Machine learning for prediction of diabetes risk in middle-aged Swedish people

Lara Lama; Oskar Wilhelmsson; Erik Norlander; Lars Gustafsson; Anton Lager; Per Tynelius; Lars Wärvik; Claes-Göran Östenson

doi:10.1016/j.heliyon.2021.e07419

Machine learning for prediction of diabetes risk in middle-aged Swedish people

Heliyon. 2021 Jun 25;7(7):e07419. doi: 10.1016/j.heliyon.2021.e07419. eCollection 2021 Jul.

Authors

Lara Lama¹, Oskar Wilhelmsson¹, Erik Norlander¹, Lars Gustafsson¹, Anton Lager², Per Tynelius², Lars Wärvik¹, Claes-Göran Östenson³

Affiliations

¹ CGI Inc., Stockholm, Sweden.
² Region Stockholm, Center for Epidemiological Research, Stockholm, Sweden.
³ Karolinska Institutet, Dept of Molecular Medicine & Surgery, Endocrine and Diabetes Unit, Stockholm, Sweden.

Abstract

Aims: To study if machine learning methodology can be used to detect persons with increased type 2 diabetes or prediabetes risk among people without known abnormal glucose regulation.

Methods: Machine learning and interpretable machine learning models were applied on research data from Stockholm Diabetes Preventive Program, including more than 8000 people initially with normal glucose tolerance or prediabetes to determine high and low risk features for further impairment in glucose tolerance at follow-up 10 and 20 years later.

Results: The features with the highest importance on the outcome were body mass index, waist-hip ratio, age, systolic and diastolic blood pressure, and diabetes heredity. High values of these features as well as diabetes heredity conferred increased risk of type 2 diabetes. . The machine learning model was used to generate individual, comprehensible risk profiles, where the diabetes risk was obtained for each person in the data set. Features with the largest increasing or decreasing effects on the risk were determined.

Conclusions: The primary application of this machine learning model is to predict individual type 2 diabetes risk in people without diagnosed diabetes, and to which features the risk relates. However, since most features affecting diabetes risk also play a role for metabolic control in diabetes, e.g. body mass index, diet composition, tobacco use, and stress, the tool can possibly also be used in diabetes care to develop more individualized, easily accessible health care plans to be utilized when encountering the patients.

Keywords: Individual healthcare plan; Interpretable machine learning; Machine learning; Risk screening; SHAP; Type 2 diabetes.