Background: According to previous reports, very high percentages of individuals in Saudi Arabia are undiagnosed for type 2 diabetes mellitus (T2DM). Despite conducting several screening and awareness campaigns, these efforts lacked full accessibility and consumed extensive human and material resources. Thus, developing machine learning (ML) models could enhance the population-based screening process. The study aims to compare a newly developed ML model's outcomes with the validated American Diabetes Association's (ADA) risk assessment regarding predicting people with high risk for T2DM.
Research design and methods: Patients' age, gender, and risk factors that were obtained from the National Health Information Center's dataset were used to build and train the ML model. To evaluate the developed ML model, an external validation study was conducted in three primary health care centers. A random sample (N = 3400) was selected from the non-diabetic individuals.
Results: The results showed the plotted data of sensitivity/100-specificity represented in the Receiver Operating Characteristic (ROC) curve with an AROC value of 0.803, 95% CI: 0.779-0.826.
Conclusions: The current study reveals a new ML model proposed for population-level classification that can be an adequate tool for identifying those at high risk of T2DM or who already have T2DM but have not been diagnosed.
Keywords: Machine learning; Saudi Arabia; health informatics; high risk; type-2 diabetes mellitus.