A Prediction Model for Uncontrolled Type 2 Diabetes Mellitus Incorporating Area-level Social Determinants of Health

Med Care. 2019 Aug;57(8):592-600. doi: 10.1097/MLR.0000000000001147.

Abstract

Background: Social determinants of health (SDH) at the area level are understood to influence the likelihood of having poor glycemic control for patients with type 2 diabetes mellitus (T2DM).

Objectives: To develop a model for predicting whether a person with T2DM has uncontrolled diabetes (hemoglobin A1c ≥9%), incorporating individual and area-level (census tract) covariates.

Research design: Development and validation of machine learning models.

Subjects: Total of N=1,015,808 privately insured persons in claims data with T2DM.

Measures: C-statistic, sensitivity, specificity, positive predictive value, negative predictive value, and accuracy.

Results: A standard logistic regression model selecting among the available individual-level covariates and area-level SDH covariates (at the census tract level) performed poorly, with a C-statistic of 0.685, sensitivity of 25.6%, specificity of 90.1%, positive predictive value of 56.9%, negative predictive value of 70.4%, and accuracy of 68.4% on a 25% held-out validation subset of the data. By contrast, machine learning models improved upon risk prediction, with the highest performance from a random forest algorithm with a C-statistic of 0.928, sensitivity of 68.5%, specificity of 94.6%, positive predictive value of 69.8%, negative predictive value of 94.3%, and accuracy of 90.6%. SDH variables alone explained 16.9% of variation in uncontrolled diabetes.

Conclusions: A predictive model developed through a machine learning approach may assist health care organizations to identify which area-level SDH data to monitor for prediction of diabetes control, for potential use in risk-adjustment and targeting.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Aged
  • Diabetes Mellitus, Type 2 / epidemiology*
  • Diabetes Mellitus, Type 2 / therapy
  • Female
  • Glycated Hemoglobin / analysis
  • Humans
  • Logistic Models
  • Machine Learning
  • Male
  • Middle Aged
  • Models, Statistical
  • Risk Assessment
  • Risk Factors
  • Social Determinants of Health / statistics & numerical data*

Substances

  • Glycated Hemoglobin A
  • hemoglobin A1c protein, human