An artificial neural network prediction model of congenital heart disease based on risk factors: A hospital-based case-control study

Medicine (Baltimore). 2017 Feb;96(6):e6090. doi: 10.1097/MD.0000000000006090.

Abstract

An artificial neural network (ANN) model was developed to predict the risks of congenital heart disease (CHD) in pregnant women.This hospital-based case-control study involved 119 CHD cases and 239 controls all recruited from birth defect surveillance hospitals in Hunan Province between July 2013 and June 2014. All subjects were interviewed face-to-face to fill in a questionnaire that covered 36 CHD-related variables. The 358 subjects were randomly divided into a training set and a testing set at the ratio of 85:15. The training set was used to identify the significant predictors of CHD by univariate logistic regression analyses and develop a standard feed-forward back-propagation neural network (BPNN) model for the prediction of CHD. The testing set was used to test and evaluate the performance of the ANN model. Univariate logistic regression analyses were performed on SPSS 18.0. The ANN models were developed on Matlab 7.1.The univariate logistic regression identified 15 predictors that were significantly associated with CHD, including education level (odds ratio = 0.55), gravidity (1.95), parity (2.01), history of abnormal reproduction (2.49), family history of CHD (5.23), maternal chronic disease (4.19), maternal upper respiratory tract infection (2.08), environmental pollution around maternal dwelling place (3.63), maternal exposure to occupational hazards (3.53), maternal mental stress (2.48), paternal chronic disease (4.87), paternal exposure to occupational hazards (2.51), intake of vegetable/fruit (0.45), intake of fish/shrimp/meat/egg (0.59), and intake of milk/soymilk (0.55). After many trials, we selected a 3-layer BPNN model with 15, 12, and 1 neuron in the input, hidden, and output layers, respectively, as the best prediction model. The prediction model has accuracies of 0.91 and 0.86 on the training and testing sets, respectively. The sensitivity, specificity, and Yuden Index on the testing set (training set) are 0.78 (0.83), 0.90 (0.95), and 0.68 (0.78), respectively. The areas under the receiver operating curve on the testing and training sets are 0.87 and 0.97, respectively.This study suggests that the BPNN model could be used to predict the risk of CHD in individuals. This model should be further improved by large-sample-size research.

Publication types

  • Observational Study

MeSH terms

  • Adult
  • Case-Control Studies
  • Diet
  • Female
  • Health Status
  • Heart Defects, Congenital / epidemiology*
  • Humans
  • Logistic Models
  • Maternal Exposure
  • Neural Networks, Computer*
  • Pregnancy
  • Reproductive History
  • Risk Factors
  • Sensitivity and Specificity
  • Socioeconomic Factors