Comparison of approaches for machine-learning optimization of neural networks for detecting gene-gene interactions in genetic epidemiology

Genet Epidemiol. 2008 May;32(4):325-40. doi: 10.1002/gepi.20307.

Abstract

The detection of genotypes that predict common, complex disease is a challenge for human geneticists. The phenomenon of epistasis, or gene-gene interactions, is particularly problematic for traditional statistical techniques. Additionally, the explosion of genetic information makes exhaustive searches of multilocus combinations computationally infeasible. To address these challenges, neural networks (NN), a pattern recognition method, have been used. One limitation of the NN approach is that its success is dependent on the architecture of the network. To solve this, machine-learning approaches have been suggested to evolve the best NN architecture for a particular data set. In this study we provide a detailed technical description of the use of grammatical evolution to optimize neural networks (GENN) for use in genetic association studies. We compare the performance of GENN to that of a previous machine-learning NN application--genetic programming neural networks in both simulated and real data. We show that GENN greatly outperforms genetic programming neural networks in data sets with a large number of single nucleotide polymorphisms. Additionally, we demonstrate that GENN has high power to detect disease-risk loci in a range of high-order epistatic models. Finally, we demonstrate the scalability of the GENN method with increasing numbers of variables--as many as 500,000 single nucleotide polymorphisms.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Artificial Intelligence*
  • Data Interpretation, Statistical
  • Databases, Genetic
  • Epidemiologic Methods
  • Epistasis, Genetic
  • Genetic Predisposition to Disease
  • HIV Infections / epidemiology
  • HIV Infections / genetics
  • HIV Infections / immunology
  • Humans
  • Immunogenetics
  • Models, Genetic*
  • Neural Networks, Computer*
  • Pattern Recognition, Automated
  • Polymorphism, Single Nucleotide