Statistical and Machine-Learning Analyses in Nutritional Genomics Studies

Nutrients. 2020 Oct 14;12(10):3140. doi: 10.3390/nu12103140.

Abstract

Nutritional compounds may have an influence on different OMICs levels, including genomics, epigenomics, transcriptomics, proteomics, metabolomics, and metagenomics. The integration of OMICs data is challenging but may provide new knowledge to explain the mechanisms involved in the metabolism of nutrients and diseases. Traditional statistical analyses play an important role in description and data association; however, these statistical procedures are not sufficiently enough powered to interpret the large integrated multiple OMICs (multi-OMICS) datasets. Machine learning (ML) approaches can play a major role in the interpretation of multi-OMICS in nutrition research. Specifically, ML can be used for data mining, sample clustering, and classification to produce predictive models and algorithms for integration of multi-OMICs in response to dietary intake. The objective of this review was to investigate the strategies used for the analysis of multi-OMICs data in nutrition studies. Sixteen recent studies aimed to understand the association between dietary intake and multi-OMICs data are summarized. Multivariate analysis in multi-OMICs nutrition studies is used more commonly for analyses. Overall, as nutrition research incorporated multi-OMICs data, the use of novel approaches of analysis such as ML needs to complement the traditional statistical analyses to fully explain the impact of nutrition on health and disease.

Keywords: data integration; genomics; machine learning; multi-OMICS; nutrition.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Cluster Analysis
  • Data Interpretation, Statistical
  • Data Mining
  • Eating
  • Genome-Wide Association Study
  • Humans
  • Machine Learning*
  • Nutrigenomics / methods*
  • Nutrition Disorders / genetics
  • Nutrition Disorders / metabolism
  • Nutritional Physiological Phenomena / genetics