Universal method for robust detection of circadian state from gene expression

Proc Natl Acad Sci U S A. 2018 Sep 25;115(39):E9247-E9256. doi: 10.1073/pnas.1800314115. Epub 2018 Sep 10.

Abstract

Circadian clocks play a key role in regulating a vast array of biological processes, with significant implications for human health. Accurate assessment of physiological time using transcriptional biomarkers found in human blood can significantly improve diagnosis of circadian disorders and optimize the delivery time of therapeutic treatments. To be useful, such a test must be accurate, minimally burdensome to the patient, and readily generalizable to new data. A major obstacle in development of gene expression biomarker tests is the diversity of measurement platforms and the inherent variability of the data, often resulting in predictors that perform well in the original datasets but cannot be universally applied to new samples collected in other settings. Here, we introduce TimeSignature, an algorithm that robustly infers circadian time from gene expression. We demonstrate its application in data from three independent studies using distinct microarrays and further validate it against a new set of samples profiled by RNA-sequencing. Our results show that TimeSignature is more accurate and efficient than competing methods, estimating circadian time to within 2 h for the majority of samples. Importantly, we demonstrate that once trained on data from a single study, the resulting predictor can be universally applied to yield highly accurate results in new data from other studies independent of differences in study population, patient protocol, or assay platform without renormalizing the data or retraining. This feature is unique among expression-based predictors and addresses a major challenge in the development of generalizable, clinically useful tests.

Keywords: circadian rhythms; cross-platform prediction; gene expression dynamics; machine learning.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Biomarkers / blood
  • Circadian Clocks / genetics*
  • Circadian Rhythm / genetics
  • Gene Expression
  • Gene Expression Profiling / methods*
  • Genes / genetics
  • Humans
  • Machine Learning*
  • Models, Statistical
  • Reproducibility of Results
  • Sleep
  • Transcriptome

Substances

  • Biomarkers