How does the age of control individuals hinder the identification of target genes for Huntington's disease?

Front Genet. 2024 Jun 20:15:1377237. doi: 10.3389/fgene.2024.1377237. eCollection 2024.

Abstract

Several studies have compared the transcriptome across various brain regions in Huntington's disease (HD) gene-positive and neurologically normal individuals to identify potential differentially expressed genes (DEGs) that could be pharmaceutical or prognostic targets for HD. Despite adhering to technical recommendations for optimal RNA-Seq analysis, none of the genes identified as upregulated in these studies have yet demonstrated success as prognostic or therapeutic targets for HD. Earlier studies included samples from neurologically normal individuals older than the HD gene-positive group. Considering the gradual transcriptional changes induced by aging in the brain, we posited that utilizing samples from older controls could result in the misidentification of DEGs. To validate our hypothesis, we reanalyzed 146 samples from this study, accessible on the SRA database, and employed Propensity Score Matching (PSM) to create a "virtual" control group with a statistically comparable age distribution to the HD gene-positive group. Our study underscores the adverse impact of using neurologically normal individuals over 75 as controls in gene differential expression analysis, resulting in false positives and negatives. We conclusively demonstrate that using such old controls leads to the misidentification of DEGs, detrimentally affecting the discovery of potential pharmaceutical and prognostic markers. This underscores the pivotal role of considering the age of control samples in RNA-Seq analysis and emphasizes its inclusion in evaluating best practices for such investigations. Although our primary focus is HD, our findings suggest that judiciously selecting age-appropriate control samples can significantly improve best practices in differential expression analysis.

Keywords: Huntington’s disease; PSM; RNA-seq analysis; aging; bioinformatics; case-control.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. The authors thank the Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) processes number 2023/06116-2 and 2023/10353-0), and Fundação Butantan for the financial support.