TreeTerminus -creating transcript trees using inferential replicate counts

Noor Pratap Singh; Michael I Love; Rob Patro

doi:10.1016/j.isci.2023.106961

TreeTerminus -creating transcript trees using inferential replicate counts

iScience. 2023 May 25;26(6):106961. doi: 10.1016/j.isci.2023.106961. eCollection 2023 Jun 16.

Authors

Noor Pratap Singh¹, Michael I Love^{2

3}, Rob Patro¹

Affiliations

¹ Department of Computer Science, University of Maryland, College Park, MD, USA.
² Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA.
³ Department of Genetics, University of North Carolina, Chapel Hill, NC, USA.

Abstract

A certain degree of uncertainty is always associated with the transcript abundance estimates. The uncertainty may make many downstream analyses, such as differential testing, difficult for certain transcripts. Conversely, gene-level analysis, though less ambiguous, is often too coarse-grained. We introduce TreeTerminus, a data-driven approach for grouping transcripts into a tree structure where leaves represent individual transcripts and internal nodes represent an aggregation of a transcript set. TreeTerminus constructs trees such that, on average, the inferential uncertainty decreases as we ascend the tree topology. The tree provides the flexibility to analyze data at nodes that are at different levels of resolution in the tree and can be tuned depending on the analysis of interest. We evaluated TreeTerminus on two simulated and two experimental datasets and observed an improved performance compared to transcripts (leaves) and other methods under several different metrics.

Keywords: Bioinformatics; Data processing in systems biology; Transcriptomics.

Grants and funding

R01 HG009937/HG/NHGRI NIH HHS/United States