Enhanced Error Suppression for Accurate Detection of Low-Frequency Variants

Huimin Chen; Fei Yu; Debin Lu; Shiyue Huang; Songrui Liu; Boseng Zhang; Kunxian Shu; Dan Pu

doi:10.1002/elps.202400202

Enhanced Error Suppression for Accurate Detection of Low-Frequency Variants

Electrophoresis. 2024 Dec 16. doi: 10.1002/elps.202400202. Online ahead of print.

Authors

Huimin Chen¹, Fei Yu¹, Debin Lu², Shiyue Huang³, Songrui Liu³, Boseng Zhang³, Kunxian Shu¹, Dan Pu¹

Affiliations

¹ Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China.
² Department of Neurology, The Second Affiliated Hospital of the Army Medical University of the People's Liberation Army, Chongqing, China.
³ Chongqing Yangjiaping Middle School, Chongqing, China.

PMID: 39679747
DOI: 10.1002/elps.202400202

Abstract

The identification of low-frequency variants remains challenging due to the inevitable high error rates of next-generation sequencing (NGS). Numerous promising strategies employ unique molecular identifiers (UMIs) for error suppression. However, their efficiency depends highly on redundant sequencing and quality control, leading to tremendous read waste and cost inefficiency. Here, we describe a novel approach, enhanced error suppression strategy (EES), that addresses these challenges by (1) optimizing data utilization and reducing read waste by utilizing single-read correction that reserves abundant single reads that complement other single reads or single-strand consensus sequences (SSCSs), and (2) effectively enhancing the accuracy of NGS by employing Bayes' theorem. EES significantly improves variant detection accuracy, achieving a background error rate of less than 4.4 × 10^-5 per base pair. Additionally, the data utilization rate is dramatically increased, with a 22.9-fold enhancement in duplex consensus sequence (DCS) recovery compared to traditional methodologies. Furthermore, EES demonstrates superior error suppression performance across various base substitutions. In conclusion, EES represents a significant advancement in detecting low-frequency variants by improving data utilization and reducing sequencing errors. It potentially enhances the sensitivity and accuracy of NGS applications, proving highly valuable in clinical and research contexts where precise variant detection is critical.

Keywords: error suppression; low‐frequency variants; next‐generation sequencing (NGS) error; unique molecular identifier (UMI).

Abstract

Grants and funding