Anti-correlated feature selection prevents false discovery of subpopulations in scRNAseq

Nat Commun. 2024 Jan 24;15(1):699. doi: 10.1038/s41467-023-43406-9.

Abstract

While sub-clustering cell-populations has become popular in single cell-omics, negative controls for this process are lacking. Popular feature-selection/clustering algorithms fail the null-dataset problem, allowing erroneous subdivisions of homogenous clusters until nearly each cell is called its own cluster. Using real and synthetic datasets, we find that anti-correlated gene selection reduces or eliminates erroneous subdivisions, increases marker-gene selection efficacy, and efficiently scales to millions of cells.

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • Single-Cell Gene Expression Analysis*