Joint probabilistic modeling of single-cell multi-omic data with totalVI

Nat Methods. 2021 Mar;18(3):272-282. doi: 10.1038/s41592-020-01050-x. Epub 2021 Feb 15.

Abstract

The paired measurement of RNA and surface proteins in single cells with cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) is a promising approach to connect transcriptional variation with cell phenotypes and functions. However, combining these paired views into a unified representation of cell state is made challenging by the unique technical characteristics of each measurement. Here we present Total Variational Inference (totalVI; https://scvi-tools.org ), a framework for end-to-end joint analysis of CITE-seq data that probabilistically represents the data as a composite of biological and technical factors, including protein background and batch effects. To evaluate totalVI's performance, we profiled immune cells from murine spleen and lymph nodes with CITE-seq, measuring over 100 surface proteins. We demonstrate that totalVI provides a cohesive solution for common analysis tasks such as dimensionality reduction, the integration of datasets with different measured proteins, estimation of correlations between molecules and differential expression testing.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Animals
  • Cells, Cultured
  • Data Analysis
  • Female
  • High-Throughput Screening Assays / methods
  • Lymph Nodes / cytology
  • Lymph Nodes / metabolism*
  • Mice
  • Mice, Inbred C57BL
  • Proteins / analysis*
  • RNA / analysis
  • RNA / genetics
  • Single-Cell Analysis / methods*
  • Spleen / cytology
  • Spleen / metabolism*
  • Transcriptome / genetics*

Substances

  • Proteins
  • RNA