Visualizing single-cell transcriptomics data in an informative way is a major challenge in biological data analysis. Clustering of cells is a prominent analysis step and the results are usually visualized in a planar embedding of the cells using methods like PCA, t-SNE, or UMAP. Given a cluster of cells, one frequently searches for the genes highly expressed specifically in that cluster. At this point, visualization is usually replaced by studying a list of differentially expressed genes. Association Plots are derived from correspondence analysis and constitute a planar visualization of the features which characterize a given cluster of observations. We have adapted Association Plots to address the challenge of visualizing cluster-specific genes in large single-cell data sets. Our method is made available as a free R package called APL. We demonstrate the application of APL and Association Plots to single-cell RNA-seq data on two example data sets. First, we present how to delineate novel marker genes using Association Plots with the example of Peripheral Blood Mononuclear Cell data. Second, we show how to apply Association Plots for annotating cell clusters to known cell types using Association Plots and a predefined list of marker genes. To do this we will use data from the human cell atlas of fetal gene expression. Results from Association Plots will also be compared to methods for deriving differentially expressed genes, and we will show the integration of APL with Gene Ontology Enrichment.
Keywords: Association Plot; correspondence analysis; gene expression; marker genes; single-cell data.
Copyright © 2022 The Author(s). Published by Elsevier Ltd.. All rights reserved.