SNPs, short tandem repeats, and structural variants are responsible for differential gene expression across C57BL/6 and C57BL/10 substrains

Cell Genom. 2022 Mar 9;2(3):100102. doi: 10.1016/j.xgen.2022.100102.

Abstract

Mouse substrains are an invaluable model for understanding disease. We compared C57BL/6J, which is the most commonly used inbred mouse strain, with eight C57BL/6 and five C57BL/10 closely related inbred substrains. Whole-genome sequencing and RNA-sequencing analysis yielded 352,631 SNPs, 109,096 indels, 150,344 short tandem repeats (STRs), 3,425 structural variants (SVs), and 2,826 differentially expressed genes (DE genes) among these 14 strains; 312,981 SNPs (89%) distinguished the B6 and B10 lineages. These SNPs were clustered into 28 short segments that are likely due to introgressed haplotypes rather than new mutations. Outside of these introgressed regions, we identified 53 SVs, protein-truncating SNPs, and frameshifting indels that were associated with DE genes. Our results can be used for both forward and reverse genetic approaches and illustrate how introgression and mutational processes give rise to differences among these widely used inbred substrains.