Comparing spatial maps of human population-genetic variation using Procrustes analysis

Stat Appl Genet Mol Biol. 2010;9(1):Article 13. doi: 10.2202/1544-6115.1493. Epub 2010 Jan 27.

Abstract

Recent applications of principal components analysis (PCA) and multidimensional scaling (MDS) in human population genetics have found that "statistical maps" based on the genotypes in population-genetic samples often resemble geographic maps of the underlying sampling locations. To provide formal tests of these qualitative observations, we describe a Procrustes analysis approach for quantitatively assessing the similarity of population-genetic and geographic maps. We confirm in two scenarios, one using single-nucleotide polymorphism (SNP) data from Europe and one using SNP data worldwide, that a measurably high level of concordance exists between statistical maps of population-genetic variation and geographic maps of sampling locations. Two other examples illustrate the versatility of the Procrustes approach in population-genetic applications, verifying the concordance of SNP analyses using PCA and MDS, and showing that statistical maps of worldwide copy-number variants (CNVs) accord with statistical maps of SNP variation, especially when CNV analysis is limited to samples with the highest-quality data. As statistical maps with PCA and MDS have become increasingly common for use in summarizing population relationships, our examples highlight the potential of Procrustes-based quantitative comparisons for interpreting the results in these maps.

Publication types

  • Comparative Study
  • Research Support, N.I.H., Extramural
  • Research Support, N.I.H., Intramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biostatistics
  • DNA Copy Number Variations
  • Europe
  • Genetic Variation*
  • Genetics, Population / statistics & numerical data*
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Multivariate Analysis
  • Polymorphism, Single Nucleotide
  • Principal Component Analysis