A model-based approach for analysis of spatial structure in genetic data

Nat Genet. 2012 May 20;44(6):725-31. doi: 10.1038/ng.2285.

Abstract

Characterizing genetic diversity within and between populations has broad applications in studies of human disease and evolution. We propose a new approach, spatial ancestry analysis, for the modeling of genotypes in two- or three-dimensional space. In spatial ancestry analysis (SPA), we explicitly model the spatial distribution of each SNP by assigning an allele frequency as a continuous function in geographic space. We show that the explicit modeling of the allele frequency allows individuals to be localized on the map on the basis of their genetic information alone. We apply our SPA method to a European and a worldwide population genetic variation data set and identify SNPs showing large gradients in allele frequency, and we suggest these as candidate regions under selection. These regions include SNPs in the well-characterized LCT region, as well as at loci including FOXP2, OCA2 and LRP1B.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Demography*
  • Gene Frequency
  • Genetic Variation*
  • Genetics, Population
  • Genotype
  • Humans
  • Models, Genetic*
  • Polymorphism, Single Nucleotide
  • Selection, Genetic
  • White People / genetics