Fine-scaled human genetic structure revealed by SNP microarrays

Genome Res. 2009 May;19(5):815-25. doi: 10.1101/gr.085589.108.

Abstract

We report an analysis of more than 240,000 loci genotyped using the Affymetrix SNP microarray in 554 individuals from 27 worldwide populations in Africa, Asia, and Europe. To provide a more extensive and complete sampling of human genetic variation, we have included caste and tribal samples from two states in South India, Daghestanis from eastern Europe, and the Iban from Malaysia. Consistent with observations made by Charles Darwin, our results highlight shared variation among human populations and demonstrate that much genetic variation is geographically continuous. At the same time, principal components analyses reveal discernible genetic differentiation among almost all identified populations in our sample, and in most cases, individuals can be clearly assigned to defined populations on the basis of SNP genotypes. All individuals are accurately classified into continental groups using a model-based clustering algorithm, but between closely related populations, genetic and self-classifications conflict for some individuals. The 250K data permitted high-level resolution of genetic variation among Indian caste and tribal populations and between highland and lowland Daghestani populations. In particular, upper-caste individuals from Tamil Nadu and Andhra Pradesh form one defined group, lower-caste individuals from these two states form another, and the tribal Irula samples form a third. Our results emphasize the correlation of genetic and geographic distances and highlight other elements, including social factors that have contributed to population structure.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Africa
  • Asia
  • Ethnicity / genetics
  • Europe, Eastern
  • Genetic Structures
  • Genetic Variation*
  • Genetics, Population
  • Genome, Human
  • Genotype
  • Humans
  • India
  • Oligonucleotide Array Sequence Analysis
  • Polymorphism, Single Nucleotide / genetics*
  • Population Dynamics