Pathogen exposure misclassification can bias association signals in GWAS of infectious diseases when using population-based common control subjects

Am J Hum Genet. 2023 Feb 2;110(2):336-348. doi: 10.1016/j.ajhg.2022.12.013. Epub 2023 Jan 16.

Abstract

Genome-wide association studies (GWASs) have been performed to identify host genetic factors for a range of phenotypes, including for infectious diseases. The use of population-based common control subjects from biobanks and extensive consortia is a valuable resource to increase sample sizes in the identification of associated loci with minimal additional expense. Non-differential misclassification of the outcome has been reported when the control subjects are not well characterized, which often attenuates the true effect size. However, for infectious diseases the comparison of affected subjects to population-based common control subjects regardless of pathogen exposure can also result in selection bias. Through simulated comparisons of pathogen-exposed cases and population-based common control subjects, we demonstrate that not accounting for pathogen exposure can result in biased effect estimates and spurious genome-wide significant signals. Further, the observed association can be distorted depending upon strength of the association between a locus and pathogen exposure and the prevalence of pathogen exposure. We also used a real data example from the hepatitis C virus (HCV) genetic consortium comparing HCV spontaneous clearance to persistent infection with both well-characterized control subjects and population-based common control subjects from the UK Biobank. We find biased effect estimates for known HCV clearance-associated loci and potentially spurious HCV clearance associations. These findings suggest that the choice of control subjects is especially important for infectious diseases or outcomes that are conditional upon environmental exposures.

Keywords: GWAS; common controls; genetic epidemiology; infectious disease; misclassification bias; population-based controls.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Communicable Diseases* / genetics
  • Genome-Wide Association Study
  • Hepacivirus
  • Hepatitis C* / genetics
  • Humans
  • Phenotype