Quantifying negative selection in human 3' UTRs uncovers constrained targets of RNA-binding proteins

Nat Commun. 2024 Jan 2;15(1):85. doi: 10.1038/s41467-023-44456-9.

Abstract

Many non-coding variants associated with phenotypes occur in 3' untranslated regions (3' UTRs), and may affect interactions with RNA-binding proteins (RBPs) to regulate gene expression post-transcriptionally. However, identifying functional 3' UTR variants has proven difficult. We use allele frequencies from the Genome Aggregation Database (gnomAD) to identify classes of 3' UTR variants under strong negative selection in humans. We develop intergenic mutability-adjusted proportion singleton (iMAPS), a generalized measure related to MAPS, to quantify negative selection in non-coding regions. This approach, in conjunction with in vitro and in vivo binding data, identifies precise RBP binding sites, miRNA target sites, and polyadenylation signals (PASs) under strong selection. For each class of sites, we identify thousands of gnomAD variants under selection comparable to missense coding variants, and find that sites in core 3' UTR regions upstream of the most-used PAS are under strongest selection. Together, this work improves our understanding of selection on human genes and validates approaches for interpreting genetic variants in human 3' UTRs.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • 3' Untranslated Regions / genetics
  • Binding Sites / genetics
  • Humans
  • MicroRNAs* / genetics
  • MicroRNAs* / metabolism
  • Polyadenylation
  • RNA-Binding Proteins / genetics
  • RNA-Binding Proteins / metabolism

Substances

  • 3' Untranslated Regions
  • MicroRNAs
  • RNA-Binding Proteins