Machine learning guided aptamer refinement and discovery

Nat Commun. 2021 Apr 22;12(1):2366. doi: 10.1038/s41467-021-22555-9.

Abstract

Aptamers are single-stranded nucleic acid ligands that bind to target molecules with high affinity and specificity. They are typically discovered by searching large libraries for sequences with desirable binding properties. These libraries, however, are practically constrained to a fraction of the theoretical sequence space. Machine learning provides an opportunity to intelligently navigate this space to identify high-performing aptamers. Here, we propose an approach that employs particle display (PD) to partition a library of aptamers by affinity, and uses such data to train machine learning models to predict affinity in silico. Our model predicted high-affinity DNA aptamers from experimental candidates at a rate 11-fold higher than random perturbation and generated novel, high-affinity aptamers at a greater rate than observed by PD alone. Our approach also facilitated the design of truncated aptamers 70% shorter and with higher binding affinity (1.5 nM) than the best experimental candidate. This work demonstrates how combining machine learning and physical approaches can be used to expedite the discovery of better diagnostic and therapeutic agents.

MeSH terms

  • Aptamers, Nucleotide / chemistry
  • Aptamers, Nucleotide / genetics
  • Aptamers, Nucleotide / metabolism*
  • Computer Simulation
  • Drug Discovery / methods
  • Gene Library
  • Ligands
  • Lipocalin-2 / chemistry
  • Lipocalin-2 / genetics
  • Lipocalin-2 / metabolism
  • Machine Learning*
  • Models, Chemical
  • Protein Binding

Substances

  • Aptamers, Nucleotide
  • Ligands
  • Lipocalin-2