Significance estimation for sequence-based chemical similarity searching (PhAST) and application to AuroraA kinase inhibitors

Future Med Chem. 2012 Oct;4(15):1897-906. doi: 10.4155/fmc.12.148.

Abstract

Background: Chemical similarity searching allows the retrieval of preferred screening molecules from a compound database. Candidates are ranked according to their similarity to a reference compound (query). Assessing the statistical significance of chemical similarity scores helps prioritizing significant hits, and identifying cases where the database does not contain any promising compounds.

Method: Our text-based similarity measure, Pharmacophore Alignment Search Tool (PhAST), employs pair-wise sequence alignment. We adapted the concept of E-values as significance estimates and employed a sampling technique that incorporates the principle of importance sampling in a Markov chain Monte Carlo simulation to generate distributions of random alignment scores. These distributions were used to compute significance estimates for similarity scores in a preliminary prospective virtual screen for inhibitors of Aurora A kinase.

Conclusion: Assessing the significance of compound similarity computed with PhAST allows for a statistically motivated identification of candidate screening compounds. Inhibitors of Aurora A kinase were retrieved from a large compound library.

MeSH terms

  • Aurora Kinases
  • Combinatorial Chemistry Techniques
  • Databases, Chemical
  • Humans
  • Monte Carlo Method
  • Piperazines / chemistry
  • Protein Kinase Inhibitors / chemistry*
  • Protein Serine-Threonine Kinases / antagonists & inhibitors*
  • Protein Serine-Threonine Kinases / metabolism
  • Software

Substances

  • Piperazines
  • Protein Kinase Inhibitors
  • tozasertib
  • Aurora Kinases
  • Protein Serine-Threonine Kinases