Improved and Interpretable Prediction of Cytochrome P450-Mediated Metabolism by Molecule-Level Graph Modeling and Subgraph Information Bottlenecks

J Chem Inf Model. 2024 Dec 23;64(24):9487-9500. doi: 10.1021/acs.jcim.4c01632. Epub 2024 Nov 27.

Abstract

Accurately identifying sites of metabolism (SoM) mediated by cytochrome P450 (CYP) enzymes, which are responsible for drug metabolism in the body, is critical in the early stage of drug discovery and development. Current computational methods for CYP-mediated SoM prediction face several challenges, including limitations to traditional machine learning models at the atomic level, heavy reliance on complex feature engineering, and the lack of interpretability relevant to medicinal chemistry. Here, we propose GraphCySoM, a novel molecule-level modeling approach based on graph neural networks, utilizing lightweight features and interpretable annotations on substructures, to effectively and interpretably predict CYP-mediated SoM. Unlike computationally expensive atomic descriptors derived from resource-intensive chemistry or even quantum chemistry calculations, we emphasize that graph-based molecular modeling initialized solely with lightweight features enables the adaptive learning of molecular topology through message-passing mechanisms combined with various aggregation kernels. Extensive ablation experiments demonstrate that GraphCySoM significantly outperforms baseline models and achieves superior performance compared with competing methods while exhibiting advantages in computational efficiency. Moreover, the attention mechanism and subgraph information bottlenecks are incorporated to analyze node importance and feature significance, resulting in mining substructures associated with the SoM. To the best of our knowledge, this is the first comprehensive study of CYP-mediated SoM using molecule-level modeling and interpretable technology. Our method achieves new state-of-the-art performance and provides potential insights into the molecular and pharmacological mechanisms underlying drug metabolism catalyzed by CYP enzymes. All source files and trained models are freely available at https://github.com/liyigerry/GraphCySoM.

MeSH terms

  • Cytochrome P-450 Enzyme System* / chemistry
  • Cytochrome P-450 Enzyme System* / metabolism
  • Drug Discovery / methods
  • Humans
  • Models, Molecular
  • Neural Networks, Computer

Substances

  • Cytochrome P-450 Enzyme System