In this study, we developed a machine-learning-aided protein design strategy for engineering Vitreoscilla hemoglobin (VHb) as carbene transferase. A Natural Language Processing (NLP) model was used for the first time to construct an algorithm (EESP, enzyme enantioselectivity score predictor) and predict the enantioselectivity of VHb. We identified critical amino acid residue sites by molecular docking and established a simplified mutation library by site-saturated mutagenesis. Based on the simplified mutant library, the trianed EESP scored 160,000 virtual mutants, and 15 predicted high-score mutants were chosen for experimental validation. Among these mutants, VHb-WK (Y29W/P54K) demonstrated the highest diastereoselectivity and enantioselectivity of carbene transferase for the olefin cyclopropanation in aqueous conditions. Subsequently, molecular dynamics simulations were performed to explore the interaction between protein and substrates, finding that the high enantioselectivity of VHb-WK stems from the interactions of R47, Q53, and K84, which narrows the entrance of the enzyme's pocket, favoring the restriction of the formation of reaction intermediates. Integrating the NLP model and enzyme modification offers significant advantages by reducing economic costs and workloads associated with the protein engineering process.
© 2024 The Authors. Published by American Chemical Society.