E-pRSA: Embeddings Improve the Prediction of Residue Relative Solvent Accessibility in Protein Sequence

J Mol Biol. 2024 Sep 1;436(17):168494. doi: 10.1016/j.jmb.2024.168494. Epub 2024 Feb 15.

Abstract

Knowledge of the solvent accessibility of residues in a protein is essential for different applications, including the identification of interacting surfaces in protein-protein interactions and the characterization of variations. We describe E-pRSA, a novel web server to estimate Relative Solvent Accessibility values (RSAs) of residues directly from a protein sequence. The method exploits two complementary Protein Language Models to provide fast and accurate predictions. When benchmarked on different blind test sets, E-pRSA scores at the state-of-the-art, and outperforms a previous method we developed, DeepREx, which was based on sequence profiles after Multiple Sequence Alignments. The E-pRSA web server is freely available at https://e-prsa.biocomp.unibo.it/main/ where users can submit single-sequence and batch jobs.

Keywords: accessible surface area; deep learning; protein language models; relative surface area; web server.

MeSH terms

  • Amino Acid Sequence
  • Computational Biology / methods
  • Internet
  • Models, Molecular
  • Protein Conformation
  • Proteins* / chemistry
  • Proteins* / genetics
  • Sequence Alignment
  • Sequence Analysis, Protein / methods
  • Software*
  • Solvents* / chemistry

Substances

  • Solvents
  • Proteins