Sensitive Identification of Known and Unknown Protease Activities by Unsupervised Linear Motif Deconvolution

Anal Chem. 2022 Feb 1;94(4):2244-2254. doi: 10.1021/acs.analchem.1c04937. Epub 2022 Jan 14.

Abstract

The cleavage-site specificities for many proteases are not well understood, restricting the utility of supervised classification methods. We present an algorithm and web interface to overcome this limitation through the unsupervised detection of overrepresented patterns in protein sequence data, providing insight into the mixture of protease activities contributing to a complex system. Here, we apply the RObust LInear Motif Deconvolution (RoLiM) algorithm to confidently detect substrate cleavage patterns for SARS-CoV-2 MPro protease in the N-terminome data of an infected human cell line. Using mass spectrometry-based peptide data from a case-control comparison of 341 primary urothelial bladder cancer cases and 110 controls, we identified distinct sequence motifs indicative of increased matrix metallopeptidase activity in urine from cancer patients. The evaluation of N-terminal peptides from patient plasma post-chemotherapy detected novel granzyme B/corin activity. RoLiM will enhance the unbiased investigation of peptide sequences to establish the composition of known and uncharacterized protease activities in biological systems. RoLiM is available at http://langelab.org/rolim/.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • COVID-19
  • Coronavirus 3C Proteases / metabolism*
  • Humans
  • Proteolysis
  • SARS-CoV-2 / enzymology*
  • Substrate Specificity

Substances

  • 3C-like proteinase, SARS-CoV-2
  • Coronavirus 3C Proteases