Computational characterization of multiple Histidine (His) post-translational-modifications (PTM) at enzyme active sites complements tedious experimental characterization in proteins-of-unknown-functions (PUFs) and domain-of-unknown-functions (DUFs). There are only a handful of Histidine-PTM-prediction-tools and those also annotate only a single function. Here, we addressed the problem using artificial neural networks on functional histidine dataset curated from enzyme (protein) sequences available in UniProt database (sample size n = 1584). The convolution-neural-network (CNN) model ('Hist-i-fy') performed the best with 75% overall accuracy/F1-score. A case study was performed on histidine-phosphorylation (n = 34) obtained from mass spectroscopy data. For the first time, we report multiple His-PTM-prediction-tool (https://histify.streamlit.app/& https://github.com/dibyansu24-maker/Histify), with optimal performance. The inputs to the tool are (i) protein sequence containing histidine, and (ii) the histidine residue number. Prediction output is one out of the eight histidine functions-acetylation, ribosylation, glycosylation, hydroxylation, methylation, oxidation, phosphorylation, and protein splicing.Communicated by Ramaswamy H. Sarma.
Keywords: Convoluted Neural Network (CNN); Histidine post-translational-modifications (PTM); UNIPROT database; protein sequence.