Dynamic spectrum quality assessment and iterative computational analysis of shotgun proteomic data: toward more efficient identification of post-translational modifications, sequence polymorphisms, and novel peptides

Alexey I Nesvizhskii; Franz F Roos; Jonas Grossmann; Mathijs Vogelzang; James S Eddes; Wilhelm Gruissem; Sacha Baginsky; Ruedi Aebersold

doi:10.1074/mcp.M500319-MCP200

Dynamic spectrum quality assessment and iterative computational analysis of shotgun proteomic data: toward more efficient identification of post-translational modifications, sequence polymorphisms, and novel peptides

Mol Cell Proteomics. 2006 Apr;5(4):652-70. doi: 10.1074/mcp.M500319-MCP200. Epub 2005 Dec 12.

Authors

Alexey I Nesvizhskii¹, Franz F Roos, Jonas Grossmann, Mathijs Vogelzang, James S Eddes, Wilhelm Gruissem, Sacha Baginsky, Ruedi Aebersold

Affiliation

¹ Institute for Systems Biology, Seattle, Washington 98103, USA. nsvi@med.umich.edu

PMID: 16352522
DOI: 10.1074/mcp.M500319-MCP200

Abstract

In mass spectrometry-based proteomics, frequently hundreds of thousands of MS/MS spectra are collected in a single experiment. Of these, a relatively small fraction is confidently assigned to peptide sequences, whereas the majority of the spectra are not further analyzed. Spectra are not assigned to peptides for diverse reasons. These include deficiencies of the scoring schemes implemented in the database search tools, sequence variations (e.g. single nucleotide polymorphisms) or omissions in the database searched, post-translational or chemical modifications of the peptide analyzed, or the observation of sequences that are not anticipated from the genomic sequence (e.g. splice forms, somatic rearrangement, and processed proteins). To increase the amount of information that can be extracted from proteomic MS/MS datasets we developed a robust method that detects high quality spectra within the fraction of spectra unassigned by conventional sequence database searching and computes a quality score for each spectrum. We also demonstrate that iterative search strategies applied to such detected unassigned high quality spectra significantly increase the number of spectra that can be assigned from datasets and that biologically interesting new insights can be gained from existing data.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Alternative Splicing
Amino Acid Sequence
Mass Spectrometry
Molecular Sequence Data
Peptides / metabolism*
Polymorphism, Genetic*
Protein Processing, Post-Translational*
Proteomics*

Substances

Peptides

Grants and funding

N01-HV-28179/HV/NHLBI NIH HHS/United States