Fast and accurate non-sequential protein structure alignment using a new asymmetric linear sum assignment heuristic

Bioinformatics. 2016 Feb 1;32(3):370-7. doi: 10.1093/bioinformatics/btv580. Epub 2015 Oct 10.

Abstract

Motivation: The three dimensional tertiary structure of a protein at near atomic level resolution provides insight alluding to its function and evolution. As protein structure decides its functionality, similarity in structure usually implies similarity in function. As such, structure alignment techniques are often useful in the classifications of protein function. Given the rapidly growing rate of new, experimentally determined structures being made available from repositories such as the Protein Data Bank, fast and accurate computational structure comparison tools are required. This paper presents SPalignNS, a non-sequential protein structure alignment tool using a novel asymmetrical greedy search technique.

Results: The performance of SPalignNS was evaluated against existing sequential and non-sequential structure alignment methods by performing trials with commonly used datasets. These benchmark datasets used to gauge alignment accuracy include (i) 9538 pairwise alignments implied by the HOMSTRAD database of homologous proteins; (ii) a subset of 64 difficult alignments from set (i) that have low structure similarity; (iii) 199 pairwise alignments of proteins with similar structure but different topology; and (iv) a subset of 20 pairwise alignments from the RIPC set. SPalignNS is shown to achieve greater alignment accuracy (lower or comparable root-mean squared distance with increased structure overlap coverage) for all datasets, and the highest agreement with reference alignments from the challenging dataset (iv) above, when compared with both sequentially constrained alignments and other non-sequential alignments.

Availability and implementation: SPalignNS was implemented in C++. The source code, binary executable, and a web server version is freely available at: http://sparks-lab.org

Contact: yaoqi.zhou@griffith.edu.au.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Protein
  • Heuristics
  • Models, Molecular
  • Proteins / chemistry
  • Sequence Alignment
  • Software*
  • Structural Homology, Protein*

Substances

  • Proteins