Discovering protein complexes in dense reliable neighborhoods of protein interaction networks

Comput Syst Bioinformatics Conf. 2007:6:157-68.

Abstract

Multiprotein complexes play central roles in many cellular pathways. Although many high-throughput experimental techniques have already enabled systematic screening of pairwise protein-protein interactions en masse, the amount of experimentally determined protein complex data has remained relatively lacking. As such, researchers have begun to exploit the vast amount of pairwise interaction data to help discover new protein complexes. However, mining for protein complexes in interaction networks is not an easy task because there are many data artefacts in the underlying protein-protein interaction data due to the limitations in the current high-throughput screening methods. We propose a novel DECAFF (Dense-neighborhood Extraction using Connectivity and conFidence Features) algorithm to mine for dense and reliable subgraphs in protein interaction networks. Our method is devised to address two major limitations in current high throughout protein interaction data, namely, incompleteness and high data noise. Experimental results with yeast protein interaction data show that the interaction subgraphs discovered by DECAFF matched significantly better with actual protein complexes than other existing approaches. Our results demonstrate that pairwise protein interaction networks can be effectively mined to discover new protein complexes, provided that the data artefacts in the underlying interaction data are taken into account adequately.

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Models, Biological*
  • Multiprotein Complexes / metabolism*
  • Protein Interaction Mapping / methods*
  • Proteome / metabolism*
  • Signal Transduction / physiology*

Substances

  • Multiprotein Complexes
  • Proteome