A multi-step approach to time series analysis and gene expression clustering

R Amato; A Ciaramella; N Deniskina; C Del Mondo; D di Bernardo; C Donalek; G Longo; G Mangano; G Miele; G Raiconi; A Staiano; R Tagliaferri

doi:10.1093/bioinformatics/btk026

A multi-step approach to time series analysis and gene expression clustering

Bioinformatics. 2006 Mar 1;22(5):589-96. doi: 10.1093/bioinformatics/btk026. Epub 2006 Jan 5.

Authors

R Amato¹, A Ciaramella, N Deniskina, C Del Mondo, D di Bernardo, C Donalek, G Longo, G Mangano, G Miele, G Raiconi, A Staiano, R Tagliaferri

Affiliation

¹ Dipartimento di Scienze Fisiche, University of Naples Federico II, Naples, Italy.

PMID: 16397005
DOI: 10.1093/bioinformatics/btk026

Abstract

Motivation: The huge growth in gene expression data calls for the implementation of automatic tools for data processing and interpretation.

Results: We present a new and comprehensive machine learning data mining framework consisting in a non-linear PCA neural network for feature extraction, and probabilistic principal surfaces combined with an agglomerative approach based on Negentropy aimed at clustering gene microarray data. The method, which provides a user-friendly visualization interface, can work on noisy data with missing points and represents an automatic procedure to get, with no a priori assumptions, the number of clusters present in the data. Cell-cycle dataset and a detailed analysis confirm the biological nature of the most significant clusters.

Availability: The software described here is a subpackage part of the ASTRONEURAL package and is available upon request from the corresponding author.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Evaluation Study

MeSH terms

Artificial Intelligence
Cluster Analysis
Computer Graphics
Computer Simulation
Databases, Protein*
Gene Expression Profiling / methods*
Information Storage and Retrieval / methods*
Models, Genetic
Oligonucleotide Array Sequence Analysis / methods*
Proteins / metabolism*
Software*
Time Factors
User-Computer Interface*

Substances

Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding