Big data bioinformatics

J Cell Physiol. 2014 Dec;229(12):1896-900. doi: 10.1002/jcp.24662.

Abstract

Recent technological advances allow for high throughput profiling of biological systems in a cost-efficient manner. The low cost of data generation is leading us to the "big data" era. The availability of big data provides unprecedented opportunities but also raises new challenges for data mining and analysis. In this review, we introduce key concepts in the analysis of big data, including both "machine learning" algorithms as well as "unsupervised" and "supervised" examples of each. We note packages for the R programming language that are available to perform machine learning analyses. In addition to programming based solutions, we review webservers that allow users with limited or no programming background to perform these analyses on large data compendia.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Artificial Intelligence
  • Computational Biology / methods*
  • Data Mining / methods*
  • Gene Expression Profiling
  • High-Throughput Screening Assays
  • Humans
  • Software*