Boosted structured additive regression for Escherichia coli fed-batch fermentation modeling

Michael Melcher; Theresa Scharl; Markus Luchner; Gerald Striedner; Friedrich Leisch

doi:10.1002/bit.26073

Boosted structured additive regression for Escherichia coli fed-batch fermentation modeling

Biotechnol Bioeng. 2017 Feb;114(2):321-334. doi: 10.1002/bit.26073. Epub 2016 Aug 30.

Authors

Michael Melcher^{1

2}, Theresa Scharl^{1

2}, Markus Luchner^{1

3}, Gerald Striedner^{1

3}, Friedrich Leisch^{1

2}

Affiliations

¹ Austrian Centre of Industrial Biotechnology, 8010 Graz, Austria.
² Institute of Applied Statistics and Computing, University of Natural Resources and Life Sciences, Peter-Jordan-Straße 82, 1190 Vienna, Austria.
³ Department of Biotechnology, University of Natural Resources and Life Sciences, Vienna, Austria.

PMID: 27530968
DOI: 10.1002/bit.26073

Abstract

The quality of biopharmaceuticals and patients' safety are of highest priority and there are tremendous efforts to replace empirical production process designs by knowledge-based approaches. Main challenge in this context is that real-time access to process variables related to product quality and quantity is severely limited. To date comprehensive on- and offline monitoring platforms are used to generate process data sets that allow for development of mechanistic and/or data driven models for real-time prediction of these important quantities. Ultimate goal is to implement model based feed-back control loops that facilitate online control of product quality. In this contribution, we explore structured additive regression (STAR) models in combination with boosting as a variable selection tool for modeling the cell dry mass, product concentration, and optical density on the basis of online available process variables and two-dimensional fluorescence spectroscopic data. STAR models are powerful extensions of linear models allowing for inclusion of smooth effects or interactions between predictors. Boosting constructs the final model in a stepwise manner and provides a variable importance measure via predictor selection frequencies. Our results show that the cell dry mass can be modeled with a relative error of about ±3%, the optical density with ±6%, the soluble protein with ±16%, and the insoluble product with an accuracy of ±12%. Biotechnol. Bioeng. 2017;114: 321-334. © 2016 Wiley Periodicals, Inc.

Keywords: Escherichia coli; boosting; machine learning; modeling; recombinant protein production; structured additive regression model.

MeSH terms

Algorithms
Batch Cell Culture Techniques / methods*
Bioreactors / microbiology
Escherichia coli / genetics
Escherichia coli / metabolism*
Fermentation
Machine Learning
Models, Biological*
Recombinant Proteins / chemistry*
Recombinant Proteins / genetics
Recombinant Proteins / metabolism*
Regression Analysis
Solubility

Substances

Recombinant Proteins