Background: Microbiome sequencing has brought increasing attention to the polymicrobial context of chronic infections. However, clinical microbiology continues to focus on canonical human pathogens, which may overlook informative, but nonpathogenic, biomarkers. We address this disconnect in lung infections in people with cystic fibrosis (CF).
Methods: We collected health information (lung function, age, and body mass index [BMI]) and sputum samples from a cohort of 77 children and adults with CF. Samples were collected during a period of clinical stability and 16S rDNA sequenced for airway microbiome compositions. We use ElasticNet regularization to train linear models predicting lung function and extract the most informative features.
Results: Models trained on whole-microbiome quantitation outperformed models trained on pathogen quantitation alone, with or without the inclusion of patient metadata. Our most accurate models retained key pathogens as negative predictors (Pseudomonas, Achromobacter) along with established correlates of CF disease state (age, BMI, CF-related diabetes). In addition, our models selected nonpathogen taxa (Fusobacterium, Rothia) as positive predictors of lung health.
Conclusions: These results support a reconsideration of clinical microbiology pipelines to ensure the provision of informative data to guide clinical practice.
Keywords: cystic fibrosis; machine learning; microbiome.
© The Author(s) 2021. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.