Volatile organic compounds (VOCs) indicative of pork microbial spoilage can be quantified rapidly at trace levels using selected-ion flow-tube mass spectrometry (SIFT-MS). Packaging atmosphere is one of the factors influencing VOC production patterns during storage. On this basis, machine learning would help to process complex volatolomic data and predict pork microbial quality efficiently. This study focused on (1) investigating model generalizability based on different nested cross-validation settings, and (2) comparing the predictive power and feature importance of nine algorithms, including Artificial Neural Network (ANN), k-Nearest Neighbors, Support Vector Regression, Decision Tree, Partial Least Squares Regression, and four ensemble learning models. The datasets used contain 37 VOCs' concentrations (input) and total plate counts (TPC, output) of 350 pork samples with different storage times, including 225 pork loin samples stored under three high-O2 and three low-O2 conditions, and 125 commercially packaged products. An appropriate choice of cross-validation strategies resulted in trustworthy and relevant predictions. When trained on all possible selections of two high-O2 and two low-O2 conditions, ANNs produced satisfactory TPC predictions of unseen test scenarios (one high-O2 condition, one low-O2 condition, and the commercial products). ANN-based bagging outperformed other employed models, when TPC exceeded ca. 6 log CFU/g. VOCs including benzaldehyde, 3-methyl-1-butanol, ethanol and methyl mercaptan were identified with high feature importance. This elaborated case study illustrates great prospects of real-time detection techniques and machine learning in meat quality prediction. Further investigations on handling low VOC levels would enhance the model performance and decision making in commercial meat quality control.
Keywords: Ensemble learning; Nested cross-validation; Permutation feature importance; Pork storage; Volatile organic compounds.
Copyright © 2024 Elsevier Ltd. All rights reserved.