Background: Monitoring the biological attributes of mosquitoes is critical for understanding pathogen transmission and estimating the impacts of vector control interventions on the survival of vector species. Infrared spectroscopy and machine learning techniques are increasingly being tested for this purpose and have been proven to accurately predict the age, species, blood-meal sources, and pathogen infections in Anopheles and Aedes mosquitoes. However, as these techniques are still in early-stage implementation, there are no standardized procedures for handling samples prior to the infrared scanning. This study investigated the effects of different preservation methods and storage duration on the performance of mid-infrared spectroscopy for age-grading females of the malaria vector, Anopheles arabiensis.
Methods: Laboratory-reared An. arabiensis (N = 3681) were collected at 5 and 17 days post-emergence, killed with ethanol, and then preserved using silica desiccant at 5 °C, freezing at - 20 °C, or absolute ethanol at room temperature. For each preservation method, the mosquitoes were divided into three groups, stored for 1, 4, or 8 weeks, and then scanned using a mid-infrared spectrometer. Supervised machine learning classifiers were trained with the infrared spectra, and the support vector machine (SVM) emerged as the best model for predicting the mosquito ages.
Results: The model trained using silica-preserved mosquitoes achieved 95% accuracy when predicting the ages of other silica-preserved mosquitoes, but declined to 72% and 66% when age-classifying mosquitoes preserved using ethanol and freezing, respectively. Prediction accuracies of models trained on samples preserved in ethanol and freezing also reduced when these models were applied to samples preserved by other methods. Similarly, models trained on 1-week stored samples had declining accuracies of 97%, 83%, and 72% when predicting the ages of mosquitoes stored for 1, 4, or 8 weeks respectively.
Conclusions: When using mid-infrared spectroscopy and supervised machine learning to age-grade mosquitoes, the highest accuracies are achieved when the training and test samples are preserved in the same way and stored for similar durations. However, when the test and training samples were handled differently, the classification accuracies declined significantly. Protocols for infrared-based entomological studies should therefore emphasize standardized sample-handling procedures and possibly additional statistical procedures such as transfer learning for greater accuracy.
Keywords: Age-grading; An.arabiensis; Machine learning and infrared spectroscopy; Malaria; Sample handling; Vector control.
© 2022. The Author(s).