The deployment of diverse data-generating technologies in livestock farming holds the promise of early disease detection and improved animal well-being. In this paper, we combine routinely collected dairy farm and herd data with weather and high-frequency sensor data from 6 farms to predict new lameness events in various future periods, spanning from the following day to 3 wk. A Random Forest classifier, using input features selected by the Boruta algorithm, was used for the prediction task; effects of individual features were further assessed using partial dependence plots. We achieve precision scores of up to 93% when predicting lameness for the next 3 wk and when using information from the last 3 wk, combined with a balanced accuracy of 79%. Removing sensor data results has a tendency to reduce the precision for predictions, especially when using information from the last 1, 2, or 3 wk. Moving to a larger dataset (without sensor data) of 44 farms keeps the similar balanced accuracy but reduces precision by more than 30%, revealing a substantial a trade-off in model quality between false positives (false lameness alerts) and false negatives (missed lameness events). Sensor data holds promise to further improve the precision of these models, but can be partially compensated by high-resolution data from other systems, such as automated milking systems.
Keywords: data integration; disease prediction; lameness; machine learning; precision livestock farming.
© 2024, The Authors. Published by Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).