Detection of COVID-19 features in lung ultrasound images using deep neural networks

Commun Med (Lond). 2024 Mar 11;4(1):41. doi: 10.1038/s43856-024-00463-5.

Abstract

Background: Deep neural networks (DNNs) to detect COVID-19 features in lung ultrasound B-mode images have primarily relied on either in vivo or simulated images as training data. However, in vivo images suffer from limited access to required manual labeling of thousands of training image examples, and simulated images can suffer from poor generalizability to in vivo images due to domain differences. We address these limitations and identify the best training strategy.

Methods: We investigated in vivo COVID-19 feature detection with DNNs trained on our carefully simulated datasets (40,000 images), publicly available in vivo datasets (174 images), in vivo datasets curated by our team (958 images), and a combination of simulated and internal or external in vivo datasets. Seven DNN training strategies were tested on in vivo B-mode images from COVID-19 patients.

Results: Here, we show that Dice similarity coefficients (DSCs) between ground truth and DNN predictions are maximized when simulated data are mixed with external in vivo data and tested on internal in vivo data (i.e., 0.482 ± 0.211), compared with using only simulated B-mode image training data (i.e., 0.464 ± 0.230) or only external in vivo B-mode training data (i.e., 0.407 ± 0.177). Additional maximization is achieved when a separate subset of the internal in vivo B-mode images are included in the training dataset, with the greatest maximization of DSC (and minimization of required training time, or epochs) obtained after mixing simulated data with internal and external in vivo data during training, then testing on the held-out subset of the internal in vivo dataset (i.e., 0.735 ± 0.187).

Conclusions: DNNs trained with simulated and in vivo data are promising alternatives to training with only real or only simulated data when segmenting in vivo COVID-19 lung ultrasound features.

Plain language summary

Computational tools are often used to aid detection of COVID-19 from lung ultrasound images. However, this type of detection method can be prone to misdiagnosis if the computational tool is not properly trained and validated to detect image features associated with COVID-19 positive lungs. Here, we devise and test seven different strategies that include real patient data and simulated patient data to train the computational tool on how to correctly diagnose image features with high accuracy. Simulated data were created with software that models ultrasound physics and acoustic wave propagation. We find that incorporating simulated data in the training process improves training efficiency and detection accuracy, indicating that a properly curated simulated dataset can be used when real patient data are limited.