Background and purpose: Radiation pneumonitis (RP) is a radiotherapy dose-limiting toxicity for locally advanced non-small cell lung cancer (LA-NSCLC). Prior studies have proposed relevant dosimetric constraints to limit this toxicity. Using machine learning algorithms, we performed analyses of contributing factors in the development of RP to uncover previously unidentified criteria and elucidate the relative importance of individual factors.
Materials and methods: We evaluated 32 clinical features per patient in a cohort of 203 stage II-III LA-NSCLC patients treated with definitive chemoradiation to a median dose of 66.6 Gy in 1.8 Gy daily fractions at our institution from 2008 to 2016. Of this cohort, 17.7% of patients developed grade ≥2 RP. Univariate analysis was performed using trained decision stumps to individually analyze statistically significant predictors of RP and perform feature selection. Applying Random Forest, we performed multivariate analysis to assess the combined performance of important predictors of RP.
Results: On univariate analysis, lung V20, lung mean, lung V10 and lung V5 were found to be significant RP predictors with the greatest balance of specificity and sensitivity. On multivariate analysis, Random Forest (AUC = 0.66, p = 0.0005) identified esophagus max (20.5%), lung V20 (16.4%), lung mean (15.7%) and pack-year (14.9%) as the most common primary differentiators of RP.
Conclusions: We highlight Random Forest as an accurate machine learning method to identify known and new predictors of symptomatic RP. Furthermore, this analysis confirms the importance of lung V20, lung mean and pack-year as predictors of RP while also introducing esophagus max as an important RP predictor.
Keywords: CART; Logistic regression; Machine learning; Non-small cell lung cancer; RUSBoost; Radiation pneumonitis; Random forest; Support vector machines.
Copyright © 2019 Elsevier B.V. All rights reserved.