Background and objectives: Prediction of lymph node metastasis (LNM) for early colorectal cancer (CRC) is critical for determining treatment strategies after endoscopic resection. Some histologic parameters for predicting LNM have been established, but evaluator error and inter-observer disagreement are unsolved issues. Here we describe an LNM prediction algorithm for submucosal invasive (T1) CRC based on machine learning.
Methods: We conducted a retrospective single-institution study of 397 T1 CRCs. Several morphologic parameters were extracted from whole slide images of cytokeratin immunohistochemistry using Image J. A random forest algorithm for a training dataset (n = 277) was executed and used to predict LNM for the test dataset (n = 120). The results were compared with conventional histologic evaluation of hematoxylin-eosin staining.
Results: Machine learning showed better LNM predictive ability than the conventional method on some datasets. Cross validation revealed no significant difference between the methods. Machine learning resulted in fewer false-negative cases than the conventional method.
Conclusions: Machine learning on whole slide images is a potential alternative for determining treatment strategies for T1 CRC.
Keywords: Colorectal cancer; Lymph node metastasis; Random forest; Supervised machine learning.
Copyright © 2019 Elsevier B.V. All rights reserved.