Background: Deep learning based radiogenomic (DLR) models present a promising performance in assisting lung cancer care. The purpose of this study was 1) To develop and validate DLR signatures to predict the EGFR mutation, 2) To assess the incremental value of these DLR signatures in comparison to the traditional clinical and semantic features.
Methods: 223 patients were selected from two phase III randomized trials in patients with advanced non-squamous NSCLC with EGFR-sensitizing mutation and EGFR wild type who were planned to receive palliative therapy (trial 1: gefitinib or gefitinib plus pemetrexed and carboplatin and trial 2: pemetrexed maintenance and erlotinib maintenance). Our method is an end-to-end pipeline that requires only the manually selected tumour region in a CT image without precise tumour boundary segmentation or human-defined features. Two deep convolutional neural networks with 3D U-Net architectures are trained to segment lung masses and nodules from 3D regions of the CT image. The primary end point was EGFR prediction using Radiomics and DLR pipeline. We also compared the performance of combination of models in predicting the mutational status.
Results: A total of 223 patients (mean age, 54.18 years; age range, 28–80 years) were included in this study. There were 121 (54.3%) patients with EGFR mutation and 102 (45.7%) patients who were EGFR wild type. On multivariate logistic regression analysis, Clinical variable and CT semantic features that were found to be significantly associated EGFR mutation were tumor stage, smoking status, pure solid texture, presence of non-tumor lobe nodule, and average enhancement. For predicting EGFR mutation, ROC curve plotted with clinical variables model, CT semantic variables model, Radiomics model, DLR model showed an AUC value of 0.70, 0.73, 0.94, 0.72 respectively. Clinical variables and semantic features were added to the radiomics predictive model and deep learning predictive model independently, showed further improvement in the accuracy for either model from AUC 0.94+/-0.02 to 0.96+/-0.02 and from AUC 0.72+/-0.02 to 0.82+/-0.04 respectively.
Conclusions: The radiomics and DLR model by machine-learned information, extracted from CT images without precise manual segmentation, could predict EGFR mutation with very high accuracy. This AI based model can be used as non-invasive and easy-to-use surrogate imaging biomarker for EGFR mutation status prediction.