| Literature DB >> 31619736 |
Hui Lin1, Wei Zou1, Taoran Li1, Steven J Feigenberg1, Boon-Keng K Teo1, Lei Dong2.
Abstract
In cancer radiation therapy, large tumor motion due to respiration can lead to uncertainties in tumor target delineation and treatment delivery, thus making active motion management an essential step in thoracic and abdominal tumor treatment. In current practice, patients with tumor motion may be required to receive two sets of CT scans - the initial free-breathing 4-dimensional CT (4DCT) scan for tumor motion estimation and a second CT scan under appropriate motion management such as breath-hold or abdominal compression. The aim of this study is to assess the feasibility of a predictive model for tumor motion estimation in three-dimensional space based on machine learning algorithms. The model was developed based on sixteen imaging features extracted from non-4D diagnostic CT images and eleven clinical features extracted from the Electronic Health Record (EHR) database of 150 patients to characterize the lung tumor motion. A super-learner model was trained to combine four base machine learning models including the Random Forest, Multi-Layer Perceptron, LightGBM and XGBoost, the hyper-parameters of which were also optimized to obtain the best performance. The outputs of the super-learner model consist of tumor motion predictions in the Superior-Inferior (SI), Anterior-Posterior (AP) and Left-Right (LR) directions, and were compared against tumor motions measured in the free-breathing 4DCT scans. The accuracy of predictions was evaluated using Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) through ten rounds of independent tests. The MAE and RMSE of predictions in the SI direction were 1.23 mm and 1.70 mm; the MAE and RMSE of predictions in the AP direction were 0.81 mm and 1.19 mm, and the MAE and RMSE of predictions in the LR direction were 0.70 mm and 0.95 mm. In addition, the relative feature importance analysis demonstrated that the imaging features are of great importance in the tumor motion prediction compared to the clinical features. Our findings indicate that a super-learner model can accurately predict tumor motion ranges as measured in the 4DCT, and could provide a machine learning framework to assist radiation oncologists in determining the active motion management strategy for patients with large tumor motion.Entities:
Mesh:
Year: 2019 PMID: 31619736 PMCID: PMC6795883 DOI: 10.1038/s41598-019-51338-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Feature importance ranks in the SI, AP and LR directions obtained by XGBoost RFE and col-linearity removal are shown in (a–c). Imaging features are plotted in red and clinical features are plotted in blue. The F-scores were averaged over ten rounds of independent tests.
Figure 2The five-fold cross-validated Mean Absolute Error (MAE) of each machine learning base model with default hyper-parameter settings, with optimized hyper-parameters and the MAE of the Super-Learner model. The benefit of hyper-parameter tuning can be demonstrated by comparing the MAE of four base models with default hyper-parameters and with optimized hyper-parameters. The base models include Random Forest (RF), Multi-Layer Perceptron (MLP) Networks, LightGBM (LGBM) and XGBoost (XGB). The power of building up super-learner models can be demonstrated by the MAE improvements between the super-learner model and each optimized base model.
Figure 3Predicted values of the super-learner models versus ground truth values in the SI, AP and LR directions and the corresponding residual plots. A 2 mm error region is highlighted in each residual plot. The independent ten test set results of each super-learner model are plotted in different colors.
Figure 4Proposed workflow of performing the automatic selection of motion management strategy prior to the patient simulation.
Characteristics of input features.
| Imaging features | Clinical features |
|---|---|
| Tumor centroid and edge locations to the apex of the lung (SI) | Age [yrs] |
| Tumor centroid location relative to the chest wall (AP, LR) | Weight [Lbs] |
| Tumor edge location relative to the chest wall (AP, LR) | Respiratory rate |
| Lung dimension in SI, AP and LR directions [cm] | Smoking history [pack yrs] |
| Tumor contact with the chest wall (2-classes) | Staging |
| Target lung volume [ | Primary tumor (T) |
| Contralateral lung volume [ | Regional lymph nodes (N) |
| Volume of Gross Tumor Volume (GTV) [ | Distant metastasis (M) |
| GTV density [HU] | Tumor location: lung-wise (Left/Right) |
| Density of surrounding tissues around GTV relative to the lung | Tumor location: lobe-wise (Upper/Middle/Lower) |
| Target lung density [HU] | Performance status |
The number of features is determined before data pre-processing.
Figure 5Experimental design of the Super-Learner model. The entire dataset is first divided into the training and independent test groups, in which the training group is further divided using five-fold cross-validation method (The validation set is shown in yellow).
Summary of the base machine learning models used in this study.
| Model | Characteristics | Parameters |
|---|---|---|
| Random Forest[ | A large number of decision trees based on random subsampling | n_estimators, max_depth, max_features, min_samples_split, min_samples_leaf |
| Multi-Layer Perceptron (MLP) Networks[ | Auxiliary features are generated by each layer; a high number of tunable weights | layer compositions, number of hidden units, dropouts, learning rate, number of epochs |
| XGBoost[ | A variation of boosting; generalizes weak learners by allowing optimization of the differentiable loss function | max_depth, min_child_weight, subsample, colsample_bytree, learning rate, num_boost_round |
| LightGBM[ | Gradient boost based on the decision tree algorithm | num_leaves, feature_fraction, lambdas, max_depth, min_child_samples |