| Literature DB >> 32150991 |
Wan Zhu1,2, Longxiang Xie1, Jianye Han3, Xiangqian Guo1.
Abstract
Deep learning has been applied to many areas in health care, including imaging diagnosis, digital pathology, prediction of hospital admission, drug design, classification of cancer and stromal cells, doctor assistance, etc. Cancer prognosis is to estimate the fate of cancer, probabilities of cancer recurrence and progression, and to provide survival estimation to the patients. The accuracy of cancer prognosis prediction will greatly benefit clinical management of cancer patients. The improvement of biomedical translational research and the application of advanced statistical analysis and machine learning methods are the driving forces to improve cancer prognosis prediction. Recent years, there is a significant increase of computational power and rapid advancement in the technology of artificial intelligence, particularly in deep learning. In addition, the cost reduction in large scale next-generation sequencing, and the availability of such data through open source databases (e.g., TCGA and GEO databases) offer us opportunities to possibly build more powerful and accurate models to predict cancer prognosis more accurately. In this review, we reviewed the most recent published works that used deep learning to build models for cancer prognosis prediction. Deep learning has been suggested to be a more generic model, requires less data engineering, and achieves more accurate prediction when working with large amounts of data. The application of deep learning in cancer prognosis has been shown to be equivalent or better than current approaches, such as Cox-PH. With the burst of multi-omics data, including genomics data, transcriptomics data and clinical information in cancer studies, we believe that deep learning would potentially improve cancer prognosis.Entities:
Keywords: cancer prognosis; deep learning; machine learning; multi-omics; prognosis prediction
Year: 2020 PMID: 32150991 PMCID: PMC7139576 DOI: 10.3390/cancers12030603
Source DB: PubMed Journal: Cancers (Basel) ISSN: 2072-6694 Impact factor: 6.639
Summary of neural network models with no feature extraction.
| Publication a | Type of Cancer | Type of Data | Sample Size | Methods | Architecture | Outputs | Hyperparameters | Validation | NN Model Performance |
|---|---|---|---|---|---|---|---|---|---|
| Joshi et al., 2006 [ | Melanoma | Clinical data of tumors | 1946 (1160 females and 786 males) | 3 layers NN | Normalized input | Survival time | Sigmoid activation | Not reported | Achieved similar performance as Cox and Kaplan Meier statistical methods |
| Chi et al., 2007 [ | Breast cancer | Cell images to measure 30 nuclear features | Dataset 1: 198 cases; | 3 layers NN | 30 input nodes, 20 hidden nods | Survival time | Epoch = 1000, | 10 fold cross validation | As good as conventional methods |
| Petalidis et al., 2008 [ | Astrocytic brain tumor | A list of genes expression from microarray data | 65 | A single layer perceptron and an output (multiple binary models) | Number of inputs equals to the number of classifier genes in different models | Tumor grades | Lr 1 = 0.05, | Leave-one-out cross validation | 44, 9 and 7 probe sets have achieved 93.3%, 84.6%, and 95.6% validation success rates, respectively. |
| Ching et al., 2018 [ | 10 types of cancer | TCGA gene expression data, clinical data and survival data | 5031 | NN | Input normalization and log-transformed, 0–2 hidden layers (143 nodes) | Survival time | L1, L2, or MCP 2 regularization, tanh activation for hidden layer(s), dropout, Cox regression as output layer | 5-fold cross validation | Similar or in some cases better performance than Cox-PH, Cox-boosting or RF |
| Katzman et al. 2018 [ | Breast cancer | METABRIC 3, | METABRIC: 1980, GBSG: 1546 training, 686 testing | NN | METABRIC: 1 dense layer, 41 nodes | Survival | SELU 5 activation, | 20% of METABRIC patients used as test set GBSG has split test dataset | C-index: 0.654 for METABRIC and 0.676 for GBSG, both are better than CoxPH |
| Jing et al. 2019 [ | Breast cancer, nasopharyngeal carcinoma | METABRIC, | METABRIC: 1980 | DNN | METABRIC: 4 hidden layers, 45 nodes of each; GBSG: 3 hidden layers, 84, 84 and 70 nodes, respectively | Survival | ELU 6, dropout, L1 and L2, momentum, LR decay, batch size. Loss function equals to mean square error and a pairwise ranking loss | After removed patients with missing data, 20% used as test set | C-index: 0.661 for METABRIC and 0.688 for GBSG, both are better than CoxPH and DeepSurv. c-index ranges 0.681–0.704 depends on input data for NPC, better than CoxPH. |
Abbreviation:1 Lr; learning rate; 2 MCP: minimax concave penalty. 3 METABRIC: Molecular Taxonomy of Breast Cancer International Consortium; 4 GBSG: the German Breast Cancer Study Group; 5 SELU: scaled exponential linear unit; 6 SELU: exponential linear unit; 7 NPC: nasopharyngeal carcinoma. Links to source codes if available from publications: Petalidis et al. [13]: http://www.imbb.forth.gr/people/poirazi/software.html. Ching et al. [59]: https://github.com/lanagarmire/cox-nnet. Katzman et al. [60]: https://github.com/jaredleekatzman/DeepSurv. Jing et al. 2019 [61]: http:/github.com/sysucc-ailab/RankDeepSurv.
Summary of neural network models that used feature extraction.
| Publication a | Type of Cancer | Type of Data | Sample Size | Methods Used in Feature Extraction | Architecture | Outputs | Hyperparameters | Validation | NN Model Performance |
|---|---|---|---|---|---|---|---|---|---|
| Sun et al., 2018 [ | Breast cancer | Gene expression profile, CAN 1 profile and clinical data | 1980 (1489 LTS 2, 491 non-LTS) | mRMR (extracted 400 features from gene expression and 200 features from CNA) | 4 hidden layers (1000, 500, 500, and 100 nodes, respectively) | Survival time | Lr 3 = 1e–3, | 10-fold cross validation | ROC4: 0.845 (better than SVM, RF5, and LR 6), Sp 7: 0.794–0.826, Pre 8: 0.749–0.875, Sn 9: 0.2–0.25, Mcc10: 0.356–0.486 |
| Huang et al., 2019 [ | Breast cancer | mRNA, miRNA, CNB 11, TMB 12, clinical data | 583 (80% for training, 20% for testing in each fold of cross validation) | lmQCM 13, | Hybrid network, mRNA and miRNA dimension reduction inputs have 1 hidden layer (8 and 4 nodes, respectively), CNB, TMB and clinical data have no hidden layer | Survival time | Adam optimizer, LASSO 14 regularization, | 5-fold cross validation | Multi-omics data network reached a median c-index15 of 0.7285 |
| Hao et al., 2018 [ | Glioblastoma multiforme | Gene expression (TCGA), pathway (MsigDB 16) | 475 (376 non-LTS, 99 LTS) | Pathway based analysis (12,024 genes from mRNA data to 574 pathways and 4359 genes) | 4 layers NN: gene layer—pathway layer—hidden layer—output | Survival time | Lr = 1e−4, | 5-fold validation | AUC17 = 0. 66 ± 0.013, |
| Chaudhary et al., 2018 [ | Liver cancer | mRNA, miRNA, methylation data, and clinical data (TCGA) | 360 samples training, (5 additional cohorts, 230, 221, 166, 40 and 27 samples for validation) | Autoencoder unsupervised NN to extract 100 features from mRNA, miRNA and methylation data | 3 hidden layers NN (500, 100, 500 nodes, respectively) and a bottleneck layer | Feature reduction | Epoch = 10, | Not reported | NN outputs were used for K means clustering. |
| Shimizu and Nakayama, 2019, [ | Breast cancer | METABRIC 19 | 1903 (METABRIC, 952 samples for training) | Select 23 genes by statistical methods | 3 layers NN | Survival time | Lr = 0.001, | 951 samples from METABRIC | NN node weights were used to calculate a mPS20 |
Abbreviation: CNA: copy number alternation, 2 LTS: long term survivals; 3 Lr: learning rate; 4 ROC: receiver operating characteristic; 5 RF: random forest, 6 LR: logistic regression, 7 Sp: specificity; 8 Pre: precision; 9 Sn: sensitivity; 10 Mcc: Mathew’s correlation coefficient. The equation is (TP*TN-FP*FN)/√ [(TP + FN)*(TP + FP)*(TN + FN)*(TN + FP)]; 11 CNB: copy number burden; 12 TNB: tumor mutation burden; 13 lmQCM: local maximum Quasi Clique Merger [67]; 14 LASSO: also known as L1 regularization; 15 c-index (concordance index): Steck et al. [70] suggested that c-index is equivalent to AUC. Specifically, c-index closes to 0.5 suggested random prediction. The closer c-index gets to 1, the better the model is. 16 MsigDB: Molecular Signatures Database; 17 AUC: area under the curve of ROC; 18 SGD: stochastic gradient descent; 19 METABRIC: Molecular Taxonomy of Breast Cancer International Consortium; 20 mPS: molecular prognostic score. Links to source codes if available from publications: Sun et al., 2018 [66]: https://github.com/USTC-HIlab/MDNNMD. Huang et al., 2019 [67]: https://github.com/huangzhii/SALMON/. Hao et al., 2018 [62]: https://github.com/DataX-JieHao/PASNet. Shimizu and Nakayama, 2019, [69]: https://hideyukishimizu.github.io/mPS_breast.
Summary of CNN based models.
| Publication a | Type of Cancer | Type of Data | Sample Size | Architecture | Outputs | Hyperparameters | Validation | NN Model Performance |
|---|---|---|---|---|---|---|---|---|
| Korfiatis et al., 2017 [ | Glioblastoma multiforme | MRI images | 155 (66 methylated and 89 unmethylated tumors) | Base model: | 3 classes, methylated, unmethylated, or no tumor | Lr1 = 0.01, mini Batch = 32, momentum = 0.5, weight decay = 0.1, | Stratified cross-validation | ResNet50 based model validation dataset performance: Accuracy = 94.9%, Precision = 96%, Recall = 95% |
| Han et al., 2018 [ | Glioblastoma multiforme | MRI images | 458,951 image frames from 5235 MRI scans of 262 patients (TCIA3) | 3 convolutional layers, 2 fully connected layers, 1 bi-directional GRU4 layer (RNN), 1 fully connected layer, softmax output | 2 classes (positive and negative methylation status) | Data augmentation, (rotation and flipping, 90-fold increase of the dataset), Lr = 5e−6 – 5e−1, | Validation set reached a precision of 0.67, an AUC of 0.56. | Training data set obtained 0.97 accuracy. 0.67 and 0.62 accuracies on the validation and test set, respectively |
| Mobadersany et al., 2018 [ | Low grade glioma and glioblastoma | H&E images, genomics data, clinical data | 769 gliomas from TCGA, containing genomics data (IDH mutation and 1 p/19 q codeletion), clinical data and 1061 slides. | VGG19 is the base model and cox regression used as output, Built 2 models with or without genomics data | Survival | Data augmentation, Lr = 0.001, epoch = 100, exponential learning decay | Monte Carlo cross-validation | SCNN median c-index is 0.754, GSCNN (adding IDH mutation and 1 p/19 q codeletion as features) improved the median c-index to 0.801 |
| Kather et al., 2019 [ | Colorectal cancer | H&E tissue slides | Training set (tissue): 86 H&E slides to create 100,000 image patches | Base models: VGG19, AlexNet, GoogLeNet, SqueezeNet and Resnet50, add an output softmax layer | 9 tissue type classification | Lr = 3e −4, | An independent cohort of 409 samples | VGG19 gets the best results, 94–99% accuracy in tissue class prediction |
| Bychkov et al., 2018 [ | Colorectal cancer | H&E images of tumor tissue microarray | 420 patients (equal number of survived or died within five years after diagnosis) | VGG16 to generate a 16 × 16 feature from input data, followed with 3 layers LSTM 6 (264, 128 and 64 LSTM cells, respectively) | Survival | Default hyperparameters in VGG16, LSTM used hyperbolic tangent as activation, binary cross entropy loss function, Adadelta optimizer | 60 samples for validation, 140 samples for testing | CNN + LSTM model reached an AUC 7 of 0.69, better than CNN + SVM, CNN + LR 8, or CNN + NB 9 |
| Courtiol et al., 2019 [ | Mesothelioma | H&E slides | 2981 patient slides (MESOPATH/MESOBANK, 2300 training, 681 testing) | Divided each slide to up to 10,000 tiles as input data | Survival | Multi-layer perceptron with sigmoid activation, | 56 patient slides | MesoNet outperformed histology-based classification but no better than a linear regression based model (Meanpool) |
| Wang et al., 2019 [ | High grade serous ovarian cancer | CT scanning venous phase images | Feature learning cohort: 8917 CT images from 102 patients | Five convolutional layers (24, 16, 16, 16, 16 filters, respectively) | 16 dimensional feature vector | Batch normalization, average pooling between adjacent convolutional layers | Not reported | CNN outputs were used to build Cox-PH model |
Abbreviation: 1 Lr: learning rate; 2 SGD; stochastic gradient descent; 3 TCIA: The Cancer Image Archive; 4 GRU: gated recurrent unit, which is similar to LSTM and is used in building RNN models; 5 OS: overall survival; 6 LSTM: long short term memory cell; 7 AUC: area under the curve of ROC; 8 LR: logistic regression; 9 NB: naive bayes; 10 c-index: also known as Harell’s concordance index. Links to source codes if available from publications: Han et al., 2018 [81]: http://onto-apps.stanford.edu/m3crnn/. Kather et al., 2019 [74]: http://dx.doi.org/10.5281/zenodo.1214456,http://dx.doi.org/10.5281/zenodo.1420524, http://dx.doi.org/10.5281/zenodo.1471616, Wang et al., 2019 [86]: http://www.radiomics.net.cn/post/111.
Figure 1Workflow of building deep learning models for cancer prognosis prediction. The sources of input data include clinical data which could be text data and/or structured data (numeric and/or categorical data), clinical images which could be tissue slides in H&E staining or immune-histological staining. MRI, CT, etc, and genomic data which could be expression data (i.e., mRNA expression data, miRNA expression data), genomic sequence data (i.e., whole genome sequence, SNP data, CNA data, etc), epigenetic data (i.e., methylation data), etc. In the next step, researchers will examine the data to handle missing data and imbalanced data. Reduction of high dimensional genomic data is an optional step here. Features are then used to build a deep learning (neural network) model. The type of models to use depends on the input data. For example, fully connected NN is commonly used for structured datasets. Image data is used to build CNN models. Sequence data is often used to build RNN models. If multiple types of data exist, hybrid models can be built to accept different data types. After the model is built, the model will be tested in the holdout (or validation) datasets. It will also be important to test and compare the models using benchmark datasets. Finally, the model can be used in applications. Abbreviations: FPR: false positive rate; TPR: true positive rate.