| Literature DB >> 34497734 |
Haiming Tang1, Nanfei Sun2, Steven Shen3.
Abstract
BACKGROUND: Artificial intelligence has an emerging progress in diagnostic pathology. A large number of studies of applying deep learning models to histopathological images have been published in recent years. While many studies claim high accuracies, they may fall into the pitfalls of overfitting and lack of generalization due to the high variability of the histopathological images. AIMS AND OBJECTS: Use the model training of osteosarcoma as an example to illustrate the pitfalls of overfitting and how the addition of model input variability can help improve model performance.Entities:
Keywords: Artificial intelligence; computer vision; deep learning; diagnostic pathology; osteosarcoma; overfitting
Year: 2021 PMID: 34497734 PMCID: PMC8404558 DOI: 10.4103/jpi.jpi_78_20
Source DB: PubMed Journal: J Pathol Inform
Crosstab table of different tumor types and different patients for the cancer imaging archive osteosarcoma dataset
| Tumor_type | Patient_id | Total | |||
|---|---|---|---|---|---|
|
| |||||
| P9 | Case 3 | Case4 | Case48 | ||
| Nontumor | 212 | 110 | 78 | 136 | 536 |
| Nonviable-tumor | 0 | 171 | 90 | 2 | 263 |
| Viable | 0 | 3 | 87 | 202 | 292 |
| Viable: Nonviable | 0 | 1 | 22 | 30 | 53 |
| Total | 212 | 285 | 277 | 370 | 1144 |
Figure 1Sample images of different osteosarcoma subtypes we have collected for Experiments E and F showed in Table 2
Experiments set up and performance summary
| Experiment label | Training set | Test set | Specific subtype of osteosarcoma | AUC |
|---|---|---|---|---|
| A | 70% of TCIA set | 30% of TCIA set | NA | 0.831406 |
| B | TCIA set (P9, case 3, case 48) | TCIA set (Case 4) | NA | 0.700612 |
| C | 70% of TCIA set | 30% of all subtypes of osteosarcoma+all benign tissues, osteoma and osteoid osteoma | NA | 0.55 |
| D | TCIA set (P9, Case 3, Case 48) | NA | 0.29 | |
| E | 70% of dfiffernt combinatiosn of otsteosarcoma subtypes+70% of all benign tissues, osteoma and osteoid osteoma | Smallcellvariant_fibroblastic | 0.471908 | |
| Smallcellvariant_fibroblastic_periosteal | 0.524636 | |||
| Smallcellvariant_fibroblastic_periosteal_telangiectactic | 0.528312 | |||
| Smallcellvariant_fibroblastic_periosteal_telangiectactic_complicatingpaget | 0.598128 | |||
| Smallcellvariant_fibroblastic_periosteal_telangiectactic_complicatingpaget_epithelioid | 0.863568 | |||
| Smallcellvariant_fibroblastic_periosteal_telangiectactic_complicatingpaget_epithelioid_withgiantcells | 0.851232 | |||
| Smallcellvariant_fibroblastic_periosteal_telangiectactic_complicatingpaget_epithelioid_withgiantcells_parosteal | 0.840264 | |||
| Smallcellvariant_fibroblastic_periosteal_telangiectactic_complicatingpaget_epithelioid_withgiantcells_parosteal_osteoblastic | 0.885664 | |||
| Smallcellvariant_fibroblastic_periosteal_telangiectactic_complicatingpaget_epithelioid_withgiantcells_parosteal_osteoblastic_chondroblastic | 0.88704 | |||
| F | 70% of one specific type of osteosarcoma+70% of all benign tissues, osteoma and osteoid osteoma | Chondroblastic | 0.663656 | |
| Complicatingpaget | 0.451864 | |||
| Epithelioid | 0.457356 | |||
| Fibroblastic | 0.392112 | |||
| Osteoblastic | 0.630856 | |||
| Parosteal | 0.542764 | |||
| Periosteal | 0.396016 | |||
| Smallcellvariant | 0.388852 | |||
| Telangiectactic | 0.399856 | |||
| Withgiantcells | 0.457676 | |||
| G | 70% of one specific type of osteosarcoma+70% of all benign tissues | 30% of all subtypes of osteosarcoma+all benign tissues | Chondroblastic | 0.916284 |
| Complicatingpaget | 0.826564 | |||
| Epithelioid | 0.906808 | |||
| Fibroblastic | 0.732032 | |||
| Osteoblastic | 0.94616 | |||
| Parosteal | 0.761204 | |||
| Periosteal | 0.783428 | |||
| Smallcellvariant | 0.535136 | |||
| Telangiectactic | 0.569512 | |||
| Withgiantcells | 0.660888 |
TCIA: The cancer imaging archive, NA: Not available, AUC: Analytical ultracentrifugation
Figure 2Benign dataset of Experiment E and F consisting of benign bone tissues and 2 types of benign tumors, osteoma, and osteoid osteoma
Figure 3Model metrics: Loss and area under curve during training epochs, upper Experiment A, lower Experiment B
Figure 4Receiver operating curves for Experiments C and D, which are the performances of the models in Experiment A and B applied to the test set composed of all subtypes of osteosarcoma, benign tissues, and benign bone tumors
Figure 5Metrics of the model using only chondroblastic subtype to train in Experiment E
Figure 6Boxplot of the area under curve of the 25 epochs of subtype models in Experiment E on the same test dataset
Figure 7Boxplots of the area under curve of the 25 epochs of models that add up different subtypes in Experiment F