| Literature DB >> 35089226 |
Cheng-Yao Lin1,2,3, Tsair-Wei Chien4, Yen-Hsun Chen5, Yen-Ling Lee6,7, Shih-Bin Su8.
Abstract
BACKGROUND: Breast cancer (BC) is the most common malignant cancer in women. A predictive model is required to predict the 5-year survival in patients with BC (5YSPBC) and improve the treatment quality by increasing their survival rate. However, no reports in literature about apps developed and designed in medical practice to classify the 5YSPBC. This study aimed to build a model to develop an app for an automatically accurate classification of the 5YSPBC.Entities:
Mesh:
Year: 2022 PMID: 35089226 PMCID: PMC8797502 DOI: 10.1097/MD.0000000000028697
Source DB: PubMed Journal: Medicine (Baltimore) ISSN: 0025-7974 Impact factor: 1.889
Illustration of the coding schema of response in this study.
| Variable and coding schema | n | % |
| Age | ||
| 1. <40 yrs | 261 | 14.42 |
| 2. 41–50 yrs | 614 | 33.92 |
| 3. 51–60 yrs | 512 | 28.29 |
| 4. >60 yrs | 423 | 23.37 |
| Pathologic M | ||
| 1: M0 | 985 | 54.42 |
| 2: M1 | 33 | 1.82 |
| 3: M1b uncertain | 706 | 39.01 |
| 4: M1c (found during or after surgery but not confirmed by pathology M1c) | 1 | 0.06 |
| 5: Null | 85 | 4.7 |
| Clinical N | ||
| 1: N0 | 1144 | 63.2 |
| 2: N1 | 381 | 21.05 |
| 3: N2 | 50 | 2.76 |
| 4: N2a | 4 | 0.22 |
| 5: N3 | 19 | 1.05 |
| 6: N3a | 2 | 0.11 |
| 7: N3b | 3 | 0.17 |
| 8: N3c | 12 | 0.66 |
| 9. Null | 195 | 10.77 |
| Clinical M | ||
| 1: M0 | 1533 | 84.7 |
| 2: M1 | 105 | 5.8 |
| 3. Null | 172 | 9.5 |
| Cancer status | ||
| 1: There is no evidence of this primary cancer | 1488 | 82.21 |
| 2: There is such primary cancer clinically | 322 | 17.79 |
| Vital status | ||
| 0: Non-survival | 298 | 16.46 |
| 1: Survival | 1512 | 83.54 |
Note. The coding schema is referred to as Supplemental Digital Content 1.
Figure 1Study flowchart.
The 15 variables extracted from the 53 eligible variables.
| Stage of variable | No. | Variable | Definition |
| TNM | 1 | Pathologic M | Refers to whether there is a remote transfer |
| Characteristics | 2 | Body BMI | kg/m2 |
| TNM | 3 | Scope of regional lymph node surgery at this facility | The scope of simultaneous removal, slicing, or aspiration of regional lymph nodes during the primary site operation or another independent operation in the reporting hospital |
| Treatment | 4 | Whether the surgery of the primary site | Have clear surgical records and dates |
| TNM | 5 | Clinical N | Refers to whether there is regional lymph node metastasis and the extent of metastasis |
| TNM | 6 | Regional lymph nodes positive | Total number of regional lymph nodes positive by a pathologist |
| TNM | 7 | Clinical M | Refers to whether there is a remote transfer |
| Treatment | 8 | Reason for no surgery of primary site | The reason why the case was not operated on the primary site in any medical institution |
| Cancer | 9 | Grade/Differentiation | Describe the similarity of tumors and normal tissues |
| Cancer | 10 | Tumor size | The largest size or diameter of the primary tumor |
| Treatment | 11 | Surgical margins of the primary site | The final state of the surgical margin after resection of the primary tumor |
| TNM | 12 | Clinical stage group | Based on clinical T, N, and M to determine the degree of disease invasion on the anatomical site. |
| Recurrence | 13 | Date of first recurrence | From the date of the first diagnosis to the date of confirmation of the relapse diagnosis |
| Recurrence | 14 | Type of first recurrence | The case records the type of the first relapse after a period of disease-free intermission or remission |
| Recurrence | 15 | Cancer status | Whether the case had cancer at the “last contact or death date” |
Note. The coding schema is referred to as Supplemental Digital Content 1.
Figure 2Feature variable comparison between two groups (survival vs non-survival) using the forest plot.
Comparison of predictive models across indicators of accuracy and AUC.
| Model | n | Sensitivity | Specificity | Precision | Accuracy | AUC | 95% CI | |
| ANN | ||||||||
| Training | 1357 | 0.87 | 0.58 | 0.88 | 0.87 | 0.8 | 0.72 | 0.7–0.75 |
| Testing | 540 | 0.83 | 0.49 | 0.9 | 0.86 | 0.78 | 0.66 | 0.62–0.70 |
| CNN | ||||||||
| Training | 1357 | 0.92 | 0.68 | 0.91 | 0.78 | 0.87 | 0.8 | 0.78–0.82 |
| Testing | 540 | 0.9 | 0.66 | 0.93 | 0.76 | 0.86 | 0.78 | 0.74–0.81 |
| KNN3 | ||||||||
| Training | 1357 | 0.93 | 0.77 | 0.94 | 0.93 | 0.89 | 0.85 | 0.83–0.87 |
| Testing | 540 | 0.87 | 0.6 | 0.93 | 0.9 | 0.82 | 0.73 | 0.69–0.77 |
| LR | ||||||||
| Training | 1357 | 0.97 | 0.52 | 0.88 | 0.68 | 0.87 | 0.75 | 0.72–0.77 |
| Testing | 540 | 0.96 | 0.54 | 0.92 | 0.69 | 0.89 | 0.75 | 0.71–0.79 |
| Bayes | ||||||||
| Training | 1357 | 0.83 | 0.61 | 0.88 | 0.7 | 0.78 | 0.72 | 0.69–0.74 |
| Testing | 540 | 0.72 | 0.76 | 0.94 | 0.74 | 0.72 | 0.74 | 0.70–0.78 |
The highest model stability (=0.78) from the testing set predicted by the training set is from the CNN model.
ANN = artificial neural network, CNN = convolutional neural network, KNN = k-nearest neighbors algorithm, LR = logistic regression.
Figure 3Data entry and assessment result.