| Literature DB >> 33028301 |
Maliazurina Saad1,2, Ik Hyun Lee3.
Abstract
BACKGROUND: Clinical endpoint prediction remains challenging for health providers. Although predictors such as age, gender, and disease staging are of considerable predictive value, the accuracy often ranges between 60 and 80%. An accurate prognosis assessment is required for making effective clinical decisions.Entities:
Keywords: Biomarker; Clinical decision- making; Endpoint; Imaging; Predictive models
Mesh:
Substances:
Year: 2020 PMID: 33028301 PMCID: PMC7538849 DOI: 10.1186/s12911-020-01262-3
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
The details of dataset used in this work
| Dataset characteristics | No | (%) |
|---|---|---|
| 211 | 100 | |
| Incomplete records | 50 | 24 |
| Complete records | 161 | 76 |
| 68 | 42 | |
| I) Air Bronchogram | 24 | 35 |
| II) Cavitation | 15 | 22 |
| III) Cysts | 4 | 6 |
| IV) Reticulation | 5 | 7 |
| V) Mix of above | 20 | 29 |
| 91 | 58 | |
| Male | (48/68) | 71 |
| Female | (20/68) | 29 |
| ≤ 70 | (36/68) | 53 |
| > 70 | (32/68) | 47 |
| Primary tumor T | ||
| I | (30/68) | 44 |
| II | (25/68) | 37 |
| III | (9/68) | 13 |
| IV | (4/68) | 6 |
| Lymph node N | ||
| 0 | (56/68) | 82 |
| 1 | (4/68) | 6 |
| 2 | (8/68) | 2 |
| 3 | 0 | 0 |
| Metastasis M | ||
| 0 | (66/68) | 97 |
| 1 | (2/68) | 3 |
| Squamous cell | (10/68) | 15 |
| Non-Squamous cell | (58/68) | 85 |
| Survived | (23/68) | 34 |
| Expired | (45/68) | 66 |
Fig. 1Flow diagram of the proposed model. The extended framework is indicated on the right side of the model, inside the dotted lines
CBMs selection and their mapping work
| Covariates | Type | Range | Conversion |
|---|---|---|---|
| Gender | Nominal | {Male, Female} | (0,1) |
| Age | Real | (42–87) | NA* |
| Weights (lbs) | Real | (80–318) | NA |
| Smoking years | Real | (0–41) | NA |
| Histology | Nominal | {Squamous, Non-Squamous} | (0,1) |
| T | Categorical | [1–4] | Number of patients in a stage divided by total number of patients (e.g., 16 patients categorized as T1; those patients were given 0.235 (16/68) value). |
| N | Categorical | [0,1,2,3] | |
| M | Categorical | [0,1] | |
| Tumor size (mm) | Real | (11.7–73.9) | NA |
*NA means no conversion work is needed
Fig. 2Example of decreased density areas in a tumor. Rows from top to bottom represent cavitation, reticulation, and air-bronchogram sign phenomena, respectively. Meanwhile columns A to C show illustration of the internal features, the tumor mask, and an example of CT scan for each case. The solid and non-solid components are designated by white and black colors in the first column, respectively
The list of covariates used to derive eq. 1, 2, 3, 4, 5, 6, 7 and 8. Active pixels refer to the white pixels in binary images
| Covariates | Definition |
|---|---|
| PA | The number of active pixels in A |
| PB | The number of active pixels in B |
| P (B∩A) | The number of active pixels that are true for both A and B |
| n | The number of effected slices |
| (x,y) | The contour vertices: - - Air pocket contour vertices (xi,yi) - Solid wall contour vertices (xk,yk) |
| Span | The longest distance between two vertices of the tumor mask |
| ID | Inner diameter of a lucent area. If more than one area is present, the average is calculated. |
The set of predictors forming three test cases; CBM, IBM, and HBM. The HBM are a combination of CBMs and selected IBMs
| CBM pool | IBM pool | HBM pool | ||
|---|---|---|---|---|
| Age | RDC | Age | + | Selected imaging biomarkers based on correlation testing. |
| Gender | RLC | Gender | ||
| Weights | DoC | Weights | ||
| Smoking years | ATR | Smoking years | ||
| Histology | LoSA | Histology | ||
| T stage | LoSAR | T stage | ||
| N stage | LoCA | N stage | ||
| M stage | LoCAR | M stage | ||
| Tumor size | Solidity | Tumor size | ||
Fig. 3Two sub-groups created based on the threshold by each IBM. In the case a measurement that is not a ratio, such as LoSA and LoCA, a posterior probability is calculated prior the threshold setting
Performance comparisons of AUC for the survival prediction in both models
| Classifiers | CBM (AUC) | IBM (AUC) | IDI | |
|---|---|---|---|---|
| Logistic Regression | 0.75 | 0.47 | 0.002 | |
| Random Forest | 0.61 | 0.54 | < 0.001 | |
| Support Vector Machine | 0.74 | 0.50 | < 0.001 | |
| Artificial Neural Network | 0.59 | 0.52 | < 0.001 |
* representing significance data
Uni- and multi-variable survival analysis for CBMs were performed through KM and Cox proportional hazard model, respectively
| Biomarkers | Univariable | Multivariable | ||
|---|---|---|---|---|
| HR | 95% CI | HR | 95% CI | |
| Age | ||||
| ≤ 70 | 1 | 1 | ||
| > 70 | 2.235 | 1.097–4.556a | 2.257 | 1.240–4.111 |
| Gender | ||||
| Male | 1 | |||
| Female | 1.015 | 0.560–1.837 | ||
| Weights | ||||
| ≤ 150 | 1 | |||
| > 150 | 1.070 | 0.604–1.897 | ||
| Smoking Status | ||||
| Yes | 1 | |||
| No | 1.623 | 0.872–3.021 | ||
| Primary Tumor | ||||
| ≤ T2 | 1 | |||
| > T2 | 0.904 | 0.446–1.830 | ||
| Lymph Node | ||||
| N0 | 1 | 1 | ||
| ≥ N1 | 3.797 | 1.038–13.887a | 4.163 | 1.858–9.326 |
| Metastasis | ||||
| M0 | 1 | |||
| M1 | 4.863 | 0.238–99.479a | ||
| Histology | ||||
| Squamous | 1 | |||
| Non-Squamous | 1.194 | 0.582–2.452 | ||
| Longest diameter | 1.059 | 1.010–1.110a | ||
* representing significant data; HR Hazard ration, CI Confidence interval
Uni- and multi-variable survival analysis for IBMs were performed through KM and Cox proportional hazard model, respectively
| Biomarkers | Univariable | Multivariable | ||
|---|---|---|---|---|
| HR | 95% CI | HR | 95% CI | |
| RDC | ||||
| SD | 1 | 1 | ||
| NSD | 2.225 | 1.149–4.307* | 0.431 | 0.232–0.769 |
| DoC | ||||
| SD | 1 | |||
| NSD | 1.583 | 0.749–3.342 | ||
| LoSA | ||||
| SD | 1 | |||
| NSD | 1.963 | 1.019–3.781 | ||
| LoSAR | ||||
| SD | 1 | 1 | ||
| NSD | 2.445 | 1.345–4.443* | 0.395 | 0216–0.708 |
| Solidity | ||||
| SD | 1 | |||
| NSD | 1.908 | 0.930–3.915* | ||
| RLC | ||||
| SD | 1 | |||
| NSD | 2.225 | 1.149–4.307* | ||
| ATR | ||||
| SD | 1 | |||
| NSD | 2.018 | 1.011–4.028* | ||
| LoCA | ||||
| SD | 1 | |||
| NSD | 0.643 | 0.337–1.225 | ||
| LoCAR | ||||
| SD | 1 | 1 | ||
| NSD | 2.274 | 1.153–4.488* | 0.422 | 0.232–0.769 |
* representing significant data; HR Hazard ration, CI Confidence interval
Mean and median survival as calculated from Kaplan-Meier survival curves. Only IBM which gives statistical significance as demonstrated by the univariate analysis in Table 7 is included in this table
| Biomarkers | 5-Year Overall Survival | |||||
|---|---|---|---|---|---|---|
| Mean | 95% CI | Difference | Median | 95% CI | Difference | |
| SD | 5.20 | 4.27–2.57 | 2.04 | 4.43 | 2.93–5.48 | 1.17 |
| NSD | 3.16 | 2.57–3.74 | 3.26 | 2.49–3.90 | ||
| LoFAR | ||||||
| SD | 5.50 | 4.46–6.61 | 2.23 | 5.20 | 3.11–5.24 | 2.00 |
| NSD | 3.27 | 2.75–3.79 | 3.20 | 2.71–3.84 | ||
| SD | 4.96 | 4.08–5.85 | 1.64 | 3.90 | 2.85–5.48 | 0.58 |
| NSD | 3.32 | 2.68–3.96 | 3.32 | 3.15–3.84 | ||
| SD | 5.20 | 4.27–2.57 | 2.04 | 4.43 | 2.93–5.48 | 1.17 |
| NSD | 3.16 | 2.57–3.74 | 3.26 | 2.49–3.90 | ||
| SD | 5.05 | 4.11–5.99 | 1.69 | 3.78 | 2.76–5.48 | 0.46 |
| NSD | 3.36 | 2.82–3.90 | 3.32 | 3.15–3.90 | ||
| SD | 5.17 | 4.24–6.11 | 2.03 | 3.99 | 3.32–5.48 | 0.83 |
| NSD | 3.14 | 2.59–3.69 | 3.16 | 2.49–3.84 | ||
Data are presented in years
Fig. 4The comparison of ROC curves in the survival prediction for all models in: a Logistic Regression, b Random Forest, c Support Vector Machines and d Artificial Neural Network