| Literature DB >> 31418423 |
Jiangpeng Wu1,2, Xiangyi Zan3, Liping Gao3, Jianhong Zhao4, Jing Fan2, Hengxue Shi2, Yixin Wan3, E Yu5, Shuyan Li1,2, Xiaodong Xie6.
Abstract
BACKGROUND: Liquid biopsies based on blood samples have been widely accepted as a diagnostic and monitoring tool for cancers, but extremely high sensitivity is frequently needed due to the very low levels of the specially selected DNA, RNA, or protein biomarkers that are released into blood. However, routine blood indices tests are frequently ordered by physicians, as they are easy to perform and are cost effective. In addition, machine learning is broadly accepted for its ability to decipher complicated connections between multiple sets of test data and diseases.Entities:
Keywords: Random Forest; lung cancer identification; routine blood indices
Year: 2019 PMID: 31418423 PMCID: PMC6714502 DOI: 10.2196/13476
Source DB: PubMed Journal: JMIR Med Inform
General demographic information on the test set and the training set (N=277).
| Characteristic | Training set | Test set | |||||
| Lung cancer (n=149) | Tuberculosis (n=37) | Other (n=36) | Lung cancer (n=34) | Tuberculosis (n=14) | Other (n=7) | ||
| Male | 110 | 37 | 12 | 22 | 5 | 5 | |
| Female | 39 | 20 | 24 | 12 | 9 | 2 | |
| Median age (range) | 60 (27-81) | 46 (20-79) | 55 (30-78) | 58 (38-79) | 52 (20-78) | 62 (49-68) | |
| Smokers, n | 44 | 2 | 2 | 5 | 0 | 1 | |
Figure 1Classification performance of the RBLC model. (A) Cross-validation results of models which were built on top ranking features. (B) ROC curves and the corresponding AUCs for the cross-validation on the training set and for the test set. RBLC: routine blood indices model for lung cancer; ROC: receiver operating characteristic; AUC: area under the curve; ACC: accuracy; MCC: Matthews correlation coefficient.
Top-ranking blood indices for the identification of lung cancer.
| Rank | Index | Reference range |
| 1 | Basophil ratio | 0.00-0.01 |
| 2 | Creatine kinase isoenzymes (U/L) | 0.0-25.0 |
| 3 | Platelet large cell ratio (%) | 17.0-45.0 |
| 4 | Albumin (g/L) | 30.0-55.0 |
| 5 | Platelet distribution width (fl) | 9.0-17.0 |
| 6 | Neutrophilic granulocytes (109/L) | 2.00-7.00 |
| 7 | White blood cell count (109/L) | 4.00-10.00 |
| 8 | Albumin/Globulin ratio | 1.10-2.50 |
| 9 | Monocytes (109/L) | 0.12-1.20 |
| 10 | Monocyte ratio | 0.03-0.08 |
| 11 | Lymphocyte ratio | 0.20-0.40 |
| 12 | Neutrophil granulocyte ratio | 0.50-0.70 |
| 13 | Lactate dehydrogenase (U/L) | 0.0-240.0 |
| 14 | Carbamide (mmol/L) | 1.80-8.00 |
| 15 | Eosinophil cells (109/L) | 0.02-0.50 |
| 16 | Mean corpuscular volume (fl) | 80.0-100.0 |
| 17 | Alkaline phosphatase (U/L) | 0.0-120.0 |
| 18 | Mean corpuscular hemoglobin (pg) | 27.0-34.0 |
| 19 | Creatine kinase (U/L) | 0-195 |
Figure 2The detailed forest structure for the RBLC model. (A) The general structure of the voting strategy of the RBLC model. (B) The independent decision rulings for different blood indices for the first tree (T1) in (A). T: tree; WBC: white blood cell count; NE%: neutrophil granulocyte ratio; LY%: lymphocyte ratio; MO%: monocyte ratio; BA%: basophil ratio; NE#: neutrophilic granulocytes; MO#: monocytes; EO#: eosinophil cells; MCV: mean corpuscular volume; MCH: mean corpuscular hemoglobin; PDW: platelet distribution width; P-LCR: platelet large cell ratio; UREA: carbamide; ALP: alkaline phosphatase; ALB: albumin; A/G: albumin/globulin; CK: creatine kinase; CK-MB: creatine kinase isoenzymes; LDH: lactate dehydrogenase.
Figure 3Web page of the RBLC tool for convenient usage online. RBLC: routine blood indices model for lung cancer; ALB/GLB: albumin/globulin.
Comparison of the performance of different methods for predicting lung cancer on cross-validation.
| Prediction method | Sample size | Sensitivity, % | Specificity, % | Area under the curve |
| RBLCa | 226 | 96.30 | 94.97 | 0.99 |
| Protein biomarker [ | 143 | 93.00 | 45.00 | N/Ab |
| RNA biomarker [ | 310 | 93.00 | 90.00 | 0.97 |
| DNA biomarker [ | 318 | 79.20 | 67.30 | 0.75 |
| Computed tomography scans [ | N/A | 94.40 | 72.60 | N/A |
aRBLC: routine blood indices model for lung cancer.
bN/A: not applicable.
Feature comparison of lung cancer and other samples.
| Feature | Negative sample | Positive sample (lung cancer) | |
| White blood cell count | 0.1986 | 0.3088 | <.001 |
| Neutrophil-granulocyte ratio | 0.4257 | 0.6502 | <.001 |
| Lymphocyte ratio | 0.5298 | 0.3232 | <.001 |
| Monocyte ratio | 0.4319 | 0.3970 | .20 |
| Basophil ratio | 0.2555 | 0.1242 | <.001 |
| Neutrophilic granulocytes | 0.1839 | 0.2808 | <.001 |
| Monocytes | 0.2795 | 0.384 | <.001 |
| Eosinophil cells | 0.3236 | 0.0833 | <.001 |
| Mean corpuscular volume | 0.6808 | 0.5453 | <.001 |
| Mean corpuscular hemoglobin | 0.6545 | 0.5983 | .008 |
| Platelet distribution width | 0.5765 | 0.6337 | .03 |
| Platelet large cell ratio | 0.5081 | 0.4010 | <.001 |
| Carbamide | 0.4181 | 0.3197 | <.001 |
| Alkaline phosphatase | 0.4138 | 0.1366 | <.001 |
| Albumin | 0.5757 | 0.5574 | .52 |
| Albumin/globulin | 0.3917 | 0.4155 | .46 |
| Creatine kinase | 0.1103 | 0.0867 | .19 |
| Creatine kinase Isoenzymes | 0.3557 | 0.2014 | <.001 |
| Lactate dehydrogenase | 0.5441 | 0.1462 | <.001 |