| Literature DB >> 32838396 |
Qi-Qiang Zeng1, Kenneth I Zheng2, Jun Chen3, Zheng-Hao Jiang3, Tian Tian3, Xiao-Bo Wang4, Hong-Lei Ma2, Ke-Hua Pan5, Yun-Jun Yang5, Yong-Ping Chen2,6,7, Ming-Hua Zheng2,6,7.
Abstract
Clinicians have been faced with the challenge of differentiating between severe acute respiratory syndrome associated coronavirus 2 (SARS-CoV-2) infected pneumonia (NCP) and influenza A infected pneumonia (IAP), a seasonal disease that coincided with the outbreak. We aim to develop a machine-learning algorithm based on radiomics to distinguish NCP from IAP by texture analysis based on computed tomography (CT) imaging. Forty-one NCP and 37 IAP patients admitted from January to February 6, 2019 admitted to two hospitals in Wenzhou, China. All patients had undergone chest CT examination and blood routine tests prior to receiving medical treatment. NCP was diagnosed by real-time RT-PCR assays. Eight of 56 radiomic features extracted by LIFEx were selected by least absolute shrinkage and selection operator regression to develop a radiomics score and subsequently constructed into a nomogram to predict NCP with area under the operating characteristics curve of 0.87 (95% confidence interval: 0.77-0.93). The nomogram also showed excellent calibration with Hosmer-Lemeshow test yielding a nonsignificant statistic (P = .904). The novel nomogram may efficiently distinguish between NCP and IAP patients. The nomogram may be incorporated to existing diagnostic algorithm to effectively stratify suspected patients for SARS-CoV-2 pneumonia.Entities:
Keywords: COVID‐19; SARS‐CoV‐2; diagnosis
Year: 2020 PMID: 32838396 PMCID: PMC7436469 DOI: 10.1002/mco2.14
Source DB: PubMed Journal: MedComm (2020) ISSN: 2688-2663
FIGURE 1Enrollment flow diagram of the study. Abbreviation: ASA score, American society of anesthesiologists score
FIGURE 3Radiomic feature selection from signature heatmap using the least absolute shrinkage and selection operator (LASSO) logistic regression model. (A) The heat map of relationship among texture analysis parameters. (B) Identification of the optimal penalization coefficient lambda (λ) in the LASSO model used 10‐fold cross‐validation and the minimum criterion. (C) Lasso coefficient profiles of the 56 radiomic features. The dotted vertical line was plotted at the value selected using 10‐fold cross‐validation in (A), for which the optimal λ resulted in 10 non‐zero coefficients
Demographics and baseline characteristics of patients infected with SARS‐CoV‐2 or influenza A virus
| SARS‐CoV‐2 (n = 41) | Influenza A (n = 37) |
| |
|---|---|---|---|
| Characteristics | |||
| Age, years | 46 (39‐50) | 55 (45‐67) | <.01 |
| Sex | .40 | ||
| Men | 15 (36.6%) | 17 (45.9%) | |
| Women | 26 (63.4%) | 20 (54.1%) | |
| Signs and symptoms | |||
| Fever | 32 (78.0%) | 28 (75.7%) | .80 |
| Highest temperature, °C | .70 | ||
| <37.3 | 9 (22.0%) | 9 (24.3%) | |
| 37.3‐38.0 | 15 (36.6%) | 11 (29.7%) | |
| 38.1‐39.0 | 11 (26.8%) | 8 (21.6%) | |
| >39.0 | 6 (14.6%) | 9 (24.3%) | |
| Cough | 22 (53.7%) | 31 (83.8%) | <.01 |
| Myalgia or fatigue | 11 (26.8%) | 8 (21.6%) | .61 |
| Headache | 2 (4.9%) | 2 (5.4%) | .92 |
| Hemoptysis | 1 (2.4%) | 3 (8.1%) | .34 |
| Diarrhea | 2 (4.9%) | 0 (0.0%) | .50 |
| Dyspnea | 1 (2.5%) | 15 (40.5%) | <.01 |
| Respiratory rate > 24 breaths per min | 1 (2.5%) | 4 (10.8%) | .19 |
| Laboratory data | |||
| White blood cell count, ×109/L | .02 | ||
| <4 | 13 (32.50%) | 10 (27.03%) | |
| 4‐10 | 26 (65.00%) | 18 (48.65%) | |
| >10 | 1 (2.50%) | 9 (24.32%) | |
| Lymphocyte count, ×109/L | |||
| <1.0 | 5 (12.50%) | 1 (2.70%) | .20 |
| ≥1.0 | 35 (87.50%) | 36 (97.30%) | |
| Aspartate aminotransferase, U/L | .05 | ||
| ≤40 | 31 (81.58%) | 22 (59.46%) | |
| >40 | 7 (18.42%) | 15 (40.54%) | |
| Total bilirubin, mmol/L | 10.2 (7.1‐15.0) | 8.0 (6.0‐12.0) | .07 |
| Lactate dehydrogenase, U/L | .40 | ||
| ≤245 | 20 (62.50%) | 10 (47.62%) | |
| >245 | 12 (37.50%) | 11 (52.38%) | |
Note. Continuous variables are presented as median (IQR), n (%); categorical variables are presented as number (%). P values tested by one‐way ANOVA for normally distributed variables, Kruskal‐Wallis rank test for not normally distributed continuous variables, and Fisher's exact test for categorical variables, respectively.
FIGURE 2Radiomics‐based machine learning workflow, including computed tomography (CT) images acquisition and region‐of‐interest (ROI) segmentation of inflammatory lesions; radiomic feature extraction by LIFEx; features selection by least absolute shrinkage and selection operator (LASSO) with 10‐fold cross‐validation; radiomics prediction score and calibration; and nomogram development for a more clinician‐friendly application, and support vector machine (SVM) were used to distinguish these two kinds of diseases effectively
FIGURE 4(A) Waterfall plot of radiomics score for each patient. Default is set to NCP (red bar) above the baseline and IAP (green bar) below the baseline. This plot assesses the association between radiomic score and disease type in which disagreement of color coding indicates misclassification by the radiomics score. (B) Nomogram for differentiating NCP and IAP. For each patient, the value of eight variables are represented as points by projecting them onto the upper‐most line (Points). Summing the eight variables and projecting the total points value downward onto the bottom‐most line (DIAGNOSIS) can determine disease type. Value approaching 0 on the DIAGNOSIS line indicates higher probability of IAP while approaching 1 indicates higher probability of NCP. Linear predictor is the nomogram visualization of NCP prediction by radiomics score. (C) Receiver operating characteristic curve for predicting NCP by nomogram; AUC is expressed as n (95% confidence interval). (D) Calibration curve of the nomogram. Calibration curves depict the calibration of model in terms of the agreement between the predicted risks of novel coronavirus and observed outcomes. The y‐axis represents the actual novel coronavirus rate. The x‐axis represents the predicted novel coronavirus risk. The diagonal solid line represents a perfect prediction by an ideal model. The dotted line represents the performance of the nomogram, of which a closer fit to the diagonal solid line represents a better prediction. The Hosmer‐Lemeshow test yielded a nonsignificant statistic (P = 0.904), which suggested that there was no departure from perfect fit. Abbreviations: NCP, SARS‐CoV‐2 infected pneumonia; IAP, influenza A infected pneumonia; AUC, area under operating characteristics curve
FIGURE 5Support vector machine visualization plot. Black cross indicates patients with IAP; red cross indicates patients with NCP; circle indicates misclassification or incorrect prediction; the X‐ and Y‐axis denote mapping variables for two‐dimensional presentation of the multidimensional hyper‐plane