Literature DB >> 33778634

Prognostic Value and Reproducibility of AI-assisted Analysis of Lung Involvement in COVID-19 on Low-Dose Submillisievert Chest CT: Sample Size Implications for Clinical Trials.

Christopher Gieraerts¹, Anthony Dangis¹, Lode Janssen¹, Annick Demeyere¹, Yves De Bruecker¹, Nele De Brucker¹, Annelies van Den Bergh¹, Tine Lauwerier¹, André Heremans¹, Eric Frans¹, Michaël Laurent¹, Bavo Ector¹, John Roosen¹, Annick Smismans¹, Johan Frans¹, Marc Gillis¹, Rolf Symons¹.

Abstract

PURPOSE: To compare the prognostic value and reproducibility of visual versus AI-assisted analysis of lung involvement on submillisievert low-dose chest CT in COVID-19 patients.
MATERIALS AND METHODS: This was a HIPAA-compliant, institutional review board-approved retrospective study. From March 15 to June 1, 2020, 250 RT-PCR confirmed COVID-19 patients were studied with low-dose chest CT at admission. Visual and AI-assisted analysis of lung involvement was performed by using a semi-quantitative CT score and a quantitative percentage of lung involvement. Adverse outcome was defined as intensive care unit (ICU) admission or death. Cox regression analysis, Kaplan-Meier curves, and cross-validated receiver operating characteristic curve with area under the curve (AUROC) analysis was performed to compare model performance. Intraclass correlation coefficients (ICCs) and Bland- Altman analysis was used to assess intra- and interreader reproducibility.
RESULTS: Adverse outcome occurred in 39 patients (11 deaths, 28 ICU admissions). AUC values from AI-assisted analysis were significantly higher than those from visual analysis for both semi-quantitative CT scores and percentages of lung involvement (all P<0.001). Intrareader and interreader agreement rates were significantly higher for AI-assisted analysis than visual analysis (all ICC ≥0.960 versus ≥0.885). AI-assisted variability for quantitative percentage of lung involvement was 17.2% (coefficient of variation) versus 34.7% for visual analysis. The sample size to detect a 5% change in lung involvement with 90% power and an α error of 0.05 was 250 patients with AI-assisted analysis and 1014 patients with visual analysis.
CONCLUSION: AI-assisted analysis of lung involvement on submillisievert low-dose chest CT outperformed conventional visual analysis in predicting outcome in COVID-19 patients while reducing CT variability. Lung involvement on chest CT could be used as a reliable metric in future clinical trials. 2020 by the Radiological Society of North America, Inc.

Entities: Chemical

Year: 2020 PMID： 33778634 PMCID： PMC7586438 DOI： 10.1148/ryct.2020200441

Source DB: PubMed Journal: Radiol Cardiothorac Imaging ISSN： 2638-6135

Summary

AI-assisted analysis of lung involvement in patients with COVID-19 outperformed conventional visual analysis in predicting adverse outcome while reducing variability; AI-assisted quantification of lung involvement in COVID-19 could be used as a reliable metric in clinical trials. ■ Area under the curve (AUC) values from automated AI analysis and AI analysis with manual correction were significantly higher than those from visual analysis for both semi-quantitative CT scores and percentages of lung involvement (0.888 and 0.903 vs 0.760 and 0.878 and 0.880 vs 0.774, respectively). Kaplan-Meier curve analysis using the identified cutoffs (CT score ≥7 and lung involvement percentage ≥12.0% for visual analysis, CT score ≥8 and lung involvement percentage ≥19.8% for automated AI analysis, and CT score ≥8 and lung involvement percentage ≥20.5% for AI analysis with manual correction) showed that these values could be used to predict patient outcome (P<0.001 by log rank test for all analyses). ■ Intra- and interreader agreement was significantly higher for AI-assisted analysis with manual correction when compared to visual analysis. ■ Using an AI-assisted analysis can reduce the required sample size for clinical trials aiming to reliably detect a change in the extent of COVID-19 lung involvement by a factor of 4 (e.g., 250 patients vs 1014 patients to detect a 5% change in the extent of lung involvement with a power of 90% and an α error of 0.05).

Introduction

Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2) is a novel enveloped RNA betacoronavirus belonging to the same family of viruses causing severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS) (1). Patients with SARS-CoV-2 infection can develop clinical coronavirus disease 2019 (COVID-19) which was declared a pandemic by the World Health Organization (WHO) on the 11th of March 2020 (2,3). The full spectrum of COVID-19 severity is still being clarified but appears to be wide, ranging from asymptomatic status or mild upper respiratory tract symptoms to severe viral pneumonia, multiple organ dysfunction and even death (4). Chest computed tomography (CT) has emerged as an accurate tool for the initial diagnosis of patients with possible COVID-19 infection (5). Additionally, CT may represent a non-invasive tool for patient prognostication as the extent of lung involvement on chest CT appears to be an important prognostic marker (6,7). Multiple Artificial Intelligence (AI) software packages are currently being developed to aid radiologists in the quantification of lung involvement in COVID-19. However, little is known about the reproducibility of these software packages and how they may improve outcome prediction. We hypothesized that the use of semiautomated AI may both improve CT reproducibility and allow for more accurate patient prognostication. We assessed COVID-19 patients who underwent chest CT at our institution by conventional visual and AI-based quantification of lung injury. We also determined the impact of chest CT variability on sample size estimates that would be applicable in a clinical trial (e.g., to determine the potential response to novel antiviral therapies). The aim of this study was therefore to determine reader and software variability in the measurement of lung injury in COVID-19 and assess its impact on patient prognosis.

Materials and Methods

This retrospective study was compliant with the Health Insurance Portability and Accountability Act (HIPAA) and was approved by our institutional review board (Imelda Hospital, Bonheiden, Belgium). Informed consent was waived. From March 15th to June 1st 2020, 250 consecutive patients with clinical suspicion of COVID-19 pneumonia were tested with both RT-PCR and CT within a 2-hour interval of hospital admission. Epidemiological, demographic, clinical, and laboratory data at admission were obtained from the electronic patient management system. Two PCR platforms (Aries system, Luminex, Austin, USA and Rotorgene Q, Qiagen, Hilden, Germany) were used to detect SARS-CoV-2 in nasopharyngeal swabs (eSwab, Copan Diagnostics, Brescia, Italy), both using the E-gene as target. Primers and probe sequences for the E-gene were provided by the Belgian National Reference Center (University Hospitals Leuven, Belgium). No cross reactivity for other human Coronaviruses, Influenza or Respiratory Syncytial Virus (RSV) has been shown for both platforms. Part of the patient population has been previously reported in studies assessing the accuracy of chest CT for COVID-19 diagnosis and the impact of gender on the extent of lung injury (5,8). Adverse outcome was defined as death or intensive care unit (ICU) admission. In patients with multiple events, only the first event was considered for event-free survival analysis. Only patients with a final outcome (death or discharge) were included in the final analysis. No patients were excluded from analysis after initial inclusion. No adverse event occurred from the chest CT exams.

CT scan protocol

All patients underwent non-contrast low-dose chest CT by using a Somatom Definition AS 64-slice 0.6 mm detector scanner (Siemens Healthineers, Forchheim, Germany). We used vendor-supplied software (CareDose 4D and CarekV, Siemens Healthineers) to calculate size-specific radiation dose estimates for the low-dose chest CT protocol which was adapted from the protocol used for lung cancer screening with reference values in an average patient of 100 kVp and 20 mAs (9). We used a 0.5 second rotation time and a pitch of 1.2 to limit motion artifacts in dyspneic patients. Effective radiation dose was calculated by multiplying the dose-length product (DLP) by 0.014 mSv/mGy · cm as the constant k-value for thoracic imaging (10). Reconstruction parameters were: 1 mm/0.7 mm slice thickness/increment with a standard lung-tissue kernel (I50f medium sharp) and 3 mm/3 mm slice thickness/increment with a standard soft tissue kernel (I31f medium smooth), sinogram-affirmed iterative reconstruction (SAFIRE) strength 3, 450 mm FOV and 512 x 512 matrix size.

CT image analysis

Visual analysis of lung involvement was performed by using a semi-quantitative scoring system as previously described (5). In short, each lobe was scored from 0 to 5 with a total score ranging from 0 to 25: score 0, 0% involvement; score 1, <5% involvement; score 2, 5-25% involvement; score 3, 26-50% involvement; score 4, 51-75% involvement, score 5, 76-100% involvement. Involvement was visually defined as any area of GGO, crazy-paving or consolidation and percentage was estimated by combining axial, coronal, and sagittal reconstructions. For the semi-quantitative score, a higher number indicated a higher ranking and involvement (e.g., a score of >7 indicates all scores from 8 to 25). AI-powered analysis of lung involvement was performed at a dedicated workstation using CT pneumonia analysis v.2.0. (Siemens Healthineers, Forchheim, Germany). The algorithm uses non-contrast CT data to automatically identify and 3D-segment both the lung parenchyma and abnormal areas of ground-glass opacities (GGO) and consolidation (11). The software outputs a percentage of total lung involvement (both GGO and consolidation). This percentage was translated to the same semi-quantitative scoring system used for visual analysis. Segmentation errors were manually corrected by trained readers. In cases of bacterial pneumonia coinfection, the total area of GGO and consolidation was included. The following outcome measures were thus evaluated by the readers: Semi-quantitative CT score (ranging from 0 to 25): CT scores from visual analysis, AI without manual correction (AI-auto), and AI with manual correction (AI-manual). Percentage of lung involvement (ranging from 0 to 100%): percentage scores of lung involvement (combined GGO and consolidation) from visual analysis, AI-auto, and AI-manual. Both metrics of lung involvement are reported, because there is precedence for both approaches to assess the extent of lung involvement in COVID-19 (6,7). The truly quantitative approach with percentages of lung involvement is likely more accurate and will increasingly become available through the rapid development of multiple AI-based software packages for COVID-19. However, we opted to include the semi-quantitative approach as it has been used in early COVID-19 studies with good prognostic value and may be only approach available to some institutions for the foreseeable future (6). Intra- and interreader reproducibility were assessed for both visual analysis and AI-based analysis with manual correction. Six radiologists (C.G., A.Da., L.J., Y.D.B., A.De., and R.S.) independently scored the lung involvement on a subset of the patient population. Two cardiothoracic radiologists (C.G. and R.S. with 8 and 7 years of cardiothoracic imaging experience, respectively) assessed reproducibility. One reader (R.S.) reread a random sample of 50 scans after 1 week to assess intrareader reproducibility. Fifty randomly selected cases first read by another reader were reread by C.G. after 1 week to assess interreader reproducibility.

Statistical analysis

All statistical analysis was performed by using R v.4.0.0. (Foundation for statistical computing, Vienna, Austria). Data were tested for normal distribution with the Shapiro-Wilk test. Summary statistics for all continuous variables are reported as means ± standard deviations (SD) or as medians with interquartile ranges (IQR), as appropriate. Summary statistics for categorical variables are reported as absolute numbers and percentages. For continuous variables, a threshold that balances sensitivity and specificity, as identified by the Youden index, was calculated from receiver-operating characteristic (ROC) curve analysis (12). It is important, however, to realize this is just one approach to cutting the ROC curve and future, larger studies are needed to determine optimal thresholds considering other predictors of adverse outcome. We assessed discrimination with the 5-fold cross-validated area under the ROC (AUROC), reported with corresponding 95% confidence intervals (13). Survival curves were estimated using the Kaplan-Meier method and compared by using the log-rank test. Cox-model results were shown by hazard ratio (HR) estimates with 95% confidence intervals (CI). We checked the proportional-hazards assumption for each variable by testing Schoenfeld residuals and using the double-log plot method. In case of violation of the proportional-hazards assumption, the restricted mean survival time (RMST) was calculated as a measure of average survival from time 0 to a specified time point and estimated as the area under the survival curve (AUC) up to that point (14). Intra- and interreader agreement were assessed by using intraclass correlation coefficients (ICCs), Bland-Altman analysis with 95% limits of agreement (LOAs), Spearman rank correlation r, and coefficient of variation (CV) (15). A two-way model with measures of agreement was used to calculate the ICC values. ICCs of >0.75 and of 0.40–0.75 indicate strong and average agreement, respectively. A difference between ICCs was considered to be statistically significant when there was no overlap between their respective 95% CI limits. There were no missing data elements for the analyses. P<0.05 was considered to indicate a statistically significant difference. Sample size estimates were derived from the interreader SD of lung involvement as described by Machin and Altman (16,17). The sample size required by chest CT to show a change with 90% power and an α error of 0.05 was calculated by using the following formula:, where α is the significance level, P is the study power, f is the value of the factor for different values of α and P (f = 10.5 for a P of 90% and an α error of 0.05), σ is the interstudy standard deviation, δ is the desired percentage difference to be detected, and n is the sample size needed (18). Chest CT reproducibility and sample size were calculated for both a visual and an AI-assisted analysis, as defined above.

Results

Patient demographics, CT findings and dose parameters, and outcome data are summarized in Table 1. The mean age for all patients was 67 years ± 17 years (SD) with fever, cough, and dyspnea as the most frequent clinical symptoms at presentation. Median time from symptom onset and ER presentation with RT-PCR and chest CT was 7 days (IQR: 4-10 days). Median time between CT scan acquisition and report was 20 minutes (IQR: 12-42 minutes). Median time for automated AI analysis was 9 minutes (IQR: 8-9 minutes), which increased to 12 minutes (IQR: 8-13 minutes) with manual correction. Manual correction was required in 154 patients (65.6%). However, manual correction changed the percentage of lung involvement with more than 1% in only 33 patients (13.2%), when compared to the automated AI analysis (Figure 3F).

Table 1:

Patient Characteristics, CT findings and Radiation Dose Parameters

Figure 3:

Bland-Altman plots show reproducibility between visual analysis, automated AI-assisted analysis, and AI-assisted analysis with manual correction. No significant bias was observed with narrower limits of agreement for AI-assisted analysis without and with manual correction.

Patient Characteristics, CT findings and Radiation Dose Parameters

CT radiation dose

Mean DLP for all patients was 43.2±24.9 mGy.cm, resulting in an effective radiation dose of 0.60±0.35 mSv (Table 1).

Outcome prediction

Adverse outcome occurred in 39 patients (15.6%) with 28 ICU admissions and 11 deaths. Five patients (17.9%) died in the ICU (6 other deaths occurred in frail older patients who were not transferred to the ICU) (19). Median time of ICU admission was 18 days (IQR:14-25 days). AUROC analyses identified the following values as Youden index based cutoffs for predicting the endpoint: a CT score of ≥7 (AUROC: 0.760, 95% CI: 0.680-0.841, P-value<0.001) and a lung involvement percentage of ≥12.0% (AUC: 0.774, 95% CI 0.693-0.854, P-value<0.001) for visual analysis, a CT score of ≥8 (AUC: 0.888, 95% CI 0.820-0.956, P-value<0.001) and a lung involvement percentage of ≥19.8% (AUC: 0.878, 95% CI 0.823-0.933, P-value<0.001) for automated AI analysis, and a CT score of ≥8 (AUC: 0.903, 95% CI: 0.836-0.969, P-value<0.001) and a lung involvement percentage of ≥20.5% (AUC: 0.880, 95% CI: 0.823-0.937, P-value<0.001) for AI analysis with manual correction (Figure 1). AUROC values from automated AI analysis and AI analysis with manual correction were significantly higher than those from visual analysis for both semi-quantitative CT scores and percentages of lung involvement (all P<0.001). Kaplan-Meier curve analysis using the identified cutoffs showed that these values could be used to predict patient outcome (P<0.001 by log rank test for all analyses) (Figure 2). Visually, it was clear that most adverse events occur within the first week after chest CT, which was confirmed by analysis of Schoenfeld residuals with violation of the proportional hazards assumption (20). The restricted mean survival time (RMST) was estimated at 1 week, and the difference and ratio of RMST were estimated by bootstrap simulation (Table 2). For example, for AI analysis with manual correction a percentage of lung involvement of more than 20.5% resulted in an RMST difference of -2.5 days (95% CI: -3.2;-1.7 days) and a RMST ratio of 0.640 (95% CI: 0.539-0.760), which significantly favored the group with less lung involvement (both P<0.001). Additional Kaplan-Meier curves with groups based on quartiles of lung involvement are presented in Figure E1.

Figure 1:

Figure 2:

Kaplan-Meier curves showing the time to adverse outcome according to the cutoffs of semi-quantitative CT score (A-C) and quantitative percentage of lung involvement (D-F). AI-assisted analysis improved outcome prediction with clear divergence of curves.

Table 2:

Restricted mean survival time (RMST) difference, RMST ratio, and restricted mean time lost (RMTL) ratio for the different types of analysis. Arm 1 = semi-quantitative CT score or percentage of lung involvement higher than optimal cutoff. Arm 0 = semi-quantitative CT score or percentage of lung involvement lower than optimal cutoff.

Cross-validated Receiver-operating characteristic (ROC) curve analysis for prediction of adverse outcome based on semi-quantitative CT score (A-C) or quantitative percentage of lung involvement (D-F). AI-assisted analysis without and with manual correction outperformed visual analysis for both types of assessment (B/C vs A and E/F vs D). Kaplan-Meier curves showing the time to adverse outcome according to the cutoffs of semi-quantitative CT score (A-C) and quantitative percentage of lung involvement (D-F). AI-assisted analysis improved outcome prediction with clear divergence of curves. Restricted mean survival time (RMST) difference, RMST ratio, and restricted mean time lost (RMTL) ratio for the different types of analysis. Arm 1 = semi-quantitative CT score or percentage of lung involvement higher than optimal cutoff. Arm 0 = semi-quantitative CT score or percentage of lung involvement lower than optimal cutoff.

Reader reproducibility

Intrareader agreement was high for both visual and AI-assisted analysis with manual correction (Table 3). However, AI-assisted analysis resulted in significantly higher ICC values with lower CV for semi-quantitative CT scores (ICC: 0.986 vs 0.935, CV: 11.4% vs 24.9%) and quantitative percentage of lung involvement (ICC: 0.997 vs 0.958, CV: 9.7% vs 25.3%). No significant intrareader bias was observed with Bland-Altman analysis for both types of analysis (Online appendix, Figure E2).

Table 3:

Intrareader and interreader reproducibility for visual and AI-assisted analysis of lung involvement.

Intrareader and interreader reproducibility for visual and AI-assisted analysis of lung involvement. Interreader agreement was also high for both visual and AI-assisted analysis with manual correction (Table 3). However, AI-assisted analysis resulted in significantly higher ICC values with lower CV for semi-quantitative CT scores (ICC: 0.960 vs 0.885, CV: 16.6% vs 25.6%) and quantitative percentage of lung involvement (ICC: 0.986 vs 0.925, CV: 17.2% vs 34.7%). No significant intrareader bias was observed with Bland-Altman analysis for both types of analysis (Online appendix, Figure E3).

Visual analysis vs AI-assisted analysis reproducibility

For semi-quantitative CT scores, visual analysis demonstrated average agreement with AI-assisted analysis without and with manual correction (ICC: 0.670 and 0.682, respectively), whereas the agreement between both AI-assisted analyses was excellent (ICC: 0.990). Overall, no significant bias was observed with Bland-Altman analysis along the different types of CT analysis (Table 4, Figure 3). However, in patients with more extensive lung involvement, there was a tendency for visual analysis to yield higher semi-quantitative CT score when compared to AI-assisted analysis (Figure 3A-3B).

Table 4:

Reproducibility between visual analysis, automated AI-assisted analysis, and AI-assisted analysis with manual correction.

Reproducibility between visual analysis, automated AI-assisted analysis, and AI-assisted analysis with manual correction. Bland-Altman plots show reproducibility between visual analysis, automated AI-assisted analysis, and AI-assisted analysis with manual correction. No significant bias was observed with narrower limits of agreement for AI-assisted analysis without and with manual correction. For quantitative percentage of lung involvement, visual analysis demonstrated excellent agreement with AI-assisted analysis without and with manual correction (ICC: 0.873 and 0.871, respectively). Agreement between both AI-assisted analyses, however, was even better (ICC: 0.997). No significant bias was observed with Bland-Altman analysis along the different types of CT analysis (Table 4, Figure 3). Example analyses are shown in Figures 4 and 5.

Figure 4:

Figure 5:

Example images from a 68-year-old female patient with RT-PCR confirmed COVID-19. CT scan was obtained 7 days after the start of symptom onset at ER presentation and show bilateral extensive subpleural areas of ground-glass opacities and consolidation consistent with extensive COVID-19. Automated AI-assisted analysis (A,B) failed to detect small areas of ground-glass opacities in the left upper lobe and included part of the thoracic wall into the area of consolidation in the right upper lobe (arrows in A and B) (semiquantitative CT score 8/25, percentage of lung involvement 23.60%). Reader manual correction added these small areas of ground-glass opacities and corrected the segmentation of the thoracic wall (arrows in C and D) (semiquantitative CT score 9/25, percentage of lung involvement 25.24%). Patient was admitted to the ICU 1 day later. Window center, -600 HU; window width 1600 HU; slice thickness, 1 mm; and increment, 0.7 mm for all images.

Example images from a 48-year-old female patient with RT-PCR confirmed COVID-19. CT scan was obtained 14 days after the start of symptom onset at ER presentation and show bilateral subpleural areas of consolidation in the lower lobes consistent with limited late-stage COVID-19 (arrows in A,B,C). AI-assisted analysis semi-quantitative CT score of 2/25 and quantitative lung involvement of 0.29%. No manual correction was required. Visual assessment: semi-quantitative CT score of 2/25 and quantitative lung involvement of 1%. 3D reconstruction highlights the areas of consolidation in the lower lobes (D). Window center, -600 HU; window width 1600 HU; slice thickness, 1 mm; and increment, 0.7 mm for all images. Example images from a 68-year-old female patient with RT-PCR confirmed COVID-19. CT scan was obtained 7 days after the start of symptom onset at ER presentation and show bilateral extensive subpleural areas of ground-glass opacities and consolidation consistent with extensive COVID-19. Automated AI-assisted analysis (A,B) failed to detect small areas of ground-glass opacities in the left upper lobe and included part of the thoracic wall into the area of consolidation in the right upper lobe (arrows in A and B) (semiquantitative CT score 8/25, percentage of lung involvement 23.60%). Reader manual correction added these small areas of ground-glass opacities and corrected the segmentation of the thoracic wall (arrows in C and D) (semiquantitative CT score 9/25, percentage of lung involvement 25.24%). Patient was admitted to the ICU 1 day later. Window center, -600 HU; window width 1600 HU; slice thickness, 1 mm; and increment, 0.7 mm for all images.

Sample size estimation for clinical trials

On the basis of the interreader variability of chest CT, we estimated sample sizes needed to detect significant decreases in lung involvement during a clinical trial (Figure 6). For example, a clinical trial intended to show a change of 5% in lung involvement over time (i.e., a change from 20% to 15% in lung involvement) with a power of 90% would require 250 patients in each group for an AI-assisted analysis, whereas 1014 patients would be required in each group for a visual analysis.

Figure 6:

Graph shows the estimated sample size required in each group to detect a change in percentage of lung involvement with 90% power and 0.05 α error. The x-axis represents the desired detectable change in lung involvement and the y-axis the corresponding sample size needed for visual analysis (blue) and AI-assisted analysis with manual correction (red).

Discussion

The extent of lung involvement on chest CT in COVID-19 patients has important prognostic value and is associated with short-term clinical deterioration. Improved risk stratification of COVID-19 patients is crucial for cost-effective patient management by prompting safe hospital discharge of low-risk patients and prolonged in-hospital and follow-up surveillance of high-risk patients. The role of chest CT as a potential tool for COVID-19 diagnosis has been extensively studied with conflicting recommendations, ranging from using CT as a first-line screening modality to warnings against its overuse and a false sense of security (21,22). Our results suggest that chest CT may be viewed as a risk stratification tool rather than a diagnostic tool per se. However, it is important to realize that chest CT should not be viewed as the sole prognosticator in COVID-19 subjects as multiple clinical and biochemical factors have been previously shown to be associated with adverse outcome (4,8,23,24). Importantly, we found that an AI-assisted approach improved patient risk stratification and reduced variability over a conventional visual approach. These results are in line with previous studies showing superior performance of an AI-driven approach for several medical image segmentation applications, ranging from organ segmentation to segmentation of the vascular network of the human eye (25,26). This success can be attributed to its capability to learn representative and unique image features from large datasets, rather than relying on individually estimated features based on the subjective experience of human experts. Colombi et al. (7) previously found similar prognostic performance of visual and software-based quantification of lung involvement on chest CT. The superior performance of an AI-assisted approach in our study could be attributed to the recent developments within these software packages, whereas their software merely depended on a density-based approach. A density-based approach works well in normal lungs. Conversely, in lungs with severe COVID-19 involvement average density is increased and thus thresholding without further texture analysis leads to errors (27). Another important advantage of an AI-assisted approach is a reduction in reader variability. Understanding variability in measurements of lung involvement on chest CT is crucial to interpret changes in lung involvement over time and accurately predict patient outcome. Our results suggest that the reproducibility of AI-assisted chest CT analysis is sufficient to accurately monitor treatment response in clinical trials with reasonable sample sizes. However, using only a visual analysis resulted in a substantially larger sample sizes (by a factor of 4) and therefore is not recommended for future clinical trials. Interestingly, these excellent reproducibility results were obtained using low-dose scans with a mean effective radiation dose of 0.60 mSv ± 0.35 mSv (SD), suggesting high performance of the AI algorithm even in the presence of substantial image noise. Using a low-dose approach for COVID-19 patients may results in important radiation dose reductions on a population level as CT scans are extensively being used in the diagnostic and prognostic work-up of possible COVID-19 patients. Furthermore, a low-dose approach is even more critical in clinical studies where CT is used for follow-up or therapy response assessment as these patients would receive multiple CT scans. However, it is important to note that during a public health crisis radiation dose consideration should not be the determining factor in deciding imaging strategies. This study has several limitations. First, this study represents a single-center experience with one type of AI software. The software we used is freely available through the postprocessing software by one of the major CT manufacturers worldwide (Syngo.Via, Siemens Healthineers) and thus has the potential for broad clinical use. However, these results are only valid for the current version of the AI software (v.2.0.) and further research evaluating and comparing different AI-based software packages is warranted. Second, true interstudy variability was not assessed in our study as this requires a second CT scan within a very short time frame (likely within hours of the first scan due to the virulent nature of COVID-19). Previous studies, however, have suggested very low interstudy variability in lung volume and nodule assessment on chest CT exams (28,29). Therefore, interstudy variability can be approached by using the interreader variability. Third, overall risk stratification of COVID-19 patients should not solely rely on chest CT findings. Integration of clinical, biochemical, and radiological findings is essential for an optimal risk prognostication. Larger studies are needed to allow for a more comprehensive, multivariable risk stratification of COVID-19 patients. Finally, the use of advanced deep-learning based iterative reconstruction algorithms and state-of-the-art hardware may result in better image quality at similar radiation doses and could theoretically further improve image segmentation (30). In conclusion, AI-assisted analysis of lung involvement on submillisievert low-dose chest CT outperformed conventional visual analysis in predicting outcome in COVID-19 patients while reducing CT variability. Lung involvement on chest CT could be used as a reliable metric in clinical trials.

22 in total

1. Evaluating variability in tumor measurements from same-day repeat CT scans of patients with non-small cell lung cancer.

Authors: Binsheng Zhao; Leonard P James; Chaya S Moskowitz; Pingzhen Guo; Michelle S Ginsberg; Robert A Lefkowitz; Yilin Qin; Gregory J Riely; Mark G Kris; Lawrence H Schwartz
Journal: Radiology Date: 2009-07 Impact factor: 11.105

2. Feasibility of Dose-reduced Chest CT with Photon-counting Detectors: Initial Results in Humans.

Authors: Rolf Symons; Amir Pourmorteza; Veit Sandfort; Mark A Ahlman; Tracy Cropper; Marissa Mallek; Steffen Kappler; Stefan Ulzheimer; Mahadevappa Mahesh; Elizabeth C Jones; Ashkan A Malayeri; Les R Folio; David A Bluemke
Journal: Radiology Date: 2017-07-28 Impact factor: 11.105

3. Why Test for Proportional Hazards?

Authors: Mats J Stensrud; Miguel A Hernán
Journal: JAMA Date: 2020-03-13 Impact factor: 56.272

4. Statistical methods for assessing agreement between two methods of clinical measurement.

Authors: J M Bland; D G Altman
Journal: Lancet Date: 1986-02-08 Impact factor: 79.321

5. Baseline Characteristics and Outcomes of 1591 Patients Infected With SARS-CoV-2 Admitted to ICUs of the Lombardy Region, Italy.

Authors: Giacomo Grasselli; Alberto Zangrillo; Alberto Zanella; Massimo Antonelli; Luca Cabrini; Antonio Castelli; Danilo Cereda; Antonio Coluccello; Giuseppe Foti; Roberto Fumagalli; Giorgio Iotti; Nicola Latronico; Luca Lorini; Stefano Merler; Giuseppe Natalini; Alessandra Piatti; Marco Vito Ranieri; Anna Mara Scandroglio; Enrico Storti; Maurizio Cecconi; Antonio Pesenti
Journal: JAMA Date: 2020-04-28 Impact factor: 56.272

6. Reduced Lung-Cancer Mortality with Volume CT Screening in a Randomized Trial.

Authors: Harry J de Koning; Carlijn M van der Aalst; Pim A de Jong; Ernst T Scholten; Kristiaan Nackaerts; Marjolein A Heuvelmans; Jan-Willem J Lammers; Carla Weenink; Uraujh Yousaf-Khan; Nanda Horeweg; Susan van 't Westeinde; Mathias Prokop; Willem P Mali; Firdaus A A Mohamed Hoesein; Peter M A van Ooijen; Joachim G J V Aerts; Michael A den Bakker; Erik Thunnissen; Johny Verschakelen; Rozemarijn Vliegenthart; Joan E Walter; Kevin Ten Haaf; Harry J M Groen; Matthijs Oudkerk
Journal: N Engl J Med Date: 2020-01-29 Impact factor: 91.245

7. Automated segmentation of lungs with severe interstitial lung disease in CT.

Authors: Jiahui Wang; Feng Li; Qiang Li
Journal: Med Phys Date: 2009-10 Impact factor: 4.071

8. Well-aerated Lung on Admitting Chest CT to Predict Adverse Outcome in COVID-19 Pneumonia.

Authors: Davide Colombi; Flavio C Bodini; Marcello Petrini; Gabriele Maffi; Nicola Morelli; Gianluca Milanese; Mario Silva; Nicola Sverzellati; Emanuele Michieletti
Journal: Radiology Date: 2020-04-17 Impact factor: 11.105

9. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China.

Authors: Chaolin Huang; Yeming Wang; Xingwang Li; Lili Ren; Jianping Zhao; Yi Hu; Li Zhang; Guohui Fan; Jiuyang Xu; Xiaoying Gu; Zhenshun Cheng; Ting Yu; Jiaan Xia; Yuan Wei; Wenjuan Wu; Xuelei Xie; Wen Yin; Hui Li; Min Liu; Yan Xiao; Hong Gao; Li Guo; Jungang Xie; Guangfa Wang; Rongmeng Jiang; Zhancheng Gao; Qi Jin; Jianwei Wang; Bin Cao
Journal: Lancet Date: 2020-01-24 Impact factor: 79.321

10. A role for CT in COVID-19? What data really tell us so far.

Authors: Michael D Hope; Constantine A Raptis; Amar Shah; Mark M Hammer; Travis S Henry
Journal: Lancet Date: 2020-03-27 Impact factor: 79.321

7 in total

1. Rapid quantification of COVID-19 pneumonia burden from computed tomography with convolutional long short-term memory networks.

Authors: Aditya Killekar; Kajetan Grodecki; Andrew Lin; Sebastien Cadet; Priscilla McElhinney; Aryabod Razipour; Cato Chan; Barry D Pressman; Peter Julien; Peter Chen; Judit Simon; Pal Maurovich-Horvat; Nicola Gaibazzi; Udit Thakur; Elisabetta Mancini; Cecilia Agalbato; Jiro Munechika; Hidenari Matsumoto; Roberto Menè; Gianfranco Parati; Franco Cernigliaro; Nitesh Nerlekar; Camilla Torlasco; Gianluca Pontone; Damini Dey; Piotr Slomka
Journal: J Med Imaging (Bellingham) Date: 2022-09-06

2. Assessment of pulmonary arterial circulation 3 months after hospitalization for SARS-CoV-2 pneumonia: Dual-energy CT (DECT) angiographic study in 55 patients.

Authors: Martine Remy-Jardin; Louise Duthoit; Thierry Perez; Paul Felloni; Jean-Baptiste Faivre; Stéphanie Fry; Nathalie Bautin; Cécile Chenivesse; Jacques Remy; Alain Duhamel
Journal: EClinicalMedicine Date: 2021-03-30

3. Rapid quantification of COVID-19 pneumonia burden from computed tomography with convolutional LSTM networks.

Authors: Kajetan Grodecki; Aditya Killekar; Andrew Lin; Sebastien Cadet; Priscilla McElhinney; Aryabod Razipour; Cato Chan; Barry D Pressman; Peter Julien; Judit Simon; Pal Maurovich-Horvat; Nicola Gaibazzi; Udit Thakur; Elisabetta Mancini; Cecilia Agalbato; Jiro Munechika; Hidenari Matsumoto; Roberto Menè; Gianfranco Parati; Franco Cernigliaro; Nitesh Nerlekar; Camilla Torlasco; Gianluca Pontone; Damini Dey; Piotr J Slomka
Journal: ArXiv Date: 2021-03-31

4. ¹²⁴I-Iodo-DPA-713 Positron Emission Tomography in a Hamster Model of SARS-CoV-2 Infection.

Authors: Camilo A Ruiz-Bedoya; Filipa Mota; Sabra L Klein; Alvaro A Ordonez; Catherine A Foss; Alok K Singh; Monali Praharaj; Farina J Mahmud; Ali Ghayoor; Kelly Flavahan; Patricia De Jesus; Melissa Bahr; Santosh Dhakal; Ruifeng Zhou; Clarisse V Solis; Kathleen R Mulka; William R Bishai; Andrew Pekosz; Joseph L Mankowski; Jason Villano; Sanjay K Jain
Journal: Mol Imaging Biol Date: 2021-08-23 Impact factor: 3.488

5. Automated AI-Driven CT Quantification of Lung Disease Predicts Adverse Outcomes in Patients Hospitalized for COVID-19 Pneumonia.

Authors: Marie Laure Chabi; Ophélie Dana; Titouan Kennel; Alexia Gence-Breney; Hélène Salvator; Marie Christine Ballester; Marc Vasse; Anne Laure Brun; François Mellot; Philippe A Grenier
Journal: Diagnostics (Basel) Date: 2021-05-14

Review 6. A Pictorial Review of the Role of Imaging in the Detection, Management, Histopathological Correlations, and Complications of COVID-19 Pneumonia.

Authors: Barbara Brogna; Elio Bignardi; Claudia Brogna; Mena Volpe; Giulio Lombardi; Alessandro Rosa; Giuliano Gagliardi; Pietro Fabio Maurizio Capasso; Enzo Gravino; Francesca Maio; Francesco Pane; Valentina Picariello; Marcella Buono; Lorenzo Colucci; Lanfranco Aquilino Musto
Journal: Diagnostics (Basel) Date: 2021-03-04

7. AI-Based Quantitative CT Analysis of Temporal Changes According to Disease Severity in COVID-19 Pneumonia.

Authors: Selin Ardali Duzgun; Gamze Durhan; Figen Basaran Demirkazik; Ilim Irmak; Jale Karakaya; Erhan Akpinar; Meltem Gulsun Akpinar; Ahmet Cagkan Inkaya; Serpil Ocal; Arzu Topeli; Orhan Macit Ariyurek
Journal: J Comput Assist Tomogr Date: 2021 Nov-Dec 01 Impact factor: 1.826

7 in total