Hyungjin Kim1, Hyunsook Hong1, Soon Ho Yoon1. 1. From the Department of Radiology, Seoul National University College of Medicine, 101 Daehak-ro, Jongno-gu, Seoul 03080, Korea (H.K., S.H.Y.); Institute of Radiation Medicine, Seoul National University Medical Research Center, Seoul, Korea (H.K., S.H.Y.); and Medical Research Collaborating Center, Seoul National University Hospital, Seoul, Korea (H.H.).
Chest computed tomography scans for the primary screening or diagnosis of coronavirus disease 2019 would not be beneficial in a low-prevalence region due to the substantial rate of false-positives.■ The pooled sensitivity and specificity were 94% (95% CI: 91%, 96%) and 37% (95% CI: 26%, 50%), respectively, for chest CT. The pooled sensitivity of reverse transcriptase-polymerase chain reaction (RT-PCR) was 89% (95% CI: 81%, 94%).■ In low-prevalence (<10%) countries, the positive predictive value of RT-PCR (range: 47.3%, 84.3%) was more than ten times higher than that of CT scans (range: 1.5%, 8.3%). The negative predictive value of both methods ranged from 99.0% to 99.9%.
Introduction
The outbreak of coronavirus disease 2019 (COVID-19) began in Wuhan, China, in December 2019, and rapidly spread to neighboring Asian and Western countries. On January 30, 2020, the World Health Organization (WHO) declared COVID-19 a public health emergency of international concern (1), and on March 12, the WHO declared COVID-19 to be a pandemic (2). As of April 8, a total of 1,282,931 infectedpatients were reported globally, with 72,774 deaths, and COVID-19 cases had been reported in 211 countries or areas (2).Given the shortage of reverse transcriptase-polymerase chain reaction (RT-PCR) testing kits for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the pathogen causing COVID-19, recent studies have suggested that chest computed tomography (CT) scans could be used as a primary screening or diagnostic tool in epidemic areas (3-5). Ai et al. (3) reported that chest CT had a high sensitivity (97%; 95% CI: 95%, 98%) for detecting COVID-19. In addition, a manufacturing defect in the earliest test kits in the United States raised the expectation that chest CT would become a major modality for COVID-19 screening or diagnosis (6).Nevertheless, the low specificity of chest CT is due to the nonspecific findings of COVID-19 that overlap with those of other viral pneumonias, raising concerns regarding clinical utility of CT for COVID-19 screening (3, 7). If CT has low positive predictive value (PPV), screenees would receive unnecessary radiation exposure. In addition, the large volume of workload for hospital staff and difficulties with disinfection procedures are non-negligible issues related to the widespread use of CT as a diagnostic tool for COVID-19. Recently, the Society of Thoracic Radiology and American Society of Emergency Radiology jointly released a position statement according to which routine CT screening is not recommended for the diagnosis of patients under investigation for COVID-19 (8).In this context, the diagnostic performance of chest CT scans for COVID-19 should be evaluated and carefully compared with RT-PCR in conjunction with the prevalence of the disease (9). The aim of this meta-analysis was to evaluate diagnostic performance measures, including predictive values, of chest CT and initial RT-PCR across a wide range of disease prevalence rates, simulating low- and high-prevalence regions.
Materials and Methods
This study was deemed exempt by the Seoul National University Hospital institutional review board, and informed consent was not necessary.
Search Strategy and Study Selection
This study followed the Preferred Reporting Items for Systematic Reviews and Meta analyses (PRISMA) reporting guidelines. Given the urgent need for evidence on diagnostic studies under the current conditions of the COVID-19 pandemic, we conducted this study without registration in a prospective registry. We initially searched MEDLINE and Embase from December 1, 2019 to March 16, 2020 for studies on COVID-19 that reported the diagnostic sensitivity and/or specificity of chest CT scans and/or RT-PCR assays. The search was updated as of April 3, 2020. Key words for the literature search included “coronavirus disease,” “novel coronavirus,” “2019-nCoV,” and “SARS-CoV-2.” The search strategy was designed by an experienced investigator (S.H.Y.) and conducted independently by two reviewers (H.K. and S.H.Y.). The search was further supplemented by screening the bibliographies of the retrieved articles and reviewing the COVID-19 materials related to RT-PCR that were provided by the WHO.
Eligibility Criteria
The inclusion criteria were as follows: 1) study populations consisting of at least five patients with COVID-19, 2) studies in which RT-PCR assays served as the reference standard, and 3) studies in which diagnostic performance measures (i.e., sensitivity and/or specificity) of initial RT-PCR and/or CT were extractable. The exclusion criteria were: 1) studies on pregnant women and neonates, 2) case reports or series with fewer than five patients, 3) lack of extractable data for a two-by-two contingency table, 4) studies which reported solely the specificity of RT-PCR, 5) non-accessible full-text versions, 6) a study population completely overlapping with that of other studies, 7) lack of a description for repeated RT-PCR assays as the reference standard for studies on the sensitivity of initial RT-PCR, and 8) studies on RTPCR with per-sample basis analysis, in which the results of initial RT-PCR were not separated.
Data Extraction
Two authors (H.K. and S.H.Y.) independently extracted the following data from each study: (a) patient characteristics (number of patients, mean or median age, the proportion of elderly patients [age > 65 years], men, asymptomatic patients, patients with comorbidities, and patients with severe to critical illness (10); (b) study characteristics (region of the study, disease prevalence [i.e., the proportion of the patients diagnosed with COVID-19 among those tested], and usage of the open reading frame (ORF) 1ab and N genes as targets of the RT-PCR assay (11), as initially recommended by the Chinese Center for Disease Control and Prevention); and (c) the results of each initial diagnostic test (true positive, false positive, true negative, and false negative) with reference to the results of RT-PCR assays. We did not extract the results of diagnostic tests from studies in which particular conditions (i.e., COVID-19 versus viral pneumonia other than COVID-19) were intentionally enriched in the study population for research purposes, as those studies would not adequately reflect real settings.For the reference standard of chest CT, any positive results and all negative results from initial or repeated RT-PCR assays were regarded as disease positives and disease negatives, respectively. For RT-PCR, all analyzed studies had repeated RT-PCR assays as the reference standard according to the eligibility criteria. That is, any positive results from the repeated RT-PCR assays were regarded as disease positives. We extracted the results of RTPCR from clinical upper respiratory specimens (nasopharyngeal swab, throat swab, or sputum) within 14 days of symptom onset. When the literature provided a particular commercial name of an RT-PCR kit without target genes, we attempted to identify the genes by referring the company’s website and brochure of the kit. Disagreements between the two reviewers were resolved through consensus.
Data Analysis
The diagnostic sensitivity of chest CT and RT-PCR and specificity of chest CT were pooled separately using a random-effects model. The specificity of RT-PCR was not pooled as it should be 100% as the reference standard by definition (i.e., no false-positives). Since only five studies reported both sensitivity and specificity of chest CT and only one study reported the diagnostic performance measures of both chest CT and RT-PCR, a bivariate model and a hierarchical summary receiver operating characteristic curve could not be employed (12). The positive predictive value and negative predictive value (NPV) of each diagnostic test were estimated for a wide range of disease prevalence rates, ranging from 0.1% to 90%. In addition, the actual disease prevalence in eight countries (i.e., Taiwan, Australia, South Korea, Germany, United States, Italy, France, and United Kingdom) was obtained from web sources (13), and the predictive values for each country were calculated and plotted. The specificity of RT-PCR was assumed to be 99% for the calculation of predictive values.Meta-regression was performed to reveal the effect of potential explanatory factors including the region of the study, the proportion of elderly patients, the distribution of disease severity, the proportion of patients with comorbidities, the proportion of asymptomatic patients, and the usage of ORF genes as RT-PCR targets. Studies without extractable data were excluded from the meta-regression analysis. In addition, sensitivity analysis was conducted for chest CT. The sensitivity of chest CT was pooled for the studies which explicitly specified the reference standard as repeated RT-PCR assays.We used the I2 statistic to assess heterogeneity across the studies. The quality of the included publications was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 tool. Publication bias was evaluated visually using funnel plots of sample size against sensitivity (14).All p-values were based on two-sided tests. A p-value < 0.05 was considered to represent statistical significance. Statistical analyses were conducted using SAS® version 9.3 (SAS Institute, Cary, NC, USA) and R software version 3.6.1 (http://www.R-project.org).
Results
Study Selection
In total, 2172 studies published between January 1 and March 16 were identified by the electronic search strategy, and 1418 studies remained after duplicates were excluded. Through title and abstract review, 1262 publications were excluded. After a full-text review, 115 studies were further excluded for the following reasons: studies on pregnant women and neonates (n = 4), case reports or series with fewer than five patients (n = 17), non-full-text research articles (n = 39), not relevant to the topic of interest (n = 31), lack of extractable data (n = 8), studies solely on the specificity of RT-PCR (n = 2), full-text not accessible (n = 9), retracted article (n = 1), uncertain reference standard for RT-PCR (n = 1), and per-sample basis analysis for RT-PCR (n = 3). Three additional studies were found through the WHO National Laboratory (n = 2) and a bibliographic search (n = 1). Thirty-seven studies published between March 17 and April 3 were added after updated literature search, and 13 studies with a overlapped study population were excluded. Consequently, 68 articles satisfied the inclusion and exclusion criteria. Sixty-three of these studies reported diagnostic performance for chest CT and 19 studies did so for RT-PCR (Fig 1).
Figure 1:
PRISMA flow diagram of the study selection process. CT = computed tomography; PRISMA = Preferred Reporting Items for Systematic Reviews and Metaanalyses; T-PCR = reverse transcriptase-polymerase chain reaction; WHO = World Health Organization.
PRISMA flow diagram of the study selection process. CT = computed tomography; PRISMA = Preferred Reporting Items for Systematic Reviews and Metaanalyses; T-PCR = reverse transcriptase-polymerase chain reaction; WHO = World Health Organization.
Study Characteristics
The 63 studies that investigated chest CT comprised 6218 patients (3-5, 15-74) and the 19 studies analyzing RT-PCR comprised 1502 patients (3-5, 16, 22, 30, 31, 44, 53, 55, 60, 66-68, 75-79). All 63 studies investigating CT scans reported the diagnostic sensitivity of CT for COVID-19 (3-5, 15-74), while five of them also reported its specificity (3, 39, 43, 47, 50). Thus, both sensitivity and specificity were analyzed for the same study population in those five studies. Twenty-one studies on the chest CT were performed with repeated RT-PCR assays as the reference standard (3-5, 16, 22, 28, 30, 31, 38, 39, 43, 44, 47, 53, 55, 60, 64, 66-68, 74). Fourteen studies reported the sensitivity of both chest CT scans and RT-PCR (3-5, 16, 22, 30, 31, 44, 53, 55, 60, 66-68).The study population inclusion period ranged from December 2019 to March 2020. The mean or median age of patients ranged from 3 to 70 years. Twelve studies were from Wuhan, China (3, 23, 25, 26, 29, 36, 49, 51, 61, 71, 75, 77). Correlations were observed between disease severity and the proportion of patients with comorbidities (Spearman’s ρ = 0.53; p = 0.003) and between disease severity and the proportion of elderly patients (Spearman’s ρ = 0.52; p = 0.01). The characteristics of the included studies are outlined in Table E1 (online).
Quality Assessment
The included studies had a relatively low risk of bias in patient selection and flow and timing (Fig 2). Regarding the index test, 33 of 68 studies (49%) lacked a description of blinding about the results of RT-PCR during the interpretation of CT images, causing an unclear risk of bias in those studies. A paucity of detail in the description of RT-PCR procedures existed in 28 of 68 studies (41%), resulting an unclear risk of bias. Nevertheless, the paucity of details in those descriptions was determined not to raise concerns regarding the applicability of the CT and RT-PCR results in most studies.
Figure 2:
Grouped bar charts for risk of bias and concerns regarding the applicability of the 68 included studies using QUADAS-2. QUADAS-2 = Quality Assessment of Diagnostic Accuracy Studies-2.
Grouped bar charts for risk of bias and concerns regarding the applicability of the 68 included studies using QUADAS-2. QUADAS-2 = Quality Assessment of Diagnostic Accuracy Studies-2.
Diagnostic Performance of Chest CT Scan and RT-PCR
The pooled sensitivity was 94% (95% CI: 91%, 96%; I = 95%) for chest CT and 89% (95% CI: 81%, 94%; I = 90%) for RT-PCR (Figs 3a and 3b). There was substantial heterogeneity for both chest CT and RT-PCR. The pooled specificity was 37% (95% CI: 26%, 50%; I = 83%) for chest CT (Fig 3c).
Figure 3:
Forest plots of pooled sensitivity of (a) chest CT and (b) RT-PCR and pooled specificity of (c) chest CT. Univariate analyses were performed for sensitivity and specificity, respectively. CT = computed tomography; RT-PCR = reverse transcriptase-polymerase chain reaction.
Forest plots of pooled sensitivity of (a) chest CT and (b) RT-PCR and pooled specificity of (c) chest CT. Univariate analyses were performed for sensitivity and specificity, respectively. CT = computed tomography; RT-PCR = reverse transcriptase-polymerase chain reaction.The pooled prevalence of COVID-19 was 38% (95% CI: 26%, 51%; I = 90%), which were extracted from the five studies from China (3, 39, 47), Italy (43), and Japan (50). The pooled prevalence in China was 39% (95% CI: 23%, 59%; I = 92%), as extracted from the three studies (3, 39, 47). The estimated PPV and NPV of chest CT were 1.5% and 99.8% at a disease prevalence of 1%, 14.2% and 98.2% at a prevalence of 10%, and 48.8% and 90.6% at a prevalence of 39%, respectively (Fig 4a). The estimated PPV and NPV of RT-PCR were 47.3% and 99.9% at a disease prevalence of 1%, 90.8% and 98.8% at a prevalence of 10%, and 98.3% and 93.4% at a prevalence of 39%, respectively (Fig 4b).
Figure 4:
Estimated positive predictive value and negative predictive value of (a) chest CT and (b) RT-PCR. The solid line indicates the positive predictive value, and the dotted line denotes the negative predictive value. The red dots indicate the predictive values for eight different countries and China (the right-most point; prevalence, 39%). CT = computed tomography; RT-PCR = reverse transcriptase-polymerase chain reaction.
Estimated positive predictive value and negative predictive value of (a) chest CT and (b) RT-PCR. The solid line indicates the positive predictive value, and the dotted line denotes the negative predictive value. The red dots indicate the predictive values for eight different countries and China (the right-most point; prevalence, 39%). CT = computed tomography; RT-PCR = reverse transcriptase-polymerase chain reaction.The prevalence of COVID-19 outside China ranged from 1.0% to 22.9%. For chest CT scans, the PPV ranged from 1.5% to 30.7%, and the NPV ranged from 95.4% to 99.8%. For RT-PCR, the PPV ranged from 47.3% to 96.4%, while the NPV ranged from 96.8% to 99.9% (Table 1).
Table 1:
Estimated Predictive Values of Chest CT and RT-PCR for COVID-19 in Nine Countries
Estimated Predictive Values of Chest CT and RT-PCR for COVID-19 in Nine Countries
Meta-regression Analysis
Meta-regression analyses for the sensitivity of chest CT revealed that the distribution of disease severity, the proportion of patients with comorbidities, and the proportion of asymptomatic patients significantly affected heterogeneity (all p < 0.05). The region of study origin (p = 0.14) and the proportion of elderly patients (p = 0.07) were not associated with the sensitivity of chest CT. The sensitivity of RT-PCR was negatively associated with the proportion of elderly patients (p = 0.01). No evidence was found for any associations of the sensitivity of RT-PCR with the proportion of study origin (p = 0.52), the distribution of disease severity (p = 0.91), the proportion of patients with comorbidities (p = 0.52), the proportion of asymptomatic patients (p = 0.19), and whether RT-PCR targeted ORF genes (p = 0.98). Detailed results are provided in Table 2.
Table 2:
Meta-Regression Analysis for the Diagnostic Sensitivity of Chest CT and RT-PCR
Meta-Regression Analysis for the Diagnostic Sensitivity of Chest CT and RT-PCR
Sensitivity Analysis
The results of the sensitivity analysis (Fig 5) were similar to those of the primary analysis. The pooled sensitivity of chest CT for the studies with repeated RT-PCR as the reference standard was 95% (95% CI: 88%, 96%; I = 87%). The pooled specificity was 35% (95% CI: 23%, 48%; I = 86%).
Figure 5:
Sensitivity analysis for the chest CT using studies with repeated RT-PCR assays as the reference standard. Forest plots of (a) pooled sensitivity and (b) pooled specificity. RTPCR = reverse transcriptase-polymerase chain reaction.
Sensitivity analysis for the chest CT using studies with repeated RT-PCR assays as the reference standard. Forest plots of (a) pooled sensitivity and (b) pooled specificity. RTPCR = reverse transcriptase-polymerase chain reaction.
Publication Bias
A visual assessment of funnel plots demonstrated that the likelihood of publication bias was low for the studies on chest CT scans and RT-PCR (Fig 6).
Figure 6:
Funnel plots. The likelihood of publication bias was low for the studies on chest CT scans and RT-PCR. CT = computed tomography; RT-PCR = reverse transcriptase-polymerase chain reaction.
Funnel plots. The likelihood of publication bias was low for the studies on chest CT scans and RT-PCR. CT = computed tomography; RT-PCR = reverse transcriptase-polymerase chain reaction.
Discussion
In this meta-analysis, we demonstrated that the pooled sensitivity was 94% (95% CI: 91%, 96%) for chest computed tomography (CT) and 89% (95% CI: 81%, 94%) for reverse transcriptase-polymerase chain reaction (RT-PCR). The pooled specificity of chest CT was 37% (95% CI: 26%, 50%). Given the low specificity of chest CT, a large gap in the positive predictive value (PPV) between chest CT and RT-PCR in low-prevalence regions was noted. Specifically, in countries with a prevalence less than 10%, the PPV of RT-PCR was more than ten times higher than that of CT scans. Nevertheless, the negative predictive value of both methods ranged from 99.0% to 99.9%. Our results imply that the usage of chest CT scans in low-prevalence regions could induce a large number of false-positive results. False positive results may lead to further diagnostic testing, greater medical costs and workload of medical staff as well as patientanxiety.Recent studies on the diagnostic performance of chest CT and RT-PCR have mostly been reported from China, and in those studies, the pooled prevalence was 39%. The prevalence in the studies from China was higher than the prevalence in the other eight countries we analyzed in this study. Suggestions from a high-prevalence region should be carefully adjudicated before clinical implementation in relatively low-prevalence regions. Our results coincide with the statements from the American College of Radiology recommending that CT should not be used to screen for or as a first-line test to diagnose COVID-19 and that CT should be used sparingly for hospitalized, symptomatic patients (80). The Society of Thoracic Radiology and American Society of Emergency Radiology also published a similar position statement that CT scans are not recommended as a primary screening tool (8).In the meta-regression analysis, we found that the diagnostic sensitivity of CT was affected by the distribution of disease severity, comorbidities, and the proportion of asymptomatic patients. This reflects the fact that variations in diagnostic performance measures can be caused by the patient spectrum, referral filter, and reader expectations (81). The patient spectrum, which refers to differences in severity, might have been reflected in the variables we analyzed. It is known that tests in a more severely diseased population may be associated with a higher prevalence or with better diagnostic performance (82). On the contrary, the sensitivity of RT-PCR was lower in studies with a higher proportion of elderly patients. This is plausible because RT-PCR requires a sample to be obtained (nasopharyngeal swab, throat swab, or sputum). Sampling in patients with poor performance status is difficult and prone to sampling error. This is a form of artifactual variability that can affect diagnostic test accuracy.It should be noted that there was substantial heterogeneity in the included studies. The Higgins I statistics for the sensitivity of chest CT and RT-PCR were 95% and 90%, respectively. In the subgroup analyses, there was a considerable level of unexplained heterogeneity, which complicates the interpretation and usefulness of pooled effect estimates. For example, the unexplained heterogeneity in the subgroup analysis for the target genes of RT-PCR was 91%, which suggests that other factors likely affected these results, such as vendor-specific effects (83) and differences in the quality assurance process. The same was true for CT scans. Readers’ experience, internal threshold, and acquisition parameters such as radiation dosage and slice thickness, which affect image quality, are potential factors that were not investigated in our study.Our study had limitations. First, we included a small number of studies, which were not randomized clinical trials. Most of the studies had a retrospective design. However, we believe that a prospective diagnostic study would not be feasible in this global health emergency. Second, bivariate analysis for the sensitivity and specificity of each diagnostic method could not be performed due to the limited number of available studies. At least 10 studies are generally required to estimate all parameters of a bivariate model (12), but there were only five available studies which reported the sensitivity and specificity of chest CT scan. Therefore, the threshold effect was not investigated, which could give misleading results (84). Third, we could not consider the interval since symptom onset, which can affect the diagnostic performance of CT scans and RT-PCR assays.In conclusion, chest CT scans for the primary screening or diagnosis of coronavirus disease 2019 would not be beneficial in a low-prevalence region due to the substantial rate of false-positives. A cost-effectiveness analysis and assessment of practicability are warranted for chest CT in high-prevalence regions.
Authors: P S Dhillon; K Pointon; R Lenthall; S Nair; G Subramanian; N McConachie; W Izzath Journal: AJNR Am J Neuroradiol Date: 2020-08-20 Impact factor: 3.825