Tobias Niedermaier1,2, Yesilda Balavarca3,4, Hermann Brenner1,3,4. 1. Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany. 2. Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany. 3. Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany. 4. German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany.
Abstract
OBJECTIVE: Fecal immunochemical tests (FITs) detect the majority of colorectal cancers (CRCs), but evidence for variation in sensitivity according to the CRC stage is sparse and has not yet been systematically synthesized. Thus, our objective was to systematically review and summarize evidence on the stage-specific sensitivity of FITs. METHODS: We screened PubMed, Web of Science, Embase, and the Cochrane Library from inception to June 14, 2019, for English-language articles reporting on the stage-specific sensitivity of FIT for CRC detection using colonoscopy as a reference standard. Studies reporting stage-specific sensitivities and the specificity of FIT for CRC detection were included. Summary estimates of sensitivity according to the CRC stage and study setting (screening cohorts, symptomatic/diagnostic cohorts, and case-control studies) were derived from bivariate meta-analysis. RESULTS: Forty-four studies (92,447 participants including 3,034 CRC cases) were included. Pooled stage-specific sensitivities were overall very similar but suffered from high levels of imprecision because of small case numbers when calculated separately for screening cohorts, symptomatic/diagnostic cohorts, and case-control studies. Pooled sensitivities (95% confidence intervals) for all studies combined were 73% (65%-79%) for stage-I-CRCs and 80% (74%-84%), 82% (77%-87%), and 79% (70%-86%) for the detection of CRC stages II, III, and IV, respectively. Even substantially larger variation was seen in sensitivity by T-stage, with summary estimates ranging from 40% (21%-64%) for T1 to 83% (68%-91%) for T3-CRC. DISCUSSION: Although FITs detect 4 of 5 CRCs at stages II-IV, the substantially lower sensitivity for stage-I-CRC and, in particular, T1 CRC indicates both need and potential for further improvement in performance for the early detection of CRC.
OBJECTIVE: Fecal immunochemical tests (FITs) detect the majority of colorectal cancers (CRCs), but evidence for variation in sensitivity according to the CRC stage is sparse and has not yet been systematically synthesized. Thus, our objective was to systematically review and summarize evidence on the stage-specific sensitivity of FITs. METHODS: We screened PubMed, Web of Science, Embase, and the Cochrane Library from inception to June 14, 2019, for English-language articles reporting on the stage-specific sensitivity of FIT for CRC detection using colonoscopy as a reference standard. Studies reporting stage-specific sensitivities and the specificity of FIT for CRC detection were included. Summary estimates of sensitivity according to the CRC stage and study setting (screening cohorts, symptomatic/diagnostic cohorts, and case-control studies) were derived from bivariate meta-analysis. RESULTS: Forty-four studies (92,447 participants including 3,034 CRC cases) were included. Pooled stage-specific sensitivities were overall very similar but suffered from high levels of imprecision because of small case numbers when calculated separately for screening cohorts, symptomatic/diagnostic cohorts, and case-control studies. Pooled sensitivities (95% confidence intervals) for all studies combined were 73% (65%-79%) for stage-I-CRCs and 80% (74%-84%), 82% (77%-87%), and 79% (70%-86%) for the detection of CRC stages II, III, and IV, respectively. Even substantially larger variation was seen in sensitivity by T-stage, with summary estimates ranging from 40% (21%-64%) for T1 to 83% (68%-91%) for T3-CRC. DISCUSSION: Although FITs detect 4 of 5 CRCs at stages II-IV, the substantially lower sensitivity for stage-I-CRC and, in particular, T1 CRC indicates both need and potential for further improvement in performance for the early detection of CRC.
Fecal immunochemical tests (FITs) are used for colorectal cancer (CRC) screening in a large and increasing number of countries (1). Multiple large-scale screening cohorts have demonstrated that FITs detect the majority of CRCs. In a meta-analysis from 2019, pooled sensitivity ranged from 71% (95% confidence interval [CI]: 56%–83%) to 91% (84%–95%), and specificity ranged from 90% to 95% (2). However, little is known on the sensitivity of FIT for detecting CRC at individual stages because many of the individual studies did not report stage-specific sensitivities, and no meta-analysis of stage-specific sensitivities was conducted.The ability of FITs to detect CRC at early stages is of particular relevance for their use in CRC screening, as chances of cure of CRC are dramatically higher when they are detected in earlier rather than later stages (3). Despite numerous studies reporting on stage-specific test characteristics of FIT (4–43), only one study to date (22) focused on this outcome. Given the small case numbers from the studies conducted among asymptomatic screening participants, obtaining reasonably precise stage-specific estimates of sensitivity from such studies is difficult if not impossible. Much larger case numbers have been included in the studies conducted among symptomatic participants who were diagnosed with CRC following colonoscopy for clarification of symptoms (symptomatic/diagnostic cohorts) or who were recruited after CRC diagnosis and compared with healthy controls in a clinical setting (case-control studies).Although overall sensitivity is expected to be higher among these patient groups than among average-risk participants undergoing screening colonoscopy due to spectrum bias, stage-specific sensitivity may be comparable (44). In that case, the possibility of recruiting larger number of CRC patients in such settings might enable estimating stage-specific sensitivities of FIT irrespective of the study type at much higher levels of precision. Our aim was therefore to systematically review and compare the estimates of the stage-specific sensitivity of FIT for CRC detection from various types of studies and, if possible, summarize them to derive reasonably precise estimates of stage-specific sensitivities.
MATERIALS AND METHODS
Data sources and searches
The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed when conducting this meta-analysis. We searched PubMed, Web of Science, and Embase for original English-language human research articles from inception to June 14, 2019. The search of citing, cited and related articles was conducted using PubMed, Web of Science, Embase, and Google Scholar. Furthermore, we searched the Cochrane Library. Our search terms (see Supplement A, Supplementary Digital Content 1, http://links.lww.com/AJG/B321) covered expressions for FIT, CRC, diagnostic accuracy (sensitivity), and CRC staging. Those terms were agreed on after intense discussion, repeated sample searches, and comparison with articles retrieved by a recent previous meta-analysis of studies on diagnostic accuracy of FIT (2).
Study selection
To be eligible, studies had to report on the sensitivity and specificity of FIT for CRC detection. Sensitivities or sufficient information to calculate them had to be provided according to at least one distinct cancer stage (preferably stages I–IV separately). We preferably used the Union Internationale Contre le Cancer/American Joint Committee on Cancer (UICC/AJCC) staging system, which was the most commonly used system reported on. Alternatively, Duke's stages were also considered because they directly translate into a corresponding UICC/AJCC stage. Studies reporting only on T-stages were summarized separately because a wide range of combinations of T-, N-, and M-staging can result in the same UICC/AJCC stage.Conduction of colonoscopy among all participants (not only FIT positives) was required for all studies. Three types of studies were identified as follows: (i) studies in which FIT was conducted in asymptomatic average-risk screening participants before colonoscopy (screening cohorts), (ii) studies which prospectively recruited participants who underwent colonoscopy for clarification of symptoms and not for primary screening (symptomatic/diagnostic cohorts), and (iii) retrospective studies in which patients diagnosed with CRC were included and compared with healthy controls (case-control). Studies focusing on high-risk populations (family history of CRC) were not considered because high-risk individuals are recommended to undergo colonoscopy in shorter intervals rather than FIT screening (45), and among the previously meta-analyzed studies on accuracy of FIT among high-risk populations (46), no study reported on CRC stage-specific performance.
Data extraction
Two authors (T.N and Y.B.) extracted data independently. The following data were extracted from relevant articles: First author's last name, year of publication, study population characteristics such as country in which the study was conducted, study setting, number of participants and CRC cases, age ranges, sex distribution, FIT brand and cutoff, number of samples, and number of CRC cases detected and missed by FIT, stratified by the CRC stage. Initial disagreement in extracted data was resolved in consensus after further review and discussion. We contacted the corresponding authors of 33 articles identified from the previous systematic reviews and meta-analyses on FIT screening (2,47) and through cross-referencing to provide further information on stage-specific sensitivities of FIT in their studies. The corresponding authors of 3 articles (14,48,49) provided additional data on exact stage-specific case numbers and sensitivities. For 2 further articles in which relevant data were not directly reported (29,37), Hb concentrations for each CRC case (29) and data presented in a figure of the manuscript (37) were used to calculate the required number of true-positive and false-negative CRC cases. If the results were presented for several FIT cutoffs, we selected the cutoff closest to the cutoff recommended by the manufacturer or, if not applicable, a cutoff yielding a specificity comparable with the previously estimated overall specificity of FIT of 94% (50). If sensitivities were examined at different specificities, we again selected estimates for specificities closest to 94%. Hemoglobin-based FITs were preferably selected over hemoglobin-haptoglobin–based FITs if the results for both were selected because the former are much more commonly used.
Risk of bias and study quality assessment
We used the Quality Assessment of Diagnostic Accuracy Studies 2 instrument (QUADAS-2) (51) to assess the quality of included studies. Quality assessment was done by 2 authors (T.N. and Y.B.) in parallel, and disagreement was resolved in consensus after further discussion. Studies recruiting asymptomatic subjects for primary screening were categorized as low risk and case-control studies as high risk of bias in the category “Patient selection,” and symptomatic/diagnostic cohorts were rated as unclear risk. A detailed description of the quality criteria and how they were adapted to our research question is given in Supplement B (see Supplementary Digital Content 2, http://links.lww.com/AJG/B322).Potential publication bias was assessed using the methods of Deeks et al. (52) and Macaskill et al. (53).
Statistical analyses
Overall and stage-specific sensitivities of FIT were calculated as the number of CRC cases (total or of the respective stage from I to IV) with a positive FIT divided by the total or stage-specific number of CRCs. Specificities were calculated as the number of FIT-negative participants without CRC divided by the total number of participants without CRC. We used R (54) for all statistical computations. The R package “mada” (55) was used for computations of Clopper-Pearson CIs of sensitivities and specificities and for bivariate meta-analyses of sensitivities and specificities using the Reitsma model (56). Meta-analysis was conducted only on studies reporting sensitivities for individual stages rather than stage groups because proportions of earlier and later stages may differ considerably between studies reporting combined stages, e.g., III and IV. For each stage, data from 2-by-2 tables were extracted and pooled using the function “reitsma” in the R package “mada,” which returns a pooled sensitivity and false-positive rate including 95% CIs. Pooled specificities were obtained from the same bivariate model but using all cases (irrespective of the stage) to calculate pooled sensitivities because sensitivity and specificity are always calculated together in the bivariate model.Differences in pooled sensitivity between stages were assessed statistically using meta-regression techniques with stage as an explanatory variable and “stage I” as a reference group. Potential differences in pooled stage-specific sensitivity and specificity between settings were assessed using meta-regression with setting as an explanatory variable and “screening setting” as a reference group. Between-study heterogeneity was assessed by Higgins I2 and P values of Cochrane's Q for each meta-analysis. To investigate the influence of individual studies on the overall results, we conducted “leave-one-out” meta-analyses, excluding one study at a time and reported the resulting summary estimates of sensitivity and specificity and measures of residual heterogeneity. We conducted further sensitivity analyses using the results for the highest and lowest reported cutoffs for meta-analyses instead of the cutoffs recommended by the manufacturer in studies that reported on several cutoffs. Potential publication bias was assessed using the methods proposed by Deeks et al. (52), in which the diagnostic odds ratios as measures of overall effect size are plotted against the inverse of the square root of the effective sample size, and the method of Macaskill (53), which plots diagnostic odds ratios against the overall sample size.
RESULTS
Study selection process
A total of 13,013 unique articles were identified (Figure 1). After screening of titles and abstracts, 151 articles were given full-text assessment, and 110 of them were excluded. The most frequent exclusion criterion was lack of reporting on both the specificity and stage-specific sensitivity of FIT. Two relevant articles were found through cross-referencing (7,12). Forty-four studies (4–43,48,49,57,58) were in English, reported on the setting, specificity, and stage-specific sensitivity of FIT, and conducted colonoscopy in all participants, thus fulfilled the inclusion criteria.
Figure 1.
Process to identify articles for study.
Process to identify articles for study.
Description of included studies
An overview of study characteristics is given in Table 1. We identified 12 screening cohorts (8,12,18,20,24,29,37,38,48,49,57,58) comprising 277 CRC cases, 18 symptomatic/diagnostic cohorts (4–7,10,11,13,16,17,19,21,22,25,27,28,34,35,40) comprising 869 CRC cases, and 14 case-control studies (9,14,15,23,26,30–33,36,39,41–43) with a total of 1,888 CRC cases. One report (23) also comprised screening participants but was classified as clinical (case-control) because approximately 80% of the CRC cases were recruited from a case-control study.
Table 1.
Characteristics of the studies investigating the stage-specific sensitivity and specificity of FIT in a clinical or screening setting
Characteristics of the studies investigating the stage-specific sensitivity and specificity of FIT in a clinical or screening settingThe most frequently reported FIT brands were FITs manufactured by Eiken Chemical (4,18,20,24,28,29,37,42,48,49,51,57,58). Other FITs used include Ridascreen Hemoglobin (16,23), FOB Gold (18,38), and InSure (11). Using stool from one sample was most common. Nine studies (4,7,10–12,18,40,41,49) reported the use of 2 samples. FIT cutoffs ranged from 2 to 67 μg Hb/g stool, but most studies reported thresholds between 10 and 20 μg/g. Ten studies (4,6,7,11,14–17,32,41) used a qualitative FIT, and 2 studies (6,10) used a quantitative FIT but did not report on its positivity threshold.Most study reports were published between 2003 and 2017. Three older studies comprising clinically detected CRC cases published between 1995 and 1998 (4,5,41) were also included. Studies were conducted among participants from Japan (4,8,12,13,26,40–42,57), Germany (5,16–18,23,38), Taiwan (24,27,34,58), Australia (7,11,35), the United States (29,30,33), the Netherlands (15,19,22), China (10,39,43), Spain (28,48,49), Thailand (14,37), Hong Kong (6,36), South Korea (9,20), the United Kingdom (25), Italy (21), and Saudi Arabia (32).Mean or median age ranged from 48.2 to 67 years was reported. Shares of men ranged mostly between ∼40% and 65%, except for 3 studies (4,8,12) with a predominantly male study population (72%–90%).
Results of the study quality assessment
We summarized the results of the QUADAS-2 study quality assessment in Table 2. Owing to exclusion of studies that did not use colonoscopy as a reference standard, all included studies were deemed as low risk of bias in the category “reference standard.” However, not all studies described whether CRC staging was conducted by pathologists, which we required for a “low risk” evaluation in the applicability concerns for the reference standard. Overall, studies were of moderate-to-high quality.
Table 2.
Quality assessment of diagnostic accuracy studies 2 instrument risk of bias assessment
Quality assessment of diagnostic accuracy studies 2 instrument risk of bias assessment
Meta-analysis of stage- and setting-specific performance of FIT
Distributions of CRC stages across studies are presented in Table 3. Screening cohorts included higher shares of early-stage CRCs (70% stage I or II) than symptomatic/diagnostic cohorts (57%) and case-control studies (54%).
Table 3.
CRC stage distribution of included studies
CRC stage distribution of included studiesPooled specificity was 87% in screening cohorts, 87% in symptomatic/diagnostic cohorts, and 93% in case-control studies (Table 4). In Table 4, we also show individual study and overall stage-specific sensitivities and specificities of FIT, grouped by setting in which the study was conducted. In most individual studies, sensitivities were higher in more advanced CRC stages (II–IV) compared with stage-I-CRCs. Pooled estimates (95% CIs) of stage-specific sensitivities were similar across the 3 different study settings, with widely overlapping CIs (which were though much narrower for symptomatic/diagnostic cohorts and case-control studies due to the much higher number of CRC cases in these groups than in the screening cohorts). Pooled estimates across the study settings were 80% (95% CI 74–84%), 82% (95% CI 77–87%), and 79% (95% CI 70–86%) for stages II, III, and IV, respectively, compared with 73% (95% CI 65–79%) for stage I (P values for differences to stage I were 0.02, 0.005, and 0.28, respectively). Studies reporting on FIT sensitivity by T-stages showed substantially higher sensitivities in stages T2, T3, and T4 (79%, 83%, and 66%) than in T1 CRCs (40%). P values for differences to T1 were 0.02, 0.004, and 0.009, respectively.
Table 4.
Summary of the results on FIT performance of included studies for CRC, stratified by the stage
Summary of the results on FIT performance of included studies for CRC, stratified by the stage
Sensitivity analyses
Leaving out one study at a time from each meta-analysis had only minor influence on pooled sensitivities and specificities (see Table, Supplementary Digital Content 3, http://links.lww.com/AJG/B319). Using the highest or lowest reported cutoff instead of the cutoff recommended by the manufacturer likewise did not change the summary estimates of sensitivity or specificity materially. Point estimates of pooled sensitivity using the highest reported cutoffs were 71%, 79%, 82%, and 79% for stages I, II, III, and IV, respectively. Using the lowest reported cutoffs, the respective estimates were 72%, 79%, 81%, and 80%.
Assessment of heterogeneity
Statistical heterogeneity was small. P values of Cochran's Q were >0.2 in all analyses. Furthermore, within each cancer stage, sensitivities did not differ significantly by the setting (all P ≥ 0.27, see Table, Supplementary Digital Content 4, http://links.lww.com/AJG/B320).
Assessment of publication bias
Plots of effect size vs effective or actual sample size showed no trend, giving no indication for publication bias being present (see Figure, Supplementary Digital Content 5, http://links.lww.com/AJG/B318).
DISCUSSION
We conducted a systematic review of studies reporting the specificity and stage-specific sensitivity of FITs for CRC detection. Meta-analyses of suitable studies suggested that the sensitivity of FIT was considerably lower, by approximately 8% units, in CRC stage I compared with stages II and IV. Meta-regression suggested that sensitivity for stage I cancers was significantly lower than sensitivity for stages II + III. In numbers, 1 of 4 stage-I-CRCs was missed by FIT, compared with less than 1 of 5 CRCs at later stages. A stronger gradient in sensitivity was seen for lower vs higher T-stages, although the T-stage–specific results were reported from a few studies only. Estimated sensitivity for T1 CRCs was 40%, thus only slightly higher than the previously estimated sensitivity of FIT for advanced adenomas (16%–34%) (59). By contrast, stage-specific sensitivities did not differ significantly across different study settings.A gradient in FIT sensitivity by stage is plausible because later-stage CRCs are typically larger and bleed stronger. An even stronger gradient in sensitivity across T-stages is likewise plausible because the size of the primary tumor is directly related to intestinal bleeding, unlike nodal involvement (N) or metastases (M). However, these suggested associations between T or overall TNM stage and the sensitivity of FIT have not been quantified to date, and investigations of potential setting-specific differences in FIT sensitivity by stage were lacking. The lack of a further increase in sensitivity for stage IV compared with stage III might result from iron-deficiency anemia in more advanced tumors (60). Early-stage CRCs (stages I and II) are associated with considerably higher survival rates than stages III and IV (61). Thus, the sensitivity of FIT and other screening tests for these early stages would be much more relevant in clinical practice than sensitivity for late-stage CRCs.Comparable stage-specific sensitivities across different study types have been suggested previously (44) and allowed for joint consideration of all studies reporting on FIT sensitivity by the CRC stage. CRC case numbers increased 11-fold compared to an analysis based on screening cohorts alone, resulting in much more precise estimates of stage-specific sensitivities of FIT. In particular, the width of CIs for the sensitivity of stage I, II, II, and IV CRC decreased from 32 to 14, from 24 to 10, from 29 to 10, and from 53 to 16 percentage points, respectively. The specificity of FIT was highest in case-control studies. Studies of this type often include healthy subjects as controls who are also typically less prone to potential alternative sources of bleeding (e.g., hemorrhoidal bleeding), making false-positive results unlikely. A similar specificity was observed in symptomatic/diagnostic cohorts and in screening cohorts. In the former, non-neoplastic gastrointestinal disorders may cause bleeding and, thus, reduce the specificity of FIT for colorectal neoplasia detection. In screening cohorts, the potentially older mean age of participants may contribute to a reduced specificity compared with symptomatic/diagnostic cohorts (62).A meta-analysis on the overall sensitivity and specificity of FIT for CRC detection in screening settings irrespective of the stage was recently published (2). Reported overall levels of the sensitivity (77%–94%) and specificity (85%–96%) of FIT that are in the same range as ours (screening cohorts: 84% and 87%, respectively; all studies: 81% and 89%, respectively), with some variation according to FIT cutoffs. Unlike their meta-analysis, we did not restrict our search to studies conducted in a screening setting which substantially increased the number of cases that could be included and enabled performance of stage-specific analyses, an additional inclusion criterion in our meta-analysis. In 2017, Katsoula et al. (46) published a meta-analysis of FIT accuracy among high-risk individuals with a family history of CRC. Including only studies that used colonoscopy as a reference standard, they found a high sensitivity and specificity of FIT for CRC detection (93% and 91%, respectively). The high summary estimate of sensitivity seems to suggest that FIT might perform better in high-risk subjects compared with the general population. However, owing to the inclusion of only 40 CRC cases overall, the 95% CI for sensitivity was very wide (53%–99%), and the results were thus also compatible with a similar or even lower sensitivity than derived for CRC cases irrespective of family history in our meta-analysis.It has been suggested that FIT has lower miss rates for distal CRC than for proximal colon cancer (59). Because no study reported on the sensitivity of FIT stratified by both the stage and location, we could not investigate the joint role of anatomic location and stage for FIT performance. Some site-specific variation in sensitivity by stage is conceivable. For example, early-stage CRCs with minimal bleeding might have a larger chance to be detected when located in the distal colon or rectum rather than the proximal colon, whereas site differences may be less relevant for advanced stages with more extensive bleeding.To the best of our knowledge, this is the first systematic review and meta-analysis of the stage-specific (rather than overall) sensitivity of FIT for CRC detection. A particular strength is the comprehensive search in 4 databases. In total, data from 44 studies comprising >92,000 participants, thereof ∼3,000 CRC cases, were used. We did not restrict our search to a single geographical region, thereby achieving increased external validity. Data were extracted by 2 authors independently. Bivariate meta-analysis was used to jointly estimate pooled sensitivities and specificities, taking into account the cutoff dependence of and negative correlation between the 2 measures. We adhered to the standards of reporting for systematic reviews and meta-analyses (PRISMA) and for studies on diagnostic accuracy (STARD). Study quality was assessed using the QUADAS-2 tool specifically designed for studies of diagnostic accuracy. By focusing on studies using colonoscopy as a reference standard, verification bias which may cause a systematic overestimation of sensitivity and underestimation of specificity was ruled out.Several limitations of our meta-analysis should be kept in mind. Despite extensive search in 4 databases and cross-referencing, it cannot be ruled out that we missed a relevant article. Language, publication, and reporting bias are conceivable. Although no statistical hypothesis was tested in the underlying studies, pronounced stage-specific differences could be more likely to be reported than small differences. We included studies irrespective of the FIT used, despite differences in FIT types (quantitative/qualitative), brands, and cutoffs. It has been suggested, however, that different quantitative FITs show similar performance characteristics when adjusting the cutoffs to yield equal specificities (63). More studies would be needed for subgroup analyses, e.g., according to the FIT brand, country of study conduct, and characteristics of study populations. Although no differences were observed in stage-specific sensitivities between different study types, the results of this meta-analysis should be interpreted with caution because of pooling of data from colonoscopies with various indications. Accuracy estimates may also be influenced by disease prevalence (64) and severity, which varied between the 3 study types. However, differences in prevalence would introduce spectrum bias only in case of an imperfect reference standard (65). Disease severity was addressed by stratifying according to the CRC stage. Somewhat uneven distributions within each stage (regarding substages, e.g., IIA or IIB) might remain, but those differences would likely be small and random. Another limitation of our meta-analysis is that few studies reported how they handled patients undergoing postpolypectomy surveillance regarding inclusion or exclusion. Among the 3 studies reporting on that criterion, one study (12) excluded such patients, whereas the other 2 studies (19,22) included them (but did not stratify the results according to this criterion). Thus, although desirable, it is unfortunately not feasible from the given data to conduct a subgroup analysis comparing patients from this group with others.Our study suggests that CRC stage distribution is a major determinant of diagnostic performance of FIT. Because staging is routinely conducted, studies should report FIT results according to CRC stages. Ideally, complete TNM stages would be reported (from which AJCC/Duke's stages can be derived), given that our study also suggests that T-stage is a particularly important determinant of FIT sensitivity. Still, the findings of this study require stringent validation. Such validation could take place through clinics and practices in which FITs are conducted for screening participants and symptomatic individuals and could be performed also among newly diagnosed CRC patients. By following standard operating procedures, potential variation due to different FITs, cutoffs, buffer solutions, delay until Hb measurement, etc. could be reduced to a minimum. Future studies should also consider reporting results on several FIT cutoffs, which would facilitate pooling of studies by using cutoffs yielding similar specificity. Besides, reporting accuracy estimates or ideally exact Hb concentrations for each CRC case (e.g. Aniwan et al. (37)) would enable detailed analyses of sensitivity/specificity for a range of possible cutoffs, along with its implications for number of colonoscopies needed to detect one CRC (or precursors) (66). Those estimates would be particularly useful for countries with limited colonoscopy resources and allow for identifying an optimal FIT positivity threshold depending on characteristics such as CRC/adenoma prevalence, available colonoscopy resources, and costs associated with colonoscopy or other screening modalities.This meta-analysis focused on the sensitivity and specificity of FIT in a single screening round. However, actual FIT screening is conducted in consecutive rounds, typically annually or biennially. Although adenoma detection rates would decrease in subsequent screening rounds because of removal of adenomas in FIT positives, and screening is associated with a shift to earlier cancer stages (67), it is unlikely that repeated screening would influence stage-specific sensitivity for CRC. Although repeated screening was not investigated in this study, our results may provide valuable input for microsimulation models of repeat screening scenarios, which crucially depend on stage-specific sensitivities. Our study may thereby help to fill a time gap until the results of large-scale randomized clinical trials on the impact of FIT screening on CRC mortality are available, e.g., from the CONFIRM trial (68) (results expected in 2028) and the SCREESCO trial (69) (results expected in 2034).In summary, this is the first meta-analysis to provide precise estimates of the stage-specific sensitivity of FIT for CRC detection. Although 4 of 5 CRCs of stages II and IV were FIT positive, sensitivity was considerably lower, by approximately 10% points, for stage-I-CRCs. Our results therefore point to the need for further improvement of FITs in detection of early-stage CRC. Such improvement might be achievable by combining FIT with other diagnostic markers. To the best of our knowledge, however, the previously investigated combinations of FIT with other stool (70) or blood (71) markers achieved—if any—only very limited improvements in overall CRC detection. In conclusion, further large-scale screening cohorts are necessary to obtain more precise stage-specific sensitivity estimates of FIT and to evaluate promising marker combinations to improve the detection of early-stage CRCs.
CONFLICTS OF INTEREST
Guarantor of the article: Tobias Niedermaier, MPH, PhD. The lead author affirms that this manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned (and, if relevant, registered) have been explained.Specific author contributions: H.B. designed the study. T.N. and Y.B. conducted the literature search and extracted the data. T.N. conducted the statistical analyses and drafted the manuscript. H.B. and Y.B. contributed important intellectual content and critically revised the manuscript.Financial support: No specific funding was obtained for this study.Potential competing interests: None to report.
WHAT IS KNOWN
✓ Overall, FITs detect the majority of CRCs in a single screening round. Cure rates for CRC are highest when detected at an early stage, making it valuable to derive stage-specific estimates of characteristics of FITs for CRC detection.✓ Although FITs are suggested to detect larger proportions of late-stage CRCs compared with early-stage CRCs due to stronger bleeding, these potential differences have not yet been systematically investigated and quantified.
WHAT IS NEW HERE
✓ We reviewed and summarized evidence on the stage-specific sensitivity of FITs for CRC detection.✓ Forty-four studies comprising a total of ∼92,500 participants and ∼3,000 CRC cases were included, yielding stage-specific estimates of FIT accuracy with high precision.✓ Pooled sensitivities (95% CIs) of FIT were 73% (65%–79%) for stage-I-CRCs and 80% (74%–84%), 82% (77%–87%), and 79% (70%–86%) for stages II, III, and IV, respectively. Sensitivity was particularly low for T1 CRCs (40%, 95% CI 21%–64%).✓ These results will enable estimating expected effectiveness and cost-effectiveness of planned screening programs with enhanced accuracy. A comparably low sensitivity of FIT for stage-I-CRCs and for T1 CRCs in particular warrants further research on ways to improve FITs in the detection of early-stage CRC.
Authors: J Cubiella; M Salve; M Díaz-Ondina; P Vega; M T Alves; F Iglesias; E Sánchez; P Macía; I Blanco; L Bujanda; J Fernández-Seara Journal: Colorectal Dis Date: 2014-08 Impact factor: 3.788
Authors: Hanna Ribbing Wilén; Johannes Blom; Jonas Höijer; Gaya Andersson; Christian Löwbeer; Rolf Hultcrantz Journal: Scand J Gastroenterol Date: 2019-03-23 Impact factor: 2.423
Authors: Kevin Selby; Christopher D Jensen; Jeffrey K Lee; Chyke A Doubeni; Joanne E Schottinger; Wei K Zhao; Jessica Chubak; Ethan Halm; Nirupa R Ghai; Richard Contreras; Celette Skinner; Aruna Kamineni; Theodore R Levin; Douglas A Corley Journal: Ann Intern Med Date: 2018-09-18 Impact factor: 25.391
Authors: Sunny H Wong; Thomas N Y Kwong; Tai-Cheong Chow; Arthur K C Luk; Rudin Z W Dai; Geicho Nakatsu; Thomas Y T Lam; Lin Zhang; Justin C Y Wu; Francis K L Chan; Simon S M Ng; Martin C S Wong; Siew C Ng; William K K Wu; Jun Yu; Joseph J Y Sung Journal: Gut Date: 2016-10-24 Impact factor: 23.059
Authors: Anton Gies; Tobias Niedermaier; Laura Fiona Gruner; Thomas Heisser; Petra Schrotz-King; Hermann Brenner Journal: Cancers (Basel) Date: 2021-02-05 Impact factor: 6.639
Authors: Mathias M Petersen; Linnea Ferm; Jakob Kleif; Thomas B Piper; Eva Rømer; Ib J Christensen; Hans J Nielsen Journal: Cancers (Basel) Date: 2020-09-12 Impact factor: 6.639