Literature DB >> 30834039

Hyperbilirubinemia as a Predictor of Appendiceal Perforation: A Systematic Review and Diagnostic Test Meta-Analysis.

Paschalis Gavriilidis¹, Nicola de'Angelis², John Evans¹, Salomone Di Saverio³, Peter Kang¹.

Abstract

BACKGROUND: Misdiagnosis of the severity of acute appendicitis may lead to perforation and can consequently result in increased morbidity and mortality. In this study, the role of hyperbilirubinemia as a predictor of perforation is assessed by performing a meta-analysis of diagnostic accuracy.
METHODS: A systematic search of the literature published over the past 20 years was performed using the EMBASE, PubMed, Cochrane library, and Google Scholar databases.
RESULTS: Low values of sensitivity, specificity, and diagnostic odds ratio (DOR) were detected: 0.21 (95% confidence interval (CI): 0.13 - 0.30, standard error (SE) = 0.43), 0.27 (95% CI: 0.15 - 0.43, SE = 0.73), and 0.10 (95% CI: 0.3 - 0.28, SE = 0.05), respectively. The positive likelihood ratio (PLR) was low (0.29 (95% CI: 0.27 - 0.91, SE = 0.76)), whereas the negative likelihood ratio (NLR) was high (2.88 (95% CI: 1.66 - 5.14, SE = 0.10)). The hierarchical summary receiver operating characteristic curve was positioned towards the lower right corner, and the area under the curve was 0.19, both indicating a low level of overall accuracy and discrimination. Compared with the PLR, the negative inverse likelihood ratio (1/LR-) indicated that a positive result has a greater impact on the odds of disease than does a negative result.
CONCLUSIONS: Hyperbilirubinemia alone is not a reliable tool to predict perforation. Future studies should investigate whether the combined predictive values of bilirubin, C-reactive protein (CRP), and white blood cells are a more effective diagnostic tool.

Entities: Chemical

Keywords: Complicated appendicitis; Hyperbilirubinemia; Perforated appendicitis

Year: 2019 PMID： 30834039 PMCID： PMC6396786 DOI： 10.14740/jocmr3724

Source DB: PubMed Journal: J Clin Med Res ISSN： 1918-3003

Introduction

Acute appendicitis may present with various symptoms, signs, and laboratory results, and therefore, diagnosis is not easy [1]. The following scores are useful in the differential diagnosis of acute appendicitis in patients presenting with right iliac fossa pain: the Alvarado score, modified Alvarado score, Raja lsteri Pengiran Anak Saleha Appendicitis score designed specifically for Asian patients, and the Appendicitis Inflammatory Response (AIR) score based on symptoms (nausea, vomiting, duration of pain, and migratory pain), signs (fever, localized tenderness, and rebound tenderness), and laboratory results (leukocytosis, neutrophilia, and increased C-reactive protein (CRP)). Yet, these scoring systems cannot distinguish between uncomplicated and complicated appendicitis [2-4]. It is reported that the mortality rate of uncomplicated appendicitis is 0.3%; however, this increases to 6% in perforated cases [5]. In the diagnosis of acute appendicitis, the supplemental roles of ultrasound and computed tomography scans are essential. However, they have a low sensitivity in detecting perforated appendicitis [6, 7]. Therefore, the predictive ability of hyperbilirubinemia in the diagnosis of perforated appendicitis, as a supplementary tool to clinical examination and imaging studies, is worthy of investigation. To date, many authors of retrospective studies have evaluated hyperbilirubinemia as a prognostic factor for appendiceal perforation. It is reported that hyperbilirubinemia commonly occurs in patients with septic conditions and in cases of complicated appendicitis [8, 9]. These retrospective studies, which are based on small population groups, suggest that hyperbilirubinemia may be a useful predictor of perforated appendicitis. Our study aims to investigate hyperbilirubinemia as a predictive factor of appendiceal perforation, using a meta-analysis of diagnostic test accuracy.

Methods

The preferred reporting items for systematic reviews and meta-analyses (PRISMA) statement checklist was followed in this study [10].

Literature search

Using the search terms in both free text and MESH terms (perforated appendicitis, appendiceal perforation, hyperbilirubinemia, jaundice, bilirubin, elevated bilirubin, complicated appendicitis, and appendix), a systematic search of the literature published over the past 20 years was performed, through the EMBASE, PubMed, Cochrane library, and Google Scholar databases. A grey literature search at the clinicaltrials.gov website was also undertaken. The references in the retrieved articles were manually checked for further studies. Disagreements between the authors were resolved through consensus-based discussions.

Study selection, inclusion, and exclusion criteria

Our study included solely studies that compared the diagnostic test accuracy of elevated bilirubin in assessing perforated appendicitis. Included studies were required to meet the following criteria: 1) Reporting all-comers population with appendicitis and bilirubin measurements; 2) Comparing a cohort of perforated cases with non-complicated cases; and 3) In the event of multiple publications by the same institution, only the most recent publication was included. Case reports and studies in which it was impossible to clearly calculate the outcomes were excluded from the analysis.

Data extraction and outcomes

Two reviewers (PG and PK) independently extracted the following summary data for the included studies: true positives, false positives, true negatives, and false negatives.

Definitions

Hyperbilirubinemia was considered to be any value of total bilirubin serum above 1 mg/dL or 20.5 μmol/L. Only patients with histological findings of perforated appendicitis were included in the study. The sensitivity of a test was defined as the proportion of patients with disease and positive histological findings of perforated appendicitis and hyperbilirubinemia. The positive likelihood ratio (PLR) is the estimation of the probability of how much more likely a positive test is to be found in a patient with perforation, as opposed to without perforation. The negative likelihood ratio (NLR) is the probability of how much more likely a negative test is to be found in a patient with perforation, as opposed to without perforation. The diagnostic odds ratio (DOR) was defined as the ratio of the odds of positive findings in a participant with disease relative to positive findings in a participant without disease.

Statistical analyses

All statistical analyses were performed using both STATA software (version 15, StataCorp LP, College Station, TX, USA) and R software (http://meta-analysis-with-r.org). The methodological quality of all studies was assessed with the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool [11, 12]. The following measures of test accuracy were estimated for each study: sensitivity, specificity, PLR, NLR, and DOR. Data analysis was based on the hierarchical summary receiver operating characteristic (HSROC) and the bivariate model, which were used to account for the correlation between sensitivity and specificity. Heterogeneity between studies was estimated by comparing the 95% confidence region and 95% prediction region. The influence of each study on the model parameters was estimated with Cook’s distance [13]. The inverse of the NLR (1/LR-) is also estimated, because larger values indicate a more accurate test. Additionally, comparison of this with the PLR can indicate whether a positive or negative test result has a greater impact on the odds of disease [13]. Positive and negative LRs were used to characterize the clinical utility of the test. It is reported that LR = 1 means that the post-test probability is equal to the pretest probability, and thus a clinically useful test can be one with high PLR (> 5, good in ruling in disease) and with low NLR (< 0.2, good in ruling out disease) [14].

Results

Search strategy and characteristics of included studies

Twenty studies were included from an initial pool of 196, comprising a total of 8,751 cases, of which 6,235 (71%) had histologically confirmed acute appendicectomy and 1,343 (15%) had perforated appendicitis; 12 were retrospective studies and eight prospective non-randomized studies (Table 1, Fig. 1) [5, 15-17-33]. Three articles were excluded because they compared more than one test. The negative appendicectomy rate was 29%. Language or region restrictions were not applied to the systematic search.

Table 1

Study Characteristics

Author, study, country, and year	Age	Histologically confirmed appendicitis	Perforated appendicitis	Positive likelihood ratio (97.5% CI)	Negative likelihood ratio (97.5% CI)
Estrada, RS, USA, 2007	33 (5 - 66)	157	41	0.34 (0.23 - 0.51)	2.30 (1.57 - 3.38)
Khan, PNR, Nepal, 2008	29 (8 - 73)	118	18	5.73 (0.42 - 6.58)	0.05 (0.02 - 0.12)
Sand, RS, Germany, 2009	36 (6 - 91)	376	97	0.29 (0.22 - 0.38)	1.94 (1.55 - 2.43)
Kaser, RS, Switzerland, 2010	22 (5 - 92)	725	155	0.21 (0.16 - 0.27)	2.05 (1.75 - 2.39)
Atahan, RS, Turkey, 2011	31 (18 - 83)	302	45	0.15 (0.11 - 0.20)	5.44 (3.03 - 9.75)
Emmanuel, RS, Ireland, 2011	27 (5 - 82)	386	45	0.09 (0.06 - 0.13)	6.56 (4.30 - 10.02)
Hong, RS, Korea, 2012	31	732	245	0.13 (0.10 - 0.17)	1.96 (1.73 - 2.22)
McGowan, RS, UK, 2013	NR	1,271	154	0.13 (0.08 - 0.16)	2.35 (1.95 - 2.82)
Chaudary, PNR, India, 2013	27 (15 - 64)	45	5	0.41 (0.24 - 0.69)	3.69 (2.31 - 5.90)
D’Souza, PNR, UK, 2013	28 (5 - 85)	89	19	0.23 (0.13 - 0.39)	2.96 (1.53 - 5.72)
Nomura, RS, Japan, 2014	NR	279	131	0.44 (0.34 - 0.56)	1.72 (1.41 - 2.09)
Socea, RS, Romania, 2013	NR	274	51	0.26 (0.20 - 0.34)	15.10 (3.15 - 71.53)
Chambers, RS, UK, 2015	33 ± 17	797	122	0.50 (0.43 - 0.59)	1.83 (1.59 - 2.11)
Muller, RS, Germany, 2015	29 (16 - 91)	312	56	0.10 (0.07 - 0.14)	4.71 (3.40 - 6.52)
Saxena, PNR, India, 2015	NR	181	32	0.24 (0.14 - 0.42)	2.67 (0.95 - 7.48)
Shahabuddin, PNR, India, 2016	25 (10 - 65)	35	15	0.44 (0.27 - 0.71)	3.81 (1.20 - 12.13)
Eren, RS, Turkey, 2016	36 (18 - 90)	100	41	0.57 (0.31 - 1.03)	1.26 (0.93 - 1.70)
Abouzeid, PNR, Egypt, 2017	NR	74	7	0.17 (0.39 - 0.70)	3.91 (1.62 - 9.45)
Cheekuri, PNR India, 2017	27 (13 - 60)	65	35	0.52 (0.39 - 0.70)	3.91 (1.62 - 9.45)
Vineed, PNR, India, 2017	Below 13 excluded	71	29	0.50 (0.27 - 0.91)	1.45 (0.94 - 2.24)
Pooled estimates		6,235 (71%)	1,343(15%)	0.29 (0.17 - 0.48), SE (0.76)	2.88 (1.16 - 5.14), SE (0.85)

CI: confidence interval; SE: standard error; RS: retrospective study; PNR: prospective non-randomized; NR: not reported.

Figure 1

Flow diagram of the search strategy.

CI: confidence interval; SE: standard error; RS: retrospective study; PNR: prospective non-randomized; NR: not reported. Flow diagram of the search strategy.

Diagnostic accuracy measures

Pooled estimates for sensitivity and specificity were 0.21 (95% confidence interval (CI): 0.13 - 0.30, standard error (SE) = 0.43), and 0.27 (95% CI: 0.15 - 0.43, SE = 0.73), respectively (Fig. 2). Pooled estimates for PLR and NLR were 0.29 (95% CI: 0.27 - 0.91, SE = 0.76), and 2.88 (95% CI: 1.66 - 5.14, SE = 0.10), respectively (Table1). The DOR was 0.10 (95% CI: 0.3 - 0.28, SE = 0.05). The inverse of the negative likelihood ratio (1/LR-) was 0.34 (0.19 - 0.61) (SE = 0.10).

Figure 2

Forest plot demonstrating sensitivity and specificity. coef: coefficient; SE: standard error; CI: confidence interval; Se: sensitivity; Sp: specificity; DOR: diagnostic odds ratio; LR+: positive likelihood ratio; LR-: negative likelihood ratio; 1/LR-: inverse negative likelihood ratio.

Figure 3

Plot showing the hierarchical summary receiver operating characteristic (HSROC) curve and summary operating point positioned towards the lower right angle. Obvious visual discrepancy of the covered areas of the confidence and prediction intervals indicating high between-studies heterogeneity.

Study quality and sensitivity analysis

Generally, the quality of the included studies was poor; none fulfilled all items on the QUADAS checklist, and none reported the diagnostic accuracy as required by the STARD guidelines [11, 34]. Moreover, none of the outcome assessors were blinded. Cook’s distance statistic, which is a measure of the influence of a study on the model parameters detected, had only one outlier which was the study of Khan [16]. However, this study did not influence the summary estimates (Fig. 4). Investigation of the effect of individual studies on the model, by refitting the model and leaving out each study, did not demonstrate any impact on the summary estimates.

Figure 4

Left panel: Cook’s distance showing one outlier (Khan’s study). Right panel: standardized residuals showing one outlier (Khan’s study). cooksd: Cook’s standard deviation; stid: study identity; Se:ustd = sensitivity standard deviation; Sp:ustd = specificity standard deviation. The prediction region compared to confidence region demonstrated a visual representation of high between-study heterogeneity (Fig. 3).

Discussion

Our study aims to determine whether hyperbilirubinemia is a reliable prognostic factor of appendiceal perforation. Pooled estimates of low values of sensitivity demonstrate that hyperbilirubinemia is not a useful rule-out test. In addition, the insufficiently high value of the PLR characterizes hyperbilirubinemia as a non-reliable rule-in test of perforated appendicitis. The PLR and NLR were 0.29 and 2.88, respectively. Considering that the values of PLR > 5 and NLR < 0.2 characterize the tests of clinical usefulness, we can conclude that hyperbilirubinemia could not be a test of high clinical usefulness. However, the negative inverse likelihood ratio (1/LR-) indicated that a positive result has a greater impact on the odds of disease than does a negative result, when compared with the PLR. This finding may justify the usefulness of hyperbilirubinemia when it is detected in clinically confirmed appendicitis, alongside other positive laboratory results. In such cases, the surgeon should consider that there is an increased probability for perforation and thus, it would be wise to proceed with an emergency intervention rather than with a planned one. The HSROC, which is used to account for the correlation between sensitivity and specificity, was positioned towards the lower right corner. In addition, the AUC was of low value. Thus, both findings demonstrate that the test is of low overall accuracy and discriminatory ability. Using the Cook’s distance measure did not detect particularly influential studies; the only outlier was the study by Khan [16], which did not influence the pooled estimates (Fig. 4). Furthermore, investigation of the effect of individual studies on the model by refitting the model and leaving out each study did not demonstrate any impact on the summary estimates. To our best knowledge, this is the first meta-analysis that includes 20 studies. A previous paper with only eight studies concluded that although hyperbilirubinemia alone is not a strong predictor of perforation, it can be included in the diagnostic process as a supplementary tool [35]. The results of the current study demonstrated a poorer performance of hyperbilirubinemia. There are several possible reasons for the discrepancies in the results of the present and previous studies. First, the previous study’s total sample was half that of the present study. Second, studies that were published after the publication of the previous meta-analysis demonstrated worse results than those of the studies included in the previous study. Therefore, an underpowered sample, national and institutional characteristics, and selection bias may have influenced the results. The results of the present study should be interpreted within its limitations, as the overall quality of the included studies was poor and the studies were conducted at single centers. Therefore, national and institutional characteristics, performance, attrition, and selection bias may have influenced the results. Furthermore, another source of bias could be the high between-study heterogeneity (Fig. 3).

Conclusions

Hyperbilirubinemia alone has a low overall accuracy to diagnose an anticipated perforation. However, in cases with clinically confirmed appendicitis when elevated bilirubin appears besides other positive laboratory results, complicated appendicitis is more likely to be diagnosed. Based on these data and in the presence of hyperbilirubinemia, the treating surgeons may prefer to proceed with immediate emergency surgery rather than pursuing non-operative management and expectant treatment further and delaying the decision of going to theatre for an appendicectomy, given the likelihood of complicated appendicitis and the risk of developing perforated appendicitis. Therefore, future studies that are adequately powered and report their results of diagnostic accuracy based on STARD guidelines are urgently needed. They should investigate whether the combined predictive value of bilirubin, CRP, and white blood cells would be a more effective diagnostic tool for surgeons.

31 in total

Review 1. The Diagnostic Differentiation Challenge in Acute Appendicitis: How to Distinguish between Uncomplicated and Complicated Appendicitis in Adults.

Authors: Benedicte Skjold-Ødegaard; Kjetil Søreide
Journal: Diagnostics (Basel) Date: 2022-07-15

1 in total

Hyperbilirubinemia as a Predictor of Appendiceal Perforation: A Systematic Review and Diagnostic Test Meta-Analysis.

Introduction

Methods

Literature search

Study selection, inclusion, and exclusion criteria

Data extraction and outcomes

Definitions

Statistical analyses

Results

Search strategy and characteristics of included studies

Diagnostic accuracy measures

Study quality and sensitivity analysis

Discussion

Conclusions

1. Perforated versus nonperforated acute appendicitis: accuracy of multidetector CT detection.

2. C-reactive protein is superior to bilirubin for anticipation of perforation in acute appendicitis.

3. The causes of obvious jaundice in South West Wales: perceptions versus reality.

4. Systematic reviews of diagnostic test accuracy.

5. Elevated serum bilirubin in acute appendicitis :a new diagnostic tool.

6. Diagnostic value of hyperbilirubinemia as a predictive factor for appendiceal perforation in acute appendicitis.

Review 7. Current concepts in imaging of appendicitis.

8. Hyperbilirubinemia in appendicitis: a new predictor of perforation.

9. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement.

10. Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies.

Review 1. The Diagnostic Differentiation Challenge in Acute Appendicitis: How to Distinguish between Uncomplicated and Complicated Appendicitis in Adults.