B Gerges1,2, W Li3, M Leonardi1,4, B W Mol3,5, G Condous1. 1. Acute Gynaecology, Early Pregnancy and Advanced Endosurgery Unit, Sydney Medical School Nepean, University of Sydney, Nepean Hospital, Kingswood, NSW, Australia. 2. Sydney West Advanced Pelvic Surgery (SWAPS), Blacktown Hospital, Blacktown, NSW, Australia. 3. Department of Obstetrics and Gynaecology, Monash University, Clayton, VIC, Australia. 4. Department of Obstetrics and Gynecology, McMaster University, Hamilton, Canada. 5. Aberdeen Centre for Women's Health Research, University of Aberdeen, Aberdeen, UK.
Abstract
STUDY QUESTION: Is there an ideal imaging modality for the detection of uterosacral ligaments/torus uterinus (USL), rectovaginal septum (RVS) and vaginal deep endometriosis (DE) in women with a clinical history of endometriosis? SUMMARY ANSWER: The sensitivity for the detection of USL, RVS and vaginal DE using MRI seems to be better than transvaginal ultrasonography (TVS), whilst the specificity of both were excellent. WHAT IS KNOWN ALREADY: The surgical management of women with DE can be complex and requires advanced laparoscopic skills with maximal cytoreduction being vital at the first procedure to provide the greatest symptomatic benefit. Owing to a correlation of TVS findings with surgical findings, preoperative imaging has been used to adequately consent women and plan the appropriate surgery. However, until publication of the consensus statement by the International Deep Endometriosis Analysis Group in 2016, there were significant variations within the terms and definitions used to describe DE in the pelvis. STUDY DESIGN SIZE DURATION: A systematic review and meta-analysis was conducted using Embase, Google Scholar, Medline, PubMed and Scopus to identify studies published from inception to May 2020, of which only those from 2010 were included owing to the increased proficiency of the sonographers and advancements in technology. PARTICIPANTS/MATERIALS SETTING METHODS: All prospective studies that preoperatively assessed any imaging modality for the detection of DE in the USL, RVS and vagina and correlated with the reference standard of surgical data were considered eligible. Study eligibility was restricted to those including a minimum of 10 unaffected and 10 affected participants. MAIN RESULTS AND THE ROLE OF CHANCE: There were 1977 references identified from which 10 studies (n = 1188) were included in the final analysis. For the detection of USL DE, the overall pooled sensitivity and specificity for all TVS techniques were 60% (95% CI 32-82%) and 95% (95% CI 90-98%), respectively, and for all MRI techniques were 81% (95% CI 66-90%) and 83% (95% CI 62-94%), respectively. For the detection of RVS DE, the overall pooled sensitivity and specificity for all TVS techniques were 57% (95% CI 30-80%) and 100% (95% CI 92-100%), respectively. For the detection of vaginal DE, the overall pooled sensitivity and specificity for all TVS techniques were 52% (95% CI 29-74%) and 98% (95% CI 95-99%), respectively, and for all MRI techniques were 64% (95% CI 40-83%) and 98% (96% CI 93-99%). Pooled analyses were not possible for other imaging modalities. LIMITATIONS REASONS FOR CAUTION: There was a low quality of evidence given the high risk of bias and heterogeneity in the included studies. There are also potential biases secondary to the risk of misdiagnosis at surgery owing to a lack of either histopathological findings or expertise, coupled with the surgeons not being blinded. Furthermore, the varying surgical experience and the lack of clarity regarding complete surgical clearance, thereby also contributing to the lack of histopathology, could also explain the wide range of pre-test probability of disease. WIDER IMPLICATIONS OF THE FINDINGS: MRI outperformed TVS for the per-operative diagnosis of USL, RVS and vaginal DE with higher sensitivities, although the specificities for both were excellent. There were improved results with other imaging modalities, such as rectal endoscopy-sonography, as well as the addition of bowel preparation or ultrasound gel to either TVS or MRI, although these are based on individual studies. STUDY FUNDING/COMPETING INTERESTS: No funding was received for this study. M.L. reports personal fees from GE Healthcare, grants from the Australian Women's and Children's Foundation, outside the submitted work. B.W.M. reports grants from NHMRC, outside the submitted work. G.C. reports personal fees from GE Healthcare, outside the submitted work; and is on the Endometriosis Advisory Board for Roche Diagnostics. REGISTRATION NUMBER: Prospective registration with PROSPERO (CRD42017059872) was obtained.
STUDY QUESTION: Is there an ideal imaging modality for the detection of uterosacral ligaments/torus uterinus (USL), rectovaginal septum (RVS) and vaginal deep endometriosis (DE) in women with a clinical history of endometriosis? SUMMARY ANSWER: The sensitivity for the detection of USL, RVS and vaginal DE using MRI seems to be better than transvaginal ultrasonography (TVS), whilst the specificity of both were excellent. WHAT IS KNOWN ALREADY: The surgical management of women with DE can be complex and requires advanced laparoscopic skills with maximal cytoreduction being vital at the first procedure to provide the greatest symptomatic benefit. Owing to a correlation of TVS findings with surgical findings, preoperative imaging has been used to adequately consent women and plan the appropriate surgery. However, until publication of the consensus statement by the International Deep Endometriosis Analysis Group in 2016, there were significant variations within the terms and definitions used to describe DE in the pelvis. STUDY DESIGN SIZE DURATION: A systematic review and meta-analysis was conducted using Embase, Google Scholar, Medline, PubMed and Scopus to identify studies published from inception to May 2020, of which only those from 2010 were included owing to the increased proficiency of the sonographers and advancements in technology. PARTICIPANTS/MATERIALS SETTING METHODS: All prospective studies that preoperatively assessed any imaging modality for the detection of DE in the USL, RVS and vagina and correlated with the reference standard of surgical data were considered eligible. Study eligibility was restricted to those including a minimum of 10 unaffected and 10 affected participants. MAIN RESULTS AND THE ROLE OF CHANCE: There were 1977 references identified from which 10 studies (n = 1188) were included in the final analysis. For the detection of USL DE, the overall pooled sensitivity and specificity for all TVS techniques were 60% (95% CI 32-82%) and 95% (95% CI 90-98%), respectively, and for all MRI techniques were 81% (95% CI 66-90%) and 83% (95% CI 62-94%), respectively. For the detection of RVS DE, the overall pooled sensitivity and specificity for all TVS techniques were 57% (95% CI 30-80%) and 100% (95% CI 92-100%), respectively. For the detection of vaginal DE, the overall pooled sensitivity and specificity for all TVS techniques were 52% (95% CI 29-74%) and 98% (95% CI 95-99%), respectively, and for all MRI techniques were 64% (95% CI 40-83%) and 98% (96% CI 93-99%). Pooled analyses were not possible for other imaging modalities. LIMITATIONS REASONS FOR CAUTION: There was a low quality of evidence given the high risk of bias and heterogeneity in the included studies. There are also potential biases secondary to the risk of misdiagnosis at surgery owing to a lack of either histopathological findings or expertise, coupled with the surgeons not being blinded. Furthermore, the varying surgical experience and the lack of clarity regarding complete surgical clearance, thereby also contributing to the lack of histopathology, could also explain the wide range of pre-test probability of disease. WIDER IMPLICATIONS OF THE FINDINGS: MRI outperformed TVS for the per-operative diagnosis of USL, RVS and vaginal DE with higher sensitivities, although the specificities for both were excellent. There were improved results with other imaging modalities, such as rectal endoscopy-sonography, as well as the addition of bowel preparation or ultrasound gel to either TVS or MRI, although these are based on individual studies. STUDY FUNDING/COMPETING INTERESTS: No funding was received for this study. M.L. reports personal fees from GE Healthcare, grants from the Australian Women's and Children's Foundation, outside the submitted work. B.W.M. reports grants from NHMRC, outside the submitted work. G.C. reports personal fees from GE Healthcare, outside the submitted work; and is on the Endometriosis Advisory Board for Roche Diagnostics. REGISTRATION NUMBER: Prospective registration with PROSPERO (CRD42017059872) was obtained.
WHAT DOES THIS MEAN FOR PATIENTS?This study looks at whether any of the imaging methods (that produce pictures of the inside of the body) performed better in detecting endometriosis of the uterosacral ligaments (supporting the uterus and pelvic organs), rectovaginal septum (strong connective tissue between the rectum and vagina) and vagina before surgery. This is particularly important for both women and gynaecologists to ensure that as much endometriosis is removed the first time, therefore avoiding multiple surgeries.In this study, we searched all studies that compared any imaging method before surgery and compared the results with surgery. MRI was slightly more accurate than transvaginal ultrasound (through the vagina), although more studies are needed to test other methods.
Introduction
Since the correlation of the ultrasonography with surgical findings of deep endometriosis (DE) in the pelvis by Bazot , there have been many studies assessing multiple imaging modalities to preoperatively diagnose the location and extent of DE. Some of the imaging techniques used have included transvaginal ultrasonography (TVS), MRI, rectal endoscopy-sonography (RES) and computed tomography.However, until the publication of the consensus statement by the International Deep Endometriosis Analysis (IDEA) Group (Guerriero ) in 2016, there were significant variations within the terms and definitions used to describe DE in the pelvis. The IDEA consensus statement standardized these, as well as the sonographic evaluation of the pelvis, dividing the pelvis into the anterior and posterior compartments. The anterior compartment includes the bladder, uterovesical pouch and ureters, whilst the posterior compartment consists of rectovaginal septum (RVS), uterosacral ligaments/torus uterinus (USL), posterior vaginal fornix and rectum/rectosigmoid (Guerriero ). In terms of the posterior compartment, the rectum/rectosigmoid has been the most evaluated, which is not surprising given the surgical complexity that DE at this site poses for the gynaecologist (Abrao ). Meanwhile the remaining three regions of the posterior compartment are less studied.The purpose of this systematic review was to assess the diagnostic accuracy of all imaging modalities for the preoperative detection of DE in the USL, RVS and posterior vaginal fornix, which will be referred to as the vagina, as defined by the IDEA group, compared with surgical data in women of reproductive age.
Materials and methods
Protocol and registration
This review was designed as per the Synthesizing Evidence from Diagnostic Accuracy Tests (SEDATE) guidelines (Sotiriadis ) and the PRISMA statement (Moher ). Prior to commencement, prospective registration of the protocol was obtained with PROSPERO (CRD42017059872) including the detailing of inclusion/exclusion criteria, data extraction and quality assessment. This study is one of a series of subgroups of the larger systematic review protocol. The protocol and following methodology, whilst standard for systematic reviews and meta-analyses, were used in a previously published study (Gerges ).
Eligibility criteria
Peer-reviewed, published studies which evaluated preoperative imaging modalities to assess the presence of DE and compared with the reference standard of surgical/histological diagnosis were included, as per the criteria defined by Bazot . The studies were included if they were prospective cohort studies including women of reproductive age presenting with a clinical suspicion of DE, based on symptoms and/or physical examination from any healthcare centre setting.Any imaging modalities used for the detection of DE of the USL, RVS and vagina were included, namely, MRI, RES, sonovaginography (SVG) and TVS. We also included the variations of standard techniques, such as the addition of gel contrast, rectal water or bowel preparation (BP), with the outcome being the presence and location of DE. The imaging techniques were assessed as a group and separately. Only those studies with sufficient data to construct 2 × 2 contingency tables were included. The risk of selection bias was reduced by only including studies with at least 10 affected and 10 unaffected women by the reference standard. There were no restrictions on language.
Information sources
Searches were conducted using Embase, Google Scholar, Medline, PubMed and Scopus to identify published studies from inception (1946) until 1 May 2020, of which only those from 2010 were screened for eligible studies owing to the increased proficiency of the sonographers and advancements in technology. Filters were not utilized to reduce any exclusions of potentially relevant studies (Leeflang ). Furthermore, the references from included studies and relevant reviews were hand-searched by the authors. Where necessary, the authors of primary studies were contacted.
Search
The search criteria used with the aforementioned databases is outlined in Supplementary Data. The studies were then screened for those that assessed USL, RVS or vaginal DE to ensure that studies that used inconsistent or outdated descriptions of DE were not excluded.
Study selection
Initial screening of the records was based on titles and abstracts after which the full texts of the potentially eligible records were reviewed. Compliance with the inclusion criteria and selection of eligible studies was performed following the independent and blind examination by two authors (B.G. and G.C.) of these full texts. Where studies included either all or part of the same previously published study population, the most complete and recent study was selected to avoid duplication of studies or participants. Similarly, the most accurate and senior reviewer’s (G.C.) results were included in inter-observer diagnostic studies. The author M.L. was consulted to solve any disagreements. A ‘PRISMA’ flow chart (Moher ) was used to document the selection process.
Data items, risk of bias and quality assessment
B.G. extracted the data and the risk of bias and applicability of individual studies were independently assessed by B.G. and M.L. as per QUADAS-2 (Whiting ; Gerges ). The four domains evaluated were: patient selection; index text; reference standard; and flow and timing (only risk of bias). An overall quality summary score for each study was not performed (Whiting ).
Statistical analysis
Mixed-effects diagnostic meta-analysis was performed to determine overall pooled sensitivity and specificity, from which the likelihood ratio of positive and negative tests (LR+, LR–) (Zwinderman and Bossuyt, 2008), diagnostic odds ratios (DORs) and AUC of summary receiver-operating characteristic curves (sROC) with their respective 95% CIs for all diagnostic modules. At least four studies are required to perform a meta-analysis with this method (Sotiriadis ). Forest plots of sensitivity and specificity for diagnostic modules that have adequate studies to be assessed were produced. sROC were plotted to illustrate AUC and the relation between sensitivity and specificity. Sub-group analyses, where possible, were performed using the same methods.The magnitude and presence of heterogeneity for sensitivity and specificity were assessed using the Cochran’s Q test and the I2 index. A P-value of Cochran’s Q test <0.1 suggests the presence of heterogeneity. The I2 index describes the percentage of total variation across studies that can be explained by heterogeneity but not chance. I2 values of 25%, 50% and 75% would be considered to indicate low, moderate and high heterogeneity, respectively (Higgins ).The Deeks Funnel Plot asymmetry test was used to assess publication bias by computing a regression of diagnostic log odds ratio against 1/root (effective sample size), weighted by effective sample size. A P-value <0.10 for the slope coefficient suggests significant asymmetry and possible publication bias (Deeks ). All analyses were performed using STATA version 16.1 for Windows (Stata Corporation, College Station, TX, USA).
Results
Search results
The literature search of the Embase, Google Scholar, Medline, PubMed and Scopus databases, from inception to 1 May 2020 identified 1977 references. The PRISMA flow diagram in Fig. 1 represents the selection of studies. Of the 45 (Fedele ; Dessole ; Delpy ; Takeuchi ; Bahr ; Abrao ; Biscaldi ; Guerriero ; Griffiths ; Guerriero ; Ribeiro ; Valenzano Menada ; Bazot ; Hottat ; Hudelist ; Piketty ; Bergamini ; Chassang ; Faccioli ; Goncalves ; Grasso ; Pascual ; Ferrero ; Hudelist ; Fiaschetti ; Savelli ; Bazot ; Holland ; Hudelist ; Manganaro ; Stabile Ianora ; Leon ; Tammaa ; Baggio ; Menakaya ; Ferrero ; Guerriero ; Jiang ; Ros ; Alborzi ; Carfagna ; Di Giovanni ; Reid ; Zhang ; Barra ), there were 10 (Pascual ; Hudelist ; Fiaschetti ; Bazot ; Holland ; Manganaro ; Tammaa ; Menakaya ; Alborzi ; Zhang ) which specifically assessed the USL, RVS and vagina that were included in the analysis after 2010.
Figure 1.
Flow of studies identified in literature for systematic review on imaging modalities for the preoperative diagnosis of uterosacral ligament/torus uterinus, rectovaginal septum and vaginal deep endometriosis.
Flow of studies identified in literature for systematic review on imaging modalities for the preoperative diagnosis of uterosacral ligament/torus uterinus, rectovaginal septum and vaginal deep endometriosis.The 10 studies included a total of 1188 women with a median of 91.5 per study (range 23 to 317) (Pascual ; Hudelist ; Fiaschetti ; Bazot ; Holland ; Manganaro ; Tammaa ; Menakaya ; Alborzi ; Zhang ). Of the 10 studies, seven were conducted in Europe, one in Asia, one in Australia and one in the Middle East.A total of nine studies assessed USL DE (1150 participants) (Hudelist ; Fiaschetti ; Bazot ; Holland ; Manganaro ; Tammaa ; Menakaya ; Alborzi ; Zhang ), of which seven studies assessed TVS (1085 women), of which five studies used two-dimensional (2D) TVS (568 participants) (Hudelist ; Fiaschetti ; Holland ; Tammaa ; Zhang ), one used SVG (Menakaya ) and one used TVS with BP (Alborzi ). A total of four studies assessed MRI (440 women with 521 examinations included in the analysis due to two studies comparing more than one MRI technique with each woman), of which all four studies used 2D MRI (Fiaschetti ; Bazot ; Manganaro ; Alborzi ), one used three-dimensional (3D) MRI (Bazot ) and two studies used MRI with gel (Fiaschetti ; Manganaro ). There was one study that assessed RES (Alborzi ). The pre-test probabilities of disease for TVS, 2D TVS, MRI and 2D MRI were 33%, 34%, 47% and 47%, respectively.A total of seven studies assessed RVS DE (1005 participants) (Pascual ; Hudelist ; Fiaschetti ; Holland ; Tammaa ; Menakaya ; Alborzi ), all of which assessed TVS (1005 participants). Of these, four studies assessed 2D TVS (450 participants) (Hudelist ; Fiaschetti ; Holland ; Tammaa ), one used SVG (Menakaya ), one used 3D TVS (Pascual ) and one used TVS-BP (Alborzi ). Two studies assessed MRI (432 participants) (Fiaschetti ; Alborzi ) of which Alborzi assessed 2D MRI and Fiaschetti compared MRI with and without vaginal gel. One study assessed RES (Alborzi ). The pre-test probabilities of DE for both TVS and 2D TVS were 14%.A total of five studies assessed vaginal DE (474 participants) (Hudelist ; Fiaschetti ; Bazot ; Tammaa ; Menakaya ), of which four studies assessed TVS (451 participants), from which five data sets were obtained (516 participants) as Tammaa ) assessed the interobserver agreement of two experts. One study used SVG (Menakaya ) and the remaining three studies used 2D TVS (251 participants) (Hudelist ; Fiaschetti ; Tammaa ) from which four data sets were used (316 participants), which included the interobserver agreement of the two experts from Tammaa ). Three studies assessed MRI (137 participants), from which four data sets were obtained (160 participants) Fiaschetti compared MRI with and without vaginal gel. The pre-test probabilities of disease for TVS, 2D TVS and MRI were 10%, 14% and 20%, respectively. The study characteristics are summarized in Table I and the summary results are shown in Table II.
Table I
Characteristics of included studies.
First author
Year
Country
Setting
n
Mean age
Dysmenorrhea (%)
Chronic pelvic pain (%)
Dyspareunia (%)
Bowel symptoms (%)
Urinary symptoms (%)
Infertility (%)
Index test(s)
Observers
Reference standard
Cases with USL/RVS/ vaginal DE
Alborzi
2018
Iran
Multicentre
317
31
–
–
–
–
–
TVS, RES and MRI
Single
Surgery and histolopathology
151/44/–
Bazot
2013
France
Single centre
23
34
–
–
–
–
–
–
MRI and 3D MRI
Two
Surgery and histolopathology
17/–/–
Fiaschetti
2011
Italy
Single centre
58
34
86
60
67
45
–
27
TVS and MRI with/ without gel
Single
Surgery
21/16/11
Holland
2013
England
Multicentre
198
35
72.2
49.5
45.9
9.6
–
–
TVS
Two
Surgery
40/32/–
Hudelist
2011
Austria
Multicentre
129
32.2
86
34.8
55.8
30.2
4.6
15.5
TVS
Single
Surgery and histolopathology
30/9/11
Manganaro
2013
Italy
Single centre
42
28
–
–
–
–
–
–
3D MRI
Single
Surgery and histolopathology
19/–/–
Menakaya
2016
Australia
Multicentre
200
32.1
58.5
61
57.5
41.5
–
38
SVG
Three
Surgery and histopathology
17/9/10
Pascual
2010
Spain
Single centre
38
35.6
–
100
–
–
–
38
3D TVS
Three
Surgery and histolopathology
–/19/–
Tammaa
2015
Austria
Single centre
65
30.2
95
32
63
15
9
32
TVS
Two
Surgery and histolopathology
17/8/11
Zhang
2019
China
Single centre
118
35.2
–
–
–
–
–
–
TVS
Single
Surgery and histolopathology
85/–/–
Only the first author of each study is given. All studies were prospective and included women with clinical suspicion of uterosacral/torus uterinus, rectovaginal septum or vaginal deep endometriosis (DE). Observers refer to the number of observers involved with each imaging modality.
Summary of findings of the pooled results of the preoperative diagnostic accuracy of imaging modalities.
Imaging technique
Studies (n)
Patients (n)
Sensitivity (CI)
Specificity (CI)
LR+ (CI)
LR− (CI)
DOR (CI)
AUC (CI)
USL DE
TVS—Overall
7
1085
60% (32–82%)
95% (90–98%)
13.2 (8.0–21.8)
0.42 (0.22–0.82)
31 (15–65)
94% (92–96%)
2D TVS
5
568
64% (29–89%)
96% (88–89%)
15.2 (6.5–35.2)
0.37 (0.15–0.94)
41 (14–121)
95% (92–96%)
MRI—Overall
4
440*
81% (66–90%)
83% (62–94%)
4.8 (2.1–11.1)
0.23 (0.14–0.38)
21 (9–47)
88% (85–91%)
2D MRI
4
440
79% (56–91%)
87% (67–96%)
5.9 (2.3–15.1)
0.25 (0.12–0.52)
24 (8–71)
90% (87–92%)
RVS DE
TVS—Overall
7
1005
57% (30–80%)
100% (92–100%)
147.1 (7.5–2895.2)
0.44 (0.23–0.81)
338 (18–6251)
93% (91–95%)
2D TVS
4
450
42% (20–68%)
99% (90–100%)
63.6 (3.1–1295.5)
0.58 (0.37–0.92)
109 (4–2833)
85% (82–88%)
Vaginal DE
TVS—Overall
4
451#
52% (29–74%)
98% (95–99%)
27.1 (12.0–61.4)
0.49 (0.30–0.80)
55 (22–140)
95% (93–97%)
2D TVS
3
251#
60% (38–78%)
97% (94–99%)
21.7 (9.7–48.4)
0.41 (0.25–0.69)
53 (19–143)
96% (93–97%)
MRI—Overall
3
137^
64% (40–83%)
98% (83–99%)
27.5 (8.4–90.8)
0.37 (0.19–0.69)
75 (16–351)
98% (96–99%)
Preoperative diagnostic accuracy of imaging modalities for the detection of uterosacral/torus uterinus (USL), rectovaginal septum (RVS) and vaginal DE.
Corresponding to 521 examinations owing to some studies performing more than one TVS technique in the same patients (Fiaschetti ; Bazot ).
Corresponding to 516 examinations (TVS—Overall) and 316 examinations (2D TVS) owing to Taamaa performing more than one TVS technique in the same patients.
Corresponding to 160 examinations owing to Fiaschetti performing more than one MRI technique in the same patients.
Characteristics of included studies.Only the first author of each study is given. All studies were prospective and included women with clinical suspicion of uterosacral/torus uterinus, rectovaginal septum or vaginal deep endometriosis (DE). Observers refer to the number of observers involved with each imaging modality.BP, bowel preparation; RES, transrectal endoscopic sonography; RWC, rectal water contrast; SVG, sonovaginography; TVS, transvaginal ultrasound.Summary of findings of the pooled results of the preoperative diagnostic accuracy of imaging modalities.Preoperative diagnostic accuracy of imaging modalities for the detection of uterosacral/torus uterinus (USL), rectovaginal septum (RVS) and vaginal DE.LR+, positive likelihood ratio; LR− negative likelihood ratio.Corresponding to 521 examinations owing to some studies performing more than one TVS technique in the same patients (Fiaschetti ; Bazot ).Corresponding to 516 examinations (TVS—Overall) and 316 examinations (2D TVS) owing to Taamaa performing more than one TVS technique in the same patients.Corresponding to 160 examinations owing to Fiaschetti performing more than one MRI technique in the same patients.
Methodological quality of included studies
The methodological quality, as per QUADAS-2 (Gerges ), of most of the studies was poor and is represented in Figs 2 and 3.
Figure 2.
QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies-2) quality evaluation of all 10 included studies.
Figure 3.
Traffic-light plot summarizing the authors' review of the QUADAS-2 risk of bias and applicability concerns.
QUADAS-2 (Quality Assessment of Diagnostic Accuracy Studies-2) quality evaluation of all 10 included studies.Traffic-light plot summarizing the authors' review of the QUADAS-2 risk of bias and applicability concerns.Six studies were considered to be low risk for patient selection bias (Pascual ; Hudelist ; Holland ; Tammaa ; Menakaya ; Alborzi ), three were high risk (Fiaschetti ; Bazot ; Manganaro ) and one was unclear (Zhang ). With reference to the index test domain, seven studies were assessed to be low risk (Pascual ; Hudelist ; Bazot ; Holland ; Tammaa ; Menakaya ; Zhang ) and three were high risk (Holland ; Menakaya ; Alborzi ). Aside from Zhang which was unclear, the remaining nine studies were considered high risk of bias for the reference standard domain as surgeons were not blinded to the preoperative imaging results. With respect to the flow and timing domain, five were considered unclear (Hudelist ; Fiaschetti ; Bazot ; Tammaa ; Alborzi ) and the remaining five were low risk (Pascual ; Holland ; Manganaro ; Menakaya ; Zhang ). With regards to the risk of bias concerning applicability, all the studies were deemed low risk as they were only included if they: had a population that was clinically relevant which would have undertaken index test in real practice; used any imaging modality, as all were included, of which the index test had sufficient information; and had surgery as a reference test.
Uterosacral ligament and torus uterinus deep endometriosis
Diagnostic performance of TVS
The overall pooled sensitivity and specificity, from which LR+, LR− and DOR were calculated, for the detection of USL DE with TVS and sub-analysis with 2D TVS (Table II). There was significant heterogeneity for sensitivity (Fig. 4). The sROC are displayed in Fig. 5. There was no evidence of publication bias for any of these analyses (P = 0.93 and P = 0.77, respectively) (Supplementary Fig. S1).
Figure 4.
Forest plots of studies included for the evaluation of uterosacral ligaments/torus uterinus deep endometriosis (TVS). Imaging modalities analysed are (a) ALL transvaginal ultrasound (TVS) and (b) sub-analysis of 2D TVS, displaying the pooled sensitivity, specificity and heterogeneity statistics (Cochran’s Q and I2).
Figure 5.
Summary ROC curves of studies included for the evaluation of uterosacral ligaments/torus uterinus deep endometriosis. Imaging modalities analysed are (a) ALL transvaginal ultrasound (b) sub-analysis of 2D transvaginal ultrasound (c) ALL MRI and (d) sub-analysis of 2D MRI. SENS, sensitivity; SPEC, specificity; SROC, summary receiver-operating characteristic.
Forest plots of studies included for the evaluation of uterosacral ligaments/torus uterinus deep endometriosis (TVS). Imaging modalities analysed are (a) ALL transvaginal ultrasound (TVS) and (b) sub-analysis of 2D TVS, displaying the pooled sensitivity, specificity and heterogeneity statistics (Cochran’s Q and I2).Summary ROC curves of studies included for the evaluation of uterosacral ligaments/torus uterinus deep endometriosis. Imaging modalities analysed are (a) ALL transvaginal ultrasound (b) sub-analysis of 2D transvaginal ultrasound (c) ALL MRI and (d) sub-analysis of 2D MRI. SENS, sensitivity; SPEC, specificity; SROC, summary receiver-operating characteristic.Given the low number of studies, it was not possible to perform sub-analyses for TV-BP or SVG. However, whilst the results were poorer with SVG, with a sensitivity and specificity of 24% and 98%, respectively (Menakaya ), they were much improved with BP with a sensitivity and specificity of 71% and 93%, respectively (Alborzi ).
Diagnostic performance of MRI
The overall pooled sensitivity and specificity, from which LR+, LR− and DOR were calculated, for the detection of USL DE with MRI and sub-analysis with 2D MRI (Table II). There was significant heterogeneity for sensitivity and specificity (Fig. 6). The sROC are displayed in Fig. 5. There was no evidence of publication bias for any of these analyses (P = 0.53 and P = 0.79, respectively) (Supplementary Fig. S1).
Figure 6.
Forest plots of studies included for the evaluation of uterosacral ligaments/torus uterinus deep endometriosis (MRI). Imaging modalities analysed are (a) ALL MRI and (b) sub-analysis of 2D MRI, displaying the pooled sensitivity, specificity and heterogeneity statistics (Cochran’s Q and I2).
Forest plots of studies included for the evaluation of uterosacral ligaments/torus uterinus deep endometriosis (MRI). Imaging modalities analysed are (a) ALL MRI and (b) sub-analysis of 2D MRI, displaying the pooled sensitivity, specificity and heterogeneity statistics (Cochran’s Q and I2).Given the low number of studies, it was not possible to perform sub-analyses for 3D MRI or MRI with ultrasound gel. 3D MRI had a slightly higher sensitivity of 88% but significantly lower specificity of 33% (Bazot ) whilst the results of MRI with ultrasound gel were improved, with sensitivities and specificities ranging from 81% to 91% and 89% to 92%, respectively (Fiaschetti ; Manganaro ).
Diagnostic performance of RES
There was one study assessing RES, with a sensitivity and specificity of 83% and 90%, respectively (Alborzi ).
Rectovaginal septum deep endometriosis
The overall pooled sensitivity and specificity, from which LR+, LR− and DOR were calculated, for the detection of RVS DE with TVS and sub-analysis with 2D TVS (Table II). There was significant heterogeneity for sensitivity and specificity (Fig. 7). The sROC are displayed in Supplementary Fig. S2. There was no evidence of publication bias for any of these analyses (P = 0.44 and P = 0.57, respectively) (Supplementary Fig. S3).
Figure 7.
Forest plots of studies included for the evaluation of rectovaginal septum endometriosis. Imaging modalities analysed are (a) ALL TVS and (b) sub-analysis of 2D TVS, displaying the pooled sensitivity, specificity and heterogeneity statistics (Cochran’s Q and I2).
Forest plots of studies included for the evaluation of rectovaginal septum endometriosis. Imaging modalities analysed are (a) ALL TVS and (b) sub-analysis of 2D TVS, displaying the pooled sensitivity, specificity and heterogeneity statistics (Cochran’s Q and I2).Given the low number of studies, it was not possible to perform sub-analyses for 3D TVS and TVS-BP or SVG. There was one study for each of the modalities 3D TVS, TVS-BP and SVG, with the sensitivity and specificity of each being 89%/95% (Pascual ), 86%/95% (Alborzi ) and 22%/100% (Menakaya ), respectively.There were only two studies that assessed MRI, with sensitivity and specificity of 73% and 95%, respectively, using 2D MRI (Alborzi ), whilst Fiaschetti found an improvement when comparing MRI without and with gel with a sensitivity of 69% and 94%, respectively, and the specificities were very similar, being 93% and 91%, respectively.Only one study assessed RES, with a sensitivity and specificity of 84%/94%, respectively (Alborzi ).
Vaginal deep endometriosis
The overall pooled sensitivity and specificity, from which LR+, LR− and DOR were calculated, for the detection of vaginal DE with TVS and sub-analysis with 2D TVS (Table II). There was significant heterogeneity for sensitivity and specificity (Fig. 8). The sROC are displayed in Fig. 9, There was no evidence of publication bias for any of these analyses (P = 0.05 and P = 0.09, respectively) (Supplementary Fig. S4).
Figure 8.
Forest plots of studies included for the evaluation of vaginal deep endometriosis. Imaging modalities analysed are (a) ALL TVS and (b) sub-analysis of 2D TVS, displaying the pooled sensitivity, specificity and heterogeneity statistics (Cochran’s Q and I2).
Figure 9.
SROC curves of studies included for the evaluation of vaginal deep endometriosis. Imaging modalities analysed are (a) ALL TVS, (b) sub-analysis of 2D TVS and (c) ALL MRI.
Forest plots of studies included for the evaluation of vaginal deep endometriosis. Imaging modalities analysed are (a) ALL TVS and (b) sub-analysis of 2D TVS, displaying the pooled sensitivity, specificity and heterogeneity statistics (Cochran’s Q and I2).SROC curves of studies included for the evaluation of vaginal deep endometriosis. Imaging modalities analysed are (a) ALL TVS, (b) sub-analysis of 2D TVS and (c) ALL MRI.As there was only one study assessing SVG, it was not possible to perform sub-analyses, although sensitivity and specificity were 20% and 99%, respectively (Menakaya ).The overall pooled sensitivity and specificity, from which LR+, LR− and DOR were calculated, for the detection of vaginal DE with MRI (Table II). There was significant heterogeneity for sensitivity and specificity (Supplementary Fig. S5). The sROC is displayed in Fig. 9. There was no evidence of publication bias for this analysis (P = 0.81) (Supplementary Fig. S4).Given the low number of studies, it was not possible to perform sub-analyses for 2D MRI, 3D MRI and MRI with ultrasound gel. For 2D MRI, the sensitivities ranged widely from 36% (Fiaschetti ) to 60% (Bazot ), although the specificities were similar, ranging from 94% to 98% (Fiaschetti ; Bazot ). MRI with ultrasound gel outperformed 2D MRI with a sensitivity and specificity of 82% and 98%, respectively (Fiaschetti ).
Discussion
Summary of evidence
While USL DE is one the most common sites of DE, found in up to 61% of women during laparoscopy (Fratelli ), assessment of disease in this region via TVS seems to be one of the most difficult, with sensitivities of less than 70% reported in the literature (Deslandes ). This is consistent with the findings of our meta-analysis, where the detection of USL DE using TVS was poorer than MRI, with pooled sensitivities, specificities, DOR and AUC of 61%, 95%, 24% and 93%, respectively for the former and 81%, 86%, 27% and 89%, respectively for the latter. MRI consistently outperformed TVS for both RVS and vaginal DE. The overall pooled sensitivity, specificity, DOR and AUC of MRI for the detection of RVS DE was 75%, 95%, 68% and 96% and for the detection of vaginal DE of 70%, 96%, 55% and 90%, respectively. Meanwhile, the overall pooled sensitivity, specificity, DOR and AUC of TVS for the detection of RVS DE was 72%, 98%, 154% and 97%, and for the detection of vaginal DE of 58%, 97%, 46% and 95%, respectively. While MRI seems to outperform TVS, it is important to note that there is an overlap of CI and the absence of differences is associated with the significant heterogeneity for sensitivity and specificity of both techniques.
Interpretation of results
Our results were comparable to previously published meta-analyses, with regards to TVS being outperformed by MRI for the detection of USL. Nisenblat compared all imaging modalities and obtained a sensitivity and specificity of 64% and 97%, respectively for TVS (seven studies), and 86% and 84%, respectively for MRI (four studies). Similarly, Guerriero published two reviews, their first in 2016 which assessed TVS while the most recent in 2018 (Guerriero ) compared TVS and MRI in women who had both tests. In 2016, a total of 11 studies were included, from which the sensitivity and specificity of TVS for the detection of USL DE was 53% and 93%, for RVS DE was 49% and 98%, and for vaginal DE was 58% and 96%, respectively. Aside from RVS DE, these results were very similar, with the differences likely linked to the smaller number of studies included: the assessment of these regions is likely to have improved given the increased experience in the time between reviews. In the head-to-head review in 2018 (Guerriero ), a total of six studies were included, from which the sensitivity and specificity, respectively, for TVS for the detection of USL DE was 67% and 86% compared with 70% and 93% for MRI. For RVS DE, the sensitivity and specificity for TVS was 59% and 97%, respectively, compared with 66% and 97%, respectively, for MRI (Guerriero ). As only head-to-head studies were included, it is not surprising that there were some differences from our results, particularly given the limitations of the small number of studies included. Noventa performed a similar head-to-head meta-analysis, although they included retrospective studies, and interestingly found TVS to be marginally superior to MRI for the detection of USL DE, with sensitivities of 71% and 67%, respectively. This, however, was reversed for RVS DE (as with other studies), with sensitivities of 47%, and 61%, respectively (Noventa ). When comparing the performance of MRI, Medeiros confirmed very similar results in their meta-analysis reviewing the accuracy of MRI for DE and found sensitivities and specificities for the detection of USL DE of 85% and 80%, respectively, 77% and 95%, respectively, for the detection of RVS DE, and 82% and 82%, respectively, for the detection of vaginal DE.In contrast to some of the studies discussed above (Medeiros ; Noventa ), the present analysis only included studies which were prospective with at least 10 non-affected and affected women to reduce the risk of selection bias. Aside from an attempt to reduce selection bias, the reasoning for specifying the minimum number of women affected and not affected by the disease was to increase the applicability of the results to the general population, as inevitably many of these studies are performed in tertiary level referral centres. In addition to these strengths, the primary searches were purposely broad to capture all potentially applicable studies, particularly given the discrepancies in the definitions of USL, RVS and vaginal DE. Although the risk of studies not being identified in a search is a limitation of any systematic review, an attempt was made to reduce this by including all studies with any reference to ‘endometriosis’ and ‘deep’.
Limitations
As with many similar systematic reviews and meta-analyses assessing similar diagnostic studies, one of the limitations is the low quality of evidence given the high risk of bias and heterogeneity in the included studies. Similarly, there are potential biases secondary to the risk of misdiagnosis at surgery owing to the lack of either histopathological findings or expertise, coupled with the surgeons not being blinded. Furthermore, and importantly, many of the studies do not report the experience or the number of surgeons involved. This potential of varying surgical experience and the lack of clarity regarding complete surgical clearance, thereby also contributing to the lack of histopathology, could also explain the wide range of pre-test probability of disease. This would be particularly problematic with RVS and vaginal DE, both of which are less common than USL DE. Of note, while the Bazot criteria were used, as with other studies, two of the included studies (Fiaschetti ; Holland ) met the criteria based on pouch of Douglas obliteration but did not include histopathology: thus there is the implication of a lack of accurate surgical mapping of the exact locations of DE posterior to the cervix since dissection of the retroperitoneum was not performed. In these cases, the Bazot criteria are insufficient when the accuracy of these specific DE sites should be evaluated. Indeed, there is the impression that the diagnostic accuracy of TVS and MRI are very high and similar for both techniques in studies performed at dedicated endometriosis centres where there are both expert imaging operators and surgeons. This is further confirmation of the effectiveness of endometriosis units where more accurate diagnoses are a result of the collaboration of experts in all fields.Finally, as the number of studies which met the criteria was limited, it was not possible to perform pooled analyses of other imaging modalities, and sub-analyses within the modalities regarding the addition of BPs or contrasts. More published prospective studies are necessary to obtain unbiased data in this regard. Finally, given the lack of standardized nomenclature prior to 2016, there is the risk that the defined regions assessed may be inaccurate, such as the difficulty in differentiating between vaginal and retrocervical lesions.
Conclusion
There is a lack of unbiased and standardized data that makes it difficult to identify the optimal imaging modality, however MRI outperformed TVS for the per-operative diagnosis of USL, RVS and vaginal DE with higher sensitivities, although the specificities for both were excellent. There were improved results with other imaging modalities, such as RES, as well as the addition of BP or ultrasound gel to either TVS or MRI, although these are based on individual studies. Further studies assessing different contrast mediums are needed as these may improve the imaging modality accuracy, while also adopting the standardized definitions proposed by IDEA.
Data availability
The data underlying this article will be shared on reasonable request to the corresponding author.
Authors’ roles
B.G. was responsible for the conception and design of the study, data collection, data analysis and interpretation, statistical analysis and manuscript preparation. W.L. was involved in statistical analysis. M.L. was involved in data analysis and interpretation. B.W.M. was involved in manuscript preparation. G.C. was involved with the conception and design of the study and manuscript preparation. All authors made contributions to drafting and revising the article critically for important intellectual content. All authors approved the final version of the article to be published.
Funding
No specific funding was obtained for this study.
Conflict of interest
M.L. reports personal fees from GE Healthcare, grants from the Australian Women’s and Children’s Foundation, outside the submitted work. B.W.M. reports grants from NHMRC, outside the submitted work. G.C. reports personal fees from GE Healthcare, outside the submitted work; and is on the Endometriosis Advisory Board for Roche Diagnostics.Click here for additional data file.Click here for additional data file.Click here for additional data file.Click here for additional data file.Click here for additional data file.
Authors: A A Stabile Ianora; M Moschetta; F Lorusso; S Lattarulo; M Telegrafo; L Rella; A Scardapane Journal: Clin Radiol Date: 2013-07-01 Impact factor: 2.350