| Literature DB >> 29023557 |
Gianni Virgili1, Manuele Michelessi2, Alba Miele1, Francesco Oddone2, Giada Crescioli3, Valeria Fameli4, Ersilia Lucenteforte3.
Abstract
AIM: To investigate the reproducibility of the updated Standards for the Reporting of Diagnostic Accuracy Studies tool (STARD 2015) in a set of 106 studies included in a Cochrane diagnostic test accuracy (DTA) systematic review of imaging tests for diagnosing manifest glaucoma.Entities:
Mesh:
Year: 2017 PMID: 29023557 PMCID: PMC5638332 DOI: 10.1371/journal.pone.0186209
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Interrater agreement between the senior rater and any other rater (Overall Agreement) or each of three other raters (Ophthalmology resident, Ophthalmologist, Pharmacology researcher).
The joint judgement is presented as Positive Reporting.
| No | Item | Positive Reporting, N (%) | Overall Agreement, N (%) | Ophthalmology resident, N (%) | Ophthalmologist, N (%) | Pharmacology researcher, N(%) | |
|---|---|---|---|---|---|---|---|
| Identification as a study of diagnostic accuracy using at least one measure of accuracy (such as sensitivity, specificity, predictive values, or AUC) | 106 (100.00) | 101 (95.3) | 34 (97.1) | 33 (94.3) | 35 (94.59) | ||
| Structured summary of study design, methods, results, and conclusions (for specific guidance, see STARD for abstracts) | Not applicable | ||||||
| Scientific and clinical background, including the intended use and clinical role of the index test | 10 (9.4) | 67 (63.21) | 34 (97.1) | 21 (60) | 12 (33.3) | ||
| Study objectives and hypotheses | 40 (37.4) | 87 (82.08) | 35 (100.00) | 23 (65.7) | 29 (80.1) | ||
| Study design | Whether data collection was planned before the index test and reference standard were performed (prospective study) or after (retrospective study) | 59 (55.7) | 89 (84) | 32 (91.4) | 32 (91.4) | 25 (69.4) | |
| Participants | Eligibility criteria | 103 (97.2) | 97 (91.5) | 34 (97.1) | 33 (94.3) | 30 (83.3) | |
| On what basis potentially eligible participants were identified (such as symptoms, results from previous tests, inclusion in registry) | 53 (50) | 77 (72.6) | 32 (91.4) | 28 (77.1) | 18 (50) | ||
| Where and when potentially eligible participants were identified (setting, location and dates) | 56 (52.8) | 95 (89.6) | 32 (91.4) | 32 (85.7) | 33 (91.7) | ||
| Whether participants formed a consecutive, random or convenience series | 44 (41.51) | 99 (93.4) | 35 (100.0) | 34 (97.1) | 30 (83.3) | ||
| Test methods | Index test, in sufficient detail to allow replication | 104 (98.1) | 100 (94.3) | 34 (97.1) | 34 (97.1) | 32 (88.9) | |
| Reference standard, in sufficient detail to allow replication | 106 (100) | 98 (92.5) | 34 (97.1) | 34 (97.1) | 31 (86.1) | ||
| Rationale for choosing the reference standard (if alternatives exist) | 20 (18.9) | 91 (85.9) | 34 (97.1) | 526 (71.4) | 32 (88.9) | ||
| Definition of and rationale for test positivity cut-offs or result categories of the index test, distinguishing pre-specified from exploratory | 97 (91.5) | 71 (67) | 32 (91.4) | 21 (60) | 18 (50) | ||
| Definition of and rationale for test positivity cut-offs or result categories of the reference standard, distinguishing pre-specified from exploratory | 98 (92.5) | 98 (92.5) | 32 (91.4) | 34 (97.1) | 32 (88.9) | ||
| Whether clinical information and reference standard results were available to the performers/readers of the index test | Not applicable | ||||||
| Whether clinical information and index test results were available to the assessors of the reference standard | 29 (27.4) | 91 (85.9) | 33 (94.29) | 25 (71.4) | 33 (91.7) | ||
| Analysis | Methods for estimating or comparing measures of diagnostic accuracy | 105 (99.1) | 103 (97.2) | 35 (100.00) | 34 (97.1) | 34 (94.4) | |
| How indeterminate index test or reference standard results were handled | 93 (87.8) | 93 (87.7) | 35 (100.00) | 33 (94.3) | 25 (69.4) | ||
| How missing data on the index test and reference standard were handled | 62 (58.5) | 66 (62.3) | 33 (94.29) | 26 (74.3) | 7 (19.4) | ||
| Any analyses of variability in diagnostic accuracy, distinguishing pre-specified from exploratory | 27 (25.5) | 85 (80.2) | 33 (94.29) | 21 (60) | 31 (86.1) | ||
| Intended sample size and how it was determined | 6 (5.7) | 104 (98.1) | 35 (100) | 34 (97.1) | 35 (97.2) | ||
| Participants | Flow of participants, using a diagram | 0 (0) | 104 (98.1) | 35 (100) | 33 (94.3) | 37 (100.00) | |
| Baseline demographic and clinical characteristics of participants | 28 (26.4) | 92 (86.8) | 34 (97.1) | 39 (82.9) | 29 (80.6) | ||
| Distribution of severity of disease in those with the target condition | 105 (99.1) | 97 (91.5) | 34 (97.1) | 33 (94.3) | 30 (83.3) | ||
| Distribution of alternative diagnoses in those without the target condition | 36 (34) | 100 (94.3) | 34 (97.1) | 33 (94.3) | 33 (91.7) | ||
| Time interval and any clinical interventions between index test and reference standard | 49 (46.2) | 79 (74.5) | 33 (94.3) | 28 (80) | 18 (50 | ||
| Test results | Cross tabulation of the index test results (or their distribution) by the results of the reference standard | 106 (100) | 103 (97.2) | 35 (100) | 34 (97.1) | 34 (94.4) | |
| Estimates of diagnostic accuracy and their precision (such as 95% confidence intervals) | 89 (84) | 88 (83) | 33 (94.3) | 28 (80) | 27 (75) | ||
| Any adverse events from performing the index test or the reference standard | Not applicable | ||||||
| Study limitations, including sources of potential bias, statistical uncertainty, and generalisability | 80 (75.5) | 94 (88.68) | 34 (97.1) | 32 (91.4) | 26 (77.8) | ||
| Implications for practice, including the intended use and clinical role of the index test | 34 (32.1) | 71 (67) | 35 (100.00) | 24 (65.7) | 13 (36.1) | ||
| Registration number and name of registry | 2 (1.9) | 105 (99.1) | 35 (100.00) | 34 (97.1) | 37 (100.00) | ||
| Where the full study protocol can be accessed | 2 (1.9) | 105 (99.1) | 35 (100.00) | 34 (97.1) | 37 (100.00) | ||
| Sources of funding and other support; role of funders | 23 (21.9) | 86 (81.1) | 35 (100.00) | 27 (77.1) | 25 (69.4) | ||
Comments on main patterns of agreement or disagreement for each STARD 2015 item.
Positive Reporting and Overall agreement, as presented in Table 1, are also shown for clarity.
| No | Item | Positive Reporting, N (%) | Overall agreement (%) | Comment | |
|---|---|---|---|---|---|
| Identification as a study of diagnostic accuracy using at least one measure of accuracy (such as sensitivity, specificity, predictive values, or AUC) | 106 (100) | 101 (95.3) | Terms sensitivity, specificity or AUC always used in title or abstract | ||
| Structured summary of study design, methods, results, and conclusions (for specific guidance, see STARD for abstracts) | Not applicable | ||||
| Scientific and clinical background, including the intended use and clinical role of the index test | 10(9.4) | 67(63.21) | Intended used of the test and the potential role in the clinical pathway often lacking or not clearly reported, thus difficult to assess | ||
| Study objectives and hypotheses | 40(37.4) | 87(82.08) | Study objectives always reported but study hypothesis often lacking or not clearly reported | ||
| Study design | Whether data collection was planned before the index test and reference standard were performed (prospective study) or after (retrospective study) | 59(55.7) | 89(84) | Clear definition of prospective or retrospective nature of the study not always reported | |
| Participants | Eligibility criteria | 103(97.2) | 97(91.5) | Generally well reported both for cases and controls | |
| On what basis potentially eligible participants were identified (such as symptoms, results from previous tests, inclusion in registry) | 53(50) | 77(72.6) | Details not always clearly reported for both cases and controls, thus difficult to assess | ||
| Where and when potentially eligible participants were identified (setting, location and dates) | 56(52.8) | 95(89.6) | Date more often missing than setting | ||
| Whether participants formed a consecutive, random or convenience series | 44(41.51) | 99(93.4) | Most studies had case-control design | ||
| Test methods | Index test, in sufficient detail to allow replication | 104(98.1) | 100(94.3) | Characteristics of index test always reported and easy to retrieve | |
| Reference standard, in sufficient detail to allow replication | 106(100) | 98(92.5) | Reference standard (visual field or optic nerve head appearance or both) always reported and easy to retrieve | ||
| Rationale for choosing the reference standard (if alternatives exist) | 20(18.9) | 91(85.9) | Acknowledgment of incorporation bias made only in few cases | ||
| Definition of and rationale for test positivity cut-offs or result categories of the index test, distinguishing pre-specified from exploratory | 97(91.5) | 71(67) | The use of a large number of continuous and/or categorical parameters led to relatively low agreement among reviewers | ||
| Definition of and rationale for test positivity cut-offs or result categories of the reference standard, distinguishing pre-specified from exploratory | 98(92.5) | 98(92.5) | Clear definition of reference test criteria both for visual field and optic nerve head appearance | ||
| Whether clinical information and reference standard results were available to the performers/readers of the index test | Not applicable | ||||
| Whether clinical information and index test results were available to the assessors of the reference standard | 29 (27.4) | 91 (85.9) | Relatively easy to assess despite low adherence | ||
| Analysis | Methods for estimating or comparing measures of diagnostic accuracy | 105 (99.1) | 103 (97.2) | Easy to detect which measures of diagnostic accuracy were used | |
| How indeterminate index test or reference standard results were handled | 93 (87.8) | 93 (87.7) | Often stated that low quality images were not included in the analysis, but the item may have been interpreted differently | ||
| How missing data on the index test and reference standard were handled | 62 (58.5) | 66 (62.3) | Comparison between number of enrolled and number of included patients in the final analysis was often needed to ascertain the existence of missing data | ||
| Any analyses of variability in diagnostic accuracy, distinguishing pre-specified from exploratory | 27 (25.5) | 85 (80.2) | Low adherence, in most cases sub-analysis related to the disc size or disease severity among patients | ||
| Intended sample size and how it was determined | 6 (5.7) | 104 (98.1) | Low adherence but easy to assess | ||
| Participants | Flow of participants, using a diagram | 0 (0) | 104 (98.1) | Never reported | |
| Baseline demographic and clinical characteristics of participants | 28 (26.4) | 92 (86.8) | Age was almost always reported while sex, refraction and IOP were more often missing, but easy to assess | ||
| Distribution of severity of disease in those with the target condition | 105 (99.1) | 97 (91.5) | High adherence regarding glaucoma severity based on any classification system or mean deviation | ||
| Distribution of alternative diagnoses in those without the target condition | 36 (34) | 100 (94.3) | IOP in control patients often missing but easy to assess | ||
| Time interval and any clinical interventions between index test and reference standard | 49 (46.2) | 79 (74.5) | Incompletely reported in the methods, results or discussion | ||
| Test results | Cross tabulation of the index test results (or their distribution) by the results of the reference standard | 106 (100) | 103 (97.2) | Never reported as 2X2 table but always derived from sensitivity/specificity data | |
| Estimates of diagnostic accuracy and their precision (such as 95% confidence intervals) | 89 (84) | 88 (83) | Estimates always reported but measures of precision sometimes missing | ||
| 25 | Any adverse events from performing the index test or the reference standard | Not applicable | |||
| 26 | Study limitations, including sources of potential bias, statistical uncertainty, and generalisability | 80 (75.5) | 94 (88.68) | At least one limitation often reported, mainly case control design or poor generalizability of the results due to the characteristics of included patients (disease severity or ethnicity) | |
| 27 | Implications for practice, including the intended use and clinical role of the index test | 34 (32.1) | 71 (67) | When reported, the pre-post test probability change or likelihood ratios were presented, rather than a discussion of false positive and false negative consequences | |
| 28 | Registration number and name of registry | 2 (1.9) | 105 (99.1) | Low adherence and high agreement, easy to assess | |
| 29 | Where the full study protocol can be accessed | 2 (1.9) | 105 (99.1) | Low adherence and high agreement, easy to assess | |
| 30 | Sources of funding and other support; role of funders | 23 (21.9) | 86 (81.1) | Low adherence and high agreement, easy to assess | |