Christopher Norman1,2, Mariska Leeflang2, Aurélie Névéol1. 1. LIMSI, CNRS, Université Paris Saclay, F-91405 Orsay. 2. Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands.
Abstract
BACKGROUND: Systematic reviews are critical for obtaining accurate estimates of diagnostic test accuracy, yet these require extracting information buried in free text articles, an often laborious process. OBJECTIVE: We create a dataset describing the data extraction and synthesis processes in 63 DTA systematic reviews, and demonstrate its utility by using it to replicate the data synthesis in the original reviews. METHOD: We construct our dataset using a custom automated extraction pipeline complemented with manual extraction, verification, and post-editing. We evaluate using manual assessment by two annotators and by comparing against data extracted from source files. RESULTS: The constructed dataset contains 5,848 test results for 1,354 diagnostic tests from 1,738 diagnostic studies. We observe an extraction error rate of 0.06-0.3%. CONCLUSIONS: This constitutes the first dataset describing the later stages of the DTA systematic review process, and is intended to be useful for automating or evaluating the process.
BACKGROUND: Systematic reviews are critical for obtaining accurate estimates of diagnostic test accuracy, yet these require extracting information buried in free text articles, an often laborious process. OBJECTIVE: We create a dataset describing the data extraction and synthesis processes in 63 DTA systematic reviews, and demonstrate its utility by using it to replicate the data synthesis in the original reviews. METHOD: We construct our dataset using a custom automated extraction pipeline complemented with manual extraction, verification, and post-editing. We evaluate using manual assessment by two annotators and by comparing against data extracted from source files. RESULTS: The constructed dataset contains 5,848 test results for 1,354 diagnostic tests from 1,738 diagnostic studies. We observe an extraction error rate of 0.06-0.3%. CONCLUSIONS: This constitutes the first dataset describing the later stages of the DTA systematic review process, and is intended to be useful for automating or evaluating the process.
Authors: Mariska M G Leeflang; Jonathan J Deeks; Constantine Gatsonis; Patrick M M Bossuyt Journal: Ann Intern Med Date: 2008-12-16 Impact factor: 25.391