| Literature DB >> 34970844 |
Shifeng Liu1, Florence T Bourgeois2,3, Adam G Dunn1,2.
Abstract
A substantial proportion of trial registrations are not linked to corresponding published articles, limiting analyses and new tools. Our aim was to develop a method for finding articles reporting the results of trials that are registered on ClinicalTrials.gov when they do not include metadata links. We used a set of 27,280 trial registration and article pairs to train and evaluate methods for identifying missing links in both directions-from articles to registrations and from registrations to articles. We trained a classifier with six distance metrics as feature representations to rank the correct article or registration, using recall@K to evaluate performance and compare to baseline methods. When identifying links from registrations to published articles, the classifier ranked the correct article first (recall@1) among 378,048 articles in 80.8% of evaluation cases and 34.9% in the baseline method. Recall@10 was 85.1% compared to 60.7% in the baseline. When predicting links from articles to registrations, recall@1 was 83.4% for the classifier and 39.8% in the baseline. Recall@10 was 89.5% compared to 65.8% in the baseline. The proposed method improves on our baseline document similarity method to be feasible for identifying missing links in practice. Given a ClinicalTrials.gov registration, a user checking 10 ranked articles can expect to identify the matching article in at least 85% of cases, if the trial has been published. The proposed method can be used to improve the coupling of ClinicalTrials.gov and PubMed, with applications related to automating systematic review and evidence synthesis processes.Entities:
Keywords: clinical trials; information retrieval; trial registration
Mesh:
Year: 2022 PMID: 34970844 PMCID: PMC9090946 DOI: 10.1002/jrsm.1545
Source DB: PubMed Journal: Res Synth Methods ISSN: 1759-2879 Impact factor: 9.308