Literature DB >> 25461812

Aggregator: a machine learning approach to identifying MEDLINE articles that derive from the same underlying clinical trial.

Weixiang Shao1, Clive E Adams2, Aaron M Cohen3, John M Davis4, Marian S McDonagh3, Sujata Thakurta3, Philip S Yu1, Neil R Smalheiser5.   

Abstract

OBJECTIVE: It is important to identify separate publications that report outcomes from the same underlying clinical trial, in order to avoid over-counting these as independent pieces of evidence.
METHODS: We created positive and negative training sets (comprised of pairs of articles reporting on the same condition and intervention) that were, or were not, linked to the same clinicaltrials.gov trial registry number. Features were extracted from MEDLINE and PubMed metadata; pairwise similarity scores were modeled using logistic regression.
RESULTS: Article pairs from the same trial were identified with high accuracy (F1 score=0.843). We also created a clustering tool, Aggregator, that takes as input a PubMed user query for RCTs on a given topic, and returns article clusters predicted to arise from the same clinical trial. DISCUSSION: Although painstaking examination of full-text may be needed to be conclusive, metadata are surprisingly accurate in predicting when two articles derive from the same underlying clinical trial.
Copyright © 2014 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Bias; Clinical trials; Evidence-based medicine; Informatics; Information retrieval; Systematic reviews

Mesh:

Year:  2014        PMID: 25461812      PMCID: PMC4339517          DOI: 10.1016/j.ymeth.2014.11.006

Source DB:  PubMed          Journal:  Methods        ISSN: 1046-2023            Impact factor:   3.608


  21 in total

1.  Different patterns of duplicate publication: an analysis of articles used in systematic reviews.

Authors:  Erik von Elm; Greta Poglia; Bernhard Walder; Martin R Tramèr
Journal:  JAMA       Date:  2004-02-25       Impact factor: 56.272

2.  Efficient Semisupervised MEDLINE Document Clustering With MeSH-Semantic and Global-Content Constraints.

Authors:  Jun Gu; Wei Feng; Jia Zeng; Hiroshi Mamitsuka; Shanfeng Zhu
Journal:  IEEE Trans Cybern       Date:  2013-08       Impact factor: 11.448

3.  Impact of covert duplicate publication on meta-analysis: a case study.

Authors:  M R Tramèr; D J Reynolds; R A Moore; H J McQuay
Journal:  BMJ       Date:  1997-09-13

4.  A document clustering and ranking system for exploring MEDLINE citations.

Authors:  Yongjing Lin; Wenyuan Li; Keke Chen; Ying Liu
Journal:  J Am Med Inform Assoc       Date:  2007-06-28       Impact factor: 4.497

5.  Clustering clinical trials with similar eligibility criteria features.

Authors:  Tianyong Hao; Alexander Rusanov; Mary Regina Boland; Chunhua Weng
Journal:  J Biomed Inform       Date:  2014-02-01       Impact factor: 6.317

6.  Author Name Disambiguation in MEDLINE.

Authors:  Vetle I Torvik; Neil R Smalheiser
Journal:  ACM Trans Knowl Discov Data       Date:  2009-07-01       Impact factor: 2.713

7.  Publication of NIH funded trials registered in ClinicalTrials.gov: cross sectional analysis.

Authors:  Joseph S Ross; Tony Tse; Deborah A Zarin; Hui Xu; Lei Zhou; Harlan M Krumholz
Journal:  BMJ       Date:  2012-01-03

8.  Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine.

Authors:  Aaron M Cohen; Neil R Smalheiser; Marian S McDonagh; Clement Yu; Clive E Adams; John M Davis; Philip S Yu
Journal:  J Am Med Inform Assoc       Date:  2015-02-05       Impact factor: 4.497

9.  Design and implementation of Metta, a metasearch engine for biomedical literature retrieval intended for systematic reviewers.

Authors:  Neil R Smalheiser; Can Lin; Lifeng Jia; Yu Jiang; Aaron M Cohen; Clement Yu; John M Davis; Clive E Adams; Marian S McDonagh; Weiyi Meng
Journal:  Health Inf Sci Syst       Date:  2014-01-10

10.  PubMed related articles: a probabilistic topic-based model for content similarity.

Authors:  Jimmy Lin; W John Wilbur
Journal:  BMC Bioinformatics       Date:  2007-10-30       Impact factor: 3.169

View more
  5 in total

Review 1.  Clinical Research Informatics: Supporting the Research Study Lifecycle.

Authors:  S B Johnson
Journal:  Yearb Med Inform       Date:  2017-09-11

2.  A web-based tool for automatically linking clinical trials to their publications.

Authors:  Neil R Smalheiser; Arthur W Holt
Journal:  J Am Med Inform Assoc       Date:  2022-04-13       Impact factor: 4.497

3.  New improved Aggregator: predicting which clinical trial articles derive from the same registered clinical trial.

Authors:  Neil R Smalheiser; Arthur W Holt
Journal:  JAMIA Open       Date:  2020-10-28

4.  Automated confidence ranked classification of randomized controlled trial articles: an aid to evidence-based medicine.

Authors:  Aaron M Cohen; Neil R Smalheiser; Marian S McDonagh; Clement Yu; Clive E Adams; John M Davis; Philip S Yu
Journal:  J Am Med Inform Assoc       Date:  2015-02-05       Impact factor: 4.497

5.  Still moving toward automation of the systematic review process: a summary of discussions at the third meeting of the International Collaboration for Automation of Systematic Reviews (ICASR).

Authors:  Annette M O'Connor; Guy Tsafnat; Stephen B Gilbert; Kristina A Thayer; Ian Shemilt; James Thomas; Paul Glasziou; Mary S Wolfe
Journal:  Syst Rev       Date:  2019-02-20
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.