Duy Duc An Bui1, Siddhartha Jonnalagadda2, Guilherme Del Fiol3. 1. Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA; Department of Preventive Medicine-Health and Biomedical Informatics, Northwestern University, Chicago, IL, USA. Electronic address: duy.bui@utah.edu. 2. Department of Preventive Medicine-Health and Biomedical Informatics, Northwestern University, Chicago, IL, USA. 3. Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA.
Abstract
OBJECTIVE: Literature database search is a crucial step in the development of clinical practice guidelines and systematic reviews. In the age of information technology, the process of literature search is still conducted manually, therefore it is costly, slow and subject to human errors. In this research, we sought to improve the traditional search approach using innovative query expansion and citation ranking approaches. METHODS: We developed a citation retrieval system composed of query expansion and citation ranking methods. The methods are unsupervised and easily integrated over the PubMed search engine. To validate the system, we developed a gold standard consisting of citations that were systematically searched and screened to support the development of cardiovascular clinical practice guidelines. The expansion and ranking methods were evaluated separately and compared with baseline approaches. RESULTS: Compared with the baseline PubMed expansion, the query expansion algorithm improved recall (80.2% vs. 51.5%) with small loss on precision (0.4% vs. 0.6%). The algorithm could find all citations used to support a larger number of guideline recommendations than the baseline approach (64.5% vs. 37.2%, p<0.001). In addition, the citation ranking approach performed better than PubMed's "most recent" ranking (average precision +6.5%, recall@k +21.1%, p<0.001), PubMed's rank by "relevance" (average precision +6.1%, recall@k +14.8%, p<0.001), and the machine learning classifier that identifies scientifically sound studies from MEDLINE citations (average precision +4.9%, recall@k +4.2%, p<0.001). CONCLUSIONS: Our unsupervised query expansion and ranking techniques are more flexible and effective than PubMed's default search engine behavior and the machine learning classifier. Automated citation finding is promising to augment the traditional literature search.
OBJECTIVE: Literature database search is a crucial step in the development of clinical practice guidelines and systematic reviews. In the age of information technology, the process of literature search is still conducted manually, therefore it is costly, slow and subject to human errors. In this research, we sought to improve the traditional search approach using innovative query expansion and citation ranking approaches. METHODS: We developed a citation retrieval system composed of query expansion and citation ranking methods. The methods are unsupervised and easily integrated over the PubMed search engine. To validate the system, we developed a gold standard consisting of citations that were systematically searched and screened to support the development of cardiovascular clinical practice guidelines. The expansion and ranking methods were evaluated separately and compared with baseline approaches. RESULTS: Compared with the baseline PubMed expansion, the query expansion algorithm improved recall (80.2% vs. 51.5%) with small loss on precision (0.4% vs. 0.6%). The algorithm could find all citations used to support a larger number of guideline recommendations than the baseline approach (64.5% vs. 37.2%, p<0.001). In addition, the citation ranking approach performed better than PubMed's "most recent" ranking (average precision +6.5%, recall@k +21.1%, p<0.001), PubMed's rank by "relevance" (average precision +6.1%, recall@k +14.8%, p<0.001), and the machine learning classifier that identifies scientifically sound studies from MEDLINE citations (average precision +4.9%, recall@k +4.2%, p<0.001). CONCLUSIONS: Our unsupervised query expansion and ranking techniques are more flexible and effective than PubMed's default search engine behavior and the machine learning classifier. Automated citation finding is promising to augment the traditional literature search.
Authors: Halil Kilicoglu; Dina Demner-Fushman; Thomas C Rindflesch; Nancy L Wilczynski; R Brian Haynes Journal: J Am Med Inform Assoc Date: 2008-10-24 Impact factor: 4.497
Authors: Thomas G Brott; Jonathan L Halperin; Suhny Abbara; J Michael Bacharach; John D Barr; Ruth L Bush; Christopher U Cates; Mark A Creager; Susan B Fowler; Gary Friday; Vicki S Hertzberg; E Bruce McIff; Wesley S Moore; Peter D Panagos; Thomas S Riles; Robert H Rosenwasser; Allen J Taylor Journal: Circulation Date: 2011-01-31 Impact factor: 29.690
Authors: Stephan D Fihn; Julius M Gardin; Jonathan Abrams; Kathleen Berra; James C Blankenship; Apostolos P Dallas; Pamela S Douglas; Joanne M Foody; Thomas C Gerber; Alan L Hinderliter; Spencer B King; Paul D Kligfield; Harlan M Krumholz; Raymond Y K Kwong; Michael J Lim; Jane A Linderbaum; Michael J Mack; Mark A Munger; Richard L Prager; Joseph F Sabik; Leslee J Shaw; Joanna D Sikkema; Craig R Smith; Sidney C Smith; John A Spertus; Sankey V Williams Journal: Circulation Date: 2012-11-19 Impact factor: 29.690
Authors: Craig T January; L Samuel Wann; Joseph S Alpert; Hugh Calkins; Joaquin E Cigarroa; Joseph C Cleveland; Jamie B Conti; Patrick T Ellinor; Michael D Ezekowitz; Michael E Field; Katherine T Murray; Ralph L Sacco; William G Stevenson; Patrick J Tchou; Cynthia M Tracy; Clyde W Yancy Journal: J Am Coll Cardiol Date: 2014-03-28 Impact factor: 24.094