Yongjing Lin1, Wenyuan Li, Keke Chen, Ying Liu. 1. Laboratory for Bioinformatics and Medical Informatics, Department of Computer Science, University of Texas at Dallas, Richardson, TX 75083-0688, USA.
Abstract
OBJECTIVE: A major problem faced in biomedical informatics involves how best to present information retrieval results. When a single query retrieves many results, simply showing them as a long list often provides poor overview. With a goal of presenting users with reduced sets of relevant citations, this study developed an approach that retrieved and organized MEDLINE citations into different topical groups and prioritized important citations in each group. DESIGN: A text mining system framework for automatic document clustering and ranking organized MEDLINE citations following simple PubMed queries. The system grouped the retrieved citations, ranked the citations in each cluster, and generated a set of keywords and MeSH terms to describe the common theme of each cluster. MEASUREMENTS: Several possible ranking functions were compared, including citation count per year (CCPY), citation count (CC), and journal impact factor (JIF). We evaluated this framework by identifying as "important" those articles selected by the Surgical Oncology Society. RESULTS: Our results showed that CCPY outperforms CC and JIF, i.e., CCPY better ranked important articles than did the others. Furthermore, our text clustering and knowledge extraction strategy grouped the retrieval results into informative clusters as revealed by the keywords and MeSH terms extracted from the documents in each cluster. CONCLUSIONS: The text mining system studied effectively integrated text clustering, text summarization, and text ranking and organized MEDLINE retrieval results into different topical groups.
OBJECTIVE: A major problem faced in biomedical informatics involves how best to present information retrieval results. When a single query retrieves many results, simply showing them as a long list often provides poor overview. With a goal of presenting users with reduced sets of relevant citations, this study developed an approach that retrieved and organized MEDLINE citations into different topical groups and prioritized important citations in each group. DESIGN: A text mining system framework for automatic document clustering and ranking organized MEDLINE citations following simple PubMed queries. The system grouped the retrieved citations, ranked the citations in each cluster, and generated a set of keywords and MeSH terms to describe the common theme of each cluster. MEASUREMENTS: Several possible ranking functions were compared, including citation count per year (CCPY), citation count (CC), and journal impact factor (JIF). We evaluated this framework by identifying as "important" those articles selected by the Surgical Oncology Society. RESULTS: Our results showed that CCPY outperforms CC and JIF, i.e., CCPY better ranked important articles than did the others. Furthermore, our text clustering and knowledge extraction strategy grouped the retrieval results into informative clusters as revealed by the keywords and MeSH terms extracted from the documents in each cluster. CONCLUSIONS: The text mining system studied effectively integrated text clustering, text summarization, and text ranking and organized MEDLINE retrieval results into different topical groups.
Authors: Yindalon Aphinyanaphongs; Ioannis Tsamardinos; Alexander Statnikov; Douglas Hardin; Constantin F Aliferis Journal: J Am Med Inform Assoc Date: 2004-11-23 Impact factor: 4.497
Authors: Ying Liu; Shamkant B Navathe; Alex Pivoshenko; Venu G Dasigi; Ray Dingledine; Brian J Ciliax Journal: Int J Data Min Bioinform Date: 2006 Impact factor: 0.667
Authors: Yan Wu; Zhaojing Zhong; James Huber; Rajiv Bassi; Bridget Finnerty; Erik Corcoran; Huiling Li; Elizabeth Navarro; Paul Balderes; Xenia Jimenez; Henry Koo; Venkata R M Mangalampalli; Dale L Ludwig; James R Tonra; Daniel J Hicklin Journal: Clin Cancer Res Date: 2006-11-01 Impact factor: 12.531
Authors: Weixiang Shao; Clive E Adams; Aaron M Cohen; John M Davis; Marian S McDonagh; Sujata Thakurta; Philip S Yu; Neil R Smalheiser Journal: Methods Date: 2014-11-20 Impact factor: 3.608