Mohammad Amin Morid1, Siddhartha Jonnalagadda2, Marcelo Fiszman3, Kalpana Raja2, Guilherme Del Fiol4. 1. Department of Operations and Information Systems, David Eccles School of Business, University of Utah, Salt Lake City, UT, USA. 2. Department of Preventive Medicine, Division of Health and Biomedical Informatics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA. 3. Lister Hill Center, National Library of Medicine, Bethesda, MD, USA. 4. Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA.
Abstract
OBJECTIVE: In a previous study, we investigated a sentence classification model that uses semantic features to extract clinically useful sentences from UpToDate, a synthesized clinical evidence resource. In the present study, we assess the generalizability of the sentence classifier to Medline abstracts. METHODS: We applied the classification model to an independent gold standard of high quality clinical studies from Medline. Then, the classifier trained on UpToDate sentences was optimized by re-retraining the classifier with Medline abstracts and adding a sentence location feature. RESULTS: The previous classifier yielded an F-measure of 58% on Medline versus 67% on UpToDate. Re-training the classifier on Medline improved F-measure to 68%; and to 76% (p<0.01) after adding the sentence location feature. CONCLUSIONS: The classifier's model and input features generalized to Medline abstracts, but the classifier needed to be retrained on Medline to achieve equivalent performance. Sentence location provided additional contribution to the overall classification performance.
OBJECTIVE: In a previous study, we investigated a sentence classification model that uses semantic features to extract clinically useful sentences from UpToDate, a synthesized clinical evidence resource. In the present study, we assess the generalizability of the sentence classifier to Medline abstracts. METHODS: We applied the classification model to an independent gold standard of high quality clinical studies from Medline. Then, the classifier trained on UpToDate sentences was optimized by re-retraining the classifier with Medline abstracts and adding a sentence location feature. RESULTS: The previous classifier yielded an F-measure of 58% on Medline versus 67% on UpToDate. Re-training the classifier on Medline improved F-measure to 68%; and to 76% (p<0.01) after adding the sentence location feature. CONCLUSIONS: The classifier's model and input features generalized to Medline abstracts, but the classifier needed to be retrained on Medline to achieve equivalent performance. Sentence location provided additional contribution to the overall classification performance.
Authors: Halil Kilicoglu; Dina Demner-Fushman; Thomas C Rindflesch; Nancy L Wilczynski; R Brian Haynes Journal: J Am Med Inform Assoc Date: 2008-10-24 Impact factor: 4.497
Authors: Cynthia Lokker; R Brian Haynes; Nancy L Wilczynski; K Ann McKibbon; Stephen D Walter Journal: J Am Med Inform Assoc Date: 2011-06-15 Impact factor: 4.497
Authors: Arjen Hoogendam; Anton F H Stalenhoef; Pieter F de Vries Robbé; A John P M Overbeke Journal: BMC Med Inform Decis Mak Date: 2008-09-24 Impact factor: 2.796
Authors: Arjen Hoogendam; Anton F H Stalenhoef; Pieter F de Vries Robbé; A John P M Overbeke Journal: J Med Internet Res Date: 2008-10-03 Impact factor: 5.428