Literature DB >> 17238292

Prospective validation of text categorization filters for identifying high-quality, content-specific articles in MEDLINE.

Yindalon Aphinyanaphongs1, Constantin Aliferis.   

Abstract

In prior work, we introduced a machine learning method to identify high quality MEDLINE documents in internal medicine. The performance of the original filter models built with this corpus on years outside 1998-2000 was not assessed directly. Validating the performance of the original filter models on current corpora is crucial to validate them for use in current years, to verify that the model fitting and model error estimation procedures do not over-fit the models, and to validate consistency of the chosen ACPJ gold standard (i.e., that ACPJ editorial policies and criteria are stable over time). Our prospective validation results indicated that in the categories of treatment, etiology, diagnosis, and prognosis, the original machine learning filter models built from the 1998-2000 corpora maintained their discriminatory performance of 0.97, 0.97, 0.94, and 0.94 area under the curve in each respective category when applied to a 2005 corpus. The ACPJ is a stable, reliable gold standard and the machine learning methodology provides robust models and model performance estimates. Machine learning filter models built with 1998-2000 corpora can be applied to identify high quality articles in recent years.

Mesh:

Year:  2006        PMID: 17238292      PMCID: PMC1839419     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  6 in total

1.  Robustness of empirical search strategies for clinical content in MEDLINE.

Authors:  Nancy L Wilczynski; R Brian Haynes
Journal:  Proc AMIA Symp       Date:  2002

2.  Developing optimal search strategies for detecting clinically sound causation studies in MEDLINE.

Authors:  Nancy L Wilczynski; R Brian Haynes
Journal:  AMIA Annu Symp Proc       Date:  2003

3.  Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey.

Authors:  R Brian Haynes; Nancy L Wilczynski
Journal:  BMJ       Date:  2004-04-08

4.  Text categorization models for high-quality article retrieval in internal medicine.

Authors:  Yindalon Aphinyanaphongs; Ioannis Tsamardinos; Alexander Statnikov; Douglas Hardin; Constantin F Aliferis
Journal:  J Am Med Inform Assoc       Date:  2004-11-23       Impact factor: 4.497

5.  Optimal search strategies for detecting clinically sound prognostic studies in EMBASE: an analytic survey.

Authors:  Nancy L Wilczynski; R Brian Haynes
Journal:  J Am Med Inform Assoc       Date:  2005-03-31       Impact factor: 4.497

6.  Challenges in the analysis of mass-throughput data: a technical commentary from the statistical machine learning perspective.

Authors:  Constantin F Aliferis; Alexander Statnikov; Ioannis Tsamardinos
Journal:  Cancer Inform       Date:  2007-02-16
  6 in total
  2 in total

1.  Classifying publications from the clinical and translational science award program along the translational research spectrum: a machine learning approach.

Authors:  Alisa Surkis; Janice A Hogle; Deborah DiazGranados; Joe D Hunt; Paul E Mazmanian; Emily Connors; Kate Westaby; Elizabeth C Whipple; Trisha Adamus; Meridith Mueller; Yindalon Aphinyanaphongs
Journal:  J Transl Med       Date:  2016-08-05       Impact factor: 5.531

2.  A Deep Learning Approach to Refine the Identification of High-Quality Clinical Research Articles From the Biomedical Literature: Protocol for Algorithm Development and Validation.

Authors:  Sophia Ananiadou; Wael Abdelkader; Tamara Navarro; Rick Parrish; Chris Cotoi; Federico Germini; Lori-Ann Linkins; Alfonso Iorio; R Brian Haynes; Lingyang Chu; Cynthia Lokker
Journal:  JMIR Res Protoc       Date:  2021-11-29
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.