Literature DB >> 29568821

Reproducibility in Natural Language Processing: A Case Study of Two R Libraries for Mining PubMed/MEDLINE.

K Bretonnel Cohen1, Jingbo Xia2, Christophe Roeder1,2, Lawrence E Hunter1.   

Abstract

There is currently a crisis in science related to highly publicized failures to reproduce large numbers of published studies. The current work proposes, by way of case studies, a methodology for moving the study of reproducibility in computational work to a full stage beyond that of earlier work. Specifically, it presents a case study in attempting to reproduce the reports of two R libraries for doing text mining of the PubMed/MEDLINE repository of scientific publications. The main findings are that a rational paradigm for reproduction of natural language processing papers can be established; the advertised functionality was difficult, but not impossible, to reproduce; and reproducibility studies can produce additional insights into the functioning of the published system. Additionally, the work on reproducibility lead to the production of novel user-centered documentation that has been accessed 260 times since its publication-an average of once a day per library.

Entities:  

Keywords:  PubMed/MEDLINE; natural language processing; reproducibility

Year:  2016        PMID: 29568821      PMCID: PMC5860830     

Source DB:  PubMed          Journal:  LREC Int Conf Lang Resour Eval


  12 in total

1.  Drug development: Raise standards for preclinical cancer research.

Authors:  C Glenn Begley; Lee M Ellis
Journal:  Nature       Date:  2012-03-28       Impact factor: 49.962

2.  TRANSLATING BIOLOGY: TEXT MINING TOOLS THAT WORK.

Authors:  K Bretonnel Cohen; Hong Yu; Philip E Bourne; Lynette Hirschman
Journal:  Pac Symp Biocomput       Date:  2008-01-01

3.  pubmed.mineR: an R package with text-mining algorithms to analyse PubMed abstracts.

Authors:  Jyoti Rani; A B Rauf Shah; Srinivasan Ramachandran
Journal:  J Biosci       Date:  2015-10       Impact factor: 1.826

4.  Believe it or not: how much can we rely on published data on potential drug targets?

Authors:  Florian Prinz; Thomas Schlange; Khusru Asadullah
Journal:  Nat Rev Drug Discov       Date:  2011-08-31       Impact factor: 84.694

5.  Factors affecting reproducibility between genome-scale siRNA-based screens.

Authors:  Nicholas J Barrows; Caroline Le Sommer; Mariano A Garcia-Blanco; James L Pearson
Journal:  J Biomol Screen       Date:  2010-07-12

6.  Testing and Validating Machine Learning Classifiers by Metamorphic Testing.

Authors:  Xiaoyuan Xie; Joshua W K Ho; Christian Murphy; Gail Kaiser; Baowen Xu; Tsong Yueh Chen
Journal:  J Syst Softw       Date:  2011-04-01       Impact factor: 2.829

7.  Application of Metamorphic Testing to Supervised Classifiers.

Authors:  Xiaoyuan Xie; Joshua Ho; Christian Murphy; Gail Kaiser; Baowen Xu; Tsong Yueh Chen
Journal:  Proc Int Conf Qual Softw       Date:  2010-01-15

8.  Policy: NIH plans to enhance reproducibility.

Authors:  Francis S Collins; Lawrence A Tabak
Journal:  Nature       Date:  2014-01-30       Impact factor: 49.962

9.  An innovative approach for testing bioinformatics programs using metamorphic testing.

Authors:  Tsong Yueh Chen; Joshua W K Ho; Huai Liu; Xiaoyuan Xie
Journal:  BMC Bioinformatics       Date:  2009-01-19       Impact factor: 3.169

10.  PubTator: a web-based text mining tool for assisting biocuration.

Authors:  Chih-Hsuan Wei; Hung-Yu Kao; Zhiyong Lu
Journal:  Nucleic Acids Res       Date:  2013-05-22       Impact factor: 16.971

View more
  5 in total

1.  Biomedical text mining for research rigor and integrity: tasks, challenges, directions.

Authors:  Halil Kilicoglu
Journal:  Brief Bioinform       Date:  2018-11-27       Impact factor: 11.622

2.  Three Dimensions of Reproducibility in Natural Language Processing.

Authors:  K Bretonnel Cohen; Jingbo Xia; Pierre Zweigenbaum; Tiffany J Callahan; Orin Hargraves; Foster Goss; Nancy Ide; Aurélie Névéol; Cyril Grouin; Lawrence E Hunter
Journal:  LREC Int Conf Lang Resour Eval       Date:  2018-05

3.  CAS: corpus of clinical cases in French.

Authors:  Natalia Grabar; Clément Dalloux; Vincent Claveau
Journal:  J Biomed Semantics       Date:  2020-08-06

4.  Hybrid phenotype mining method for investigating off-target protein and underlying side effects of anti-tumor immunotherapy.

Authors:  Yuyu Zheng; Xiangyu Meng; Pierre Zweigenbaum; Lingling Chen; Jingbo Xia
Journal:  BMC Med Inform Decis Mak       Date:  2020-07-09       Impact factor: 2.796

5.  Assessment of the impact of EHR heterogeneity for clinical research through a case study of silent brain infarction.

Authors:  Sunyang Fu; Lester Y Leung; Anne-Olivia Raulli; David F Kallmes; Kristin A Kinsman; Kristoff B Nelson; Michael S Clark; Patrick H Luetmer; Paul R Kingsbury; David M Kent; Hongfang Liu
Journal:  BMC Med Inform Decis Mak       Date:  2020-03-30       Impact factor: 2.796

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.