Literature DB >> 21595960

Toward an interactive article: integrating journals and biological databases.

Arun Rangarajan1, Tim Schedl, Karen Yook, Juancarlos Chan, Stephen Haenel, Lolly Otis, Sharon Faelten, Tracey DePellegrin-Connelly, Ruth Isaacson, Marek S Skrzypek, Steven J Marygold, Raymund Stefancsik, J Michael Cherry, Paul W Sternberg, Hans-Michael Müller.   

Abstract

BACKGROUND: Journal articles and databases are two major modes of communication in the biological sciences, and thus integrating these critical resources is of urgent importance to increase the pace of discovery. Projects focused on bridging the gap between journals and databases have been on the rise over the last five years and have resulted in the development of automated tools that can recognize entities within a document and link those entities to a relevant database. Unfortunately, automated tools cannot resolve ambiguities that arise from one term being used to signify entities that are quite distinct from one another. Instead, resolving these ambiguities requires some manual oversight. Finding the right balance between the speed and portability of automation and the accuracy and flexibility of manual effort is a crucial goal to making text markup a successful venture.
RESULTS: We have established a journal article mark-up pipeline that links GENETICS journal articles and the model organism database (MOD) WormBase. This pipeline uses a lexicon built with entities from the database as a first step. The entity markup pipeline results in links from over nine classes of objects including genes, proteins, alleles, phenotypes and anatomical terms. New entities and ambiguities are discovered and resolved by a database curator through a manual quality control (QC) step, along with help from authors via a web form that is provided to them by the journal. New entities discovered through this pipeline are immediately sent to an appropriate curator at the database. Ambiguous entities that do not automatically resolve to one link are resolved by hand ensuring an accurate link. This pipeline has been extended to other databases, namely Saccharomyces Genome Database (SGD) and FlyBase, and has been implemented in marking up a paper with links to multiple databases.
CONCLUSIONS: Our semi-automated pipeline hyperlinks articles published in GENETICS to model organism databases such as WormBase. Our pipeline results in interactive articles that are data rich with high accuracy. The use of a manual quality control step sets this pipeline apart from other hyperlinking tools and results in benefits to authors, journals, readers and databases.

Entities:  

Mesh:

Year:  2011        PMID: 21595960      PMCID: PMC3213741          DOI: 10.1186/1471-2105-12-175

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  15 in total

1.  The FEBS Letters/BioCreative II.5 experiment: making biological information accessible.

Authors:  Florian Leitner; Andrew Chatr-aryamontri; Scott A Mardis; Arnaud Ceol; Martin Krallinger; Luana Licata; Lynette Hirschman; Gianni Cesareni; Alfonso Valencia
Journal:  Nat Biotechnol       Date:  2010-09       Impact factor: 54.908

2.  Plant Physiology and TAIR partnership.

Authors:  Donald R Ort; Aleel K Grennan
Journal:  Plant Physiol       Date:  2008-03       Impact factor: 8.340

3.  LINNAEUS: a species name identification system for biomedical literature.

Authors:  Martin Gerner; Goran Nenadic; Casey M Bergman
Journal:  BMC Bioinformatics       Date:  2010-02-11       Impact factor: 3.169

4.  A uniform genetic nomenclature for the nematode Caenorhabditis elegans.

Authors:  H R Horvitz; S Brenner; J Hodgkin; R K Herman
Journal:  Mol Gen Genet       Date:  1979-09

5.  HID-1, a new component of the peptidergic signaling pathway.

Authors:  Rosana Mesa; Shuo Luo; Christopher M Hoover; Kenneth Miller; Alicia Minniti; Nibaldo Inestrosa; Michael L Nonet
Journal:  Genetics       Date:  2010-11-29       Impact factor: 4.562

6.  The role of eIF1 in translation initiation codon selection in Caenorhabditis elegans.

Authors:  Lisa L Maduzia; Anais Moreau; Nausicaa Poullet; Sebastien Chaffre; Yinhua Zhang
Journal:  Genetics       Date:  2010-09-20       Impact factor: 4.562

7.  METT-10, a putative methyltransferase, inhibits germ cell proliferative fate in Caenorhabditis elegans.

Authors:  Maia Dorsett; Bethany Westlund; Tim Schedl
Journal:  Genetics       Date:  2009-07-13       Impact factor: 4.562

8.  Textpresso: an ontology-based information retrieval and extraction system for biological literature.

Authors:  Hans-Michael Müller; Eimear E Kenny; Paul W Sternberg
Journal:  PLoS Biol       Date:  2004-09-21       Impact factor: 8.029

9.  Utopia documents: linking scholarly literature with research data.

Authors:  T K Attwood; D B Kell; P McDermott; J Marsh; S R Pettifer; D Thorne
Journal:  Bioinformatics       Date:  2010-09-15       Impact factor: 6.937

Review 10.  Calling International Rescue: knowledge lost in literature and data landslide!

Authors:  Teresa K Attwood; Douglas B Kell; Philip McDermott; James Marsh; Steve R Pettifer; David Thorne
Journal:  Biochem J       Date:  2009-12-10       Impact factor: 3.857

View more
  9 in total

1.  The Descent of Databases.

Authors:  Howard D Lipshitz
Journal:  Genetics       Date:  2021-03-31       Impact factor: 4.562

2.  Opportunities for text mining in the FlyBase genetic literature curation workflow.

Authors:  Peter McQuilton
Journal:  Database (Oxford)       Date:  2012-11-17       Impact factor: 3.451

3.  Text mining in the biocuration workflow: applications for literature curation at WormBase, dictyBase and TAIR.

Authors:  Kimberly Van Auken; Petra Fey; Tanya Z Berardini; Robert Dodson; Laurel Cooper; Donghui Li; Juancarlos Chan; Yuling Li; Siddhartha Basu; Hans-Michael Muller; Rex Chisholm; Eva Huala; Paul W Sternberg
Journal:  Database (Oxford)       Date:  2012-11-17       Impact factor: 3.451

4.  WormBase 2012: more genomes, more data, new website.

Authors:  Karen Yook; Todd W Harris; Tamberlyn Bieri; Abigail Cabunoc; Juancarlos Chan; Wen J Chen; Paul Davis; Norie de la Cruz; Adrian Duong; Ruihua Fang; Uma Ganesan; Christian Grove; Kevin Howe; Snehalata Kadam; Ranjana Kishore; Raymond Lee; Yuling Li; Hans-Michael Muller; Cecilia Nakamura; Bill Nash; Philip Ozersky; Michael Paulini; Daniela Raciti; Arun Rangarajan; Gary Schindelman; Xiaoqi Shi; Erich M Schwarz; Mary Ann Tuli; Kimberly Van Auken; Daniel Wang; Xiaodong Wang; Gary Williams; Jonathan Hodgkin; Matthew Berriman; Richard Durbin; Paul Kersey; John Spieth; Lincoln Stein; Paul W Sternberg
Journal:  Nucleic Acids Res       Date:  2011-11-08       Impact factor: 16.971

5.  Sustainable funding for biocuration: The Arabidopsis Information Resource (TAIR) as a case study of a subscription-based funding model.

Authors:  Leonore Reiser; Tanya Z Berardini; Donghui Li; Robert Muller; Emily M Strait; Qian Li; Yarik Mezheritsky; Andrey Vetushko; Eva Huala
Journal:  Database (Oxford)       Date:  2016-03-17       Impact factor: 3.451

6.  Micropublication: incentivizing community curation and placing unpublished data into the public domain.

Authors:  Daniela Raciti; Karen Yook; Todd W Harris; Tim Schedl; Paul W Sternberg
Journal:  Database (Oxford)       Date:  2018-01-01       Impact factor: 3.451

7.  Harmonizing model organism data in the Alliance of Genome Resources.

Authors: 
Journal:  Genetics       Date:  2022-04-04       Impact factor: 4.402

8.  In the beginning there was babble...

Authors:  Daniel J Klionsky; Elspeth A Bruford; J Michael Cherry; Jonathan Hodgkin; Stanley J F Laulederkind; Amy G Singer
Journal:  Autophagy       Date:  2012-07-27       Impact factor: 16.016

9.  Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase.

Authors:  Valerio Arnaboldi; Daniela Raciti; Kimberly Van Auken; Juancarlos N Chan; Hans-Michael Müller; Paul W Sternberg
Journal:  Database (Oxford)       Date:  2020-01-01       Impact factor: 3.451

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.