Chih-Hsuan Wei1, Robert Leaman1, Zhiyong Lu1. 1. National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), Bethesda, MD 20894, USA.
Abstract
UNLABELLED: The biomedical literature is a knowledge-rich resource and an important foundation for future research. With over 24 million articles in PubMed and an increasing growth rate, research in automated text processing is becoming increasingly important. We report here our recently developed web-based text mining services for biomedical concept recognition and normalization. Unlike most text-mining software tools, our web services integrate several state-of-the-art entity tagging systems (DNorm, GNormPlus, SR4GN, tmChem and tmVar) and offer a batch-processing mode able to process arbitrary text input (e.g. scholarly publications, patents and medical records) in multiple formats (e.g. BioC). We support multiple standards to make our service interoperable and allow simpler integration with other text-processing pipelines. To maximize scalability, we have preprocessed all PubMed articles, and use a computer cluster for processing large requests of arbitrary text. AVAILABILITY AND IMPLEMENTATION: Our text-mining web service is freely available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#curl CONTACT: : Zhiyong.Lu@nih.gov. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.
UNLABELLED: The biomedical literature is a knowledge-rich resource and an important foundation for future research. With over 24 million articles in PubMed and an increasing growth rate, research in automated text processing is becoming increasingly important. We report here our recently developed web-based text mining services for biomedical concept recognition and normalization. Unlike most text-mining software tools, our web services integrate several state-of-the-art entity tagging systems (DNorm, GNormPlus, SR4GN, tmChem and tmVar) and offer a batch-processing mode able to process arbitrary text input (e.g. scholarly publications, patents and medical records) in multiple formats (e.g. BioC). We support multiple standards to make our service interoperable and allow simpler integration with other text-processing pipelines. To maximize scalability, we have preprocessed all PubMed articles, and use a computer cluster for processing large requests of arbitrary text. AVAILABILITY AND IMPLEMENTATION: Our text-mining web service is freely available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#curl CONTACT: : Zhiyong.Lu@nih.gov. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.
Authors: David Salgado; Martin Krallinger; Marc Depaule; Elodie Drula; Ashish V Tendulkar; Florian Leitner; Alfonso Valencia; Christophe Marcelle Journal: Bioinformatics Date: 2012-07-12 Impact factor: 6.937
Authors: J Gregory Caporaso; William A Baumgartner; David A Randolph; K Bretonnel Cohen; Lawrence Hunter Journal: Bioinformatics Date: 2007-05-11 Impact factor: 6.937
Authors: Alexander A Morgan; Zhiyong Lu; Xinglong Wang; Aaron M Cohen; Juliane Fluck; Patrick Ruch; Anna Divoli; Katrin Fundel; Robert Leaman; Jörg Hakenberg; Chengjie Sun; Heng-hui Liu; Rafael Torres; Michael Krauthammer; William W Lau; Hongfang Liu; Chun-Nan Hsu; Martijn Schuemie; K Bretonnel Cohen; Lynette Hirschman Journal: Genome Biol Date: 2008-09-01 Impact factor: 13.583
Authors: Mustafa H Gunturkun; Efraim Flashner; Tengfei Wang; Megan K Mulligan; Robert W Williams; Pjotr Prins; Hao Chen Journal: G3 (Bethesda) Date: 2022-05-06 Impact factor: 3.542