Literature DB >> 25810774

tmChem: a high performance approach for chemical named entity recognition and normalization.

Robert Leaman1, Chih-Hsuan Wei1, Zhiyong Lu1.   

Abstract

Chemical compounds and drugs are an important class of entities in biomedical research with great potential in a wide range of applications, including clinical medicine. Locating chemical named entities in the literature is a useful step in chemical text mining pipelines for identifying the chemical mentions, their properties, and their relationships as discussed in the literature. We introduce the tmChem system, a chemical named entity recognizer created by combining two independent machine learning models in an ensemble. We use the corpus released as part of the recent CHEMDNER task to develop and evaluate tmChem, achieving a micro-averaged f-measure of 0.8739 on the CEM subtask (mention-level evaluation) and 0.8745 f-measure on the CDI subtask (abstract-level evaluation). We also report a high-recall combination (0.9212 for CEM and 0.9224 for CDI). tmChem achieved the highest f-measure reported in the CHEMDNER task for the CEM subtask, and the high recall variant achieved the highest recall on both the CEM and CDI tasks. We report that tmChem is a state-of-the-art tool for chemical named entity recognition and that performance for chemical named entity recognition has now tied (or exceeded) the performance previously reported for genes and diseases. Future research should focus on tighter integration between the named entity recognition and normalization steps for improved performance. The source code and a trained model for both models of tmChem is available at: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmChem. The results of running tmChem (Model 2) on PubMed are available in PubTator: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/PubTator.

Entities:  

Year:  2015        PMID: 25810774      PMCID: PMC4331693          DOI: 10.1186/1758-2946-7-S1-S3

Source DB:  PubMed          Journal:  J Cheminform        ISSN: 1758-2946            Impact factor:   5.514


  25 in total

1.  Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction.

Authors:  Aurélie Névéol; Rezarta Islamaj Doğan; Zhiyong Lu
Journal:  J Biomed Inform       Date:  2010-11-20       Impact factor: 6.317

2.  BANNER: an executable survey of advances in biomedical named entity recognition.

Authors:  Robert Leaman; Graciela Gonzalez
Journal:  Pac Symp Biocomput       Date:  2008

3.  tmVar: a text mining approach for extracting sequence variants in biomedical literature.

Authors:  Chih-Hsuan Wei; Bethany R Harris; Hung-Yu Kao; Zhiyong Lu
Journal:  Bioinformatics       Date:  2013-04-05       Impact factor: 6.937

4.  Detection of IUPAC and IUPAC-like chemical names.

Authors:  Roman Klinger; Corinna Kolárik; Juliane Fluck; Martin Hofmann-Apitius; Christoph M Friedrich
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

5.  Assessment of NER solutions against the first and second CALBC Silver Standard Corpus.

Authors:  Dietrich Rebholz-Schuhmann; Antonio Jimeno Yepes; Chen Li; Senay Kafkas; Ian Lewin; Ning Kang; Peter Corbett; David Milward; Ekaterina Buyko; Elena Beisswanger; Kerstin Hornbostel; Alexandre Kouznetsov; René Witte; Jonas B Laurila; Christopher Jo Baker; Cheng-Ju Kuo; Simone Clematide; Fabio Rinaldi; Richárd Farkas; György Móra; Kazuo Hara; Laura I Furlong; Michael Rautschka; Mariana Lara Neves; Alberto Pascual-Montano; Qi Wei; Nigel Collier; Md Faisal Mahbub Chowdhury; Alberto Lavelli; Rafael Berlanga; Roser Morante; Vincent Van Asch; Walter Daelemans; José Luís Marina; Erik van Mulligen; Jan Kors; Udo Hahn
Journal:  J Biomed Semantics       Date:  2011-10-06

6.  The gene normalization task in BioCreative III.

Authors:  Zhiyong Lu; Hung-Yu Kao; Chih-Hsuan Wei; Minlie Huang; Jingchen Liu; Cheng-Ju Kuo; Chun-Nan Hsu; Richard Tzong-Han Tsai; Hong-Jie Dai; Naoaki Okazaki; Han-Cheol Cho; Martin Gerner; Illes Solt; Shashank Agarwal; Feifan Liu; Dina Vishnyakova; Patrick Ruch; Martin Romacker; Fabio Rinaldi; Sanmitra Bhattacharya; Padmini Srinivasan; Hongfang Liu; Manabu Torii; Sergio Matos; David Campos; Karin Verspoor; Kevin M Livingston; W John Wilbur
Journal:  BMC Bioinformatics       Date:  2011-10-03       Impact factor: 3.169

7.  Understanding PubMed user search behavior through log analysis.

Authors:  Rezarta Islamaj Dogan; G Craig Murray; Aurélie Névéol; Zhiyong Lu
Journal:  Database (Oxford)       Date:  2009-11-27       Impact factor: 3.451

8.  The CHEMDNER corpus of chemicals and drugs and its annotation principles.

Authors:  Martin Krallinger; Obdulia Rabal; Florian Leitner; Miguel Vazquez; David Salgado; Zhiyong Lu; Robert Leaman; Yanan Lu; Donghong Ji; Daniel M Lowe; Roger A Sayle; Riza Theresa Batista-Navarro; Rafal Rak; Torsten Huber; Tim Rocktäschel; Sérgio Matos; David Campos; Buzhou Tang; Hua Xu; Tsendsuren Munkhdalai; Keun Ho Ryu; S V Ramanan; Senthil Nathan; Slavko Žitnik; Marko Bajec; Lutz Weber; Matthias Irmer; Saber A Akhondi; Jan A Kors; Shuo Xu; Xin An; Utpal Kumar Sikdar; Asif Ekbal; Masaharu Yoshioka; Thaer M Dieb; Miji Choi; Karin Verspoor; Madian Khabsa; C Lee Giles; Hongfang Liu; Komandur Elayavilli Ravikumar; Andre Lamurias; Francisco M Couto; Hong-Jie Dai; Richard Tzong-Han Tsai; Caglar Ata; Tolga Can; Anabel Usié; Rui Alves; Isabel Segura-Bedmar; Paloma Martínez; Julen Oyarzabal; Alfonso Valencia
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

9.  Overview of BioCreative II gene mention recognition.

Authors:  Larry Smith; Lorraine K Tanabe; Rie Johnson nee Ando; Cheng-Ju Kuo; I-Fang Chung; Chun-Nan Hsu; Yu-Shi Lin; Roman Klinger; Christoph M Friedrich; Kuzman Ganchev; Manabu Torii; Hongfang Liu; Barry Haddow; Craig A Struble; Richard J Povinelli; Andreas Vlachos; William A Baumgartner; Lawrence Hunter; Bob Carpenter; Richard Tzong-Han Tsai; Hong-Jie Dai; Feng Liu; Yifei Chen; Chengjie Sun; Sophia Katrenko; Pieter Adriaans; Christian Blaschke; Rafael Torres; Mariana Neves; Preslav Nakov; Anna Divoli; Manuel Maña-López; Jacinto Mata; W John Wilbur
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

10.  DNorm: disease name normalization with pairwise learning to rank.

Authors:  Robert Leaman; Rezarta Islamaj Dogan; Zhiyong Lu
Journal:  Bioinformatics       Date:  2013-08-21       Impact factor: 6.937

View more
  76 in total

1.  Beyond accuracy: creating interoperable and scalable text-mining web services.

Authors:  Chih-Hsuan Wei; Robert Leaman; Zhiyong Lu
Journal:  Bioinformatics       Date:  2016-02-16       Impact factor: 6.937

2.  Combining relation extraction with function detection for BEL statement extraction.

Authors:  Suwen Liu; Wei Cheng; Longhua Qian; Guodong Zhou
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

3.  SimConcept: a hybrid approach for simplifying composite named entities in biomedical text.

Authors:  Chih-Hsuan Wei; Robert Leaman; Zhiyong Lu
Journal:  IEEE J Biomed Health Inform       Date:  2015-04-13       Impact factor: 5.772

Review 4.  Community challenges in biomedical text mining over 10 years: success, failure and the future.

Authors:  Chung-Chi Huang; Zhiyong Lu
Journal:  Brief Bioinform       Date:  2015-05-01       Impact factor: 11.622

5.  Recognition of chemical entities: combining dictionary-based and grammar-based approaches.

Authors:  Saber A Akhondi; Kristina M Hettne; Eelke van der Horst; Erik M van Mulligen; Jan A Kors
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

Review 6.  A survey of current trends in computational drug repositioning.

Authors:  Jiao Li; Si Zheng; Bin Chen; Atul J Butte; S Joshua Swamidass; Zhiyong Lu
Journal:  Brief Bioinform       Date:  2015-03-31       Impact factor: 11.622

7.  NERChem: adapting NERBio to chemical patents via full-token features and named entity feature with chemical sub-class composition.

Authors:  Richard Tzong-Han Tsai; Yu-Cheng Hsiao; Po-Ting Lai
Journal:  Database (Oxford)       Date:  2016-10-25       Impact factor: 3.451

Review 8.  Crowdsourcing in biomedicine: challenges and opportunities.

Authors:  Ritu Khare; Benjamin M Good; Robert Leaman; Andrew I Su; Zhiyong Lu
Journal:  Brief Bioinform       Date:  2015-04-17       Impact factor: 11.622

9.  TaggerOne: joint named entity recognition and normalization with semi-Markov Models.

Authors:  Robert Leaman; Zhiyong Lu
Journal:  Bioinformatics       Date:  2016-06-09       Impact factor: 6.937

10.  Discovering biomedical semantic relations in PubMed queries for information retrieval and database curation.

Authors:  Chung-Chi Huang; Zhiyong Lu
Journal:  Database (Oxford)       Date:  2016-03-25       Impact factor: 3.451

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.