Literature DB >> 33767203

NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature.

Rezarta Islamaj1, Robert Leaman1, Sun Kim1, Dongseop Kwon1, Chih-Hsuan Wei1, Donald C Comeau1, Yifan Peng1, David Cissel1, Cathleen Coss1, Carol Fisher1, Rob Guzman1, Preeti Gokal Kochar1, Stella Koppel1, Dorothy Trinh1, Keiko Sekiya1, Janice Ward1, Deborah Whitman1, Susan Schmidt1, Zhiyong Lu2.   

Abstract

Automatically identifying chemical and drug names in scientific publications advances information access for this important class of entities in a variety of biomedical disciplines by enabling improved retrieval and linkage to related concepts. While current methods for tagging chemical entities were developed for the article title and abstract, their performance in the full article text is substantially lower. However, the full text frequently contains more detailed chemical information, such as the properties of chemical compounds, their biological effects and interactions with diseases, genes and other chemicals. We therefore present the NLM-Chem corpus, a full-text resource to support the development and evaluation of automated chemical entity taggers. The NLM-Chem corpus consists of 150 full-text articles, doubly annotated by ten expert NLM indexers, with ~5000 unique chemical name annotations, mapped to ~2000 MeSH identifiers. We also describe a substantially improved chemical entity tagger, with automated annotations for all of PubMed and PMC freely accessible through the PubTator web-based interface and API. The NLM-Chem corpus is freely available.

Entities:  

Year:  2021        PMID: 33767203     DOI: 10.1038/s41597-021-00875-1

Source DB:  PubMed          Journal:  Sci Data        ISSN: 2052-4463            Impact factor:   6.444


  18 in total

1.  PubTator central: automated concept annotation for biomedical full text articles.

Authors:  Chih-Hsuan Wei; Alexis Allot; Robert Leaman; Zhiyong Lu
Journal:  Nucleic Acids Res       Date:  2019-07-02       Impact factor: 16.971

2.  Biomedical text mining for research rigor and integrity: tasks, challenges, directions.

Authors:  Halil Kilicoglu
Journal:  Brief Bioinform       Date:  2018-11-27       Impact factor: 11.622

Review 3.  Information Retrieval and Text Mining Technologies for Chemistry.

Authors:  Martin Krallinger; Obdulia Rabal; Anália Lourenço; Julen Oyarzabal; Alfonso Valencia
Journal:  Chem Rev       Date:  2017-05-05       Impact factor: 60.622

4.  TaggerOne: joint named entity recognition and normalization with semi-Markov Models.

Authors:  Robert Leaman; Zhiyong Lu
Journal:  Bioinformatics       Date:  2016-06-09       Impact factor: 6.937

5.  Concept annotation in the CRAFT corpus.

Authors:  Michael Bada; Miriam Eckert; Donald Evans; Kristin Garcia; Krista Shipley; Dmitry Sitnikov; William A Baumgartner; K Bretonnel Cohen; Karin Verspoor; Judith A Blake; Lawrence E Hunter
Journal:  BMC Bioinformatics       Date:  2012-07-09       Impact factor: 3.169

6.  CHEMDNER: The drugs and chemical names extraction challenge.

Authors:  Martin Krallinger; Florian Leitner; Obdulia Rabal; Miguel Vazquez; Julen Oyarzabal; Alfonso Valencia
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

7.  Chemical Entity Recognition and Resolution to ChEBI.

Authors:  Tiago Grego; Catia Pesquita; Hugo P Bastos; Francisco M Couto
Journal:  ISRN Bioinform       Date:  2012-02-15

Review 8.  An analysis on the entity annotations in biological corpora.

Authors:  Mariana Neves
Journal:  F1000Res       Date:  2014-04-25

9.  The CHEMDNER corpus of chemicals and drugs and its annotation principles.

Authors:  Martin Krallinger; Obdulia Rabal; Florian Leitner; Miguel Vazquez; David Salgado; Zhiyong Lu; Robert Leaman; Yanan Lu; Donghong Ji; Daniel M Lowe; Roger A Sayle; Riza Theresa Batista-Navarro; Rafal Rak; Torsten Huber; Tim Rocktäschel; Sérgio Matos; David Campos; Buzhou Tang; Hua Xu; Tsendsuren Munkhdalai; Keun Ho Ryu; S V Ramanan; Senthil Nathan; Slavko Žitnik; Marko Bajec; Lutz Weber; Matthias Irmer; Saber A Akhondi; Jan A Kors; Shuo Xu; Xin An; Utpal Kumar Sikdar; Asif Ekbal; Masaharu Yoshioka; Thaer M Dieb; Miji Choi; Karin Verspoor; Madian Khabsa; C Lee Giles; Hongfang Liu; Komandur Elayavilli Ravikumar; Andre Lamurias; Francisco M Couto; Hong-Jie Dai; Richard Tzong-Han Tsai; Caglar Ata; Tolga Can; Anabel Usié; Rui Alves; Isabel Segura-Bedmar; Paloma Martínez; Julen Oyarzabal; Alfonso Valencia
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

10.  LSTMVoter: chemical named entity recognition using a conglomerate of sequence labeling tools.

Authors:  Wahed Hemati; Alexander Mehler
Journal:  J Cheminform       Date:  2019-01-10       Impact factor: 5.514

View more
  5 in total

1.  Chemical identification and indexing in PubMed full-text articles using deep learning and heuristics.

Authors:  Tiago Almeida; Rui Antunes; João F Silva; João R Almeida; Sérgio Matos
Journal:  Database (Oxford)       Date:  2022-07-01       Impact factor: 4.462

2.  Full-text chemical identification with improved generalizability and tagging consistency.

Authors:  Hyunjae Kim; Mujeen Sung; Wonjin Yoon; Sungjoon Park; Jaewoo Kang
Journal:  Database (Oxford)       Date:  2022-09-28       Impact factor: 4.462

3.  MetaboListem and TABoLiSTM: Two Deep Learning Algorithms for Metabolite Named Entity Recognition.

Authors:  Cheng S Yeung; Tim Beck; Joram M Posma
Journal:  Metabolites       Date:  2022-03-22

4.  Pre-trained models, data augmentation, and ensemble learning for biomedical information extraction and document classification.

Authors:  Arslan Erdengasileng; Qing Han; Tingting Zhao; Shubo Tian; Xin Sui; Keqiao Li; Wanjing Wang; Jian Wang; Ting Hu; Feng Pan; Yuan Zhang; Jinfeng Zhang
Journal:  Database (Oxford)       Date:  2022-08-13       Impact factor: 4.462

5.  Data driven identification of international cutting edge science and technologies using SpaCy.

Authors:  Chunqi Hu; Huaping Gong; Yiqing He
Journal:  PLoS One       Date:  2022-10-12       Impact factor: 3.752

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.