Literature DB >> 15811781

Bio-medical entity extraction using support vector machines.

Koichi Takeuchi1, Nigel Collier.   

Abstract

OBJECTIVE: Support vector machines (SVMs) have achieved state-of-the-art performance in several classification tasks. In this article we apply them to the identification and semantic annotation of scientific and technical terminology in the domain of molecular biology. This illustrates the extensibility of the traditional named entity task to special domains with large-scale terminologies such as those in medicine and related disciplines. METHODS AND MATERIALS: The foundation for the model is a sample of text annotated by a domain expert according to an ontology of concepts, properties and relations. The model then learns to annotate unseen terms in new texts and contexts. The results can be used for a variety of intelligent language processing applications. We illustrate SVMs capabilities using a sample of 100 journal abstracts texts taken from the {human, blood cell, transcription factor} domain of MEDLINE.
RESULTS: Approximately 3400 terms are annotated and the model performs at about 74% F-score on cross-validation tests. A detailed analysis based on empirical evidence shows the contribution of various feature sets to performance.
CONCLUSION: Our experiments indicate a relationship between feature window size and the amount of training data and that a combination of surface words, orthographic features and head noun features achieve the best performance among the feature sets tested.

Entities:  

Mesh:

Year:  2005        PMID: 15811781     DOI: 10.1016/j.artmed.2004.07.019

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  6 in total

1.  Accurate prediction of coronary artery disease using reliable diagnosis system.

Authors:  Indrajit Mandal; N Sairam
Journal:  J Med Syst       Date:  2012-02-12       Impact factor: 4.460

2.  Framework for automatic information extraction from research papers on nanocrystal devices.

Authors:  Thaer M Dieb; Masaharu Yoshioka; Shinjiro Hara; Marcus C Newton
Journal:  Beilstein J Nanotechnol       Date:  2015-09-07       Impact factor: 3.649

3.  Collecting specialty-related medical terms: Development and evaluation of a resource for Spanish.

Authors:  Pilar López-Úbeda; Alexandra Pomares-Quimbaya; Manuel Carlos Díaz-Galiano; Stefan Schulz
Journal:  BMC Med Inform Decis Mak       Date:  2021-05-04       Impact factor: 2.796

4.  Hierarchical network analysis of co-occurring bioentities in literature.

Authors:  Heejung Yang; Namgil Lee; Beomjun Park; Jinyoung Park; Jiho Lee; Hyeon Seok Jang; Hojin Yoo
Journal:  Sci Rep       Date:  2022-05-12       Impact factor: 4.996

5.  Corpus refactoring: a feasibility study.

Authors:  Helen L Johnson; William A Baumgartner; Martin Krallinger; K Bretonnel Cohen; Lawrence Hunter
Journal:  J Biomed Discov Collab       Date:  2007-09-13

6.  Identifying medical terms in patient-authored text: a crowdsourcing-based approach.

Authors:  Diana Lynn MacLean; Jeffrey Heer
Journal:  J Am Med Inform Assoc       Date:  2013-05-05       Impact factor: 4.497

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.