Literature DB >> 26454282

Automatic semantic classification of scientific literature according to the hallmarks of cancer.

Simon Baker1, Ilona Silins2, Yufan Guo3, Imran Ali2, Johan Högberg2, Ulla Stenius2, Anna Korhonen3.   

Abstract

MOTIVATION: The hallmarks of cancer have become highly influential in cancer research. They reduce the complexity of cancer into 10 principles (e.g. resisting cell death and sustaining proliferative signaling) that explain the biological capabilities acquired during the development of human tumors. Since new research depends crucially on existing knowledge, technology for semantic classification of scientific literature according to the hallmarks of cancer could greatly support literature review, knowledge discovery and applications in cancer research.
RESULTS: We present the first step toward the development of such technology. We introduce a corpus of 1499 PubMed abstracts annotated according to the scientific evidence they provide for the 10 currently known hallmarks of cancer. We use this corpus to train a system that classifies PubMed literature according to the hallmarks. The system uses supervised machine learning and rich features largely based on biomedical text mining. We report good performance in both intrinsic and extrinsic evaluations, demonstrating both the accuracy of the methodology and its potential in supporting practical cancer research. We discuss how this approach could be developed and applied further in the future.
AVAILABILITY AND IMPLEMENTATION: The corpus of hallmark-annotated PubMed abstracts and the software for classification are available at: http://www.cl.cam.ac.uk/∼sb895/HoC.html. CONTACT: simon.baker@cl.cam.ac.uk.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2015        PMID: 26454282     DOI: 10.1093/bioinformatics/btv585

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  12 in total

1.  ML-Net: multi-label classification of biomedical texts with deep neural networks.

Authors:  Jingcheng Du; Qingyu Chen; Yifan Peng; Yang Xiang; Cui Tao; Zhiyong Lu
Journal:  J Am Med Inform Assoc       Date:  2019-11-01       Impact factor: 4.497

Review 2.  Reviewing cancer's biology: an eclectic approach.

Authors:  Ibrahim Diori Karidio; Senay Hamarat Sanlier
Journal:  J Egypt Natl Canc Inst       Date:  2021-11-01

3.  An extensive review of tools for manual annotation of documents.

Authors:  Mariana Neves; Jurica Ševa
Journal:  Brief Bioinform       Date:  2021-01-18       Impact factor: 11.622

Review 4.  Drug repurposing in oncology: Compounds, pathways, phenotypes and computational approaches for colorectal cancer.

Authors:  Patrycja Nowak-Sliwinska; Leonardo Scapozza; Ariel Ruiz i Altaba
Journal:  Biochim Biophys Acta Rev Cancer       Date:  2019-04-26       Impact factor: 10.680

Review 5.  Candidate biomarkers in the cervical vaginal fluid for the (self-)diagnosis of cervical precancer.

Authors:  Xaveer Van Ostade; Martin Dom; Wiebren Tjalma; Geert Van Raemdonck
Journal:  Arch Gynecol Obstet       Date:  2017-11-15       Impact factor: 2.344

6.  Text mining for improved exposure assessment.

Authors:  Kristin Larsson; Simon Baker; Ilona Silins; Yufan Guo; Ulla Stenius; Anna Korhonen; Marika Berglund
Journal:  PLoS One       Date:  2017-03-03       Impact factor: 3.240

7.  Cancer Hallmarks Analytics Tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer.

Authors:  Simon Baker; Imran Ali; Ilona Silins; Sampo Pyysalo; Yufan Guo; Johan Högberg; Ulla Stenius; Anna Korhonen
Journal:  Bioinformatics       Date:  2017-12-15       Impact factor: 6.937

8.  STonKGs: A Sophisticated Transformer Trained on Biomedical Text and Knowledge Graphs.

Authors:  Helena Balabin; Charles Tapley Hoyt; Colin Birkenbihl; Benjamin M Gyori; John Bachman; Alpha Tom Kodamullil; Paul G Plöger; Martin Hofmann-Apitius; Daniel Domingo-Fernández
Journal:  Bioinformatics       Date:  2022-01-05       Impact factor: 6.937

9.  BioVerbNet: a large semantic-syntactic classification of verbs in biomedicine.

Authors:  Olga Majewska; Charlotte Collins; Simon Baker; Jari Björne; Susan Windisch Brown; Anna Korhonen; Martha Palmer
Journal:  J Biomed Semantics       Date:  2021-07-15

10.  LION LBD: a literature-based discovery system for cancer biology.

Authors:  Sampo Pyysalo; Simon Baker; Imran Ali; Stefan Haselwimmer; Tejas Shah; Andrew Young; Yufan Guo; Johan Högberg; Ulla Stenius; Masashi Narita; Anna Korhonen
Journal:  Bioinformatics       Date:  2019-05-01       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.