Literature DB >> 20671314

Empirical investigations into full-text protein interaction Article Categorization Task (ACT) in the BioCreative II.5 Challenge.

Man Lan1, Jian Su.   

Abstract

The selection of protein interaction documents is one important application for biology research and has a direct impact on the quality of downstream BioNLP applications, i.e., information extraction and retrieval, summarization, QA, etc. The BioCreative II.5 Challenge Article Categorization task (ACT) involves doing a binary text classification to determine whether a given structured full-text article contains protein interaction information. This may be the first attempt at classification of full-text protein interaction documents in wide community. In this paper, we compare and evaluate the effectiveness of different section types in full-text articles for text classification. Moreover, in practice, the less number of true-positive samples results in unstable performance and unreliable classifier trained on it. Previous research on learning with skewed class distributions has altered the class distribution using oversampling and downsampling. We also investigate the skewed protein interaction classification and analyze the effect of various issues related to the choice of external sources, oversampling training sets, classifiers, etc. We report on the various factors above to show that 1) a full-text biomedical article contains a wealth of scientific information important to users that may not be completely represented by abstracts and/or keywords, which improves the accuracy performance of classification and 2) reinforcing true-positive samples significantly increases the accuracy and stability performance of classification.

Mesh:

Year:  2010        PMID: 20671314     DOI: 10.1109/TCBB.2010.49

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  5 in total

1.  Text mining for modeling of protein complexes enhanced by machine learning.

Authors:  Varsha D Badal; Petras J Kundrotas; Ilya A Vakser
Journal:  Bioinformatics       Date:  2021-05-01       Impact factor: 6.937

2.  Integrating image caption information into biomedical document classification in support of biocuration.

Authors:  Xiangying Jiang; Pengyuan Li; James Kadin; Judith A Blake; Martin Ringwald; Hagit Shatkay
Journal:  Database (Oxford)       Date:  2020-01-01       Impact factor: 3.451

3.  Simple and efficient machine learning frameworks for identifying protein-protein interaction relevant articles and experimental methods used to study the interactions.

Authors:  Shashank Agarwal; Feifan Liu; Hong Yu
Journal:  BMC Bioinformatics       Date:  2011-10-03       Impact factor: 3.169

4.  Detection of interaction articles and experimental methods in biomedical literature.

Authors:  Gerold Schneider; Simon Clematide; Fabio Rinaldi
Journal:  BMC Bioinformatics       Date:  2011-10-03       Impact factor: 3.169

5.  Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications.

Authors:  Tim Clark; Paolo N Ciccarese; Carole A Goble
Journal:  J Biomed Semantics       Date:  2014-07-04
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.