Literature DB >> 20472411

Document classification for mining host pathogen protein-protein interactions.

Lanlan Yin1, Guixian Xu, Manabu Torii, Zhendong Niu, Jose M Maisog, Cathy Wu, Zhangzhi Hu, Hongfang Liu.   

Abstract

OBJECTIVE: Scientific findings regarding human pathogens and their host responses are buried in the growing volume of biomedical literature and there is an urgent need to mine information pertaining to pathogenesis-related proteins especially host pathogen protein-protein interactions (HP-PPIs) from literature.
METHODS: In this paper, we report our exploration of developing an automated system to identify MEDLINE abstracts referring to HP-PPIs. An annotated corpus consisting of 1360 MEDLINE abstracts was generated. With this corpus, we developed and evaluated document classification systems using support vector machines (SVMs). We also investigated the effects of three feature selection methods:information gain (IG), chi(2) test, and specific mutual information (SI). The performance was measured using normalized discounted cumulative gain (NDCG) and positive predictive value (PPV) and all measures were obtained through 10-fold cross validation.
RESULTS: NDCG measures for classification systems using all features or a subset of features selected using IG and chi(2) test range from 0.83 to 0.89 while classification systems built based on features selected using SI had relatively lower NDCG measures. The classification system achieved a PPV of 50.7% for the top 10% ranked documents comparing to a baseline PPV of 10.0%.
CONCLUSIONS: Our results indicate that document classification systems can be constructed to efficiently retrieve HP-PPI related documents. Feature selection was effective in reducing the dimensionality of features to build a compact system.

Entities:  

Mesh:

Substances:

Year:  2010        PMID: 20472411      PMCID: PMC2902599          DOI: 10.1016/j.artmed.2010.04.003

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  7 in total

1.  The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors:  Olivier Bodenreider
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  GENIA corpus--semantically annotated corpus for bio-textmining.

Authors:  J-D Kim; T Ohta; Y Tateisi; J Tsujii
Journal:  Bioinformatics       Date:  2003       Impact factor: 6.937

Review 3.  Host-pathogen interactions: a proteomic view.

Authors:  Celia G Zhang; Brett A Chromy; Sandra L McCutchen-Maloney
Journal:  Expert Rev Proteomics       Date:  2005-04       Impact factor: 3.940

Review 4.  Biomedical language processing: what's beyond PubMed?

Authors:  Lawrence Hunter; K Bretonnel Cohen
Journal:  Mol Cell       Date:  2006-03-03       Impact factor: 17.970

5.  Diagnostic tests 2: Predictive values.

Authors:  D G Altman; J M Bland
Journal:  BMJ       Date:  1994-07-09

6.  A bacterial virulence protein suppresses host innate immunity to cause plant disease.

Authors:  Kinya Nomura; Sruti Debroy; Yong Hoon Lee; Nathan Pumplin; Jonathan Jones; Sheng Yang He
Journal:  Science       Date:  2006-07-14       Impact factor: 47.728

7.  Overview of the protein-protein interaction annotation extraction task of BioCreative II.

Authors:  Martin Krallinger; Florian Leitner; Carlos Rodriguez-Penagos; Alfonso Valencia
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

  7 in total
  9 in total

1.  A similarity study of content-based image retrieval system for breast cancer using decision tree.

Authors:  Hyun-Chong Cho; Lubomir Hadjiiski; Berkman Sahiner; Heang-Ping Chan; Mark Helvie; Chintana Paramagul; Alexis V Nees
Journal:  Med Phys       Date:  2013-01       Impact factor: 4.071

2.  An exploratory study of a text classification framework for Internet-based surveillance of emerging epidemics.

Authors:  Manabu Torii; Lanlan Yin; Thang Nguyen; Chand T Mazumdar; Hongfang Liu; David M Hartley; Noele P Nelson
Journal:  Int J Med Inform       Date:  2010-12-04       Impact factor: 4.046

3.  Representing and extracting lung cancer study metadata: study objective and study design.

Authors:  Jean I Garcia-Gathright; Andrea Oh; Phillip A Abarca; Mary Han; William Sago; Marshall L Spiegel; Brian Wolf; Edward B Garon; Alex A T Bui; Denise R Aberle
Journal:  Comput Biol Med       Date:  2015-01-13       Impact factor: 4.589

Review 4.  Network representations of immune system complexity.

Authors:  Naeha Subramanian; Parizad Torabi-Parizi; Rachel A Gottschalk; Ronald N Germain; Bhaskar Dutta
Journal:  Wiley Interdiscip Rev Syst Biol Med       Date:  2015-01-27

5.  Overview of the gene ontology task at BioCreative IV.

Authors:  Yuqing Mao; Kimberly Van Auken; Donghui Li; Cecilia N Arighi; Peter McQuilton; G Thomas Hayman; Susan Tweedie; Mary L Schaeffer; Stanley J F Laulederkind; Shur-Jen Wang; Julien Gobeill; Patrick Ruch; Anh Tuan Luu; Jung-Jae Kim; Jung-Hsien Chiang; Yu-De Chen; Chia-Jung Yang; Hongfang Liu; Dongqing Zhu; Yanpeng Li; Hong Yu; Ehsan Emadzadeh; Graciela Gonzalez; Jian-Ming Chen; Hong-Jie Dai; Zhiyong Lu
Journal:  Database (Oxford)       Date:  2014-08-25       Impact factor: 3.451

6.  Literature Mining and Ontology based Analysis of Host-Brucella Gene-Gene Interaction Network.

Authors:  İlknur Karadeniz; Junguk Hur; Yongqun He; Arzucan Özgür
Journal:  Front Microbiol       Date:  2015-12-09       Impact factor: 5.640

Review 7.  Computational analysis of interactomes: current and future perspectives for bioinformatics approaches to model the host-pathogen interaction space.

Authors:  Roland Arnold; Kurt Boonen; Mark G F Sun; Philip M Kim
Journal:  Methods       Date:  2012-06-28       Impact factor: 3.608

Review 8.  A review on computational systems biology of pathogen-host interactions.

Authors:  Saliha Durmuş; Tunahan Çakır; Arzucan Özgür; Reinhard Guthke
Journal:  Front Microbiol       Date:  2015-04-09       Impact factor: 5.640

Review 9.  Systems biology of pathogen-host interaction: networks of protein-protein interaction within pathogens and pathogen-human interactions in the post-genomic era.

Authors:  Saliha D Durmuş Tekir; Kutlu Ö Ülgen
Journal:  Biotechnol J       Date:  2012-11-29       Impact factor: 4.677

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.