Literature DB >> 23838808

Predicting potential cancer genes by integrating network properties, sequence features and functional annotations.

Wei Liu1, HongWei Xie.   

Abstract

The discovery of novel cancer genes is one of the main goals in cancer research. Bioinformatics methods can be used to accelerate cancer gene discovery, which may help in the understanding of cancer and the development of drug targets. In this paper, we describe a classifier to predict potential cancer genes that we have developed by integrating multiple biological evidence, including protein-protein interaction network properties, and sequence and functional features. We detected 55 features that were significantly different between cancer genes and non-cancer genes. Fourteen cancer-associated features were chosen to train the classifier. Four machine learning methods, logistic regression, support vector machines (SVMs), BayesNet and decision tree, were explored in the classifier models to distinguish cancer genes from non-cancer genes. The prediction power of the different models was evaluated by 5-fold cross-validation. The area under the receiver operating characteristic curve for logistic regression, SVM, Baysnet and J48 tree models was 0.834, 0.740, 0.800 and 0.782, respectively. Finally, the logistic regression classifier with multiple biological features was applied to the genes in the Entrez database, and 1976 cancer gene candidates were identified. We found that the integrated prediction model performed much better than the models based on the individual biological evidence, and the network and functional features had stronger powers than the sequence features in predicting cancer genes.

Entities:  

Mesh:

Year:  2013        PMID: 23838808     DOI: 10.1007/s11427-013-4500-6

Source DB:  PubMed          Journal:  Sci China Life Sci        ISSN: 1674-7305            Impact factor:   6.038


  3 in total

1.  NCG 5.0: updates of a manually curated repository of cancer genes and associated properties from cancer mutational screenings.

Authors:  Omer An; Giovanni M Dall'Olio; Thanos P Mourikis; Francesca D Ciccarelli
Journal:  Nucleic Acids Res       Date:  2015-10-29       Impact factor: 16.971

2.  Integrating network, sequence and functional features using machine learning approaches towards identification of novel Alzheimer genes.

Authors:  Salma Jamal; Sukriti Goyal; Asheesh Shanker; Abhinav Grover
Journal:  BMC Genomics       Date:  2016-10-18       Impact factor: 3.969

3.  Identification of infectious disease-associated host genes using machine learning techniques.

Authors:  Ranjan Kumar Barman; Anirban Mukhopadhyay; Ujjwal Maulik; Santasabuj Das
Journal:  BMC Bioinformatics       Date:  2019-12-27       Impact factor: 3.169

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.