Literature DB >> 22000346

Detecting disease genes based on semi-supervised learning and protein-protein interaction networks.

Thanh-Phuong Nguyen1, Tu-Bao Ho.   

Abstract

OBJECTIVE: Predicting or prioritizing the human genes that cause disease, or "disease genes", is one of the emerging tasks in biomedicine informatics. Research on network-based approach to this problem is carried out upon the key assumption of "the network-neighbour of a disease gene is likely to cause the same or a similar disease", and mostly employs data regarding well-known disease genes, using supervised learning methods. This work aims to find an effective method to exploit the disease gene neighbourhood and the integration of several useful omics data sources, which potentially enhance disease gene predictions.
METHODS: We have presented a novel method to effectively predict disease genes by exploiting, in the semi-supervised learning (SSL) scheme, data regarding both disease genes and disease gene neighbours via protein-protein interaction network. Multiple proteomic and genomic data were integrated from six biological databases, including Universal Protein Resource, Interologous Interaction Database, Reactome, Gene Ontology, Pfam, and InterDom, and a gene expression dataset.
RESULTS: By employing a 10 times stratified 10-fold cross validation, the SSL method performs better than the k-nearest neighbour method and the support vector machines method in terms of sensitivity of 85%, specificity of 79%, precision of 81%, accuracy of 82%, and a balanced F-function of 83%. The other comparative experimental evaluations demonstrate advantages of the proposed method given a small amount of labeled data with accuracy of 78%. We have applied the proposed method to detect 572 putative disease genes, which are biologically validated by some indirect ways.
CONCLUSION: Semi-supervised learning improved ability to study disease genes, especially a specific disease when the known disease genes (as labeled data) are very often limited. In addition to the computational improvement, the analysis of predicted disease proteins indicates that the findings are beneficial in deciphering the pathogenic mechanisms.
Copyright © 2011 Elsevier B.V. All rights reserved.

Entities:  

Mesh:

Year:  2011        PMID: 22000346     DOI: 10.1016/j.artmed.2011.09.003

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  19 in total

1.  Network modeling of patients' biomolecular profiles for clinical phenotype/outcome prediction.

Authors:  Jessica Gliozzo; Paolo Perlasca; Marco Mesiti; Elena Casiraghi; Viviana Vallacchi; Elisabetta Vergani; Marco Frasca; Giuliano Grossi; Alessandro Petrini; Matteo Re; Alberto Paccanaro; Giorgio Valentini
Journal:  Sci Rep       Date:  2020-02-27       Impact factor: 4.379

2.  Network propagation with dual flow for gene prioritization.

Authors:  Shunyao Wu; Fengjing Shao; Jun Ji; Rencheng Sun; Rizhuang Dong; Yuanke Zhou; Shaojie Xu; Yi Sui; Jianlong Hu
Journal:  PLoS One       Date:  2015-02-17       Impact factor: 3.240

3.  A systems biology investigation of neurodegenerative dementia reveals a pivotal role of autophagy.

Authors:  Laura Caberlotto; Thanh-Phuong Nguyen
Journal:  BMC Syst Biol       Date:  2014-06-07

Review 4.  The role of protein interaction networks in systems biomedicine.

Authors:  Tuba Sevimoglu; Kazim Yalcin Arga
Journal:  Comput Struct Biotechnol J       Date:  2014-09-03       Impact factor: 7.271

5.  Locus heterogeneity disease genes encode proteins with high interconnectivity in the human protein interaction network.

Authors:  Benjamin P Keith; David L Robertson; Kathryn E Hentges
Journal:  Front Genet       Date:  2014-12-09       Impact factor: 4.599

6.  Multiple kernels learning-based biological entity relationship extraction method.

Authors:  Xu Dongliang; Pan Jingchang; Wang Bailing
Journal:  J Biomed Semantics       Date:  2017-09-20

7.  Integrative gene network construction to analyze cancer recurrence using semi-supervised learning.

Authors:  Chihyun Park; Jaegyoon Ahn; Hyunjin Kim; Sanghyun Park
Journal:  PLoS One       Date:  2014-01-31       Impact factor: 3.240

8.  Network analysis of neurodegenerative disease highlights a role of Toll-like receptor signaling.

Authors:  Thanh-Phuong Nguyen; Laura Caberlotto; Melissa J Morine; Corrado Priami
Journal:  Biomed Res Int       Date:  2014-01-16       Impact factor: 3.411

9.  Improved multi-level protein-protein interaction prediction with semantic-based regularization.

Authors:  Claudio Saccà; Stefano Teso; Michelangelo Diligenti; Andrea Passerini
Journal:  BMC Bioinformatics       Date:  2014-04-12       Impact factor: 3.169

10.  An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.

Authors:  Giorgio Valentini; Alberto Paccanaro; Horacio Caniza; Alfonso E Romero; Matteo Re
Journal:  Artif Intell Med       Date:  2014-03-20       Impact factor: 5.326

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.