Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 C-PUGP: A cluster-based positive unlabeled learning method for disease gene prediction and prioritization.

Literature DB >> 29890338

C-PUGP: A cluster-based positive unlabeled learning method for disease gene prediction and prioritization.

Abstract

Disease gene detection is an important stage in the understanding disease processes and treatment. Some candidate disease genes are identified using many machine learning methods Although there are some differences in these methods including feature vector of genes, the method used to selecting reliable negative data (non-disease genes), and the classification method, the lack of negative data is the most significant challenge of them. Recently, candidate disease genes are identified by semi-supervised learning methods based on positive and unlabeled data. These methods are reasonably accurate and achieved more desirable results versus preceding methods. In this article, we propose a novel Positive Unlabeled (PU) learning technique based upon clustering and One-Class classification algorithm. In this regard, unlike existing methods, we make a more Reliable Negative (RN) set in three steps: (1) Clustering positive data, (2) Learning One-Class classifier models using the clusters, and (3) Selecting intersection set of negative data as the Reliable Negative set. Next, we attempt to identify and rank the candidate disease genes using a binary classifier based on support vector machine (SVM) algorithm. Experimental results indicate that the proposed method yields to the best results, that is 92.8, 93.6, and 93.1 in terms of precision, recall, and F-measure respectively. Compared to the existing methods, the increase of performances of our proposed method is 11.7 percent better than the best method in terms of F-measure. Also, results show about 6% increase in the prioritization results.

Keywords: Candidate disease genes; Classification; Clustering; Identification; Pul; Semi-supervised learning

Mesh：

Year: 2018 PMID： 29890338 DOI： 10.1016/j.compbiolchem.2018.05.022

Source DB: PubMed Journal: Comput Biol Chem ISSN： 1476-9271 Impact factor: 2.877

Keyword Cloud
Cited

3 in total

1. Factor graph-aggregated heterogeneous network embedding for disease-gene association prediction.

Authors: Ming He; Chen Huang; Bo Liu; Yadong Wang; Junyi Li
Journal: BMC Bioinformatics Date: 2021-03-29 Impact factor: 3.169

2. A novel candidate disease gene prioritization method using deep graph convolutional networks and semi-supervised learning.

Authors: Saeid Azadifar; Ali Ahmadi
Journal: BMC Bioinformatics Date: 2022-10-14 Impact factor: 3.307

3. A novel one-class classification approach to accurately predict disease-gene association in acute myeloid leukemia cancer.

Authors: Akram Vasighizaker; Alok Sharma; Abdollah Dehzangi
Journal: PLoS One Date: 2019-12-11 Impact factor: 3.240

3 in total