Literature DB >> 18378524

Probabilistic multi-class multi-kernel learning: on protein fold recognition and remote homology detection.

Theodoros Damoulas1, Mark A Girolami.   

Abstract

MOTIVATION: The problems of protein fold recognition and remote homology detection have recently attracted a great deal of interest as they represent challenging multi-feature multi-class problems for which modern pattern recognition methods achieve only modest levels of performance. As with many pattern recognition problems, there are multiple feature spaces or groups of attributes available, such as global characteristics like the amino-acid composition (C), predicted secondary structure (S), hydrophobicity (H), van der Waals volume (V), polarity (P), polarizability (Z), as well as attributes derived from local sequence alignment such as the Smith-Waterman scores. This raises the need for a classification method that is able to assess the contribution of these potentially heterogeneous object descriptors while utilizing such information to improve predictive performance. To that end, we offer a single multi-class kernel machine that informatively combines the available feature groups and, as is demonstrated in this article, is able to provide the state-of-the-art in performance accuracy on the fold recognition problem. Furthermore, the proposed approach provides some insight by assessing the significance of recently introduced protein features and string kernels. The proposed method is well-founded within a Bayesian hierarchical framework and a variational Bayes approximation is derived which allows for efficient CPU processing times.
RESULTS: The best performance which we report on the SCOP PDB-40D benchmark data-set is a 70% accuracy by combining all the available feature groups from global protein characteristics but also including sequence-alignment features. We offer an 8% improvement on the best reported performance that combines multi-class k-nn classifiers while at the same time reducing computational costs and assessing the predictive power of the various available features. Furthermore, we examine the performance of our methodology on the SCOP 1.53 benchmark data-set that simulates remote homology detection and examine the combination of various state-of-the-art string kernels that have recently been proposed.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18378524     DOI: 10.1093/bioinformatics/btn112

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  26 in total

1.  Multi-kernel graph embedding for detection, Gleason grading of prostate cancer via MRI/MRS.

Authors:  Pallavi Tiwari; John Kurhanewicz; Anant Madabhushi
Journal:  Med Image Anal       Date:  2012-12-13       Impact factor: 8.545

2.  Thickness network features for prognostic applications in dementia.

Authors:  Pradeep Reddy Raamana; Michael W Weiner; Lei Wang; Mirza Faisal Beg
Journal:  Neurobiol Aging       Date:  2014-09-06       Impact factor: 4.673

3.  Physicochemical property distributions for accurate and rapid pairwise protein homology detection.

Authors:  Bobbie-Jo M Webb-Robertson; Kyle G Ratuiste; Christopher S Oehmen
Journal:  BMC Bioinformatics       Date:  2010-03-19       Impact factor: 3.169

Review 4.  Template-based protein modeling: recent methodological advances.

Authors:  Pankaj R Daga; Ronak Y Patel; Robert J Doerksen
Journal:  Curr Top Med Chem       Date:  2010       Impact factor: 3.295

5.  Structural alphabets for protein structure classification: a comparison study.

Authors:  Quan Le; Gianluca Pollastri; Patrice Koehl
Journal:  J Mol Biol       Date:  2008-12-25       Impact factor: 5.469

6.  Blinded Clinical Evaluation for Dementia of Alzheimer's Type Classification Using FDG-PET: A Comparison Between Feature-Engineered and Non-Feature-Engineered Machine Learning Methods.

Authors:  Da Ma; Evangeline Yee; Jane K Stocks; Lisanne M Jenkins; Karteek Popuri; Guillaume Chausse; Lei Wang; Stephan Probst; Mirza Faisal Beg
Journal:  J Alzheimers Dis       Date:  2021       Impact factor: 4.472

7.  DISCOVER: a feature-based discriminative method for motif search in complex genomes.

Authors:  Wenjie Fu; Pradipta Ray; Eric P Xing
Journal:  Bioinformatics       Date:  2009-06-15       Impact factor: 6.937

8.  A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis.

Authors:  Bin Liu; Xiaolong Wang; Lei Lin; Qiwen Dong; Xuan Wang
Journal:  BMC Bioinformatics       Date:  2008-12-01       Impact factor: 3.169

9.  Enhanced protein fold recognition through a novel data integration approach.

Authors:  Yiming Ying; Kaizhu Huang; Colin Campbell
Journal:  BMC Bioinformatics       Date:  2009-08-26       Impact factor: 3.169

10.  Predicting Conversion from MCI to AD Combining Multi-Modality Data and Based on Molecular Subtype.

Authors:  Hai-Tao Li; Shao-Xun Yuan; Jian-Sheng Wu; Yu Gu; Xiao Sun
Journal:  Brain Sci       Date:  2021-05-21
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.