Literature DB >> 22833524

Contact map prediction using a large-scale ensemble of rule sets and the fusion of multiple predicted structural features.

Jaume Bacardit1, Paweł Widera, Alfonso Márquez-Chamorro, Federico Divina, Jesús S Aguilar-Ruiz, Natalio Krasnogor.   

Abstract

MOTIVATION: The prediction of a protein's contact map has become in recent years, a crucial stepping stone for the prediction of the complete 3D structure of a protein. In this article, we describe a methodology for this problem that was shown to be successful in CASP8 and CASP9. The methodology is based on (i) the fusion of the prediction of a variety of structural aspects of protein residues, (ii) an ensemble strategy used to facilitate the training process and (iii) a rule-based machine learning system from which we can extract human-readable explanations of the predictor and derive useful information about the contact map representation.
RESULTS: The main part of the evaluation is the comparison against the sequence-based contact prediction methods from CASP9, where our method presented the best rank in five out of the six evaluated metrics. We also assess the impact of the size of the ensemble used in our predictor to show the trade-off between performance and training time of our method. Finally, we also study the rule sets generated by our machine learning system. From this analysis, we are able to estimate the contribution of the attributes in our representation and how these interact to derive contact predictions. AVAILABILITY: http://icos.cs.nott.ac.uk/servers/psp.html. CONTACT: natalio.krasnogor@nottingham.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22833524     DOI: 10.1093/bioinformatics/bts472

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  9 in total

1.  Multi-Dimensional Scaling and MODELLER-Based Evolutionary Algorithms for Protein Model Refinement.

Authors:  Yan Chen; Yi Shang; Dong Xu
Journal:  Proc Congr Evol Comput       Date:  2014-07

2.  Hard Data Analytics Problems Make for Better Data Analysis Algorithms: Bioinformatics as an Example.

Authors:  Jaume Bacardit; Paweł Widera; Nicola Lazzarini; Natalio Krasnogor
Journal:  Big Data       Date:  2014-09-01       Impact factor: 2.128

3.  Protein Residue Contacts and Prediction Methods.

Authors:  Badri Adhikari; Jianlin Cheng
Journal:  Methods Mol Biol       Date:  2016

4.  A machine learning heuristic to identify biologically relevant and minimal biomarker panels from omics data.

Authors:  Anna L Swan; Dov J Stekel; Charlie Hodgman; David Allaway; Mohammed H Alqahtani; Ali Mobasheri; Jaume Bacardit
Journal:  BMC Genomics       Date:  2015-01-15       Impact factor: 3.969

5.  Mining the entire Protein DataBank for frequent spatially cohesive amino acid patterns.

Authors:  Pieter Meysman; Cheng Zhou; Boris Cule; Bart Goethals; Kris Laukens
Journal:  BioData Min       Date:  2015-01-31       Impact factor: 2.522

6.  Functional networks inference from rule-based machine learning models.

Authors:  Nicola Lazzarini; Paweł Widera; Stuart Williamson; Rakesh Heer; Natalio Krasnogor; Jaume Bacardit
Journal:  BioData Min       Date:  2016-09-05       Impact factor: 2.522

7.  Exact p-values for pairwise comparison of Friedman rank sums, with application to comparing classifiers.

Authors:  Rob Eisinga; Tom Heskes; Ben Pelzer; Manfred Te Grotenhuis
Journal:  BMC Bioinformatics       Date:  2017-01-25       Impact factor: 3.169

8.  RRCRank: a fusion method using rank strategy for residue-residue contact prediction.

Authors:  Xiaoyang Jing; Qiwen Dong; Ruqian Lu
Journal:  BMC Bioinformatics       Date:  2017-09-02       Impact factor: 3.169

9.  An analysis and evaluation of the WeFold collaborative for protein structure prediction and its pipelines in CASP11 and CASP12.

Authors:  Chen Keasar; Liam J McGuffin; Björn Wallner; Gaurav Chopra; Badri Adhikari; Debswapna Bhattacharya; Lauren Blake; Leandro Oliveira Bortot; Renzhi Cao; B K Dhanasekaran; Itzhel Dimas; Rodrigo Antonio Faccioli; Eshel Faraggi; Robert Ganzynkowicz; Sambit Ghosh; Soma Ghosh; Artur Giełdoń; Lukasz Golon; Yi He; Lim Heo; Jie Hou; Main Khan; Firas Khatib; George A Khoury; Chris Kieslich; David E Kim; Pawel Krupa; Gyu Rie Lee; Hongbo Li; Jilong Li; Agnieszka Lipska; Adam Liwo; Ali Hassan A Maghrabi; Milot Mirdita; Shokoufeh Mirzaei; Magdalena A Mozolewska; Melis Onel; Sergey Ovchinnikov; Anand Shah; Utkarsh Shah; Tomer Sidi; Adam K Sieradzan; Magdalena Ślusarz; Rafal Ślusarz; James Smadbeck; Phanourios Tamamis; Nicholas Trieber; Tomasz Wirecki; Yanping Yin; Yang Zhang; Jaume Bacardit; Maciej Baranowski; Nicholas Chapman; Seth Cooper; Alexandre Defelicibus; Jeff Flatten; Brian Koepnick; Zoran Popović; Bartlomiej Zaborowski; David Baker; Jianlin Cheng; Cezary Czaplewski; Alexandre Cláudio Botazzo Delbem; Christodoulos Floudas; Andrzej Kloczkowski; Stanislaw Ołdziej; Michael Levitt; Harold Scheraga; Chaok Seok; Johannes Söding; Saraswathi Vishveshwara; Dong Xu; Silvia N Crivelli
Journal:  Sci Rep       Date:  2018-07-02       Impact factor: 4.379

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.