Literature DB >> 10679525

Is it better to combine predictions?

R D King1, M Ouali, A T Strong, A Aly, A Elmaghraby, M Kantardzic, D Page.   

Abstract

We have compared the accuracy of the individual protein secondary structure prediction methods: PHD, DSC, NNSSP and Predator against the accuracy obtained by combing the predictions of the methods. A range of ways of combing predictions were tested: voting, biased voting, linear discrimination, neural networks and decision trees. The combined methods that involve 'learning' (the non-voting methods) were trained using a set of 496 non-homologous domains; this dataset was biased as some of the secondary structure prediction methods had used them for training. We used two independent test sets to compare predictions: the first consisted of 17 non-homologous domains from CASP3 (Third Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction); the second set consisted of 405 domains that were selected in the same way as the training set, and were non-homologous to each other and the training set. On both test datasets the most accurate individual method was NNSSP, then PHD, DSC and the least accurate was Predator; however, it was not possible to conclusively show a significant difference between the individual methods. Comparing the accuracy of the single methods with that obtained by combing predictions it was found that it was better to use a combination of predictions. On both test datasets it was possible to obtain a approximately 3% improvement in accuracy by combing predictions. In most cases the combined methods were statistically significantly better (at P = 0.05 on the CASP3 test set, and P = 0.01 on the EBI test set). On the CASP3 test dataset there was no significant difference in accuracy between any of the combined method of prediction: on the EBI test dataset, linear discrimination and neural networks significantly outperformed voting techniques. We conclude that it is better to combine predictions.

Entities:  

Mesh:

Substances:

Year:  2000        PMID: 10679525     DOI: 10.1093/protein/13.1.15

Source DB:  PubMed          Journal:  Protein Eng        ISSN: 0269-2139


  3 in total

1.  Functional characterization of the non-catalytic ectodomains of the nucleotide pyrophosphatase/phosphodiesterase NPP1.

Authors:  Rik Gijsbers; Hugo Ceulemans; Mathieu Bollen
Journal:  Biochem J       Date:  2003-04-15       Impact factor: 3.857

2.  Implications of secondary structure prediction and amino acid sequence comparison of class I and class II phosphoribosyl diphosphate synthases on catalysis, regulation, and quaternary structure.

Authors:  B N Krath; B Hove-Jensen
Journal:  Protein Sci       Date:  2001-11       Impact factor: 6.725

3.  Hotspot Hunter: a computational system for large-scale screening and selection of candidate immunological hotspots in pathogen proteomes.

Authors:  Guang Lan Zhang; Asif M Khan; Kellathur N Srinivasan; At Heiny; Kx Lee; Chee Keong Kwoh; J Thomas August; Vladimir Brusic
Journal:  BMC Bioinformatics       Date:  2008       Impact factor: 3.169

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.