Literature DB >> 19963875

On-line hierarchy of general linear models for selecting and ranking the best predicted protein structures.

Hani Zakaria Girgis1, Jason J Corso, Daniel Fischer.   

Abstract

To predict the three dimensional structure of proteins, many computational methods sample the conformational space, generating a large number of candidate structures. Subsequently, such methods rank the generated structures using a variety of model quality assessment programs in order to obtain a small set of structures that are most likely to resemble the unknown experimentally determined structure. Model quality assessment programs suffer from two main limitations: (i) the rank-one structure is not always the best predicted structure; in other words, the best predicted structure could be ranked as the 10th structure (ii) no single assessment method can correctly rank the predicted structures for all target proteins. However, because often at least some of the methods achieve a good ranking, a model quality assessment method that is based on a consensus of a number of model quality assessment methods is likely to perform better. We have devised the STPdata algorithm, a consensus method based on five model quality assessment programs. We have applied it to build an on-line "custom-trained" hierarchy of general linear models to select and rank the best predicted structures. By "custom-trained", we mean for each target protein the STPdata algorithm trains a unique model on data related to the input target protein. To evaluate our method we participated in CASP8 as human predictors. In CASP8, the STPdata algorithm has trained 128 hierarchical models for each of the 128 target proteins. Based on the official results of CASP8 our method outperformed the best server by 6% and won the fourth position among human predictors. Our CASP results are purely based on computational methods without any human intervention.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19963875     DOI: 10.1109/IEMBS.2009.5332706

Source DB:  PubMed          Journal:  Conf Proc IEEE Eng Med Biol Soc        ISSN: 1557-170X


  4 in total

1.  MeShClust: an intelligent tool for clustering DNA sequences.

Authors:  Benjamin T James; Brian B Luczak; Hani Z Girgis
Journal:  Nucleic Acids Res       Date:  2018-08-21       Impact factor: 16.971

2.  Identity: rapid alignment-free prediction of sequence alignment identity scores using self-supervised general linear models.

Authors:  Hani Z Girgis; Benjamin T James; Brian B Luczak
Journal:  NAR Genom Bioinform       Date:  2021-02-01

3.  MULTICOM: a multi-level combination approach to protein structure prediction and its assessments in CASP8.

Authors:  Zheng Wang; Jesse Eickholt; Jianlin Cheng
Journal:  Bioinformatics       Date:  2010-02-11       Impact factor: 6.937

4.  HebbPlot: an intelligent tool for learning and visualizing chromatin mark signatures.

Authors:  Hani Z Girgis; Alfredo Velasco; Zachary E Reyes
Journal:  BMC Bioinformatics       Date:  2018-09-03       Impact factor: 3.169

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.