Literature DB >> 10786291

Using sequence motifs for enhanced neural network prediction of protein distance constraints.

J Gorodkin1, O Lund, C A Andersen, S Brunak.   

Abstract

Correlations between sequence separation (in residues) and distance (in Angstrom) of any pair of amino acids in polypeptide chains are investigated. For each sequence separation we define a distance threshold. For pairs of amino acids where the distance between C alpha atoms is smaller than the threshold, a characteristic sequence (logo) motif, is found. The motifs change as the sequence separation increases: for small separations they consist of one peak located in between the two residues, then additional peaks at these residues appear, and finally the center peak smears out for very large separations. We also find correlations between the residues in the center of the motif. This and other statistical analysis are used to design neural networks with enhanced performance compared to earlier work. Importantly, the statistical analysis explains why neural networks perform better than simple statistical data-driven approaches such as pair probability density functions. The statistical results also explain characteristics of the network performance for increasing sequence separation. The improvement of the new network design is significant in the sequence separation range 10-30 residues. Finally, we find that the performance curve for increasing sequence separation is directly correlated to the corresponding information content. A WWW server, distanceP, is available at http://www.cbs.dtu.dk/services/distanceP/.

Mesh:

Substances:

Year:  1999        PMID: 10786291

Source DB:  PubMed          Journal:  Proc Int Conf Intell Syst Mol Biol        ISSN: 1553-0833


  4 in total

1.  Reliable prediction of T-cell epitopes using neural networks with novel sequence representations.

Authors:  Morten Nielsen; Claus Lundegaard; Peder Worning; Sanne Lise Lauemøller; Kasper Lamberth; Søren Buus; Søren Brunak; Ole Lund
Journal:  Protein Sci       Date:  2003-05       Impact factor: 6.725

2.  Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers.

Authors:  Peng Chen; Jinyan Li
Journal:  BMC Struct Biol       Date:  2010-05-17

3.  Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks.

Authors:  Predrag Kukic; Claudio Mirabello; Giuseppe Tradigo; Ian Walsh; Pierangelo Veltri; Gianluca Pollastri
Journal:  BMC Bioinformatics       Date:  2014-01-10       Impact factor: 3.169

4.  Enhancing protein inter-residue real distance prediction by scrutinising deep learning models.

Authors:  Julia Rahman; M A Hakim Newton; Md Khaled Ben Islam; Abdul Sattar
Journal:  Sci Rep       Date:  2022-01-17       Impact factor: 4.379

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.