Literature DB >> 8107089

Hidden Markov models in computational biology. Applications to protein modeling.

A Krogh1, M Brown, I S Mian, K Sjölander, D Haussler.   

Abstract

Hidden Markov Models (HMMs) are applied to the problems of statistical modeling, database searching and multiple sequence alignment of protein families and protein domains. These methods are demonstrated on the globin family, the protein kinase catalytic domain, and the EF-hand calcium binding motif. In each case the parameters of an HMM are estimated from a training set of unaligned sequences. After the HMM is built, it is used to obtain a multiple alignment of all the training sequences. It is also used to search the SWISS-PROT 22 database for other sequences that are members of the given protein family, or contain the given domain. The HMM produces multiple alignments of good quality that agree closely with the alignments produced by programs that incorporate three-dimensional structural information. When employed in discrimination tests (by examining how closely the sequences in a database fit the globin, kinase and EF-hand HMMs), the HMM is able to distinguish members of these families from non-members with a high degree of accuracy. Both the HMM and PROFILESEARCH (a technique used to search for relationships between a protein sequence and multiply aligned sequences) perform better in these tests than PROSITE (a dictionary of sites and patterns in proteins). The HMM appears to have a slight advantage over PROFILESEARCH in terms of lower rates of false negatives and false positives, even though the HMM is trained using only unaligned sequences, whereas PROFILESEARCH requires aligned training sequences. Our results suggest the presence of an EF-hand calcium binding motif in a highly conserved and evolutionary preserved putative intracellular region of 155 residues in the alpha-1 subunit of L-type calcium channels which play an important role in excitation-contraction coupling. This region has been suggested to contain the functional domains that are typical or essential for all L-type calcium channels regardless of whether they couple to ryanodine receptors, conduct ions or both.

Entities:  

Mesh:

Substances:

Year:  1994        PMID: 8107089     DOI: 10.1006/jmbi.1994.1104

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  331 in total

1.  Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments.

Authors:  I Friedberg; T Kaplan; H Margalit
Journal:  Protein Sci       Date:  2000-11       Impact factor: 6.725

2.  The MetaFam Server: a comprehensive protein family resource.

Authors:  K A Silverstein; E Shoop; J E Johnson; A Kilian; J L Freeman; T M Kunau; I A Awad; M Mayer; E F Retzel
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

3.  trEST, trGEN and Hits: access to databases of predicted protein sequences.

Authors:  M Pagni; C Iseli; T Junier; L Falquet; V Jongeneel; P Bucher
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

4.  Evolution and horizontal transfer of dUTPase-encoding genes in viruses and their hosts.

Authors:  A M Baldo; M A McClure
Journal:  J Virol       Date:  1999-09       Impact factor: 5.103

5.  Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature.

Authors:  Soumya Raychaudhuri; Jeffrey T Chang; Patrick D Sutphin; Russ B Altman
Journal:  Genome Res       Date:  2002-01       Impact factor: 9.043

6.  Fold recognition without folds.

Authors:  Kristin K Koretke; Robert B Russell; Andrei N Lupas
Journal:  Protein Sci       Date:  2002-06       Impact factor: 6.725

7.  Enhanced protein domain discovery by using language modeling techniques from speech recognition.

Authors:  Lachlan Coin; Alex Bateman; Richard Durbin
Journal:  Proc Natl Acad Sci U S A       Date:  2003-03-31       Impact factor: 11.205

8.  A computational method for resequencing long DNA targets by universal oligonucleotide arrays.

Authors:  Itsik Pe'er; Naama Arbili; Ron Shamir
Journal:  Proc Natl Acad Sci U S A       Date:  2002-11-12       Impact factor: 11.205

9.  The directional atomic solvation energy: an atom-based potential for the assignment of protein sequences to known folds.

Authors:  Parag Mallick; Robert Weiss; David Eisenberg
Journal:  Proc Natl Acad Sci U S A       Date:  2002-12-02       Impact factor: 11.205

10.  A branch point consensus from Arabidopsis found by non-circular analysis allows for better prediction of acceptor sites.

Authors:  N Tolstrup; P Rouzé; S Brunak
Journal:  Nucleic Acids Res       Date:  1997-08-01       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.