Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Dirichlet mixtures, the Dirichlet process, and the structure of protein space.

Literature DB >> 23294268

Dirichlet mixtures, the Dirichlet process, and the structure of protein space.

Viet-An Nguyen¹, Jordan Boyd-Graber, Stephen F Altschul.

Abstract

The Dirichlet process is used to model probability distributions that are mixtures of an unknown number of components. Amino acid frequencies at homologous positions within related proteins have been fruitfully modeled by Dirichlet mixtures, and we use the Dirichlet process to derive such mixtures with an unbounded number of components. This application of the method requires several technical innovations to sample an unbounded number of Dirichlet-mixture components. The resulting Dirichlet mixtures model multiple-alignment data substantially better than do previously derived ones. They consist of over 500 components, in contrast to fewer than 40 previously, and provide a novel perspective on the structure of proteins. Individual protein positions should be seen not as falling into one of several categories, but rather as arrayed near probability ridges winding through amino acid multinomial space.

Mesh：

Substances：
Proteins

Year: 2013 PMID： 23294268 PMCID： PMC3541698 DOI： 10.1089/cmb.2012.0244

Source DB: PubMed Journal: J Comput Biol ISSN： 1066-5277 Impact factor: 1.479

8 in total

1. A comparison of scoring functions for protein sequence profile alignment.

Authors: Robert C Edgar; Kimmen Sjölander
Journal: Bioinformatics Date: 2004-02-12 Impact factor: 6.937

2. An assessment of substitution scores for protein profile-profile comparison.

Authors: Xugang Ye; Guoli Wang; Stephen F Altschul
Journal: Bioinformatics Date: 2011-10-13 Impact factor: 6.937

3. On the inference of dirichlet mixture priors for protein sequence comparison.

Authors: Xugang Ye; Yi-Kuo Yu; Stephen F Altschul
Journal: J Comput Biol Date: 2011-06-24 Impact factor: 1.479

4. The complexity of the dirichlet model for multiple alignment data.

Authors: Yi-Kuo Yu; Stephen F Altschul
Journal: J Comput Biol Date: 2011-06-24 Impact factor: 1.479

5. Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology.

Authors: K Sjölander; K Karplus; M Brown; R Hughey; A Krogh; I S Mian; D Haussler
Journal: Comput Appl Biosci Date: 1996-08

6. Using Dirichlet mixture priors to derive hidden Markov models for protein families.

Authors: M Brown; R Hughey; A Krogh; I S Mian; K Sjölander; D Haussler
Journal: Proc Int Conf Intell Syst Mol Biol Date: 1993

7. Lines of descent in the diffusion approximation of neutral Wright-Fisher models.

Authors: R C Griffiths
Journal: Theor Popul Biol Date: 1980-02 Impact factor: 1.570

8. The construction and use of log-odds substitution scores for multiple sequence alignment.

Authors: Stephen F Altschul; John C Wootton; Elena Zaslavsky; Yi-Kuo Yu
Journal: PLoS Comput Biol Date: 2010-07-15 Impact factor: 4.475

8 in total

4 in total

Dirichlet mixtures, the Dirichlet process, and the structure of protein space.

1. A comparison of scoring functions for protein sequence profile alignment.

2. An assessment of substitution scores for protein profile-profile comparison.

3. On the inference of dirichlet mixture priors for protein sequence comparison.

4. The complexity of the dirichlet model for multiple alignment data.

5. Dirichlet mixtures: a method for improved detection of weak but significant protein sequence homology.

6. Using Dirichlet mixture priors to derive hidden Markov models for protein families.

7. Lines of descent in the diffusion approximation of neutral Wright-Fisher models.

8. The construction and use of log-odds substitution scores for multiple sequence alignment.

1. Log-odds sequence logos.

2. Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

3. Bridging the gaps in statistical models of protein alignment.

4. Bayesian Top-Down Protein Sequence Alignment with Inferred Position-Specific Gap Penalties.