Literature DB >> 14663142

The compositional adjustment of amino acid substitution matrices.

Yi-Kuo Yu1, John C Wootton, Stephen F Altschul.   

Abstract

Amino acid substitution matrices are central to protein-comparison methods. In most commonly used matrices, the substitution scores take a log-odds form, involving the ratio of "target" to "background" frequencies derived from large, carefully curated sets of protein alignments. However, such matrices often are used to compare protein sequences with amino acid compositions that differ markedly from the background frequencies used for the construction of the matrices. Of course, the target frequencies should be adjusted in such cases, but the lack of an appropriate way to do this has been a long-standing problem. This article shows that if one demands consistency between target and background frequencies, then a log-odds substitution matrix implies a unique set of target and background frequencies as well as a unique scale. Standard substitution matrices therefore are truly appropriate only for the comparison of proteins with standard amino acid composition. Accordingly, we present and evaluate a rationale for transforming the target frequencies implicit in a standard matrix to frequencies appropriate for a nonstandard context. This rationale yields asymmetric matrices for the comparison of proteins with divergent compositions. Earlier approaches are unable to deal with this case in a fully consistent manner. Composition-specific substitution matrix adjustment is shown to be of utility for comparing compositionally biased proteins, including those of organisms with nucleotide-biased, and therefore codon-biased, genomes or isochores.

Mesh:

Substances:

Year:  2003        PMID: 14663142      PMCID: PMC307629          DOI: 10.1073/pnas.2533904100

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  20 in total

1.  The estimation of statistical parameters for local alignment score distributions.

Authors:  S F Altschul; R Bundschuh; R Olsen; T Hwa
Journal:  Nucleic Acids Res       Date:  2001-01-15       Impact factor: 16.971

2.  A global compositional complexity measure for biological sequences: AT-rich and GC-rich genomes encode less complex proteins.

Authors:  H Wan; J C Wootton
Journal:  Comput Chem       Date:  2000-01

Review 3.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.

Authors:  A A Schäffer; L Aravind; T L Madden; S Shavirin; J L Spouge; Y I Wolf; E V Koonin; S F Altschul
Journal:  Nucleic Acids Res       Date:  2001-07-15       Impact factor: 16.971

4.  Non-symmetric score matrices and the detection of homologous transmembrane proteins.

Authors:  T Müller; S Rahmann; M Rehmsmeier
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

5.  Estimating amino acid substitution models: a comparison of Dayhoff's estimator, the resolvent approach and a maximum likelihood method.

Authors:  Tobias Müller; Rainer Spang; Martin Vingron
Journal:  Mol Biol Evol       Date:  2002-01       Impact factor: 16.240

6.  Modeling amino acid replacement.

Authors:  T Müller; M Vingron
Journal:  J Comput Biol       Date:  2000       Impact factor: 1.479

Review 7.  A genomic perspective on protein families.

Authors:  R L Tatusov; E V Koonin; D J Lipman
Journal:  Science       Date:  1997-10-24       Impact factor: 47.728

8.  Directional mutation pressure and neutral molecular evolution.

Authors:  N Sueoka
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

9.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

10.  A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes.

Authors:  R D Knight; S J Freeland; L F Landweber
Journal:  Genome Biol       Date:  2001-03-22       Impact factor: 13.583

View more
  39 in total

1.  Compositional adjustment of Dirichlet mixture priors.

Authors:  Xugang Ye; Yi-Kuo Yu; Stephen F Altschul
Journal:  J Comput Biol       Date:  2010-12       Impact factor: 1.479

2.  A collection of amino acid replacement matrices derived from clusters of orthologs.

Authors:  Rolf Olsen; William F Loomis
Journal:  J Mol Evol       Date:  2005-10-20       Impact factor: 2.395

Review 3.  Protein database searches using compositionally adjusted substitution matrices.

Authors:  Stephen F Altschul; John C Wootton; E Michael Gertz; Richa Agarwala; Aleksandr Morgulis; Alejandro A Schäffer; Yi-Kuo Yu
Journal:  FEBS J       Date:  2005-10       Impact factor: 5.542

Review 4.  The limits of protein sequence comparison?

Authors:  William R Pearson; Michael L Sierk
Journal:  Curr Opin Struct Biol       Date:  2005-06       Impact factor: 6.809

5.  Efficient methods for estimating amino acid replacement rates.

Authors:  Lars Arvestad
Journal:  J Mol Evol       Date:  2006-04-28       Impact factor: 2.395

Review 6.  Membrane protein prediction methods.

Authors:  Marco Punta; Lucy R Forrest; Henry Bigelow; Andrew Kernytsky; Jinfeng Liu; Burkhard Rost
Journal:  Methods       Date:  2007-04       Impact factor: 3.608

7.  Sequence context-specific profiles for homology searching.

Authors:  A Biegert; J Söding
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-20       Impact factor: 11.205

8.  Detecting remote homologues using scoring matrices calculated from the estimation of amino acid substitution rates of beta-barrel membrane proteins.

Authors:  David Jimenez-Morales; Larisa Adamian; Jie Liang
Journal:  Conf Proc IEEE Eng Med Biol Soc       Date:  2008

9.  Splitting the BLOSUM score into numbers of biological significance.

Authors:  Francesco Fabris; Andrea Sgarro; Alessandro Tossi
Journal:  EURASIP J Bioinform Syst Biol       Date:  2007

10.  Island method for estimating the statistical significance of profile-profile alignment scores.

Authors:  Aleksandar Poleksic
Journal:  BMC Bioinformatics       Date:  2009-04-20       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.