Literature DB >> 8744776

Using substitution probabilities to improve position-specific scoring matrices.

J G Henikoff1, S Henikoff.   

Abstract

Each column of amino acids in a multiple alignment of protein sequences can be represented as a vector of 20 amino acid counts. For alignment and searching applications, the count vector is an imperfect representation of a position, because the observed sequences are an incomplete sample of the full set of related sequences. One general solution to this problem is to model unobserved sequences by adding artificial 'pseudo-counts' to the observed counts. We introduce a simple method for computing pseudo-counts that combines the diversity observed in each alignment position with amino acid substitution probabilities. In extensive empirical tests, this position-based method out-performed other pseudo-count methods and was a substantial improvement over the traditional average score method used for constructing profiles.

Mesh:

Substances:

Year:  1996        PMID: 8744776     DOI: 10.1093/bioinformatics/12.2.135

Source DB:  PubMed          Journal:  Comput Appl Biosci        ISSN: 0266-7061


  51 in total

1.  Reevaluation of the determinants of tyrosine sulfation.

Authors:  H B Nicholas; S S Chan; G L Rosenquist
Journal:  Endocrine       Date:  1999-12       Impact factor: 3.633

2.  Evolutionary relationship between K(+) channels and symporters.

Authors:  S R Durell; Y Hao; T Nakamura; E P Bakker; H R Guy
Journal:  Biophys J       Date:  1999-08       Impact factor: 4.033

3.  Predicting deleterious amino acid substitutions.

Authors:  P C Ng; S Henikoff
Journal:  Genome Res       Date:  2001-05       Impact factor: 9.043

4.  PSI-BLAST searches using hidden markov models of structural repeats: prediction of an unusual sliding DNA clamp and of beta-propellers in UV-damaged DNA-binding protein.

Authors:  A F Neuwald; A Poleksic
Journal:  Nucleic Acids Res       Date:  2000-09-15       Impact factor: 16.971

5.  CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCR primer design.

Authors:  Timothy M Rose; Jorja G Henikoff; Steven Henikoff
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

6.  Tools for comparative protein structure modeling and analysis.

Authors:  Narayanan Eswar; Bino John; Nebojsa Mirkovic; Andras Fiser; Valentin A Ilyin; Ursula Pieper; Ashley C Stuart; Marc A Marti-Renom; M S Madhusudhan; Bozidar Yerkovich; Andrej Sali
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

7.  PARSESNP: A tool for the analysis of nucleotide polymorphisms.

Authors:  Nicholas E Taylor; Elizabeth A Greene
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

8.  Enhancement to the RANKPEP resource for the prediction of peptide binding to MHC molecules using profiles.

Authors:  Pedro A Reche; John-Paul Glutting; Hong Zhang; Ellis L Reinherz
Journal:  Immunogenetics       Date:  2004-09-03       Impact factor: 2.846

9.  Alignment of protein sequences by their profiles.

Authors:  Marc A Marti-Renom; M S Madhusudhan; Andrej Sali
Journal:  Protein Sci       Date:  2004-04       Impact factor: 6.725

10.  Sequence context-specific profiles for homology searching.

Authors:  A Biegert; J Söding
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-20       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.