Literature DB >> 7769622

The size distribution of insertions and deletions in human and rodent pseudogenes suggests the logarithmic gap penalty for sequence alignment.

X Gu1, W H Li.   

Abstract

The size distributions of deletions, insertions, and indels (i.e., insertions or deletions) were studied, using 78 human processed pseudogenes and other published data sets. The following results were obtained: (1) Deletions occur more frequently than do insertions in sequence evolution; none of the pseudogenes studied shows significantly more insertions than deletions. (2) Empirically, the size distributions of deletions, insertions, and indels can be described well by a power law, i.e., fk = Ck-b, where fk is the frequency of deletion, insertion, or indel with gap length k, b is the power parameter, and C is the normalization factor. (3) The estimates of b for deletions and insertions from the same data set are approximately equal to each other, indicating that the size distributions for deletions and insertions are approximately identical. (4) The variation in the estimates of b among various data sets is small, indicating that the effect of local structure exists but only plays a secondary role in the size distribution of deletions and insertions. (5) The linear gap penalty, which is most commonly used in sequence alignment, is not supported by our analysis; rather, the power law for the size distribution of indels suggests that an appropriate gap penalty is wk = a + b ln k, where a is the gap creation cost and blnk is the gap extension cost. (6) The higher frequency of deletion over insertion suggests that the gap creation cost of insertion (ai) should be larger than that of deletion (ad); that is, ai - ad = ln R, where R is the frequency ratio of deletions to insertions.

Entities:  

Mesh:

Year:  1995        PMID: 7769622     DOI: 10.1007/BF00164032

Source DB:  PubMed          Journal:  J Mol Evol        ISSN: 0022-2844            Impact factor:   2.395


  19 in total

1.  Gene deletions causing human genetic disease: mechanisms of mutagenesis and the role of the local DNA sequence environment.

Authors:  M Krawczak; D N Cooper
Journal:  Hum Genet       Date:  1991-03       Impact factor: 4.132

2.  Analysis of insertions/deletions in protein structures.

Authors:  S Pascarella; P Argos
Journal:  J Mol Biol       Date:  1992-03-20       Impact factor: 5.469

3.  Optimal sequence alignments.

Authors:  W M Fitch; T F Smith
Journal:  Proc Natl Acad Sci U S A       Date:  1983-03       Impact factor: 11.205

4.  An evolutionary model for maximum likelihood alignment of DNA sequences.

Authors:  J L Thorne; H Kishino; J Felsenstein
Journal:  J Mol Evol       Date:  1991-08       Impact factor: 2.395

5.  Three-way Needleman--Wunsch algorithm.

Authors:  M Murata
Journal:  Methods Enzymol       Date:  1990       Impact factor: 1.600

6.  Evaluation and improvements in the automatic alignment of protein sequences.

Authors:  G J Barton; M J Sternberg
Journal:  Protein Eng       Date:  1987 Feb-Mar

Review 7.  Processed pseudogenes: characteristics and evolution.

Authors:  E F Vanin
Journal:  Annu Rev Genet       Date:  1985       Impact factor: 16.830

8.  Comparative analysis of multiple protein-sequence alignment methods.

Authors:  M A McClure; T K Vasi; W M Fitch
Journal:  Mol Biol Evol       Date:  1994-07       Impact factor: 16.240

9.  Similar amino acid sequences: chance or common ancestry?

Authors:  R F Doolittle
Journal:  Science       Date:  1981-10-09       Impact factor: 47.728

10.  Deletions in processed pseudogenes accumulate faster in rodents than in humans.

Authors:  D Graur; Y Shuali; W H Li
Journal:  J Mol Evol       Date:  1989-04       Impact factor: 2.395

View more
  44 in total

1.  Nature and structure of human genes that generate retropseudogenes.

Authors:  I Gonçalves; L Duret; D Mouchiroud
Journal:  Genome Res       Date:  2000-05       Impact factor: 9.043

2.  The probability of preservation of a newly arisen gene duplicate.

Authors:  M Lynch; M O'Hely; B Walsh; A Force
Journal:  Genetics       Date:  2001-12       Impact factor: 4.562

3.  Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes.

Authors:  Zhaolei Zhang; Mark Gerstein
Journal:  Nucleic Acids Res       Date:  2003-09-15       Impact factor: 16.971

4.  Neutral evolution of ten types of mariner transposons in the genomes of Caenorhabditis elegans and Caenorhabditis briggsae.

Authors:  David J Witherspoon; Hugh M Robertson
Journal:  J Mol Evol       Date:  2003-06       Impact factor: 2.395

5.  Evolution's cauldron: duplication, deletion, and rearrangement in the mouse and human genomes.

Authors:  W James Kent; Robert Baertsch; Angie Hinrichs; Webb Miller; David Haussler
Journal:  Proc Natl Acad Sci U S A       Date:  2003-09-19       Impact factor: 11.205

6.  Frequency of gaps observed in a structurally aligned protein pair database suggests a simple gap penalty function.

Authors:  Nalin C W Goonesekere; Byungkook Lee
Journal:  Nucleic Acids Res       Date:  2004-05-20       Impact factor: 16.971

7.  MCALIGN: stochastic alignment of noncoding DNA sequences based on an evolutionary model of sequence evolution.

Authors:  Peter D Keightley; Toby Johnson
Journal:  Genome Res       Date:  2004-03       Impact factor: 9.043

8.  Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome.

Authors:  P M Harrison; N Echols; M B Gerstein
Journal:  Nucleic Acids Res       Date:  2001-02-01       Impact factor: 16.971

9.  Ngila: global pairwise alignments with logarithmic and affine gap costs.

Authors:  Reed A Cartwright
Journal:  Bioinformatics       Date:  2007-03-25       Impact factor: 6.937

10.  Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution.

Authors:  Deyou Zheng; Adam Frankish; Robert Baertsch; Philipp Kapranov; Alexandre Reymond; Siew Woh Choo; Yontao Lu; France Denoeud; Stylianos E Antonarakis; Michael Snyder; Yijun Ruan; Chia-Lin Wei; Thomas R Gingeras; Roderic Guigó; Jennifer Harrow; Mark B Gerstein
Journal:  Genome Res       Date:  2007-06       Impact factor: 9.043

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.