Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Duplication count distributions in DNA sequences.

Literature DB >> 19256873

Duplication count distributions in DNA sequences.

Suzanne S Sindi¹, Brian R Hunt, James A Yorke.

Abstract

We study quantitative features of complex repetitive DNA in several genomes by studying sequences that are sufficiently long that they are unlikely to have repeated by chance. For each genome we study, we determine the number of identical copies, the "duplication count," of each sequence of length 40, that is of each "40-mer." We say a 40-mer is "repeated" if its duplication count is at least 2. We focus mainly on "complex" 40-mers, those without short internal repetitions. We find that we can classify most of the complex repeated 40-mers into two categories: one category has its copies clustered closely together on one chromosome, the other has its copies distributed widely across multiple chromosomes. For each genome and each of the categories above, we compute N(c), the number of 40-mers that have duplication count c, for each integer c. In each case, we observe a power-law-like decay in N(c) as c increases from 3 to 50 or higher. In particular, we find that N(c) decays much more slowly than would be predicted by evolutionary models where each 40-mer is equally likely to be duplicated. We also analyze an evolutionary model that does reflect the slow decay of N(c).

Entities: Chemical Species

Mesh：

Substances：
DNA

Year: 2008 PMID： 19256873 PMCID： PMC3121164 DOI： 10.1103/PhysRevE.78.061912

Source DB: PubMed Journal: Phys Rev E Stat Nonlin Soft Matter Phys ISSN： 1539-3755

31 in total

1. Fast algorithms for large-scale genome alignment and comparison.

Authors: Arthur L Delcher; Adam Phillippy; Jane Carlton; Steven L Salzberg
Journal: Nucleic Acids Res Date: 2002-06-01 Impact factor: 16.971

Review 2. Genome organization and reorganization in evolution: formatting for computation and function.

Authors: James A Shapiro
Journal: Ann N Y Acad Sci Date: 2002-12 Impact factor: 5.691

Review 3. Short, local duplications in eukaryotic genomes.

Authors: Elizabeth E Thomas
Journal: Curr Opin Genet Dev Date: 2005-10-07 Impact factor: 5.578

Review 4. Repbase Update, a database of eukaryotic repetitive elements.

Authors: J Jurka; V V Kapitonov; A Pavlicek; P Klonowski; O Kohany; J Walichiewicz
Journal: Cytogenet Genome Res Date: 2005 Impact factor: 1.636

Review 5. Segmental duplications and the evolution of the primate genome.

Authors: Rhea Vallente Samonte; Evan E Eichler
Journal: Nat Rev Genet Date: 2002-01 Impact factor: 53.242

Review 6. Evolutionary dynamics of microsatellite DNA.

Authors: C Schlötterer
Journal: Chromosoma Date: 2000-09 Impact factor: 4.316

7. Equilibrium distributions of microsatellite repeat length resulting from a balance between slippage events and point mutations.

Authors: S Kruglyak; R T Durrett; M D Schug; C F Aquadro
Journal: Proc Natl Acad Sci U S A Date: 1998-09-01 Impact factor: 11.205

8. Distribution of short paired duplications in mammalian genomes.

Authors: Elizabeth E Thomas; Nathan Srebro; Jonathan Sebat; Nicholas Navin; John Healy; Bud Mishra; Michael Wigler
Journal: Proc Natl Acad Sci U S A Date: 2004-07-06 Impact factor: 11.205

Review 9. Genome sequence of the nematode C. elegans: a platform for investigating biology.

Authors:
Journal: Science Date: 1998-12-11 Impact factor: 47.728

10. Birth and death of protein domains: a simple model of evolution explains power law behavior.

Authors: Georgy P Karev; Yuri I Wolf; Andrey Y Rzhetsky; Faina S Berezovskaya; Eugene V Koonin
Journal: BMC Evol Biol Date: 2002-10-14 Impact factor: 3.260

7 in total

1. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers.

Authors: Guillaume Marçais; Carl Kingsford
Journal: Bioinformatics Date: 2011-01-07 Impact factor: 6.937

2. Algebraic distribution of segmental duplication lengths in whole-genome sequence self-alignments.

Authors: Kun Gao; Jonathan Miller
Journal: PLoS One Date: 2011-07-14 Impact factor: 3.240

3. Diminishing return for increased Mappability with longer sequencing reads: implications of the k-mer distributions in the human genome.

Authors: Wentian Li; Jan Freudenberg; Pedro Miramontes
Journal: BMC Bioinformatics Date: 2014-01-03 Impact factor: 3.169