Literature DB >> 27474728

Comparing the Statistical Fate of Paralogous and Orthologous Sequences.

Florian Massip1, Michael Sheinman2, Sophie Schbath3, Peter F Arndt4.   

Abstract

For several decades, sequence alignment has been a widely used tool in bioinformatics. For instance, finding homologous sequences with a known function in large databases is used to get insight into the function of nonannotated genomic regions. Very efficient tools like BLAST have been developed to identify and rank possible homologous sequences. To estimate the significance of the homology, the ranking of alignment scores takes a background model for random sequences into account. Using this model we can estimate the probability to find two exactly matching subsequences by chance in two unrelated sequences. For two homologous sequences, the corresponding probability is much higher, which allows us to identify them. Here we focus on the distribution of lengths of exact sequence matches between protein-coding regions of pairs of evolutionarily distant genomes. We show that this distribution exhibits a power-law tail with an exponent [Formula: see text] Developing a simple model of sequence evolution by substitutions and segmental duplications, we show analytically and computationally that paralogous and orthologous gene pairs contribute differently to this distribution. Our model explains the differences observed in the comparison of coding and noncoding parts of genomes, thus providing a better understanding of statistical properties of genomic sequences and their evolution.
Copyright © 2016 by the Genetics Society of America.

Keywords:  DNA duplications; comparative genomics; genome evolution; statistical genomics

Mesh:

Year:  2016        PMID: 27474728      PMCID: PMC5068840          DOI: 10.1534/genetics.116.193912

Source DB:  PubMed          Journal:  Genetics        ISSN: 0016-6731            Impact factor:   4.562


  21 in total

1.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

2.  A burst of protein sequence evolution and a prolonged period of asymmetric evolution follow gene duplication in yeast.

Authors:  Devin R Scannell; Kenneth H Wolfe
Journal:  Genome Res       Date:  2007-11-19       Impact factor: 9.043

3.  How confident can we be that orthologs are similar, but paralogs differ?

Authors:  Romain A Studer; Marc Robinson-Rechavi
Journal:  Trends Genet       Date:  2009-04-14       Impact factor: 11.639

4.  Accelerated evolution after gene duplication: a time-dependent process affecting just one copy.

Authors:  Cinta Pegueroles; Steve Laurie; M Mar Albà
Journal:  Mol Biol Evol       Date:  2013-04-26       Impact factor: 16.240

5.  Neutral evolution of duplicated DNA: an evolutionary stick-breaking process causes scale-invariant behavior.

Authors:  Florian Massip; Peter F Arndt
Journal:  Phys Rev Lett       Date:  2013-04-02       Impact factor: 9.161

6.  Human-chimpanzee alignment: ortholog exponentials and paralog power laws.

Authors:  Kun Gao; Jonathan Miller
Journal:  Comput Biol Chem       Date:  2014-10-02       Impact factor: 2.877

7.  Adaptive evolution of young gene duplicates in mammals.

Authors:  Mira V Han; Jeffery P Demuth; Casey L McGrath; Claudio Casola; Matthew W Hahn
Journal:  Genome Res       Date:  2009-05       Impact factor: 9.043

8.  ALF--a simulation framework for genome evolution.

Authors:  Daniel A Dalquen; Maria Anisimova; Gaston H Gonnet; Christophe Dessimoz
Journal:  Mol Biol Evol       Date:  2011-12-08       Impact factor: 16.240

9.  How evolution of genomes is reflected in exact DNA sequence match statistics.

Authors:  Florian Massip; Michael Sheinman; Sophie Schbath; Peter F Arndt
Journal:  Mol Biol Evol       Date:  2014-11-13       Impact factor: 16.240

10.  The impact of gene duplication, insertion, deletion, lateral gene transfer and sequencing error on orthology inference: a simulation study.

Authors:  Daniel A Dalquen; Adrian M Altenhoff; Gaston H Gonnet; Christophe Dessimoz
Journal:  PLoS One       Date:  2013-02-25       Impact factor: 3.240

View more
  1 in total

1.  Identical sequences found in distant genomes reveal frequent horizontal transfer across the bacterial domain.

Authors:  Michael Sheinman; Ksenia Arkhipova; Rutger Hermsen; Florian Massip; Peter F Arndt; Bas E Dutilh
Journal:  Elife       Date:  2021-06-14       Impact factor: 8.140

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.