Literature DB >> 18047424

Algorithms to distinguish the role of gene-conversion from single-crossover recombination in the derivation of SNP sequences in populations.

Yun S Song1, Zhihong Ding, Dan Gusfield, Charles H Langley, Yufeng Wu.   

Abstract

Meiotic recombination is a fundamental biological event and one of the principal evolutionary forces responsible for shaping genetic variation within species. In addition to its fundamental role, recombination is central to several critical applied problems. The most important example is "association mapping" in populations, which is widely hoped to help find genes that influence genetic diseases (Carlson et al., 2004; Clark, 2003). Hence, a great deal of recent attention has focused on problems of inferring the historical derivation of sequences in populations when both mutations and recombinations have occurred. In the algorithms literature, most of that recent work has been directed to single-crossover recombination. However, gene-conversion is an important, and more common, form of (two-crossover) recombination which has been much less investigated in the algorithms literature. In this paper, we explicitly incorporate gene-conversion into discrete methods to study historical recombination. We are concerned with algorithms for identifying and locating the extent of historical crossing-over and gene-conversion (along with single-nucleotide mutation), and problems of constructing full putative histories of those events. The novel technical issues concern the incorporation of gene-conversion into recently developed discrete methods (Myers and Griffiths, 2003; Song et al., 2005) that compute lower and upper-bound information on the amount of needed recombination without gene-conversion. We first examine the most natural extension of the lower bound methods from Myers and Griffiths (2003), showing that the extension can be computed efficiently, but that this extension can only yield weak lower bounds. We then develop additional ideas that lead to higher lower bounds, and show how to solve, via integer-linear programming, a more biologically realistic version of the lower bound problem. We also show how to compute effective upper bounds on the number of needed single-crossovers and gene-conversions, along with explicit networks showing a putative history of mutations, single-crossovers and gene-conversions. Both lower and upper bound methods can handle data with missing entries, and the upper bound method can be used to infer missing entries with high accuracy. We validate the significance of these methods by showing that they can be effectively used to distinguish simulation-derived sequences generated without gene-conversion from sequences that were generated with gene-conversion. We apply the methods to recently studied sequences of Arabidopsis thaliana, identifying many more regions in the sequences than were previously identified (Plagnol et al., 2006), where gene-conversion may have played a significant role. Demonstration software is available at www.csif.cs.ucdavis.edu/~gusfield.

Entities:  

Mesh:

Year:  2007        PMID: 18047424      PMCID: PMC2581774          DOI: 10.1089/cmb.2007.0096

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  24 in total

1.  Distinguishing recombination and intragenic gene conversion by linkage disequilibrium patterns.

Authors:  T Wiehe; J Mountain; P Parham; M Slatkin
Journal:  Genet Res       Date:  2000-02       Impact factor: 1.588

2.  Gene conversion and different population histories may explain the contrast between polymorphism and linkage disequilibrium levels.

Authors:  L Frisse; R R Hudson; A Bartoszewicz; J D Wall; J Donfack; A Di Rienzo
Journal:  Am J Hum Genet       Date:  2001-08-29       Impact factor: 11.025

3.  Bounds on the minimum number of recombination events in a sample history.

Authors:  Simon R Myers; Robert C Griffiths
Journal:  Genetics       Date:  2003-01       Impact factor: 4.562

Review 4.  Finding genes underlying risk of complex disease by linkage disequilibrium mapping.

Authors:  Andrew G Clark
Journal:  Curr Opin Genet Dev       Date:  2003-06       Impact factor: 5.578

5.  On the minimum number of recombination events in the evolutionary history of DNA sequences.

Authors:  Yun S Song; Jotun Hein
Journal:  J Math Biol       Date:  2003-08-20       Impact factor: 2.259

6.  Close look at gene conversion hot spots.

Authors:  Jeffrey D Wall
Journal:  Nat Genet       Date:  2004-02       Impact factor: 38.330

Review 7.  Mapping complex disease loci in whole-genome association studies.

Authors:  Christopher S Carlson; Michael A Eberle; Leonid Kruglyak; Deborah A Nickerson
Journal:  Nature       Date:  2004-05-27       Impact factor: 49.962

8.  Estimating the rate of gene conversion on human chromosome 21.

Authors:  Badri Padhukasahasram; Paul Marjoram; Magnus Nordborg
Journal:  Am J Hum Genet       Date:  2004-07-12       Impact factor: 11.025

9.  Optimal, efficient reconstruction of phylogenetic networks with constrained recombination.

Authors:  Dan Gusfield; Satish Eddhu; Charles Langley
Journal:  J Bioinform Comput Biol       Date:  2004-03       Impact factor: 1.122

10.  Intense and highly localized gene conversion activity in human meiotic crossover hot spots.

Authors:  Alec J Jeffreys; Celia A May
Journal:  Nat Genet       Date:  2004-01-04       Impact factor: 38.330

View more
  8 in total

1.  A decomposition theory for phylogenetic networks and incompatible characters.

Authors:  Dan Gusfield; Vikas Bansal; Vineet Bafna; Yun S Song
Journal:  J Comput Biol       Date:  2007-12       Impact factor: 1.479

2.  Molecular population genetics of Drosophila subtelomeric DNA.

Authors:  Jennifer A Anderson; Yun S Song; Charles H Langley
Journal:  Genetics       Date:  2008-01       Impact factor: 4.562

3.  Genome-wide compatible SNP intervals and their properties.

Authors:  Jeremy Wang; Fernando Pardo-Manual de Villena; Kyle J Moore; Wei Wang; Qi Zhang; Leonard McMillan
Journal:  ACM Int Conf Bioinform Comput Biol (2010)       Date:  2010-08

4.  In the light of deep coalescence: revisiting trees within networks.

Authors:  Jiafan Zhu; Yun Yu; Luay Nakhleh
Journal:  BMC Bioinformatics       Date:  2016-11-11       Impact factor: 3.169

5.  Haplotypes spanning centromeric regions reveal persistence of large blocks of archaic DNA.

Authors:  Sasha A Langley; Karen H Miga; Gary H Karpen; Charles H Langley
Journal:  Elife       Date:  2019-06-25       Impact factor: 8.140

6.  Joint estimation of gene conversion rates and mean conversion tract lengths from population SNP data.

Authors:  Junming Yin; Michael I Jordan; Yun S Song
Journal:  Bioinformatics       Date:  2009-06-15       Impact factor: 6.937

7.  A comparison of phylogenetic network methods using computer simulation.

Authors:  Steven M Woolley; David Posada; Keith A Crandall
Journal:  PLoS One       Date:  2008-04-09       Impact factor: 3.240

8.  A human genome-wide library of local phylogeny predictions for whole-genome inference problems.

Authors:  Srinath Sridhar; Russell Schwartz
Journal:  BMC Genomics       Date:  2008-08-18       Impact factor: 3.969

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.