Literature DB >> 25565268

Building a pan-genome reference for a population.

Ngan Nguyen1, Glenn Hickey, Daniel R Zerbino, Brian Raney, Dent Earl, Joel Armstrong, W James Kent, David Haussler, Benedict Paten.   

Abstract

A reference genome is a high quality individual genome that is used as a coordinate system for the genomes of a population, or genomes of closely related subspecies. Given a set of genomes partitioned by homology into alignment blocks we formalize the problem of ordering and orienting the blocks such that the resulting ordering maximally agrees with the underlying genomes' ordering and orientation, creating a pan-genome reference ordering. We show this problem is NP-hard, but also demonstrate, empirically and within simulations, the performance of heuristic algorithms based upon a cactus graph decomposition to find locally maximal solutions. We describe an extension of our Cactus software to create a pan-genome reference for whole genome alignments, and demonstrate how it can be used to create novel genome browser visualizations using human variation data as a test. In addition, we test the use of a pan-genome for describing variations and as a reference for read mapping.

Entities:  

Keywords:  algorithms; computational molecular biology; genomics; molecular evolution; sequence analysis

Mesh:

Year:  2015        PMID: 25565268      PMCID: PMC4424974          DOI: 10.1089/cmb.2014.0146

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  18 in total

1.  Computation of perfect DCJ rearrangement scenarios with linear and circular chromosomes.

Authors:  Sèverine Bérard; Annie Chateau; Cedric Chauve; Christophe Paul; Eric Tannier
Journal:  J Comput Biol       Date:  2009-10       Impact factor: 1.479

2.  Cactus: Algorithms for genome multiple sequence alignment.

Authors:  Benedict Paten; Dent Earl; Ngan Nguyen; Mark Diekhans; Daniel Zerbino; David Haussler
Journal:  Genome Res       Date:  2011-06-10       Impact factor: 9.043

3.  Cactus graphs for genome comparisons.

Authors:  Benedict Paten; Mark Diekhans; Dent Earl; John St John; Jian Ma; Bernard Suh; David Haussler
Journal:  J Comput Biol       Date:  2011-03       Impact factor: 1.479

4.  A map of human genome variation from population-scale sequencing.

Authors:  Gonçalo R Abecasis; David Altshuler; Adam Auton; Lisa D Brooks; Richard M Durbin; Richard A Gibbs; Matt E Hurles; Gil A McVean
Journal:  Nature       Date:  2010-10-28       Impact factor: 49.962

5.  How and why chromosome inversions evolve.

Authors:  Mark Kirkpatrick
Journal:  PLoS Biol       Date:  2010-09-28       Impact factor: 8.029

6.  Genetic map refinement using a comparative genomic approach.

Authors:  Denis Bertrand; Mathieu Blanchette; Nadia El-Mabrouk
Journal:  J Comput Biol       Date:  2009-10       Impact factor: 1.479

7.  A user's guide to the encyclopedia of DNA elements (ENCODE).

Authors: 
Journal:  PLoS Biol       Date:  2011-04-19       Impact factor: 8.029

8.  The GENCODE exome: sequencing the complete human exome.

Authors:  Alison J Coffey; Felix Kokocinski; Maria S Calafato; Carol E Scott; Priit Palta; Eleanor Drury; Christopher J Joyce; Emily M Leproust; Jen Harrow; Sarah Hunt; Anna-Elina Lehesjoki; Daniel J Turner; Tim J Hubbard; Aarno Palotie
Journal:  Eur J Hum Genet       Date:  2011-03-02       Impact factor: 4.246

9.  NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy.

Authors:  Kim D Pruitt; Tatiana Tatusova; Garth R Brown; Donna R Maglott
Journal:  Nucleic Acids Res       Date:  2011-11-24       Impact factor: 16.971

10.  The UCSC Genome Browser database: extensions and updates 2013.

Authors:  Laurence R Meyer; Ann S Zweig; Angie S Hinrichs; Donna Karolchik; Robert M Kuhn; Matthew Wong; Cricket A Sloan; Kate R Rosenbloom; Greg Roe; Brooke Rhead; Brian J Raney; Andy Pohl; Venkat S Malladi; Chin H Li; Brian T Lee; Katrina Learned; Vanessa Kirkup; Fan Hsu; Steve Heitner; Rachel A Harte; Maximilian Haeussler; Luvina Guruvadoo; Mary Goldman; Belinda M Giardine; Pauline A Fujita; Timothy R Dreszer; Mark Diekhans; Melissa S Cline; Hiram Clawson; Galt P Barber; David Haussler; W James Kent
Journal:  Nucleic Acids Res       Date:  2012-11-15       Impact factor: 16.971

View more
  17 in total

Review 1.  Completing the human genome: the progress and challenge of satellite DNA assembly.

Authors:  Karen H Miga
Journal:  Chromosome Res       Date:  2015-09       Impact factor: 5.239

2.  A Flow Procedure for Linearization of Genome Sequence Graphs.

Authors:  David Haussler; Maciej Smuga-Otto; Jordan M Eizenga; Benedict Paten; Adam M Novak; Sergei Nikitin; Maria Zueva; Dmitrii Miagkov
Journal:  J Comput Biol       Date:  2018-05-24       Impact factor: 1.479

Review 3.  Whole-Genome Alignment and Comparative Annotation.

Authors:  Joel Armstrong; Ian T Fiddes; Mark Diekhans; Benedict Paten
Journal:  Annu Rev Anim Biosci       Date:  2018-10-31       Impact factor: 8.923

Review 4.  Genetic Variation, Comparative Genomics, and the Diagnosis of Disease.

Authors:  Evan E Eichler
Journal:  N Engl J Med       Date:  2019-07-04       Impact factor: 91.245

5.  Bloom Filter Trie: an alignment-free and reference-free data structure for pan-genome storage.

Authors:  Guillaume Holley; Roland Wittler; Jens Stoye
Journal:  Algorithms Mol Biol       Date:  2016-04-14       Impact factor: 1.405

6.  Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly.

Authors:  Valerie A Schneider; Tina Graves-Lindsay; Kerstin Howe; Nathan Bouk; Hsiu-Chuan Chen; Paul A Kitts; Terence D Murphy; Kim D Pruitt; Françoise Thibaud-Nissen; Derek Albracht; Robert S Fulton; Milinn Kremitzki; Vincent Magrini; Chris Markovic; Sean McGrath; Karyn Meltz Steinberg; Kate Auger; William Chow; Joanna Collins; Glenn Harden; Timothy Hubbard; Sarah Pelan; Jared T Simpson; Glen Threadgold; James Torrance; Jonathan M Wood; Laura Clarke; Sergey Koren; Matthew Boitano; Paul Peluso; Heng Li; Chen-Shan Chin; Adam M Phillippy; Richard Durbin; Richard K Wilson; Paul Flicek; Evan E Eichler; Deanna M Church
Journal:  Genome Res       Date:  2017-04-10       Impact factor: 9.043

7.  Characterizing the Major Structural Variant Alleles of the Human Genome.

Authors:  Peter A Audano; Arvis Sulovari; Tina A Graves-Lindsay; Stuart Cantsilieris; Melanie Sorensen; AnneMarie E Welch; Max L Dougherty; Bradley J Nelson; Ankeeta Shah; Susan K Dutcher; Wesley C Warren; Vincent Magrini; Sean D McGrath; Yang I Li; Richard K Wilson; Evan E Eichler
Journal:  Cell       Date:  2019-01-17       Impact factor: 41.582

8.  Progressive Cactus is a multiple-genome aligner for the thousand-genome era.

Authors:  Joel Armstrong; Glenn Hickey; Mark Diekhans; Ian T Fiddes; Adam M Novak; Alden Deran; Qi Fang; Duo Xie; Shaohong Feng; Josefin Stiller; Diane Genereux; Jeremy Johnson; Voichita Dana Marinescu; Jessica Alföldi; Robert S Harris; Kerstin Lindblad-Toh; David Haussler; Elinor Karlsson; Erich D Jarvis; Guojie Zhang; Benedict Paten
Journal:  Nature       Date:  2020-11-11       Impact factor: 49.962

9.  A tri-tuple coordinate system derived for fast and accurate analysis of the colored de Bruijn graph-based pangenomes.

Authors:  Jindan Guo; Erli Pang; Hongtao Song; Kui Lin
Journal:  BMC Bioinformatics       Date:  2021-05-27       Impact factor: 3.169

Review 10.  Plant NLR diversity: the known unknowns of pan-NLRomes.

Authors:  A Cristina Barragan; Detlef Weigel
Journal:  Plant Cell       Date:  2021-05-31       Impact factor: 12.085

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.