Literature DB >> 17567995

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome.

Elliott H Margulies1, Gregory M Cooper, George Asimenos, Daryl J Thomas, Colin N Dewey, Adam Siepel, Ewan Birney, Damian Keefe, Ariel S Schwartz, Minmei Hou, James Taylor, Sergey Nikolaev, Juan I Montoya-Burgos, Ari Löytynoja, Simon Whelan, Fabio Pardi, Tim Massingham, James B Brown, Peter Bickel, Ian Holmes, James C Mullikin, Abel Ureta-Vidal, Benedict Paten, Eric A Stone, Kate R Rosenbloom, W James Kent, Gerard G Bouffard, Xiaobin Guan, Nancy F Hansen, Jacquelyn R Idol, Valerie V B Maduro, Baishali Maskeri, Jennifer C McDowell, Morgan Park, Pamela J Thomas, Alice C Young, Robert W Blakesley, Donna M Muzny, Erica Sodergren, David A Wheeler, Kim C Worley, Huaiyang Jiang, George M Weinstock, Richard A Gibbs, Tina Graves, Robert Fulton, Elaine R Mardis, Richard K Wilson, Michele Clamp, James Cuff, Sante Gnerre, David B Jaffe, Jean L Chang, Kerstin Lindblad-Toh, Eric S Lander, Angie Hinrichs, Heather Trumbower, Hiram Clawson, Ann Zweig, Robert M Kuhn, Galt Barber, Rachel Harte, Donna Karolchik, Matthew A Field, Richard A Moore, Carrie A Matthewson, Jacqueline E Schein, Marco A Marra, Stylianos E Antonarakis, Serafim Batzoglou, Nick Goldman, Ross Hardison, David Haussler, Webb Miller, Lior Pachter, Eric D Green, Arend Sidow.   

Abstract

A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization.

Entities:  

Mesh:

Year:  2007        PMID: 17567995      PMCID: PMC1891336          DOI: 10.1101/gr.6034307

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  68 in total

1.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

2.  The human genome browser at UCSC.

Authors:  W James Kent; Charles W Sugnet; Terrence S Furey; Krishna M Roskin; Tom H Pringle; Alan M Zahler; David Haussler
Journal:  Genome Res       Date:  2002-06       Impact factor: 9.043

Review 3.  Mutation rate variation in the mammalian genome.

Authors:  Hans Ellegren; Nick G C Smith; Matthew T Webster
Journal:  Curr Opin Genet Dev       Date:  2003-12       Impact factor: 5.578

4.  Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes.

Authors:  Samuel Aparicio; Jarrod Chapman; Elia Stupka; Nik Putnam; Jer-Ming Chia; Paramvir Dehal; Alan Christoffels; Sam Rash; Shawn Hoon; Arian Smit; Maarten D Sollewijn Gelpke; Jared Roach; Tania Oh; Isaac Y Ho; Marie Wong; Chris Detter; Frans Verhoef; Paul Predki; Alice Tay; Susan Lucas; Paul Richardson; Sarah F Smith; Melody S Clark; Yvonne J K Edwards; Norman Doggett; Andrey Zharkikh; Sean V Tavtigian; Dmitry Pruss; Mary Barnstead; Cheryl Evans; Holly Baden; Justin Powell; Gustavo Glusman; Lee Rowen; Leroy Hood; Y H Tan; Greg Elgar; Trevor Hawkins; Byrappa Venkatesh; Daniel Rokhsar; Sydney Brenner
Journal:  Science       Date:  2002-07-25       Impact factor: 47.728

5.  ProbCons: Probabilistic consistency-based multiple sequence alignment.

Authors:  Chuong B Do; Mahathi S P Mahabhashyam; Michael Brudno; Serafim Batzoglou
Journal:  Genome Res       Date:  2005-02       Impact factor: 9.043

6.  Comparative sequencing provides insights about the structure and conservation of marsupial and monotreme genomes.

Authors:  Elliott H Margulies; Valerie V B Maduro; Pamela J Thomas; Jeffery P Tomkins; Chris T Amemiya; Meizhong Luo; Eric D Green
Journal:  Proc Natl Acad Sci U S A       Date:  2005-02-17       Impact factor: 11.205

7.  Resolution of the early placental mammal radiation using Bayesian phylogenetics.

Authors:  W J Murphy; E Eizirik; S J O'Brien; O Madsen; M Scally; C J Douady; E Teeling; O A Ryder; M J Stanhope; W W de Jong; M S Springer
Journal:  Science       Date:  2001-12-14       Impact factor: 47.728

8.  Conserved fragments of transposable elements in intergenic regions: evidence for widespread recruitment of MIR- and L2-derived sequences within the mouse and human genomes.

Authors:  J C Silva; S A Shabalina; D G Harris; J L Spouge; A S Kondrashovi
Journal:  Genet Res       Date:  2003-08       Impact factor: 1.588

9.  Assessing computational tools for the discovery of transcription factor binding sites.

Authors:  Martin Tompa; Nan Li; Timothy L Bailey; George M Church; Bart De Moor; Eleazar Eskin; Alexander V Favorov; Martin C Frith; Yutao Fu; W James Kent; Vsevolod J Makeev; Andrei A Mironov; William Stafford Noble; Giulio Pavesi; Graziano Pesole; Mireille Régnier; Nicolas Simonis; Saurabh Sinha; Gert Thijs; Jacques van Helden; Mathias Vandenbogaert; Zhiping Weng; Christopher Workman; Chun Ye; Zhou Zhu
Journal:  Nat Biotechnol       Date:  2005-01       Impact factor: 54.908

10.  Highly conserved non-coding sequences are associated with vertebrate development.

Authors:  Adam Woolfe; Martin Goodson; Debbie K Goode; Phil Snell; Gayle K McEwen; Tanya Vavouri; Sarah F Smith; Phil North; Heather Callaway; Krys Kelly; Klaudia Walter; Irina Abnizova; Walter Gilks; Yvonne J K Edwards; Julie E Cooke; Greg Elgar
Journal:  PLoS Biol       Date:  2004-11-11       Impact factor: 8.029

View more
  118 in total

1.  Comparative assessment of methods for aligning multiple genome sequences.

Authors:  Xiaoyu Chen; Martin Tompa
Journal:  Nat Biotechnol       Date:  2010-05-23       Impact factor: 54.908

2.  CAGE: Combinatorial Analysis of Gene-cluster Evolution.

Authors:  Giltae Song; Louxin Zhang; Tomas Vinar; Webb Miller
Journal:  J Comput Biol       Date:  2010-09       Impact factor: 1.479

3.  ChIP-Seq identification of weakly conserved heart enhancers.

Authors:  Matthew J Blow; David J McCulley; Zirong Li; Tao Zhang; Jennifer A Akiyama; Amy Holt; Ingrid Plajzer-Frick; Malak Shoukry; Crystal Wright; Feng Chen; Veena Afzal; James Bristow; Bing Ren; Brian L Black; Edward M Rubin; Axel Visel; Len A Pennacchio
Journal:  Nat Genet       Date:  2010-08-22       Impact factor: 38.330

4.  Extrathymic generation of regulatory T cells in placental mammals mitigates maternal-fetal conflict.

Authors:  Robert M Samstein; Steven Z Josefowicz; Aaron Arvey; Piper M Treuting; Alexander Y Rudensky
Journal:  Cell       Date:  2012-07-06       Impact factor: 41.582

5.  Reliable prediction of regulator targets using 12 Drosophila genomes.

Authors:  Pouya Kheradpour; Alexander Stark; Sushmita Roy; Manolis Kellis
Journal:  Genome Res       Date:  2007-11-07       Impact factor: 9.043

6.  Comparative genomics beyond sequence-based alignments: RNA structures in the ENCODE regions.

Authors:  Elfar Torarinsson; Zizhen Yao; Eric D Wiklund; Jesper B Bramsen; Claus Hansen; Jørgen Kjems; Niels Tommerup; Walter L Ruzzo; Jan Gorodkin
Journal:  Genome Res       Date:  2007-12-20       Impact factor: 9.043

7.  Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution.

Authors:  Deyou Zheng; Adam Frankish; Robert Baertsch; Philipp Kapranov; Alexandre Reymond; Siew Woh Choo; Yontao Lu; France Denoeud; Stylianos E Antonarakis; Michael Snyder; Yijun Ruan; Chia-Lin Wei; Thomas R Gingeras; Roderic Guigó; Jennifer Harrow; Mark B Gerstein
Journal:  Genome Res       Date:  2007-06       Impact factor: 9.043

8.  Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions.

Authors:  France Denoeud; Philipp Kapranov; Catherine Ucla; Adam Frankish; Robert Castelo; Jorg Drenkow; Julien Lagarde; Tyler Alioto; Caroline Manzano; Jacqueline Chrast; Sujit Dike; Carine Wyss; Charlotte N Henrichsen; Nancy Holroyd; Mark C Dickson; Ruth Taylor; Zahra Hance; Sylvain Foissac; Richard M Myers; Jane Rogers; Tim Hubbard; Jennifer Harrow; Roderic Guigó; Thomas R Gingeras; Stylianos E Antonarakis; Alexandre Reymond
Journal:  Genome Res       Date:  2007-06       Impact factor: 9.043

Review 9.  Nonhuman primate models of human viral infections.

Authors:  Jacob D Estes; Scott W Wong; Jason M Brenchley
Journal:  Nat Rev Immunol       Date:  2018-06       Impact factor: 53.106

10.  28-way vertebrate alignment and conservation track in the UCSC Genome Browser.

Authors:  Webb Miller; Kate Rosenbloom; Ross C Hardison; Minmei Hou; James Taylor; Brian Raney; Richard Burhans; David C King; Robert Baertsch; Daniel Blankenberg; Sergei L Kosakovsky Pond; Anton Nekrutenko; Belinda Giardine; Robert S Harris; Svitlana Tyekucheva; Mark Diekhans; Thomas H Pringle; William J Murphy; Arthur Lesk; George M Weinstock; Kerstin Lindblad-Toh; Richard A Gibbs; Eric S Lander; Adam Siepel; David Haussler; W James Kent
Journal:  Genome Res       Date:  2007-11-05       Impact factor: 9.043

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.