Literature DB >> 17053093

ESPERR: learning strong and weak signals in genomic sequence alignments to identify functional elements.

James Taylor1, Svitlana Tyekucheva, David C King, Ross C Hardison, Webb Miller, Francesca Chiaromonte.   

Abstract

Genomic sequence signals - such as base composition, presence of particular motifs, or evolutionary constraint - have been used effectively to identify functional elements. However, approaches based only on specific signals known to correlate with function can be quite limiting. When training data are available, application of computational learning algorithms to multispecies alignments has the potential to capture broader and more informative sequence and evolutionary patterns that better characterize a class of elements. However, effective exploitation of patterns in multispecies alignments is impeded by the vast number of possible alignment columns and by a limited understanding of which particular strings of columns may characterize a given class. We have developed a computational method, called ESPERR (evolutionary and sequence pattern extraction through reduced representations), which uses training examples to learn encodings of multispecies alignments into reduced forms tailored for the prediction of chosen classes of functional elements. ESPERR produces a greatly improved Regulatory Potential score, which can discriminate regulatory regions from neutral sites with excellent accuracy ( approximately 94%). This score captures strong signals (GC content and conservation), as well as subtler signals (with small contributions from many different alignment patterns) that characterize the regulatory elements in our training set. ESPERR is also effective for predicting other classes of functional elements, as we show for DNaseI hypersensitive sites and highly conserved regions with developmental enhancer activity. Our software, training data, and genome-wide predictions are available from our Web site (http://www.bx.psu.edu/projects/esperr).

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 17053093      PMCID: PMC1665643          DOI: 10.1101/gr.4537706

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  26 in total

1.  Comparison of site-specific rate-inference methods for protein sequences: empirical Bayesian methods are superior.

Authors:  Itay Mayrose; Dan Graur; Nir Ben-Tal; Tal Pupko
Journal:  Mol Biol Evol       Date:  2004-06-16       Impact factor: 16.240

2.  Combining phylogenetic and hidden Markov models in biosequence analysis.

Authors:  Adam Siepel; David Haussler
Journal:  J Comput Biol       Date:  2004       Impact factor: 1.479

3.  Subtree power analysis and species selection for comparative genomics.

Authors:  Jon D McAuliffe; Michael I Jordan; Lior Pachter
Journal:  Proc Natl Acad Sci U S A       Date:  2005-05-23       Impact factor: 11.205

4.  Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome.

Authors:  Sara J Cooper; Nathan D Trinklein; Elizabeth D Anton; Loan Nguyen; Richard M Myers
Journal:  Genome Res       Date:  2005-12-12       Impact factor: 9.043

5.  Using multiple alignments to improve gene prediction.

Authors:  Samuel S Gross; Michael R Brent
Journal:  J Comput Biol       Date:  2006-03       Impact factor: 1.479

6.  Experimental validation of predicted mammalian erythroid cis-regulatory modules.

Authors:  Hao Wang; Ying Zhang; Yong Cheng; Yuepin Zhou; David C King; James Taylor; Francesca Chiaromonte; Jyotsna Kasturi; Hanna Petrykowska; Brian Gibb; Christine Dorman; Webb Miller; Louis C Dore; John Welch; Mitchell J Weiss; Ross C Hardison
Journal:  Genome Res       Date:  2006-10-12       Impact factor: 9.043

7.  Dating of the human-ape splitting by a molecular clock of mitochondrial DNA.

Authors:  M Hasegawa; H Kishino; T Yano
Journal:  J Mol Evol       Date:  1985       Impact factor: 2.395

8.  Assessing computational tools for the discovery of transcription factor binding sites.

Authors:  Martin Tompa; Nan Li; Timothy L Bailey; George M Church; Bart De Moor; Eleazar Eskin; Alexander V Favorov; Martin C Frith; Yutao Fu; W James Kent; Vsevolod J Makeev; Andrei A Mironov; William Stafford Noble; Giulio Pavesi; Graziano Pesole; Mireille Régnier; Nicolas Simonis; Saurabh Sinha; Gert Thijs; Jacques van Helden; Mathias Vandenbogaert; Zhiping Weng; Christopher Workman; Chun Ye; Zhou Zhu
Journal:  Nat Biotechnol       Date:  2005-01       Impact factor: 54.908

9.  Functional analysis of eve stripe 2 enhancer evolution in Drosophila: rules governing conservation and change.

Authors:  M Z Ludwig; N H Patel; M Kreitman
Journal:  Development       Date:  1998-03       Impact factor: 6.868

10.  Differences in the chromatin structure and cis-element organization of the human and mouse GATA1 loci: implications for cis-element identification.

Authors:  Veronica Valverde-Garduno; Boris Guyot; Eduardo Anguita; Isla Hamlett; Catherine Porcher; Paresh Vyas
Journal:  Blood       Date:  2004-07-20       Impact factor: 22.113

View more
  69 in total

1.  Identification and characterization of Hoxa9 binding sites in hematopoietic cells.

Authors:  Yongsheng Huang; Kajal Sitwala; Joel Bronstein; Daniel Sanders; Monisha Dandekar; Cailin Collins; Gordon Robertson; James MacDonald; Timothee Cezard; Misha Bilenky; Nina Thiessen; Yongjun Zhao; Thomas Zeng; Martin Hirst; Alfred Hero; Steven Jones; Jay L Hess
Journal:  Blood       Date:  2011-11-09       Impact factor: 22.113

2.  Genome-wide association study of theta band event-related oscillations identifies serotonin receptor gene HTR7 influencing risk of alcohol dependence.

Authors:  Mark Zlojutro; Niklas Manz; Madhavi Rangaswamy; Xiaoling Xuei; Leah Flury-Wetherill; Daniel Koller; Laura J Bierut; Alison Goate; Victor Hesselbrock; Samuel Kuperman; John Nurnberger; John P Rice; Marc A Schuckit; Tatiana Foroud; Howard J Edenberg; Bernice Porjesz; Laura Almasy
Journal:  Am J Med Genet B Neuropsychiatr Genet       Date:  2010-11-02       Impact factor: 3.568

3.  LIM domain only 2 protein expression, LMO2 germline genetic variation, and overall survival in diffuse large B-cell lymphoma in the pre-rituximab era.

Authors:  James R Cerhan; Yasodha Natkunam; Lindsay M Morton; Matthew J Maurer; Yan Asmann; Thomas M Habermann; Mohammad A Vasef; Wendy Cozen; Charles F Lynch; Cristine Allmer; Susan L Slager; Izidore S Lossos; Stephen J Chanock; Nathaniel Rothman; Patricia Hartge; Ahmet Dogan; Sophia S Wang
Journal:  Leuk Lymphoma       Date:  2012-01-03

4.  Reliable prediction of regulator targets using 12 Drosophila genomes.

Authors:  Pouya Kheradpour; Alexander Stark; Sushmita Roy; Manolis Kellis
Journal:  Genome Res       Date:  2007-11-07       Impact factor: 9.043

5.  A GATA-1-regulated microRNA locus essential for erythropoiesis.

Authors:  Louis C Dore; Julio D Amigo; Camila O Dos Santos; Zhe Zhang; Xiaowu Gai; John W Tobias; Duonan Yu; Alyssa M Klein; Christine Dorman; Weisheng Wu; Ross C Hardison; Barry H Paw; Mitchell J Weiss
Journal:  Proc Natl Acad Sci U S A       Date:  2008-02-26       Impact factor: 11.205

6.  Metrics of sequence constraint overlook regulatory sequences in an exhaustive analysis at phox2b.

Authors:  David M McGaughey; Ryan M Vinton; Jimmy Huynh; Amr Al-Saif; Michael A Beer; Andrew S McCallion
Journal:  Genome Res       Date:  2007-12-10       Impact factor: 9.043

7.  Complexity reduction in context-dependent DNA substitution models.

Authors:  William H Majoros; Uwe Ohler
Journal:  Bioinformatics       Date:  2008-11-18       Impact factor: 6.937

8.  White matter abnormalities in 22q11.2 deletion syndrome: preliminary associations with the Nogo-66 receptor gene and symptoms of psychosis.

Authors:  Matthew D Perlstein; Moeed R Chohan; Ioana L Coman; Kevin M Antshel; Wanda P Fremont; Matthew H Gnirke; Zora Kikinis; Frank A Middleton; Petya D Radoeva; Martha E Shenton; Wendy R Kates
Journal:  Schizophr Res       Date:  2013-12-08       Impact factor: 4.939

Review 9.  TERT genetic polymorphism rs2736100 was associated with lung cancer: a meta-analysis based on 14,492 subjects.

Authors:  Hui-Min Wang; Xue-Yan Zhang; Bo Jin
Journal:  Genet Test Mol Biomarkers       Date:  2013-12

10.  28-way vertebrate alignment and conservation track in the UCSC Genome Browser.

Authors:  Webb Miller; Kate Rosenbloom; Ross C Hardison; Minmei Hou; James Taylor; Brian Raney; Richard Burhans; David C King; Robert Baertsch; Daniel Blankenberg; Sergei L Kosakovsky Pond; Anton Nekrutenko; Belinda Giardine; Robert S Harris; Svitlana Tyekucheva; Mark Diekhans; Thomas H Pringle; William J Murphy; Arthur Lesk; George M Weinstock; Kerstin Lindblad-Toh; Richard A Gibbs; Eric S Lander; Adam Siepel; David Haussler; W James Kent
Journal:  Genome Res       Date:  2007-11-05       Impact factor: 9.043

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.