Literature DB >> 15285895

Methods in comparative genomics: genome correspondence, gene identification and regulatory motif discovery.

Manolis Kellis1, Nick Patterson, Bruce Birren, Bonnie Berger, Eric S Lander.   

Abstract

In Kellis et al. (2003), we reported the genome sequences of S. paradoxus, S. mikatae, and S. bayanus and compared these three yeast species to their close relative, S. cerevisiae. Genomewide comparative analysis allowed the identification of functionally important sequences, both coding and noncoding. In this companion paper we describe the mathematical and algorithmic results underpinning the analysis of these genomes. (1) We present methods for the automatic determination of genome correspondence. The algorithms enabled the automatic identification of orthologs for more than 90% of genes and intergenic regions across the four species despite the large number of duplicated genes in the yeast genome. The remaining ambiguities in the gene correspondence revealed recent gene family expansions in regions of rapid genomic change. (2) We present methods for the identification of protein-coding genes based on their patterns of nucleotide conservation across related species. We observed the pressure to conserve the reading frame of functional proteins and developed a test for gene identification with high sensitivity and specificity. We used this test to revisit the genome of S. cerevisiae, reducing the overall gene count by 500 genes (10% of previously annotated genes) and refining the gene structure of hundreds of genes. (3) We present novel methods for the systematic de novo identification of regulatory motifs. The methods do not rely on previous knowledge of gene function and in that way differ from the current literature on computational motif discovery. Based on genomewide conservation patterns of known motifs, we developed three conservation criteria that we used to discover novel motifs. We used an enumeration approach to select strongly conserved motif cores, which we extended and collapsed into a small number of candidate regulatory motifs. These include most previously known regulatory motifs as well as several noteworthy novel motifs. The majority of discovered motifs are enriched in functionally related genes, allowing us to infer a candidate function for novel motifs. Our results demonstrate the power of comparative genomics to further our understanding of any species. Our methods are validated by the extensive experimental knowledge in yeast and will be invaluable in the study of complex genomes like that of the human.

Entities:  

Mesh:

Year:  2004        PMID: 15285895     DOI: 10.1089/1066527041410319

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  42 in total

1.  MSARI: multiple sequence alignments for statistical detection of RNA secondary structure.

Authors:  Alex Coventry; Daniel J Kleitman; Bonnie Berger
Journal:  Proc Natl Acad Sci U S A       Date:  2004-08-10       Impact factor: 11.205

2.  Emergence of species-specific transporters during evolution of the hemiascomycete phylum.

Authors:  Benoît De Hertogh; Frédéric Hancy; André Goffeau; Philippe V Baret
Journal:  Genetics       Date:  2005-08-22       Impact factor: 4.562

3.  Functional analysis of gene duplications in Saccharomyces cerevisiae.

Authors:  Yuanfang Guan; Maitreya J Dunham; Olga G Troyanskaya
Journal:  Genetics       Date:  2006-12-06       Impact factor: 4.562

4.  RNAcode: robust discrimination of coding and noncoding regions in comparative sequence data.

Authors:  Stefan Washietl; Sven Findeiss; Stephan A Müller; Stefan Kalkhof; Martin von Bergen; Ivo L Hofacker; Peter F Stadler; Nick Goldman
Journal:  RNA       Date:  2011-02-28       Impact factor: 4.942

5.  A Statistical Model for Event Sequence Data.

Authors:  Kevin Heins; Hal Stern
Journal:  JMLR Workshop Conf Proc       Date:  2014

6.  Systematic discovery and characterization of fly microRNAs using 12 Drosophila genomes.

Authors:  Alexander Stark; Pouya Kheradpour; Leopold Parts; Julius Brennecke; Emily Hodges; Gregory J Hannon; Manolis Kellis
Journal:  Genome Res       Date:  2007-11-07       Impact factor: 9.043

7.  Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes.

Authors:  Michael F Lin; Joseph W Carlson; Madeline A Crosby; Beverley B Matthews; Charles Yu; Soo Park; Kenneth H Wan; Andrew J Schroeder; L Sian Gramates; Susan E St Pierre; Margaret Roark; Kenneth L Wiley; Rob J Kulathinal; Peili Zhang; Kyl V Myrick; Jerry V Antone; Susan E Celniker; William M Gelbart; Manolis Kellis
Journal:  Genome Res       Date:  2007-11-07       Impact factor: 9.043

8.  The cellular robustness by genetic redundancy in budding yeast.

Authors:  Jingjing Li; Zineng Yuan; Zhaolei Zhang
Journal:  PLoS Genet       Date:  2010-11-04       Impact factor: 5.917

9.  Statistical assessment of discriminative features for protein-coding and non coding cross-species conserved sequence elements.

Authors:  Teresa M Creanza; David S Horner; Annarita D'Addabbo; Rosalia Maglietta; Flavio Mignone; Nicola Ancona; Graziano Pesole
Journal:  BMC Bioinformatics       Date:  2009-06-16       Impact factor: 3.169

10.  Assessing the quality of whole genome alignments in bacteria.

Authors:  Firas Swidan; Ron Shamir
Journal:  Adv Bioinformatics       Date:  2009-11-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.