Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Human-mouse gene identification by comparative evidence integration and evolutionary analysis.

Literature DB >> 12743024

Human-mouse gene identification by comparative evidence integration and evolutionary analysis.

Lingang Zhang¹, Vladimir Pavlovic, Charles R Cantor, Simon Kasif.

Abstract

The identification of genes in the human genome remains a challenge, as the actual predictions appear to disagree tremendously and vary dramatically on the basis of the specific gene-finding methodology used. Because the pattern of conservation in coding regions is expected to be different from intronic or intergenic regions, a comparative computational analysis can lead, in principle, to an improved computational identification of genes in the human genome by using a reference, such as mouse genome. However, this comparative methodology critically depends on three important factors: (1) the selection of the most appropriate reference genome. In particular, it is not clear whether the mouse is at the correct evolutionary distance from the human to provide sufficiently distinctive conservation levels in different genomic regions, (2) the selection of comparative features that provide the most benefit to gene recognition, and (3) the selection of evidence integration architecture that effectively interprets the comparative features. We address the first question by a novel evolutionary analysis that allows us to explicitly correlate the performance of the gene recognition system with the evolutionary distance (time) between the two genomes. Our simulation results indicate that there is a wide range of reference genomes at different evolutionary time points that appear to deliver reasonable comparative prediction of human genes. In particular, the evolutionary time between human and mouse generally falls in the region of good performance; however, better accuracy might be achieved with a reference genome further than mouse. To address the second question, we propose several natural comparative measures of conservation for identifying exons and exon boundaries. Finally, we experiment with Bayesian networks for the integration of comparative and compositional evidence.

Entities: Species

Mesh：

Substances：

Year: 2003 PMID： 12743024 PMCID： PMC403647 DOI： 10.1101/gr.703903

Source DB: PubMed Journal: Genome Res ISSN： 1088-9051 Impact factor: 9.043

42 in total

1. Codon-substitution models for heterogeneous selection pressure at amino acid sites.

Authors: Z Yang; R Nielsen; N Goldman; A M Pedersen
Journal: Genetics Date: 2000-05 Impact factor: 4.562

2. Analysis of expressed sequence tags indicates 35,000 human genes.

Authors: B Ewing; P Green
Journal: Nat Genet Date: 2000-06 Impact factor: 38.330

3. A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes.

Authors: J B Hogenesch; K A Ching; S Batalov; A I Su; J R Walker; Y Zhou; S A Kay; P G Schultz; M P Cooke
Journal: Cell Date: 2001-08-24 Impact factor: 41.582

4. Comparative ab initio prediction of gene structures using pair HMMs.

Authors: Irmtraud M Meyer; Richard Durbin
Journal: Bioinformatics Date: 2002-10 Impact factor: 6.937

Review 5. Computational prediction of eukaryotic protein-coding genes.

Authors: Michael Q Zhang
Journal: Nat Rev Genet Date: 2002-09 Impact factor: 53.242

6. A hidden Markov model that finds genes in E. coli DNA.

Authors: A Krogh; I S Mian; D Haussler
Journal: Nucleic Acids Res Date: 1994-11-11 Impact factor: 16.971

7. Maximum-Likelihood Models for Combined Analyses of Multiple Sequence Data

Authors:
Journal: J Mol Evol Date: 1996-05 Impact factor: 2.395

Review 8. Computational methods for the identification of genes in vertebrate genomic sequences.

Authors: J M Claverie
Journal: Hum Mol Genet Date: 1997 Impact factor: 6.150

9. A codon-based model of nucleotide substitution for protein-coding DNA sequences.

Authors: N Goldman; Z Yang
Journal: Mol Biol Evol Date: 1994-09 Impact factor: 16.240

10. Human and mouse gene structure: comparative analysis and application to exon prediction.

Authors: S Batzoglou; L Pachter; J P Mesirov; B Berger; E S Lander
Journal: Genome Res Date: 2000-07 Impact factor: 9.043

10 in total

1. Subtree power analysis and species selection for comparative genomics.

Authors: Jon D McAuliffe; Michael I Jordan; Lior Pachter
Journal: Proc Natl Acad Sci U S A Date: 2005-05-23 Impact factor: 11.205

2. Prediction of small, noncoding RNAs in bacteria using heterogeneous data.

Authors: Brian Tjaden
Journal: J Math Biol Date: 2007-03-13 Impact factor: 2.259

3. Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures.

Authors: Alexander Stark; Michael F Lin; Pouya Kheradpour; Jakob S Pedersen; Leopold Parts; Joseph W Carlson; Madeline A Crosby; Matthew D Rasmussen; Sushmita Roy; Ameya N Deoras; J Graham Ruby; Julius Brennecke; Emily Hodges; Angie S Hinrichs; Anat Caspi; Benedict Paten; Seung-Won Park; Mira V Han; Morgan L Maeder; Benjamin J Polansky; Bryanne E Robson; Stein Aerts; Jacques van Helden; Bassem Hassan; Donald G Gilbert; Deborah A Eastman; Michael Rice; Michael Weir; Matthew W Hahn; Yongkyu Park; Colin N Dewey; Lior Pachter; W James Kent; David Haussler; Eric C Lai; David P Bartel; Gregory J Hannon; Thomas C Kaufman; Michael B Eisen; Andrew G Clark; Douglas Smith; Susan E Celniker; William M Gelbart; Manolis Kellis
Journal: Nature Date: 2007-11-08 Impact factor: 49.962

4. The truth about mouse, human, worms and yeast.

Authors: David R Nelson; Daniel W Nebert
Journal: Hum Genomics Date: 2004-01 Impact factor: 4.639

5. Genomix: a method for combining gene-finders' predictions, which uses evolutionary conservation of sequence and intron-exon structure.

Authors: Avril Coghlan; Richard Durbin
Journal: Bioinformatics Date: 2007-05-05 Impact factor: 6.937

6. GeneWaltz--A new method for reducing the false positives of gene finding.

Authors: Kazuharu Misawa; Reiko F Kikuno
Journal: BioData Min Date: 2010-09-28 Impact factor: 2.522

7. Gene finding in the chicken genome.

Authors: Eduardo Eyras; Alexandre Reymond; Robert Castelo; Jacqueline M Bye; Francisco Camara; Paul Flicek; Elizabeth J Huckle; Genis Parra; David D Shteynberg; Carine Wyss; Jane Rogers; Stylianos E Antonarakis; Ewan Birney; Roderic Guigo; Michael R Brent
Journal: BMC Bioinformatics Date: 2005-05-30 Impact factor: 3.169