Literature DB >> 11337476

Computational inference of homologous gene structures in the human genome.

R F Yeh1, L P Lim, C B Burge.   

Abstract

With the human genome sequence approaching completion, a major challenge is to identify the locations and encoded protein sequences of all human genes. To address this problem we have developed a new gene identification algorithm, GenomeScan, which combines exon-intron and splice signal models with similarity to known protein sequences in an integrated model. Extensive testing shows that GenomeScan can accurately identify the exon-intron structures of genes in finished or draft human genome sequence with a low rate of false-positives. Application of GenomeScan to 2.7 billion bases of human genomic DNA identified at least 20,000-25,000 human genes out of an estimated 30,000-40,000 present in the genome. The results show an accurate and efficient automated approach for identifying genes in higher eukaryotic genomes and provide a first-level annotation of the draft human genome.

Entities:  

Mesh:

Year:  2001        PMID: 11337476      PMCID: PMC311055          DOI: 10.1101/gr.175701

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  31 in total

1.  NCBI's LocusLink and RefSeq.

Authors:  D R Maglott; K S Katz; H Sicotte; K D Pruitt
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  The DNA sequence of human chromosome 22.

Authors:  I Dunham; N Shimizu; B A Roe; S Chissoe; A R Hunt; J E Collins; R Bruskiewich; D M Beare; M Clamp; L J Smink; R Ainscough; J P Almeida; A Babbage; C Bagguley; J Bailey; K Barlow; K N Bates; O Beasley; C P Bird; S Blakey; A M Bridgeman; D Buck; J Burgess; W D Burrill; K P O'Brien
Journal:  Nature       Date:  1999-12-02       Impact factor: 49.962

3.  Open annotation offers a democratic solution to genome sequencing.

Authors:  T Hubbard; E Birney
Journal:  Nature       Date:  2000-02-24       Impact factor: 49.962

4.  A computer program for aligning a cDNA sequence with a genomic DNA sequence.

Authors:  L Florea; G Hartzell; Z Zhang; G M Rubin; W Miller
Journal:  Genome Res       Date:  1998-09       Impact factor: 9.043

Review 5.  Finding the genes in genomic DNA.

Authors:  C B Burge; S Karlin
Journal:  Curr Opin Struct Biol       Date:  1998-06       Impact factor: 6.809

6.  The transcriptional program in the response of human fibroblasts to serum.

Authors:  V R Iyer; M B Eisen; D T Ross; G Schuler; T Moore; J C Lee; J M Trent; L M Staudt; J Hudson; M S Boguski; D Lashkari; D Shalon; D Botstein; P O Brown
Journal:  Science       Date:  1999-01-01       Impact factor: 47.728

Review 7.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

8.  Automated gene identification in large-scale genomic sequences.

Authors:  Y Xu; E C Uberbacher
Journal:  J Comput Biol       Date:  1997       Impact factor: 1.479

Review 9.  Computational methods for the identification of genes in vertebrate genomic sequences.

Authors:  J M Claverie
Journal:  Hum Mol Genet       Date:  1997       Impact factor: 6.150

10.  Four evolutionary strata on the human X chromosome.

Authors:  B T Lahn; D C Page
Journal:  Science       Date:  1999-10-29       Impact factor: 47.728

View more
  131 in total

1.  A question of size: the eukaryotic proteome and the problems in defining it.

Authors:  Paul M Harrison; Anuj Kumar; Ning Lang; Michael Snyder; Mark Gerstein
Journal:  Nucleic Acids Res       Date:  2002-03-01       Impact factor: 16.971

2.  Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22.

Authors:  Paul M Harrison; Hedi Hegyi; Suganthi Balasubramanian; Nicholas M Luscombe; Paul Bertone; Nathaniel Echols; Ted Johnson; Mark Gerstein
Journal:  Genome Res       Date:  2002-02       Impact factor: 9.043

3.  Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.

Authors:  Nathaniel Echols; Paul Harrison; Suganthi Balasubramanian; Nicholas M Luscombe; Paul Bertone; Zhaolei Zhang; Mark Gerstein
Journal:  Nucleic Acids Res       Date:  2002-06-01       Impact factor: 16.971

4.  SLAM: cross-species gene finding and alignment with a generalized pair hidden Markov model.

Authors:  Marina Alexandersson; Simon Cawley; Lior Pachter
Journal:  Genome Res       Date:  2003-03       Impact factor: 9.043

Review 5.  Distribution and evolution of von Willebrand/integrin A domains: widely dispersed domains with roles in cell adhesion and elsewhere.

Authors:  Charles A Whittaker; Richard O Hynes
Journal:  Mol Biol Cell       Date:  2002-10       Impact factor: 4.138

6.  Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome.

Authors:  Zhaolei Zhang; Paul M Harrison; Yin Liu; Mark Gerstein
Journal:  Genome Res       Date:  2003-12       Impact factor: 9.043

7.  A comparative genomic method for computational identification of prokaryotic translation initiation sites.

Authors:  Megon Walker; Vladimir Pavlovic; Simon Kasif
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

8.  Gene structure conservation aids similarity based gene prediction.

Authors:  Irmtraud M Meyer; Richard Durbin
Journal:  Nucleic Acids Res       Date:  2004-02-04       Impact factor: 16.971

9.  Computational gene prediction using multiple sources of evidence.

Authors:  Jonathan E Allen; Mihaela Pertea; Steven L Salzberg
Journal:  Genome Res       Date:  2004-01       Impact factor: 9.043

10.  Comparative gene prediction in human and mouse.

Authors:  Genís Parra; Pankaj Agarwal; Josep F Abril; Thomas Wiehe; James W Fickett; Roderic Guigó
Journal:  Genome Res       Date:  2003-01       Impact factor: 9.043

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.