Literature DB >> 11861898

A question of size: the eukaryotic proteome and the problems in defining it.

Paul M Harrison1, Anuj Kumar, Ning Lang, Michael Snyder, Mark Gerstein.   

Abstract

We discuss the problems in defining the extent of the proteomes for completely sequenced eukaryotic organisms (i.e. the total number of protein-coding sequences), focusing on yeast, worm, fly and human. (i) Six years after completion of its genome sequence, the true size of the yeast proteome is still not defined. New small genes are still being discovered, and a large number of existing annotations are being called into question, with these questionable ORFs (qORFs) comprising up to one-fifth of the 'current' proteome. We discuss these in the context of an ideal genome-annotation strategy that considers the proteome as a rigorously defined subset of all possible coding sequences ('the orfome'). (ii) Despite the greater apparent complexity of the fly (more cells, more complex physiology, longer lifespan), the nematode worm appears to have more genes. To explain this, we compare the annotated proteomes of worm and fly, relating to both genome-annotation and genome evolution issues. (iii) The unexpectedly small size of the gene complement estimated for the complete human genome provoked much public debate about the nature of biological complexity. However, in the first instance, for the human genome, the relationship between gene number and proteome size is far from simple. We survey the current estimates for the numbers of human genes and, from this, we estimate a range for the size of the human proteome. The determination of this is substantially hampered by the unknown extent of the cohort of pseudogenes ('dead' genes), in combination with the prevalence of alternative splicing. (Further information relating to yeast is available at http://genecensus.org/yeast/orfome)

Entities:  

Mesh:

Substances:

Year:  2002        PMID: 11861898      PMCID: PMC101239          DOI: 10.1093/nar/30.5.1083

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  78 in total

1.  Gene index analysis of the human genome estimates approximately 120,000 genes.

Authors:  F Liang; I Holt; G Pertea; S Karamycheva; S L Salzberg; J Quackenbush
Journal:  Nat Genet       Date:  2000-06       Impact factor: 38.330

2.  Analysis of expressed sequence tags indicates 35,000 human genes.

Authors:  B Ewing; P Green
Journal:  Nat Genet       Date:  2000-06       Impact factor: 38.330

3.  Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome.

Authors:  S Gopal; M Schroeder; U Pieper; A Sczyrba; G Aytekin-Kurban; S Bekiranov; J E Fajardo; N Eswar; R Sanchez; A Sali; T Gaasterland
Journal:  Nat Genet       Date:  2001-03       Impact factor: 38.330

4.  Updating the str and srj (stl) families of chemoreceptors in Caenorhabditis nematodes reveals frequent gene movement within and between chromosomes.

Authors:  H M Robertson
Journal:  Chem Senses       Date:  2001-02       Impact factor: 3.160

Review 5.  Fugu: a compact vertebrate reference genome.

Authors:  B Venkatesh; P Gilligan; S Brenner
Journal:  FEBS Lett       Date:  2000-06-30       Impact factor: 4.124

6.  Genome-wide detection of alternative splicing in expressed sequences of human genes.

Authors:  B Modrek; A Resch; C Grasso; C Lee
Journal:  Nucleic Acids Res       Date:  2001-07-01       Impact factor: 16.971

7.  A comparison of the Celera and Ensembl predicted gene sets reveals little overlap in novel genes.

Authors:  J B Hogenesch; K A Ching; S Batalov; A I Su; J R Walker; Y Zhou; S A Kay; P G Schultz; M P Cooke
Journal:  Cell       Date:  2001-08-24       Impact factor: 41.582

8.  Computational inference of homologous gene structures in the human genome.

Authors:  R F Yeh; L P Lim; C B Burge
Journal:  Genome Res       Date:  2001-05       Impact factor: 9.043

9.  Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence.

Authors:  H Roest Crollius; O Jaillon; A Bernot; C Dasilva; L Bouneau; C Fischer; C Fizames; P Wincker; P Brottier; F Quétier; W Saurin; J Weissenbach
Journal:  Nat Genet       Date:  2000-06       Impact factor: 38.330

10.  Computational identification of promoters and first exons in the human genome.

Authors:  R V Davuluri; I Grosse; M Q Zhang
Journal:  Nat Genet       Date:  2001-12       Impact factor: 38.330

View more
  53 in total

1.  Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.

Authors:  Nathaniel Echols; Paul Harrison; Suganthi Balasubramanian; Nicholas M Luscombe; Paul Bertone; Zhaolei Zhang; Mark Gerstein
Journal:  Nucleic Acids Res       Date:  2002-06-01       Impact factor: 16.971

2.  Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome.

Authors:  Zhaolei Zhang; Paul M Harrison; Yin Liu; Mark Gerstein
Journal:  Genome Res       Date:  2003-12       Impact factor: 9.043

3.  Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome.

Authors:  Zhaolei Zhang; Paul Harrison; Mark Gerstein
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

4.  The histone modification pattern of active genes revealed through genome-wide chromatin analysis of a higher eukaryote.

Authors:  Dirk Schübeler; David M MacAlpine; David Scalzo; Christiane Wirbelauer; Charles Kooperberg; Fred van Leeuwen; Daniel E Gottschling; Laura P O'Neill; Bryan M Turner; Jeffrey Delrow; Stephen P Bell; Mark Groudine
Journal:  Genes Dev       Date:  2004-06-01       Impact factor: 11.361

5.  The multiassembly problem: reconstructing multiple transcript isoforms from EST fragment mixtures.

Authors:  Yi Xing; Alissa Resch; Christopher Lee
Journal:  Genome Res       Date:  2004-02-12       Impact factor: 9.043

6.  Simultaneous measurement of 10,000 protein-ligand affinity constants using microarray-based kinetic constant assays.

Authors:  James P Landry; Yiyan Fei; Xiangdong Zhu
Journal:  Assay Drug Dev Technol       Date:  2011-12-22       Impact factor: 1.738

7.  GFSWeb: a web tool for genome-based identification of proteins from mass spectrometric samples.

Authors:  Michael S Wisz; Melissa Kimball Suarez; Mark R Holmes; Morgan C Giddings
Journal:  J Proteome Res       Date:  2004 Nov-Dec       Impact factor: 4.466

8.  High-resolution functional proteomics by active-site peptide profiling.

Authors:  Eric S Okerberg; Jiangyue Wu; Baohong Zhang; Babak Samii; Kelly Blackford; David T Winn; Kevin R Shreder; Jonathan J Burbaum; Matthew P Patricelli
Journal:  Proc Natl Acad Sci U S A       Date:  2005-03-28       Impact factor: 11.205

9.  Synonymous SNPs provide evidence for selective constraint on human exonic splicing enhancers.

Authors:  David B Carlini; Jordan E Genut
Journal:  J Mol Evol       Date:  2005-11-30       Impact factor: 2.395

10.  From Physics to Pharmacology?

Authors:  Richard J Allen; Timothy C Elston
Journal:  Rep Prog Phys       Date:  2011-01
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.