Literature DB >> 11160906

Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome.

P M Harrison1, N Echols, M B Gerstein.   

Abstract

Pseudogenes are non-functioning copies of genes in genomic DNA, which may either result from reverse transcription from an mRNA transcript (processed pseudogenes) or from gene duplication and subsequent disablement (non-processed pseudogenes). As pseudogenes are apparently 'dead', they usually have a variety of obvious disablements (e.g., insertions, deletions, frameshifts and truncations) relative to their functioning homologs. We have derived an initial estimate of the size, distribution and characteristics of the pseudogene population in the Caenorhabditis elegans genome, performing a survey in 'molecular archaeology'. Corresponding to the 18 576 annotated proteins in the worm (i.e., in Wormpep18), we have found an estimated total of 2168 pseudogenes, about one for every eight genes. Few of these appear to be processed. Details of our pseudogene assignments are available from http://bioinfo.mbb.yale.edu/genome/worm/pseudogene. The population of pseudogenes differs significantly from that of genes in a number of respects: (i) pseudogenes are distributed unevenly across the genome relative to genes, with a disproportionate number on chromosome IV; (ii) the density of pseudogenes is higher on the arms of the chromosomes; (iii) the amino acid composition of pseudogenes is midway between that of genes and (translations of) random intergenic DNA, with enrichment of Phe, Ile, Leu and Lys, and depletion of Asp, Ala, Glu and Gly relative to the worm proteome; and (iv) the most common protein folds and families differ somewhat between genes and pseudogenes-whereas the most common fold found in the worm proteome is the immunoglobulin fold and the most common 'pseudofold' is the C-type lectin. In addition, the size of a gene family bears little overall relationship to the size of its corresponding pseudogene complement, indicating a highly dynamic genome. There are in fact a number of families associated with large populations of pseudogenes. For example, one family of seven-transmembrane receptors (represented by gene B0334.7) has one pseudogene for every four genes, and another uncharacterized family (represented by gene B0403.1) is approximately two-thirds pseudogenic. Furthermore, over a hundred apparent pseudogenic fragments do not have any obvious homologs in the worm.

Entities:  

Mesh:

Substances:

Year:  2001        PMID: 11160906      PMCID: PMC30377          DOI: 10.1093/nar/29.3.818

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  44 in total

1.  Two large families of chemoreceptor genes in the nematodes Caenorhabditis elegans and Caenorhabditis briggsae reveal extensive gene duplication, diversification, movement, and intron loss.

Authors:  H M Robertson
Journal:  Genome Res       Date:  1998-05       Impact factor: 9.043

2.  Patterns of protein-fold usage in eight microbial genomes: a comprehensive structural census.

Authors:  M Gerstein
Journal:  Proteins       Date:  1998-12-01

3.  How representative are the known structures of the proteins in a complete genome? A comprehensive structural census.

Authors:  M Gerstein
Journal:  Fold Des       Date:  1998

4.  Cloning, mRNA localization and evolutionary conservation of a human 5-HT7 receptor pseudogene.

Authors:  M A Olsen; L E Schechter
Journal:  Gene       Date:  1999-02-04       Impact factor: 3.688

Review 5.  Processed pseudogenes: characteristics and evolution.

Authors:  E F Vanin
Journal:  Annu Rev Genet       Date:  1985       Impact factor: 16.830

6.  Transcriptional analysis of the PTEN/MMAC1 pseudogene, psiPTEN.

Authors:  G H Fujii; A M Morimoto; A E Berson; J B Bolen
Journal:  Oncogene       Date:  1999-03-04       Impact factor: 9.867

7.  Genomic analysis of Caenorhabditis elegans reveals ancient families of retroviral-like elements.

Authors:  N J Bowen; J F McDonald
Journal:  Genome Res       Date:  1999-10       Impact factor: 9.043

8.  Neuronal expression of neural nitric oxide synthase (nNOS) protein is suppressed by an antisense RNA transcribed from an NOS pseudogene.

Authors:  S A Korneev; J H Park; M O'Shea
Journal:  J Neurosci       Date:  1999-09-15       Impact factor: 6.167

Review 9.  Genome sequence of the nematode C. elegans: a platform for investigating biology.

Authors: 
Journal:  Science       Date:  1998-12-11       Impact factor: 47.728

Review 10.  Comparison of the complete protein sets of worm and yeast: orthology and divergence.

Authors:  S A Chervitz; L Aravind; G Sherlock; C A Ball; E V Koonin; S S Dwight; M A Harris; K Dolinski; S Mohr; T Smith; S Weng; J M Cherry; D Botstein
Journal:  Science       Date:  1998-12-11       Impact factor: 47.728

View more
  39 in total

1.  PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information.

Authors:  J Qian; B Stenger; C A Wilson; J Lin; R Jansen; S A Teichmann; J Park; W G Krebs; H Yu; V Alexandrov; N Echols; M Gerstein
Journal:  Nucleic Acids Res       Date:  2001-04-15       Impact factor: 16.971

2.  A question of size: the eukaryotic proteome and the problems in defining it.

Authors:  Paul M Harrison; Anuj Kumar; Ning Lang; Michael Snyder; Mark Gerstein
Journal:  Nucleic Acids Res       Date:  2002-03-01       Impact factor: 16.971

3.  Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22.

Authors:  Paul M Harrison; Hedi Hegyi; Suganthi Balasubramanian; Nicholas M Luscombe; Paul Bertone; Nathaniel Echols; Ted Johnson; Mark Gerstein
Journal:  Genome Res       Date:  2002-02       Impact factor: 9.043

4.  Annotation transfer for genomics: measuring functional divergence in multi-domain proteins.

Authors:  H Hegyi; M Gerstein
Journal:  Genome Res       Date:  2001-10       Impact factor: 9.043

5.  Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.

Authors:  Nathaniel Echols; Paul Harrison; Suganthi Balasubramanian; Nicholas M Luscombe; Paul Bertone; Zhaolei Zhang; Mark Gerstein
Journal:  Nucleic Acids Res       Date:  2002-06-01       Impact factor: 16.971

6.  Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome.

Authors:  Zhaolei Zhang; Paul M Harrison; Yin Liu; Mark Gerstein
Journal:  Genome Res       Date:  2003-12       Impact factor: 9.043

7.  A genome-wide survey of human pseudogenes.

Authors:  David Torrents; Mikita Suyama; Evgeny Zdobnov; Peer Bork
Journal:  Genome Res       Date:  2003-12       Impact factor: 9.043

8.  GenomeHistory: a software tool and its application to fully sequenced genomes.

Authors:  Gavin C Conant; Andreas Wagner
Journal:  Nucleic Acids Res       Date:  2002-08-01       Impact factor: 16.971

9.  A new role for expressed pseudogenes as ncRNA: regulation of mRNA stability of its homologous coding gene.

Authors:  Yoshihisa Yano; Rintaro Saito; Noriyuki Yoshida; Atsushi Yoshiki; Anthony Wynshaw-Boris; Masaru Tomita; Shinji Hirotsune
Journal:  J Mol Med (Berl)       Date:  2004-05-18       Impact factor: 4.599

10.  Two complementary recessive genes in duplicated segments control etiolation in rice.

Authors:  Donghai Mao; Huihui Yu; Touming Liu; Gaiyu Yang; Yongzhong Xing
Journal:  Theor Appl Genet       Date:  2010-09-26       Impact factor: 5.699

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.