| Literature DB >> 12186647 |
Nicholas M Luscombe1, Jiang Qian, Zhaolei Zhang, Ted Johnson, Mark Gerstein.
Abstract
BACKGROUND: The sequencing of genomes provides us with an inventory of the 'molecular parts' in nature, such as protein families and folds, and their functions in living organisms. Through the analysis of such inventories, it has been shown that different genomes have very different usage of parts; for example, the common folds in the worm are very different from those in Escherichia coli.Entities:
Mesh:
Substances:
Year: 2002 PMID: 12186647 PMCID: PMC126234 DOI: 10.1186/gb-2002-3-8-research0040
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Power-law behaviour is observed for many genomic properties. (a) The occurrence of DNA words, InterPro families and protein folds in the worm genome. Black diamonds, 6-mers; dark-gray diamonds, 7-mers; mid-gray diamonds, 8-mers; light-gray diamonds, 9-mers, open diamonds, 10-mers; red circles, gene families; open green squares, protein superfamilies, blue crosses, protein folds. The solid lines represent the best-fit power-law functions for each distribution. (b) The occurrence of pseudogene families (open green squares) and pseudomotifs (black crosses) in the worm intergenic regions. (c) The occurrence of InterPro families in M. genitalium (black diamonds); E. coli (dark-gray diamonds); S. cerevisiae (mid-gray diamonds); and D. melanogaster (open diamonds). (d) Other properties that follow the power law. Black crosses, the number of assigned functions for each fold; open blue squares, the number of protein-protein interactions each fold makes in the yeast two-hybrid experiment, open green circles, the number of transcripts of each fold during vegetative growth in yeast. (e) Best-fit functions for the occurrence of protein folds in the worm genome (blue crosses): linear (y = a - bx), exponential (y = ae-), double-exponential (y = ae- + ce-), triple exponential (y = ae- + ce- +fe-), stretched-exponential , , and power-law functions (y = ax-). The residuals (R) between the functions and genomic data are calculated as Σ (N(actual) - N(fitted))2. (f) Properties that do not follow the power law. The occurrence of 3-mers (open blue squares); 4-mers (green crosses); and 5-mers (open dark-blue squares) in the worm genome. Open blue circles, the average composition of asparagine in different folds; open red diamonds, the number of residues involved in protein flexibility in different folds. The slopes (exponent b) are given on the plots. The worm genome was taken from the database at the National Center for Biotechnological Information [41], the family assignments were obtained from the InterPro proteome database[42], and the fold assignments from the Partslist database [20]. Solid red line, best-fit line for worm InterPro families.
List of genomic properties that display power-law behavior and the associated exponent (b) for the best-fitting power-law function
| Exponent | |||||||||
| Organism | 6-10-mers | Protein families | Protein super-families | Protein folds | Pseudomotifs | Pseudogene families | Functions per protein fold | Interactions per protein fold | Transcripts per protein family |
| 3.7 | 1.7 | 1.6 | 1.9 | - | - | - | - | - | |
| 3.5 | 1.6 | 1.5 | 1.8 | - | - | - | - | - | |
| 3.7 | 1.7 | 1.6 | 1.9 | - | - | - | - | - | |
| 3.7 | 1.7 | 1.6 | 1.9 | - | - | - | - | - | |
| 3.4 | 1.6 | 1.5 | 1.7 | - | - | - | - | - | |
| 3.6 | 1.6 | 1.5 | 1.7 | - | - | - | - | - | |
| 3.8 | 1.7 | 1.5 | 1.9 | - | - | - | - | - | |
| 3.5 | 1.6 | 1.5 | 1.7 | - | - | - | - | - | |
| 3.4 | 1.5 | 1.4 | 1.6 | - | - | - | - | - | |
| 3.5 | 1.6 | 1.5 | 1.7 | - | - | - | - | - | |
| 3.8 | 2.0 | 1.8 | 2.2 | - | - | - | - | - | |
| 3.9 | 1.9 | 1.7 | 2.0 | - | - | - | - | - | |
| 3.8 | 1.8 | 1.6 | 1.9 | - | - | - | - | - | |
| 3.4 | 1.6 | 1.5 | 1.7 | - | - | - | - | - | |
| 3.4 | 1.5 | 1.4 | 1.6 | - | - | - | - | - | |
| 3.3 | 1.4 | 1.3 | 1.5 | - | - | - | - | - | |
| 3.2 | 1.5 | 1.4 | 1.6 | - | - | - | - | - | |
| 3.2 | 1.4 | 1.3 | 1.5 | 0.9 | 1.5 | 1.6 | 2.2 | 1.2 | |
| 3.1 | 1.1 | 1.0 | 1.2 | 1.0 | 1.8 | - | - | ||
| 3.3 | 1.2 | 1.1 | 1.3 | 1.2 | - | - | - | ||
| Human chromosomes 21 and 22 | - | - | - | - | 1.0 | 1.9 | - | - | |