| Literature DB >> 15935094 |
Vineet K Sharma1, Samir K Brahmachari, Srinivasan Ramachandran.
Abstract
BACKGROUND: Creation of human gene families was facilitated significantly by gene duplication and diversification. The (TG/CA)n repeats exhibit length variability, display genome-wide distribution, and are abundant in the human genome. Accumulation of evidences for their multiple functional roles including regulation of transcription and stimulation of recombination and splicing elect them as functional elements. Here, we report analysis of the distribution of (TG/CA)n repeats in human gene families.Entities:
Mesh:
Year: 2005 PMID: 15935094 PMCID: PMC1177943 DOI: 10.1186/1471-2164-6-83
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Distribution pattern of human gene families with respect to family sizes. X axis: family size (number of genes in each gene family). Y axis: number of gene families corresponding to various family sizes. Note the inverse exponential relationship.
Figure 2Global distribution of gene families, genes and proportion of genes containing (TG/CA)n repeats classified into the six functional classes. The numbers correspond to the height of the vertical bars in each group.
Figure 3Relationship between proportion of genes with (TG/CA)n repeats in each functional class and the average gene length in the corresponding functional classes. X axis: Proportion of genes with (TG/CA)n repeats (%); Y axis: Average gene length (kb) (CC: Cell cycle; IN: Information; IR: Immune and related functions; MET: Metabolism; SC: Signaling and communication; STM: Structure and motility)
Figure 4A Venn diagram of the genes with trinity of intragenic (TG/CA)n repeats (type I, II and III). Note that several genes (shaded area) have multiple types of repeats in various combinations.
Figure 5Distribution of proportion of genes with three types of (TG/CA)n repeats in the six functional classes.
Figure 6Distribution of the densities of three types of (TG/CA)n repeats in the genes of six functional classes.
Distribution of (TG/CA)n repeats in large gene families
| Histone proteins family | Dispersed | 3.55 | 76 | 7.9 | 4 | 4 | 0 | 1.7 | 0.7 | 0 |
| Interleukins | Dispersed | 14.67 | 43 | 32.6 | 9 | 6 | 2 | 1.3 | 0.9 | 0.1 |
| Serine (or cysteine) proteinase inhibitor family | Dispersed | 17.87 | 32 | 50 | 12 | 9 | 2 | 1.4 | 0.8 | 0.2 |
| Tumor necrosis factor (ligand) superfamily | Dispersed | 23.49 | 38 | 63.2 | 21 | 14 | 1 | 1.9 | 0.9 | 0.1 |
| CD antigens | Dispersed | 26.96 | 54 | 46.3 | 20 | 11 | 3 | 1.9 | 1.3 | 0.3 |
| Immunoglobulin heavy chains | Intrachromosomal | 0.38 | 162 | 0.6 | 1 | 0 | 0 | 1 | 0 | 0 |
| Immunoglobulin kappa chains | Intrachromosomal | 0.55 | 73 | 5.5 | 1 | 3 | 0 | 0.3 | 3 | 0 |
| Immunoglobulin lambda chains | Intrachromosomal | 0.35 | 88 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Interleukin receptors family | Dispersed | 29.77 | 32 | 59.4 | 17 | 10 | 0 | 3.4 | 0.9 | 0 |
| T cell receptor beta chains | 84 Intrachromosomal, 9 Dispersed | 0.42 | 94 | 9.6 | 5 | 3 | 1 | 0.8 | 0.3 | 0.1 |
| Homeo box | Dispersed | 5.48 | 40 | 25 | 6 | 6 | 1 | 1.2 | 0.9 | 0.2 |
| Eukaryotic translation initiation factor | Dispersed | 36.54 | 33 | 45.5 | 13 | 10 | 3 | 2.1 | 0.9 | 0.2 |
| Zinc finger protein family | Dispersed | 30.33 | 200 | 42.5 | 63 | 44 | 6 | 2.2 | 1.1 | 0.1 |
| DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptides | Dispersed | 43.44 | 32 | 62.5 | 17 | 8 | 2 | 1.5 | 0.6 | 0.1 |
| Ribosomal protein genes | Dispersed | 4.94 | 96 | 6.3 | 6 | 0 | 0 | 1 | 0 | 0 |
| Mitochondrial ribosomal protein genes | Dispersed | 16.92 | 74 | 23 | 14 | 12 | 1 | 1.8 | 0.9 | 0.1 |
| Cytochrome P450 superfamily | Dispersed | 31.34 | 45 | 46.7 | 15 | 11 | 2 | 1.9 | 1 | 0.1 |
| Proteasome subunit genes | Dispersed | 25.64 | 40 | 32.5 | 11 | 8 | 0 | 1.5 | 0.8 | 0 |
| G protein-coupled receptor family | Dispersed | 24.71 | 98 | 33.7 | 26 | 19 | 6 | 2.3 | 1.5 | 0.2 |
| Tripartite motif-containing family | Dispersed | 29.29 | 40 | 60 | 19 | 12 | 2 | 1.6 | 0.8 | 0.1 |
| Solute carrier family | Dispersed | 59.19 | 223 | 62.8 | 134 | 87 | 22 | 2.9 | 1.5 | 0.2 |
| RAS oncogene family | Dispersed | 39.92 | 60 | 65 | 38 | 17 | 4 | 1.8 | 0.7 | 0.1 |
| ATP-binding cassette transporters gene family | Dispersed | 73.85 | 44 | 68.2 | 29 | 24 | 4 | 3.6 | 2.5 | 0.3 |
| Guanine nucleotide binding protein (G protein) polypeptide genes | Dispersed | 58.83 | 32 | 59.4 | 18 | 12 | 5 | 3.3 | 1.9 | 0.4 |
| Potassium voltage-gated channel genes | Dispersed | 104.95 | 38 | 57.9 | 17 | 16 | 6 | 8.6 | 4.4 | 0.5 |
| Protein phosphatase subunit genes | Dispersed | 65.62 | 59 | 57.6 | 27 | 22 | 7 | 2.9 | 1.8 | 0.3 |
| Collagen family | Dispersed | 132.83 | 37 | 86.5 | 29 | 23 | 10 | 5.7 | 2.3 | 0.4 |
a: Chromosomal distribution of the members of gene families. 'Dispersed' indicates that members are distributed on different chromosomes.
b: average gene length (in kb) for each gene family.
c:Numbers in parentheses show the number of large sized gene families in each functional class.