| Literature DB >> 19159490 |
Yen Kaow Ng1, Wei Wu, Louxin Zhang.
Abstract
BACKGROUND: Co-expressing genes tend to cluster in eukaryotic genomes. This paper analyzes correlation between the proximity of eukaryotic genes and their transcriptional expression pattern in the zebrafish (Danio rerio) genome using available microarray data and gene annotation.Entities:
Mesh:
Year: 2009 PMID: 19159490 PMCID: PMC2654907 DOI: 10.1186/1471-2164-10-42
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Distribution of 10,000 mean . Each plot shows the distribution of 10,000 mean R values. Each mean R value is calculated by first randomly permuting the gene order of the genome, and then averaging the R values for every pair of neighboring genes in the resulting gene order. The mean R value in the real genome is shown as a single line on each plot. Both plots are based on the same gene expression dataset: (A) the results on the original dataset (average of mean R = 0.03086, σ = 0.00384); (B) the results after tandem duplicates are removed (average of mean R = 0.03071, σ = 0.00389).
Figure 2Mean of pair-wise . This is compared to the mean of 100 values obtained similarly, each from the same analysis after a random permutation of: (1) the gene order of the entire genome (△); (2) the order of genes in each chromosome (□); (3) the order of non-overlapping blocks of 3 consecutive genes (■). Plots (△), (□) and (■) are shown with standard deviations. The points in (A) are from analyses with the full dataset, while (B) are from analyses after tandem duplicates are removed.
Figure 3Mean . Gene pairs of up to 50 kbp apart were binned according to their intergenic distance, shown with regression lines. (A) is from the full dataset, whereas (B) from the resulting dataset after tandem duplicates are removed.
Descriptive statistics for pair-wise comparison of neighboring genes according to orientation of transcription
| ← → | 1681 | 0.0637 ± 0.0084 | 0.0238 | 215221.3 ± 7887.2 | 88957 | |
| → →/← ← | 3418 | 0.0877 ± 0.0061 | 0.0547 | 201669.6 ± 5611.0 | 75199 | |
| → ← | 1678 | 0.0592 ± 0.0082 | 0.0238 | 207196.8 ± 8385.1 | 75805 | |
| ← → | 1635 | 0.0618 ± 0.0086 | 0.0220 | 219869.4 ± 8382.6 | 91546 | |
| → →/← ← | 3295 | 0.0762 ± 0.0061 | 0.0438 | 209041.5 ± 5714.5 | 82818 | |
| → ← | 1632 | 0.0594 ± 0.0083 | 0.0250 | 216217.3 ± 8729.1 | 80961 | |
Mean of R values for all neighboring gene pairs found within some cluster of size d (d = 2, 3 ..., 7, > 7).
| 2 | 877 | 867 | 0.1051 ± 0.0122 | 0.0930 ± 0.0121 |
| 3 | 560 | 520 | 0.0952 ± 0.0152 | 0.0799 ± 0.0157 |
| 4 | 315 | 282 | 0.1237 ± 0.0202 | 0.1148 ± 0.0207 |
| 5 | 140 | 148 | 0.0739 ± 0.0303 | 0.0870 ± 0.0306 |
| 6 | 125 | 80 | 0.1302 ± 0.0328 | 0.1330 ± 0.0385 |
| 7 | 60 | 48 | 0.2360 ± 0.0545 | 0.2505 ± 0.0585 |
| > 7 | 88 | 62 | 0.3427 ± 0.0390 | 0.2456 ± 0.0492 |
The positional clusters which contain at least 8 genes. The xxx stands for genes with unknown functions.
| 3 | 10 | 100K | 1.24e-6 | 0.289 | |
| 4 | 8 | 197K | 9.19e-7 | 0.031 | |
| 5 | 8 | 89k | 1.36e-5 | 0.213 | |
| 13 | 8 | 139K | 6.39e-5 | 0.148 | |
| 15 | 17 | 174K | 2.69e-7 | 0.371 | |
| 19 | 13 | 189K | 4.12e-7 | 0.229 | |
| 8 | 159K | 3.04e-5 | -0.026 | ||
| 20 | 8 | 168K | 1.77e-6 | -0.075 | |
| 8 | 110K | 2.61e-4 | 0.259 | ||
| 23 | 10 | 96K | 2.12e-5 | 0.606 | |
Number of positional gene clusters found with intergenic distance D = 25K.
| 328 | 48 | 12 | 3 | 2 | 0 | 3 | ||
| 847 | 280 | 105 | 35 | 25 | 10 | 10 | ||
| 322 | 33 | 9 | 1 | 2 | 1 | 1 | ||
| 867 | 260 | 94 | 37 | 16 | 8 | 8 | ||
Figure 4Mean . Mean R values of neighboring gene pairs in -lg p intervals. All p-values were calculated with D = 25K. Gene pairs grouped into parallel, divergent, and convergent orientations are plotted similarly. There is only one gene pair has -lg p-value in the interval 5~6, for both the divergent and convergent cases. They are hence omitted from the plot.
Figure 5Results from the same analysis as in Figure 4 after tandem duplicates are removed.
Number of neighboring gene pairs found in clusters in different -lg p-value intervals (D = 25K).
| 64 | 1101 | 660 | 210 | 79 | 16 | 35 | ||
| 30 | 541 | 331 | 107 | 48 | 14 | 23 | ||
| 18 | 237 | 167 | 51 | 17 | 1 | 6 | ||
| 16 | 323 | 162 | 52 | 14 | 1 | 6 | ||
| 63 | 1073 | 586 | 169 | 83 | 14 | 19 | ||
| 27 | 518 | 291 | 89 | 49 | 12 | 9 | ||
| 17 | 244 | 152 | 41 | 18 | 1 | 5 | ||
| 19 | 311 | 143 | 39 | 16 | 1 | 5 | ||
Figure 6Average distance between neighboring gene pairs in different -lg .