| Literature DB >> 19891785 |
Ashwin Prakash1, Samuel S Shepard, Jie He, Benjamin Hart, Miao Chen, Surya P Amarachintha, Olga Mileyeva-Biebesheimer, Jason Bechtel, Alexei Fedorov.
Abstract
BACKGROUND: Mid-range inhomogeneity or MRI is the significant enrichment of particular nucleotides in genomic sequences extending from 30 up to several thousands of nucleotides. The best-known manifestation of MRI is CpG islands representing CG-rich regions. Recently it was demonstrated that MRI could be observed not only for G+C content but also for all other nucleotide pairings (e.g. A+G and G+T) as well as for individual bases. Various types of MRI regions are 4-20 times enriched in mammalian genomes compared to their occurrences in random models.Entities:
Mesh:
Year: 2009 PMID: 19891785 PMCID: PMC2779198 DOI: 10.1186/1471-2164-10-513
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Substitution Rates in MRI Regions for a Combination of Nucleotides. For each X MRI region--where X is for GC-, GT-, or GA-rich or poor regions--the X-base composition rate of change is given for all substitutions at different levels of fixation within the human population. The rate of change (S) is the ratio of X to nonX substitutions over nonX to X substitutions in those particular X-rich regions. Thus, a ratio of 1 means no change in the X-richness of the region whereas a ratio greater than 1 implies degradation of the X-rich region and less than 1 implies enrichment of the X-rich MRI region. Note that in the control X-average regions the S-ratio is always inverse to S-ratio (S= 1/S). Therefore, only one graph for each Sand Spair is presented. Since there are significant variations in S-ratios for different X compositions, the graphs are presented in two different scales. The white background presents changes of S-ratios in the 0.8 to 2 range, while the gray background presents changes in the 0 to 7 range. Vertical bars show the standard error of the means (see Methods section).
Figure 2Substitution Rates in MRI Regions for Single Nucleotides. For each X MRI region--where X is for A-, T-, G-, or C-rich or poor regions--the X-base composition rate of change is given for all substitutions at different levels of fixation within the human population. The rate of change (S) is the ratio of X to nonX substitutions over nonX to X substitutions in those particular X-rich regions. Thus, a ratio of 1 means no change in the X-richness of the region whereas a ratio greater than 1 implies degradation of the X-rich region and less than 1 implies enrichment of the X-rich MRI region. Note that in the control X-average regions the S-ratio is always inverse to S-ratio (SX = 1/S). Therefore, only one graph for each SX and Spair is presented. Since there are significant variations in S-ratios for different X compositions, the graphs are presented in two different scales. The white background presents changes of S-ratios in the 0.8 to 2 range, while the gray background presents changes in the 0 to 7 range. Vertical bars show the standard error of the means (see Methods section)
Projected X-Equilibria.
| Equilibrium for | ||||||
|---|---|---|---|---|---|---|
| G-rich | 40% | 14.5% | 16.9% | 22.3% | 36.0% | 32.1% |
| nonG-rich | 7% | 14.6 | 13.4 | 10.6 | 7.6 | 7.8 |
| G-average | 20% | 17.1 | 18.0 | 19.2 | 23.0 | 22.2 |
| C-rich | 40% | 13.8 | 16.9 | 23.0 | 33.90 | 32.4 |
| nonC-rich | 7% | 14.5 | 12.7 | 10.2 | 8.0 | 7.8 |
| C-average | 20% | 17.1 | 18.1 | 19.2 | 23.1 | 22.1 |
| A-rich | 49.5% | 41.4 | 40.3 | 42.0 | 43.5 | 44.1 |
| nonA-rich | 12.9% | 32.3 | 26.3 | 20.8 | 10.9 | 12.6 |
| A-average | 29.4% | 34.1 | 32.6 | 30.7 | 26.0 | 27.0 |
| T-rich | 49.5% | 39.5 | 39.5 | 42.8 | 43.2 | 44.6 |
| nonT-rich | 12.9% | 33.4 | 27.3 | 19.9 | 11.5 | 12.6 |
| T-average | 29.4% | 34.2 | 32.6 | 30.8 | 25.9 | 27.1 |
| GT-rich | 69.8% | 56.9 | 60.7 | 64.6 | 70.8 | 70.4 |
| nonGT-rich | 30.1% | 41.7 | 37.7 | 36.5 | 29.2 | 30.1 |
| GT-average | 50.0% | 49.9 | 50.0 | 50.0 | 50.0 | 50.0 |
| GA-rich | 70.0% | 56.6 | 56.6 | 60.0 | 63.2 | 65.8 |
| nonGA-rich | 29.9% | 44.6 | 42.1 | 39.1 | 31.7 | 34.1 |
| GA-average | 50.0% | 49.9 | 50.1 | 50.0 | 49.9 | 49.9 |
| GC-rich | 71.3% | 26.4 | 31.7 | 39.5 | 56.1 | 60.6 |
| nonGC-rich | 20.0% | 30.2 | 29.3 | 27.8 | 27.4 | 24.4 |
| GC-average | 40.7% | 34.9 | 36.8 | 39.0 | 45.7 | 45.0 |
The calculated equilibria percentages (see Equation 3) for X-bases in X-rich MRI and control regions with average X-composition. Projected equilibria are given based on the substitution rates of rare, minor, medium, and major SNPs as well as for the fixed substitution rates (chimp-macaque to human).
Impact of Indels on X-rich MRI Regions, with X Representing Any Single Base. The impact of indels on X-rich MRI regions and on X-average regions, where X is for A-, T-, C-, or G-rich or poor. For each particular region we give the total length of examined regions in mega-bases, the percentage composition or content of X, the number of changes in X due to insertions and deletions (ΔX = N(X) - N(X)), and the change in X composition due to both indels and substitutions.
| A-rich | nonA-rich | A-average | |
|---|---|---|---|
| total length | 66.9 Mb | 72.4 Mb | 800.4 Mb |
| content of A | 49.6% | 12.9% | 30.5% |
| ΔA | -16850 | -7390 | -44182 |
| ΔnonA | -24748 | -29257 | -98769 |
| net A% change INDEL | 0.006% | -0.004% | -0.0001% |
| net A% change SUBST | -0.027% | -0.002% | -0.014% |
| T-rich | nonT-rich | T-average | |
| total length | 67.8 Mb | 71.1 Mb | 800.4 Mb |
| content of T | 49.5% | 13.1% | 30.5% |
| ΔT | -21849 | -7078 | -47238 |
| ΔnonT | -24084 | -22716 | -97057 |
| net T% change INDEL | 0.001% | -0.004% | -0.0004% |
| net T% change SUBST | -0.024% | -0.002% | -0.013% |
| G-rich | nonG-rich | G-average | |
| total length | 52.0 Mb | 60.4 Mb | 884.7 Mb |
| content of G | 40.10% | 7.20% | 20.40% |
| ΔG | -1185 | -7080 | -31780 |
| ΔnonG | -12864 | -37512 | -139126 |
| net G% change INDEL | 0.009% | -0.006% | 0.0003% |
| net G% change SUBST | -0.052% | 0.009% | 0.016% |
| C-rich | nonC-rich | C-average | |
| total length | 52.0 Mb | 60.4 Mb | 883.9 Mb |
| content of C | 40.10% | 7.20% | 20.50% |
| ΔC | -829 | -6700 | -33823 |
| ΔnonC | -12418 | -35277 | -140331 |
| net C% change INDEL | 0.009% | -0.006% | 0.0002% |
| net C% change SUBST | -0.049% | 0.009% | 0.015% |
Impact of Indels on MRI Regions, with X Representing Combinations of Any Two Bases.
| GC-rich | nonGC-rich | GC-average | |
|---|---|---|---|
| total length | 17.8 Mb | 54.8 Mb | 780.6 Mb |
| content of GC | 71.00% | 20.30% | 40.90% |
| ΔGC | 1405 | -9100 | -31622 |
| ΔnonGC | -765 | -5951 | -56278 |
| net GC% change INDEL | 0.005% | -0.011% | 0.001% |
| net GC% change SUBST | -0.094% | 0.042% | 0.034% |
| GT-rich | nonGT-rich | GT-average | |
| total length | 34.9 Mb | 34.6 Mb | 1192 Mb |
| content of GT | 69.10% | 30.90% | 50.00% |
| ΔGT | -8278 | -6837 | -121644 |
| ΔnonGT | -4518 | -8502 | -120128 |
| net GT% change INDEL | 0.002% | -0.006% | -0.0001% |
| net GT% change SUBST | 0.004% | 0.001% | -0.0003% |
| GA-rich | nonGA-rich | GA-average | |
| total length | 69.2 Mb | 70.0 Mb | 978.3 Mb |
| content of GA | 69.75% | 30.22% | 49.99% |
| ΔGA | -23641 | -13935 | -96617 |
| ΔnonGA | -14185 | -28480 | -100013 |
| net GA% change INDEL | 0.004% | -0.002% | 0.0002% |
| net GA% change SUBST | -0.014% | 0.014% | 0.0002% |
The impact of indels on X-rich MRI regions and on X-average regions, where X is for GC-, GT-, or GA-rich or poor. For each particular region we give the total length of examined regions in mega-bases, the percentage composition or content of X, the number of changes in X due to insertions and deletions (ΔX = N(X) - N(X)), and the net change in X composition due to both indels and substitutions.