| Literature DB >> 24447369 |
Eran Elhaik, Matteo Pellegrini, Tatiana V Tatarinova1.
Abstract
BACKGROUND: The methylation of cytosines at CpG dinucleotides, which plays an important role in gene expression regulation, is one of the most studied epigenetic modifications. Thus far, the detection of DNA methylation has been determined mostly by experimental methods, which are not only prone to bench effects and artifacts but are also time-consuming, expensive, and cannot be easily scaled up to many samples. It is therefore useful to develop computational prediction methods for DNA methylation. Our previous studies highlighted the existence of correlations between the GC content of the third codon position (GC₃), methylation, and gene expression. We thus designed a model to predict methylation in Oryza sativa based on genomic sequence features and gene expression data.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24447369 PMCID: PMC3903047 DOI: 10.1186/1471-2105-15-23
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Distribution of gene-body methylation.
Figure 2Log-methylation vs. GC .
Correlation between gene-body methylation and other genic features
| GC3 | 0.08 | −0.68 |
| Mean expression | 0.07 | 0.22 |
| Standard deviation of gene expression | 0.14 | 0.06 |
| Coefficient of variation of gene expression | −0.04 | −0.18 |
| Relative abundance of CpG | 0.07 | −0.70 |
| Gene length | 0.06 | 0.29 |
Correlation ( ) between nine gene compositional features and gene body-methylation
| GC3 | GC3 | −0.673 |
| Gene expression: mean ( | GE_MEAN | 0.255 |
| Standard deviation ( | GE_STDEV | 0.084 |
| CV of expression ( | GE_CV | −0.217 |
| Genome signature ( | GEN_SIG | −0.697 |
| CDS length | l | 0.286 |
| Change in CG3 from the left to the middle of the gene | GRADLM | 0.269 |
| Change in CG3 from the middle to the of the right gene | GRADMR | −0.289 |
| CG3 in the left third of the coding sequence (CDS) | GCL | −0.364 |
| CG3 in the middle third of CDS | GCM | −0.545 |
| CG3 in the right third of CDS | GCR | −0.343 |
Figure 3Gene methylation levels estimated from sixmers (x-axis) and the gene methylation level obtained from experimental data (after moving average smoothing). An exponential fitting is shown in red.
Figure 4A flow chart of the proposed algorithm to calculate methylation from gene expression data (left to right). Calculations marked in black arrows are carried once for all genes, whereas calculation marked in green are carried for each specific gene.