| Literature DB >> 19458156 |
Hehuang Xie1, Min Wang, Maria de F Bonaldo, Christina Smith, Veena Rajaram, Stewart Goldman, Tadanori Tomita, Marcelo B Soares.
Abstract
DNA methylation, the only known covalent modification of mammalian DNA, occurs primarily in CpG dinucleotides. 51% of CpGs in the human genome reside within repeats, and 25% within Alu elements. Despite that, no method has been reported for large-scale ascertainment of CpG methylation in repeats. Here we describe a sequencing-based strategy for parallel determination of the CpG-methylation status of thousands of Alu repeats, and a computation algorithm to design primers that enable their specific amplification from bisulfite converted genomic DNA. Using a single primer pair, we generated amplicons of high sequence complexity, and derived CpG-methylation data from 31 178 Alu elements and their 5' flanking sequences, altogether representing over 4 Mb of a human cerebellum epigenome. The analysis of the Alu methylome revealed that the methylation level of Alu elements is high in the intronic and intergenic regions, but low in the regions close to transcription start sites. Several hypomethylated Alu elements were identified and their hypomethylated status verified by pyrosequencing. Interestingly, some Alu elements exhibited a strikingly tissue-specific pattern of methylation. We anticipate the amplicons herein described to prove invaluable as epigenome representations, to monitor epigenomic alterations during normal development, in aging and in diseases such as cancer.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19458156 PMCID: PMC2715246 DOI: 10.1093/nar/gkp393
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Diagram of experimental design. Genomic DNA is first digested with a methylation insensitive enzyme, ligated to adaptors and then subjected to bisulfite treatment. Bisulfite treated DNA is amplified with adaptor and primer specific for the targeted repeat elements. PCR products contain repeat sequence and flanking unique genomic sequence. Lastly, the repeat-containing amplicon library thus constructed is subjected to large-scale sequencing. The methylated cytosines are indicated with the filled circles while the unmethylated cytosines are indicated with the open circles. The green and red segments represent the repeat elements with the red segment indicating the region to which the primer anneals. The adaptors are indicated by blue boxes.
Figure 2.The preservation of CpG dinucleotides along Alu consensus sequences. X-axis represents the nucleotide position within the consensus Alu sequence. The Y-axis represents the percentage of all Alus that contain CpG dinucleotides for each given position. The green arrow represents the PCR primer designed in this study.
Statistics of Alu elements sequenced
| Alu sub family | Number of Alu elements in genome | Number of distinct Alu elements sequenced | Number of sequences generated | Precentage of sequences generated |
|---|---|---|---|---|
| AluY | 137 925 | 23 857 | 207 602 | 84.5 |
| AluSc | 49 325 | 1725 | 7365 | 3.0 |
| AluSx | 339 002 | 1438 | 2744 | 1.1 |
| AluSg | 81 915 | 1304 | 4381 | 1.8 |
| AluYa_h | 9646 | 672 | 3494 | 1.4 |
| AluJb | 128 539 | 539 | 1044 | 0.4 |
| AluJo | 141 841 | 510 | 806 | 0.3 |
| AluSq | 94 487 | 450 | 926 | 0.4 |
| AluSp | 50 917 | 231 | 467 | 0.2 |
| Others | 147 375 | 452 | 1218 | 0.5 |
| Total | 1 180 972 | 31 178 | 230 047 | 93.6 |
aThe percentage was calculated as the number of sequences from Alu subfamily divided by the total number of sequences mapped (245 825 sequence reads).
bAluYa-h denotes the young AluY families: AluYa to AluYh.
The average methylation level of CpG clusters with different genomic localization
| Number of methylation value | Number of distinct CpG dinucleotides | Methylation level | |
|---|---|---|---|
| Within 1 kb upstream from TSSs | 11 185 | 1437 | 84.2 |
| 5′UTR | 586 | 135 | 82.8 |
| CDS | 857 | 187 | 79.2 |
| 3′UTR | 5082 | 687 | 93.3 |
| Intron | 443 817 | 58 718 | 93.5 |
| Intergenic | 791 035 | 98 061 | 93.6 |
| SUM | 1 252 562 | 159 225 |
aTSSs denotes Transcription Start Sites.
bThe genomic regions within 1-kb upstream from TSSs were excluded from intergenic regions.
Figure 3.The methylation profile of CpG sites sequenced. (A) CpG methylation near transcription start sites (TSSs). The X-axis represents the distance of CpG sites sequenced to the TSSs. The Y-axis represents the average methylation level. The average methylation levels were calculated for CpG sites adjacent to TSSs in 100-bp increments. (B) Spatial methylation correlation surrounding unmethylated CpG sites. The X-axis represents the distance to the nearest unmethylated CpG site. The Y-axis represents the average methylation level. The average methylation level of CpG dinucleotides were determined and plotted against the spatial distance, in 10-bp increments, to the nearest unmethylated CpG site.
Figure 4.Pyrosequencing validation and tissue specific Alu methylation (A) Pyrosequencing validation for high-throughput sequencing results. Asterisks represent the genomic regions identified to be completely unmethylated by high-throughput sequencing. The X-axis represents the distinct genomic loci selected. The Y-axis represents the average methylation level determined by 454 sequencing or by pyrosequencing. (B) Methylation profiles of four loci adjacent to promoters in cerebellum, cortex and fetal adrenal gland. The X-axis represents the distinct tissue sample IDs. CE, CO and AG represent the cerebellum, cortex and fetal adrenal gland tissues, respectively. Paired cerebellum and cortex were shown with the same number. The Y-axis represents the average methylation level determined by pyrosequencing. Similar methylation profiles were observed in duplicate experiments.