Literature DB >> 26981364

Lists of HumanMethylation450 BeadChip probes with nucleotide-variant information obtained from the Phase 3 data of the 1000 Genomes Project.

Kohji Okamura1, Tomoko Kawai2, Kenichiro Hata2, Kazuhiko Nakabayashi2.   

Abstract

The Illumina's Infinium HumanMethylation450 (HM450) BeadChip array provides a simultaneous examination of DNA methylation status of more than 480,000 CpG sites in the human genome. Its relatively simple protocol is achieved by employing a hybridization methodology followed by single-base extension reactions. However, nucleotide variations among individuals in the hybridization probe sequences can affect the results, i.e. estimates of methylation levels. To investigate possible effects of maternal nutritional conditions on the extent of epigenetic alterations in utero, we examined genome-wide DNA methylation profiles of 33 chorionic villi samples collected in Japan (GEO accession number GSE62733), and revealed using Smirnov-Grubbs' outlier test that epigenetic alterations accumulate in placentas under adverse in utero environments. In that study, we compiled a list of HM450 probes overlapping with the reported nucleotide variants in the Phase 3 dataset (release 20130502) of the 1000 Genomes Project. We excluded the probes whose sequences overlapped with variants with minor allele frequency (MAF) higher than 1% in the Japanese population from identified methylation outliers, to diminish the number of outliers that could have been spuriously identified due to variants at/near the target CpG sites. We herein compiled lists of HM450 probes with MAF information of the African, European, American, South Asian and East Asian populations, in addition to the Japanese population. The provided lists are useful for methylome analyses for human populations using the HM450 BeadChip arrays.

Entities:  

Keywords:  DNA methylation; Genetic variations; Methylation BeadChip; Minor allele frequency; The 1000 Genomes Project

Year:  2015        PMID: 26981364      PMCID: PMC4778612          DOI: 10.1016/j.gdata.2015.11.023

Source DB:  PubMed          Journal:  Genom Data        ISSN: 2213-5960


Direct link to deposited data

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE62733.

Experimental design, materials and methods

In our previous study, we investigated the possible effects of maternal nutritional conditions during pregnancy on the extent of methylation changes in the fetal genomes [1]. We collected 33 postpartum placentas from Japanese women and obtained chorionic villous tissues. Extracted genomic DNA samples were treated with the EpiTect Plus DNA Bisulfite Kit (Qiagen), and 300 ng of each sample was subjected to the Illumina's Infinium HumanMethylation450 (HM450) BeadChip arrays for methylome profiling [2]. The data obtained using the manufacturer's standard protocol have been submitted to NCBI GEO under accession number GSE62733. We were cautious with genetic variations overlapping with the sequence intervals of the probes on the BeadChip array. Nucleotide variants at the target CpG site of a probe result in the loss of the target site, and those outside the target CpG site, are considered to impair hybridization and single primer extensions with various degrees depending on their distance from the target CpG site. As it was impractical cost-wise to determine the whole-genome sequences of all samples, we surveyed nucleotide variants overlapping with the probe intervals from the variant data (Phase 3 dataset) of the 1000 Genomes Project [3] available at the following FTP site, ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/. The whole set of probes on the HM450 BeadChip targets 482,421 CpG sites in the human genome. The genomic locations of the probes (50 bp in length) and their target CpG sites were determined as described previously [4]. When all variants in the Phase 3 dataset were used without considering their allele frequencies in the populations, 395,546 (82.0%) out of 482,421 probes overlapped with at least one variant. Of the 395,546 probes, 138,401 probes (35.0%) overlapped with nucleotide variant(s) at their target CpG sites. Out of the 138,401 CpG sites, alternate dinucleotides of 100,523 sites (72.6%) were either TpG or CpA (Fig.1A), most of which were presumed to have been derived through spontaneous deamination of C in methylated CpG sites. The alternate TpG (or CpA if observed from the complementary strand) dinucleotide genomic sequence is indistinguishable from the bisulfite-converted form of unmethylated CpG, and can be spuriously identified to be hypomethylated. However, it should be noted that the majority of these variant-containing probes were found only in one to several individuals among all subjects (n = 2504). When only variants whose minor allele frequency (MAF) is higher than 1% among all subjects, the numbers of variant-containing probes and of variant-containing probes at their CpG target site dropped to 105,280 (21.8%) and 17,274 (3.6%), respectively (Fig.1B).
Fig. 1

Numbers of HM450 probes whose genomic interval overlaps with nucleotide variants reported in the Phase 3 dataset of the 1000 Genomes Project. A. Out of all 482,421 probes that target a CpG site, 395,546 probes contained one or more reported variant(s) in their genomic intervals. Among them, 138,401 probes (35.0%) overlapped with nucleotide variant(s) at their target CpG sites. Out of the 138,401 sites, alternate dinucleotides of 100,523 sites (72.6%) were either TpG or CpA. B. Numbers of probes containing nucleotide variants detected among Japanese (JPT), African (AFR), European (EUR), American (AMR), South Asian (SAS), and East Asian (EAS) subjects, and among all subjects (n = 2500). In each panel, variant-containing probes were classified in three categories (A, B, and C) depending on the distance to the C (MAPINFO [4]) of the target CpG site. In each category, probes were further divided into three sub-categories depending on the minor allele frequency of the overlapping variant(s) and shown as stacked column charts (black, MAF ≥ 5%; dark gray, 5% > MAF ≥ 1%; light gray, 1% > MAF).

In our aforementioned study [1], we excluded CpG sites whose probes overlapped with variants with > 1% MAF in the Japanese population from a list of methylation outliers that were detected to be differentially methylated with statistical significance in one sample compared with the others by Smirnov–Grubbs' outlier tests. Since the exclusion criteria of variant-containing probes can vary depending on the aims of studies and the ethnic background of the populations enrolled, we extended our analysis and compiled lists of probes on the HM450 BeadChip array overlapping with nucleotide variants detected in the East Asian (EAS), American (AMR), African (AFR), European (EUR), and South Asian (SAS) populations in the 1000 Genomes Project (Supplementary Tables 1 and 2), and lists of nucleotide variants overlapping with HM450 probes (Supplementary Tables 3 and 4). Our lists are more updated than the Illumina-provided probe lists containing nucleotide variant information: HumanMethylation450_15017482_v.1.1.csv listing 89,678 probes as single nucleotide polymorphism (SNP)-containing in its “probe_SNPs” and “probe_SNPs_10” columns based on the information of NCBI dbSNP Build 131, and humanmethylation450_dbsnp137.snpupdate.table.v2.sorted.txt listing 273,660 probes as SNP-containing.

Discussion

When a spontaneous deamination occurs at an unmethylated cytosine site, it mutates the cytosine to uracil. Because uracil is not a canonical base of DNA, such mutation is immediately recognized and corrected by DNA repair mechanisms in vivo [5]. In contrast, deamination of methylated cytosine gives rise to thymine, which cannot be readily corrected. Although it might be repaired by a mismatch-specific thymine-DNA glycosylase, C-to-T (or G-to-A if observed from the complementary strand) is the most common single-nucleotide substitution in organisms with cytosine methylation [6]. By using 8.2 million SNPs available from dbSNP123 (released on November 3, 2004), Zhao and Zhang reported that the frequency of CpG dinucleotides at the polymorphic sites was 6.09 times higher than that in the human genome reference sequence [7]. Consistently, HM450 probes were found to contain nucleotide variants at their target CpG sites (2 bp) much more frequently than in the rest of the probe intervals (48 bp of Type I probes and 49 bp of Type II probes [2]) (Fig. 1). We observed that hypomethylated outliers tended to coincide with variant-containing probes more often than hypermethylated ones [1], indicating that a significant fraction of the hypomethylated outliers was detected to be hypomethylated not due to methylation change but a nucleotide variant, which resulted in the loss of the target CpG site. In CpG islands (CGIs), CpG SNPs were reported to be 3.92-fold less frequent than in the human genome [7]. Nevertheless, we observed that hypermethylated outliers tended to be clustered in CGIs [1], indicating that such hypermethylated outliers were more likely to represent bona fide placental epigenetic alterations caused by adverse in utero environments. As demonstrated in our previous study [1], the compiled lists of HM450 probes with nucleotide variant information presented in this study would be helpful for the methylome analyses, until we can readily determine the whole-genome sequences of all subjects.

Conflict of interest

The authors declare that there are no conflicts of interests.
Specifications [standardized info for the reader]
Organism/tissueHomo sapiens/postpartum placentas (chorionic villi)
SexFemales and males
Sequencer or array typeIllumina's Infinium HumanMethylation450 BeadChip array
Data formatRaw and analyzed
Experimental factorsMaternal gestational weight gain and growth of corresponding fetus
Experimental featuresOutlier tests for genome-wide DNA methylation profiles of chorionic villi
ConsentPublicly available from NCBI GEO
Sample source locationJapan
  7 in total

1.  High density DNA methylation array with single CpG site resolution.

Authors:  Marina Bibikova; Bret Barnes; Chan Tsan; Vincent Ho; Brandy Klotzle; Jennie M Le; David Delano; Lu Zhang; Gary P Schroth; Kevin L Gunderson; Jian-Bing Fan; Richard Shen
Journal:  Genomics       Date:  2011-08-02       Impact factor: 5.736

Review 2.  DNA repair enzymes.

Authors:  T Lindahl
Journal:  Annu Rev Biochem       Date:  1982       Impact factor: 23.643

3.  Sequence context analysis of 8.2 million single nucleotide polymorphisms in the human genome.

Authors:  Zhongming Zhao; Fengkai Zhang
Journal:  Gene       Date:  2005-11-28       Impact factor: 3.688

4.  Cloning and expression of human G/T mismatch-specific thymine-DNA glycosylase.

Authors:  P Neddermann; P Gallinari; T Lettieri; D Schmid; O Truong; J J Hsuan; K Wiebauer; J Jiricny
Journal:  J Biol Chem       Date:  1996-05-31       Impact factor: 5.157

5.  A global reference for human genetic variation.

Authors:  Adam Auton; Lisa D Brooks; Richard M Durbin; Erik P Garrison; Hyun Min Kang; Jan O Korbel; Jonathan L Marchini; Shane McCarthy; Gil A McVean; Gonçalo R Abecasis
Journal:  Nature       Date:  2015-10-01       Impact factor: 49.962

6.  Additional annotation enhances potential for biologically-relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array.

Authors:  Magda E Price; Allison M Cotton; Lucia L Lam; Pau Farré; Eldon Emberly; Carolyn J Brown; Wendy P Robinson; Michael S Kobor
Journal:  Epigenetics Chromatin       Date:  2013-03-03       Impact factor: 4.954

7.  Increased epigenetic alterations at the promoters of transcriptional regulators following inadequate maternal gestational weight gain.

Authors:  Tomoko Kawai; Takahiro Yamada; Kosei Abe; Kohji Okamura; Hiromi Kamura; Rina Akaishi; Hisanori Minakami; Kazuhiko Nakabayashi; Kenichiro Hata
Journal:  Sci Rep       Date:  2015-09-29       Impact factor: 4.379

  7 in total
  4 in total

1.  Epigenetic-scale comparison of human iPSCs generated by retrovirus, Sendai virus or episomal vectors.

Authors:  Koichiro Nishino; Yoshikazu Arai; Ken Takasawa; Masashi Toyoda; Mayu Yamazaki-Inoue; Tohru Sugawara; Hidenori Akutsu; Ken Nishimura; Manami Ohtaka; Mahito Nakanishi; Akihiro Umezawa
Journal:  Regen Ther       Date:  2018-09-01       Impact factor: 3.419

2.  Revisiting genetic artifacts on DNA methylation microarrays exposes novel biological implications.

Authors:  Benjamin Planterose Jiménez; Manfred Kayser; Athina Vidaki
Journal:  Genome Biol       Date:  2021-09-21       Impact factor: 13.583

3.  Comparison of DNA methylation profiles associated with spontaneous preterm birth in placenta and cord blood.

Authors:  Xi-Meng Wang; Fu-Ying Tian; Li-Jun Fan; Chuan-Bo Xie; Zhong-Zheng Niu; Wei-Qing Chen
Journal:  BMC Med Genomics       Date:  2019-01-03       Impact factor: 3.063

4.  Identification of an epigenetic signature in human induced pluripotent stem cells using a linear machine learning model.

Authors:  Koichiro Nishino; Ken Takasawa; Kohji Okamura; Yoshikazu Arai; Asato Sekiya; Hidenori Akutsu; Akihiro Umezawa
Journal:  Hum Cell       Date:  2020-10-12       Impact factor: 4.174

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.