| Literature DB >> 22726460 |
Rejane Hughes Carvalho1, Vanja Haberle, Jun Hou, Teus van Gent, Supat Thongjuea, Wilfred van Ijcken, Christel Kockx, Rutger Brouwer, Erikjan Rijkers, Anieta Sieuwerts, John Foekens, Mirjam van Vroonhoven, Joachim Aerts, Frank Grosveld, Boris Lenhard, Sjaak Philipsen.
Abstract
BACKGROUND: Non-small cell lung carcinoma (NSCLC) is a complex malignancy that owing to its heterogeneity and poor prognosis poses many challenges to diagnosis, prognosis and patient treatment. DNA methylation is an important mechanism of epigenetic regulation involved in normal development and cancer. It is a very stable and specific modification and therefore in principle a very suitable marker for epigenetic phenotyping of tumors. Here we present a genome-wide DNA methylation analysis of NSCLC samples and paired lung tissues, where we combine MethylCap and next generation sequencing (MethylCap-seq) to provide comprehensive DNA methylation maps of the tumor and paired lung samples. The MethylCap-seq data were validated by bisulfite sequencing and methyl-specific polymerase chain reaction of selected regions.Entities:
Year: 2012 PMID: 22726460 PMCID: PMC3407794 DOI: 10.1186/1756-8935-5-9
Source DB: PubMed Journal: Epigenetics Chromatin ISSN: 1756-8935 Impact factor: 4.954
Figure 1Experimental design for profiling of DNA methylation patterns in non-small cell lung carcinoma. (A) Overall view of the steps followed to generate the profiles (ADC: adenocarcinoma; LCC: large cell carcinoma; MBD: methyl-binding domain protein; N: Lung; SCC: squamous cell carcinoma; T: tumor), (B) MethylCap using methyl-binding domain proteins: Sheared genomic DNA is used as input fraction (methyl groups in red) and incubated with beads [26], coated with streptavidin (green)-biotin (yellow)) and linked to methyl-binding domain protein (blue) to capture methylated DNA. Captured fragments are subjected to high-throughput sequencing. (C) Summary of the bioinformatics approach used to generate the methylation profiles.
Data from patients used for MethylCap-seq and bisulfite sequencing validation
| 2213 N | healthy | SCC1 | IIB | 2,43 | Alive | M | 54,54 | Caucasian |
| 2214 T | SCC | |||||||
| 2235 N | healthy | SCC2 | IIB | 51,23 | Deceased | M | 73,73 | Caucasian |
| 2236 T | SCC | |||||||
| 2245 N | healthy | ADC1 | IB | 12,57 | Alive | F | 66,26 | Caucasian |
| 2246 T | ADC | |||||||
| 2255 N | healthy | ADC2 | IA | 98,93 | Deceased | F | 54,26 | Caucasian |
| 2256 T | ADC | |||||||
| 2257 N | healthy | ADC3 | N/A | N/A | N/A | M | 78,96 | Caucasian |
| 2258 T | ADC | |||||||
| 2261 N | healthy | SCC3 | IIB | 6,77 | Alive | M | 70,03 | Caucasian |
| 2262 T | SCC | |||||||
| 22he | healthy | LCC1 | IB | 35,73 | Deceased | F | 56,58 | Caucasian |
| 22tu | LCC |
ADC: adenocarcinoma; LCC: large cell carcinoma; N/A: not applicable; SCC: Squamous cell carcinoma.
Figure 2Global analysis of DNA methylation patterns in non-small cell carcinoma. (A) Correlation between experimental replicates. Each point represents the raw methylation signal (mean coverage in 10 bp bins). Density of points (log10 scale) is shown in different shades of blue. Pearson’s correlation coefficient is denoted in the top left corner of each scatter plot. (B) Correlation between tumor (x axis) and paired lung tissue sample (y axis). (C) Raw reads from a lung sample (above panel) and its matching tumor sample (middle panel) in the position chr2:176,716,200 to 176,738,910 viewed in the UCSC genome browser; normalized P-value is depicted in the bottom panel. Y-axis depicts the number of sequence reads in each region per sample. Dashed line represents the maximum number of reads (highest peak) in each sample; (D) Representation of the distribution of hypomethylated (green) and hypermethylated (red) regions across chromosomes in a tumor versus paired lung sample. (E) Composition of hypo- and hypermethylated regions in comparison to randomly sampled regions across the genome. Bars indicate the mean proportions of regions covered by each distinct genomic feature. Error bars denote standard error of the mean. Promoters were defined as regions ± 1 kb from all Ensembl transcription start site. Distinct classes of repeats were retrieved from the UCSC Table Browser (RepeatMasker table for hg18). Repeats were excluded from all subsequent features. Random regions were sampled across the genome independently seven times and each time their number was matched to the number of differentially methylated region in one sample pair. Statistical testing was done against random regions using one-tailed Student’s t-test (*P-≤0.01; **P-≤0.001; ***p≤0.0001).
Figure 3Differentially methylated region in non-small cell carcinoma versus paired lung tissue samples. (A) Heatmap of 14,742 most significant DMRs, picturing regions with mostly hypermethylated (top), mostly hypomethylated (middle), and mixes (bottom). Color bar at the bottom represents the log ratio of the normalized signal in tumor versus. lung (red= hypermethylation, green= hypomethylation). Dashed line represents cut in the dendogram, generating nine groups of DMRs. The DMRs from the two most distinctive groups are depicted on the chromosomes. Red: cluster containing regions hypermethylated in all samples; green: cluster containing regions hypomethylated in all samples. (B) Bar plot showing mean enrichment of DMRs at chromosome ends calculated as ratio between proportion of 1 Mbp region at chromosome ends and total proportion of each chromosome covered by DMRs. Error bars show standard error of mean of all chromosomes. Statistical testing was done against random regions using one-tailed Student’s t-test (***P≤0.0001). (C) DMRs distribution relative to gene position. Number of hyper- and hypomethylated genes and regions (outside gene area), and their distribution as in promoter, gene body and 3′ UTR (top panel: all seven samples; middle panel: squamous cell carcinomas, bottom panel: adenocarcinoma). (D): Gene function distribution of the differentially methylated genes showing number of hyper- and hypomethylated genes relative to gene classes or function. DMR: differentially methylated regions; UTR: untranslated region.
Figure 4Bisulfite sequencing validation for MethylCap-Seq. (A) Sequence reads for the Frag_01 (genomic region located at position Ch2: 119,331,343 to 119,331,692) in a tumor (2214 T) and matching lung tissue (2213 N), plotted in the Genome Browser, showing the distribution of the 25 CpGs contained in the fragment highlighted in green. Dashed lines represent the highest number of reads in 2213 N and 2214 T. (B) Methylation status of each CpG in all 36 individually sequenced clones in the same samples and fragment shown in Figure 4A. The middle row represents the average of methylation in all clones per CpG position. (C) Average of methylation of all clones sequenced per patient in each fragment (M: control totally methylated DNA; N:lung; T:tumor; U:control totally unmethylated DNA). (D) Average of the methylation status of the sum of all clones, grouped per histological subtype; comparison betweenadenocarcinoma and squamous cell carcinoma in all fragments were statistically significant. (E) Correlation between normalized methylation signals from MethylCap-seq and CpG methylation from bisulfite sequence. Different regions are shown in different colors; lung samples are marked by dots and tumors by stars. Pearson’s correlation coefficient is denoted above the linear regression curve. *P≤0.001.
Figure 5Methylation-specific polymerase chain reaction screening in 48 tumors and matching lung tissues. (A) Cropped gel images for the seven patients used for MethylCap-seq, grouped by histological subtype and controls (M: “methylated primer set”; N: lung; T: tumor; Tot_M: totally methylated control samples; Tot_U: totally unmethylated control sample; U: unmethylated primer set; ø: blank/water). (B) Histogram of the methylation status in lung samples and tumor samples in all 48 patients in the five fragments analyzed. (C) Histogram of the methylation status per histological subtype (statistics were calculated with Fisher’s exact test, two-tailed p-values: *P≤0.02; **P≤0.001 and ***P≤0.01).
Frequency of methylation by methylation-specific polymerase chain reaction in the 96 samples (48 tumor and matching lung tissue) in five fragments
| count (n) | Sensitivity (%) | Specificity (%) | Count (n) | Sensitivity (%) | Specificity (%) | Count (n) | Sensitivity (%) | Specificity (%) | Count (n) | Sensitivity (%) | Specificity (%) | Count (n) | Sensitivity (%) | Specificity (%) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Lung | 24/48 | 75 | 27 | 0/47 | 62 | 100 | 9/48 | 75 | 56 | 0/41 | 42 | 100 | 0/39 | 77 | 100 |
| Tumor | 36/48* | 29/47** | 36/48** | 17/41** | 30/39** | ||||||||||
| Adenocarcinoma | 13/17 | 76.5 | 46.4 | 5/16 | 31.3 | 27.7 | 12/17 | 70.6 | 46.1 | 8/15 | 53.3 | 53.3 | 7/13 | 53.9 | 36.8 |
| Squamous cell carcinoma | 15/16 | 93.8 | 53.6 | 13/16 | 81.3** | 72.2 | 14/16 | 87.5 | 53.9 | 7/16 | 43.8 | 46.7 | 12/16 | 75.0 | 63.2 |
Sensitivity is calculated using the ratio between methylated (amplified) sample and the total amount of samples analyzed per fragment; specificity is the ratio of the difference between amplification in tumor and in the matching lung sample (or adenocarcinoma versus squamous cell carcinoma). Statistical significance of the difference between control lung samples and tumors, and between adenocarcinomas and squamous cell carcionmas were calculated using chi-square. *P<0.02; **P0.0001.
List of most differentially methylated genes in non-small cell carcinoma
| Common in all tumors | 4.8 | transcription regulator | BARX homeobox 1 | HGNC:955 | |
| 4.3 | transcription regulator | paired box 9 | HGNC:8623 | ||
| 4.0 | transcription regulator | orthodenticle homeobox 1 | HGNC:8521 | ||
| 3.8 | G-protein coupled receptor | natriuretic peptide receptor C/guanylate cyclase C (atrionatriuretic peptide receptor C) | HGNC:7945 | ||
| 3.7 | growth factor | fibroblast growth factor 12 | HGNC:3668 | ||
| 3.6 | transcription regulator | one cut homeobox 2 | HGNC:8139 | ||
| 3.5 | transcription regulator | PR domain containing 14 | HGNC:14001 | ||
| 3.5 | transcription regulator | retina and anterior neural fold homeobox | HGNC:18662 | ||
| 3.3 | transcription regulator | short stature homeobox 2 | HGNC:10854 | ||
| 3.1 | transcription regulator | DMRT-like family A2 | HGNC:13908 | ||
| 3.1 | unknown | fer-1-like 4 ( | HGNC:15801 | ||
| 3.1 | transcription regulator | SIX homeobox 6 | HGNC:10892 | ||
| 3 | transcription regulator | GATA binding protein 3 | HGNC:4172 | ||
| 3 | transcription regulator | SKI familytranscriptional corepressor 1 | HGNC:21326 | ||
| 3 | transcription regulator | homeobox A9 | HGNC:5109 | ||
| 2.9 | transcription regulator | sal-like 1 (Drosophila) | HGNC:10524 | ||
| 2.7 | transcription regulator | iroquois homeobox 2 | HGNC:14359 | ||
| 2.7 | ion channel | glutamate receptor, ionotropic, kainate 2 | HGNC:4580 | ||
| 2.6 | transcription regulator | SATB homeobox 2 | HGNC:21637 | ||
| 2.5 | transcription regulator | Meis homeobox 1 | HGNC:7000 | ||
| VAX1 | 2.4 | transcription regulator | ventral anterior homeobox 1 | HGNC:12660 | |
| TBX15 | 2.3 | transcription regulator | T-box 15 | HGNC:11594 | |
| −2.4 | unknown | cyclin N-terminal domain containing 2 | HGNC:25805 | ||
| −2.6 | unknown | zinc finger, MYND-type containing 10 | HGNC:19412 | ||
| −2.9 | transcription regulator | v-myc myelocytomatosis viral oncogene homolog (avian) | HGNC:7553 | ||
| −2.9 | Plasma Membrane | tetraspanin 9 | HGNC:21640 | ||
| −3.1 | unknown | neuron navigator 1 | HGNC:15989 | ||
| −3.4 | unknown | cytoplasmic polyadenylation element binding protein 3 | HGNC:21746 | ||
| Unique for squamous cell carcinomas | 6.5 | cell differentiation | trafficking protein particle complex 9 | HGNC:30832 | |
| 6.3 | hydrolase | abhydrolase domain containing 2 | HGNC:18717 | ||
| 6.2 | transcrption regulator | catenin (cadherin-associated protein), delta 1 | HGNC:2515 | ||
| 6.2 | histone protein | histone cluster 1, H2bb | HGNC:4751 | ||
| 5.8 | protein binding | epithelial membrane protein 1 | HGNC:3333 | ||
| 5.6 | transcription regulator | transducin (beta)-like 1 X-linked receptor 1 | HGNC:29529 | ||
| 5.6 | protein binding | neurexophilin 1 | HGNC:20693 | ||
| 5.6 | transcription regulator | Zic family member 4 | HGNC:20393 | ||
| 5.4 | hydrolase | acyloxyacyl hydrolase (neutrophil) | HGNC:548 | ||
| 5.4 | transporter | actinin, alpha 4 | HGNC:166 | ||
| 5.2 | unknown | chromosome 1 open reading frame 21 | HGNC:15494 | ||
| 5.2 | transporter | protein kinase C and casein kinase substrate in neurons 2 | HGNC:8571 | ||
| 5.2 | isomerase | phosphomannomutase 2 | HGNC:9115 | ||
| 5.1 | histone methyltransferase | DOT1-like, histone H3 methyltransferase ( | HGNC:24948 | ||
| 5 | e3 ubiquitin-protein ligase | WW domain containing E3 ubiquitin protein ligase 2 | HGNC:16804 | ||
| 5 | transcription regulator | generaltranscription factor IIIC, polypeptide 1, alpha 220 kDa | HGNC:4664 | ||
| 5 | nuclear chaperone | MDN1, midasin homolog (yeast) | HGNC:18302 | ||
| 5 | transcrption regulator | death inducer-obliterator 1 | HGNC:2680 | ||
| 5 | histone protein | histone cluster 1, H3c | HGNC:4768 | ||
| 4.9 | unknown | ankyrin repeat domain 13B | HGNC:26363 | ||
| 4.9 | hormone | calcitonin-related polypeptide beta | HGNC:1438 | ||
| 4.9 | phosphatase | protein tyrosine phosphatase, receptor type, A | HGNC:9664 | ||
| 4.8 | transcription regulator | signal transducer and activator of transcription 5A | HGNC:11366 | ||
| 4.8 | cytoskeleton | LIM domain kinase 1 | HGNC:6613 | ||
| 4.7 | transporter | solute carrier family 23 (nucleobase transporters), member 2 | HGNC:10973 | ||
| 4.5 | transcription regulator | BARX homeobox 1 | HGNC:955 | ||
| 4.5 | unknown | Nance-Horan syndrome (congenital cataracts and dental anomalies) | HGNC:7820 | ||
| 4.5 | glycosyltransferase | methylthioadenosine phosphorylase | HGNC:7413 | ||
| 4.5 | transcription regulator | forkhead box K1 | HGNC:23480 | ||
| 4.4 | O-methyltransferase activity | protein-L-isoaspartate (D-aspartate) O-methyltransferase | HGNC:8728 | ||
| 4.3 | transcription regulator | SET domain containing 1A | HGNC:29010 | ||
| 4.2 | centromere protein | centromere protein P | HGNC:32933 | ||
| 4.1 | unknown | KIAA1217 | HGNC:25428 | ||
| 4.1 | enhances neuronal dendrite outgrowth | SLIT and NTRK-like family, member 1 | HGNC:20297 | ||
| 4 | transcription regulator | RAR-related orphan receptor B | HGNC:10259 | ||
| 4 | postsynaptic scaffold in neuronal cells | discs, large (Drosophila) homolog-associated protein 1 | HGNC:2905 | ||
| 4 | unknown | chromosome 3 open reading frame 21 | HGNC:26639 | ||
| 4 | protein glycosylation | ST6 (alpha-N-acetyl-neuraminyl-2,3-beta-galactosyl-1,3)-N-acetylgalactosaminide alpha-2,6-sialyltransferase 1 | HGNC:23614 | ||
| 3.9 | transcription regulator | Zic family member 3 (odd-paired homolog, Drosophila) | HGNC:12874 | ||
| 3.8 | hydrolase | lysophospholipase I | HGNC:6737 | ||
| 3.8 | transcription regulator | RE1-silencingtranscription factor | HGNC:9966 | ||
| -2 | unknown | transmembrane protein 132D | HGNC:29411 | ||
| -2.1 | microtubule | microtubule-associated protein 1 light chain 3 beta 2 | HGNC:34390 | ||
| -2.2 | unknown | chibby homolog 3 ( | HGNC:33278 | ||
| -2.3 | signal transduction | Ras interacting protein 1 | HGNC:24716 | ||
| -2.3 | kinase | protein kinase, cGMP-dependent, type I | HGNC:9414 | ||
| -2.4 | unknown | WD repeat domain 72 | HGNC:26790 | ||
| -2.7 | ion channel | potassium voltage-gated channel, KQT-like subfamily, member 2 | HGNC:6296 | ||
| -2.8 | G-protein coupled receptor | BAI1-associated protein 3 | HGNC:948 | ||
| -30 | kinase | mitogen-activated protein kinase kinase kinase 10 | HGNC:6849 | ||
| Unique for Adenocarcinomas | 4.4 | transcription regulator | musculin | HGNC:7321 | |
| 4.1 | unknown | family with sequence similarity 78, member B | HGNC:13495 | ||
| 3.6 | transcription regulator | homeobox A1 | HGNC:5099 | ||
| 3.3 | enzyme | septin 9 | HGNC:7323 | ||
| 3.3 | Cell cycle/growth | growth arrest-specific 1 | HGNC:4165 | ||
| 3.1 | phosphatase | protein tyrosine phosphatase, receptor type, N polypeptide 2 | HGNC:9677 | ||
| 2.7 | Wnt receptor signaling pathway | R-spondin 2 homolog ( | HGNC:28583 | ||
| 2.6 | transcription regulator | POU class 3 homeobox 3 | HGNC:9216 | ||
| 2.5 | transporter | transient receptor potential cation channel, subfamily A, member 1 | HGNC:497 | ||
| 2.3 | transporter | synaptotagmin VI | HGNC:18638 | ||
| 2.3 | transporter | solute carrier family 6 (neurotransmitter transporter, noradrenalin), member 2 | HGNC:11048 | ||
| 2.1 | transcription regulator | LIM homeobox 1 | HGNC:6593 | ||
| 2.1 | small GTPase mediated signal transduction | Rap guanine nucleotide exchange factor (GEF) 5 | HGNC:16862 | ||
| 2.1 | growth factor | growth differentiation factor 10 | HGNC:4215 | ||
| 1.6 | unknown | chromosome 3 open reading frame 45 | HGNC:26781 | ||
| 1.6 | differentiation/apoptosis | slit homolog 2 (Drosophila) | HGNC:11086 | ||
| −1.6 | transcription regulator | zinc finger protein 423 | HGNC:16762 | ||
| −1.8 | actin filament organization | ras homolog gene family, member F (in filopodia) | HGNC:15703 |
Genes were selected based on the average of log2 ratios from the samples in each group. Hypermethylated genes common in all tumors and unique for the ADC groups were selected based on log2 ratio > 2.5; hypomethylation included all genes from those groups. For SCC we selected the first 50 genes with the highest log2 ratios in hypermethylation, and log2 ratio < −2.0 for hypomethylated genes. Gene classes and their correspondent biological functions were retrieved from Gene Ontology (http://www.geneontology.org/) and Protein Knowledgebase UniProtKb (http://www.uniprot.org/).