Literature DB >> 26484094

Genome-wide epigenetic profiling of breast cancer tumors treated with aromatase inhibitors.

Ekaterina Nevedomskaya1, Lodewyk Wessels2, Wilbert Zwart3.   

Abstract

Aromatase inhibitors (AI) are extensively used in the treatment of estrogen receptor-positive breast cancers, however resistance to AI treatment is commonly observed. Apart from Estrogen receptor (ERα) expression, no predictive biomarkers for response to AI treatment are clinically applied. Yet, since other therapeutic options exist in the clinic, such as tamoxifen, there is an urgent medical need for the development of treatment-selective biomarkers, enabling personalized endocrine treatment selection in breast cancer. In the described dataset, ERα chromatin binding and histone marks H3K4me3 and H3K27me3 were assessed in a genome-wide manner by Chromatin Immunoprecipitation (ChIP) combined with massive parallel sequencing (ChIP-seq). These datasets were used to develop a classifier to stratify breast cancer patients on outcome after AI treatment in the metastatic setting. Here we describe in detail the data and quality control metrics, as well as the clinical information associated with the study, published by Jansen et al. [1]. The data is publicly available through the GEO database with accession number GSE40867.

Entities:  

Keywords:  Aromatase inhibitor treatment outcome; Breast cancer; ChIP-seq; Epigenetic modifications; Estrogen Receptor

Year:  2014        PMID: 26484094      PMCID: PMC4536071          DOI: 10.1016/j.gdata.2014.06.023

Source DB:  PubMed          Journal:  Genom Data        ISSN: 2213-5960


Direct link to deposited data

Deposited data can be found here: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE40867.

Experimental design, materials and methods

Study population and clinical data

The cohort of 84 metastatic ERα-positive breast cancer patients, who received AI therapy, was selected for evaluation. Tumor material analyzed by genomic profiling was extracted from primary surgery specimens. The patient selection criteria, definitions of follow-up, tumor staging, and response to therapy were previously described by Ramirez-Ardila et al. [5]. Briefly, fresh frozen ERα-positive breast tumor tissue specimens were collected from female patients with primary operable breast cancer and whose metastatic disease was treated with first-line aromatase inhibitors (anastrozole, letrozole, exemestane). Time to progression (TTP) was taken as the end point. Thirteen specimens were selected for chromatin immunoprecipitation (ChIP) and massive parallel sequencing (ChIP-seq) analyses, all on samples with more than 50% ER-positive tumor cells. Poor outcome patients were defined as patients with a TTP < 12 months, whereas good outcome patients were defined as patients with a TTP > 24 months. Clinical characteristics of the selected groups of patients are provided in Table 1 and clinical characteristics per sample are provided in the Supplementary Table 1.
Table 1

Patient and tumor characteristics for the selected groups.

Patients (n = 13)
Characteristic
No of patients
Good outcomePoor outcome
58
Age at diagnosis (mean), years6460
Age at start therapy (mean), years6863
Treatment type
 Anastrozole25
 Exemestane01
 Exemestane01
 Letrozole31
Grade
 110
 233
 314
ER status
 Negative00
 Positive58
PR status
 Negative00
 Positive58
HER2 status
 Negative35
 Positive11
TTP (median), months386.5
The anonymized clinical data were deposited in the Gene Expression Omnibus database (GEO; [3]) under accession number GSE40867.

Chromatin immunoprecipitations and sequencing

Chromatin immunoprecipitation (ChIP) was performed as described before [1]. To obtain input material, tumor samples were cryosectioned (30 × 30 mm sections) prior to further processing for ChIP-seq as described before [7]. For each ChIP, 10 mg of antibody and 100 mL of Protein A magnetic beads (Invitrogen) were used. Antibodies against ERα (SC-543; Santa Cruz), H3K4me3 (ab8580; Abcam), and H3K27me3 (07–449; Millipore) were used. ChIP DNA was amplified as described [1], [4]. Sequences were generated by the Illumina Hiseq 2000 genome analyzer (using 50 bp reads), and aligned to the Human Reference Genome (assembly hg19, February 2009). Non-ChIP input DNA from a randomly selected tumor was sequenced as an input control. Enriched regions of the genome were identified by comparing the ChIP samples to input using the MACS peak caller [5] version 1.3.7.1 with default parameters, except for the p-value cutoff that was set at 10− 7. Details on the number of reads obtained, the percentage of reads aligned, and the number of peaks called can be found in Table 2. ChIP-seq data and sample annotations were deposited in GEO under accession number GSE40867.
Table 2

Read count, number of peaks and quality parameters.

GEO accessionChIPTotal readsMapped reads (%)No of peaksFraction of reads in peaks, %NSCRSC
GSM1003708ERα23,760,88521,964,709 (92.4)5240.151.020.48
GSM1003709H3K4me322,772,85220,520,046 (90.1)16,38426.351.641.72
GSM1003710H3K27me326,990,55926,122,735 (96.8)14,8903.561.010.46
GSM1003711ERα23,802,29422,002,226 (92.4)22550.61.020.47
GSM1003712H3K4me322,591,28920,813,064 (92.1)16,85731.031.611.49
GSM1003713H3K27me322,096,32621,343,362 (96.6)10,0782.911.010.32
GSM1003714ERα20,789,75817,832,808 (85.8)15,3814.381.090.89
GSM1003715H3K4me323,075,27120,411,990 (88.5)25,11111.971.070.53
GSM1003716H3K27me319,103,28617,130,759 (89.7)40081.481.020.32
GSM1003717ERα22,555,19521,115,239 (93.6)27260.841.040.76
GSM1003718H3K4me319,872,39918,226,845 (91.7)16,3209.051.050.44
GSM1003719H3K27me323,961,46422,493,285 (93.9)30850.641.020.39
GSM1003720ERα16,604,87615,605,068 (94.0)13,5753.611.091.03
GSM1003721H3K4me310,238,0049,467,187 (92.5)19,0126.691.090.33
GSM1003722H3K27me322,530,53521,625,249 (96.0)33,6619.131.040.83
GSM1003723ERα19,902,39618,778,288 (94.4)63871.461.040.7
GSM1003724H3K4me320,235,98518,151,245 (89.7)18,35141.231.831.56
GSM1003725H3K27me324,169,59623,067,266 (95.4)30,5887.441.020.42
GSM1003726ERα16,011,31213,905,708 (86.8)22870.581.020.41
GSM1003727H3K27me316,423,40015,482,959 (94.2)28,5147.341.030.57
GSM1003728ERα21,552,07317,908,925 (83.1)7090.721.020.31
GSM1003729H3K4me327,693,75525,171,058 (90.9)27,02315.471.161.03
GSM1003730H3K27me327,372,17724,765,816 (90.5)11,3951.281.020.67
GSM1003731ERα15,620,21514,134,239 (90.5)51702.481.050.72
GSM1003732H3K4me320,741,33618,816,604 (90.7)26,82114.941.10.57
GSM1003733H3K27me321,310,47720,553,892 (96.4)27,1223.281.020.49
GSM1003734ERα18,169,78516,090,891 (88.6)19,7165.431.141.19
GSM1003735H3K4me326,621,10624,586,405 (92.4)23,7851.581.110.87
GSM1003736H3K27me326,069,53125,135,569 (96.4)59,91022.031.061.23
GSM1003737ERα20,867,11118,925,868 (90.7)11100.291.010.37
GSM1003738H3K4me320,012,88717,988,530 (89.9)16,42722.241.381.41
GSM1003739H3K27me323,750,33022,949,910 (96.6)42560.791.010.33
GSM1003740ERα13,499,17912,530,097 (92.8)9241.761.020.33
GSM1003741H3K4me326,027,54324,076,775 (92.5)26,39611.241.130.96
GSM1003742H3K27me331,996,44130,602,420 (95.6)70671.131.030.86
GSM1003743Input27,097,49725,588,905 (94.4)

Quality control

Prior to analysis, visual inspection of the regions known to typically bind ERα or contain histone modifications was performed using the Integrative Genome Viewer IGV 2.1 (www.broadinstitute.org/igv/). Examples of such regions are provided in Fig. 1A. As expected, ERα peaks were found at the enhancers of known estrogen-responsive genes (e.g. XBP1 (Fig. 1A), RARA, GREB1), H3K4me3 signal was observed at promoters of estrogen-responsive genes and H3K27me3 marked genes not expressed in breast tissue, such as NEUROD1. (Fig. 1A). The peaks of H3K4me3 histone modification are often wider than the peaks of ERα binding [6], while the transcription repressive histone mark H3K27me3 can cover large areas, including full gene bodies [7], which also results in the identification of broad peaks for this histone mark. Peak widths for all three datasets are illustrated by the density distributions as depicted in Fig. 1B.
Fig. 1

Quality control and data metrics of ChIP-seq data. (A) Example genomic regions with distinct and unique signal of ERα (red), H3K4me3 (blue), and H3K27me3 (green) binding events. Genomic coordinates are indicated. Tag count is shown for each position. (B) Distribution of peak widths in different ChIP-seq datasets. (C) Example of a cross-correlation plot. Blue dashed line indicates the ‘phantom’ peak corresponding to the read length, red dashed line marks the peak of the fragment length. (D) Distribution of ERα motifs relative to the peak position of ERα binding events.

There is no current consensus on the quality control metrics for ChIP- and enrichment-based technologies, such as ChIP-seq, GRO-seq and others. Commonly, the number of reads and peaks detected is reported. The total number of reads, number of aligned reads and number of peaks for each ChIP-seq sample are shown in Table 2. A few quality control procedures have been suggested in the literature [8], [9], however their use is not established practice and some of them may not be applicable to a large variety of ChIP-seq data. Here we employed quality control measures suggested by the ENCODE consortium for assessing the quality of the data [8]. It is, however, important to mention that ENCODE guidelines are used in the analysis of the data from cell line experiments. Data from tumor samples, used in the current study, are more difficult to process due to intrinsic intra-tumor heterogeneity and biological variation. Therefore, we cannot expect our tumor sample-based ChIP-seq data to fully meet the criteria used for the cell line data. The minimal fraction of reads in peaks as prescribed by ENCODE (1%), which is an indicator of ChIP efficiency, was met in almost 80% of the samples (Table 2). Cross-correlations of positive and negative strands were calculated using publicly available scripts (http://code.google.com/p/phantompeakqualtools) [10], [11]. An example of a cross-correlation plot can be seen in Fig. 1C. Dominant fragment and read lengths were calculated from the cross-correlations, and the related measures, namely Normalized Strand Coefficient (NSC) and Relative Strand Correlation (RSC), were assessed. As can be seen from Table 2, not all the samples meet the ENCODE criteria of NSC > 1.05 and RSC > 0.8. The best results for these parameters are achieved in the H3K4me3 data with over 90% meeting the NSC criterion and over 60% meeting the RSC criterion. Overall, the quality metrics for ERα ChIP-seq have lower values than those for the histone marks. However, it is not surprising for a number of reasons. First, immunoprecipitation of chromatin with histone marks is more efficient as histones are the intrinsic part of the chromatin, whereas ERα is a transcription factor not integrated in the structure of chromatin. Second, being a hormone-dependent transcription factor, ERα chromatin interactions are dependent on the physiological levels of E2, which may be at non-saturated levels within the tumor and could vary from patient to patient. Third, as shown before, high quality ChIP-seq datasets with limited number of genuine binding sites may produce low NSC and RSC values [8]. We further validated that the peaks detected in ERα ChIP-seq data are genuine signal and correspond to the binding sites of ERα. Called peaks that were found in at least two tumor samples were considered for analysis, resulting in 11,262 peaks for ERα dataset. This high number of consensus peaks illustrates the quality of the data available for the analysis. We subsequently defined the locations of ERα motifs in these peaks by using the ScreenMotif tool from the Cistrome (cistrome.org). As seen from the Fig. 1D, the motifs are clearly concentrated around the center of identified peaks. This illustrates that despite the NSC and RSC metrics having marginal values, the ERα peaks detected present a genuine signal. R scripts for analysis are available upon request.

Discussion

Here we describe a unique dataset, in which we profiled the chromatin binding landscapes of ERα, H3K4me3 and H3K27me3 in primary human ERα-positive luminal breast tumor specimens. Patients were treated in the metastatic setting with AIs, and survival data are available and provided in the public data repositories. With this, our datasets consist of two parts: clinical and ChIP-seq data. Clinical data includes outcome upon treatment with aromatase inhibitors and other important clinic-pathological characteristics. ChIP-seq data comprises genome-wide profiling of estrogen receptor (ERα) binding to chromatin, promoter-specific histone modification H3K4me3 and transcription repressive histone mark H3K27me3. This dataset has been recently used in a publication for finding epigenetic signatures related to the outcome upon aromatase inhibitors treatment for metastatic breast cancer [1]. The following are the supplementary data related to this article.

Supplementary Table S1.

Clinicopathological paramater of patient series.
Specifications
Organism/cell line/tissueHomo sapiens
SexFemale
Sequencer or array typeIllumina Hiseq 2000 genome analyzer
Data formatRaw: SRA study; processed: BED
Experimental factorsPoor vs. good outcome tumors
Experimental featuresGenome-wide binding of Estrogen Receptor α (ERα), as well as histone marks H3K4me3 and H3K27me3, were assessed in tumors from breast cancer patients with good or poor survival outcome after aromatase inhibitors therapy.
ConsentAll patients gave their written informed consent before study entry.
Sample source locationSamples were from breast cancer patients, treated at the Erasmus University Medical Center (EMC; Rotterdam, the Netherlands), the Netherlands Cancer Institute/Antoni van Leeuwenhoek hospital (Amsterdam, the Netherlands), and the Translational Cancer Research Unit (Saint Augustinus Hospital, Antwerpen, Belgium).
  11 in total

1.  A chromatin landmark and transcription initiation at most promoters in human cells.

Authors:  Matthew G Guenther; Stuart S Levine; Laurie A Boyer; Rudolf Jaenisch; Richard A Young
Journal:  Cell       Date:  2007-07-13       Impact factor: 41.582

2.  ChIP-seq: using high-throughput sequencing to discover protein-DNA interactions.

Authors:  Dominic Schmidt; Michael D Wilson; Christiana Spyrou; Gordon D Brown; James Hadfield; Duncan T Odom
Journal:  Methods       Date:  2009-03-09       Impact factor: 3.608

3.  Hotspot mutations in PIK3CA associate with first-line treatment outcome for aromatase inhibitors but not for tamoxifen.

Authors:  Diana E Ramirez-Ardila; Jean C Helmijr; Maxime P Look; Irene Lurkin; Kirsten Ruigrok-Ritstier; Steven van Laere; Luc Dirix; Fred C Sweep; Paul N Span; Sabine C Linn; John A Foekens; Stefan Sleijfer; Els M J J Berns; Maurice P H M Jansen
Journal:  Breast Cancer Res Treat       Date:  2013-04-17       Impact factor: 4.872

4.  Hallmarks of aromatase inhibitor drug resistance revealed by epigenetic profiling in breast cancer.

Authors:  Maurice P H M Jansen; Theo Knijnenburg; Esther A Reijm; Iris Simon; Ron Kerkhoven; Marjolein Droog; Arno Velds; Steven van Laere; Luc Dirix; Xanthippi Alexi; John A Foekens; Lodewyk Wessels; Sabine C Linn; Els M J J Berns; Wilbert Zwart
Journal:  Cancer Res       Date:  2013-11-15       Impact factor: 12.701

5.  ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia.

Authors:  Stephen G Landt; Georgi K Marinov; Anshul Kundaje; Pouya Kheradpour; Florencia Pauli; Serafim Batzoglou; Bradley E Bernstein; Peter Bickel; James B Brown; Philip Cayting; Yiwen Chen; Gilberto DeSalvo; Charles Epstein; Katherine I Fisher-Aylor; Ghia Euskirchen; Mark Gerstein; Jason Gertz; Alexander J Hartemink; Michael M Hoffman; Vishwanath R Iyer; Youngsook L Jung; Subhradip Karmakar; Manolis Kellis; Peter V Kharchenko; Qunhua Li; Tao Liu; X Shirley Liu; Lijia Ma; Aleksandar Milosavljevic; Richard M Myers; Peter J Park; Michael J Pazin; Marc D Perry; Debasish Raha; Timothy E Reddy; Joel Rozowsky; Noam Shoresh; Arend Sidow; Matthew Slattery; John A Stamatoyannopoulos; Michael Y Tolstorukov; Kevin P White; Simon Xi; Peggy J Farnham; Jason D Lieb; Barbara J Wold; Michael Snyder
Journal:  Genome Res       Date:  2012-09       Impact factor: 9.043

6.  ChIP-seq analysis reveals distinct H3K27me3 profiles that correlate with transcriptional activity.

Authors:  Matthew D Young; Tracy A Willson; Matthew J Wakefield; Evelyn Trounson; Douglas J Hilton; Marnie E Blewitt; Alicia Oshlack; Ian J Majewski
Journal:  Nucleic Acids Res       Date:  2011-06-07       Impact factor: 16.971

7.  Model-based analysis of ChIP-Seq (MACS).

Authors:  Yong Zhang; Tao Liu; Clifford A Meyer; Jérôme Eeckhoute; David S Johnson; Bradley E Bernstein; Chad Nusbaum; Richard M Myers; Myles Brown; Wei Li; X Shirley Liu
Journal:  Genome Biol       Date:  2008-09-17       Impact factor: 13.583

8.  Design and analysis of ChIP-seq experiments for DNA-binding proteins.

Authors:  Peter V Kharchenko; Michael Y Tolstorukov; Peter J Park
Journal:  Nat Biotechnol       Date:  2008-11-16       Impact factor: 54.908

9.  Large-scale quality analysis of published ChIP-seq data.

Authors:  Georgi K Marinov; Anshul Kundaje; Peter J Park; Barbara J Wold
Journal:  G3 (Bethesda)       Date:  2014-02-19       Impact factor: 3.154

10.  A quality control system for profiles obtained by ChIP sequencing.

Authors:  Marco-Antonio Mendoza-Parra; Wouter Van Gool; Mohamed Ashick Mohamed Saleem; Danilo Guillermo Ceschin; Hinrich Gronemeyer
Journal:  Nucleic Acids Res       Date:  2013-09-14       Impact factor: 16.971

View more
  2 in total

1.  Changes of bivalent chromatin coincide with increased expression of developmental genes in cancer.

Authors:  Stephan H Bernhart; Helene Kretzmer; Lesca M Holdt; Frank Jühling; Ole Ammerpohl; Anke K Bergmann; Bernd H Northoff; Gero Doose; Reiner Siebert; Peter F Stadler; Steve Hoffmann
Journal:  Sci Rep       Date:  2016-11-23       Impact factor: 4.379

Review 2.  Pharmacogenetic and pharmacogenomic discovery strategies.

Authors:  Concetta Crisafulli; Petronilla Daniela Romeo; Marco Calabrò; Ludovica Martina Epasto; Saverio Alberti
Journal:  Cancer Drug Resist       Date:  2019-06-19
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.