Literature DB >> 26484082

Gene expression analysis of livers from female B6C3F1 mice exposed to carcinogenic and non-carcinogenic doses of furan, with or without bromodeoxyuridine (BrdU) treatment.

Anna Francina Webster¹, Andrew Williams², Leslie Recio³, Carole L Yauk².

Abstract

Standard methodology for identifying chemical carcinogens is both time-consuming and resource intensive. Researchers are actively investigating how new technologies can be used to identify chemical carcinogens in a more rapid and cost-effective manner. Here we performed a toxicogenomic case study of the liver carcinogen furan. Full study and mode of action details were previously published in the Journal of Toxicology and Applied Pharmacology. Female B6C3F1 mice were sub-chronically treated with two non-carcinogenic (1 and 2 mg/kg bw) and two carcinogenic (4 and 8 mg/kg bw) doses of furan for 21 days. Half of the mice in each dose group were also treated with 0.02% bromodeoxyuridine (BrdU) for five days prior to sacrifice [13]. Agilent gene expression microarrays were used to measure changes in liver gene and long non-coding RNA expression (published in Toxicological Sciences). Here we describe the experimental and quality control details for the microarray data. We also provide the R code used to analyze the raw data files, produce fold change and false discovery rate (FDR) adjusted p values for each gene, and construct hierarchical clustering between datasets.

Entities: Chemical Disease Gene Species

Keywords: Dose–response; Furan; Liver cancer; Microarray; Statistical analysis in R

Year: 2014 PMID： 26484082 PMCID： PMC4536026 DOI： 10.1016/j.gdata.2014.05.013

Source DB: PubMed Journal: Genom Data ISSN： 2213-5960

Direct link to deposited data

The complete dataset is available through the Gene Expression Omnibus. Non-BrdU dataset: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE48644 BrdU dataset: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE54078

Experimental design, materials and methods

Study design

5–6 week old female specific pathogen free B6C3F1 mice were purchased from Charles River Breeding Laboratories (Portage, ME) and were allowed to acclimatize for at least seven days prior to the start of the study. Feed (NIH-07; Zeigler Brothers, Inc., Gardners, PA) and tap water were available ad libitum up until the time of necropsy. Mice were housed five per cage in polycarbonate cages in a specific pathogen free (SPF) and Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC) accredited facility. Female mice were dosed with furan (CAS no. 110-00-9) (> 99% pure) (Sigma-Aldrich Chemical Co., Milwaukee, WI) in Mazola corn oil at 0, 1, 2, 4, or 8 mg/kg bw per day by oral gavage for three weeks (n = 10 per dose groups). Upon necropsy, there remained n = 5 mice in each non-BrdU dose group. However, some mice were lost due to early (pre-BrdU treatment) mis-dosing or esophageal puncture, therefore the 0, 1, 2, 4, or 8 mg/kg bw + BrdU groups had n = 4, 5, 3, 4, or 5 mice, respectively. + BrdU mice were treated with 0.02% BrdU (Sigma Chemical Co., St. Louis, MO, USA) in drinking water for five days prior to sacrifice. Four hours after their final dosing, mice were anesthetized by carbon dioxide inhalation prior to euthanasia by exsanguination achieved by cutting the caudal vena cava after blood collection. One animal per group was killed and this continued until all mice had been sacrificed; this occurred over a period of 100 min (beginning at 1 pm). The left, median, right posterior and right anterior lobes of the liver were cut into 0.25–0.5 cm3 pieces that were either snap-frozen in liquid nitrogen and stored at or below − 70 °C or were formalin fixed and paraffin embedded.

Microarray: sample labeling and hybridization

RNA was extracted from ~ 100 mg frozen liver tissue using the RNeasy Midi RNA Extraction kit (Qiagen, Mississauga, ON, Canada). An Omni tissue homogenizer with a disposable 7 mm Omni generator probe was used (Omni #34750, Omni International, Marietta, GA). RNA was quantified using a NanoDrop Spectrophotometer (Thermo Fisher Scientific Inc., Wilmington, DE, USA), qualified using an Agilent 2100 Bioanalyzer (Agilent Technologies Inc., Mississauga, ON, Canada), and stored at − 80 °C. Sample RNA integrity numbers (RINs) ranged from 8.9–10. Sample RNA (200 ng) was used together with a mouse universal reference RNA (Stratagene by Agilent Technologies Inc.) to synthesize, amplify and label cRNA using the Low Input Quick Amp Labeling Kit (Agilent Technologies Inc.). Labeled cRNA was purified using the RNeasy Mini Kit (Qiagen). Amplification and labeling efficiency of cRNA were quantified using a NanoDrop spectrophotometer. Hybridization mixes were prepared using the Hi-RPM Gene Expression Hybridization Kit (Agilent Technologies Inc.). 300 ng of Cy3-labeled reference RNA and 300 ng Cy5-labeled sample cRNA were hybridized on SurePrint G3 Mouse GE 8 × 60 K microarrays (Agilent Technologies Inc.) at 65 °C for 17 h at 10 rpm. Samples were arranged on arrays according to a randomized block design (RBD). An RBD is used to control for a source of variability that is not related to the experimental question. The microarray slides used in this experiment each had eight arrays, which means that a total of eight samples could be hybridized to each slide. Since this experiment had more than eight samples, there were two blocking factors that could impact the results: (1) which slide the samples appeared on, and (2) which array the sample was assigned to (i.e. the sample's location on a microarray slide, of which there are eight options). All samples (control, treated, ± BrdU) were randomized across five slides. Slides were washed according to the manufacturer's specifications with Gene Expression Wash Buffers 1 and 2 (Agilent Technologies Inc.), scanned using an Agilent G2505B scanner at 5 μm resolution. Data were pre-processed using the Agilent Feature Extraction Software, version 11. All arrays met the minimum Agilent QA/QC standards. All kits were carried out according to the manufacturer's protocol.

Normalization of microarray data

Boxplots of the relative ratio and signal intensities (MA plots) were constructed and inspected. MA plots are used to identify systematic variations (including dye biases and poor hybridization efficiency) within arrays. ‘M’ is the log red/green intensity ratio (log(R/G), a measure of differential gene expression) and ‘A’ is the average log of the product of the two intensities (log(R⁎G), a measure of overall fluorescence intensity). Since the majority of probes are not expected to be differentially expressed in response to the treatment, the majority of points on the y-axis (M) should fall at zero. Data where the majority of points on the y-axis are not at zero must be LOWESS (LOcally WEighted Scatterplot Smoothing) normalized [10] (Fig. 1). Non-background-subtracted median signal intensities were normalized in R [7] using the transform.madata function in the MAANOVA library [9]. Probes with technical replicates were then averaged.

Fig. 1

A representative MA plot for (A) before and (B) after LOWESS normalization. Each dot represents a probe on the microarray. The red line is the line of best fit through all data points.

Quality assessment and control of microarray data

The background for each array was assessed using the negative control 3xSLv1 probe, which forms a hair pin and does not hybridize well with labeled samples. There are 182 technical replicates of this probe distributed across the microarray. The trimmed mean (trim = 0.05) and standard deviation was used to measure the background fluorescence and background variability for each array. Using trimmed statistics reduces the effects of outliers that may be present. This is done by removing a percentage of the largest and smallest values before calculating the statistics. Fig. 2 provides a visual representation of these estimates. Arrays with background fluorescence that fell outside of these ranges were repeated.

Fig. 2

Background estimates of each array for reference (Cy3, green) and sample (Cy5, red) mean fluorescent signal of the 3xSLv1 negative control probe (n = 182 probes). Error bars = standard deviation; FLU = fluorescent units.

To identify microarrays with poor data quality, hierarchical clustering analysis with average linkage (a method for calculating distance between clusters that averages over all pairs of objects between two clusters) was performed in R software (hclust function in the stats library) using the normalized log ratios (M values) for all probes (Fig. 3). Since the majority of probes will not be differentially expressed in response to the treatment, the expectation is that there will be no differences between samples. To obtain a fair inter-array comparison, signal intensities were adjusted to control for slide effect (i.e. to control for slide-to-slide variation in signal intensity that occurs as an artifact of experimental conditions, as opposed to a real treatment-related effect) using a linear model, which subtracted the estimated slide effect from the log ratios. One minus the Pearson correlation was used as the measure of dissimilarity. The original dendrogram was cut at 0.3, which has 6 clusters. The 5 samples that clustered separately (at the top) were considered outliers and were then repeated.

Fig. 3

Hierarchical clustering of all probes based on normalized signal intensity ratios. The red line and bracket delineate outlier arrays, which were repeated (repeated arrays are indicated by arrows). Colored boxes represent dose groups where orange, lime, green, aqua-green, and light blue are the 0, 1, 2, 4, and 8 mg/kg bw groups; and, royal blue, purple, and pink are the 0, 1, and 8 mg/kg bw + BrdU groups.

Upon repetition and/or removal of outlying arrays (arrays with high background, systematic dye variation, or poor hybridization efficiency), the final sample sizes used for gene expression analysis were n = 5, 4, 5, 4, and 5 for the non-BrdU 0, 1, 2, 4, and 8 mg/kg bw furan dose groups, respectively; and n = 4, 5, 5 for the 0, 1, and 8 mg/kg bw furan + BrdU groups, respectively. Gene expression analysis was not conducted for the 2 and 4 mg/kg furan + BrdU animals.

Differential gene expression analysis

Differential gene expression was determined in R using the MicroArray ANalysis Of VAriance (MAANOVA) library [9]. Included in the MAANOVA model are the block (slide) and the treatment effects. A typical ANOVA generates an F statistic, which is the ratio of inter- to intra-group variance. The MAANOVA uses a modified F statistic, called an Fs statistic [3], to determine gene-specific treatment effects. Using the James–Stein shrinkage concept [6], the Fs statistic has an improved estimation of error variance because it estimates intra-group variance based on global gene expression (as opposed to expression of individual genes, which have far fewer data points associated with them). The associated p-values were then estimated using the permutation method (i.e. bootstrapping) with 30,000 permutations and residual shuffling. In this instance, bootstrapping increases statistical power of gene expression associated p-values because the typical sample size for gene expression studies is quite low (n = ~ 3–6 per group). The p-values were adjusted for multiple comparisons using the Benjamini–Hochberg false discovery rate (FDR) approach [1]. This approach sets a more stringent significance threshold of α = 0.05 for all comparisons (as opposed to α = 0.05 for individual comparisons, which would result in a higher false positive rate). Individual gene fold changes were estimated using least square means of each pairwise comparison. Genes having an FDR-adjusted p ≤ 0.05 and a fold change ≥ ± 1.5 were deemed differentially expressed.

Gene expression meta-analysis in R

Data were obtained from Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo/) and the European Bioinformatics Institute (http://www.ebi.ac.uk/arrayexpress/). All data generated using the one-color Affymetrix platform (E-MEXP-82; GSE13149; GSE18858; GSE20427; GSE26538) were background-subtracted and normalized using Robust Multi-array Average, RMA [12], using the ReadAffy() function in the affy library in R [4]. Since the RMA requires background subtraction, data generated by all other platforms were also background subtracted for consistency. Signal intensities generated on the Agilent two-color platform (GSE48644; BaP) were normalized using the LOWESS approach. For two-color experiments using other, non-Agilent platforms (GSE35934; GSE4874), signal intensities (with a dye adjustment) were normalized using cyclic LOWESS [2]. Normalized probe intensities for probes with common gene symbols were then averaged using the median normalized signal intensity. Normalized data from each data set were then merged together yielding 3190 common gene symbols. Hierarchical clustering was conducted using the hclust function with average linkage using one minus the Pearson correlation as the distance metric in R. The relevant figure can be viewed in [5].

Discussion

Here we have described the steps taken to analyze changes in gene expression in the livers of furan-exposed female B6C3F1 mice using Agilent microarrays and R (see Fig. 4 for summary). We have demonstrated that the data are of high quality and have provided all of the tools required for reproducibility. The biological significance of the study was previously reported: differential gene expression [5], differential long non-coding RNA expression [8], and effect of BrdU on gene expression [11]. We anticipate that the importance of toxicogenomics studies in chemical carcinogen assessment will continue to increase in the coming years and believe that the rate at which this occurs will be highly dependent upon ensuring public availability of these very powerful datasets.

Fig. 4

Summary of steps taken to generate, normalize and analyze two-color Agilent gene expression microarray data.

Specifications
Organism	B6C3F1 mice
Sex	Female
Array type	Agilent SurePrint G3 Mouse GE 8x60K Microarray
Data format (in GEO)	Raw data: TXT files; normalized data: TXT files
Experimental factors	Furan exposed vs. un-exposed control
Experimental features	Female B6C3F1 mice were sub-chronically exposed for 21 days to control (0 mg/kg bw), non-carcinogenic (1, 2 mg/kg bw), and carcinogenic (4, 8 mg/kg bw) doses of furan. Half of the mice in each group were also given 0.02% BrdU for five days prior to sacrifice (days 16–21). All non-BrdU mice as well as the 0, 1, and 8 mg/kg bw furan + BrdU mice were used for gene expression analysis. Necropsy occurred four hours after the final furan dosing. RNA was extracted from livers and changes in gene expression were analyzed using Agilent microarrays.
Consent	All procedures were conducted in compliance with the Animal Welfare Act Regulations (9CFR1–4). Mice were handled and treated according to the guidelines provided in the National Institutes of Health (NIH) Guide for the Care and Use of Laboratory Animals (ILAR, 1996; http://dels.nas.edu/ilar/).
Sample source location	5–6 week old female specific pathogen free B6C3F1 mice were purchased from Charles River Breeding Laboratories (Portage, ME). Experiments were conducted at ILS, P.O. Box 13501, Research Triangle Park, NC 27709, USA.

8 in total

1. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation.

Authors: Yee Hwa Yang; Sandrine Dudoit; Percy Luu; David M Lin; Vivian Peng; John Ngai; Terence P Speed
Journal: Nucleic Acids Res Date: 2002-02-15 Impact factor: 16.971

2. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.

Authors: B M Bolstad; R A Irizarry; M Astrand; T P Speed
Journal: Bioinformatics Date: 2003-01-22 Impact factor: 6.937

3. Exploration, normalization, and summaries of high density oligonucleotide array probe level data.

Authors: Rafael A Irizarry; Bridget Hobbs; Francois Collin; Yasmin D Beazer-Barclay; Kristen J Antonellis; Uwe Scherf; Terence P Speed
Journal: Biostatistics Date: 2003-04 Impact factor: 5.899

4. affy--analysis of Affymetrix GeneChip data at the probe level.

Authors: Laurent Gautier; Leslie Cope; Benjamin M Bolstad; Rafael A Irizarry
Journal: Bioinformatics Date: 2004-02-12 Impact factor: 6.937

5. Improved statistical tests for differential gene expression by shrinking variance components estimates.

Authors: Xiangqin Cui; J T Gene Hwang; Jing Qiu; Natalie J Blades; Gary A Churchill
Journal: Biostatistics Date: 2005-01 Impact factor: 5.899

6. Differential expression of long noncoding RNAs in the livers of female B6C3F1 mice exposed to the carcinogen furan.

Authors: Leslie Recio; Suzanne L Phillips; Tim Maynor; Michael Waters; Anna Francina Jackson; Carole L Yauk
Journal: Toxicol Sci Date: 2013-07-12 Impact factor: 4.849

7. Bromodeoxyuridine (BrdU) treatment to measure hepatocellular proliferation does not mask furan-induced gene expression changes in mouse liver.

Authors: Anna Francina Webster; Andrew Williams; Leslie Recio; Carole L Yauk
Journal: Toxicology Date: 2014-06-06 Impact factor: 4.221

8. Case study on the utility of hepatic global gene expression profiling in the risk assessment of the carcinogen furan.

Authors: Anna Francina Jackson; Andrew Williams; Leslie Recio; Michael D Waters; Iain B Lambert; Carole L Yauk
Journal: Toxicol Appl Pharmacol Date: 2013-10-29 Impact factor: 4.219

8 in total

2 in total

1. Impact of Genomics Platform and Statistical Filtering on Transcriptional Benchmark Doses (BMD) and Multiple Approaches for Selection of Chemical Point of Departure (PoD).

Authors: A Francina Webster; Nikolai Chepelev; Rémi Gagné; Byron Kuo; Leslie Recio; Andrew Williams; Carole L Yauk
Journal: PLoS One Date: 2015-08-27 Impact factor: 3.240

2. BMDExpress Data Viewer - a visualization tool to analyze BMDExpress datasets.

Authors: Byron Kuo; A Francina Webster; Russell S Thomas; Carole L Yauk
Journal: J Appl Toxicol Date: 2015-12-15 Impact factor: 3.446

2 in total