Literature DB >> 26500152

funtooNorm: an R package for normalization of DNA methylation data when there are multiple cell or tissue types.

Kathleen Oros Klein¹, Stepan Grinek¹, Sasha Bernatsky², Luigi Bouchard³, Antonio Ciampi⁴, Ines Colmegna⁵, Jean-Philippe Fortin⁶, Long Gao⁴, Marie-France Hivert⁷, Marie Hudson⁸, Michael S Kobor⁹, Aurelie Labbe⁴, Julia L MacIsaac¹⁰, Michael J Meaney¹¹, Alexander M Morin¹⁰, Kieran J O'Donnell¹², Tomi Pastinen¹³, Marinus H Van Ijzendoorn¹⁴, Gregory Voisin¹, Celia M T Greenwood¹⁵.

Abstract

MOTIVATION: DNA methylation patterns are well known to vary substantially across cell types or tissues. Hence, existing normalization methods may not be optimal if they do not take this into account. We therefore present a new R package for normalization of data from the Illumina Infinium Human Methylation450 BeadChip (Illumina 450 K) built on the concepts in the recently published funNorm method, and introducing cell-type or tissue-type flexibility.
RESULTS: funtooNorm is relevant for data sets containing samples from two or more cell or tissue types. A visual display of cross-validated errors informs the choice of the optimal number of components in the normalization. Benefits of cell (tissue)-specific normalization are demonstrated in three data sets. Improvement can be substantial; it is strikingly better on chromosome X, where methylation patterns have unique inter-tissue variability.
AVAILABILITY AND IMPLEMENTATION: An R package is available at https://github.com/GreenwoodLab/funtooNorm, and has been submitted to Bioconductor at http://bioconductor.org.

Entities: Disease Gene Species

Mesh：

Year: 2015 PMID： 26500152 PMCID： PMC4743629 DOI： 10.1093/bioinformatics/btv615

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Recently, a normalization method was introduced by Fortin specifically designed for the Illumina Infinium Human Methylation 450 BeadChip (Illumina 450 K) and implemented in Bioconductor’s minfi package(Aryee ). The percentile-specific adjustments in funNorm are the key feature allowing batch effects and technical artefacts to have non-constant influence across the range of signal strengths. However, since methylation patterns may differ substantially across cell types or tissues leading to cell- (or tissue)-type-specific quantiles, optimal normalization adjustments should capture this. Here we present an R package for normalization of Illumina 450 K data, funtooNorm (an extension of the ideas in funNorm) applicable to such heterogeneous data sets.

2 Methods

Key features of funtooNorm and funNorm are identical, i.e. normalization adjustments are estimated via regression models applied to a series of quantiles of the probe-type-specific signals in each sample. Covariates, derived from the control probes, capture variation not associated with the biological signals of interest. In funtooNorm, an augmented covariate matrix is constructed by including interactions between cell-type or tissue-type indicators and the average signal from each control probe type. Either principal component regression (PCR) or partial least squares regression (PLS) (Tenenhaus, 1998) can be fit (the type.fits option); as in funNorm, normalized methylation values are based on predictions from linear interpolations between the analyzed percentiles (see Supplemental Methods). The function, funtoonorm, operates in two distinct modes: Three data sets are used to illustrate performance (Supplemental Table S1). In the Replication Data Set, methylation was measured in ten healthy individuals who contributed 2–3 samples of each of whole blood, buccal swab and dried blood spots, including a mixture of technical and biological replicates. In the Systemic Autoimmune Diseases Data (SARDS), monocytes and CD4 + T-cells from incident patients were separated from whole blood, with repeated samples drawn before and after 6 months of immunosuppressive treatment. For the Gestational Diabetes Data (GD), one technical replicate sample was available for each of fetal placenta and cord blood tissues. Agreement—within a tissue or cell type—is measured by the average (over probes) of the squared intra-replicate set differences, summed over distinct individuals. Normalization mode: When validate = FALSE, normalization of the data is performed for a chosen number of components in the regressions. The model-fitting step requires only a set of quantiles for each sample, and hence is efficient both computationally and in memory usage. Calculations can be performed in a modular fashion; intermediary results can be saved by setting appropriate flags. Cross-validation mode: When validate = TRUE, a graphical display of root mean squared errors (RMSE) obtained with cross-validation facilitates choice of an appropriate number of components (Fig. 1). Plots are provided for both PCR and PLS fits.

Fig. 1.

Root mean square error from cross-validation comparing different numbers of components in funtooNorm on the Replication Data Set. Separate model fits are implemented for A and B signals, and for different probe types

3 Results

Figure 1 displays the cross-validation RMSE plot for the Replication Data set with PCR. The optimal number of components varies across the percentiles and signals; evidently there is substantial improvement in mean squared error from 2 to 3 components. Technical replicate agreement was improved with funtooNorm compared to funNorm (Supplemental Figs S1 and S2, Supplemental Tables S2 and S3). Agreement improved by substantially for technical replicates of whole blood, blood spots, and fetal placenta tissues, although there was little difference between the methods for buccal swabs or cord blood. For biological replicates, we saw improvements of 10-20% in many tissues. Performance was particularly good for probes on the X chromosome. Supplemental Figure S3 shows that the distribution across probes of the differences between tissue types is distinct on the X chromosome; this is captured by our augmented covariate matrix. A similar argument explains enhanced performance for some probe annotations (Supplemental Fig. S4). Performance on the Y chromosome was poor, since with only 416 probes, a quantile-based model fit is overly complex; we recommend the simpler method implemented in funNorm for this chromosome.

4 Discussion

Most methylation studies today are designed to detect inter-individual differences, rather than inter-tissue differences. Improved normalization of datasets containing multiple tissues can be expected to translate into increased power to detect associations of interest, due to the inferred reduction in residual error; funNorm and this extension funtooNorm are designed with this goal in mind.

2 in total

1. Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays.

Authors: Martin J Aryee; Andrew E Jaffe; Hector Corrada-Bravo; Christine Ladd-Acosta; Andrew P Feinberg; Kasper D Hansen; Rafael A Irizarry
Journal: Bioinformatics Date: 2014-01-28 Impact factor: 6.937

2. Functional normalization of 450k methylation array data improves replication in large cancer studies.

Authors: Jean-Philippe Fortin; Aurélie Labbe; Mathieu Lemire; Brent W Zanke; Thomas J Hudson; Elana J Fertig; Celia Mt Greenwood; Kasper D Hansen
Journal: Genome Biol Date: 2014-12-03 Impact factor: 13.583

2 in total

14 in total

1. Harmonization of multi-site diffusion tensor imaging data.

Authors: Jean-Philippe Fortin; Drew Parker; Birkan Tunç; Takanori Watanabe; Mark A Elliott; Kosha Ruparel; David R Roalf; Theodore D Satterthwaite; Ruben C Gur; Raquel E Gur; Robert T Schultz; Ragini Verma; Russell T Shinohara
Journal: Neuroimage Date: 2017-08-18 Impact factor: 6.556

2. Novel insights into systemic autoimmune rheumatic diseases using shared molecular signatures and an integrative analysis.

Authors: Marie Hudson; Sasha Bernatsky; Ines Colmegna; Maximilien Lora; Tomi Pastinen; Kathleen Klein Oros; Celia M T Greenwood
Journal: Epigenetics Date: 2017-04-07 Impact factor: 4.528

3. Decreased DNA Methylation of RGMA is Associated with Intracranial Hypertension After Severe Traumatic Brain Injury: An Exploratory Epigenome-Wide Association Study.

Authors: Dongjing Liu; Benjamin E Zusman; John R Shaffer; Yunqi Li; Annie I Arockiaraj; Shuwei Liu; Daniel E Weeks; Shashvat M Desai; Patrick M Kochanek; Ava M Puccio; David O Okonkwo; Yvette P Conley; Ruchira M Jha
Journal: Neurocrit Care Date: 2022-01-13 Impact factor: 3.532

4. An exploratory study of white blood cell proportions across preeclamptic and normotensive pregnancy by self-identified race in individuals with overweight or obesity.

Authors: Mitali Ray; Lacey W Heinsberg; Yvette P Conley; James M Roberts; Arun Jeyabalan; Carl A Hubel; Daniel E Weeks; Mandy J Schmella
Journal: Hypertens Pregnancy Date: 2021-10-26 Impact factor: 2.108

5. Altered DNA methylation is associated with aberrant gene expression in parenchymal but not airway fibroblasts isolated from individuals with COPD.

Authors: Tillie-Louise Hackett; Alan J Knox; Rachel L Clifford; Nick Fishbane; Jamie Patel; Julia L MacIsaac; Lisa M McEwen; Andrew J Fisher; Corry-Anke Brandsma; Parameswaran Nair; Michael S Kobor
Journal: Clin Epigenetics Date: 2018-03-05 Impact factor: 7.259

6. Acute Brain-Derived Neurotrophic Factor DNA Methylation Trajectories in Cerebrospinal Fluid and Associations With Outcomes Following Severe Traumatic Brain Injury in Adults.

Authors: Amery Treble-Barna; Lacey W Heinsberg; Ava M Puccio; John R Shaffer; David O Okonkwo; Sue R Beers; Daniel E Weeks; Yvette P Conley
Journal: Neurorehabil Neural Repair Date: 2021-06-25 Impact factor: 3.919

7. Agreement in DNA methylation levels from the Illumina 450K array across batches, tissues, and time.

Authors: Marie Forest; Kieran J O'Donnell; Greg Voisin; Helene Gaudreau; Julia L MacIsaac; Lisa M McEwen; Patricia P Silveira; Meir Steiner; Michael S Kobor; Michael J Meaney; Celia M T Greenwood
Journal: Epigenetics Date: 2018-01-30 Impact factor: 4.528

8. Epigenetic regulation of gene expression in cancer: techniques, resources and analysis.

Authors: Luciane T Kagohara; Genevieve L Stein-O'Brien; Dylan Kelley; Emily Flam; Heather C Wick; Ludmila V Danilova; Hariharan Easwaran; Alexander V Favorov; Jiang Qian; Daria A Gaykalova; Elana J Fertig
Journal: Brief Funct Genomics Date: 2018-01-01 Impact factor: 4.241

9. Cohort Profile: Pregnancy And Childhood Epigenetics (PACE) Consortium.

Authors: Janine F Felix; Bonnie R Joubert; Andrea A Baccarelli; Gemma C Sharp; Catarina Almqvist; Isabella Annesi-Maesano; Hasan Arshad; Nour Baïz; Marian J Bakermans-Kranenburg; Kelly M Bakulski; Elisabeth B Binder; Luigi Bouchard; Carrie V Breton; Bert Brunekreef; Kelly J Brunst; Esteban G Burchard; Mariona Bustamante; Leda Chatzi; Monica Cheng Munthe-Kaas; Eva Corpeleijn; Darina Czamara; Dana Dabelea; George Davey Smith; Patrick De Boever; Liesbeth Duijts; Terence Dwyer; Celeste Eng; Brenda Eskenazi; Todd M Everson; Fahimeh Falahi; M Daniele Fallin; Sara Farchi; Mariana F Fernandez; Lu Gao; Tom R Gaunt; Akram Ghantous; Matthew W Gillman; Semira Gonseth; Veit Grote; Olena Gruzieva; Siri E Håberg; Zdenko Herceg; Marie-France Hivert; Nina Holland; John W Holloway; Cathrine Hoyo; Donglei Hu; Rae-Chi Huang; Karen Huen; Marjo-Riitta Järvelin; Dereje D Jima; Allan C Just; Margaret R Karagas; Robert Karlsson; Wilfried Karmaus; Katerina J Kechris; Juha Kere; Manolis Kogevinas; Berthold Koletzko; Gerard H Koppelman; Leanne K Küpers; Christine Ladd-Acosta; Jari Lahti; Nathalie Lambrechts; Sabine A S Langie; Rolv T Lie; Andrew H Liu; Maria C Magnus; Per Magnus; Rachel L Maguire; Carmen J Marsit; Wendy McArdle; Erik Melén; Phillip Melton; Susan K Murphy; Tim S Nawrot; Lorenza Nisticò; Ellen A Nohr; Björn Nordlund; Wenche Nystad; Sam S Oh; Emily Oken; Christian M Page; Patrice Perron; Göran Pershagen; Costanza Pizzi; Michelle Plusquin; Katri Raikkonen; Sarah E Reese; Eva Reischl; Lorenzo Richiardi; Susan Ring; Ritu P Roy; Peter Rzehak; Greet Schoeters; David A Schwartz; Sylvain Sebert; Harold Snieder; Thorkild I A Sørensen; Anne P Starling; Jordi Sunyer; Jack A Taylor; Henning Tiemeier; Vilhelmina Ullemar; Marina Vafeiadi; Marinus H Van Ijzendoorn; Judith M Vonk; Annette Vriens; Martine Vrijheid; Pei Wang; Joseph L Wiemels; Allen J Wilcox; Rosalind J Wright; Cheng-Jian Xu; Zongli Xu; Ivana V Yang; Paul Yousefi; Hongmei Zhang; Weiming Zhang; Shanshan Zhao; Golareh Agha; Caroline L Relton; Vincent W V Jaddoe; Stephanie J London
Journal: Int J Epidemiol Date: 2018-02-01 Impact factor: 7.196

10. Comparison of DNA methylation profiles associated with spontaneous preterm birth in placenta and cord blood.

Authors: Xi-Meng Wang; Fu-Ying Tian; Li-Jun Fan; Chuan-Bo Xie; Zhong-Zheng Niu; Wei-Qing Chen
Journal: BMC Med Genomics Date: 2019-01-03 Impact factor: 3.063