Evan Gatev1,2,3,4, Nicole Gladish3,4,5, Sara Mostafavi3,4,5,6, Michael S Kobor3,4,5. 1. Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC V5T 4S6, Canada. 2. Department of Finance, Beedie School of Business, Simon Fraser University, Burnaby, BC V5A 1S6, Canada. 3. BC Children's Hospital Research Institute, Vancouver, BC V5Z 4H4, Canada. 4. Centre for Molecular Medicine and Therapeutics, Vancouver, BC V5Z 4H4, Canada. 5. Department of Medical Genetics. 6. Department of Statistics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
Abstract
MOTIVATION: High-dimensional DNA methylation (DNAm) array coverage, while sparse in the context of the entire DNA methylome, still constitutes a very large number of CpG probes. The ensuing multiple-test corrections affect the statistical power to detect associations, likely contributing to prevalent limited reproducibility. Array probes measuring proximal CpG sites often have correlated levels of DNAm that may not only be biologically meaningful but also imply statistical dependence and redundancy. New methods that account for such correlations between adjacent probes may enable improved specificity, discovery and interpretation of statistical associations in DNAm array data. RESULTS: We developed a method named Co-Methylation with genomic CpG Background (CoMeBack) that estimates DNA co-methylation, defined as proximal CpG probes with correlated DNAm across individuals. CoMeBack outputs co-methylated regions (CMRs), spanning sets of array probes constructed based on all genomic CpG sites, including those not measured on the array, and without any phenotypic variable inputs. This approach can reduce the multiple-test correction burden, while enhancing the discovery and specificity of statistical associations. We constructed and validated CMRs in whole blood, using publicly available Illumina Infinium 450 K array data from over 5000 individuals. These CMRs were enriched for enhancer chromatin states, and binding site motifs for several transcription factors involved in blood physiology. We illustrated how CMR-based epigenome-wide association studies can improve discovery and reduce false positives for associations with chronological age. AVAILABILITY AND IMPLEMENTATION: https://bitbucket.org/flopflip/comeback. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: High-dimensional DNA methylation (DNAm) array coverage, while sparse in the context of the entire DNA methylome, still constitutes a very large number of CpG probes. The ensuing multiple-test corrections affect the statistical power to detect associations, likely contributing to prevalent limited reproducibility. Array probes measuring proximal CpG sites often have correlated levels of DNAm that may not only be biologically meaningful but also imply statistical dependence and redundancy. New methods that account for such correlations between adjacent probes may enable improved specificity, discovery and interpretation of statistical associations in DNAm array data. RESULTS: We developed a method named Co-Methylation with genomic CpG Background (CoMeBack) that estimates DNA co-methylation, defined as proximal CpG probes with correlated DNAm across individuals. CoMeBack outputs co-methylated regions (CMRs), spanning sets of array probes constructed based on all genomic CpG sites, including those not measured on the array, and without any phenotypic variable inputs. This approach can reduce the multiple-test correction burden, while enhancing the discovery and specificity of statistical associations. We constructed and validated CMRs in whole blood, using publicly available Illumina Infinium 450 K array data from over 5000 individuals. These CMRs were enriched for enhancer chromatin states, and binding site motifs for several transcription factors involved in blood physiology. We illustrated how CMR-based epigenome-wide association studies can improve discovery and reduce false positives for associations with chronological age. AVAILABILITY AND IMPLEMENTATION: https://bitbucket.org/flopflip/comeback. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Evan Gatev; Amy M Inkster; Gian Luca Negri; Chaini Konwar; Alexandre A Lussier; Anne Skakkebaek; Marla B Sokolowski; Claus H Gravholt; Erin C Dunn; Michael S Kobor; Maria J Aristizabal Journal: Nucleic Acids Res Date: 2021-09-20 Impact factor: 16.971
Authors: Anke Hüls; Chloe Robins; Karen N Conneely; Rachel Edgar; Philip L De Jager; David A Bennett; Aliza P Wingo; Michael P Epstein; Thomas S Wingo Journal: Biol Psychiatry Date: 2021-02-03 Impact factor: 13.382
Authors: Michael Scherer; Gilles Gasparoni; Souad Rahmouni; Tatiana Shashkova; Marion Arnoux; Edouard Louis; Arina Nostaeva; Diana Avalos; Emmanouil T Dermitzakis; Yurii S Aulchenko; Thomas Lengauer; Paul A Lyons; Michel Georges; Jörn Walter Journal: Epigenetics Chromatin Date: 2021-09-16 Impact factor: 4.954