Literature DB >> 29506020

MIRA: an R package for DNA methylation-based inference of regulatory activity.

John T Lawson1,2, Eleni M Tomazou3, Christoph Bock4, Nathan C Sheffield2.   

Abstract

Summary: DNA methylation contains information about the regulatory state of the cell. MIRA aggregates genome-scale DNA methylation data into a DNA methylation profile for a given region set with shared biological annotation. Using this profile, MIRA infers and scores the collective regulatory activity for the region set. MIRA facilitates regulatory analysis in situations where classical regulatory assays would be difficult and allows public sources of region sets to be leveraged for novel insight into the regulatory state of DNA methylation datasets. Availability and implementation: http://bioconductor.org/packages/MIRA.

Entities:  

Mesh:

Year:  2018        PMID: 29506020      PMCID: PMC6061852          DOI: 10.1093/bioinformatics/bty083

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


DNA methylation interacts with other regulatory features to control gene expression (Stadler ). The connection between methylation and transcription factor (TF) binding goes both ways: TF binding affects and is affected by DNA methylation (Zhu ), making it difficult to infer the causative factor; nevertheless, independent of directionality, the inverse correlation between DNA methylation and gene expression indicates that regulatory information can be derived from DNA methylation data. Multiple approaches have been used to relate DNA methylation to regulatory activity; for example, correlating differential methylation with expression of nearby genes (Yao ), or testing enrichment of TFs in differentially methylated regions (Wijetunga ; Yao ). These approaches are limited by arbitrary thresholds for differential methylation and do not make full use of genome-wide data. Also, factors other than DNA methylation levels, such as the shape of the DNA methylation profile around a site, may be important to the site’s activity (Kapourani and Sanguinetti, 2016). We recently introduced and validated a novel method called MIRA (Methylation-based Inference of Regulatory Activity), which takes advantage of genome-scale DNA methylation data to assess regulatory activity (Sheffield ). We now present the MIRA R package which enhances this method and makes it broadly available. MIRA requires two inputs: (i) single-nucleotide-resolution DNA methylation data; and (ii) a set of genomic regions (Fig. 1A). The DNA methylation data could come from sources such as whole genome or reduced representation bisulfite sequencing (WGBS or RRBS), or microarrays. MIRA has been successfully tested with coverage as low as 450k array data. Genomic regions can be derived from sequencing assays such as ChIP-seq, DNase-seq, or ATAC-seq. Many region sets are publicly available through large-scale genomics projects and may be conveniently accessed through R packages like LOLA (Sheffield and Bock, 2016).
Fig. 1.

MIRA workflow. (A) Two inputs to MIRA: DNA methylation data for the sample of interest and a set of genomic regions that share a biological annotation. (B) Three regions from the region set are shown for this example, but a region set would normally be composed of thousands of regions. The DNA methylation level at individual CpGs is plotted for each 4.5 kb region, which is centered around a site of interest. (C) Each region is split into 11 bins of approximately equal size and an average methylation level is calculated based on the CpGs in each bin. (D) All regions are aggregated into a single DNA methylation profile by averaging methylation from the corresponding bins of each region. (E) The methylation profile is scored by taking the log of the ratio between the average methylation of the two shoulders and the methylation of the center. An algorithm determines the position of the shoulders. (F) As might be seen in an experiment that uses MIRA, the single score calculated from this sample is compared to scores from other samples of the same type—condition 1—as well as to samples of a different type—condition 2. All scores were calculated using the same region set. The difference in scores between groups suggests differential activity of this region set. (G) Real MIRA profiles for a TF region set and for an H3K27 acetylation region set with DNA methylation data from six mesenchymal stem cell samples

MIRA workflow. (A) Two inputs to MIRA: DNA methylation data for the sample of interest and a set of genomic regions that share a biological annotation. (B) Three regions from the region set are shown for this example, but a region set would normally be composed of thousands of regions. The DNA methylation level at individual CpGs is plotted for each 4.5 kb region, which is centered around a site of interest. (C) Each region is split into 11 bins of approximately equal size and an average methylation level is calculated based on the CpGs in each bin. (D) All regions are aggregated into a single DNA methylation profile by averaging methylation from the corresponding bins of each region. (E) The methylation profile is scored by taking the log of the ratio between the average methylation of the two shoulders and the methylation of the center. An algorithm determines the position of the shoulders. (F) As might be seen in an experiment that uses MIRA, the single score calculated from this sample is compared to scores from other samples of the same type—condition 1—as well as to samples of a different type—condition 2. All scores were calculated using the same region set. The difference in scores between groups suggests differential activity of this region set. (G) Real MIRA profiles for a TF region set and for an H3K27 acetylation region set with DNA methylation data from six mesenchymal stem cell samples Using these two inputs, MIRA aggregates the DNA methylation of individual CpGs to create a summary profile through several steps: First, each region (Fig. 1B) is split into n bins. Second, the DNA methylation level (0–100%) within a bin is averaged (Fig. 1C). Third, the regions are aggregated into a single summary profile by averaging the DNA methylation levels of each bin across all regions (Fig. 1D). MIRA thus creates a ‘meta-region profile’ that provides general information about the activity of that region type across the genome. Through aggregation, MIRA handles sparse DNA methylation data well. This makes MIRA well suited for low-coverage bisulfite sequencing (e.g. Farlik ). Once an aggregate profile is constructed (Fig. 1E), it is scored to quantify the regulatory activity (Fig. 1F). MIRA assumes that genomic regions with lower DNA methylation levels have higher regulatory activity and gives a score based on the deepness of the ‘dip’ in the middle of the ‘meta-region profile’. MIRA automatically determines the location of the edges of the dip and calculates the score as the natural logarithm of the ratio between the DNA methylation level of the edges of the dip and the DNA methylation level of the center of the dip (Fig. 1F). The score reduces the DNA methylation profile to a number, which predicts the region set’s aggregate regulatory activity. MIRA scores can be compared between samples to identify regulatory differences. MIRA supports a variety of applications depending on the context and what type of region set is used. For example, MIRA can be used to compare the chromatin states of different types of cells (Sheffield ). MIRA makes analysis of regulatory activity possible in cases where it would otherwise be infeasible. When sample amount or quality would not allow ATAC-seq or ChIP-seq but DNA methylation data can be obtained, regulatory analysis can be done with MIRA using existing ATAC-seq or ChIP-seq data (e.g. from a database). MIRA is also valuable for cases where it would be impractical in terms of time or cost to perform traditional regulatory assays, such as for large-scale cohort studies. The MIRA R package can be accessed via Bioconductor, and comes with multiple vignettes demonstrating how to apply it to biological data. MIRA provides a novel tool to enhance analysis of DNA methylation and leverage existing data from regulatory assays to gain new regulatory insights.

Funding

This work has been supported by the University of Virginia and by an NIH training grant to J.L. (NLM; 5T32LM012416). Conflict of Interest: none declared.
  8 in total

1.  DNA-binding factors shape the mouse methylome at distal regulatory regions.

Authors:  Michael B Stadler; Rabih Murr; Lukas Burger; Robert Ivanek; Florian Lienert; Anne Schöler; Erik van Nimwegen; Christiane Wirbelauer; Edward J Oakeley; Dimos Gaidatzis; Vijay K Tiwari; Dirk Schübeler
Journal:  Nature       Date:  2011-12-14       Impact factor: 49.962

2.  Higher order methylation features for clustering and prediction in epigenomic studies.

Authors:  Chantriolnt-Andreas Kapourani; Guido Sanguinetti
Journal:  Bioinformatics       Date:  2016-09-01       Impact factor: 6.937

Review 3.  Transcription factors as readers and effectors of DNA methylation.

Authors:  Heng Zhu; Guohua Wang; Jiang Qian
Journal:  Nat Rev Genet       Date:  2016-08-01       Impact factor: 53.242

4.  DNA methylation heterogeneity defines a disease spectrum in Ewing sarcoma.

Authors:  Nathan C Sheffield; Gaelle Pierron; Johanna Klughammer; Paul Datlinger; Andreas Schönegger; Michael Schuster; Johanna Hadler; Didier Surdez; Delphine Guillemot; Eve Lapouble; Paul Freneaux; Jacqueline Champigneulle; Raymonde Bouvier; Diana Walder; Ingeborg M Ambros; Caroline Hutter; Eva Sorz; Ana T Amaral; Enrique de Álava; Katharina Schallmoser; Dirk Strunk; Beate Rinner; Bernadette Liegl-Atzwanger; Berthold Huppertz; Andreas Leithner; Gonzague de Pinieux; Philippe Terrier; Valérie Laurence; Jean Michon; Ruth Ladenstein; Wolfgang Holter; Reinhard Windhager; Uta Dirksen; Peter F Ambros; Olivier Delattre; Heinrich Kovar; Christoph Bock; Eleni M Tomazou
Journal:  Nat Med       Date:  2017-01-30       Impact factor: 53.440

5.  Single-cell DNA methylome sequencing and bioinformatic inference of epigenomic cell-state dynamics.

Authors:  Matthias Farlik; Nathan C Sheffield; Angelo Nuzzo; Paul Datlinger; Andreas Schönegger; Johanna Klughammer; Christoph Bock
Journal:  Cell Rep       Date:  2015-02-26       Impact factor: 9.423

6.  Inferring regulatory element landscapes and transcription factor networks from cancer methylomes.

Authors:  Lijing Yao; Hui Shen; Peter W Laird; Peggy J Farnham; Benjamin P Berman
Journal:  Genome Biol       Date:  2015-05-21       Impact factor: 13.583

7.  A pre-neoplastic epigenetic field defect in HCV-infected liver at transcription factor binding sites and polycomb targets.

Authors:  N A Wijetunga; M Pascual; J Tozour; F Delahaye; M Alani; M Adeyeye; A W Wolkoff; A Verma; J M Greally
Journal:  Oncogene       Date:  2016-10-10       Impact factor: 9.867

8.  LOLA: enrichment analysis for genomic region sets and regulatory elements in R and Bioconductor.

Authors:  Nathan C Sheffield; Christoph Bock
Journal:  Bioinformatics       Date:  2015-10-27       Impact factor: 6.937

  8 in total
  9 in total

1.  The DNA methylation landscape of glioblastoma disease progression shows extensive heterogeneity in time and space.

Authors:  Johanna Klughammer; Barbara Kiesel; Thomas Roetzer; Nikolaus Fortelny; Amelie Nemc; Karl-Heinz Nenning; Julia Furtner; Nathan C Sheffield; Paul Datlinger; Nadine Peter; Martha Nowosielski; Marco Augustin; Mario Mischkulnig; Thomas Ströbel; Donat Alpar; Bekir Ergüner; Martin Senekowitsch; Patrizia Moser; Christian F Freyschlag; Johannes Kerschbaumer; Claudius Thomé; Astrid E Grams; Günther Stockhammer; Melitta Kitzwoegerer; Stefan Oberndorfer; Franz Marhold; Serge Weis; Johannes Trenkler; Johanna Buchroithner; Josef Pichler; Johannes Haybaeck; Stefanie Krassnig; Kariem Mahdy Ali; Gord von Campe; Franz Payer; Camillo Sherif; Julius Preiser; Thomas Hauser; Peter A Winkler; Waltraud Kleindienst; Franz Würtz; Tanisa Brandner-Kokalj; Martin Stultschnig; Stefan Schweiger; Karin Dieckmann; Matthias Preusser; Georg Langs; Bernhard Baumann; Engelbert Knosp; Georg Widhalm; Christine Marosi; Johannes A Hainfellner; Adelheid Woehrer; Christoph Bock
Journal:  Nat Med       Date:  2018-08-27       Impact factor: 53.440

2.  MethReg: estimating the regulatory potential of DNA methylation in gene transcription.

Authors:  Tiago C Silva; Juan I Young; Eden R Martin; X Steven Chen; Lily Wang
Journal:  Nucleic Acids Res       Date:  2022-05-20       Impact factor: 19.160

3.  A Pathway Analysis Based on Genome-Wide DNA Methylation of Chinese Patients with Graves' Orbitopathy.

Authors:  Zhong Xin; Lin Hua; Yi-Lin Yang; Ting-Ting Shi; Wei Liu; Xiu Tuo; Yu Li; Xi Cao; Fang-Yuan Yang
Journal:  Biomed Res Int       Date:  2019-01-13       Impact factor: 3.411

4.  MeinteR: A framework to prioritize DNA methylation aberrations based on conformational and cis-regulatory element enrichment.

Authors:  Andigoni Malousi; Sofia Kouidou; Maria Tsagiopoulou; Nikos Papakonstantinou; Emmanouil Bouras; Elisavet Georgiou; Georgios Tzimagiorgis; Kostas Stamatopoulos
Journal:  Sci Rep       Date:  2019-12-16       Impact factor: 4.379

5.  GeneDMRs: An R Package for Gene-Based Differentially Methylated Regions Analysis.

Authors:  Xiao Wang; Dan Hao; Haja N Kadarmideen
Journal:  J Comput Biol       Date:  2020-11-13       Impact factor: 1.479

6.  Low biological fluctuation of mitochondrial CpG and non-CpG methylation at the single-molecule level.

Authors:  Chloe Goldsmith; Jesús Rafael Rodríguez-Aguilera; Ines El-Rifai; Adrien Jarretier-Yuste; Valérie Hervieu; Olivier Raineteau; Pierre Saintigny; Victoria Chagoya de Sánchez; Robert Dante; Gabriel Ichim; Hector Hernandez-Vargas
Journal:  Sci Rep       Date:  2021-04-13       Impact factor: 4.379

7.  Detecting molecular subtypes from multi-omics datasets using SUMO.

Authors:  Karolina Sienkiewicz; Jinyu Chen; Ajay Chatrath; John T Lawson; Nathan C Sheffield; Louxin Zhang; Aakrosh Ratan
Journal:  Cell Rep Methods       Date:  2022-01-14

8.  GenomicDistributions: fast analysis of genomic intervals with Bioconductor.

Authors:  Kristyna Kupkova; Jose Verdezoto Mosquera; Jason P Smith; Michał Stolarczyk; Tessa L Danehy; John T Lawson; Bingjie Xue; John T Stubbs; Nathan LeRoy; Nathan C Sheffield
Journal:  BMC Genomics       Date:  2022-04-12       Impact factor: 3.969

9.  COCOA: coordinate covariation analysis of epigenetic heterogeneity.

Authors:  John T Lawson; Jason P Smith; Stefan Bekiranov; Francine E Garrett-Bakelman; Nathan C Sheffield
Journal:  Genome Biol       Date:  2020-09-07       Impact factor: 17.906

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.