| Literature DB >> 31366909 |
Elior Rahmani1, Regev Schweiger2,3, Brooke Rhead4, Lindsey A Criswell5, Lisa F Barcellos6, Eleazar Eskin7,8,9, Saharon Rosset10, Sriram Sankararaman7,8,9, Eran Halperin11,12,13,14.
Abstract
High costs and technical limitations of cell sorting and single-cell techniques currently restrict the collection of large-scale, cell-type-specific DNA methylation data. This, in turn, impedes our ability to tackle key biological questions that pertain to variation within a population, such as identification of disease-associated genes at a cell-type-specific resolution. Here, we show mathematically and empirically that cell-type-specific methylation levels of an individual can be learned from its tissue-level bulk data, conceptually emulating the case where the individual has been profiled with a single-cell resolution and then signals were aggregated in each cell population separately. Provided with this unprecedented way to perform powerful large-scale epigenetic studies with cell-type-specific resolution, we revisit previous studies with tissue-level bulk methylation and reveal novel associations with leukocyte composition in blood and with rheumatoid arthritis. For the latter, we further show consistency with validation data collected from sorted leukocyte sub-types.Entities:
Mesh:
Year: 2019 PMID: 31366909 PMCID: PMC6668473 DOI: 10.1038/s41467-019-11052-9
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Observed bulk methylation levels may obscure cell-type-specific signals. Neither the observed methylation levels nor the observed levels after adjusting for the variability in cell-type composition can demonstrate a clear difference between cases and controls, in spite of a clear (unobserved) difference in cell type 3. Methylation levels are represented by a gradient of red color, and adjusted observed levels were calculated for each sample by removing the cell-type-specific mean levels, weighted by its cell-type composition
Fig. 2TCA versus a traditional decomposition approach. Given bulk DNA methylation data from a heterogeneous tissue, previous decomposition methods (e.g., PCA, ReFACTor[33], or a reference-based decomposition[26]) aim at estimating a matrix of the cell-type proportions of the individuals and a matrix of the cell-type-specific methylomes in the sample (shared across individuals). In contrast, TCA aims at estimating a matrix of the cell-type proportions of the individuals and—for each individual—a matrix of the unique cell-type-specific methylomes of the individual
Fig. 3An evaluation of power for detecting cell-type-specific associations with DNA methylation. Performance was evaluated using three approaches: TCA, a standard linear regression with the observed bulk data, and CellDMC with the true cell-type proportions as an input. The numbers of true positives (TPs) were measured under three scenarios using a range of effect sizes: different effect sizes for different cell types (Scenario I), the same effect size for all cell types (Scenario II), and a single effect size for a single cell type (Scenario III); each of the scenarios was evaluated under the assumption of three constituting cell types (k = 3; top row) and six constituting cell types (k = 6; bottom row). Lines represent the median performance across 10 simulations and the colored areas reflect the results range across the multiple executions. The colored dots reflect the results of TCA under different initializations of the cell-type proportion estimates (i.e., different levels of noise injected into TCA), where the color gradients represent the mean absolute correlation of the initial estimates with the true values (across all cell types)
Fig. 4Results of the association analysis with level of immune activity and with rheumatoid arthritis in the Liu et al. whole blood methylation data, presented by Manhattan plots of the −log10 P-values for the association tests. a, b Shown are results with immune activity using CellDMC (results subsampled and truncated for visualization) and using TCA. c, d Shown are results of the RA analysis using standard regression and using TCA under the assumption of a single effect size for all cell types. e, f Shown are results of a cell-type-specific analysis of RA using CellDMC and using TCA. Solid horizontal red lines represent the experiment-wide significance threshold, and dotted horizontal red lines represent the significance threshold adjusted for three experiments corresponding to the three cell types