| Literature DB >> 31711526 |
Liduo Yin1,2,3, Yanting Luo4, Xiguang Xu5,6, Shiyu Wen4, Xiaowei Wu7, Xuemei Lu8,9,10, Hehuang Xie11,12,13.
Abstract
BACKGROUND: Numerous cell types can be identified within plant tissues and animal organs, and the epigenetic modifications underlying such enormous cellular heterogeneity are just beginning to be understood. It remains a challenge to infer cellular composition using DNA methylomes generated for mixed cell populations. Here, we propose a semi-reference-free procedure to perform virtual methylome dissection using the nonnegative matrix factorization (NMF) algorithm.Entities:
Keywords: Cellular heterogeneity; DNA methylation; Nonnegative matrix factorization; Single-cell methylome
Mesh:
Year: 2019 PMID: 31711526 PMCID: PMC6844058 DOI: 10.1186/s13072-019-0310-9
Source DB: PubMed Journal: Epigenetics Chromatin ISSN: 1756-8935 Impact factor: 4.954
Fig. 1A three-step process to perform methylome dissection using eigen-pCSM loci. a In the first step, bipolar 4-CG segments are identified and a nonparametric Bayesian clustering algorithm is used for the determination of pCSM loci. b In the second step, co-methylation analysis is performed by k-means clustering coupled with WGCNA analysis. In each co-methylation module, PCA analysis is performed to pick the eigen-pCSM loci as a representative for the whole module. c In the third step, methylome dissection is performed by nonnegative matrix factorization (NMF), where matrix N stands for the raw methylation profile and is decomposed into two matrices, W and H. Matrix W represents the methylation profile of cell components, and matrix H represents the proportion of cell components
Fig. 2pCSM segments reflected methylation heterogeneity. a Distribution of methylation differences between cell subsets classified with pCSM and non-CSM segments. b Average methylation levels of pCSM segments and non-CSM segments across single cells. c, d Relationship between methylation level and methylation difference of pCSM segments (c) and non-CSM segments (d). The color indicates the densities of pCSM segments or non-CSM segments from low (blue) to high (red). e The distribution of pCSM loci across various genomic features compared to those of control regions
Fig. 3Co-methylation analysis to extract eigen-pCSM loci. a Heatmap of the methylation level of pCSM loci across brain methylomes. The methylation levels were represented by color gradient from blue (unmethylation) to red (full methylation). The color key in the right panel represents co-methylation modules. b Methylation profiles of the top five co-methylation modules. Each blue line represents the methylation level of pCSM loci across brain methylomes, the red lines represent the methylation level of eigen-pCSM loci picked by PCA analysis in each module, and 10% eigen-pCSM loci with the maximal loadings in PC1 were shown
Fig. 4Virtual methylome dissection based on eigen-pCSM loci. a Methylation profiles of eigen-pCSM loci, with each row representing an eigen-pCSM locus and each column representing one synthetic methylome. b Methylation profiles of NMF predicted cell types, with each row representing an eigen-pCSM loci and each column representing an NMF predicted cell type. c Heatmap of cell proportions predicted with NMF across all samples, with each row representing an NMF predicted cell type and each column representing a sample. The proportions were represented by color gradient from blue (low) to red (high). d Clustering analysis of cell types predicted by NMF and 16 reference methylomes. e Recovery of the mixing ratios for 16 neuronal cell types. The reference cell types that could not be unambiguously assigned to an LMC were considered as failures in prediction with a ratio of zero. In each line plot, the synthetic samples are sorted by ascending true mixing proportion
Fig. 5Performance of virtual methylome dissection based on eigen-pCSM loci and hVar-CpG sites. a Number of correctly predicted cell types in each simulation. b Pearson correlation coefficient between LMCs and their corresponding reference methylome. c The root-mean-square error (RMSE) between LMCs and their corresponding reference methylome. d Mean absolute error (MAE) between NMF predicted proportions and real proportions, with the dot showing the mean MAE and the shade showing the standard deviation of the MAE in 100 simulations
Fig. 6Methylome virtual dissection of five neuronal sorted cell populations. a Selection of parameters k and λ by cross-validation provided by MeDeCom Package. b Clustering analysis of predicted cell types and reference cell types when k = 3, with the red nodes representing the predicted cell types and the blue nodes representing the reference cell types from single-cell methylomes. c Predicted proportions of each LMC in five datasets