| Literature DB >> 31921389 |
Hanshuang Li1, Na Ta2, Chunshen Long1, Qiutang Zhang1, Siyu Li1, Shuai Liu2, Lei Yang3, Yongchun Zuo1.
Abstract
Understanding the target regulation between pioneer factor and its binding genes is crucial for improving the efficiency of TF-mediated reprogramming. Oct4 as the only one factor that cannot be substituted by other POU members, it is urgent need to develop a quantitative model for describing the spatial binding pattern with its target genes. The dynamic profiles of pioneer factor Oct4-binding showed that the major wave occurs at the intermediate stage of cell reprogramming (from day 7 to day 15), and the promoter is the preferred targeting regions. The Oct4-binding distributions perform significant chromosome bias. The overall enrichment on chromosome 1-11 is higher than that on the others. The dramatic event of TF-mediated reprogramming is mainly concentrated on autosomes. We also found that the spatial binding ability of Oct4 binding can be represented quantitatively by using three parameters of peaks (height, width and distance). The dynamic changes of Oct4-binding demonstrated that the width play more important roles in regulating expression of target genes. At last, a multivariate linear regression was introduced to establish the spatial binding model of the Oct4-binding. The evaluation results confirmed that the height and width is positively correlated with the gene expression. And the additive interaction terms of height and width can better optimize the model performance than the multiplicative terms. The best average coefficients of determination of improved model achieved to 81.38%. Our study will provide new insights into the cooperative regulation of spatial binding pattern of pioneer factors in cell reprogramming.Entities:
Keywords: Cell reprogramming; Multivariate linear regression; Pioneer factor; Spatial binding pattern
Year: 2019 PMID: 31921389 PMCID: PMC6944736 DOI: 10.1016/j.csbj.2019.09.002
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Dynamic profiles of Oct4-binding and characteristic of chromosomal distribution. (A) Flowchart of this study. (B) Dynamic profiles of Oct4 binding in the whole genome during reprogramming. (C) Bar plot of the distribution of Oct4_Peak on chromosomes at different time points of programming. The thickness of connecting lines between chromosomes and time points is proportional of the number of peaks. (D) The number of Oct4 peaks distributed on different chromosomes during reprogramming. (E) The boxplot represents the dynamic change of height of peaks during reprogramming. (F) The boxplot represents the dynamic change of width of peaks during reprogramming. (G) Kernel Density plot of width of Oct4-binding on different target genes during reprogramming. (H) The dynamic change of width of Oct4 peaks during reprogramming.
Fig. 2The dynamics binding of Oct4 in promoter regions. (A) The number of Oct4 target genes at different reprogramming time points is represented as bar graph, and the specific number is marked at the top of the bar graph. (B) Distribution of Oct4-binding sites with respect to the TSS in promoter at different reprogramming time points is shown. (C) Boxplots of height (upper left), width (left bottom) and distance values of Oct4 peaks (upper right) located in target genes promoter and its target genes expression in promoter (right bottom). (D) Genome browser view at the Erbb3, Sall1, Lama3 and Lin28a locus of Oct4 binding at the different time points of reprogramming, respectively. And the blue boxes represent the promoters of these genes and their surrounding regions. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 3Modeling of Oct4 spatial binding and its target genes. (A) Pearson correlation analysis of spatial binding parameters of Oct4 and its target gene expression levels at different time points of reprogramming. (B) The number of genes screened based on Pearson correlation coefficient. (C) The evaluation of multivariate regression model. The red line indicated the average coefficients of determination (R2) of model 6. The higher the coefficients are, the better the model goodness of fit. (D) The form and detailed parameters of the optimal model. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 4Identification and characteristic analysis of strongly correlated target genes. (A) Pearson correlation analysis of spatial binding parameters and parameters combinations of Oct4 and its target genes expression values. (B) Upset chart showing the target genes screened by us at four time points of reprogramming (horizontal bar). The specific number of identified genes shared between different sets is indicated in the top bar chart corresponding to the solid points below the bar chart and each column represents shared genes between the different time points (linked dots). Figure generated using Upset R package. (C) The analysis KEGG pathway enrichment. (D) Genome browser view of the Oct4 density in the Nanog, Dppa3 and Sall4 region at different point of reprogramming, and the blue boxes represent the promoters of these genes and their surrounding regions. The bar chart presents the dynamic change of the expression level of Sall4, and the abscissa is the logarithmic conversion of the expression value. (E) Analysis of gene co-expression network, cycle nodes represent genes and the size of edges represents the power of the interrelation among the nodes.