Literature DB >> 22355682

Generation of mouse ES cell lines engineered for the forced induction of transcription factors.

Lina S Correa-Cerro1, Yulan Piao, Alexei A Sharov, Akira Nishiyama, Jean S Cadet, Hong Yu, Lioudmila V Sharova, Li Xin, Hien G Hoang, Marshall Thomas, Yong Qian, Dawood B Dudekula, Emily Meyers, Bernard Y Binder, Gregory Mowrer, Uwem Bassey, Dan L Longo, David Schlessinger, Minoru S H Ko.   

Abstract

Here we report the generation and characterization of 84 mouse ES cell lines with doxycycline-controllable transcription factors (TFs) which, together with the previous 53 lines, cover 7-10% of all TFs encoded in the mouse genome. Global gene expression profiles of all 137 lines after the induction of TFs for 48 hrs can associate each TF with the direction of ES cell differentiation, regulatory pathways, and mouse phenotypes. These cell lines and microarray data provide building blocks for a variety of future biomedical research applications as a community resource.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 22355682      PMCID: PMC3240988          DOI: 10.1038/srep00167

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Mammalian genomes encode 1,500–2,000 transcription factors (TFs)1, which cross-regulate one another to form the network of TFs. The network controls the transcriptome of cells, thereby defining the identity of cells. A powerful approach to deciphering such a complex network is the systematic perturbation of individual TFs followed by global gene expression profiling2.

Results

Here we report the generation of mouse embryonic stem (ES) lines, each of which has been engineered by integrating an expression cassette of a specific transcription factor (TF) into the ubiquitously expressing Rosa26 locus (Fig. 1a)2. The Rosa26 locus3 drives relatively uniform expression of the exogenous copy (transgene) of a TF, which is repressed by doxycycline (Dox) and can be induced in Dox- cell culture conditions (Fig. 1b)4. Combined with the 53 ES lines reported previously2, we present a total 137 ES cell lines. The majority of the manipulated genes were TFs, which were selected from a set of high-priority genes involved in critical functions in mouse ES cells and their differentiation5. To ensure the quality of these ES cell lines, we implemented vigorous QC steps that have been described previously in detail2. As a part of the characterization of these ES cell lines, we carried out global gene expression profiling by DNA microarrays 48 hours after TF induction (Fig. 1c; GEO accession number, GSE31381). The induction of a TF was confirmed by qRT-PCR (Fig. 1d, Supplementary Table 1 for primer pairs). The effect of TF induction on the transcriptome of mouse ES cells was highly variable (Fig. 1e; Supplementary Table 2). On a scale of the number of genes significantly changed in expression (FDR ≤ 0.05, fold change ≥1.5), the top 10% of studied TFs changed 4676 genes on average (e.g., Dmrt1), whereas the bottom 50% of TFs caused significant changes in expression in only 54.5 genes on average (e.g., Mbd3) (Fig. 1c, d).
Figure 1

Induction of transcription factors (TFs) in ES cells:

(a) plasmid structure that includes loxP recombination sites, puromycin resistance gene, open reading frame (ORF) of a TF with hCMV promoter followed by His6-FLAG tag; (b) schematic diagram showing the expression of transgenic TF induced in Dox- conditions; (c) examples of scatterplots of gene expression in Dox- versus Dox+ condition. Green and red dots indicate genes that are differentially expressed with statistical significance (FDR<0.05, change >1.5 fold); (d) Increase of transcription factor expression after the induction of a transgene, as measured by qPCR (Dox- vs. Dox+); results from two biological replicates (3 technical replicates each); error bars (S.E.M.; ANOVA); and dashed line = 2 fold change; (e) a list of TFs and the number of genes up- or down-regulated by the induction of the TF (FDR<0.05, change >1.5 fold) (Supplementary Table S2).

To further characterize the transcriptome alterations caused by each TF, we compared our microarray data with 3 public databases: the gene expression profiles of many mouse organs/tissues at The Genomics Institute of the Novartis Research Foundation (GNF) (ver. 2 & 3)67, the Genetic Association Database (GAD) on gene sets associated with mouse phenotypes8, and the MSigDB database (ver. 3) of gene sets associated with signaling pathways and cellular functions9. Because the GNF database is quantitative and the two other databases are qualitative, we used different methods to quantify association: correlation of median-subtracted log-transformed gene expression values for the GNF database, and Parametric Analysis of Gene Expression (PAGE)10 for the GAD and msigdb databases (see Supplementary Methods). A comparison of our microarray data with the GNF database showed that the induction of a TF in ES cells often initiates the differentiation of ES cells into specific cell types as soon as 48 hr later, when cells do not yet exhibit any overt phenotypes (Fig. 2 for GNF ver. 3; Supplementary Fig. 1 for GNF ver. 2). For example, the transcriptome of ES cells shifted toward a neural profile after the induction of Sox9, Foxg1, Klf3, or Pou5f1; toward endoderm after the induction of Hnf4a, Gata2, Gata3, or Esx1; and toward skeletal muscle and heart after the induction of Myod1 or Mef2c. Similarly, the transcriptome of ES cells shifted toward hematopoietic cell lineages after the induction of Sfpi1, Elf1, or Irf2; and toward T-cells and thymocytes after the induction of Elf5 or Tgif1. Interestingly, TFs associated positively with transcriptome changes toward specific lineages showed a negative association with those toward different cell lineages (Fig. 2). For example, TFs associated with transcriptome changes toward neural tissues were negatively associated with those toward hematopoietic lineages (e.g., Sox9 and Foxg1 in Fig. 2), and vice versa (e.g., Irf2, Elf1, Sfpi1 in Fig. 2). These data suggest that TF networks are organized to cross-regulate as if different tissue lineages are mutually exclusive.
Figure 2

Correlation of gene expression response to the induction of TFs with tissue-specific gene expression from the GNF ver. 3 database7.

A comparison of our microarray data with the GAD database identified associations of TF's with mouse phenotypes (Fig. 3). Many newly identified associations are consistent with published data. For example, Hoxa2 was associated with the pancreatic alpha and beta cells11; Foxc1, with hair follicle/shaft1213; and Sox11 with skeletal defects14. A comparison of our microarray data with the msigdb database identified the association of each TF with specific cells and pathways (Fig. 4). For example, Smad6 was associated with keratinocytes15; Myod1, with alveolar rhabdomyosarcoma16; and Hnf4a, with lipoproteins17.
Figure 3

Enrichment of gene sets associated with mouse phenotypes from GAD database8 among genes that were upregulated (positive) or downregulated (negative) after the induction of various TFs.

Figure 4

Enrichment of gene sets associated with various functions and signaling pathways from msigdb ver. 3 database9 among genes that were upregulated (positive) or downregulated (negative) after the induction of various TFs.

Discussion

The collection of mouse ES cell lines reported here are freely available to the research community (http://esbank.nia.nih.gov/index.html). The analysis presented here can help researchers select ES cell lines suitable for their own research programs. For example, these TF-manipulable ES cell lines can be used to study the complex mechanisms of ES cell differentiation toward specific lineages. These ES cell lines are also adaptable to a variety of experiments and analyses, as shown in our previous report2. For example, each TF is C-terminally tagged with His6-FLAG, which simplifies studies of TF localization, protein-protein interactions, and protein-DNA interactions2. Further mining of the microarray results reported here as well as additional experiments with provided ES cell lines and their derivatives will yield more insight into gene regulatory networks. Carrying out similar experiments for more regulatory proteins (ideally for all TFs and additional signaling proteins) should give increasingly complete information to comprehend gene regulation in mammalian cells and organs.

Methods

Derivation of transgenic ES cell lines

ES cell lines with inducible TF transgenes were derived from MC1 mouse ES cells (129S6/SvEvTac), passage 17. Cells were cultured in DMEM with 15% FBS and LIF on feeder cells. Cells were electroporated with a linearized pMWROSATcH vector and selected by hygromycin B. Knock-in for ROSA-TET locus was confirmed by southern blotting. For exchange vectors, PCR amplified ORFs were subcloned into pZhcSfi that was modified to express a His6-FLAG tagged protein and puromycin resistance gene. ES cells were co-transfected with a sequence verified exchange vector and pCAGGS-Cre and selected by puromycin in the presence of doxycycline (Dox). Isolated clones were tested for Venus expression, hygromycin B susceptibility, transgene RNA expression, genotyping for Cre mediated integration, and mycoplasma contamination.

Gene expression analysis of cells with induced TFs

ES cells (passage 25) were cultured in the standard LIF+ medium with Dox+ on a gelatin-coated dish throughout the experiments. Cells from each cell line were split into 6 wells and the media was changed 24 hr after cell plating: 3 wells with Dox+ medium, and 3 wells with Dox- medium to induce transgenic TFs. Dox was removed via washing 3 times with PBS at 3 hour intervals. Total RNA was isolated by TRIzol (Invitrogen) after 48 hr, and two replications were used for real time qPCR (see primers in Supplementary Table S1) and for microarray hybridization. RNA samples were labeled with total RNA by the Low RNA Input Fluorescent Linear Amplification Kit (Agilent). For most TFs, we hybridized Cy3-CTP labeled sample from Dox- medium together with a Cy5-CTP labeled sample from Dox+ medium. But for 7 TFs we labeled samples from Dox- and Dox+ with Cy3, and hybridized them independently with a Cy5-labeled reference target, which is a mixture of Stratagene Universal Mouse Reference RNA and MC1 cells RNA (this method requires a double number of arrays). Analysis showed that both methods produce results of comparable quality. Targets were hybridized to the NIA Mouse 44K Microarray v3.0 (Agilent, design ID 015087)18. Slides were scanned with Agilent DNA Microarray Scanner. All DNA Microarray data are available in Supplementary Table S2, at GEO/NCBI19 (http://www.ncbi.nlm.nih.gov/geo; accession number GSE31381), and at NIA Array Analysis software20 (http://lgsun.grc.nia.nih.gov/ANOVA).

Normalization of microarray data and detection of outliers

Two methods of array hybridizations were used in this study: (1) RNA extracted from cells with induced transcription factors (TFs) (cultured in Dox- conditions) and from controlled cells (cultured in Dox+ conditions) were Cy3 labeled and all hybridized on separate arrays together with reference RNA labeled with Cy5; and (2) RNA extracted from cells with induced TFs (Dox-) were labeled with Cy3 and hybridized together with RNA from control cells (Dox+) which were labeled with Cy5. The second method does not use reference RNA. Data processing depended on the method of hybridization. Potential Cy3/Cy5 bias in microarrays with the hybridization of Dox- vs. Dox+ samples was removed by normalization to the median logratio of gene expression change in all TF-manipulation experiments. The details of the method are available in Supplementary Information.

Statistical analysis of microarray data

For statistical analysis we used NIA Array Analysis, which estimates the False Discovery Rate (FDR) to account for multiple hypothesis testing20. Response of genes to the knockdown of TFs was measured as a logratio (i.e., difference between means of log-transformed intensities) between manipulated (Dox-) and control (Dox+) cells. We considered gene expression change as significant if logratio was significantly different from zero (FDR < 0.05) and the change of expression was >1.5 fold.

Correlation with tissue-specific gene expression

Association of gene expression changes induced by TF manipulation with tissue-specific gene expression was evaluated based on the correlation between our microarray results with the GNF database7. Correlation was estimated between gene expression responses to TF manipulation (logratio of Dox- vs. Dox+) and median-centered log-transformed gene expression in various tissues from GNF database (ver. 2 and 3). Because the importance of genes in ES cells and adult tissues may be different and different platforms of microarrays used in these studies are not 100% compatible, we applied correlation analysis to a subset of genes that are highly expressed and dynamic in both data sets. We selected 10,000 genes in each database with the highest score equal to the product of average log-expression and standard deviation of expression (after induction of various TFs or in different tissues), and then took the intersecting portion of 5,595 genes for GNF ver. 3 (5,295 genes for ver. 2). Then, correlation values and corresponding z-values were estimated based on this subset of genes. The matrix was sorted using hierarchical clustering, TMEV, ver 3.121.

Analysis of gene set enrichment

Enrichment of target genes in subsets of genes that are upregulated or/and downregulated following the manipulation of the TF is quantified using a modified Parametric Analysis of Gene Enrichment (PAGE)10. PAGE is based on the comparison of the average expression change in a specific subset of genes, xset, with the average expression change in all genes, xall: where nset is the size of the gene set and SDall is standard deviation of expression change among all genes. We modified this method by applying equation (1) to the subset of N top upregulated and another subset of N top downregulated genes rather than to all genes combined, which allowed us to detect the enrichment of the same gene set among both upregulated and downregulated genes. The value of N = 5000 was selected experimentally because it appeared that the enrichment of genes with TF binding sites is always limited to the top 5000 upregulated or downregulated genes. The probability distribution of expression change within subsets of N upregulated and downregulated genes is not normal; however, because we compare averages for large sets of genes (usually, nset is >50), the probability distribution of these averages is close to normal based on the central limit theorem22. Thus, it is reasonable to use equation (1) as an approximation. In the case when both up-regulated and down-regulated genes were enriched in a specific functional gene set, we subtracted the smaller z-value from both z-values. The matrix of z-values was first sorted using hierarchical clustering, TMEV, ver 3.121, and then manually converted to a semi-diagonal form.

Author Contributions

LSC, YP, AN, JSC, HY, LVS, LX, HGH, MT, EM, BYB, GM, and UB carried out the experiments. AAS, YQ, DD, and MSHK carried out the data analysis. MSHK conceived the project. DLL, DS, and MSHK supervised the project. AAS and MSHK wrote the manuscript with inputs from all authors. All authors reviewed the manuscript.
  21 in total

1.  A web-based tool for principal component and significance analysis of microarray data.

Authors:  Alexei A Sharov; Dawood B Dudekula; Minoru S H Ko
Journal:  Bioinformatics       Date:  2005-02-25       Impact factor: 6.937

2.  The establishment of a predictive mutational model of the forkhead domain through the analyses of FOXC2 missense mutations identified in patients with hereditary lymphedema with distichiasis.

Authors:  Fred B Berry; Yahya Tamimi; Michelle V Carle; Ordan J Lehmann; Michael A Walter
Journal:  Hum Mol Genet       Date:  2005-08-04       Impact factor: 6.150

3.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

4.  Requirement for Shh and Fox family genes at different stages in sweat gland development.

Authors:  Makoto Kunisada; Chang-Yi Cui; Yulan Piao; Minoru S H Ko; David Schlessinger
Journal:  Hum Mol Genet       Date:  2009-03-06       Impact factor: 6.150

5.  DGAT1 participates in the effect of HNF4A on hepatic secretion of triglyceride-rich lipoproteins.

Authors:  Sergey Krapivner; Maria Jesus Iglesias; Angela Silveira; Jesper Tegnér; Johan Björkegren; Anders Hamsten; Ferdinand M van't Hooft
Journal:  Arterioscler Thromb Vasc Biol       Date:  2010-02-18       Impact factor: 8.311

6.  NCBI GEO: archive for functional genomics data sets--10 years on.

Authors:  Tanya Barrett; Dennis B Troup; Stephen E Wilhite; Pierre Ledoux; Carlos Evangelista; Irene F Kim; Maxim Tomashevsky; Kimberly A Marshall; Katherine H Phillippy; Patti M Sherman; Rolf N Muertter; Michelle Holko; Oluwabukunmi Ayanbule; Andrey Yefanov; Alexandra Soboleva
Journal:  Nucleic Acids Res       Date:  2010-11-21       Impact factor: 16.971

7.  Dissecting Oct3/4-regulated gene networks in embryonic stem cells by expression profiling.

Authors:  Ryo Matoba; Hitoshi Niwa; Shinji Masui; Satoshi Ohtsuka; Mark G Carter; Alexei A Sharov; Minoru S H Ko
Journal:  PLoS One       Date:  2006-12-20       Impact factor: 3.240

8.  PAGE: parametric analysis of gene set enrichment.

Authors:  Seon-Young Kim; David J Volsky
Journal:  BMC Bioinformatics       Date:  2005-06-08       Impact factor: 3.169

9.  Transcript copy number estimation using a mouse whole-genome oligonucleotide microarray.

Authors:  Mark G Carter; Alexei A Sharov; Vincent VanBuren; Dawood B Dudekula; Condie E Carmack; Charlie Nelson; Minoru S H Ko
Journal:  Genome Biol       Date:  2005-06-30       Impact factor: 13.583

10.  Systematic analysis, comparison, and integration of disease based human genetic association data and mouse genetic phenotypic information.

Authors:  Yonqing Zhang; Supriyo De; John R Garner; Kirstin Smith; S Alex Wang; Kevin G Becker
Journal:  BMC Med Genomics       Date:  2010-01-21       Impact factor: 3.063

View more
  26 in total

1.  Identifying gene expression modules that define human cell fates.

Authors:  I Germanguz; J Listgarten; J Cinkornpumin; A Solomon; X Gaeta; W E Lowry
Journal:  Stem Cell Res       Date:  2016-04-13       Impact factor: 2.020

2.  The gene regulatory network of mESC differentiation: a benchmark for reverse engineering methods.

Authors:  Johannes Meisig; Nils Blüthgen
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2018-07-05       Impact factor: 6.237

3.  Assessment of engineered cells using CellNet and RNA-seq.

Authors:  Arthur H Radley; Remy M Schwab; Yuqi Tan; Jeesoo Kim; Emily K W Lo; Patrick Cahan
Journal:  Nat Protoc       Date:  2017-04-27       Impact factor: 13.491

4.  Chromatin properties of regulatory DNA probed by manipulation of transcription factors.

Authors:  Alexei A Sharov; Akira Nishiyama; Yong Qian; Dawood B Dudekula; Dan L Longo; David Schlessinger; Minoru S H Ko
Journal:  J Comput Biol       Date:  2014-06-11       Impact factor: 1.479

5.  Transcription factor AP-2γ induces early Cdx2 expression and represses HIPPO signaling to specify the trophectoderm lineage.

Authors:  Zubing Cao; Timothy S Carey; Avishek Ganguly; Catherine A Wilson; Soumen Paul; Jason G Knott
Journal:  Development       Date:  2015-04-09       Impact factor: 6.868

6.  SOX9 accelerates ESC differentiation to three germ layer lineages by repressing SOX2 expression through P21 (WAF1/CIP1).

Authors:  Kohei Yamamizu; David Schlessinger; Minoru S H Ko
Journal:  Development       Date:  2014-11       Impact factor: 6.868

Review 7.  Past Roadblocks and New Opportunities in Transcription Factor Network Mapping.

Authors:  Michael R Brent
Journal:  Trends Genet       Date:  2016-10-06       Impact factor: 11.639

8.  Weighted enrichment method for prediction of transcription regulators from transcriptome and global chromatin immunoprecipitation data.

Authors:  Eiryo Kawakami; Shinji Nakaoka; Tazro Ohta; Hiroaki Kitano
Journal:  Nucleic Acids Res       Date:  2016-04-30       Impact factor: 16.971

9.  ExAtlas: An interactive online tool for meta-analysis of gene expression data.

Authors:  Alexei A Sharov; David Schlessinger; Minoru S H Ko
Journal:  J Bioinform Comput Biol       Date:  2015-06-09       Impact factor: 1.122

10.  Induction of specific neuron types by overexpression of single transcription factors.

Authors:  Yusuke Teratani-Ota; Kohei Yamamizu; Yulan Piao; Lioudmila Sharova; Misa Amano; Hong Yu; David Schlessinger; Minoru S H Ko; Alexei A Sharov
Journal:  In Vitro Cell Dev Biol Anim       Date:  2016-06-01       Impact factor: 2.416

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.