| Literature DB >> 29845286 |
Guillermo-Issac Guerrero-Ramirez1, Cesar-Miguel Valdez-Cordoba1, Jose-Francisco Islas-Cisneros1, Victor Trevino1.
Abstract
There is a need for specific cell types in regenerative medicine and biological research. Frequently, specific cell types may not be easily obtained or the quantity obtained is insufficient for study. Therefore, reprogramming by the direct conversion (transdifferentiation) or re‑induction of induced pluripotent stem cells has been used to obtain cells expressing similar profiles to those of the desired types. Therefore, a specific cocktail of transcription factors (TFs) is required for induction. Nevertheless, identifying the correct combination of TFs is difficult. Although certain computational approaches have been proposed for this task, their methods are complex, and corresponding implementations are difficult to use and generalize for specific source or target cell types. In the present review four computational approaches that have been proposed to obtain likely TFs were compared and discussed. A simplified view of the computational complexity of these methods is provided that consists of three basic ideas: i) The definition of target and non‑target cell types; ii) the estimation of candidate TFs; and iii) filtering candidates. This simplified view was validated by analyzing a well‑documented cardiomyocyte differentiation. Subsequently, these reviewed methods were compared when applied to an unknown differentiation of corneal endothelial cells. The generated results may provide important insights for laboratory assays. Data and computer scripts that may assist with direct conversions in other cell types are also provided.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29845286 PMCID: PMC6072137 DOI: 10.3892/mmr.2018.9092
Source DB: PubMed Journal: Mol Med Rep ISSN: 1791-2997 Impact factor: 2.952
Figure 1.Simplified view of TF identification for cell conversion. (A) Process of defining at least two cell populations. (B) Differential expression analysis of TFs between defined populations to identify pre-candidate TFs. (C) Filtering process of pre-candidates in order to generate a short list of TFs whose overexpression will likely control the desired cell state. TF, transcription factor.
Figure 2.Comparison of the definition of cell populations.
Definition of populations of cell types by all methods.
| Author, year | Data | Target | Non-targets | (Refs.) |
|---|---|---|---|---|
| Cahan | GEO, queried datasets, 16–20 cell types | Several samples of the same cell or tissue type | Remaining cell types | ( |
| D'Alessio | GEO, 504 datasets, 233 cell types | Several samples of the same cell or tissue type | Remaining cell types (balanced) | ( |
| Rackham | FANTOM5, >700 datasets (CAGE-Seq) | Samples of the same cell type | Remaining cell types but avoiding close and distant related ones | ( |
| Okawa | GEO, Specific data | A daughter cell type | The progenitor and sister cell types | ( |
GEO, gene expression omnibus.
Figure 3.Comparison of conceptual definitions to identify TF differences. TF, transcription factor; mag, magnitude.
Identification of differential expressed TF.
| Author, year | Method | Comparison | (Refs.) |
|---|---|---|---|
| Cahan | Tissue-Specific Context Likelihood of Relatedness | Pairs of co-expressed TF | ( |
| D'Alessio | Jensen-Shannon Divergence | Per TF | ( |
| Rackham | Combines P-values and fold-change | Per TF | ( |
| Okawa | Normalized Ratio Difference | Pairs of swap-expressed TF | ( |
TF, transcription factor.
Figure 4.Comparison of the generation of candidate TFs. TF, transcription factor; CLR, context likelihood of relatedness; JSD, Jensen-Shannon divergence; NRD, normalized ratio difference; GSEA, gene set enrichment analysis.
Resources available for finding key TF.
| Author, year | Resources and limitations | (Refs.) |
|---|---|---|
| Cahan | CellNet: Web interface and R package. Any source cell type as input but only from certain Affymetrix arrays, and Illumina arrays (in R). Only specific target cell types are available | ( |
| D'Alessio | File for 233 cell type predictions. Manual estimations are possible for a target. Source is not used. | ( |
| Rackham | Mogrify: Web interface. Specific for several already cataloged source and target cell types. | ( |
| Okawa | None available. | ( |
TF, transcription factor.
Top 20 genes per method for cardiomyocyte differentiation.
| Method | ||||||||
|---|---|---|---|---|---|---|---|---|
| Author, year | Delta | t-test | Rackham | D'Alessio | Okawa | Mentions, n | TF comments (Refs.) | |
| Kamaraj | TBX20 | TBX20 | TBX20 | ZNF705A | GATA4 | HAND1, 5 | Computational prediction ( | |
| Ieda | GATA4 | GATA4 | GATA4 | ZNF283 | TBX20 | HAND2, 5 | First described in ( | |
| Addis | efficiency of CM expression markers ( | |||||||
| Ieda | HAND1 | HAND1 | TBX5 | ZSCAN4 | HAND1 | GATA4, 4 | Key TF first described in ( | |
| Addis | and computationally ( | |||||||
| Ebrahimi | ||||||||
| Ieda | TBX5 | TBX5 | GATA6 | LIN28B | TBX5 | TBX5, 4 | Key TF first described in ( | |
| Addis | and computationally ( | |||||||
| Ebrahimi | ||||||||
| Addis | HAND2 | HAND2 | HAND1 | HAND2 | HAND2 | NKX2.5, 4 | Increased efficiency of CM expression markers ( | |
| Xiang | ESRRG | ESRRG | CSDC2 | HAND1 | ESRRG | TBX20, 4 | Implicated in CM proliferation and cardiac function in mice ( | |
| Fu | NKX2.5 | NKX2.5 | NKX2.5 | TFDP3 | CSDC2 | ESRRG, 4 | Improved CM phenotype ( | |
| Kamaraj | CSDC2 | CSDC2 | HAND2 | POU1F1 | NKX2.5 | HEY2, 4 | [ | |
| Rastegar-Pouyani | PROX1 | PROX1 | ESRRG | E2F8 | PROX1 | TCF21, 4 | [ | |
| Kamaraj | TCF21 | TCF21 | PROX1 | HNF4G | TCF21 | GATA6, 4 | [ | |
| HEY2 | HEY2 | HEY2 | ZNF20 | HEY2 | CSDC2, 4 | [ | ||
| Risebro | GATA6 | GATA6 | NPAS2 | NR1H4 | GATA6 | PROX1, 4 | Muscle structure maintenance ( | |
| Kamaraj | NR0B2 | NR0B2 | TEAD2 | RFX6 | NR0B2 | EBF2, 4 | [ | |
| Liu | EBF2 | EBF2 | PPARA | CDX4 | EBF2 | MEIS2, 4 | May be important in CM ( | |
| Rastegar-Pouyani | IRX3 | IRX3 | MEIS2 | ESX1 | ID4 | TEAD2, 4 | [ | |
| ETV1 | ETV1 | EBF2 | ZFP42 | IRX3 | EBF3, 3 | [ | ||
| Shekhar | MEIS2 | MEIS2 | TCF21 | X.2878 | ETV1 | ETV1, 3 | Involved in rapid impulse conduction ( | |
| TEAD2 | TEAD2 | TEAD1 | SRY | MEIS2 | IRF6, 3 | [ | ||
| Koizumi | IRF6 | IRF6 | IRX4 | FOXR2 | TEAD2 | IRX3, 3 | Involved in cardiac rhythm ( | |
| Nam | EBF3 | EBF3 | EBF3 | RFX8 | IRF6 | NR0B2, 3 | Involved in cardiac hypertrophy ( | |
TF not tested for differentiation.
GeneCards Human Gene Database, www.genecards.org/cgi-bin/carddisp.pl?gene=CSDC2. Top 20 genes by each criterion including those most frequently appearing (Mentions column). TF, transcription factor; CM, cardiomyocyte; HAND1, heart and neural crest derivatives expressed 1; HAND2, heart and neural crest derivatives expressed 2; GATA4, GATA binding protein 4; TBX5, T-box 5; NKX2.5, NK2 homeobox 5; TBX20, T-box 20; ESRRG, estrogen related receptor γ; HEY2, hes related family bHLH transcription factor with YRPW motif 2; TCF21, transcription factor 21; GATA6, GATA binding protein 6; CSDC2, cold shock domain containing C2; PROX1, prospero homeobox 1; EBF2, early B cell factor 2; MEIS2, meis homeobox 2; TEAD2, TEA domain transcription factor 2; EBF3, early B cell factor 3; ETV1, ETS variant 1; IRF6, interferon regulatory factor 6; IRX3, iroquois homeobox 3; NR0B2, nuclear receptor subfamily 0 group B member 2.
Figure 5.Results for the CEC example. (A) Comparison of the five scores. The t-test P-value is indicated as -Log10. (B) Table of the top 20 genes by each criterion including those most frequently arising (Mentions column). Genes were assigned specific colors. Genes in italics were repeated, although not in the top 20. Black genes were specific to each score. (C) Comparison of gene expression of genes in column Mentions in panel (B) across CEC and non-target cell types. CEC, corneal endothelial cells.