| Literature DB >> 31467377 |
Tomoya Mori1,2, Haruka Takaoka3, Junko Yamane1, Cantas Alev1, Wataru Fujibuchi4.
Abstract
Deciphering the key mechanisms of morphogenesis during embryonic development is crucial to understanding the guiding principles of the body plan and promote applications in biomedical research fields. Although several computational tissue reconstruction methods using cellular gene expression data have been proposed, those methods are insufficient with regard to arranging cells in their correct positions in tissues or organs unless spatial information is explicitly provided. Here, we report SPRESSO, a new in silico three-dimensional (3D) tissue reconstruction method using stochastic self-organizing map (stochastic-SOM) clustering, to estimate the spatial domains of cells in tissues or organs from only their gene expression profiles. With only five gene sets defined by Gene Ontology (GO), we successfully demonstrated the reconstruction of a four-domain structure of mid-gastrula mouse embryo (E7.0) with high reproducibility (success rate = 99%). Interestingly, the five GOs contain 20 genes, most of which are related to differentiation and morphogenesis, such as activin A receptor and Wnt family member genes. Further analysis indicated that Id2 is the most influential gene contributing to the reconstruction. SPRESSO may provide novel and better insights on the mechanisms of 3D structure formation of living tissues via informative genes playing a role as spatial discriminators.Entities:
Mesh:
Year: 2019 PMID: 31467377 PMCID: PMC6715814 DOI: 10.1038/s41598-019-49031-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Overview of the 3D reconstruction method of mid-gastrula mouse embryo using stochastic-SOM clustering. The gene expression data of mid-gastrula mouse embryo published by Peng et al.[27] were downloaded from GEO (accession number: GSE65924) and used as input data for our 3D reconstruction method. The expression data consisted of 41 samples with 23,361 genes. After filtering out low-expression genes, we used 5,585 genes as the input data. We generated candidate spatial discriminator gene sets according to GOs. We evaluated all the reconstructed structures from stochastic-SOM clustering in terms of success rate and total variance. Finally, we projected the samples to the paraboloid to reproduce the embryo structure.
Figure 2Success rate and total variance of GOs and their combinations. The horizontal and vertical axes show the success rate and total variance, respectively. Each dot indicates a feature gene set selected by GO. (a) 6,778 GOs were selected from 17,940 GOs to which the mouse genes belong according to the following two criteria: (i) the number of included genes is less than or equal to 1,000, and (ii) three or more genes from 5,585 genes are contained. GO:0060412 (ventricular septum morphogenesis) shows the highest success rate, 84%. (b) The results of all pairs of GO:0060412 and the other 6,777 GOs. The success rates of 22 pairs are equal to or higher than 85%, and the highest is 95%.
22 GOs showing success rates of reconstruction equal to or higher than 85% when combined with GO:0060412.
| Success rate (%) | Total variance | GO | Term |
|---|---|---|---|
| 84 (single GO) | 0.124 | GO:0060412 | ventricular septum morphogenesis |
| 85 | 0.110 | GO:0005021 | vascular endothelial growth factor-activated receptor activity |
| 85 | 0.122 | GO:1905456 | regulation of lymphoid progenitor cell differentiation |
| 85 | 0.128 | GO:0031117 | positive regulation of microtubule depolymerization |
| 85 | 0.137 | GO:0005381 | iron ion transmembrane transporter activity |
| 87 | 0.122 | GO:0070986 | left/right axis specification |
| 87 | 0.130 | GO:0044117 | growth of symbiont in host |
| 87 | 0.130 | GO:0044130 | negative regulation of growth of symbiont in host |
| 87 | 0.130 | GO:0044146 | negative regulation of growth of symbiont involved in interaction with host |
| 87 | 0.135 | GO:0030169 | low-density lipoprotein particle binding |
| 88 | 0.129 | GO:0072079 | nephron tubule formation |
| 89 | 0.141 | GO:0003214 | cardiac left ventricle morphogenesis |
| 90 | 0.110 | GO:2000392 | regulation of lamellipodium morphogenesis |
| 90 | 0.110 | GO:2000394 | positive regulation of lamellipodium morphogenesis |
| 90 | 0.127 | GO:0002830 | positive regulation of type 2 immune response |
| 90 | 0.127 | GO:0045630 | positive regulation of T-helper 2 cell differentiation |
| 91 | 0.126 | GO:0010899 | regulation of phosphatidylcholine catabolic process |
| 92 | 0.121 | GO:0042827 | platelet dense granule |
| 92 | 0.132 | GO:0048681 | negative regulation of axon regeneration |
| 93 | 0.106 | GO:0034707 | chloride channel complex |
| 93 | 0.122 | GO:1905564 | positive regulation of vascular endothelial cell proliferation |
| 95 | 0.120 | GO:0046716 | muscle cell cellular homeostasis |
| 95 | 0.121 | GO:0031994 | insulin-like growth factor I binding |
Final set of 18 genes derived from the five GOs reproducing 100% success rate.
| Gene | Official full name |
|---|---|
|
| activin A receptor, type 1 |
|
| Cbp/p300-interacting transactivator, with Glu/Asp-rich carboxy-terminal domain, 2 |
|
| coronin, actin binding protein 1B |
|
| delta like canonical Notch ligand 1 |
|
| ectonucleotide pyrophosphatase/phosphodiesterase 2 |
|
| fibroblast growth factor receptor-like 1 |
|
| FMS-like tyrosine kinase 1 |
|
| frizzled class receptor 1 |
|
| hes family bHLH transcription factor 1 |
|
| inhibitor of DNA binding 2 |
|
| insulin-like growth factor binding protein 3 |
|
| insulin-like growth factor binding protein 4 |
|
| integrin alpha 6 |
|
| neuropilin 2 |
|
| platelet derived growth factor receptor, alpha polypeptide |
|
| ras responsive element binding protein 1 |
|
| slit guidance ligand 3 |
|
| wingless-type MMTV integration site family, member 5A |
Figure 3Visualization of reconstructed models from gene expression profiles of mouse embryo samples. Reconstructed mouse embryo models and heatmaps of the domain correlation for different gene sets are shown. When only the feature gene set, GO:0060412, was used, the success rate was 84%. However, when four optimal GOs were added and Arl13b and Smad7 genes were removed, the success rates increased to 99% and 100%, respectively, and the total variances became smaller than that of GO:0060412 only. The visualization distance from the centroids of the output units to each sample reflects the similarity (Euclidean norm) between the centroids and the sample vectors. In the domain correlations, D4 shows a distinct gene expression pattern from the other three domains.