| Literature DB >> 26173873 |
Abstract
Animals use a stereotypical set of developmental genes to build body architectures of varying sizes and organizational complexity. Some genes are critical to developmental patterning, while other genes are important to physiological control of growth. However, growth regulator genes may not be as important in small-bodied "micro-metazoans" such as nematodes. Nematodes use a simplified developmental strategy of lineage-based cell fate specifications to produce an adult bilaterian body composed of a few hundreds of cells. Nematodes also lost the MYC proto-oncogenic regulator of cell proliferation. To identify additional regulators of cell proliferation that were lost with MYC, we computationally screened and determined 839 high-confidence genes that are conserved in bilaterians/lost in nematodes (CIBLIN genes). We find that 30 % of all CIBLIN genes encode transcriptional regulators of cell proliferation, epithelial-to-mesenchyme transitions, and other processes. Over 50 % of CIBLIN genes are unnamed genes in Drosophila, suggesting that there are many understudied genes. Interestingly, CIBLIN genes include many Myc synthetic lethal (MycSL) hits from recent screens. CIBLIN genes include key regulators of heparan sulfate proteoglycan (HSPG) sulfation patterns, and lysyl oxidases involved in cross-linking and modification of the extracellular matrix (ECM). These genes and others suggest the CIBLIN repertoire services critical functions in ECM remodeling and cell migration in large-bodied bilaterians. Correspondingly, CIBLIN genes are co-expressed with Myc in cancer transcriptomes, and include a preponderance of known determinants of cancer progression and tumor aggression. We propose that CIBLIN gene research can improve our understanding of regulatory control of cellular growth in metazoans.Entities:
Keywords: Apoptosis; Cancer; Cell migration; Cell proliferation; Comparative genomics; Genetic screens; Mnt; Myc; Myc synthetic lethals (MycSL genes)
Mesh:
Substances:
Year: 2015 PMID: 26173873 PMCID: PMC4568025 DOI: 10.1007/s00427-015-0508-1
Source DB: PubMed Journal: Dev Genes Evol ISSN: 0949-944X Impact factor: 0.900
The 839 conserved in bilaterians/lost in nematodes (CIBLIN) genes
| Seta | Orthology groups checked and identified in: | Orthologs NOT called in: | Data system | No. of distinct fly genes | No. of distinct human genes |
|---|---|---|---|---|---|
| 1 |
|
| Metazoa EnsemblCompara (invertebrates) | 3009 | n.d. |
| 2 |
|
| Ensembl Genes 78 (vertebrates + | 1197 | 1389 |
| 3 |
|
| Ensembl Genes 78 | 968 | 1158 |
| 4 | Three-way strict orthology across | N/A | Ensembl Genes 78 | 881 | 971 |
Gene sets identified and analyzed in this study. Rows in yellow represent three CIBLIN gene lists. Set 1 is from a precursor step. Set 2 includes all CIBLIN genes, including those duplicated in any one lineage and regardless of whether they are present outside of Bilateria. Other sets are used for specific analyses described in the text
n.d. not determined, N/A not applicable
aEach numbered set describes a subset of genes identified from the previous set
bHuman-mouse many-to-many relationships (independent duplications) were removed. This predominantly removes multi-copy genes such as histone-encoding genes
Fig. 1Identification of CIBLIN genes. a Sufficient genomes and comparative genomic resources exist to attempt a screen for genes conserved in bilaterians/lost in nematodes (CIBLIN). Tree is based on the phylogenetic analysis of the Med12 protein sequence, which is not lost (see “Materials and methods”). Tree shows only the species whose genomes were used to search for genes lost in the stem-nematode lineage, during which the genes encoding the Myc and Mnt bHLH transcription factors were lost. The identification of genes lost in the stem-nematode lineage might correspond to general cell proliferation programs used by animals. Image of nematode is of an adult C. elegans, which has only 959 somatic cells in the adult (image adapted from Bob Goldstein, UNC Chapel Hill, CC-BY-A 2006). b Plot of evolutionary rates for 971 CIBLIN orthologs present as single-copy genes in mammals. Graph plots each gene using the ω values (dN/dS) computed between the human and mouse genes (x-axis) or the human and rat genes (y-axis), and shows that these genes predominantly evolve at clock-like rates, indicating negative (purifying) selection. The red dot represents the average rates for the 971 mammalian CIBLIN genes (~0.14) indicating that most of these are diverging only slowly. The box in yellow encloses the most conserved ~490 mammalian CIBLIN genes, which correspond to the ranked set at which “developmental process” is most significant of all ranked sets (167 N genes/top 490 M genes; see Table 3). Thus, the GO attribute for “developmental process” is significantly overrepresented in the most conserved CIBLIN genes
Developmental regulatory processes are most overrepresented in the most conserved CIBLIN orthologs
|
| Ma | X |
|
| Attrib. ID | Gene Ontology (GO) attribution name |
|---|---|---|---|---|---|---|
| 13 | 66 | 384 | 4.2E−10 | 0.000 | GO:0048598 | Embryonic morphogenesis |
| 16 | 785 | 53 | 1.5E−10 | 0.000 | GO:0008146 | Sulfotransferase activity |
| 17 | 785 | 64 | 3.6E−10 | 0.000 | GO:0016782 | Transferase act., transferring sulfur-con. groups |
| 18 | 136 | 407 | 4.8E−10 | 0.000 | GO:0009887 | Organ morphogenesis |
| 15 | 260 | 179 | 1.8E−08 | 0.000 | GO:0030278 | Regulation of ossification |
| 27 | 248 | 550 | 2.1E−09 | 0.000 | GO:0009888 | Tissue development |
| 27 | 212 | 666 | 3.9E−09 | 0.000 | GO:0048731 | System development |
| 49 | 128 | 2675 | 3.2E−12 | 0.000 | GO:0048856 | Anatomical structure development |
| 41 | 204 | 1273 | 7.7E−11 | 0.000 | GO:0009653 | Anatomical structure morphogenesis |
| 39 | 320 | 789 | 8.0E−10 | 0.000 | GO:0043565 | Sequence-specific DNA binding |
| 50 | 298 | 1290 | 9.3E−10 | 0.000 | GO:0045595 | Regulation of cell differentiation |
| 55 | 407 | 1025 | 1.1E−10 | 0.000 | GO:0003700 | Sequence-specific DNA binding txn. factor activity |
| 55 | 407 | 1026 | 1.1E−10 | 0.000 | GO:0001071 | Nucleic acid binding transcription factor activity |
| 59 | 318 | 1502 | 1.8E−10 | 0.000 | GO:0006357 | Reg. of txn. from RNA polymerase II promoter |
| 69 | 343 | 1778 | 2.2E−10 | 0.000 | GO:0050793 | Regulation of developmental process |
| 100 | 320 | 3294 | 1.3E−10 | 0.000 | GO:0006355 | Regulation of transcription, DNA-templated |
| 101 | 320 | 3424 | 5.1E−10 | 0.000 | GO:0051252 | Regulation of RNA metabolic process |
| 108 | 320 | 3716 | 2.2E−10 | 0.000 | GO:0010556 | Reg. of macromolecule biosynthetic process |
| 110 | 320 | 3928 | 1.3E−09 | 0.000 | GO:0009889 | Regulation of biosynthetic process |
| 91 | 433 | 2234 | 5.0E−09 | 0.000 | GO:0048869 | Cellular developmental process |
| 108 | 320 | 3895 | 3.5E−09 | 0.000 | GO:0031326 | Regulation of cellular biosynthetic process |
| 139 | 411 | 4010 | 1.5E−10 | 0.000 | GO:0010468 | Regulation of gene expression |
| 156 | 488 | 3994 | 9.0E−10 | 0.000 | GO:0044767 | Single-organism developmental process |
| 167 | 489 | 4456 | 4.6E−09 | 0.000 | GO:0032502 | Developmental process |
| 35 | 559 | 431 | 3.3E−08 | 0.001 | GO:0006366 | Transcription from RNA polymerase II promoter |
| 153 | 411 | 4917 | 3.0E−08 | 0.001 | GO:0060255 | Regulation of macromolecule metabolic process |
| 8 | 305 | 34 | 4.1E−08 | 0.002 | GO:0032570 | Response to progesterone |
| 47 | 318 | 1230 | 4.0E−08 | 0.002 | GO:1902680 | Positive regulation of RNA biosynthetic process |
| 48 | 318 | 1273 | 4.3E−08 | 0.002 | GO:0010628 | Positive regulation of gene expression |
| 9 | 128 | 115 | 6.5E−08 | 0.003 | GO:0061448 | Connective tissue development |
| 26 | 316 | 474 | 6.5E−08 | 0.003 | GO:0008134 | Transcription factor binding |
| 42 | 387 | 861 | 7.2E−08 | 0.003 | GO:0045944 | Pos. reg. of txn. from RNA pol. II promoter |
| 45 | 318 | 1172 | 7.1E−08 | 0.003 | GO:0045893 | Pos. regulation of transcription, DNA-templated |
| 47 | 318 | 1251 | 6.7E−08 | 0.003 | GO:0051254 | Positive regulation of RNA metabolic process |
| 51 | 318 | 1405 | 5.1E−08 | 0.003 | GO:0010557 | Pos. reg. of macromolecule biosynthetic process |
| 150 | 320 | 6382 | 6.8E−08 | 0.003 | GO:0044260 | Cellular macromolecule metabolic process |
| 133 | 320 | 5472 | 1.0E−07 | 0.006 | GO:0031323 | Regulation of cellular metabolic process |
| 5 | 40 | 60 | 1.4E−07 | 0.008 | GO:2000648 | Positive regulation of stem cell proliferation |
| 46 | 295 | 1336 | 1.2E−07 | 0.008 | GO:2000026 | Reg. of multicellular organismal development |
| 53 | 318 | 1534 | 1.3E−07 | 0.008 | GO:0009891 | Positive regulation of biosynthetic process |
| 31 | 204 | 1062 | 2.0E−07 | 0.012 | GO:0048513 | Organ development |
| 35 | 311 | 840 | 2.1E−07 | 0.012 | GO:0051094 | Positive regulation of developmental process |
Gene Ontology (GO) attributes related to developmental processes, stem cell proliferation, and organogenesis are predominantly associated with the CIBLIN repertoire (yellow highlighted rows). GO attributes related to DNA-binding transcriptional regulators are also overrepresented (light red highlighted rows.) The 167 CIBLIN genes with the GO term for developmental process (dark yellow highlight) are highlighted in Fig. 1b
P single hypothesis one-sided P value of association between attribute and query based on Fisher’s exact test, P adj is an empirically adjusted P-value given by the fraction of 1000 null-hypothesis simulations having attributes with this single-hypothesis P value or smaller
aGenes ordered by average human/mouse and human/rat dN/dS ratios from slowest to fastest rates. In this context, M corresponds to the first M genes in the ranked list producing the most significant P value for any significant attribute among 17,658 attributes
Fig. 2Interaction network for human CIBLIN transcriptional regulators. a Of the 1158 human CIBLIN genes (set 3, Table 1), 101 have GO attributes associated with either “sequence-specific DNA binding transcription factor activity” (GO:0003700) or “Mediator complex” (GO:0016592). The top panel shows the interaction network for the human genes based on physical interaction interactome data, shared protein domains, predicted based on other species (e.g., studies in mouse and others); and pathway interactome. The bottom panel shows a subset of 56 genes that are most closely expressed with MYC (big yellow halo), MYCN (small yellow halo), MYCL (small yellow halo), or MNT (small pink halo) based on all available human transcriptome studies. The percent contribution of each study to the expression association map is predominantly associated with cancer transcriptomes (see Table 4). b Co-expression network for 52 regulator genes (a subset of genes in Fig. 2a) co-expressed with MYC, MYCN, MYCL, and MNT (highlighted gene nodes in each corner) over 287 transcriptomic studies using human cells. The specific studies that contributed the most to the Pearson correlations between these genes are listed in Table S5 and ranked by weight
Examples of conserved CIBLIN genes with roles in development and/or cancer
| RANK | Avg. | Human gene | Human gene description [source: HGNC] | Roles in development and/or cancer progression |
|---|---|---|---|---|
| 1 | 0.000 | BZW1 | Basic leucine zipper and W2 domains 1 | Proliferation regulator, in salivary mucoepodermoid carcinoma (Li et al. |
| 2 | 0.000 | ENY2 | Enhancer of yellow 2 homolog ( | Insulator/barrier regulator, binds CTCF (Maksimenko et al. |
| 3 | 0.000 | LMO4 | LIM domain only 4 | Proliferation and epithelial-to-mesenchyme regulator; neuroblastomas; mammary stem cells and breast tumorigenesis (Ferronha et al. |
| 7 | 0.001 | OTP | Orthopedia homeobox | Breast cancer; pulmonary carcinoids (Kim et al. |
| 11 | 0.006 | PRRX1 | Paired related homeobox 1 | Gioblastoma invasiveness (Sugiyama et al. |
| 14 | 0.008 | TADA3 | Transcriptional adaptor 3 | Embryonic progression, cell cycle checkpoint (Mohibi et al. |
| 15 | 0.008 | MAD2L2 | MAD2 mitotic arrest deficient-like 2 (yeast) | Mitotic check point (Cahill et al. |
| 28 | 0.014 | SHOX2 | Short stature homeobox 2 | Embryoid bodies, hepatocellular carcinoma, breast cancer, lung cancers (Schneider et al. |
| 31 | 0.015 | PTOV1 | Prostate tumor overexpressed 1 | Epithelial ovarian cancers, prostate cancers, high grade malignant tumors (Alana et al. |
| 33 | 0.015 | GBX2 | Gastrulation brain homeobox 2 | Promotes pluripotent cell fates (Tai and Ying |
| 34 | 0.015 | SOX10 | SRY-box 10 | Melanoma progression (Shakhova et al. |
See Supplementary tables for complete list
aAverage dN/dS values are the average of the human/mouse and human/rat alignments
Transcriptional regulators are overrepresented in the 971 mammalian CIBLIN orthologs
|
| X |
|
| Attrib. ID | Gene Ontology (GO) attribution name |
|---|---|---|---|---|---|
| 16 | 53 | 3.3E−09 | <0.001 | GO:0008146 | Sulfotransferase activity |
| 17 | 64 | 9.4E−09 | <0.001 | GO:0016782 | Transferase act., transferring sulfur-containing groups |
| 259 | 4010 | 1.7E−06 | 0.004 | GO:0010468 | Regulation of gene expression |
| 45 | 431 | 2.3E−06 | 0.005 | GO:0006366 | Transcription from RNA polymerase II promoter |
| 85 | 1025 | 2.5E−06 | 0.005 | GO:0003700 | Sequence-specific DNA binding transcription factor act. |
| 85 | 1026 | 2.6E−06 | 0.006 | GO:0001071 | Nucleic acid binding transcription factor activity |
| 4 | 4 | 6.1E−06 | 0.022 | GO:0004720 | Protein-lysine 6-oxidase activity |
| 6 | 12 | 1.1E−05 | 0.032 | GO:0048484 | Enteric nervous system development |
| 6 | 12 | 1.1E−05 | 0.032 | GO:0070286 | Axonemal dynein complex assembly |
| 7 | 18 | 1.5E−05 | 0.040 | GO:0001539 | Cilium or flagellum-dependent cell motility |
Functions associated with cell migration or remodeling of extra cellular matrix are highlighted in green. Functions associated with transcriptional regulation are highlighted in red and correspond to exactly 299 genes with these terms (~1/3 of the genes). Functions associated with development are highlighted in yellow. For context, the human Gene Ontology database is composed 19,452 genes with 17,658 attributes
N number of genes in the tested set that match the number of genes with the given GO Attribution. X the total number of genes in the genome with that attribute, P the “single hypothesis one-sided P value of the association between attribute and query based on Fisher’s exact test” (Berriz et al. 2009; Berriz et al. 2003), P an empirically adjusted P value, which is the “fraction of 1000 null-hypothesis simulations having attributes with this single-hypothesis P value or smaller” (Berriz et al. 2009; Berriz et al. 2003)
Fig. 3Reduction of Mediator complex accompanied loss of CIBLIN regulators in nematodes. a The head, middle, and tail subcomplexes, as well as the kinase module of Mediator is shown, along with the subunits that are not detectable in nematode genomes (specifically the genomes for species shown in Fig. 1a). The undetectable subunits, which are likely lost or else under relaxed selection and fast-evolving are indicated in blue with a delta symbol (“deleted”). Conserved subunits are indicated in fuchsia. Subunits in purple are putatively present as extremely divergent forms and have been given suggestive names Mdt-15 and Mdt-11. Med27 was only detected in the enoplian species of Trichinella. Human Myc is known to physically contact human Med1 and Med16 (vertical and horizontal lines, from Fig. 2a). b An alignment of the Med15 protein from human (H. sap.), fly (D. mel.), and the nematode Trichinella (T. spi.) and Mdt-15 from C. elegans (C. ele.), which is most likely Med-15, is highlighted here to make several points about the threshold sensitivity of the CIBLIN repertoire. The EnsemblCompara pipelines are able to make the call for Med15 in Trichinella (Ensembl Metazoa EnsemblCompara) but not in C. elegans (both Metazoan Ensembl Compara and the main Ensembl Genes 78 computation). Med15 protein sequence does not feature any major domains and at no place is there more than a single amino acid residue conserved twice in a row in all four species. Insertions and deletions predominate, and few residues are conserved across all taxa (yellow highlight). Med15/Mdt15 is not a CIBLIN gene because of its detection in Trichinella
Fig. 4Human CIBLIN genes include Myc synthetic lethal hits from several screens. A Venn diagram of overlap between human CIBLIN genes and Myc synthetic lethal (MycSL) hits identified by screening small hairpin RNA (shRNA) or siRNA libraries. The list of 1389 human CIBLIN genes were cross-checked with the 11 MycSL kinome hits in a screen using human mammary epithelial cells (HMECs), 397 MycSL hits found in an HMEC screen, and 101 MycSL hits from a screen in human foreskin fibroblasts (HFFs) (Kessler et al. 2012; Liu et al. 2012; Toyoshima et al. 2012). The first two studies produced ectopic Myc using an inducible Myc-ER fusion, while the third screen used a retroviral vector to drive expression of ectopic levels of Myc. Thirty-one or ~6.1 % of human CIBLIN genes were found to be MycSL hits in one of the three MycSL screens as indicated. Thus, there is more overlap between the CIBLIN genes and any one MycSL screen than overlap between the MycSL screens themselves. In addition, the list of human CIBLIN genes include many important factors connected to cancer progression but not directly connected to Myc-related pathways (list of genes in red includes a small sample of relevant genes not listed in other figures). See also Supplementary Table S6 for a breakdown of genes