| Literature DB >> 30802244 |
Abstract
BACKGROUND: Magic roundabout (ROBO4) is an unusual endothelial-specific paralog of the family of neuronally-expressed axon guidance receptors called roundabouts. Endothelial cells (ECs), whose uninterrupted sheet delimits the lumen of all vertebrate blood vessels and which are absent from invertebrate species, are a vertebrate-specific evolutionary novelty.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30802244 PMCID: PMC6389290 DOI: 10.1371/journal.pone.0208952
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The directions of the data-mining of the ROBO4 paralog.
This conceptual figure illustrates the four major directions of this analysis. These include: (1) the phylogenetics of roundabouts, (2) the protein interactions of ROBO4, (3) the gene expression patterns of roundabouts, and (4) the architecture of the promoter of ROBO4. In the center of the figure, ROBO4 is symbolized by the homology model of the extra-cellular domain of the gene. (Note that there is yet no crystal structure of ROBO4.).
Fig 2The phylogenetic tree of roundabouts.
In the tree, nodes corresponding to gene duplications were annotated with a bootstrap value (B) and the taxon of duplication (note red labels). The tree suggests a single ancestral bilaterian roundabout in the last common ancestor of vertebrates and insects. There are four vertebrate roundabout paralogs: ROBO1, ROBO2, ROBO3 and ROBO4. They derive from gene duplications timed by phylogenetic timing to the base of vertebrates. In addition, in D. melanogaster there were two lineage-specific duplications giving rise to three paralogs: robo1-3 (but these duplications were not the focus of our analyses). The tree was calculated from the protein-guided nucleotide alignment of roundabout sequences, displayed using TreeViewJ [102], and annotated graphically in Adobe Illustrator using data from the .nhx tree file (S5 File). Extant species are signified with the following labels: HUMAN—Homo sapiens, MOUSE—Mus musculus, RAT—Rattus Norvegicus, CHICKEN—, Gallus gallus, BRARE—Danio rerio, CAEEL—Caenorhabditis elegans, DROME—Drosophila melanogaster. The tree is rooted on time.
The clustered arrangement of roundabouts is conserved in vertebrates.
| Species | The genomic clusters of roundabouts | |||||
|---|---|---|---|---|---|---|
| ROBO3-ROBO4 | ROBO2-ROBO1 | |||||
| Size | Chr. | Boundaries | Size | Chr. | Boundaries | |
| 30 kb | 11 | 124.87–124.9 Mb; | 3.87 Mb | 3 | 75.9–79.77 Mb; | |
| 30 kb | 9 | 37.4–37.43 Mb; | 1.75 Mb | 16 | 72.65–74.4 Mb; | |
| 35 kb | D1 | 21.72–21.75 Mb; | 1.8 Mb | C2 | 30.69–32.5 Mb; | |
| 32 kb | 5 | 9.55–9.58 Mb; | 1.5 Mb | 31 | 8.1–9.6 Mb; | |
| 46 kb | 29 | 28.69–28.74 kb; | 1.65Mb | 1 | 24.35–26 Mb | |
| 26 kb | 7 | 33.4–33.43 Mb; | 1.4 Mb | 26 | 10.5–11.9 Mb; | |
| 185 kb | scaffold GL172933 | 1.015–1.2 Mb; | 0.6 Mb | scaffold GL172645 | 0.75–1.35 Mb; | |
| 10 kb | 10 | 3.135–3.15 Mb; | 0.8 Mb | 15 | 38.95–39.75 Mb | |
* In all the species, the arrangements of genes within both clusters are tail-to-tail
** In a tail-to-tail arrangement, but there is an additional copy of robo2 downstream of the cluster.
NOTE: Genomic locations are given in the coordinates of the hg38 assembly.
Fig 3ROBO4 integrates several functional networks including a neo-functionalized network regulating angiogenesis.
ROBO4 is an endothelial-specific network hub, and a signaling bridge which integrates three functional sub-networks of the vertebrate cell: (1) the angiogenesis network, (2) the actin/filopodia network, and (3) the axon guidance network. The legend beneath the network graph provides information on sources of evidence and scores obtained for the interactors. Functional enrichment analysis at the bottom of the figure provides information on enriched gene ontology terms and KEGG pathways. Roman numbers indicate two sets of ohnologs: ROBO1 and ROBO4 (I) and SLIT1-3 (II).
Experimentally-verified protein interactions of ROBO4 in human and mouse.
| Interactor | Interactant | Assay | Reference |
|---|---|---|---|
| Human | Slit guidance ligand 2 ( | Co-immunoprecipitation; | [ |
| Human | Retinoid X receptor alpha ( | [ | |
| Mouse | Slit guidance ligand 3 ( | Co-immunoprecipitation; | [ |
| Mouse | Paxillin ( | Yeast two hybrid; | [ |
| Human | Roundabout 1 ( | Yeast two hybrid, | [ |
| Mouse | Unc-5 Netrin Receptor B ( | Surface plasmon resonance, co-immunoprecipitation; | [ |
| Human | Fms related tyrosine kinase 1 ( | Co-immunoprecipitation; | [ |
Fig 4The expression profiles of the roundabout TSSes.
Alternative TSSes were identified in the ROBO3-ROBO4 (A) and ROBO2-ROBO1 (B) clusters based on location with reference to the beginnings of RefSeq transcripts and maximal expression (ME), in that order. Expression profiles were inferred from human F5 data and TSSes visualized using the Zenbu browser. For example, ROBO3 has two alternative TSSes: one is expressed in melanocytes and epithelial cells (ROBO3-TSS2). The other is characterized by endothelial and neuroectodermal expression (ROBO3-TSS1). In contrast, ROBO4 has one (ROBO4-TSS1) sharply defined strong endothelial-specific TSS. (Graphical gene models of roundabouts derive from the UCSC Genome Browser, modified with Adobe Illustrator). The boxes of TSSes which are conserved in mouse are highlighted in red. Note that the ROBO4 transcript is 5’-truncated in relation to the other roundabouts. Protein homology to the ROBO4 protein starts only at the third exon of the transcript of ROBO1-TSS1 (ENSEMBL exon ID ENSE00001149757, marked with the blue `^`sign by the gene models of ROBO1). In panel (C), a heatmap of pairwise Pearson correlation coefficients visualizes the co-expression between roundabout TSSes in F5 primary cells. ROBO4-TSS (highlighted in red) is dissimilar in expression and clusters as an out-group to the expression profiles of other TSSes.
The expression profiles of the TSSes of roundabouts.
For each TSS, we show either the expression signal in each individual library (bold font) or enrichment in the sets of human F5 samples grouped in sample ontologies.
| GENE-TSS | The location of the TSS | The top five tissues of expression with signal in in tags per million (TPM); | ||
|---|---|---|---|---|
| ROBO4-TSS1 | chromosome 11: | |||
| ROBO3-TSS2 | chromosome 11: | |||
| ROBO3-TSS1 | chromosome 11: | |||
| ROBO2-TSS2 | chromosome 3: | |||
| ROBO2-TSS1 | chromosome 3: | |||
| ROBO1-TSS1 | chromosome 3: | |||
| ROBO1-TSS2 | chromosome 3: | |||
* Groupings of samples derive from the following sample ontologies: UBERON, or Gene Ontology. Ontologies are sorted according to the Wilcoxon-mann-whitney rank-sum enrichment z-score. The top five enriched sample categories are given for each TSS.
Note: locations are given in the coordinates of the hg19 assembly.
ROBO4-TSS1 does not correlate in expression with the other roundabout TSSes.
Pearson correlation coefficients (PCCes) for expression profiles in human F5 libraries from primary cells are given. Both asymptotic P-values (PA) and P-values from sampled randomization (PR) are shown. Interestingly, the negative PCCes of ROBO4 are not significant in the asymptotic test, but are significant in the randomization test*.
| SS | RefSeq | ROBO3-TSS1/2 | ROBO2-TSS2 | ROBO2-TSS1 | ROBO1-TSS1 | ROBO1-TSS2 |
|---|---|---|---|---|---|---|
| ROBO4-TSS1 | NM_019055 | PCC = -0.01, | ||||
| ROBO3-TSS1/2 | NM_022370 | PCC = 0.03, | ||||
| ROBO2-TSS2 | NM_001128929 | |||||
| ROBO2-TSS1 | NM_002942 | |||||
| ROBO1-TSS1 | NM_133631 |
NOTE: PCCes significant in the randomization test are highlighted in bold. Those which are significant in both the randomization test and the asymptotic test are highlighted in both bold and italics.
* The randomization test rejects a data-conditioned null hypothesis: that the value of correlation is not different from the distribution obtained for the genomic background.
Promoter architectures associated with TSSes of roundabouts.
Strong sites* are shown in bold font. Those detected in ECs are marked with (+). GATA2 and GATA3, as well as AP-1 subunits (FOS and JUN-family) TFBSes are underlined. The sites do not include polymerase type II peaks whose number correlates trivially with the level of gene expression. We note that in some loci high-affinity sites were shown to be inhibitory while low-affinity sites were shown to activate transcription [108]. This is not the case for ROBO4 as most of its strong TFBSes were activating endothelial sites. However, several weak sites of ROBO4-TSS1 were also of endothelial origin, suggesting both strong and weak TFBSes could be functional in this promoter architecture.
| ENCODE TFBSes. | TSS. |
|---|---|
| ROBO4-TSS1 | |
| ROBO3-TSS2 | |
| ROBO3-TSS1 | |
| ROBO2-TSS1 | |
| ROBO2-TSS2 | |
| ROBO1-TSS1 | |
| ROBO1-TSS2 |
* The quality score cutoff of 500 divided ENCODE TFBSes into strong and weak sites.
Accelerated evolution and positively selected sites in ROBO4.
The table shows the values of log likelihoods () and the estimates of parameters under different models of the rate of evolution among the codons of ROBO4. The models were applied to the small dataset only. The first group of models average the rates of evolution over the entire tree. The model assuming one parameter (i.e., the one-ratio model) calculates , which equaled 0.1095, for the entire tree. This model was least likely (). The next group of models allow the rate of evolution to vary between branches. The free-ratio model calculates a separate rate of evolution for each branch. The branch model calculates one rate of evolution for the ROBO4 branch (foreground) and the other for the remaining branches (background). The branch-site model allows the rate of evolution to vary both among sites and between the foreground and background.
| Model | The estimates of parameters | Positively selected sites | |||
|---|---|---|---|---|---|
| 1 | -14869 | Not allowed | |||
| 2 | -14849 | Not allowed | |||
| 4 | -14849 | None | |||
| 2 | -14714 | Not allowed | |||
| 5 | -14715 | None | |||
| 15 | -14790 | Not allowed | |||
| 2 | -14864 | Not allowed | |||
| 2 | -14830 | Not allowed | |||
| 4 | -14808 | 58 sites (BEB>95%). 12 sites (BEB>99%) out of 630 | |||
NOTE: p stands for the number of parameters under different models. stands for log-likelihood. Branch lengths are fixed at their maximum likelihood estimates under the one-ratio model. The sites of positive selection are inferred at the BEB score higher than 95%.
Likelihood ratio tests on PAML data.
| Models compared | D.f. | Chi -squared | Conclusions | |
|---|---|---|---|---|
| One-ratio | 79 | 14 | Rejects the one-ratio model | |
| One-ratio | 10 | 1 | Rejects the one-ratio model | |
| Branch-site null | 44 | 2 | Rejects the branch-site null model |
Fig 5The promoter architecture associated with ROBO4-TSS1 is dissimilar to those of other roundabout TSSes.
In this figure, the promoter architecture of ROBO4-TSS1 is compared against the architectures of other roundabouts using the following two approaches: pairwise Manhattan distances and a Venn diagram. In panel (A), a heatmap visualizes Manhattan distances between vectorized promoter architectures of the TSSes. Short distances indicate similarity, long distances indicate divergence. The average distance to ROBO4-TSS1 was 10.7 (N = 6). The distances between similar promoter architectures of ROBO2-TSS1, ROBO2-TSS2, ROBO1-TSS2, and ROBO3-TSS1 (N = 6), which cluster in the heatmap, are significantly shorter (mean 3.7, Wilcoxon P-value 0.005). To put these distances in context, we note that the average distance between all roundabout TSSes (N = 21, all pairwise comparisons excluding self-comparisons) equaled 8.67 and was almost twice shorter than the average distance between random pairs of RefSeq promoters (16.259, N = 834,585,940, Wilcoxon P-value = 0.03614). Panel (B) shows a Venn diagram for the sets of strong TFBSes in the architectures of ROBO4-TSS1, ROBO1-TSS1, ROBO2-TSS1, ROBO3-TSS1 and ROBO3-TSS2. The five architectures have only one TF binding site in common: CTCF.
Gene families of putative 2R-ohnologs in the signaling and regulatory networks of ROBO4.
Paralogs were inferred from TreeFam v9 from trees corresponding to family IDs given (www.treefam.org/). In each case, the most recent duplication event was placed at the base of vertebrates by phylogenetic timing. The paralogs listed are descendants of such vertebrate duplication nodes. The ancestral bilaterian gene was inferred on the basis of the fly or worm ortholog. Paralogs directly in the network are highlighted in bold font. Note that many gene members of these gene families are preferentially expressed in ECs, for example: KDR, FLT1, TIE1, TIE2, ROBO4, GATA2.
| TreeFam family ID | 2R-ohnologs (vertebrate paralogs) | Ancestral bilaterian gene |
|---|---|---|
| Receptors | ||
| TF325768 | kinase insert domain receptor (KDR); | PDGF- and VEGF-receptor related (Pvr) |
| TF317568 | tyrosine kinase with immunoglobulin like and EGF like domains 1 (TIE1); | Unknown |
| TF351053 | roundabout guidance receptor (robo) | |
| Ligands | ||
| TF319554 | vascular endothelial growth factor A (VEGFA); | PDGF- and VEGF-related factor 1 (Pvf1) |
| TF336658 | angiopoietin 1 (ANGPT1); | uncertain or missing |
| TF332887 | secreted glycoprotein Slit (sli) | |
| Transcription factors | ||
| TF315391 | GATA binding protein 1 (GATA1); | Grain (Grn) |
| TF106430 | CCCTC-binding factor (CTCF); | CTCF |
| TF318648 | signal transducer and activator of transcription 1 (STAT1); | signal transducer and activator of transcription 1 (Sta-1) |
| TF316127 | forkhead box A1 (FOXA1); | Forkhead (fkh) |
The accessions of the sequences of roundabouts.
| Species | Gene | Protein accession | Amino-acids | Coding sequence. |
|---|---|---|---|---|
| Human | ROBO1 | NP_002932 | 1651 | CCDS54611 |
| ROBO2 | NP_001276969 | 1443 | NM_00129004, 644–4975 bps | |
| ROBO3 | NP_071765 | 1386 | CCDS44755 | |
| ROBO4 | NP_061928 | 1007 | CCDS8455 | |
| Mouse | Robo1 | NP_062286 | 1612 | CCDS37376 |
| Robo2 | NP_780758 | 1508 | CCDS49886 | |
| Robo3 | NP_001158239 | 1402 | CCDS52770 | |
| Robo4 | NP_001296319 | 1022 | CCDS80976 | |
| -Rat | Robo1 | NP_071524 | 1651 | NM_022188, 1–4956 bps |
| Robo2 | NP_115289 | 1512 | NM_032106, 459–4997 bps | |
| Robo3 | NP_001101605 | 1305 | NM_001108135, 352–4269 bps | |
| Robo4 | NP_852040 | 961 | NM_181375, 1–2886 bps | |
| Zebrafish | Robo1 | NP_001296753 | 1646 | NM_001309824, 699–5639 bps |
| Robo2 | NP_571708 | 1513 | NM_131633, 168–4709 bps | |
| Robo3 | NP_001315345 | 1419 | NM_001328416, 355–4614 bps | |
| Robo4 | XP_689255 | 1134 | XM_684163, 647–4051 bps | |
| Chicken | Robo1 | XP_015153996 | 1573 | XM_015298510, 326–5047 bps |
| Robo2 | XP_015154089 | 1516 | XM_015298603, 419–4969 bps | |
| Robo3 | XP_015153567 | 1232 | XM_015298081, 196–3894 bps | |
| Robo4 | XP_015153568 | 1064 | XM_015298082, 335–3529 bps | |
| Fly | Robo1 | NP_476899 | 1395 | NM_057551, 176–4363 bps |
| Robo2 | NP_536792 | 1463 | NM_080531, 289–4680 bps | |
| Robo3 | NP_001259866 | 1342 | NM_001272937, 475–4503 bps | |
| Worm | Sax-3 | NP_001024990 | 1273 | NM_001029819, 1–3822 bps |
The protein domains of roundabouts.
| Gene ID | Domain-name | E-value | START | END |
|---|---|---|---|---|
| ROBO1 | Ig_3 | 1.5e-84 | 67 | 151 |
| I-set | 2.3e-90 | 172 | 257 | |
| Ig_3 | 1.5e-84 | 261 | 334 | |
| Ig_3 | 1.5e-84 | 350 | 432 | |
| Ig_3 | 1.5e-84 | 454 | 528 | |
| V-set | 3.5e-28 | 504 | 537 | |
| fn3 | 8e-35 | 562 | 646 | |
| V-set | 3.5e-28 | 675 | 721 | |
| fn3 | 8e-35 | 778 | 864 | |
| Ig_2 | 5.6e-52 | 944 | 961 | |
| C2-set_2 | 5.2e-18 | 1253 | 1269 | |
| ROBO2 | Ig_5 | 3.2e-11 | 104 | 114 |
| I-set | 1.4e-88 | 135 | 221 | |
| I-set | 1.4e-88 | 225 | 310 | |
| Ig_3 | 1.2e-82 | 317 | 397 | |
| Ig_3 | 1.2e-82 | 421 | 495 | |
| fn3 | 1.5e-35 | 528 | 611 | |
| Ig_3 | 1.2e-82 | 651 | 669 | |
| fn3 | 1.5e-35 | 743 | 830 | |
| ROBO3 | V-set | 5.9e-30 | 137 | 161 |
| I-set | 1.1e-82 | 168 | 249 | |
| I-set | 1.1e-82 | 258 | 343 | |
| Ig_3 | 3.3e-78 | 346 | 426 | |
| Ig_3 | 3.3e-78 | 450 | 524 | |
| fn3 | 2.1e-31 | 558 | 641 | |
| I-set | 1.1e-82 | 682 | 694 | |
| fn3 | 2.1e-31 | 782 | 857 | |
| C2-set_2 | 1.4e-15 | 1143 | 1162 | |
| ROBO4 | Ig | 1.5e-13 | 107 | 124 |
| I-set | 3.6e-24 | 138 | 220 | |
| fn3 | 1.5e-10 | 264 | 333 | |
| fn3 | 1.5e-10 | 350 | 432 | |
| Ig_2 | 6.7e-17 | 985 | 1005 |
NOTE: I-set (PF07679.14), V-set (PF07686.15), C2-set_2 (PF08205.10), ig (PF00047.23), Ig_2 (PF13895.4), and Ig_3 (PF13927.4) are sub-types of immunoglobulin domains; fn3 — fibronectin type III domain (PF00041.19). E-value, that is the expected number of random hits of equal strength, was calculated by the program hmmscan from the HMMER v3.1 package.
The lesser number of protein-binding domains in the ROBO4 extra-cellular domain is the consequence of the loss of coding exons. Human ROBO4 has 18 exons, while ROBOs 1/2/3 have 29, 27 and 28, respectively. (The numbers of exons were derived from the UCSC Genome Browser on the hg38 assembly, gene models for ROBO4 (transcript ENST00000306534.7), ROBO1 (ENST00000467549.5), ROBO2 (ENST00000332191.12) and ROBO3 (ENST00000397801.5).)
* START and END positions are given in the coordinates of the amino-acid sequence of the roundabout.
The locations of the promoters of roundabouts in the human genome (hg19 coordinates) and corresponding regions in mouse (mm9).
For each promoter, maximal expression (ME) for both human and mouse is given in square brackets.
| Promoter ID | Closest RefSeq | ENSEMBL gene ID | Location in the human genome and maximal expression [ME] | Top mouse BLAT hit: location (score) and [ME] Conserved? |
|---|---|---|---|---|
| ROBO4-TSS1 | NM_019055 | ENSG00000154133 | chr11:124764760–124770760 | chr9:37208261–37212629 ( |
| ROBO3-TSS2 | NM_022370 | ENSG00000154134 | chr11:124732300–124738300 | chr9:37239055–37242808 ( |
| ROBO3-TSS1 | NM_022370 | ENSG00000154134 | chr11:124743700–124749700 | chr9:37225070–37231494 ( |
| ROBO2-TSS2 | NM_001128929 | ENSG00000185008 | chr3:75952845–75958845 | chr16:75420561–75421064 ( |
| ROBO2-TSS1 | NM_002942 | ENSG00000185008 | chr3:77086294–77092294 | chr16:74409016–74413502 ( |
| ROBO1-TSS1 | NM_133631 | ENSG00000169855 | chr3:79065600–79071600 | chr16:72662054–72666244 ( |
| ROBO1-TSS2 | NM_002841 | ENSG00000169855 | chr3:79814200–79820200 | chr16:72024757–72029898 ( |