| Literature DB >> 28271061 |
Samantha J England1, Paul C Campbell1, Santanu Banerjee1, Annika J Swanson1, Katharine E Lewis1.
Abstract
Polycystic kidney disease (PKD) proteins are trans-membrane proteins that have crucial roles in many aspects of vertebrate development and physiology, including the development of many organs as well as left-right patterning and taste. They can be divided into structurally-distinct PKD1-like and PKD2-like proteins and usually one PKD1-like protein forms a heteromeric polycystin complex with a PKD2-like protein. For example, PKD1 forms a complex with PKD2 and mutations in either of these proteins cause Autosomal Dominant Polycystic Kidney Disease (ADPKD), which is the most frequent potentially-lethal single-gene disorder in humans. Here, we identify the complete family of pkd genes in zebrafish and other teleosts. We describe the genomic locations and sequences of all seven genes: pkd1, pkd1b, pkd1l1, pkd1l2a, pkd1l2b, pkd2, and pkd2l1. pkd1l2a/pkd1l2b are likely to be ohnologs of pkd1l2, preserved from the whole genome duplication that occurred at the base of the teleosts. However, in contrast to mammals and cartilaginous and holostei fish, teleosts lack pkd2l2, and pkdrej genes, suggesting that these have been lost in the teleost lineage. In addition, teleost, and holostei fish have only a partial pkd1l3 sequence, suggesting that this gene may be in the process of being lost in the ray-finned fish lineage. We also provide the first comprehensive description of the expression of zebrafish pkd genes during development. In most structures we detect expression of one pkd1-like gene and one pkd2-like gene, consistent with these genes encoding a heteromeric protein complex. For example, we found that pkd2 and pkd1l1 are expressed in Kupffer's vesicle and pkd1 and pkd2 are expressed in the developing pronephros. In the spinal cord, we show that pkd1l2a and pkd2l1 are co-expressed in KA cells. We also identify potential co-expression of pkd1b and pkd2 in the floor-plate. Interestingly, and in contrast to mouse, we observe expression of all seven pkd genes in regions that may correspond to taste receptors. Taken together, these results provide a crucial catalog of pkd genes in an important model system for elucidating cell and developmental processes and modeling human diseases and the most comprehensive analysis of embryonic pkd gene expression in any vertebrate.Entities:
Keywords: PKD; TRP proteins; dorsal forerunner cells; kidney; node; polycystin; spinal cord; taste buds
Year: 2017 PMID: 28271061 PMCID: PMC5318412 DOI: 10.3389/fcell.2017.00005
Source DB: PubMed Journal: Front Cell Dev Biol ISSN: 2296-634X
Figure 1Mapping . Summary of mRNA transcript mapping results for pkd1 (A), pkd1l2a (B), and pkd1l2b (C). Approximate length in base pairs (bp) is indicated by scale at top of each panel. Mapped transcripts are shown in next row of each panel. Coding sequence is blue and UTR is gray. Numbers flanking these mapped transcripts indicate nucleotide positions. Black vertical lines (coding sequence box, A) indicate putative start codon, and morpholino sequence positions. Purple vertical lines (coding sequence boxes, A–C) indicate exon boundaries, where known. Mapped PCR amplicons generated in this study are indicated with white boxes. Red indicates riboprobe sequences used in this study. Dark blue indicates novel sequence identified in this study but not currently present in Ensembl GRCz10 genome. Genbank reference sequences used at beginning of this study are shown as pink boxes. Magenta lines with double arrows beneath these indicate regions of sequence homology identified at start of this project. Genbank reference sequences identified during this study are shown as orange boxes. Ensembl Zv9 transcript sequences are shown as lilac boxes. Ensembl GRCz10 transcript sequences are shown as green boxes. Numbers beneath sequences show nucleotide positions. ∧, break in aligned sequence. Thin purple vertical lines in green boxes indicate exon boundaries, where known. Key exons for interpreting mapping results are numbered. (A) Our newly mapped pkd1 transcript contains all but 173 bases of the older Zv9 ENSDART00000039911 transcript (lilac boxes) as well as all but the first 45 nucleotides of the current GRCz10 ENSDART00000039911 transcript (green boxes). We have also identified additional 5′ sequence and missing regions of coding sequence. The GRCz10 ENSDART00000039911 transcript corresponds to nucleotides 2975–18401 of our mapped transcript. The Zv9 ENSDART00000039911 transcript corresponds to nucleotides 218–13798 of our mapped transcript but contains some gaps (nucleotides 292–357, 430–597, 875–925, 1778–1786, 1935–1973, 8363–8410, 9804–9811, 10715–10725, 11608–11640, 12356, 12359–12389, 12852–12890, and 13109–13675 of our mapped transcript). Inverse PCR identified 5′ transcript sequence along with a novel stretch of nucleotides (292–357) absent from GRCz10 Ensembl genome (shown in dark blue). Nucleotides 2659–12720 of the Zv9 transcript are almost 100% identical to nucleotides 46–10179 of the GRCz10 transcript and these regions align with nucleotides 2975–13108 of our mapped transcript. Nucleotides 10180–10746 of the GRCz10 transcript share no homology with the Zv9 transcript, but correspond to nucleotides 13109–13675 of our mapped transcript. Nucleotides 12721–12843 of the Zv9 transcript share 100% homology with nucleotides 10747–10869 of the GRCz10 transcript and correspond to nucleotides 13676–13798 of our mapped transcript. The coding sequence of the GRCz10 transcript terminates 30 nucleotides downstream of the Zv9 transcript and is followed by 4573 bp of unique 3′ UTR sequence. Using RT-PCR we have confirmed that our mapped transcript utilizes the same stop codon and 3′ UTR sequence. Specifically, we have confirmed that nucleotides 10192–11136, 11173–12085, 12598–13484, and 14524–15376 of 3′ UTR sequence in the GRCz10 transcript are transcribed and map to nucleotides 13121–14065, 14102–15014, 15527–16413, and 17453–18395 of our mapped transcript, respectively. Our inverse PCR revealed 217 nucleotides of coding sequence upstream of the Zv9 transcript and 66 nucleotides of novel coding sequence between nucleotides 71 and 72 of the Zv9 transcript. In total, this produces a 18401 bp transcript that encodes a 4608 amino acid protein and we have deposited this sequence in NCBI (NCBI accession number KY074550). This sequence lacks a start methionine. *indicates in-frame methionine at 527–529 nucleotides. However, if this is the start codon, the resulting protein would lack the leucine rich repeat domain, encoded by the 175 amino acids in-frame upstream of this methionine, that is present in mouse, human and stickleback PKD1. There is a putative in-frame start codon a further 54 nucleotides (18 amino acids) upstream of our present transcript, which we think is more likely to be the true start codon. The location of the splice-blocking morpholino sequence used by Mangos et al. (2010) that resulted in kidney cysts in some animals is also indicated (nucleotides 1187–1197 of the Zv9 transcript). (B) Our current transcript for pkd1l2a encompasses both LOC101884812 and XM_002662913 and contains additional exons not present in either of these sequences. The start of the current ENSDART00000173234.1 transcript coincides with the start of exon 23 in our longer transcript, but the first exon of ENSDART00000173234.1 is shorter than exon 23 in our transcript. Exons 2–3, 4–7, and 9–17 of ENSDART00000173234.1 are identical to exons 24–25, 27–30, and 33–41 of our transcript. Exon 26 of our transcript is absent in ENSDART00000173234.1 and exons 31–32 and intron 31–32 exist as a single exon, exon 8, in ENSDART00000173234.1. (C) Our current transcript for pkd1l2b contains both the si:ch211-168k15.4 and ENSDARG00000101214 (ENSDART00000124969.2) sequences, utilizing a start codon 4 bases upstream of exon 1 in the current si:ch211-168k15.4 annotation, and transitioning between exon 16 of si:ch211-168k15.4 immediately into exon 9 of ENSDART00000124969.2. Nucleotides 683–7026 of our new 7898 bp mRNA transcript align perfectly with the original XM_009303604 6344 bp reference sequence. The start codon was identified in this study along with novel 5′ UTR sequence.
Characterization of teleost .
| Zebrafish | 1(–): 53597305–53745945 | ENSDARG00000030417; | No—5′ coding sequence incomplete | Yes | 18401 | No∧ | 4608 | Leucine rich repeat (93–149); Carbohydrate binding WSC domain (201–273); Lectin C-type domain (454–551); PKD domains (1067–1151, 1175–1236, 1257–1317, 1343–1402, 1425–1491, 1516–1572, 1597–1657, 1682–1747, 1854–1914, 1933–1998, 2112–2174); REJ domain (2210–2662); PLAT/LH2 domain (3198–3304); Polycystin-cation-channel domain (3786–4156) | Pkd2 (aa 54–95, 48%); | Phylogeny, synteny, domain structure, expression, and morpholino data (see text). | |
| Green spotted pufferfish | None | 18(+): 2403952–2439321 | ENSTNIG00000014075 | Yes | No | 13572 | Yes | 4523 | Carbohydrate binding WSC domain (201–281); Lectin C-type domain (441–544); PKD domains (1065–1155, 1172–1237, 1262–1319, 1342–1400, 1429–1496, 1521–1579, 1602–1663, 1693–1761, 2130–2186); REJ domain (2224–2690); PLAT/LH2 domain (3230–3336); Polycystin-cation-channel domain (3920–4171) | Pkd1 (aa 1–371, 34%) | Phylogeny, synteny, and domain structure. |
| Medaka | 1(+): 31241283–31246625 | ENSORLG00000011636 (Novel) | No—5′ and 3′ coding sequence incomplete | No | 1926 | No | 642 | Polycystin-cation-channel domain (95–508) | Pkd1 (aa 135–331, 56%) | Phylogeny and synteny. | |
| Scaffold1073 (+): 14296–55909 | ENSORLG00000019755 | No—5′ and 3′ coding sequence incomplete | No | 10200 | No | 3400 | Carbohydrate binding WSC domain (114–187); Lectin C-type domain (342–442); PKD domains (204–257, 981–1072, 1089–1155, 1178–1236, 1259–1323, 1347–1411, 1451–1493, 1514–1579, 1612–1673, 1694–1754, 2029–2094); REJ domain (2131–2599); PLAT/LH2 domain (3146–3251) | Domain structure. | |||
| Stickleback | None | GroupIX(+): 14963056–15005610 | ENSGACG00000018946 | No—5′ and 3′ coding sequence incomplete | No | 12849 | No | 4283 | Leucine rich repeat (72–127); Carbohydrate binding WSC domain (179–254); PKD domains (289–341, 1037–1126, 1143–1208, 1235–1290, 1311–1377, 1404–1466, 1495–1550, 1573–1633, 1663–1732, 1839–1897, 1917–1983, 2096–2156); Lectin C-type domain (435–535); REJ domain (2194–2649); PLAT/LH2 domain (3158–3263); Polycystin-cation-channel domain (3754–4098) | Pkd1 (aa 8–74 and 135–371, 50%) | Phylogeny, synteny, and domain structure. |
| Zebrafish | None | 12(+): 21576321–21626795 | ENSDARG00000033029; | Possible additional 5′ coding sequence (see text) | No | 11674 | Yes | 3827 | PKD domains (543–613, 641–696, 735–775, 895–951, 968–1036, 1493–1552); REJ domain (1589–2031); PLAT/LH2 domain (2519–2624); Polycystin-cation-channel domain (3250–3542) | No significant hits other than with itself. | Phylogeny with other teleosts (see text). |
| Green spotted pufferfish | None | 2(–): 4745387–4760923 | ENSTNIG00000004457 | No—5′ and 3′ coding sequence incomplete | No | 10137 | No | 3379 | PKD domains (210–296, 313–378, 406–467, 486–550, 645–702, 989–1049, 1249–1311); REJ domain (1346–1787); PLAT/LH2 domain (2293–2397); Polycystin-cation-channel domain (2980–3242) | Pkd1b (aa 300–367, 57%) | Phylogeny and domain structure. Synteny with Stickleback |
| Medaka | Not applicable. | ||||||||||
| Stickleback | GroupV(–): 6686030–6695058 | ENSGACG00000005441 (Novel) | No—5′ and 3′ coding sequence incomplete | No | 4044 | No | 1348 | REJ domain (9–211); PLAT/LH2 domain (718–822) | Synteny with Green spotted pufferfish | ||
| Zebrafish | 24(+): 17140489–17184898 | ENSDARG00000099162 ( | No—3′ coding sequence incomplete | No | 6486 | Yes (see text). | 2162 | PKD domains (16–73, 99–159); REJ domain (200–781); GPS motif (1213–1251); PLAT/LH2 domain (1322–1251); Polycystin-cation-channel domain (1960–2140) | Pkd1l2a (aa 4–49, 37%); | Phylogeny. Synteny and expression data with other teleosts. Domain structure (see text). | |
| Green spotted pufferfish | None | 6(+): 2144742–2156425 | ENSTNIG00000000844 (Novel) | No—5′ and 3′ coding sequence incomplete | No | 6459 | No | 2153 | PKD domains (12–74, 98–158); REJ domain (203–786); PLAT/LH2 domain (1293–1411); Polycystin-cation-channel domain (1925–2153) | Pkd1l1 (aa 3–67, 43%); | Phylogeny. Synteny with other teleosts. Domain structure (see text). |
| Medaka | None | 20(+): 13051104–13070105 | No | Not applicable | No | 8226≫ | No—5′ and 3′ coding sequence incomplete | 2742 | PKD domains (294–350, 380–440); REJ domain (482–1081); GPS motif (1541–1574); PLAT/LH2 domain (1650–1768); Polycystin-cation-channel domain (2270–2530) | Pkd1l1 (aa 2–65, 52%) | Phylogeny. Synteny and expression data with other teleosts. Domain structure (see text). |
| Stickleback | None | GroupXXI(+): 5717457–5729908 | No | Tblastn with the full-length zebrafish Pkd1l1 protein sequence identified this non-annotated region of homology on the forward strand of GroupXXI. Unfortunately, there is no mRNA or reference sequence available to determine the protein structure for this region. However, the genes flanking this locus show synteny with those flanking the other teleost | Pkd1l1 (aa 3–67, 48%); | Synteny with other teleost | |||||
| Zebrafish | 7(-): 64953104–65012621 | ENSDARG00000105344; | No—5′ coding sequence incomplete | Yes | 8526 | Yes∧ | 2485 | Lectin C-type domain (51–159); Galactose binding lectin domain (174–258); REJ domain (648–919); GPS motif (1298–1335); PLAT/LH2 domain (1408–1512); Polycystin-cation-channel domain (2019–2437) | Pkd1 (aa 217–296, 24%); | Phylogeny, synteny, domain structure, and expression data (see text). | |
| Green spotted pufferfish | 5(–): 6132261–6135595 | ENSTNIG00000009353 (Novel) | No—5′ and 3′ coding sequence incomplete | No | 2109 | No | 703 | Polycystin-cation-channel domain (250–657) | Pkd1l1 (aa 4–109, 38%); | Phylogeny. Synteny with other teleost | |
| Medaka | 3(+): 18867151–18882761 | ENSORLG00000007572 (Novel) | No—5′ and 3′ coding sequence incomplete | No | 7410 | No | 2470 | Lectin C-type domain (47–157); Galactose binding lectin domain (172–255); REJ domain (600–872); GPS motif (1276–1314); PLAT/LH2 domain (1387–1500); Polycystin-cation-channel domain (1992–2412) | Pkd1l2a (aa 32–426, 56%); | Phylogeny. Synteny with other teleost | |
| Stickleback | GroupII(+): 10511520–10528238 | ENSGACG00000015742 (Novel) | No—3′ coding sequence incomplete | No | 7373 | No | 2458 | Lectin C-type domain (49–158); Galactose binding lectin domain (173–256); REJ domain (644–895); GPS motif (1285–1323); PLAT/LH2 domain (1396–1499); Polycystin-cation-channel domain (1999–2416) | Pkd1l2a (aa 30–426, 68%); | Phylogeny. Synteny with other teleost | |
| Zebrafish | 7(+): 67029232–67087646 | ENSDARG00000101214; | Incomplete—incorrect exon boundaries (see text) | Yes | 7898 | Yes∧ | 1902 | Lectin C-type domain (42–150); Galactose binding lectin domain (164–245); GPS motif (760–797); PLAT/LH2 domain (871–895); Polycystin-cation-channel domain (1441–1857) | Pkd1 (aa 152–197, 37%); | Phylogeny and domain structure (see text). | |
| Green spotted pufferfish | Un_random (+): 65025810–65043223 | ENSTNIG00000003968 (Novel) | No—5′ and 3′ coding sequence incomplete | No | 7143 | No | 2381 | Lectin C-type domain (10–117); GPS motif (1249–1287); Polycystin-cation-channel domain (1927–2352) | Pkd1l1 (aa 4–87, 23%); | Phylogeny. | |
| Medaka | 21(–): 29489012–29547025 | ENSORLG00000018124 (Novel) | No—5′ and 3′ coding sequence incomplete | No | 7293 | No | 2431 | Lectin C-type domain (44–157); Galactose binding lectin domain (173–255); GPS motif (1278–1316); PLAT/LH2 domain (1389–1491); Polycystin-cation-channel domain (1976–2389) | Pkd1l2a (aa 32–417, 59%); | Phylogeny and domain structure. Synteny with Stickleback | |
| Stickleback | GroupXVI(–): 4027610–4044964 | ENSGACG00000002153 (Novel) | No—5′ and 3′ coding sequence incomplete | No | 4965 | No | 1655 | GPS motif (500–538); PLAT/LH2 domain (611–713); Polycystin-cation-channel domain (1218–1626) | Pkd1l2a (aa 30–426, 59%); | Phylogeny. Synteny with Medaka | |
| Zebrafish | 7(–): 56186991–56215290 | ENSDARG00000091803 (Zv9) | Yes | No | 3252 | Yes | 1083 | Lectin C-type domain (55–155); GPS motif (493–531); PLAT/LH2 domain (604–713) | Synteny with amniote | ||
| Green spotted pufferfish | None | Un_random (–): 80610479–80612299 | ENSTNIG00000005751 (Novel) | Yes | No | 1194 | Yes | 397 | GPS motif (34–68); PLAT/LH2 domain (144–258) | Synteny with amniote | |
| Medaka | None | 3(+): 16335410–16337664[ | No | Tblastn with the full-length longest isoform of mouse PKD1L3 protein sequence identified this non-annotated region of homology on the forward strand of Chromosome 3. Unfortunately, there is no mRNA or reference sequence available to determine the protein structure for this region. However, the genes flanking this locus show synteny with those flanking the other teleost putative | Synteny with amniote | ||||||
| Stickleback | None | GroupII(+): 8624083–8628300 | ENSGACG00000015469 (Novel) | No—5′ and 3′ coding sequence incomplete | No | 1575 | No | 525 | GPS motif (183–219); PLAT/LH2 domain (293–409) | Synteny with amniote | |
| Zebrafish | None | 1(+): 49895013–49913204 | ENSDARG00000014098; | Yes | No | 3336 | Yes | 904 | Polycystin-cation-channel domain (204–624) | Pkd1 (aa 112–195, 45%); | Phylogeny, synteny and domain structure. |
| Green spotted pufferfish | None | 18(+): 7404161–7408783 | ENSTNIG00000011045 | No—5′ and 3′ coding sequence incomplete | No | 2553 | No | 851 | Polycystin-cation-channel domain (170–589) | Pkd1 (aa 83–348, 22%); | Phylogeny, synteny and domain structure. |
| Medaka | None | 1(–): 20624482–20633236 | ENSORLG00000007003 | Yes | No | 2709 | Yes | 902 | Polycystin-cation-channel domain (211–629) | Pkd1 (aa 15–154, 34%); | Phylogeny, synteny and domain structure. |
| Stickleback | None | GroupIX(–): 7437117–7444710 | ENSGACG00000017335 | No—5′ coding sequence incomplete | No | 3386 | No | 903 | Polycystin-cation-channel domain (214–633) | Pkd1 (aa 85–154, 43%); | Phylogeny, synteny and domain structure. |
| Zebrafish | None | 13(+): 25324869–25340926 | ENSDARG00000022503; | Yes | No | 2718 | Yes | 790 | Polycystin-cation-channel domain (139–559) | Pkd1 (aa 113–150, 42%); | Phylogeny, domain structure and expression data (see text). Synteny with other teleost |
| Green spotted pufferfish | None | 17(–): 1034193–1037975 | ENSTNIG00000013092 | No—5′ and 3′ coding sequence incomplete | No | 1977 | No | 659 | Polycystin-cation-channel domain (69–489) | Pkd1 (aa 82–333, 23%); | Phylogeny and domain structure. Synteny with other teleost |
| Medaka | None | 15(+): 29369729–29374015 | ENSORLG00000013731 | No—5′ and 3′ coding sequence incomplete | No | 2058 | No | 686 | Polycystin-cation-channel domain (149–569) | Pkd1 (aa 83–154, 37%); | Phylogeny and domain structure. Synteny with other teleost |
| Stickleback | None | GroupVI(–): 2954929–2959614 | ENSGACG00000003378 | No—5′ and 3′ coding sequence incomplete | No | 2250 | No | 750 | Polycystin-cation-channel domain (153–573) | Pkd1 (aa 83–154, 41%); | Phylogeny and domain structure. Synteny with other teleost |
Column 2 lists former gene names used in previous genome assemblies. Column 3 shows position in current version of appropriate genome. Where an Ensembl gene annotation exists that supports a particular pkd gene, this is listed in column 4, and column 5 describes whether this annotation is accurate and/or complete. For zebrafish genes, the ZFIN ID is also given in column 4. Zebrafish transcripts that we have mapped in this study are indicated in column 6. ˆ Mapped sequences have been submitted to NCBI (see Section Materials and Methods). Columns 7 and 9 indicate transcript and protein lengths, in base pairs and amino acids, respectively. Column 8 indicates whether a complete transcript for the gene is available, either on Ensembl or as a result of our analyses in this paper. Column 10 indicates protein domains identified by searching against Pfam protein database. Column 11 indicates results (percentage amino acid sequence identity) from Tblastn searches of respective genome with polycystin-cation-channel domains of each zebrafish Pkd protein. In most cases, a specific gene was only found by a subset of these Tblastn searches. The amino acid regions of the zebrafish query sequence that align with the candidate sequence are indicated. Column 12 lists data that support the annotations shown here (see Section Results for more info). Alternative transcripts are available in Ensembl (see Section Materials and Methods for genome assemblies) for the following species and genes: zebrafish—pkd1b and pkd1l1, green spotted pufferfish—pkd1, pkd1b, pkd1l1 and pkd1l2a, medaka—pkd2l1 and stickleback—pkd1 and pkd2. We cannot rule out the possibility that alternative transcripts may exist for other genes. Where multiple transcripts exist, the longest protein isoform is shown.
Indicates that there are two potential medaka pkd1 sequences. The currently annotated pkd1 gene [ENSORLG00000019755, formerly pkd1 (2 of 2)] is present on Scaffold1073 of the medaka genome. This is the only gene on Scaffold1073 and it encodes a 3400 amino acid protein containing all of the protein domains found within other Pkd1 proteins, with the exception of the carboxy-terminus polycystin-cation-channel domain. Tblastn analysis against the medaka genome with the polycystin-cation-channel domains of each of the zebrafish Pkd proteins did not detect a polycystin-cation-channel domain on Scaffold1073, but it did find one in novel gene ENSORLG00000011636 [formerly pkd1 (1 of 2)] on chromosome 1. This novel gene shares synteny and phylogeny with both mammalian and teleost pkd1 genes (Figures 3, 4A). Tblastn with Scaffold1073 Pkd1 protein sequence failed to detect any more Pkd1-like protein sequences encoded upstream of this novel gene on chromosome 1. Therefore, there is no evidence at present that these two sequences constitute different parts of the same gene, although the protein domains that they encode suggest that this may be the case.
When Tblastn was performed against the stickleback genome with full-length zebrafish Pkd1b, sequence with homology to REJ and PLAT/LH2 domains was detected in novel gene ENSGACG00000005441. This gene was called pkd1b in a previous genome assembly and it has conserved synteny with green spotted pufferfish pkd1b. However, unlike all bona-fide Pkd proteins, this stickleback protein does not contain a carboxy-terminus polycystin-cation-channel domain. ≫When Tblastn was performed against the medaka genome with the polycystin-cation-channel domain from zebrafish Pkd1l1, a region of homology on chromosome 20 was identified that shared synteny with pkd1l1 genes in other teleosts. Since this genomic region was not annotated, we searched the literature for putative medaka pkd1l1 sequences and identified a 8226 bp partial mRNA sequence lacking both start and stop codons, reported by Kamura et al. (.
This sequence is a non-annotated region that falls within the final intron (intron 6–7) of the third transcript of the sult5a1 gene (ENSDART00000162934.1) in the current GRCz10 Ensembl assembly. This corresponds to the locus occupied by gene ENSDARG00000091803 in Zv9. We are showing information for the retired Zv9 annotation, since our synteny analyses support this sequence being a partial pkd1l3 ortholog.
This sequence is a non-annotated region that falls between the genes sult5a1 and hp on chromosome 3 in the current assembly of the medaka genome. Sequence homology and synteny analyses suggest that this is a putative medaka partial pkd1l3 ortholog.
Figure 3Phylogenetic analysis of PKD proteins. Phylogenetic analysis of human (Homo sapiens, hsa), mouse (Mus musculus, mmu), spotted gar (Lepisosteus oculatus, loc), elephant shark (Callorhinchus milii, cmi), zebrafish (Danio rerio, dre), medaka (Oryzias latipes, ola), green spotted pufferfish (Tetraodon nigroviridis, tni), and stickleback (Gasterosteus aculeatus, gac) PKD1-like proteins (A) and PKD2-like proteins with the Drosophila melanogaster (dme) Pkd2 protein as an outgroup (B). In both cases a region of the polycystin-cation-channel domain that was present in all of the proteins was used (see Section Materials and Methods and Supplementary Figures 1, 2). Both analyses used a maximum likelihood method, with WAG substitution, performed using PhyML (v3.1/3.0 Alrt; see Section Materials and Methods). Human PKD1L2, stickleback Pkd1b and teleost, and spotted gar Pkd1l3 proteins are not included as they lack the polycystin-cation-channel domain. We did not include an invertebrate protein in the analysis of PKD1-like proteins as the evolution of this gene family seems to be more complex and while vertebrate PKD1-like proteins do have some homology to invertebrate proteins, this homology is limited and we did not identify an invertebrate protein that had good support for being a clear outgroup for this family. aLRT Sh-like branch support values are shown in red to the left of each branch. Red arrowheads indicate the branch that each value corresponds to. Scale bar = 0.4 nucleotide substitutions per site (A), 0.5 nucleotide substitutions per site (B).
Figure 4Conserved synteny around zebrafish . Examination of syntenic relationships between pkd and neighboring genes in genomic regions associated with zebrafish pkd1 family genes. Species is indicated on left and chromosomes on right. Un-Random (tni), unordered random sequences that have yet to be assigned to a chromosome. hsa, human (Homo sapiens); mmu, mouse (Mus musculus); dre, zebrafish (Danio rerio); ola, medaka (Oryzias latipes); tni, green spotted pufferfish (Tetraodon nigroviridis); and gac, stickleback (Gasterosteus aculeatus). Pkd genes are indicated in bold red text. Schematics are not to scale. For ease of comparison, gene clusters are shown in the same orientation, even though in some cases, gene organization is as shown, but on the opposite strand of the chromosome. Schematics only include annotated coding genes. Antisense processed transcripts and ribosomal and long-non-coding RNA loci are not included. Colors indicate homologous genes within an individual panel. So, for example, pink genes in pkd1 (A) are homologous to each other (they are all SLC9A3R2 despite their slightly different positions) but they are not homologous to pink genes in the pkd1l1 panel. However, gray (novel) genes in (A) are an exception, as these three genes are not homologous to each other. We did not find a pkd1b gene in medaka, and none of the genes flanking pkd1b in green spotted pufferfish and stickleback are found near the zebrafish pkd1b gene (B). The PKD1L1 locus is syntenic within but not between amniotes and teleosts (C). Zebrafish pkd1l2a is the only teleost gene to share synteny with both the aminote and other teleost PKD1L2 loci (D). Only stickleback and medaka pkd1l2b genes share any synteny among the pkd1l2b genes (D). As in amniotes, all teleost putative partial pkd1l3 orthologs are flanked by dhodh genes (E).
Figure 2PKD protein domains. Schematics of protein domains identified in all eight human (Homo sapiens, hsa) and mouse (Mus musculus, mmu) and seven zebrafish (Danio rerio, dre) PKD proteins. The zebrafish putative partial pkd1l3 ortholog is also shown. Approximate protein length is indicated by scale at top. Where multiple transcripts exist in Ensembl, the longest protein isoform is shown. In all three species, PKD1 is the longest PKD protein and the only protein to contain a leucine-rich repeat and carbohydrate-binding WSC domain in the amino-terminus. Pkd1b is not present in mammals. Zebrafish Pkd1b resembles Pkd1 with multiple PKD domain repeats in the amino-terminus and REJ, PLAT/LH2, and polycystin-cation-channel domains in the carboxy-terminus. In all three species, PKD1L1 contains a shorter polycystin-cation-channel domain, approximately half the size of that in other Pkd proteins. Unlike mammals, zebrafish Pkd1l1 also contains a GPS motif upstream of the PLAT/LH2 domain. The 5′ coding sequence of mouse Pkd1l1 gene is presently incomplete. PKD1L2 is unusual in humans in that, according to information on Ensembl, longer transcripts represent polymorphic pseudogenes that have acquired mutations, preventing them from being expressed as functional proteins. As a result, the current version of human PKD1L2 is half the size of mouse and zebrafish Pkd1l2 and lacks the polycystin-cation-channel domain characteristic of PKD proteins. If this is correct, then this suggests that human PKD1L2 is no longer a bona-fide PKD gene. Mouse PKD1L2 and Zebrafish Pkd1l2a and Pkd1l2b have identical domain structures, with the exception that Pkd1l2b lacks the REJ domain. PKD1L3 protein structure differs slightly between mammals. Human PKD1L3 has a Lectin C-type domain in the amino-terminus and mouse PKD1L3 does not. In addition, the polycystin-cation-channel domain in mouse PKD1L3 is 411 amino acids long, compared to only 237 amino acids in human PKD1L3. We have identified a putative partial pkd1l3 ortholog in zebrafish, but the sequence lacks the polycystin-cation-channel domain, so we do not consider it a bona-fide pkd gene. Pkdrej and Pkd2l2 are not present in zebrafish. The only currently identified domain in PKD2, PKD2L1, and PKD2L2 proteins is the polycystin-cation-channel domain.
Similarities of polycystin-cation-channel domains of zebrafish Pkd proteins.
| Pkd1 | 392 | 100 | 27.78 | 16.95 | 23.81 | 25.20 | 25.86 | 24.07 |
| Pkd1b | 293 | 27.78 | 100 | 21.64 | 20.98 | 20.49 | 20.63 | 19.23 |
| Pkd1l1 | 181 | 16.95 | 21.64 | 100 | 28.49 | 21.79 | 27.37 | 22.91 |
| Pkd1l2a | 419 | 23.81 | 20.98 | 28.49 | 100 | 62.26 | 29.56 | 29.80 |
| Pkd1l2b | 417 | 25.20 | 20.49 | 21.79 | 62.26 | 100 | 28.47 | 27.97 |
| Pkd2 | 421 | 25.86 | 20.63 | 27.37 | 29.56 | 28.47 | 100 | 57.38 |
| Pkd2l1 | 421 | 24.07 | 19.23 | 22.91 | 29.80 | 27.97 | 57.38 | 100 |
Percentage identity between polycystin-cation-channel domains of zebrafish Pkd proteins generated using Clustal Omega (see Section Materials and Methods). Column 2 indicates the size of the polycystin-cation-channel domain in amino acids. Compare the protein in each row to the protein in each column to read the pairwise identity percentage.
Figure 5Conserved synteny around zebrafish . Examination of syntenic relationships between pkd and neighboring genes in genomic regions associated with zebrafish pkd2 family genes. Species is indicated on left and chromosomes on right. hsa, human (Homo sapiens); mmu, mouse (Mus musculus); dre, zebrafish (Danio rerio); ola, medaka (Oryzias latipes); tni, green spotted pufferfish (Tetraodon nigroviridis); and gac, stickleback (Gasterosteus aculeatus). Pkd genes are indicated in bold red text. Schematics are not to scale. For ease of comparison, gene clusters are shown in the same orientation, even though in some cases, gene organization is as shown, but on the opposite strand of the chromosome. Schematics only include annotated coding genes. Antisense processed transcripts and ribosomal and long-non-coding RNA loci are not included. Colors only indicate homologous genes within an individual panel. So, for example, pink genes in the pkd2 panel are not homologous to pink genes in the pkd2l1 panel. The teleost pkd2 genes share synteny with human but not mouse PKD2 (A). The teleosts share considerable synteny at the pkd2l1 locus, but only zebrafish and stickleback pkd2l1 genes share any synteny with amniotes (B).
Figure 6and . Lateral expression of pkd genes at 8.3, 10, and 12 h. Region shown in main panel at each stage is indicated by red dotted boxes in schematics (A–C). Inset images in (D–S) show whole-mount view of embryo, dorsal forerunner cells/KV located in bottom right-hand corner. There is no expression of pkd1, pkd1b, pkd1l2a, pkd1l2b, or pkd2l1 in dorsal forerunner cells or KV at any of these stages. The boundary of the KV cavity is faintly visible in M as a slightly different focal plane has been shown to include spinal cord expression. However, pkd1b is not expressed in the margin of KV. Arrows in (M) indicate caudal limit of spinal cord expression of pkd1b. pkd1l1 is expressed in the KV region at all three stages (D–F) and pkd2 is expressed at 10 and 12 h but not 8.3 h (G–I). Scale bar (D) = 50 μm, (D–S) main panels and 200 μm, inset panels.
Figure 7Expression of . Lateral views of whole embryo expression of pkd genes at 24 h and 5 dpf. Rostral left, dorsal up. (A,B) pkd1 is strongly expressed in the pronephros at 24 h (arrows, A) but not at 5 dpf. By 5 dpf, pkd1 expression persists only in the putative taste receptors (white asterisks, B). (C,D) pkd1b is broadly expressed throughout the dorsal-ventral hindbrain and spinal cord, and in the caudal-most midbrain at 24 h. By 5 dpf, strong expression persists in the floor plate in the midbrain and hindbrain, whilst weaker expression persists in putative taste receptors of the pharynx (white asterisks, D). (E,F) pkd1l1 is not expressed at 24 h but is detected in the ear at 5 dpf (white dotted line, F). (G,H) pkd1l2a is expressed in cells in the ventral-most spinal cord at 24 h. This expression persists at 5 dpf, as does expression in putative taste receptors (white asterisk, H). (I,J) pkd1l2b expression is not detected at 24 h and persists only weakly in the pharyngeal cartilage at 5 dpf (white asterisk, J). (K,L) pkd2 is expressed in the pronephros (arrows, K) and perhaps very weakly in the floor plate at 24 h (arrowheads, K). By 5 dpf, pkd2 expression is restricted to the ventral region of the rostral somites and putative taste receptors (white asterisks, L). (M,N) Like pkd1l2a, pkd2l1 is also expressed in cells in the ventral-most spinal cord at 24 h. This expression also persists at 5 dpf, together with weak expression in putative taste receptors (white asterisks, N). Low level diffuse staining in the brain in (A,C,F,H,L,N) and more widely in (E,I,K) is probably background staining. These embryos were stained for longer periods in order to try and detect any weak, but specific, expression in the spinal cord. As a consequence of this, the brain, which contains large ventricles which sometimes trap RNA riboprobes, often has background staining (see Section Discussion). Scale bar (A) = 100 μm.
Figure 8Spinal cord expression of zebrafish . Lateral views showing expression of pkd genes at 1–5 dpf. Rostral left, dorsal up. (A–F) pkd1b is expressed broadly in the spinal cord. pkd1l2a (G–L) and pkd2l1 (M–R) are both expressed in two rows of cells in the ventral spinal cord and occasionally weakly in more dorsal cells (asterisk). (S–U) pkd1, (V–X) pkd1l1, (Y–A') pkd1l2b, and (B'–D') pkd2 are not expressed in spinal cord. Some of these embryos have background expression as we stained them for long periods of time to try and detect any weak, but specific, expression. Expression of pkd2 is visible in the rostral ventral somites (D'). Scale bar (A) = 50 μm.
Figure 9Expression of zebrafish . Lateral views (A,B,E,F,I,J,M,N,Q,R,U,V,Y,Z) and cross-sections (C,D,G,H,K,L,O,P,S,T,W,X,A',B') of pkd expression in the trunk of mindbomb mutants and sibling embryos with WT phenotypes. Dorsal is up. In lateral views, rostral is left and only the spinal cord region is shown. Arrows (O,P,A',B') indicate pronephros expression. Arrowheads (Y,Z,A' and higher magnification inset in A') indicate weak expression of pkd2 in the floor plate of the spinal cord. The focal plane in B' does not include labeled floor plate cells. Scale bar (A) = 50 μm (lateral views, A,B,E,F,I,J,M,N,Q,R,U,V,Y,Z); Scale Bar (C) = 30 μm (cross-sections, C,D,G,H,K,L,O,P,S,T,W,X,A',B') and 10 μm (inset in A').
Figure 10Expression of Lateral view of pkd1 (A,B) or pkd2 (C,D) expression in pronephros (black and white arrows) at 27 and 36 h. Rostral left, dorsal up. pkd1 is strongly expressed in pronephros at 27 h (A). Expression starts to decline at 36 h (B). Expression of pkd2 is weak in pronephros at 27 h (C) and is reduced even further by 36 h (D). (E–H) Lateral expression of pkd genes in the ear at 4–5 dpf. Dotted line shows ear boundary. Weak expression of pkd1l1 (E) and pkd2l1 (G) is first detected at 4 dpf in the inner ear ectoderm that supports the posterior canal and posterior crista (black arrowheads). pkd1l1 is also weakly expressed in the utricular otolith (white arrows). By 5 dpf, pkd1l1 expression persists in the utricular otolith and the underlying utricular macula (white arrows). It is also expressed in neighboring ectoderm flanking the lateral canal and lateral crista (white asterisks; F). At 5 dpf the expression of pkd2l1 persists in tissue surrounding the posterior canal and posterior crista (black arrowheads; H). (I–M) Lateral view of pkd1 expression in neuromasts (white asterisks) and lateral line primordium (white dotted line) at 36 h and 3 dpf. Rostral left, dorsal up. Weak expression of pkd1 in neuromasts and lateral line primordium is first detected at 36 h [I, higher magnification of the neuromasts (J) and lateral line primordium (K)]. By 3 dpf expression persists in neuromasts (L and higher magnification view, M). pkd1 is also expressed in pectoral fin buds (black arrows) at 36 h [dorsal view, rostral top (N), and lateral view—rostral left, dorsal up (O)]. (P–R) Lateral expression of pkd2 in rostral somites at 4 and 5 dpf. Rostral left, dorsal up. pkd2 is first expressed in the ventral half of each rostral somite at 4 dpf (black arrows in P, higher magnification in Q) and persists at 5 dpf (black arrows in R). (S–W) Lateral expression of pkd genes in the eye at 4 dpf. Rostral left, dorsal up. pkd1b, pkd1l1, pkd1l2a, pkd1l2b, and pkd2l1 are expressed in the ganglion cell layer (adjacent to lens, single white cross) and amacrine cells (outer cell layer immediately adjacent to ganglion cell layer, double white cross) of the eye at 4 dpf. The expression of pkd1b (S) and pkd1l2b is weak (V) and the expression of pkd1l1 (T), pkd1l2a (U), and pkd2l1 (W) is stronger. Only the expression of pkd1l2b persists in these cell layers at 5 dpf (data not shown). Scale bar (A) = 23 μm (J,K,M); 42 μm (E–H,O); 50 μm (A–D,Q,R); 55 μm (N); 62.5 μm (S–W); and 100 μm (I,L,P).
Figure 13Summary of . (A–U) Schematics showing ventral views of zebrafish pharyngeal regions summarizing expression of pkd genes at 3, 4, and 5 dpf. Black dots indicate expression on pharyngeal cartilage and red dots indicate expression on pharyngeal walls. Cartilage schematics are modified from Schilling and Kimmel (1997), Knight et al. (2003), and Edmunds et al. (2016). Locations of the major cartilaginous elements are shown in panel (A). m, Meckel's cartilage (ventral component of lower jaw); ch, ceratohyal cartilage (derived from the second pharyngeal arch); pq, palatoquadrate cartilage (derived from the first pharyngeal arch, forms the dorsal mandibular cartilage); hs, hyosymplectic cartilage (also derived from the second pharyngeal arch); cb1–5, ceratobranchial cartilage 1–5 (forms the ventral branchial or gill arches).
Figure 11and . Dorsal view of pkd1l2a (A) and pkd2l1 (B) expression in 24 h spinal cord. Rostral left. Most of the labeled cells are KA cells that abut the central canal. Asterisks indicate expression in occasional weak, more lateral cells (that correspond to the more dorsal cells indicated in Figures 8G–R). Prolonged staining sometimes reveals additional weak, lateral cells (data not shown). (C) Average number of cells (y-axis) expressing pkd1l2a (blue) and pkd2l1 (red) in KA″ and KA′ cells (x-axis) at 24 h in WT spinal cord region adjacent to somites 6–10 (n = 5). Error bars indicate standard error of the mean. There is no statistical difference between the number of pkd1l2a and pkd2l1-expressing KA″ (p = 0.6419) and KA′ (p = 0.8571) cells respectively (Student's t-test). These data do not include occasional non-KA lateral cells (2 cells each in 2/5 pkd2l1-labeled embryos; 0 cells in 5 pkd1l2a-labeled embryos). (D–F',G–I') Lateral views of zebrafish spinal cord at 30 h. Anterior left, dorsal top. In situ hybridization (purple) for pkd1l2a (D,D') and pkd2l1 (G,G'), EGFP immunohistochemistry (green) in Tg(–8.1gata1:gata1-EGFP) embryos (E,E',H,H') and merged views (F,F',I,I'). (D'–I') Magnified single confocal plane of white dotted box region. 100% of pkd1l2a and pkd2l1-expressing KA cells co-express Tg(–8.1gata1:gata1-EGFP) and 100% of GFP-positive Tg(–8.1gata1:gata1-EGFP) KA cells co-express either pkd1l2a (F, indicated with + in F') or pkd2l1 (I, indicated with + in I'). No GFP-positive Tg(–8.1gata1:gata1-EGFP) dorsal V2b cells co-express either pkd1l2a (white ˆ in F,F') or pkd2l1 (white ˆ in I,I'). Double-labeled cells are not indicated in (D–I) main panels as they are so numerous. Scale bar (A) = 50 μm (A,B). Scale bar (D) = 50 μm (D–I) and 20 μm (D'–I').
Figure 12Expression of . Lateral (A–C,G–I,M–O,S–U,Y,Z,A',E'–G',K'–M') and ventral (D–F,J–L,P–R,V–X, B'–D',H'–J',N'–P') views of pkd gene expression at 3, 4, and 5 dpf. Rostral is left Rostral is left (A–C,G–I,M–O,S–U,Y,Z,A',E'–G',K'–M') and top (D–F,J–L,P–R,V–X,B'–D',H'–J',N'–P'). In most of the lateral views, the eyes are out of focus. Insets in (M–O) and (Y–A') show expression of pkd1l2a and pkd2l1 in KA cells in the rostral spinal cord (small white arrows). Insets in (E'–G') show expression of pkd1b in the floor plate of the midbrain and hindbrain. White arrowheads indicate the locations of pharyngeal expression. Scale bar (A) = 100 μm.