| Literature DB >> 28873438 |
Francesco Musacchia1, Filip Vasilev1, Marco Borra2, Elio Biffali2, Remo Sanges1, Luigia Santella1, Jong Tai Chun1.
Abstract
Starfish have been instrumental in many fields of biological and ecological research. Oocytes of Astropecten aranciacus, a common species native to the Mediterranean Sea and the East Atlantic, have long been used as an experimental model to study meiotic maturation, fertilization, intracellular Ca2+ signaling, and cell cycle controls. However, investigation of the underlying molecular mechanisms has often been hampered by the overall lack of DNA or protein sequences for the species. In this study, we have assembled a transcriptome for this species from the oocytes, eggs, zygotes, and early embryos, which are known to have the highest RNA sequence complexity. Annotation of the transcriptome identified over 32,000 transcripts including the ones that encode 13 distinct cyclins and as many cyclin-dependent kinases (CDK), as well as the expected components of intracellular Ca2+ signaling toolkit. Although the mRNAs of cyclin and CDK families did not undergo significant abundance changes through the stages from oocyte to early embryo, as judged by real-time PCR, the transcript encoding Mos, a negative regulator of mitotic cell cycle, was drastically reduced during the period of rapid cleavages. Molecular phylogenetic analysis using the homologous amino acid sequences of cytochrome oxidase subunit I from A. aranciacus and 30 other starfish species indicated that Paxillosida, to which A. aranciacus belongs, is not likely to be the most basal order in Asteroidea. Taken together, the first transcriptome we assembled in this species is expected to enable us to perform comparative studies and to design gene-specific molecular tools with which to tackle long-standing biological questions.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28873438 PMCID: PMC5584759 DOI: 10.1371/journal.pone.0184090
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Comparison of the A. aranciacus reference transcriptome with the existing genetic sequence databases.
| Species | Percentage |
|---|---|
| 34.2% (11,221) | |
| 6.2% (2,046) | |
| 1.6% (529) | |
| 1.4% (470) | |
| 1.4% (456) | |
| 1.0% (312) | |
| 0.8% (258) | |
| 0.7% (231) | |
| 0.4% (145) | |
| 0.4% (145) | |
| 0.4% (144) | |
| 0.4% (137) | |
| 0.3% (106) | |
| 0.3% (106) | |
| 0.3% (100) |
Note: A total of 32,783 annotated transcripts were analyzed with respect to the UniRef classification. Only the top 15 most frequent species exhibiting the closest matches to the entire transcriptome were presented in this table of NCBI taxonomy. Presented in parentheses are actual numbers of the different transcripts that scored the highest hits in the given reference species.
Fig 1Comparison of the transcript population in the cells of the four different developmental stages based on the occurrence frequency of GO classes.
(A) Morphology of the cells at the different developmental stages: GV-stage oocytes, mature eggs, zygotes (20 min post-fertilization), and the early embryos (4 h and 20 min post-fertilization). (B) Comparison of the four stages on the basis of the three GO domains: biological processes (BP), molecular functions (MF) and cellular components (CC). The numbers on the Y-axis represent the percentage of the transcripts belonging to the labeled category in reference to the net annotated transcript population at the given stage.
Fig 2Comparison of the transcript population in the cells of the four different stages on the basis of pathway annotation.
Transcripts were classified with respect to the metabolic pathways to which their deduced proteins contribute. The occurrence frequencies of the major pathways were plotted in reference to the net annotated transcripts in the cells of four stages (percentage, Y-axis). The three plots show the three different levels of pathways as in UniPathway: level 1 (A), level 2 (B), level 3 (C).
Transcripts encoding putative enzymes and ion channels involved in intracellular calcium signaling.
| Transcripts | Annotation | Length (amino acid) | Coverage/ Identity | Top Hit Species | Accession No. | E-value |
|---|---|---|---|---|---|---|
| comp201526 | Inositol 1,4,5-trisphosphate receptor | 2,736 | 100%/86% | Q8WSR4 | 0.0 | |
| comp200930 | Phospholipase C-beta4 | 1,243 | 97%/71% | AAS55894 | 0.0 | |
| comp197619 | Phospholipase C-gamma1 | 2,160 | 100%/91% | AAR85355.1 | 0.0 | |
| comp197008 | Phospholipase C-delta4 | 772 | 100%/60% | NP_001008790 | 0.0 | |
| comp191901 | Phospholipase C-epsilon1 | 2,455 | 58%/45% | XP_011680276 | 0.0 | |
| comp201603 | Phospholipase C-eta2 | 2,150 | 38%/55% | Strongylocentrotus purpuratus | XP_011680765 | 0.0 |
| comp202094 | Ryanodine Receptor | 5,261 | 70% / 68% | BAB84714 | 0.0 | |
| comp183145 | ADP-ribosyl cyclase-like 1 | 343 | 94%/43% | XP_002735899 | 2E-56 | |
| comp190002 | ADP-ribosyl cyclase-like 2 | 320 | 98%/41% | Saccoglossus kowalevskii | XP_002735899 | 4E-20 |
| comp188422 | ADP-ribosyl cyclase-like 3 | 219 | 65%/46% | Saccoglossus kowalevskii | XP_002735899 | 4E-40 |
| comp199516 | Two-pore channel 1 | 915 | 89%/56% | CBI63263 | 0.0 | |
| comp195246 | Two-pore channel 2 | 830 | 99%/55% | CBI63264 | 0.0 | |
| comp195231 | Two pore calcium channel protein 1 isoform X1 | 1,022 | 99%/53% | XP_011679870 | 0.0 | |
| comp198265 | Sarco/endoplasmic reticulum calcium transporting ATPase (SERCA) | 795 | 100%/81% | XP_011663710 | 0.0 | |
| comp188433 | Plasma membrane calcium transporting ATPase (PMCA) | 1,141 | 99%/81% | XP_011672381 | 0.0 | |
| comp181009 | Sodium/calcium exchanger 3 (NCX) isoform X2 | 968 | 96%/70% | XP_011661939 | 0.0 | |
| comp197721 | Sodium/calcium exchanger 2 | 465 | 51%/63% | XP_011683878 | 0.0 | |
| comp200471 | Sodium/calcium exchanger 3 isoform X6 | 801 | 97%/71% | XP_794875 | 0.0 | |
| comp201718 | Voltage-dependent calcium channel type A subunit alpha-1 | 1,255 | 51%/70% | XP_011662956 | 0.0 | |
| comp196123 | Voltage-dependent calcium channel subunit alpha-2/delta-1 | 1,084 | 99%/53% | XP_011671023 | 0.0 | |
| comp199854 | Voltage-dependent calcium channel subunit alpha-2/delta-2 | 741 | 58%/60% | XP_013386854 | 0.0 | |
| comp200305 | Voltage-dependent calcium channel subunit alpha-2/delta-3 | 583 | 90%/49% | XP_006811294 | 1e-138 | |
| comp198886 | Voltage-dependent calcium channel subunit alpha-2/delta-4 | 1,101 | 93%/41% | XP_012994116 | 0.0 | |
| comp201327 | Voltage-dependent L-type calcium channel subunit beta-1 | 729 | 98%/66% | XP_011661591 | 0.0 | |
| comp195452 | Voltage-dependent T-type calcium channel subunit alpha-1G | 2,395 | 79%/49% | XP_015150974 | 0.0 | |
| comp162850 | Sodium/potassium/calcium exchanger Nckx30C isoform X5 (NCKX) | 660 | 92%/51% | XP_018318698 | 0.0 | |
| comp201917 | Sodium/potassium/calcium exchanger 3 | 634 | 99%/57% | XP_780438 | 0.0 | |
| comp192844 | Sodium/potassium/calcium exchange CG1090 isoform X2 | 534 | 87%/44% | XP_013391924 | 6e-134 | |
| comp191999 | Stromal interaction molecule 1 (STIM-1) | 689 | 72%/62% | XP_011666877 | 0.0 | |
| comp200692 | Orai-2-like | 210 | 97%/73% | XP_780791 | 9e-92 | |
| comp200190 | Transient receptor potential cation channel (TRP) subfamily A1 | 1,393 | 100%/59% | XP_797912 | 0.0 | |
| comp200408 | TRP subfamily M3-like | 1,090 | 76%/43% | XP_006816420 | 0.0 | |
| comp194828 | TRP subfamily M2, isoform X6 | 1,475 | 87%/53% | XP_011664677 | 0.0 | |
| comp200918 | TRP subfamily M2 | 1,574 | 75%/51% | XP_006814003 | 0.0 | |
| comp191759 | TRP subfamily V6 | 819 | 99%/48% | XP_011668037 | 0.0 | |
| comp200620 | TRP subfamily V5 | 759 | 89%/55% | XP_786430 | 0.0 | |
| comp189572 | Calcium channel flower homolog isoform X1 | 171 | 72%/52% | XP_011671877 | 8e-40 |
* Acquisition of the composite sequence containing all the expected domains required higher coverage filtering.
a-f Isoforms exist. The longest one was considered for BLAST analyses.
a) An additional isoform exists lacking four amino acid residues (AA) 907–911.
b) Three other isoforms exists lacking AA 499–532 to various extents.
c) Two other isoforms exist; one lacks AA 151–157, the other lacks AA 151–157 and AA 541–542.
d) An additional isoform exists that lacks AA 1389–1398 and contains a substitution (E1399K)
e) Another isoform exists lacking AA 303–313 and contains a substitution (S301P)
f) Three other isoforms exist; one lacks AA207-213 and contains a substitution (I206A), another lacks AA 2012–2025, and the third one has both of the two changes.
Families of cyclin and cyclin-dependent kinase.
| Transcripts | Annotation | Length (amino acid) | Coverage/Identity | Top Hit Species | Accession No. | E-value |
|---|---|---|---|---|---|---|
| comp196719 | Cyclin A | 446 | 99%/79% | BAA14010 | 0.0 | |
| comp190183 | Cyclin B | 403 | 100%/100% | CAO99272 | 0.0 | |
| comp196609 | Cyclin B3 | 447 | 100%/77% | CBG91877 | 0.0 | |
| comp195049 | Cyclin C | 278 | 98%/75% | XP_006817695 | 1e-151 | |
| comp195288 | Cyclin D2 | 315 | 99%/61% | XP_013410672 | 1e-114 | |
| comp202146 | Cyclin E | 428 | 100%/81% | CBG91878.1 | 0.0 | |
| comp196253 | Cyclin F | 594 | 49%/49% | XP_011666474 | 1e-120 | |
| comp190754 | Cyclin G2 | 390 | 98%/46% | XP_788820 | 1e-111 | |
| comp196371 | Cyclin H | 322 | 100%/57% | XP_787341 | 6e-130 | |
| comp195488 | Cyclin I | 369 | 98%/51% | XP_794154 | 1e-112 | |
| comp194056 | Cyclin J | 308 | 90%/52% | XP_798281 | 1e-89 | |
| comp200893 | Cyclin K | 659 | 30%/81% | XP_795740.3 | 4e-150 | |
| comp192356 | Cyclin L1 | 582 | 78%/63% | XP_790064 | 9e-175 | |
| comp196794 | Cyclin-dependent kinase (CDK)1 | 306 | 100%/91% | BAA11477 | 0.0 | |
| comp185828 | CDK2 | 300 | 100%/92% | BAH97197 | 0.0 | |
| comp194905 | CDK5 | 296 | 100%/87% | XP_002732807 | 0.0 | |
| comp197484 | CDK6 | 331 | 97%/65% | XP_002741831 | 2e-140 | |
| comp194115 | CDK7 | 360 | 89%/76% | XP_003216396 | 0.0 | |
| comp197368 | CDK8 | 490 | 88%/80% | XP_003728284 | 0.0 | |
| comp189294 | CDK9 | 385 | 93%/75% | XP_798269 | 0.0 | |
| comp196995 | CDK10 | 400 | 96%/73% | XP_797002.1 | 0.0 | |
| comp199997 | CDK11B | 696 | 47%/76% | XP_006823032 | 0.0 | |
| comp190838 | CDK12 | 1,263 | 82%/52% | XP_789337 | 0.0 | |
| comp201876 | CDK14 | 478 | 96%/69% | XP_011675680 | 0.0 | |
| comp191141 | CDK17 | 371 | 100%/84% | XP_011680140 | 0.0 | |
| comp190114 | CDK20 | 340 | 100%/83% | XP_002741922 | 0.0 |
a-c Isotypes exist. The longest one was considered for BLAST analyses.
a) An additional isoform exists that lacks amino acid residues (AA) 270–278 and contains a substitution (S269R).
b) An additional isoform exists lacking AA 446–476.
c) An additional isoform exists varying at the C-terminus: AA 340–360 substituted by a shorter peptide GRLAKKLVF.
Fig 3Quantitation of the transcripts encoding proteins of the cyclin family.
(A) Abundance of the transcripts encoding distinct cyclins in mature eggs. Data were presented both in CPM (green bars) and in RPKM (brown bars). (B) Relative expression levels of cyclin family in the GV-stage oocytes (GV), mature eggs (Mat), zygotes (Zyg) and early embryos (EE) estimated by real-time qPCR. Data were normalized with the values of the internal control transcript and plotted on a logarithmic scale as described in Materials and methods. Histogram and data points with error bars indicate mean ± standard deviation (n = 4).
Fig 4Quantitation of the transcripts encoding proteins of the CDK family.
(A) Abundance of the transcripts encoding distinct members of the CDK family in mature eggs. Data were presented both in CPM (green bars) and in RPKM (brown bars). (B) Relative expression levels of CDK transcripts in GV-stage oocytes (GV), mature eggs (Mat), zygotes (Zyg) and early embryos (EE) estimated by real-time qPCR. Data were normalized with the values of the internal control transcript and plotted on a logarithmic scale as described in Materials and methods. Histogram and data points with error bars indicate mean ± standard deviation (n = 4).
Fig 5Comparison of the transcript levels of selected genes estimated by transcriptome statistics and real-time qPCR.
(A) The cpm values of histone H2A (H2A), catenin ß, Mos, Y-Box transcription factor, and 40S ribosomal protein S2 (Ribo S2) were normalized with that of the internal control gene. (B) Quantitation of the same transcripts in mature eggs (Mat), zygotes (Zyg), and early embryos (EE) by real-time qPCR. The difference of the bars marked with distinct alphabetic characters (e.g. a versus b) is considered statistically significant. P<0.01 (post hoc test) in all cases except the ones indicated with the asterisks (P<0.05). Histograms with error bars indicate mean ± standard deviation.
Fig 6A phylogenetic tree of starfish based on the amino acid sequences of cytochrome oxidase subunit I (COS-I).
The homologous amino acid sequences of uniform length were selected from the N-terminal coding region of the COI-I gene that was commonly available in 31 starfish species, and were subjected to multiple alignment and tree-building as described in Materials and Methods. Results of bootstrap analysis were indicated left to the corresponding nodes. For the sake of simplicity, bootstrap values below 50 were not indicated in the cladogram. The scale bar (0.1) for the branches shows the number of substitution per site. Abbreviations in parentheses represent taxonomic order of the species: BRI, Brisingida; FOR, Forcipulatida; NOT, Notomyotida; PAX, Paxillosida; SPI, Spinulosida; VAL, Valvatida; VEL, Velatida. N1, N2, and N3: key nodes discussed in the text.