Literature DB >> 30510598

De novo male gonad transcriptome draft for the marine mussel Perumytilus purpuratus with a focus on its reproductive-related proteins.

Carolina Briones1, José J Nuñez1, Montse Pérez2, Daniela Espinoza-Rojas3, Cristian Molina-Quiroz4, Ricardo Guiñez5.   

Abstract

Perumytilus purpuratus is a marine mussel considered a bioengineer species with a broad distribution in the Pacific and Atlantic coast of South America. Studies have shown two geographically and genetically differentiated subpopulations at molecular level and in sperm morphological traits. To open avenues for molecular research on P. purpuratus, a global de novo transcriptome from gonadal tissue of mature males was sequenced using the Illumina platform. From a total of 126.38 million reads, 37,765 transcripts were successfully annotated. BUSCO analysis determined a level of 89% completeness for the assembled transcriptome. The functional gene ontology (GO) annotation indicated that, in terms of abundance, the transcripts related with molecular function were the most represented, followed by those related with biological process and cellular components. Additionally, a subset of GO annotations generated using the "sperm" term resulted in a total of 1,294 sequences where the biological process category was the more represented, with transcripts strongly associated to sperm-processes required for fertilization, and with processes where the sperm-egg interaction could be implicated. Our work will contribute to the evolutionary understanding of the molecular mechanisms related to tissue-specific functions. This work reports the first male gonad transcriptome for the mussel P. purpuratus, generating a useful transcriptomic resource for this species and other closely related mytilids.

Entities:  

Keywords:  BUSCO; Bivalve; Gamete-recognition proteins; Gene Ontology; Mollusca; Sperm

Year:  2018        PMID: 30510598      PMCID: PMC6275399          DOI: 10.7150/jgen.27864

Source DB:  PubMed          Journal:  J Genomics


Introduction

The marine mussel Perumytilus purpuratus (Mytilidae) is a dominant competitor and an ecosystem engineer forming extensive beds in the intertidal rocky zone 1. This species inhabits throughout a broad latitudinal distribution comprising the Pacific Ocean, from Guayaquil in Ecuador (3ºS) to Cabo de Hornos (56ºS) in Chile 2, and up the south Atlantic coast (41ºS) of Argentina 3. Recently, in a macro-scale work using microsatellite markers was proposed a biogeographical break in P. purpuratus ca. 40º South latitude, where two groups genetically divergent were identified northward and southward of the 40ºS 4. Moreover, in phylogenetic studies using nuclear and mitochondrial markers, congruent phylogeographic patterns with such biogeographical break have also been observed 5, 6. Morphological traits in sperm of P. purpuratus were compared and geographic intra-specific variation in spermatozoa traits of this mussel was observed 5, consistent with the phylogeographic patterns and genetic divergence previously reported. Sperm morphological traits have been used in systematic studies to differentiate closely related species, and in general sperm ultra-structure appears to be highly conserved at the species level 7, 8. Accordingly, the sperm proteins play an important role in the process of speciation 9. In this context, the recent developments in high-throughput sequencing techniques such as the whole transcriptome sequencing (RNAseq) have enabled large-scale analysis of genetic variation and gene expression in different tissues and species 10. Furthermore, RNAseq is an effective way to discover genes participating in specific biological processes when genome reference or genome sequence is not available 11, as it is the P. purpuratus case. The aim of this study was to characterize the male gonad transcriptome of P. purpuratus. The information generated herein, may be used in future integrated analysis using transcriptome and proteomic data on this species and will allow the identification of a list of reproduction-related genes that could be useful to determine if populations of P. purpuratus are undergoing cryptic speciation mediated by reproductive isolation processes. Finally, our work represents the first transcriptomic resource for P. purpuratus and a reference transcriptome for closely related species. For the characterization of the male gonad transcriptome, individuals of P. purpuratus were collected from the mid-intertidal zone off the coast of Valparaíso, Chile (32º57'S/71º33'W). In this species, sex can be easily determined based on gonad and mantle tissue colour, with male displaying a range of light colours (i.e. between white and yellow) and female showing browner coloration 3. Gonadal tissue was obtained from mature males, preserved in RNAlater (Life Technologies), and stored at -80°C until use. To allow the optimization of RNA extraction, the total gonadal RNA for eight samples was extracted using the E.Z.N.A Total RNA Kit II (Omega Biotek), and then the sample of higher quality and concentration was selected for sequencing and library construction. mRNA enrichment and cDNA library construction were carried out using the Illumina Tru-SeqTM Stranded Total RNA Library Preparation Kit with Ribozero (human/mouse/rat). The cDNA was sequenced on an Illumina Hiseq 2500 using 100 bp paired-end reads. Sequencing was performed by STAB Vida (Lisboa, Portugal). The de novo transcriptome assembly for the gonadal tissue in P. purpuratus was performed using Trinity (Release 2013-11-10) 12. The adapters and low quality bases were trimmed using the tool Trimmomatic 13; reads or bases with Q ≤ 30 and with less than 36 bp in length were removed. The obtained transcripts were then annotated using Blast searches on the Swiss-Prot database 14 using an e-value threshold of 1E-05. To perform the quantitative assessment of the assembly and completeness, the BUSCO software version 3.0.2 (Benchmarking Universal Single-Copy Orthologs) 15 was applied using default setting and the Metazoa gene set as a reference, which consists of 978 single-copy genes. Later, the transcripts were annotated to the three principal functional categories: biological process, molecular function and cellular components using Blast2GO 16 where both the second and third levels were annotated with GO terms 17. Similar to the results of previous studies of molluscs and other invertebrates, total RNA extractions yielded a single non-degraded ribosomal RNA band that had the same size as that typical of 18S rRNA 18. The co-migration of 28S and 18S rRNA fragments prevented us from using the RNA Integrity Number (RIN); thus only the absence of RNA degradation was considered 18. A total of 126.38 million paired-end reads were obtained from transcriptome sequencing. The de novo Trinity assembly resulted in 105.42 million high-quality paired-end reads with a mean length of 96.01 bp. From this, a total of 385,288 transcripts and 314,741 different unigenes were generated. The transcripts had an average length of 430.5 bp and an N50 of 445 bp (Table 1), and GC content was 35%, congruent with other mussel assemblies where mantle tissue was used 19, 20. The results of BUSCO analysis 15 showed that of the 978 metazoa BUSCOs, 89.7% (878) were complete, 60.94% (596) complete and single-copy, 28.8% (282) complete and duplicated, and 10.1% (99) were fragmented. Because the fraction of missing BUSCOs was quite low (0.1%), we concluded that the transcriptome of P. purpuratus was sufficiently complete and comprehensive for further analysis.
Table 1

Summary statistics of the sequencing, de novo assembly, and annotated transcripts of gonadal tissue of Perumytilus purpuratus male.

Total
Obtained from sequencing
Reads126,380,260
Number of bases (Mbp)12,131
Mean read length (bp)101
After trimming for de novo assembly
Reads105,419,693
Total base pairs (Gb)12,130
Mean read length (bp)96.01
De novo assembly
Transcripts385,288
Shortest transcript length (bp)201
Longest transcript length (bp)16,455
Mean transcript length (bp)290
Average length430.5
Average GC (%)35
N50 (bp)445
Transcripts retained for annotation
Transcripts SWP database43,756
Transcripts GO database37,992
Shortest transcript length (bp)201
Longest transcript length (bp)16,455
Mean transcript length (bp)327
Average length521
Average GC (%)35.5
Of the 385,288 transcripts, 43,756 (11.4%) were successfully annotated by Blast in the Swiss-Prot database 14, and 37,992 (9.9 %) transcripts were annotated to the functional GO terms in Blast2GO 16. Even though a total of 385,288 transcripts were observed, a lower percentage of these (9.9%) were successfully annotated to the functional GO terms. Although this result seems to be quite low, a similar percentage of GO annotations have been reported in other non-model molluscan species, such as in pearl oysters 21 and the mussel Perna viridis 22. The lower percentage of GO annotations is probably related to the fact that the number of results from the BLAST search for the de novo transcriptomes in a non-model species is highly dependent on the availability of the annotated sequence information for a reference genome 23, 24. For instance, an increased percentage of GO annotations (> 25%) were shown in mollusc species with a large transcriptome characterization, as in the case of Haliotis species 25 and in mussels of the genus Mytilus 26. The functional annotations of the gene ontology (at second-level) showed that 34,763 transcripts were related to biological processes (i.e. 91.50% of the total GO annotated transcripts), 35,423 (93.24%) were involved in molecular functions, and 34,246 transcripts (90.14%) were associated with cellular components (Fig. 1). Among the total number of transcripts, 31,148 (81.99%) were associated to at least one of the three ontology categories; thus, most of the annotated genes were involved in multiple biological functions (Fig. 1). The functional descriptions of the basic GO categories, for the second and third-level, are shown in Fig. 2.
Figure 1

Venn diagram for number of transcripts annotated in each GO basic category: Biological process, Molecular function, and Cellular component.

Figure 2

Proportions of gene ontology annotations to gonadal tissue of Perumytilus purpuratus male. The functional descriptions for each basic GO category are shown: biological process (a), molecular function (b), and cellular components (c). GO second level is nested on GO third level.

Representing the basic processes performed by the organism, the biological process was the most diverse category for both the second and third level, where the three principal GO terms were associated to cellular (95%), single organism (90%) and metabolic (83%) processes (Fig. 2A). Conversely, the molecular function category was the least variable (Fig. 2B) with transcripts that were associated principally to binding processes (93%) and catalytic processes (58%) terms. Among molecular function, we also found that in the third level, binding-related transcripts had different binding targets. Specifically, protein-binding term was associated with the highest number of transcripts (80%). Finally, the annotated transcripts associated with cellular components (Fig. 2C), indicate that the basic cellular terms were successfully identified. In this category, most transcripts were assigned to cell and organelle-related genes and were involved in some type of cellular structural organization. Details of the number of transcripts involved in the second and third-level for each GO category are shown in the Electronic supplementary material. At last, our de novo transcriptome has shown a total of 5,075 transcripts related with reproductive process and reproduction terms in the biological process category, as for example: single organism reproductive, multi-organism reproductive, developmental process involved in reproduction and sexual reproduction (Fig. 2A and Electronic supplementary material). In addition, a specific subset of GO annotations was generated using the “sperm” term. As result, a total of 1,294 sequences were assigned to the three functional GO categories, being the biological process category the more highly represented (1,002 sequences), followed by cellular component (282 sequences) and molecular function (10 sequences). In this search, many transcripts were related to sperm-specific processes. Accordingly, the transcripts from cellular component category were related to specific sperm structures as for instance, the acrosomal vesicle (GO: 0001669), sperm flagellum (GO: 0097223) and sperm midpiece (GO: 0097225). Whereas biological process transcripts were strongly associated to sperm-processes required for fertilization such as spermatogenesis (GO: 0007283), sperm capacitation (GO: 0048240), sperm motility (GO: 0030317) and sperm ejaculation (GO: 0042713), and also processes where the sperm-egg interaction could be implicated such as prevention of polyspermy (GO: 0060468), regulation of fusion of sperm to egg plasma membrane (GO: 0043012), fusion of sperm to egg plasma membrane (GO: 0007342), sperm-egg recognition (GO: 0035036) and binding of sperm to zona pellucida (GO: 0007339). Gamete-recognition proteins, particularly in free-spawning marine invertebrates 27-29, have been widely studied. The evidence suggests that these proteins often evolve rapidly 30 and could have an important role on reproductive isolation during early stages of the speciation process 30. Nevertheless, to identify more accurately gamete-recognition proteins, such as binding-sperm proteins, future experimental analyses should compare the gonad transcriptome expression profiles in males and females of P. purpuratus at different maturation stages. Finally, our de novo transcriptome and its functional annotations provide a valuable molecular resource to improve our knowledge of reproductive and fertilization-related genes in P. purpuratus. Supplementary tables. Click here for additional data file.
  23 in total

1.  A Tridimensional Self-Thinning Model for Multilayered Intertidal Mussels.

Authors:  Ricardo Guiñez; Juan Carlos Castilla
Journal:  Am Nat       Date:  1999-09       Impact factor: 3.926

2.  BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

Authors:  Felipe A Simão; Robert M Waterhouse; Panagiotis Ioannidis; Evgenia V Kriventseva; Evgeny M Zdobnov
Journal:  Bioinformatics       Date:  2015-06-09       Impact factor: 6.937

3.  Scorched mussels (BIVALVIA: MYTILIDAE: BRACHIDONTINAE) from the temperate coasts of South America: phylogenetic relationships, trans-Pacific connections and the footprints of Quaternary glaciations.

Authors:  Berenice Trovant; J M Lobo Orensanz; Daniel E Ruzzante; Wolfgang Stotz; Néstor G Basso
Journal:  Mol Phylogenet Evol       Date:  2014-10-14       Impact factor: 4.286

4.  Transcriptomic responses of Perna viridis embryo to Benzo(a)pyrene exposure elucidated by RNA sequencing.

Authors:  Xiu Jiang; Liguo Qiu; Hongwei Zhao; Qinqin Song; Hailong Zhou; Qian Han; Xiaoping Diao
Journal:  Chemosphere       Date:  2016-08-11       Impact factor: 7.086

5.  Transcriptomics provides insight into Mytilus galloprovincialis (Mollusca: Bivalvia) mantle function and its role in biomineralisation.

Authors:  Nadège A Bjärnmark; T Yarra; A M Churcher; R C Felix; M S Clark; D M Power
Journal:  Mar Genomics       Date:  2016-03-29       Impact factor: 1.710

Review 6.  Sequencing technologies and genome sequencing.

Authors:  Chandra Shekhar Pareek; Rafal Smoczynski; Andrzej Tretyn
Journal:  J Appl Genet       Date:  2011-06-23       Impact factor: 3.240

7.  Development of a Pacific oyster (Crassostrea gigas) 31,918-feature microarray: identification of reference genes and tissue-enriched expression patterns.

Authors:  Nolwenn M Dheilly; Christophe Lelong; Arnaud Huvet; Pascal Favrel
Journal:  BMC Genomics       Date:  2011-09-27       Impact factor: 3.969

Review 8.  SWISS-PROT: connecting biomolecular knowledge via a protein database.

Authors:  E Gasteiger; E Jung; A Bairoch
Journal:  Curr Issues Mol Biol       Date:  2001-07       Impact factor: 2.081

9.  Full-length transcriptome assembly from RNA-Seq data without a reference genome.

Authors:  Manfred G Grabherr; Brian J Haas; Moran Yassour; Joshua Z Levin; Dawn A Thompson; Ido Amit; Xian Adiconis; Lin Fan; Raktima Raychowdhury; Qiandong Zeng; Zehua Chen; Evan Mauceli; Nir Hacohen; Andreas Gnirke; Nicholas Rhind; Federica di Palma; Bruce W Birren; Chad Nusbaum; Kerstin Lindblad-Toh; Nir Friedman; Aviv Regev
Journal:  Nat Biotechnol       Date:  2011-05-15       Impact factor: 54.908

10.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

View more
  2 in total

1.  Comparative transcriptome analysis of three gonadal development stages reveals potential genes involved in gametogenesis of the fluted giant clam (Tridacna squamosa).

Authors:  Jun Li; Yinyin Zhou; Zihua Zhou; Chuanxu Lin; Jinkuan Wei; Yanpin Qin; Zhiming Xiang; Haitao Ma; Yang Zhang; Yuehuan Zhang; Ziniu Yu
Journal:  BMC Genomics       Date:  2020-12-07       Impact factor: 3.969

2.  Gonadal transcriptomes associated with sex phenotypes provide potential male and female candidate genes of sex determination or early differentiation in Crassostrea gigas, a sequential hermaphrodite mollusc.

Authors:  Coralie Broquard; Suwansa-Ard Saowaros; Mélanie Lepoittevin; Lionel Degremont; Jean-Baptiste Lamy; Benjamin Morga; Abigail Elizur; Anne-Sophie Martinez
Journal:  BMC Genomics       Date:  2021-08-09       Impact factor: 3.969

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.