| Literature DB >> 36209315 |
María Prado-Álvarez1, Sonia Dios1, Pablo García-Fernández1,2, Ricardo Tur3,2, Ismael Hachero-Cruzado3, Pedro Domingues3, Eduardo Almansa4, Inmaculada Varó5, Camino Gestal6.
Abstract
Cephalopods have been considered enigmatic animals that have attracted the attention of scientists from different areas of expertise. However, there are still many questions to elucidate the way of life of these invertebrates. The aim of this study is to construct a reference transcriptome in Octopus vulgaris early life stages to enrich existing databases and provide a new dataset that can be reused by other researchers in the field. For that, samples from different developmental stages were combined including embryos, newly-hatched paralarvae, and paralarvae of 10, 20 and 40 days post-hatching. Additionally, different dietary and rearing conditions and pathogenic infections were tested. At least three biological replicates were analysed per condition and submitted to RNA-seq analysis. All sequencing reads from experimental conditions were combined in a single dataset to generate a reference transcriptome assembly that was functionally annotated. The number of reads aligned to this reference was counted to estimate the transcript abundance in each sample. This dataset compiled a complete reference for future transcriptomic studies in O. vulgaris.Entities:
Mesh:
Year: 2022 PMID: 36209315 PMCID: PMC9547907 DOI: 10.1038/s41597-022-01735-2
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Fig. 1Experimental and RNA-seq workflows of O. vulgaris transcriptomes. (a) Set of samples collected over the pre-settlement development of O. vulgaris (filled coloured boxes) and the three trials (infection, suboptimal culture condition and dietary) carried out at different stages (empty coloured boxes). Pictures of each stage are shown at the corresponding life stage and dietary treatment. Age of paralarvae is indicated in days post-hatching (dph). (b) Sample preparation and library construction workflow. (c) Transcriptome analysis workflow including construction of reference transcriptome. Photographs by R. Tur.
Fig. 2Quality data filtering in a representative sample. (a) Electropherogram showing the fluorescence and running time. (b) Classification of raw reads into clean reads (purple), reads containing adapter contamination (green), reads containing uncertain nucleotides in more than 10% of the read length (orange) and reads containing uncertain base pairs (N) (yellow). (c) Error rate per base position. (d) GC content distribution per base position.
Fig. 3Assembled transcriptome quality. Graphical representation of BUSCO scores of the O. vulgaris paralarvae transcriptome: C:87.2% [S:34.8%, D:52.4%], F:3.6%, M:9.2%; n:5295 - Mollusca Odb10 database.
Number of transcripts and unigenes classified by length intervals and length distribution.
| Transcripts | Unigenes | |
|---|---|---|
| 200-500 bp | 226553 | 226334 |
| 500-1k bp | 117727 | 117726 |
| 1k-2k bp | 50903 | 50902 |
| >2k bp | 31553 | 31553 |
| Total | 426736 | 426515 |
| Minimum length | 201 | 201 |
| Mean length | 797 | 797 |
| Median length | 475 | 475 |
| Maximum length | 38516 | 38516 |
| N50 | 1139 | 1139 |
| N90 | 344 | 344 |
| Total | 340093823 | 340038556 |
Total number and percentage of unigenes successfully annotated in each database.
| Number of Unigenes | Percentage (%) | |
|---|---|---|
| Annotated in Nr | 113338 | 26.57 |
| Annotated in Nt | 150149 | 35.2 |
| Annotated in KO | 5572 | 1.3 |
| Annotated in Swiss-Prot | 92701 | 21.73 |
| Annotated in Pfam | 112932 | 26.47 |
| Annotated in GO | 113619 | 26.63 |
| Annotated in KOG | 48550 | 11.38 |
| Annotated in all Databases | 3216 | 0.75 |
| Annotated in at least one Database | 192189 | 45.06 |
| Total Unigenes | 426515 | 100 |
Fig. 4Classification of annotated unigenes. (a) Percentage of species similarity based on Nr annotation. (b) Number of unigenes successfully annotated into GO Database and grouped into three main GO domains: Biological Process (BP), Cellular Component (CC), and Molecular Function (MF).
Fig. 5Sample correlation and gene expression level of experimental conditions. (a) Scatter diagram of pairwise correlation between samples (Pearson coefficient). (b) Box plot of log10(FPKM + 1) per sample group.
| Measurement(s) | Transcriptome sequencing assay |
| Technology Type(s) | RNA-seq assay (Illumina) |
| Sample Characteristic - Organism |
|
| Sample Characteristic - Environment | Ocean |
| Sample Characteristic - Location | NW Spain (Ría de Vigo, Galicia) |