| Literature DB >> 34822568 |
Dany Domínguez-Pérez1, José Carlos Martins1, Daniela Almeida2, Pedro Reis Costa3, Vitor Vasconcelos1,4, Alexandre Campos1.
Abstract
Bivalves constitute an important source of proteins for human consumption, but some accumulate biotoxins such as diarrhetic shellfish toxins (DSTs), constituting a risk to human health. The cockle Cerastoderma edule is one of the most important species harvested in the Portuguese coast but also one of the most affected species due to recurrent DSTs exposure. However, little is known regarding the effects of the toxins produced by blooming dinoflagellates on C. edule. Herein, we explore the Differentially Expressed Genes (DEGs) of two tissues (gills and digestive gland) from wild cockles sampled in Portugal, through their whole transcriptomic response in two different seasons (exposed and not exposed to DSTs). The de novo transcriptome assembly returned 684,723 contigs, N50 of 1049, and 98.53% completeness. Altogether, 1098 DEGs were identified, of which 353 DEGs were exclusive for the digestive gland, 536 unique for the gills and 209 DEGs were common. Among DEGs were identified known DSTs-biomarkers including glutathione peroxidase, glutathione S-transferase, superoxide dismutase, cytochrome P450, ABC transporters, actin and tubulin-related proteins, Heat shock proteins and complement C1Q-like proteins. This study provides the first transcriptomic profile of C. edule, giving new insights about its molecular responses under different environmental conditions of DSTs exposure.Entities:
Keywords: DSTs; Differentially Expressed Genes (DEGs); Portugal; RNAseq; biomarkers; bivalves
Mesh:
Substances:
Year: 2021 PMID: 34822568 PMCID: PMC8625317 DOI: 10.3390/toxins13110784
Source DB: PubMed Journal: Toxins (Basel) ISSN: 2072-6651 Impact factor: 4.546
Figure 1Shellfish harvesting area in Ria de Aveiro, Portugal. The composite figure displays in the background (A) the map of Portugal (x-axis: longitude, y-axis: latitude), highlighting the geographical localization of Ria de Aveiro, which is enlarged in panel (B), showing the coordinates of Aveiro city (red) and the sampling point (green). In the panel (C) is shown a picture of the harvesting area in Ria de Aveiro, during the low tide in the sampling point of (D) the cockles Cerastoderma edule.
Statistics summary of transcriptomic analyses performed with wild cockles sampled in Ria de Aveiro, Portugal, exposed to Diarrhetic Shellfish Toxins (DSTs), and without DST. The table includes Raw Data Statistics from sequencing process, the de novo Assembly and Annotation metrics.
| Summary of Analyses | Statistics | Ce_DGc | Ce_DGe | Ce_Gc | Ce_Ge |
|---|---|---|---|---|---|
|
| Total Bases 2 | 6,248,071,054 | 6,402,268,630 | 6,258,018,330 | 6,333,078,920 |
| Read Count 3 | 41,377,954 | 42,399,130 | 41,443,830 | 41,940,920 | |
| GC (%) 4 | 42.33 | 41.63 | 41.09 | 41.05 | |
| AT (%) 5 | 57.67 | 58.37 | 58.91 | 58.95 | |
| Q20 (%) 6 | 97.95 | 97.89 | 98.02 | 97.9 | |
| Q30 (%) 7 | 94.06 | 93.83 | 93.98 | 93.8 | |
|
| Assembled transcripts | 684,723 | |||
| Total genes | 361,460.0 | ||||
| Minimum/Maximum Contig length (bp: base pair) 8 | 200/42,164 bp | ||||
| Contig length (N50) 8 | 1094 bp | ||||
| Contig length (Average) 9 | 736.09 bp | ||||
| GC (%) | 37.39 | ||||
| Total assembled bases (all transcripts) | 504,019,879 | ||||
| Total assembled bases (longest isoform per gene) | 199,701,724 | ||||
| Percentage of all read aligned/not aligned 10 | 92.1%/7.8% | 92.4%/7.5% | 93.1%/6.8% | 93.3%/6.6% | |
|
| number of BUSCOs/species | 954/65 | |||
| Complete Single-Copy | 282/29.56% | ||||
| Complete duplicated | 658/69% | ||||
| Fragmented | 8/0.84% | ||||
| Missing | 6/0.63% | ||||
|
| number of clusters at 90% identity | 402,961 | |||
| Complete Single-Copy | 771/80.82% | ||||
| Complete Duplicated | 170/17.82% | ||||
| Fragmented | 8/0.84% | ||||
| Missing | 5/0.52% | ||||
|
| Transcripts with ORF (best isoform per gene) | 48,107 | |||
|
| Blast Hits nr/NCBI (Metazoa/non-Metazoa) 14 | 37,184/362 | |||
| Blast Hits SwissProt (Metazoa/non-Metazoa) 15 | 26,968/22 | ||||
| Averall ORFs with Blast hits/no hits 16 | 37,262 (77.5%)/10,845 (22.5%) | ||||
| EggNOG 17 | 31,848 (66.2%) | ||||
| KEGG Pathways/KO 18 | 12,760 (25.9%)/20,621 (42.8%) | ||||
| IPS/GO annotation (%) 19 | 48,107 (100%)/19,398 (40.3%) | ||||
| Gene Ontology (ORFs with GO annotation) 20 | 21,799 (45.3%) | ||||
1 Sample ID/Condition: Ce_DGc: Cerastodesma edule digestive gland (DG) without dinophysistoxins (DSTs), Ce_DGe: C. edule DG exposed to DSTs, Ce_Gc: C. edule gills (G) without DSTs, Ce_Ge: C. edule gills exposed to DSTs. 2 Total read bases: Total number of bases sequenced. 3 Total reads: Total number of reads, corresponding to the sum of read 1 (Upstream Files Pattern: _1) and read 2 (Downstream Files Pattern: _2), obtained from Illumina paired-end sequencing. 4 GC(%): GC content. 5 AT(%): AT content. 6 Q20(%): Ratio of bases that have Phred quality score (expresses the accuracy of each nucleotide) of over 20 (99% base call accuracy). 7 Q30(%): Ratio of bases that have Phred quality score (expresses the accuracy of each nucleotide) of over 30 (99.9% Base call accuracy). 8 N50: is defined as the sequence length of the shortest contig at 50% of the total transcriptome/genome length. 9 Average contigs length. 10 RNA-Seq Read Representation corresponding to the percentage of the overall alignment (reads aligned/not aligned). 11 The de novo assembly quality and completeness assessment with the Benchmarking Universal Single-Copy Orthologs (BUSCOs), showing the number of BUSCOs/relative representation (percentage) of Complete Single-Copy, Complete Duplicated, Fragmented and Missing. BUSCO v5 run in mode: Transcriptome, lineage Metazoa, e-value 1.0E−3. 12 Sequence redundancy of the de novo assembled transcriptome was reduced using CD-HIT 4.8.1, setting 0.9 as the Sequence Identity Threshold. 13 Protein Coding Sequences (CDS) obtained by six-frame translation with TransDecoder v5.5.0., considering a minimum length of 100 amino acids for open reading frames (ORFs), homology to known proteins via Pfam searches, and the best/longest isoform per gene. 14 Blast hits of the 48,107 CDS (best isoforms) obtained from the transcriptome assembly searched against the non-redundant (nr) protein database from NCBI, using BlastX program, filtering by the taxonomic section Metazoa/non-Metazoans (accessed on 11 May 2021), setting a cut-off e-value 1.0E−3. 15 Blast hits of the 48,107 CDS (best isoforms) obtained from the transcriptome assembly searched against the protein database UniProtKB/Swiss-Prot, using BlastX program, filtering by the taxonomic section Metazoa/non-Metazoans (accessed on 12 May 2021), setting a cut-off e-value 1.0E−3. 16 Summary of Blast hits/no hits found against both databases searched, and their relative representation (percentage) within all (48,107) predicted ORFs. 17 Functionally annotated ORFs according to orthologous groups (OGs: i.e., COGs, arCOGs, ENOGs, KOGs), using the eggNOG Mapper 1.0.3 with EggNOG 5.0.0, and its relative representation (percentage) within all (48,107) predicted ORFs. 18 Functionally annotated ORFs according to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and KEGG Orthology (KO), and its relative representation (percentage) within all (48,107) predicted ORFs. 19 Functionally annotated ORFs according to Gene Ontology (GO) and known proteins domains using InterProScan (IPS) v5.51–85.0, searched against InterPro member databases (i.e., Pfam, PROSITE, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D, and PANTHER), as well as its relative representation (percentage) within all (48,107) pred. 20 Overall functionally annotated ORFs according to Gene Ontology (GO) and its relative representation (percentage) within all (48,107) predicted ORFs.icted ORFs.
Figure 2Pairwise Differential Expression results using NOISeq R/Bioc package (p < 0.01). The figure shows a scatter plot showing the log-fold change (M) and the absolute value of the difference in expression between conditions (D), where D values are displayed in a log scale. The Differentially Expressed Genes (DEGs) are depicted as red dots. The corresponding heat map of DEGs (Z-score values) from each condition tested (in columns) are depicted: Ce_DGe_vs_Ce_DGc, in the top panel corresponding to Digestive Gland (DG) samples, and Gills (G) in the bottom (Ce_Ge_vs_Ce_Gc). The contrast condition and reference/control samples are shown as Ce DGe/Ce Ge and Ce DGc/Ce Gc, respectively.
Figure 3Venn Diagram of the Differentially Expressed Genes (DEGs) and heat map of the shared DEGs between pairwise comparisons. The figure shows in the left panel unique and overlapped DEGs between the tested conditions Ce_DGe_vs_Ce_DGc, and Ce_Ge_vs_Ce_Gc, corresponding to Digestive Gland (DEGs_DG) samples, and Gills (DEGs_G). The contrast condition and reference/control sample are shown in the columns of the heat map (right panel) as Ce_DGe/Ce_Ge and Ce_DGc/Ce_Gc, respectively. The heat map depicts the corresponding expression (Z-score values) of the 209 common DEGs found, of which a cluster of up-regulated genes in the exposed condition, comprising 16 transcripts highlighted (in the bottom between dashed lines). Venn diagram was drawn with an online free tool, available at the webserver of the Bioinformatics and Evolutionary Genomics Center (BEG/Van de Peer Lab site, Ghent University, Belgium, http://bioinformatics.psb.ugent.be/webtools/Venn/accessed on 15 July 2021).
List of Differentially Expressed Genes (DEGs) related to known biomarkers involved in the shellfish molecular response after DSTs exposure.
| Functional 1 | Gene/Transcript 2 | Gene/Transcript Description 3 | Gene/Transcript 4
| Organism 5
| DEGs Subset 6 (UP-Regulated/Down-Regulated) LFC 7 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| Unique (DG) 6.1
| Unique (DG) 6.2 168 DR | Common 16 UR 6.3 | Common 189 DR 6.4 | Unique (G) 6.5 80 UR | Unique (G) 6.6 456 DR | |||||
| Antioxidant | TRINITY_DN158227_c0_g1 | Extracellular superoxide dismutase [Cu-Zn] | Superoxide metabolic process; metal ion |
| −8.15 | |||||
| TRINITY_DN250845_c0_g1 | Extracellular superoxide dismutase [Cu-Zn] |
| −8.15 | |||||||
| TRINITY_DN57639_c0_g2 | Superoxide dismutase |
| −9.14 | |||||||
| TRINITY_DN54684_c0_g1 | Glutathione | Response to oxidative stress; glutathione peroxidase activity |
| −7.91 | ||||||
| TRINITY_DN1482_c1_g1 | Glutathione | Glutathione catabolic process glutathione hydrolase activity |
| 2.84 | ||||||
| TRINITY_DN7416_c0_g1 | Glutathione S-transferase Mu 3 | Glutathione metabolic process; protein binding |
| −8.29 | ||||||
| TRINITY_DN8990_c0_g1 | Putative glutathione-specific γ-glutamyl | Glutathione catabolic process; γ-glutamyl |
| 3.46 | ||||||
| Metabolic detoxification | TRINITY_DN9032_c0_g3 | Cytochrome P450 2U1-like | Monooxygenase |
| 3.66 | |||||
| TRINITY_DN2028_c0_g1 | Steroid 17-α-hydroxylase |
| 5.2 | |||||||
| 3.1 | ||||||||||
| TRINITY_DN12127_c0_g1 | ABC transporter G family member 21-like isoform X1 | ATPase-coupled transmembrane transporter activity |
| 6.64 | ||||||
| TRINITY_DN86_c1_g1 | Phosphatidylcholine translocator ABCB4-like | ATP binding; xenobiotic transmembrane transporter activity |
| 3.18 | ||||||
| Transcription | TRINITY_DN70033_c0_g1 | KH domain protein | RNA processing; DNA |
| −8.01 | |||||
| −9.24 | ||||||||||
| TRINITY_DN22122_c0_g1 | Cell division cycle and apoptosis regulator protein 1 | regulation of transcription, DNA-templated |
| −9.21 | ||||||
| −10.38 | ||||||||||
| TRINITY_DN54844_c0_g2 | CREB-binding protein | regulation of transcription, |
| −7.39 | ||||||
| TRINITY_DN116751_c0_g1 | Transcription elongationregulator 1 | FF domain; protein-protein interaction |
| −8.26 | ||||||
| −8.56 | ||||||||||
| cytoskeleton, actin related proteins | TRINITY_DN9687_c0_g1 | Tubulin α chain testis- | structural constituent of cytoskeleton; |
| −8.65 | |||||
| −8.66 | ||||||||||
| TRINITY_DN9687_c1_g2 | Tubulin α-chain |
| −8.48 | |||||||
| TRINITY_DN181044_c0_g1 | Actin depolymerizing factor | actin filament depolymerization; actin binding |
| −7.5 | ||||||
| TRINITY_DN34498_c0_g1 | Actin-1 isoform X2 | actin family |
| −7.08 | ||||||
| TRINITY_DN613_c1_g1 | VChain V, beta-actin | ATP binding; cytoplasm; cytoskeleton |
| −11.08 | ||||||
| −11.81 | ||||||||||
| TRINITY_DN6932_c0_g1 | Actin β/γ 1 | Actin family |
| −8.61 | ||||||
| TRINITY_DN7730_c0_g1 | Actin β/γ 1 |
| −5.42 | |||||||
| −7.54 | ||||||||||
| Inmune response | TRINITY_DN10313_c0_g2 | Complement C1Q-like protein | Protein binding |
| −3.83 | |||||
| TRINITY_DN1050_c0_g1 | Complement C1q tumor necrosis factor-related protein 4-like |
| 8.06 | |||||||
| TRINITY_DN14010_c0_g1 | Putative C1q domain containing protein MgC1q83 |
| −8.38 | |||||||
| TRINITY_DN1909_c0_g2 | Complement C1q-like protein 2 | 3.91 | ||||||||
| TRINITY_DN45102_c0_g1 | Complement C1Q-like protein |
| −5.41 | |||||||
| Heat shock protein HSP 90 (HSP90) | TRINITY_DN16801_c0_g1 | Hsp90 protein | Protein folding; ATP binding; ATP hydrolysis activity; unfolded protein binding |
| −8.31 | |||||
| −10.24 | ||||||||||
| TRINITY_DN260239_c0_g1 | Hsp90 co-chaperone Cdc37 isoform 1 | Protein kinase binding |
| −7.99 | ||||||
1 Major Functional Categories related to known proteins/biomarkers involved in dinophysistoxins (DSTs) metabolism and detoxification. 2 Sequence name of the DEGs identified within the corresponding Major Functional Category. 3 Blast top-hit description of the corresponding DEGs. 4 Gene Ontology (GO)/InterProScan (IPS) description of the corresponding DEG (Gene/transcript). 5 Organism and accession number of the corresponding Blast top-hit obtained for each DEGs. 6 Subsets of unique and overlapped DEGs (used as test set in the enrichment analyses), identified in the digestive gland and gills of the cockle C. edule, exposed to DSTs, where contrast condition and reference/control sample were Ce_DGe/Ce_Ge and Ce_DGc/Ce_Gc, respectively. Subsets: 6.1 185 unique up-regulated DEGs identified in the digestive gland (DG) (unique DG 185 UR, Ce_DGe_vs_Ce_DGc); 6.2 168 unique down-regulated DEGs identified in the digestive gland (unique DG 168 DR, Ce_DGe_vs_Ce_DGc); 6.3 16 common DEGs with a similar up-regulated pattern identified in both conditions exposed to DSTs (common 16 UR); 6.4 189 common DEGs with a similar down-regulated pattern identified in both conditions exposed to DSTs (common 189 DR); 6.5 80 unique up-regulated DEGs identified in the gills (Ce_Ge_vs_Ce_Gc, unique (G) 80 up-regulated); 6.6 456 unique down-regulated DEGs as test set (file: test_set_456_unique_DEGs_down_Ce_Ge_vs_Ce_Gc). 7 The log-fold change (M) and the absolute value of the difference in expression obtained with NOISeq R/Bioc package between conditions: Ce_DGe_vs_Ce_DGc, in the Digestive Gland (DG), and Ce_Ge_vs_Ce_Gc in the Gills (G).
Figure 4Enrichment analyses performed for the Differentially Expressed Genes (DEGs). The top-thirty enriched GO terms obtained with one-tailed Fisher’s exact test (FDR < 0.05) are represented by a stacked bar (in green) as the percentage of the GO terms found in the Reference Set (in red), for each subset of DEGs used as Test Set: 189 common DEGs with a similar down-regulated pattern identified in both condition exposed to DSTs (189_common_DEGs), unique down-regulated DEGs identified in the digestive gland (168_unique_DEGs_DG), and 456 unique down-regulated DEGs identified in the gills (456_unique_DEGs_G). The tested conditions in the pairwise analyses were Ce_DGe_vs_Ce_DGc and Ce_Ge_vs_Ce_Gc, corresponding to Digestive Gland (DEGs_DG) samples, and Gills (DEGs_G), where the contrast condition and reference/control samples were Ce_DGe/Ce_Ge and Ce_DGc/Ce_Gc, respectively.