| Literature DB >> 31653851 |
Romain Péden1,2, Pascal Poupin1, Bénédicte Sohm1, Justine Flayac1, Laure Giambérini1, Christophe Klopp3, Fanny Louis1, Sandrine Pain-Devin1, Marine Potet1, Rémy-Félix Serre4, Simon Devin5.
Abstract
Dreissenids are established model species for ecological and ecotoxicological studies, since they are sessile and filter feeder organisms and reflect in situ freshwater quality. Despite this strong interest for hydrosystem biomonitoring, omics data are still scarce. In the present study, we achieved full de novo assembly transcriptomes of digestive glands to gain insight into Dreissena polymorpha and D. rostriformis bugensis molecular knowledge. Transcriptomes were obtained by Illumina RNA sequencing of seventy-nine organisms issued from fifteen populations inhabiting sites that exhibits multiple freshwater contamination levels and different hydrosystem topographies (open or closed systems). Based on a recent de novo assembly algorithm, we carried out a complete, quality-checked and annotated transcriptomes. The power of the present study lies in the completeness of transcriptomes gathering multipopulational organisms sequencing and its full availability through an open access interface that gives a friendly and ready-to-use access to data. The use of such data for proteogenomic and targeted biological pathway investigations purpose is promising as they are first full transcriptomes for this two Dreissena species.Entities:
Mesh:
Year: 2019 PMID: 31653851 PMCID: PMC6814772 DOI: 10.1038/s41597-019-0252-x
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Top 10 fields treemap for Dreissena publications.
BioProject deposit. The BioProject gathered all BioSamples, SRA[30] and TSA[31,32] related to this Data Descriptor.
| BioProject | Datasets (nb) | Accessions |
|---|---|---|
| PRJNA507340 | BioSamples (79) | SAMN10537936 to SAMN10538014 |
| SRA (79) | SRR8354718 to SRR8354796 | |
| TSA (2) | GHIW00000000 and GHIX00000000 |
NGSPipeline deposit.
| Data type | URL | Implemented softwares |
|---|---|---|
| Web interface |
| BioMart and BLAST |
Datasets are fully accessible in an user friendly web interface provided by Genotoul sequencing and bioinformatic platform[29,33]. Biomart and BLAST are implemented for quick database interrogation.
Figshare deposit.
| Descriptive filename | Data format |
|---|---|
| fasta | |
| fasta | |
| BLAST_annotations_for_ | csv |
| BLAST_annotations_for_ | csv |
| GOterms_annotations_for_ | csv |
| GOterms_annotations_for_ | csv |
| KEGG_annotations_for_ | csv |
| KEGG_annotations_for_ | csv |
Supplementary files are available on figshare including annotations (GO, KEGG) and ready to use transcriptomes in fasta format[34].
Assembly metrics.
| Type | ||
|---|---|---|
| Number of contigs | 44,538 | 49,679 |
| Total size of contigs (bp) | 103,039,811 | 97,982,186 |
| N50 (bp) | 3,094 | 2,674 |
| Average length (bp) | 2,314 | 1,972 |
| Longest contig (bp) | 39,481 | 39,593 |
| Shortest contig (bp) | 210 | 207 |
Individual realignment statistics.
| Type | ||||
|---|---|---|---|---|
| Total read count average | 69,815,940 | ±9,645,652 | 73,730,343 | ±9,641,714 |
| Alignment rate | 97.56% | ±0.7% | 98.15% | ±0.05% |
| Properly paired rate | 88.84% | ±1.15% | 86.58% | ±1.60% |
Crassostrea gigas proteins.
| Type | ||
|---|---|---|
| Number of | 4,352 | 4,281 |
| Number of unique Dreissenid contigs mapped to | 3,594 | 3,558 |
BUSCO analysis.
| Type | ||||
|---|---|---|---|---|
| Complete BUSCOs | 921 | 94.2% | 923 | 94.4% |
| Complete and single-copy BUSCOs | 494 | 50.5% | 450 | 46.0% |
| Complete and duplicated BUSCOs | 427 | 43.7% | 473 | 48.4% |
| Fragmented BUSCOs | 5 | 0.5% | 7 | 0.7% |
| Missing BUSCOs | 52 | 5.3% | 48 | 4.9% |
| Total BUSCO groups searched | 978 | 100% | 978 | 100% |
Fig. 2Most representated species hits. Top 5 best species hits for (A) Dreissena polymorpha and (B). Dreissena rostriformis bugensis.
Biomarker best contig hits and closest species hits.
| Biomarkers | Contig database name | Closest blastn hit | |||||
|---|---|---|---|---|---|---|---|
| E-value | % Identity | ||||||
| Species | |||||||
| Acetylcholinesterase | Dp_LOC105324424.2.2 | Db_LOC105324424.3.3 |
| 2e-8 | 4e-12 | 68% | 67% |
| Beta Actine | Dp_ACT.1.3 | Db_ACT.2.4 |
| 0.0 | 0.0 | 90% | 90% |
| Catalase | Dp_LOC105339902 | Db_LOC105339902.2.4 |
| 0.0 | 0.0 | 77% | 76% |
| Superoxide dismutase (Cu-Zn) | Dp_SODC | Db_SODC |
| 4e-55 | 1e-48 | 73% | 72% |
| Superoxide dismutase (Mn) | Dp_LOC101852344 | Db_LOC101852344 |
| 3e-78 | 3e-77 | 72% | 72% |
| Estrogen receptor | Dp_LOC105318922.2.2 | Db_LOC105318922.1.2 |
| 0.0 | 0.0 | 75% | 74% |
| GABARAP | Dp_LOC105335545.1.2 | Db_LOC105335545.1.6 |
| 2e-103 | 1e-98 | 84% | 83% |
| Glutathione peroxidase 1 | Dp_GPX1.1.2 | Db_GPX1.1.2 |
| 1e-54 | 6e-59 | 72% | 73% |
| Glutathione peroxidase (Se) | Dp_LOC106070504.1.5 | Db_LOC106070504.6.8 |
| 3e-79 | 6e-75 | 73% | 73% |
| Heat Shock Protein70 | Dp_HSP7D.3.8 | Db_HSP7D.4.8 |
| 0.0 | 0.0 | 80% | 81% |
| Metallothionein (isoform 1) | Dp_MT.6.6 | Db_MT.6.6 |
| 6e-05 | 6e-05 | 80% | 80% |
| mTOR | Dp_LOC105331599 | Db_LOC105331599 |
| 0.0 | 0.0 | 72% | 72% |
| Na/K ATPase | Dp_LOC106058320 | Db_LOC106058320.1.2 |
| 0.0 | 0.0 | 76% | 77% |
| Succinate dehydrogenase | Dp_LOC101864456.1.3 | Db_LOC105338659 |
| 9e-14 | 2e-16 | 66% | 67% |
| Acid phosphatase | Dp_LOTGIDRAFT_139839 | Db_contig_19914 |
| 1e-11 | 0.0 | 67% | 95% |
| Thioredoxin reductase 1 | Dp_LOC105322705.2.2 | Db_LOC105322705 | 0.0 | 0.0 | 71% | 71% | |
| MRP1 (Abcc1 gene) | Dp_LOC105347802.1.2 | Db_LOC105347802 |
| 0.0 | 0.0 | 69% | 69% |
| MDR1 (Abcb1 gene) | Dp_LOC101858982.2.4 | Db_LOC101858982.2.2 |
| 0.0 | 0.0 | 68% | 70% |
D. polymorpha and D. rostriformis bugensis databases were used to find common ecotoxicological biomarkers. Contig coding sequences were then screened in NCBInr (blastn) and closest species (excluding D. polymorpha and D. rostriformis bugensis) were reported according with E-values and identity percentages. GABARAP: Gamma-aminobutyric acid receptor-associated protein; mTOR: mechanistic target of rapamycin; MRP1: Multidrug resistance-associated protein 1; MDR1: Multidrug resistance protein 1; Cds: coding sequence.
Interspecies sequence homologies.
| Biomarkers |
|
| Interspecies Cds homology | ||||
|---|---|---|---|---|---|---|---|
| Contig accession | Cds length | Contig accession | Cds lengt | % Ident. | E-val. | Gap | |
| Acetylcholinesterase | Dp_LOC105324424.2.2 | 1,599 | Db_LOC105324424.3.3 | 1,644 | 91% | 0.0 | 0 (0%) |
| Beta Actine | Dp_ACT.1.3 | 1,128 | Db_ACT.2.4 | 1,128 | 98% | 0.0 | 0 (0%) |
| Catalase | Dp_LOC105339902 | 1,515 | Db_LOC105339902.2.4 | 1,515 | 91% | 0.0 | 0 (0%) |
| Superoxide dismutase (Cu-Zn) | Dp_SODC | 459 | Db_SODC | 459 | 91% | 0.0 | 0 (0%) |
| Superoxide dismutase (Mn) | Dp_LOC101852344 | 621 | Db_LOC101852344 | 621 | 92% | 0.0 | 0 (0%) |
| Estrogen receptor | Dp_LOC105318922.2.2 | 1,476 | Db_LOC105318922.1.2 | 1,476 | 85% | 0.0 | 111 (3%) |
| GABARAP | Dp_LOC105335545.1.2 | 351 | Db_LOC105335545.1.6 | 351 | 95% | 7e-164 | 0 (0%) |
| Glutathione peroxidase 1 | Dp_GPX1.1.2 | 426 | Db_GPX1.1.2 | 426 | 94% | 0.0 | 0 (0%) |
| Glutathione peroxidase (Se) | Dp_LOC106070504.1.5 | 729 | Db_LOC106070504.6.8 | 714 | 85% | 0.0 | 15 (2%) |
| Heat Shock Protein70 | Dp_HSP7D.3.8 | 1,959 | Db_HSP7D.4.8 | 1,965 | 92% | 0.0 | 14 (<1%) |
| Metallothionein (isoform 1) | Dp_MT.6.6 | 219 | Db_MT.6.6 | 219 | 100% | 6e-117 | 0 (0%) |
| mTOR | Dp_LOC105331599 | 7,422 | Db_LOC105331599 | 7,410 | 92% | 0.0 | 12 (<1%) |
| Na/K ATPase | Dp_LOC106058320 | 3,093 | Db_LOC106058320.1.2 | 3,096 | 89% | 0.0 | 3 (<1%) |
| Succinate dehydrogenase | Dp_LOC101864456.1.3 | 504 | Db_LOC105338659 | 504 | 93% | 0.0 | 0 (0%) |
| Acid phosphatase | Dp_LOTGIDRAFT_139839 | 1,293 | Db_contig_19914 | 1,098 | 90% | 0.0 | 5 (0%) |
| Thioredoxin reductase 1 | Dp_LOC105322705.2.2 | 1,788 | Db_LOC105322705 | 1,932 | 92% | 0.0 | 3 (<1%) |
| MRP1 (Abcc1 gene) | Dp_LOC105347802.1.2 | 3,510 | Db_LOC105347802 | 4,686 | 91% | 0.0 | 6 (<1%) |
| MDR1 (Abcb1 gene) | Dp_LOC101858982.2.4 | 4,032 | Db_LOC101858982.2.2 | 4,023 | 89% | 0.0 | 15 (<1%) |
Coding sequence from D. polymorpha and D. rostriformis bugensis biomarkers contigs were used for interspecies homology analysis. Length of coding sequences were also indicated (in base pair; For acronym significations, see Table 8).
| Measurement(s) | transcriptome |
| Technology Type(s) | RNA sequencing |
| Factor Type(s) | sampling location |
| Sample Characteristic - Organism | Dreissena polymorpha • Dreissena rostriformis bugensis |
| Sample Characteristic - Environment | freshwater biome |
| Sample Characteristic - Location | Metropolitan France |