| Literature DB >> 33238003 |
Christophe Klopp1, Cédric Cabau2, Gonzalo Greif3, André Lasalle4, Santiago Di Landro4, Denise Vizziano-Cantonnet4.
Abstract
MOTIVATION: Siberian sturgeon is a long lived and late maturing fish farmed for caviar production in 50 countries. Functional genomics enable to find genes of interest for fish farming. In the absence of a reference genome, a reference transcriptome is very useful for sequencing based functional studies.Entities:
Year: 2020 PMID: 33238003 PMCID: PMC7687680 DOI: 10.1093/database/baaa082
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Comparing the four contig sets build with two assemblers (trinity and Oases) and two strategies (one assembly for all the reads, one assembly per sample plus contig reconciliation)
| Metrics | Sample | All_Oases | All_trinity | Meta_Oases | Meta_trinity |
|---|---|---|---|---|---|
| Assembly | N seq |
| 75 514 | 71 263 | 105 556 |
| N50 |
| 1957 | 2501 | 1844 | |
| L50 |
| 16 772 | 14 784 | 21 757 | |
| Length sum |
| 104 981 273 | 118 836 837 | 134 182 723 | |
| Length mean |
| 1390 | 1668 | 1271 | |
| Chimera | N chimeric contigs | 140 | 140 | 228 |
|
| N chimeric nt |
| 40 559 | 72 474 | 35 620 | |
| Read alignment | Mapped | 84.57% | 81.58% | 85.64% |
|
| Properly paired | 77.16% | 75.00% | 79.17% |
| |
| Mate mapped to different chromosome | 8 062 249 | 11 643 098 |
| 7 454 996 | |
| BUSCO | Complete | 3872 | 3682 |
| 4093 |
| vertebrata_odb9 | Complete single-copy | 1675 | 2653 |
| 3382 |
| Complete duplicated | 2197 | 1029 |
| 711 | |
| Fragmented | 132 | 293 |
| 117 | |
| Missing | 580 | 609 |
| 374 | |
| Proteins | N protein(s) aligned on contig | 48 925 | 41 836 |
| 53 009 |
| TransRate | Score | 3115 | 2294 |
| 2546 |
| Optimal score | 4356 | 366 |
| 4395 |
Public sturgeon transcriptome assembly comparison
| Assembly | Nb seq | N50 | L50 | Lg sum | Lg mean | Lg max | proteins |
|---|---|---|---|---|---|---|---|
| GEUL01.1 | 179 564 | 1946 | 25 772 | 166 715 666 | 928 |
| 46 841 |
| GGQL01.1 | 77 634 | 2523 | 18 243 | 135 647 056 | 1747 | 16 131 | 41 703 |
| GGWJ01.1 | 53 624 | 1086 |
| 34 807 151 | 649 | 15 639 | 17 213 |
| GGWK01.1 | 369 441 | 763 | 69 627 | 208 011 161 | 563 | 15 644 | 29 402 |
| GGYF01.1 | 121 398 | 2211 | 23 671 | 168 641 140 | 1389 | 2596 | 49 218 |
| GGZT01.1 | 203 131 | 1874 | 40 438 | 254 007 803 | 125 | 34 023 | 30 551 |
| GGZX01.1 |
| 604 | 132 361 |
| 533 | 1664 | 20 915 |
| GICD01.1 | 91 579 | 1011 | 18 288 | 136 604 581 | 1492 | 20 419 | 53 729 |
| SSTdb | 79 217 |
| 16 359 | 150 824 770 |
| 45 872 |
|
The public assembly names with their TSA prefix. The Siberian sturgeon reference transcriptome database is named SSTdb.
Figure 1.BUSCO scores of the eight sturgeon transcriptome assemblies found in TSA plus the sturgeon reference database named SSTbd.
Reference transcriptome manual validation table
| Type | Number of genes searched | Number of contigs found | Confirmed genes (%) |
| Hypophysiotropic peptides | 17 | 24 | 88 |
| Hypophysiotropic peptide receptors | 17 | 18 | 88 |
| Pituitary hormones | 8 | 8 | 100 |
| Gonad related | 11 | 11 | 100 |
| Liver | 7 | 10 | 100 |
| Gastrointestinal hormones genes | 10 | 11 | 100 |
| Kidney and anterior kidney | 7 | 8 | 86 |
| Immunologically-relevant genes | 8 | 13 | 100 |
For some genes more than one contig have been found in the assembly.