| Literature DB >> 23497389 |
Laia Ribas1, Belén G Pardo, Carlos Fernández, José Antonio Alvarez-Diós, Antonio Gómez-Tato, María Isabel Quiroga, Josep V Planas, Ariadna Sitjà-Bobadilla, Paulino Martínez, Francesc Piferrer.
Abstract
BACKGROUND: Genomic resources for plant and animal species that are under exploitation primarily for human consumption are increasingly important, among other things, for understanding physiological processes and for establishing adequate genetic selection programs. Current available techniques for high-throughput sequencing have been implemented in a number of species, including fish, to obtain a proper description of the transcriptome. The objective of this study was to generate a comprehensive transcriptomic database in turbot, a highly priced farmed fish species in Europe, with potential expansion to other areas of the world, for which there are unsolved production bottlenecks, to understand better reproductive- and immune-related functions. This information is essential to implement marker assisted selection programs useful for the turbot industry.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23497389 PMCID: PMC3700835 DOI: 10.1186/1471-2164-14-180
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Increase of the genomic resources for the turbot ( ) with the successive databases
| 1 | Liver, head kidney, spleen ( | ABI3730 cDNA library | 9,873 | 3,482 | Pardo |
| (1,073 + 2,409) | |||||
| 2 | Muscle | ABI3730 Microsatellite- enriched DNA library | 1,371 | - | Pardo |
| Liver, kidney and gills (nodavirus infection and stimulation | ABI3730 cDNA library | 3,339 | - | Park | |
| | |||||
| Liver, head kidney, spleen, pyloric caeca and thymus ( | ABI3730 cDNA library | 3,043 | 6,170 | Ribas | |
| | (1,827 + 4,343) | ||||
| 3 | Brain-hypophysis, gonad | 454 Roche Titanium | 1,191,866 | 52,427 + 176,451 | Ribas |
Summary statistics of ( . ) 454-pyrosequencing
| Total reads (raw wells) | 2,762,845 |
| High quality reads (filtered) | 1,191,866 |
| Total megabases (Mb) | 341.2 |
| Average read length (bp) | 286 |
| N50 read length (bp) | 383 |
| Number of contigs | 65,472 |
| Number of contigs > 500 bp | 32,612 |
| Number of singletons | 172,108 |
| Total consensus length (Mb) | 41 |
| Average contig coverage | 4.6 |
| Maximum coverage | 578.7 |
| Mean contig length (bp) | 625.9 |
| Median contig length (bp) | 499 |
| Mode contig length (bp) | 365 |
| N25 contig length (bp) | 1,217 |
| N50 contig length (bp) | 748 |
| N75 contig length (bp) | 482 |
Figure 1Length range distribution of pyrosequencing contig reads of the turbot brain-hypophysis-gonad axis at different stages of gonad development. (A) Range distribution of contigs obtained in the 454 FLX Titanium run. (B) Average coverage distribution per nucleotide of the contigs obtained in the 454 FLX Titanium run.
List of the top 20 longest contigs originated from the 454 run of brain-hypophysis-gonad axis tissues of turbot ( . )
| 358 | 5,012 | 584 | 33.2 | 43.4 | Cytochrome c oxidase subunit 3 | 1.00E-127 | UniRef90_C7S7B2 | Uniref90 |
| 4,333 | 4,486 | 133 | 8.3 | 45.7 | Adrenodoxin-like protein mitochondrial | 2.00E-60 | UniRef90_Q08C57 | Uniref90 |
| 2,251 | 4,042 | 146 | 10.5 | 46.1 | SSRU rRNA | 3.00E-13 | AAKN02033512 | SSU |
| 425 | 4,010 | 293 | 23.3 | 52.1 | Aspartate/tyrosine/aromatic aminotransferase | 1.00E-101 | YDR111c | COG |
| 1,894 | 3,992 | 212 | 13.6 | 44.3 | unknown | | | |
| 2,165 | 3,964 | 158 | 11.0 | 50.4 | Ubiquitin carboxyl-terminal hydrolase | 0 | UniRef90_B7Z855 | Uniref90 |
| 762 | 3,896 | 464 | 32.4 | 48.6 | Proliferation-associated 2G4b | 1.00E-165 | dre:323462 | KEGG |
| 1,956 | 3,797 | 182 | 13.8 | 47.5 | Novel protein (Zgc:55794) | 0 | UniRef90_Q1LWK5 | Uniref90 |
| 9,828 | 3,757 | 54 | 4.3 | 44.1 | FAM3C | 9.00E-57 | UniRef90_B5X712 | Uniref90 |
| 12,120 | 3,753 | 63 | 4.3 | 47.0 | Zgc:165446 protein | 1.00E-65 | UniRef90_A6H8R7 | Uniref90 |
| 3,201 | 3,707 | 128 | 10.3 | 46.6 | Novel protein similar to WDR44 | 0 | dre:569045 | KEGG |
| 2,212 | 3,700 | 116 | 9.1 | 38.6 | Osteocalcin | 2.00E-12 | UniRef90_D2XEB2 | Uniref90 |
| 1,541 | 3,673 | 181 | 14.7 | 48.3 | Cell division cycle | 1.00E-105 | UniRef90_Q6PFU4 | Uniref90 |
| 2,167 | 3,623 | 173 | 12.3 | 45.1 | M-phase phosphoprotein 10 | 1.00E-148 | dre:323426 | KEGG |
| 280 | 3,613 | 287 | 27.3 | 44.9 | NADH dehydrogenase subunit 5 | 0 | UniRef90_C7S7B6 | Uniref90 |
| 4,193 | 3,611 | 85 | 6.2 | 41.3 | Suppressor of tumorigenicity 7 protein | 1.00E-34 | UniRef90_Q1RLU8 | Uniref90 |
| 644 | 3,571 | 531 | 42.1 | 46.3 | RAD1 homolog | 1.00E-145 | UniRef90_Q6P2T4 | Uniref90 |
| 1,248 | 3,569 | 127 | 10.8 | 43.8 | 60S ribosomal protein | 2.00E-45 | UniRef90_P61513 | Uniref90 |
| 2,845 | 3,555 | 106 | 8.1 | 43.4 | NF-kappaB repressing factor | 1.00E-79 | gga:422370 | KEGG |
| 5,634 | 3,550 | 84 | 5.8 | 47.1 | WD repeat domain phosphoinositide | 0 | UniRef90_Q5MNZ6 | Uniref90 |
List of the top 20 deepest contigs originated from the 454 run of brain-hypophysis-gonad axis tissues of turbot ( . )
| 1 | 744 | 1,997 | 578.7 | 53.7 | 40S ribosomal protein S9 | 8.00E-93 | UniRef90_P46781 | Uniref90 |
| 19 | 676 | 1,728 | 541.1 | 45.6 | Parvalbumin | 4.00E-34 | UniRef90_B5WX08 | Uniref90 |
| 1,926 | 101 | 466 | 398.6 | 43.6 | Unknown | | | |
| 13 | 663 | 935 | 382.1 | 51.0 | Nucleolar protein family A3 | 4.00E-25 | UniRef90_A4IHX9 | Uniref90 |
| 88 | 787 | 1,545 | 366.8 | 53.2 | Similar to ribosomal protein S7 | 5.00E-98 | UniRef90_UPI0000E801DC | Uniref90 |
| 44 | 701 | 930 | 336.4 | 56.0 | Unknown | | | |
| 36 | 709 | 725 | 291.6 | 51.8 | NADH dehydrogenase 1 beta | 8.00E-75 | UniRef90_C1BWY9 | Uniref90 |
| 2 | 1,222 | 1,216 | 289.6 | 46.7 | Chromobox protein homolog 3 | 3.00E-80 | UniRef90_C3KJI6 | Uniref90 |
| 6 | 1,103 | 1,335 | 285.0 | 48.4 | Ribosomal protein L7 | 1.00E-108 | dre:336710 | KEGG |
| 47 | 675 | 651 | 277.6 | 51.6 | Fatty acid binding protein 11a | 2.00E-49 | dre:447944 | KEGG |
| 3 | 1,275 | 1,250 | 271.4 | 48.0 | General transcription factor IIIA | 6.00E-89 | dre:445389 | KEGG |
| 92 | 603 | 687 | 271.2 | 51.9 | Ribosomal protein S11 | 9.00E-73 | UniRef90_B5FX82 | Uniref90 |
| 9 | 759 | 929 | 270.6 | 50.8 | Histone deacetylase complex | 6.00E-74 | UniRef90_C3KI01 | Uniref90 |
| 179 | 649 | 738 | 266.8 | 50.0 | Unknown | | | |
| 68 | 782 | 1,065 | 264.7 | 51.9 | Ribosomal protein | 1.00E-99 | UniRef90_A9Z0M8 | Uniref90 |
| 82 | 663 | 679 | 264.5 | 55.7 | 40S ribosomal protein S27a | 4.00E-65 | UniRef90_P68200 | Uniref90 |
| 59 | 749 | 842 | 260.6 | 53.4 | Ribosomal protein L12 | 6.00E-82 | UniRef90_Q5BKW5 | Uniref90 |
| 31 | 869 | 973 | 258.7 | 54.0 | 60S ribosomal protein L13 | 1.00E-101 | UniRef90_B5DGD9 | Uniref90 |
| 15 | 997 | 950 | 255.3 | 51.6 | Ferritin | 1.00E-90 | UniRef90_Q4SBB8 | Uniref90 |
| 4 | 1,346 | 1,340 | 253.2 | 42.6 | Epididymal secretory protein E1 | 3.00E-70 | UniRef90_C3KIM5 | Uniref90 |
Figure 2Venn diagrams showing annotation (A) and functional classification by Gene Ontology terms (B) in the Turbot 3 database. BP: Biological Process, CC: Cellular Component, MF: Molecular Function.
Figure 3BLASTx top-hit species distribution of gene annotations in the Turbot 3 database.
Figure 4Second level Gene Ontology assignment of sequences in Turbot 3 database. A) Biological process; B) Cellular component and C) Molecular function.
Selection of some of the novel relevant immune-related genes identified in the Turbot 3 database
| DNA fragmentation factor, 40kDa, beta polypeptide | 607 | GO:0006917 GO:0006309 GO:0030263 | 2,00E-73 | Uniref90 | ||
| BCL2-like 1 | 698 | GO:0045087 GO:0045768 | 6,00E-19 | nr | ||
| Tumor necrosis factor receptor-associated factor 2 | 528 | GO:0045087 GO:0008624 GO:0050870 GO:0042981 | 2,00E-53 | Uniref90 | ||
| Interleukin 1 receptor activated kinase 1 | 842 | GO:0045087 GO:0008063 GO:0006916 | 6,00E-28 | Uniref90 | ||
| JNK1/MAPK8-associated membrane protein | 223 | GO:0006986 GO:0030433 | 8,00E-23 | | Uniref90 | |
| Toll-interacting protein | 401 | GO:0045087 GO:0006954 GO:0045321 | 2,00E-55 | nr | ||
| TNF receptor-associated factor 6 | 187 | GO:0045087 GO:0006915 GO:0008063 GO:0050852 | 5,00E-15 | nr | ||
| FYN oncogene related to SRC, FGR, YES | 467 | GO:0006468 GO:0016310 | 1,00E-74 | nr | ||
| Cytoplasmic protein 1 | 382 | GO:0042110 GO:0050852 GO:0042102 | 4,00E-38 | nr | ||
| Cytoplasmic protein NCK2 | 779 | GO:0042102 GO:0042110 | 3,00E-12 | Amniota | Uniref90 | |
| Disks large homolog 1 | 402 | GO:0070830 GO:0044419 | 2,00E-09 | nr | ||
| Mitogen-activated protein kinase 8 | 705 | GO:0031295 GO:0000165 GO:0000186 | 1,00E-07 | Uniref90 | ||
| T-cell-specific surface glycoprotein CD28 | 689 | GO:0031295 GO:0006959 GO:0008624 GO:0042102 GO:0045768 GO:0002863 GO:0045086 GO:0045066 | 2,00E-18 | nr | ||
| GRB2-related adaptor protein 2 | 721 | GO:0031295 GO:0050852 GO:0007265 GO:0007267 | 5,00E-24 | Uniref90 | ||
| Growth factor receptor-bound protein 2 | 370 | GO:0031295 GO:0050900 GO:2000379 GO:0030168 GO:0007265 | 3,00E-17 | nr |
Selection of some of the novel relevant reproductive-related genes identified in the Turbot 3 database
| Androgen receptor alpha | ARA | 1,780 | GO:0005634 GO:0003707 GO:0003700 GO:0006355 | 1,00E-110 | Uniref90 | |
| Cytochrome P450 aromatase | CYP19A | 2,109 | GO:0009055 GO:0020037 GO:0005506 GO:0004497 | 0 | Uniref90 | |
| Follicle stimulating hormone receptor | FSHR | 2,890 | GO:0016021 GO:0007186 | 0 | Uniref90 | |
| Gonadal soma derived factor | GSF | 2,035 | GO:0008083 | 3,00E-60 | Uniref90 | |
| Gonadotropin alpha | GTC | 1,529 | GO:0005576 GO:0005179 | 2,00E-47 | Uniref90 | |
| Growth differentiation factor 9 | GDF9 | 2,407 | GO:0008083 | 1,00E-131 | Uniref90 | |
| Meiotic nuclear division protein 1 homolog | MND1 | 1,386 | | 4,00E-96 | Uniref90 | |
| Mitotic arrest deficient 2 | MAD2 | 396 | GO:0007067 | 2,00E-26 | Uniref90 | |
| Müllerian inihibiting substance | AMH | 1,388 | GO:0008083 GO:0008406 | 2,00E-47 | Uniref90 | |
| Pituitary tumor-transforming | PTTG1IP | 1,257 | GO:0008083 | 4,00E-51 | Uniref90 | |
| Sex hormone binding globulin | SHBG | 906 | | 1,00E-70 | Uniref90 | |
| SRY-box containing gene 6 | SOX6 | 804 | GO:0005634 GO:0003677 | 1,00E-83 | nr | |
| SRY-box containing gene 9 | SOX9 | 610 | GO:0005634 GO:0003677 | 2,00E-59 | nr | |
| Spermatogenesis associated 13 | SPATA13 | 1,369 | GO:0005089 GO:0005622 GO:0035023 | 1,00E-146 | nr | |
| StAR-related lipid transfer protein 5 | START-5 | 737 | GO:0006694 GO:0015485 GO:0017127 | 2,00E-81 | Uniref90 | |
| StAR-related lipid transfer protein 7 | START-7 | 603 | | 4,00E-48 | nr | |
| Steroid 11-beta-hydroxylase | CYP11B | 584 | GO:0009055 GO:0020037 GO:0005506 GO:0004497 | 6,00E-53 | Uniref90 | |
| Vasa | VASA | 2,027 | GO:0005524 GO:0008026 | 1,00E-128 | Uniref90 | |
| Zona pellucida glycoproteins | ZPC | 468 | | 2,00E-14 | Uniref90 | |
| Zygote arrest protein 1 | ZAR1 | 353 | GO:0005737 GO:0007275 | 4,00E-22 | Uniref90 |
Representation of reproductive pathways with more than 50% of coverage in the Turbot 3 database
| Oocyte meiosis | dre04114 | 137 | 114 | 83.2 |
| Circadian rhythm | dre04710 | 30 | 24 | 80.0 |
| mTOR signaling pathway | dre04150 | 60 | 46 | 76.7 |
| ErbB signaling pathway | dre04012 | 102 | 77 | 75.5 |
| Progesterone-mediated oocyte maturation | dre04914 | 105 | 79 | 75.2 |
| GnRH signaling pathway | dre04912 | 121 | 91 | 75.2 |
| Insulin signaling pathway | dre04910 | 163 | 117 | 71.8 |
| Wnt signaling pathway | dre04310 | 187 | 121 | 64.7 |
| Steroid Biosynthesis | dre00100 | 19 | 12 | 63.2 |
| Notch signaling pathway | dre04330 | 55 | 31 | 56.4 |
Frequency distribution of the new SSRs by motif length in the Turbot 3 database
| | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Di | - | - | 98 | 60 | 33 | 42 | 33 | 125 | 391 | 60,7 | 31.6 |
| Tri | 392 | 136 | 85 | 46 | 30 | 23 | 17 | 27 | 756 | 35,4 | 61.1 |
| Tetra | 36 | 17 | 10 | 5 | 2 | 2 | 1 | 2 | 75 | 3,4 | 6.1 |
| Penta | 10 | 2 | 1 | 1 | 0 | 0 | 0 | 1 | 15 | 0,5 | 1.2 |
| Total | 438 | 155 | 194 | 112 | 65 | 67 | 51 | 155 | 1,237 | 100 | 100 |
| % | 35.4 | 12.5 | 15.7 | 9.1 | 5.3 | 5.4 | 4.1 | 12.5 | 100 | ||
Summary statistics of SNPs in the Turbot 3 database
| Total number SNPs | | 7,030 |
| Total contigs with SNPs | | 1,040 |
| with 4 sequences | 131 | 270 |
| with 5-10 sequences | 620 | 2,516 |
| with 11-20 sequences | 147 | 926 |
| with 21-30 sequences | 63 | 859 |
| with 31-50 sequences | 40 | 940 |
| with > 50 sequences | 39 | 1,859 |
| Total number of transitions | | 2,223 |
| C/T | | 1,328 |
| A/G | | 895 |
| Total number of transversions | | 2,404 |
| A/T | | 500 |
| A/C | | 746 |
| T/G | | 614 |
| C/G | | 544 |
| Total number of indels | | 1,578 |
| Tri-allelicpolymorphisms | | 1,044 |
| Tetra-allelicpolymorphisms | 113 |
Filtration process results for the 47,921 sequences with oligos in forward and reverse orientation
| Reverse | Both systems | 12,189 | 275 | 4,297 | 5,276 | 22,037 |
| Only immune | 531 | 74 | 71 | 116 | 792 | |
| Only Reproduction | 6,131 | 47 | 2,536 | 1,946 | 10,660 | |
| Without signal | 8,310 | 179 | 3,119 | 2,194 | 13,802 | |
| Total | 27,161 | 575 | 10,023 | 9,532 | 47,291 | |
| | | | | | | |
| Reverse | Both systems | 2,976 | 153 | 1627 | 5,493 | 10,249 |
| | Only immune | 298 | 85 | 86 | 257 | 726 |
| | Only reproduction | 2,332 | 61 | 1,360 | 3,112 | 6,865 |
| | Without signal | 9,059 | 359 | 4,812 | 15,221 | 29,451 |
| Total | 14,665 | 658 | 7,885 | 24,083 | 47,291 | |
Representative sample of miRNAs found in the Turbot 3 database. miRNAs were identified by Blasting Turbot 3 database sequences against the miRBase
| 6,514 | 54 | Early growth response 1 | MI0020478 | |
| 32,392 | 3 | LSU rRNA | MI0022328 | |
| 1,984r | 97 | Serine/threonine-protein phosphatase 2A | MI0014190 | |
| 34,898r | 2 | Similar to dynamin 3 | MI0000247 | |
| 49,897 | 2 | Similar to Elongation factor 1-gamma | MI0015615 | |
| 40,442 | 3 | Similar to ORFa | MI0019438 | |
| 40,442r | 3 | Similarity to transposases | MI0001124 | |
| 10,002r | 1 | Spindle and kinetochore-associated protein 2-like | MI0018844 | |
| 8,914r | 22 | Unknown | MI0019711 | |
| 10,002 | 1 | Unnamed protein product | MI006276 |