| Literature DB >> 20942956 |
Mohamed Salem1, Caird E Rexroad, Jiannan Wang, Gary H Thorgaard, Jianbo Yao.
Abstract
BACKGROUND: Rainbow trout are important fish for aquaculture and recreational fisheries and serves as a model species for research investigations associated with carcinogenesis, comparative immunology, toxicology and evolutionary biology. However, to date there is no genome reference sequence to facilitate the development of molecular technologies that utilize high-throughput characterizations of gene expression and genetic variation. Alternatively, transcriptome sequencing is a rapid and efficient means for gene discovery and genetic marker development. Although a large number (258,973) of EST sequences are publicly available, the nature of rainbow trout duplicated genome hinders assembly and complicates annotation.Entities:
Mesh:
Year: 2010 PMID: 20942956 PMCID: PMC3091713 DOI: 10.1186/1471-2164-11-564
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary statistics of the major rainbow trout Sanger-based EST projects
| Library Name | Supplier | Tissue/Organ | # in TCs | # of singletons | Total ESTs |
|---|---|---|---|---|---|
| NCCCWA 1RT | NCCCWA-USDA | Pooled | 34927 | 11452 | 46379 |
| NCCCWA 02RT | NCCCWA-USDA | Pooled | 14506 | 3452 | 17958 |
| NCCCWA 03RT | NCCCWA-USDA | Pooled juvenile | 17412 | 651 | 18063 |
| NCCCWA 04RT | NCCCWA-USDA | Pooled juvenile | 2198 | 407 | 2605 |
| NCCCWA 05RT | NCCCWA-USDA | Pooled juvenile | 1789 | 282 | 2071 |
| NCCCWA 07RT | NCCCWA-USDA | Pituitary | 3208 | 321 | 3529 |
| NCCCWA 09RT | NCCCWA-USDA | Testis | 4615 | 537 | 5152 |
| NCCCWA 10RT#1-4 | NCCCWA-USDA | Oocyte | 254 | 78 | 332 |
| NCCCWA 10RT#2 | NCCCWA-USDA | Oocyte | 147 | 19 | 166 |
| NCCCWA 10RT#3 | NCCCWA-USDA | Oocyte | 14594 | 3060 | 17654 |
| NCCCWA 10RT#4 | NCCCWA-USDA | Oocyte | 68 | 16 | 84 |
| leuk | NCCCWA-USDA | Anterior kidney/spleen | 6223 | 166 | 6389 |
| EMB | NCCCWA-USDA | Embryo | 3529 | 227 | 3756 |
| AGENAE Rainbow trout multi-tissues library (tcaa) | INRA - SCRIBE | Adipose, blood, brain, gonad | 863 | 256 | 1119 |
| AGENAE Rainbow trout normalized testis library (tcab) | INRA - SCRIBE | Testis | 730 | 294 | 1024 |
| AGENAE Rainbow trout multi-tissues subtracted library (tcay) | INRA - SCRIBE | Adipose, blood, brain, gonad | 17859 | 8433 | 26292 |
| AGENAE Rainbow trout normalized ovarian library (tcby) | INRA - SCRIBE | Ovary | 4714 | 204 | 4918 |
| AGENAE Rainbow trout normalized multi-tissues library (tcac) | INRA - SCRIBE | Adipose, blood, brain, gonad | 3265 | 713 | 3978 |
| AGENAE Rainbow trout subtracted multi-tissues library (tcav) | INRA - SCRIBE | Adipose, blood, brain, gonad | 2516 | 850 | 3366 |
| Agenae (tcbj) | INRA - SCRIBE | 179 | 17 | 196 | |
| AGENAE Rainbow trout normalized multi-tissues library (tcad) | INRA - SCRIBE | Adipose, blood, brain, gonad | 5015 | 1131 | 6146 |
| AGENAE Rainbow trout normalized testis library (tcbi) | INRA - SCRIBE | Testis | 11494 | 3603 | 15097 |
| AGENAE Rainbow trout multi-tissues-normalized (tcbk) | INRA - SCRIBE | Multi-tissues | 21132 | 5757 | 26889 |
| AGENAE Rainbow trout multi-tissues library (tcce) | INRA - SCRIBE | Embryos to adults | 6211 | 1664 | 7875 |
| Oncorhynchus mykiss reproductive | University of Victoria | Gonads | 5387 | 848 | 6235 |
| Oncorhynchus mykiss Chilliwack River steelhead whole | University of Victoria | Whole embryo/juvenil | 3760 | 246 | 4006 |
| Oncorhynchus mykiss Tzenaicut Lake whole | University of Victoria | Whole embryo/juvenil | 2266 | 188 | 2454 |
Data from the National Center for Cool and Cold Water Aquaculture (NCCCWA-USDA) and the National Institute for Agricultural Research (INRA) constitute 47% and 37%, respectively.
Summary statistics of the Sanger-based assembly, 454-pyrosequencing assembly and a combination assembly of both datasets
| Combination assembly | Sanger assembly | 454 assembly | |
|---|---|---|---|
| 521 M bases | 74 M bases | 447 M bases | |
| 1,380,311 | 258,973 | 1,290,292 | |
| 422,889 | 90,019 | 376,238 | |
| 1,119,240 (81%, Ave = 394 bp) | 209,565 (81%) | 1,065,901 (83%, Ave = 374 bp) | |
| 261,071 (Ave = 417 bp) | 49,408 | 224,391 (Ave = 345 bp) | |
| 161,818 (Ave = 758 bp) | 40,320 (Ave = 999 bp) | 151,847(Ave = 662 bp) | |
| 101,464 | 78,572 | 84,810 |
Figure 1Average length distribution of assemblies and pyrosequencing reads of the rainbow trout ESTs. Average lengths of the combination assembly contigs are more than those of the 454-pyrosequencing assembly over the 100-700 bp length range, indicating that addition of the Sanger-based data improved assembly of the short 454-pyrosequenceing reads up to 700 bp. However, over the 700-2000 bp length range, average contig lengths of the combination assembly are less than those of the 454-pyrosequencing assembly indicating that contigs longer than 700 bp of the combination assembly were mainly derived from the 454-pyroseqeuncing data, possibly because most of the 454-pyrosequencing reads were 400-600 bp. Number of the 454-pyrosequencing reads is divided by 10 for scaling.
Figure 2Gene Ontology (GO) assignment (2. Biological processes constitute majority of GO assignment of the transcripts (17,694 counts, 39%), followed by cellular components (16,452 counts, 36%) and molecular function (11,160 counts, 25%).
Figure 3BLASTx top-hit species distribution of gene annotations showing high homology to fish species with known genome sequences. Only 5.7% of the BLAST hits matched rainbow trout protein sequences due to the limited number of the rainbow trout proteins (6,915) that are currently available in the NCBI database (compared to 141,396 proteins in Zebrafish).
Figure 4Gene Ontology (2.