| Literature DB >> 22279088 |
Abstract
The silver carp (Hypophthalmichthys molitrix) is among the most intensively pond-cultured fish species and is used in the wild to counteract water bloom in China. However, little genomic information is available for this species, especially regarding its ability to grow rapidly in water, even water contaminated with high concentrations of poisonous microcystin. In this study, we performed de novo transcriptome assembly and analysis of the 17.10 million short-read sequences produced by the Illumina paired-end sequencing technology. Using an improved multiple k-mer contig assembly method coupled with further scaffolding, 85,759 sequences were obtained. There were 23,044 sequences annotated with 3423 gene ontology terms for 104 196 term occurrences and the three corresponding organizing principles. A total of 38,200 assembled sequences were involved in 218 predicted Kyoto Encyclopedia of Genes and Genomes metabolic pathways. We also recovered 41 of 44 genes involved in the biosynthesis of glutathione. Of these, five genes were identified as experienced positive selection between silver carp and zebrafish, as determined by the likelihood ratio test. This report is the first annotated review of the silver carp transcriptome. These data will be of interest to researchers investigating the evolution and biological processes of the silver carp. This work also provides an archive for future studies of recent speciation and evolution of Cyprinidae fishes and can be used in comparative studies of other fishes.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22279088 PMCID: PMC3325077 DOI: 10.1093/dnares/dsr046
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Summary statistics of the assemblies used to assess the performances of the Mulit-K de novo assembly method
| Method | k-mer | Contig > 100 | N50 | Max length | Total length (Mb) | Average contig size |
|---|---|---|---|---|---|---|
| single K | 58 | 3328 | 65 | 3324 | 2.076 | 87 |
| 54 | 22 397 | 159 | 5597 | 8.197 | 127 | |
| 52 | 37 207 | 233 | 6087 | 12.602 | 140 | |
| 50 | 51 041 | 239 | 8297 | 18.059 | 142 | |
| 48 | 61 097 | 241 | 8045 | 23.245 | 144 | |
| 46 | 69 717 | 242 | 8358 | 28.235 | 145 | |
| 44 | 77 038 | 239 | 11 062 | 33.133 | 142 | |
| 42 | 82 806 | 236 | 10 004 | 37.633 | 139 | |
| 40 | 87 673 | 233 | 7322 | 41.928 | 135 | |
| 38 | 91 694 | 228 | 10 092 | 45.964 | 130 | |
| 34 | 97 153 | 220 | 13 873 | 53.206 | 119 | |
| multi K | 118 764 | 257 | 13 880 | 58.075 | 159 |
These statistics correspond to the set of contig > 100 bp. k-mer, required length of identical overlap match between two reads by ABySS; N50, contig length–weighted median; max length, length of the longest contig; (Total length) summed length of all contig > 100 bp.
Summary statistics of the scaffolds produced by SSPACE
| k-mer | scaffold > 100 | N50 | Max length | Total length (Mb) | Average scaffold size |
|---|---|---|---|---|---|
| 58 | 2805 | 65 | 4835 | 2.074 | 92 |
| 54 | 19 184 | 241 | 7069 | 8.165 | 135 |
| 52 | 31 314 | 279 | 11 041 | 12.561 | 151 |
| 50 | 41 815 | 301 | 11 669 | 18.033 | 155 |
| 48 | 49 814 | 325 | 12 403 | 23.239 | 159 |
| 46 | 57 241 | 332 | 10 964 | 28.259 | 160 |
| 44 | 63 799 | 324 | 11 062 | 33.196 | 156 |
| 42 | 69 856 | 314 | 10 950 | 37.804 | 152 |
| 40 | 74 827 | 302 | 12 140 | 42.046 | 146 |
| 38 | 79 408 | 291 | 11 339 | 46.097 | 139 |
| 34 | 87 408 | 270 | 13 880 | 53.831 | 127 |
These statistics correspond to the set of scaffold > 100 bp. k-mer, required length of identical overlap match between two reads by ABySS; N50, scaffold length–weighted median; max length, length of the longest scaffold; total length, summed length of all scaffold > 100 bp.
Figure 1.Length distributions of scaffolds assembled by a multiple k-mer method.
Figure 2.The relationship of RPKM versus the transcript size. RPKM, Reads Per Kilobase of exon model per Million mapped reads.
Figure 3.Functional classification of silver carp transcriptome and comparison with zebrafish transcriptome. (A) GO: biological process. (B) GO: molecular function. (C) GO: cellular component. In some cases, one transcript may have multiple functions. Grey, silver carp; black, zebrafish.
Figure 4.COG annotations of putative proteins. All putative proteins were aligned to COG database and can be classified functionally into at least 25 molecular families.
The top 20 pathways with highest sequence numbers
| Num | Pathway | All genes with pathway annotation (38 200) | Pathway ID |
|---|---|---|---|
| 1 | Metabolic pathways | 4510 (11.81%) | ko01100 |
| 2 | Pathways in cancer | 1790 (4.69%) | ko05200 |
| 3 | Regulation of actin cytoskeleton | 1634 (4.28%) | ko04810 |
| 4 | Focal adhesion | 1518 (3.97%) | ko04510 |
| 5 | MAPK signaling pathway | 1463 (3.83%) | ko04010 |
| 6 | Endocytosis | 1345 (3.52%) | ko04144 |
| 7 | Tight junction | 1256 (3.29%) | ko04530 |
| 8 | Adherens junction | 1073 (2.81%) | ko04520 |
| 9 | Phagosome | 1034 (2.71%) | ko04145 |
| 10 | Dilated cardiomyopathy | 1027 (2.69%) | ko05414 |
| 11 | Vascular smooth muscle contraction | 1014 (2.65%) | ko04270 |
| 12 | Complement and coagulation cascades | 1005 (2.63%) | ko04610 |
| 13 | Hypertrophic cardiomyopathy (HCM) | 957 (2.51%) | ko05410 |
| 14 | Chemokine signaling pathway | 955 (2.5%) | ko04062 |
| 15 | Calcium signaling pathway | 942 (2.47%) | ko04020 |
| 16 | Axon guidance | 939 (2.46%) | ko04360 |
| 17 | Insulin signaling pathway | 912 (2.39%) | ko04910 |
| 18 | Huntington's disease | 907 (2.37%) | ko05016 |
| 19 | Leukocyte transendothelial migration | 869 (2.27%) | ko04670 |
| 20 | Protein processing in endoplasmic reticulum | 864 (2.26%) | ko04141 |
Sequences recovered in the glutathione synthesizing pathway
| Gene id | Description | Length | Matched |
|---|---|---|---|
| dre:100002145 | Gamma-glutamyltranspeptidase | 2082 | 0 |
| dre:100006589 | Isocitrate dehydrogenase 1 (NADP+) | 1290 | 1254 |
| dre:100124622 | Glutathione | 672 | 514 |
| dre:100330864 | Ribonucleoside-diphosphate reductase subunit M2-like | 1161 | 1057 |
| dre:100333757 | Gamma-glutamyltransferase 5-like | 1521 | 1162 |
| dre:114426 | Ornithine decarboxylase | 1386 | 1370 |
| dre:30733 | Ribonucleotide reductase M2 polypeptide | 1161 | 1057 |
| dre:30740 | Ribonucleotide reductase M1 polypeptide | 2385 | 2385 |
| dre:322533 | Alanyl (membrane) aminopeptidase b | 2898 | 1238 |
| dre:324366 | Glutathione | 660 | 658 |
| dre:324900 | Protein-disulfide reductase (glutathione) | 519 | 515 |
| dre:326857 | Glutamate-cysteine ligase, catalytic subunit | 1896 | 1896 |
| dre:333974 | Glutamate-cysteine ligase, modifier subunit | 822 | 736 |
| dre:352926 | Glutathione peroxidase 1a | 576 | 565 |
| dre:352928 | Glutathione peroxidase 4a | 561 | 561 |
| dre:352929 | Glutathione peroxidase 4b | 576 | 576 |
| dre:386951 | Isocitrate dehydrogenase 2 (NADP+), mitochondrial | 1350 | 1350 |
| dre:394009 | Spermidine synthase | 870 | 749 |
| dre:406278 | Gamma-glutamylcyclotransferase | 663 | 600 |
| dre:406703 | Glutathione | 672 | 571 |
| dre:406736 | Glutathione | 453 | 425 |
| dre:406762 | Phosphogluconate hydrogenase | 1536 | 1418 |
| dre:431762 | Glutathione | 459 | 451 |
| dre:436833 | Glutathione | 690 | 680 |
| dre:436894 | Glutathione | 723 | 722 |
| dre:449784 | Microsomal glutathione | 465 | 244 |
| dre:450084 | Glutathione synthetase | 1428 | 1385 |
| dre:552981 | Glutathione peroxidase 7 | 561 | 0 |
| dre:553169 | Glutathione | 627 | 625 |
| dre:553575 | Glutathione reductase (NADPH) | 1278 | 1257 |
| dre:555478 | Aminopeptidase N | 2883 | 763 |
| dre:562854 | Leucine aminopeptidase 3 | 1554 | 1320 |
| dre:563972 | Glutathione | 729 | 729 |
| dre:566746 | Gamma-glutamyltranspeptidase | 1773 | 82 |
| dre:567275 | Glutathione | 423 | 423 |
| dre:568744 | Glutathione | 660 | 658 |
| dre:569014 | Gamma-glutamyltranspeptidase 1-like | 1725 | 281 |
| dre:570579 | Glucose-6-phosphate dehydrogenase | 1572 | 1572 |
| dre:571365 | Glutathione | 660 | 658 |
| dre:723997 | Microsomal glutathione | 411 | 0 |
| dre:79381 | Glutathione | 627 | 626 |
| dre:798788 | Glutathione peroxidase 3 | 669 | 529 |
| dre:799288 | Glutathione | 672 | 512 |
| dre:80872 | Spermine synthase | 1083 | 1057 |
Genes determined to be under positive selection
| Gene id | Model | Log likelihood | d | Estimates of parameters | Sites under selection ( |
|---|---|---|---|---|---|
| dre_322533 | M7(beta) | −4530.079761 | 0.3750 | 164,167,256 | |
| M8(beta and ω) | −4519.934204 | 0.7805 | |||
| dre_406703 | M7(beta) | −1299.159488 | 0.1663 | No | |
| M8(beta & ω) | −1295.384925 | 16.1965 | |||
| dre_563972 | M7(beta) | −1399.109283 | 0.3750 | 224,226,227,233 | |
| M8(beta & ω) | −1390.997418 | 2.5343 | |||
| dre_79381 | M7(beta) | −1105.063900 | 0.1428 | 129,174 | |
| M8(beta & ω) | −1100.372249 | 14.0364 | |||
| dre_799288 | M7(beta) | −1334.096144 | 0.1833 | No | |
| M8(beta & ω) | −1328.432914 | 13.0557 |