| Literature DB >> 22424280 |
Jin-Tu Wang1, Jiong-Tang Li, Xiao-Feng Zhang, Xiao-Wen Sun.
Abstract
BACKGROUND: Common carp (Cyprinus carpio) is thought to have undergone one extra round of genome duplication compared to zebrafish. Transcriptome analysis has been used to study the existence and timing of genome duplication in species for which genome sequences are incomplete. Large-scale transcriptome data for the common carp genome should help reveal the timing of the additional duplication event.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22424280 PMCID: PMC3352309 DOI: 10.1186/1471-2164-13-96
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of common carp contig annotation
| Methods | Database | Number | |
|---|---|---|---|
| Homolog search | Protein-coding | Fish protein database* | 24,167 |
| UniProt database | 409 | ||
| NCBI nr protein database | 208 | ||
| UTRdb | 3,658 | ||
| NCBI nr nucleotide database | 14,524 | ||
| ab initio search | Protein-coding | CPC | 47 |
| Unknown | 6,656 |
* Fish protein database consists of protein sequences from Zebrafish, Fugu, Stickleback, Tetraodon and Medaka.
Figure 1A bar plot showing the hits to protein sequences from five sequenced teleost species. Alignments of common carp contigs to protein sequences from Zebrafish, Fugu, Stickleback, Tetraodon and Medaka, respectively.
Figure 2. Data was grouped into bins of 0.01 Ks units for graphing. For common carp and zebrafish, the Ks distributions of duplication events were shown in red and green respectively. A secondary Ks peak within common carp indicated the genome duplication (red line). Given the rate of substitutions/synonymous site per year, the peak indicated the time of the 4R of genome duplication. Within zebrafish, no secondary peak in the Ks distribution of paralogous sequences was observed (green line). The Ks distribution of the orthologous pairs was plotted in blue line and showed a distinct secondary Ks peak, indicating the speciation time between these two species.
Number of sequences and paralogs within common carp and zebrafish
| Species | Sequences in final dataseta | Paralogous pairs | Paralogous sequencesb | Percentage of paralogsc | Gene familiesd | Duplication Event with median |
|---|---|---|---|---|---|---|
| 49,669 | 129,984 | 19,159 | 38.6% | 4,689 | 8,190 | |
| zebrafish | 25,348 | 46,385 | 3,774 | 14.9% | 869 | 2,721 |
a Number of the longest sequences.
b Number of paralogous sequences found in the final dataset using BLASTN search.
c Percentage of paralogous sequences found in the final dataset.
d Number of gene families constructed with paralogous sequences using single linkage clustering.
e Number of duplication events used in the distributions in Figure 2 and of which median Ks rates are < 2.
Figure 3Distribution of common carp GO terms in biological process and molecular function categories. The relative proportion of GO terms is represented by more than 100 contigs for the biological process (A) and molecular function (B) categories in the GO vocabulary. The enriched GO terms in common carp (p < 0.05) were highlighted with orange.
The enriched pathways in common carp identified by KOBAS (corrected p-value < 0.05)
| KEGG pathway | ID | Common carp contigs proportion | Zebrafish gene proportion | Biological process GO term (level 2)* |
|---|---|---|---|---|
| Protein digestion and absorption | ko04974 | 547/10308 | 126/7433 | multicellular organismal process |
| Glycolysis/Gluconeogenesis | ko00010 | 358/10308 | 76/7433 | metabolic process |
| Pancreatic secretion | ko04972 | 564/10308 | 172/7433 | Localization |
| Complement and coagulation cascades | ko04610 | 360/10308 | 105/7433 | immune system process |
| Starch and sucrose metabolism | ko00500 | 243/10308 | 56/7433 | metabolic process |
| Oxidative phosphorylation | ko00190 | 415/10308 | 139/7433 | metabolic process |
| Antigen processing and presentation | ko04612 | 278/10308 | 85/7433 | immune system process |
| Pyruvate metabolism | ko00620 | 175/10308 | 47/7433 | metabolic process |
| TCA cycle | ko00020 | 149/10308 | 36/7433 | metabolic process |
| Pentose phosphate pathway | ko00030 | 134/10308 | 31/7433 | cellular process |
| RNA transport | ko03013 | 352/10308 | 162/7433 | Localization |
| Mineral absorption | ko04978 | 134/10308 | 43/7433 | developmental process |
| PPAR signaling pathway | ko03320 | 232/10308 | 96/7433 | response to stimulus |
| Protein processing in endoplasmic reticulum | ko04141 | 400/10308 | 215/7433 | cellular process |
| Ribosome biogenesis in eukaryotes | ko03008 | 113/10308 | 42/7433 | cellular component biogenesis |
| RNA degradation | ko03018 | 153/10308 | 77/7433 | metabolic process |
* The enriched pathways correspond to biological process GO terms of level 2 in GO vocabulary. These biological processes were still enriched in common carp, indicating the consistence between GO term comparison and KEGG pathway analysis.