| Literature DB >> 22514716 |
Peifeng Ji1, Guiming Liu, Jian Xu, Xumin Wang, Jiongtang Li, Zixia Zhao, Xiaofeng Zhang, Yan Zhang, Peng Xu, Xiaowen Sun.
Abstract
BACKGROUND: Common carp (Cyprinus carpio) is one of the most important aquaculture species of Cyprinidae with an annual global production of 3.4 million tons, accounting for nearly 14% of the freshwater aquaculture production in the world. Due to the economical and ecological importance of common carp, genomic data are eagerly needed for genetic improvement purpose. However, there is still no sufficient transcriptome data available. The objective of the project is to sequence transcriptome deeply and provide well-assembled transcriptome sequences to common carp research community. RESULT: Transcriptome sequencing of common carp was performed using Roche 454 platform. A total of 1,418,591 clean ESTs were collected and assembled into 36,811 cDNA contigs, with average length of 888 bp and N50 length of 1,002 bp. Annotation was performed and a total of 19,165 unique proteins were identified from assembled contigs. Gene ontology and KEGG analysis were performed and classified all contigs into functional categories for understanding gene functions and regulation pathways. Open Reading Frames (ORFs) were detected from 29,869 (81.1%) contigs with an average ORF length of 763 bp. From these contigs, 9,625 full-length cDNAs were identified with sequence length from 201 bp to 9,956 bp. Comparative analysis revealed that 27,693(75.2%) contigs have significant similarity to zebrafish Refseq proteins, and 24,371(66.2%), 24,501(66.5%) and 25,025(70.0%) to teraodon, medaka and three-spined stickleback refseq proteins. A total of 2,064 microsatellites were initially identified from 1,730 contigs, and 1,639 unique sequences had sufficient flanking sequences on both sides for primer design.Entities:
Mesh:
Year: 2012 PMID: 22514716 PMCID: PMC3325976 DOI: 10.1371/journal.pone.0035152
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Statistics of common carp transcriptome sequences.
| Number of raw reads | 2,116,226 |
| Average length of raw reads | 331 bp |
| Number of cleaned reads | 1,418,591 |
| Average length of cleaned reads | 321 bp |
| Median length of cleaned reads | 328 bp |
| Sequences for assembly | 1,150,339 |
Statistics of transcriptome assembly.
| Contig number | 36,811 |
| Maximum contig length | 14,971 bp |
| Minimum contig length | 100 bp |
| Average contig length | 888 bp |
| N50 length | 1,002 bp |
| Number of reads per contig | 31.3 |
Summary of BLASTX search results of common carp transcriptome.
| Database | common carp hits | Unique protein | % of total unique proteins |
| NR | 28,055 | 19,165 | |
| Refseq/Ensembl | |||
| Zebrafish | 27,693 | 14,554 | 53.4% of 27,271 |
| Medaka | 24,501 | 12,471 | 50.6% of 24,661 |
| Tetraodon | 24,371 | 12,536 | 54.2% of 23,118 |
| Three-spined stickleback | 25,025 | 13,147 | 47.7% of 27,576 |
Figure 1Comparative analysis and functional classification of common carp and zebrafish genes.
KEGG biochemical mappings for common carp.
| KEGG categories represented | Unique sequences |
|
|
|
| Carbohydrate Metabolism | 906 (682) |
| Amino Acid Metabolism | 210 (169) |
| Energy Metabolism | 174 (134) |
| Nucleotide Metabolism | 144 (109) |
| Metabolism of Cofactors and Vitamins | 117 (93) |
| Lipid Metabolism | 248 (171) |
| Glycan Biosynthesis and Metabolism | 149 (119) |
| Metabolism of Other Amino Acids | 82 (55) |
| Xenobiotics Biodegradation and Metabolism | 69 (51) |
| Biosynthesis of Secondary Metabolites | 20 (17) |
| Biosynthesis of Polyketides and Nonribosomal Peptides | 22 (21) |
|
|
|
| Replication and Repair | 129 (101) |
| Folding, Sorting and Degradation | 408 (307) |
| Transcription | 187 (147) |
| Translation | 426 (290) |
|
|
|
| Signal Transduction | 564 (383) |
| Signaling Molecules and Interaction | 255 (184) |
| Membrane Transport | 28 (24) |
|
|
|
| Cell Motility | 134 (83) |
| Cell Growth and Death | 245 (176) |
| Transport and Catabolism | 411 (280) |
| Cell Communication | 281 (174) |
|
|
|
| Immune System | 441 (308) |
| Endocrine System | 277 (193) |
| Circulatory System | 112 (73) |
| Digestive System | 187 (126) |
| Excretory System | 98 (64) |
| Nervous System | 265 (186) |
| Sensory System | 34 (24) |
| Development | 160 (109) |
| Environmental Adaptation | 37 (25) |
|
|
|
Unique sequences indicate non-redundant sequences involving particular KEGG category.
Figure 2Length distribution of identified ORF from common carp transcriptome assembly.
Figure 3Distribution of common carp transcriptome contig on zebrafish chromosomes.
Figure 4Length distribution of putative full-length cDNAs of common carp.
Statistics of microsatellites identified from common carp transcriptome.
| Total number of contigs | 36,811 |
| Microsatellites identified | 2,064 |
| Di-nucleotide repeats | 9,64 |
| Tri-nucleotide repeats | 9,51 |
| Tetra-nucleotide repeats | 1,28 |
| Penta-nucleotide repeats | 29 |
| Hexa-nucleotide repeats | 10 |
| Number of contigs containing microsatellites | 1,730 |
| Number of microsatellites with sufficient flanking sequencing for PCR primer design | 1,639 |
Figure 5Transcriptome assembly and analysis pipeline.