| Literature DB >> 24982883 |
Qing Yang1, Fanyue Sun2, Zhi Yang3, Hongjun Li1.
Abstract
Calanus sinicus Brodsky (Copepoda, Crustacea) is a dominant zooplanktonic species widely distributed in the margin seas of the Northwest Pacific Ocean. In this study, we utilized an RNA-Seq-based approach to develop molecular resources for C. sinicus. Adult samples were sequenced using the Illumina HiSeq 2000 platform. The sequencing data generated 69,751 contigs from 58.9 million filtered reads. The assembled contigs had an average length of 928.8 bp. Gene annotation allowed the identification of 43,417 unigene hits against the NCBI database. Gene ontology (GO) and KEGG pathway mapping analysis revealed various functional genes related to diverse biological functions and processes. Transcripts potentially involved in stress response and lipid metabolism were identified among these genes. Furthermore, 4,871 microsatellites and 110,137 single nucleotide polymorphisms (SNPs) were identified in the C. sinicus transcriptome sequences. SNP validation by the melting temperature (T m )-shift method suggested that 16 primer pairs amplified target products and showed biallelic polymorphism among 30 individuals. The present work demonstrates the power of Illumina-based RNA-Seq for the rapid development of molecular resources in nonmodel species. The validated SNP set from our study is currently being utilized in an ongoing ecological analysis to support a future study of C. sinicus population genetics.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24982883 PMCID: PMC4055022 DOI: 10.1155/2014/493825
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Schematic presentation of the copepod Calanus sinicus transcriptome analysis. After sequencing, raw reads were trimmed by stripping the adaptor sequences and ambiguous nucleotides using SeqPrep and Sickle. De novo assembly was performed using Trinity. The assembled contigs were used for three separate analyses: (a) gene identification and annotation analysis; (b) pathway analysis; and (c) SSR and SNP screening and validation.
Summary of RNA-Seq of the copepod Calanus sinicus transcriptome.
| Category | Number/length |
|---|---|
| Reads from raw data | 58,944,478 |
| Average read length (bp) | 100 |
| Reads after trimming | 57,773,604 |
| Percentage retained | 98.0% |
| Average read length after trimming (bp) | 97.9 |
| Contigs after removing redundancy | 69,751 |
| Average length (bp) | 928.8 |
| Final N50 (bp) | 1,127 |
| Unigenes | 43,417 |
Figure 2Size distribution of the assembled contigs in the Calanus sinicus transcriptome.
Figure 3Gene ontology classification of assembled unigenes of Calanus sinicus transcriptome on biological process, cellular component, and molecular function levels.
Representative transcripts involved in stress response and regulation of diapause in the Calanus sinicus transcriptome.
| Gene function | Number of unigenes | Size range (bp) |
|---|---|---|
| Response to stimulus | ||
| Heat shock protein 90 | 10 | 92–714 |
| Heat shock protein 70 | 17 | 120–900 |
| Heat shock protein 60 | 1 | 584 |
| Heat shock protein 40 | 1 | 410 |
| Heat shock protein 10 | 1 | 112 |
| Cytochrome P450 (CYP) | 71 | 103–551 |
| Glutathione S-transferase (GST) | 31 | 103–409 |
| Ferritin | 14 | 105–226 |
| Copper/zinc superoxide dismutase (Cu/Zn-SOD) | 12 | 156–280 |
| Mitochondrial manganese superoxide dismutase (Mn-SOD) | 1 | 230 |
| Catalase | 4 | 207–696 |
| Diapause/lipid metabolism | ||
| Long-chain-fatty-acid-Coa ligase 3-like | 35 | 115–726 |
| Fatty acid binding protein (FABP) | 3 | 86–135 |
| Long-chain fatty acid transport protein 4-like | 8 | 167–659 |
| Elongation of very long chain fatty acids protein (ELOV) | 19 | 88–363 |
| Short-chain dehydrogenase/reductase family 16C member 6-like | 8 | 116–312 |
| Xanthine dehydrogenase (XAD) | 12 | 139–1318 |
| Hippocalcin | 1 | 118 |
| Ecdysteroid receptor (Ecr) | 3 | 277–890 |
KEGG biochemical mapping for Calanus sinicus.
| KEGG pathways | Subpathways | Number of isogenes | Number of genes |
|---|---|---|---|
| Metabolism | Metabolism of cofactors and vitamins | 283 | 191 |
| Amino acid metabolism | 764 | 492 | |
| Nucleotide metabolism | 424 | 282 | |
| Metabolism of terpenoids and polyketides | 134 | 87 | |
| Glycan biosynthesis and metabolism | 401 | 260 | |
| Lipid metabolism | 740 | 513 | |
| Xenobiotics biodegradation and metabolism | 285 | 199 | |
| Energy metabolism | 541 | 373 | |
| Carbohydrate metabolism | 901 | 559 | |
| Metabolism of other amino acids | 365 | 207 | |
| Biosynthesis of other secondary metabolites | 132 | 102 | |
| Overview | 270 | 161 | |
|
| |||
| Genetic information processing | Replication and repair | 327 | 194 |
| Translation | 1024 | 679 | |
| Transcription | 512 | 341 | |
| Folding, sorting, and degradation | 1051 | 686 | |
|
| |||
| Environmental information processing | Signal transduction | 1534 | 979 |
| Signaling molecules and interaction | 247 | 201 | |
| Membrane transport | 90 | 67 | |
|
| |||
| Cellular processes | Cell growth and death | 702 | 444 |
| Cell motility | 282 | 170 | |
| Transport and catabolism | 1121 | 736 | |
| Cell communication | 774 | 478 | |
|
| |||
| Organismal systems | Nervous system | 774 | 505 |
| Excretory system | 312 | 194 | |
| Sensory system | 198 | 136 | |
| Digestive system | 672 | 470 | |
| Circulatory system | 380 | 259 | |
| Endocrine system | 953 | 625 | |
| Immune system | 698 | 437 | |
| Development | 451 | 298 | |
| Environmental adaptation | 331 | 198 | |
Figure 4Clusters of orthologous groups (COG) classification. In total, 6,383 of the 43,417 sequences with nonredundant (nr) protein hits were grouped into 25 COG classifications.
Summary of simple sequence repeat (SSR) types in the Calanus sinicus transcriptome.
| SSR type | Number of SSRs | Percentage of total SSRs (%) |
|---|---|---|
| Dinucleotide | 236 | 4.8 |
| Trinucleotide | 4,500 | 92.4 |
| Tetranucleotide | 118 | 2.4 |
| Pentanucleotide and hexanucleotides | 17 | 0.3 |
|
| ||
| Total | 4,871 | |
Figure 5Classification of single nucleotide polymorphisms (SNPs) identified in the Calanus sinicus transcriptome.
Single nucleotide polymorphism (SNP) markers derived from the transcriptome of Calanus sinicus.
| Locus | Putative function | SNP type | Primer sequence (5′-3′) |
|
| Minor allele and frequency |
|---|---|---|---|---|---|---|
| CsSNP01 | Heat shock protein 40 | G/A | AS1: | 0.156 | 1.257 | A |
| CsSNP02 | Cytosolic heat shock protein 90 kda | G/T | AS1: | 0.333 | 0.282 | G |
| CsSNP03 | Heat shock protein 70 | C/T | AS1: | 0.500 | 0.523 | T |
| CsSNP04 | Ferritin | G/T | AS1: | 0.194 | 0.325 | G |
| CsSNP05 | 10 kda heat shock protein | C/T | AS1: | 0.822 | 0.522 | T |
| CsSNP06 | Prophenoloxidase | G/A | AS1: | 0.098 | 0.475 | A |
| CsSNP07 | Selenium- dependent salivary glutathione peroxidase | G/A | AS1: | 0.125 | 0.225 | G |
| CsSNP08 | Selenium-dependent glutathione peroxidase | G/A | AS1: | 0.246 | 0.325 | A |
| CsSNP09 | Superoxide dismutase | G/A | AS1: | 0.056 | 0.468 | A |
| CsSNP10 | Ferritin heavy subunit | C/T | AS1: | 0.083 | 0.475 | C |
| CsSNP11 | Lysosomal aspartic protease precursor | C/T | AS1: | 0.500 | 0.482 | T |
| CsSNP12 | Catalase | C/T | AS1: | 0.250 | 0.335 | C |
| CsSNP13 | Trypsin | C/T | AS1: | 0.833 | 0.501 | T |
| CsSNP14 | Zwilch-like protein | C/T | AS1: | 0.778 | 0.516 | C |
| CsSNP15 | Broad-complex core protein isoform 6 | C/T | AS1: | 0.223 | 0.212 | T |
| CsSNP16 | V-Type proton Atpase 116 kda subunit | C/T | AS1: | 0.456 | 0.487 | C |
GC tails are underlined, and additional deliberate mismatches are boxed. AS1 and AS2: allele-specific primers; CR: common reverse primer; H : observed heterozygosity; H : expected heterozygosity.
Figure 6Melting curve of locus CsSNP02 genotyped with T -shift method. GC tails of different lengths were added to allele-specific primers. Samples homozygous for allele A or T will be amplified with the short GC-tailed primer and show lower temperature peak. Samples homozygous for allele G or C will be amplified with the long GC-tailed primer and show higher temperature peak. Samples heterozygous will show both temperature peaks.