| Literature DB >> 25054331 |
Hongyu Ma1, Chunyan Ma1, Shujuan Li1, Wei Jiang1, Xincang Li1, Yuexing Liu1, Lingbo Ma1.
Abstract
In this study, we reported the characterization of the first transcriptome of the mud crab (Scylla paramamosain). Pooled cDNAs of four tissue types from twelve wild individuals were sequenced using the Roche 454 FLX platform. Analysis performed included de novo assembly of transcriptome sequences, functional annotation, and molecular marker discovery. A total of 1,314,101 high quality reads with an average length of 411 bp were generated by 454 sequencing on a mixed cDNA library. De novo assembly of these 1,314,101 reads produced 76,778 contigs (consisting of 818,154 reads) with 5.4-fold average sequencing coverage. The remaining 495,947 reads were singletons. A total of 78,268 unigenes were identified based on sequence similarity with known proteins (E≤0.00001) in UniProt and non-redundant protein databases. Meanwhile, 44,433 sequences were identified (E≤0.00001) using a BLASTN search against the NCBI nucleotide database. Gene Ontology (GO) analysis indicated that biosynthetic process, cell part, and ion binding were the most abundant terms in biological process, cellular component, and molecular function categories, respectively. Kyoto Encyclopedia of Genes and Genome (KEGG) pathway analysis revealed that 4,878 unigenes distributed in 281 different pathways. In addition, 19,011 microsatellites and 37,063 potential single nucleotide polymorphisms were detected from the transcriptome of S. paramamosain. Finally, thirty polymorphic microsatellite markers were developed and used to assess genetic diversity of a wild population of S. paramamosain. So far, existing sequence resources for S. paramamosain are extremely limited. The present study provides a characterization of transcriptome from multiple tissues and individuals, as well as an assessment of genetic diversity of a wild population. These sequence resources will facilitate the investigation of population genetic diversity, the development of genetic maps, and the conduct of molecular marker-assisted breeding in S. paramamosain and related crab species.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25054331 PMCID: PMC4108364 DOI: 10.1371/journal.pone.0102668
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Characteristics of reads generated from 454 pyrosequencing.
| Category | The former quarter run | The later quarter run | The final half run | Total |
| Raw sequencing reads | 398,920 | 456,640 | 1,068,775 | 1,924,335 |
| High quality reads | 256,228 | 331,441 | 726,432 | 1,314,101 |
| Total bases (bp) | 100,671,040 | 139,089,398 | 300,240,405 | 540,000,843 |
| Average read length (bp) | 392.90 | 419.65 | 413.31 | 410.93 |
Figure 1The Size distribution of reads generated from 454 FLX platform pyrosequencing.
Summary of contigs generated by de
| Item | Number |
| Number of contigs | 76,778 |
| Total bases of contigs (bp) | 46,525,023 |
| Average length of contigs (bp) | 605.97 |
| Largest contig length (bp) | 3,579 |
| Number of contigs (≥1 kbp) | 4058 |
| Number of reads in contigs | 818,154 |
| N50 of contigs (bp) | 639 |
| Number of singletons | 495,947 |
Figure 2The size distribution of contigs resulted from de
Figure 3The classification of unigenes in three GO categories (level 3).
Figure 3–1 indicated biological process; Figure 3–2 indicated cellular component; Figure 3–3 indicated molecular function; The x-axis indicated the number of unigenes in a process; The y-axis indicated GO process.
Figure 4The ten most representive pathways resulted from KEGG pathway annotation.
The x-axis indicated the number of unigenes in a pathway; The y-axis indicated the ten representive pathway.