| Literature DB >> 22174756 |
Hyungtaek Jung1, Russell E Lyons, Hung Dinh, David A Hurwood, Sean McWilliam, Peter B Mather.
Abstract
BACKGROUND: Giant freshwater prawn (Macrobrachium rosenbergii or GFP), is the most economically important freshwater crustacean species. However, as little is known about its genome, 454 pyrosequencing of cDNA was undertaken to characterise its transcriptome and identify genes important for growth. METHODOLOGY AND PRINCIPALEntities:
Mesh:
Substances:
Year: 2011 PMID: 22174756 PMCID: PMC3234237 DOI: 10.1371/journal.pone.0027938
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of 454 pyrosequencing, assembly and analysis of M. rosenbergii transcriptomic sequences.
| Dataset name | All | Muscle | Ovary | Testis | |
| Total number of bases (Mp) | 244.37 | 114.38 | 86.06 | 43.94 | |
| Average read length (bp) | 310 | 311 | 308 | 311 | |
| No. of reads | Total | 787,731 | 367,379 | 279,393 | 140,959 |
| Assembled | 645,837 | 323,044 | 189,771 | 112,271 | |
| Singleton | 115,123 | 33,622 | 77,455 | 24,995 | |
| Repeat | 276 | 136 | 197 | 77 | |
| No. of contigs | Total contigs | 8,411 | 1,723 | 5,346 | 1004 |
| Average contigread length (bp) | 845 | 1,027 | 796 | 848 | |
| Largest contig (bp) | 7,531 | 7,304 | 6,955 | 7,530 | |
| No. of largeContigs > 500bp | 5,724 | 1,171 | 3,559 | 683 | |
| Average coverage (x) | 29.85 | 59.56 | 14.92 | 43.09 | |
Figure 1Summary of M. rosenbergii transcriptomic sequences.
The contig sequences are represented by solid bars and the singleton sequences by open bars.
Figure 2Top 30 hit species distribution based on BLASTx.
E value cut-off is 1e–5 and top 30 hit species distribution of gene annotations showing high homology to the Arthropoda (Insecta and Crustacea) phylum with known genome sequences. Only contig sequences were used. Bold text indicates non-Arthropod homology.
Figure 3Comparative summary of M. rosenbergii transcriptomic sequences among three libraries.
Putative sequence descriptions were counted using BLASTx results (E-value <1e–5) after excluding ribosomal proteins and redundant ones. Bold numbers indicate contigs and numbers in italics indicate singletons.
Figure 4Gene ontology (GO) terms for the transcriptomic sequences of M. rosenbergii and comparison of among libraries.
(A) cellular component, (B) molecular function and (C) biological process.
Summary of top 20 domains predicted in M. rosenbergii sequences.
| IPR accession | Domain name | Domain description | Occurrence |
| IPR000504 | RRM_dom | RNA recognition motif domain | 188 |
| IPR017442 | Se/Thr_prot_kinase-like_dom | Serine/threonine-protein kinase-like domain | 180 |
| IPR000719 | Prot_kinase_cat_dom | Protein kinase, catalytic domain | 180 |
| IPR007087 | Znf_C2H2 | Zinc finger, C2H2-type | 179 |
| IPR004000 | Actin-like | Actin-like | 150 |
| IPR002290 | Ser/Thr_prot_kinase_dom | Serine/threonine-protein kinase domain | 144 |
| IPR012677 | Nucleotide-bd_a/b_plait | Nucleotide-binding, alpha-beta plait | 112 |
| IPR016024 | ARM-type_fold | Armadillo-type fold | 108 |
| IPR015943 | WD40/YVTN_repeat-like_dom | WD40/YVTN repeat-like-containing domain | 104 |
| IPR011009 | Kinase-like_dom | Protein kinase-like domain | 104 |
| IPR015880 | Znf_C2H2-like | Zinc finger, C2H2-like | 102 |
| IPR017986 | WD40_repeat_dom | WD40-repeat-containing domain | 90 |
| IPR011046 | WD40_repeat-like_dom | WD40 repeat-like-containing domain | 88 |
| IPR002041 | Ran_GTPase | Ran GTPase | 86 |
| IPR008271 | Ser/Thr_prot_kinase_AS | Serine/threonine-protein kinase, active site | 84 |
| IPR013783 | Ig-like_fold | Immunoglobulin-like fold | 84 |
| IPR011989 | ARM-like | Armadillo-like helical | 84 |
| IPR023796 | Sepin_dom | Serpin domain | 79 |
| IPR013083 | Znf_RING/FYVE/PHD | Zinc finger, RING/FYVE/PHD-type | 72 |
| IPR016040 | NAD(P)-bd_dom | NAD(P)-binding domain | 72 |
Genes of interest for growth and muscle development in M. rosenbergii sequences.
| Candidate genes | Contig IDs | Length (bp) |
| Actin | A000585; A000586; A000587; A000588; A000763;A000764; A000765; A000766; A001338; A001339 | 1329; 1318; 1306; 1295; 1679; 1676; 1641; 1638; 930; 738 |
| Alpha skeletal muscle | A000008; A000407; A000408; A000807; A002601; A002969 | 710; 1141; 1016; 1110; 1474; 1166 |
| Calponin/calponin transgelin | A002718; A002875; A006133 | 1383; 1232; 518 |
| Cyclophilin a | A001348; A001349 | 811; 850 |
| Farnesoic acid O-methyltransferase | A002527 | 1587 |
| Fatty acid binding protein | A004382 | 728 |
| Lim domain binding | A000448 | 2610 |
| Muscle lim protein | A000421; A000422; A000423; A000424; A000425 | 5788; 5694; 4595; 4501; 1716 |
| Myosin heavy chain | A000009; A000016; A001103; A001282; A001283; A002073; A003870; A004348; A004442; A007715; A008193 | 612; 1510; 672; 942; 916; 711; 828; 733; 717; 383; 277 |
| Myosin heavy nonmuscle or smooth muscle | A000018; A000968; A000969; A001363; A008338; A008352 | 1512; 5609; 2201; 730; 156; 148 |
| Myosin light chain | A008264; A008271; A008339 | 220; 209; 155 |
| Myosin light chain smooth muscle | A000639; A000783; A000785 | 3309; 2919; 1544 |
| Profilin | A002454; A003703 | 1696; 872 |
| Skeletal muscle actin 6 | A000022; A000409; A005595; A006187; A007119; A008308; A008374 | 853; 522; 574; 512; 433; 177; 122 |
| Transforming growth factor beta regulator 1 | A006817 | 458 |
| Tropomyosin | A000105; A000106; A000107; A000108; A000109; A000110; A000111; A000112; A000113; A000114; A000115; A000116; A001463; A002025; A002026; A007719 | 2777; 2769; 2770; 2768; 2762; 2760; 2773; 2775; 2765; 2767; 1962; 1954; 377; 1391; 110; 383 |
*Prefix “A” in ContigIDs indicates all merged contig from three libraries.
Figure 5Distribution of putative single nucleotide polymorphisms (SNP) and indels in M. rosenbergii sequences.
Figure 6Distribution of simple sequence repeat (SSR) nucleotide classes among different nucleotide types found in M. rosenbergii sequences.
Both contig and singleton sequences are used to predict the SSR loci.