| Literature DB >> 28505302 |
Seyoung Mun1,2, Yun-Ji Kim1,2, Kesavan Markkandan3, Wonseok Shin1,2, Sumin Oh4, Jiyoung Woo4, Jongsu Yoo4, Hyesuck An4, Kyudong Han1,2.
Abstract
The manila clam, Ruditapes philippinarum, is an important bivalve species in worldwide aquaculture including Korea. The aquaculture production of R. philippinarum is under threat from diverse environmental factors including viruses, microorganisms, parasites, and water conditions with subsequently declining production. In spite of its importance as a marine resource, the reference genome of R. philippinarum for comprehensive genetic studies is largely unexplored. Here, we report the de novo whole-genome and transcriptome assembly of R. philippinarum across three different tissues (foot, gill, and adductor muscle), and provide the basic data for advanced studies in selective breeding and disease control in order to obtain successful aquaculture systems. An approximately 2.56 Gb high quality whole-genome was assembled with various library construction methods. A total of 108,034 protein coding gene models were predicted and repetitive elements including simple sequence repeats and noncoding RNAs were identified to further understanding of the genetic background of R. philippinarum for genomics-assisted breeding. Comparative analysis with the bivalve marine invertebrates uncover that the gene family related to complement C1q was enriched. Furthermore, we performed transcriptome analysis with three different tissues in order to support genome annotation and then identified 41,275 transcripts which were annotated. The R. philippinarum genome resource will markedly advance a wide range of potential genetic studies, a reference genome for comparative analysis of bivalve species and unraveling mechanisms of biological processes in molluscs. We believe that the R. philippinarum genome will serve as an initial platform for breeding better-quality clams using a genomic approach.Entities:
Keywords: Ruditapes philippinarum; de novo assembly; repeat elements; transcriptome
Mesh:
Substances:
Year: 2017 PMID: 28505302 PMCID: PMC5499747 DOI: 10.1093/gbe/evx096
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Statistics of De Novo Assembly for the R. philippinarum Genome
| No. of Sequences | Total Bases | Longest (kb) | N50 (kb) | N90 (kb) | |
|---|---|---|---|---|---|
| Short-read assembly | |||||
| Contig (PE) | 4,861,413 | 2,257,620,557 | 96,344 | 3,333 | 128 |
| Scaffold (PE + MP) | 298,671 | 2,478,800,284 | 474,845 | 32,797 | 5,215 |
| Long-read assembly | |||||
| Contig (TSLR) | 247,445 | 2,206,994,927 | 149,501 | 12,971 | 4,470 |
| Total-read assembly | |||||
| Scaffold (PE + MP + TSLR) | 223,851 | 2,561,070,351 | 572,939 | 48,447 | 7,827 |
| HaploMerger | 13,411 | 1,078,771,101 | 1,050,406 | 119,518 | 39,029 |
Results of Gene Prediction for the R. philippinarum Genome
| Classification | Quantification |
|---|---|
| Total no. of gene models predicted | 108,034 |
| Unique gene models (no.) | 106,102 |
| Genes with isoforms (no.) | 1,932 |
| RNA-Seq supported gene model (no.) | 98,442 |
| Average gene length (bp) | 5,117 bp |
| Total bases of gene models (Mb) | 552.90 Mb |
| Genes in the draft genome (%) | 21.58% |
| No. of exon | 451,049 |
| Average no. of exon per gene | 4.17 |
| Average exon length (bp) | 232 bp |
| Exons in the draft genome (%) | 4.09% |
| No. of intron | 343,015 |
| Average no. of intron per gene | 3.17 |
| Average intron length (bp) | 1,230 bp |
| Introns in the draft genome (%) | 16.48% |
FPhylogenetic tree of C1qDC genes. The tree was reconstructed by BLOSUM62 and the Neighbor-Joining methods using the Jalview v2.10.1 program and visualized by using Figtree tool v1.4.3. Each taxon of C1qDC genes in R. philippinarum, C. gigas, and PFam seeds are written in green, red, and blue letter, respectively. A total of 1,589 C1qDC genes were used for this analysis and divided into five groups. The number in brackets indicates the number of C1qDC genes in R. philippinarum, C. gigas, and PFam seeds, clustering to each group.
FOrthologous gene clusters in the Mollusca lineage. Venn diagram shows the number of unique and shared gene families among the four Mollusca genomes (manila clam, octopus, snail, and oyster).
FGene ontology (GO) categories of unique gene families in R. philippinarum among four Mollusca. The functions of 68 gene families were classified and subdivided into a total of 93 GO terms. However, the majority of gene families is only represented in this figure as follows three main categories: Cellular components (yellow), molecular function (green), and biological process (blue). The x axis denoted the related functional categories and the y axis denoted the number of gene models, which are associated with each GO categories.
Summary of Whole Transcriptome Analysis and Unigene Construction
| Tissue | Total RNA Reads | Filtered RNA Reads (%) | GC Rate (%) | Assembly Result | Expressed Transcript (FPKM > 0) | Expressed Transcript (FPKM > 1) | Predicted Transcripts | |||
|---|---|---|---|---|---|---|---|---|---|---|
| Nonredundant Unigenes | Average Unigenes Length | Unigene N50 (bp) | Gene | |||||||
| Gill | 67,441,250 | 62,923,824 (93.3) | 37.63 | 199,345 | 882 | 1,600 | 132,766 | 144,535 | 112,061 | 66,879 |
| Adductor muscle | 81,677,808 | 75,874,878 (92.9) | 38.92 | 111,441 | 64,844 | |||||
| Foot | 67,685,344 | 62,907,532 (92.9) | 37.49 | 108,559 | 73,058 | |||||
FBioinformatic analysis for transcript expression profile. (A) Frequency of transcripts with 0 FPKM in each tissue. (B) Distribution of FPKM values for transcripts expressed in each tissue. The x axis and y axis indicate the number of expressed transcripts and log-transformed FPKM values of transcripts (FPKM > 0) from three tissues, respectively. The blue box denotes the majority of transcripts (1 < FPKM < 10).