| Literature DB >> 29617765 |
Chao Li1, Xiao Liu2, Bo Liu1, Bin Ma3, Fengqiao Liu1, Guilong Liu1, Qiong Shi4, Chunde Wang1.
Abstract
Background: The Peruvian scallop, Argopecten purpuratus, is mainly cultured in southern Chile and Peru was introduced into China in the last century. Unlike other Argopecten scallops, the Peruvian scallop normally has a long life span of up to 7 to 10 years. Therefore, researchers have been using it to develop hybrid vigor. Here, we performed whole genome sequencing, assembly, and gene annotation of the Peruvian scallop, with an important aim to develop genomic resources for genetic breeding in scallops. Findings: A total of 463.19-Gb raw DNA reads were sequenced. A draft genome assembly of 724.78 Mb was generated (accounting for 81.87% of the estimated genome size of 885.29 Mb), with a contig N50 size of 80.11 kb and a scaffold N50 size of 1.02 Mb. Repeat sequences were calculated to reach 33.74% of the whole genome, and 26,256 protein-coding genes and 3,057 noncoding RNAs were predicted from the assembly. Conclusions: We generated a high-quality draft genome assembly of the Peruvian scallop, which will provide a solid resource for further genetic breeding and for the analysis of the evolutionary history of this economically important scallop.Entities:
Mesh:
Year: 2018 PMID: 29617765 PMCID: PMC5905365 DOI: 10.1093/gigascience/giy031
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:Picture of a representative Peruvian scallop in China.
Summary of the Peruvian scallop genome assembly and annotation
| Genome assembly | Parameter |
|---|---|
| Contig N50 size (kb) | 80.11 |
| Scaffold N50 size (Mb) | 1.02 |
| Estimated genome size (Mb) | 885.29 |
| Assembled genome size (Mb) | 724.78 |
| Genome coverage () | 303.83 |
| Longest scaffold (bp) | 11,125,,544 |
| Genome annotation | Parameter |
| Protein-coding gene number | 26,256 |
| Average transcript length (kb) | 10.53 |
| Average CDS length (bp) | 1,418.29 |
| Average intron length (bp) | 1,505.92 |
| Average exon length (bp) | 201.09 |
| Average exons per gene | 7.05 |
The prediction of repeat elements in the Peruvian scallop genome
| Type | Repeat size (bp) | % of genome |
|---|---|---|
| TRF | 83,037,380 | 11.46 |
| RepeatMasker | 237,471,691 | 32.76 |
| RepeatProteinMask | 21,719,425 | 3.00 |
| Total | 294,496,811 | 40.63 |
Figure 2:Distribution of genes in different species. Abbreviations: Aca, Aplysia californica; Apu, Argopecten purpuratus; Bfl, Branchiostoma floridae; Bpl, Bathymodiolus platifrons; Cel, Caenorhabditis elegans; Cgi, Crassostrea gigas; Cte, Capitella teleta; Dme, Drosophila melanogaster; Hsa, Homo sapiens; Hdi, Haliotis discus; Hro, Helobdella robusta; Lan, Lingula anatina; Lgi, Lottia gigantea; Mph, Modiolus philippinarum; Obi, Octopus bimaculoides; Pfu, Pinctada fucata; Pye, Patinopecten yessoensis; Tca, Tribolium castaneum.
Figure 3:Bootstrap support of phylogenetic tree. A maximum likelihood tree was constructed using RAxML based on 108 single-copy protein-coding genes of the related species. The total number of bootstrap was 100.