| Literature DB >> 27922628 |
Jia Li1, Chao Bian1, Yinchang Hu2, Xidong Mu2, Xueyan Shen3, Vydianathan Ravi4, Inna S Kuznetsova3,5, Ying Sun1, Xinxin You1, Ying Qiu1, Xinhui Zhang1, Hui Yu1, Yu Huang1, Pao Xu6, Ruobo Gu6,7, Junmin Xu1,7, László Orbán3, Byrappa Venkatesh4, Qiong Shi1,7,8.
Abstract
Asian arowana (Scleropages formosus), an ancient teleost belonging to the Order Osteoglossomorpha, has been a valuable ornamental fish with some varieties. However, its biological studies and breeding germplasm have been remarkably limited by the lack of a reference genome. To solve these problems, here we report high-quality genome sequences of three common varieties of Asian arowana (the golden, red and green arowana). We firstly generated a chromosome-level genome assembly of the golden arowana, on basis of the genetic linkage map constructed with the restriction site-associated DNA sequencing (RAD-seq). In addition, we obtained draft genome assemblies of the red and green varieties. Finally, we annotated 22,016, 21,256 and 21,524 protein-coding genes in the genome assemblies of golden, red and green varieties respectively. Our data were deposited in publicly accessible repositories to promote biological research and molecular breeding of Asian arowana.Entities:
Mesh:
Year: 2016 PMID: 27922628 PMCID: PMC5139669 DOI: 10.1038/sdata.2016.105
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Overview of genome assembly and annotation for the three varieties of Asian arowana.
| Sequence coverage (×) | 137.60 | 109.60 | 100.10 |
| Estimated genome size (Mb) | 828 | 949 | 897 |
| Assembled genome size (Mb) | 779 | 753 | 759 |
| Scaffold N50 (Mb) | 5.97 | 1.63 | 1.85 |
| Contig N50 (kb) | 30.73 | 60.19 | 62.80 |
| Number of genes | 22,016 | 21,256 | 21,524 |
| Repeat content | 27.34% | 27.93% | 28.04% |
Figure 1Construction of the 25 linkage groups (or pseudo-chromosomes) of golden arowana based on RAD-seq.
Markers on the scaffolds (dark blue) were aligned and reordered onto the corresponding chromosomes (chr; in the orange color). The length for the axis is centimorgan (cM; for the chromosomes) or millionbase (Mb; for the scaffolds).
Construction of the sequencing libraries.
| Golden arowana | 170 bp | 24.7 | 100 | 30.1 |
| 500 bp | 20.3 | 100 | 24.7 | |
| 800 bp | 13.8 | 100 | 16.8 | |
| 2 kb | 22.5 | 49 | 27.4 | |
| 5 kb | 10.6 | 49 | 12.9 | |
| 10 kb | 9.9 | 49 | 12.0 | |
| 20 kb | 6.6 | 49 | 8.0 | |
| 40 kb | 4.7 | 49 | 5.7 | |
| Total | 113.1 | 137.6 | ||
| Red arowana | 250 bp | 39.7 | 150 | 41.7 |
| 500 bp | 26.6 | 90 | 28.0 | |
| 2 kb | 17.7 | 100 | 18.6 | |
| 5 kb | 19.8 | 100 | 20.8 | |
| Total | 103.8 | 109.6 | ||
| Green arowana | 250 bp | 32.7 | 150 | 36.3 |
| 500 bp | 27.6 | 90 | 30.7 | |
| 2 kb | 15.8 | 100 | 17.6 | |
| 5 kb | 14.4 | 100 | 16.0 | |
| Total | 90.5 | 100.1 |
statistics of the anchored scaffolds for the golden variety.
| No. of scaffolds (>2 kb) | 554 |
| N50 of Scaffold (Mb) | 5.97 |
| Assembled genome size (Mb) | 779 |
| No. of the anchored scaffolds | 194 |
| N50 of the anchored scaffolds (Mb) | 7.26 |
| Genome size of the anchored scaffolds (Mb) | 683 |
Versions of genome assemblies of the eight vertebrate species used for homology annotation.
| Human | Release 75 | |
| Zebrafish | ||
| Japanese fugu | ||
| Spotted green pufferfish | ||
| Three-spined stickleback | ||
| Japanese medaka | ||
| Half-smooth tongue sole | ||
| Coelacanth |
Assessing the completeness of gene regions in the three genome assemblies by RNA-seq of skin tissue.
| Golden | 312,697 | 419,386,592 | 95.10 | 89.85 | 96.74 |
| Red | 271,689 | 278,274,436 | 97.20 | 91.82 | 98.64 |
| Green | 329,917 | 427,288,534 | 96.07 | 91.45 | 98.41 |
Coverage rates of core eukaryotic genes in the three assembled genomes by CEGMA.
| Golden | 66 | 100% | 55 | 98.21% | 60 | 98.36% | 63 | 96.92% |
| Red | 65 | 98.48% | 56 | 100% | 60 | 98.36% | 65 | 100% |
| Green | 65 | 98.48% | 56 | 100% | 60 | 98.36% | 65 | 100% |