| Literature DB >> 29342277 |
Mun Hua Tan1,2,3, Christopher M Austin1,2,3, Michael P Hammer4, Yin Peng Lee2,3, Laurence J Croft1,5, Han Ming Gan1,2,3.
Abstract
Background: Some of the most widely recognized coral reef fishes are clownfish or anemonefish, members of the family Pomacentridae (subfamily: Amphiprioninae). They are popular aquarium species due to their bright colours, adaptability to captivity, and fascinating behavior. Their breeding biology (sequential hermaphrodites) and symbiotic mutualism with sea anemones have attracted much scientific interest. Moreover, there are some curious geographic-based phenotypes that warrant investigation. Leveraging on the advancement in Nanopore long read technology, we report the first hybrid assembly of the clown anemonefish (Amphiprion ocellaris) genome utilizing Illumina and Nanopore reads, further demonstrating the substantial impact of modest long read sequencing data sets on improving genome assembly statistics.Entities:
Mesh:
Year: 2018 PMID: 29342277 PMCID: PMC5848817 DOI: 10.1093/gigascience/gix137
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:The clown anemonefish (Amphiprion ocellaris). Photo by Michael P. Hammer.
Genome and transcriptome statistics of the clownfish (Amphiprion ocellaris) genome
| Illumina (≥500 bp) | Illumina + Nanopore (≥500 bp) | |
|---|---|---|
| Genome assembly | ||
| Contig statistics | ||
| Number of contigs | 133 997 | 7810 |
| Total contig size, bp | 851 389 851 | 880 159 068 |
| Contig N50 size, bp | 15 458 | 323 678 |
| Longest contig, bp | 204 209 | 2051 878 |
| Scaffold statistics | ||
| Number of scaffolds | 106 526 | 6404 |
| Total scaffold size, bp | 852 602 726 | 880 704 246 |
| Scaffold N50 size, bp | 21 802 | 401 715 |
| Longest scaffold, bp | 227 111 | 3111 502 |
| GC/AT/N, % | 39.6/60.2/0.14 | 39.4/60.5/0.06 |
| BUSCO genome completeness | ||
| Complete | 3691 (80.5%) | 4417 (96.3%) |
| Complete and single copy | 3600 (78.5%) | 4269 (93.1%) |
| Complete and duplicated | 91 (2.0%) | 148 (3.2%) |
| Fragmented | 534 (11.6%) | 63 (1.4%) |
| Missing | 359 (7.9%) | 104 (2.3%) |
| Transcriptome assembly | ||
| Number of contigs | 25 364 | |
| Total length, bp | 68 405 796 | |
| Contig N50 size, bp | 3670 | |
| BUSCO completeness | ||
| Complete | 4253 (92.8%) | |
| Complete and single-copy | 4128 (90.1%) | |
| Complete and duplicated | 125 (2.7%) | |
| Fragmented | 127 (2.8%) | |
| Missing | 204 (4.4%) | |
| Genome annotation | ||
| Number of protein-coding genes | 27 420 | |
| Number of functionally annotated proteins | 26 211 | |
| Mean protein length | 514 aa | |
| Longest protein | 29 084 aa (titin protein) | |
| Average number (length) of exon per gene | 9 (355 bp) | |
| Average number (length) of intron per gene | 8 (1532 bp) |
Figure 2:Mapping of MinION long reads, Illumina-assembled scaffolds, and RNA-sequencing reads of male and female A. ocellaris to the genomic region containing the cyp19a1a gene. Transcripts per million (TPM) values were calculated using Kallisto, version 0.43.1 [46].