| Literature DB >> 32385046 |
Stefan Prost1,2, Malte Petersen3, Martin Grethlein4, Sarah Joy Hahn4, Nina Kuschik-Maczollek4, Martyna Ewa Olesiuk4, Jan-Olaf Reschke4, Tamara Elke Schmey4,5, Caroline Zimmer4, Deepak K Gupta3, Tilman Schell3, Raphael Coimbra4,5, Jordi De Raad3,4, Fritjof Lammers3,4,5, Sven Winter4,5, Axel Janke3,4,5.
Abstract
Ever decreasing costs along with advances in sequencing and library preparation technologies enable even small research groups to generate chromosome-level assemblies today. Here we report the generation of an improved chromosome-level assembly for the Siamese fighting fish (Betta splendens) that was carried out during a practical university master's course. The Siamese fighting fish is a popular aquarium fish and an emerging model species for research on aggressive behavior. We updated the current genome assembly by generating a new long-read nanopore-based assembly with subsequent scaffolding to chromosome-level using previously published Hi-C data. The use of ∼35x nanopore-based long-read data sequenced on a MinION platform (Oxford Nanopore Technologies) allowed us to generate a baseline assembly of only 1,276 contigs with a contig N50 of 2.1 Mbp, and a total length of 441 Mbp. Scaffolding using the Hi-C data resulted in 109 scaffolds with a scaffold N50 of 20.7 Mbp. More than 99% of the assembly is comprised in 21 scaffolds. The assembly showed the presence of 96.1% complete BUSCO genes from the Actinopterygii dataset indicating a high quality of the assembly. We present an improved full chromosome-level assembly of the Siamese fighting fish generated during a university master's course. The use of ∼35× long-read nanopore data drastically improved the baseline assembly in terms of continuity. We show that relatively in-expensive high-throughput sequencing technologies such as the long-read MinION sequencing platform can be used in educational settings allowing the students to gain practical skills in modern genomics and generate high quality results that benefit downstream research projects.Entities:
Keywords: Betta splendens; chromosome-level genome assembly; master’s course
Mesh:
Year: 2020 PMID: 32385046 PMCID: PMC7341155 DOI: 10.1534/g3.120.401205
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Genome continuity statistics for the Hi-C scaffolded genome calculated with QUAST in comparison to the assembly of Fan et al. (2018)
| This Study | Fan | |
|---|---|---|
| 1,276 /109 | 139,323/91,819 | |
| 2.1 Mbp | 18.9 kbp | |
| 20.7 Mbp | 19.7 Mbp | |
| 9 | 10 | |
| 34.1Mbp | 34.9 Mbp | |
| 441.2 Mbp | 456.2 Mbp | |
| 45.1 | 45.2 |
Figure 1(A) Hi-C contact map of the 21 chromosome-level scaffolds, and the shorter unplaced scaffolds. As can be seen in the plot, the assembly only shows small amounts of trans-chromosomal interactions. The scale revers to the link coverage along and between scaffolds. (B) Whole genome synteny between the chromosome-level assembly of Fan (on the left) and our chromosome-level assembly (on the right). The lines indicate aligned regions between the two assemblies.
Comparison of BUSCO scores for our and the chromosome-level assembly of Fan et al. (2018). We did not include complete single-copy and duplicated BUSCO statistics for the transcriptome assembly, as it includes isoforms. The analyses are based on a total number of 3,640 BUSCOs
| Genome | Annotation | Transcriptome | |||
|---|---|---|---|---|---|
| this study | Fan | this study | Fan | this study | |
| 3,528 (96.9%) | 3,543 (97.3%) | 2,870 (78.8%) | 3,164 (87.0%) | 3,194 (87.7%) | |
| 3,499 (96.1%) | 3,498 (96.1%) | 2,819 (77.4%) | 3,093 (85.0%) | — | |
| 29 (0.8%) | 45 (1.2%) | 51 (1.4%) | 71 (2.0%) | — | |
| 11 (0.3%) | 11 (0.3%) | 148 (4.1%) | 225 (6.2%) | 149 (4.1%) | |
| 101 (2.8%) | 86 (2.4%) | 622 (17.1%) | 251 (6.8%) | 297 (8.2%) |
Repeat content of the Hi-C scaffolded assembly
| Type of element | Number of elements | Length | Percentage of assembly |
|---|---|---|---|
| 12,939 | 2,732,859 | 0.62% | |
| 51,088 | 31,431,107 | 7.13% | |
| 24,596 | 21,574,199 | 4.89% | |
| 54,813 | 14,902,018 | 3.37% | |
| 74,783 | 23,656,821 | 5.36% | |
| 6,578 | 1,201,687 | 0.27% | |
| 2,397 | 1,216,117 | 0.28% | |
| 385,394 | 24,039,213 | 5.44% | |
| 35,821 | 2,031,477 | 0.46% | |
| 27.82% |