| Literature DB >> 32060047 |
Hanno Schmidt1,2, Sören Lukas Hellmann3, Ann-Marie Waldvogel1, Barbara Feldmeyer1, Thomas Hankeln3, Markus Pfenninger4,5.
Abstract
Chironomus riparius is of great importance as a study species in various fields like ecotoxicology, molecular genetics, developmental biology and ecology. However, only a fragmented draft genome exists to date, hindering the recent rush of population genomic studies in this species. Making use of 50 NGS datasets, we present a hybrid genome assembly from short and long sequence reads that make C. riparius' genome one of the most contiguous Dipteran genomes published, the first complete mitochondrial genome of the species, and the respective recombination rate among the first insect recombination rates at all. The genome assembly and associated resources will be highly valuable to the broad community working with dipterans in general and chironomids in particular. The estimated recombination rate will help evolutionary biologists gaining a better understanding of commonalities and differences of genomic patterns in insects.Entities:
Keywords: Chironomus riparius; hybrid genome assembly; recombination rate
Mesh:
Year: 2020 PMID: 32060047 PMCID: PMC7144091 DOI: 10.1534/g3.119.400710
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
– Characteristics of C. riparius genome assembly. Shown are the improvements in quality by combining short and long reads in comparison to the previous Illumina-only assembly. Values are based on the nuclear genome only
| Illumina-only draft genome ( | Hybrid assembly (present study) | Degree of improvement | |
|---|---|---|---|
| number of scaffolds | 5,292 | 752 | 1/7 |
| total scaffold length (bp) | 180,652,019 | 178,167,951 | equal |
| average scaffold length (bp) | 34,136 | 236,926 | x 7 |
| longest sequence (bp) | 2,056,324 | 2,626,431 | +25% |
| N50 | 272,065 | 539,778 | x 2 |
| N content (%) | 15.96 | 0.08 | 1/200 |
| BUSCOs found (complete and fragmented) | 92.8 | 93.7 | + 1% |
Figure 1Effect of gap filling procedures. Gap filling with corrected PacBio reads (PBJelly) and Illumina paired end reads (Gapfiller). Shown is the decrease in number of gaps (“# gaps”, dashed line) and fraction of undefined nucleotides (“% Ns”, dotted line) in the scaffolds during the iterative gap filling process.
Figure 2C. riparius chromosome-specific recombination rates. Recombination rates from all individuals across populations were pooled. Chromosome 3 is represented without the identified part of the sex determining region, which is displayed separately (SDR). White circles show medians, box limits indicate the 25th and 75th percentiles, whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles, polygons represent density estimates of data and extend to extreme values. Kruskal-Wallis test (nonparametric for data without normal distribution) with Dunn’s multiple comparison post-test (GraphPad Prism v5) revealed a significant difference (P < 0.001) between chromosome 3 as well as the SDR relative to all other chromosomes. Chr = Chromosome.
– Comparative statistics of the nuclear genome’s annotation. Content of protein-coding genes in the genome of C. riparius compared to published genomes of other chironomids and Drosophila melanogaster
| gene count | average number of exons per gene | average exon length (bp) | protein coding part of the genome (%) | |
|---|---|---|---|---|
| 13,449 | 5.1 | 378 | 14.6 | |
| 15,120 | 3.8 | 312 | 9 | |
| 17,137 | 4.3 | 324 | 20.2 | |
| 16,553 | 4.0 | 328 | 20.3 | |
| 11,005 | 5.0 | 321 | 19.6 | |
| 14,041 | 4.5 | 329 | 29.6 | |
| 13,468 | 6.2 | 215 | 13.0 | |
| 13,907 | 5.5 | 538 | 18.3 |
Figure 3GC content for genomic features. Different genomic features revealed differences in GC content. GC content in exons resided above genome average, while the opposite was found for introns. 10 kb windows were generated without regard to their content. Box limits indicate the 25th and 75th percentiles, whiskers extend 1.5 times the interquartile range from the 25th and 75th percentiles.
Figure 4Mitochondrial genome of C. riparius. The circular genome consists of 15,467 bp. Prediction of protein-coding sequences by the EMBOSS tool tcode (blue graph at the inner edge of the genome; green ring = coding, red ring = non-coding) mainly is consistent with the annotation from MITOS.