| Literature DB >> 31774498 |
Jacqueline Heckenhauer1,2, Paul B Frandsen1,3,4, Deepak K Gupta1, Juraj Paule5, Stefan Prost1,6, Tilman Schell1, Julio V Schneider2, Russell J Stewart7, Steffen U Pauls1,2,8.
Abstract
Members of the speciose insect order Trichoptera (caddisflies) provide important ecosystem services, for example, nutrient cycling through breaking down of organic matter. They are also of industrial interest due to their larval silk secretions. These form the basis for their diverse case-making behavior that allows them to exploit a wide range of ecological niches. Only five genomes of this order have been published thus far, with variable qualities regarding contiguity and completeness. A low-cost sequencing strategy, that is, using a single Oxford Nanopore flow cell per individual along with Illumina sequence reads was successfully used to generate high-quality genomes of two Trichoptera species, Plectrocnemia conspersa and Hydropsyche tenuis. Of the de novo assembly methods compared, assembly of low coverage Nanopore reads (∼18×) and subsequent polishing with long reads followed by Illumina short reads (∼80-170× coverage) yielded the highest genome quality both in terms of contiguity and BUSCO completeness. The presented genomes are the shortest to date and extend our knowledge of genome size across caddisfly families. The genomic region that encodes for light (L)-chain fibroin, a protein component of larval caddisfly silk was identified and compared with existing L-fibroin gene clusters. The new genomic resources presented in this paper are among the highest quality Trichoptera genomes and will increase the knowledge of this important insect order by serving as the basis for phylogenomic and comparative genomic studies.Entities:
Keywords: zzm321990 : de novo genome assembly; Trichoptera; fibroin; genome size; insect genomics; silk genes
Mesh:
Substances:
Year: 2019 PMID: 31774498 PMCID: PMC6916706 DOI: 10.1093/gbe/evz264
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Comparison of Genome Assemblies among the Seven Published Caddisfly Genomes
| Species | Accession | Suborder |
| Assembly Length (bp) | Scaffold N50 (kb) |
| Estimated Haploid Genome Size (1C) |
|---|---|---|---|---|---|---|---|
|
| VTFK00000000 | Annulipalpia | Nanopore+Illumina | 229,663,394 | 2190 | 98.3 | 222 Mb |
| (this study) | (16.5 | (FCM: 258 Mb | |||||
|
| VTON00000000 | Annulipalpia | Nanopore+Illumina | 396,695,105 | 869 | 98.6 | 315 Mb |
| (this study) | (17.1 | (FCM: 455 Mb) | |||||
|
| v1 | Annulipalpia | PacBio+Illumina | 451,494,475 | 1297 | 98.7 | 407 |
| ( | (153 | ||||||
|
| Llun_2.0 | Integripalpia | Illumina | 1369,180,260 | 69.1 | 94.3 | n.a. |
| (i5k Consortium) | (80.1 | ||||||
|
| ASM334726v1 | Annulipalpia | Illumina | 604,293,666 | 16.7 | 96.8 | 1.52 Gb |
| ( | (53 | ||||||
|
| ASM300347v1 | Integripalpia | Illumina | 1015,727,762 | 3.1 | 76.2 | 616 Mb |
| ( | (43 | ||||||
|
| n.a. | Integripalpia | Illumina | 757,289,448 | 1.47 | 62.2 | n.a. |
| ( | (8.12 |
Coverage of data used for genome assembly.
N Insecta = 1,658; present = complete + fragmented.
Based on GenomeScope (Vurture et al. 2017).
Based on closely related species.
Based on 17-mer analysis.
FCM, flow cytometry; n.a., not available.
. 1.—(A) Jukes–Cantor distance tree. I: Annulipalpia; II: Integripalpia. (B) Maximum likelihood tree. Bootstrap values ≥90% are given for each node. L-fibroins of the two sequenced genomes are indicated in red. Genbank accessions: Bombyx mori: NM_001044023.1 (Suetsugu et al. 2013), Rhyacophila obliterata: AB354690.1 (Yonemura et al. 2009), Limnephilus decipienns: AB214510.1 (Yonemura et al. 2006), Hesperophylax occidentalis: KM384738.1 (Wang et al. 2014), Hydropsyche tenuis: this study, Hydropsyche angustipennis: AB214508.1 (Yonemura et al. 2006), Plectrocnemia conspersa: this study, Stenopsyche marmorata: LC057252.1 (Bai et al. 2015), and Stenopsyche tienmushanensis: Luo et al. (2018). (C) Aligned amino acid residues of L-fibroin. Each color in the alignment represents a different amino acid. Mean pairwise identity over all pairs in the column: green, 100% identity; greeny-brown, at least 30% and under 100% identity; red, below 30% identity.