| Literature DB >> 35397610 |
Brian W Strehlow1, Astrid Schuster2, Warren R Francis2, Donald E Canfield2.
Abstract
OBJECTIVES: These data were collected to generate a novel reference metagenome for the sponge Halichondria panicea and its microbiome for subsequent differential expression analyses. DATA DESCRIPTION: These data include raw sequences from four separate sequencing runs of the metagenome of a single individual of Halichondria panicea-one Illumina MiSeq (2 × 300 bp, paired-end) run and three Oxford Nanopore Technologies (ONT) long-read sequencing runs, generating 53.8 and 7.42 Gbp respectively. Comparing assemblies of Illumina, ONT and an Illumina-ONT hybrid revealed the hybrid to be the 'best' assembly, comprising 163 Mbp in 63,555 scaffolds (N50: 3084). This assembly, however, was still highly fragmented and only contained 52% of core metazoan genes (with 77.9% partial genes), so it was also not complete. However, this sponge is an emerging model species for field and laboratory work, and there is considerable interest in genomic sequencing of this species. Although the resultant assemblies from the data presented here are suboptimal, this data note can inform future studies by providing an estimated genome size and coverage requirements for future sequencing, sharing additional data to potentially improve other suboptimal assemblies of this species, and outlining potential limitations and pitfalls of the combined Illumina and ONT approach to novel genome sequencing.Entities:
Keywords: Halichondria panicea; Hologenome; Metagenome; Microbiome; Objective; Porifera
Mesh:
Year: 2022 PMID: 35397610 PMCID: PMC8994243 DOI: 10.1186/s13104-022-06013-3
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Overview of data files/data sets
| Label | Name of data file/data set | File types (file extension) | Data repository and identifier (DOI or accession number) |
|---|---|---|---|
| Data file 1 | DNA extraction protocol | .io | Protocols.io, 10.17504/protocols.io.yvkfw4w [ |
| Data set 1 | Illumina raw sequences lane 1 | fastq | NCBI Sequence Read Archive, |
| Data set 2 | Illumina raw sequences lane 2 | fastq | NCBI Sequence Read Archive, |
| Data set 3 | Nanopore run 1 raw sequences | fastq | NCBI Sequence Read Archive, |
| Data set 4 | Nanopore run 2 WGA raw sequences | fastq | NCBI Sequence Read Archive, |
| Data set 5 | Nanopore run 3—WGA raw sequences | fastq | NCBI Sequence Read Archive, |
| Data set 6 | Whole metagenome assembly (from Illumina sequences) | fasta | NCBI Assembly, |
| Data set 7 | fasta | NCBI Assembly, | |
| Data set 8 | HOC36 bin assembly (from Illumina sequences) | fasta | NCBI Assembly, |
| Data set 9 | Proteobacteria bin assembly (from Illumina sequences) | fasta | NCBI Assembly, |
| Data set 10 | Nanopore only metagenome assembly | fasta | NCBI Assembly, |
| Data set 11 | Hybrid nanopore-Illumina assembly | fasta | NCBI Assembly, |
| Data file 2 | Supplementary material | Harvard Dataverse, 10.7910/DVN/DJYOOI [ |