| Literature DB >> 35705661 |
Mattia De Vivo1,2,3, Hsin-Han Lee1,4,5, Yu-Sin Huang1,2,3, Niklas Dreyer1,2,3,6, Chia-Ling Fong1,2,3, Felipe Monteiro Gomes de Mattos1,2,3, Dharmesh Jain7,8,9, Yung-Hui Victoria Wen7,10, John Karichu Mwihaki1,2,3, Tzi-Yuan Wang1, Ryuji J Machida1, John Wang1, Benny K K Chan1, Isheng Jason Tsai11.
Abstract
High-throughput sequencing has enabled genome skimming approaches to produce complete mitochondrial genomes (mitogenomes) for species identification and phylogenomics purposes. In particular, the portable sequencing device from Oxford Nanopore Technologies (ONT) has the potential to facilitate hands-on training from sampling to sequencing and interpretation of mitogenomes. In this study, we present the results from sampling and sequencing of six gastropod mitogenomes (Aplysia argus, Cellana orientalis, Cellana toreuma, Conus ebraeus, Conus miles and Tylothais aculeata) from a graduate level biodiversity course. The students were able to produce mitogenomes from sampling to annotation using existing protocols and programs. Approximately 4 Gb of sequence was produced from 16 Flongle and one MinION flow cells, averaging 235 Mb and N50 = 4.4 kb per flow cell. Five of the six 14.1-18 kb mitogenomes were circlised containing all 13 core protein coding genes. Additional Illumina sequencing revealed that the ONT assemblies spanned over highly AT rich sequences in the control region that were otherwise missing in Illumina-assembled mitogenomes, but still contained a base error of one every 70.8-346.7 bp under the fast mode basecalling with the majority occurring at homopolymer regions. Our findings suggest that the portable MinION device can be used to rapidly produce low-cost mitogenomes onsite and tailored to genomics-based training in biodiversity research.Entities:
Mesh:
Year: 2022 PMID: 35705661 PMCID: PMC9200733 DOI: 10.1038/s41598-022-14121-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Sample identification (ID) codes, together with original morphological identification and BLASTn results of the cox1 sequence.
| Sample ID | Aoc | Cra | DJ | Ceb | Cfl | Mku |
|---|---|---|---|---|---|---|
| Family | Aplysidae | Nacellidae | Nacellidae | Conidae | Conidae | Muricidae |
| Initial morphological identification | ||||||
| Bit Score (fast) | 1175 | 1186 | 1042 | 1151 | 1158 | 1136 |
| Nucleotide identity (%) (fast) | 98.9 | 99.1 | 99.3 | 99.2 | 99.5 | 98.0 |
| Bit Score (hac) | 1194 | 1199 | 1066 | 1151 | 1170 | 1197 |
| Nucleotide identity (%) (hac) | 99.4 | 99.5 | 100 | 99.4 | 99.8 | 99.5 |
| Genbank Accession | ON018801 | ON018804 | ON018805 | ON018802 | ON018803 | ON018806 |
| Length (bp) | 14,124 | 16,169 | 16,268 | 18,031 | 16,243 | 17,024 |
| AT content (%) | 66.5 | 69.5 | 68.4 | 67.5 | 61.8 | 67.0 |
| Bit Score | 1194 | 1205 | 1240 | 1175 | 1170 | 1205 |
| Nucleotide identity (%) | 99.4 | 99.7 | 100 | 100.0 | 99.8 | 99.7 |
*Latest species names are provided in the table; some were not yet updated in GenBank. **Conus cloveri with 87.2% nucleotide identity was identified as top hit when the full cox1 sequence was used. We searched instead using Folmer region and identified C. ebraeus with much higher nucleotide identity.
Figure 1Quantification of ONT errors from fast mode basecalling. (A) Number of INDELs (+/−) and substitutions (*) in ONT assemblies before and after consensus improvement using Illumina reads. Error types that occurred once (n = 15) and twice (n = 8) were excluded from the plot. (B) Relationship between composition of single-base INDELs and homopolymer length.
Figure 2ONT assembly features of sample Mku. (A) Dotplot against Illumina assembly. (B) AT content in 50 bp windows. (C) Nanopore and Illumina read coverage in 50 bp windows.
Figure 3cox1 (left) and mitogenome (right) phylogenies from each family. From top to bottom: (A) Aplysiidae (with Aplysia argus); (B) Patellogastropoda (with Cellana orientalis and Cellana toreuma); (C) Conidae (with Conus ebraeus and Conus miles); and (D) Muricidae (with Tylothais aculeata). Blue dots represent bootstrap support ≤ 75, yellow ones represent bootstrap support ≥ 95. Values in the middle are written. Red bold tips represent our specimens, black bold ones represent the identified species’ sequences.
Figure 4Synteny comparison among our samples and reference mitogenomes. Red labels denote our samples. The lengths of the control region between tRNAPhe and cox3 are shown when more than 1kb difference are observed between closely related species.