| Literature DB >> 35459089 |
J Antonio Baeza1,2,3, F J García-De León4.
Abstract
BACKGROUND: Whole mitochondrial genomes are quickly becoming markers of choice for the exploration of within-species genealogical and among-species phylogenetic relationships. Most often, 'primer walking' or 'long PCR' strategies plus Sanger sequencing or low-pass whole genome sequencing using Illumina short reads are used for the assembling of mitochondrial chromosomes. In this study, we first confirmed that mitochondrial genomes can be sequenced from long reads using nanopore sequencing data exclusively. Next, we examined the accuracy of the long-reads assembled mitochondrial chromosomes when comparing them to a 'gold' standard reference mitochondrial chromosome assembled using Illumina short-reads sequencing.Entities:
Keywords: Elasmobranch; Long-read sequencing; Nanopore
Mesh:
Substances:
Year: 2022 PMID: 35459089 PMCID: PMC9027416 DOI: 10.1186/s12864-022-08482-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 4.547
Fig. 1Circularized mitochondrial genome ideogram of the silky shark Carcharhinus falciformis. The map is annotated and depicts a single putative control region, 22 transfer RNA (tRNA) genes, 2 ribosomal RNA genes (rrnS [12S ribosomal RNA] and rrnL [16S ribosomal RNA]), and 13 protein-coding genes (PCGs). Shark photograph: Joi Ito (Attribution 2.0 Generic [CC BY 2.0])
Accuracy metrics for different de novo and reference-based mitochondrial genome assemblies using nanopore long reads exclusively in the silky shark Carcharhinus falciformis
| Assembly Pipeline | Contigs | Length | Coverage | p-dist | Errorsb |
|---|---|---|---|---|---|
| Flye +1p | circular | 16,690 | 20x | 0.001023172 | 65 |
| Flye +1p + Medaka | circular | 16,475 | 20x | 0.000541679 | 70 |
| Flye +5p | circular | 16,691 | 20x | 0.001023172 | 69 |
| Flye +5p + Medaka | circular | 16,475 | 20x | 0.000541679 | 71 |
| Flye +10p | circular | 16,691 | 20x | 0.001023172 | 69 |
| Flye +10p + Medaka | circular | 16,475 | 20x | 0.000541679 | 71 |
| Unicycler - N | circular | 16,801 | 2.28xa | 0.001143545 | 110 |
| Unicycler - N + Medaka | circular | 16,781 | 2.28xa | 0.000601866 | 89 |
| Unicycler - B | circular | 16,801 | 2.28xa | 0.001143545 | 110 |
| Unicycler - B + Medaka | circular | 16,781 | 2.28xa | 0.000601866 | 89 |
| Unicycler - C | circular | 16,801 | 2.28xa | 0.001143545 | 110 |
| Unicycler - C + Medaka | circular | 16,781 | 2.28xa | 0.000601866 | 89 |
| Rebaler - | circular | 15,782 | 50.59x | 0.001324105 | 106 |
| Rebaler - | circular | 16,774 | 50.59x | 0.000541679 | 81 |
| Rebaler - | circular | 15,790 | 49.94x | 0.000902799 | 95 |
| Rebaler - | circular | 16,776 | 49.94x | 0.000361119 | 73 |
| Rebaler - | circular | 16,789 | 52.52x | 0.000842612 | 96 |
| Rebaler - | circular | 16,777 | 52.52x | 0.000541679 | 81 |
| Reference mtDNA | circular | 16,705 | 9.8x | – | – |
aUnicycler normalizes the depth of contigs to the median value
bError refers to total number of errors quantified in the long-read assemblies compared to the short read assembly. Errors were classified as single, double, triple, quadruple, quintuple, sextuple, or septuple “homopolymer insertions’ or ‘homopolymer deletions’, ‘simple substitution’, ‘single insertion’, ‘short insertion (< 5 bp)’, ‘single deletion’, and ‘short deletion (< 5 pb)’
Fig. 2Sequence errors per de novo (Unicycler and Flye) and reference-based assemblers (Rebaler) without and with ‘extra polishing’ using the program Medaka for the silky shark Carcharhinus falciformis mitochondrial genome. Benchmarking of all long-read assemblies occurred against the Illumina short-read assembly (‘gold’ standard)
Fig. 3Annotation of reference-based (Rebaler) and de novo (Fyer and Unicycler) mitochondrial genomes assembled using long reads in the silky shark Carcharhinus falciformis. Assemblies depicted include those with and without ‘extra polishing’ with the program Medaka
Fig. 4Mitophylogenomic analysis of the genus Carcharhinus and allies, including mitochondrial genomes of the silky shark Carcharhinus falciformis assembled with long reads exclusively and short reads (‘gold standard’). Nodes with bootstrap support values > 90 are marked with an orange circle. Shark photograph: Joi Ito (Attribution 2.0 Generic [CC BY 2.0])
Fig. 5Barcoding analysis of the genus Carcharhinus using the D-Loop/Control Region (CR), including the CR retrieved from mitochondrial genomes of the silky shark Carcharhinus falciformis mitochondrial genome assembled with long reads alone and short reads (‘gold standard’) plus 447 other specimens belonging to the genus Carcharhinus retrieved from Genbank. Shark drawings from M. Dando (used with permission) [47]
Fig. 6Bioinformatics pipeline to assemble the mitochondrial chromosome of the silky shark Carcharhinus falciformis using nanopore long reads exclusively