| Literature DB >> 31213642 |
M M Malmberg1,2, G C Spangenberg1,2, H D Daetwyler1,2, N O I Cogan3,4.
Abstract
Despite the high accuracy of short read sequencing (SRS), there are still issues with attaining accurate single nucleotide polymorphism (SNP) genotypes at low sequencing coverage and in highly duplicated genomes due to misalignment. Long read sequencing (LRS) systems, including the Oxford Nanopore Technologies (ONT) minION, have become popular options for de novo genome assembly and structural variant characterisation. The current high error rate often requires substantial post-sequencing correction and would appear to prevent the adoption of this system for SNP genotyping, but nanopore sequencing errors are largely random. Using low coverage ONT minION sequencing for genotyping of pre-validated SNP loci was examined in 9 canola doubled haploids. The minION genotypes were compared to the Illumina sequences to determine the extent and nature of genotype discrepancies between the two systems. The significant increase in read length improved alignment to the genome and the absence of classical SRS biases results in a more even representation of the genome. Sequencing errors are present, primarily in the form of heterozygous genotypes, which can be removed in completely homozygous backgrounds but requires more advanced bioinformatics in heterozygous genomes. Developments in this technology are promising for routine genotyping in the future.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31213642 PMCID: PMC6582154 DOI: 10.1038/s41598-019-45131-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Tablet view of alignment of minION sequencing reads to the Darmor-bzh reference genome, with known SNP (chr A01 position 533686) in DH-9 outlined.
Coverage of the Darmor-bzh genome generated in each sample across all sequencing runs and their associated total percentage of heterozygous genotypes.
| Illumina | Minion | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| All runs | Run 1 | Run 2 | Run 3 | Run 4 | ||||||||
| Coverage | % hets | Coverage | % hets | Coverage | % hets | Coverage | % hets | Coverage | % hets | Coverage | % hets | |
| DH-1 | 7.6 | 3.7 | 4.2 | 4.0 | 0.9 | 2.6 | 0.5 | 2.0 | 1.5 | 2.6 | 1.4 | 2.5 |
| DH-2 | 9.1 | 4.6 | 7.4 | 4.0 | 0.9 | 2.8 | 0.4 | 2.1 | 3.0 | 2.5 | 3.0 | 2.5 |
| DH-3 | 4.9 | 4.8 | 2.8 | 4.4 | 0.9 | 2.7 | 0.6 | 2.0 | 0.7 | 3.1 | 0.6 | 3.0 |
| DH-4 | 8.4 | 6.1 | 3.9 | 4.1 | 1.0 | 2.5 | 0.6 | 1.8 | 1.2 | 2.9 | 1.1 | 2.8 |
| DH-5 | 4.4 | 5.2 | 1.7 | 5.1 | 1.0 | 3.0 | 0.5 | 2.3 | 0.1 | 3.7 | 0.1 | 3.6 |
| DH-6 | 6.3 | 3.6 | 2.4 | 4.0 | 1.1 | 2.4 | 0.5 | 2.1 | 0.4 | 3.2 | 0.4 | 3.0 |
| DH-7 | 12.0 | 4.8 | 4.9 | 4.1 | 0.9 | 2.7 | 0.6 | 2.1 | 1.8 | 2.7 | 1.7 | 2.7 |
| DH-8 | 14.4 | 4.3 | 4.9 | 3.9 | 1.1 | 2.5 | 0.6 | 1.9 | 1.6 | 2.7 | 1.6 | 2.6 |
| DH-9 | 9.5 | 4.1 | 1.7 | 4.2 | 0.9 | 2.5 | 0.5 | 1.9 | 0.2 | 3.5 | 0.1 | 2.9 |
| AVERAGE |
|
|
|
|
|
| ||||||
| Correlation with coverage | −0.09 | −0.56 | −0.15 | −0.47 | −0.89*** | −0.75** | ||||||
***Signif at 0.01.
**Signif at 0.02.
Figure 2Percentage of accurate genotype calls for each DH sample based on the sequencing data from all 4 minION sequencing runs. Four different filtering treatments were applied: no filtering was performed (no dp filter), a minimum read depth of 2 and a maximum read depth of 5 (dp 2–5), removal of any heterozygous genotype calls in the minION sequences and without depth filtering (homozygous no dp filter) and removal of any heterozygous genotype calls in the minION sequences and a minimum read depth of 2 and a maximum read depth of 5 (homozygous dp 2–5).
Figure 3Percentage of accurate genotype calls based on the number of supporting reads. The optimal number of supporting reads is marked with an asterisk (*).
Effect of minimum read length filtering and trimming the ends of minION reads on heterozygosity and accuracy across the all DH samples, filtered for between 2 and 5 supporting reads.
| DP 2–5 genotype calls | None | >500 bp reads | >1 kbp reads | >4 kbp reads | 100 bp ends trimmed |
|---|---|---|---|---|---|
| Average nr of comparable SNPs per individual | 779,184 | 776,426 | 770,463 | 708,370 | 759,754 |
| Discrepant with Illumina % | 8.3 | 8.4 | 8.4 | 8.4 | 8.4 |
| Concordant with Illumina % | 91.7 | 91.6 | 91.6 | 91.6 | 91.6 |
| Heterozygous in minION % | 4.2 | 4.2 | 4.2 | 4.2 | 4.2 |
|
| |||||
| Discrepant with Illumina % | 5.3 | 5.3 | 5.3 | 5.4 | 5.3 |
| Concordant with Illumina % | 94.7 | 94.7 | 94.7 | 94.6 | 94.7 |
Figure 4Cumulative percentage frequency of SNPs for the minimum proportion of true genotypes per SNP. The blue lines include heterozygous genotypes calls from the minION data, the orange lines include only homozygous genotype calls. The solid lines represent only genotype calls which are congruent between Illumina and minION, the dashed lines represent all genotype calls which are congruent between the Illumina and minION as well as genotypes which are heterozygous in Illumina but homozygous in minION.
Analysis of genotype calls which are not consistent between the Illumina and minION sequences, based on whether the genotype call is homozygous or heterozygous in Illumina.
| Illumina Genotype | minION Genotype | Percentage |
|---|---|---|
| Homozygous | Same genotype as Illumina | 95.0 |
| Discrepant | 5.0 | |
| Discrepant and homozygous | | |
| Discrepant and heterozygous | | |
| Heterozygous | Same genotype as Illumina | 15.2 |
| Discrepant (i.e. homozygous) | 84.7 |
Figure 5Nucleotide bias in minION sequencing. The percentage of discrepant and correct genotype calls in minION sequencing for each homozygous genotype class captured in the Illumina sequencing.