| Literature DB >> 35955702 |
Maria G Khrenova1,2, Tatiana V Panova1, Vladimir A Rodin1, Maxim A Kryakvin3, Dmitrii A Lukyanov1,4, Ilya A Osterman1,4, Maria I Zvereva1.
Abstract
Nanopore sequencing (ONT) is a new and rapidly developing method for determining nucleotide sequences in DNA and RNA. It serves the ability to obtain long reads of thousands of nucleotides without assembly and amplification during sequencing compared to next-generation sequencing. Nanopore sequencing can help for determination of genetic changes leading to antibiotics resistance. This study presents the application of ONT technology in the assembly of an E. coli genome characterized by a deletion of the tolC gene and known single-nucleotide variations leading to antibiotic resistance, in the absence of a reference genome. We performed benchmark studies to determine minimum coverage depth to obtain a complete genome, depending on the quality of the ONT data. A comparison of existing programs was carried out. It was shown that the Flye program demonstrates plausible assembly results relative to others (Shasta, Canu, and Necat). The required coverage depth for successful assembly strongly depends on the size of reads. When using high-quality samples with an average read length of 8 Kbp or more, the coverage depth of 30× is sufficient to assemble the complete genome de novo and reliably determine single-nucleotide variations in it. For samples with shorter reads with mean lengths of 2 Kbp, a higher coverage depth of 50× is required. Avoiding of mechanical mixing is obligatory for samples preparation. Nanopore sequencing can be used alone to determine antibiotics-resistant genetic features of bacterial strains.Entities:
Keywords: ONT sequencing; SNV; antibiotic resistance; deletion; tolC gene
Mesh:
Substances:
Year: 2022 PMID: 35955702 PMCID: PMC9369328 DOI: 10.3390/ijms23158569
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Main features of the considered samples. Deletions and SNVs are confirmed by alternative experimental methods.
| Sample | T1 | T2 | T3 | T4 | T5 |
|---|---|---|---|---|---|
| Deletion |
|
|
|
|
|
| Gene with SNV |
|
|
|
| |
| Total bases, Mbp | 4377.9 | 1952.1 | 114.8 | 663.5 | 831.7 |
| Number of reads | 3,888,954 | 935,363 | 62,748 | 70,397 | 97,636 |
| Mean coverage | 952× | 424× | 25× | 144× | 181× |
| Mean read length | 1126 | 2087 | 1829 | 9425 | 8518 |
| Mean read length | 12,000 | 12,759 | 12,916 | 13,244 | 13,255 |
Figure 1Accumulation curves demonstrating read length distributions for considered samples. (A,B) Entire samples T1–T5. Each point on the curve corresponds to the fraction of all (A) reads or (B) nucleotides, with the read length equal to or smaller than the read length at this point. In the legend of panel (B), the values after colons are the lengths of reads at which the curves reach the fraction value of 0.5. (C) The same graph as (B) obtained for the T5 set and its subsets, values after colons are the fractions of reads from the entire set of T5 sample in the corresponding curves.
Figure 2Deletion of the tolC gene. Alignment of (A) ONT reads to the reference genome of the E. coli (B) de novo assembled genomes on the reference genome of the E. coli. All genes that are considered in this study are marked.
SNVs in samples T3–T5. The probability of error is determined by the quality of the data in this position and the coverage depth during the analysis by the Medaka program [28]. The Fraction column shows the fraction of reads from the entire set utilized for analysis. The values of the coverage depth and error probability are separated by commas if several subsets of data were generated with a given fraction of the entire sample. “-” refers to samples in which no SNV was found. The last column presents data on SNVs detection in de novo assembled genomes.
| Sample | Gene | SNV and Its Coordinate | Fraction | Coverage Depth (Error Probability) | SNV in De Novo Genomes, Yes/No |
|---|---|---|---|---|---|
| T3 |
| 248: C → T | 1 | 21× (0.01%) | 1/0 |
| T4 |
| 272: C → T | 1 | 150× (0.5%) | 1/0 |
| 1/2 | 92× (3%), 58× (1%) | 2/0 | |||
| 1/4 | 50× (0.2%), 42× (0.7%), 31× (1%), 27× (0.3%) | 3/1 | |||
| 1/8 | 27× (0.3%), 23× (1%), 22× (2%), 20× (3%), 16× (5%), 15× (0.6%), 9× (28%),- | 5/3 | |||
| T5 |
| 599: T → A | 1 | 165× (0.001%) | 1/0 |
| 1/2 | 91× (0.006%), 74× (0.002%) | 2/0 | |||
| 1/4 | 53× (0.01%), 45× (0.002%), 38× (0.01%), 29× (0.02%) | 4/0 | |||
| 1/8 | 26× (0.01%), 24× (0.01%), 17× (0.01%), 21× (0.004%), 15× (0.02%),-,-,- | 8/0 | |||
| T5 |
| 272: C → T | 1 | 171× (0.4%) | 1/0 |
| 1/2 | 96× (1%), 75× (0.3%) | 2/0 | |||
| 1/4 | 50× (0.6%), 46× (2%), 41× (3%), 33× (3%) | 4/0 | |||
| 1/8 | 27× (5%), 27× (1%), 25× (2%), 23x (2%), 20× (5%), 19× (45%), -, - | 5/3 |
Main features of the considered samples.
| Sample Vortexing Time | 0 s | 5 s | 10 s | 30 s |
|---|---|---|---|---|
| Mean read length | 13,101.7 ± 72.4 | 12,851.1 ± 167.5 | 13,021.5 ± 51.4 | 12,942.5 ± 121.1 |