Literature DB >> 35171093

DNA isolation methods for Nanopore sequencing of the Streptococcus mitis genome.

David Pinzauti¹, Francesco Iannelli¹, Gianni Pozzi¹, Francesco Santoro¹.

Abstract

Streptococcus mitis is a Gram-positive bacterium, member of the oral commensal microbiota, which can occasionally be the etiologic agent of diseases such as infective endocarditis, bacteraemia and septicaemia. The highly recombinogenic and repetitive nature of the S. mitis genome impairs the assembly of a complete genome relying only on short sequencing reads. Oxford Nanopore sequencing can overcome this limitation by generating long reads, enabling the resolution of genomic repeated regions and the assembly of a complete genome sequence. Since the output of a Nanopore sequencing run is strongly influenced by genomic DNA quality and molecular weight, the DNA isolation is the crucial step for an optimal sequencing run. In the present work, we have set up and compared three DNA isolation methods on two S. mitis strains, evaluating their capability of preserving genomic DNA integrity and purity. Sequencing of DNA isolated with a mechanical lysis-based method, despite being cheaper and quicker, did not generate ultra-long reads (maximum read length of 59516 bases) and did not allow the assembly of a circular complete genome. Two methods based on enzymatic lysis of the bacterial cell wall, followed by either (i) a modified CTAB DNA isolation procedure, or (ii) a DNA purification after osmotic lysis of the protoplasts allowed the sequencing of ultra-long reads up to 107294 and 181199 bases in length, respectively. The reconstruction of a circular complete genome was possible sequencing DNAs isolated using the enzymatic lysis-based methods.

Entities: Chemical

Keywords: Gram positive; Oxford Nanopore sequencing; Streptococcus mitis; genomic DNA extraction; high molecular weight DNA

Mesh：

Substances：
DNA, Bacterial

Year: 2022 PMID： 35171093 PMCID： PMC8942023 DOI： 10.1099/mgen.0.000764

Source DB: PubMed Journal: Microb Genom ISSN： 2057-5858

Data Summary

All Nanopore sequencing data used in this study have been uploaded to the sequence read archive (SRA): SRX13372482, SRX13372483, SRX13372484, SRX13372479, SRX13372480, SRX13372481. S022-V3-A4 and S022-V7-A3 complete genomes are available at the NCBI with accession numbers CP047883.1 and CP067992.1. The possibility of achieving long sequencing reads makes Oxford Nanopore technology an essential tool for bacterial genomics. In fact, long reads have the capability of spanning long repetitive elements, solving genomic complexities and allowing complete genome assembly. Preservation of genomic DNA integrity, during DNA isolation and sequencing library preparation, is crucial to obtain ultra-long sequence reads. The isolation of high molecular weight DNA from Gram-positive bacteria can be particularly challenging. Here we set up three DNA isolation methods, which were validated for the sequencing of the highly recombinogenic genome. Sequencing of DNA, isolated using a method based on cell-wall enzymatic lysis followed by protoplast osmotic lysis, produces multiple ultra-long reads, up to 181 199 bases in length. These methods can be readily applied for isolation of high molecular weight genomic DNA from difficult to lyse Gram-positive bacteria.

Introduction

The ‘mitis group’ of Streptococci comprises 20 species of human-associated Streptococci, including the pathogen . The phylogenetic relationships among , and are complex and cannot be completely untangled using standard biochemical approaches or sequencing marker genes such as sodA or the 16S rRNA gene [1]. , formerly known as Streptococcus mitior, is a Gram-positive bacterium, member of the oral commensal microbiota, which can occasionally be the etiologic agent of human infectious diseases [2-4]. The genome is between 1.8 and 2.1 Mb in length containing up to 2.277 open reading frames, and has a median G+C content of 40 %. To date (December 2021), 170 . genomes are deposited in GenBank (https://www.ncbi.nlm.nih.gov/genome/genomes/530), of which only ten are complete. The 170 . genomes were downloaded and analysed with the speciator tool built in the PathogenWatch website (https://pathogen.watch/), which uses the Mash algorithm [5] to query a reference genome database and perform species assignment. The analysis confirmed the species assignment for 156 genomes (91.76 %), of those 125 had a Mash distance (roughly corresponding to the average nucleotide identity) of 0, while 31 had a Mash distance between 0 and 0.05. The remaining 14 genomes were incorrectly classified as and were assigned to (n=4), (n=4), (n=3), S. timonensis (n=1), generically to Streptococcus sp (n=1), or could not be classified (n=1). This analysis confirmed the difficulties in taxonomic assignments within the ‘mitis group’ of Streptococci, and calls for a better, genome-based classification. Unfortunately, bacterial species such as , [6] or [7] have highly recombinogenic and repetitive genomes, which impair the assembly of a complete genome sequence. Furthermore, the presence of more than one copy of the same insertion sequences and/or of the ribosomal RNA genes in bacterial genomes, makes difficult the assembly of complete genome sequences [8]. In fact, to date (December 2021) only 7.12 % (24884 out of 349934) of the bacterial genomes available in the GenBank database are complete. Oxford Nanopore technology can generate long and ultra-long sequencing reads, which can be as long as the DNA template used for sequencing library preparation. Long reads enable resolution of genomic complexities and allow a complete genome assembly. Since Nanopore sequencing is strongly influenced by genomic DNA quality and molecular weight, the DNA isolation step remains the first challenge for an optimal sequencing run. In the present study we have set up and compared three different DNA extraction methods, evaluating their ability to isolate pure high molecular weight DNA from two strains prior to Oxford Nanopore sequencing. These methods can be readily applied for isolation of high molecular weight genomic DNA from difficult to lyse Gram positive bacteria allowing the assembly of complete genomes.

Methods

DNA isolation methods

S022-V3-A4 and S022-V7-A3 bacterial strains were isolated from the saliva of a healthy subject treated with minocycline during the ANTIRESDEV project and identified as by 16S rRNA gene sequencing [9]. Frozen starter cultures were diluted 50-fold in 10 ml of Tryptic Soy Broth (TSB) and incubated at 37 °C in ambient air. Bacterial growth was monitored until an OD590=1.0, corresponding approximately to 3×107 c.f.u. ml−1. Bacterial cultures were centrifuged at 6600 for 5 min and the supernatants were discarded. Genomic DNAs were isolated using three different methods designed as cetyl trimethyl ammonium bromide (CTAB), Raffinose and TissueLyser. CTAB and Raffinose methods rely on enzymatic lysis of the bacterial cell wall, while TissueLyser is based on mechanical lysis. A schematic representation of the isolation methods is reported in Fig. 1.

Fig. 1.

Schematic representation of the DNA isolation methods.

Schematic representation of the DNA isolation methods. The CTAB method was adapted from Current Protocols in Molecular Biology [10]. The collected bacterial pellet was resuspended in 14.8 ml of sterile TE buffer (10 mM Tris, 1 mM EDTA, pH 8.0). Pre-digestion of the bacterial cell wall was carried out for 60 min (min) at 37 °C in 2.6 mg ml−1 lysozyme (Sigma Aldrich). Bacterial cells were lysed for 30 min in 0.5 % sodium dodecyl sulphate (SDS) and 0.1 mg ml−1 proteinase K. Proteins and polysaccharides were precipitated incubating for 10 min at 65 °C in 0.5 M NaCl and CTAB/NaCl (10 % CTAB, 0.7 M NaCl). DNA purification was carried out two times in an equal volume of chloroform/isoamyl alcohol (24 : 1 v/v) mixing well by inversion. The solution was centrifuged at 6600 for 15 min, the supernatant was recovered and transferred into a fresh tube. DNA precipitation was obtained incubating for 30 min at −20 °C in 0.6 volumes of ice-cold isopropanol. DNA precipitate was pelleted by centrifuging at 6600 for 15 min, let dry at room temperature (RT), then the pellet was resuspended in 100 µl of saline solution (0.9 % NaCl) and stored at 4 °C. In the Raffinose method, bacterial lysis relies on osmotic lysis of the protoplasts while proteins and polysaccharides precipitation was obtained in absence of CTAB. The bacterial pellet was washed with 10 ml of sterile TE buffer and resuspended in 7.5 ml Raffinose buffer (50 mM Tris pH 8, 5 mM EDTA, 20 % Raffinose). Pre-digestion of the bacterial cell wall was carried out for 60 min at 37 °C in 2.6 mg ml−1 lysozyme. The solution was centrifuged at 6600 for 5 min, the supernatant was discarded. The pellet was resuspended in 8 ml of sterile water, 400 µl of 10 % SDS (final concentration of 0.5%) and 80 µl of 10 mg ml−1 proteinase K (final concentration of 0.1 mg ml−1) were added, incubating for 30 min at 37 °C to induce protoplasts osmotic lysis. Proteins and polysaccharides were precipitated incubating for 10 min at RT in 0.5 M NaCl. DNA purification was carried out as described in the CTAB method. The TissueLyser protocol relies on glass beads to mechanically disrupt the bacterial cell wall. The bacterial pellet was resuspended in 1 ml of sterile TE buffer and transferred into a fresh 2 ml microtube containing 0.04 g of sterile glass beads (Sigma-Aldrich, Ø150–212 µm). Bacterial cells were lysed with two passages of 2 min at 30 Hz frequency in the TissueLyser device (Qiagen). The solution was centrifuged for 5 min at 1800 recovering the supernatant. DNA purification was performed by adding 0.4 × AMPure XP beads (Beckman Coulter) and incubating for 15 min in a rotator mixer. Beads were pelleted using a magnetic rack (NimaGen), washed twice with freshly prepared 70 % ethanol, and resuspended in 100 µl of saline incubating for 15 min at 37 °C. Beads were pelleted again in the magnetic rack, the eluate containing the DNA was recovered and stored at 4 °C.

DNA quantification

Extraction methods were compared in terms of genomic DNA quantity, quality and molecular weight. Extraction yields were measured using Qubit dsDNA BR assay kit (Thermo Fisher Scientific) in a Qubit 2.0 Fluorometer device (Invitrogen). The DNA concentration and the whole DNA amount recovered were measured for each extraction protocol. Genomic DNA purity grade was determined using a NanoPhotometer device (IMPLEN), evaluating the DNA absorbance ratios 260 nm/280 nm and 260 nm/230 nm. DNA integrity was determined using an agarose gel electrophoretic assay. Samples were loaded into a 0.8 % agarose gel and ran for 4 h at 3 V/cm in 0.5 × Tris-Borate-EDTA buffer.

Oxford Nanopore library preparation

Isolated DNA samples were used for sequencing library preparation employing the Ligation Sequencing kit (SQK-LSK108) and the Expansion Barcoding kit (EXP-NB103), manufactured by Oxford Nanopore Technologies (ONT). A unique barcode sequence was associated to each DNA sample, enabling sample multiplexing. Sequencing libraries were prepared following the manufacturer’s instructions (https://nanoporetech.com/), though introducing an initial size-selection step to reduce small DNA fragments' contamination. Wide bore pipette tips were used and vortexing was avoided trying to further reduce DNA shearing. Briefly, for each extraction protocol 2.5 µg of genomic DNA were size-selected using 0.7 × AMPure XP beads. The reaction was incubated for 15 min in a rotator mixer and pelleted on a magnetic rack. The bead pellet was washed twice using freshly prepared 70 % ethanol and eluted in 40 µl of saline. Size-selected samples were end-repaired using FFPE and Ultra II End-Prep kits (New England BioLabs, NEB). Barcodes were ligated using Blunt TA Master Mix (NEB) and mixed together into an equimolar pool of 1.2 µg DNA in 50 µl of saline. Adapter proteins were ligated using Quick T4 Ligase (NEB), for 30 min at room temperature. Finally, 520 ng of sequencing library was loaded on a new R9.4 flowcell (ONT). The sequencing run was performed using a MinION device (ONT).

Data analysis

Oxford Nanopore native fast5 files were basecalled and demultiplexed using the stand-alone Guppy module v. 5.0.16 (High Accuracy mode, quality threshold >7). Sequencing readouts were analysed using NanoStat (v. 1.38.0) [11], while read-length distributions were plotted using the R-package ggplot2 (v. 2.2.1) [12]. Nanopore reads were filtered using Filtlong (v. 0.2.0) (https://github.com/rrwick/Filtlong) to exclude reads smaller than 3000 bases (--min_length 3000) and the worst 5 % of reads (--keep_percent 95). The de novo genome assembly was performed with Flye (v. 2.9) [13] assembler and polished with Medaka (v. 1.4.4) (https://github.com/nanoporetech/medaka). A hybrid assembly was generated with Unicycler (v. 0.4.7) [14] using Nanopore and Illumina reads. Assembly quality and completeness were evaluated comparing predicted genes with the Benchmarking Universal Single-Copy Orthologs (BUSCO v. 5.2.2) tool [15], using the ‘genome’ running mode and lactobacillales_odb10 database. All tools were run using default parameters unless otherwise specified.

Results

Three genomic DNA isolation methods were set up and compared on two strains, namely S022-V3-A4 and S022-V7-A3. The capability of isolating high molecular weight DNA without compromising the purity was evaluated. Then, we used the purified DNAs as templates in whole-genome sequencing experiments to assess the ability to generate long and ultra-long Nanopore reads suitable for de novo genome assembly. One method based on mechanical lysis of bacterial cells using the Qiagen TissueLyser device was designed as TissueLyser. Two methods based on enzymatic lysis of the bacterial cell wall, followed by either (i) a modified CTAB DNA extraction procedure, or (ii) a DNA purification after osmotic lysis of the protoplasts were designed as CTAB and Raffinose, respectively.

Genomic DNA quantitative and qualitative analysis

To determine the concentration, yield, purity and integrity of genomic DNAs, fluorometric, spectrophotometric and agarose gel electrophoretic analysis were carried out. Quantitative fluorometric analysis, obtained with Qubit Fluorometer, showed that the two enzymatic lysis-based methods are more efficient than the mechanical lysis-based method allowing to achieve a total DNA yield up to 56.2 µg (Table 1). Qualitative spectrophotometric analysis showed that both the DNA absorbance ratios were within acceptable ranges for all methods. Agarose gel electrophoresis analysis showed that enzymatic lysis-based methods were able to preserve the integrity of genomic DNAs, which could be detected as two sharp high molecular weight (HMW) bands, of which one is retained in the loading well and the other has an estimated MW of about 30 kb (Fig. 2). The TissueLyser method yielded a smear in the lanes with no HMW band indicating DNA fragmentation.

Table 1.

Quantitative and qualitative analysis of genomic DNAs

Method	DNA concentration* (ng μl⁻¹)		DNA absorbance*
			A ₂₆₀/A ₂₈₀		A ₂₆₀/A ₂₃₀
	S022-V3-A4	S022-V7-A3	S022-V3-A4	S022-V7-A3	S022-V3-A4	S022-V7-A3
CTAB	177.13±80.1	355±221.73	1.96±0.07	1.79±0.17	2.13±0.04	1.97±0.10
Raffinose	158.4±43.4	196.66±42.72	1.92±0.05	1.94±0.07	2.04±0.07	2.09±0.02
TissueLyser	48.08±45.74	25.74±16.29	1.96±0.13	2.03±0.13	2.00±0.13	2.11±0.05

*Values represent the average of at least three independent DNA isolation experiments or measurements. For Oxford Nanopore sequencing acceptable DNA absorbance ratios are as follows: A 260/A 280=1.8–2.0 and A 260/A 230=2.0–2.2.

Fig. 2.

Agarose gel electrophoretic analysis of genomic DNAs. (a) Representative DNAs extracted from S022-V3-A4 strain, (b) representative DNAs extracted from S022-V7-A3 strain. Lane 1, Lambda (λ) DNA/HindIII (100 ng). Genomic DNA purified with the CTAB method: lane 2, (1 µl); lane 3, (1/10 µl); lane 4, (1/20 µl). Genomic DNA purified with the Raffinose method: lane 5, (1 µl); lane 6, (1/10 µl); lane 7, (1/20 µl). Genomic DNA purified with the TissueLyser method: lane 8, (1 µl); lane 9, (1/10 µl); lane 10, (1/20 µl); lane 11, λ DNA (100 ng); lane 12, λ DNA (30 ng); lane 13, GeneRuler 1 kb Plus DNA Ladder (100 ng). Quantitative and qualitative analysis of genomic DNAs Method DNA concentration* (ng μl−1) DNA absorbance* A 260/A 280 A 260/A 230 S022-V3-A4 S022-V7-A3 S022-V3-A4 S022-V7-A3 S022-V3-A4 S022-V7-A3 CTAB 177.13±80.1 355±221.73 1.96±0.07 1.79±0.17 2.13±0.04 1.97±0.10 Raffinose 158.4±43.4 196.66±42.72 1.92±0.05 1.94±0.07 2.04±0.07 2.09±0.02 TissueLyser 48.08±45.74 25.74±16.29 1.96±0.13 2.03±0.13 2.00±0.13 2.11±0.05 *Values represent the average of at least three independent DNA isolation experiments or measurements. For Oxford Nanopore sequencing acceptable DNA absorbance ratios are as follows: A 260/A 280=1.8–2.0 and A 260/A 230=2.0–2.2.

Genome DNA sequencing

As a final readout for the three DNA extraction methods' comparison, three representative DNA samples obtained with each method were mixed and used as templates in whole-genome sequencing experiments (Table 2). Libraries prepared with the different DNA samples were run in the same flow cell; the sequencing run was stopped after 4 h, generating a total of ~700 Mb. All sequencing experiments yielded a 50 × genome coverage, with the exception of the S022-V7-A3 sequencing on the templates obtained with the CTAB method, which roughly achieved a 37 × coverage. Statistical analysis suggested a similar mean read length across the three methods, while the TissueLyser generated a higher median length possibly due to two size selection steps performed during the DNA isolation and before library preparation. Read-length N50 values suggested that both enzymatic lysis-based methods were able to preserve DNA integrity generating longer reads. The highest N50 values were achieved sequencing DNAs extracted with the Raffinose method, which generated multiple ultra-long reads (>100 kb). Read-length distribution showed that sequencing of DNAs extracted with the TissueLyser method, generated more reads of about 10 kb in length compared to DNAs extracted with the two enzymatic-based methods (Fig. 3). A tail of longer reads can be observed for the sequencing of DNAs obtained with the CTAB and Raffinose methods. De novo genome assembly showed that the long reads obtained sequencing Raffinose DNAs solved genomic complexities achieving a complete genome for both strains (Table 3). Sequencing of the CTAB DNAs was capable of generating a complete genome for S022-V7-A3, but not for S022-V3-A4 (five contigs). Incomplete genomes were generated assembling reads obtained sequencing the TissueLyser DNAs: (i) an incomplete, fragmented assembly (five contigs) of 2083754 bp was obtained for the S022-V3-A4 strain where the longest contig is 1941254 bp; (ii) a 2032458 bp incomplete, single assembly was obtained for the S022-V7-A3 strain. A total of 402 BUSCO genes, contained in the lactobacillales_odb10 database, were used to assess genome quality and completeness from nanopore-only assemblies (Table 3). BUSCO identified on average 375.5±5.5 (93.4±1.4 %) and 372.5±16.5 (92.7±4.1 %) complete, single-copy genes for S022-V3-A4 and S022-V7-A3, respectively. A total of 20±5 (4.95±1.25 %) and 22.5±13.5 (5.6±3.4 %) fragmented genes, while 6±4 (1.15±0.65 %) and 7±3 (1.7±0.7 %) missed genes were identified, respectively, for S022-V3-A4 and S022-V7-A3 strains. BUSCO analysis of GenBank deposited genomes showed scores >99 % for all but eight genomes, suggesting that nanopore reads alone do not allow yet perfect bacterial genome assemblies. To further improve the assembly quality, the genomic architecture of both strains was finally reconstructed based on a hybrid assembly approach, using both Nanopore and Illumina reads: (i) representative samples were also sequenced with the Illumina technology at MicrobesNG (Birmingham, UK) (sequence data not shown); (ii) Nanopore reads obtained sequencing DNAs extracted with the three different methods were merged together achieving an overall 150 × genome coverage (~300 Mb) for each strain. S022-V3-A4 genome was assembled into a complete chromosome of 2086958 bp, while S022-V7-A3 genome was assembled into a complete chromosome of 2033396 bp. Both final genomes were assigned to with a Mash distance <0.05 and reached a BUSCO score of 100 %.

Table 2.

genome sequencing readout

Sequencing output	DNA extraction method
	CTAB		Raffinose		TissueLyser
	S022-V3-A4	S022-V7-A3	S022-V3-A4	S022-V7-A3	S022-V3-A4	S022-V7-A3
Mean read length	4476.0	4216.3	5338.5	4270.9	5121.9	4128.9
Median read length	2552	2171	2519	2047	4695	3455
Mean read quality	11.3	11.2	11.3	11.2	11.2	11.1
Number of reads	18485	17778	18920	24227	19109	57825
Read length N50	7608	7700	10610	8371	6651	5317
Total bases	82739476	74956919	101004905	103470132	97875297	238752790
Longest reads:
1	82241	107,294	181199	124694	34966	59516
2	73834	93,505	130,614	105568	31951	52114
3	73597	84226	118691	103501	31676	47645
4	70310	82270	116448	97398	31098	38005
5	68235	80707	114758	92372	27268	36981

Fig. 3.

Read-length distribution of genome sequencing experiments. (a) S022-V3-A4 strain, (b) S022-V7-A3 strain. Read-length values are plotted on the X-axis, while the Y-axis reports the number of reads. Reads obtained sequencing the DNAs isolated using the three methods are represented with different colours: blue for CTAB, red for Raffinose and green for TissueLyser.

Table 3.

genomes' assembly and data availability

DNA extraction method	Length*		BUSCO†		Sequence read archive no.		Genome GenBank no.‡
	S022-V3-A4	S022-V7-A3	S022-V3-A4	S022-V7-A3	S022-V3-A4	S022-V7-A3	S022-V3-A4	S022-V7-A3
CTAB	2094027 bp	2033697 bp	C:370 (92%), F:25 (6.2%), M:7 (1.8%)	C:356 (88.6%), F:36 (9%), M:10 (2.4%)	SRX13372482	SRX13372479	CP047883.1	CP067992.1
Raffinose	2087264 bp	2033595 bp	C:377 (93.8%), F:15 (3.7%), M:10 (2.5%)	C:372 (92.5%), F:23 (5.7%), M:7 (1.8%)	SRX13372483	SRX13372480
TissueLyser	2083754 bp	2032458 bp	C:381 (94.8%), F:19 (4.7%), M:2 (0.5%)	C:389 (96.8%), F:9 (2.2%), M:4 (1%)	SRX13372484	SRX13372481

*Five linear contigs were obtained assembling reads obtained sequencing DNAs extracted with the CTAB and the TissueLyser method for S. mitis S002-V3-A4 strain, while DNAs extracted with the TissueLyser method generate a single, linear contig for S022-V7-A3 strain.

†Assembly completeness was evaluated with BUSCO v. 5.2.2 using the lactobacillales_odb10 database. A total of 402 BUSCO genes were identified, the number of complete and single-copy (C) fragmented (F), and missing (M) genes in each assembly is reported with their relative abundance in brackets.

‡The deposited genome sequences were assembled combining Illumina reads and Nanopore reads, obtained using DNAs isolated with the three different methods.

genome sequencing readout Sequencing output DNA extraction method CTAB Raffinose TissueLyser S022-V3-A4 S022-V7-A3 S022-V3-A4 S022-V7-A3 S022-V3-A4 S022-V7-A3 Mean read length 4476.0 4216.3 5338.5 4270.9 5121.9 4128.9 Median read length 2552 2171 2519 2047 4695 3455 Mean read quality 11.3 11.2 11.3 11.2 11.2 11.1 Number of reads 18485 17778 18920 24227 19109 57825 Read length N50 7608 7700 10610 8371 6651 5317 Total bases 82739476 74956919 101004905 103470132 97875297 238752790 Longest reads: 1 82241 107,294 181199 124694 34966 59516 2 73834 93,505 130,614 105568 31951 52114 3 73597 84226 118691 103501 31676 47645 4 70310 82270 116448 97398 31098 38005 5 68235 80707 114758 92372 27268 36981 Read-length distribution of genome sequencing experiments. (a) S022-V3-A4 strain, (b) S022-V7-A3 strain. Read-length values are plotted on the X-axis, while the Y-axis reports the number of reads. Reads obtained sequencing the DNAs isolated using the three methods are represented with different colours: blue for CTAB, red for Raffinose and green for TissueLyser. genomes' assembly and data availability DNA extraction method Length* BUSCO† Sequence read archive no. Genome GenBank no.‡ S022-V3-A4 S022-V7-A3 S022-V3-A4 S022-V7-A3 S022-V3-A4 S022-V7-A3 S022-V3-A4 S022-V7-A3 CTAB 2094027 bp 2033697 bp C:370 (92%), F:25 (6.2%), M:7 (1.8%) C:356 (88.6%), F:36 (9%), M:10 (2.4%) SRX13372482 SRX13372479 CP047883.1 CP067992.1 Raffinose 2087264 bp 2033595 bp C:377 (93.8%), F:15 (3.7%), M:10 (2.5%) C:372 (92.5%), F:23 (5.7%), M:7 (1.8%) SRX13372483 SRX13372480 TissueLyser 2083754 bp 2032458 bp C:381 (94.8%), F:19 (4.7%), M:2 (0.5%) C:389 (96.8%), F:9 (2.2%), M:4 (1%) SRX13372484 SRX13372481 *Five linear contigs were obtained assembling reads obtained sequencing DNAs extracted with the CTAB and the TissueLyser method for S. mitis S002-V3-A4 strain, while DNAs extracted with the TissueLyser method generate a single, linear contig for S022-V7-A3 strain. †Assembly completeness was evaluated with BUSCO v. 5.2.2 using the lactobacillales_odb10 database. A total of 402 BUSCO genes were identified, the number of complete and single-copy (C) fragmented (F), and missing (M) genes in each assembly is reported with their relative abundance in brackets. ‡The deposited genome sequences were assembled combining Illumina reads and Nanopore reads, obtained using DNAs isolated with the three different methods.

Conclusions

In the present study, to overcome difficulties in complete-genome assembly of the highly recombinogenic and repetitive genome, we have set up and compared three DNA isolation methods. The methods aimed to isolate a pure high molecular weight DNA from to be used as a template in Oxford Nanopore sequencing experiments. We found that the two enzymatic lysis-based methods (CTAB and Raffinose) were able to preserve genomic DNA integrity and purity, while the mechanical lysis-based method (TissueLyser) was not able to isolate high molecular weight DNA. Oxford Nanopore sequencing showed that: (i) the DNA isolated with a mechanical lysis-based method, did not generate ultra-long reads failing in complete genome assembly; (ii) the DNAs isolated using the enzymatic lysis-based methods generated ultra-long reads, up to 181199 bases, achieving circular complete genome assembly. Despite the limited sequencing coverage (~50 ×) achieved and the overall low read accuracy (mean quality 11, 92 % accuracy), the evaluation of BUSCO scores suggested that it was possible to achieve complete, good-quality genomes solely relying on Nanopore reads.

14 in total

1. Preparation of genomic DNA from bacteria.

Authors: K Wilson
Journal: Curr Protoc Mol Biol Date: 2001-11

Review 2. Streptococcus mitis: walking the line between commensalism and pathogenesis.

Authors: J Mitchell
Journal: Mol Oral Microbiol Date: 2011-01-18 Impact factor: 3.563

3. Mash: fast genome and metagenome distance estimation using MinHash.

Authors: Brian D Ondov; Todd J Treangen; Páll Melsted; Adam B Mallonee; Nicholas H Bergman; Sergey Koren; Adam M Phillippy
Journal: Genome Biol Date: 2016-06-20 Impact factor: 13.583

4. Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.

Authors: Ryan R Wick; Louise M Judd; Claire L Gorrie; Kathryn E Holt
Journal: PLoS Comput Biol Date: 2017-06-08 Impact factor: 4.475

5. Resolving the complex Bordetella pertussis genome using barcoded nanopore sequencing.

Authors: Natalie Ring; Jonathan S Abrahams; Miten Jain; Hugh Olsen; Andrew Preston; Stefan Bagby
Journal: Microb Genom Date: 2018-11-21

6. Resolving Phylogenetic Relationships for Streptococcus mitis and Streptococcus oralis through Core- and Pan-Genome Analyses.

Authors: Irina M Velsko; Megan S Perez; Vincent P Richards
Journal: Genome Biol Evol Date: 2019-04-01 Impact factor: 3.416

7. Parallel evolution of Streptococcus pneumoniae and Streptococcus mitis to pathogenic and mutualistic lifestyles.

Authors: Mogens Kilian; David R Riley; Anders Jensen; Holger Brüggemann; Hervé Tettelin
Journal: MBio Date: 2014-07-22 Impact factor: 7.867

8. Same Exposure but Two Radically Different Responses to Antibiotics: Resilience of the Salivary Microbiome versus Long-Term Microbial Shifts in Feces.

Authors: Egija Zaura; Bernd W Brandt; M Joost Teixeira de Mattos; Mark J Buijs; Martien P M Caspers; Mamun-Ur Rashid; Andrej Weintraub; Carl Erik Nord; Ann Savell; Yanmin Hu; Antony R Coates; Mike Hubank; David A Spratt; Michael Wilson; Bart J F Keijser; Wim Crielaard
Journal: MBio Date: 2015-11-10 Impact factor: 7.867

9. BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes.

Authors: Mosè Manni; Matthew R Berkeley; Mathieu Seppey; Felipe A Simão; Evgeny M Zdobnov
Journal: Mol Biol Evol Date: 2021-09-27 Impact factor: 16.240