Literature DB >> 35084299

Genome diversity of domesticated Acinetobacter baumannii ATCC 19606^T strains.

Irene Artuso¹, Massimiliano Lucidi¹, Daniela Visaggio^1,2, Giulia Capecchi¹, Gabriele Andrea Lugli³, Marco Ventura³, Paolo Visca^1,2.

Abstract

Acinetobacter baumannii has emerged as an important opportunistic pathogen worldwide, being responsible for large outbreaks for nosocomial infections, primarily in intensive care units. A. baumannii ATCC 19606T is the species type strain, and a reference organism in many laboratories due to its low virulence, amenability to genetic manipulation and extensive antibiotic susceptibility. We wondered if frequent propagation of A. baumannii ATCC 19606T in different laboratories may have driven micro- and macro-evolutionary events that could determine inter-laboratory differences of genome-based data. By combining Illumina MiSeq, MinION and Sanger technologies, we generated a high-quality whole-genome sequence of A. baumannii ATCC 19606T, then performed a comparative genome analysis between A. baumannii ATCC 19606T strains from several research laboratories and a reference collection. Differences between publicly available ATCC 19606T genome sequences were observed, including SNPs, macro- and micro-deletions, and the uneven presence of a 52 kb prophage belonging to genus Vieuvirus. Two plasmids, pMAC and p1ATCC19606, were invariably detected in all tested strains. The presence of a putative replicase, a replication origin containing four 22-mer direct repeats, and a toxin-antitoxin system implicated in plasmid stability were predicted by in silico analysis of p1ATCC19606, and experimentally confirmed. This work refines the sequence, structure and functional annotation of the A. baumannii ATCC 19606T genome, and highlights some remarkable differences between domesticated strains, likely resulting from genetic drift.

Entities: Chemical

Keywords: Acinetobacter baumannii ATCC 19606T; genome refinement; native plasmids; strain domestication; Φ19606 phage

Mesh：

Year: 2022 PMID： 35084299 PMCID： PMC8914354 DOI： 10.1099/mgen.0.000749

Source DB: PubMed Journal: Microb Genom ISSN： 2057-5858

Data Summary

The sequence data of the ATCC 19606(A) used in this study is freely available from the NCBI BioProject database under accession number PRJNA637288. For decades has emerged as a major antibiotic-resistant nosocomial pathogen, and the type strain ATCC 19606T has been used as the reference organism for research in many laboratories. However, frequent subculturing and local differences in culture conditions can result in domestication of laboratory strains, a micro-evolutionary process driven by mutational events at the genome level that can reflect into a variable phenotype. Motivated by the remarkable diversity in publicly available ATCC 19606T whole-genome sequences, we generated an accurately revised whole-genome sequence of ATCC 19606T, which will set a more solid basis for studies of the genetics and genomics of this model organism. Accurate genome assembly made it possible to characterize indigenous plasmids and a new prophage in the ATCC 19606T genome. The remarkable genome diversity among ATCC 19606T strains from different laboratories poses the need for researchers to specify the lineage of the strain used, as local culturing and storage practices may affect strain microevolution.

Introduction

is a worldwide-distributed Gram-negative bacterium and a major opportunistic pathogen, especially among critically ill patients in intensive care units [1-3]. Due to extensive antibiotic resistance [4, 5], is on top of the global priority list of pathogens for which new and effective drugs are urgently needed, according to the World Health Organization [6]. While it is likely that infections caused by bacteria now classified as emerged during the 1970s [7], the species was not formally established until 1986, when strain ATCC 19606T was designated as the type strain [8]. ATCC 19606T was isolated in 1948 from the urine of a US patient, and called Bacterium anitratum [9, 10], then filed with the ATCC and renamed in 1986 [8]. More recently, ATCC 19606T was assigned type O by pulsed-field gel electrophoresis and ST52 by multi-locus sequence typing (MLST) [11]. ATCC 19606T has extensively been used as a model strain for research on antimicrobial resistance [12-15], desiccation and osmotic shock tolerance [16], transcriptional regulation and virulence both in vitro and in vivo [17-24], with more than 180 published papers according to ISI Web of Science records (accessed July 2021). The importance of ATCC 19606T as a model organism led us to initiate a whole sequencing and annotation project of its genome. While this work was in progress, three complete genome sequences of ATCC 19606T were released [CP045110.1 [25], hereafter ATCC 19606(H); AP022836 [26], hereafter ATCC 19606(O); CP046654.1 [27], hereafter ATCC 19606(M)], showing remarkable differences in sequence and annotation compared with the ATCC 19606T genome determined in our laboratory. We wondered if the maintenance and propagation of ATCC 19606T in laboratories throughout the world may have entailed micro- and macro-evolutionary events responsible for such differences, and eventually affect the inter-laboratory comparison of genome-based data, as inferred for other laboratory-adapted strains [28-32]. To test this hypothesis, we generated a high-quality complete genome sequence of ATCC 19606T employing Illumina MiSeq and MinION technologies and performed a comparative analysis with previously deposited whole-genome sequences of this strain [25-27]. To gain further insights into inter-laboratory diversification of the ATCC 19606T, the whole-genome sequences of strains maintained in different research institutions throughout Europe and the ATCC (complete genome sequence available at https://www.lgcstandards-atcc.org/Products/All/19606.aspx?geo_country=it#generalinformation) were compared. Numerous single nucleotide polymorphisms (SNPs) and insertions-deletions (INDELs), as well as the occasional loss of a prophage and the invariable presence of two indigenous plasmids were demonstrated, probably reflecting strain domestication. Moreover, the toxin-antitoxin (TA) module and the replicase gene of the indigenous plasmid p1ATCC19606 [25] were in silico characterized and experimentally verified, providing useful information about the mechanisms of plasmid maintenance and replication in .

Methods

Bacterial strains and culture media

Bacterial strains used in this study (Table 1) were grown in Luria–Bertani (LB) broth or on LB agar (LA) plates at 37 °C. When needed, kanamycin (Km), gentamicin (Gm) or tetracycline (Tc) were added. The Km concentrations used for DH5α and BD413 were 40 µg ml−1 and 15 µg ml−1, respectively. The Gm and Tc concentrations used for DH5α were 10 µg ml−1 and 12.5 µg ml−1, respectively. Zeocin (Zeo) was added to low-salt LA [33] at 25 µg ml−1 and 250 µg ml−1 for DH5α and spp., respectively.

Table 1.

Bacterial strains and plasmids

Strain or plasmid	Relevant characteristics*	Received year	Reference and/or source
Strain	Relevant characteristics*	Received year	Reference and/or source
A. baumannii
ATCC 19606(A)	Clinical isolate; type strain	2014	Beate Averhoff collection; genome accession number: CP058289.1
ATCC 19606(D)	Clinical isolate; type strain	2020	German Collection of Microorganisms and Cell Cultures, DSMZ GmbH (genome available at https://genomes.atcc.org/genomes/1577c3a70f334038)
ATCC 19606(H)	Clinical isolate; type strain	–	[25]; genome accession number: CP045110.1
ATCC 19606(M)	Clinical isolate; type strain	–	[27]; genome accession number: CP046654.1
ATCC 19606(O)	Clinical isolate; type strain	–	[26]; genome accession number: AP022836
ATCC 19606(S)	Clinical isolate; type strain	2019	Harald Seifert collection
ATCC 19606(T)	Clinical isolate; type strain	2010	Kevin Towner collection
ATCC 17978	Clinical isolate	2007	[114]
ACICU	MDR clinical isolate, prototype of the international clonal lineage II	2007	[115]
AB5075	MDR and hypervirulent clinical isolate	2019	[116]
A. baylyi BD413 (ADP1)	Naturally transformable strain	2017	[117]
Acinetobacter dijkshoorniae 271	Member of the ACB complex	2017	[118]; H. Seifert collection
Acinetobacter nosocomialis UKK_0361	Member of the ACB complex	2017	[66]; H. Seifert collection
Acinetobacter pittii UKK_0145	Member of the ACB complex	2017	[66]; H. Seifert collection
E. coli DH5α	recA1 endA1 hsdR17 supE44 thi-1 gyrA96 relA1 Δ(lacZYA-argF)U169 [ϕ80dlacZΔM15] F^- Nal^R	–	[60]
Plasmid
pCR-Blunt II-TOPO	E. coli cloning vector; Km^R, Zeo^R	–	ThermoFisher
p1ATCC19606	Native plasmid of A. baumannii ATCC 19606^T	–	[25]
pMAC	Native plasmid of A. baumannii ATCC 19606^T	–	[87]
pVRL1	E. coli-Acinetobacter species shuttle vector for general cloning purposes; Gm^R	–	[66]
pVRL1ΔTA	pVRL1 carrying a deletion in the TA system; Gm^R	–	[66]
pVRL2	E. coli-Acinetobacter species shuttle vector for arabinose-inducible gene expression; Gm^R	–	[66]
pME6032	Broad-host-range shuttle vector for IPTG-inducible gene expression; Tc^R	–	[68]
pCR-p1ATCC19606	Full length p1ATCC19606 ligated to pCR-Blunt II-TOPO; Km^R, Zeo^R	–	This work
pCR-p1ATCC19606Δ1	Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; Km^R, Zeo^R	–	This work
pCR-p1ATCC19606Δ2	Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; Km^R, Zeo^R	–	This work
pCR-p1ATCC19606Δ3	Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; Km^R, Zeo^R	–	This work
pCR-p1ATCC19606Δ4	Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; Km^R, Zeo^R	–	This work
pCR-p1ATCC19606Δ5	Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; Km^R, Zeo^R	–	This work
pCR-p1ATCC19606ΔhigB2A2	pCR-p1ATCC19606 carrying a deletion in the TA system; Km^R, Zeo^R	–	This work
pVRL2higA2	higA2-like antitoxin promoterless gene cloned into pVRL2; Gm^R	–	This work
pME6032higB2	higB2-like toxin promoterless gene cloned into pME6032; Tc^R	–	This work

*NalR, nalidixic acid resistant; KmR, kanamicin resistant; TcR, tetracycline resistant; ZeoR, zeocin resistant, GmR, gentamicin resistance.

Bacterial strains and plasmids Strain or plasmid Relevant characteristics* Received year Reference and/or source Strain ATCC 19606(A) Clinical isolate; type strain 2014 Beate Averhoff collection; genome accession number: CP058289.1 ATCC 19606(D) Clinical isolate; type strain 2020 German Collection of Microorganisms and Cell Cultures, DSMZ GmbH (genome available at https://genomes.atcc.org/genomes/1577c3a70f334038) ATCC 19606(H) Clinical isolate; type strain – [25]; genome accession number: CP045110.1 ATCC 19606(M) Clinical isolate; type strain – [27]; genome accession number: CP046654.1 ATCC 19606(O) Clinical isolate; type strain – [26]; genome accession number: AP022836 ATCC 19606(S) Clinical isolate; type strain 2019 Harald Seifert collection ATCC 19606(T) Clinical isolate; type strain 2010 Kevin Towner collection ATCC 17978 Clinical isolate 2007 [114] ACICU MDR clinical isolate, prototype of the international clonal lineage II 2007 [115] AB5075 MDR and hypervirulent clinical isolate 2019 [116] BD413 (ADP1) Naturally transformable strain 2017 [117] 271 Member of the ACB complex 2017 [118]; H. Seifert collection UKK_0361 Member of the ACB complex 2017 [66]; H. Seifert collection UKK_0145 Member of the ACB complex 2017 [66]; H. Seifert collection DH5α recA1 endA1 hsdR17 supE44 thi-1 gyrA96 relA1 Δ(lacZYA-argF)U169 [ϕ80dlacZΔM15] F- NalR – [60] Plasmid pCR-Blunt II-TOPO cloning vector; KmR, ZeoR – ThermoFisher p1ATCC19606 Native plasmid of ATCC 19606T – [25] pMAC Native plasmid of ATCC 19606T – [87] pVRL1 E. coli-Acinetobacter species shuttle vector for general cloning purposes; GmR – [66] pVRL1ΔTA pVRL1 carrying a deletion in the TA system; GmR – [66] pVRL2 E. coli-Acinetobacter species shuttle vector for arabinose-inducible gene expression; GmR – [66] pME6032 Broad-host-range shuttle vector for IPTG-inducible gene expression; TcR – [68] pCR-p1ATCC19606 Full length p1ATCC19606 ligated to pCR-Blunt II-TOPO; KmR, ZeoR – This work pCR-p1ATCC19606Δ1 Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; KmR, ZeoR – This work pCR-p1ATCC19606Δ2 Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; KmR, ZeoR – This work pCR-p1ATCC19606Δ3 Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; KmR, ZeoR – This work pCR-p1ATCC19606Δ4 Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; KmR, ZeoR – This work pCR-p1ATCC19606Δ5 Deletion derivative of p1ATCC19606 cloned into pCR-Blunt II-TOPO; KmR, ZeoR – This work pCR-p1ATCC19606ΔhigB2A2 pCR-p1ATCC19606 carrying a deletion in the TA system; KmR, ZeoR – This work pVRL2higA2 higA2-like antitoxin promoterless gene cloned into pVRL2; GmR – This work pME6032higB2 higB2-like toxin promoterless gene cloned into pME6032; TcR – This work *NalR, nalidixic acid resistant; KmR, kanamicin resistant; TcR, tetracycline resistant; ZeoR, zeocin resistant, GmR, gentamicin resistance.

DNA manipulation

Genomic DNA was extracted and purified with the QIAamp DNA minikit (Qiagen), according to the manufacturer’s instructions. Plasmid DNA was extracted from overnight bacterial cultures using the Wizard Plus SV Minipreps DNA Purification System (Promega Corporation), according to the manufacturer’s instructions. PCR reactions were performed using Thermo Scientific Phusion High-Fidelity DNA Polymerase with primers listed in Table S1 (available in the online version of this article). FastDigest restriction enzymes were purchased from Thermo Fisher Scientific. Plasmid DNA sequencing was performed using an ABI3730 Sequencer (service by Bio-fab Research, Rome, IT).

Genome sequencing, assembly and annotation

Whole-genome sequencing of ATCC 19606(A) was performed by GenProbio srl (Parma, Italy) using a MiSeq platform (Illumina, San Diego, CA, USA) according to the supplier’s protocol (Illumina, San Diego, CA, USA). Genomic DNA extracted from ATCC 19606(A) was also subjected to whole-genome sequencing using a MinION (Oxford Nanopore, UK) at GenProbio srl (Parma, Italy) according to the supplier’s protocol (Oxford Nanopore, UK). MinION long reads obtained from genome sequencing runs were used as input for a de novo genome assembly using Canu v1.8 with the estimated parameter ‘genomeSize’ of 4.0 m [34], generating a single complete sequence of the genome. Then, fastq ﬁles of Illumina paired-end reads (250 bp) and MinION long reads (ranging from 1000 to 100 065 bp) were used as input for a second genome assembly through the MEGAnnotator pipeline [35]. The SPAdes programme v 3.13.0 was used for the hybrid assembly of the genome sequence with the pipeline option ‘--isolate’, a list of k-mer sizes of 21, 33, 55, 77, 99, 127, and the complete genome sequence obtained through Canu v1.8 for gap closure and repeat resolution using the option ‘--trusted-contigs’ [36]. The chromosome sequence, together with those of plasmids, were then analysed by MEGAnnotator for the prediction of protein-encoding ORFs using Prodigal [37]. Predicted ORFs were functionally annotated by means of RAPSearch2 (Reduced Alphabet based Protein similarity Search) (cutoff E value, 1×10−5; minimum alignment length, 20 aminoacids) by interrogation of the NCBI nr database [38] coupled with hidden Markov model (HMM) proﬁle searches (http://hmmer.org/), performed against the manually curated Pfam-A database (cutoff E value, 1×10−10). tRNA genes were identiﬁed using tRNAscan-SE v1.4 [39], while rRNA genes were detected using RNAmmer v1.2 [40]. Before genome submission to NCBI, a protein integrity check that takes neighbouring pairs of proteins and does a blastp analysis on them was performed to detect frameshifts (https://www.ncbi.nlm.nih.gov/genomes/frameshifts/frameshifts.cgi). Pairs of neighbours that hit the same longer protein were annotated as pseudogenes. The presence of genomic islands (GIs) was predicted by IslandViewer 4 [41], which uses SIGI-HMM, IslandPath-DIMOB and IslandPick prediction algorithms to calculate codon usage, dinucleotide bias within a genome, generating a dataset of GIs. Only GIs predicted by at least one of the three algorithms, which do not completely overlap with any predicted phage regions, were considered. Insertion sequences (ISs) were predicted by ISEScan [42]. Prophage sequence prediction and annotation were performed using PHASTER (PHAge Search Tool Enhanced Release) (http://phaster.ca), and only intact (score >90) and questionable (score >70 to 90) phage genomes were considered, whereas incomplete phage genomes (score ≤70) were discarded [43]. The raw read data of the ATCC 19606(M) sequencing project (SRR10295884) were downloaded and screened in search for putative plasmid sequences using both plasmidSPADES [44] and Bowtie 2 [45]. The quality of genome assemblies, namely completeness and contamination percentages were evaluated using CheckM [46].

Comparative genome analyses and mutation detection

ProgressiveMauve [47] was used for pairwise comparison of the ATCC 19606(A) assembled genome with ATCC 19606(D, M, H and O strains) genome assemblies. To detect SNPs and microindels (insertion or deletion events), sequence reads belonging to ATCC 19606(A) were mapped against the ATCC 19606(M) and ATCC 19606(H) genome sequences with BWA mem v0.7.17 [48], using default parameters. A consensus pileup was produced using SAMtools v1.10 [49]. Then, SNPs and microindels were defined using VarScan v2.3.6 [50] with the following parameters: minimum coverage (8), min-reads2 (2), min-avg-qual (15), min-var-freq (0.5), P-value (99×10−2). SNPs and microindels were manually inspected in the output files with Artemis [51]. All discrepancies, i.e. mutations inferred from comparative genome analysis, between ATCC 19606(A), ATCC 19606(M) and ATCC 19606(H) were confirmed by Sanger sequencing of PCR products encompassing the mutated site using an ABI3730 Sequencer (service by Bio-Fab Research, Rome, Italy). For each mutated protein product, a corresponding orthologue from ACICU strain (CP031380.1) was identified using blastp analysis (https://blast.ncbi.nlm.nih.gov/Blast.cgi) with a cutoff value E<10−5 and 50 % identity across at least 80 % of protein sequences. Twenty-two genes showing SNPs and INDELs in pairwise comparisons between ATCC 19606(A), ATCC 19606(H) and ATCC 19606(M) genomes were concatenated, and aligned with mafft v7, selecting the G-INS-i method [52]. A phylogenetic tree based on the concatenated alignment was constructed using the neighbour-joining (NJ) method and visualized using iTOL v6.1.2 [53]. The tree was rooted on ATCC 19606(A).

Detection of phage Φ19606

The presence of phage Φ19606 in ATCC 19606 strains A, D, S and T was verified by PCR using the primer pairs listed in Table S1 and 30 ng of bacterial DNA as a template. To evaluate the loss of Φ19606 during serial propagation steps, ten 1 mm colonies were randomly taken from primary plating on LA of the original ATCC 19606(D) vial and suspended in 500 µl of saline at 4 °C for 30 min with intermittent vortexing to maximize the separation of individual cells. The absence of cell aggregates was verified by bright-field microscopy. Then 1 µl of this suspension was streaked onto an LA plate and incubated at 37 °C for 24 h, while the rest was centrifuged for total DNA extraction from the cell pellet using QIAamp DNA minikit (Qiagen), according to the manufacturer’s instructions. Fourteen serial passages were repeated according to this procedure, for a total of 15 passages including the primary plating from the DSMZ stock vial. For each propagation step of ATCC 19606(D), the presence/absence of Φ19606 was verified by PCR using the primers listed in Table S1 and 30 ng of the purified genome as template.

Phylogenetic analysis of phage Ф19606

The complete genome sequences of phages belonging to the Siphoviridae family were retrieved from the NCBI database. The resulting dataset contained the genomes of 19 phages (Table S2), including Ф19606. All pairwise comparisons of the nucleotide sequences were conducted using the Genome-blast Distance Phylogeny (GBDP) method [54] under settings recommended for prokaryotic viruses [55] using VICTOR (https://ggdc.dsmz.de/victor.php). The resulting intergenomic distances were used to infer a balanced minimum evolution tree with branch support via FASTME including SPR postprocessing [56] for the formula D0. Branch support was inferred from 100 pseudo-bootstrap replicates each. The tree was rooted at the midpoint [57] and visualized with iTOL [53]. Taxon boundaries at the genus level were estimated with the optsil programme [58], with the recommended clustering threshold of 0.84 [56] and an F value (fraction of links required for cluster fusion) of 0.5 [59]. The presence of the Ф19606 sequence was searched in all strains using blastn against the NCBI database with cutoff values of 95 % identity and 85 % coverage of consecutive segments from the query sequence. For each strain containing the Ф19606 sequence, MLST was performed according to the Pasteur scheme (https://pubmlst.org/bigsdb?db=pubmlst_abaumannii_pasteur_seqdef) and the corresponding sequence type (ST) was determined.

Preparation of spp. and DH5α competent cells

competent cells were prepared by the rubidium-calcium chloride method and transformed according to the heat-shock protocol [60]. Electrocompetent cells of were prepared according to Lucidi and coworkers [22]. Plasmid DNA was introduced in spp. by electroporation as previously described [22]. Naturally competent BD413 was transformed with 150 ng of plasmid, as previously reported [61].

Deletion analysis of p1ATCC19606

DNA fragments encompassing different regions of the p1ATCC19606 plasmid were amplified by PCR using primer pairs listed in Table S1. The amplicons were cloned into pCR-Blunt II-TOPO (pCR, Thermo Fisher Scientific) according to the manufacturer’s instructions, and the resulting constructs (Table 1) were introduced by electroporation into different spp. The transformation efficiency (TE) was calculated as the ratio between the c.f.u. counts on selective agar plates and the amount of DNA used for transformation and expressed as c.f.u. μg−1 of plasmid DNA.

Homology searches and protein modelling

Putative protein-coding genes from plasmid p1ATCC19606 were manually annotated by integrating the MEGAnnotator [35] and Blast2GO (Basic Local Alignment Search Tool 2 Gene Ontology; [62]) outputs. The putative TA system of p1ATCC19606 was characterized by predicting the protein structures using I-TASSER [63] and SWISS-MODEL [64]. Match marker analyses and superimposition of proteins were performed using the UCSF Chimaera software [65].

Impact of the higB2A2-like toxin-antitoxin system on p1ATCC19606 stability

The p1ATCC19606 plasmid was ligated to the pCR plasmid to enable replication in , yielding pCR-p1ATCC19606. Subsequently, deletion of the higB2A2-like operon (293 bp) was generated using primers ΔTA FW and ΔTA RV (Table S1) and a Q5 site-directed mutagenesis kit (New England BioLabs). The resulting construct, named pCR-p1ATCC19606ΔhigB2A2, was introduced into DH5α for plasmid stability testing. Briefly, bacterial strains were preliminarily grown for 18 h in LB with 40 µg ml−1 Km, then washed and diluted 1000-fold in LB (without antibiotic). Bacterial cultures were refreshed (1 : 1000) every 12 h four times, for a total of 48 h. Bacterial colony counts were determined on LA (N) and LA supplemented with 40 µg ml−1 Km (N Ant). Plasmids pVRL1 and pVRL1ΔTA were used as controls for plasmid stability, as reported elsewhere [66]. Plasmid stability is defined by the ratio N Ant /N [67].

Construction of plasmids directing the controlled expression of higB2 and higA2 genes

The promoterless higA2-like antitoxin gene was amplified from p1ATCC19606 with primers pVRL2higA2_FW and pVRL2higA2_RV (Table S1), and the 357 bp amplicon was ligated to XhoI and HindIII sites of the pVRL2 vector [66], yielding pVRL2higA2. The promoterless higB2-like toxin gene was amplified by PCR from p1ATCC19606 with primers pME6032higB2_FW and pME6032higB2_RV (Table S1), and the 368 bp amplicon was ligated to the SacI and XhoI sites of the pME6032 vector [68], yielding pME6032higB2. Plasmids pVRL2higA2 and pME6032higB2 were individually introduced into DH5α, and transformants were selected on LA supplemented with either 10 µg ml−1 Gm or 12.5 µg ml−1 Tc, respectively. Subsequently, pME6032higB2 was introduced in DH5α(pVRL2higA2) and transformants were routinely maintained on LA supplemented with 10 µg ml−1 Gm, 12.5 µg ml−1 Tc and 0.5 % l-arabinose to ensure plasmid selection and expression of the higA2-like antitoxin gene. To test the neutralizing effect of the HigA2 antitoxin on the HigB2 toxin, DH5α carrying both pVRL2higA2 and pME6032higB2 was grown in LB supplemented with 10 µg ml−1 Gm, 12.5 µg ml−1 Tc, and 0.5 % l-arabinose, then washed in saline, diluted to OD600=0.001 (corresponding to ca 5×105 c.f.u. ml−1) and dispensed in a microtitre plate containing LB supplemented with 12.5 µg ml−1 Tc and different concentrations of l-arabinose (from 1–0.015 %) and isopropyl-β-D-1-thiogalactopyranoside (IPTG; from 300 to 9.4 µM). After 24 h incubation at 37 °C, the growth was determined by OD600 measurements in a Spark 10M microtitre reader (Tecan). DH5α(pVRL2higA2 pME6032higB2) was also plated on LA supplemented with 0.5, 0.25, 0.12 and 0 % l-arabinose. Then paper discs containing IPTG (from 5 mM to 0.08 mM) were dispensed on the plate, and the zone of inhibition (ZOI) was determined after 24 h incubation at 37 °C.

Results and discussion

Historical background of ATCC 19606T

In the course of a study on ‘paracolon bacilli’ (old designation for coliform bacteria that do not ferment lactose) carried out by Dr K. Wheeler and Dr C.A. Stuart, several isolates were sent to Dr I.G. Schaub and Dr F.D. Hauber (John Hopkins University, Baltimore, MD, USA) for laboratory identification. Among these, one isolate (no. 81 according to Schaub’s and Hauber’s nomenclature) was reported to be an unidentifiable Bacterium sp. responsible for a urinary tract infection. The isolate was proposed to belong to the species Bacterium anitratum [9]. This isolate was sent to Dr E.O. King (CDC, Atlanta, GA, USA) in 1965, and a descent to Dr R. Hugh (George Washington University, Washington, DC, USA) before being filed with ATCC in 1966, and made publicly available with the ATCC 19606 designation (chain of custody available at https://www.lgcstandards-atcc.org/products/all/19606.aspx#history). A first taxonomic reassignment of Bacterium anitratum to the genus was proposed in 1964, but it was not accepted by the scientific community [69]. Thereafter, ATCC 19606 was formally designated as the type strain of B. anitratum [10]. In the 1980s, DNA–DNA hybridization studies reclassified B. anitratum as [70] and, subsequently, the definitive name of ATCC 19606T was assigned to the original Schaub and Hauber no. 81 isolate [8].

Genealogy of ATCC 19606T strains from different sources

We wondered if maintenance and sequential propagation of ATCC 19606T in different laboratories may result in genotype differences. To address this issue, four ATCC 19606 strains (A, D, S, T) were obtained from three independent research laboratories in Europe and the German Collection of Microorganisms and Cell Cultures-DSMZ GmbH (Braunschweig, Germany) (Table 1). ATCC 19606(A) was directly purchased from ATCC by microbiologists of the Robert Koch Institute (Wernigerode, Germany), then given to Professor B. Averhoff of the Goethe University (Frankfurt am Main, Germany) before reaching our laboratory. ATCC 19606 strain D was purchased by our laboratory from the DSMZ. ATCC 19606 strains S and T were shared by Dr P.A.D. Grimont of Institute Pasteur (Paris, France) with Professor H. Seifert (Institut Für Medizinische Mikrobiologie Immunologie Und Hygiene, Cologne, Germany) and Professor K.J. Towner (Department of Clinical Microbiology, Nottingham University Hospitals NHS Trust, Queen’s Medical Centre, Nottingham, UK), respectively, then sent to our laboratory. From the history of the various strains, it emerged that ATCC 19606(A, S, T) were serially propagated several times (personal communications), while ATCC 19606(D) at least five times before ending up in our hands (see previous section Historical background of ATCC 19606T and https://www.lgcstandards-atcc.org/products/all/19606.aspx?geo_country=it#history).

General features of the ATCC 19606(A) genome

The whole-genome sequence of ATCC 19606(A) was determined in our laboratory using a hybrid de novo assembly combining Illumina paired-end reads and MinION long reads. The resulting ATCC 19606(A) genome sequence represents a 696-fold coverage (426-fold coverage based on short-reads and 270 on long-reads) complete genome with a consensus sequence of 3 927 723 bp and a mean GC content of 39.18 % (Fig. 1). The ATCC 19606(A) genome contains 3618 ORFs, 71 tRNAs and 6 rRNA operons (six copies of 5 s, 16 s and 23 s). Two indigenous plasmids, named p1ATCC19606 and pMAC, were entirely reconstructed, resulting in 9540 and 7655 bp sequences with 11 and 13 predicted ORFs, respectively. The ATCC 19606(A) chromosome and plasmid sequences are now publicly available under GenBank Accession numbers of CP058289, CP058290 and CP058291, respectively. ATCC 19606(A) was assigned to ST52 with MLST Pasteur scheme (cpn60-fusA-gltA-pyrG-recA-rplB-rpoB, 3-2-2-7-9-1-5) [71], in accordance with a previous report [11]. Six ISs were predicted in ATCC 19606(A) genome: five already reported by Zhu and coworkers [27], and an additional IS belonging to the ISL3 family spanning positions 2 528 296–2 531 742 of the chromosome (Table S3). A total of 11 GIs were predicted in the chromosome (Fig. 1, Table S4), many of which are located in the proximity of tRNA genes, which serve as integration sites for exogenous genetic elements [72, 73]. Several genes located in the predicted GI-10 (named GI19606 in [25]) encode proteins involved in arsenic resistance (arsenic resistance protein ArsH, and arsenic reductase and transporter [HTZ92_3354-HTZ92_3356]) and sulfamethoxazole resistance (Sul2 [HTZ92_3349]). Moreover, the gene encoding for a putative chlorhexidine efflux transporter (HTZ92_1759), plausibly implicated in resistance to this disinfectant, was detected in GI-3. In GI-4 two efflux pumps (HTZ92_2076 and HTZ92_2085) were detected, whose role is still unknown. The lprI lipoprotein gene (HTZ92_2140) was detected in GI-5. LprI was previously reported to act as a lysozyme inhibitor [74] and could therefore contribute to elude the host’s innate immunity [75].

Fig. 1.

Chromosome map of ATCC 19606(A). Circular map created by the CGView server. From the outermost to innermost, the tracks show the genes on positive (dark blue) and negative (light blue) strands, ORFs on positive and negative strands (with colours indicating COG classifications; [119]), prophages (red) with dotted lines indicating the excision site of the missing prophage, GIs (orange), GC content (green) and GC skew (purple and light green for positive and negative, respectively). Position 1 in ATCC 19606(A) corresponds to position 3772737 in ATCC 19696(H) and position 1094161 in ATCC 19606(M). Both genomes are in reverse orientation relative to ATCC 19606(A).

Comparative quality assessment of available ATCC 19606T complete genome sequences

To assess the quality of the assembled ATCC 19606(A) genome sequence, a pairwise comparison between three publicly available complete genome sequences of the same strain retrieved from the NCBI database, i.e. ATCC 19606(M) (CP046654.1 [27]), ATCC 19606(O) (AP022836 [26]) and ATCC 19606(H) (CP045110.1 [25]), and the sequence of ATCC 19606(D) deposited at the ATCC web site (https://genomes.atcc.org/genomes/1577c3a70f334038), was conducted. All genome assemblies showed >99 % completeness and no contamination (Fig. 2). Screening with the MAUVE multiple genome aligner [47] unveiled several SNPs and small INDELs (Fig. 2). ATCC 19606(O) showed the highest number of mutations with respect to ATCC 19606(H), ATCC 19606(M) and ATCC 19606(D), with a total of 1698, 1683 and 1843 SNPs plus INDELs, respectively (Fig. 2). These results are consistent with previous observations by Hamidian and coworkers [25]. Manual screening of the ORFs revealed 263 frameshifted genes in ATCC 19606(O), which also showed the lowest annotation accuracy, with 1284 hypothetical proteins. Although both Racon v1.3.1.1 and Pilon v1.20.1 were used to improve the sequence quality of ATCC 19606(O) [26], the high number of SNPs, INDELs and frameshifted ORFs in its genome suggests they could result from a low-quality sequence. Therefore, further comparative genome analyses were limited to the ATCC 19606(A) and the publicly available genomes with the highest quality represented by ATCC 19606(M) and ATCC 19606(H).

Fig. 2.

Relevant features of genome sequences of different ATCC 19606T strains.

Differential genomic traits of ATCC 19606 strains A, M and H

ATCC 19606(A) and ATCC 19606(M) exhibited similar annotation accuracy (Fig. 2), while ATCC 19606(H) showed lower accuracy, with 1599 hypothetical proteins. Discrepancies were also observed in the number of annotated pseudogenes, specifically six in ATCC 19606(A), none in ATCC 19606(H) and 45 in ATCC 19606(M), including frameshifted, incomplete or internally stopped ORFs [27]. Eleven out of 45 putative pseudogenes annotated in ATCC 19606(M) encode frameshifted proteins, and likely result from DNA sequencing errors; 28 putative pseudogenes shared 100 % sequence identity with ATCC 19606(A) counterparts but were differently annotated in ATCC 19606(M) and ATCC 19606(A) (Table S5). The remaining six putative pseudogenes were detected also in ATCC 19606(A), according to the NCBI annotation pipeline, which interprets pairs of neighbouring proteins that hit the same longer protein in blastp search as encoded by a single pseudogene. This implies that the pair may represent a single gene that has gained frameshift or other mutations along the way. Although the increasing number of pseudogenes is suggestive of genome erosion [76], in this case, the difference in the number of predicted pseudogenes seems to be due to discrepancies in sequencing and annotation. It was also noticed that ATCC 19606(A) lacks three tRNA genes compared with ATCC 19606 strains M and H (Fig. 2): one tRNA-Gly (GO593_07205) arranged in tandem with another tRNA-Gly and two tRNA-Gln (GO593_10340 and GO593_10345). The absence of these tRNA genes was confirmed in both Illumina and Nanopore assemblies.

SNPs and INDELs

A comparison of ATCC 19606(A), considered as reference, with ATCC 19606(M) and ATCC 19606(H) genome sequences revealed 67 individual mutations: 47 in ATCC 19606(M) only, 12 in ATCC 19606(H) only, and eight in both ATCC 19606(M) and ATCC 19606(H) (Fig. S1). Among the 47 mutations detected in ATCC 19606(M), 13 insertions mapped inside intergenic regions (Table S6) and one SNP resulted in nucleotide substitution in GO593_04005, encoding a tRNA-Arg. None of the intergenic mutations were mapped within putative promoter sequences, as predicted with the BPROM software [77]. The remaining 33 mutations were located within ORFs, causing frameshifts in 12 genes of ATCC 19606(M), compared with ATCC 19606(A) (Table 2). Multiple mutations were detected in GO593_18950 (ten insertions and two SNPs) and in GO593_18955 (11 insertions), predicted to encode a sodium/glutamate symporter and an alpha-beta fold hydrolase, respectively. These mutations were not detected in both ATCC 19606(A) and ATCC 19606(H), in which full-length protein products matched their orthologues in ACICU strain. One insertion in the ATCC 19606(M) GO593_04990 gene (encoding a putative 3′−5′ exonuclease domain-containing protein) produced a shift in the stop codon, leading to an extended protein product.

Table 2.

Comparative analysis of SNPs detected in ATCC 19606(M), ATCC 19606(H) and ATCC 19606(A) genomes

Mutation	Position in ATCC 19606(M)	Position in ATCC 19606(H)	Nucleotide change	Aminoacid change	Protein length (aa) in ATCC 19606(M)/ ATCC 19606(A)/ ATCC 19606(H)	Gene designation in ATCC 19606(M)/ ATCC 19606(H)	Gene designation in ATCC 19606(A)	Gene product	Gene designation in ACICU (Protein length)
	SNPs between ATCC 19606(M) and ATCC 19606(A)
Frameshift	175 678		T ->TG	G80W	85/513/513	GO593_01010/FQU82_02766	HTZ92_0804	MFS transporter	DMO12_ 07743 (513)
	426 614		A ->AT	L235F	258/548/548	GO593_02100/FQU82_02985	HTZ92_0587	Phospholipid carrier-dependent glycosyltransferase	DMO12_08688 (548)
	1 052 567		G ->GA	F207I	222/213/213	GO593_04990/FQU82_03557	HTZ92_0046	3'−5' exonuclease domain-containing protein 2	DMO12_10425 (213)
	1 452 728		A ->AG	L101T	144/250/250	GO593_06935/FQU82_00154	HTZ92_3299	Transcriptional regulator LldL	DMO12_00333 (250)
	2 027 426		A ->AC	I100Y	113/462/462	GO593_09615/FQU82_00690	HTZ92_2784	Aminodeoxychorismate synthase component I	DMO12_02013 (462)
	2 761 191		T ->TG	L96I	100/209/209	GO593_13255/FQU82_01413	HTZ92_2145	Hypothetical protein	DMO12_03804 (209)
	3 154 380		C ->CT	N62E	67/181/181	GO593_15095/FQU82_1785	HTZ92_1775	Acyltransferase	DMO12_04716 (181)
	3 271 196		T ->TG	R380Q	381/576/576	GO593_15665/FQU82_01898	HTZ92_1663	Dipeptide ABC transporter ATP-binding protein	DMO12_05028 (576)
	3 974 985		T ->TC	F157I	172/212/212	GO593_18930/FQU82_02556	HTZ92_1005	TetR family transcriptional regulator	DMO12_07332 (212)
	3 976 370		G ->GA	E45R	75/300/300	GO593_18945/FQU82_02559	HTZ92_1002	ATPase AAA	DMO12_07341 (300)
	3 977 672		T ->TC*	W9L	19/411/411	GO593_18950/FQU82_02560	HTZ92_1001	Sodium/glutamate symporter gltS	DMO12_07344 (411)
	3 978 830		A ->ATC†	21L‡	20/382/382	GO593_18955/FQU82_02561	HTZ92_1000	Alpha-beta fold hydrolase	DMO12_07347 (382)
	SNPs between ATCC 19606(M, H) and ATCC 19606(A)
Stop codon	711 049	3 389 623	T ->A	K337‡	403/345/403§	GO593_03295/FQU82_03219	HTZ92_0363	Methyltransferase	DMO12_09396 (286)
Aminoacid change	261 181	2 939 754	G ->A	G40D	213/213/213	GO593_01345/FQU82_02834	HTZ92_0737	Hypothetical protein	DMO12_08034 (213)
Aminoacid change	1 149 064	3 827 640	A ->T	V56D	265/265/265	GO593_05440/FQU82_03650	HTZ92_3585	MBL fold metallo-hydrolase	DMO12_10692 (265)
	1 168 711	3 847 287	A ->T	T36S	70/70/70	GO593_05520/FQU82_03666	HTZ92_3569	Hypothetical protein	DMO12_10740 (70)
	3 331 716	2 029 407	A ->G	V94A	711/711/711	GO593_15945/FQU82_01954	HTZ92_1607	TonB-dependent siderophore receptor	DMO12_05199 (711)
Synonymous	20 323	2 698 895	T ->C	S117	475/478/478	GO593_00130/FQU82_02587	HTZ92_0978	M48 family metalloprotease	DMO12_07425 (478)
	2 190 622	887 198	T ->C	I319	394/394/394	GO593_10430/FQU82_00850	HTZ92_2632	Hypothetical protein	DMO12_02541 (394)
Intergenic	1 184 582	3 863 158	A ->T	–	–	GO593_05600-GO593_05605/FQU82_03682-FQU82_03683	HTZ92_3552-HTZ92_3553	–	–
	SNPs between ATCC 19606(H) and ATCC 19606(A)
Stop codon		2 834 189	T ->A	117L‡	178/117/178	GO593_00915/FQU82_02747	HTZ92_0820	Peptidase C39	DMO12_07686 (178)
Aminoacid change		2 573 642	C ->A	R190L	201/201/201	GO593_18455/FQU82_02459	HTZ92_1102	Potassium-transporting ATPase C chain	DMO12_07086 (201)
Synonymous		3 278 427	C ->T	R362\|\|	549/549/533¶	GO593_02765/FQU82_03113	HTZ92_0463	Lipid A phosphoethanolamine transferase	DMO12_09075 (533)

*Only the first nucleotide change is shown: GO593_18950 contains ten insertions and two SNPs.

†Only the first nucleotide change is shown: GO593_18955 contains 11 insertions.

‡Indicates a stop codon.

§The predicted ORF in ATCC 19606(A) starts 27 nucleotides upstream, relative to ATCC 19606(M) and ATCC 19606(H).

||Only the first nucleotide change is shown: FQU82_03113 contains ten SNPs.

¶The predicted ORF in ATCC 19606(A) starts 48 nucleotides downstream, relative to ATCC 19606(M) and ATCC 19606(H).

Comparative analysis of SNPs detected in ATCC 19606(M), ATCC 19606(H) and ATCC 19606(A) genomes Mutation Position in ATCC 19606(M) Position in ATCC 19606(H) Nucleotide change Aminoacid change Protein length (aa) in ATCC 19606(M)/ ATCC 19606(A)/ ATCC 19606(H) Gene designation in ATCC 19606(M)/ ATCC 19606(H) Gene designation in ATCC 19606(A) Gene product Gene designation in ACICU (Protein length) SNPs between ATCC 19606(M) and ATCC 19606(A) Frameshift 175 678 T ->TG G80W 85/513/513 GO593_01010/FQU82_02766 HTZ92_0804 MFS transporter DMO12_ 07743 (513) 426 614 A ->AT L235F 258/548/548 GO593_02100/FQU82_02985 HTZ92_0587 Phospholipid carrier-dependent glycosyltransferase DMO12_08688 (548) 1 052 567 G ->GA F207I 222/213/213 GO593_04990/FQU82_03557 HTZ92_0046 3'−5' exonuclease domain-containing protein 2 DMO12_10425 (213) 1 452 728 A ->AG L101T 144/250/250 GO593_06935/FQU82_00154 HTZ92_3299 Transcriptional regulator LldL DMO12_00333 (250) 2 027 426 A ->AC I100Y 113/462/462 GO593_09615/FQU82_00690 HTZ92_2784 Aminodeoxychorismate synthase component I DMO12_02013 (462) 2 761 191 T ->TG L96I 100/209/209 GO593_13255/FQU82_01413 HTZ92_2145 Hypothetical protein DMO12_03804 (209) 3 154 380 C ->CT N62E 67/181/181 GO593_15095/FQU82_1785 HTZ92_1775 Acyltransferase DMO12_04716 (181) 3 271 196 T ->TG R380Q 381/576/576 GO593_15665/FQU82_01898 HTZ92_1663 Dipeptide ABC transporter ATP-binding protein DMO12_05028 (576) 3 974 985 T ->TC F157I 172/212/212 GO593_18930/FQU82_02556 HTZ92_1005 TetR family transcriptional regulator DMO12_07332 (212) 3 976 370 G ->GA E45R 75/300/300 GO593_18945/FQU82_02559 HTZ92_1002 ATPase AAA DMO12_07341 (300) 3 977 672 T ->TC* W9L 19/411/411 GO593_18950/FQU82_02560 HTZ92_1001 Sodium/glutamate symporter gltS DMO12_07344 (411) 3 978 830 A ->ATC† 21L‡ 20/382/382 GO593_18955/FQU82_02561 HTZ92_1000 Alpha-beta fold hydrolase DMO12_07347 (382) SNPs between ATCC 19606(M, H) and ATCC 19606(A) Stop codon 711 049 3 389 623 T ->A K337‡ 403/345/403§ GO593_03295/FQU82_03219 HTZ92_0363 Methyltransferase DMO12_09396 (286) Aminoacid change 261 181 2 939 754 G ->A G40D 213/213/213 GO593_01345/FQU82_02834 HTZ92_0737 Hypothetical protein DMO12_08034 (213) 1 149 064 3 827 640 A ->T V56D 265/265/265 GO593_05440/FQU82_03650 HTZ92_3585 MBL fold metallo-hydrolase DMO12_10692 (265) 1 168 711 3 847 287 A ->T T36S 70/70/70 GO593_05520/FQU82_03666 HTZ92_3569 Hypothetical protein DMO12_10740 (70) 3 331 716 2 029 407 A ->G V94A 711/711/711 GO593_15945/FQU82_01954 HTZ92_1607 TonB-dependent siderophore receptor DMO12_05199 (711) Synonymous 20 323 2 698 895 T ->C S117 475/478/478 GO593_00130/FQU82_02587 HTZ92_0978 M48 family metalloprotease DMO12_07425 (478) 2 190 622 887 198 T ->C I319 394/394/394 GO593_10430/FQU82_00850 HTZ92_2632 Hypothetical protein DMO12_02541 (394) Intergenic 1 184 582 3 863 158 A ->T – – GO593_05600-GO593_05605/FQU82_03682-FQU82_03683 HTZ92_3552-HTZ92_3553 – – SNPs between ATCC 19606(H) and ATCC 19606(A) Stop codon 2 834 189 T ->A 117L‡ 178/117/178 GO593_00915/FQU82_02747 HTZ92_0820 Peptidase C39 DMO12_07686 (178) Aminoacid change 2 573 642 C ->A R190L 201/201/201 GO593_18455/FQU82_02459 HTZ92_1102 Potassium-transporting ATPase C chain DMO12_07086 (201) Synonymous 3 278 427 C ->T R362|| 549/549/533¶ GO593_02765/FQU82_03113 HTZ92_0463 Lipid A phosphoethanolamine transferase DMO12_09075 (533) *Only the first nucleotide change is shown: GO593_18950 contains ten insertions and two SNPs. †Only the first nucleotide change is shown: GO593_18955 contains 11 insertions. ‡Indicates a stop codon. §The predicted ORF in ATCC 19606(A) starts 27 nucleotides upstream, relative to ATCC 19606(M) and ATCC 19606(H). ||Only the first nucleotide change is shown: FQU82_03113 contains ten SNPs. ¶The predicted ORF in ATCC 19606(A) starts 48 nucleotides downstream, relative to ATCC 19606(M) and ATCC 19606(H). Of the eight mutations shared by both ATCC 19606(M) and ATCC 19606(H) (Table 2), one SNP mapped in an intergenic region, and two SNPs were synonymous mutations. Only one SNP introduced a stop codon in HTZ92_0363, whose predicted product is a methyltransferase, leading to the production of truncated protein form in ATCC 19606(A), compared with the full-length protein predicted for both ATCC 19606(M) and ATCC 19606(H) genomes. Four SNPs detected in both ATCC 19606(M) and ATCC 19606(H) caused aminoacid substitutions relative to corresponding ATCC 19606(A) proteins (Table 2). Twelve additional mutations were detected by direct comparison between ATCC 19606(H) and ATCC 19606(A), taken as reference genome; ten were synonymous changes in the pmrC gene encoding lipid A phosphoethanolamine transferase (FQU2_03113), one caused a non-conservative (R→L) aminoacid substitution, and one generated a stop codon in the peptidase C39 gene (FQU82_02747), resulting in a truncated protein in ATCC 19606(H). To validate the above observations, Sanger sequencing of the PCR-generated genomic regions encompassing individual SNPs was performed using the ATCC 19606(A) DNA as template. Results confirmed the sequence determined by the hybrid assembly for all SNP-containing regions (100 % identity). Since multiple validations of the reconstructed genome sequence of ATCC 19606(A) made it possible to exclude sequencing errors, it can be argued that ATCC 19606T strains domesticated in different laboratories had diversified their genome sequence.

Domesticated ATCC 19606 strains A, S and T lack the Φ19606 prophage

A preliminary comparison between ATCC 19606(A) and ATCC 19606(M) genomes revealed one major structural difference, consisting in the absence of a 52 kb prophage region spanning from GO593_11545 to GO593_11900 in ATCC 19606(M), mapping between a gene coding for a hypothetical protein (HTZ92_2409) and the ssrS gene (HTZ92_ssRs) in ATCC 19606(A). This prophage region was also detected in the ATCC 19606(D) and ATCC 19606(H) genomes and was predicted by PHASTER as an ‘intact’ prophage in the region 2 438 683–2 490 916 of the ATCC 19606(M) genome, with a GC content of 38.22 %. We propose to designate this putative prophage Φ19606 (Fig. 3a). A 60-nucleotide repeat (N60), overlapping the ssrS gene sequence, encoding the 6S regulatory RNA, was detected at both ends of the prophage in ATCC 19606 strains D, H and M. Notably, N60 was present as a single copy in ATCC 19606(A), which lacks Φ19606 (Fig. 3b). Genome inspection of other A. strains, such as AYE (CU459141.1) and AB307-0294 (CP001172.2), revealed that the ssrS gene is also an insertion site for other phages [78]. No significant homology was observed between Φ19606 and known phage genomes. Integration of Φ19606 in ATCC 19606(M) had occurred immediately downstream of the stop codon of the ssrS gene, causing target site duplication of the N60 sequence. Consequently, N60 could represent a homology region between Φ19606 and ATCC 19606(M) genome, possibly implicated in phage integration. Similar repeats could constitute possible recognition sites for the predicted phage terminase (GO593_11720); at the end of their infection cycle, dsDNA phages generally form concatemers that are cut by the terminase, enabling packaging of the mature phage genome [79].

Fig. 3.

Φ19606 phage. (a) Circular map of the Φ19606 genome drawn with DNAPlotter. The genome map illustrates putative ORFs along with the direction of transcription indicated with arrows. Functional proteins predicted by PHASTER are depicted in different colours. (b) Integration site of Φ19606 (black) into the ATCC 19606(M, D, H) chromosomes (top). The double slash denotes a phage region that is not shown. Positions refer to the ATCC 19606(M) genome sequence. Structure of ATCC 19606(A, S, T) after phage loss (bottom). Positions refer to ATCC 19606(A) genome sequence. Sequences flanking the insertion site are boxed, with predicted phage nucleotides italicized. Primer positions are indicated with black arrows. N60 stands for the 60-nucleotide sequence generated by phage insertion/excision. (c) Agarose gel electrophoresis of the PCR products obtained by using different primer pairs indicated in (b). (d) Presence (+) or absence (-) of amplicons detected in the different ATCC 19606T strains. Seventy-two ORFs were predicted in the Φ19606 prophage, 48 of which match proteins in the phage protein database, five match bacterial proteins and 19 are annotated as hypothetical proteins (Table S7). According to PHASTER results, prophage Φ19606 showed partial similarity with previously published siphoviral phages YMC/09/02/B1251_ABA_BP (NC_019541 [80]) and YMC11/11 /R3177 (NC_041866 [81]), with 38 and 31 homologous proteins, respectively. Intriguingly, Φ19606 harbours the GO593_11890 gene, encoding a putative lipid A phosphoethanolamine transferase (eptA), showing 95 % identity with the chromosomal pmrC gene, implicated in colistin resistance [82]. However, colistin susceptibility testing by the broth microdilution method showed similar MIC values (1 µg ml−1 colistin) for both ATCC 19606(A) and ATCC 19606(D) (data not shown). While the absence of Φ19606 is a distinctive feature of ATCC 19606(A), three prophage regions classified as ‘intact’ and ‘questionable’ by PHASTER, were conserved among ATCC 19606 strains A, D, H and M (Table S8). Since the whole-genome sequence is unavailable for ATCC 19606 strains S and T, the occurrence of Φ19606 in the genome of these domesticated strains was experimentally tested by PCR, including the ATCC 19606(D) and ATCC 19606(A) DNA as positive and negative control, respectively (Fig. 3c, d). An amplicon of 498 bp was invariably detected in domesticated strains ATCC 19606(A, S, T), indicating the absence of Φ19606 prophage. Intriguingly, the generation of both 560 bp and 498-bp-amplicons from genomic DNA of ATCC 19606(D) suggests that Φ19606 prophage was present in both integrated and episomal form (Fig. 3c). Coherently, genes encoding Cro (GO593_11615) and CI (GO593_11625) repressors, likely implicated in the switch control between the lysogeny and lytic cycle [83, 84], were detected in Φ19606, also presenting with the divergent gene organization typical of the λ-like coliphages. Since ATCC 19606(A, S, T) were originally filed with ATCC before being distributed to the various laboratories, we hypothesized that multiple propagations from the original vial resulted in Φ19606 loss. To reproduce the in vitro conditions that may have led to the phage loss, serial propagation steps of ATCC 19606(D) from the original vial obtained from DSMZ were conducted. In particular, preliminary viable counts showed that a single ~1 mm colony of ATCC 19606(D) generated after 24 h growth at 37 °C on LA plates is composed of 2.14 (±0.7)×107 c.f.u. Assuming that one colony originates from a single cell, it was calculated that ATCC 19606(D) replicates ca 24 times to produce a ~1 mm colony, with a mean generation time of ~1 h. Therefore, it was estimated that a total of 360 generations were made by ATCC 19606(D) after 15 daily passages on LA plates. For each propagation step, the presence of the prophage was verified by PCR with the primers listed in Table S1, using 30 ng of genomic DNA purified from ten randomly selected colonies as a template for PCR. Amplicons of 560 and 498 bp were detected for all colonies at all passages. In addition, a large screening conducted on 200 randomly selected colonies from the last propagation plate invariably yielded both amplicons, indicating that not even a single cell had lost Φ19606 during 360 generations. Spontaneous prophage induction could be due to both extrinsic and intrinsic factors, or a combination of both [85]. In fact, extrinsic factors, such as pH variations, accumulation of reactive oxygen species, UV radiation or other factors causing DNA damages could have triggered Φ19606 excision. On the other hand, spontaneous activation of genetic circuitry causing prophage excision in single cells of bacterial populations was also observed in the absence of an external trigger, a phenomenon dubbed ‘spontaneous prophage induction’ [85]. Consequently, the events that have led to Φ19606 loss remain unpredictable and difficult to reproduce in vitro, likely resulting from a combination of stochastic and/or ill-defined environmental conditions.

Ф19606 belongs to genus Vieuvirus and its host range is likely restricted to

To address the phylogeny of bacteriophage Ф19606, genome comparisons between Ф19606 and all 19 previously reported phages belonging to the Siphoviridae family were performed using VICTOR, in agreement with the recommendations of the International Committee on Taxonomy of Viruses [55]. The resulting minimum evolution tree grouped Ф19606 with members of the genus Vieuvirus, with Ab105-3phi and Ab105-2phi resulting the closest relatives (Fig. 4). Therefore, Ф19606 can be classified as a member of the genus Vieuvirus, within the Siphoviridae family.

Fig. 4.

Phylogenetic tree of phages belonging to the Shiphoviridae family. The tree was generated by VICTOR using the complete genome sequences of the Shiphoviridae family members. Filled circles at the nodes are GBDP pseudo-bootstrap support values >70 % from 100 replications. The scale bar indicates the number of substitutions per variable site. Phages belonging to the genus Vieuvirus are grey-shaded, Ф19606 is in bold. The tree was rooted at the midpoint. A significant portion (over 85%) of the Ф19606 sequence was detected in 123 non-redundant complete genomes (Table S9). strains harbouring Ф19606 belonged to clonal complex 2 (95 %), including ST2 and a single-locus variant of ST2, and to ST52 and a single-locus variant of ST52 (3 %). In most of the genomes (75 %), Ф19606 was inserted between a homologue of the hypothetical protein coding sequence HTZ92_2409 [according to the ATCC 19606(A) annotation] and the ssrS gene (HTZ92_ssRs), and in six strains an ISAba1 was detected between HTZ92_2409 and Ф19606. In 19 % of genomes, Ф19606 mapped between the HTZ92_2409 homologue and the gene encoding for another hypothetical protein [absent in ATCC 19606(A)], and in 6 % between an integrase and the ssrS genes. Ф19606 was not detected in any organisms other than A. baumannii, in line with previous evidence that Vieuvirus only infect spp. [86].

ATCC 19606T harbours two plasmids: pMAC and p1ATCC19606

An indigenous plasmid, called pMAC, was previously identified and characterized in ATCC 19606T [87]. pMAC is a 9.5 kb mobilizable episomal element carrying the genetic determinants for resistance to organic peroxides [87]. The existence of an additional replicon was firstly predicted from previous genome assemblies (https://genomes.atcc.org/genomes/1577c3a70f334038; [25, 26]), but no physical evidence of the existence of two different replicons in ATCC 19606T has so far been provided. Moreover, the predicted size of ATCC 19606T plasmids differs from previous studies (Fig. 2). To address these discrepancies, both ATCC 19606T plasmids were isolated and entirely sequenced. The two indigenous plasmids were found to coexist in four different type strains (A, D, S, T), as demonstrated by agarose gel electrophoresis of clear lysates (Fig. 5a). The 7655 bp extrachromosomal element was identified as p1ATCC19606 [25], copurified with pMAC in order to perform a restriction analysis of both replicons (Fig. 5b), and Illumina sequences were confirmed by primer walking and Sanger sequencing with primers listed in Table S1. Fig. 5(c) displays the physical and functional maps of pMAC and p1ATCC19606. The pMAC size (9540 bp) and annotated ORFs exactly match previous data [87]. Inspection of publicly available genomes did not detect p1ATCC19606 and pMAC in strains other than ATCC 19606T. Multiple analyses conducted on the raw sequence read data of ATCC 19606(M) did not detect plasmid p1ATCC19606 (7655 bp). This is because small plasmids are frequently overlooked from long-read-only assemblies [88], likely due to the removal of <~10 kb DNA fragments during the library preparation.

Fig. 5.

Plasmids p1ATCC19606 and pMAC harboured by ATCC 19606T strains. (a) Agarose gel electrophoresis of clear lysates of ATCC 19606(A) (lane 1), ATCC 19606(D) (lane 2), ATCC 19606(S) (lane 3) and ATCC 19606(T) (lane 4). M, Lambda DNA/HindIII marker (ThermoFisher). White arrows indicate the closed circular forms of pMAC (upper band) and p1ATCC19606 (lower band). (b) p1ATCC19606 and pMAC were copurified from strains ATCC 19606(A) (lanes 1 and 5), ATCC 19606(D) (lanes 2 and 6), ATCC 19606(S) (lanes 3 and 7) and ATCC 19606(T) (lanes 4 and 8), and digested with XhoI (lanes 1–4) and BclI (lanes 5–8). M, BenchTop 1 kb DNA Ladder (Promega). (c) Physical and functional maps of the p1ATCC19606 and pMAC plasmids. Restriction sites for the enzymes used to generate the electropherogram in (b) are shown. Unique cutter restriction enzymes are indicated in bold. Nomenclature of p1ATCC19606: rep, putative replicase; dbp, gene encoding a predicted DNA-binding protein; cspE-like, putative cold-shock protein gene; sel1-like, putative gene coding for a Sel1-repeat family protein; yedL-like, gene coding for the putative YedL N-acetyltransferase; oriC, predicted origin of replication. Nomenclature of pMAC: repM, replication protein M; dbp, gene encoding a predicted DNA-binding protein; ohr, gene encoding an organic hydroperoxide resistance protein, mobA, plasmid mobilization protein; oriC, origin of replication. ORFs shown in black are predicted to encode for hypothetical proteins. All genes are reported in scale over the total length of each plasmid. Images were obtained by the use of the SnapGene software (GSL Biotech).

Determination of the minimal self-replicating region of p1ATCC19606

For a preliminary functional characterization of p1ATCC19606, protein-coding genes were annotated by integrating the output of MEGAnnotator and Blast2GO. Thirteen putative ORFs (ORF-1 to ORF-13) were identified (Table 3). Among them, three overlapping ORFs (ORF 10, 11 and 12) were predicted to be involved in plasmid replication and segregation, namely ORF-10 encoding a putative replication protein located downstream the predicted origin of replication, ORF-11 encoding a putative DNA-binding protein, and ORF-12 encoding a putative integral membrane protein, likely involved in plasmid segregation.

Table 3.

Annotation of protein-coding genes of plasmid p1ATCC19606

Predicted ORF	Gene ID	Position (bp)	Protein length (aa)	Blast2GO description (e-value)
ORF-1	HTZ92_3642	225–497	91	Helix-turn-helix domain-containing protein (7.34E-59)
ORF-2	HTZ92_3643	490–810	107	Type II toxin-antitoxin system RelE/ParE family toxin (4.55E-71)
ORF-3	HTZ92_3644	997–1194	66	Hypothetical protein (3.33E-39)
ORF-4	HTZ92_3645	1261–1488	76	Hypothetical protein (2.14E-45)
ORF-5	HTZ92_3646	1586–1801	73	Cold shock-like protein CspE (1.97E-42)
ORF-6	HTZ92_3647	2128–2640	171	Hypothetical protein (6.05E-70)
ORF-7	HTZ92_3648	2735–3094	120	Sel1 repeat family protein (6.67E-80)
ORF-8	HTZ92_3649	3203–3343	47	Uncharacterized protein (6.73E-25)
ORF-9	HTZ92_3650	3343–3714	124	N-acetyltransferase YedL (1.3311E-84)
ORF-10	HTZ92_3651	4925–5875	317	Initiator replication family protein (0)
ORF-11	HTZ92_3652	5868–6443	192	DNA replication protein (1.73E-140)
ORF-12	HTZ92_3653	6463–6606	48	Hypothetical protein - integral component of membrane (5.11E-23)
ORF-13	HTZ92_3654	7050–7385	112	Hypothetical protein (2.1E-60)

Annotation of protein-coding genes of plasmid p1ATCC19606 Predicted ORF Gene ID Position (bp) Protein length (aa) Blast2GO description (e-value) ORF-1 HTZ92_3642 225–497 91 Helix-turn-helix domain-containing protein (7.34E-59) ORF-2 HTZ92_3643 490–810 107 Type II toxin-antitoxin system RelE/ParE family toxin (4.55E-71) ORF-3 HTZ92_3644 997–1194 66 Hypothetical protein (3.33E-39) ORF-4 HTZ92_3645 1261–1488 76 Hypothetical protein (2.14E-45) ORF-5 HTZ92_3646 1586–1801 73 Cold shock-like protein CspE (1.97E-42) ORF-6 HTZ92_3647 2128–2640 171 Hypothetical protein (6.05E-70) ORF-7 HTZ92_3648 2735–3094 120 Sel1 repeat family protein (6.67E-80) ORF-8 HTZ92_3649 3203–3343 47 Uncharacterized protein (6.73E-25) ORF-9 HTZ92_3650 3343–3714 124 N-acetyltransferase YedL (1.3311E-84) ORF-10 HTZ92_3651 4925–5875 317 Initiator replication family protein (0) ORF-11 HTZ92_3652 5868–6443 192 DNA replication protein (1.73E-140) ORF-12 HTZ92_3653 6463–6606 48 Hypothetical protein - integral component of membrane (5.11E-23) ORF-13 HTZ92_3654 7050–7385 112 Hypothetical protein (2.1E-60) The p1ATCC19606 origin of replication (oriC), predicted by using the DoriC 5.0 software [89], consists of a 1072–bp sequence containing four 22-mer direct repeats (5′-GCAAGGTAAACGGTGTCATATT-3′). Similarly arranged iterons of four 21 bp direct repeats were also identified in pMAC [87] and could be implicated in the initiation of plasmid replication and copy-number control [90, 91]. Interrogation of PlasmidFinder (https://cge.cbs.dtu.dk/services/PlasmidFinder/; [92]) did not retrieve any replication class for the p1ATCC19606 oriC region, suggesting that p1ATCC19606 has a narrow-host range, consistent with the absence of predicted mobilization proteins. To determine the shortest DNA region enabling self-replication in spp., deletion analysis of p1ATCC19606 was performed. Overlapping DNA fragments encompassing the oriC region were generated by PCR with the primer pairs listed in Table S1, and the resulting amplicons were cloned into the pCR vector for transformation of DH5α. After sequence verification, all constructs (designated as pCR-p1ATCC19606 to pCR-p1ATCC19606Δ5) were individually transferred by electroporation in different spp. to assess self-replication (Fig. 6). Except for the putative replicase gene, all ORFs could be deleted from p1ATCC19606 without affecting the replication of the hybrid constructs in spp. All p1ATCC19606 deletion derivatives showed similar TEs (Table S10). Intriguingly, the transformation of UKK_0145 with pCR-p1ATCC19606Δ4, lacking the putative replicase gene, yielded some transformants, though with low efficiency. A blastp search of p1ATCC19606 putative replicase protein sequence in the NCBI nr database of proteins retrieved a chromosomally located DNA replication protein (KQF43430.1) sharing 85 % identity with the putative replicase gene of p1ATCC19606. It can therefore be speculated that a chromosomal DNA replicase could act in trans to enable pCR-p1ATCC19606Δ4 replication in UKK_0145 (Table S10).

Fig. 6.

Deletion analysis of p1ATCC19606 to determine the minimal region required for autonomous plasmid replication in spp. Deletion fragments of p1ATCC19606 were generated by PCR amplification with primers listed in Table S1 and cloned into pCR. The resulting p1ATCC19606 deletion derivatives were introduced in BD413 and AB5075 to map the minimal self-replicating region (black box). Relevant coding regions are indicated with colours: red, predicted minimal origin of replication (oriC); yellow, putative replicase (rep); orange, gene encoding a predicted DNA-binding protein (dbp); dark green, putative higA2-like antitoxin gene; light green, putative higB2-like toxin gene; cyan, putative cold-shock protein gene (cspE); blue, gene coding for putative a Sel1-repeat family protein (sel1); white, gene coding for the putative YedL N-acetyltransferase (yedL). Four copies of the 22-mer direct repeat (DR1–DR4) in the predicted origin of replication are shown on top. ORFs in black are predicted to encode for hypothetical proteins. All genes are reported in scale over the total length of the plasmid. Images were obtained by the use of the SnapGene software (GSL Biotech). The p1ATCC19606 predicted replicase was modelled on the π initiator protein–iteron complex of plasmid R6K (2NRA [93]). Both SWISS-MODEL and I-TASSER provided highly superimposable models (Fig. S2), with maximum superimposition at the level of the replicase DNA-binding domain. This prediction suggests that the p1ATCC19606 replicase could interact with the four 22-mer direct repeats identified in oriC, similar to R6K replicase. However, R6K replication also requires iterons located outside the predicted origin of replication [93, 94], which are not detectable in p1ATCC19606, arguing for a different replication mechanism.

A HigB2-HigA2-like TA system accounts for p1ATCC19606 plasmid stability

ORF-1 and ORF-2 of p1ATCC19606 (Table 3) were predicted to encode a putative antitoxin and a toxin, respectively, possibly constituting a type-II TA module [95-97]. TA modules are implicated in plasmid maintenance since they encode a poisonous toxin, and a neutralizing antitoxin [98-100]. Homology modelling of ORF-2 and superimposition to the HigB2 toxin from (5JA8; [101]), using both I-TASSER and SWISS-MODEL, provided high scores for template modelling (TM) and Qualitative Model Energy ANalysis (QMEAN) (Table S11; Fig. 7a). ORF-1 was modelled by I-TASSER and SWISS-MODEL on the HigA2 antitoxin cocrystallized with its cognate HigB2 toxin (5JAA; [101]) (Fig. 7a). The predicted HigB2A2-like module was substantiated by the unusual genetic organization of this TA system: although the antitoxin gene is usually located upstream of the toxin gene, higB2A2 module has reverse gene organization [102-104]. Although I-TASSER and SWISS-MODEL software employed the same protein template to model the putative HigA2-like antitoxin, SWISS-MODEL provided higher modelling scores compared with I-TASSER (Table S11), probably because antitoxin proteins show considerable structural flexibility, which limits the superimposition of the ORF-1 product on the template structure, compared with the HigA2 template (Fig. 7a). After structure editing of the I-TASSER-derived antitoxin model, consisting of the torsion angle modification of five residues (from 26 to 30), both SWISS-MODEL and I-TASSER models became superimposable onto the HigB2A2 crystal structure complex (5JAA; [101]) (Fig. 7b), enabling to predict a 3D model of the interaction between the p1ATCC19606 putative toxin and antitoxin proteins (Fig. 7c).

Fig. 7.

HigB2-like and HigA2-like components the TA system of p1ATCC19606. (a) Superimposition of the HigBA2-like TA complex on the HigBA2 TA crystal structure (5JAA). The query structure is shown in grey, while the structural analogue is displayed in orange or cyan for I-TASSER- and SWISS-MODEL-based models, respectively. Only the first-ranked model predicted by I-TASSER and SWISS-MODEL for each query is shown. Torsion angles of amminoacid residues 26–30 of the I-TASSER-based model of the predicted HigA2-like antitoxin were modified to orient the α-helix involved in the interaction with HigB2-like toxin. (b) Superimposition of the predicted p1ATCC19606 TA complex models (I-TASSER, orange; SWISS-MODEL, cyan) over the crystal structure of HigB2-HigA2 (grey; 5JAA). (c) GRASP surface representation of the HigB2-like toxin (red)-HigA2-like antitoxin (green) complex based on the SWISS-MODEL predictions, displaying the interaction between the putative toxin and antitoxin proteins. The images shown in (a–c) were obtained using UCSF Chimaera. (d) Schematic illustration of HigB2-like toxin neutralization by the HigA2-like antitoxin. The arabinose-inducible expression of the higA2-like antitoxin gene provided in trans from pVRL2 allows the growth of DH5α expressing the IPTG-inducible higB2-like toxin gene from plasmid pME6032higB2. (e) Bacterial growth assessed after 24 h incubation at 37 °C in LB supplemented with the appropriate antibiotic concentration. To induce the expression of the higA2-like antitoxin gene from the arabinose-inducible PBAD promoter and of the higB2-like toxin gene from the IPTG-inducible P promoter, the medium was supplemented with the indicated arabinose and IPTG concentrations, respectively. OD600 values are representative of three independent experiments giving similar results. To demonstrate the involvement of the predicted TA system in p1ATCC19606 stability, the higB2A2-like gene system was deleted from pCR-p1ATCC19606, yielding pCR-p1ATCC19606ΔhigB2A2. The stability of p1ATCC19606ΔhigB2A2 in DH5α after 48 h growth in the absence of antibiotic selection (N Ant /N ratios) was reduced by ca 99 % compared with the parent pCR-p1ATCC19606 plasmid (Table S12). A comparable reduction of plasmid stability was also observed for the TA deletion derivative of pVRL1 (i.e. pVRL1ΔTA; [66]), used as a control. Therefore, the deletion of the TA module dramatically reduces p1ATCC19606 stability. To provide direct evidence of the toxicity of the HigB2-like toxin, DH5α was transformed with both pVRL2higA2 and pME6032higB2 plasmids, directing the arabinose- and IPTG-inducible expression of higA2 and higB2 genes, respectively (Fig. 7d). Assuming that the HigB2-like protein is a toxin and the HigA2-like protein is the cognate antitoxin, cells should be viable only upon expression of the higA2-like antitoxin gene, i.e. in the presence of arabinose. On this assumption, the growth of DH5α carrying both pVRL2higA2 and pME6032higB2 was determined after 24 h incubation at 37 °C in LB supplemented with 10 µg ml−1 Tc and different concentration of IPTG and arabinose. Results clearly demonstrate that IPTG-inducible expression of the higB2-like toxin gene abrogates growth unless compensated by arabinose-inducible expression of the higA2-like antitoxin gene (Fig. 7e). Indeed, bacterial growth increased with increasing arabinose concentration and decreased with increasing IPTG concentrations. A similar growth inhibition profile of DH5α carrying both pVRL2higA2 and pME6032higB2 was observed around paper discs soaked with increasing IPTG concentrations and applied to LA plates supplemented with arabinose (Fig. S3). The peculiar gene architecture together with the structural and functional similarities between the p1ATCC19606 TA system and the HigB2A2 complex of suggest that the HigB2-like toxin acts as a translating mRNA ribonuclease, causing a stall in protein synthesis in plasmid-free daughter cells and cell death [105]. The activity of the HigB2-like toxin is neutralized by the cognate HigA2-like antitoxin, securing the survival of the cells that inherit the p1ATCC19606 plasmid.

Conclusion

For decades ATCC 19606T has been the reference strain for research. However, it is known that frequent subculturing and local differences in culture conditions can result in the domestication of laboratory strains, a micro-evolutionary process driven by mutational events at the genome level that could even reflect into variable phenotype [28–32, 106]. Motivated by the remarkable diversity in publicly available ATCC 19606T genome sequences, we generated an accurately revised genome sequence of ATCC 19606T, which will hopefully set a more solid basis for studies of the genetics and genomics of this model organism. Previous long-read sequence data, such as those generated by PacBio and Oxford Nanopore technologies [26, 27], allowed for complete assembly of the ATCC 19606T genomic sequence without manual intervention. However, assembling long-read data alone can result in sequencing errors [107], and in the case of ATCC 19606T failed to detect both plasmids [25]. To overcome these limitations, we combined deep Illumina short-read with MinION long-read and Sanger technologies to generate a high-quality genome sequence. Subsequent annotation made it possible to lower the number of genes encoding hypothetical proteins as well as of pseudogenes, compared with formerly released ATCC 19606T genome sequences. Indeed, comparative analysis of our genomic sequence with previously published ones [25-27] highlighted a high number of SNPs and INDELs, and a difference in the annotation of putative pseudogenes. Sanger sequencing of the genomic regions encompassing individual SNPs confirmed the sequence determined by the hybrid assembly of ATCC 19606(A), allowing us to exclude sequencing errors. Therefore, the confirmed SNPs and INDELs could result from micro-evolutionary events of individual strains during domestication, whereas differences in pseudogene number are suggestive of sequencing and/or annotation errors, rather than genome erosion events. Accurate genome assembly also made it possible to characterize the indigenous plasmid p1ATCC19606, whose presence and size were undefined in previous versions of the ATCC 19606T genome sequence (Fig. 2). The HigB2A2-like TA system and the minimal self-replicating region of p1ATCC19606 were characterized both in silico and in vitro, providing insights into the mechanisms of plasmid maintenance and replication, respectively. Of note, HigB2A2-like modules are the most prevalent plasmid-borne TA systems in [100, 108]. Coherently, the invariable presence of p1ATCC19606 in all tested ATCC 19606 strains A, D, S and T denotes intrinsic stability, which can be ascribed to a very efficient maintenance system rather than a selective advantage, e.g. antibiotic resistance, conferred by plasmid carriage. However, the function of p1ATCC19606 remains so far elusive. Prophages are important sources of new genetic information, having the potential of transferring virulence and antibiotic resistance genes [109-111]. Prophages belonging to the Siphoviridae and Myoviridae families were the most frequently discovered in genomes [109, 112]. Here, we show that a remarkable difference among domesticated ATCC 19606T strains was the uneven presence of a 52 kb region, which we identified as the siphoviral Φ19606 prophage. This genetic element was only detected in ATCC 19606T stocks directly originating from ATCC, as inferred from genome analysis of strains D, H, M and O, but not in strains A, S and T, which were passed from lab to lab since the 1980s. Φ19606 belongs to the Vieuvirus genus and was not detected in species other than A. baumannii, showing high prevalence among strains belonging to the successful clonal complex 2 [113]. In ATCC 19606(D) both integrated and episomal forms of Φ19606 were experimentally proven to coexist during serial passages under non-curing conditions (ca 360 generations on LA plates), denoting substantial phage stability. Therefore, at which stage of the of ATCC 19606 A, S and T evolutionary history Φ19606 was lost remains an open question. The genetic drift resulting from laboratory domestication of reference strains can reflect phenotypic variability, and genome-level differences among laboratory-adapted strains were observed to affect the inter-laboratory experimental reproducibility in the case of PAO1 lineages [28, 30]. Our studies on ATCC 19606T strains from different laboratories highlighted remarkable diversity at the genome level. This poses the need for researchers to specify the lineage of the strain used, as individual culturing and storage practices may affect micro-evolution, and should encourage the storage of the strains in a single glycerol stock, revitalizing an aliquot when necessary, without recurring to subculturing. Stringent quality controls and strain assessments, including sequencing and the use of low-passage cultures, will help ensure the reproducibility and consistency of research. Click here for additional data file. Click here for additional data file.

112 in total

1. MqsR, a crucial regulator for quorum sensing and biofilm formation, is a GCU-specific mRNA interferase in Escherichia coli.

Authors: Yoshihiro Yamaguchi; Jung-Ho Park; Masayori Inouye
Journal: J Biol Chem Date: 2009-08-18 Impact factor: 5.157

2. Ribosome-dependent Vibrio cholerae mRNAse HigB2 is regulated by a β-strand sliding mechanism.

Authors: San Hadži; Abel Garcia-Pino; Sarah Haesaerts; Dukas Jurenas; Kenn Gerdes; Jurij Lah; Remy Loris
Journal: Nucleic Acids Res Date: 2017-05-05 Impact factor: 16.971

3. Acinetobacter baumannii ATCC 19606 Carries GIsul2 in a Genomic Island Located in the Chromosome.

Authors: Mohammad Hamidian; Ruth M Hall
Journal: Antimicrob Agents Chemother Date: 2016-12-27 Impact factor: 5.191

4. ISAba1-dependent overexpression of eptA in clinical strains of Acinetobacter baumannii resistant to colistin.

Authors: Anaïs Potron; Jean-Baptiste Vuillemenot; Hélène Puja; Pauline Triponney; Maxime Bour; Benoit Valot; Marlène Amara; Laurent Cavalié; Christine Bernard; Laurence Parmeland; Florence Reibel; Gerald Larrouy-Maumus; Laurent Dortet; Rémy A Bonnin; Patrick Plésiat
Journal: J Antimicrob Chemother Date: 2019-09-01 Impact factor: 5.790

5. Letter to the Editor: Prophages Encode Antibiotic Resistance Genes in Acinetobacter baumannii.

Authors: Gamaliel López-Leal; Rosa Isela Santamaria; Miguel Ángel Cevallos; Victor Gonzalez; Santiago Castillo-Ramírez
Journal: Microb Drug Resist Date: 2020-02-28 Impact factor: 3.431

6. New insights into Acinetobacter baumannii pathogenesis revealed by high-density pyrosequencing and transposon mutagenesis.

Authors: Michael G Smith; Tara A Gianoulis; Stefan Pukatzki; John J Mekalanos; L Nicholas Ornston; Mark Gerstein; Michael Snyder
Journal: Genes Dev Date: 2007-03-01 Impact factor: 11.361

7. Complete genome sequence of DSM 30083(T), the type strain (U5/41(T)) of Escherichia coli, and a proposal for delineating subspecies in microbial taxonomy.

Authors: Jan P Meier-Kolthoff; Richard L Hahnke; Jörn Petersen; Carmen Scheuner; Victoria Michael; Anne Fiebig; Christine Rohde; Manfred Rohde; Berthold Fartmann; Lynne A Goodwin; Olga Chertkov; Tbk Reddy; Amrita Pati; Natalia N Ivanova; Victor Markowitz; Nikos C Kyrpides; Tanja Woyke; Markus Göker; Hans-Peter Klenk
Journal: Stand Genomic Sci Date: 2014-12-08

8. Fallacy of the Unique Genome: Sequence Diversity within Single Helicobacter pylori Strains.

Authors: Jenny L Draper; Lori M Hansen; David L Bernick; Samar Abedrabbo; Jason G Underwood; Nguyet Kong; Bihua C Huang; Allison M Weis; Bart C Weimer; Arnoud H M van Vliet; Nader Pourmand; Jay V Solnick; Kevin Karplus; Karen M Ottemann
Journal: mBio Date: 2017-02-21 Impact factor: 7.867

9. Emergence, molecular mechanisms and global spread of carbapenem-resistant Acinetobacter baumannii.

Authors: Mohammad Hamidian; Steven J Nigro
Journal: Microb Genom Date: 2019-10

10. Complete Genome Sequence of Acinetobacter baumannii ATCC 19606^T, a Model Strain of Pathogenic Bacteria Causing Nosocomial Infection.

Authors: Taishi Tsubouchi; Masato Suzuki; Makoto Niki; Ken-Ichi Oinuma; Mamiko Niki; Hiroshi Kakeya; Yukihiro Kaneko
Journal: Microbiol Resour Announc Date: 2020-05-14

1 in total

1. Comparative Analysis and Data Provenance for 1,113 Bacterial Genome Assemblies.

Authors: David A Yarmosh; Juan G Lopera; Nikhita P Puthuveetil; Patrick Ford Combs; Amy L Reese; Corina Tabron; Amanda E Pierola; James Duncan; Samuel R Greenfield; Robert Marlow; Stephen King; Marco A Riojas; John Bagnoli; Briana Benton; Jonathan L Jacobs
Journal: mSphere Date: 2022-05-02 Impact factor: 5.029

1 in total