Literature DB >> 28348858

Quantitative assessment of insertion sequence impact on bacterial genome architecture.

Mark D Adams1, Brian Bishop1, Meredith S Wright1.   

Abstract

Insertion sequence (IS) elements are important mediators of genome plasticity and can lead to phenotypic changes with evolutionary significance. In multidrug-resistant Acinetobacter baumannii and Klebsiella pneumoniae, IS elements have contributed significantly to the mobilization of genes that encode resistance to antimicrobial drugs. A systematic analysis of IS elements is needed for a more comprehensive understanding of their evolutionary impact. We developed a computational approach (ISseeker) to annotate IS elements in draft genome assemblies and applied the method to analysis of IS elements in all publicly available A. baumannii(>1000) and K. pneumoniae(>800) genome sequences, in a phylogenetic context. Most IS elements in A. baumanniigenomes are species-specific ISAba elements, whereas K. pneumoniaegenomes contain significant numbers of both ISKpn elements and elements that are found throughout the Enterobacteriaceae. A. baumanniigenomes have a higher density of IS elements than K. pneumoniae, averaging ~33 vs ~27 copies per genome. In K. pneumoniae, several insertion sites are shared by most genomes in the ST258 clade, whereas in A. baumannii, different IS elements are abundant in different phylogenetic groups, even among closely related Global Clone 2 strains. IS elements differ in the distribution of insertion locations relative to genes, with some more likely to disrupt genes and others predominantly in intergenic regions. Several genes and intergenic regions had multiple independent insertion events, suggesting that those events may confer a selective advantage. Genome- and taxon-wide characterization of insertion locations revealed that IS elements have been active contributors to genome diversity in both species.

Entities:  

Keywords:  Acinetobacter baumannii; Klebsiella pneumoniae; antibiotic resistance; genome-wide analysis; insertion sequence; mobile genetic element

Mesh:

Substances:

Year:  2016        PMID: 28348858      PMCID: PMC5343135          DOI: 10.1099/mgen.0.000062

Source DB:  PubMed          Journal:  Microb Genom        ISSN: 2057-5858


The ISseeker software has been deposited in Github at http://github.com/JCVI-VIRIFX/ISseeker Tables S3 and S4 provide the GenBank accession numbers to all genome sequences analysed and a URL to retrieve each sequence from the GenBank database.

Impact Statement

Mobile genetic elements are well recognized for the role they have played in the dissemination of antimicrobial resistance genes in Gram-negative bacteria and in the rise of multi-drug resistance in several human pathogens. With large collections of genome sequences available for many bacterial species, it is now possible to quantify the abundance and distribution of these elements and assess the role they have played in genome evolution. Genome-wide surveys of the locations of insertion sequence (IS) elements in Acinetobacter baumannii and Klebsiella pneumoniae showed that several different IS elements are common within each species, and that IS elements have made significant contributions to the evolution of genome structure and variation in both species.

Introduction

Insertion sequences (IS) are mobile genetic elements smaller than ~2 kbp that encode only a transposase. Once acquired, IS elements can spread in a genome by transposition, creating genetic variation and playing important roles in adaptation (Bennett, 2004; Siguier ). The density of coding content in bacterial genomes means that most random insertions occur in functional genome regions. Intragenic insertions can cause loss-of-function mutations, while intergenic insertions may disrupt promoter function or can result in up-regulation of adjacent genes in cases where the IS element encodes an outward-facing promoter. Most insertions are presumed to be deleterious, but some may confer a selective advantage. For example, in Acinetobacter baumannii an ISAba1 insertion upstream of the chromosomal ampC gene results in over-expression of the Acinetobacter-derived cephalosporinase (ADC) beta-lactamase and resistance to extended-spectrum cephalosporins (Corvec ; Heritier ; Turton ). In addition to disrupting a gene and modifying gene expression, pairs of IS elements can act as a transposon, mobilizing new genetic material via lateral gene transfer such as the ISAba1-flanked blaOXA-23 termed Tn2006 that confers carbapenem resistance (Mugnier ). The blaKPC carbapenemase is bracketed by ISKpn7 and ISKpn6 in Tn4401a (Naas ) and the ISAba125 element was involved in the emergence of blaNDM-1 (Poirel , 2011). Several IS elements have been reported that drive mobilization and expression of blaOXA-58 (Poirel & Nordmann, 2006) and blaRTG (Potron ; Bonnin ). Despite the importance of these elements in the evolution of antimicrobial resistance, few studies have addressed their genome-wide distribution across a diverse set of strains. Gaffé ) found that IS elements contributed significantly to adaptive evolution of Escherichia coli under controlled growth conditions in continuous culture. They examined the distribution of eight IS elements in 120 Escherichia coli genomes following long-term growth in chemostats, and identified new IS locations that altered the global regulatory program. A study of eight clinical isolates of E. coli O157 found that IS629 and ISEc8 caused frequent small-size structural polymorphisms and suggested that IS elements may play a role in the inactivation of incoming phage and plasmids (Ooka ). Open questions remain regarding the genome-wide impact of IS elements, the relative abundance and diversity across evolutionary lineages, and the extent to which IS elements may be reshaping the genomes of clinically important pathogens. In draft genome assemblies, multi-copy IS elements are typically collapsed into a single contig that represents the full-length IS element sequence. Each IS copy cannot be placed in its correct genome location during assembly unless long reads, mate pairs, or some other long-range linking strategy is employed. Typically, contigs are broken at IS locations, and sequence reads that span the junction from chromosomal sequence to IS element sequence extend several bases into the IS sequence (Fig. 1a). This extension or ‘stub’ is often approximately half the read length when using the Velvet (Zerbino & Birney, 2008) or SPAdes (Bankevich ) assemblers. Three software programs have been described for mapping transposable element locations: ISmapper (Hawkey ), TIF (Nakagome ), and breseq (Barrick ). Each of these programs relies on primary read data rather than sequence assemblies, making them effective at defining junctions, but difficult to apply to large surveys involving hundreds of genomes given the very large input datasets. We developed the ISseeker software to identify flanks of IS elements in genome assemblies – both full length copies in long contiguous sequences and stubs at contig edges – extract the flanking sequences, and align those flanks to a common reference to enable comparison of IS locations across many strains.
Fig. 1.

Outline of ISseeker process. (a) Illustration of four possible alignments between an IS element and a query genome. IS element sequences are represented by blue arrows and contig sequences as a solid line. (b) Process diagram for determining IS locations based on alignment of IS flanks with a reference genome. Inputs are in blue boxes and outputs in green boxes. ISseeker program steps are shown in yellow boxes. White boxes depict intermediate results that determine next steps in program execution. Arrows depict the flow of logic in the program.

Outline of ISseeker process. (a) Illustration of four possible alignments between an IS element and a query genome. IS element sequences are represented by blue arrows and contig sequences as a solid line. (b) Process diagram for determining IS locations based on alignment of IS flanks with a reference genome. Inputs are in blue boxes and outputs in green boxes. ISseeker program steps are shown in yellow boxes. White boxes depict intermediate results that determine next steps in program execution. Arrows depict the flow of logic in the program. Thirty-six Acinetobacter baumannii species-specific ISAba elements have been registered with the ISfinder database (https://www-is.biotoul.fr/; Siguier ). Twenty-five ISKpn elements are in the ISfinder database. Several of these elements were initially described in genome sequencing projects, while others were identified based on their participation in antibiotic resistance gene mobilization (Tables 1 and 2). Klebsiella pneumoniae strains also have elements that are commonly found throughout the Enterobacteriaceae. Other IS elements that have been described in both genomes were included in the analysis as well. ISseeker was used to define the location of IS elements in over 1000 A. baumannii genome sequences and in over 800 K. pneumoniae genomes. The resulting patterns of IS distribution show that several elements are abundant in both species and that IS elements have played a significant role in genome evolution.
Table 1.

IS Elements Surveyed in A. baumannii genomes

IS elementNo. of genomes with:Distinct sites*Total sites†Diversity ratio‡Linked genes
Annotated sitesUn-annotated sitesNo. of sitesIn genesBetween genes% In genes% Between genesNo. of sitesIn genesBetween genes% In genes% Between genes
IS26776302652085778.521.530871612147552.247.80.086
ISAba181520122973549759.840.4145107383712750.949.10.085
ISAba2463157431475.424.61851602586.513.50.308
ISAba3412149564.335.72013765.035.00.700OXA-58
ISAba40
ISAba5193141251661.039.059362361.039.00.695
ISAba61
ISAba71
ISAba81443175.025.043175.025.01.000
ISAba92220.0100.0220.0100.01.000RTG-4
ISAba10381297593860.839.22411439859.340.70.402OXA-23
ISAba11121969393056.543.585483756.543.50.812LpxA/C
ISAba1214212235720914858.541.599059239859.840.20.361
ISAba133938057230926354.046.036391865177451.348.70.157
ISAba141315139469.230.82520580.020.00.520RTG-6
ISAba150LpxD
ISAba16140152581728666.733.3103266137164.135.90.250OXA-51
ISAba172761298603861.238.82007111988855.844.20.049
ISAba18462560461476.723.31991722786.413.60.302
ISAba1962251631323181.019.03873285984.815.20.421
ISAba203154180.020.054180.020.01.000
ISAba213201210283.316.71210283.316.71.000RTG-5
ISAba226330171492269.031.0112892379.520.50.634AadB
ISAba230
ISAba24200197277.822.22404919120.479.60.038
ISAba251571934021912164.435.6139188450763.636.40.244
ISAba26381391901365471.628.465225539739.160.90.291
ISAba27462034311622733.866.281825856031.568.50.419
ISAba2811
ISAba2956251281022679.720.33082634585.414.60.416
ISAba300
ISAba31295193346036.664.5122398332.068.00.762
ISAba32016
ISAba33292064333151.648.494415343.656.40.681
ISAba364212531728168.032.071244626662.637.40.355
ISAba12540216052833819064.036.01595105454166.133.90.331OXA-58
ISAba8253263350.050.063350.050.01.000OXA-58

*Distinct sites is the number of unique annotated locations on the reference genome.

†Total sites in the number of annotated and unannotated sites summed for all genomes.

‡(Number of distinct insertion sites)/(Total insertions in all genomes).

Table 2.

IS Elements Surveyed in K. pneumoniae genomes

IS elementNo. of genomes with:Distinct sites*Total sites†Diversity ratio‡Linked genes
Annotated sitesUn-annotated sitesNo. of sitesIn genesBetween genes% In genes% Between genesNo. of sitesIn genesBetween genes% In genes% Between genes
IS12944016127973076.423.63252279869.830.20.391
IS1F4524987523559.840.21030759557.392.70.084
IS26494692111605175.924.22064157748776.423.60.102
IS4321R159332113861.938.137614123537.562.50.056
IS5192344222250.050.053262749.150.90.830
IS50753182636221461.138.988329259133.166.90.041
IS6100430301511473.326.793714079714.985.10.016
IS903B491731941197561.338.7140346793633.366.70.138
ISEcp1B901651351668.631.4136993772.827.20.375
ISKpn158341462711918.581.529675729101.998.10.049
ISKpn2035
ISKpn30
ISKpn40
ISKpn64131330.0100.08238230.0100.00.004KPC
ISKpn74141330.0100.08248240.0100.00.004KPC
ISKpn824581712.587.5323299.490.60.250
ISKpn90
ISKpn100
ISKpn1106
ISKpn12012
ISKpn130
ISKpn146414258332556.943.198514752.048.00.592
ISKpn150
ISKpn1838718576989.410.6118811533597.12.90.072
ISKpn191216119281.818.21210283.316.70.917
ISKpn2033
ISKpn2147361991047.452.680433753.846.30.238
ISKpn230
ISKpn24235192920969.031.0458374218.191.90.063
ISKpn257176136746.253.8766707.992.10.171
ISKpn264863362137924261.039.042381627261138.461.60.147
ISKpn270
ISKpn284522549202940.859.285516668919.480.60.057
ISKpn31010

*Distinct sites is the number of unique annotated locations on the reference genome.

†Total sites in the number of annotated and unannotated sites summed for all genomes.

‡(Number of distinct insertion sites)/(Total insertions in all genomes).

*Distinct sites is the number of unique annotated locations on the reference genome. †Total sites in the number of annotated and unannotated sites summed for all genomes. ‡(Number of distinct insertion sites)/(Total insertions in all genomes). *Distinct sites is the number of unique annotated locations on the reference genome. †Total sites in the number of annotated and unannotated sites summed for all genomes. ‡(Number of distinct insertion sites)/(Total insertions in all genomes).

Methods

ISseeker was written in perl to annotate the locations of a range of IS elements in complete and draft genome sequences. Search results are output in a text file log, a comma-separated values file and as SQL statements that can be loaded to a MySQL database to facilitate complex queries. The outline of the program is illustrated in Fig. 1. ISseeker identifies complete and partial IS matches in a query genome using blastn, with a user-specifiable percent identity threshold (default 97 %). Using contig length information, matches are classified as embedded in a contig (and either full-length or partial), consisting of an entire contig (this is common in draft assemblies), or representing the edge of the IS element matching the edge of a contig. Full-length embedded matches and valid edge matches are selected for annotation. A 500 bp sequence region adjacent to the IS element is extracted from the contig and searched against the reference genome using blastn. Matches are evaluated using user-defined thresholds for percent identity (default 97 %) and length, and those passing the threshold are reported. The location relative to adjacent genes in the reference genome is reported. The program attempts to link matches into pairs representing the start and end of the IS element that map to equivalent sites in the reference genome and thus correspond to a single insertion event. In practice, this is incomplete because it appears that deletions are common near IS elements and it is not obvious whether a single event is represented. Flanks that do not match the reference are also included in the output as ‘unannotated’ flanks. Output is saved in a log file and as SQL statements for bulk import into a MySQL database, which facilitates complex queries. When evaluating the IS locations, we found many instances of IS sites clustered within a few bases of one another. These could represent independent insertion events or alignment artifacts. Manual review suggested that the latter was common, so for the purpose of reporting the number of distinct insertion sites, we bundled annotated locations within 10 bases of each other as a single event. Locations relative to genes were inferred based on the GenBank annotation for the reference genome, with location outside of annotated coding regions designated as intergenic and those inside coding regions designated intragenic. The ISseeker software and the MySQL schema are available at https://github.com/JCVI-VIRIFX/ISseeker. A user-specified reference genome is required for ISseeker analysis. The TYTH-1 genome sequence [GenBank accession no. CP003856.1 (Liou )] was selected as the reference A. baumannii genome after consideration of several completed genome sequences. TYTH-1 was isolated in Taiwan in 2008 and is a GC2 strain (Nemec ), as are a majority of strains with genome sequences in the GenBank database. NJST258_1 [GenBank accession no. CP006923.1 (Deleo )] was selected as the K. pneumoniae reference genome. NJST258_1 is a KPC-positive ST258 strain isolated in New Jersey, USA, in 2010. All completed and draft A. baumannii and K. pneumoniae genomes available in the GenBank database as of 1 August 2015 were downloaded. Genome assemblies that were highly fragmented (>300 contigs), or were assembled with newbler, or represented non-baumannii Acinetobacter strains or non-pneumoniae Klebsiella strains were excluded. 1035 complete and draft A. baumannii genomes and 807 complete and draft K. pneumoniae genomes were analysed. All species-specific IS elements cataloged in the ISfinder database (Siguier ) were downloaded and compared against the full genome set for each species. In addition, several complete genome sequences for each species were searched against the ISfinder database by BLAST to identify species non-specific elements. Eighteen additional (non-ISKpn) IS elements found in K. pneumoniae genomes were analysed and seven additional (non-ISAba) IS elements were analysed in A. baumannii genomes. Results are included in Tables 1 and 2 for those elements that were present in more than five genomes. ISseeker was compared with ISmapper using a set of genomes for which both a finished genome sequence (e.g. a ‘gold standard’) and Illumina short reads were available. Illumina read sets were downloaded from NCBI’s Sequence Read Archive (SRA) using the SRA Toolkit utility fastq-dump. ISmapper was run on each read set using default parameters. Each Illumina read set was assembled using SPAdes (Bankevich ). ISseeker was run on the finished genome sequence and on the Illumina assembly. The performance of ISseeker was evaluated by performing runs against the full set of sequences for each species using varying values for percent identity of matches to the IS element and of IS-flanking sequences to the reference, and using an alternative reference genome. It should be noted that the newbler assembler (Miller ) suppresses these stubs so newbler assemblies cannot be used by ISseeker. A core phylogeny based on single-nucleotide variants (SNVs; 278 322 SNVs for A. baumannii, 332 571 SNVs for K. pneumoniae) was inferred using SNVs identified by NASP (Sahl ) and constructed using FastTree 2 (Price ). Genome positions with allele calls in at least 80 % of strains were included in the analysis. Fig. 2 was prepared using the graphics tools available through the interactive Tree of Life (iTOL) web service (Letunic & Bork, 2011).
Fig. 2.

IS representation in a phylogenetic context. The most abundant IS elements (present in >100 genomes) are shown in the context of the A. baumannii (a) and K. pneumoniae (b) phylogeny based on SNP markers. Isolation locations for the strains from the largest collections have a color code in the inner circle. The height of each bar represents the number of copies of each element in each genome. Scale rings illustrate the height of the histograms on each tree diagram. In (a), strain groups are denoted with coloured branches: Global Clone 1 (pink), GC2 (green) and ST79 (orange). In (b), the two major sub-groups of ST258 are denoted as ST258A (orange) and ST258B (blue).

IS representation in a phylogenetic context. The most abundant IS elements (present in >100 genomes) are shown in the context of the A. baumannii (a) and K. pneumoniae (b) phylogeny based on SNP markers. Isolation locations for the strains from the largest collections have a color code in the inner circle. The height of each bar represents the number of copies of each element in each genome. Scale rings illustrate the height of the histograms on each tree diagram. In (a), strain groups are denoted with coloured branches: Global Clone 1 (pink), GC2 (green) and ST79 (orange). In (b), the two major sub-groups of ST258 are denoted as ST258A (orange) and ST258B (blue). The statistical significance of comparisons of IS element composition between strain sets was assessed using Student’s t-test.

Results

Description and evaluation of the ISseeker program

Four classes of IS alignment are considered by the ISseeker program (Fig. 1a): contigs that are comprised entirely of IS sequence, IS element matches that are full-length and embedded in a long genomic contig, matches to the beginning or end of an IS element at the start of end of a contig, and partial matches internal to a contig. Contig sequences flanking each IS element are extracted and compared to a reference genome (Fig. 1b). By mapping all IS/genome junctions to a single reference, it is possible to compare IS locations across strains. The performance of ISseeker was evaluated from two perspectives: 1) comparison with ISmapper, using IS locations in completely sequenced reference genomes as a gold standard, and 2) to determine the impact of alternative run parameters on the detection of IS element locations. Two other programs that can identify IS elements in short read data were not included in the evaluation because they are not strictly comparable. Breseq was designed for mutation-finding in long-term culture experiments and is best suited to comparing very closely related genomes to a sequenced reference. TIF uses the unix grep command to identify IS-matching reads and is thus unable to identify non-exact matches. Results from ISseeker and ISmapper were compared on four K. pneumoniae genomes and four A. baumannii genomes for which Illumina reads, Illumina assemblies, and finished genome sequences were available (Table S1, available in the online Supplementary Material). Across these eight genomes, there were 74 insertion sites for the test IS elements ISAba1 or ISKpn26. ISseeker found all 74 sites when run using both the finished sequences and the draft genome assemblies, while ISmapper missed 20 sites for a sensitivity of 73 %. This is lower than the value reported by Hawkey et al. (2015). Further analysis showed that most missed sites were in genomes with low read coverage (<80x) or at locations with structural variation relative to the reference. ISseeker reports every IS-flanking sequence, including locations that cannot be annotated in the reference genome and those that are not in valid pairs matching both the IS element beginning and end sequences at a common reference location. ISmapper is more conservative in reporting only valid IS edge pairs in the primary output, with some additional information in ancillary output files. One interesting case identified by ISseeker, but not ISmapper, involved the creation and mobilization of a compound transposon comprised of inverted repeat copies of ISAba1 in the ORAB01 genome (Fig. S1). ISseeker recognized that there were two copies and the correct location of both, but the details of the structure were only apparent in the finished genome sequence. The sensitivity to alteration in run parameters was evaluated for an abundant IS element in each species - ISAba1 and ISKpn26 - across the full set of assemblies for each species (Table S2). Reduction of the minimum percent identity of the matches (IS edge detection and flank alignment to the reference) from the default of 97 % to 95 % resulted in annotation of 3–4 % more sites. Upon manual review, some of these were determined to be spurious, so the more conservative threshold was retained. Increasing the stringency of the flank alignment to require a full 500 bp match reduced the number of annotated sites by 13 % (A. baumannii) and 8 % (K. pneumoniae). Use of an alternative A. baumannii reference -a GC1 strain rather than a GC2 strain - also resulted in a loss of about 7 % of the annotated sites.

Analysis of IS elements in A. baumannii and K. pneumoniae

Each ISAba and ISKpn element was compared against the corresponding full set of complete and draft genomes (Tables 1, 2, S3 and S4). Several non-species-specific IS elements were also included, based on elements present in a sampling of genomes from each species. 89 % of A. baumannii genomes and 94 % of K. pneumoniae genomes had at least one IS element detected and several elements were detected in hundreds of genomes (Fig. 2). The overall numbers of insertions and chromosomal locations of IS elements were greater in A. baumannii than in K. pneumoniae. In K. pneumoniae, 18 869 total IS copies were found across 782 genomes. In A. baumannii, 32 539 copies were found across 976 genomes. On average, A. baumannii genomes contained 33 copies of IS elements, while K. pneumoniae genomes contained 27 copies (p<0.001). A strong pattern of similar IS content among phylogenetically related strains is apparent, suggesting that many insertions are conserved. The number of genomes that share a set of IS insertion locations for the most abundant elements is shown in Fig. 3. ISAba1 and ISKpn26 are the only elements with >10 shared sites in a substantial number of strains. Four additional IS elements have >10 copies per genome in some A. baumannii strains. IS26 and ISKpn1 have 2 and 6 copies in shared locations per genome, respectively, in 300 strains, corresponding to ST258 strains.
Fig. 3.

Distribution of conserved IS clusters for the most common IS elements. The number of strains sharing a set of IS element locations is plotted for the five IS elements with the largest number of copies (>1500 total copies for each element) in A. baumannii and for ISAba27 that is greatly expanded in certain genomes (a), and for six K. pneumoniae IS elements with the largest number of copies (>1000 total copies) (b).

Distribution of conserved IS clusters for the most common IS elements. The number of strains sharing a set of IS element locations is plotted for the five IS elements with the largest number of copies (>1500 total copies for each element) in A. baumannii and for ISAba27 that is greatly expanded in certain genomes (a), and for six K. pneumoniae IS elements with the largest number of copies (>1000 total copies) (b). There are more distinct IS element insertions in the examined A. baumannii genomes compared to K. pneumoniae. With respect to distinct sites mapped to each reference genome (TYTH-1 for A. baumannii and NJST258_1 for K. pneumoniae), there were 1843 distinct K. pneumoniae genome locations with IS insertions and 5341 distinct A. baumannii locations. These distinct insertion sites represent the minimum number of insertion events that occurred over time because some insertion sites could not be mapped to the selected reference genomes, and because multiple independent insertions could have occurred at a given site. Twelve different IS elements have over 100 distinct insertion sites across the A. baumannii strain set, but only five IS elements have that many distinct insertion sites in K. pneumoniae genomes. There are many more IS insertion sites shared by up to 100 A. baumannii genomes than there are shared sites across similar numbers of K. pneumoniae genomes (Table 3). In contrast, K. pneumoniae genomes have more sites shared in >250 genomes reflecting the relatively homogenous IS patterns in the dominant ST258 clade in the dataset. In addition, there are many more strain-specific insertion events in A. baumannii (3194 vs 1234). Another view of the extent of shared insertion sites is given in Fig. S2 that depicts the number of genomes that share sites along the A. baumannii or K. pneumoniae reference chromosome. There are many more moderately abundant shared sites among A. baumannii strains than K. pneumoniae strains.
Table 3.

Shared IS element sites

Number of genomes sharing a siteTotal copiesDistinct sites
A. baumanniiK. pneumoniaeA. baumanniiK. pneumoniae
>500736010
250–4996975127892536
150–249379433851818
100–14925611136219
50–9915599112114
20–49498595017232
10–19264261219542
2–9559016341694471
13194123431941234
A majority of genomes in the datasets for both species belong to multidrug-resistant clonal groups that have recently expanded: 62 % of A. baumannii genomes belong to GC2 [i.e. multi-locus sequence type (MLST) ST2] and 53 % of K. pneumoniae genomes belong to MLST ST258. In these subsets of strains, IS elements are also much more frequent and their locations are more diverse in A. baumannii compared to K. pneumoniae (see Fig. 2). In K. pneumoniae, the common location of IS insertion sites shared by large numbers of ST258 strains regardless of geographic origin supports a model of recent expansion with limited strain- or clade-specific IS gain or loss. Variability in IS element composition and abundance is higher in the A. baumannii GC2 set, with more clustering of IS patterns by geographic location and phylogenetic position. It has been hypothesized that the ST258 lineage of K. pneumoniae arose around 1995 (Bowers ), whereas the oldest known MDR GC2 strain of A. baumannii was isolated in 1982 (Diancourt ; Blackwell ).

A. baumannii IS elements

ISAba1 has had the largest impact on A. baumannii genomes, with copies detected in 815 of the A. baumannii genome assemblies and over 14 500 total insertions mapped in those strains. An ISAba1 insertion site is present upstream of the blaADC (ampC) gene in most of the genomes that have copies of this element (736 genomes). The second most common insertion site for ISAba1 is upstream of the other chromosomal β-lactamase gene, blaOXA-51-like (369 genomes). This insertion results in over-expression of the OXA-51-like carbapenemase and resistance to imipenem and meropenem (Nemec ). The median number of ISAba1 sites per genome was 19 and the maximum number in a single genome was 34. Four other elements (ISAba125, ISAba13, ISAba26 and IS26) were present in over 350 strains each, and five additional elements were present in more than 100 strains (Table 1). In a few cases, it seems that an IS element has run amok in a genome, such as the ISAba6 and ISAba7 elements in A. baumannii strain SDF (Vallenet ). Most ST79 strains have 50–100 copies of ISAba27. Seven genomes have copies of 10 or more different IS elements and five genomes have more than 100 total IS copies. ISAba4, ISAba15, ISAba23, ISAba30 and ISAba32 were not found in any of the sequenced genomes. ISAba6 and ISAba7 were found only in the SDF genome (Vallenet ). ISAba8 and ISAba28 were also only found in one genome each. ISAba2, ISAba18, ISAba19 and ISAba29 are IS3-family elements and are 85–95 % identical to one another, making inference of their abundance and correct locations difficult in draft genomes. Likewise, ISAba16 and ISAba25 are 97 % identical to one another and many of their annotated sites overlap with one another and are thus ambiguous as to the specific element that is present at each location. ISAba12 and ISAba13 are 84.8 % identical, including regions of 100 % identity in the first 23 bases and last 21 bases, and are also difficult to discriminate in draft genomes. Of the non-ISAba elements examined, only IS26 was abundant enough to be included. Genomes that are closely related to each other on the phylogenetic tree tended to have similar patterns of IS element composition (Fig. 2). There are a few large strain collections representing restricted geographic regions among the 1035 genomes, including 442 isolates from Maryland (NCBI BioProject PRJNA224116) and 174 from Ohio (Wright , 2016). Many of these genomes are very similar, potentially representing clonal series, but differences in IS content are apparent within each group. Among the GC2 genomes, there are several interesting phylogenetic clusters, some of which correspond to geographically restricted strain collections. For example, some clusters of strains isolated in Maryland have copies of ISAba16/ISAba25 that are largely absent from other strains. Strains previously identified as Clade D (Wright , 2016) are clearly distinct from other GC2 strains by having 7–31 copies of ISAba12. Most Ohio strains have one or two copies of ISAba22, ISAba24, and ISAba26 that are found in relatively few other strains. Most Maryland strains have 7–10 copies of ISAba13 and of ISAba17, elements that are not as abundant in other branches of the tree.

K. pneumoniae IS elements

Overall, there are fewer IS element copies in K. pneumoniae genomes than in A. baumannii. This difference is reflected across both the number of distinct insertion sites (reflecting historical independent insertion events) and in the total number of copies across the genome set (reflecting the success of strains carrying those elements) (Fig. 2, Tables 1 and 2). In K. pneumoniae, about 350 ST258 genomes share IS insertion locations for ISKpn1, ISKpn26 and IS1F. This suggests that ST258 genomes have spread rapidly worldwide with a reasonably stable repertoire of IS elements and only limited new IS mobilization activity. Thirteen IS elements were present in >100 K. pneumoniae genomes (Table 2, Fig. 2). ISKpn6 and ISKpn7 are present on Tn4401 that carries the blaKPC gene; the presence of those two elements corresponded closely with the presence of the blaKPC gene. Both of those elements were only observed in the Tn4401 context and so appear not to have mobilized to other sites in the K. pneumoniae genomes. ISKpn24 is also present on the pNJST258N2 plasmid that carries Tn4401,and copies in most ST258 genomes map to that plasmid. Unlike ISKpn6 and ISKpn7, ISKpn24 was observed at several other sites in a subset of genomes. Eight additional IS elements were found in more than 200 genomes (ISKpn1, ISKpn18, ISKpn26, ISKpn28, IS26, IS903B, IS1F, and IS6100). Of these, ISKpn18 is almost entirely restricted to ST258 strains, but the other IS elements are found throughout the phylogenetic tree. ISKpn1 was the most broadly distributed, appearing in 583 genomes. Most ST258 genomes share five common ISKpn1 insertion sites. The number of copies of ISKpn26 is more variable across genomes, indicating more active mobilization than other elements. NJST258_1 has seven chromosomal copies of ISKpn26 and one on pNJST258N1. Copies at equivalent positions are present in other ST258 strains. ISKpn28 mapped to another large plasmid (pNJST258N1) in the reference genome, and one copy was present in the chromosome in most ST258 strains. ST258 genomes have more than three times as many IS copies than non-ST258 genomes (average 34.5 vs 10; p<0.01). Unlike in A. baumannii, there are no large expansions in IS copy number in any of the K. pneumoniae genomes: the maximum number of copies of a single element was 22 copies of ISKpn26 and only 18 genomes have ≥50 total IS insertions, compared to 189 A. baumannii genomes with ≥50 elements. Eight ISKpn elements were not found in any sequenced genome.

Distribution relative to genes

The examined IS elements vary in their insertion locations relative to coding regions. By definition, an element may insert within a gene or between genes. Intragenic insertions have the potential to act as gene knockouts. Intergenic insertions may have no effect on adjacent genes or could either positively or negatively affect expression, depending on the precise location relative to promoters. ISAba1 and ISAba125 have strong outward-facing promoters and can up-regulate the expression of adjacent genes (Lopes & Amyes, 2012); other elements have not been characterized for promoter activity. We considered the fraction of intragenic insertions for each element from two perspectives – the total number of sites across all genomes, and the number of distinct sites (Tables 1 and 2). The former measure incorporates the abundance (number of genomes carrying each insertion) while the latter more accurately reflects the number of insertion events and is not biased by repeated sampling of closely related genomes. The two measures are closely correlated for IS elements with more than 15 distinct insertion sites. In A. baumannii, the proportion of intragenic insertions varies from approximately 30 % (ISAba31 and ISAba27) to over 70 % (ISAba19 and ISAba22). In K. pneumoniae, the intragenic proportion ranges less than 20 % (ISKpn1) to over 80 % (ISKpn18). A low proportion of intragenic insertions could be due to the fact that gene-disrupting insertions are more likely to be selected against than intergenic insertions. Alternatively, there may have been strong positive selection for certain intergenic events that has resulted in their high frequency. Another indirect measure of the potential adaptive effects of IS insertion is the diversity of sites, which we calculated as the ‘diversity ratio’: the number of distinct sites divided by the total number of observed insertions in all genomes for each IS element. A high ratio means that most insertions are strain-specific, while a low ratio means that a few IS locations are shared by most strains carrying that element, with few additional strain-specific insertions. The latter group is more likely to represent positively selected insertions. In A. baumannii, the diversity ratio ranged from <10 % (ISAba1, ISAba17, IS26) to >75 % (ISAba11, ISAba31) (Table 1). As an example of a low diversity ratio, 190 of the 200 genomes that contain ISAba24 have an insertion between the genes encoding hypothetical proteins M3Q_2649 and M3Q_2651 in TYTH-1. On the other hand, among the 29 genomes with ISAba31, 72 of the 93 insertion sites are strain-specific. In K. pneumoniae, the abundant IS elements have diversity ratios less than 0.4, except ISKpn14 and IS5. Multiple independent insertions by the same or different IS elements in the same genomic region may also indicate that those insertions convey a selective advantage. In K. pneumoniae, there are fewer than two dozen genes or intergeneic regions with insertions by more than two different IS elements. In A. baumannii, however, there are 320 genes with three or more different IS elements inserted in them across the strain set (Table S5). An additional 185 intergenic locations have three or more different IS elements (Table S6). Among these, there is a strong bias for insertions between genes that are oriented so as to be up-regulated by an IS-encoded promoter. Only 18 (10 %) of the intergenic insertions are between genes oriented toward the IS insertion site; the remaining 89 % of insertion sites, a gene is oriented so as to be up-regulated by the adjacent IS element. One genome segment with multiple insertions is the four-gene region M3Q_2685–M3Q_2688 encoding the type I pilus proteins CsuA/B and their regulators, which has dozens of independent insertions by thirteen different IS elements. Twelve different IS elements were found in the 176 bp region between M3Q_2382 and M3Q_2383 in 389 strains. The repeated insertions at this locus suggest that these genes may encode important functions, although each encodes a hypothetical protein with no functionally characterized domains.

Discussion

After correcting for the larger number of A. baumannii genomes in the analysis, there were about 40 % more total insertions observed in A. baumannii genomes than in K. pneumoniae genomes, and more than twice as many distinct insertion sites. Considering that A. baumannii genomes (~4 Mbp) are about 30 % smaller than K. pneumoniae genomes (~5.6 Mbp), the IS element density is even greater, with about one IS element every ~109 kbp in A. baumannii, compared with every ~185 kbp in K. pneumoniae. As can be seen in Table 3, there are many more IS locations that are shared in up to 10 % of the A. baumannii strains, while K. pneumoniae genomes have more sites that are shared by about half of the genomes, reflecting the large proportion of very similar ST258 strains in the dataset and the greater diversity of IS location patterns among the A. baumannii genomes. There are several potential explanations for the more diverse set of IS locations in A. baumannii than in K. pneumoniae. The most straightforward may be that the A. baumannii strains represent a more diverse evolutionary history than the K. pneumoniae strains, and thus more time for IS elements to move and accumulate. There are more genomes on long branches of the A. baumannii phylogenetic tree than the K. pneumoniae tree. Although difficult to discern in Fig. 2, this is also true for the most abundant MLST groups: the sum of the branch lengths of GC2 A. baumannii strains is about four times longer than the sum of the ST258 K. pneumoniae strains. However, in both species, IS elements are more abundant in the recently emerged lineages than in the more diverse strains, so divergence time alone cannot explain the abundance differences. It appears that strong selection for a founder ST258 strain carrying the blaKPC gene resulted in a rapid expansion of the this lineage (Bowers ) that contains a shared set of IS insertion sites that were present in the founder. Only the ISAba1 site upstream of the blaADC gene is common to most GC2 strains, and it has been argued that this is due to multiple independent insertions, rather than a single shared ancestor (Hamidian & Hall, 2013). Hawkey ) have described a program that uses primary reads to identify the locations of insertion sites in A. baumannii. Their approach has some advantages over ours: by relying on primary sequence reads rather than assemblies, the impact of variation in sequencing and assembly methods and efficacy are reduced. However, publicly available read sets are much more difficult to work with than assemblies, requiring generally 50–150 times as much disk space and computing time during analysis. For example, it takes 20–120 min to download the reads from a single genome from SRA using the SRA Toolkit program fastq-dump. In the same length of time, contig sequences can be downloaded from 1000 genomes from the WGS division of GenBank and searched for IS content using ISseeker. Other limitations to mapping IS locations in draft genomes include sequence variation in the IS element, differences between the query genome and the reference, and assembly artifacts that tend to occur near repetitive genome regions. Use of alternative run parameters resulted in differences in detection rate (Table S2), and suggest that for any particular insertion site, additional computational analysis may be needed to determine the insertion status in strains of interest.
  35 in total

1.  A novel and hybrid composite transposon at the origin of acquisition of bla(RTG-5) in Acinetobacter baumannii.

Authors:  Rémy A Bonnin; Laurent Poirel; Patrice Nordmann
Journal:  Int J Antimicrob Agents       Date:  2012-05-23       Impact factor: 5.283

2.  Inference of the impact of insertion sequence (IS) elements on bacterial genome diversification through analysis of small-size structural polymorphisms in Escherichia coli O157 genomes.

Authors:  Tadasuke Ooka; Yoshitoshi Ogura; Md Asadulghani; Makoto Ohnishi; Keisuke Nakayama; Jun Terajima; Haruo Watanabe; Tetsuya Hayashi
Journal:  Genome Res       Date:  2009-06-29       Impact factor: 9.043

3.  FastTree 2--approximately maximum-likelihood trees for large alignments.

Authors:  Morgan N Price; Paramvir S Dehal; Adam P Arkin
Journal:  PLoS One       Date:  2010-03-10       Impact factor: 3.240

4.  Genome sequence of Acinetobacter baumannii TYTH-1.

Authors:  Ming-Li Liou; Chih-Chin Liu; Chia-Wei Lu; Ming-Feng Hsieh; Kai-Chih Chang; Han-Yueh Kuo; Chi-Ching Lee; Chun-Tien Chang; Cheng-Yao Yang; Chuan Yi Tang
Journal:  J Bacteriol       Date:  2012-12       Impact factor: 3.490

5.  Insertion sequence-driven evolution of Escherichia coli in chemostats.

Authors:  Joël Gaffé; Christopher McKenzie; Ram P Maharjan; Evelyne Coursange; Tom Ferenci; Dominique Schneider
Journal:  J Mol Evol       Date:  2011-03-12       Impact factor: 2.395

6.  The population structure of Acinetobacter baumannii: expanding multiresistant clones from an ancestral susceptible genetic pool.

Authors:  Laure Diancourt; Virginie Passet; Alexandr Nemec; Lenie Dijkshoorn; Sylvain Brisse
Journal:  PLoS One       Date:  2010-04-07       Impact factor: 3.240

7.  Long-term predominance of two pan-European clones among multi-resistant Acinetobacter baumannii strains in the Czech Republic.

Authors:  Alexandr Nemec; Lenie Dijkshoorn; Tanny J K van der Reijden
Journal:  J Med Microbiol       Date:  2004-02       Impact factor: 2.472

8.  ISfinder: the reference centre for bacterial insertion sequences.

Authors:  P Siguier; J Perochon; L Lestrade; J Mahillon; M Chandler
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

9.  Genomic Analysis of the Emergence and Rapid Global Dissemination of the Clonal Group 258 Klebsiella pneumoniae Pandemic.

Authors:  Jolene R Bowers; Brandon Kitchel; Elizabeth M Driebe; Duncan R MacCannell; Chandler Roe; Darrin Lemmer; Tom de Man; J Kamile Rasheed; David M Engelthaler; Paul Keim; Brandi M Limbago
Journal:  PLoS One       Date:  2015-07-21       Impact factor: 3.240

10.  Comparative analysis of Acinetobacters: three genomes for three lifestyles.

Authors:  David Vallenet; Patrice Nordmann; Valérie Barbe; Laurent Poirel; Sophie Mangenot; Elodie Bataille; Carole Dossat; Shahinaz Gas; Annett Kreimeyer; Patricia Lenoble; Sophie Oztas; Julie Poulain; Béatrice Segurens; Catherine Robert; Chantal Abergel; Jean-Michel Claverie; Didier Raoult; Claudine Médigue; Jean Weissenbach; Stéphane Cruveiller
Journal:  PLoS One       Date:  2008-03-19       Impact factor: 3.240

View more
  30 in total

1.  Distinct Mechanisms of Dissemination of NDM-1 Metallo-β-Lactamase in Acinetobacter Species in Argentina.

Authors:  Mark D Adams; Fernando Pasteran; German M Traglia; Jasmine Martinez; Fanny Huang; Christine Liu; Jennifer S Fernandez; Carolina Lopez; Lisandro J Gonzalez; Ezequiel Albornoz; Alejandra Corso; Alejandro J Vila; Robert A Bonomo; Maria Soledad Ramirez
Journal:  Antimicrob Agents Chemother       Date:  2020-04-21       Impact factor: 5.191

2.  Conjugative plasmids interact with insertion sequences to shape the horizontal transfer of antimicrobial resistance genes.

Authors:  You Che; Yu Yang; Xiaoqing Xu; Karel Břinda; Martin F Polz; William P Hanage; Tong Zhang
Journal:  Proc Natl Acad Sci U S A       Date:  2021-02-09       Impact factor: 11.205

3.  CsrA Supports both Environmental Persistence and Host-Associated Growth of Acinetobacter baumannii.

Authors:  John M Farrow; Greg Wells; Samantha Palethorpe; Mark D Adams; Everett C Pesci
Journal:  Infect Immun       Date:  2020-11-16       Impact factor: 3.441

4.  Nosocomial Outbreak of Extensively Drug-Resistant Acinetobacter baumannii Isolates Containing blaOXA-237 Carried on a Plasmid.

Authors:  Andrea M Hujer; Paul G Higgins; Susan D Rudin; Genevieve L Buser; Steven H Marshall; Kyriaki Xanthopoulou; Harald Seifert; Laura J Rojas; T Nicholas Domitrovic; P Maureen Cassidy; Margaret C Cunningham; Robert Vega; Jon P Furuno; Christopher D Pfeiffer; Zintars G Beldavs; Meredith S Wright; Michael R Jacobs; Mark D Adams; Robert A Bonomo
Journal:  Antimicrob Agents Chemother       Date:  2017-10-24       Impact factor: 5.191

5.  digIS: towards detecting distant and putative novel insertion sequence elements in prokaryotic genomes.

Authors:  Janka Puterová; Tomáš Martínek
Journal:  BMC Bioinformatics       Date:  2021-05-20       Impact factor: 3.169

6.  Evolution of Acinetobacter baumannii In Vivo: International Clone II, More Resistance to Ceftazidime, Mutation in ptk.

Authors:  Xiaoting Hua; Zhihui Zhou; Qing Yang; Qiucheng Shi; Qingye Xu; Jianfeng Wang; Keren Shi; Feng Zhao; Long Sun; Zhi Ruan; Yan Jiang; Yunsong Yu
Journal:  Front Microbiol       Date:  2017-07-10       Impact factor: 5.640

7.  Assessment of Insertion Sequence Mobilization as an Adaptive Response to Oxidative Stress in Acinetobacter baumannii Using IS-seq.

Authors:  Meredith S Wright; Stephanie Mountain; Karen Beeri; Mark D Adams
Journal:  J Bacteriol       Date:  2017-04-11       Impact factor: 3.490

8.  Comparative Whole-Genomic Analysis of an Ancient L2 Lineage Mycobacterium tuberculosis Reveals a Novel Phylogenetic Clade and Common Genetic Determinants of Hypervirulent Strains.

Authors:  Rahim Rajwani; Wing Cheong Yam; Ying Zhang; Yu Kang; Barry Kin Chung Wong; Kenneth Siu Sing Leung; Kingsley King Gee Tam; Ketema Tafess Tulu; Li Zhu; Gilman Kit Hang Siu
Journal:  Front Cell Infect Microbiol       Date:  2018-01-12       Impact factor: 5.293

9.  Comparative scaffolding and gap filling of ancient bacterial genomes applied to two ancient Yersinia pestis genomes.

Authors:  Nina Luhmann; Daniel Doerr; Cedric Chauve
Journal:  Microb Genom       Date:  2017-07-08

10.  Transposable elements contribute to the genome plasticity of Ralstonia solanacearum species complex.

Authors:  Osiel Silva Gonçalves; Kiara França Campos; Jéssica Catarine Silva de Assis; Alexia Suellen Fernandes; Thamires Santos Souza; Luiz Guilherme do Carmo Rodrigues; Marisa Vieira de Queiroz; Mateus Ferreira Santana
Journal:  Microb Genom       Date:  2020-05-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.