Literature DB >> 20333227

Continuing evolution of Burkholderia mallei through genome reduction and large-scale rearrangements.

Liliana Losada1, Catherine M Ronning, David DeShazer, Donald Woods, Natalie Fedorova, H Stanley Kim, Svetlana A Shabalina, Talima R Pearson, Lauren Brinkac, Patrick Tan, Tannistha Nandi, Jonathan Crabtree, Jonathan Badger, Steve Beckstrom-Sternberg, Muhammad Saqib, Steven E Schutzer, Paul Keim, William C Nierman.   

Abstract

Burkholderia mallei (Bm), the causative agent of the predominately equine disease glanders, is a genetically uniform species that is very closely related to the much more diverse species Burkholderia pseudomallei (Bp), an opportunistic human pathogen and the primary cause of melioidosis. To gain insight into the relative lack of genetic diversity within Bm, we performed whole-genome comparative analysis of seven Bm strains and contrasted these with eight Bp strains. The Bm core genome (shared by all seven strains) is smaller in size than that of Bp, but the inverse is true for the variable gene sets that are distributed across strains. Interestingly, the biological roles of the Bm variable gene sets are much more homogeneous than those of Bp. The Bm variable genes are found mostly in contiguous regions flanked by insertion sequence (IS) elements, which appear to mediate excision and subsequent elimination of groups of genes that are under reduced selection in the mammalian host. The analysis suggests that the Bm genome continues to evolve through random IS-mediated recombination events, and differences in gene content may contribute to differences in virulence observed among Bm strains. The results are consistent with the view that Bm recently evolved from a single strain of Bp upon introduction into an animal host followed by expansion of IS elements, prophage elimination, and genome rearrangements and reduction mediated by homologous recombination across IS elements.

Entities:  

Keywords:  bacterial evolution; bacterial virulence; comparative genomics; genome erosion

Year:  2010        PMID: 20333227      PMCID: PMC2839346          DOI: 10.1093/gbe/evq003

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

Burkholderia mallei (Bm) is a pathogen that is not found outside its mammalian host (Sanford 1995), yet its genome is highly similar to that of Burkholderia pseudomallei (Bp), a versatile, saprophytic pathogen endemic to the warm, wet soils of South East Asia and Northern Australia (Dance 1991). Bm causes glanders in equids, usually resulting in chronic infections but can cause fatal, acute infection in humans and other domesticated mammals. Its historical use as a biological weapon has led the Centers for Disease Control and prevention to classify Bm and Bp as category B select agents. Bp causes the human disease melioidosis and has been associated with disease in numerous hosts beyond mammals, including birds, reptiles, and even survival inside amoeba (Inglis et al. 2000). It has been suggested that Bm evolved from a single strain of Bp, after an ancestral strain infected an animal host and then lost genes not required for survival in the host, ultimately becoming an obligate pathogen (Godoy et al. 2003; Nierman et al. 2004). This hypothesis is supported by the genomic similarity shared by two reference strains: both Bp K96243 and Bm ATCC23344 possess two circular chromosomes, nearly all Bm genes have orthologs in Bp, and Bp has roughly 1,200 additional genes. The versatility of Bp’s host range and living environments is reflected in the species’ genome. For example, there exist a wide array of genomic islands (GIs) variably represented across different Bp strains that give each strain different characteristics (Sim et al. 2008; Tuanyok et al. 2008; Tumapa et al. 2008). Moreover, it is proposed that these GIs were acquired via horizontal gene transfer from other soil saprophytes, consistent with a life in diverse environments outside of a host. Lastly, different GIs are present in strains isolated from different regions of the world (Sim et al. 2008; Tuanyok et al. 2008; Tumapa et al. 2008), demonstrating that the genomes are adapted to different environmental conditions. In contrast, the underlying mechanism for host and environmental restriction in Bm is not clearly understood. These observations are similar to those in other bacterial genera where a “host-generalist” pathogen (in this case Bp) has undergone genome erosion (Ochman and Davalos 2006) that resulted in a “host-restricted” pathogen (Bm). Bm appears to be in an intermediate stage of erosion similar to Shigella flexneri, Salmonella typhi, Francisella tularensis (Ochman and Davalos 2006). Genome evolution in bacterial pathogens is a dynamic process that can occur over long periods of time but also during the span of short infections in a host (Oliver et al. 2000; Kraft et al. 2006). Under great selective pressures, such as survival in a host, unnecessary or deleterious genes could mutate rapidly or be lost entirely. Recombination across repeated sequences in a genome can lead to rapid gene mutation and loss. The genomes of Bp and Bm have very high contents of simple sequence repeats and IS elements that could have mediated recombination, resulting in the common gene disruptions, genomic inversions, translocations, duplications, and deletions observed in the reference Bm genome (Nierman et al. 2004). However, the extent of these gene losses and rearrangements across multiple Bm isolates has not been studied, and thus, it is unknown how common these events have been across the species. We hypothesized that comparative genomic analysis of several Bm and Bp genomes would reveal a core set of genes essential for survival and virulence in a mammalian host, and elucidate genes involved in environmental survival. In addition, the analysis would also clarify the evolutionary process from a Bp ancestor to a modern Bm genome. Our results provide strong evidence for the evolution of Bm from a single ancestral Bp strain whose genome eroded through IS-mediated elimination of clusters of genes. The analysis suggests that the deleted genes were those that contributed to survival of Bp in the environment but were nonessential to the life of Bm as a mammalian pathogen. In addition, several clusters of genes were variably lost from different Bm strains, suggesting that the Bm genomes still contain genes that are under reduced selection in the equid host and might be unnecessary for survival in the host. Last, the results show that the Bm continues to undergo genomic erosion that can lead to reduced virulence.

Materials and Methods

Bacterial Strains

Seven B. mallei strains and eight Bp strains were selected for sequencing and analysis based on geographic origin and virulence status (table 1).
Table 1

Burkholderia mallei and B. pseudomallei Strains Used in This Study

Size (bp)
GenBank accession numberVirulentSourceMLSTChromosome IChromosome IITotal genesVariable genes (% of genome)a
B. mallei
    ATCC23344 Nierman et al. (2004)NC_006348, NC_006349YesbBurma 1944403,510,1482,325,3795,2291,773 (34%)
    NCTC10229NC_008836, NC_008835YesbHungary 1961403,458,2082,284,0955,5192,063 (37%)
    NCTC10247NC_009080, NC_009079AttenuatedbTurkey 19601003,495,6872,352,6935,8692,413 (41%)
    SAVP1NC_008785, NC_008784NoSchutzer et al. (2008)403,497,4791,734,9225,2001,744 (33%)
    2002721280NZ_AANX00000000cNobPasteur Institute405,3002,239 (35%)
    ATCC10399NZ_AAHN00000000cYesbChina 1942405,7491,844 (40%)
    PRL-20NZ_AAZP00000000cYesPakistan 2005405,4692,013 (37%)
B. pseudomallei
    K96243 Holden et al. (2004)NC_006350, NC_006351YesThailand 1996104,074,5423,173,0056,324688 (11%)
    1106aNC_009076, NC_009078YesThailand 1993703,988,4553,100,7947,1871551 (21%)
    1710bNC_007434, NC_007435YesThailand 19991774,126,2923,181,7627,0881452 (20%)
    668NC_009074, NC_009075YesAustralia 19951293,912,9473,127,4567,2321388 (19%)
    1655NZ_AAHR00000000cYesAustralia 20031316,9801344 (19%)
    406eNZ_AAMM00000000cYesThailand 19882116,8801244 (18%)
    S13NZ_AAHW00000000cYesSingapore517,2171581 (22%)
    Pasteur 52237NZ_AAHV00000000cYesViet Nam4117,1541518 (21%)

Core genome is 3,456 genes for Bm and 5,636 for Bp.

Virulence determined by Syrian hamster infection model. Three groups of female Syrian hamsters (five per group) were infected by the intraperitoneal route with a range of 101–103 cfu for each strain of B. mallei examined. Mortality was recorded daily for 14 days and on day 15, the surviving animals from each group were euthanized.

WGS, whole-genome shotgun sequencing (unfinished).

Burkholderia mallei and B. pseudomallei Strains Used in This Study Core genome is 3,456 genes for Bm and 5,636 for Bp. Virulence determined by Syrian hamster infection model. Three groups of female Syrian hamsters (five per group) were infected by the intraperitoneal route with a range of 101–103 cfu for each strain of B. mallei examined. Mortality was recorded daily for 14 days and on day 15, the surviving animals from each group were euthanized. WGS, whole-genome shotgun sequencing (unfinished).

Sequencing and Annotation

The Bm type strain ATCC23344 was previously sequenced (Nierman et al. 2004). Three Bm strains (NCTC10229, NCTC10247, and SAVP1) were sequenced with full closure and manually annotated using approaches previously described (Nierman et al. 2004). The remaining three (2002721280, ATCC10399, and PRL-20) were sequenced to 8× Sanger sequence coverage by the whole-genome shotgun method (Fleischmann et al. 1995) without closure, assembled using Celera Assembler (Myers et al. 2000), and contigs oriented by alignment to the reference strain ATCC23344 using PROMER (Delcher et al. 2002). Open reading frames (ORFs) were predicted and annotated automatically using GLIMMER (Salzberg et al. 1998; Delcher et al. 1999). Pseudochromosomes were constructed from the ordered scaffolds, using manual examination where necessary. Bp strains 1106a, 1710b, and 668 were sequenced with full closure and manual annotation, whereas 1655, 406e, S13, and Pasteur 52237 were sequenced without closure and annotated automatically to 8× coverage. The Bp type strain K96243 was downloaded for analysis (Holden et al. 2004).

Analysis of Functional Role Categories

Proportions of genes in each functional role category were calculated for each strain and then averaged over all seven Bm strains, over four virulent Bm strains, or over three avirulent Bm strains. T-tests were performed on the square root transformed percentage data to determine the significance of the difference between core and variable genes.

Identification of Shared and Strain-Specific Genes

Coding sequences (CDSs) from each strain were aligned against the whole-genome sequence of every other strain using the Program to Assemble Spliced Alignments (Haas et al. 2003). All CDSs that could not be aligned were thus assumed to be specific to that strain relative to the strain against which it was aligned.

Identification of Paralogs

CD-Hit was used to identify paralogs with 90% amino acid sequence identity within each of the Bm genomes.

Pan-Genome Analysis

The pan-genome analysis was carried out as described previously (Tettelin et al. 2005). Very briefly, after sequentially comparing the seven Bm strains and the eight Bp strains in all possible combinations, the size of the species core- and pan-genomes were extrapolated (for detailed statistical calculations, see Tettelin et al. 2005). The core genome analysis was also conducted using OrthoMCL with a Blast e value cutoff of 1 × 10−5 and an inflation parameter of 1.5. The OrthoMCL output was used to construct tables of shared orthologs and strain-specific genes.

Whole-Genome Alignments

WebACT (Abbott et al. 2005) and the multigenome homology tools at the Pathema web site (http://pathema.jcvi.org) were used to generate alignment images with e value cutoff of 1 × 10−5.

Construction of Species Tree

First, orthologous proteins (60–80% identical over at least 90% of their length) from Bm ATCC23344, Bp K96243, B. thailandensis E264, and B. cenocepacia AU 1065 were identified by cluster analysis. From this set, all proteins annotated as “putative,” “domain,” “family,” and “related,” as well as all hypothetical and unknown proteins, were eliminated. The selected proteins from each of the four species were concatenated and searched individually against the complete protein sets of B. ambifaria MC40-6, B. cepacia AMMD, B. multivorans ATCC17616, B. phymatum STM815, B. phytofirmans PsJN, B. vietnamiensis G4, B. xenovorans LB400, and Pseudomonas aeruginosa PA7 using BlastP to identify orthologs from these species. The final set, which consisted of 56 proteins from each of the 12 species that were 60–80% identical over at least 95% of their length, were aligned using Muscle (Edgar 2004) then concatenated (supplementary table 1, Supplementary Material online). Bootstrapped maximum likelihood trees were calculated from the concatenated protein set using the PHYLIP package applying the JTT substitution model with a gamma distribution (α = 0.5) of rates over four categories of variable sites, and a consensus tree was produced from the bootstrap replicates. Bootstrapped maximum parsimony and Neighbor-Joining trees were also created by PHYLIP, using the default parameters for those methods.

Identification of Orthologous Genes and Evolutionary Comparisons (dN/dS Analysis)

Orthologous gene pairs were compiled from eight Bm strains by identifying symmetrical best hits between proteins from the reference strain ATCC23344 and the other seven Bm genomes using BlastP (http://www.ncbi.nlm.nih.gov/BLAST/) with a cutoff of 1 × 10−10. Nucleotide sequence alignments were produced for orthologous pairs of ATCC23344 and each other Bm strain using MUSCLE and OWEN (Ogurtsov et al. 2002; Edgar 2004). Alignments of CDSs were guided by their corresponding amino acid sequence alignments (Kondrashov and Shabalina 2002). In cases where greater than 30% of the gaps or annotated regions of putative orthologs did not align or where pairs of sequences aligned perfectly (100% similarity), the sequence pairs were removed from further analysis. dN and dS values were calculated by Nei–Gojobori method (Nei and Gojobori 1986; Yang 1997). Overall, 1,018 and 219 detailed alignments were generated from the original 4,197 core and 996 variable Bm genes, respectively, and dN/dS ratios were estimated. Differences between rates of synonymous (dS) and nonsynonymous (dN) substitutions in the variable and core coding regions were analyzed with the Wilcoxon rank sum test.

Results

Genome Features

Bm was reported to have evolved from a single strain of Bp that became highly adapted to its mammalian host (Godoy et al. 2003). In order to determine whether Bm was the result of genome reduction and clarify the mechanism of the proposed host adaptation, six Bm strains and seven Bp strain were sequenced and used in whole-genome comparative analyses. Each of the strains sequenced was selected based on their geographical or clinical isolation (table 1). Among the Bm strains, two were avirulent in a Syrian hamster model (SAVP1 and 2002721280) and one had reduced virulence (NCTC10247). The genome sizes of the seven sequenced Bm strains were similar, averaging 5.7 Mb (table 1). However, chromosome II of strain SAVP1 was significantly smaller than the other fully sequenced strains. The eight sequenced Bp strains averaged 7.2 Mb, approximately 1.5 Mb larger than that of Bm, and the corresponding chromosomes of the four fully sequenced and closed strains were relatively similar in size. The genomic diversity among seven housekeeping genes in Bm and Bp strains was studied using multilocus sequence type (MLST) analysis (Maiden et al. 1998). Despite the differences in geographical distribution or virulence, all but one of the Bm strains belonged to the same MLST (http://bpseudomallei.mlst.net/; table 1), suggesting a lack of genetic diversity. The two identified MLST groups differed only in one nucleotide within the gltB locus, further demonstrating a highly similar genetic landscape. These results were consistent with Chantratita et al. (2006) who found that 21 isolates of Bm belonged to only one MLST type. In contrast, each of the eight Bp strains belonged to a different MLST group (table 1), and none of the Bp MLST groups matched the Bm MLST groups. Based on MLST relatedness, K96243 is the closest sequenced Bp relative, although there exist several Bp isolates with closer MLST profiles whose genome sequence is not known (Godoy et al. 2003). Combined, the genome properties and MLST data provide evidence for the clonal evolution of Bm from a single Bp ancestor.

Bm Lost Large Clusters of Bp Genes Associated with Environmental Survival

To better understand the genome reduction among Bm strains, we performed reciprocal comparisons of all CDSs of each strain of one species with the genome sequence of each strain of the other species. The results showed that, as expected, many genes were Bp-specific relative to Bm (ranging from 1,122 to 1,488), whereas only very few (0–8) Bm-specific genes exist (data not shown). All the Bm-specific genes were either hypothetical proteins or phage integrases, presumably relics from a Bp ancestor. Interestingly, roughly 40% of the Bp-specific genes were clustered in the Bp genome and mapped to the GIs identified previously (Holden et al. 2004; Tuanyok et al. 2008; Tumapa et al. 2008; fig. 1). Furthermore, none of the GIs from the sequenced B. pseudomallei genomes are represented in any of the Bm genomes (data not shown). Almost all the remaining 60% of Bp-specific genes also clustered in the genome (fig. 1) and, in some cases, were deletions surrounding the GIs, similar to the observation made in a wide panel of Bp isolates (Sim et al. 2008). The loss of these GIs could explain why Bm is not found in the environment because many of the GIs lost in Bm have functions associated survival and competition in the soil environment (Holden et al. 2004; Tuanyok et al. 2008; Tumapa et al. 2008). For instance, at least four of the GIs lost encode for multidrug resistance pumps. In addition, several of the Bp GI encode for secondary metabolite clusters that could act as antibacteriacidals or antifungals (Duerkop et al. 2009), and thus allow Bp to compete in the soil, whereas Bm would be at a disadvantage.
F

Multigenome alignment of eight Bp and seven Bm strains. Each circle represents a genome as presented in Materials and Methods. All genomes are aligned with Bp K96234 genome as a reference, which appears as the outermost multicolored circle. The Bp genomes are the eight outermost circles, and Bm genomes are internal. Areas in each color represent homologies between the subject genome and the reference. Areas in black in the reference chromosome (outermost circle) are regions present in K96243 but absent in query genome. Areas in black in each of the concentric circles are regions present in the query genome but absent from K96243. Representative Bp GIs are shown with red arrows. Representative clusters of Bp-specific genes absent from all Bm genomes (black on the K96243 ring) are highlighted with a yellow arrow.

Multigenome alignment of eight Bp and seven Bm strains. Each circle represents a genome as presented in Materials and Methods. All genomes are aligned with Bp K96234 genome as a reference, which appears as the outermost multicolored circle. The Bp genomes are the eight outermost circles, and Bm genomes are internal. Areas in each color represent homologies between the subject genome and the reference. Areas in black in the reference chromosome (outermost circle) are regions present in K96243 but absent in query genome. Areas in black in each of the concentric circles are regions present in the query genome but absent from K96243. Representative Bp GIs are shown with red arrows. Representative clusters of Bp-specific genes absent from all Bm genomes (black on the K96243 ring) are highlighted with a yellow arrow. Since the GI are proposed to be the source of environmental variability in Bp (Holden et al. 2004; Tuanyok et al. 2008; Tumapa et al. 2008), and given that these are absent from the Bm genome, we hypothesized that the entire genetic complement, or pan-genome (Tettelin et al. 2005), of Bm would be significantly reduced compared with Bp. Pan-genome analysis confirmed that the Bm strains were remarkably homogeneous in their gene content. The number of new genes dropped off precipitously and essentially leveled off after inclusion of only five genomes, indicating that sequencing additional Bm strains would not reveal a significant number of novel genes (fig. 2). In other words, essentially all Bm genes will be identified after only 4 or 5 additional genomes are sequenced. In contrast, the number of new genes leveled off much more gradually in Bp (fig. 2), suggesting that 25–50 new genes will be revealed with each newly sequenced strain.
F

Pan-genome analysis of seven Bm and eight Bp strains. The CDSs in all Bm genomes (blue line) and Bp genomes (red line) were compared, and the number of new genes was plotted against the number of genomes used. The blue dashed line represents the extrapolated number of Bm strain-specific genes. The red dashed line represents the extrapolated minimum number of new genes discovered with each Bp genome.

Pan-genome analysis of seven Bm and eight Bp strains. The CDSs in all Bm genomes (blue line) and Bp genomes (red line) were compared, and the number of new genes was plotted against the number of genomes used. The blue dashed line represents the extrapolated number of Bm strain-specific genes. The red dashed line represents the extrapolated minimum number of new genes discovered with each Bp genome.

Bm Has a Distinct Variable and Core Genome

The Bm core genome is defined as the set of genes that is common to all strains, whereas the strain-specific variable gene sets contain genes that are absent from at least one of the other Bm genomes. The Bm core genome consisted of 3,456 genes, whereas the pan-genome (the core gene set plus variable genes) contained about 2,300 more (roughly 5,700 genes; table 1). SAVP1 and ATCC23344 had the fewest number of variable genes (1,773 and 1,774, respectively), whereas NCTC10247 had the most (2,413). Many of the core genes had duplicates and paralogs that were considered part of the variable gene set. The total number of duplicates or paralogs in each strain ranged from 240 to 253, most of which were annotated as IS elements. Consistent with the hypothesis that Bm evolved from a single strain of Bp, these Bm variable genes all had orthologs in Bp K96243, suggesting that the mammalian host environment offered no opportunity for new gene influx into the Bm pan-genome. These results suggest that the Bm pan-genome is closed and that the organism has entered an evolutionary bottleneck in the host. The Bp core- and pan-genomes (ca. 5,300 and 7,500 genes, respectively; table 1) were larger than those of Bm. The variable genome of Bp ranged from 454 to 837, genes many of which were encoded within GIs. Interestingly, the variable genome in Bm encompassed a larger portion of the genome (33–41%) than in Bp (20.6%), suggesting that even with a relatively narrow genetic base, the genome of Bm is continuing to change, albeit without actual gain of genes to the pan-genome. It is possible that the large Bm variable genome is an artifact of in vitro culture deletions that led to a loss of virulence (Schutzer et al. 2008). In vitro culture would remove the selective pressure on genes essential for survival in the mammalian host, leading to the loss of some of these genes. To address this possibility, the analysis was repeated after removing the two avirulent strains (SAVP1 and 2002721280). The size of the variable genome decreased by 610 genes, and accordingly, the core genome increased by 610 genes because those genes were shared among the remaining five strains. Interestingly, none of the 610 genes were lost from both avirulent strains showing that there exist at least two independent traits that are essential for virulence in a mammalian host (see below). Analysis of functional role categories of variable genes among strains of Bm and of Bp revealed significant differences between the two variable genomes (table 2) that were consistent with each species life style. Much of the Bp variable genome was associated with phage elements or complete prophage (Ronning CM, Nierman WC, Ulrich RL, DeShazer D, in preparation) and had predominate gene functions of mobile and extrachromosomal elements (29.3%) and DNA metabolism (24.5%; table 2). These genes were probably acquired through lateral gene transfer in the soil environment. In contrast, the predominant roles in the Bm variable genes are cell envelope, cellular processes, energy metabolism, regulatory functions, and transport and binding (table 2). These functions are probably essential for survival and competition in the environment but are under lower selection in the host (Casadevall 2008).
Table 2

Percentages of Total Variable Genes within Each Functional Role Category

Role categoryBurkholderia pseudomallei
B. mallei
Mean (%)Standard deviation (%)Mean (%)Standard deviation (%)
Amino acid biosynthesis1.492.372.160.89
Biosynthesis of cofactors, prosthetic groups, and carriers0.831.551.020.16
Cell envelope6.804.7811.462.44
Cellular processes6.123.4412.423.16
Central intermediary metabolism2.103.122.670.35
DNA metabolism24.5114.390.900.21
Energy metabolism3.574.8314.421.86
Fatty acid and phospholipid metabolism0.661.333.890.78
Mobile and extrachromosomal element functions29.3414.700.820.41
Protein fate3.564.186.561.26
Protein synthesis0.762.141.090.41
Purines, pyrimidines, nucleosides, and nucleotides0.000.001.850.78
Regulatory functions8.845.7516.781.23
Signal transduction0.000.005.373.50
Transcription2.362.390.750.29
Transport and binding proteins7.336.5217.851.83
Viral functions1.733.970.000.00

NOTE.—Mean, standard deviation, and range are given for eight Bp strains and seven Bm strains. Hypothetical and unknown proteins and proteins of unknown function have been excluded.

Percentages of Total Variable Genes within Each Functional Role Category NOTE.—Mean, standard deviation, and range are given for eight Bp strains and seven Bm strains. Hypothetical and unknown proteins and proteins of unknown function have been excluded.

Bm Variable Genes Exist in Multigene Contiguous Clusters Flanked by IS Elements

For all Bm strains, the vast majority of the genes that were present in a particular strain but absent from one or more of the other strains tended to occur in contiguous clusters within that strain, with the total number of these variable gene clusters ranging from 9 to 18 for each strain (table 3). The presence or absence of these variable regions appeared to be the primary difference between Bm strains. In all strains, there were more variable gene clusters on chromosome II than chromosome I, even though chromosome II is smaller. The variable clusters among the seven Bm strains were classified into 24 groups based on sequence homology (table 3). The number of strains from which each cluster was absent ranged from 1 (clusters A, D, F, G, I, J, L, M, N, P, Q, and R) to 5 (cluster X). The variable regions varied greatly in size, from ∼3.4 kb (cluster N) to ∼269 kb (cluster Q).
Table 3

Variable Gene clusters in Bm

5' end3' endSize (bp)Boundary (5'/3')ATCC23344SAVP11029910247103992002721280PRL-20Number of putative virulence genesaNRPS/PKS/Multidrug efflux pumpb
A600,776612,72811,953IS407AXXXXXX1
B1,000,6921,080,04079,349IS407AXXXX11RND
C1,269,3171,277,5048,188IS407AXXXXXX4
D2,053,5572,070,42816,872IS407AXXXXXX5
E2,335,0452,354,06319,019IS407AXXXXXX2PKS
F2,527,0112,629,142102,132ISBm2/IS407AXXXXXX20
G3,320,4103,346,61926,210ISBm2XXXXXX6
H104,657170,44165,785IS407AXXXXX13RND
I173,242319,417146,176ISBm2XXXXXX32
J409,775432,88423,110IS407AXXXXXX5
K567,683655,44187,759IS407AXXXXX15
L658,191733,81675,626IS407AXXXXXX9RND
M839,856869,58129,726IS407AXXXXXX6
N895,207898,6473,441ISBm2XXXXXX1
O1,015,7581,061,75645,999IS407AXXXXX6RND, PKS
P1,176,7751,225,74448,970ISBm1/IS407AXXXXXX13NRPS
Q1,518,8171,790,695271,879IS407AXXXXXX64NRPS, PKS
R2,158,8112,265,535106,725ISBm2XXXXXX28PKS
S1,136,9101,145,7078,798None/IS407AXXXXX2
T783,963817,79833,836IS407A/transposaseXXXXX18
U2,650,1892,695,42945,241IS407AXXXX0
V947,304951,9284,625IS407A/noneXXX1
W1,809,4691,823,84914,381A, transposase OrfB/IS407AXXX0
X1,237,8291,245,9228,094IS407A/ISBma2XXX1RND

NOTE.—Each variable cluster was assigned a letter. Genomic locations for clusters A–R are from ATCC23344, where the bold font represents those located on chromosome II. Genomic locations for clusters S–W are from NCTC10247 (bold, chromosome II), and cluster X from NCTC10399 chromosome II. An X under each strain signifies that the cluster is presented in that genome.

Virulence genes were determined by using MVirDB as described in Materials and Methods.

NRPS, nonribosomal peptide synthase; PKS, polyketide synthase; RND, resistance nodulation-division like pump.

Variable Gene clusters in Bm NOTE.—Each variable cluster was assigned a letter. Genomic locations for clusters A–R are from ATCC23344, where the bold font represents those located on chromosome II. Genomic locations for clusters S–W are from NCTC10247 (bold, chromosome II), and cluster X from NCTC10399 chromosome II. An X under each strain signifies that the cluster is presented in that genome. Virulence genes were determined by using MVirDB as described in Materials and Methods. NRPS, nonribosomal peptide synthase; PKS, polyketide synthase; RND, resistance nodulation-division like pump. Most of these clusters were flanked by transposases associated with IS elements, usually of the same type; however, a few were bounded by a transposase on one end only (table 3). Interestingly, some of these variable regions appeared contiguously in some genomes, for example, clusters C and T in SAVP1 and NCTC10229. By searching the sequence flanking the variable gene clusters against the strains from which the cluster is absent, putative excision points were mapped back to most of the strains (supplementary table 2, Supplementary Material online), which were invariably marked by transposases. In cases where the cluster was lost from several genomes, the excision point was the same in each genome. The results suggest that the variable gene clusters were present in the Bp ancestor and were differentially lost through IS element-mediated excision in different Bm strains. Interestingly, several of the Bm variable genes had functions associated with survival and competition in a soil environment such as synthesis of secondary metabolites and drug resistance mechanisms. In total, 5/24 of the variable regions contain genes involved in nonribosomal peptide or polyketide synthesis (table 3). In addition, several metal ion resistance genes and stress-related proteins also belong to the variable gene set (data not shown). Lastly, a different set of five variable regions encode multidrug efflux pumps (table 3). Interestingly, genomes of NCTC10399 and SAVP1 encoded a 50-kb region containing a multidrug efflux pump that we had previously proposed as the source for aminoglycoside resistance (Nierman et al. 2004). Both of these genomes contained the same arrangement at the amrAB-ompR locus as Bp (data not shown; Moore et al. 1999) but contain a 6-bp deletion within the coding region of amrB that resulted in a two amino acid deletion in a highly conserved transmembrane motif (Putman et al. 2000) toward the C terminus of the protein. Both NCTC10247 and NCTC10299 contained a homolog of amrA, but the AmrB protein was truncated at amino acid 244 potentially resulting in sensitivity to aminoglycosides and macrolides. None of the remaining Bm genomes encoded for this region. The finding that this cluster is present in some Bm strains could help explain previous studies where a few of the Bm strains were resistant to aminoglycosides (Thibault et al. 2004). A recent study found several aminoglycoside-sensitive clinical Bp isolates, some of which had also lost the entire amrAB-ompR locus, whereas others used an entirely different and unknown mechanism to repress expression of the operon (Trunck et al. 2009), suggesting that this locus is not necessary for survival in the host.

The Bm Genome Has Undergone a Dramatic Expansion of IS Elements That Mediated Extensive Intrachromosome Rearrangements within the Bm Strains

Whole-chromosome Rearrangements

In addition to the IS elements flanking the variable gene clusters, each Bm strain had a considerable repertoire of IS elements (ranging 166–218, supplementary fig. 1, Supplementary Material online). In particular, IS elements of the type IS407A had undergone a significant expansion in all the sequenced Bm strains, accounting for 76% of all IS elements (supplementary fig. 1, Supplementary Material online). Interestingly, most of the IS407A elements in Bm did not have the flanking 4-bp repeat that result from a transposon insertion (88% in chromosome I: 88% and 77% in chromosome II; DeShazer et al. 2001), suggesting that these elements had been subject to homologous recombination. Bias in base composition among the existing 4-bp repeats suggested that the initial transposon insertions within the chromosome were nonrandom, but rearrangement since then was random (fig. 3). Whole-genome alignments demonstrated that Bm chromosomes were dramatically and extensively rearranged by recombination across IS407A elements (fig. 3). Among the Bm strains, none of the IS407A rearrangements occurred between chromosome I and chromosome II. Intriguingly, Bp contained an average of seven IS elements per genome (supplementary fig. 1, Supplementary Material online), but these have not catalyzed such genome-wide rearrangements (fig. 3). Thus, it is unclear whether there exist environmental selective pressures that maintain Bp’s genomic arrangement, as in Salmonella typhimurium (Kothapalli et al. 2005) or whether rearrangements occur in Bm due to its high IS element content.
F

IS407A rearrangement of whole genomes. (A) Relative occurrence of the nucleotides in the 4-bp direct repeat of IS407A element insertion is shown as bar graphs for each position in the box below. (B) Four fully sequenced Bm genomes were aligned using WebACT. Red lines denote homology between chromosomes organized in the same orientation. Blue lines show homology but inverse orientation in each chromosome. Yellow lines show the presence of IS407A elements. Regions with no homology are shown by the absence of red or blue lines. (C) Four fully sequence Bp genomes were aligned, as described for Bm.

IS407A rearrangement of whole genomes. (A) Relative occurrence of the nucleotides in the 4-bp direct repeat of IS407A element insertion is shown as bar graphs for each position in the box below. (B) Four fully sequenced Bm genomes were aligned using WebACT. Red lines denote homology between chromosomes organized in the same orientation. Blue lines show homology but inverse orientation in each chromosome. Yellow lines show the presence of IS407A elements. Regions with no homology are shown by the absence of red or blue lines. (C) Four fully sequence Bp genomes were aligned, as described for Bm.

rrn Rearrangements in Bm

Chromosome I

It has been reported that many host-specific pathogens, so called specialists, have undergone considerable ribosomal RNA operon rearrangements when compared with their generalist relatives (Liu and Sanderson 1998). We investigated whether any of the large-scale rearrangements observed in Bm also affected the position and organization of rrn operons when compared with Bp. All the finished Bp strains (K96243, 1106a, 1710b, and 668) shared the same number, distribution, and organization of the rrn operons. There are three complete operons in chromosome I and one on chromosome II that all share the same order: rrs(16S)—tRNA-Ile—tRNA-Ala—rrl(23S)—rrf(5S). In contrast, in each of the four completely sequenced Bm strains, there were different numbers and distributions of the rrn loci (fig. 4). Each chromosome I had at least one complete rrn locus with the same order as described above. However, each Bm strain had lost an entire operon from chromosome I, and the third locus was interrupted at the 1,427 bp of the 23S locus. SAVP1 had an additional remaining degenerate rrn locus on chromosome I that had an IS407A element interrupting the 23S locus at position 284. This IS element had the 4-bp repeat associated with an insertion event (DeShazer et al. 2001). Interestingly, the 5S locus was lost at all the degenerate locations. These results suggest that at least two 23S loci in Bm are susceptible to mutations via insertion of IS elements or phages that drive the loss of the 5S gene as well.
F

IS407A mediated rearrangements of rrn and replichores among Bm strains. (A) rrn rearrangements due to IS407A recombinations. The outermost ring corresponds to Bp K96243 but is a representative of all Bp genomes. Green, ATCC23344; orange, NCTC10229; purple, NCTC10247; brown, SAVP1. The brown rrn cluster represents the locus rearranged into chromosome II in Bm. Red bars represent degenerate rrn loci. (B) guanine/cytosine-skew representation of the NCTC10247 genome generated in DNAplotter (Carver et al. 2009). Green represents a negative guanine/cytosine-skew suggesting ORF are oriented in the negative strand and purple represents a positive guanine/cytosine-skew suggesting ORF oriented in the positive strand. The origin of replication for NCTC10247 chromosome I is predicted at around 2.3 Mb and the termination around 1.0 Mb. (C) Alignment of chromosome II of ATCC23344 with chromosome I of BpK96243 as Bp representative. Regions of homology are represented by blue color. For the sake of clarity, only the genomic regions of interest are depicted.

IS407A mediated rearrangements of rrn and replichores among Bm strains. (A) rrn rearrangements due to IS407A recombinations. The outermost ring corresponds to Bp K96243 but is a representative of all Bp genomes. Green, ATCC23344; orange, NCTC10229; purple, NCTC10247; brown, SAVP1. The brown rrn cluster represents the locus rearranged into chromosome II in Bm. Red bars represent degenerate rrn loci. (B) guanine/cytosine-skew representation of the NCTC10247 genome generated in DNAplotter (Carver et al. 2009). Green represents a negative guanine/cytosine-skew suggesting ORF are oriented in the negative strand and purple represents a positive guanine/cytosine-skew suggesting ORF oriented in the positive strand. The origin of replication for NCTC10247 chromosome I is predicted at around 2.3 Mb and the termination around 1.0 Mb. (C) Alignment of chromosome II of ATCC23344 with chromosome I of BpK96243 as Bp representative. Regions of homology are represented by blue color. For the sake of clarity, only the genomic regions of interest are depicted. In addition to the loss of the 3′ sequence at two loci, each of the Bm strains displayed a different organization of rrn operons on chromosome I (fig. 4). Despite a considerable degree of rearrangement, the orientation of the rrn operons was always in the direction of replication, consistent with observations in other species (Liu and Sanderson 1998; Shu et al. 2000). However, rearrangements in other species, like Salmonella and Shigella, almost always resulted in rrn operons that are equidistant from the origin of replication (Kothapalli et al. 2005). Compared with Bp, only ATCC23344 had an rrn operon at the same distance (0.2 Mb) from the origin of replication. Interestingly, NCTC10247 (reduced virulence strain) had a drastic rearrangement that left the rrn locus 1.1 Mb away from the origin of replication and also resulted in a chromosome with differently sized replichores (fig. 4). It has been proposed that rrn loci must be close to the ori for adequate expression of ribosomal components necessary during cellular division (Schmid and Roth 1987; Kothapalli et al. 2005). In addition, Escherichia coli strains with differently sized replichores are at a growth disadvantage (Lesterlin et al. 2008). Thus, it is possible that the attenuation in virulence in NCTC10247 can be explained by these genomic constraints. The growth rate of NCTC10247 in rich media over a 24-h span was only slightly slower than NCTC10229 (average td = 191 min and 193 min, respectively). However, during early exponential growth, the doubling time (td) of NCTC10247 was 104 min compared with only 79 min for NCTC10229. In an animal host, this difference in growth could be sufficient to explain the attenuation.

Chromosome II

Neither of the rrn loci on Bm chromosome II have been subject to degeneration. When compared with Bp chromosome II, the additional rrn locus could be the result of intrachromosomal duplication of the existing locus or due to an exchange between the two chromosomes. Whole-genome alignments with Bp revealed that this locus was part of a 46-kb interchromosomal exchange between chromosome I and chromosome II flanked by IS407A elements (fig. 4). This exchange occurred after the divergence between Bp and Bm because all the Bp strains carried this cluster on chromosome I and all Bm stains on chromosome II. As in chromosome I, the organization of the rrn loci on Chromosome II was not conserved, and each Bm strain had a different organization around the chromosome. However, all loci are oriented in the same direction of transcription, with none being as close to the origin of replication as the rrn locus in the Bp chromosome II.

fliP IS407A Element Insertions

Bm is nonmotile, and thus, it was originally surprising to find that flagellar biosynthesis genes were present in the ATCC23344 genome with only one obvious mutation: an IS407A insertion into fliP (Nierman et al. 2004). Comparative analysis of the seven Bm strains showed that all the flagellar genes are present in all strains, but each one has an IS407A element at the same location in fliP, 124 bp from the start position. In all of these genomes, the N-terminal disruption of fliP also resulted in a 4-bp GACG complementary direct repeat that suggests the IS element was initially introduced via a transposition event. None of the Bp strains have a similar fliP mutation. These results suggest that functional flagella are necessary for environmental survival or generalist behavior but not for survival or virulence in the narrow Bm host range. Furthermore, the retention of all other flagellar genes in Bm suggests that those might be used as an alternate secretion apparatus similarly to Buchnera spp. (Toft and Fares 2008). Interestingly, three different types of alleles were identified (fig. 5) among the seven strains. In NCTC10247, NCTC10229, and 2000721280, only the IS407A element disrupts the gene. In ATCC23344, SAVP1, and NCTC10399, an additional 65-kb region was located adjacent to the IS407A element. This region was flanked by phage-associated proteins on the end closest to fliP-C terminus and by an ISBma1 element closest of fliP-N terminus. In NCTC10247 and NCTC10299, this 65-kb genomic region is located elsewhere on chromosome I, flanked by ISBma and IS407A, suggesting that the arrangement in the latter group is due to a recombination event across the two IS407A elements, perhaps aided by the ISBma1 transposase. A third allele present in PRL-20 shares all the loci present in the ATCC23344 allele, but has at least three additional IS-element flanked insertions, that resulted in 179-kb insertion at fliP. None of the other IS407A elements within this region contain the 4-bp complementary direct repeat, further signifying that these insertions were not a result of transposition but instead due to intrachromosomal recombination mediated by IS407A. These observations suggest that genes in the Bm genome under no selection in the equid host will acquire IS insertions, and conversely, those genes without IS may be experiencing selection in the host.
F

Genomic organization of the fliP locus in Bp and Bm. The wild-type fliP locus is present in all Bp. The fliP CDS is represented by dark purple rectangles. The NCTC10247 allele is interrupted by an IS407A (aquamarine) element. In ATCC23344, an ISBma1 (gray) is located upstream of the IS407A element and an additional 65 kb was inserted at this location. PRL20 had additional IS407A mediated insertions into fliP. Figures are not to scale, and IS407A elements in PRL-20 were made smaller.

Genomic organization of the fliP locus in Bp and Bm. The wild-type fliP locus is present in all Bp. The fliP CDS is represented by dark purple rectangles. The NCTC10247 allele is interrupted by an IS407A (aquamarine) element. In ATCC23344, an ISBma1 (gray) is located upstream of the IS407A element and an additional 65 kb was inserted at this location. PRL20 had additional IS407A mediated insertions into fliP. Figures are not to scale, and IS407A elements in PRL-20 were made smaller.

Loss of Virulence Is Explained by IS-Mediated Loss of Essential Gene Clusters

We wished to determine if any of the variable clusters contained virulence genes, particularly those absent from the two avirulent and one attenuated strain. Putative virulence genes were identified by blasting against the MvirDB database (Zhou et al. 2007; supplementary table 3, Supplementary Material online). Several of the clusters contained putative virulence genes, five of which (groups D, F, G, I, and R) were absent only from the avirulent strain 2002721280 and five (groups J, L, M, P, and Q) were absent solely from avirulent strain SAVP1. It has recently been reported that SAVP1 lacked the entire animal type III secretion system (TTSS) gene complex that was essential for virulence (Nierman et al. 2004; Ulrich and DeShazer 2004; Schutzer et al. 2008). The TTSS was encoded in the variable gene cluster P (table 3) that was lost through IS-mediated deletion in SAVP1 but was present in all other Bm stains. Because of its obvious virulence deficiency, no further analysis was done on this strain. Analysis of the other avirulent strain did not immediately result in an obvious virulence defect. However, clusters D and F were lost through IS407A recombination. These clusters contain amino acid synthesis and transporters that probably resulted in a strain auxotrophic for lysine and ornithine and at least partially deficient in its capacity to uptake several amino acids (glutamate, aspartate, leucine, valine, and isolecuine). Indeed, 200272180 did not grow on minimal media (data not shown). Thus, it is likely that these deficiencies are sufficient to explain the lack of virulence observed in 2002721280, as was demonstrated for a branched-chain amino acid auxotroph of Bp (Atkins et al. 2002). Alternatively, the presence of large numbers of regulatory genes within the variable gene clusters lost from 2002171280 may, together with the identified virulence genes present within the clusters, influence the virulence phenotype of this strain. The attenuation of virulence observed in NCTC10247 could not be explained solely by the loss of genes. Only two variable gene clusters were absent from NCTC10247 (groups B and X), but these two groups also were absent from other virulent strains, suggesting that the attenuation may be due to the loss of a single or a few genes rather than a whole cluster. However, pairwise comparisons of each of the six other strains compared with NCTC10247 showed it has lost very few genes compared with any of the strains, and in fact, NCTC10229 had no unique genes relative to NCTC10247. These results were surprising because the other avirulent strains appeared to have lost their virulence through gene loss while cultured in the laboratory. Thus, the mechanism of attenuation is not clear from genomic data and could be due to differential transcriptional control or other reasons such as the disequilibrium of the replichores as discussed above.

The Bm Core Genome Is Under Stronger Purifying Selection Than the Variable Genome

To evaluate the evolutionary forces that affect the variable regions in Bm genomes, we constructed detailed alignments and calculated the evolutionary rates for Bm orthologous gene pairs. Significant differences between rates of synonymous (dS) and nonsynonymous (dN) substitutions in the variable and core coding regions of Bm genomes were detected. Both dN and dS values were on average significantly lower for the core gene set: 0.0013 versus 0.0020 for dN (P = 0.0005) and 0.0026 versus 0.0033 for dS (P = 0.0005). However, the selection pressure on nonsynonymous sites varied dramatically between core and variable genes, indicating the existence of stronger purifying selection pressure on Bm core genes. The same trend was observed for virulent strains alone. When three avirulent or attenuated strains were excluded from the analysis, average dN values for the core gene sets were significantly lower than for variable genes (P < 0.03). Although overall dN/dS ratios were significantly different between variable and core genes (P < 0.001), a large fraction of completely conserved genes (with dS and/or dN equal to 0) was found in both groups (fig. 6) but was lower for variable genes (P < 0.005). This trend of higher conservation in the core genes was observed for all individual analyzed strains as well (data not shown), indicating stronger purifying selection on these genes. The observed stronger purifying selection on the core genes is consistent with the hypothesis that the variable genes experience reduced selective pressure within the mammalian host.
F

Distribution of dN/dS in variable and core genes of Bm genomes aligned with corresponding regions of the reference strain ATCC 23344. dN and dS rates were calculated as described in Materials and Methods. Cumulative data for the seven Bm strains is shown.

Distribution of dN/dS in variable and core genes of Bm genomes aligned with corresponding regions of the reference strain ATCC 23344. dN and dS rates were calculated as described in Materials and Methods. Cumulative data for the seven Bm strains is shown.

Phylogenetic Analysis of Bm

An initial phylogenetic analysis comparing the Bm and Bp reference strains relative to nine other Burkholderia spp. and to P. aeruginosa illustrated the close identity of Bm and Bp (supplementary fig. 2, Supplementary Material online). The two species clustered with the avirulent B. thailandensis and were distinct from the other Burkholderia spp. as reported previously (Lin et al. 2008). We performed phylogenetic analysis of the Bm species, first using a single nucleotide polymorphism (SNP)-based approach and then by indel analysis. Phylogenetic reconstructions using 515 SNPs as characters indicated that Bm is a monophyletic group and highly consistent with a strictly clonal pattern of evolution (supplementary fig. 3, Supplementary Material online). There were 253 SNPs unique to individual strains and the remaining 262 SNPs defined a highly robust tree with 34 homoplastic SNPs (all nodes had 100% bootstrap support) and a consistency index of 0.84. The root of the tree was determined by polarizing the SNP character states as ancestral or derived by comparison the Bp strain K96243. In contrast, indel phylogeny based upon whole-gene differences resulted in a poorly resolved topology and a lower consistency index (0.62; supplementary fig. 3, Supplementary Material online). We found 6,683 genes differing among the seven strains, which was astounding for a recently emerged pathogen. In this analysis, three pairs of highly similar strains clustered together and their association was consistent with the SNP-based tree. The deeper topology, however, was not consistent between the phylogenies. The indel-analysis tree had a four-node polytomy, illustrating the lack of topological resolution. Different rates of character evolution were clearly seen when gene indels were placed on the SNP-based phylogeny (fig. 7). Some branches had a very large number of gene indels (e.g., 2,284 and 2,552) relative to other branches (0, 3, 45, etc.) of comparable SNP length. Of the 5,683 gene indels analyzed, 997 require two or more “map locations” on the SNP-based tree (data not shown). Superimposing the variable gene cluster data from table 3 revealed that those indels belonged to clusters that had been differentially lost in different strains (fig. 7). The results from the phylogenetic trees support the hypothesis that Bm evolved from a single Bp ancestor whose genome has been continually rearranged, accompanied by the loss of clusters of genes from different strains in a process of convergent evolution.
F

Evolutionary tree of Bm showing the number of genes deleted and the evolutionary point of change. In total, 5,686 gene changes can be mapped onto this tree in a manner that assumes only single evolutionary deletion events. Conversely, 997 gene changes require 2 or 3 independent deletions of the same gene. Because we did not compare these genes with Bp, we do not know the ancestral state for 45 of these genes. These 45 genes could be additions or deletions with equal parsimony with mutations occurring along the basal branches of this tree. Letters in red represent the variable regions lost in each branch.

Evolutionary tree of Bm showing the number of genes deleted and the evolutionary point of change. In total, 5,686 gene changes can be mapped onto this tree in a manner that assumes only single evolutionary deletion events. Conversely, 997 gene changes require 2 or 3 independent deletions of the same gene. Because we did not compare these genes with Bp, we do not know the ancestral state for 45 of these genes. These 45 genes could be additions or deletions with equal parsimony with mutations occurring along the basal branches of this tree. Letters in red represent the variable regions lost in each branch.

Discussion

MLST analysis provided data supporting the evolution of Bm from a single strain of Bp (Godoy et al. 2003). The results presented here from comparative genomic analysis between Bm strains and relative to Bp provide further evidence that Bm arose as a founder population from a single Bp strain, most likely after colonization of an equine-like ancestral host. The evolution of Bm from a Bp ancestor was a result of IS-mediated gene loss and genomic recombination that resulted after genes that provided adaptability to variable environments were no longer under selection in a host. These extraneous genes provided expansion targets for the resident IS element population. Homologous recombination then ensued across IS elements, leading to beneficial genomic losses. Genome evolution continues in Bm, leading to strains that are fitter under different pressures. Our results support the notion that virulence is multifactorial because no gene losses were common among the avirulent and attenuated strains. In addition, the results agree with the hypothesis of genome reduction and erosion as an adaptation to intracellular lifestyle (Ochman and Davalos 2006; Casadevall 2008). IS element-mediated gene loss in Bm was random and continues to be a major evolutionary mechanism for this species; however, only viable strains can be isolated from an animal host. Random gene loss is evidenced by the unsystematic distribution of variable gene clusters across Bm strains (table 3), and the independent loss of variable clusters in different branches of the phylogenetic tree (fig. 7). In previous laboratory studies, IS407A-mediated gene loss and recombination were observed frequently in vitro (DeShazer et al. 2001; Nierman et al. 2004), and in some cases resulted in lower fitness in an animal host as in SAVP1 (Schutzer et al. 2008) and 200272128. Genomic inversions and rearrangement were a natural outcome of IS expansion with no explicit benefit to Bm, but in some cases, such as NCTC10247, potentially detrimental to the fitness of the organism. This phenomenon of excess IS and other repetitive elements in Bm which mediate recombination and hence rearrangements has been observed in closely related species of other genera, for example, Bordetella (Parkhill et al. 2003), Shigella (Yang et al. 2005), Yersenia (Gu et al. 2007), Orientia (Nakayama et al. 2008), and Clostridium (Myers et al. 2006), to name a few. Reconstruction of the ancestral Bp isolate is impractical. First, essentially all the genes in the pan-genome of Bm have already been elucidated (fig. 2), meaning that the closest common ancestor to all Bm strains is most similar to either NCTC10247 or NCTC10399 which harbor the greatest number of variable gene clusters and including those clusters that were lost from each strain (B and X or S, U, and V, respectively). Second, because all the Bp GI have been lost in Bm, it is impossible to infer which, if any, of these GI were present in the ancestor. Sim et al. (2008) found a large number of Bp isolates that have lost all but two of the GIs. Therefore, it is possible that the ancestral Bp strain looked very similar to one of these GI-deficient Bp isolates. Interestingly, those Bp strains were more commonly associated with environmental isolation, rather than human or animal hosts (Sim et al. 2008). Our results from Bm are in better agreement with the findings that there was little correlation between GI content and disease symptoms in melioidosis patients (Tumapa et al. 2008), as all GIs were lost in assuming an obligate mammalian parasite lifestyle. Last, each of the Bm chromosomes has undergone such dramatic rearrangements (fig. 4) that make it almost impossible to discover the ancestral organization of the genome. Although it is possible to conduct a simple concatenation of synteny blocks on known Bp genomes, it is likely that the ancestral Bp strain itself was also rearranged in the process of losing the GIs. It is noteworthy that the massive intrachromosomal shuffling of gene clusters has occurred with an almost complete absence of interchromosomal recombination. There were no observed interchromosomal exchanges among the any of Bm strains. However, the Bp ancestral strain underwent an interchromosomal exchange that encompassed one of the rrn and an anthranilate-resistance operon (fig. 5) in chromosome I. This cluster is located in chromosome II and is flanked by IS407A elements in Bm but not in the Bp genomes. Thus, it is difficult to conclude whether the exchange was induced by IS407A elements that had incorporated into chromosome I of the ancestor or whether the rearrangement sites were hot spots for IS407A insertion after exchange into chromosome II. Interchromosomal rrn exchange was observed in Bartonella spp. and Brucella suis biovar 3 (Jumas-Bilak et al. 1998; Alsmark et al. 2004). However, in both of these genera, the rrn exchange occurred from the smaller to the larger replicon, ultimately leading to a reduction in the chromosome number. In Bm, the opposite occurred perhaps as a mechanism to maintain some of the essential genes in the smaller replicon. Analysis of the shotgun assemblies of other Bp strains revealed that Bp1655 and Bp406e have also undergone dramatic changes in their rrn operon content and organization (data not shown). In contrast to Bm that has lost the 3′ end of rrn loci, each of these Bp strains has lost the 5′ region of at least one rrn locus through recombination across the 23S CDS. These recombinations have resulted in two out of only three major chromosomal rearrangements observed in Bp (Nandi T, et al., in preparation.). Combined, these results suggest that the 23S locus of both Bp and Bm are hot spots for chromosomal rearrangement and provide further evidence that Bm is the evolutionary product of a single Bp ancestor. Even though the Bp genome is larger overall, Bm has a larger variable (or accessory) genome. Some explanations are possible for this observation: 1) Bm is an active intermediary step between a “generalist” and an obligate pathogen (Ochman and Davalos 2006; Casadevall 2008). In this case, each Bm strain still carries many genes that granted its Bp ancestor its generalist status. Given enough time, most of these genes would be eroded, resulting in a much smaller genome. 2) The inclusion of reduced virulent strains resulted in an artificially large variable genome because these strains lost IS-defined regions essential for in vivo survival (Schutzer et al. 2008). Certainly, Bm isolated from animals do not resemble SAVP1 or 2002721820 but continuing evolution outside of a host must also account for variability within a species. Compared with Bp, the Bm variable genome functions are not very diverse. Functional role category analysis (table 2) allows us to speculate that Bp has unlimited access to variable genes via lateral transfer and through other means not available to Bm, including phage (Ronning CM, Nierman WC, Ulrich RL, DeShazer D, in preparation). These data show that Bm has entered a population bottleneck and that the small effective population size has further contributed to the homogeneity and reduced genome size of Bm. The resulting Bm population has been at a competitive disadvantage outside of the mammalian host and thus is never isolated from the natural environment. In summary, our results provide very strong evidence that Bm evolved from a single Bp ancestor through genetic loss and genome rearrangements mediated across IS elements. Bm strains continue to evolve in vivo and in vitro and is another snapshot in our growing understanding of genomic erosion in the path toward adaptation to intracellular lifestyles observed in so many other bacterial pathogens. Further studies into the specific traits lost in avirulent Bm strains, and the potential role of large-scale rearrangements in the reduction of virulence need to be pursued in order to achieve a full understanding of the pathogenicity of Bm.

Funding

This project has been funded with federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services under contract number N01-AI-30071.

Supplementary Material

Supplementary tables 1–3 and supplementary figures 1–3 are available at Genome Biology and Evolution online (http://www.oxfordjournals.org/our_journals/gbe/).
  52 in total

1.  Fast algorithms for large-scale genome alignment and comparison.

Authors:  Arthur L Delcher; Adam Phillippy; Jane Carlton; Steven L Salzberg
Journal:  Nucleic Acids Res       Date:  2002-06-01       Impact factor: 16.971

2.  Comparative analysis of the genome sequences of Bordetella pertussis, Bordetella parapertussis and Bordetella bronchiseptica.

Authors:  Julian Parkhill; Mohammed Sebaihia; Andrew Preston; Lee D Murphy; Nicholas Thomson; David E Harris; Matthew T G Holden; Carol M Churcher; Stephen D Bentley; Karen L Mungall; Ana M Cerdeño-Tárraga; Louise Temple; Keith James; Barbara Harris; Michael A Quail; Mark Achtman; Rebecca Atkin; Steven Baker; David Basham; Nathalie Bason; Inna Cherevach; Tracey Chillingworth; Matthew Collins; Anne Cronin; Paul Davis; Jonathan Doggett; Theresa Feltwell; Arlette Goble; Nancy Hamlin; Heidi Hauser; Simon Holroyd; Kay Jagels; Sampsa Leather; Sharon Moule; Halina Norberczak; Susan O'Neil; Doug Ormond; Claire Price; Ester Rabbinowitsch; Simon Rutter; Mandy Sanders; David Saunders; Katherine Seeger; Sarah Sharp; Mark Simmonds; Jason Skelton; Robert Squares; Steven Squares; Kim Stevens; Louise Unwin; Sally Whitehead; Bart G Barrell; Duncan J Maskell
Journal:  Nat Genet       Date:  2003-08-10       Impact factor: 38.330

Review 3.  Evolution of intracellular pathogens.

Authors:  Arturo Casadevall
Journal:  Annu Rev Microbiol       Date:  2008       Impact factor: 15.500

4.  Differences in chromosome number and genome rearrangements in the genus Brucella.

Authors:  E Jumas-Bilak; S Michaux-Charachon; G Bourg; D O'Callaghan; M Ramuz
Journal:  Mol Microbiol       Date:  1998-01       Impact factor: 3.501

5.  Skewed genomic variability in strains of the toxigenic bacterial pathogen, Clostridium perfringens.

Authors:  Garry S A Myers; David A Rasko; Jackie K Cheung; Jacques Ravel; Rekha Seshadri; Robert T DeBoy; Qinghu Ren; John Varga; Milena M Awad; Lauren M Brinkac; Sean C Daugherty; Daniel H Haft; Robert J Dodson; Ramana Madupu; William C Nelson; M J Rosovitz; Steven A Sullivan; Hoda Khouri; George I Dimitrov; Kisha L Watkins; Stephanie Mulligan; Jonathan Benton; Diana Radune; Derek J Fisher; Helen S Atkins; Tom Hiscox; B Helen Jost; Stephen J Billington; J Glenn Songer; Bruce A McClane; Richard W Titball; Julian I Rood; Stephen B Melville; Ian T Paulsen
Journal:  Genome Res       Date:  2006-07-06       Impact factor: 9.043

6.  Microbial gene identification using interpolated Markov models.

Authors:  S L Salzberg; A L Delcher; S Kasif; O White
Journal:  Nucleic Acids Res       Date:  1998-01-15       Impact factor: 16.971

7.  Multilocus sequence typing and evolutionary relationships among the causative agents of melioidosis and glanders, Burkholderia pseudomallei and Burkholderia mallei.

Authors:  Daniel Godoy; Gaynor Randle; Andrew J Simpson; David M Aanensen; Tyrone L Pitt; Reimi Kinoshita; Brian G Spratt
Journal:  J Clin Microbiol       Date:  2003-05       Impact factor: 5.948

8.  Quorum-sensing control of antibiotic synthesis in Burkholderia thailandensis.

Authors:  Breck A Duerkop; John Varga; Josephine R Chandler; Snow Brook Peterson; Jake P Herman; Mair E A Churchill; Matthew R Parsek; William C Nierman; E Peter Greenberg
Journal:  J Bacteriol       Date:  2009-04-17       Impact factor: 3.490

9.  Burkholderia pseudomallei genome plasticity associated with genomic island variation.

Authors:  Sarinna Tumapa; Matthew T G Holden; Mongkol Vesaratchavest; Vanaporn Wuthiekanun; Direk Limmathurotsakul; Wirongrong Chierakul; Edward J Feil; Bart J Currie; Nicholas P J Day; William C Nierman; Sharon J Peacock
Journal:  BMC Genomics       Date:  2008-04-25       Impact factor: 3.969

10.  The core and accessory genomes of Burkholderia pseudomallei: implications for human melioidosis.

Authors:  Siew Hoon Sim; Yiting Yu; Chi Ho Lin; R Krishna M Karuturi; Vanaporn Wuthiekanun; Apichai Tuanyok; Hui Hoon Chua; Catherine Ong; Sivalingam Suppiah Paramalingam; Gladys Tan; Lynn Tang; Gary Lau; Eng Eong Ooi; Donald Woods; Edward Feil; Sharon J Peacock; Patrick Tan
Journal:  PLoS Pathog       Date:  2008-10-17       Impact factor: 6.823

View more
  53 in total

1.  Bifidobacterium animalis subsp. lactis ATCC 27673 is a genomically unique strain within its conserved subspecies.

Authors:  Joseph R Loquasto; Rodolphe Barrangou; Edward G Dudley; Buffy Stahl; Chun Chen; Robert F Roberts
Journal:  Appl Environ Microbiol       Date:  2013-08-30       Impact factor: 4.792

Review 2.  Insights from genomic comparisons of genetically monomorphic bacterial pathogens.

Authors:  Mark Achtman
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2012-03-19       Impact factor: 6.237

3.  Microbial species delineation using whole genome sequences.

Authors:  Neha J Varghese; Supratim Mukherjee; Natalia Ivanova; Konstantinos T Konstantinidis; Kostas Mavrommatis; Nikos C Kyrpides; Amrita Pati
Journal:  Nucleic Acids Res       Date:  2015-07-06       Impact factor: 16.971

4.  Antibodies against In Vivo-Expressed Antigens Are Sufficient To Protect against Lethal Aerosol Infection with Burkholderia mallei and Burkholderia pseudomallei.

Authors:  Shawn M Zimmerman; Jeremy S Dyke; Tomislav P Jelesijevic; Frank Michel; Eric R Lafontaine; Robert J Hogan
Journal:  Infect Immun       Date:  2017-07-19       Impact factor: 3.441

5.  Comparative phylogenomics and evolution of the Brucellae reveal a path to virulence.

Authors:  Alice R Wattam; Jeffrey T Foster; Shrinivasrao P Mane; Stephen M Beckstrom-Sternberg; James M Beckstrom-Sternberg; Allan W Dickerman; Paul Keim; Talima Pearson; Maulik Shukla; Doyle V Ward; Kelly P Williams; Bruno W Sobral; Renee M Tsolis; Adrian M Whatmore; David O'Callaghan
Journal:  J Bacteriol       Date:  2013-12-13       Impact factor: 3.490

6.  Cross-species comparison of the Burkholderia pseudomallei, Burkholderia thailandensis, and Burkholderia mallei quorum-sensing regulons.

Authors:  Charlotte D Majerczyk; Mitchell J Brittnacher; Michael A Jacobs; Christopher D Armour; Matthew C Radey; Richard Bunt; Hillary S Hayden; Ryland Bydalek; E Peter Greenberg
Journal:  J Bacteriol       Date:  2014-09-02       Impact factor: 3.490

Review 7.  Antibiotic resistance in Burkholderia species.

Authors:  Katherine A Rhodes; Herbert P Schweizer
Journal:  Drug Resist Updat       Date:  2016-07-30       Impact factor: 18.500

8.  Genetic and phenotypic diversity in Burkholderia: contributions by prophage and phage-like elements.

Authors:  Catherine M Ronning; Liliana Losada; Lauren Brinkac; Jason Inman; Ricky L Ulrich; Mark Schell; William C Nierman; David Deshazer
Journal:  BMC Microbiol       Date:  2010-07-28       Impact factor: 3.605

Review 9.  Developing insights into the mechanisms of evolution of bacterial pathogens from whole-genome sequences.

Authors:  Josephine Bryant; Claire Chewapreecha; Stephen D Bentley
Journal:  Future Microbiol       Date:  2012-11       Impact factor: 3.165

10.  The Type VI secretion system spike protein VgrG5 mediates membrane fusion during intercellular spread by pseudomallei group Burkholderia species.

Authors:  Isabelle J Toesca; Christopher T French; Jeff F Miller
Journal:  Infect Immun       Date:  2014-01-13       Impact factor: 3.441

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.