Literature DB >> 31744907

Conserved Patterns of Symmetric Inversion in the Genome Evolution of Bordetella Respiratory Pathogens.

Michael R Weigand1, Yanhui Peng2, Dhwani Batra3, Mark Burroughs3, Jamie K Davis3, Kristen Knipe3, Vladimir N Loparev3, Taccara Johnson2, Phalasy Juieng3, Lori A Rowe3, Mili Sheth3, Kevin Tang3, Yvette Unoarumhi3, Margaret M Williams2, M Lucia Tondella2.   

Abstract

Whooping cough (pertussis), primarily caused by Bordetella pertussis, has resurged in the United States, and circulating strains exhibit considerable chromosome structural fluidity in the form of rearrangement and deletion. The genus Bordetella includes additional pathogenic species infecting various animals, some even causing pertussis-like respiratory disease in humans; however, investigation of their genome evolution has been limited. We studied chromosome structure in complete genome sequences from 167 Bordetella species isolates, as well as 469 B. pertussis isolates, to gain a generalized understanding of rearrangement patterns among these related pathogens. Observed changes in gene order primarily resulted from large inversions and were only detected in species with genomes harboring multicopy insertion sequence (IS) elements, most notably B. holmesii and B. parapertussis While genomes of B. pertussis contain >240 copies of IS481, IS elements appear less numerous in other species and yield less chromosome structural diversity through rearrangement. These data were further used to predict all possible rearrangements between IS element copies present in Bordetella genomes, revealing that only a subset is observed among circulating strains. Therefore, while it appears that rearrangement occurs less frequently in other species than in B. pertussis, these clinically relevant respiratory pathogens likely experience similar mutation of gene order. The resulting chromosome structural fluidity presents both challenges and opportunity for the study of Bordetella respiratory pathogens.IMPORTANCE Bordetella pertussis is the primary agent of whooping cough (pertussis). The Bordetella genus includes additional pathogens of animals and humans, including some that cause pertussis-like respiratory illness. The chromosome of B. pertussis has previously been shown to exhibit considerable structural rearrangement, but insufficient data have prevented comparable investigation in related species. In this study, we analyze chromosome structure variation in several Bordetella species to gain a generalized understanding of rearrangement patterns in this genus. Just as in B. pertussis, we observed inversions in other species that likely result from common mutational processes. We used these data to further predict additional, unobserved inversions, suggesting that specific genome structures may be preferred in each species.

Entities:  

Keywords:  Bordetella; evolution; genomics; insertion sequence; pertussis; rearrangement; whooping cough

Year:  2019        PMID: 31744907      PMCID: PMC6867878          DOI: 10.1128/mSystems.00702-19

Source DB:  PubMed          Journal:  mSystems        ISSN: 2379-5077            Impact factor:   6.496


INTRODUCTION

The genus Bordetella presently includes 16 named species, comprising human and animal pathogens, as well as environmental colonizers. The closely related Bordetella species B. pertussis, B. parapertussis, and B. bronchiseptica are collectively known as the “classic bordetellae” due to their pathogenic ecologies and were previously regarded as subspecies differentiated by host adaptation (1, 2). Most notably, B. pertussis is the primary causative agent of whooping cough (pertussis), a respiratory disease with high morbidity and mortality in infants too young to be vaccinated (i.e., those <2 months of age). B. bronchiseptica causes respiratory disease in a range of mammals, such as kennel cough in dogs, but also various infections in humans (3–6). Divergent lineages of B. parapertussis infect the respiratory tracts of sheep and humans (7–9). The remaining “nonclassic” species are phylogenetically more distant and have been recovered from varied hosts and pathologies, in particular, B. holmesii and B. hinzii from humans, which is reflected in their discrete virulence-associated-gene compositions (10). Considerable recent attention has been given to B. pertussis as the number of reported pertussis cases in many developed countries—including the United States—has risen over the last 2 decades, despite high or increasing coverage with pertussis-containing vaccines (11). Increased disease reporting likely results from many factors, including heightened awareness, expanded surveillance, improved laboratory diagnostic testing, and allelic mismatch between circulating and vaccine reference strains (11–13). Moreover, the protection conferred by acellular vaccine formulations—which replaced whole-cell preparations in the United States during the 1990s—wanes over time, resulting in increased disease rates among vaccinated individuals (11, 14–16). B. holmesii and B. parapertussis cause similar cough illnesses in humans that, while generally regarded as less severe, can be clinically indistinguishable from B. pertussis infection and meet the pertussis clinical case definition (17–24). Cocirculation and coinfection have been reported with no discernible difference in symptomology between etiologies (23–28). It is unclear whether increased detection of B. holmesii or B. parapertussis respiratory infection results from recent genetic changes or, rather, heightened awareness and improved diagnostics (24, 29, 30). Nonetheless, many diagnostic PCR assays cannot differentiate B. pertussis and B. holmesii, and retrospective testing of specimens sometimes reveals misdiagnosed cases of B. holmesii (26, 27, 31–33). For these reasons, the true burden of pertussis-like respiratory disease caused by B. holmesii and B. parapertussis remains understudied and largely unknown (19, 34). Recent advances in high-throughput sequencing and bioinformatics have yielded a large collection of complete genome assemblies from B. pertussis clinical isolates, and comparative analyses have described considerable rearrangement fluidity (35, 36). In contrast, much lower amounts of genomic data exist for non-pertussis Bordetella species, including other human pathogens, limiting opportunities to research these species. To address these knowledge gaps, we assembled and compared closed genome sequences from 167 Bordetella species isolates for analyses of genome evolution and gene order rearrangement within this genus. Together with a comparison of 469 B. pertussis closed genome sequences, these data reveal conserved patterns of inversion within species whose genomes harbor multicopy insertion sequence (IS) elements. A linear model of observed rearrangements predicted additional potential inversions based on IS element content. These results indicated that Bordetella species are subject to common rearrangement processes and that each exhibits a preferred gene order.

RESULTS

Rearrangement in Bordetella spp.

Complete genome assemblies were prepared or publicly available for 11 named or provisional species in the genus Bordetella and were compared in the present study to identify rearrangements of gene order. A neighbor-joining phylogenetic reconstruction of these species, plus two distantly related Bordetella isolates (H567 and J329), rooted with outgroup Achromobacter xylosoxidans, clustered the classic bordetellae apart from the otherwise disparate species branches (Fig. 1), consistent with previous studies (10). The number of single nucleotide polymorphisms (SNPs) observed in each species varied. Isolates of B. bronchiseptica derived primarily from two distinct clades, complex I and complex IV, which differ by ∼35,000 SNPs (Fig. S1A and B in the supplemental material). The distant B. bronchiseptica strain I328 (37) did not belong to either complex (Fig. 1) and was separated from other B. bronchiseptica isolates by ∼60,000 SNPs. Isolates of B. pertussis (complex II), B. parapertussis (complex III), and B. holmesii were much more similar, exhibiting <100 SNP differences between most isolates in each (Fig. S1C). Complete genomes from isolates of B. hinzii (n = 3) and Bordetella trematum (n = 2) differed by ∼9,600 and ∼11,000 SNPS, respectively.
FIG 1

Phylogeny of the genus Bordetella. Neighbor-joining tree of named and provisional Bordetella species with available complete-genome sequences based on pairwise mash distances and rooted with Achromobacter xylosoxidans. Species sequenced at the CDC are listed in red. Triangles indicate collapsed nodes.

Phylogeny of the genus Bordetella. Neighbor-joining tree of named and provisional Bordetella species with available complete-genome sequences based on pairwise mash distances and rooted with Achromobacter xylosoxidans. Species sequenced at the CDC are listed in red. Triangles indicate collapsed nodes. (A) Phylogenetic reconstruction of the “classic bordetellae” from 37,085 core variable SNPs using maximum parsimony. (B) Pairwise SNP distances between isolates of B. bronchiseptica, as well as within complex I and complex IV separately. (C) Pairwise SNP distances between isolates of B. holmesii, B. parapertussis, and B. pertussis. Download FIG S1, PDF file, 0.7 MB. Among the various Bordetella species, only B. holmesii and B. parapertussis displayed signs of chromosome structure variation in alignments of the isolate genomes analyzed here (Fig. 2). Genomes of B. holmesii (n = 63) comprised 11 discrete chromosome structures (Fig. S2A). The most abundant structure was observed in 48 isolates (cluster BH-01), and six other structures each differed from it by a single, unique inversion (Fig. 2A). All observed inversions were symmetric, encompassing the replication origin-terminus axis, and flanked by opposing copies of IS elements. Other structures differed by small deletions, not rearrangements, including cluster BH-03 (Data Set S1). Genomes of B. parapertussis (n = 79) exhibited two predominant chromosome structures of comparable levels of abundance, varying by a symmetric inversion between copies of IS1001, which largely followed phylogenetic boundaries (Fig. 2B). In both species, unique structures were also observed in only one isolate (Fig. S2B, singletons). Predicted protein-coding gene contents adjacent to select inversion breakpoints observed in B. holmesii and B. parapertussis are listed in Data Set S1. These results indicated that inversions like those frequently observed in B. pertussis also occur in at least a few other Bordetella species.
FIG 2

Genome rearrangements in Bordetella species. (A) Eleven unique chromosome structures that included six symmetric inversions were observed among B. holmesii isolate genomes. (B) Two predominant structures were observed in B. parapertussis. (C, D) Phylogenetic placement of structures observed in B. holmesii (C) and B. parapertussis (D) isolates was reconstructed using maximum parsimony from 1,496 and 677 core SNPs, respectively.

Genome rearrangements in Bordetella species. (A) Eleven unique chromosome structures that included six symmetric inversions were observed among B. holmesii isolate genomes. (B) Two predominant structures were observed in B. parapertussis. (C, D) Phylogenetic placement of structures observed in B. holmesii (C) and B. parapertussis (D) isolates was reconstructed using maximum parsimony from 1,496 and 677 core SNPs, respectively. Abundance distribution of unique chromosome structures in B. holmesii (A), B. parapertussis (B), and B. pertussis (C). Abundances of singletons are highlighted (red). (D) Rarefaction curve of unique B. pertussis structures recovered during 2000 to 2016. Download FIG S2, PDF file, 0.1 MB. Annotated genes encoded near boundaries of select inversions and deletions. Download Data Set S1, XLSX file, 0.07 MB.

Chromosome structural diversity of B. pertussis.

The chromosome of B. pertussis displays structural fluidity (36), and the 469 complete genomes of B. pertussis included here were aligned in 109,746 pairwise comparisons to identify rearrangements and group isolates with colinear chromosomes. These genomes exhibited 107 unique chromosome structures, including 32 clusters of two or more isolates (e.g., cluster BP-01 and cluster BP-02) and 75 singletons observed in only one isolate (Fig. S2C). The abundances of unique structures varied, with the largest cluster including 95/469 (20%) of the isolates; the distribution suggested that structural diversity remains undersampled, as observed previously (36). The data set here sampled years 2000 to 2016 (n = 370) most densely, and a rarefaction curve of unique structures recovered during this time period did not approach saturation, as expected (Fig. S2B). Links between clusters of unique structures reported here and those reported previously are listed in Table S2. Circulating isolates of B. pertussis exhibited little gene sequence diversity (13, 38), but the distribution of pairwise SNP distances between isolates within the largest clusters varied. While cluster BP-01 and cluster BP-05 appeared relatively clonal, exhibiting <15 SNPs between isolates, SNP distances within others, such as cluster BP-02, cluster BP-03, cluster BP-04, cluster BP-06, and cluster BP-08, measured up to 32 SNPs (Fig. S3A). These differences largely reflected the temporal spread of isolates within a cluster. Isolates in clusters with fewer pairwise SNPs were recovered during narrower time frames than those with more SNPs and wider temporal distributions (Fig. S3B). These data illustrate that certain chromosome structures predominate due to their stability (e.g., cluster BP-02), while the abundances of others can be attributed to recent expansion (e.g., cluster BP-01). Distribution of pairwise SNP distances (A) and years of isolation (B) among isolates of B. pertussis within colinear genome clusters. Download FIG S3, PDF file, 0.9 MB.

Rearrangement network mapping.

Pairwise alignments between a subset of genomes representing each of the unique structures in B. pertussis, B. holmesii, and B. parapertussis were investigated further to identify pairs that differed by either a single inversion (Fig. 3A) or insertion/deletion (i.e., a gap of >1,500 bp) but were otherwise colinear. Eighty-two pairs were identified among the 107 B. pertussis structures, of which 49 differed by symmetric inversion, 11 by asymmetric inversion, and 22 by insertion/deletion. Symmetric inversions of varied sizes were observed near both the replication origin (oriC) and terminus (dif) and were larger than asymmetric inversions on average (Fig. 3B).
FIG 3

Genome rearrangement in B. pertussis. (A) Observed inversions were either symmetric, encompassing either replication origin (purple) or terminus (light blue), or asymmetric (blue). (B) Single inversions observed in pairwise alignments varied in size and were predominantly symmetric. (C) Undirected network constructed from single inversions or insertion/deletions observed between 71 unique chromosome structures. Node diameter and edge line type indicate cluster size and rearrangement, respectively, according to the key. Nodes with the highest degrees of centrality (Fig. S5A) and their connections at the network core are highlighted (green). Shaded arrows indicate directionality inferred from the SNP phylogeny (Fig. S7), including divergence toward the predominant cluster BP-01 (pink), as described in the text.

Genome rearrangement in B. pertussis. (A) Observed inversions were either symmetric, encompassing either replication origin (purple) or terminus (light blue), or asymmetric (blue). (B) Single inversions observed in pairwise alignments varied in size and were predominantly symmetric. (C) Undirected network constructed from single inversions or insertion/deletions observed between 71 unique chromosome structures. Node diameter and edge line type indicate cluster size and rearrangement, respectively, according to the key. Nodes with the highest degrees of centrality (Fig. S5A) and their connections at the network core are highlighted (green). Shaded arrows indicate directionality inferred from the SNP phylogeny (Fig. S7), including divergence toward the predominant cluster BP-01 (pink), as described in the text. To investigate relationships among discrete B. pertussis chromosome structures, the identified pairs were used to construct an undirected network that connected 71 unique structures (nodes) by inferred single inversion or insertion/deletion events (Fig. 3C, edges). Sixty-two structures were connected to a single graph, representing 407 isolates (87%) in the data set. The structures in vaccine and laboratory reference strains, such as strain 134 (B202 [accession no. CP016338]), were not connected to the larger network of recent clinical isolates but instead formed small groups of 2 to 4 related structures. The remaining 36 structures, representing 49 isolates that included Tohama I, could not be connected to the network or each other. Some of the largest structural clusters were located near the network core and connected through loops of parallel, symmetric inversions. Four of these clusters also exhibited the highest degrees of centrality, indicating their significant connectivity to other structures in the network (Fig. S4A). In contrast, cluster BP-01, which was the most abundant, appeared closer to the periphery, following a path of successive rearrangements that diverged from the network core, and therefore, it deviated from many other abundant clusters. Singletons also formed peripheral nodes, often crowded around an abundant cluster, suggesting they represent rare aberrations from predominant chromosome structures. Network edges representing insertion/deletion and asymmetric inversion were also largely responsible for connecting small, peripheral nodes to the network, while edges at the core comprised symmetric inversions. Similar patterns were observed in smaller networks constructed of chromosome structures observed in B. holmesii and B. parapertussis (Fig. S5). Taken together, these results demonstrated the extent to which symmetric inversion has shaped and continues to shape the genome compared to the relative rarity of asymmetric and insertion/deletion events. (A, B) Relationship of B. pertussis colinear genome cluster size with network centrality (A) and relative replichore size balance (B). (C, D) Replichore balance in clusters of B. holmesii (C) and B. parapertussis (D). Red points denote singletons. Download FIG S4, PDF file, 0.2 MB. Undirected networks constructed from single inversions or insertion/deletions observed between unique chromosome structures in B. holmesii (A) and B. parapertussis (B). Node diameter and edge line type indicate cluster size and rearrangement, respectively, according to the key. Download FIG S5, PDF file, 0.09 MB.

Rearrangement divergence dating.

The phylogenetic history of discrete chromosome structures observed in B. pertussis was further explored to infer the divergence of specific structures and assign putative directionality to network edges shown in Fig. 3C, where possible. A time-scaled Bayesian reconstruction of 438 isolates from the ptxP3-prn2-ptxA1 background was calculated using 908 variable, core nucleotides (Fig. S6) with an estimated mutation rate of 1.30 × 10−7 substitutions per site per year. The resulting tree topology separated isolates into two large clades corresponding to the fimH1 (fim3-1) and fimH2 (fim3-2) alleles, consistent with previous studies (13, 36), and large structural clusters were restricted to a single clade. Bayesian time-scaled phylogenetic reconstruction of B. pertussis isolates from the ptxP3 clade. Tip colors indicate predominant structures according to the key, and subclades corresponding to alleles fimH1 (fim3-1) and fimH2 (fim3-2) are shaded. Divergence date estimates for internal nodes discussed in the text are labeled with the upper and lower bounds of their highest posterior density intervals (95% HPD). Select structures could be connected by symmetric inversions (inset). Download FIG S6, PDF file, 0.3 MB. Within the fimH1 clade, there was insufficient sampling to discriminate the ancestral order of cluster BP-03 and cluster BP-04. Genomes exhibiting both structures appeared near the tree root and in isolates dating back to 2000 (Fig. S6; Table S1). However, inspection of more contemporary subclades revealed evidence for multiple inversion events between the two clusters, consistent with the wide range of pairwise SNPs in each (Fig. S3). The divergence dates of some chromosome structures could be estimated from the phylogeny (e.g., cluster BP-01 and cluster BP-07), while multiple events were predicted for others (e.g., cluster BP-10) (Fig. S6). Likewise, the fimH2 allele background included the large central cluster BP-02, and multiple inversion events were predicted that gave rise to unrelated isolates with colinear chromosomes (e.g., cluster BP-06 and cluster BP-08) (Fig. S6). While some inversions have occurred repeatedly or reversibly, most chromosome structures exhibited stable phylogenetic linkage, suggesting that rearrangement events are either rare or under selection. Characteristics of Bordetella species isolates, metadata, and genome accession numbers. Download Table S1, XLSX file, 0.05 MB.

Repeat content.

Observed inversions among genomes of B. pertussis, B. parapertussis, and B. holmesii were primarily flanked by IS elements, which appeared in multiple copies throughout each genome. All repetitive sequence content was quantified independently of gene annotation by counting unique 15-bp sequences (15-mers) in a representative isolate genome from each of the species analyzed here. Genomes of B. pertussis and B. holmesii each included a set of highly abundant 15-mers, while B. parapertussis, B. petrii, B. bronchialis, Bordetella sp. strain H567, B. bronchiseptica I328, B. pseudohinzii, and B. flabilis included decreased numbers of abundant 15-mers (Fig. S7A). Representatives from all other species included in Fig. 1 did not exhibit any abundant 15-mers. A similar pattern resulted from comparison of predicted protein-coding gene sequences (Fig. S7B), and abundant genes corresponded to transposases encoded in various IS elements in B. pertussis, B. holmesii, B. parapertussis, B. petrii, B. bronchialis, Bordetella sp. H567, B. bronchiseptica I328, B. pseudohinzii, and B. flabilis (Table 1). The complements and copy numbers of the IS elements detected varied, with B. pertussis carrying the greatest number. While all species considered here included a few duplicated protein-coding genes of other functions, those absent from Table 1 did not encode any, including transposases, with >2 copies. These results support the expected correlation between genome repeat contents, specifically IS elements, and the observation of rearrangement.
TABLE 1

IS element contents in representative isolates of Bordetella species

Species (isolate)IS elementCopy no.No. of inversions predicted
B. pertussis (J549)IS481249918
IS1663161
IS100260
B. holmesii (C690)IS1001-like68106
IS4814830
IS4074859
IS21 family40
B. parapertussis (B271)IS10012212
IS100290
B. petrii (DSM1204)IS3 family (RS00975-RS00980)256
IS3 family (RS01055–RS01060)101
IS3 family (RS20025)60
IS3 family (RS20230)20
IS110 family61
ISBope140
B. bronchialis (AU171976)IS48150
IS256 family30
Bordetella sp. H567IS256 family102
B. bronchiseptica I328IS1002-like50
B. pseudohinzii (HI4681)IS21 family40
B. flabilis (AU10664)IS3 family20
IS element contents in representative isolates of Bordetella species Repetitive sequence contents in Bordetella species. Abundance of unique 15-mers (A) and predicted protein-coding genes (B) in a representative isolate from each species, colored according to the key. The greatest repeat contents were observed in B. pertussis and B. holmesii (insets). Download FIG S7, PDF file, 0.2 MB.

Inversion boundary prediction.

Most observed inversions were symmetric, and their “balance” was assessed by calculating the ratio of breakpoint distances relative to either the replication origin (oriC) or terminus (dif), whichever was closer, according to equation 1. The size of replichores, oppositely replicated chromosome halves, in most of the 107 unique B. pertussis chromosome structures were balanced, as the natural logs of their ratios were close to zero (Fig. 4A), particularly among those representing the largest clusters, and most of the deviations were observed in small clusters or singletons (Fig. S4B). Replichore sizes were similarly balanced in B. holmesii (Fig. S4C). Although replichore sizes in B. parapertussis appeared less balanced, the two predominant structures produced similar results (Fig. S4D). Many of the 49 symmetric inversions observed in B. pertussis were also balanced (Fig. 4B), maintaining similar relative replichore sizes before and after inversion. Some unbalanced inversions were also observed, and these appeared more common near the replication terminus than near the origin.
FIG 4

Symmetric inversion balance and prediction. (A) Density of relative replichore size balance observed in each of the 107 unique B. pertussis genome structures. (B) Histogram of breakpoint balance observed in symmetric inversions encompassing the replication origin (purple) or terminus (light blue). The combined densities of observed (solid line) and predicted (dotted line) symmetric inversions were scaled to 10. (C) Linear regression of right and left breakpoint distances observed in B. pertussis (circle) and B. holmesii (triangle) nearest to the replication origin (purple) or terminus (light blue). Open symbols represent duplicated points inverted for model calculation. Blue line indicates linear fit, and shading represents the 95% confidence interval. Dotted red lines denote boundaries of the model’s 95% prediction interval, and gray points correspond to all predicted inversions between copies of IS elements in B. pertussis J549.

Symmetric inversion balance and prediction. (A) Density of relative replichore size balance observed in each of the 107 unique B. pertussis genome structures. (B) Histogram of breakpoint balance observed in symmetric inversions encompassing the replication origin (purple) or terminus (light blue). The combined densities of observed (solid line) and predicted (dotted line) symmetric inversions were scaled to 10. (C) Linear regression of right and left breakpoint distances observed in B. pertussis (circle) and B. holmesii (triangle) nearest to the replication origin (purple) or terminus (light blue). Open symbols represent duplicated points inverted for model calculation. Blue line indicates linear fit, and shading represents the 95% confidence interval. Dotted red lines denote boundaries of the model’s 95% prediction interval, and gray points correspond to all predicted inversions between copies of IS elements in B. pertussis J549. Linear regression of the right and left relative breakpoint distances in 55 symmetric inversions observed in B. pertussis and B. holmesii further illustrated the bias toward balance (Fig. 4C). The resulting linear model was used to predict all possible symmetric inversions in a representative isolate genome from each species based on the positions of detected IS elements (Table 1). As expected, the number of predicted inversions increased with IS element copy number and was greater than the number observed in the current data set of B. pertussis, B. holmesii, and B. parapertussis genomes (Fig. 5). Additional breakpoints were determined from alignment of B. pertussis J549 (cluster BP-04) with representatives of all 106 other structures but were still fewer than those derived from predicted inversions (Fig. 5A). Many individual copies of IS481 in J549 were predicted to engage in multiple inversions, pairing with as many as 16 other insertions, reflecting the particularly high density of this IS element and varied balance points of observed inversions. Only one genome sequence each was available for B. petrii and Bordetella sp. H567, but symmetric inversions were predicted based on the locations of their IS elements (Table 1). No inversions were predicted between IS elements in genomes of B. bronchiseptica I328, B. bronchialis, B. flabilis, or B. pseudohinzii, as suitable pairs could not be matched within the prediction window. This generalized model of symmetric inversion revealed considerable rearrangement potential, much of which remains unobserved. The model also illustrated that inversion potential does not simply depend on IS element copy number but on relative position and orientation as well.
FIG 5

Inversion potential of Bordetella species genomes based on IS element content. Chromosome maps of B. pertussis J549 (A), B. holmesii C690 (B), and B. parapertussis B271 (C). Tracks of red lines indicate the locations of IS elements in either the forward or reverse orientation (above or below center, respectively). Connecting lines link breakpoints of observed symmetric (black), observed asymmetric (blue), and predicted symmetric (gray) inversions. Additional tracks in panel A denote all breakpoints observed in pairwise alignments between J549 and genomes of the ptxP1-ptxP2 (orange) or ptxP3 (green) clades.

Inversion potential of Bordetella species genomes based on IS element content. Chromosome maps of B. pertussis J549 (A), B. holmesii C690 (B), and B. parapertussis B271 (C). Tracks of red lines indicate the locations of IS elements in either the forward or reverse orientation (above or below center, respectively). Connecting lines link breakpoints of observed symmetric (black), observed asymmetric (blue), and predicted symmetric (gray) inversions. Additional tracks in panel A denote all breakpoints observed in pairwise alignments between J549 and genomes of the ptxP1-ptxP2 (orange) or ptxP3 (green) clades.

DISCUSSION

In this study, we extended analyses of chromosome structure variation in B. pertussis to include diverse Bordetella species to gain a comprehensive understanding of rearrangement patterns in this genus. Those species colonized by IS elements, in particular B. holmesii and B. parapertussis, also exhibited similar patterns of chromosome inversion. Just as in B. pertussis, inversions observed in these species were biased toward symmetry and maintenance of replichore balance between predominant—and thus potentially favorable—gene arrangements. The results here reveal the breadth of chromosome rearrangement in the continued genome evolution of a genus that encompasses human and other animal pathogens. The chromosome of B. pertussis experiences ongoing structural fluidity, and rearrangements supply a notable source of mutational diversity within the circulating population (36, 39), first investigated before the availability of assembled genomes (40, 41). Study of complete genome assemblies from other Bordetella species revealed that such patterns are not a unique feature of B. pertussis. IS element colonization and subsequent copy number proliferation generate regions of homology throughout genomes, measured here as both 15-mers and protein-coding genes, that supply substrates for homologous-recombination-mediated rearrangement (42). There are >240 copies of IS481 in genomes of circulating B. pertussis strains, which has significantly impacted the evolution of this species (39). Likewise, the genomes of B. holmesii and B. parapertussis each harbor a collection of different IS elements that have contributed to their speciation through genome reduction (10, 43, 44), as well as mediated the rearrangements observed here. B. holmesii has evolved independently of the classic bordetellae, reiterating the influence of IS element colonization, not phylogenetic history, on genome structural fluidity in the genus. However, B. holmesii may have first acquired IS481 from B. pertussis via horizontal gene transfer (HGT) (43). Not all Bordetella species included in this study encoded IS elements, and while sample sizes for direct comparison were low for some (e.g., B. hinzii), these are unlikely to exhibit rearrangement, based on the results presented here. Publicly available genome sequences of B. bronchiseptica and ovine-specific B. parapertussis show differences in gene order, with putative rearrangements flanked by rRNA operons and IS elements, but were excluded from the present study because their assemblies lacked the independent structural validation employed here (2, 9, 44, 45). Bacterial chromosomes are highly organized with respect to both the placement of protein-coding genes (46, 47) and the distribution of noncoding sequences (48, 49). Accordingly, this organization imposes limits on genome rearrangement (50, 51), and gene location can directly influence expression (52–55). Consistent with this understanding, some abundant B. pertussis chromosome structures have persisted across SNP and temporal distances. A time-scaled phylogeny also revealed a degree of flexibility in the form of repeated, or reversible, symmetric inversion among similar, abundant structures. Many structures have given rise to other rare clusters or singletons, depicted here as peripheral nodes in an undirected rearrangement network, through random exploration of mutation space. Analogous to nucleotide sequence variation, beneficial structural mutations are expected to persist, suggesting structures that are low in abundance may comprise less favorable gene order arrangements. Although sampling of B. holmesii and B. parapertussis genomes was considerably lower, each similarly exhibited a predominant chromosome structure, or two, which had given rise to rare variants through individual rearrangements. Phenotypes derived from chromosome structure variation in these species have not yet been observed experimentally, and therefore, whether specific gene arrangements are indeed favorable remains a hypothesis. Based on observations of IS element involvement and proclivity for symmetry around the replication origin-terminus axis, additional inversions could be predicted, even in Bordetella species in which rearrangement could not be observed here. Not surprisingly, the number of predicted inversions was proportional to the IS element load, with B. pertussis exhibiting the most due to the high copy number of IS481. The model here considered not only IS element copy number, but position and orientation relative to the replication origin-terminus axis as well. However, limitations to the model likely include an oversimplification of balance, according to the predominant structures seen in B. parapertussis, and disregard for asymmetric inversion. As a result, predicted inversions were likely underestimated, particularly in species with sporadic IS element insertions. In species where IS element-mediated rearrangement was observed, the high number of predicted possible inversions reinforced the idea that specific gene order arrangements are likely more favorable than others. The B. pertussis IS481 copy number continues to slowly increase (36), and additional insertion target sites are predicted in B. pertussis (36) and B. holmesii (M. R. Weigand, unpublished data) that could facilitate even more potential inversions than those predicted here. The present genomic study of the genus Bordetella highlights patterns of chromosome structural variation in two species, B. parapertussis and B. holmesii. Although B. pertussis is the primary agent, B. parapertussis also causes clinical whooping cough (18, 28, 56), and increasing reports have attributed pertussis-like cough illness to B. holmesii (19–22). Current acellular pertussis vaccines contain immunogenic proteins purified from B. pertussis and provide no cross-protection against B. holmesii (57, 58) and debated protection against B. parapertussis (59–66). However, widespread application of acellular pertussis vaccines may have influenced the circulating population structure of all three species (19, 29, 67, 68). Acellular pertussis vaccination and B. pertussis infection each appear to enhance B. parapertussis colonization (61, 69), and interactions among the three species have even been suggested to underlie the periodicity of pertussis cycles (56, 70). B. parapertussis and B. holmesii are likely endemic to the United States, and their detection through improved diagnostics warrants additional attention. While chromosome structural fluidity challenges existing approaches to the genomic surveillance of bacterial pathogens, it may hold valuable information about the emergence of these related etiologies of respiratory disease.

MATERIALS AND METHODS

Strain selection.

The Centers for Disease Control and Prevention’s (CDC’s) collection includes U.S. B. pertussis isolates collected through surveillance and during outbreaks. In total, 469 assembled genomes were included in the current study based on availability, and most were selected for sequencing as part of previous studies (Table S1 in the supplemental material). One set (n = 170) was selected to capture potential geographic diversity among 28 U.S. states in 2000 to 2013 (71). Another selection (n = 108) focused on isolates obtained through the Enhanced Pertussis Surveillance/Emerging Infection Program Network (EPS) (72) in 2011 to 2014, which prioritized isolates from hospitalized cases and forced sampling of every possible combination of year, state, vaccination status, and age group. Additionally, 133 isolates were selected prospectively from sporadic or EPS submission in 2014 to 2016; 31 epidemic isolates were sequenced previously (35); and 27 others, such as vaccine reference strains and novel immunogen-deficient genotypes, were also included (73–76). U.S. isolates of other Bordetella species were selected from the CDC collection for sequencing across as many years as available, but with additional emphasis on those recovered after 2000 from patients presenting respiratory illness. A few isolates sequenced previously were also included (37, 77, 78). The present study featured new genome assemblies for 125 B. pertussis, 11 B. bronchiseptica, 60 B. holmesii, 78 B. parapertussis, 2 B. trematum, 1 B. hinzii, and 1 other Bordetella species isolate as outlined in Table S1.

Genomic DNA preparation, sequencing, and assembly.

Isolates were cultured on Regan-Lowe agar without cephalexin for 72 h at 37°C. Genomic DNA isolation and purification was performed using the Gentra Puregene yeast/bacteria kit (Qiagen, Valencia, CA) with slight modifications. Briefly, two aliquots of approximately 1 × 109 bacterial cells were harvested and resuspended in 500 μl of 0.85% sterile saline and then pelleted by centrifugation for 1 min at 16,000 × g. Recovered genomic DNA was resuspended in 100 μl of DNA hydration solution. Aliquots were quantified using a Nanodrop 2000 (Thermo Fisher Scientific, Inc.; Wilmington, DE). Additional chloroform purification was performed to remove polysaccharides from preparations of all species except B. pertussis (79). Whole-genome shotgun sequencing was performed using a combination of the PacBio RSII (Pacific Biosciences, Menlo Park, CA), Illumina HiSeq/MiSeq (Illumina, San Diego, CA), and Argus (OpGen, Gaithersburg, MA) platforms as described previously (35). Genomic DNA libraries were prepared for PacBio sequencing using the SMRTbell template preparation kit 1.0 and polymerase binding kit P4 or P6, while Illumina libraries were prepared using the NEB ultra library preparation kit (New England Biolabs, Ipswich, MA). De novo assembly was performed using either the Hierarchical Genome Assembly Process (HGAP) (version 3; Pacific Biosciences) (80) or FLYE (version 2.4.2) (81). The resulting consensus sequences were circularized using either circlator (version 1.5.1) (82) or gepard (version 1.30) (83). Where necessary, hybrid de novo assembly of PacBio and Illumina sequences was performed using Unicycler (version 0.4.0) (84). All assemblies were reordered to start at the coding region for glucose-inhibited cell division protein A (gidA). Assemblies were confirmed by comparison to restriction digest optical maps using the Argus system (OpGen) with MapSolver (version 2.1.1; OpGen) and further polished by mapping Illumina reads using CLC Genomics Workbench (version 10) (CLC bio, Boston, MA). Assemblies were annotated using the National Center for Biotechnology Information (NCBI) Prokaryotic Genome Annotation Pipeline (PGAP). Genome sequence-based allele characterization of B. pertussis molecular typing loci (prn, ptxP, ptxA, ptxB, and fimH) was assigned by genome alignment to a curated set of wild-type and deficient alleles using high stringency.

Phylogenetic reconstruction.

A neighbor-joining phylogenetic reconstruction of the genus Bordetella was determined from pairwise Mash (version 2.0) (85) distances using mashtree (version 0.29) (86). Further phylogenetic reconstruction within species B. holmesii and B. parapertussis was calculated using kSNP3 (87), with a kmer size of 23 bp. Maximum-parsimony trees and pairwise SNP distances were calculated from core variable positions. Trees were annotated with iTOL (version 4) (88). The phylogeny of B. pertussis isolates from the predominant ptxP3-prn2-ptxA1 background was reconstructed from SNPs determined relative to the reference strain C734 (accession number CP013078) using snippy (version 3.1) (https://github.com/tseemann/snippy) after masking all IS elements with N’s. Isolates with <75% coverage breadth were removed (n = 6). A maximum-likelihood phylogeny was estimated from the core SNP alignment using RAxML (version 8.1.16) (89), and a temporal correlation of 0.5231 was determined using TempEst (version 1.5.1) (90). Bayesian divergence time estimation from the core alignment was performed using BEAST (version 1.8.3) (91) with an HKY substitution model, four-category gamma site heterogeneity, and strict clock. Markov chain length was 100,000,000 with parameters sampled every 1,000 states. Model convergence and parameter expected sample sizes were evaluated with tracer (version 1.6) (92), and a maximum clade credibility tree was calculated after a burn-in of 10,000,000 states using TreeAnnotator (version 1.10.0) (91) with common ancestor heights. Tree visualization and annotation were performed with the ggtree R package (version 1.10.5) (93). Repeat content in each genome was quantified by counting the abundance of unique 15-mers with jellyfish (version 2.2.6) (94). Protein-coding genes within each genome were clustered using CD-HIT (version 4.6) (95) with default parameters, except for match cutoffs, which were set at 95% nucleotide sequence identity and 90% length difference. Detected IS elements were annotated using ISfinder (96), and copy numbers were quantified by BLASTn query of a representative genome from each species.

Genome structural variation.

Genomes from each species were aligned in all pairwise combinations using progressiveMauve (97) with optimized parameters (–seed-weight = 16, –hmm-identity = 0.85) and clustered according to their structures as described previously (36). To improve the accuracy of B. pertussis whole-genome alignments, adjacent insertions of IS481 with the same orientation and separated only by their 6-bp target sequences were first collapsed to a single insertion using a custom Perl script. A rarefaction curve of structures observed among B. pertussis isolates collected during 2010 to 2016 was calculated using the vegan R package (version 2.3.0) (98). Alignments were further analyzed using custom Perl scripts to identify pairs of B. pertussis genome clusters differing by only a single inversion or single insertion/deletion of >1,500 bp, set to exceed the length of individual IS elements, and to characterize inversion symmetry measured relative to the positions of oriC and dif (99, 100). The resulting list was used to construct an undirected network of rearrangement and insertion/deletion events (edges) between genome clusters (nodes) using the network (version 1.13.0) (101) and sna (version 2.4) (102) R packages. The centralization of each node in the network was computed as degree of centrality using sna. The balance of symmetric inversions was calculated from the left and right breakpoint distances, measured from the nearest replication origin or terminus as follows: The replichore balance of discrete chromosome structures was determined by the same equation using the distances between the replication origin and terminus. Symmetric inversions observed among 49 B. pertussis alignment pairs were used to fit a linear model to the balance of breakpoint distances in the right and left replichores. The distribution of natural log ratios (balance) was used to prune extremely unbalanced inversions (outliers). The remaining 35 breakpoint distances were used to fit a linear model using the stats R package (version 3.4.4). The resulting model was used to predict all possible inversions between IS elements located within the corresponding 95% prediction intervals in a representative genome from each species. Observed and predicted inversions were visualized using the circlize R package (version 0.4.4) (103).

Data availability.

Source code for custom scripts developed under the present study is available at https://github.com/mikeyweigand/Bordetella_species. The whole-genome shotgun sequences have been deposited at DDBJ/EMBL/GenBank under the accession numbers listed in Table S1. The versions described in this paper are the first versions. Raw sequence data are available from the NCBI Sequence Read Archive, organized under BioProject accession numbers PRJNA279196 and PRJNA287884. Chromosome structural clusters matched to designations from Weigand et al. (36). Download Table S2, DOCX file, 0.01 MB.
  96 in total

1.  Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors:  Weizhong Li; Adam Godzik
Journal:  Bioinformatics       Date:  2006-05-26       Impact factor: 6.937

2.  The History of Bordetella pertussis Genome Evolution Includes Structural Rearrangement.

Authors:  Michael R Weigand; Yanhui Peng; Vladimir Loparev; Dhwani Batra; Katherine E Bowden; Mark Burroughs; Pamela K Cassiday; Jamie K Davis; Taccara Johnson; Phalasy Juieng; Kristen Knipe; Marsenia H Mathis; Andrea M Pruitt; Lori Rowe; Mili Sheth; M Lucia Tondella; Margaret M Williams
Journal:  J Bacteriol       Date:  2017-03-28       Impact factor: 3.490

3.  Characterization of Bordetella holmesii isolates from patients with pertussis-like illness in The Netherlands.

Authors:  Frits R Mooi; Sylvia Bruisten; Ineke Linde; Frans Reubsaet; Kees Heuvelman; Saskia van der Lee; Audrey J King
Journal:  FEMS Immunol Med Microbiol       Date:  2011-12-14

4.  Human but not ovine isolates of Bordetella parapertussis are highly clonal as determined by PCR-based RAPD fingerprinting.

Authors:  M H Yuk; U Heininger; G Martínez de Tejada; J F Miller
Journal:  Infection       Date:  1998 Sep-Oct       Impact factor: 3.553

5.  Infections with the agent of 'kennel cough' in patients with cancer.

Authors:  Ariel D Szvalb; Kenneth V Rolston; Nobuyoshi Mori; Jeffrey J Tarrand; Victor E Mulanovich
Journal:  J Infect       Date:  2018-07-23       Impact factor: 6.072

6.  Epidemiologic and laboratory features of a large outbreak of pertussis-like illnesses associated with cocirculating Bordetella holmesii and Bordetella pertussis--Ohio, 2010-2011.

Authors:  Loren Rodgers; Stacey W Martin; Amanda Cohn; Jeremy Budd; Mario Marcon; Andrew Terranella; Sema Mandal; Douglas Salamon; Amy Leber; Maria-Lucia Tondella; Kathleen Tatti; Kevin Spicer; Allen Emanuel; Elizabeth Koch; Londell McGlone; Lucia Pawloski; Mysheika Lemaile-Williams; Naomi Tucker; Radhika Iyer; Thomas A Clark; Mary Diorio
Journal:  Clin Infect Dis       Date:  2012-10-19       Impact factor: 9.079

7.  Host Specificity of Ovine Bordetella parapertussis and the Role of Complement.

Authors:  Sara E Hester; Laura L Goodfield; Jihye Park; Heather A Feaga; Yury V Ivanov; Liron Bendor; Dawn L Taylor; Eric T Harvill
Journal:  PLoS One       Date:  2015-07-09       Impact factor: 3.240

8.  Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees.

Authors:  Ivica Letunic; Peer Bork
Journal:  Nucleic Acids Res       Date:  2016-04-19       Impact factor: 16.971

9.  Insertion sequence-caused large-scale rearrangements in the genome of Escherichia coli.

Authors:  Heewook Lee; Thomas G Doak; Ellen Popodi; Patricia L Foster; Haixu Tang
Journal:  Nucleic Acids Res       Date:  2016-07-18       Impact factor: 16.971

Review 10.  Hard-Wired Control of Bacterial Processes by Chromosomal Gene Location.

Authors:  Jelle Slager; Jan-Willem Veening
Journal:  Trends Microbiol       Date:  2016-06-27       Impact factor: 17.079

View more
  12 in total

1.  Replication-associated inversions are the dominant form of bacterial chromosome structural variation.

Authors:  Matthew D'Iorio; Ken Dewar
Journal:  Life Sci Alliance       Date:  2022-10-19

2.  A Mutation Upstream of the rplN-rpsD Ribosomal Operon Downregulates Bordetella pertussis Virulence Factor Production without Compromising Bacterial Survival within Human Macrophages.

Authors:  Jakub Novák; David Jurnečka; Irena Linhartová; Jana Holubová; Ondřej Staněk; Daniel Štipl; Ana Dienstbier; Branislav Večerek; Nayara Azevedo; Jan Provazník; Vladimír Beneš; Peter Šebo
Journal:  mSystems       Date:  2020-12-08       Impact factor: 6.496

3.  Bilateral symmetry of linear streptomycete chromosomes.

Authors:  Lis Algora-Gallardo; Jana K Schniete; David R Mark; Iain S Hunter; Paul R Herron
Journal:  Microb Genom       Date:  2021-11

Review 4.  Pathogenicity and virulence of Bordetella pertussis and its adaptation to its strictly human host.

Authors:  Thomas Belcher; Violaine Dubois; Alex Rivera-Millot; Camille Locht; Françoise Jacob-Dubuisson
Journal:  Virulence       Date:  2021-12       Impact factor: 5.882

5.  Comparative genomics of Bordetella pertussis isolates from New Zealand, a country with an uncommonly high incidence of whooping cough.

Authors:  Natalie Ring; Heather Davies; Julie Morgan; Maithreyi Sundaresan; Audrey Tiong; Andrew Preston; Stefan Bagby
Journal:  Microb Genom       Date:  2022-01

6.  Towards comprehensive understanding of bacterial genetic diversity: large-scale amplifications in Bordetella pertussis and Mycobacterium tuberculosis.

Authors:  Jonathan S Abrahams; Michael R Weigand; Natalie Ring; Iain MacArthur; Joss Etty; Scott Peng; Margaret M Williams; Barret Bready; Anthony P Catalano; Jennifer R Davis; Michael D Kaiser; John S Oliver; Jay M Sage; Stefan Bagby; M Lucia Tondella; Andrew R Gorringe; Andrew Preston
Journal:  Microb Genom       Date:  2022-02

7.  Characterization of Bordetella pertussis Strains Isolated from India.

Authors:  Shweta Alai; Manish Gautam; Sonali Palkar; Jitendra Oswal; Sunil Gairola; Dhiraj P Dhotre
Journal:  Pathogens       Date:  2022-07-14

8.  A comprehensive resource for Bordetella genomic epidemiology and biodiversity studies.

Authors:  Sébastien Bridel; Valérie Bouchez; Bryan Brancotte; Sofia Hauck; Nathalie Armatys; Annie Landier; Estelle Mühle; Sophie Guillot; Julie Toubiana; Martin C J Maiden; Keith A Jolley; Sylvain Brisse
Journal:  Nat Commun       Date:  2022-07-01       Impact factor: 17.694

Review 9.  Bordetella Type III Secretion Injectosome and Effector Proteins.

Authors:  Jana Kamanova
Journal:  Front Cell Infect Microbiol       Date:  2020-09-04       Impact factor: 5.293

10.  Tatajuba: exploring the distribution of homopolymer tracts.

Authors:  Leonardo de Oliveira Martins; Samuel Bloomfield; Emily Stoakes; Andrew J Grant; Andrew J Page; Alison E Mather
Journal:  NAR Genom Bioinform       Date:  2022-02-02
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.