| Literature DB >> 17520020 |
Matthew N Van Ert1, W Ryan Easterday, Lynn Y Huynh, Richard T Okinaka, Martin E Hugh-Jones, Jacques Ravel, Shaylan R Zanecki, Talima Pearson, Tatum S Simonson, Jana M U'Ren, Sergey M Kachur, Rebecca R Leadem-Dougherty, Shane D Rhoton, Guenevier Zinser, Jason Farlow, Pamala R Coker, Kimothy L Smith, Bingxiang Wang, Leo J Kenefic, Claire M Fraser-Liggett, David M Wagner, Paul Keim.
Abstract
Anthrax, caused by the bacterium Bacillus anthracis, is a disease of historical and current importance that is found throughout the world. The basis of its historical transmission is anecdotal and its true global population structure has remained largely cryptic. Seven diverse B. anthracis strains were whole-genome sequenced to identify rare single nucleotide polymorphisms (SNPs), followed by phylogenetic reconstruction of these characters onto an evolutionary model. This analysis identified SNPs that define the major clonal lineages within the species. These SNPs, in concert with 15 variable number tandem repeat (VNTR) markers, were used to subtype a collection of 1,033 B. anthracis isolates from 42 countries to create an extensive genotype data set. These analyses subdivided the isolates into three previously recognized major lineages (A, B, and C), with further subdivision into 12 clonal sub-lineages or sub-groups and, finally, 221 unique MLVA15 genotypes. This rare genomic variation was used to document the evolutionary progression of B. anthracis and to establish global patterns of diversity. Isolates in the A lineage are widely dispersed globally, whereas the B and C lineages occur on more restricted spatial scales. Molecular clock models based upon genome-wide synonymous substitutions indicate there was a massive radiation of the A lineage that occurred in the mid-Holocene (3,064-6,127 ybp). On more recent temporal scales, the global population structure of B. anthracis reflects colonial-era importation of specific genotypes from the Old World into the New World, as well as the repeated industrial importation of diverse genotypes into developed countries via spore-contaminated animal products. These findings indicate humans have played an important role in the evolution of anthrax by increasing the proliferation and dispersal of this now global disease. Finally, the value of global genotypic analysis for investigating bioterrorist-mediated outbreaks of anthrax is demonstrated.Entities:
Mesh:
Year: 2007 PMID: 17520020 PMCID: PMC1866244 DOI: 10.1371/journal.pone.0000461
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1The relationship between canSNPs, sub-lineages and/or sub-groups: The stars in this dendrogram represent specific lineages that are defined by one of the seven sequenced genomes of B. anthracis.
The circles represent branch points along the lineages that contain specific subgroups of isolates. These sub-groups are named after the canSNPs that flank these positions. Indicated in red are the positions and names for each of the canSNPs (also see Table 1).
CANONICAL SNPs.
| Lineage/Group | Type Strain. | Sequence | A.Br.001 | A.Br.002 | A.Br.003 | A.Br.004 | A.Br.006 | A.Br.007 | A.Br.008 | A.Br.009 | B.Br.001 | B.Br.002 | B.Br.003 | B.Br.004 | A/B.Br.001 |
|
| C.A1055 | C.USA.A1055 | T | G | A | T | C | T | T | A | T | G | G | T |
|
|
| B1.A0442 | KrugerB | T | G | A | T | C | T | T | A |
| T | A | T | A |
| B.Br.001/002 | B1.A0102 | T | G | A | T | C | T | T | A |
|
| A | T | A | |
|
| B2.A0402 | CNEVA.9066 | T | G | A | T | C | T | T | A | T | G | A |
| A |
|
| A2.A0462 | Ames |
| A | G | C | A | T | T | A | T | G | G | T | A |
| A.Br.001/002 | A2.A0034 |
|
| G | C | A | T | T | A | T | G | G | T | A | |
|
| A1.A0039 | Australia94 | T |
|
| C | A | T | T | A | T | G | G | T | A |
| A.Br.003/004 | A2.A0489 | T | G |
|
| A | T | T | A | T | G | G | T | A | |
|
| A1.A0488 | Vollum | T | G | A | T | A |
| T | A | T | G | G | T | A |
| A.Br.005/006 | A1.A0158 | T | G | A |
|
|
| T | A | T | G | G | T | A | |
| A.Br.008/009 | A1.A0293 | T | G | A | T | A | T |
|
| T | G | G | T | A | |
|
| A1.A0193 | W. N. America | T | G | A | T | A | T | G |
| T | G | G | T | A |
CanSNPs and profiles for the lineages/groups: This table lists each of the 12 lineages and groups and indicates the canonical SNPs that help to define each of the sub-lineages and sub-groups (canSNPs that define a particular sub-lineage or sub-group are indicated in yellow). Each lineage is named after the whole genome sequence that is positioned as an end point in a branch created by a comparison of that particular genome sequence to 6 other genomes (stars in Figures 1 and 3). As endpoints all but one of the lineages are defined by a single canSNP (see profiles in yellow for B.Br.Kruger , B.Br.CNEVA, A.Br.Vollum, A.Br.Ames and A.Br.WNA. Although Aust94 is an endpoint the canSNPs that define this lineage were developed before the draft sequence and as a result two canSNPs A.Br.002 and A.Br.003 define the branch point where this isolate is located. Similarly, the groups are positions that define branch points [5], [35] along the different lineages (Circles in Fig. 1 and 3). They carry the group name designations corresponding to the canSNPs that flank these positions and are indicated in blue in this table (e.g. A.Br.001/002). Note that the sub-group need at least two canSNPs (one SNP on either side of the node) to assign a correct sub-group. Sub-group A.Br.005/006 requires three canSNPs to assign an exact genotype because a canSNP for A.Br.005 has not yet been tested. The whole genome sequences for Bacillus anthracis strains A0155, Ames Ancestor, CNEVA-9066, Kruger B, Vollum, Western North America (WNA) and Australia 94 can be found in the NCBI microbial genome website at http://www.ncbi.nlm.nih.gov
Figure 3Worldwide distribution of B. anthracis clonal lineages:Phylogenetic and geographic relationships among 1,033 B. anthracis isolates.
(A) Population structure based upon analysis of data from 12 canSNP (Protocol S1). The numbers of isolates (N) and associated MLVA genotypes (G) within each sub-lineage are indicated as well as the average Hamming distance (D) as estimated from VNTR data. The major lineages (A, B, C) are labelled, as are the derived sub-lineages (1–12), which are also color-coded. (B) Frequency and geographic distribution of the B. anthracis sub-lineages. The colors represented in the pie charts correspond to the sub-lineage color designations in panel A.
Figure 2UPGMA dendrogram of VNTR data from worldwide B. anthracis isolates: Fifteen VNTR loci and UPGMA cluster analysis were used to establish genetic relationships among the 1,033 B. anthracis isolates.
In this UPGMA dendrogram, which was created using MEGA software [39], groups of genetically similar isolates are collapsed into black triangles that are sized in proportion to the number of isolates in that particular lineage. VNTR loci mutate at faster rates than SNPs and, hence, provide greater resolution for terminal branches. Longer branches, such as the B and C lineages, have length underestimation in this analysis due to mutational saturation. The scale bar indicates genetic distance. Also illustrated on this figure is the distribution of the canonical SNP groups relative to the MLVA phylogeny (right columns). The number of isolates (N) associated with each canSNP group is shown in the second column. The correlation between the phylogenetic clusters identified by the canSNP and MLVA analysis with regards to the world wide geographic distribution of these clusters can be seen in Figure 3.
Molecular clock estimates of separation times among B. anthracis sub-lineages.
| Compared lineagesa | Major groupings | Total synonymous sitesb | Observed sSNPs | sSNP substitution frequency | 1 death per year model (ybp±2 STD)c | 0.5 death per year model (ybp±2 STD)c |
| dVollum /eAmes | A vs. A | 899,987 | 153 | 1.7E-04 | 3,801±123 | 7,603±174 |
| dVollum /fWNA | A vs. A | 899,957 | 129 | 1.4E-04 | 3,205±113 | 6,411±160 |
| eAmes/fWNA | A vs. A | 902,239 | 114 | 1.3E-04 | 2,825±106 | 5,651±150 |
|
|
|
| ||||
| gCNEVA/eAmes | B vs. A | 901,936 | 322 | 3.6E-04 | 7,983±179 | 15,966±253 |
| hKrugerB vs eAmes | B vs. A | 902,983 | 384 | 4.3E-04 | 9,509±195 | 19,019±276 |
|
|
|
| ||||
| gCNEVA/hKrugerB | B vs. B | 901,935 | 188 | 2.1E-04 |
|
|
| iC.A1055/ gCNEVA | C vs. B | 901,783 | 484 | 5.4E-04 |
|
|
| iC.A1055/eAmes | C vs. A | 901,791 | 553 | 6.1E-04 |
|
|
a Sub-lineages according to Fig. 1, bTotal Syn Sites = The total sites for synonymous substitutions were determined separately for each pair-wise comparison. c The model for sSNP substitution rate is particularly sensitive to number of death cycles per year. Therefore, two possible scenarios (1 and 0.5 deaths per year) were modelled (see supporting methods on the PNAS website for more details). STD = The standard deviation for observed sSNPs, calculated as the square root of the time estimate. Thus, 2 STD represents ∼95% confidence interval based upon fluctuation in this parameter of the model. d Sequence from the Vollum strain, The Institute for Genome Research (TIGR). e Sequence from the ‘Ames Ancestor’ strain, GenBank Reference Sequence NC 007530. f Sequence from the Western North America USA 6153, TIGR.g Sequence from the CNEVA-9066, TIGR. h Sequence from the Kruger B strain, TIGR. I Sequence from A1055, TIGR