| Literature DB >> 19643763 |
Xavier Didelot1, Rachel Urwin2, Martin C J Maiden3, Daniel Falush4.
Abstract
Despite the increasing popularity of multilocus sequence typing (MLST), the most appropriate method for characterizing bacterial variation and facilitating epidemiological investigations remains a matter of debate. Here, we propose that different typing schemes should be compared on the basis of their power to infer clonal relationships and investigate the utility of sequence data for genealogical reconstruction by exploiting new statistical tools and data from 20 housekeeping loci for 93 isolates of the bacterial pathogen Neisseria meningitidis. Our analysis demonstrated that all but one of the hyperinvasive isolates established by multilocus enzyme electrophoresis and MLST were grouped into one of six genealogical lineages, each of which contained substantial variation. Due to the confounding effect of recombination, evolutionary relationships among these lineages remained unclear, even using 20 loci. Analyses of the seven loci in the standard MLST scheme using the same methods reproduced this classification, but were unable to support finer inferences concerning the relationships between the members within each complex.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19643763 PMCID: PMC2762044 DOI: 10.1099/mic.0.031534-0
Source DB: PubMed Journal: Microbiology (Reading) ISSN: 1350-0872 Impact factor: 2.777
Gene fragments analysed
Annotations are from ENSEMBLE genomes (http://bacteria.ensembl.org) accessed 22 July 2009, except where a different locus name is customarily used in the MLST scheme. In these cases the ENSEMBLE designation is given in parentheses.
| Putative ABC transporter ATP-binding protein | 433 | 17.3 | 15* | 0.049 | |
| Adenylate kinase | 465 | 3.7 | 10* | 0.010 | |
| Shikimate dehydrogenase | 490 | 33.9 | 19* | 1.545 | |
| Fumarate hydratase class II | 465 | 8.2 | 19* | 0.022 | |
| Glucose-6-phosphate 1-dehydrogenase ( | 501 | 5.6 | 16* | 0.045 | |
| Monofunctional biosynthetic peptidoglycan transglycosylase ( | 497 | 15.3 | 16* | 0.115 | |
| Pyruvate dehydrogenase subunit E1 ( | 480 | 16.7 | 24* | 0.063 | |
| Phosphoglucomutase | 450 | 17.1 | 21* | 0.119 | |
| Probable signal recognition particle protein | 432 | 11.6 | 36* | 0.002 | |
| Proline iminopeptidase | 416 | 6.2 | 19* | 0.449 | |
| Polyphosphate kinase | 579 | 13.3 | 23* | 0.088 | |
| Phosphoserine aminotransferase | 451 | 14.9 | 29* | 0.123 | |
| Aspartate ammonia-lyase | 432 | 6.7 | 11† | 0.078 | |
| Carbamoyl phosphate synthase large subunit | 495 | 9.7 | 15† | 0.014 | |
| Dihydropteroate synthase | 366 | 21.2 | 33† | 0.189 | |
| Glutamine synthetase | 471 | 13.0 | 11† | 0.036 | |
| Ribose-5-phosphate isomerase A | 567 | 8.2 | 21† | 0.032 | |
| Transaldolase | 492 | 11.0 | 15† | 0.237 | |
| 2,3-Bisphosphoglycerate-dependent phosphoglycerate mutase ( | 534 | 5.0 | 18† | 0.163 | |
| Pyruvate kinase ( | 453 | 10.4 | 17† | 0.071 |
*From 107 isolates.
†From 93 isolates.
Fig. 1.Majority-rule consensus tree for the clonal genealogies inferred by ClonalFrame when using all 20 gene fragments. The isolates are represented according to their MLEE designations, as shown in the key, and with no symbol meaning that no MLEE designation corresponds to the electrophoresis type of that isolate. ST numbers are shown for the isolates that do not belong to any of the six hyperinvasive complexes; see Fig. 3 for the others.
Properties of the six hyperinvasive lineages
| No. of isolates | 8 | 10 | 10 | 14 | 13 | 23 |
| No. of ETs* | 1 | 1 | 1 | 5 | 2 | 4 |
| No. of MLST STs | 4 | 3 | 1 | 3 | 8 | 4 |
| No. of types (all fragments) | 8 | 8 | 8 | 14 | 12 | 14 |
| No. of polymorphic sites | 270 | 162 | 121 | 395 | 314 | 246 |
| No. of polymorphic fragments | 12 | 9 | 9 | 16 | 17 | 14 |
| Mean pairwise distance | 0.009 | 0.004 | 0.004 | 0.010 | 0.008† | 0.007 |
| Relative age‡ | 0.3 [0.1;0.5] | 0.2 [0.1;0.4] | 0.1 [0.1;0.2] | 0.4 [0.2;0.6] | 0.3 [0.2;0.5]§ | 0.2 [0.1;0.4] |
| No. of mutation events|| | 1.6 [0.6;3.3] | 0.5 [0.2;1.5] | 0.9 [0.6;1.6] | 8.8 [5.5;13.2] | 5.8 [3.1;9.8] | 0.9 [0.2;1.8] |
| No. of recombination events|| | 5.1 [3.6;7.1] | 3.0 [2.0;5.1] | 1.5 [1.2;2.2] | 12.9 [10.6;15.4] | 9.5 [6.0;14.4] | 5.4 [4.7;6.1] |
| No. of substitutions via recombination|| | 85 [65;107] | 49 [28;76] | 22 [19;36] | 121 [94;138] | 83 [53;120] | 58 [56;63] |
| Proportion of clonal frame|| | 80 % [74;86] | 87 % [83;91] | 92 % [90;93] | 72 % [68;79] | 75 % [69;82] | 81 % [79;82] |
| 3.8 [1.5;9.2] | 6.8 [2.0;14.5] | 1.7 [0.8;3.5] | 1.5 [0.9;2.6] | 1.7 [0.9;2.9] | 7.7 [2.7;26.5] | |
| 63.7 [25.5;144.6] | 108.9 [35.7;217.6] | 25.7 [13.8;45.6] | 14.4 [8.7;23.0] | 15.2 [8.5;24.6] | 82.3 [31.2;271.9] | |
| External/internal branch length ratio | 3.9 [1.7;7.3] | 2.3 [0.8;4.3] | 2.4 [0.9;5.1] | 1.4 [1.1;1.9] | 2.8 [1.4;5.1] | 1.9 [1.0;3.0] |
| Statistical significance¶ | 0.012 | 0.047 | 0.046 | 0.045 | 0.006 | 0.004 |
*Electrophoretic types as determined by MLEE.
†Between the isolates of lineage 3 and the three others, the mean per site pairwise distance is 0.015.
‡As a fraction of the estimated age of N. meningitidis. The numbers in square brackets in this and subsequent rows are 95% credibility intervals.
§The relative age of the group of isolates of lineage 3 is 0.2 [0.1;0.3].
||Measured between each isolate of each complex and their most recent common ancestor.
¶Deviation from random expectation under the Kingman coalescent (Kingman, 1982a).
Fig. 2.Inferred (black) and expected (white) values for the ratio of the external to internal branch lengths in the ST-4/5 complex and in the ST-41 complex. Expectations were estimated under the neutral coalescent model (Kingman, 1982a, b).
Fig. 3.Evolutionary events inferred by ClonalFrame on the branches of the ST-8 complex for the 20-locus dataset. The red line indicates the probability of recombination; substitutions, caused by either mutation or recombination, are represented as black crosses.
Fig. 4.Comparison of the ClonalFrame analyses for the seven- and 20-locus datasets. For each of the six complexes, the seven-locus analyses are shown to the left and the 20-locus analyses to the right. The isolates are represented according to their MLEE designations as in Fig. 1, and labelled with their ST number according to the seven MLST gene fragments. The numbers attached to internal nodes of the ClonalFrame outputs are in the format X/Y, where X and Y are the confidence in the node according to the seven- and 20-locus analyses respectively. X and Y are expressed on a scale from 0 to 10.
Fig. 5.Number of nodes found compared to coalescent expectation. The numbers of clusters found by the 20-locus (light grey) and seven-locus (dark grey) analyses, sorted according to the inferred age of the clusters, are compared to the expectation (crosses) and 95 % credibility interval for the number of groups of different ages under a coalescent model computed using a Monte Carlo simulation (Kingman, 1982a, b).