| Literature DB >> 17967176 |
Alexander Mellmann1, Thomas Weniger, Christoph Berssenbrügge, Jörg Rothgänger, Michael Sammeth, Jens Stoye, Dag Harmsen.
Abstract
BACKGROUND: For typing of Staphylococcus aureus, DNA sequencing of the repeat region of the protein A (spa) gene is a well established discriminatory method for outbreak investigations. Recently, it was hypothesized that this region also reflects long-term epidemiology. However, no automated and objective algorithm existed to cluster different repeat regions. In this study, the Based Upon Repeat Pattern (BURP) implementation that is a heuristic variant of the newly described EDSI algorithm was investigated to infer the clonal relatedness of different spa types. For calibration of BURP parameters, 400 representative S. aureus strains with different spa types were characterized by MLST and clustered using eBURST as "gold standard" for their phylogeny. Typing concordance analysis between eBURST and BURP clustering (spa-CC) were performed using all possible BURP parameters to determine their optimal combination. BURP was subsequently evaluated with a strain collection reflecting the breadth of diversity of S. aureus (JCM 2002; 40:4544).Entities:
Mesh:
Substances:
Year: 2007 PMID: 17967176 PMCID: PMC2148047 DOI: 10.1186/1471-2180-7-98
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Figure 1Concordance analysis of eBURST and BURP clustering in dependence of all possible BURP parameters. A surface curve displaying the dependence of concordance (in %) between eBURST MLST CCs and BURP spa-CCs applying all possible combinations of the BURP parameters "exclude spa types that are shorter than x repeats" and "spa types are clustered if costs are less or equal than y".
Figure 2High range of concordance between eBURST and BURP for optimal BURP calibration. Graph showing curves for cost integers in the high concordance range. Curves labeled "Costs: 1 to 10" represent different cost values. For the curve with the overall highest concordance (Costs: 4) the first inflection point is marked (arrow) and corresponds to the first local optimum giving a good balance between concordance and percentage of excluded spa types.
Figure 3Population snapshot of the 400 . Population snapshot of the 400 S. aureus strains after grouping with the calibrated BURP ("exclude spa types that are shorter than 5 repeats" and "spa types are clustered if costs are less or equal than 4", 31 spa types were excluded). Clusters of linked isolates correspond to spa-CCs. Whereas eBURST uses the number of relatives (single locus variants, SLVs) to define founders and subfounders of groups, BURP sums up costs to define a founder-score for each spa type in a cluster. The spa type with the highest founder-score is defined founder of the cluster (blue color). Subfounders are the spa types with the second highest founder-score and are labeled in yellow. If two or more spa types exhibit the same highest founder-score, they are all colored in blue. For clarity, only the spa-CCs are labeled. Note that the spacing between linked spa types and between unlinked spa types and spa-CCs provides no information concerning the genetic distance between them.
Comparison of BURP and eBURST clustering results
| t004, t015, t028, t029, t031, t033, t038, t040, t043, t049, t050, t061, t065, t069, t073, t077, t080, t095, t102, t116, t123, t124, t130, t141, t142, t157, t161, t204, t230, t247, t266, t277, t330, t331, t333, t340, t350, t361, t370, t371, t424 | 45 | 45 |
| t180 | 53 | 45 |
| t220 | 54 | 45 |
| t295 | 278 | 45 |
| t209 | 109 | 9 |
| t133 | 254 | 239 |
| t412 | 846 | 395 |
| t302 | 625 | singletona |
| t397 | 842 | singleton |
| t383 | 1008b | singleton |
spa types, their corresponding MLST sequence types (ST), and clonal complexes (CC) of spa-CC004 are shown. ano clonal complex was assigned for these singletons by eBURST analysis, bthis ST is preliminary named ST1008 and has the allelic profile 6, 5, 6, 6, 7, 17, 19.