| Literature DB >> 35732699 |
Amaya Campillay Lagos1, Martin Sundqvist2, Fredrik Dyrkell3, Marc Stegger2,4, Bo Söderquist2, Paula Mölling2.
Abstract
Whole genome sequencing (WGS) of methicillin-resistant Staphylococcus aureus (MRSA) provides high-resolution typing, facilitating surveillance and outbreak investigations. The aim of this study was to evaluate the genomic variation rate in MRSA, by comparing commonly used core genome multilocus sequencing (cgMLST) against single nucleotide polymorphism (SNP) analyses. WGS was performed on 95 MRSA isolates, collected from 20 carriers during years 2003-2019. To assess variation and methodological-related differences, two different cgMLST schemes were obtained using Ridom SeqSphere+ and the cloud-based 1928 platform. In addition, two SNP methods, 1928 platform and Northern Arizona SNP Pipeline (NASP) were used. The cgMLST using Ridom SeqSphere+ and 1928 showed a median of 5.0 and 2.0 allele variants/year, respectively. In the SNP analysis, performed with two reference genomes COL and Newman, 1928 showed a median of 13 and 24 SNPs (including presumed recombination) and 3.8 respectively 4.0 SNPs (without recombination) per individual/year. Accordantly, NASP showed a median of 5.5 and 5.8 SNPs per individual/year. In conclusion, an estimated genomic variation rate of 2.0-5.8 genetic events per year (without recombination), is suggested as a general guideline to be used at clinical laboratories for surveillance and outbreak investigations independently of analysis approach used.Entities:
Mesh:
Year: 2022 PMID: 35732699 PMCID: PMC9214674 DOI: 10.1038/s41598-022-14640-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Classification of samples according to its multilocus sequencing type (MLST), core genome MLST (Complex type) and Staphylococcus aureus protein A (spa) per patient and year performed with SeqSphere+.
The change of MRSA strain appears clearly (orange) for patients 5, 9 and 12 therefore the samples (n = 4) were excluded from the calculations. Patient 10 has two of five samples with a different CT- and spa-type consequently, patient 10 was divided in two groups, patient 10.1 (dark colour) and patient 10.2 (light colour). Three samples (light yellow) were re-analyse for spa with Sanger sequencing due to failure with WGS to define spa. In addition, five more samples; two samples from patient 2, two samples from patient 6 and one sample from patient 16 were also sequenced with Sanger to verify the results obtained with WGS due to change of spa type within the patient during the follow up.
Figure 1Circular neighbour joining tree made based on the data analysed with SeqSphere+ where missing pairwise values were ignored. The tree is arranged according to the MRSA samples coloured by the division into a complex type (CT) based on the core genome multilocus sequence types (cgMLST) analysis with SeqSphere+. MLST sequence type (ST) and Clonal complex (CC) are also shown in the figure. The remaining 91 samples included in the analyzes are shown in the phylogenetic tree and the samples where the spa type differed are marked in a red box.
Calculation of the mean distance between MRSA-samples for each patient and for the different analysis performed; where the cgMLST was calculated according to the pairwise allelic differences between samples and the SNP according to the pairwise SNP differences between samples.
| Patient | Samples | YearA | cgMLSTB | SNP NASPC | SNP with recombination 1928 | SNP without recombination 1928 | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| SeqSphere COL | 1928 Newman | COL | Newman | COL | Newman | COL | Newman | |||
| 1 | 5 | 4 | 5.8 | 1.5 | 5.8 | 5.8 | 47.0 | 22.3 | 5.5 | 5.0 |
| 2 | 5 | 4 | 6.5 | 4.0 | 8.0 | 8.0 | 25.0 | 13.5 | 6.5 | 5.8 |
| 3 | 4 | 4 | 9.3 | 6.5 | 9.3 | 9.0 | 36.0 | 13.3 | 6.3 | 6.3 |
| 4 | 4 | 4 | 8.3 | 6.8 | 9.3 | 9.3 | 41.3 | 13 | 4.8 | 4.8 |
| 5 | 2 | 3 | 1.5 | 0.5 | 2.5 | 2.0 | 10.8 | 3.8 | 1.0 | 1.0 |
| 6 | 6 | 8 | 4.0 | 2.3 | 4.8 | 4.8 | 14.1 | 10.9 | 3.4 | 3.5 |
| 7 | 6 | 6 | 10.8 | 3.5 | 12.5 | 12.3 | 32.5 | 17.8 | 8.7 | 8.8 |
| 8 | 3 | 2 | 6.5 | 4.5 | 8.0 | 8.0 | 12.0 | 20 | 6.5 | 6.0 |
| 9 | 2 | 2 | 0.5 | 0.6 | 2.2 | 1.2 | 9.4 | 3.6 | 1.4 | 1.2 |
| 10.1 | 2 | 2 | 5.5 | 1.5 | 6.5 | 5.5 | 37.0 | 13.0 | 1.0 | 1.0 |
| 10.2 | 3 | 3 | 4.3 | 2.7 | 5.7 | 4.3 | 26.0 | 19.3 | 2.0 | 2.0 |
| 11 | 4 | 4 | 4.0 | 1.5 | 5.0 | 5.0 | 29.0 | 17.5 | 3.8 | 3.8 |
| 12 | 5 | 5 | 6.0 | 3.8 | 6.8 | 6.6 | 23.6 | 13.2 | 4.4 | 4.0 |
| 13 | 3 | 4 | 4.5 | 2.0 | 4.8 | 4.8 | 28.3 | 6.8 | 3.5 | 3.5 |
| 14 | 9 | 11 | 2.7 | 1.1 | 2.5 | 2.5 | 16.1 | 4.1 | 2.0 | 1.7 |
| 15 | 9 | 8 | 6.3 | 3.4 | 6.3 | 6.3 | 23.9 | 10.1 | 4.3 | 4.0 |
| 16 | 4 | 3 | 9.3 | 4.0 | 10.3 | 8.3 | 14.7 | 14.3 | 7.7 | 7.7 |
| 17 | 4 | 5 | 5.0 | 1.2 | 6.4 | 6.2 | 8.4 | 9.4 | 4.8 | 5.0 |
| 18 | 4 | 4 | 2.3 | 1.3 | 3.5 | 2.3 | 15.5 | 13.3 | 0.75 | 0.8 |
| 19 | 3 | 5 | 1.6 | 0.6 | 2.6 | 2.6 | 12.8 | 6.2 | 1.2 | 1.2 |
| 20 | 4 | 6 | 4.2 | 1.0 | 5.2 | 5.2 | 12.8 | 11.3 | 3.3 | 2.3 |
| Mean | 5.2 | 2.6 | 6.1 | 5.7 | 22.7 | 12.2 | 4.0 | 3.8 | ||
| Median | 5.0 | 2.0 | 5.8 | 5.5 | 23.6 | 13.0 | 4.0 | 3.8 | ||
| Standard deviation | 2.7 | 1.8 | 2.8 | 2.8 | 11.3 | 5.3 | 2.0 | 2.3 | ||
A: years passed from the first to the last sample collection. B: cgMLST = alleles per year. C: SNP = SNPs per year.
Estimated genomic variation rate within each sequence type (ST) based only on the COL reference sequence.
| ST type | Patients/samples | SeqSphere+ alleles/year | NASP SNPs/year | 1928 WORA SNPs/year |
|---|---|---|---|---|
| 1 | 1/2 | 1.5 | 2.5 | 1.0 |
| 5 | 1/9 | 6.2 | 6.3 | 4.2 |
| 6 | 1/4 | 9.2 | 9.2 | 6.2 |
| 22 | 2/9 | 6.9 | 7.5 | 5.2 |
| 45 | 2/9 | 3.9 | 5.2 | 1.2 |
| 59 | 1/2 | 0.6 | 2.2 | 1.2 |
| 72 | 2/10 | 3.6 | 3.6 | 2.8 |
| 80 | 1/4 | 9.2 | 10.3 | 7.6 |
| 88 | 7/29 | 4.5 | 5.6 | 4.1 |
| 1340 | 1/6 | 10.8 | 12.5 | 8.75 |
| 5210 | 1/5 | 5.9 | 6.8 | 4.4 |
A: WOR = without recombination.
Figure 2The mean for the genomic variation rate per year for all the ST found among the 20 MRSA-carriers are shown in the dot plot analysis. The cgMLST analysis (circle) was performed with SeqSphere+. Two different SNP-analysis were performed, the NASP-pipeline (square) and the cloud based software 1928 (triangle). The plot is based according to the results obtained using the COL reference. The three analysis have been grouped after each ST to facilitate visual comparison of the mean between them. The 1928 analysis show deviating/lower results compared to the results obtained with the cgMLST- and NASP-analysis where the results are similar. ST88, which is the ST with most samples, shows that the cgMLST and 1928 analysis are more comparable with each other than with NASP. A larger collection could give different results to the one observed in this study.
Figure 3Sampling distribution for each MRSA-carrier (y-axis) per year (x-axis). Each dot represents a sample (n = 95).
Figure 4Flowchart for the analysis process from frozen MRSA samples to data analysis performed with different platforms for core genome multilocus sequence typing (cgMLST) and single nucleotide polymorphism (SNP).