| Literature DB >> 34645857 |
Pippa Scott1, Ji Zhang2, Trevor Anderson3, Patricia C Priest4, Stephen Chambers5, Helen Smith6, David R Murdoch5, Nigel French2, Patrick J Biggs2,6.
Abstract
Epidemiological studies of communicable diseases increasingly use large whole-genome sequencing (WGS) datasets to explore the transmission of pathogens. It is important to obtain an initial overview of datasets and identify closely related isolates, but this can be challenging with large numbers of isolates and imperfect sequencing. We used an ad hoc whole-genome multi locus sequence typing method to summarise data from a longitudinal study of Staphylococcus aureus in a primary school in New Zealand. Each pair of isolates was compared and the number of genes where alleles differed between isolates was tallied to produce a matrix of "allelic differences". We plotted histograms of the number of allelic differences between isolates for: all isolate pairs; pairs of isolates from different individuals; and pairs of isolates from the same individual. 340 sequenced isolates were included, and the ad hoc shared genome contained 445 genes. There were between 0 and 420 allelic differences between isolate pairs and the majority of pairs had more than 260 allelic differences. We found many genetically closely related S. aureus isolates from single individuals and a smaller number of closely-related isolates from separate individuals. Multiple S. aureus isolates from the same individual were usually very closely related or identical over the ad hoc shared genome. Siblings carried genetically similar, but not identical isolates. An ad hoc shared genome approach to WGS analysis can accommodate imperfect sequencing of the included isolates, and can provide insights into relationships between isolates in epidemiological studies with large WGS datasets containing diverse isolates.Entities:
Mesh:
Year: 2021 PMID: 34645857 PMCID: PMC8514452 DOI: 10.1038/s41598-021-99080-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Sample collection schedule.
Figure 2Isolate flow chart.
Figure 3Histograms of allelic differences between sequence pairs. Red hashed bars indicate comparisons of isolates from siblings. Blue hashed bars indicate comparisons of isolates from a child participant and a study staff member (unlikely to represent direct transmission).
Frequency of conventional 7-locus MLST profiles in the 340 included isolates.
| 7-locus MLST type | Number of isolates |
|---|---|
| 15 | 38 |
| 59 | 32 |
| 45 | 30 |
| 5 | 28 |
| 6 | 20 |
| 1 | 19 |
| 78 | 17 |
| 30 | 13 |
| 25 | 10 |
| 97 | 10 |
| 894 | 9 |
| 2851 | 9 |
| 12 | 8 |
| 39 | 8 |
| 188 | 8 |
| 508 | 8 |
| 22 | 7 |
| 149 | 8 |
| 4826 | 6 |
| 5112* | 6 |
| 779 | 5 |
| 5383* | 5 |
| 5384* | 5 |
| 5386* | 5 |
| 828 | 4 |
| 5111* | 4 |
| 72 | 3 |
| 5385* | 3 |
| 8 | 2 |
| 121 | 2 |
| 672 | 2 |
| 5388* | 1 |
| 630 | 1 |
| 2276 | 1 |
| 5387* | 1 |
| 5390* | 1 |
*MLST type newly registered from this study.
Observed allelic differences between MLSTs in the same or different clonal complexes.
| Clonal complexes present in dataset[ | 7-Locus MLSTs present in dataset (n wgMLST allelic differences within MLST) | 7-Locus MLST pairs with 15–120 allelic differences in wgMLST (n wgMLST allelic differences, n 7-locus MLST loci differences) | 7-Locus MLST pairs with 280–420 allelic differences in wgMLST (n wgMLST allelic differences, n 7-locus MLST loci differences) | |||
|---|---|---|---|---|---|---|
| CC1 | 1 | (0–42) | 1:2851 | (39–62, 1) | 1:188 | (260–267, 2) |
| 188 | (0–38) | 1:5388 | (45–65, 1) | 188: 2851 | (262–265, 3) | |
| 2851 | (0–4) | 2851:5388 | (82–84, 2) | 188: 5388 | (273–275, 2) | |
| 5388 | (n/a) | |||||
| CC5 | 5 | (0–61) | 5: 149 | (28–59, 1) | 5:6 | (322–333, 2) |
| 6 | (0–29) | 6:149 | (323–325, 3) | |||
| 149 | (0) | |||||
| CC8 | 8 | (79) | 8:630 | (112–115, 1) | 8:72 | (323–383, 3) |
| 72 | (1–48) | 8:828 | (22–89, 1) | 72:630 | (319–323, 3) | |
| 630 | (n/a) | 630:828 | (116–117, 2) | 72:828 | (326–332, 4) | |
| 828 | (0–3) | |||||
| CC15 | 15 | (0–58) | 15:5386 | (19–54, 1) | – | |
| 5386 | (0–1) | |||||
| CC22 | 22 | (0–2) | 22:5387 | (45–46, 1) | – | |
| 5387 | (n/a) | |||||
| CC30 | 30 | (0–45) | 30:39 | (85–97, 2) | – | |
| 39 | (0–48) | 30:4826 | (43–40, 1) | |||
| 4826 | (0) | 30:5112 | (38–43, 1) | |||
| 5112 | (0–2) | 30:5383 | (92–94, 3) | |||
| 5383 | (0) | 30:5384 | (54–60, 2) | |||
| 5384 | (1–2) | 39:4826 | (83–95, 3) | |||
| 39:5112 | (82–94, 3) | |||||
| 39:5383 | (44–53, 1) | |||||
| 39:5384 | (95–108, 4) | |||||
| 4826:5112 | (42–43, 2) | |||||
| 4826:5383 | (90, 4) | |||||
| 4826:5384 | (58–59, 2) | |||||
| 5112:5383 | (88–89, 4) | |||||
| 5112:5384 | (56–58,3) | |||||
| 5383:5384 | (101–102, 4) | |||||
| CC45 | 45 | (0–76) | (56–74, 1) | – | ||
| 508 | (0–53) | (44–73, 1) | ||||
| 5385 | (0) | (60–70, 1) | ||||
| CC97 | 97 | (0–38) | – | – | ||
| Unclassified | 12 | (0–1) | (97–98, 2) | All remaining 7-locus MLST pairs (including between clonal complexes) | ||
| 25 | (0–3) | |||||
| 59 | (0–41) | |||||
| 78 | (0–28) | |||||
| 121 | (1) | |||||
| 672 | (2) | |||||
| 779 | (0–1) | |||||
| 894 | (0–3) | |||||
| 2276 | (n/a) | |||||
| 5111 | (0–3) | |||||
| 5390 | (n/a) | |||||
Figure 4Phylogenic network plot, full dataset. Isolates from the same ST have same shape and colour. Symbols with no fill and/or ST number with asterisk indicate MLST types newly registered from this study.