| Literature DB >> 35308374 |
Shuyi Wang1,2, Chunjiang Zhao2, Yuyao Yin2, Fengning Chen1,2, Hongbin Chen2, Hui Wang1,2.
Abstract
With the reduction in sequencing price and acceleration of sequencing speed, it is particularly important to directly link the genotype and phenotype of bacteria. Here, we firstly predicted the minimum inhibitory concentrations of ten antimicrobial agents for Staphylococcus aureus using 466 isolates by directly extracting k-mer from whole genome sequencing data combined with three machine learning algorithms: random forest, support vector machine, and XGBoost. Considering one two-fold dilution, the essential agreement and the category agreement could reach >85% and >90% for most antimicrobial agents. For clindamycin, cefoxitin and trimethoprim-sulfamethoxazole, the essential agreement and the category agreement could reach >91% and >93%, providing important information for clinical treatment. The successful prediction of cefoxitin resistance showed that the model could identify methicillin-resistant S. aureus. The results suggest that small datasets available in large hospitals could bypass the existing basic research and known antimicrobial resistance genes and accurately predict the bacterial phenotype.Entities:
Keywords: Staphylococcus aureus; WGS; antimicrobial resistance (AMR); k-mer algorithm; machine learning
Year: 2022 PMID: 35308374 PMCID: PMC8924536 DOI: 10.3389/fmicb.2022.841289
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
FIGURE 1Sequence Types (STs) and collection years of the 466 isolates in this study. A total of 23 STs were identified and the STs of 2 isolates were unknown. Most isolates were collected in 2019 and 2020.
States and collection years of the 466 isolates in this study.
| Years States | 2005–2011 | 2012–2014 | 2016–2018 | 2019 | 2020 |
| Beijing | 15 | 2 | 10 | 79 | 1 |
| Chongqing | 1 | 1 | 4 | 20 | 1 |
| Guangdong | 4 | 3 | 5 | 28 | 5 |
| Hubei | 2 | - | 4 | 19 | - |
| Hunan | 1 | 1 | 3 | 9 | 7 |
| Jiangsu | 1 | 2 | 3 | 22 | - |
| Liaoning | 1 | 1 | - | - | - |
| Shaanxi | 1 | 1 | 4 | - | - |
| Shandong | 1 | 38 | 1 | 20 | 6 |
| Shanghai | 4 | 1 | 2 | 5 | 6 |
| Shanxi | - | - | 3 | - | 25 |
| Shenyang | - | 2 | 2 | - | - |
| Tianjin | - | - | 2 | 26 | - |
| Zhejiang | 2 | 1 | 3 | 49 | 6 |
The 466 isolates were widely distributed across 14 states in China.
Number of genomes with different minimum inhibitory concentration (MIC) to the 10 antimicrobial agents for the 466 Staphylococcus aureus isolates used in this study.
| MICs (μg/mL) Antimicrobial agents | 0.032 | 0.064 | 0.125 | 0.25 | 0.5 | 1 | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 | 512 | Total |
| Clindamycin | 46 | 163 | 32 | 3 | 5 | 1 | 3 | 1 | 2 | 2 | 208 | 466 | ||||
| Cefoxitin | 46 | 171 | 31 | 54 | 63 | 7 | 94 | 466 | ||||||||
| Oxacillin | 217 | 40 | 39 | 39 | 15 | 13 | 5 | 5 | 93 | 466 | ||||||
| Levofloxacin | 51 | 207 | 45 | 26 | 5 | 9 | 33 | 90 | 466 | |||||||
| Trimethoprim-Sulfamethoxazole | 102 | 257 | 43 | 21 | 8 | 10 | 9 | 2 | 2 | 12 | 466 | |||||
| Vancomycin | 61 | 396 | 9 | 466 | ||||||||||||
| Linezolid | 22 | 278 | 166 | 466 | ||||||||||||
| Erythromycin | 4 | 33 | 112 | 6 | 4 | 9 | 6 | 9 | 12 | 12 | 10 | 237 | 454 | |||
| Daptomycin | 13 | 175 | 221 | 22 | 431 | |||||||||||
| Gentamicin | 8 | 34 | 3 | 5 | 1 | 1 | 4 | 2 | 1 | 5 | 3 | 2 | 69 |
In this table, light green refers to susceptible isolates defined according to clinical breakpoints, light orange refers to intermediate isolates, and colorless refers to resistant isolates.
FIGURE 2Schematic workflow of the actual operation in this study. DNA was isolated from the specimen and sequenced by Illumina to obtain the FASTQ file. We used KMC to obtain k-mer counts files and converted the format of the k-mer counts files to obtain the matrix, in which rows include isolates’ numbers and columns include k-mer counts. Finally, we used various machine learning algorithms to learn and obtain the prediction accuracy in the testing set.
FIGURE 3The prediction accuracy within the one two-fold dilution, essential agreement (EA), and the category agreement (CA) of all antimicrobial agents.
The AUC (Area Under Curve), sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), very major error (VME), and major error (ME) of the prediction results of all antimicrobial agents evaluated in this study.
| Antimicrobial agents | AUC (%) | Sensitivity (%) | Specificity (%) | NPV (%) | PPV (%) | VME (%) | ME (%) |
| Clindamycin | 94.61 | 91.30 | 97.92 | 92.16 | 97.67 | 8.70 | 2.08 |
| Cefoxitin | 92.65 | 94.00 | 93.18 | 93.18 | 94.00 | 6.00 | 6.82 |
| Oxacillin | 94.47 | 86.96 | 80.28 | 95.00 | 58.82 | 13.04 | 20.59 |
| Levofloxacin | 88.99 | 100.00 | 88.00 | 100.00 | 67.86 | 0.00 | 12.00 |
| Trimethoprim-Sulfamethoxazole | 92.77 | 100.00 | 98.92 | 100.00 | 50.00 | 0.00 | 1.08 |
| Vancomycin | 96.13 | - | 100.00 | 100.00 | - | - | 0.00 |
| Linezolid | 82.02 | - | 100.00 | 100.00 | - | - | 0.00 |
| Erythromycin | 94.83 | 96.49 | 85.29 | 93.55 | 91.67 | 3.51 | 14.71 |
| Daptomycin | 82.24 | - | 100.00 | 100.00 | - | - | 0.00 |
| Gentamicin | 85.46 | 100.00 | 91.67 | 100.00 | 66.67 | 0.00 | 8.33 |
FIGURE 4Scatter plot of MICs of cefoxitin (FOX) and oxacillin (OXA) and specific composition of OS-MRSA. On the left, the horizontal axis is the MICs of FOX for the isolates and the vertical axis is the MICs of OXA for the isolates. The figure depicts FOX MICs versus OXA MICs for Staphylococcus aureus. Green represents methicillin-susceptible Staphylococcus aureus (MSSA) and blue represents methicillin-resistant Staphylococcus aureus (MRSA) with Staphylococcal cassette chromosome mec (SCCmec) typing. Totally, 79 isolates were resistant to FOX and susceptible to OXA (OS-MRSA). The specific composition of these 79 isolates is shown in the table on the right; there were 56 ST59 isolates, accounting for the vast majority of OS-MRSA.
FIGURE 5Schematic workflow of this study. Traditional microbial identification and antimicrobial agent susceptibility rely on microbial culture technology, which is time-consuming and requires 4 days to obtain the antimicrobial susceptibility report. This study used whole genome sequencing and was 6-h faster than routine clinical testing.