| Literature DB >> 33897984 |
Ekaterina Avershina1, Priyanka Sharma2, Arne M Taxt1,3, Harpreet Singh4, Stephan A Frye3, Kolin Paul5, Arti Kapil6, Umaer Naseer7, Punit Kaur2, Rafi Ahmad1,8.
Abstract
Antibiotic resistance poses a major threat to public health. More effective ways of the antibiotic prescription are needed to delay the spread of antibiotic resistance. Employment of sequencing technologies coupled with the use of trained neural network algorithms for genotype-to-phenotype prediction will reduce the time needed for antibiotic susceptibility profile identification from days to hours. In this work, we have sequenced and phenotypically characterized 171 clinical isolates of Escherichia coli and Klebsiella pneumoniae from Norway and India. Based on the data, we have created neural networks to predict susceptibility for ampicillin, 3rd generation cephalosporins and carbapenems. All networks were trained on unassembled data, enabling prediction within minutes after the sequencing information becomes available. Moreover, they can be used both on Illumina and MinION generated data and do not require high genome coverage for phenotype prediction. We cross-checked our networks with previously published algorithms for genotype-to-phenotype prediction and their corresponding datasets. Besides, we also created an ensemble of networks trained on different datasets, which improved the cross-dataset prediction compared to a single network. Additionally, we have used data from direct sequencing of spiked blood cultures and found that AMR-Diag networks, coupled with MinION sequencing, can predict bacterial species, resistome, and phenotype as fast as 1-8 h from the sequencing start. To our knowledge, this is the first study for genotype-to-phenotype prediction: (1) employing a neural network method; (2) using data from more than one sequencing platform; and (3) utilizing sequence data from spiked blood cultures.Entities:
Keywords: Antibiotic resistance; Antibiotic susceptibility testing; Bacterial infection; Carbapenemases; Colistin; E. coli; Extended spectrum β-lactamases; Genotype to phenotype; K. pneumoniae; Machine learning; Neural networks; Rapid diagnostics
Year: 2021 PMID: 33897984 PMCID: PMC8060595 DOI: 10.1016/j.csbj.2021.03.027
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
MIC distribution of the isolates. The cut-off between WT (light blue) and NWT (no coloring) is set according to EUCAST ECOFF values [20]. N – number of NWT isolates. A. E. coli, B. K. pneumoniae.
General statistics of the de novo assemblies.
| Species | Number of contigs [median (IQR)] | Largest contig, bp | Total length, bp | GC,% | N50, bp |
|---|---|---|---|---|---|
| 199 (208) | 412 016 ± 239 231 | 5 133 633 ± 1 156 150 | 51 ± 1 | 157 841 ± 114 466 | |
| 119 (381) | 421 222 ± 312 860 | 4 848 755 ± 2 044 236 | 57 ± 2 | 149 514 ± 114 415 |
Fig. 1Prevalence of featured BLAKs associated with antibiotic resistance in various gene groups in (A) E. coli and (B) K. pneumoniae. The number of featured BLAKs in each species-antibiotic pair is given in brackets.
Accuracy, precision and recall of 12 feed-forward neural networks for WT/NWT prediction of E.coli and K. pneumoniae isolates. *HLN – number of neurons in hidden layers.
| Bacteria | Antibiotic | HLN* | Number of isolates correct/all Accuracy, % | Precision/Recall for NWT isolates from test subset, % | |||||
|---|---|---|---|---|---|---|---|---|---|
| Train [80%] | Validate [10%] | Test [10%] | |||||||
| WT | NWT | WT | NWT | WT | NWT | ||||
| AMP | 24; 12 | 24/24 | 40/46 | 5/6 | 3/3 | 3/4 | 4/5 | 80/80 | |
| 91 | 89 | 78 | |||||||
| CTX | 128; 64 | 44/44 | 25/26 | 6/7 | 2/2 | 7/7 | 4/4 | 100/100 | |
| 99 | 90 | ||||||||
| TAZ | 128; 64 | 42/43 | 25/27 | 6/6 | 3/3 | 8/8 | 3/3 | 100/100 | |
| 96 | |||||||||
| MEM | 192; 96 | 55/55 | 12/15 | 8/8 | 1/1 | 10/10 | 1/1 | 100/100 | |
| 97 | |||||||||
| IMI | 48; 24 | 63/63 | 3/7 | 7/7 | 1/2 | 8/8 | 3/3 | 100/100 | |
| 94 | 89 | ||||||||
| ERT | 24; 12 | 49/50 | 19/20 | 4/4 | 5/5 | 8/8 | 3/3 | 100/100 | |
| 97 | |||||||||
| CTX | 96; 48 | 27/27 | 33/33 | 5/5 | 2/3 | 5/5 | 3/3 | 100/100 | |
| 100 | 87 | ||||||||
| TAZ | 48; 24 | 25/25 | 32/34 | 6/6 | 2/2 | 5/5 | 3/3 | 100/100 | |
| 97 | |||||||||
| MEM | 24 | 41/41 | 18/19 | 6/6 | 1/2 | 6/6 | 2/2 | 100/100 | |
| 98 | 87 | ||||||||
| IMI | 24; 12 | 42/44 | 15/16 | 6/6 | 2/2 | 5/5 | 3/3 | 100/100 | |
| 100 | |||||||||
| ERT | 12 | 37/37 | 23/23 | 3/3 | 5/5 | 4/4 | 4/4 | 100/100 | |
| 100 | |||||||||
Fig. 2BLAKs distribution in E. coli and K. pneumoniae isolates for the different β-lactam antibiotics. Wild type (WT) and non-wild-type (NWT) isolates.
Cross-performance of neural networks for prediction of E. coli AMR phenotype. Precision/recall for WT and NWT classes are given below the count of isolates.
| TAZ | MEM | IMI | ERT | |||||
|---|---|---|---|---|---|---|---|---|
| WT | NWT | WT | NWT | WT | NWT | WT | NWT | |
| VAMPr | 51/56 | 25/30 | 71/71 | 7/15 | 73/73 | 7/13 | 55/59 | 7/17 |
| 91%/91% | 83%/83% | 90%/100% | 47%/47% | 92%/100% | 54%/93% | 85%/93% | 64%/41% | |
| AMR-Diag models & VAMPr | 24/38 | 56/66 | 90/90 | 3/10 | 13/13 | 0/4 | 79/79 | 0/14 |
| 71%/63% | 80%/85% | 93%/100% | 100%/30% | 76%/100% | 0%/0% | 85%/100% | 0%/0% | |
Cross-performance of ML algorithms and datasets for prediction of K. pneumoniae AMR phenotype. Precision/recall for WT and NWT classes are given below the count of isolates.
| Models/datasets | TAZ | MEM | IMI | |||
|---|---|---|---|---|---|---|
| WT | NWT | WT | NWT | WT | NWT | |
| Nguyen et al. | 0/36 | 38/38 | 0/52 | 22/22 | 5/54 | 16/20 |
| 0%/0% | 51%/100% | 0%/0% | 30%/100% | 55%/9% | 25%/80% | |
| AMR-Diag models & Nguyen et al. | 7/10 | 1096/1481 | 1025/1025 | 0/352 | 1000/1088 | 1/482 |
| 2%/70% | 99%/74% | 74%/100% | 0%/0% | 68%/92% | 1%/0% | |
Performance of neural networks ensemble for prediction of K. pneumoniae AMR phenotype. Two networks used in the ensemble are a network trained on data from Nguyen et al. and a network trained on AMR-Diag data. Precision/recall for WT and NWT classes are given below the count of isolates.
| Models/datasets | TAZ | MEM | IMI | |||
|---|---|---|---|---|---|---|
| WT | NWT | WT | NWT | WT | NWT | |
| Networks ensemble & Nguyen et al. | 0/10 | 1481/1481 | 974/1025 | 319/352 | 983/1028 | 290/322 |
| 0%/0% | 100%/99% | 97%/95% | 86%/91% | 97%/96% | 86%/90% | |
| Networks ensemble & AMR-Diag data | excluded* | 53/53 | 22/22 | 55/55 | 20/20 | |
| 100%/100% | 100%/100% | 100%/100% | 100%/100% | |||
Phenotype prediction of partial ONT MinION data of spiked blood cultures, sequencing data were taken from the sequencing start-up to the time point, where the first target ARG was detected, as well as from the whole sequencing run. Predictions that correspond to the correct phenotypic data are highlighted in bold.
| Blood culture spiked with | Time when first ARG was detected | Phenotype prediction (class; score) | ||||
|---|---|---|---|---|---|---|
| CTX | TAZ | MEM | IMI | ERT | ||
| 59 min | WT; 1.0/WT; 1.0 | WT; 1.0/WT; 1.0 | ||||
| 40 min | ||||||
| 10 min | WT; 1.0/WT; 1.0 | |||||