| Literature DB >> 28968762 |
Dionysios A Antonopoulos, Rida Assaf, Ramy Karam Aziz, Thomas Brettin, Christopher Bun, Neal Conrad, James J Davis, Emily M Dietrich, Terry Disz, Svetlana Gerdes, Ronald W Kenyon, Dustin Machi, Chunhong Mao, Daniel E Murphy-Olson, Eric K Nordberg, Gary J Olsen, Robert Olson, Ross Overbeek, Bruce Parrello, Gordon D Pusch, John Santerre, Maulik Shukla, Rick L Stevens, Margo VanOeffelen, Veronika Vonstein, Andrew S Warren, Alice R Wattam, Fangfang Xia, Hyunseung Yoo.
Abstract
The Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org) is designed to provide researchers with the tools and services that they need to perform genomic and other 'omic' data analyses. In response to mounting concern over antimicrobial resistance (AMR), the PATRIC team has been developing new tools that help researchers understand AMR and its genetic determinants. To support comparative analyses, we have added AMR phenotype data to over 15 000 genomes in the PATRIC database, often assembling genomes from reads in public archives and collecting their associated AMR panel data from the literature to augment the collection. We have also been using this collection of AMR metadata to build machine learning-based classifiers that can predict the AMR phenotypes and the genomic regions associated with resistance for genomes being submitted to the annotation service. Likewise, we have undertaken a large AMR protein annotation effort by manually curating data from the literature and public repositories. This collection of 7370 AMR reference proteins, which contains many protein annotations (functional roles) that are unique to PATRIC and RAST, has been manually curated so that it projects stably across genomes. The collection currently projects to 1 610 744 proteins in the PATRIC database. Finally, the PATRIC Web site has been expanded to enable AMR-based custom page views so that researchers can easily explore AMR data and design experiments based on whole genomes or individual genes.Entities:
Keywords: RAST; antibiotic; antimicrobial resistance (AMR); genome annotation; minimum inhibitory concentration; the SEED
Mesh:
Year: 2019 PMID: 28968762 PMCID: PMC6781570 DOI: 10.1093/bib/bbx083
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1PATRIC annotation process for integrating AMR data in both genomic regions and genes.
AMR classifiers in the PATRIC annotation system
| Species | Antibiotic | Resistant genomes | Susceptible genomes | F1 score | Initially described in |
|---|---|---|---|---|---|
|
| Carbapenem | 122 | 110 | 0.95 | [ |
|
| Amikacin | 1190 | 364 | 0.92 | [ |
|
| Aztreonam | 1377 | 100 | 0.75 | [ |
|
| Cefoxitin | 555 | 976 | 0.80 | [ |
|
| Ciprofloxacin | 119 | 1435 | 0.91 | [ |
|
| Ertapenem | 265 | 178 | 0.96 | [ |
|
| Gentamicin | 786 | 768 | 0.86 | [ |
|
| Imipenem | 1100 | 453 | 0.94 | [ |
|
| Levofloxacin | 246 | 1307 | 0.93 | [ |
|
| Meropenem | 1123 | 430 | 0.92 | [ |
|
| Piperacillin–tazobactam | 322 | 1230 | 0.76 | [ |
|
| Tetracycline | 658 | 896 | 0.79 | [ |
|
| Tobramycin | 501 | 1053 | 0.94 | [ |
|
| Co-trimoxazole | 331 | 1223 | 0.87 | [ |
|
| Amikacin | 210 | 350 | 0.91 | This study |
|
| Capreomycin | 204 | 350 | 0.83 | This study |
|
| Isoniazid | 250 | 250 | 0.88 | [ |
|
| Kanamycin | 188 | 250 | 0.87 | [ |
|
| Ofloxacin | 239 | 250 | 0.79 | [ |
|
| Rifampicin | 250 | 250 | 0.86 | [ |
|
| Streptomycin | 250 | 250 | 0.71 | [ |
|
| Azithromycin | 213 | 246 | 0.97 | This study |
|
| Ceftriaxone | 228 | 86 | 0.86 | This study |
|
| Clarithromycin | 213 | 246 | 0.99 | This study |
|
| Clindamycin | 310 | 89 | 0.74 | This study |
|
| Moxifloxacin | 188 | 271 | 0.97 | This study |
|
| Levofloxacin | 192 | 290 | 0.85 | This study |
|
| Ciprofloxacin | 467 | 762 | 0.98 | This study |
|
| Clindamycin | 350 | 274 | 0.97 | This study |
|
| Erythromycin | 484 | 821 | 0.96 | This study |
|
| Gentamicin | 162 | 1144 | 0.98 | This study |
|
| Methicillin | 707 | 886 | 0.99 | [ |
|
| Penicillin | 886 | 156 | 0.96 | This study |
|
| Tetracycline | 203 | 1029 | 0.97 | This study |
|
| Co-trimoxazole | 142 | 178 | 0.96 | This study |
|
| Beta-lactam | 2124 | 584 | 0.90 | [ |
|
| Chloramphenicol | 165 | 289 | 0.94 | This study |
|
| Co-trimoxazole | 2124 | 584 | 0.88 | [ |
|
| Erythromycin | 381 | 324 | 0.96 | This study |
|
| Tetracycline | 368 | 290 | 0.96 | This study |
aAMR data in PATRIC may be described as individual antibiotics or classes of antibiotics.
bUsed for building the classifiers.
Figure 2ROC curves for AdaBoost-based AMR classifiers installed in the annotation service since the publication of the Davis et al. [16] and Long et al. papers [27]. Accuracy and F1 scores are displayed in each inset. ROC curves depict classifiers for (A) P. difficile, (B) S. aureus and (C) K. pneumoniae (Kpn), M. tuberculosis (Mtb), P. aeruginosa (Pae) and S. pneumoniae (Spn). Antibiotic abbreviations are: AZM, azithromycin; CC, clindamycin; CIP, ciprofloxacin; CLR, clarithromycin; CRO, ceftriaxone; E, erythromycin; GM, gentamicin; MFX, moxifloxacin; OX, ofloxacin; P, penicillin; SXT, trimethoprim sulfamethoxazole; TE, tetracycline.
Figure 3Summary information for the antibiotic methicillin at PATRIC. The antibiotic interface provides a summary of the antibiotic, its synonyms and actions, and also provides links via separate tabs for AMR phenotypes, genes and regions across all the data available in PATRIC.
Figure 4A taxon-level summary on the PATRIC Web site describing AMR phenotype data across all of the genomes that are part of the Staphylococcus genus. (A) A bar graph summarizes the antibiotics, the AMR phenotype (resistant, intermediate or susceptible) and the number of genomes that match that phenotype. (B) The AMR phenotype tabular view, which shows all the genomes that have associated AMR data, includes a dynamic filter for rapid selection of genomes based on the metadata.
Figure 5AMR predicted regions, located in the genome of S. aureus strain 08S00974, as visualized in the PATRIC JBrowse viewer [57]. These predicted regions, numbered sequentially by their occurrence in the genome as ‘classifier_predicted_regions 12–15’, were predicted by the ML algorithm that is being used to predict AMR phenotypes. The predicted regions are located in and around a gene (fig|1280.11691.peg.56) that is annotated as ‘Tetracycline resistance, MFS efflux pump = > Tet(K)’. The annotation for this gene came from the focused manual curation effort at PATRIC to incorporate and propagate information for specific genes that were known to play an important role in AMR.