| Literature DB >> 31941435 |
Yichen He1, Xiujuan Zhou1, Ziyan Chen1, Xiangyu Deng2, Andrew Gehring3, Hongyu Ou1, Lida Zhang1, Xianming Shi4.
Abstract
BACKGROUND: Antibiotic resistance genes (ARGs) can spread among pathogens via horizontal gene transfer, resulting in imparities in their distribution even within the same species. Therefore, a pan-genome approach to analyzing resistomes is necessary for thoroughly characterizing patterns of ARGs distribution within particular pathogen populations. Software tools are readily available for either ARGs identification or pan-genome analysis, but few exist to combine the two functions.Entities:
Keywords: Identification; Machine learning; Pan-resistome; Visualization
Mesh:
Year: 2020 PMID: 31941435 PMCID: PMC6964052 DOI: 10.1186/s12859-019-3335-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Diagrammatic sketch of k-mer algorithm. Using two kernels as an example
Fig. 2PRAP workflow. The input files and steps are shown in blue and output files and steps are shown in red. The cells in gray represent the PRAP modules
Performance of different methods for ARGs identification
| Input Format | Database | SMC | MCC | Runtime (min) |
|---|---|---|---|---|
| Raw Reads | CARD | 0.9638 | 0.9809 | 25 |
| Scaffolds | CARD | 0.8954 | 0.9440 | < 1 |
| CDSs | CARD | 0.9600 | 0.9789 | < 1 |
| Proteins | CARD | 0.9532 | 0.9347 | 30 |
| Raw Reads | ResFinder | 0.9345 | 0.9649 | 25 |
| Scaffolds | ResFinder | 0.9924 | 0.9960 | < 1 |
| CDSs | ResFinder | 0.9899 | 0.9946 | < 1 |
| Proteins | ResFinder | 0.9647 | 0.9812 | 30 |
Parameters for the k-mer method included a k value of 25, two searching kernels, a depth of 20 and at least 100 area score and 90% coverage by length. Parameters for BLASTn and BLASTp included 95% identity for BLASTn and 98% identity for BLASTp and at least 90% coverage by length for both. Runtime is the time consumed for analyzing 26 genomes
Fig. 3Features of the pan-resistome. a ARGs distribution based on the CARD. b ARGs distribution based on the ResFinder database c Models of pan and core resistomes based on the CARD. d Models of pan and core resistomes based on the ResFinder database
Fig. 4Characteristics of the accessory resistomes based on the ResFinder database. a Total counts of antibiotic resistance genes for individual strains of S. enterica serotypes. The different colors correspond to different antibiotics shown in the legend. b Clustering results of the accessory resistomes. The darker the color, the greater the number of related genes. c Comparison matrix of accessory ARGs within each genome. Each symbol represents the number of genes related to a specific antibiotic. The blue symbols indicate that the genomes on the x-axis and the y-axis have equal numbers of genes (nx = ny), while green for nx < ny and orange for nx > ny. If the number of the two genomes is equal, all the symbols will be arranged on the diagonal, otherwise significant shifts will deviate substantially from the diagonal
Fig. 5Matrix analysis of β-lactam antibiotics based on the ResFinder database. a Clustering results of ARGs that were associated with β-lactam resistance with the “allele” parameter. b Clustering results of ARGs that were associated with β-lactam resistance with the “detailed” parameter, together with user-provided phenotypes of β-lactam antibiotic resistance results. The deeper the color, the greater number of antibiotics to which the isolate is resistant