| Literature DB >> 31068916 |
Xiaomei Zhang1, Michael Payne1, Ruiting Lan1.
Abstract
Salmonella enterica subspecies enterica is a highly diverse subspecies with more than 1500 serovars and the ability to distinguish serovars within this group is vital for surveillance. With the development of whole-genome sequencing technology, serovar prediction by traditional serotyping is being replaced by molecular serotyping. Existing in silico serovar prediction approaches utilize surface antigen encoding genes, core genome MLST and serovar-specific gene markers or DNA fragments for serotyping. However, these serovar-specific gene markers or DNA fragments only distinguished a small number of serovars. In this study, we compared 2258 Salmonella accessory genomes to identify 414 candidate serovar-specific or lineage-specific gene markers for 106 serovars which includes 24 polyphyletic serovars and the paraphyletic serovar Enteritidis. A combination of several lineage-specific gene markers can be used for the clear identification of the polyphyletic serovars and the paraphyletic serovar. We designed and evaluated an in silico serovar prediction approach by screening 1089 genomes representing 106 serovars against a set of 131 serovar-specific gene markers. The presence or absence of one or more serovar-specific gene markers was used to predict the serovar of an isolate from genomic data. We show that serovar-specific gene markers have comparable accuracy to other in silico serotyping methods with 84.8% of isolates assigned to the correct serovar with no false positives (FP) and false negatives (FN) and 10.5% of isolates assigned to a small subset of serovars containing the correct serovar with varied FP. Combined, 95.3% of genomes were correctly assigned to a serovar. This approach would be useful as diagnosis moves to culture-independent and metagenomic methods as well as providing a third alternative to confirm other genome-based analyses. The identification of a set of gene markers may also be useful in the development of more cost-effective molecular assays designed to detect specific gene markers of the all major serovars in a region. These assays would be useful in serotyping isolates where cultures are no longer obtained and traditional serotyping is therefore impossible.Entities:
Keywords: Salmonella enterica; accessory genomes; lineage-specific gene markers; paraphyletic serovar; polyphyletic serovars; serotyping; serovar prediction; serovar-specific gene markers
Year: 2019 PMID: 31068916 PMCID: PMC6491675 DOI: 10.3389/fmicb.2019.00835
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
FIGURE 1The distribution of sensitivity and specificity of 354 potential serovar-specific gene markers. TPR, true positive rate; FPR, false positive rate. Where a gradient from light blue (low percentage) to dark blue (high percentage) is displayed.
Lineage-specific candidate gene markers for polyphyletic serovars and paraphyletic serovar.
| No of | No of | No of | ||||
|---|---|---|---|---|---|---|
| Serovar | genomes | lineages | Lineages | genes | Sensitivity# | Specificity# |
| Bareilly | 20 | 2 | Bareilly-I | 2 | 100.00 | 98.76 |
| Bareilly-II | 1 | 100.00 | 99.11 | |||
| Bovismorbificans | 34 | 2 | Bovismorbificans-I | 1 | 100.00 | 97.25 |
| Bovismorbificans-II | 1 | 100.00 | 99.91 | |||
| Bredeney | 5 | 2 | Bredeney | 1 | 100.00 | 97.61 |
| Cerro | 40 | 2 | Cerro-I | 4 | 100.00 | 100.00 |
| Cerro-II | 2 | 100.00 | 100.00 | |||
| Derby | 24 | 3 | Derby-I&II | 1 | 100.00 | 100.00 |
| Derby-III | 4 | 100.00 | 100.00 | |||
| Enteritidis | 165 | 2 | Enteritidis-clade A/C | 1 | 100.00 | 98.85 |
| Enteritidis-clade B | 5 | 96.43* | 99.65 | |||
| Give | 26 | 3 | Give-I&II | 4 | 100.00 | 94.60 |
| Give-III | 1 | 100.00 | 99.82 | |||
| Havana | 20 | 2 | Havana-I | 2 | 100.00 | 97.39 |
| Havana-II | 4 | 100.00 | 100.00 | |||
| Hvittingfoss | 16 | 3 | Hvittingfoss-I&II | 1 | 100.00 | 100.00 |
| Hvittingfoss-III | 1 | 100.00 | 100.00 | |||
| Kentucky | 31 | 2 | Kentucky-I | 5 | 100.00 | 100.00 |
| Kentucky-II | 3 | 100.00 | 100.00 | |||
| Kottbus | 12 | 3 | Kottbus | 1 | 100.00 | 93.98 |
| Livingstone | 17 | 2 | Livingstone | 1 | 88.24* | 99.47 |
| London | 11 | 2 | London-I | 2 | 100.00 | 99.11 |
| London-II | 3 | 100.00 | 99.87 | |||
| Mississippi | 14 | 2 | Mississippi-I | 5 | 100.00 | 100.00 |
| Mississippi-II | 1 | 100.00 | 100.00 | |||
| Newport | 85 | 3 | Newport-I&II | 1 | 100.00 | 92.87 |
| Newport-I&III | 1 | 100.00 | 91.67 | |||
| Oranienburg | 29 | 4 | Oranienburg-I&II&IV | 1 | 100.00 | 98.67 |
| Oranienburg-III | 1 | 100.00 | 98.72 | |||
| Oslo | 9 | 2 | Oslo-I | 2 | 100.00 | 99.91 |
| Oslo-II | 1 | 100.00 | 100.00 | |||
| Paratyphi B | 72 | 3 | Paratyphi B-I&II | 11 | 100.00 | 97.83 |
| Paratyphi B-III | 1 | 100.00 | 100.00 | |||
| Paratyphi B-mono | 1 | 100.00 | 100.00 | |||
| Reading | 8 | 2 | Reading-I | 1 | 100.00 | 100.00 |
| Reading-II | 2 | 100.00 | 99.96 | |||
| Saintpaul | 31 | 3 | Saintpaul-I | 11 | 100.00 | 98.14 |
| Saintpaul-II | 5 | 100.00 | 100.00 | |||
| Saintpaul-III | 1 | 100.00 | 98.27 | |||
| Senftenberg | 27 | 3 | Senftenberg-I&II | 2 | 100.00 | 99.96 |
| Senftenberg-III | 1 | 100.00 | 100.00 | |||
| Stanleyville | 6 | 3 | Stanleyville-I&II | 2 | 83.33* | 95.44 |
| Tell El Kebir | 8 | 2 | Tell El Kebir-I | 3 | 100.00 | 100.00 |
| Tell El Kebir-II | 6 | 100.00 | 100.00 | |||
| Thompson | 32 | 2 | Thompson-I | 2 | 100.00 | 98.49 |
| Thompson-II | 2 | 100.00 | 100.00 | |||
| Virchow | 39 | 2 | Virchow | 1 | 100.00 | 100.00 |
Serovar-specific genes functional categories.
| Category by RAST | No of genes∗ |
|---|---|
| DNA Metabolism | 18 |
| Regulation and cell signaling | 5 |
| Carbohydrates | 2 |
| Membrane Transport | 8 |
| Virulence, Disease and Defence | 1 |
| RNA Metabolism | 4 |
| Stress Response | 2 |
| Cofactors, Vitamins, Prosthetic Groups, Pigments | 1 |
| Cell Wall and Capsule | 1 |
| Phages related | 2 |
| Protein Metabolism | 1 |
| Amino Acids and Derivatives | 1 |
| Uncategorized | 152 |
| Hypothetical proteins with unknown function | 217 |
FIGURE 2The distribution of a minimal set of 131 serovar-specific genes in 106 serovars. The Y-axis shows serovar or lineage-specific gene markers and the X-axis shows serovars or lineages. The details were listed in Supplementary Table S4. Gray indicated zero genomes containing a gene (TN). Gene/Genome pairs along the diagonal represent genomes containing the serovar-specific gene markers that matches their serovar (TP). Red represents genes that are present in 100% of genomes for a given serovar or lineage. Where a gene is present in less than 100% of a serovar a gradient from light blue (low percentage) to dark blue (high percentage) is displayed. Blue pairs along the diagonal represent the presence of FN. Pairs that are blue or red outside of the diagonal represent pairs containing genes that do not match the predicted serovar of the genome (FP).
A panel of serovar-specific genes for typing the ten most frequent serovars in Australia.
| Serovar | Gene 1 | Gene 2 | Gene 3 | Gene 4 | Gene 5 | Gene 6 | Gene 7 | Gene 8 | Gene 9 | Gene 10 | Gene 11 | Gene 12 | Gene 13 | Gene 14 | Gene 15 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Typhimurium | + | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| Enteritidis-B | - | + | - | - | - | - | - | - | - | - | - | - | - | - | - |
| Enteritidis-A/C | - | - | + | - | - | - | - | - | - | - | - | - | - | - | - |
| Virchow | - | - | - | + | - | - | - | - | - | - | - | - | - | - | - |
| Saintpaul-I | - | - | - | - | + | - | - | - | [+] | - | - | - | - | - | - |
| Saintpaul-II | - | - | - | - | - | + | - | - | - | - | - | - | - | - | - |
| Saintpaul-III | [+] | - | - | - | - | - | + | - | - | - | - | - | - | - | - |
| Infantis | - | - | - | - | - | - | - | + | - | - | - | - | - | - | - |
| Paratyphi B-I&II | - | - | - | - | - | - | - | - | + | - | - | - | - | - | - |
| Paratyphi B-III | [+] | - | - | - | - | - | - | - | - | + | - | - | - | - | - |
| Chester | - | - | - | - | - | - | - | - | - | - | + | - | - | - | - |
| Hvittingfoss-I&II | - | - | - | - | - | - | - | - | - | - | - | + | - | - | - |
| Hvittingfoss-III | [+] | - | - | - | - | - | - | - | - | - | - | - | + | - | - |
| Muenchen-I | - | - | - | - | - | - | [+] | - | - | - | - | - | - | + | - |
| Muenchen-II | - | - | - | - | - | - | - | - | - | - | - | - | - | - | + |
| Error rate | 2.4 | 0 | 1.5 | 0 | 2.9 | 0 | 0.2 | 0 | 1 | 0 | 2.2 | 0 | 0 | 0 | 0.9 |
| Specificity | 97.6 | 100 | 98.5 | 100 | 97.1 | 100 | 99.8 | 100 | 99 | 100 | 97.8 | 100 | 100 | 100 | 99.1 |