| Literature DB >> 31580794 |
Cyril Savin1,2,3, Alexis Criscuolo4, Julien Guglielmini4, Anne-Sophie Le Guern2,3,1, Elisabeth Carniel2,1,3, Javier Pizarro-Cerdá1,2,3, Sylvain Brisse5.
Abstract
The genus Yersinia comprises species that differ widely in their pathogenic potential and public-health significance. Yersinia pestis is responsible for plague, while Yersinia enterocolitica is a prominent enteropathogen. Strains within some species, including Y. enterocolitica, also vary in their pathogenic properties. Phenotypic identification of Yersinia species is time-consuming, labour-intensive and may lead to incorrect identifications. Here, we developed a method to automatically identify and subtype all Yersinia isolates from their genomic sequence. A phylogenetic analysis of Yersinia isolates based on a core subset of 500 shared genes clearly demarcated all existing Yersinia species and uncovered novel, yet undefined Yersinia taxa. An automated taxonomic assignment procedure was developed using species-specific thresholds based on core-genome multilocus sequence typing (cgMLST). The performance of this method was assessed on 1843 isolates prospectively collected by the French National Surveillance System and analysed in parallel using phenotypic reference methods, leading to nearly complete (1814; 98.4 %) agreement at species and infra-specific (biotype and serotype) levels. For 29 isolates, incorrect phenotypic assignments resulted from atypical biochemical characteristics or lack of phenotypic resolution. To provide an identification tool, a database of cgMLST profiles and reference taxonomic information has been made publicly accessible (https://bigsdb.pasteur.fr/yersinia). Genomic sequencing-based identification and subtyping of any Yersinia is a powerful and reliable novel approach to define the pathogenic potential of isolates of this medically important genus.Entities:
Keywords: Yersinia; core-genome multilocus sequence typing; genotyping; identification; phylogenetics; species
Year: 2019 PMID: 31580794 PMCID: PMC6861861 DOI: 10.1099/mgen.0.000301
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
Species and bioserotype composition of the genomic dataset
|
Species |
Bioserotype |
Origin |
Total | |
|---|---|---|---|---|
|
Public genomes |
YNRL | |||
|
|
6 |
14 |
20 | |
|
|
2 |
12 |
14 | |
|
|
4 |
17 |
21 | |
|
|
1A |
35 |
42 |
77 |
|
1B |
13 |
3 |
16 | |
|
2/O:9 |
28 |
8 |
36 | |
|
2–3/O:5,27 |
15 |
7 |
22 | |
|
3/O:3 |
0 |
19 |
19 | |
|
4 |
26 |
115 |
141 | |
|
5 |
7 |
4 |
11 | |
|
Unknown |
8 |
0 |
8 | |
|
|
0 |
2 |
2 | |
|
|
22 |
19 |
41 | |
|
|
16 |
7 |
23 | |
|
|
13 |
11 |
24 | |
|
|
2 |
5 |
7 | |
|
|
10 |
11 |
21 | |
|
|
1 |
0 |
1 | |
|
|
2 |
0 |
2 | |
|
|
270 |
20 |
290 | |
|
|
43 |
442 |
485 | |
|
|
6 |
12 |
18 | |
|
|
8 |
11 |
19 | |
|
|
5 |
13 |
18 | |
|
|
2 |
5 |
7 | |
|
Undefined |
0 |
3 |
3 | |
|
Total |
544 |
802 |
1346 | |
Fig. 1.Maximum-likelihood phylogenetic tree of the genus based on 500 concatenated multiple sequence alignments. Only bootstrap-based branch support values >70 % are shown. Bar, 0.01 amino acid substitutions per character.
Correspondence between genotypic and phenotypic characterization
|
Genotypic characterization |
Phenotypic characterization | ||
|---|---|---|---|
|
Species |
Lineage |
Species |
Infra-specific category |
|
|
1Aa |
|
Biotype 1A |
|
|
1Ab |
|
|
|
1B |
|
Biotype 1B | |
|
2/3-9a |
|
Biotype 2/O:9 | |
|
2/3-9b | |||
|
|
2/3-5a |
|
Bioserotype 2–3/O5,27 |
|
|
2/3-5b |
|
|
|
3-3a |
|
Bioserotype 3/O:3 | |
|
3-3b | |||
|
3–3 c | |||
|
3-3d | |||
|
4 |
|
Bioserotype 4/O:3 | |
|
5 |
|
Biotype 5 | |
|
|
1 to 32 |
|
21 O-serotypes |
|
|
|
|
|
|
|
1 (sublineage 1a and 1b) |
|
|
|
2 | |||
|
|
|
|
Diverse serotypes |
|
| |||
|
| |||
|
|
|
|
Diverse serotypes |
|
|
| ||
|
|
| ||
|
|
1 |
|
|
|
2 |
| ||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
NEW 1 ( |
|
|
|
|
NEW 2 |
|
|
|
|
NEW 3 |
|
|
Biotype 1A |
|
NEW 4 |
|
|
Biotype 1A |
Fig. 2.Maximum-likelihood phylogenetic tree of species based on 500 concatenated multiple sequence alignments. The tree was rooted with isolates from the groups NEW 3 and NEW 4 (not shown). Only bootstrap-based branch support values >70 % are shown. Bar, 0.001 amino acid substitutions per character.
Fig. 3.Maximum-likelihood phylogenetic tree of , and based on 500 concatenated multiple sequence alignments. The tree was rooted with (not shown). Only bootstrap-based branch support values >70 % are shown. Bar, 0.001 amino acid substitutions per character. AAG, Auto-agglutinable; NAG, non-agglutinable; NT, non-typable as does not harbour the O-antigen; O, O antigen serotype; Unk, unknown serotype.
Validation of the genotypic characterization
Consistent characterization with both methods was found for 1814 strains (out of 1843).
|
Phenotypic characterization |
Genotypic characterization | ||||
|---|---|---|---|---|---|
|
Species |
Biotype or serotype |
No. |
Species |
Lineage |
No. |
|
|
2 |
|
2 | ||
|
|
|
10 |
|
|
10 |
|
|
|
45 |
|
|
3 |
|
|
|
34 | |||
|
|
|
8 | |||
|
|
|
18 |
|
|
18 |
|
|
|
7 |
|
|
3 |
|
|
|
4 | |||
|
|
|
1 |
|
1 |
1 |
|
|
|
4 |
|
|
4 |
|
|
|
9 |
|
|
9 |
|
|
1A |
573 |
|
1Aa |
565 |
|
|
1Ab |
8 | |||
|
2/O:5–27 |
16 |
|
2/3-5a |
16 | |
|
2/O:9 |
147 |
|
2/3-9a |
11 | |
|
|
|
|
2/3-9b |
136 | |
|
3/O:3 |
3 |
|
3-3b |
2 | |
|
|
|
|
|
3–3 c |
1 |
|
4/O:3 |
950 |
|
4 |
950 | |
|
|
O:3 |
1 |
|
14 |
1 |
|
O:1 |
26 |
|
2 |
3 | |
|
4 |
2 | ||||
|
7 |
4 | ||||
|
10 |
5 | ||||
|
12 |
2 | ||||
|
15 |
6 | ||||
|
16 |
3 | ||||
|
17 |
1 | ||||
|
|
|
2 |
NEW 2 |
|
2 |
|
Total |
|
1814 |
|
|
1814 |
Correspondence between the genotypic and phenotypic characterizations for the 29 non-concordant strains (out of 1843)
|
Phenotypic characterization |
Genotypic characterization | ||||
|---|---|---|---|---|---|
|
Species |
Biotype or serotype |
No. |
Species |
Lineage |
No. |
|
|
1A |
21 |
NEW4 |
|
21 |
|
|
|
4 |
|
2 |
4 |
|
|
1A/O:5 |
1 |
|
|
1 |
|
|
2/O:27 |
1 |
|
3–3 c |
1 |
|
|
2/O:5–27 |
1 |
|
1Ab |
1 |
|
|
3/O:3 |
1 |
|
4 |
1 |
|
Total |
|
29 |
|
|
29 |
Fig. 4.Time-to-results comparison between cgMLST and phenotypic methods. Calculations are based on 1113 isolates received in 2017 at the YNRL for the phenotypic characterization (mean=7.5 days) and 1440 isolates received in 2018 for the cgMLST (mean=8.8 days). Statistical analysis was performed with Student's t-test. ****, Difference highly significant, with P value <0.0001.