| Literature DB >> 29270162 |
Margarita Gomila1, Antonio Busquets1, Magdalena Mulet1, Elena García-Valdés1,2, Jorge Lalucat1,2.
Abstract
The Pseudomonas syringae phylogenetic group comprises 15 recognized bacterial species and more than 60 pathovars. The classification and identification of strains is relevant for practical reasons but also for understanding the epidemiology and ecology of this group of plant pathogenic bacteria. Genome-based taxonomic analyses have been introduced recently to clarify the taxonomy of the whole genus. A set of 139 draft and complete genome sequences of strains belonging to all species of the P. syringae group available in public databases were analyzed, together with the genomes of closely related species used as outgroups. Comparative genomics based on the genome sequences of the species type strains in the group allowed the delineation of phylogenomic species and demonstrated that a high proportion of strains included in the study are misclassified. Furthermore, representatives of at least 7 putative novel species were detected. It was also confirmed that P. ficuserectae, P. meliae, and P. savastanoi are later synonyms of P. amygdali and that "P. coronafaciens" should be revived as a nomenspecies.Entities:
Keywords: ANIb; GGDC; MLSA; P. syringae; core genome; pangenome; phylogenetic group; phylogenomic species
Year: 2017 PMID: 29270162 PMCID: PMC5725466 DOI: 10.3389/fmicb.2017.02422
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1UPGMA dendrogram based on ANIb values of all pairwise comparisons. Each species name, as submitted in the database, is labeled with different colors. Roman numerals at the corresponding nodes indicate phylogenomic branches defined. Phylogenomic species inside each phylogenetic branch are highlighted with different colors. Species type strains are labeled in bold. Accession numbers of the corresponding genomes are given in brackets. Proposed phylogenomic species are indicated in the external circle. Putative novel species are marked in quotation marks or by capital letters (A–E).
ANIb, GGDC, 3 genes-MLSA, and core MLSA indices for the delineation of the proposed phylogenomic species in each genomic branch.
| I | 3 | 98.42 | 94.15 | 87.7 | 58.7 | 99.64 | 98.64 | 99.7 | 98.7 | |
| 9 | 94.79 | 95.15 | 68.4 | 62.8 | 98.75 | 98.39 | 98.8 | 98.9 | ||
| 1 | – | 95.07 | – | 61.2 | – | 97.47 | – | 99 | ||
| Species A (strain B728a) | 2 | 98.27 | 95.21 | 87.6 | 62.8 | 99.25 | 98.96 | 99.6 | 98.9 | |
| II | 13 | 96.66 | 95.41 | 73.2 | 65 | 99.03 | 98.85 | 99.2 | 99.1 | |
| “ | 7 | 98.41 | 95.41 | 89 | 65 | 99.35 | 98.85 | 99.7 | 99.1 | |
| III | 3 | 97.49 | 95.17 | 70.9 | 62.8 | 98.89 | 98.42 | 99.5 | 98.9 | |
| “ | 1 | – | 95.17 | – | 62.8 | – | 98.42 | – | 99 | |
| species B (strain CC1583) | 2 | 98.91 | 89.53 | 90.9 | 38.8 | 99.71 | 96.81 | 99.7 | 97.5 | |
| “ | 11 | 97.68 | 86.45 | 74 | 32.9 | 99.28 | 96.84 | 99.7 | 95.9 | |
| IV | 57 | 96.79 | 89.2 | 76.4 | 39 | 97.45 | 97.54 | 98.8 | 98.1 | |
| 3 | 97.3 | 89.38 | 78.8 | 39 | 98.97 | 97.49 | 99.5 | 98.1 | ||
| V | 1 | 100 | 94.63 | – | 59 | – | 98.57 | – | 98.8 | |
| species C (strain CC1417) | 2 | 99.05 | 94.63 | 92.7 | 59 | 99.64 | 98.57 | 99.9 | 98.8 | |
| 7 | 96.39 | 86.69 | 71.6 | 32.8 | 98.24 | 95.2 | 99.6 | 96.8 | ||
| VI | 2 | 99.94 | 81.35 | 100 | 23.6 | 100 | 92.26 | 100 | 91.3 | |
| species D (strain UB246) | 1 | – | 88.52 | – | 36.8 | – | 97.31 | – | 97.6 | |
| 1 | – | 88.52 | – | 32.6 | – | 96.31 | – | 97.5 | ||
| species E (strain S25) | 1 | – | 86.93 | – | 23.5 | – | 97.31 | – | 97.5 | |
The representative strain for the unnamed phylogenomic species is indicated in brackets.
Figure 2Core and pangenome analysis of the 127 strains in the P. syringae phylogenetic group. (A) Venn diagram of core genomes generated by the BDBH, COG, and OMCL strategies. (B) Estimate of core genome size with the Tettelin (blue) and Willenbrock (red) fits. (C) Estimate of pangenome size with the Tettelin fit. (D) Venn analysis of pangenomes generated by COG and OMCL. (E,F) Partition of the OMCL pangenomic matrix into shell, cloud, soft-core, and core compartments. These plots can be easily created with GET_HOMOLOGUES auxiliary scripts, as explained in the manual.
Figure 3Phylogenetic tree of the concatenated amino acid sequences of 219 monocopy proteins of the core genome defined in the 127 genomes analyzed. Forty-three thousand one hundred thirty-three amino acid positions were used to construct the tree. Each species name, as submitted in the database, is labeled with different colors. Roman numerals at the corresponding nodes indicate phylogenomic branches defined. Phylogenomic species inside each phylogenetic branch are highlighted with different colors. Species type strains are labeled in bold. Accession numbers of the corresponding genomes are given in brackets. Proposed phylogenomic species are indicated in the external circle. Putative novel species are marked in quotation marks or by capital letters (A–E). All bootstrap values are indicated in the nodes.
Core and pangenome analyses of the 127 strains included in the P. syringae phylogenetic group, as well as for 5 of the individual groups defined (I–V).
| Nr of strains | 127 | 15 | 20 | 17 | 57 | 10 |
| Coregenome proteins | 343 | 1,938 | 1,229 | 2,367 | 1,694 | 3,900 |
| Pangenome proteins | 27,904 | 13,150 | 13,493 | 11,873 | 16,810 | 7,704 |
| Cloud | 19,461 | 8,139 | 7,559 | 6,366 | 9,826 | 2,329 |
| Shell | 6,688 | 1,888 | 3,715 | 2,189 | 4,014 | 1,254 |
| Soft core | 1,755 | 3,123 | 2,219 | 3,318 | 2,970 | 4,121 |
| % Conserved genes | 6.29 | 23.75 | 16.45 | 27.95 | 17.67 | 53.49 |
| % Flexible genome | 93.71 | 76.25 | 83.55 | 72.05 | 82.33 | 46.51 |
Numbers of genes in the shell, cloud, soft-core, and core compartments are indicated. Percentages of conserved genes and flexible genome are also given.
Core and pangenome analyses of the phylogenomic species proposed with more than 6 strains in the P. syringae phylogenetic group.
| Nr of strains | 9 | 7 | 13 | 11 | 57 | 7 |
| Coregenome proteins | 2,185 | 4,197 | 1,329 | 2,888 | 1,694 | 4,438 |
| Pangenome proteins | 11,966 | 7,022 | 12,361 | 9,674 | 16,810 | 6,452 |
| Cloud | 7,210 | 1,605 | 7,007 | 4,512 | 9,826 | 1,244 |
| Shell | 1,394 | 875 | 2,952 | 1,145 | 4,014 | 524 |
| Soft core | 3,392 | 4,542 | 2,402 | 4,017 | 2,970 | 4,684 |
| % Conserved genes | 28.35 | 64.68 | 19.43 | 41.52 | 17.67 | 72.60 |
| % Flexible genome | 71.90 | 35.32 | 80.57 | 58.48 | 82.33 | 27.40 |
Numbers of genes in the shell, cloud, soft-core, and core compartments are indicated. Percentages of conserved genes and flexible genome are also given.