| Literature DB >> 28845461 |
Sander Wuyts1, Stijn Wittouck1, Ilke De Boeck1, Camille N Allonsius1, Edoardo Pasolli2, Nicola Segata2, Sarah Lebeer1.
Abstract
Although the genotypic and phenotypic properties of the Lactobacillus casei group have been studied extensively, the taxonomic structure has been the subject of debate for a long time. Here, we performed a large-scale comparative analysis by using 183 publicly available genomes supplemented with a Lactobacillus strain isolated from the human upper respiratory tract. On the basis of this analysis, we identified inconsistencies in the taxonomy and reclassified all of the genomes according to their most closely related type strains. This led to the identification of a catalase-encoding gene in all 10 L. casei sensu stricto strains, making it the first described catalase-positive species in the Lactobacillus genus. Moreover, we found that 6 of 10 L. casei genomes contained a SecA2/SecY2 gene cluster with two putative glycosylated surface adhesin proteins. Altogether, our results highlight current inconsistencies in the taxonomy of the L. casei group and reveal new clade-associated functional features. IMPORTANCE The closely related species of the Lactobacillus casei group are extensively studied because of their applications in food fermentations and as probiotics. Our results show that many strains in this group are incorrectly classified and that reclassifying them to their most closely related species type strain improves the functional predictive power of their taxonomy. In addition, our findings may spark increased interest in the L. casei species. We find that after reclassification, only 10 genomes remain classified as L. casei. These strains show some interesting properties. First, they all appear to be catalase positive. This suggests that they have increased oxidative stress resistance. Second, we isolated an L. casei strain from the human upper respiratory tract and discovered that it and multiple other L. casei strains harbor one or even two large, glycosylated putative surface adhesins. This might inspire further exploration of this species as a potential probiotic organism.Entities:
Keywords: Lactobacillus casei group; accessory Sec system; catalase; comparative genomics; phylogenomics
Year: 2017 PMID: 28845461 PMCID: PMC5566788 DOI: 10.1128/mSystems.00061-17
Source DB: PubMed Journal: mSystems ISSN: 2379-5077 Impact factor: 6.496
FIG 1 GC contents of all of the genomes analyzed in this study. Genomes are grouped according to their species annotation in the NCBI database (except for upper respiratory tract isolate L. casei AMBR2) and colored by the phylogenetic clade they belong to, as defined in Fig. 2.
FIG 2 Phylogenetic trees of the whole L. casei group and the individual clades. (A) Phylogenetic tree constructed of all 184 genome assemblies of the L. casei group by using 776 single-copy marker genes identified by roary (15) with L. nasuensis JCM 17158 as the outgroup. Colors show NCBI database classifications. URT, upper respiratory tract. (B) Subtree of clade A. (C) Subtree of clade B. (D) Subtree of clade C. Strong clades (bootstrap support of >70) are indicated by black dots.
FIG 3 Pairwise ANIb and TETRA values for all genomes. The phylogenetic tree on the left is the same as that in Fig. 2A with the outgroup removed.
Overview of the gene content distribution in the L. casei group
| Group | No. of genomes | Avg no. of genes/genome ± SD | Avg no. of orthogroups/genome ± SD | No. of core orthogroups | No. of accessory orthogroups |
|---|---|---|---|---|---|
| 184 | 2,827 ± 141 | 2,654 ± 92 | 1,814 | 4,101 | |
| Clade A | 70 | 2,897 ± 148 | 2,696 ± 106 | 1,924 | 2,866 |
| Clade B | 10 | 2,847 ± 111 | 2,615 ± 96 | 1,924 | 1,576 |
| Clade C | 104 | 2,780 ± 119 | 2,629 ± 68 | 2,133 | 2,363 |
Gene content metrics were calculated for the L. casei group as a whole, as well as for the three clades defined by the phylogenetic tree. A core orthogroup is defined as an orthogroup present in >95% of the genomes.
FIG 4 PCoA of predicted functional capacity per clade based on mapping of all orthogroups to the eggNOG database (v4.5) (19). Each letter represents a different functional category, as defined above each plot. Some orthogroups mapped to multiple functional categories. The majority of the orthogroups (2,662 of them) mapped to category S (function unknown).
FIG 5 Gene content and order of the clade B-specific GT-rich gene cluster. The gene cluster is shown in all 10 clade B strains. Contigs were mapped to AMBR2; contig boundaries are indicated by broken vertical lines. Functional annotation was performed on the orthogroup level; multiple orthogroups can have the same function assigned to them. For example, the SecA2/SecY2 system consists of five orthogroups, SecA2, SecY2, Asp1, Asp2, and Asp3.