| Literature DB >> 31731444 |
Samrat Ghosh1,2, Aditya Narayan Sarangi1, Mayuri Mukherjee1,2, Swati Bhowmick1, Sucheta Tripathy1,2.
Abstract
Lactobacillus paracasei are diverse Gram-positive bacteria that are very closely related to Lactobacillus casei, belonging to the Lactobacillus casei group. Due to extreme genome similarities between L. casei and L. paracasei, many strains have been cross placed in the other group. We had earlier sequenced and analyzed the genome of Lactobacillus paracasei Lbs2, but mistakenly identified it as L. casei. We re-analyzed Lbs2 reads into a 2.5 MB genome that is 91.28% complete with 0.8% contamination, which is now suitably placed under L. paracasei based on Average Nucleotide Identity and Average Amino Acid Identity. We took 74 sequenced genomes of L. paracasei from GenBank with assembly sizes ranging from 2.3 to 3.3 MB and genome completeness between 88% and 100% for comparison. The pan-genome of 75 L. paracasei strains hold 15,945 gene families (21,5232 genes), while the core genome contained about 8.4% of the total genes (243 gene families with 18,225 genes) of pan-genome. Phylogenomic analysis based on core gene families revealed that the Lbs2 strain has a closer relationship with L. paracasei subsp. tolerans DSM20258. Finally, the in-silico analysis of the L. paracasei Lbs2 genome revealed an important pathway that could underpin the production of thiamin, which may contribute to the host energy metabolism.Entities:
Keywords: COG; Lactobacillus paracasei; carbohydrate active enzyme; horizontal gene transfer; host-specific gene; pan/core-genome; thiamin
Year: 2019 PMID: 31731444 PMCID: PMC6920896 DOI: 10.3390/microorganisms7110487
Source DB: PubMed Journal: Microorganisms ISSN: 2076-2607
Proposed name of L. casei strains based on the Average Nucleotide Identity (ANI) and Average Amino Acid Identity (AAI) calculation.
| Existing Name in NCBI Database (GenBank) | Proposed Name |
|---|---|
Figure 1Circular map of the L. paracasei Lbs2 genome. Labeling from outside to inside of the circle, each ring carries information of the genome: coding sequences (CDSs) on the forward strand (magenta); CDSs on the reverse strand (cyan); tRNA genes (red); rRNA genes (blue); GC content; GC skew.
Figure 2Cluster of Orthologous Groups (COG) frequency heatmap based on hierarchical clustering. The horizontal axis depicts functional COG categories, and the vertical axis represents 75 L. paracasei strains. Genome of interest ‘Lbs2’ is marked as red.
Figure 3Pan-genome analysis of 75 L. paracasei strains. (A) Pie-chart representing core and accessory genes distribution. (B) Pie-chart representing the distribution of COG categories of 243 core-gene families.
Figure 4Histogram illustrating Ka/Ks ratios of each core gene.
Figure 5Whole genome phylogenetic tree of L. paracasei strains was inferred using the maximum likelihood method. The tree was built on the basis of the core-genome and it is presented as a cladogram. Complete genomes are marked as blue, while the genome of interest and its closest one are marked as red.
Genome assembly statistics of the three Lactobacillus paracasei strains (Lpc-37, Lbs2, DSM20258).
| Strain | |||
|---|---|---|---|
| Features | Lpc-37 | Lbs2 | DSM20258 |
| Source | Microbial food product | Human Gut | Not available |
| Genome Status | Draft | Draft | Draft |
| Accession Number | NOKL00000000.1 | JPKN00000000.3 | AYYJ00000000.1 |
| N50 (bp) | 3,112,081 | 10,992 | 14,516 |
| L50 | 1 | 68 | 49 |
| Completeness (%) | 100 | 91.28 | 97.19 |
| Contamination (%) | 0 | 0.87 | 0 |
| Size (Mb) | 3.16 | 2.50 | 2.36 |
| GC% | 46.33 | 46.97 | 46.44 |
| Genes | 3125 | 2380 | 2424 |
| Proteins | 3010 | 2308 | 2339 |
| t-RNA | 59 | 20 | 37 |
| r-RNA | 15 | 3 | 2 |
| Other-RNA | 41 | 49 | 46 |
Figure 6The schematic illustration of pathways. (A) The role of thiamin (marked as red) as a co-factor in major metabolic pathways (marked as blue). (B) Genes involved in thiamin biosynthesis.