| Literature DB >> 28163828 |
Burhan Lehri1, Alan M Seddon1, Andrey V Karlyshev1.
Abstract
The article provides an overview of the genomic features of Lactobacillus fermentum strain 3872. The genomic sequence reported here is one of three L. fermentum genome sequences completed to date. Comparative genomic analysis allowed the identification of genes that may be contributing to enhanced probiotic properties of this strain. In particular, the genes encoding putative mucus binding proteins, collagen-binding proteins, class III bacteriocin, as well as exopolysaccharide and prophage-related genes were identified. Genes related to bacterial aggregation and survival under harsh conditions in the gastrointestinal tract, along with the genes required for vitamin production were also found.Entities:
Keywords: Bacteriocin; Collagen binding protein; Genome sequencing; Lactobacillus fermentum; Mucus binding protein; Probiotics; Prophage
Year: 2017 PMID: 28163828 PMCID: PMC5286655 DOI: 10.1186/s40793-017-0228-4
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Fig. 1Photomicrograph of L. fermentum 3872 the bacteria was grown overnight at 37 °C using MRS agar and gram stained. The image was taken using an optical microscope with magnification 100 ×
Classification and general features of Lactobacillus fermentum 3872T [38]
| MIGS ID | Property | Term | Evidence codea |
|---|---|---|---|
| Classification | Domain Bacteria | TAS [ | |
| Phylum | TAS [ | ||
| Class | TAS [ | ||
| Order | TAS [ | ||
| Family | TAS [ | ||
| Genus | TAS [ | ||
| Species | TAS [ | ||
| (Type) strain: 3872 | |||
| Gram stain | Positive | IDA | |
| Cell shape | Rod | IDA | |
| Motility | Not known | ||
| Sporulation | Not known | ||
| Temperature range | 30-42 °C | TAS [ | |
| Optimum temperature | 37 ± 2 °C | TAS [ | |
| pH range; Optimum | not known; 5.5-6.0 | TAS [ | |
| Carbon source | D-Ribose, D-Galactose, D-Glucose, D-Fructose, D-Maltose, D-Lactose, D-Melibiose, D-Sucrose, D-Trehalose, D-Raffinose | TAS [ | |
| MIGS-6 | Habitat |
| TAS [ |
| MIGS-6.3 | Salinity | Not known | |
| MIGS-22 | Oxygen requirement | Facultative anaerobe | TAS [ |
| MIGS-15 | Biotic relationship | commensal | TAS [ |
| MIGS-14 | Pathogenicity | None known | NAS |
| MIGS-4 | Geographic location | Russia/Moscow region | TAS [ |
| MIGS-5 | Sample collection | 2011 | TAS [ |
| MIGS-4.1 | Latitude | Not known | |
| MIGS-4.2 | Longitude | Not known | |
| MIGS-4.4 | Altitude | Not known |
aEvidence codes - IDA inferred from direct assay, TAS traceable author statement (i.e., a direct report exists in the literature), NAS non-traceable author statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [46]
Fig. 2Phylogenetic tree based on comparative analysis of 16S rRNA genes. The sequences were aligned using the MUSCLE alignment tool [47]. The numbers above the tree nodes represent Bayesian posterior percentage probabilities computed using MrBayes 3.2.2 [48]. The tool used the HKY85 substitution model. A Markov Chain Monte Carlo chain length of 1,100,000 of a burn in length of 100,000, heated chains of 4 and a heated chain temperature of 0.2. Lactobacillus_reuteri_DSM_20016_NZ_AZDD00000000.1 was used as an out-group. The tree generated was further modified using Geneious tree builder [15]
Project information
| MIGS ID | Property | Term |
|---|---|---|
| MIGS 31 | Finishing quality | Completed high quality |
| MIGS-28 | Libraries used | IonTorrent OT2 400 sequencing kit, PacBio P6/C4 |
| MIGS 29 | Sequencing platforms | Ion Torrent Personal Genome Machine, PacBio RSII sequencing Machine |
| MIGS 31.2 | Fold coverage | 19.7 (PacBio), 49.6 (Ion Torrent run1), 60.1 (Ion Torrent run 2), 47.9 (Ion Torrent run 3) |
| MIGS 30 | Assemblers | CELERA, MIRA |
| MIGS 32 | Gene calling method | NCBI PGAP, PROKKA, RAST, BASys |
| Locus Tag | NZ_CP011536 | |
| Genbank ID | CP011536.1 | |
| GenBank Date of Release | 28/5/2015 | |
| GOLD ID | Ga0099330 | |
| BIOPROJECT | PRJNA224116, PRJNA213970 | |
| MIGS 13 | Source Material Identifier | VKM:B-2793D |
| Project relevance | biotechnological, antimicrobial, probiotic |
Summary of the genome: one chromosome and one plasmid
| Label | Size (Mb) | Topology | INSDC identifier | RefSeq ID |
|---|---|---|---|---|
| Chromosome 1 | 2297851 bp | Circular | CP011536.1 | NZ_CP011536.1 |
| Plasmid 1 | 32641 bp | Circular | CP011537.1 | NZ_CP011537.1 |
Genome statistics
| Attribute | Value | Percent |
|---|---|---|
| Genome size (bp) | 2,330,492 | 100.00 |
| DNA coding (bp) | 2,028,095 | 87.02 |
| DNA G + C (bp) | 1,179,376 | 50.56 |
| DNA scaffolds | 2 | 100.00 |
| Total genes | 2,328 | 100.00 |
| Protein coding genes | 2127 | 91.37 |
| RNA genes | 73 | 3.14 |
| Pseudo genes | 128 | 0.05 |
| Genes in internal clusters | 481 | 20.66 |
| Genes with function prediction | 1824 | 78.35 |
| Genes assigned to COGs | 1563 | 67.14 |
| Genes with Pfam domains | 1898 | 81.53 |
| Genes with signal peptides | 37 | 1.59 |
| Genes with transmembrane helices | 507 | 21.78 |
| CRISPR repeats | 3 | 0.13 |
Number of genes associated with general COG functional categories
| Code | Value | Percenta | Description |
|---|---|---|---|
| J | 179 | 8.42 | Translation, ribosomal structure and biogenesis |
| A | 0 | 0.00 | RNA processing and modification |
| K | 108 | 5.08 | Transcription |
| L | 97 | 4.56 | Replication, recombination and repair |
| B | 0 | 0.00 | Chromatin structure and dynamics |
| D | 26 | 1.22 | Cell cycle control, Cell division, chromosome partitioning |
| V | 34 | 1.60 | Defense mechanisms |
| T | 58 | 2.72 | Signal transduction mechanisms |
| M | 83 | 3.90 | Cell wall/membrane biogenesis |
| N | 11 | 0.52 | Cell motility |
| U | 13 | 0.61 | Intracellular trafficking and secretion |
| O | 52 | 0.02 | Posttranslational modification, protein turnover, chaperones |
| C | 72 | 3.39 | Energy production and conversion |
| G | 91 | 4.28 | Carbohydrate transport and metabolism |
| E | 148 | 6.96 | Amino acid transport and metabolism |
| F | 96 | 4.51 | Nucleotide transport and metabolism |
| H | 95 | 4.47 | Coenzyme transport and metabolism |
| I | 63 | 2.96 | Lipid transport and metabolism |
| P | 81 | 3.80 | Inorganic ion transport and metabolism |
| Q | 21 | 0.99 | Secondary metabolites biosynthesis, transport and catabolism |
| R | 134 | 6.30 | General function prediction only |
| S | 6.30 | 3.76 | Function unknown |
| - | 766 | 36.01 | Not in COGs |
aBased on the total number of protein encoding genes
Fig. 3L. fermentum 3872 genome representation showing GC skew. Leading and lagging strands are shown in green and purple. BlastN comparison of the genome of L. fermentum 3872 against those IFO 3956, CECT 5716, and F6, are indicated by the colour coded key. The intensity of each colour indicates nucleotide percentage identity. The diagram was generated using BRIGS software [16] using an upper identity threshold of 70% and a lower identity threshold of 50%
Fig. 4Comparison of the genomes of L. fermentum strains 3872, F6, 5716 and IFO 3956 using LASTZ program with a step length of 20 and a seed pattern of 12 of 19 [26]. Similar direct and inverted regions are shown in blue and red respectively
Fig. 5Comparison of the genomes of L. fermentum strains 3872, F6, CECT 5716 and IFO 3956 using LASTZ program with a step length of 20 and a seed pattern of 12 of 19 [26] with close-up of regions containing bacteriocin and prophages