| Literature DB >> 21375712 |
Ravi Kant1, Jochen Blom, Airi Palva, Roland J Siezen, Willem M de Vos.
Abstract
The genus Lactobacillus includes a diverse group of bacteria consisting of many species that are associated with fermentations of plants, meat or milk. In addition, various lactobacilli are natural inhabitants of the intestinal tract of humans and other animals. Finally, several Lactobacillus strains are marketed as probiotics as their consumption can confer a health benefit to host. Presently, 154 Lactobacillus species are known and a growing fraction of these are subject to draft genome sequencing. However, complete genome sequences are needed to provide a platform for detailed genomic comparisons. Therefore, we selected a total of 20 genomes of various Lactobacillus strains for which complete genomic sequences have been reported. These genomes had sizes varying from 1.8 to 3.3 Mb and other characteristic features, such as G+C content that ranged from 33% to 51%. The Lactobacillus pan genome was found to consist of approximately 14 000 protein-encoding genes while all 20 genomes shared a total of 383 sets of orthologous genes that defined the Lactobacillus core genome (LCG). Based on advanced phylogeny of the proteins encoded by this LCG, we grouped the 20 strains into three main groups and defined core group genes present in all genomes of a single group, signature group genes shared in all genomes of one group but absent in all other Lactobacillus genomes, and Group-specific ORFans present in core group genes of one group and absent in all other complete genomes. The latter are of specific value in defining the different groups of genomes. The study provides a platform for present individual comparisons as well as future analysis of new Lactobacillus genomes.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21375712 PMCID: PMC3818991 DOI: 10.1111/j.1751-7915.2010.00215.x
Source DB: PubMed Journal: Microb Biotechnol ISSN: 1751-7915 Impact factor: 5.813
A general overview of the origin and genome statistics of the 20 Lactobacillus genomes.
| Genome | Length (bp) | G+C content (%) | Predicted ORFs | Isolated from | Genes assigned to COG | LPXTG genes | Genes encoding signal peptides predicted by SignalP (%) | Genes encoding signal peptides predicted by LocateP (%) | Reference |
|---|---|---|---|---|---|---|---|---|---|
| 1 993 564 | 34.71 | 1864 | Infant faeces | 1433 | 13 | 21.03 | 9.40 | ||
| 2 080 931 | 37.08 | 1757 | Cheese | 1396 | 2 | 17.07 | 6.96 | ||
| 1 894 360 | 35.26 | 1755 | Human Gut | 1316 | 14 | 17.89 | 6.38 | ||
| 2 043 161 | 36 | 2024 | Chicken faeces | 1499 | 8 | 13.58 | 9.39 | ||
| 1 781 645 | 34.43 | 1733 | Human faeces | 1320 | 12 | 24.58 | 6.49 | ||
| 1 992 676 | 34.61 | 1821 | Human faeces | 1403 | 18 | 19.17 | 7.41 | ||
| 1 856 951 | 49.69 | 1721 | Yoghurt | 1196 | 3 | 17.49 | 8.28 | ||
| 1 864 998 | 49.72 | 2094 | Yoghurt | 1153 | 2 | 16.19 | 8.69 | ||
| 2 924 325 | 46.58 | 2771 | Cheese | 1959 | 18 | 20.39 | 8.55 | ||
| 3 079 196 | 46.34 | 3044 | Cheese | 2152 | 20 | 20.47 | 8.40 | ||
| 3 010 111 | 46.69 | 2944 | Human Gut | 2032 | 18 | 30.16 | 7.83 | ||
| 3 033 106 | 46.68 | 2992 | Cheese | 2099 | 15 | 31.45 | 8.34 | ||
| 1 884 661 | 41.26 | 1879 | Meat | 1462 | 7 | 19.96 | 8.46 | ||
| 2 340 228 | 46.06 | 2218 | Human | 1678 | 12 | 21.06 | 9.38 | ||
| 3 197 759 | 44.66 | 2948 | Human saliva | 2248 | 34 | 28.05 | 7.94 | ||
| 3 348 625 | 44.42 | 3100 | Adult Intestine | 2305 | 35 | 19.87 | 7.88 | ||
| 2 098 684 | 51.47 | 1843 | Adult Intestine | 1519 | 5 | 14.81 | 5.48 | ||
| 1 999 618 | 38.87 | 1935 | Silage | 1529 | 4 | 15.76 | 4.84 | A. Copeland, S. Lucas, A. Lapidus, K. Barry, J.C. Detter, T. Glavina del Rio, N. Hammon | |
| 2 039 414 | 38.88 | 1820 | Fermented plant material | 1495 | 5 | 16.65 | 5.22 | ||
| 2 133 977 | 33.04 | 2073 | Terminal ileum of human | 1476 | 5 | 14.52 | 6.58 |
Figure 1COG distribution of the predicted function of the LCG genes.
Figure 2Phylogenetic grouping of the Lactobacillus spp. with known genomes based on the features of their LCG. Three groups are shaded with different colours and termed NCFM, WCFS and GG groups (for further explanation see text).
Proteins found in core group and signature group genes of Lactobacillus genomes.
| NCFM | WCFS | GG | |
|---|---|---|---|
| Core group genes | 771 | 636 | 991 |
| Signature group genes | 119 | 14 | 88 |
General statistics of proteins predicted to be ORFans from the three specific core groups of Lactobacillus genomes.
| Data set | Genes blasted | ORFans found | Hypothetical | Annotated |
|---|---|---|---|---|
| Complete core (LCG) | 383 | 41 | 13 | 28 |
| NCFM | 119 | 56 | 34 | 22 |
| WCFS | 14 | 4 | 3 | 1 |
| GG | 88 | 30 | 15 | 15 |