| Literature DB >> 34468734 |
Shinkuro Takenaka1, Takeshi Kawashima1,2, Masanori Arita1,2.
Abstract
In prokaryotes, a major contributor to genomic evolution is the exchange of genes via horizontal gene transfer (HGT). Areas with a high density of HGT networks are defined as genetic exchange communities (GECs). Although some phenotypes associated with specific ecological niches are linked to GECs, little is known about the phenotypic influences on HGT in bacterial groups within a taxonomic family. Thanks to the published genome sequences and phenotype data of lactic acid bacteria (LAB), it is now possible to obtain more detailed information about the phenotypes that affect GECs. Here, we have investigated the relationship between HGT and internal and external environmental factors for 178 strains from 24 genera in the Lactobacillaceae family. We found a significant correlation between strains with high utilization of sugars and HGT bias. The result suggests that the phenotype of the utilization of a variety of sugars is key to the construction of GECs in this family. This feature is consistent with the fact that the Lactobacillaceae family contributes to the production of a wide variety of fermented foods by sharing niches such as those in vegetables, dairy products and brewing-related environments. This result provides the first evidence that phenotypes associated with ecological niches contribute to form GECs in the LAB family.Entities:
Keywords: accessory genes; distribution; ecological niche; genetic exchange community; lactic acid bacteria; ortholog analysis
Mesh:
Substances:
Year: 2021 PMID: 34468734 PMCID: PMC8440127 DOI: 10.1093/femsle/fnab117
Source DB: PubMed Journal: FEMS Microbiol Lett ISSN: 0378-1097 Impact factor: 2.742
Figure 1.Phylogenetic tree based on the 16S rRNA genes of the LAB strains with the phenotypic and genomic features identified. The inner band shows species colored by genus. The next five symbols show phenotypic characteristics for each LAB strain; first inward-facing triangle indicates the growth at 15°C, second outward-facing triangle indicates the growth at 45°C, third star indicates the micro aerophilic, fourth red inward-facing indicates facultatively anaerobic and fifth circle indicates obligate anaerobic. A filled symbol means the strain has the phenotype, and an open symbol means that it does not. A blank means that there is no relevant information available. The next red band shows the number of sugar types that can be utilized. The outer bands show the number of coding sequences (CDS) for each strain: navy blue indicates the estimated number of CDS acquired by the horizontal gene transfer (HGT) and light blue indicates the number of native CDS.
Figure 2.Values of the coefficients of the multiple aggression analysis for (A) genome size and (B) the number of CDS judged to be HGTs. The genome size or number of CDS judged to be HGTs was set as the objective variable, and the six phenotypic features (sugar utilization value, growth at 15°C, growth at 45°C, microaerobic, facultatively anaerobic and obligate anaerobic) and four genomic features (G/C content, number of rRNAs, number of tRNAs and number of CRISPRs) were subjected to multiple regression analysis as explanatory variables. * indicates a P-value ≤ 0.05.
Figure 4.The clusters of orthologous group (COG) ratios for each group of orthologs. The COG ratios of the core genome, accessory genome, generalist group orthologs and specialist group orthologs are displayed. [J] Translation, ribosomal structure and biogenesis, [A] RNA processing and modification, [K] Transcription, [L] Replication, recombination and repair, [B] Chromatin structure and dynamics, [D] Cell cycle control, cell division and chromosome partitioning, [Y] Nuclear structure, [V] Defense mechanisms, [T] Signal transduction mechanisms, [M] Cell wall/membrane/envelope biogenesis, [N] Cell motility, [Z] Cytoskeleton, [W] Extracellular structures, [U] Intracellular trafficking, secretion and vesicular transport, [O] Post-translational modification, protein turnover and chaperones, [X] Mobilome: prophages and transposons, [C] Energy production and conversion, [G] Carbohydrate transport and metabolism, [E] Amino acid transport and metabolism, [F] Nucleotide transport and metabolism, [H] Coenzyme transport and metabolism, [I] Lipid transport and metabolism, [P] Inorganic ion transport and metabolism, [Q] Secondary metabolites biosynthesis, transport and catabolism, [R] General function prediction only and [S] Function unknown. Orthologs not assigned COG are indicated in gray color. In the accessory genome, more metabolism-related genes such as ‘carbohydrate transport and metabolism’ (G), ‘amino acid transport and metabolism’ (E), ‘transcription’ (K) and ‘defense mechanisms’ (V) were enriched than in the core genome. On the other hand, ‘translation, ribosomal structure and biogenesis’ (J) and ‘replication, recombination and repair’ (L) were lower than in the core genome.
Figure 3.ASU value and number of strains for each ortholog. The vertical axis indicates the number of strains in each ortholog, and the horizontal axis indicates the ASU value for each ortholog. We introduced the concept of ASU (Average of Sugar Utilization for the Ortholog) value. For example, two sequences derived from strains A and B that were clustered as an ortholog, then their ASU value was calculated as the average sugar utilization value for A and B. We also calculated the overall average and standard deviation of the sugar utilization value in 178 strains, then ortholog clusters were chosen when their ASU values were more/less than the means ± one standard deviation. The orthologs with high ASU values are designated as generalist group orthologs (red dots) and the low group are designated as specialist group orthologs (blue dots). Core genes from the 178 LAB strains are indicated as green dots. The top and side histograms show the number of orthologs on each axis.
T-test and Benjamini–Hochberg method to compare the functional ratio of COG for each group. The right side of the table indicates the P-value for the t-test to compare each COG ratio between all combinations to choose two from three groups (accessory genome, generalist group orthologs and specialist group orthologs). The left side of the table indicates the Boolean values of the Benjamini–Hochberg correction at a 0.05 false discovery rate (FDR) level. Significant differences indicate TRUE.
| COG | All accessory vs generalist | All accessory vs specialist | Generalist vs specialist | All accessory vs generalist | All accessory vs specialist | Generalist vs specialist |
|---|---|---|---|---|---|---|
| J | 0.326 101 | 0.114 384 | 0.32 189 | FALSE | FALSE | FALSE |
| A | 0.770 197 | 0.86 256 | ND | FALSE | FALSE | FALSE |
| K | 0.660 644 | 0.001 324 | 0.005 024 | FALSE | TRUE | FALSE |
| L | 0.016 087 | 0.454 098 | 0.458 151 | FALSE | FALSE | FALSE |
| B | ND | ND | ND | FALSE | FALSE | FALSE |
| D | 0.233 915 | 0.902 782 | 0.498 252 | FALSE | FALSE | FALSE |
| Y | ND | ND | ND | FALSE | FALSE | FALSE |
| V | 0.253 986 | 0.908 512 | 0.590 247 | FALSE | FALSE | FALSE |
| T | 0.546 536 | 0.086 224 | 0.073 969 | FALSE | FALSE | FALSE |
| M | 0.609 181 | 0.285 109 | 0.484 595 | FALSE | FALSE | FALSE |
| N | 0.330 625 | 0.666 394 | 0.873 454 | FALSE | FALSE | FALSE |
| Z | ND | ND | ND | FALSE | FALSE | FALSE |
| W | 0.795 567 | 0.973 348 | 0.906 121 | FALSE | FALSE | FALSE |
| U | 0.164 648 | 0.519 524 | 0.133 258 | FALSE | FALSE | FALSE |
| O | 0.74 121 | 0.073 661 | 0.129 009 | FALSE | FALSE | FALSE |
| X | 0.003 727 | 0.155 424 | 0.688 248 | FALSE | FALSE | FALSE |
| C | 0.115 125 | 0.690 668 | 0.197 208 | FALSE | FALSE | FALSE |
| G | 0.971 753 | 0.014 538 | 0.025 503 | FALSE | FALSE | FALSE |
| E | 0.000 799 | 0.679 508 | 0.012 048 | TRUE | FALSE | FALSE |
| F | 0.062 515 | 0.913 128 | 0.279 673 | FALSE | FALSE | FALSE |
| H | 0.002 552 | 0.136 954 | 0.679 383 | TRUE | FALSE | FALSE |
| I | 0.018 139 | 0.633 887 | 0.046 447 | FALSE | FALSE | FALSE |
| P | 0.275 201 | 0.034 176 | 0.140 896 | FALSE | FALSE | FALSE |
| Q | 0.159 491 | 0.383 424 | 0.094 268 | FALSE | FALSE | FALSE |
| R | 0.149 804 | 0.147 752 | 0.581 598 | FALSE | FALSE | FALSE |
| S | 0.145 587 | 0.624 075 | 0.713 207 | FALSE | FALSE | FALSE |
| Not_assigned | 0 | 0 | 0.804 117 | TRUE | TRUE | FALSE |
Figure 5.The networks for the generalist and specialist group orthologs. Each of the 178 nodes represents an LAB genome, which are colored and numbered by genus. Edges of dotted-red/solid-blue were created between two genomes when the number of sharing generalist/specialist group orthologs was more than five.