| Literature DB >> 28890711 |
Evgeni Bolotin1, Ruth Hershberg1.
Abstract
Horizontal gene transfer (HGT) serves as an important source of innovation for bacterial species. We used a pangenome-based approach to identify genes that were horizontally acquired by four closely related bacterial species, belonging to the Enterobacteriaceae family. This enabled us to examine the extent to which such closely related species tend to share horizontally acquired genes. We find that a high percent of horizontally acquired genes are shared among these closely related species. Furthermore, we demonstrate that the extent of sharing of horizontally acquired genes among these four closely related species is predictive of the extent to which these genes will be found in additional bacterial species. Finally, we show that acquired genes shared by more species tend to be better optimized for expression within the genomes of their new hosts. Combined, our results demonstrate the existence of a large pool of frequently horizontally acquired genes that have distinct characteristics from horizontally acquired genes that are less frequently shared between species.Entities:
Keywords: bacterial evolution; gene content; genome composition; horizontal gene transfer; pangenome
Year: 2017 PMID: 28890711 PMCID: PMC5575156 DOI: 10.3389/fmicb.2017.01536
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Summary of the pangenome data in the analyzed species.
| 16 | 9586 | 5278 | 55.06 | |
| 85 | 11741 | 7596 | 64.70 | |
| 34 | 8652 | 3692 | 42.67 | |
| 72 | 9091 | 4893 | 53.82 |
Figure 1Percentage of rare pangenes shared between the studied Enterobacteriaceae species is likely underestimated. Presented are saturation plots for each possible species pair combination. The average percent of “rare” pangenes, out of all rare pangenes of one species (Y-axis) found in a second species is depicted as a function of the number of second species strains that were included in the analysis (X-axis). Error bars represent deviation of the number of the shared “rares” in the individual combinations of strains from the average (see Materials and methods). The resulting data was fitted using Heap's law. For all species pairs, inclusion of more second species strains into the analysis continues to unravel additional shared “rare” pangnes with no evident saturation.
Distribution of the shared “rare” pangenes across the four studied species.
| 5278 | 58.34 | 41.66 | 18.76 | 12.56 | 10.34 | |
| 7596 | 62.40 | 37.60 | 19.73 | 10.73 | 7.14 | |
| 3692 | 53.17 | 46.83 | 20.88 | 14.82 | 11.13 | |
| 4893 | 49.15 | 50.85 | 24.22 | 15.57 | 11.06 |
Distribution of the shared “rare” pangenes in each species by a pangene category of their homolog in the second species.
| 81.34 | 5.97 | 12.69 | 67.27 | 9.79 | 22.94 | 81.43 | 3.91 | 14.66 | ||||
| 78.94 | 8.41 | 12.65 | 67.93 | 13.19 | 18.88 | 82.02 | 5.53 | 12.45 | ||||
| 79.78 | 11.26 | 8.96 | 81.33 | 7.33 | 11.34 | 76.64 | 4.29 | 19.07 | ||||
| 83.03 | 7.07 | 9.90 | 85.50 | 5.94 | 8.56 | 71.97 | 13.45 | 14.58 | ||||
Figure 2Horizontally acquired genes shared by more of the four studied Enterobacteriaceae family species tend to be more conserved in additional species as well. (A) Box plot representing the number of species in which “rare” pangenes were found, outside of the four Enterobacteriaceae family studied species. For each species four boxes are given: red – “rare” pangenes that are not shared in any of the three other species; yellow—“rare” pangenes shared by one additional species; green—“rare” pangenes shared by two additional species; blue—“rare” pangenes shared by all species. Whisker length for each boxplot represents 1.5 IQR. Statistical significance of differences between the gene-loss groups according to a non-paired, one-sided Mann-Whitney-Wilcoxon test is denoted by: ***P ≤ 0.001. The Y-axis is presented in logarithmic scale. (B) Bar plot depicting the percent of “rare” pangenes in each sharing group that are not conserved outside the four analyzed species (orange), or that are shared outside the four analyzed species (blue).
Figure 3Horizontally acquired genes that are shared by more of the studied Enterobacteriaceae species are better adapted to the codon usage of their host species. Boxplots depicting: (A) levels of codon usage bias (as measured using ENC') and (B) frequency of the optimal codons (FOP) in the “rare” pangenes. For each species four boxes are given: red—“rare” pangenes that are not shared in any of the three other species; yellow—“rare” pangenes shared by one additional species; green—“rare” pangenes shared by two additional species; blue—“rare” pangenes shared by all species. Whisker length for each boxplot represents 1.5 IQR. Statistical significance of differences between the gene-loss groups according to a non-paired, one-sided Mann-Whitney-Wilcoxon test is denoted by: ***P ≤ 0.001, **P ≤ 0.01, *P ≤ 0.05, and (ns) for P>0.05.