| Literature DB >> 28979253 |
Lucía Graña-Miraglia1, Luis F Lozano1, Consuelo Velázquez2, Patricia Volkow-Fernández2, Ángeles Pérez-Oseguera1, Miguel A Cevallos1, Santiago Castillo-Ramírez1.
Abstract
Genome sequencing has been useful to gain an understanding of bacterial evolution. It has been used for studying the phylogeography and/or the impact of mutation and recombination on bacterial populations. However, it has rarely been used to study gene turnover at microevolutionary scales. Here, we sequenced Mexican strains of the human pathogen Acinetobacter baumannii sampled from the same locale over a 3 year period to obtain insights into the microevolutionary dynamics of gene content variability. We found that the Mexican A. baumannii population was recently founded and has been emerging due to a rapid clonal expansion. Furthermore, we noticed that on average the Mexican strains differed from each other by over 300 genes and, notably, this gene content variation has accrued more frequently and faster than the accumulation of mutations. Moreover, due to its rapid pace, gene content variation reflects the phylogeny only at very short periods of time. Additionally, we found that the external branches of the phylogeny had almost 100 more genes than the internal branches. All in all, these results show that rapid gene turnover has been of paramount importance in producing genetic variation within this population and demonstrate the utility of genome sequencing to study alternative forms of genetic variation.Entities:
Keywords: A. baumannii; gene content; genetic variation; microevolution; pathogen; phylogeography; population genomics
Year: 2017 PMID: 28979253 PMCID: PMC5611417 DOI: 10.3389/fmicb.2017.01817
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1Maximum likelihood phylogeny and population structure analysis. The mid-point rooted phylogeny is based on the concatenated alignment of all the single gene families not affected by recombination and was constructed via PhyML. The color labels represent all the known STs for which there were two or more strains. The green rectangle shows the newly sequenced Mexican strains, which form a single cluster at deepest level of the hierarchical population structure analysis. The scale bar represents substitution per sites.
Figure 2Gene content variation among the strains. Heat map of the gene content correlation matrix used to analyse the gene content differences among the strains. The top row on the heat map shows the BAPS groups of the first level of clustering of the population structure analysis, most of which are split according to the clustering of the heat map. The black dotted rectangle shows the Mexican strains sequenced for this study. The dendrograms across the top and side reflect the clustering by gene content.
Topology tests.
| NJ whole data set | −658053.472 | 0.000 | 0.000 | 0.000 |
| NJ Mexican clade | −0.172 | 0.336 | 0.352 | 0.29 |
Kishino-Hasegawa (KH), Shimodaira-Hasewaga (SH), and RELL bootstrap proportion tests to determine whether the NJ topologies, based on the gene content matrix, differ significantly from the ML phylogenies.
Difference in log-likelihood to the ML phylogeny.
p-values under the different tests.
Branch models of gene family turnover.
| GD-FR-ML | −7500.2956 | 29 | 15058.59 | 0.00 |
| GD-GR-ML | −8290.2042 | 3 | 16586.41 | 15015.68 |
The GD-GR-ML model implies that all the branches have the same turnover rates, whereas GD-FR-ML model assumes that each branch has its own turnover rates. In both models, turnover rates were estimated by Maximum Likelihood (ML).
Log-Likelihood scores;
Number of parameters;
ΔAIC is the difference in the Akaike Information Criterion (AIC) for each model to the best model. We used BadiRate to implement the Gain-and-Death stochastic population model to estimate the gene family turnover rates.
Figure 3Estimates of the ancestral gene content and the minimum number of losses and gains per branch. The bold numbers next to the nodes show the estimates of the ancestral gene content, whereas the taxa labels give the number of the total gene content for the newly sequenced Mexican strains. The numbers on the branches mark the minimum number of gains (number before the slash) and the minimum number of loses (number after the slash).
Figure 4Boxplots of the differences due to either mutations or gene losses/gains. The boxes on the left refer to the differences due to mutations, whereas the boxes on the right describe the differences in gene content. We carried out pairwise comparisons including all the strains (A) and pairwise comparisons without the hypermutator strains (B).