Literature DB >> 23734294

Extensive genomic variation within clonal bacterial groups resulted from homologous recombination.

Weilong Hao1.   

Abstract

Due to divergence, genetic variation is generally believed to be high among distantly related strains, low among closely related ones and little or none within the same classified clonal groups. Several recent genome-wide studies, however, revealed that significant genetic variation resides in a considerable number of genes among strains with identical MLST (Multilocus sequence typing) types and much of the variation was introduced by homologous recombination. Recognizing and understanding genomic variation within clonal bacterial groups could shed new light on the evolutionary path of infectious agents and the emergence of particularly pathogenic or virulent variants. This commentary presents our recent contributions to this line of work.

Entities:  

Keywords:  homologous recombination; horizontal gene transfer; multilocus sequence typing; pathogenic adaptation; phylogenomics; prophage

Year:  2013        PMID: 23734294      PMCID: PMC3661140          DOI: 10.4161/mge.23463

Source DB:  PubMed          Journal:  Mob Genet Elements        ISSN: 2159-2543


Introduction

Nucleotide sequences diverge over time due to the combined effects of point mutation and homologous recombination. Recombination events cause changes to regions of contiguous bases in single events and were generally assumed to be rare in bacteria. However, there is growing evidence that homologous recombination has a significant impact on sequence diversification during bacterial genome evolution. A recent analysis on the MLST (Multilocus sequence typing) data of 46 bacterial and two archaeal species revealed 27 (56%) species in which homologous recombination contributed to more nucleotide changes than point mutation. The rapid genetic change introduced by homologous recombination could facilitate ecological adaption and drive pathogenesis in bacterial pathogens.- Currently, the MLST scheme, using DNA fragments from seven housekeeping genes, has been routinely used to characterize bacterial isolates. The standard MLST scheme has also been extended to construct fine-scale relationships and further subdivide identical multilocus sequence types (STs) using more loci or a large amount of shared genomic sequences.- Given the common occurrence of homologous recombination, it becomes crucial to investigate the genome-wide extent of homologous recombination, which could also benefit the construction of the strain history and tracking the spread of emerging pathogens.

Identification and Quantification of Nonvertically Acquired Genes via Recombination within Identical STs

Identifying recombinational exchanges in closely related strains is challenging as recombinational exchanges involved in a small number of nucleotides may be mistaken as point mutations. Guttman and Dykhuizen (1994) have successfully examined the clonal divergence of E. coli strains in the ECOR group A by considering the divergence time and mutation rate and showed that recombination has occurred at a rate 50-fold higher than the mutation rate in four loci. Feil et al. (2000) estimated the ancestral allele for the isolates that differ only one locus out of the seven MLST loci and assigned recombination based on the number of derived nucleotides from the ancestral allele and on whether the nucleotides are novel in the population. We adopted a new approach (illustrated in Fig. 1) to identify recombinant genes in Neisseria meningitidis strains with identical STs, which does not require the estimation of divergence time and ancestral alleles and can be applied on any two strains with identical STs. In brief, nucleotide substitution was assumed to follow a binomial distribution and an upper bound of genome-wide divergence () by point mutation was calculated for no observed substitution in all nucleotide sites of the seven MLST loci. The estimated maximum genome-wide divergence was then used as a benchmark to compute a P-value for the observed nucleotide changes of each gene in the genome to be explained by point mutation. Genes that have more than the expected number of nucleotide changes at a significance level of 0.001 were deemed as recombinant genes. Our results revealed that up to 19% of commonly present genes in N. meningitidis strains with identical STs have been affected by homologous recombination.

Figure 1. Inference of homologous recombination in strains with identical STs. Under a binomial distribution of nucleotide substitution, there is a probability for no nucleotide change in the seven MLST loci. That is (1-μ)n = 0.001, here n is the number of nucleotides in the seven MLST loci and μ is the upper bound of genome-wide nucleotide divergence (μ) at 0.001 significance level given no change in the seven MLST loci. At genome-wide divergence μ, genes that have more than the expected number of nucleotide changes at 0.001 significance level were deemed as nonvertically acquired genes.

Figure 1. Inference of homologous recombination in strains with identical STs. Under a binomial distribution of nucleotide substitution, there is a probability for no nucleotide change in the seven MLST loci. That is (1-μ)n = 0.001, here n is the number of nucleotides in the seven MLST loci and μ is the upper bound of genome-wide nucleotide divergence (μ) at 0.001 significance level given no change in the seven MLST loci. At genome-wide divergence μ, genes that have more than the expected number of nucleotide changes at 0.001 significance level were deemed as nonvertically acquired genes. In another study on E. coli O104 (ST678) genomes, we visualized recombinant genes by plotting the pairwise DNA distance of orthologous genes along the genome and identified 167 genes in three gene clusters that have likely undergone homologous recombination. A reanalysis on the orthologs between E. coli ON2010 and 55989 (labeled as Ec55989 thereafter to avoid unnecessary confusion) genomes using both pairwise DNA distance and the P-values as described in ref. 15 yielded remarkably similar results (Fig. 2). In fact, the use of nucleotide divergence between two genomes for homologous recombination detection has been successful in other studies,, one of which was on two E. coli ST131 strains. It has been observed that a higher portion (at least 9%) of core genes in the E. coli ST131 genomes than in the E. coli ST678 genomes (Fig. 2) are affected by homologous recombination. The findings in both N. meningitidis and E. coli showed extensive genomic variation within identical STs. Since many bacterial species have a comparable or higher level of recombinogenicity than N. meningitidis or E. coli, extensive genomic variation within identical STs should be expected in many bacterial species.

Figure 2. Inferring genes involved in homologous recombination by comparing orthologs between two E. coli strains ON2010 and Ec55989. (A) DNA distance was measured using DNADIST of the PHYLIP package. (B) P-values were calculated based on the maximum genome-wide divergence given the seven identical MLST loci as illustrated in Figure 1. For simplicity, P -values smaller than 0.0001 were shown as 0.0001. Genes located in the prophage regions were colored in blue. Please note that more genes (4207 genes in total) were examined here than in our previous study (3794 genes), since our previous study focused on the genes present in both the O104 strains and the IAI1 strain.

Figure 2. Inferring genes involved in homologous recombination by comparing orthologs between two E. coli strains ON2010 and Ec55989. (A) DNA distance was measured using DNADIST of the PHYLIP package. (B) P-values were calculated based on the maximum genome-wide divergence given the seven identical MLST loci as illustrated in Figure 1. For simplicity, P -values smaller than 0.0001 were shown as 0.0001. Genes located in the prophage regions were colored in blue. Please note that more genes (4207 genes in total) were examined here than in our previous study (3794 genes), since our previous study focused on the genes present in both the O104 strains and the IAI1 strain. It is important to note that the high genomic variation discovered within identical STs,, should not be interpreted as artifacts of these studies. The high level of genomic variation within identical STs could, instead, be explained by that many non-vertical genes within identical STs are deleterious or transiently adaptive and undergo fast rates of evolution. In fact, the ratio of recombination to mutation rates was higher in the comparison of clonally related strains, than of relatively broadly sampled strains from the corresponding species. Such a discrepancy between the estimated recombination-mutation ratios highlights the need for a population genetics framework for the study of recombination and bacterial genome evolution.

Genomic Regions Involved in Recombination

Among the three gene clusters of recombinant genes we identified in E. coli O104, one gene cluster contained 125 genes and was likely involved in direct chromosomal homologous recombination specific to the ON2010 strain. These 125 genes were found in 20 different functional categories and 70 of them were found in all the studied 57 E. coli and Shigella genomes. This is consistent with the conclusion that genes from all functional categories are subject to DNA exchange. Furthermore, the nearest phylogenetic neighbors of these genes were not clustered in a single phylogenetic group. We hypothesized that extensive recombination with a broad spectrum of strains has taken place in one genome, and this highly mosaic genome then recombined with the precursor to the ON2010 genome. The other two gene clusters of recombinant genes in E. coli O104 were located in the prophage regions, but the genes in these two gene clusters were identical between ON2010 and Ec55989 genomes. It is noteworthy that the reanalysis with more single-copy genes (with details in Fig. 2) revealed 5 prophage genes involved in recombination. These prophage genes are not present in all O104 strains and the outgroup IAI1 strain. This could be explained by frequent recombination of the prophage genes with infecting phages or different prophages from other bacterial chromosomes. Since all examined O104 genomes are of conserved genome synteny, our observations support the argument that homologous (legitimate) recombination drives module exchange between phages. Together, these findings suggest that homologous recombination takes place frequently in both core genes and dispensable genes.

Phylogenomic Consequence

As the cost of sequencing drops, the characterization of bacterial isolates has utilized more shared genes or loci and shifted toward phylogenomic analysis.-, Quite often, multiple gene alignments were concatenated into a single super-alignment, from which phylogenies were reconstructed using a variety of methodologies. Such a data set, also known as a supermatrix, has been demonstrated to solve previously ambiguous or unresolved phylogenies, even in the presence of a low amount of horizontal gene transfer in the data set. Unfortunately, the supermatrix approach becomes very sensitive to recombination when applied to strains with identical STs due to limited genuine sequence diversity. The concatenated sequences of 3794 genes in the E. coli O104 strains were overwhelmed by the phylogenetic signal of the 125 recombinant genes, as many other genes are identical among the E. coli O104 strains (Fig. 2). The accuracy and robustness of the constructed evolutionary relationships can be improved by the exclusion of recombinogenic and incongruent sequences., In fact, the removal of the 125 recombinant genes from the E. coli O104 data set has resulted in consistent phylogenetic relationships of O104 strains by different phylogenetic approaches. One interesting finding of our E. coli O104 study is that the number of identical loci implemented in BIGSdb was less sensitive to homologous recombination than the concatenated sequences of all loci. This could be explained by the fact that recombination has affected a relatively small number of genes but introduced a substantial amount of diversity in the ON2010 genome. It is further noteworthy that supertrees, another widely used approach for phylogenomic analysis are not suitable for characterizing strains with identical MLST types, as many individual genes are identical or nearly identical and contain no or very limited phylogenetic information for each individual gene tree.

Homologous Recombination and Pathogenic Adaptation

Homologous recombination can bring the beneficial mutations arising in different genomes together and have a strong impact on ecological adaptation., One well-known example was the recombination in the penA genes during the emergence of penicillin resistance in N. meningitidis. Variation of the penA gene corresponding to different levels of penicillin susceptibility has also been observed between N. meningitidis strains with the same MLST types. Furthermore, genetic variation within the same MLST types has been evident in the capsule gene cluster and genes used for vaccine target in N. meningitidis. These observations suggest a strong relationship between homologous recombination and pathogenic adaptation involved in antibiotic resistance, capsule biosynthesis and vaccine efficacy. Recombination-mediated pathogenic adaptation was also evident in E. coli. Recombination has affected fimH which encodes mannose-specific type 1 fimbrial adhesin, resulting in distinct fluoroquinolone-resistance profiles in ST131 strains. A survey of the fimH gene on the 57 E. coli and Shigella genomes revealed that ON2010 was the only E. coli O104 genome containing a fimH blast hit > 10% of length (Fig. 3). Except one nucleotide, the fimH sequence in ON2010 was identical with E24377A and S88. On the ON2010 genome scaffold, fimH is upstream adjacent to a fructuronic acid transporter gene gntP, which is universally present in all E. coli and Shigella genomes. The gntP gene in ON2010 was also found to be involved in homologous recombination (Fig. 2), and most importantly, the most similar sequences to the ON2010 gntP were also in E24377A and S88 (data not shown). The shared origin between the adjacent fimH and gntP genes in ON2010 suggested that patchily distributed genes involved in pathogenesis could be introduced by homologous recombination of the conserved flanking genes.

Figure 3. Sequence alignment of fimH. Only informative sites are shown with coordinates at the top. The ON2010 sequence and its most similar sequences (differing by one nucleotide) are shown in light green.

Figure 3. Sequence alignment of fimH. Only informative sites are shown with coordinates at the top. The ON2010 sequence and its most similar sequences (differing by one nucleotide) are shown in light green.
  28 in total

1.  Estimating recombinational parameters in Streptococcus pneumoniae from multilocus sequence typing data.

Authors:  E J Feil; J M Smith; M C Enright; B G Spratt
Journal:  Genetics       Date:  2000-04       Impact factor: 4.562

2.  Nucleotide sequence of coliphage HK620 and the evolution of lambdoid phages.

Authors:  A J Clark; W Inwood; T Cloutier; T S Dhillon
Journal:  J Mol Biol       Date:  2001-08-24       Impact factor: 5.469

3.  Phylogenetic analyses of cyanobacterial genomes: quantification of horizontal gene transfer events.

Authors:  Olga Zhaxybayeva; J Peter Gogarten; Robert L Charlebois; W Ford Doolittle; R Thane Papke
Journal:  Genome Res       Date:  2006-08-09       Impact factor: 9.043

4.  A bimodal pattern of relatedness between the Salmonella Paratyphi A and Typhi genomes: convergence or divergence by homologous recombination?

Authors:  Xavier Didelot; Mark Achtman; Julian Parkhill; Nicholas R Thomson; Daniel Falush
Journal:  Genome Res       Date:  2006-11-07       Impact factor: 9.043

5.  Role of homologous recombination in adaptive diversification of extraintestinal Escherichia coli.

Authors:  Sandip Paul; Elena V Linardopoulou; Mariya Billig; Veronika Tchesnokova; Lance B Price; James R Johnson; Sujay Chattopadhyay; Evgeni V Sokurenko
Journal:  J Bacteriol       Date:  2012-11-02       Impact factor: 3.490

6.  A genomic approach to bacterial taxonomy: an examination and proposed reclassification of species within the genus Neisseria.

Authors:  Julia S Bennett; Keith A Jolley; Sarah G Earle; Craig Corton; Stephen D Bentley; Julian Parkhill; Martin C J Maiden
Journal:  Microbiology (Reading)       Date:  2012-03-15       Impact factor: 2.777

7.  BIGSdb: Scalable analysis of bacterial genome variation at the population level.

Authors:  Keith A Jolley; Martin C J Maiden
Journal:  BMC Bioinformatics       Date:  2010-12-10       Impact factor: 3.169

8.  Phylogenetic incongruence in E. coli O104: understanding the evolutionary relationships of emerging pathogens in the face of homologous recombination.

Authors:  Weilong Hao; Vanessa G Allen; Frances B Jamieson; Donald E Low; David C Alexander
Journal:  PLoS One       Date:  2012-04-06       Impact factor: 3.240

9.  The effect of bacterial recombination on adaptation on fitness landscapes with limited peak accessibility.

Authors:  Danesh Moradigaravand; Jan Engelstädter
Journal:  PLoS Comput Biol       Date:  2012-10-25       Impact factor: 4.475

10.  Organised genome dynamics in the Escherichia coli species results in highly diverse adaptive paths.

Authors:  Marie Touchon; Claire Hoede; Olivier Tenaillon; Valérie Barbe; Simon Baeriswyl; Philippe Bidet; Edouard Bingen; Stéphane Bonacorsi; Christiane Bouchier; Odile Bouvet; Alexandra Calteau; Hélène Chiapello; Olivier Clermont; Stéphane Cruveiller; Antoine Danchin; Médéric Diard; Carole Dossat; Meriem El Karoui; Eric Frapy; Louis Garry; Jean Marc Ghigo; Anne Marie Gilles; James Johnson; Chantal Le Bouguénec; Mathilde Lescat; Sophie Mangenot; Vanessa Martinez-Jéhanne; Ivan Matic; Xavier Nassif; Sophie Oztas; Marie Agnès Petit; Christophe Pichon; Zoé Rouy; Claude Saint Ruf; Dominique Schneider; Jérôme Tourret; Benoit Vacherie; David Vallenet; Claudine Médigue; Eduardo P C Rocha; Erick Denamur
Journal:  PLoS Genet       Date:  2009-01-23       Impact factor: 5.917

View more
  2 in total

1.  Homologous recombination drives both sequence diversity and gene content variation in Neisseria meningitidis.

Authors:  Ying Kong; Jennifer H Ma; Keisha Warren; Raymond S W Tsang; Donald E Low; Frances B Jamieson; David C Alexander; Weilong Hao
Journal:  Genome Biol Evol       Date:  2013       Impact factor: 3.416

2.  Multilocus sequence analysis of nectar pseudomonads reveals high genetic diversity and contrasting recombination patterns.

Authors:  Sergio Alvarez-Pérez; Clara de Vega; Carlos M Herrera
Journal:  PLoS One       Date:  2013-10-08       Impact factor: 3.240

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.