Literature DB >> 21749728

Genome-wide evidence for positive selection and recombination in Actinobacillus pleuropneumoniae.

Zhuofei Xu1, Huanchun Chen, Rui Zhou.   

Abstract

BACKGROUND: Actinobacillus pleuropneumoniae is an economically important animal pathogen that causes contagious pleuropneumonia in pigs. Currently, the molecular evolutionary trajectories for this pathogenic bacterium remain to require a better elucidation under the help of comparative genomics data. For this reason, we employed a comparative phylogenomic approach to obtain a comprehensive understanding of roles of natural selective pressure and homologous recombination during adaptation of this pathogen to its swine host.
RESULTS: In this study, 12 A. pleuropneumoniae genomes were used to carry out a phylogenomic analyses. We identified 1,587 orthologous core genes as an initial data set for the estimation of genetic recombination and positive selection. Based on the analyses of four recombination tests, 23% of the core genome of A. pleuropneumoniae showed strong signals for intragenic homologous recombination. Furthermore, the selection analyses indicated that 57 genes were undergoing significant positive selection. Extensive function properties underlying these positively selected genes demonstrated that genes coding for products relevant to bacterial surface structures and pathogenesis are prone to natural selective pressure, presumably due to their potential roles in the avoidance of the porcine immune system.
CONCLUSIONS: Overall, substantial genetic evidence was shown to indicate that recombination and positive selection indeed play a crucial role in the adaptive evolution of A. pleuropneumoniae. The genome-wide profile of positively selected genes and/or amino acid residues will provide valuable targets for further research into the mechanisms of immune evasion and host-pathogen interactions for this serious swine pathogen.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21749728      PMCID: PMC3146884          DOI: 10.1186/1471-2148-11-203

Source DB:  PubMed          Journal:  BMC Evol Biol        ISSN: 1471-2148            Impact factor:   3.260


Background

In the evolutionary history of many microorganisms, positive selection and homologous recombination are two indispensable driving forces for adaptation to new niches. Both of them contribute to the genetic variations that might influence the population diversification and adaptation of pathogenic microorganisms [1,2]. Recent studies on the genome-wide evolutionary dynamics have highlighted the important roles of selection and recombination in the molecular evolution of bacterial pathogens, including Escherichia coli [1], Listeria monocytogenes [3], Salmonella spp. [4], Streptococcus spp. [5], and Campylobacter spp. [6]. These analyses have revealed that a certain number of protein-coding genes subject to natural selection pressure are usually involved in the dynamical interactions between host and pathogen, especially in the immune and defense-associated functions [1]. Diversifying selection operating on these genes may be caused by pathogen-host co-evolutionary arms race [7,8]. In the present study, dN/dS-based methods were applied to detect evidence of genome-wide positive Darwinian selection. Estimating the ratio (ω) of the rate of nonsynonymous nucleotide substitutions (dN) to that of synonymous substitutions (dS) is a powerful approach for measuring selective pressure on the protein-coding level: ω = 1, < 1, > 1 indicate neutral evolution, purifying (negative) selection, and positive (adaptive) selection, respectively [9,10]. The codon models further developed by Nielsen and Yang allow variation in ω among sites [11], which have an extensive capability to find evidence for adaptive evolution in most functional genes where only a small fraction of amino acid sites are subject to strong positive selective pressure [12]. Thus far this approach has been widely used for genome-wide selection analyses in pathogenic viruses, bacteria, and eukaryotes [9,13]. A substantial number of genes encoding highly variable antigens are identified to undergo adaptive selection particularly on some functional sites for evasion of host immunity [1,4,14]. Actinobacillus pleuropneumoniae, a Gram-negative coccobacillus belonging to the Actinobacillus genus of Pasteurellaceae family, is a strictly swine pathogen and colonizes in the upper respiratory tract of porcine [15]. This pathogen has caused an economically severe disease characterized by pulmonary lesions, pleuritis, and pericarditis in pigs [16]. According to the differences in capsular polysaccharides, A. pleuropneumoniae has been divided into 15 serovars [17]. The recent comparative genomics studies through both high-throughput approaches of genome sequencing and microarray have depicted the compositions of the pan-genome and confirmed the contribution of genes loss or gain to the diversity in virulence and serovar of A. pleuropneumoniae [18,19]. However, besides large genetic variations resulting from DNA acquisition and genome reduction, small sequence differences occurring in the conserved genes, including point mutations, insertions/deletions (indels), and intragenic recombination, may also play a crucial part in the alteration of antibiotic resistance, pathogenicity and immunogenicity [20,21]. But to date, no research pays enough attention to the linkages between genetic alterations and putative functional roles in intraspecies conserved genes of A. pleuropneumoniae at the whole genome level. In order to further trace evolutionary trajectories on the core genome of pathogenic bacterium A. pleuropneumoniae, we employed a genome-wide analyses approach to investigate the effects of natural selection and homologous recombination operating on the coding genes. Our analyses focused on the evolutionary characterizations of core genome genes that are shared by 12 A. pleuropneumoniae genomes. Many genes were shown to be under strong positive selective pressure and primarily associated with the fitness and immunogenic properties of this swine pathogen.

Methods

Genome dataset and alignment

Twelve genome sequences of A. pleuropneumoniae were retrieved from NCBI Genome database (http://www.ncbi.nlm.nih.gov/genome). The sequences included 3 complete genomes and 9 draft genome assemblies (see details in Table 1). Orthologous gene content information and annotation with COG functional classification have been defined in our recent work and used here [19]. To increase accuracy and power of selection analyses, an ortholog set was excluded if it satisfied any of the following criteria: the length of any gene lower than 80% of the maximum length, more than one gene from each genome or less than four sequences. Protein-coding sequences longer than 50 codons were used in this study. Subsequently, the orthologous protein sequences were aligned using a progressive method implemented in T-Coffee v8.93 [22].
Table 1

Genome sequences of A. pleuropneumoniae used in this study

StrainGenBank accession no.Genome size (Mbp)No. of CDS (> 49 codons)Reference
JL03CP0006872.242,101 (2,035)[60]
L20CP0005692.272,137 (2,072)[61]
Ap76CP0010912.332,203 (2,134)
4074ADOD000000002.262,180 (2,101)[19]
S1536ADOE000000002.222,137 (2,061)[19]
M62ADOF000000002.272,223 (2,151)[19]
FemφADOG000000002.312,219 (2,138)[19]
CVJ13261ADOI000000002.262,204 (2,124)[19]
D13039ADOJ000000002.272,168 (2,091)[19]
56153ADOK000000002.272,195 (2,120)[19]
1096ADOL000000002.192,096 (2,033)[19]
N273ADOM000000002.252,149 (2,086)[19]
Genome sequences of A. pleuropneumoniae used in this study Frameshift mutations (indels of a number of nucleotides not divisible by three) can lead to high nonsynonymous substitution rates, resulting in more false positive results when positive selection was estimated based on dN/dS ratios [5]. To avoid incorrect indels in the codon alignments, multiple sequence alignments were initially performed with amino acid sequences from each gene cluster, followed by conversion to the corresponding codon alignments using custom Perl scripts. The coding sequences located at the beginning or end of the contigs appeared to be more prone to frameshift sequencing errors. Therefore, we further assessed the quality for each alignment through obtaining the following information: overall identity, and identity in the first 30 nt and last 30 nt per alignment. The codon alignment sequences that contain frameshift mutations were checked and edited manually in the software MEGA4 [23] if identity is low.

Calculation of dN, dS, codon bias, nucleotide diversity and informative sites

According to the method as defined by Nei and Gojobori [24], the number of synonymous nucleotide substitutions per synonymous site (dS) and the number of nonsynonymous nucleotide substitutions per nonsynonymous site (dN) were estimated for the resulting gene alignments using the program SNAP [25]. Gene-by-gene number of informative sites and genetic diversity were obtained from the output of the PhiPack program [26]. The analyses for the codon usage variation was performed by computing the effective number of codons (Nc), which is a general measure of bias from equal codon usage in a gene. The Nc value ranges from 20 for the strongest bias (where only one codon is used for each amino acid) to 61 for no bias [27,28]. The calculation of Nc were implemented in the program CodonW 1.4 (http://codonw.sourceforge.net/).

Detection of recombination events

Since recombining fragments among aligned codon sequences have a profound effect on the detection of the positively selective evidence [29], we first tested for recombination signals between sequences in the alignment of orthologous genes. Four statistical procedures GENECONV [30], pairwise homoplasy index (PHI) [26], maximum χ2 [31] and neighbor similarity score (NSS) [32] were applied to discover the homologous recombination signals. Besides GENECONV version 1.81, the other three methods were implemented in the PhiPack package [26]. For the analyses of GENECONV, the parameter g-scale was set to 1, which allows mismatches within a recombining fragment. The p-values for inner fragments using 10,000 random permutations were used to indicate the significance of putative recombinant regions. For maximum χ2, a fixed window-size of 2/3 the number of polymorphic sites was used. For PHI, the window size was set to 100 nucleotides. Simulated p-values were estimated based on 1,000 permutations for PHI, maximum χ2 and NSS.

Detection of Selection

Maximum likelihood (ML) phylogenetic trees were reconstructed for each gene in the dataset of the core genome genes using the PhyML program [33]. A general time-reversible (GTR) model of nucleotide substitution with the ML estimates for gamma distributed rate heterogeneity of four categories (Г4) and a proportion of invariable sites were used in all tree reconstruction methods. The resulting topologies of ML trees were applied to the subsequent selection analyses. To detect selective pressure acting on each coding gene, the rates of synonymous and nonsynonymous substitutions were estimated site-by-site using the codeml program from the PAML 4.2b package [34]. According to the topology of the resulting ML tree per gene alignment, two site-specific models that allow variable nonsynonymous (dN) and synonymous (dS) rate ratios (ω = dN/dS) among codons were applied to analyze our data set: M1a (NearlyNeutral) and M2a (PositiveSelection). Null hypothesis model M1a was nested with alternative selection model M2a. The latter model adds an extra site class for a fraction of positively selected amino acid sites with ω > 1; whereas models M1a only allows site classes with ω varying between 0 and 1 [10,35]. A likelihood ratio test (LRT) was carried out to infer the occurrence of sites subject to positive selective pressure through comparing M1a against M2a. Three replicates were run with codeml and the maximum likelihood values for each model were used in the LRT. The LRT statistic (twice the log-likelihood difference between the null and the alternative models) was compared with the χ2 distribution with two degrees of freedom. The Bayes empirical Bayes approach was employed to identify positively selected sites under the likelihood framework [36].

Mapping of positively selected sites to structure models of proteins

The web server PSORTb v3.0 was used to predict bacterial protein subcellular localization [37]. Integral beta-barrel outer membrane proteins were predicted by BOMP [38]. The three dimensional structure model of the protein encoded by the gene that showed evidence for positive Darwinian selection was modeled using the Phyre server [39]. The sites subject to positive selective pressure were mapped onto the structure and visualized by PyMol (http://www.pymol.org/).

Statistical analyses

Multiple testing correction was performed to control for Type I errors according to the approach presented by Benjamini & Hochberg [40]. For all genes tested for recombination and positive selection, q-values were calculated from p-values using the R package qvalue with the proportion of true null hypothesis set to 1 (π0 = 1) [41]. A false discovery rate (FDR) of 10% was used for the recombination analyses. As the tests used for detecting positive selection was conservative [42], an FDR of 20% was set. The non-parametric Mann-Whitney U-test was employed to determine the significance level for the differences among the selected continuous variables (i.e., dN, dS, codon bias and nucleotide diversity) between a given COG functional categories and all other categories. Binomial test was used to estimate association between each COG category and evolutionary forces (i.e. positive selection and/or homologous recombination); Bonferroni corrections for multiple comparisons were performed according to the number of one-sided tests. The significance level was set to 5% in this study. All statistic analyses were carried out using Perl scripts and R 2.11.1 [43].

Results

Properties of orthologous genes in 12 A. pleuropneumoniae genomes

In our recent work [19], 2,531 orthologous genes and 772 strain-specific genes have been identified in the pan-genome of 12 A. pleuropneumoniae strains using BlastClust. The above data set was used to further decode phylogenomic characterizations of pathogenic A. pleuropneumoniae. The evidence for homologous recombination and natural selection pressure whether operate on the conserved coding genes was estimated at the present study. After manually editing the aligned gene sequences and removing the low quality ones, a data set of sequence alignments of 1,960 orthologs was selected out, 81% (n = 1,587) of which were core genes that are present one copy per genome and the remaining (n = 373) were distributed genes present in at least four genomes. The codon bias for each orthologous gene was measured by the effective number of codons (Nc value) calculated by CodonW [28]. The reduction in Nc indicates strong bias that significantly correlates with high gene expressivity [44]. A. pleuropneumoniae genes in the COG functional categories "Energy production and conversion", "Translation", "Amino acid transport and metabolism", "Nucleotide transport and metabolism", and "Carbohydrate transport and metabolism" were evident to have higher codon usage bias (P < 0.001, P < 0.001, P = 0.003, P = 0.001, and P < 0.001, respectively; one-tailed U-test) compared with genes in other COG categories. As is well known, genes bearing stronger codon bias are likely to be highly expressed and have housekeeping features [3,45]. So, high codon bias of genes present in the five COG categories is likely to elucidate the necessity of relevant coding products for implementing fundamental life cycle and essential physiological activities of A. pleuropneumoniae. A. pleuropneumoniae genes in the functional categories "Replication, recombination and repair" and "Amino acid transport and metabolism" represented a tendency to have higher rates of synonymous (dS) nucleotide substitutions (P = 0.006 and P < 0.001, respectively; one-tailed U-test) in comparison with genes in other role categories (Table 2). On the other hand, genes in the functional categories "Replication, recombination and repair", "Amino acid transport and metabolism", "Coenzyme transport and metabolism" and "General functional prediction" showed a tendency to have higher rates of nonsynonymous (dN) substitutions (P = 0.012, P = 0.001, P = 0.007, and P = 0.007, respectively; one-tailed U-test) in comparison with genes in other COG categories (Table 2). Positive correlation was observed between dS and dN values for each COG category of A. pleuropneumoniae genes, indicating that natural selection might uniformly act on synonymous and nonsynonymous sites per gene. In addition, it was worth noting that the average dS and dN values were significantly lower (P = 0.001 and P < 0.001, respectively; one-tailed U-test) for genes in the COGs "Translation" than for genes in other COG categories. It has been suggested that genes involved in the translation machinery, e.g. ribosomal proteins and tRNA synthetases, usually evolved slowly with low dS and dN, probably due to structural and functional constraints required by the fundamental cell life cycle [20,46,47].
Table 2

The rates of synonymous (dS) and nonsynonymous (dN) nucleotide substitutions among different functional categories for A. pleuropneumoniae genes

COG categoryaNumber of genes analyzeddS (± se)bdN (± se)cr d
Energy production and conversion11444.2 × 10-3 (± 5.2 × 10-3)2.7 × 10-3 (± 0.5 × 10-3)0.78
Cell cycle control and cell division2659.7 × 10-3 (± 13.0 × 10-3)3.3 × 10-3 (± 0.8 × 10-3)0.69
Amino acid transport and metabolism13380.1 × 10-3 (± 6.6 × 10-3)4.1 × 10-3 (± 0.4 × 10-3)0.80
Nucleotide transport and metabolism5253.7 × 10-3 (± 8.3 × 10-3)2.5 × 10-3 (± 0.4 × 10-3)0.67
Carbohydrate transport and metabolism10956.8 × 10-3 (± 6.5 × 10-3)3.2 × 10-3 (± 0.6 × 10-3)0.73
Coenzyme transport and metabolism9259.5 × 10-3 (± 7.7 × 10-3)4.7 × 10-3 (± 0.7 × 10-3)0.57
Lipid transport and metabolism3261.4 × 10-3 (± 11.3 × 10-3)3.7 × 10-3 (± 0.9 × 10-3)0.75
Translation15142.9 × 10-3 (± 4.5 × 10-3)2.1 × 10-3 (± 0.3 × 10-3)0.58
Transcription7869.4 × 10-3 (± 11.8 × 10-3)3.8 × 10-3 (± 0.8 × 10-3)0.86
Replication, recombination and repair9483.3 × 10-3 (± 11.8 × 10-3)6.2 × 10-3 (± 1.3 × 10-3)0.69
Cell membrane and envelope biogenesis12063.3 × 10-3 (± 10.8 × 10-3)4.8 × 10-3 (± 0.8 × 10-3)0.85
Posttranslational modification, protein turnover, chaperones8455.3 × 10-3 (± 7.2 × 10-3)3.6 × 10-3 (± 0.9 × 10-3)0.75
Inorganic ion transport and metabolism12463.5 × 10-3 (± 6.1 × 10-3)3.9 × 10-3 (± 0.4 × 10-3)0.67
Secondary metabolites biosynthesis, transport and catabolism1234.3 × 10-3 (± 11.4 × 10-3)2.7 × 10-3 (± 0.9 × 10-3)0.90
General function prediction only20365.7 × 10-3 (± 6.1 × 10-3)4.6 × 10-3 (± 0.5 × 10-3)0.82
Function unknown17773.6 × 10-3 (± 9.1 × 10-3)4.7 × 10-3 (± 0.6 × 10-3)0.86
Signal transduction mechanisms3039.5 × 10-3 (± 8.6 × 10-3)2.0 × 10-3 (± 0.4 × 10-3)0.60
Intracellular trafficking, secretion and vesicular transport4766.4 × 10-3 (± 16.5 × 10-3)7.0 × 10-3 (± 3.5 × 10-3)0.85
Defense mechanisms2263.6 × 10-3 (± 20.1 × 10-3)3.9 × 10-3 (± 1.5 × 10-3)0.76
Not in COGs25856.1 × 10-3 (± 5.9 × 10-3)6.6 × 10-3 (± 0.8 × 10-3)0.72

a Two COG functional categories including one gene (i.e. "RNA processing and modification" and "Extracellular structures") are not showed.

b The average rate (dS) of synonymous substitutions (nucleotide mutations that do not alter the amino acid sequence) for all orthologous genes within a give COG role category is displayed, followed by the standard error that is represented in parentheses.

c The average rate (dN) of nonsynonymous substitutions (nucleotide mutations that alter the amino acid sequence) for all orthologous genes within a give COG role category is displayed, followed by the standard error that is represented in parentheses.

d Correlation coefficient between the values of dS and dN.

The rates of synonymous (dS) and nonsynonymous (dN) nucleotide substitutions among different functional categories for A. pleuropneumoniae genes a Two COG functional categories including one gene (i.e. "RNA processing and modification" and "Extracellular structures") are not showed. b The average rate (dS) of synonymous substitutions (nucleotide mutations that do not alter the amino acid sequence) for all orthologous genes within a give COG role category is displayed, followed by the standard error that is represented in parentheses. c The average rate (dN) of nonsynonymous substitutions (nucleotide mutations that alter the amino acid sequence) for all orthologous genes within a give COG role category is displayed, followed by the standard error that is represented in parentheses. d Correlation coefficient between the values of dS and dN.

A substantial number of genes showing evidence for recombination in the core genome of A. pleuropneumoniae

Among the 1,587 orthologous core genes, 2% (29 genes) had no occurrence of nucleotide substitutions and thus were not further investigated for evidence of homologous recombination. Furthermore, among the remaining genes, 197 gene alignments that contain few informative sites less than two could not be analyzed with programs in PhiPack and were removed from the ortholog sets. Finally, 86% of total core genes were selected to conduct the subsequent recombination analyses through four approaches. The evolutionarily conserved core genes (n = 226) were summarized (Additional file 1) and the biological functions carried out by their coding products may be essential for the survival of A. pleuropneumoniae. Notably, conserved genes were significantly enriched in the COG category "Translation" with a low Bonferroni corrected p-value (P < 0.001; Binomial test); this result was consistent with low dS and dN values mentioned before. These translation-associated protein-coding genes are generally involved in the fundamental cellular activity and thus hardly have any changes at the amino acid level as a result of functional constraints. Overall, among 12 A. pleuropneumoniae genomes, 822 orthologous core genes (52% of all 1,587 core genome genes) were identified to have significant evidence for recombination (FDR < 10%) that was detected by at least one of the four tests (Additional file 2). A total of 493, 675, 659 and 559 orthologs were identified to have recombination signals using GENECONV, Maximum χ2, NSS and PHI, respectively. Additionally, a total of 149, 148, 160, and 365 orthologs exhibiting recombination signals were identified by using one, two, three, and all four recombination tests, respectively. It is worth noting that 23% of all core genes, which were selected as recombinants by all four methods for testing recombination, have more informative sites (P < 0.001; one-sided U-test) and higher nucleotide diversity (P < 0.001; one-sided U-test). For all core genome genes, association between COG categories and the number of genes with recombining fragments was estimated (Figure 1). Core genes that exhibit evidence for recombination were significantly overrepresented in three COG categories "Replication, recombination and repair", "Amino acid transport and metabolism", and "Inorganic ion transport and metabolism" (uncorrected P = 0.007, P < 0.001, and P = 0.029, respectively; one-sided Binomial test). However, after Bonferroni correction, only the association for the COG "Amino acid transport and metabolism" was significant (Bonferroni corrected P = 0.004).
Figure 1

Genes with evidence of recombination are enriched in three COG functional categories. The abscissa represents different COG functional categories. The ordinate represents the proportion of genes in each COG category. Bars in dark gray stand for proportions of genes (n = 365) with evidence for recombination (FDR < 10%). Bars in white stand for proportions of all core genes (n = 1,587) of A. pleuropneumoniae used in this study. Asterisks mark certain COG categories that significantly enriched with recombining genes (P < 0.05, Binomial test). The COG categories are coded as follows: J, translation; K, transcription; L, DNA replication, recombination and repair; D, cell division and chromosome partitioning; V, defense mechanisms; T, signal transduction; M, cell wall/membrane biogenesis; U, intracellular trafficking, secretion and vesicular transport; O, posttranslational modification, protein turnover and chaperones; C, energy production and conversion; G, carbohydrate transport and metabolism; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; H, coenzyme metabolism; I, lipid metabolism; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; R, general functional prediction only; S, function-unassigned conserved proteins; -, unknown proteins not in the COG collection.

Genes with evidence of recombination are enriched in three COG functional categories. The abscissa represents different COG functional categories. The ordinate represents the proportion of genes in each COG category. Bars in dark gray stand for proportions of genes (n = 365) with evidence for recombination (FDR < 10%). Bars in white stand for proportions of all core genes (n = 1,587) of A. pleuropneumoniae used in this study. Asterisks mark certain COG categories that significantly enriched with recombining genes (P < 0.05, Binomial test). The COG categories are coded as follows: J, translation; K, transcription; L, DNA replication, recombination and repair; D, cell division and chromosome partitioning; V, defense mechanisms; T, signal transduction; M, cell wall/membrane biogenesis; U, intracellular trafficking, secretion and vesicular transport; O, posttranslational modification, protein turnover and chaperones; C, energy production and conversion; G, carbohydrate transport and metabolism; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; H, coenzyme metabolism; I, lipid metabolism; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; R, general functional prediction only; S, function-unassigned conserved proteins; -, unknown proteins not in the COG collection.

Evidence for 57 A. pleuropneumoniae core genes subject to positive selection

The analyses of positive selection implemented in PAML was carried out for 1,587 core genome genes of A. pleuropneumoniae (in our initial experiment we included all 1,960 orthologous genes). Based on the LRT statistic for comparing the null model M1a and the selection model M2a with the distribution and corrections for multiple testing (FDR < 20%), a total of 57 genes were identified to be under strong positive selected pressure (Table 3; Additional file 3). Genes in the COG category "General function prediction only" were significantly enriched (P = 0.004; one-sided Binomial test). Except for four positively selected genes in the COG category "cell wall/membrane biogenesis", many genes with homologues in other COG categories or without homologues in the COG collection were also predicted to encode proteins localized on surface/membrane and simultaneously subject to positive selective pressure, e.g. gntT, cysW, apaA, pcaK, aphA, pqiB and ytfN.
Table 3

Genes that show evidence for positive Darwinian selection

Name (Systematic)Cluster ID aCOG bPutative function of coding products2ΔℓcQ-valuep dωePositively selected sites f
napAAPO_0068CPeriplasmic nitrate reductase10.030.1890.0001.02
glpAAPO_0197CAnaerobic glycerol-3-phosphate dehydrogenase subunit A12.260.0960.00911.45
fumCAPO_0306CFumarate hydratase15.630.0320.00499.56
prlCAPO_0109EOligopeptidase A14.720.0420.000607.84
argHAPO_0322EArgininosuccinate lyase13.100.0710.01410.79423
gntTAPO_0415ETransporter, gluconate/H+ symporter (GntP) family11.740.1060.00813.97190, 193
proAAPO_0424EGamma-glutamyl phosphate reductase10.020.1890.0513.29
artPAPO_0998EArginine transport ATP-binding protein14.750.0420.004521.75225
leuDAPO_1132E3-isopropylmalate dehydratase small subunit11.140.1350.03257.70
cpdBAPO_0120F2',3'-cyclic-nucleotide 2'-phosphodiesterase/3'-nucleotidase19.130.0100.0323.829, 99, 554
purDAPO_0394FPhosphoribosylamine--glycine ligase13.440.0620.06134.74
glgPAPO_00f74GMaltodextrin phosphorylase14.130.0540.0199.6613, 27, 392
kdgKAPO_0752G2-dehydro-3-deoxygluconokinase9.980.1900.0428.28
hemNAPO_0325HOxygen-independent coproporphyrinogen-III oxidase21.870.0050.01527.3738, 136, 428
lipBAPO_1067HOctanoyltransferase10.140.1890.0001.00
valSAPO_0044JValyl-tRNA synthetase22.180.0050.0163.8247, 792
rumAAPO_0356J23S rRNA (uracil-5-)-methyltransferase26.170.0020.00347.2620
trmA2APO_0540JtRNA (uracil-5-)-methyltransferase10.700.1600.0284.0824, 210
yhgFAPO_0081KTranscriptional accessory protein18.600.0100.0243.7810, 142
lysRAPO_0797KTranscriptional regulatory protein18.790.0100.01210.60284, 287, 292
parCAPO_0090LDNA topoisomerase 4 subunit A13.470.0620.00310.50357
mutMAPO_0873LHeptosyltransferase family12.150.0960.07714.90270
tatDAPO_0977LDeoxyribonuclease13.730.0590.1982.42188, 204, 234
hcsAAPO_0102MCapsule polysaccharide modification protein10.070.1890.0273.16483
hcsBAPO_0367MCapsule polysaccharide modification protein18.720.0100.03418.5213, 136, 366
ompP2BAPO_0561MOuter membrane protein P2-like protein16.150.0260.0958.07306, 317, 320
mscLAPO_1465MLarge-conductance mechanosensitive channel12.170.0960.044103.4097
ptrAAPO_0039OProtease III17.690.0130.0496.2643, 44, 50, 91, 617, 909
glnDAPO_0064Ouridylyltransferase17.930.0130.02412.2317, 190, 514, 551
sppAAPO_0149OProtease 410.420.1730.0309.37
lonHAPO_0207OLon protease15.240.0350.013106.4719
nrfGAPO_0983OFormate-dependent nitrite reductase complex subunit12.290.0960.08427.77
tehAAPO_0711PTellurite resistance protein11.700.1060.00770.81252
cysWAPO_0883PSulfate transport system permease protein13.800.0590.0743.68145
-APO_0030RHelicase12.140.0960.0512.56
pqiBAPO_0055RParaquat-inducible protein B20.310.0080.0463.62259, 318, 384
tldDAPO_0268RProtease10.160.1890.00915.91377, 379
-APO_0269RHypothetical protein10.020.1890.0001.00463
pcaKAPO_0359RMajor facilitator transporter12.090.0960.0225.96195, 238, 416, 417
thiHAPO_0509RThiazole biosynthesis protein18.320.0110.01212.9815, 55
murQAPO_0754RN-acetylmuramic acid 6-phosphate etherase17.690.0130.0594.0554, 99, 293
-APO_0792RNucleoside-diphosphate sugar epimerase11.920.1020.0609.71
rssAAPO_0839RPatatin11.220.1320.01941.55225
cofAPO_0888RHydrolases of the HAD superfamily10.620.1600.00025.15
smtAAPO_0969RMethyltransferase10.660.1600.045101.87134, 204
aphAAPO_1107RMembrane protein affecting hemolysin expression10.830.1530.01224.9646
recXAPO_1397RRegulatory protein19.960.0080.00837.0947
ytfNAPO_0025SHypothetical protein14.070.0540.002239.60553
-APO_0098SOligopeptide transporter20.790.0070.01215.9182, 127, 546
glnEAPO_0041TGlutamate-ammonia-ligase adenylyltransferase11.880.1020.0233.6053
typAAPO_0150TGTP-binding protein33.590.0000.00615.43395, 567, 569
apaAAPO_0772TABC transporter, periplasmic binding protein13.490.0620.0317.10103, 147
-APO_0051-Hypothetical protein19.000.0100.0885.16105
guaBAPO_0259-Inosine-5'-monophosphate dehydrogenase15.250.0350.0134.98448
-APO_0393-Hypothetical protein24.920.0020.02757.6297, 104, 134, 355
-APO_0571-Hypothetical protein12.270.0960.02028.21199, 275, 283
wecFAPO_0577-TDP-Fuc4NAc:lipid II Fuc4NAc transferase22.680.0050.03514.2370, 92, 182, 183, 289

a Protein designations were taken from the A. pleuropneumoniae ortholog annotation that was summarized in our recent publication [19].

b The abbreviations of COG function categories were assigned based on Figure 1.

c 2Δℓ denote the statistic of likelihood ratio test.

d The proportion of the amino acid sites under positive selection.

e ω is equal to the ratio of dN to dS (Number of nonsynonymous changes per nonsynonymous sites/Number of synonymous changes per synonymous site) for amino acid sites under positive selection (model M2a).

f Positively selected sites identified with posterior probability P > 95%.

Genes that show evidence for positive Darwinian selection a Protein designations were taken from the A. pleuropneumoniae ortholog annotation that was summarized in our recent publication [19]. b The abbreviations of COG function categories were assigned based on Figure 1. c 2Δℓ denote the statistic of likelihood ratio test. d The proportion of the amino acid sites under positive selection. e ω is equal to the ratio of dN to dS (Number of nonsynonymous changes per nonsynonymous sites/Number of synonymous changes per synonymous site) for amino acid sites under positive selection (model M2a). f Positively selected sites identified with posterior probability P > 95%. Notably, there was no obvious discrepancy for the values of dS between genes under positive selection and the remaining genes; whereas the dN values together with the number of informative sites and genetic diversity were significantly higher in the positively selected genes (P < 0.001, P < 0.001, P = 0.023; one-sided U-test). No association between positive selection and COG categories was observed, as the number of positively selected genes is low in each role category. Among 57 positively selected genes, 24 genes also showed significant evidence for homologous recombination detected by all four recombination tests. Furthermore, 41 genes under positive selection pressure showed evidence for recombination identified by at least one test. It indicates that positive selection should be associated with intragenic recombination, as recombination can lead to phylogenic incongruence and highly false positives when selective pressure on protein coding sequences was estimated [3,29].

Discussion

Gene acquisitions and losses that contribute to the virulence and serotypic diversification of A. pleuropneumoniae have been depicted in detail [18,19], but our understanding on small genetic variations caused by positive selection and homologous recombination, which also factually influence the evolutionary trajectories of protein coding genes, has not been well considered for this swine pathogen so far. In this report, we chose 12 genomes of A. pleuropneumoniae to study the evolutionary driving forces acting on the core genome of this animal pathogen using a comparative phylogenomic approach.

Intragenic recombination and positive selection both play a key role in the evolution of A. pleuropneumoniae pan-genome

Tests for intragenic homologous recombination and positive selection were performed with 1,587 orthologous genes present in the core genomes of twelve strains of A. pleuropneumoniae. Overall, our results indicated that about a quarter of the genes in A. pleuropneumoniae core genome showed significant evidence for intragenic recombination. In comparison, core-genome recombination was also evident in both species of the genus Streptococcus, as 18% and 37% of the core genome for S. agalactiae and S. pyogenes, respectively, showed evidence for homologous recombination [5]. Notably, in A. pleuropneumoniae, two COG categories "Replication, recombination and repair" and "Amino acid transport and metabolism", which both presented high values of dS and dN, were favored by intragenic recombination. On the other hand, 57 A. pleuropneumoniae genes, accounting for approximately 3.6% of the core genome, were identified to be undergoing positive selection. Another similar study on the identification of genes under positive selection in E. coli reported that 0.7% of 3,505 genes found in six E. coli genomes showed evidence for positive selection and no evidence for recombination [1]. Like other pathogenic bacteria, a substantial number of positively selected genes in A. pleuropneumoniae encode protein products involved in the biogenesis and structural components of bacterial cell wall and/or outer membrane. These genes are likely to be associated with co-evolutionary arms races between pathogenic microorganisms and hosts. To further decipher the roles of evolutionary pressure operating on the core genome of A. pleuropneumoniae, we analyzed the functional properties of the positively selected genes and potentially important residues subject to positive selection.

Genes subject to positive selection in A. pleuropneumoniae

We found that many protein products encoded by the positively selected genes were exposed on the cell surface or involved in structural constituents of bacterial cell wall. Some of these proteins have been reported to be important virulence factors associated with bacterial adherence, colonization and persistence. Therefore, it suggests that the genes under diversifying selection may dynamically interact with the host immune and defense systems. The beta barrel porins are pore proteins that allow the passive diffusion of small, hydrophilic, or changed molecules across Gram-negative bacterial outer membranes [48]. The pore proteins have been believed to be crucial for not only dynamic interactions with the host immune system, but bacterial pathogenesis as well [1,49]. An outer membrane protein OmpP2, which was predicted to be beta barrel porin, showed strong evidence for positive selection with a low q-value (Table 3). The results of the Bayes empirical Bayes (BEB) analyses showed that A. pleuropneumoniae OmpP2 amino acid residues 306, 317, and 320 were subject to intense positive selective pressure (Figure 2). The three residues all located on a predicted extracellular loop in the C-terminus, perhaps associated with potential antigenic epitope. In addition, OmpP2 has been experimentally confirmed to be essential for in vivo survival of A. pleuropneumoniae by signature-tagged mutagenesis and also an immunogenic surface antigen by the immunoproteomic approach [50,51]. In our initial selection analyses using a set of 1,960 genes, gene fepA present in 11 A. pleuropneumoniae genomes encodes a beta barrel porin (Figure 2) and was also identified with evidence for positive selection (data not shown). FepA of A. pleuropneumoniae shared a common TonB-dependent receptor plug domain (PF07715) with E. coli outer membrane protein FepA that is a receptor for ferric enterobactin and for colicins B and D [52]. FepA of A. pleuropneumoniae has already been reported to exhibit immunogenic activity [53]. The adaptive changes in both porins might be beneficial for A. pleuropneumoniae to escape from the host immune systems and attack of phages, antibiotics, and colicins.
Figure 2

Three-dimensional structural models of beta barrel porins OmpC and FepA. Orange spheres stand for amino acid sites that are subject to strong positive selection (posterior probability > 95%).

Three-dimensional structural models of beta barrel porins OmpC and FepA. Orange spheres stand for amino acid sites that are subject to strong positive selection (posterior probability > 95%). Bacterial surface polysaccharides, which are often involved in adherence and colonization, may be directly exposed to the host immune pressure. Three A. pleuropneumoniae genes (hcsA, hcsB, and wecF) participated in biogenesis of surface polysaccharides showed significant evidence for positive selection. The products of selected genes hcsA and hcsB code for capsule polysaccharide modification proteins that share 63% and 64% identity with Haemophilus influenzae HcsA and HcsB, respectively, which facilitate transport of capsular polysaccharide across outer membrane and are essential for bacterial virulence [54]. Besides, the positively selected gene wecF codes for a 4-alpha-L-fucosyltransferase and is located at a wec locus which has highly conserved colinearity in all A. pleuropneumoniae genomes. The products of wecF together with other wec genes exhibit high similarity to the E. coli K12 homologues that are involved in the assembly of a cell surface glycolipid [55]. The other gene apaA encoding an antigenic membrane lipoprotein that could provide cross-protection against heterologous A. pleuropneumoniae serovars [56], was also under strong positive selection (q-value = 0.062). The above analyses strongly demonstrated that the positively selected genes involved in the biosynthesis and structural composition of cell surface/wall have undergone adaptive functional changes, perhaps allowing bacterial pathogens to escape recognition by the host immune system and phages. Such phenomena have already been proposed by the previous studies of natural selection on the E. coli genome [1,14]. The proteases of A. pleuropneumoniae have been reviewed to be one of important virulence factors and contribute to pathogenesis [57]. Overall, 4 protease genes (i.e., ptrA, lonH, sppA and tldD) showed significant evidence for positive selection. The precise function of these protease genes identified here, to our knowledge, was not well understood for this pathogen. However, proteolytic enzymes are pivotal to the secretion processes of Gram-negative pathogens and several of them have been described as attractive drug targets in other pathogens, e.g. ClpP [58] and Lon [59].

Conclusion

Our findings indicated that intragenic homologous recombination and positive Darwinian selection, unsurprisingly, indeed play crucial roles in the evolution of pathogenic A. pleuropneumoniae. In genes with extensive functional classification we found genes involved in the formation of cell surface/membrane are favored by the positive selective pressure. The adaptive changes in these positively selected genes and/or residues likely attribute to dynamic interaction caused by the host immune and defense systems. Of course, the diversifying selective forces of genes encoding metabolic functions may be also advantage for improving bacterial fitness in response to a variety of environmental signals. More experimental works are required for verifying the functions of these adaptive genes in future. Overall, the genetic evidence of positive selection will provide promising targets for further researches in the mechanisms of immune evasion and the host-pathogen interaction in A. pleuropneumoniae.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

ZX carried out the data collection, data analyses, wrote the manuscript. ZX and RZ participated in its design and revised the manuscript. RZ and HC supervised and coordinated the project. All authors read, edited and approved the final manuscript.

Additional file 1

Highly conserved genes in the core genome of . Detailed information for individual gene alignment is provided, including nucleotide diversity, informative sites, and codon bias. Click here for file

Additional file 2

Detailed information on test of recombination. A. pleuropneumoniae genes showing evidence for recombination detected by at least one method (FDR < 10%). Click here for file

Additional file 3

Alignments for positively selective genes. Compressed file containing all alignments for genes under positive selection (FASTA format). Click here for file
  57 in total

1.  Codon-substitution models for heterogeneous selection pressure at amino acid sites.

Authors:  Z Yang; R Nielsen; N Goldman; A M Pedersen
Journal:  Genetics       Date:  2000-05       Impact factor: 4.562

Review 2.  Actinobacillus pleuropneumoniae: pathobiology and pathogenesis of infection.

Authors:  Janine T Bossé; Håkan Janson; Brian J Sheehan; Amanda J Beddek; Andrew N Rycroft; J Simon Kroll; Paul R Langford
Journal:  Microbes Infect       Date:  2002-02       Impact factor: 2.700

3.  Protein structure prediction on the Web: a case study using the Phyre server.

Authors:  Lawrence A Kelley; Michael J E Sternberg
Journal:  Nat Protoc       Date:  2009       Impact factor: 13.491

4.  Arms races between and within species.

Authors:  R Dawkins; J R Krebs
Journal:  Proc R Soc Lond B Biol Sci       Date:  1979-09-21

5.  Molecular analysis of the Escherichia coli ferric enterobactin receptor FepA.

Authors:  S K Armstrong; C L Francis; M A McIntosh
Journal:  J Biol Chem       Date:  1990-08-25       Impact factor: 5.157

Review 6.  The role of porins in neisserial pathogenesis and immunity.

Authors:  Paola Massari; Sanjay Ram; Heather Macleod; Lee M Wetzler
Journal:  Trends Microbiol       Date:  2003-02       Impact factor: 17.079

7.  Identification and biosynthesis of cyclic enterobacterial common antigen in Escherichia coli.

Authors:  Paul J A Erbel; Kathleen Barr; Ninguo Gao; Gerrit J Gerwig; Paul D Rick; Kevin H Gardner
Journal:  J Bacteriol       Date:  2003-03       Impact factor: 3.490

8.  Outer membrane proteome of Actinobacillus pleuropneumoniae: LC-MS/MS analyses validate in silico predictions.

Authors:  Jacqueline W Chung; Christopher Ng-Thow-Hing; Lorne I Budman; Bernard F Gibbs; John H E Nash; Mario Jacques; James W Coulton
Journal:  Proteomics       Date:  2007-06       Impact factor: 3.984

9.  High-throughput sequencing provides insights into genome variation and evolution in Salmonella Typhi.

Authors:  Kathryn E Holt; Julian Parkhill; Camila J Mazzoni; Philippe Roumagnac; François-Xavier Weill; Ian Goodhead; Richard Rance; Stephen Baker; Duncan J Maskell; John Wain; Christiane Dolecek; Mark Achtman; Gordon Dougan
Journal:  Nat Genet       Date:  2008-07-27       Impact factor: 38.330

Review 10.  Virulence factors of Actinobacillus pleuropneumoniae involved in colonization, persistence and induction of lesions in its porcine host.

Authors:  Koen Chiers; Tine De Waele; Frank Pasmans; Richard Ducatelle; Freddy Haesebrouck
Journal:  Vet Res       Date:  2010-06-15       Impact factor: 3.683

View more
  14 in total

1.  Genomic composition and dynamics among Methanomicrobiales predict adaptation to contrasting environments.

Authors:  Patrick Browne; Hideyuki Tamaki; Nikos Kyrpides; Tanja Woyke; Lynne Goodwin; Hiroyuki Imachi; Suzanna Bräuer; Joseph B Yavitt; Wen-Tso Liu; Stephen Zinder; Hinsby Cadillo-Quiroz
Journal:  ISME J       Date:  2016-08-23       Impact factor: 10.302

2.  Genome evolution and phylogenomic analysis of Candidatus Kinetoplastibacterium, the betaproteobacterial endosymbionts of Strigomonas and Angomonas.

Authors:  João M P Alves; Myrna G Serrano; Flávia Maia da Silva; Logan J Voegtly; Andrey V Matveyev; Marta M G Teixeira; Erney P Camargo; Gregory A Buck
Journal:  Genome Biol Evol       Date:  2013       Impact factor: 3.416

3.  Genome-wide survey of mutual homologous recombination in a highly sexual bacterial species.

Authors:  Koji Yahara; Mikihiko Kawai; Yoshikazu Furuta; Noriko Takahashi; Naofumi Handa; Takeshi Tsuru; Kenshiro Oshima; Masaru Yoshida; Takeshi Azuma; Masahira Hattori; Ikuo Uchiyama; Ichizo Kobayashi
Journal:  Genome Biol Evol       Date:  2012-04-25       Impact factor: 3.416

4.  High levels of multiple infections, recombination and horizontal transmission of Wolbachia in the Andricus mukaigawae (Hymenoptera; Cynipidae) communities.

Authors:  Xiao-Hui Yang; Dao-Hong Zhu; Zhiwei Liu; Ling Zhao; Cheng-Yuan Su
Journal:  PLoS One       Date:  2013-11-08       Impact factor: 3.240

5.  A genome-wide identification of genes undergoing recombination and positive selection in Neisseria.

Authors:  Dong Yu; Yuan Jin; Zhiqiu Yin; Hongguang Ren; Wei Zhou; Long Liang; Junjie Yue
Journal:  Biomed Res Int       Date:  2014-08-10       Impact factor: 3.411

6.  Genome-Wide Analyses Reveal Genes Subject to Positive Selection in Pasteurella multocida.

Authors:  Peili Cao; Dongchun Guo; Jiasen Liu; Qian Jiang; Zhuofei Xu; Liandong Qu
Journal:  Front Microbiol       Date:  2017-05-30       Impact factor: 5.640

7.  Polymorphism analysis of the apxIA gene of Actinobacillus pleuropneumoniae serovar 5 isolated in swine herds from Brazil.

Authors:  Lucas Fernando Dos Santos; Richard Costa Polveiro; Thalita Scatamburlo Moreira; Pedro Marcus Pereira Vidigal; Yung-Fu Chang; Maria Aparecida Scatamburlo Moreira
Journal:  PLoS One       Date:  2018-12-18       Impact factor: 3.240

8.  ODoSE: a webserver for genome-wide calculation of adaptive divergence in prokaryotes.

Authors:  Michiel Vos; Tim A H te Beek; Marc A van Driel; Martijn A Huynen; Adam Eyre-Walker; Mark W J van Passel
Journal:  PLoS One       Date:  2013-05-06       Impact factor: 3.240

9.  Insight into the evolution of the histidine triad protein (HTP) family in Streptococcus.

Authors:  Zhu-Qing Shao; Yan-Mei Zhang; Xiu-Zhen Pan; Bin Wang; Jian-Qun Chen
Journal:  PLoS One       Date:  2013-03-20       Impact factor: 3.240

10.  POTION: an end-to-end pipeline for positive Darwinian selection detection in genome-scale data through phylogenetic comparison of protein-coding genes.

Authors:  Jorge A Hongo; Giovanni M de Castro; Leandro C Cintra; Adhemar Zerlotini; Francisco P Lobo
Journal:  BMC Genomics       Date:  2015-08-01       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.