| Literature DB >> 19912661 |
Yeşim Soyer1, Renato H Orsi, Lorraine D Rodriguez-Rivera, Qi Sun, Martin Wiedmann.
Abstract
BACKGROUND: The bacterium Salmonella enterica includes a diversity of serotypes that cause disease in humans and different animal species. Some Salmonella serotypes show a broad host range, some are host restricted and exclusively associated with one particular host, and some are associated with one particular host species, but able to cause disease in other host species and are thus considered "host adapted". Five Salmonella genome sequences, representing a broad host range serotype (Typhimurium), two host restricted serotypes (Typhi [two genomes] and Paratyphi) and one host adapted serotype (Choleraesuis) were used to identify core genome genes that show evidence for recombination and positive selection.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19912661 PMCID: PMC2784778 DOI: 10.1186/1471-2148-9-264
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Salmonella genomes used in this study
| Serotype | No. of ORFs | Accession No. | Sequencing Center | Reference |
|---|---|---|---|---|
| Choleraesuis | 4801 | Chang Gung Univ. | [ | |
| Paratyphi A | 4093 | Washington Univ. | [ | |
| Typhi CT18 | 4395 | Sanger Centre/Imperial College | [ | |
| Typhi Ty2 | 4323 | Univ. of Wisconsin | [ | |
| Typhimurium | 4553 | Washington University Consort. | [ |
Figure 1Example of neighbor joining tree used for positive selection analysis. Gene specific trees were used for all positive selection analysis. The tree showed here represented the phylogeny of 849 genes. Branches used for branch specific analyses are indicated; Ch# = Choleraesuis branch specific test; Ty# = Typhi branch specific test; Tym# = Typhimurium branch specific test; Pty# = Paratyphi A branch specific test.
Genes used to confirm positive selection and recombination patterns identified in genome wide analyses
| Gene Name | Protein name | JCVI Role Category | Gene length (bp) | Genome analyses results for | Sequence analyses results forc | ||
|---|---|---|---|---|---|---|---|
| Positive Selectiona | Recombinationb | Positive Selectiond | Recombinatione | ||||
| 2-amino-4-hydroxy-6-hydroxymethyldihydro pteridine pyrophosphokinase | Biosynthesis of cofactors, prosthetic groups, and carriers | 480 | ( | GEN, MAX | ( | GEN, MAX, NSS | |
| STM3258 | Putative PTS system IIA component | Transport and binding proteins | 465 | Ty# | - | Ty# | - |
| Probable pathogenicity island effector protein | Unclassified | 1455 | ( | GEN, MAX | ( | GEN, MAX, NSS, PHI | |
| Phosphoribosylamino-imidazole carboxylase, catalytic subunit | Purines, pyrimidines, nucleosides, and nucleotides | 510 | Ty# | - | Ty# | NSS | |
apositive selection tests that were significant (Q < 0.2) are listed; TO = overall test; Ch# = Choleraesuis branch specific test; Ty# = Typhi branch specific test; for genes that showed evidence of recombination, results are shown in a parenthesis as recombination may affect the positive selection analyses.
brecombination tests that were significant (Q < 0.1) are listed; GEN = GENECONV; MAX = Maximum χ2; PHI = pairwise homoplasy; NSS = neighbor similarity score
cResults of positive selection and recombination analyses were based on gene sequence data for the 5 genomes and 42 additional Salmonella isolates (see Additional file 2); for folK-2 and sseC sequences were only obtained for 36 additional isolates; for STM3258 sequences were only obtained for 37 additional isolates.
dpositive selection tests that were significant (P < 0.05); for genes that showed evidence of recombination with multiple tests, results are shown in a parenthesis as recombination may affect the positive selection analyses.
erecombination tests that were significant (P < 0.05)
Genes that show evidence of recombination in all four testsa
| Gene annotation no. for | Protein name | Gene name | JCVI Role Category |
|---|---|---|---|
| STM0067 | Carbamoyl-phosphate synthase, large subunit | Purines, pyrimidines, nucleosides, and nucleotides | |
| STM0224 | Surface antigen | Unknown function | |
| STM0540 | Conserved hypothetical protein | - | Hypothetical proteins |
| STM0661 | Inosine-uridine preferring nucleoside hydrolase | Purines, pyrimidines, nucleosides, and nucleotides | |
| STM2287 | Conserved hypothetical protein | - | Hypothetical proteins |
| STM2660 | ATP-dependent protease, Hsp 100, part of novel | Protein fate | |
| STM2947 | Sulfite reductase (nADPh) hemoprotein beta-component | - | Central intermediary metabolism |
| STM2948 | Sulfite reductase (nADPh) flavoprotein alpha-component | - | Central intermediary metabolism |
| STM3174 | DNA topoisomerase IV, A subunit | DNA metabolism | |
| STM4066 | Fructokinase | Energy metabolism |
aThese genes showed evidence for recombination (Q < 0.1) in four tests (i.e., GENECONV, Maximum χ2 [Max-χ2], pairwise homoplasy index [PHI], and neighbor similarity score [NSS])
Figure 2Proportions of genes with evidence of recombination among individual JCVI role categories. Genes with evidence for recombination (Q < 0.1) in at least one of the four tests were included. Bars indicate estimated standard error for the proportion of genes with evidence of recombination in each role category; standard errors were calculated as square root of p (1-p)/n, where p is the proportion of genes with evidence of positive selection in a given role category, and n is the total number of genes in a given role category. Among the 20 JCVI role categories, two did not include genes with evidence of recombination (i.e., "Signal Transduction" and "Viral functions") and are thus not included in this figure.
Evidence of recombination among genes with evidence for positive selection
| Test for positive selectiona | No. of genes with evidence for positive selection and no evidence for recombination | No. of genes with evidence for positive selection that show evidence of recombination withb | Total no. of genes with evidence for positive selection and recombinationc | |||
|---|---|---|---|---|---|---|
| GENECONV | Max-χ2 | PHI | NSS | |||
| TO | 5 | 12 | 7 | 3 | 4 | 13 |
| Ch# | 8 | 11 | 10 | 1 | 3 | 11 |
| Ty# | 16 | 4 | 3 | 0 | 3 | 5 |
| Tym# | 7 | 4 | 4 | 0 | 1 | 4 |
| Pty# | 5 | 8 | 7 | 1 | 2 | 8 |
aTO = overall test; Ch# = Choleraesuis branch specific test; Ty# = Typhi branch specific test; Tym# = Typhimurium branch specific test; Pty# = Paratyphi A branch specific test
bBased on our preliminary analysis, among 3316 orthologous genes, 81 genes showed evidence of positive selection in at least one test. Among 81 genes, 32 genes also showed evidence of recombination with at least one of the four recombination tests used in our study. Statistical analysis showed that genes evidence of recombination were more likely to be under positive selection (P < 0.0001; chi-square test). Therefore, we excluded the 270 genes with evidence of recombination from our final positive selection analysis.
cThis column lists the number of genes that show evidence for recombination and positive selection in a given test (e.g., TO); since many genes showed evidence of recombination in > 1 recombination test, the total number of genes in this column is lower than the sum of the numbers in a given row. While a total of 32 genes showed evidence of recombination and positive selection, the sum of the numbers in this column is > 32 as some genes showed evidence of positive selection in two tests.
Genes with evidence for positive selection
| Gene annotation no. for | Gene name | Protein namea | Alignment length (bp) | Positive selection ( | BEB ( |
|---|---|---|---|---|---|
| STM1441 | - | Membrane protein, putative | 1995 | Ch# (0.0043) | - |
| STM2267 | Outer membrane protein C precursor | 1134 | Ch# (0.0986) | 274 | |
| STM0743 | - | Putative lipoprotein | 273 | Ch# (0.1830) | - |
| STM2801 | Conserved hypothetical protein | 300 | Pty# (0.020) | - | |
| STM0301 | Outer membrane usher, | 2508 | TO (0.0104) | 85, 111, 405, 692 | |
| STM4106 | Catalase hydroperoxidase HPI(I) | 2178 | TO (0.0035) | - | |
| STM1425 | Hypothetical integral membrane protein | 1371 | Tym# (0.0145) | - | |
| STM4023 | - | Putative 3-hydroxyisobutyrate dehydrogenase | 840 | Ch# (0.0138) | - |
| STM3680 | Aldehyde dehydrogenase B | 1536 | Pty# (0.020) | - | |
| STM0698 | Phosphoglucomutase, alpha-D-glucose phosphate-specific | 1638 | Ty# (0.0157) | - | |
| STM3515 | MalT regulatory protein | 2703 | Ty# (0.0198) | 801 | |
| STM4187 | Acetate operon repressor | 819 | Ty# (0.0693) | - | |
| STM0401 | Glycosyl hydrolase, family 13 | 1815 | Tym# (0.1005) | - | |
| STM3329 | - | Conserved hypothetical protein TIGR01212 | 927 | Ch# (0.1471) | - |
| STM1854 | - | Hypothetical protein | 162 | Pty# (0.1973) | 32, 40, 44, 45 |
| STM0861 | - | Conserved hypothetical protein | 471 | Tym# (0.1149) | - |
| STM1515 | - | Conserved hypothetical protein TIGR00156 domain protein | 384 | Ty# (0.0167) | - |
| STM4015 | - | Hypothetical protein | 846 | Ty# (0.0693) | - |
| STM4258 | - | Conserved hypothetical protein | 1386 | Ty# (0.0884) | - |
| STM1532 | - | Hypothetical protein | 678 | Ty# (0.1614) | - |
| STM1280 | - | Conserved hypothetical protein | 396 | Tym# (0.0145) | - |
| STM3463 | - | Conserved hypothetical protein | 201 | Tym# (0.1005) | - |
| STM0534 | Phosphoribosylaminoimidazole carboxylase, catalytic subunit | 507 | Ty# (0.0194) | - | |
| STM2806 | NrdI protein | 408 | Ty# (0.0693) | - | |
| STM2107 | GDP-mannose mannosyl hydrolase | 435 | Tym# (0.1081) | - | |
| STM1679 | Oligopeptide ABC transporter, periplasmic oligopeptide-binding protein | 1605 | Ch# (0.0121) | - | |
| STM3685 | PTS system, mannitol-specific IIC component subfamily, putative PTS system IIA component, putative | 1914 | TO (0.0035) | - | |
| STM3258 | - | 462 | Ty# (0.0157) | 124, 139, 143, 144, 147 | |
| STM3626 | Oligopeptide ABC transporter, ATP-binding protein | 1011 | Ty# (0.0157) | - | |
| STM3592 | - | Proton/peptide symporter family protein | 1470 | TO (0.0104) | - |
| STM1088 | Pathogenicity island encoded protein: SPI5, PipB | 873 | TO (0.0688) | 173 | |
| STM0248 | - | Histidinol phosphatase-related protein | 573 | Ch# (< 0.0001) | 175, 184, 185, 191 |
| STM3565 | - | Acetyltransferase, GNAT family | 381 | Pty# (0.0391) | - |
| STM3955 | RarD protein | 879 | Ty# (0.0194) | - | |
| STM1450 | - | Pyridoxal kinase | 666 | Ty# (0.0147) | - |
| STM3057 | 2-octaprenyl-6-methoxyphenol hydroxylase, UbiH | 1176 | Tym# (0.1149) | - | |
| STM2678 | Putative membrane protein, CorE | 750 | Ch# (0.1411) | - | |
| STM4242 | - | 99% identical to TraF of plasmid R64 | 1284 | Pty# (0.0329) | - |
| STM3655 | Glycyl-tRNA synthetase, beta subunit | 2067 | Ty# (0.0393) | 313 | |
| STM0603 | Aminotransferase, class I | 1158 | Ty# (0.1976) | - | |
| STM0395 | - | Exonuclease SbcC, putative | 3096 | Ty# (0.1041) | - |
aProtein designations were taken from the Typhi CT18 annotation; where limited annotation information was available, additional information was extracted from JCVI primary annotations and Typhimurium LT2 and Paratyphi A annotations
btests that were significant for positive selection (FDR <20%) are shown; TO = overall test; Ch# = Choleraesuis branch specific test; Pty# = Paratyphi A branch specific test; Ty# = Typhi branch specific test; Tym# = Typhimurium branch specific test; numbers in brackets indicate q-values
caa sites identified by Bayes Empirical Bayes (BEB) as having probability > 95% of being under positive selection are shown; aa sites are based on site location in the alignment (alignments for genes under positive selection are provided as Additional file 6)
dRole categories were assigned based on annotations for S. Typhi CT18; JCVI locus names for Typhi CT18 for these genes are listed in Additional file 5.
eOther role categories include Protein synthesis (STM3655), Central intermediary metabolism (STM0603), DNA metabolism (STM0395)
Figure 3Proportions of genes with evidence of positive selection among individual JCVI role categories. Only genes that showed no evidence for recombination were used to generate the data showed here. Bars indicate estimated standard error for the proportion of genes with evidence of positive selection in each role category; standard errors were calculated as the square root of p (1-p)/n, where p is the frequency of genes with evidence of positive selection in a given role category, and n is the total number of genes in a given role category. Among the 20 JCVI role categories, seven did not include genes with evidence of positive selection and are thus not included in this figure. For each role category, proportion of genes with evidence of positive selection in the overall test (TO) and each of the four branch specific tests (Ch# = Choleraesuis branch specific test; Ty# = Typhi branch specific test; Tym# = Typhimurium branch specific test; Pty# = Paratyphi A branch specific test) are shown.
Salmonella pathogenicity island (SPI) genes with evidence of positive selection and recombination
| SPIsa | Locationb | No. of orthologous genes found in SPI | No. of genes with evidence for positive selection | No. of genes with evidence for recombination |
|---|---|---|---|---|
| 1 | STM2865-2914 | 34 | 0 | 1 ( |
| 2 | STM1379-1422 | 32 | 0 | 3 ( |
| 3 | STM3752-3764 STM3766-3775 | 5 | 0 | 0 |
| 4 | STM4257-4262 | 7 | 1 ( | 0 |
| 5 | STM1087-1094 | 4 | 1 ( | 0 |
| 6 | NT03ST0297-0356 | 20 | 1 ( | 4 ( |
aThis table lists genes in the Salmonella Pathogenicity islands (SPIs) 1 to 6
bGenes in SPIs 1 to 5 are reported as described by [53] using primary annotation locus numbers for Salmonella Typhimurium LT2; genes in SPI-6 are reported as described by [54] using JCVI locus numbers for Salmonella Typhi CT18