| Literature DB >> 19159446 |
Kathryn E Holt1, Nicholas R Thomson, John Wain, Gemma C Langridge, Rumina Hasan, Zulfiqar A Bhutta, Michael A Quail, Halina Norbertczak, Danielle Walker, Mark Simmonds, Brian White, Nathalie Bason, Karen Mungall, Gordon Dougan, Julian Parkhill.
Abstract
BACKGROUND: Of the > 2000 serovars of Salmonella enterica subspecies I, most cause self-limiting gastrointestinal disease in a wide range of mammalian hosts. However, S. enterica serovars Typhi and Paratyphi A are restricted to the human host and cause the similar systemic diseases typhoid and paratyphoid fever. Genome sequence similarity between Paratyphi A and Typhi has been attributed to convergent evolution via relatively recent recombination of a quarter of their genomes. The accumulation of pseudogenes is a key feature of these and other host-adapted pathogens, and overlapping pseudogene complements are evident in Paratyphi A and Typhi.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19159446 PMCID: PMC2658671 DOI: 10.1186/1471-2164-10-36
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Insertion/deletion events between Paratyphi A AKU_12601 and ATCC9150
| pseudo-forming | 217 bp del | |
| pseudo-forming | 1 bp del (homopol) | |
| pseudo-forming | 95 bp del | |
| pseudo-forming | 1338 bp ins (IS10) | |
| pseudo-forming | 1 bp del (homopol) | |
| pseudo-forming | 171 del | |
| pseudo-forming | 7 bp del | |
| pseudo-forming | 1 bp ins (homopol) | |
| pseudo-forming | 1 bp ins (homopol) | |
| pseudo-forming | 2 bp ins | |
| pseudo-forming | SSPA3202 | 1 bp ins (homopol) |
| pseudo-forming | 352 bp del | |
| pseudo-forming | 1 bp ins (homopol) | |
| pseudo-forming | 1 bp del (homopol) | |
| pseudo-forming | 1 bp del | |
| pseudo-forming | 1 bp ins (homopol) | |
| already pseudo | SSPA4008a | 1338 bp ins (IS10) |
| coding change | 6 bp VNTR | |
| coding change | 12 bp VNTR | |
| coding change | 9 bp del | |
| coding change | SSPA3558a | 10 bp del |
| coding change | 3 bp del | |
| coding change* | SSPA0733,4,5 | 5770 bp VNTR |
| intergenic | - | 1 bp ins (homopol) |
| intergenic | - | 1 bp ins (homopol) |
| intergenic | - | 3 bp ins (homopol) |
| intergenic | - | 1 bp ins |
| intergenic | - | 1 bp ins |
| intergenic | - | 1 bp ins |
| intergenic | - | 1 bp ins |
| intergenic | - | 1 bp ins |
| intergenic | - | 122 bp VNTR |
| RNA | 175 bp VNTR | |
| rRNA | 1 bp del (homopol) | |
| rRNA | 1 bp ins (homopol) | |
| rRNA | 1 bp ins | |
| sRNA | 1 bp ins (homopol) | |
| tRNA | 7 bp del | |
| tRNA | tRNA- | 220 bp VNTR |
Del – deletion; ins – insertion; homopol – insertion or deletion in homopolymeric sequence; IS10 – IS10 transposase insertion; VNTR – variable number tandem repeat. Pseudo-forming – mutation results in formation of a pseudogene in one strain; coding change – mutation results in a change to the translated amino acid sequence but retains the reading frame; * fusion of two genes, retaining the reading frame.
Figure 1Tandem repeats in the O-antigen biosynthesis cluster in Paratyphi A ATCC9150. Bottom row: gene arrangement in Paratyphi A AKU_12601 and Typhi, presumed to be the ancestral form. Top row: gene arrangement in Paratyphi A ATCC9150, apparently resulting from two tandem duplications. Labels give systematic identifiers for the gene sequences in each genome, identical coding sequences are shown in the same colours, identical sequences are joined by lines.
Inactivating mutations unique to either AKU_12601 or ATCC9150
| del | AKU_12601 | probable acyl Co-A dehydrogenase | |
| 1 bp del (homopol) | ATCC9150 | asparagine synthetase B | |
| 88 bp del | AKU_12601 | cytochrome c-type biogenesis protein H2 | |
| nonsense SNP | AKU_12601 | glutamate/aspartate transport system permease | |
| IS10 ins | AKU_12601 | outer membrane porin | |
| 1 bp del (homopol) | ATCC9150 | propanediol diffusion facilitator | |
| 171 bp del | AKU_12601 | propanediol dehydratase reactivation protein | |
| 7 bp del | ATCC9150 | ProP effector | |
| 1 bp ins (homopol) | ATCC9150 | high affinity ribose transport protein | |
| 1 bp ins (homopol) | ATCC9150 | ribose operon repressor | |
| 2 bp ins | ATCC9150 | putative ATP-dependent RNA helicase | |
| SSPA1447 | nonsense SNP | AKU_12601 | putative oxidoreductase |
| SSPA3202 | 1 bp ins (homopol) | AKU_12601 | putative lipoprotein |
| SSPA3581 | nonsense SNP | AKU_12601 | conserved hypothetical protein |
| 352 bp del | ATCC9150 | acyl-CoA thioesterase II | |
| nonsense SNP | AKU_12601 | anthranilate synthase component II | |
| 1 bp ins (homopol) | AKU_12601 | putative glycosyl transferase | |
| 1 bp del | ATCC9150 | putative amino-acid transport protein | |
| 1 bp del | ATCC9150 | conserved hypothetical protein | |
| 1 bp ins (homopol) | ATCC9150 | putative inner membrane protein | |
| nonsense SNP | ATCC9150 | putative transport system protein | |
| nonsense SNP | ATCC9150 | putative membrane protein |
Insertion, deletion or substitution events identified between Paratyphi A strains AKU_12601 and ATCC9150, causing gene inactivation in one strain. Strain – Paratyphi A strain in which the inactivating mutation occurs; del – deletion; ins – insertion; homopol – variation in homopolymeric sequence.
Pseudogenes shared between Paratyphi A and Typhi
| i^ | 0062a | n/a | - | putative viral protein | - |
| i^ | 0255a | n/a | - | putative uncharacterized protein | - |
| i | 1103 | 1362 | - | Pertussis toxin subunit S1 related protein | 1.22% |
| i^ | 1699a | 0971 | *secreted effector protein SopD homolog | 1.73% | |
| i^ | 2014 | 0610 | *putative inner membrane proton/cation antiporter | 1.08% | |
| i^ | 2014a | 0609a | *putative copper-ion sensor protein | 0.18% | |
| i^ | 3229 | 4202 | - | putative phosphosugar-binding protein | 0.14% |
| i | 3640 | 3800 | CDP-diacylglycerol pyrophosphatase | 2.32% | |
| i^ | 3888 | 4728a | - | putative uncharacterized protein | 1.35% |
| i | |||||
| ii | 0097 | 0113 | - | *putative secreted protein | 0.25% |
| ii | 0431b | 2631 | - | putative IS transposase | 0.24% |
| ii | 0754a | 2275 | *secreted effector protein | 0.23% | |
| ii | 3228 | 4203 | - | putative L-asparaginase | 0.14% |
| ii | 3365a | 4037 | putative uncharacterized protein (SPI-3) | 0.14% | |
| iii | 0192a | 0218 | *ferrichrome-iron receptor precursor | 23.95% | |
| iii | 0317a | 2775 | - | putative anaerobic dimethylsulfoxide reductase component | 1.79% |
| iii | 0329a | 2762 | *putative invasin (CS54) | 1.17% | |
| iii | 0331a | 2758 | *putative lipoprotein (CS54) | 1.67% | |
| iii | 0331b | 2755 | *putative uncharacterized protein (CS54) | 2.11% | |
| iii | 0621a | 2422 | *galactoside transport ATP-binding protein | 1.09% | |
| iii | 0720a | 2311 | *putative extracellular polysaccharide biosynthesis protein | 1.82% | |
| iii | 0756a | 2268 | penicillin-binding protein | 2.19% | |
| iii | 0850a | 2166 | lysine-N-methylase | 3.11% | |
| iii | 0943a | 1995 | - | transposase | 4.77% |
| iii | 1014a | 1913 | hydrogenase-1 small subunit | 0.33% | |
| iii | 1220a | 1508 | - | *putative transport protein | 1.31% |
| iii | 1367a | 1739 | - | putative ribokinase (SPI-2) | 1.42% |
| iii | 1531a | 1244 | *FhuE receptor precursor | 0.96% | |
| iii | 1642a | 1104 | - | *putative secreted protein | 1.54% |
| iii | 1820a | 0833 | *secreted effector protein | 1.95% | |
| iii | 2045a | 0569 | *putative allantoin transporter | 1.19% | |
| iii | 2301a | 0333 | *probable lipoprotein (SPI-6 fimbrial cluster) | 1.52% | |
| iii | 3388a | 4007 | - | putative cytoplasmic protein | 1.12% |
| iii | 3636a | 3805 | - | *putative permease of the Na+:galactoside symporter family | 2.42% |
| iii | 3828b | 4503 | anaerobic dimethyl sulfoxide reductase chain A | 0.22% | |
| iii | 3998a | 4839 | *putative fimbrial protein (SPI-10) | 0.18% | |
| iii, iv1 | 1197a | 1486 | respiratory nitrate reductase 2 delta chain | 1.74% | |
| iii, iv2 | 2900a | 3421 | *putative transport system protein | 0.58% | |
| iv3 | 0708 | 2328 | putative uncharacterized protein | 1.35% | |
(i) Ancestral pseudogenes (shared by virtue of inheritance in inactived gene from a common ancestor); ^ intact in Typhimurium, note that 30 ancestral pseudogenes encoding phage or transposase genes are excluded here but listed in Additional file 1. (ii) Recombined pseudogenes (shared by recombination). (iii) Recent conserved pseudogenes (independent inactivating mutations in each serovar). (iv) Recent strain-specific pseudogenes (pseudogenes in some but not all strains belonging to their respective serovar); 1pseudogene in Paratyphi A (both strains) and Typhi Ty2, 2pseudogene in Typhi (both strains) and Paratyphi A ATCC9150, 3pseudogene in Paratyphi A ATCC9150 and Typhi CT18. SSPA and STY -systematic identifiers in Paratyphi A AKU_12601 and Typhi CT18 respectively; n/a – not annotated. For genes lying in Salmonella pathogenicity islands (SPIs) the island is indicated in brackets after the gene product. Div. – nucleotide divergence reported in [15].
Figure 2Scenarios of recombination and pseudogene formation in Paratyphi A and Typhi. (a) True distribution of pseudogenes in the Paratyphi A AKU_12601 and Typhi CT18 genomes (gene order based on gene co-ordinates in Typhi CT18). (b-c) Distribution of pseudogenes resulting from data simulated under two scenarios, under both of which 40 pseudogenes are inherited from the most recent common ancestor of Paratyphi A and Typhi, and extensive accumulation of pseudogenes occurs before or after recombination of 25% of genes. For ease of simulation, the recombination shown is uni-directional, but bi-directional exchange would result in similar patterns. (b) Scenario 1: 150 additional pseudogenes accumulate in each serovar, followed by recombination. (c) Scenario 2: only 20 additional pseudogenes arise before recombination, after which a further 150 pseudogenes accumulate in each serovar.
Distribution of serovar-specific and shared pseudogenes in recombined regions
| Typhi-specific | 114 | 39 | 0.33 (p-value = 0.57) |
| Paratyphi A-specific | 92 | 24 | 1.63 (p-value = 0.20) |
| Shared | 46 | 20 |
Pearson χ2 tests were performed separately for each serovar based on the two-way contingency table obtained from the respective serovar-specific row and shared row.
Figure 3Pseudogene formation in the evolutionary histories of Paratyphi A and Typhi. Phylogenetic tree based on multiple alignments of all nonrecombined genes as defined in [15], rooted using S. bongori and E. coli as outgroups. Scale bar is nucleotide divergence. The timing of the recombination between Paratyphi A and Typhi is an approximation inferred from published divergence data [15]. Group (i) pseudogenes were inactivated prior to the divergence of Paratyphi A and Typhi, some are also inactivated in Typhimurium and Paratyphi B; following their divergence Paratyphi A and Typhi likely accumulated few additional pseudogenes; during the recombination of 23% of their genomes (direction of transfer unknown) 18 pseudogene sequences were shared between Paratyphi A and Typhi, including five non-ancestral pseudogenes (group ii); many pseudogenes were formed during a period of accelerated pseudogene accumulation in both serovars, including most group (iii) pseudogenes; pseudogenes continue to accumulate in individual sub-lineages after the most recent common ancestor of each serovar (group iv).