| Literature DB >> 18927621 |
Siew Hoon Sim1, Yiting Yu, Chi Ho Lin, R Krishna M Karuturi, Vanaporn Wuthiekanun, Apichai Tuanyok, Hui Hoon Chua, Catherine Ong, Sivalingam Suppiah Paramalingam, Gladys Tan, Lynn Tang, Gary Lau, Eng Eong Ooi, Donald Woods, Edward Feil, Sharon J Peacock, Patrick Tan.
Abstract
Natural isolates of Burkholderia pseudomallei (Bp), the causative agent of melioidosis, can exhibit significant ecological flexibility that is likely reflective of a dynamic genome. Using whole-genome Bp microarrays, we examined patterns of gene presence and absence across 94 South East Asian strains isolated from a variety of clinical, environmental, or animal sources. 86% of the Bp K96243 reference genome was common to all the strains representing the Bp "core genome", comprising genes largely involved in essential functions (eg amino acid metabolism, protein translation). In contrast, 14% of the K96243 genome was variably present across the isolates. This Bp accessory genome encompassed multiple genomic islands (GIs), paralogous genes, and insertions/deletions, including three distinct lipopolysaccharide (LPS)-related gene clusters. Strikingly, strains recovered from cases of human melioidosis clustered on a tree based on accessory gene content, and were significantly more likely to harbor certain GIs compared to animal and environmental isolates. Consistent with the inference that the GIs may contribute to pathogenesis, experimental mutation of BPSS2053, a GI gene, reduced microbial adherence to human epithelial cells. Our results suggest that the Bp accessory genome is likely to play an important role in microbial adaptation and virulence.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18927621 PMCID: PMC2564834 DOI: 10.1371/journal.ppat.1000178
Source DB: PubMed Journal: PLoS Pathog ISSN: 1553-7366 Impact factor: 6.823
Figure 1The Core and Accessory Genomes of Bp.
Chromosome 1 is on the left and Chromosome 2 on the right. Both chromosomes are centered around the origin of replication. From outside to inside: Red - Computationally-identified GIs (12 on Chr 1 and 4 on Chr 2) (33); Accessory (Blue) and Core (Yellow) Genes; Internal red - False Discovery Values as assessed by GMM - A red peak indicates high variability in that genomic region (see Methods). Black arrows - Representative examples of novel indels.
Enriched Functions of Core and Accessory Genes in Bp.
| Gene Distribution | ||||
| Accessory (A) | Core (C) | Total | p-value | |
| Total Number of Genes | 750 | 4619 | 5369 | |
|
| ||||
| Amino acid transport and metabolism | 37 | 377 | 414 | 1.5×10−3 |
| Inorganic ion transport and metabolism | 16 | 199 | 215 | 3.96×10−3 |
| Nucleotide transport and metabolism | 4 | 78 | 82 | 0.0152 |
| Protein Translation | 12 | 158 | 170 | 0.007 |
| Virulence Components | 30 | 321 | 351 | 1.83×10−3 |
|
| ||||
| Paralogous Genes | 73 | 228 | 301 | 2.25×10−7 |
| Hypothetical Proteins | 233 | 1132 | 1365 | 3.3×10−4 |
P-values were computed using a Fisher Test.
*: P-values were computed based upon the simultaneous comparison of 25 COG pathways.
+: Virulence genes were obtained from an annotated listing provided in Holden et al (2004) [34].
Novel indels in Bp.
| Indel | Genes | Size (kb) | Integrase/bacteriohage/transposase | GC (%) | Presence in BT? | Gene Functions |
| 1 |
| 2.7 | 0 |
| − | Hypothetical proteins |
| 2 |
| 3.7 | 1 integrase | 60.2 | − | Hypothetical proteins and putative phage-related integrase |
| 3 |
| 2.5 | 0 | 68.2 | + | Miscellaneous island; contains lipoprotein, putative amino acid transport protein and 30S ribosomal protein S15 |
| 4 |
| 5.0 | 0 |
| − | Hypothetical proteins |
| Replaced by BTH_I2688, 2689 and 2690 | ||||||
| 5 |
| 4.5 | 0 | 69.6 | + | Miscellaneous island; contains family U32 unassigned peptidase, putative 2-nitropropane dioxygenase, hypothetical protein and putative regulatory protein |
| 6 |
| 3.7 | 0 | 68.4 | + | LPS biosynthesis; phogphoglucomutase, LPS biosynthesis protein and glycosyl transferase |
| 7 |
| 4.1 | 0 | 68.3 | + | Miscellaneous; contains hypothetical proteins, probable alcohol dehydrogenase and putative OmpW-family exported protein |
| 8 |
| 4.6 | 0 | 66.8 | + | Miscellaneous; contains C4-dicarboxylate transport protein, putative GntR-family regulatory protein, cyn operon transcriptional activator (LysR-family) and carbonic anhydrase |
| 9 |
| 3.6 | 1 integrase | 64.3 | + | Hypothetical protein, integrase and DNA-binding protein |
| 10 |
| 2.4 | 0 | 68.0 | + | Hypothetical proteins and glutathione S-transferase like protein |
| 11 |
| 1.3 | 2 bacteriophage proteins |
| − | Bacteriophage protein Gp49 and hypothetical protein |
| 12 |
| 2.7 | 0 | 66.7 | + | LPS biosynthesis; contains O-acetyl transferase and glycosyl transferase (O-antigen related) and hypothetical protein |
| 13 |
| 2.4 | 0 | 69.2 | + | Miscellaneous; contains AraC family regulatory protein and hypothetical proteins |
| 14 |
| 4.3 | 0 | 71.3 | + | Miscellaneous; contains sensor kinase protein and hypothetical protein |
| 15 |
| 4.1 | 0 | 69.0 | + | Miscellaneous; contains MarR family regulator protein, fumarylacetoacetate (FAA) hydrolase family protein and hypothetical proteins |
| 16 |
| 7.5 | 0 | 69.8 | + | Metabolic; contains citrate lyase, transporter proteins, zinc binding dehydrogenase and isochoristmatase. |
| 17 |
| 3.3 | 0 | 73.2 | + | Miscellaneous; contains acylphosphatase protein and hypothetical protein |
| 18 |
| 3.0 | 0 | 73.5 | − | Miscellaneous; contains Zinc-binding dehydrogenase and hypothetical proteins |
| 19 |
| 4.8 | 0 | 71.6 | − | LPS biosynthesis; contains LPS biosynthesis proteins and transferases |
| 20 |
| 3.2 | 0 | 69.8 | + | Miscellaneous; contains lipoprotein and hypothetical proteins |
*: Presence indicated by +; and absence indicated by −.
Indels exhibiting atypical %GC content are indicated in bold.
Figure 2Frequency of Indels in Bp.
The graph shows the percentage of strains exhibiting either a partial (red) or total (blue) absence of the indel segment (n1–n20). Blue represents the percentage of strains where the entire indel is absent. Red represents strains where the indel is only partially absent.
Figure 3Unsupervised Accessory Genome Clustering of Bp Isolates.
Clustering diagram of Bp strains on the basis of accessory genome content. The tree is contructed using MultiExperiment Viewer (MeV) version 4, based on the entire 750-gene accessory genome and combined average linkage hierarchical clustering. Clinical (labeled in red), Animal (labeled in blue) and Environmental (labeled in green) strains are indicated. Isolates from Thailand are highlighted in the red broken circle. Three broad clusters/clades are identified which are named C-clinical, A-animal, E-environmental, with the percentage of concordant strains in that cluster. Numbers on branches represent bootstrap values based on 1000 tests. The bootstrapping analysis reveals a clear distinction between the C (clinical) and A/E clusters (non-clinical - animal and environmental) (Bootstrap value = 100).
Figure 4Enrichment of Genomic Islands in Clinical Isolates.
Heat map representing absence and presence of GI genes in Clinical, Animal and Environmental isolates. Top row (“Cluster”): AGC clusters corresponding to clinical (left), animal (middle), and environmental (right) isolates. Second row (“Source”) Strains were color-coded according to their original source of isolation, where red = clinical, blue = animal, and green = environmental. Third row: strains highlighted in pink from Thailand. In the heat-map, black indicates gene presence and red indicates gene absence. Locations of the fourteen GIs are depicted on the right.
Concordance of AGC Clusters and MLST Sequence Types.
| AGC Clades | ||||
| C | A | E | ||
|
| ST51 | 3 |
| 0 |
| ST423 | 5 | 0 | 4 | |
| ST422 |
| 0 | 0 | |
| ST84 | 0 | 0 |
| |
| ST169 | 1 | 0 | 0 | |
| ST46 | 1 | 0 | 2 | |
| ST54 | 0 | 0 | 1 | |
| ST414 | 1 | 0 | 1 | |
| ST289 | 0 | 0 | 1 | |
|
| 15 | 17 | 13 | |
Depicted are the distributions of 45 Bp strains subjected to both AGC and MLST analysis. Strain numbers in bold (eg ST51) highlight STs where the majority of strains were found in one AGC clade.