| Literature DB >> 23902280 |
Neethu Shah1, Rosmarie Gaupp, Hideaki Moriyama, Kent M Eskridge, Etsuko N Moriyama, Greg A Somerville.
Abstract
BACKGROUND: The Per-Arnt-Sim (<span class="Chemical">PAS) domain represents a ubiquitous structural fold that is involved in bacterial sensing and adaptation systems, including several virulence related functions. Although PAS domains and the subclass of PhoQ-DcuS-CitA (PDC) domains have a common structure, there is limited amino acid sequence similarity. To gain greater insight into the evolution of PDC/PAS domains present in the bacterial kingdom and staphylococci in specific, the PDC/PAS domains from the genomic sequences of 48 bacteria, representing 5 phyla, were identified using the sensitive search method based on HMM-to-HMM comparisons (HHblits).Entities:
Mesh:
Substances:
Year: 2013 PMID: 23902280 PMCID: PMC3734008 DOI: 10.1186/1471-2164-14-524
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Three-dimensional structures of PAS (A) and PDC (B) domains. The structures are based on (A) the Rhizobium meliloti oxygen sensor FixL protein with its ligand heme (UniProt P10955: positions 122–251) (PDB: 1D06) [21], and (B) the ligand-binding domain of the Klebsiella pneumoniae sensor kinase CitA protein with its ligand citrate (UniProt P52687: positions 5–135) (PDB: 1P0Z) [22]. For both structures, the core β strands are labeled from 1 to 5. Schematic models were generated by PyMol (Schrödinger, Portland, OR). Each region is colored as follows: the amino end with blue, the leading α-helix region with green, the first two β-strands with orange, the inter-domain α-helix region with magenta, the last three β-strands with yellow, and the carboxyl end with red. Ligands are shown in white stick models.
Summary of the 48 bacterial genomes used in this study and the PAS/PDC domains identified
| | | | | | | | | | |
| NC_011835.1 | Actinobacteria | 0 | 0 | 1527 | 5 | 4 | 2 | ||
| NCTC 13129 | NC_002935.2 | Actinobacteria | 1 | 0 | 2272 | 4 | 2 | 2 | |
| ATCC 13032 | NC_006958.1 | Actinobacteria | 0 | 0 | 3057 | 7 | 5 | 4 | |
| NCTC 2665 | NC_012803.1 | Actinobacteria | 0 | 0 | 2236 | 5 | 4 | 3 | |
| TN | NC_002677.1 | Actinobacteria | 1 | 0 | 1605 | 6 | 7 | 1 | |
| MC2 155 | NC_008596.1 | Actinobacteria | 0 | 0 | 6717 | 20 | 17 | 10 | |
| CDC1551 | NC_002755.2 | Actinobacteria | 1 | 0 | 4189 | 10 | 6 | 7 | |
| A3(2) | NC_003888.3 | Actinobacteria | 0 | 0 | 8154 | 51 | 48 | 18 | |
| CDC 684 | NC_012581.1 | Firmicutes | 1 | 0 | 5902 | 43 | 30 | 30 | |
| ATCC 10987 | NC_003909.8 | Firmicutes | 1 | 1 | 5843 | 41 | 30 | 25 | |
| QM B1551 | NC_010010.2 | Firmicutes | 0 | 1 | 5612 | 53 | 57 | 21 | |
| NC_000964.3 | Firmicutes | 0 | 1 | 4176 | 32 | 21 | 30 | ||
| ATCC 824 | NC_003030.1 | Firmicutes | 0 | 1 | 3847 | 36 | 16 | 34 | |
| A str. ATCC 3502 | NC_009496.1 | Firmicutes | 1 | 1 | 3590 | 48 | 31 | 40 | |
| 630 | NC_009089.1 | Firmicutes | 1 | 1 | 3749 | 49 | 49 | 26 | |
| V583 | NC_004668.1 | Firmicutes | 1 | 0 | 3264 | 9 | 7 | 7 | |
| DO | NZ_AAAK00000000 | Firmicutes | 1 | 0 | 3114 | 9 | 3 | 10 | |
| ATCC 334 | NC_008526.1 | Firmicutes | 0 | 0 | 2768 | 8 | 6 | 4 | |
| NC_009004.1 | Firmicutes | 0 | 0 | 2434 | 5 | 5 | 0 | ||
| EGD-e | NP_463535.1 | Firmicutes | 1 | 1 | 2846 | 12 | 9 | 7 | |
| JCSC5402 | NC_011999.1 | Firmicutes | 0 | 0 | 2052 | 7 | 6 | 4 | |
| NC_010079.1 | Firmicutes | 1 | 0 | 2693 | 8 | 5 | 7 | ||
| NC_002951.2 | Firmicutes | 1 | 0 | 2612 | 8 | 5 | 7 | ||
| NC_012121.1 | Firmicutes | 0 | 0 | 2461 | 9 | 8 | 5 | ||
| ATCC 12228 | NC_004461.1 | Firmicutes | 1 | 0 | 2416 | 8 | 5 | 7 | |
| JCSC1435 | NC_007168.1 | Firmicutes | 1 | 0 | 2676 | 7 | 5 | 4h | |
| HKU09-01 | CP_001837.1 | Firmicutes | 1 | 0 | 2490 | 7 | 5 | 5 | |
| HKU10-03 | NC_014925.1 | Firmicutes | 1 | 0 | 2450 | 8 | 5 | 7 | |
| NC_007350.1 | Firmicutes | 1 | 0 | 2446 | 6 | 4 | 5 | ||
| 2603 V/R | NC_004116.1 | Firmicutes | 1 | 0 | 2124 | 11 | 6 | 7 | |
| D39 | NC_008533.1 | Firmicutes | 1 | 0 | 1914 | 8 | 5 | 4 | |
| MGAS10270 | NC_008022.1 | Firmicutes | 1 | 0 | 1986 | 11 | 4 | 10 | |
| | | | | | | | | | |
| PCC 73102 | NC_010628.1 | Cyanobacteria | 0 | 1 | 6689 | 97 | 131 | 38 | |
| CC9311 | NC_008319.1 | Cyanobacteria | 0 | 1 | 2892 | 12 | 8 | 6 | |
| C58 | NC_003062.2 | Proteobacteria | 0 | 1 | 5355 | 63 | 61 | 27 | |
| Houston-1 | NC_005956.1 | Proteobacteria | 1 | 0 | 1488 | 8 | 10 | 2 | |
| BTAi1 | NC_009485.1 | Proteobacteria | 0 | 1 | 7621 | 98 | 103 | 51 | |
| K-12 substr. MG1655 | NC_000913.2 | Proteobacteria | 0 | 1 | 4146 | 32 | 24 | 18 | |
| 83 | CP002605.1 | Proteobacteria | 1 | 1 | 1609 | 5 | 1 | 7 | |
| 342 | NC_011283.1 | Proteobacteria | 1 | 0 | 5768 | 33 | 24 | 18 | |
| MC58 | NC_003112.1 | Proteobacteria | 1 | 0 | 2063 | 3 | 2 | 1 | |
| PAO1 | NC_002516.1 | Proteobacteria | 1 | 1 | 5571 | 70 | 71 | 35 | |
| NC_010067.1 | Proteobacteria | 1 | 1 | 4500 | 21 | 12 | 13 | ||
| 2a str. 2457 T | NC_004741.1 | Proteobacteria | 1 | 0 | 4060 | 18 | 11 | 9 | |
| MJ-1236 | NC_012668.1 | Proteobacteria | 1 | 1 | 3772 | 71 | 42 | 59 | |
| CO92 | NC_003143.1 | Proteobacteria | 1 | 1 | 4066 | 20 | 10 | 13 | |
| serovar Patoc | CP000788.1 | Spirochaetes | 0 | 1 | 3726 | 67 | 76 | 30 | |
| CP000805.1 | Spirochaetes | 1 | 1 | 1028 | 5 | 1 | 6 |
aFor those included in our phylogenetic analysis, abbreviations for species names are shown in parentheses.
b1: pathogenic, 0: non-pathogenic.
c1: motile, 0: non-motile.
dTotal number of proteins in the genome.
eTotal number of PAS/PDC-containing proteins identified in the genome.
fTotal number of PAS domains identified.
gTotal number of PDC domains identified.
hOne of the PDC/PAS-containing proteins in Staphylococcus haemolyticus (Sh.3, YP_253148.1, PhoR) does not have the PDC domain identified by HHblits nor HHsearch, although all other Staphylococcus PhoR homologs have clearly identified PDC domains. However, as noted in Supplementary Table S1, a very weakly conserved PDC-like region was identified in this protein based on the predicted secondary structure. This potential PDC domain is not included in this table nor in our analysis.
Figure 2Correlation between the total protein numbers and the numbers of PDC/PAS proteins across 48 bacterial genomes. The correlation is significant either based on all 48 genomes or based on only 43 genomes (excluding 5 over-representing Staphylococcus genomes): for 48 genomes, Pearson’s correlation coefficient r = 0.77 (p < 0.0001) and Spearman’s rank correlation ρ = 0.83 (p < 0.0001); for 43 genomes, Pearson’s correlation coefficient r = 0.76 (p < 0.0001) and Spearman’s rank correlation ρ = 0.82 (p < 0.0001). Bacterial species were classified as motile or non-motile (see Table 1), and they were plotted with open and closed circles, respectively.
Figure 3Overview of PDC/PAS proteins in 28 Gram-positive bacteria. The total number of PDC/PAS proteins found in each genome is shown in parentheses next to the species name and its corresponding bar length. Ranges of different colors in each bar indicate the proportions of different domain architectures. They include "single PDC" (dark blue: PDC), "two PDCs" (light blue: PDC = 2), "PDC and PAS" (purple: PDC PAS), "single PAS" (orange: PAS), "two PASs" (red: PAS = 2), "more than two PASs" (brown: PAS > 2), and "multiple PAS and PDC" (yellow green: PAS/PDC > =2). See (Additional file 1: Table S1) for the details. On the left hand side, the maximum likelihood phylogeny of 16S rDNA sequences is given to provide an evolutionary relationship among these bacteria. Dark circles at internal nodes represent those supported by bootstrap values greater than 70%.
Figure 4Maximum likelihood phylogeny of PAS domain protein sequences. 372 PAS domain sequences obtained from 28 Gram-positive bacterial genomes are included. As references, the phylogeny is shown in two color schemes: based on PDC/PAS domain architectures (A) and based on bacterial genera (B). Black and green circles at internal nodes represent those supported by bootstrap values greater than 60% and those supported by all three phylogenetic methods although bootstrap values were 60% or lower. Fourteen PAS domain sequences whose structures are known and used as the search queries were included in the phylogeny and they are labeled with black letters. One PDC domain sequence, 1P0Z (CitA, Klebsiella pneumoniae), was also included as the outgroup and is shown in black. See Table 1 and (Additional file 1: Table S1) and (Additional file 3: Table S2) for the species name abbreviations and protein IDs. Species abbreviations not listed in Table 1 are Av (Azotobacter vinelandii), Bj (Bradyrhizobium japonicum), Gs (Geobacter sulfurreducens), Hm (Haloarcula marismortui), Hh (Halorhodospira halophila), Rc (Rhodospirillum centenum), Rj (Rhodococcus jostii), Rm (Rhizobium meliloti), Tm (Thermotoga maritima), Vp (Vibrio parahaemolyticus).
Figure 5Maximum likelihood phylogeny of PDC domain protein sequences. 303 PDC domain sequences identified from 28 Gram-positive bacterial genomes are included. The phylogeny is shown in two color schemes: based on PDC/PAS domain architectures (A) and based on bacterial genera (B). "Distal" and "Proximal" PDCs are the first and second, respectively, domains in the two-PDC proteins (shown as "PDC = 2" in Figure 3). Black and green circles at internal nodes represent those supported by bootstrap values greater than 60% and those supported by all three phylogenetic methods although bootstrap values were 60% or lower. Seven PDC domain sequences whose structures are known and used as the search queries were included in the phylogeny and they are labeled with black letters. One PAS domain sequence, 1D06 (FixL, Rhizhobium meliloti), was included as the outgroup and is shown in black. See Table 1 and (Additional file 1: Table S1) and (Additional file 3: Table S2) for the species name abbreviations and protein IDs. Species abbreviations not listed in Table 1 are Gs (Geobacter sulfurreducens), Rm (Rhizobium meliloti), and Vp (Vibrio parahaemolyticus).