| Literature DB >> 23745136 |
Karolina L Tkaczuk1, Igor A Shumilin, Maksymilian Chruszcz, Elena Evdokimova, Alexei Savchenko, Wladek Minor.
Abstract
We present the crystal structures of two universal stress proteins (USP) from Archaeoglobus fulgidus and Nitrosomonas europaea in both apo- and ligand-bound forms. This work is the first complete synthesis of the structural properties of 26 USP available in the Protein Data Bank, over 75% of which were determined by structure genomics centers with no additional information provided. The results of bioinformatic analyses of all available USP structures and their sequence homologs revealed that these two new USP structures share overall structural similarity with structures of USPs previously determined. Clustering and cladogram analyses, however, show how they diverge from other members of the USP superfamily and show greater similarity to USPs from organisms inhabiting extreme environments. We compared them with other archaeal and bacterial USPs and discuss their similarities and differences in context of structure, sequential motifs, and potential function. We also attempted to group all analyzed USPs into families, so that assignment of the potential function to those with no experimental data available would be possible by extrapolation.Entities:
Keywords: Archaeoglobus fulgidus; Nitrosomonas europaea; crystal structures; pathogens; sequence analyses; structural comparison; structural genomics; universal stress protein
Year: 2013 PMID: 23745136 PMCID: PMC3673472 DOI: 10.1111/eva.12057
Source DB: PubMed Journal: Evol Appl ISSN: 1752-4571 Impact factor: 5.183
Figure 1Schematic maps of the Archeoglobus fulgidus and Nitrosomonas europea chromosomes, showing the positions of usp genes. The names on the map are usp genes labeled by locus tag. If the structure of the protein encoded by an usp gene has been determined, the gene is labeled with the Protein Data Bank codes and underlined in red.
Figure 2Domain composition of known USP proteins. In this figure, USP denotes the universal stress protein domain, CBS is the cystathionine beta synthase domain, and CD is a conserved domain of unknown function. All other domains are labeled by Pfam family identifier: PF00069 is a protein kinase domain, PF04564 the U-box domain, PF02080 the TrkA domain (unknown function), PF02702 the KdpD domain, PF13493 the DUF4118 domain (unknown function), PF00512 the HisKA domain, PF02518 the HATPase_c domain, PF00999 the sodium/hydrogen exchanger family domain, PF00654 the Voltage CLC domain, PF09413 the DUF2007 domain (unknown function), PF10494 the Stk19 domain (a family of Ser/Thr protein kinases), and PF13520 the AA_permease_2 domain (a family of amino acid permeases).
General information on USP crystal structures deposited in the Protein Data Bank (PDB)
| Name | PDB code | Organisms | Annot. | ATP-rel. | ATP-binding motif | Motif | ATP | Ion | SG center | Ref. |
|---|---|---|---|---|---|---|---|---|---|---|
| NE1028 | 2PFS/3TNJ | USP | D12, V40 | G114-SH-G117-(8X)-G126ST | Typical | No/AMP | No | MCSG | – | |
| AF0836 | 3DLO/3QTB | USP | D11, S40 | G103-IR-K106 -(9X)-G116SV | Typical | No/dAMP | Cl | MCSG | – | |
| AF1760 | 3LOQ | USP | D154,V182 | G234-SR-G237-(9X)-G247ST | Typical | AMP | Cl | MCSG | n/a | |
| Lp1163 | 3S3T | USP | D13, V41 | G115-AT-G118-(9X)-G128ST | Typical | ATP | Ca | MCSG | n/a | |
| Lp3663 | 3FG9 | UspA | D20, V50 | G123-AD-T126-(11X)-G138PR | Degen. | No | Mg | MCSG | n/a | |
| WS0661 | 3IDF | USP | D9, V38 | G108-SS-E111-(8X)-A120SH | Degen. | No | No | MCSG | n/a | |
| Rv2623 | 3CIS | UspE | D167,A195 | G262-SR-G265-(9X)-G275SV | Typical | ATP | Mg | MCSG | n/a | |
| Rv1636 | 1TQ8 | D25, A52 | G126-NV-G129-(9X)-G139SV | Typical | No | No | NYSGXRC | n/a | ||
| Rv2623 | 2JAX | UspE | D167,A195 | G262-SR-G265-(9X)-G275SV | Typical | ATP | No | n/a | n/a | |
| KPN01444 | 3FH0 | USP | D7, V37 | A111-SH-R114-(8X)-G123SN | Degen. | ADP | No | MCSG | n/a | |
| KPN01444 | 3FDX | UspF | D7, V37 | A111-SH-R114-(8X)-G123SN | Degen. | ATP | Mg | MCSG | n/a | |
| AT3G01520 | 2GM3 | USP | N13, V53 | G131-SR-G134-(9X)-G275SV | Typical | AMP | No | CESG | n/a | |
| Aq178 | 1Q77 | UspA | D9, V37 | A113-CY-P130 | Degen. | No | No | MCSG | n/a | |
| PMI1202 | 3OLQ | UspE | N161,A198 | G270-IL-G273-(10X)-N284TA | Degen. | No | No | MCSG | n/a | |
| PA1789 | 3MT0 | UspE | D139,A174 | G241-TV-A244-(9X)-G254NT | Degen. | No | Cl | MCSG | n/a | |
| 1WJG | – | D10, A38 | G106-TR-G109-(9X)-G119SQ | Typical | No | No | RSGI | n/a | ||
| TTHA0895 | 2Z3V | USP | D10, A38 | G106-TR-G109-(9X)-G119SQ | Typical | No | No | RSGI | n/a | |
| TTHA0895 | 2Z08/9 | USP | D10, A38 | G106-TR-G109-(9X)-G119SQ | Typical | ATP/ACT | Mg | RSGI | n/a | |
| TTH0350 | 3AB7/8 | USP | D8, V36 | G121-RS-D124-(5X)-G130ST | Degen. | ATP | Mg | n/a | (1) | |
| HI0815 | 1JMV | UspA | D10, A38 | G109-HH-Q112-(6X)-M119SV | Degen. | No | No | n/a | (2) | |
| HELO1754 | 3HGM | TeaD | D10, A38 | G117-AE-G120-(9X)-G130SV | Degen. | ATP | Mn | n/a | (3) | |
| MJ0577 | 1MJH | – | D13, V41 | G127-SH-G130-(9X)-G140SV | Typical | ATP | Mn | BSGC | (4) |
MCSG, Midwest Center for Structural Genomics; RSGI, RIKEN Structural Genomics/Proteomics Initiative; CESG, Center for Eukaryotic Structural Genomics; BSGC, Berkley Structural Genomics Center; NYSGXRC, New York SGX Research Consortium.
X denotes structures not solved by structural genomics centers.
This work, X denotes any residue and the digit in front of it the number of X residues. AMP, ADP, and ACT are the following derivatives of ATP: ACT, phosphomethylphosphonic acid adenylate ester (C11H18N5O13P3); AMP, adenosine monophosphate (C10H14N5O7P); ADP, adenosine-5′-diphosphate (C10H15N5O10P2). (i) publication by Iino et al. (2011), (ii) work by Sousa and McKay (2001), (iii) publication by Schweikhard et al. (2010), (iv) work by Zarembinski et al. (1998).
Special case of USP.
ATP-binding protein.
Data collection and structure determination statistics. Crystallographic parameters, data-collection (native data) and refinement statistics for Archaeoglobus fulgidus and Nitrosomonas europaea proteins (apo and ligand-bound structures)
| apo-AF0826 3DLO | AF0826-dAMP 3QTB | apo-NE1028 2PFS | NE1028-AMP 3TNJ | |
|---|---|---|---|---|
| Data collection | ||||
| Beamline | 19-BM | 21-ID | 19-ID | 19-BM |
| Wavelength (Å) | 0.9793 | 0.9792 | 0.9792 | 0.9791 |
| Resolution (Å) | 1.97 (1.97–2.03) | 2.10 (2.10–2.14) | 2.25 (2.25–2.29) | 2.00 (2.00–2.03) |
| Space group | P21 | C2 | P321 | P321 |
| a (Å)/b (Å) | 43.2/99.2 | 109.6/42.7 | 76.0/77.8 | 77.8/77.8 |
| c (Å) | 57.4 | 61.3 | 43.0 | 39.9 |
| α/β (°) | 90.0/92.4 | 90.0/116.8 | 90.0/90.0 | 90.0/90.0 |
| γ (°) | 90.0 | 90.0 | 120.0 | 120.0 |
| Solvent content (%) | 30.7 | 33.3 | 42.5 | 41.5 |
| Completeness (%) | 99.6 (77.6) | 99.6 (98.8) | 97.60 (81.5) | 97.5 (93.7) |
| Observed reflections | 33270 | 36964 | 6882 | 9070 |
| Unique reflections | 33213 | 33270 | 6538 | 8637 |
| I/σ (I) | 24.1 (2.7) | 21.5 (2.6) | 58.5 (2.9) | 29.0 (2.7) |
| Rmerge (%) | 7.3 (40.5) | 7.0 (40.5) | 6.0 (48.6) | 5.1 (57.5) |
| Refinement | ||||
| R (%)/Rmerge (%) | 17.5/23.0 | 20.0/23.7 | 19.8/25.5 | 19.1/24.1 |
| Mean | 23.2 | 38.9 | 52.4 | 48.8 |
| Protein atoms | 4398 | 1970 | 944 | 996 |
| Chloride ions | 2 | 0 | 2 | 0 |
| Water molecules | 182 | 50 | 40 | 36 |
| Structure quality | ||||
| Ramachandran statistics | ||||
| Favored (%)/ | 97.9 | 99.6 | 100 | 100 |
| Allowed (%)/ | 2.1 | 0.4 | 0 | 0 |
| All-atoms contacts and protein geometry | ||||
| Clash score | 12.25 (71st) | 8.15 (93rd) | 9.17 (93rd) | 7.47 (93rd) |
| MolProbity score | 2.13 (63rd) | 1.44 (99rd | 1.85 (93rd) | 1.61 (94rd) |
| RMS deviation | ||||
| Bond lengths (Å) | 0.019 | 0.013 | 0.017 | 0.014 |
| Bons angles (°) | 1.8 | 1.5 | 1.6 | 1.4 |
Data for the highest resolution shell are given in parentheses.
Pro and Gly residues were excluded from calculation.
Percentile.
Figure 3Topology key note diagrams of (A) AF0826 and (B) NE1028 structures. Cylinders represent α-helices while arrows correspond to β-strands (C) cartoon representation of AF0826 monomer; the arrow shows the location of the b-hairpin insertion (D) cartoon representation of NE1028 monomer.
Figure 4Ligand-binding sites of USP family members. (A) ATP-binding residues in the best characterized USP, namely UspA MJ0577 from M. jannaschii; (B) universal stress protein NE1028-AMP from β-proteobacterium N. europea; (C) fusion UspE protein KPN01444-ATP from Klebsiella pneumonia; (D) universal stress protein AF0826-dAMP from euryarchaeota Archeoglobus fulgidus.
Figure 8Multiple sequence alignment (MSA) of USP proteins. This figure presents MSA of selected representatives of each family. Invariant and strongly conserved residues are highlighted in black and gray, respectively. Residues interacting with the ligand are marked with asterisk (*).
Figure 5Dimerization pattern of USP family members. (A) probable dimer assembly of USP NE1028 from N. europea [Protein Data Bank (PDB) code: 2PFS]; (B) likely incorrect dimeric assembly of USP NE1028 from N. europea (PDB code: 2PFS) predicted by the PISA server; (C) dimeric assembly of UspE protein Rv2623 from Mycobacterium tuberculosis (PDB code: 3CIS); (D) superposition of type 1 dimers (representatives listed in the Table 1); (E) Incorrect UspF assembly (PISA AB); (F) Correct assembly (PISA AA) of UspF (PDB code: 3FDX) ATP-binding residues are shown in pink, dimerization interface residues from monomers A and B are shown in green and blue respectively, and ligand molecules are shown in CPK colors in either space-filling or ball-and-stick representation.
Figure 6Clustering analysis results. (A) Clustering analysis results with full-length UspE. B shows clustering results with each UspE protein divided into separate domains (UspE1 and UspE2) and treated separately. Each USP family is presented in different colors and labeled.
Figure 7Cladogram depicting USP protein families grouping.