| Literature DB >> 24646163 |
Debanu Das1, Alexey G Murzin, Neil D Rawlings, Robert D Finn, Penelope Coggill, Alex Bateman, Adam Godzik, L Aravind.
Abstract
BACKGROUND: CA_C2195 from Clostridium acetobutylicum is a protein of unknown function. Sequence analysis predicted that part of the protein contained a metallopeptidase-related domain. There are over 200 homologs of similar size in large sequence databases such as UniProt, with pairwise sequence identities in the range of ~40-60%. CA_C2195 was chosen for crystal structure determination for structure-based function annotation of novel protein sequence space.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24646163 PMCID: PMC4000134 DOI: 10.1186/1471-2105-15-75
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Summary of crystal parameters, data collection and refinement statistics for PDB 3k9t
| Data collection | | | |
| Space group | H32 | | |
| Unit cell parameters (Å) | a = 153.78, b = 153.78, c = 168.38 | ||
| Wavelength (Å) | 0.91837 | 0.97925 | 0.97911 |
| Resolution range (Å) | 29.1-2.37 | 29.1-2.44 | 29.1-2.25 |
| (2.43-2.37) | (2.50-2.44) | (2.31-2.25) | |
| No. of observations | 172,585 | 157,212 | 403,378 |
| No. of unique reflections | 31,178 | 28,543 | 36,347 |
| Completeness (%) | 99.9 (100.0) | 99.9 (100.0) | 100.0 (100.0) |
| Mean | 9.0 (1.5) | 9.2 (1.6) | 12.7 (1.9) |
| | 18.9 (101.7) | 18.3 (93.1) | 20.9 (132.1) |
| | 20.9 (112.3) | 20.3 (102.9) | 21.9 (138.4) |
| | 8.8 (47.4) | 8.6 (43.4) | 6.5 (41.2) |
| Model and refinement statistics | | | |
| Resolution range (Å) | 29.1-2.37 | | |
| No. of reflections (total) | 31,177§ | | |
| No. of reflections (test) | 1576 | | |
| Completeness (%) | 100.0 | | |
| Data set used in refinement | λ1 | | |
| Cutoff criteria | |F| > 0 | | |
| | 0.171 | | |
| | 0.212 | | |
| Stereochemical parameters | | | |
| Restraints (RMSD observed) | | | |
| Bond angles (º) | 1.61 | | |
| Bond lengths (Å) | 0.015 | | |
| Average isotropic | 29.5 | | |
| ESU‡‡‡ based on | 0.18 | | |
| Protein residues/ atoms | 422 / 3386 | | |
| Waters / Zn/ Cl/ Imd/ MRD | 221 / 1 / 2 / 1 / 4 | ||
Values in parentheses are for the highest resolution shell.
†R = ΣΣ|I(hkl) - (I(hkl))|/Σ Σ(hkl).
‡R = Σ[N/(N-1)]1/2Σ|I(hkl) - (I(hkl))|/ΣΣI(hkl)[17].
‡‡R (precision-indicating R) = Σ[(1/(N-1)] ½ Σ|I (hkl) - < I(hkl) > | / ΣΣ I(hkl) [19,20].
§ Typically, the number of unique reflections used in refinement is slightly less than the total number that were integrated and scaled. Reflections are excluded owing to systematic absences, negative intensities and rounding errors in the resolution limits and unit-cell parameters.
¶R = Σ||Fobs| - |Fcalc||/Σ|Fobs|, where Fcalc and Fobs are the calculated and observed structure-factor amplitudes, respectively. R is the same as R but for 5.1% of the total reflections chosen at random and omitted from refinement.
†† This value represents the total B that includes TLS and residual B components.
‡‡‡ Estimated overall coordinate error [18].
Figure 1Crystal structure and domain architecture. The crystal structure of CA_C2195 from Clostridium acetobutyliticum, with the N- and C-termini labeled as ‘;N’ and ‘;C’, reveals 3 domains: residues 1–55 (blue) and 165–355 (yellow) form the N-terminal metallopeptidase domain, DUF4910; residues 56–164 (grey) form the DUF2172 domain; and residues 356–434 (red) form a C-terminal wHTH domain, HTH_47. Residues in the putative active site are Asp195 (red stick); and His189 and His324 (cyan sticks), and they are bound to a Zn ion from the crystallization condition. Imidazole from the crystallization condition is also bound to the active site Zn. The lower panel is a linear representation of the domain architecture of CA_C2195.
Figure 2Metallopeptidase domain structure. The metallopeptidase domain of CA_C2195 (blue) is similar in structure to several other metallopeptidases, as for example, the Peptidase_M28 family aminopeptidase [PDB:2dea] (orange) with r.m.s.d. ~2.3 Å between Cα atoms over the entire domain despite a very low sequence identity of ~17%.
Figure 3Residue conservation analysis in the metallopeptidase domain. The residues likely involved in activity are Asp195, His189 and His324 and have the highest conservation (dark pink, scale 9 in a range of 1 to 9 in CONSURF) across CA_C2195 homologs. The presence of other highly conserved residues around the putative active site suggests that they will also be involved in function. The least conserved residues (cyan, scale 1) in CA_C2195 are also visible.
Figure 4Comparison of the DUF2172 and PA domains.(A) The DUF2172 domain in CA_C2195 (grey, left panel) bears some fold resemblance to the PA (Protease-associated) domain (grey, right panel), which has been observed in a Peptidase_M28 family member [PDB:2ek8, right panel) even though there is no discernible sequence identity. Analogous to the proposed role of the PA domain, the DUF2172 domain may be forming a lid modulating access to the peptidase active site and may also be involved in substrate recognition and specificity. Molecules in the panels are oriented such that the peptidase domains in both superimpose. The active sites in both molecules are shown in cyan sticks and black spheres. (B) A large substructure of the PA domain fold (yellow, left panel) is replaced with a turn of α-helix in DUF2172 (orange, right panel).
Figure 5Residue conservation analysis in the DUF2172 domain. The presence of highly conserved aromatic residues (dark pink) including Trp70, Tyr98, Tyr127, Tyr131 and Tyr132, indicates residues that may be involved in substrate recognition if this domain has a functionality associated with substrate interactions.
Figure 6Residue conservation analysis in C-terminal wHTH domain. Residues in the C-terminal circularly permuted wHTH domain that might be involved in substrate recognition and specificity based on their high conservation across CA_C2195 homologs (residues with highest conservation are in dark pink) are visualized.
Figure 7Comparison of wHTH domains.(A) The circularly permuted wHTH domain observed in CA_C2195 (red, left panel) resembles another circularly permuted wHTH domain present in the structure of a Peptidase_M24 family aminopeptidase [PDB:1boa] (red, right panel), and may be involved in substrate recognition and specificity. (B) The wHTH domain in CA_C2195 (left) is compared to the wHTH domain from Peptidase_M24 [PDB:1boa] (center) and a wHTH domain from a transcription factor [PDB:1cf7] (right), which was one of the proteins most similar in structure to the CA_C2195 wHTH domain. Each domain is colored from the N-terminus (blue) to the C-terminus (red). All domains are in a similar orientation. (C) Topology diagrams for the three domains in (B) in the same order depicting the arrangement of secondary structure elements and circular permutation in the CA_C2195 wHTH compared to the transcription factor wHTH. Cylinders represent α-helices, arrows represent β-strands and the N- and C-termini are labeled.
Gene neighborhood analysis
| 15895455 | . | CA_C2186 | NP_348804.1 | Glycosyltransferase |
| 15895456 | spsE | CA_C2187 | NP_348805.1 | N-acetylneuraminic acid synthase + SAF sugar-binding (condenses of phosphoenolpyruvate and |
| 15895457 | . | CA_C2188 | NP_348806.1 | Glycosyltransferase |
| 15895458 | . | CA_C2189 | NP_348807.1 | ATP-grasp amino acyl ligase |
| 15895459 | spsF | CA_C2190 | NP_348808.1 | Sugar phosphate nucleotidyltransferase |
| 15895460 | . | CA_C2192 | NP_348809.1 | Glyoxylase |
| 15895461 | . | CA_C2193 | NP_348810.1 | DUF3880 + Glycosyltransferase |
| 15895462 | . | CA_C2194 | NP_348811.1 | nucleoside-diphosphate sugar epimerase |
| 15895464 | . | CA_C2196 | NP_348813.1 | Methyltransferase + Glycosyltransferase (currently annotated as: MAF_flag10, DUF115) |
| 15895465 | . | CA_C2197 | NP_348814.1 | aminosugar |
| 15895466 | acpA | CA_C2198 | NP_348815.1 | acyl carrier protein |
| 15895467 | . | CA_C2199 | NP_348816.1 | aminosugsar |