| Literature DB >> 18971256 |
Ofir Goldenberg1, Elana Erez, Guy Nimrod, Nir Ben-Tal.
Abstract
ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/Entities:
Mesh:
Substances:
Year: 2008 PMID: 18971256 PMCID: PMC2686473 DOI: 10.1093/nar/gkn822
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Cytochrome c. (A) The conservation coloring profile from the ConSurf-DB repository, mapped onto a space-filling representation of the protein. The conservation coloring scale is shown below. The HEME group, in stick representation, is colored green. (B) The same view as calculated by the ConSurf server using default parameters.
Build statistics for the first full version of ConSurf-DB dated February 2008
| PDB chains | MSA sizes | ||
|---|---|---|---|
| PDB entries processed | 48 091 | Chains with less than 5 homologues (insufficient) | 1348 |
| Total chains found | 117 384 | MSAs Created | 29 570 |
| Filtered | Chains with 5-10 homologues | 859 | |
| Chains containing nucleic acids | 8237 | Chains with 11-20 homologues | 1059 |
| Chains of less than 30 residues | 5594 | Chains with 21-50 homologues | 2332 |
| Chains containing more than 15% modifications | 281 | Chains with 51-100 homologues | 7297 |
| Total chains meeting our requirements | 103 272 | Chains with 101-200 homologues | 14 945 |
| Total distinct chains meeting our requirements | 30 918 | Chains with 201-300 homologues | 3078 |