| Literature DB >> 31702846 |
Adi Ben Chorin1, Gal Masrati1, Amit Kessel1, Aya Narunsky1, Josef Sprinzak1, Shlomtzion Lahav2, Haim Ashkenazy2, Nir Ben-Tal1.
Abstract
Patterns observed by examining the evolutionary relationships among proteins of common origin can reveal the structural and functional importance of specific residue positions. In particular, amino acids that are highly conserved (i.e., their positions evolve at a slower rate than other positions) are particularly likely to be of biological importance, for example, for ligand binding. ConSurf is a bioinformatics tool for accurately estimating the evolutionary rate of each position in a protein family. Here we introduce a new release of ConSurf-DB, a database of precalculated ConSurf evolutionary conservation profiles for proteins of known structure. ConSurf-DB provides high-accuracy estimates of the evolutionary rates of the amino acids in each protein. A reliable estimate of a query protein's evolutionary rates depends on having a sufficiently large number of effective homologues (i.e., nonredundant yet sufficiently similar). With current sequence data, ConSurf-DB covers 82% of the PDB proteins. It will be updated on a regular basis to ensure that coverage remains high-and that it might even increase. Much effort was dedicated to improving the user experience. The repository is available at https://consurfdb.tau.ac.il/. BROADER AUDIENCE: By comparing a protein to other proteins of similar origin, it is possible to determine the extent to which each amino acid position in the protein evolved slowly or rapidly. A protein's evolutionary profile can provide valuable insights: For example, amino acid positions that are highly conserved (i.e., evolved slowly) are particularly likely to be of structural and/or functional importance, for example, for ligand binding and catalysis. We introduce here a new and improved version of ConSurf-DB, a continually updated database that provides precalculated evolutionary profiles of proteins with known structure.Entities:
Keywords: ConSurf; ConSurf-DB; binding site; evolutionary conservation; evolutionary rate; functional importance
Mesh:
Substances:
Year: 2019 PMID: 31702846 PMCID: PMC6933843 DOI: 10.1002/pro.3779
Source DB: PubMed Journal: Protein Sci ISSN: 0961-8368 Impact factor: 6.725
Figure 1A flowchart of the pipeline used to construct ConSurf‐DB. The pipeline consists of four steps: retrieving PDB entries, homologue detection and building a multiple sequence alignment, estimating evolutionary conservation, and formatting the results
Statistics of ConSurf‐DB
| PDB chains | MSA sizes | ||
|---|---|---|---|
| Total chains found | 473,197 | Chains with less than 50 homologues | 7,363 |
| Total nonredundant chains found | 108,958 | MSA's created | |
| Filtered | Chains with 50–100 homologues | 3,238 | |
| Chains shorter than 30 amino acids | 7,054 | Chains with 101–200 homologues | 4,978 |
| Chains with large structures | 4,629 | Chains with 201–300 homologues | 81,486 |
| Chains with more than 15% modified residues | 210 | Total chains processed | 89,702 |
| Total chains post‐initial filtration | 389,863 | ||
| Total nonredundant chains post‐initial filtration | 97,065 |
Note: Currently, the databases cover 89,702 of the 108,958 protein chains in the nonredundant set, that is, 82%.
Figure 2Conservation of catalytic and specificity‐determining positions (SDPs) in the active site of Or‐AT (PDB entry 2oat). (a) Ornithine‐aminotransferase, colored by conservation grade and shown in surface representation, together with the inhibitor–cofactor (pyridoxal phosphate) conjugate, colored by atom type and shown as spheres. (b) The catalytic and suspected specificity‐determining positions of ornithine‐aminotransferase are shown as sticks and colored by conservation grade. For clarity, the backbone of the enzyme is not shown
Figure 3The conservation pattern of an antibody (PDB entry http://firstglance.jmol.org/fg.htm?mol=1igt). A cartoon representation of an antibody colored according to evolutionary conservation. The constant and hypervariable regions in the structure are annotated. The antigen‐binding region (CDR loops) is shown as spheres