| Literature DB >> 12499538 |
Abstract
An algorithm is described for automated building of side chains in an electron-density map once a main-chain model is built and for alignment of the protein sequence to the map. The procedure is based on a comparison of electron density at the expected side-chain positions with electron-density templates. The templates are constructed from average amino-acid side-chain densities in 574 refined protein structures. For each contiguous segment of main chain, a matrix with entries corresponding to an estimate of the probability that each of the 20 amino acids is located at each position of the main-chain model is obtained. The probability that this segment corresponds to each possible alignment with the sequence of the protein is estimated using a Bayesian approach and high-confidence matches are kept. Once side-chain identities are determined, the most probable rotamer for each side chain is built into the model. The automated procedure has been implemented in the RESOLVE software. Combined with automated main-chain model building, the procedure produces a preliminary model suitable for refinement and extension by an experienced crystallographer.Entities:
Mesh:
Substances:
Year: 2002 PMID: 12499538 PMCID: PMC2745879 DOI: 10.1107/s0907444902018048
Source DB: PubMed Journal: Acta Crystallogr D Biol Crystallogr ISSN: 0907-4449
Figure 1Fraction of correct amino-acid side-chain assignments as a function of the probability estimated from (1). For each residue in the main-chain models for the eight structures listed in Table 1 ▶, the relative probabilities for each of the possible side chains were obtained using (1). The correct side chains were identified as the nearest amino acid in the refined model of each structure. The fraction of correct amino-acid side-chain assignments is tabulated as a function of the probability estimates.
Figure 2Fraction of correct fragment alignments as functions of the probabilities estimated from (2). For each main-chain fragment built, the sub-fragment with the highest weighted Z score was identified as described in §2. All alignments of this sub-fragment with the protein sequence were considered and the relative probabilities of each alignment were estimated with (2). An alignment was considered correct if the residue numbers of 90% of the residues in the fragment matched those of the nearest amino acid in the refined model.
Test structures for which side-chain models have been built with RESOLVE
| Structure | Resolution () | Figure of merit | Residues in refined model | Main chain built (%) | Side chains built (%) | Correct alignment (%) | Side-chain mean coordinate error () | Side-chain r.m.s. coordinate error () |
|---|---|---|---|---|---|---|---|---|
| Gene 5 protein (Skinner | 2.6 | 0.62 | 87 | 61 | 11 | 100 | 1.2 | 1.4 |
| Granulocyte-stimulating factor (Rozwarski | 3.5 | 0.70 | 242 | 50 | 0 | N/A | 0 | 0 |
| Initiation factor 5A (Peat | 2.1 | 0.85 | 136 | 84 | 84 | 99 | 1.3 | 1.8 |
| -Catenin (Huber | 2.7 | 0.72 | 455 | 81 | 62 | 100 | 1.2 | 1.7 |
| NDP kinase (Pdelacq | 2.6 | 0.56 | 556 (3 186) | 56 | 37 | 98 | 1.2 | 1.6 |
| Hypothetical ( | 2.6 | 0.58 | 494 (2 247) | 79 | 75 | 98 | 1.3 | 2.0 |
| Red fluorescent protein (Yarbrough | 2.5 | 0.91 | 936 (4 234) | 88 | 88 | 99 | 1.2 | 1.8 |
| 2-Aminoethylphosphonate (AEP) transaminase (Chen | 2.6 | 0.84 | 2232 (6 372) | 85 | 81 | 99 | 1.3 | 1.8 |
Figure 3Effect of resolution on model building of IF5A. The phases and amplitudes for IF5A (Peat et al., 1998 ▶) after density modification were truncated at varying resolutions and the resulting maps were used for automated main-chain and side-chain model building. (a) Percentage of the main chain (closed circles) and the side chains (open circles) in the refined structure that were built. (b) R.m.s. coordinate error for main-chain (closed circles) and side-chain (open circles) atoms. Side-chain atoms include Cβ.