| Literature DB >> 29187139 |
Jelena Čalyševa1,2, Mauno Vihinen3.
Abstract
BACKGROUND: Amino acid substitutions due to DNA nucleotide replacements are frequently disease-causing because of affecting functionally important sites. If the substituting amino acid does not fit into the protein, it causes structural alterations that are often harmful. Clashes of amino acids cause local or global structural changes. Testing structural compatibility of variations has been difficult due to the lack of a dedicated method that could handle vast amounts of variation data produced by next generation sequencing technologies.Entities:
Keywords: Amino acid substitution; Side chain rotamers; Structural clashes; Variation interpretation
Mesh:
Substances:
Year: 2017 PMID: 29187139 PMCID: PMC5707825 DOI: 10.1186/s12859-017-1947-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The scheme of the method to identify amino acid substitutions causing clashes. Using PDB file as an input, the program iterates through all positions of interest in the structure, making assumptions and performing calculations for every substitution of interest, and providing information on whether the amino acid substitutions cause clashes in the structure or not
Fig. 2An example of a clash between atoms caused by amino acid a substitution. Substitution of Leu98 (white) by Glu (top) in SH2D1A protein (PDB id 1D4W) causes no clashes with the surrounding residues, while substitution with Arg (bottom) causes clashes with Ile84 and Tyr29 (indicated by circles)
Number of predicted clashes by amino acid types in PDB structures that tolerate substitutions
| PON-SC number | PON-SC (%) | SCWRL+ Probe number | SCWRL+ Probe (%) | Botha number | Both (%) | Totalb | |
|---|---|---|---|---|---|---|---|
| Alanine | 0 | 0 | 0 | 0 | 0 | 0 | 1165 |
| Arginine | 86 | 25.52 | 11 | 3.26 | 15 | 4.45 | 337 |
| Asparagine | 37 | 8.22 | 112 | 24.89 | 7 | 1.56 | 450 |
| Aspartic acid | 38 | 8.35 | 125 | 27.47 | 13 | 2.85 | 455 |
| Cysteine | 0 | 0 | 4 | 1.316 | 0 | 0 | 304 |
| Glutamic acid | 92 | 18.70 | 42 | 8.54 | 42 | 8.54 | 492 |
| Glutamine | 42 | 15 | 31 | 11.07 | 22 | 7.86 | 280 |
| Glycine | 0 | 0 | 0 | 0 | 0 | 0 | 393 |
| Histidine | 74 | 21.70 | 45 | 13.20 | 38 | 11.14 | 341 |
| Isoleucine | 94 | 33.45 | 38 | 13.52 | 42 | 14.95 | 281 |
| Leucine | 96 | 25.26 | 39 | 10.26 | 27 | 7.11 | 380 |
| Lysine | 53 | 19.41 | 6 | 2.20 | 5 | 1.83 | 273 |
| Methionine | 93 | 32.63 | 11 | 3.86 | 10 | 3.51 | 285 |
| Phenylalanine | 86 | 18.86 | 118 | 25.88 | 95 | 20.83 | 456 |
| Proline | 90 | 76.92 | 0 | 0 | 0 | 0 | 117 |
| Serine | 5 | 0.89 | 4 | 0.71 | 0 | 0 | 561 |
| Threonine | 78 | 24.68 | 30 | 9.49 | 28 | 8.86 | 316 |
| Tryptophan | 54 | 29.19 | 29 | 15.68 | 58 | 31.35 | 185 |
| Tyrosine | 110 | 30.05 | 76 | 20.77 | 71 | 19.40 | 366 |
| Valine | 122 | 34.08 | 41 | 11.45 | 42 | 11.73 | 358 |
aDoes not include cases listed in PON-SC and SCWRL+Probe columns. bTotal number of substitutions in the dataset.
Validation of the method performance
| Study | TP | FP | TN | FN | Total | NPV | PPV | Sensitivity | Specificity | Accuracy | MCC |
|---|---|---|---|---|---|---|---|---|---|---|---|
| CD40LG | 9 | 1 | 18 | 4 | 32 | 0.75 | 093 | 0.69 | 0.95 | 0.82 | 0.66 |
| SH2 | 13 | 25 | 46 | 15 | 99 | 0.54 | 0.57 | 0.46 | 0.65 | 0.55 | 0.11 |
| ELANE | 16 | 1 | 2 | 4 | 23 | 0.77 | 0.70 | 0.80 | 0.65 | 0.73 | 0.46 |
| TP53 | 27 | 19 | 102 | 16 | 164 | 0.69 | 0.80 | 0.63 | 0.84 | 0.74 | 0.48 |
| CANCER | 7 | 5 | 16 | 3 | 31 | 0.72 | 0.75 | 0.71 | 0.76 | 0.74 | 0.47 |
| Total/Average | 72 | 51 | 184 | 42 | 349 | 0.69 | 0.74 | 0.66 | 0.77 | 0.71 | 0.43 |