| Literature DB >> 25547411 |
Broto Chakrabarty1, Nita Parekh2.
Abstract
BACKGROUND: Tandem repetition of structural motifs in proteins is frequently observed across all forms of life. Topology of repeating unit and its frequency of occurrence are associated to a wide range of structural and functional roles in diverse proteins, and defects in repeat proteins have been associated with a number of diseases. It is thus desirable to accurately identify specific repeat type and its copy number. Weak evolutionary constraints on repeat units and insertions/deletions between them make their identification difficult at the sequence level and structure based approaches are desired. The proposed graph spectral approach is based on protein structure represented as a graph for detecting one of the most frequently observed structural repeats, Ankyrin repeat.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25547411 PMCID: PMC4307672 DOI: 10.1186/s12859-014-0440-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Ankyrin repeat motif. (a) The second copy of the ANK structural motif in designed protein 1N0R. (b) Schematic diagram showing the secondary structure arrangement in the ANK motif. (c) The principal eigenvector of adjacency matrix plotted for the ANK motif in (a).
Figure 2Designed protein 1N0R (chain A). (a) 3-D structure, and (b) protein contact network.
Figure 3Plot of principal eigenvectors of the adjacency matrix ( ) for designed protein 1N0R. (a) The principal eigenvectors of the adjacency matrix (A ) for designed protein 1N0R is shown. The start and end of each repeat are indicated by dotted and solid lines respectively. (b) The overlap of A profile for the repeat regions is shown. The points in different shapes correspond to the secondary structure elements.
Characteristic features of Ankyrin repeat proteins
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
|
| 12 | 8 | 21 | 34 | 15 | 49 |
|
| 3 | 1 | 4 | 13 | 5 | 19 |
|
| 7.7 | 2.4 | 9.8 | 20 | 8.6 | 31.2 |
Variations in the length of the secondary structure elements and the distance between two peaks in the A profile of the ANK motif for a dataset of 58 non-redundant protein structures is summarized.
Prediction of repeat regions for a representative set of 15 proteins compared with UniProt annotation, RADAR and ConSole output
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| 3ANK | 1N0Q (A) | - | 3-35, 36–68, 69-92 | 21-52 | 3-33, 34–67, 68-93 |
| 4ANK | 1N0R (A) | - | 1-33, 34–66, 67–99, 100-125 | 21-52, 53-84 | 1-33, 34–66, 67–100, 101-126 |
| Protein phosphatase 1 regulatory subunit 12A | 1S70 (B) | 39-68, 72–101, 105–134, 138–164, 198–227, 231-260 | 47-73, 77–106, 110–139, 203–232, 236-265 | 44-74, 75–105, 106–136, 137–167, 175–205, 206–236, 237-267 | 36-71, 72–104, 105–138, 198–231, 232-267 |
| Mouse GABP α/βdomain | 1AWC (B) | 5-34, 37–66, 70–99, 103–132, 136-166 | 13-34, 40–67, 73–100, 106-133 | 16-47, 48–79, 80-111 | 5-36, 37–69, 70–103, 104–136, 137-157 |
| TRPV6 Ankyrin repeat domain | 2RFA (A) | 44-74, 78–107, 116–145, 162–191, 195–236, 238-267 | 46-91, 93–129, 164-208 | 117-132 | 44-77, 78–114, 116–142, 162–194, 195–237, 238-265 |
| Yeast Nas6p complex with proteasome subunit, rpt3 | 2DZN (A) | 1-30, 35–64, 71–100, 106–135, 139–168, 173-203 | 5-37, 38–70, 74–106, 109–141, 142-175 | 13-44, 45–76, 77–108, 112–143, 144–175, 176-207 | 3-34, 35–70, 71–105, 106–138, 139–172, 173–207, 208-228 |
| D34 of human Ankyrin-R | 1 N11 (A) | 403-432, 436–465, 469–498, 502–531, 535–564, 568–597, 601–630, 634–663, 667–696, 700–729, 733–762, 766-795 | 406-431, 438–458, 471–487, 504–530, 537–563, 570–596, 603–629, 636–662, 669–695, 702–728, 735-761 | 415-446, 447–478, 479–510, 511–542, 543–574, 575–606, 607–638, 639–670, 671–702, 703–734, 735–766, 767-802 | 405-432, 436–458, 469–487, 503–535, 536–567, 568–600, 601–633, 634–666, 667–699, 700–733, 734–766, 767-796 |
| Human gankyrin | 1UOH (A) | 3-36, 37–69, 70–102, 103–135, 136–168, 169–201, 202-226 | 49-78, 82–111, 115–144, 148–177, 181-210 | 23-53, 54–84, 85–115, 116–146, 147–177, 178-208 | 5-38, 39–71, 72–105, 106–137, 138–170, 171–204, 205-226 |
| P53-53BP2 complex | 1YCS (B) | 958-990, 991-1023 | 959-987, 992-1020 | 337-352, 398-413 | 926-957, 958–990, 991–1024, 1025-1067 |
| Tumor Suppressor P15(INK4B) | 1D9S (A) | 5-34, 38–66, 71–100, 104-130 | 24-44, 56–86, 88-119 | 78-111 | 71-104, 105-129 |
| Notch | 2F8Y (A) | 1927-1956, 1960–1990, 1994–2023, 2027–2056, 2060-2089 | 1884-1930, 1931–1963, 1964–1997, 1998–2030, 2031–2063, 2064-2096 | 1919-1950, 1951–1982, 1983–2014, 2015–2046, 2047–2078, 2079-2110 | 1928-1960, 1961–1994, 1995–2027, 2028–2060, 2061–2094, 2095-2122 |
| S. Cerevisiae Swi6 Ankyrin repeat fragmen | 1SW6 (A) | 318-346, 347–383, 384–469, 470–498, 499-514 | 348-391, 467-507 | 316-323, 324–331, 470-477 | - |
| Human Osteoclast Stimulating Factor | 3EHQ (A) | 72-101, 105–135, 139-168 | 94-126, 128-159 | 83-114, 115–147, 148-180 | 72-104, 105–138, 139-177 |
| Ankyrin repeat domain of Huntingtin interacting protein 14 | 3EU9 (A) | 89-118, 123–155, 156–188, 189–219, 224-253 | 64-90, 94–123, 128–157, 161–190, 194–225, 229-252 | 70-101, 102–133, 134–165, 166–197, 198–229, 230-261 | 57-88, 89–122, 123–156, 157–189, 190–223, 224–257, 258-281 |
| ANKRA | 3SO8 (A) | 148-180, 181–213, 214–246, 247–279, 280-313 | 34-89, 100-155 | 29-60, 61–92, 93–124, 125-156 | 149-180, 181–213, 214–246, 247–280, 281-310 |
Figure 4MSA of the predicted repeat regions for 1N0R. (a) predicted by the proposed approach, (b) RADAR output, and (c) ConSole output.
Figure 5Natural Ankyrin repeat protein 3EHQ (chain A). (a) The 3d structure, and (b) the eigenvector components corresponding to the largest eigenvalue of adjacency matrix (A ).
Figure 6MSA of the repeat regions in protein 3EHQ. (a) predicted by the proposed approach, (b) annotated in the UniProt database, and (c) predicted by ConSole output.
Figure 7Natural Ankyrin repeat protein 3EU9 (chain A). (a) 3-D structure (b) Plot of the principal eigenvector of the adjacency matrix. (c) - (d) Structural alignment of extra Ankyrin repeat copy predicted in 3EU9 (shown in blue colour) with a repeat copy of designed protein 1N0R (shown in red colour).
Figure 8MSA of the repeat regions in protein 3EU9. (a) predicted by the proposed approach, and (b) annotated in UniProt database.
Figure 9Secondary structure representation of Ankyrin repeat protein 1D9S (chain A) from PDBsum.
Performance of the proposed approach
|
| |||||
|---|---|---|---|---|---|
|
|
|
|
|
|
|
| 122 | 0 | 3 | 245 | 0.976 | 1 |
|
| |||||
|
|
|
|
|
| |
|
| 515 | 67 | 69 | 0.88 | 0.88 |
|
| 419 | 109 | 165 | 0.72 | 0.79 |
|
| 395 | 63 | 189 | 0.68 | 0.86 |
Figure 10Predicted Ankyrin repeat protein 1OUV (chain A). (a) Secondary structure representation from PDBsum (b) Structural alignment of predicted ANK repeat copy (shown in blue colour) with a repeat copy of designed ANK protein 1N0R (shown in orange) (c) A plot with dotted and solid lines showing the start and end of predicted ANK boundaries.
Example proteins with binding sites in the predicted Ankyrin repeat region
|
|
|
|
|
|
|---|---|---|---|---|
| 3HWT | Q9UGP5 | Homo sapiens | 257-291, 292-319 | DNA |
| 1FO3 | Q9UKM7 | Homo sapiens | 285-319, 324–381, 397–444, 458-504 | Kifunensine and Sulphate |
| 1KRF | P31723 | Penicillium citrinum | 263-308, 325-381 | Kifunensine, N-Acetyl-D-Glucosamine and Mannose |
| 2IQC | Q9NPI8 | Homo sapiens | 253-299, 300-341 | Hg (Mercury) |
| 3Q0P* | Q14671 | Homo sapiens | 894-923, 929–959, 965-995 | RNA |
| 3K4E* | Q07807 | S. cerevisiae | 539-576, 584-613 | RNA |
| 3V71* | O44169 | C. elegans | 203-230, 240-270 | RNA |
| 4F42* | P07174 | R. norvegicus | 19-46, 60-85 | MNB |
| 2VTB | Q84KJ5 | A. thaliana | 378-408, 412-445 | FAD and MHF |
| 3FY4 | O48652 | A. thaliana | 343-378, 379-417 | FAD, IMD and MES |
| 4LCT | P45432 | A. thaliana | 221-252, 262-294 | Sulphate |
*Multi repeat protein where predicted repeat region is overlapping with other repeat annotated in UniProt.
Figure 11Prediction on modelled structures shown. (a) Integrin-linked protein kinase (UniProt Id: Q99J82). The repeat boundaries of five Ankyrin motifs predicted by AnkPred (shown in different colours) are in good agreement with five annotated copies in Uniprot. (b) ANKRD protein (UniProt Id: Q7Z3H0). In this case only 3 Ankyrin motifs are annotated in UniProt (intermediate copies) while AnkPred predicts two additional copies on either side.
Figure 12Proteins of other structural repeat families. (a)-(d) 3-D structure: (a) 2C2L: chain A (TPR) (b) 3SL9: chain A (ARM) (c) 1D0B: chain A (LRR) (d) 1U6D: chain X (KELCH). In (e), (f), (g) and (h) the A plot for respective proteins shown. In (i), (j), (k) and (l) the A profile of the repeat regions in respective proteins are overlapped.