| Literature DB >> 22479614 |
Zexian Liu1, Fang Yuan, Jian Ren, Jun Cao, Yanhong Zhou, Qing Yang, Yu Xue.
Abstract
Anaphase-promoting complex/cyclosome (APC/C), an E3 ubiquitin ligase incorporated with Cdh1 and/or Cdc20 recognizes and interacts with specific substrates, and faithfully orchestrates the proper cell cycle events by targeting proteins for proteasomal degradation. Experimental identification of APC/C substrates is largely dependent on the discovery of APC/C recognition motifs, e.g., the D-box and KEN-box. Although a number of either stringent or loosely defined motifs proposed, these motif patterns are only of limited use due to their insufficient powers of prediction. We report the development of a novel GPS-ARM software package which is useful for the prediction of D-boxes and KEN-boxes in proteins. Using experimentally identified D-boxes and KEN-boxes as the training data sets, a previously developed GPS (Group-based Prediction System) algorithm was adopted. By extensive evaluation and comparison, the GPS-ARM performance was found to be much better than the one using simple motifs. With this powerful tool, we predicted 4,841 potential D-boxes in 3,832 proteins and 1,632 potential KEN-boxes in 1,403 proteins from H. sapiens, while further statistical analysis suggested that both the D-box and KEN-box proteins are involved in a broad spectrum of biological processes beyond the cell cycle. In addition, with the co-localization information, we predicted hundreds of mitosis-specific APC/C substrates with high confidence. As the first computational tool for the prediction of APC/C-mediated degradation, GPS-ARM is a useful tool for information to be used in further experimental investigations. The GPS-ARM is freely accessible for academic researchers at: http://arm.biocuckoo.org.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22479614 PMCID: PMC3315528 DOI: 10.1371/journal.pone.0034370
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1The Sequence logos of ARM(2, 6) and ARM(8, 15) were generated by the HMM-Logo (LogoMat-M) [ for the (A) D-box and (B) KEN-box, respectively.
Figure 2Screen snapshot of the GPS-ARM 1.0 software.
The default thresholds were chosen for the D-box (high) and KEN-box (low). As an example, the prediction results for the human centromere protein F/CENP-F (UniProt ID: P49454) are shown.
Figure 3The prediction performance of GPS-ARM 1.0.
The LOO validation and 4-, 6-, 8- and 10-fold cross-validations were performed for (A) the D-box and (B) the KEN-box.
Performance evaluation and comparison of the GPS-ARM with known motifs.
| Method | Threshold |
|
|
|
|
|
| D-box (GPS-ARM) | High | 87.29 | 82.46 | 63.51 | 95.39 | 0.6463 |
| Medium | 80.76 | 61.54 | 64.86 | 86.18 | 0.5018 | |
| Low | 76.63 | 53.26 | 66.22 | 80.18 | 0.4346 | |
|
| 81.10 | 95.24 | 27.03 | 99.54 | 0.4471 | |
| KEN-box (GPS-ARM) | High | 86.67 | 100.00 | 81.82 | 100.00 | 0.7385 |
| Medium | 91.67 | 100.00 | 88.64 | 100.00 | 0.8218 | |
| Low | 95.00 | 100.00 | 93.18 | 100.00 | 0.8858 | |
| Motif-D1 | 77.32 | 90.00 | 12.16 | 99.54 | 0.2797 | |
| Motif-D2 | 75.95 | 56.25 | 24.32 | 93.55 | 0.2488 | |
| Motif-D3 | 76.29 | 53.25 | 55.41 | 83.41 | 0.3832 | |
| Motif-KEN | 38.33 | 100.00 | 15.91 | 100.00 | 0.2192 |
For the construction of the GPS-ARM software package, the three thresholds of high, medium and low were selected for D-box and KEN-box, respectively.
Motif-D1, RXXLXX-I/V-XN [3];
Motif-D2, RXXLXXXXN [4], [5], [16];
Motif-D3, RXXLXX-L/I/V/M [30];
Motif-KEN, KENXXX-N/D [3], [22];
For comparison, we fixed the Sp value of GPS-ARM so as to be identical with Motif-D1.
The predicted D-boxes and KEN-boxes in five eukaryotic organisms.
| Method |
|
|
|
|
| Total | |
|
| Box | 12,814 | 7,555 | 9,520 | 51,065 | 63,018 | 143,972 |
| Pro. | 4,723 | 2,519 | 2,569 | 13,503 | 16,477 | 39,791 | |
|
| Box | 118 | 51 | 39 | 216 | 247 | 671 |
| Pro. | 117 | 51 | 39 | 213 | 244 | 664 | |
|
| Box | 817 | 415 | 445 | 1,746 | 2,123 | 5,546 |
| Pro. | 732 | 374 | 395 | 1,581 | 1,916 | 4,998 | |
|
| Box | 3,154 | 1,849 | 2,442 | 12,478 | 15,138 | 35,061 |
| Pro. | 2,207 | 1,192 | 1,411 | 7,523 | 8,963 | 21,296 | |
|
| Box | 1,104 | 635 | 815 | 4,022 | 4,841 | 11,417 |
| Pro. | 958 | 517 | 638 | 3,221 | 3,832 | 9,166 | |
|
| Box | 1,045 | 362 | 387 | 1,966 | 2,683 | 6,443 |
| Pro. | 891 | 331 | 340 | 1,668 | 2,206 | 5,436 | |
|
| Box | 143 | 42 | 35 | 159 | 222 | 601 |
| Pro. | 138 | 42 | 34 | 157 | 220 | 591 | |
|
| Box | 641 | 227 | 241 | 1,191 | 1,632 | 3,932 |
| Pro. | 571 | 216 | 217 | 1,052 | 1,403 | 3,459 |
Box, the number of the predicted boxes;
Pro., the number of the predicted D-box or KEN-box proteins.
Statistical analysis of the functional abundance and diversity of the D-box and the KEN-box proteins in H. sapiens.
| Description of GO term | D- or KEN-box | Proteome | E-ratio |
| ||
| Num. | Per. | Num. | Per. | |||
|
| ||||||
| Regulation of small GTPase mediated signal transduction (GO:0051056) | 70 | 2.02% | 169 | 0.92% | 2.18 | 1.19E-11 |
| Protein phosphorylation (GO:0006468) | 116 | 3.34% | 371 | 2.03% | 1.65 | 7.27E-09 |
| Regulation of Rho protein signal transduction (GO:0035023) | 35 | 1.01% | 72 | 0.39% | 2.56 | 1.20E-08 |
| Microtubule-based movement (GO:0007018) | 39 | 1.12% | 93 | 0.51% | 2.21 | 2.74E-07 |
| Axon guidance (GO:0007411) | 93 | 2.68% | 303 | 1.66% | 1.62 | 5.54E-07 |
| Intracellular signal transduction (GO:0035556) | 82 | 2.36% | 269 | 1.47% | 1.61 | 3.40E-06 |
| G2/M transition of mitotic cell cycle (GO:0000086) | 41 | 1.18% | 110 | 0.60% | 1.96 | 5.28E-06 |
| Mitotic cell cycle (GO:0000278) | 90 | 2.59% | 306 | 1.67% | 1.55 | 5.95E-06 |
| Intracellular protein kinase cascade (GO:0007243) | 35 | 1.01% | 89 | 0.49% | 2.07 | 6.38E-06 |
| Cell adhesion (GO:0007155) | 141 | 4.06% | 546 | 2.99% | 1.36 | 4.11E-05 |
| Positive regulation of Rho GTPase activity (GO:0032321) | 13 | 0.37% | 23 | 0.13% | 2.98 | 6.84E-05 |
| Peptidyl-serine phosphorylation (GO:0018105) | 20 | 0.58% | 45 | 0.25% | 2.34 | 8.14E-05 |
| Mitotic metaphase/anaphase transition (GO:0007091) | 9 | 0.26% | 13 | 0.07% | 3.65 | 1.08E-04 |
| Nerve growth factor receptor signaling pathway (GO:0048011) | 63 | 1.82% | 215 | 1.18% | 1.54 | 1.56E-04 |
| Regulation of glucose transport (GO:0010827) | 15 | 0.43% | 31 | 0.17% | 2.55 | 1.98E-04 |
|
| ||||||
| Cell cycle (GO:0007049) | 71 | 5.56% | 416 | 2.28% | 2.44 | 1.64E-12 |
| Cell division (GO:0051301) | 50 | 3.92% | 267 | 1.46% | 2.68 | 1.20E-10 |
| Mitotic cell cycle (GO:0000278) | 54 | 4.23% | 306 | 1.67% | 2.53 | 2.21E-10 |
| M phase of mitotic cell cycle (GO:0000087) | 26 | 2.04% | 93 | 0.51% | 4.00 | 5.69E-10 |
| Microtubule-based movement (GO:0007018) | 24 | 1.88% | 93 | 0.51% | 3.70 | 1.48E-08 |
| Flavonoid metabolic process (GO:0009812) | 8 | 0.63% | 11 | 0.06% | 10.41 | 7.55E-08 |
| Mitosis (GO:0007067) | 33 | 2.59% | 181 | 0.99% | 2.61 | 3.22E-07 |
| Golgi to plasma membrane protein transport (GO:0043001) | 6 | 0.47% | 7 | 0.04% | 12.27 | 7.55E-07 |
| Flavone metabolic process (GO:0051552) | 5 | 0.39% | 5 | 0.03% | 14.32 | 1.65E-06 |
| Sulfation (GO:0051923) | 6 | 0.47% | 8 | 0.04% | 10.74 | 2.84E-06 |
| Mitotic prometaphase (GO:0000236) | 19 | 1.49% | 85 | 0.47% | 3.20 | 4.81E-06 |
| Cell adhesion (GO:0007155) | 64 | 5.02% | 546 | 2.99% | 1.68 | 3.08E-05 |
| G2/M transition of mitotic cell cycle (GO:0000086) | 20 | 1.57% | 110 | 0.60% | 2.60 | 6.78E-05 |
| Axon guidance (GO:0007411) | 40 | 3.13% | 303 | 1.66% | 1.89 | 7.58E-05 |
| Protein targeting to lysosome (GO:0006622) | 5 | 0.39% | 8 | 0.04% | 8.95 | 7.72E-05 |
The top15 most over-represented biological processes are shown.
the number of proteins annotated;
the proportion of proteins annotated;
E-ratio, enrichment ratio, the D-box or KEN-box proportion divided by the human proteome proportion.
Statistical results of the potential D-box and KEN-box substrates predicted from the microkit proteins.
| Organism | MiCroKit proteins | D-box | E-ratio |
| KEN-box | E-ratio |
| ||
| Pro. | Box | Pro. | Box | ||||||
|
| 266 | 74 | 93 | 1.92 | 5.93E-09 | 58 | 70 | 2.53 | 1.31E-11 |
|
| 95 | 30 | 37 | 2.04 | 5.24E-05 | 21 | 23 | 3.41 | 3.55E-07 |
|
| 112 | 50 | 80 | 2.19 | 3.19E-09 | 30 | 35 | 3.86 | 2.67E-11 |
|
| 132 | 52 | 83 | 2.00 | 1.22E-07 | 24 | 32 | 2.83 | 3.17E-06 |
|
| 677 | 215 | 315 | 1.68 | 3.25E-16 | 101 | 138 | 2.15 | 1.51E-13 |
|
| 1,282 | 421 | 608 | 1.78 | 5.50E-36 | 234 | 298 | 2.62 | 1.18E-42 |
Pro., the number of predicted D-box or KEN-box proteins;
Box, the number of predicted D-boxes or KEN-boxes;
E-ratio, enrichment ratio, the MiCroKit D-box or KEN-box proportion in comparison with the proteomic proportion.