| Literature DB >> 21829572 |
Le-Le Hu1, Tao Huang, Yu-Dong Cai, Kuo-Chen Chou.
Abstract
Determining the body fluids where secreted proteins can be secreted into is important for protein function annotation and disease biomarker discovery. In this study, we developed a network-based method to predict which kind of body fluids human proteins can be secreted into. For a newly constructed benchmark dataset that consists of 529 human-secreted proteins, the prediction accuracy for the most possible body fluid location predicted by our method via the jackknife test was 79.02%, significantly higher than the success rate by a random guess (29.36%). The likelihood that the predicted body fluids of the first four orders contain all the true body fluids where the proteins can be secreted into is 62.94%. Our method was further demonstrated with two independent datasets: one contains 57 proteins that can be secreted into blood; while the other contains 61 proteins that can be secreted into plasma/serum and were possible biomarkers associated with various cancers. For the 57 proteins in first dataset, 55 were correctly predicted as blood-secrete proteins. For the 61 proteins in the second dataset, 58 were predicted to be most possible in plasma/serum. These encouraging results indicate that the network-based prediction method is quite promising. It is anticipated that the method will benefit the relevant areas for both basic research and drug development.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21829572 PMCID: PMC3146524 DOI: 10.1371/journal.pone.0022989
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
A breakdown of the 529 human secreted proteins in the training dataset according to the 11 different types of body fluids into which they can be secreted.
| Type | Body fluid | Number of proteins in dataset |
| 1 | Amniotic fluid | 192 |
| 2 | Bronchoalveolar lavage fluid | 65 |
| 3 | Cerebrospinal fluid | 204 |
| 4 | Milk | 71 |
| 5 | Nipple aspiration fluid | 37 |
| 6 | Plasma/Serum | 418 |
| 7 | Saliva | 175 |
| 8 | Seminal fluid | 155 |
| 9 | Synovial fluid | 63 |
| 10 | Tear | 84 |
| 11 | Urine | 244 |
| Sum | 1,708 | |
Figure 1The numbers of proteins that are secreted in different types of body fluids.
See Table 1 for the definition of the numerical codes used here for the body fluid types.
Figure 2All the 11 order jackknife cross-validation accuracies by the network-based method for the 529 human secreted proteins.
Interactions of peptidoglycan recognition protein 1 (O75594, UniProt Protein) with its neighbor proteins in the PPI network.
| Protein A | Body fluid type number | Protein B | Body fluid type number | Interaction confidence |
| O75594 | 6, 7, 11 | P61626 | 1, 2, 3, 4, 6, 7, 8, 10, 11 | 0.532 |
| O75594 | 6, 7, 11 | O15263 | 7 | 0.501 |
| O75594 | 6, 7, 11 | P05231 | 6 | 0.300 |
| O75594 | 6, 7, 11 | P13500 | 6 | 0.291 |
| O75594 | 6, 7, 11 | P60022 | 6, 11, 7 | 0.291 |
| O75594 | 6, 7, 11 | P01350 | 6 | 0.286 |
| O75594 | 6, 7, 11 | P78380 | 11 | 0.279 |
| O75594 | 6, 7, 11 | P07492 | 8 | 0.257 |
| O75594 | 6, 7, 11 | P02743 | 3, 6, 7, 8, 9, 10, 11 | 0.249 |
| O75594 | 6, 7, 11 | P05120 | 6, 7, 10 | 0.243 |
| O75594 | 6, 7, 11 | P35858 | 1, 3, 6, 9, 11 | 0.235 |
| O75594 | 6, 7, 11 | P49913 | 1, 6, 7, 8, 11 | 0.232 |
| O75594 | 6, 7, 11 | P01375 | 6 | 0.227 |
| O75594 | 6, 7, 11 | Q13410 | 4, 5 | 0.221 |
| O75594 | 6, 7, 11 | P48023 | 6 | 0.218 |
| O75594 | 6, 7, 11 | P19883 | 6 | 0.207 |
| O75594 | 6, 7, 11 | P05814 | 3, 4, 5 | 0.196 |
| O75594 | 6, 7, 11 | P11226 | 6 | 0.191 |
| O75594 | 6, 7, 11 | Q14116 | 6 | 0.162 |
| O75594 | 6, 7, 11 | P13236 | 6 | 0.156 |
| O75594 | 6, 7, 11 | P02788 | 1, 3, 5, 6, 7, 8, 10, 11 | 0.154 |
| O75594 | 6, 7, 11 | P13501 | 6 | 0.154 |
| O75594 | 6, 7, 11 | P13591 | 3, 6, 11 | 0.154 |
See for the definition of the body fluid type number.
The prediction accuracies with 11 different orders for the 57 blood-secreted proteins by the network-based method, with order 1 corresponding to the most likely prediction and order 11 the least likely prediction.
| Order | Accuracy (%) |
| 1 | 96.49 |
| 2 | 3.51 |
| 3 | 0 |
| 4 | 0 |
| 5 | 0 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 0 |
| 10 | 0 |
| 11 | 0 |
The prediction accuracies with 11 different orders for the 61 marker proteins by the network-based method, with order 1 corresponding to the most likely prediction and order 11 the least likely prediction.
| Order | Accuracy (%) |
| 1 | 95.08 |
| 2 | 3.28 |
| 3 | 1.64 |
| 4 | 0 |
| 5 | 0 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 0 |
| 10 | 0 |
| 11 | 0 |