| Literature DB >> 34398224 |
Dan Shao1,2,3, Lan Huang1, Yan Wang1,4, Kai He1, Xueteng Cui2, Yao Wang1, Qin Ma5, Juan Cui3.
Abstract
MOTIVATION: Human proteins that are secreted into different body fluids from various cells can be promising disease indicators. Modern proteomics research empowered by both qualitative and quantitative profiling techniques has made great progress in protein discovery in various human fluids. However, due to the large numbers of proteins and diverse modifications present in the fluids, as well as the existing technical limits of major proteomics platforms (e.g., mass spectrometry), large discrepancies are often generated from different experimental studies. As a result, a comprehensive proteomics landscape across major human fluids are not well determined.Entities:
Year: 2021 PMID: 34398224 PMCID: PMC8696095 DOI: 10.1093/bioinformatics/btab545
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.The distribution of 12 types of body fluids that are analyzed in this study
Fig. 2.The architecture of DeepSec which supports input as PSI-profiles based on protein sequences, feature extraction through CNN, classification based on BGRU with fully connected dense layer, and the outputs as the probability of being secreted protein
Fig. 3.The forwards and backwards GRU capturing possible long-range dependencies between the input sequence and the predicted class
The performance evaluation on 12 body fluids based on testing dataset, grouped by several evaluation measures
| Body fluids | Accuracy | Recall | Precision | F-measure | MCC | AUC |
|---|---|---|---|---|---|---|
| Blood |
|
| 0.868200 |
|
|
|
| Saliva | 0.824650 | 0.810522 | 0.835108 | 0.797251 | 0.643117 | 0.898319 |
| Urine | 0.845857 | 0.805017 | 0.883817 | 0.834207 | 0.692125 | 0.918341 |
| Cerebrospinal fluid | 0.835470 | 0.667881 | 0.931376 | 0.747156 | 0.637556 | 0.900955 |
| Seminal fluid | 0.821891 | 0.834073 |
| 0.819841 | 0.644204 | 0.894597 |
| Amniotic fluid | 0.828476 | 0.747795 | 0.889995 | 0.790455 | 0.649091 | 0.905148 |
| Tear fluid | 0.830080 | 0.572529 | 0.926077 | 0.646611 | 0.545096 | 0.856645 |
| Bronchoalveolar lavage fluid | 0.859257 |
|
|
|
| 0.857458 |
| Milk |
| 0.677016 | 0.885655 | 0.719173 | 0.580265 | 0.871808 |
| Nipple aspirate fluid | 0.816795 | 0.565002 | 0.956554 | 0.687654 | 0.594120 | 0.887423 |
| Pleural effusion | 0.837814 | 0.571637 | 0.922045 | 0.62887 | 0.530781 |
|
| Sputum | 0.823529 | 0.822089 | 0.824375 | 0.775122 | 0.633483 | 0.891406 |
Note: The highest scores are in bold, and the lowest scores are underlined.
The performance evaluation on 12 kinds of body fluids based on all datasets, grouped by different evaluation measures
| Body fluids | Accuracy | Recall | Precision | F-measure | MCC | AUC |
|---|---|---|---|---|---|---|
| Blood |
|
| 0.855263 |
|
|
|
| Saliva | 0.769029 | 0.666667 | 0.844749 | 0.710526 | 0.522890 | 0.855485 |
| Urine | 0.825711 | 0.810127 | 0.840196 | 0.817456 | 0.650832 | 0.899171 |
| Cerebrospinal fluid | 0.819790 | 0.700608 | 0.887924 | 0.738782 | 0.603938 | 0.869243 |
| Seminal fluid | 0.782139 | 0.802020 |
| 0.781496 | 0.565294 | 0.853547 |
| Amniotic fluid | 0.806870 | 0.839506 | 0.781965 | 0.790041 | 0.616044 | 0.893354 |
| Tear fluid | 0.800349 | 0.443730 | 0.933014 | 0.546535 | 0.446769 |
|
| Bronchoalveolar lavage fluid | 0.832130 |
|
|
|
| 0.792717 |
| Milk |
| 0.661253 | 0.824742 | 0.669014 | 0.488595 | 0.810696 |
| Nipple aspirate fluid | 0.787149 | 0.592342 | 0.895131 | 0.664981 | 0.520782 | 0.837818 |
| Pleural effusion | 0.824978 | 0.439560 | 0.946759 | 0.546697 | 0.467328 | 0.823116 |
| Sputum | 0.781225 | 0.759140 | 0.794192 | 0.719674 | 0.543051 | 0.849145 |
Note: The highest scores are in bold, and the lowest scores are underlined.
Fig. 4.Results of predicted human proteins secreted in 12 body fluids by screening against all human proteins reported in Swiss-Prot. The orange bar depicts number of predicted proteins against all human proteins in Swiss-Prot and blue bar depicts the experimental identified proteins
Fig. 5.The ROC curves for body-fluid protein prediction differentiation of DeepSec versus other models in 12 kinds of body fluids on testing datasets
Prediction performance of various model architectures evaluated based on testing dataset and all datasets
| Measures | Testing dataset | All datasets | ||||
|---|---|---|---|---|---|---|
| CNN | BGRU | DeepSec | CNN | BGRU | DeepSec | |
| Accuracy | 0.806068 | 0.855145 |
| 0.825292 | 0.811404 |
|
| Recall | 0.782884 |
| 0.872120 | 0.777778 |
| 0.887427 |
| Precision |
| 0.687112 | 0.868200 |
| 0.691520 | 0.855263 |
| F-measure | 0.858212 | 0.904143 |
| 0.816577 | 0.831593 |
|
| MCC | 0.587026 | 0.608209 |
| 0.653542 | 0.64152 |
|
| AUC | 0.912392 | 0.897955 |
| 0.906867 | 0.903960 |
|
Note: The highest scores are in bold.
Fig. 6.The ROC curves of various model architectures. (a) Evaluation on testing dataset. (b) Evaluation on all datasets
Fig. 7.The significant differential expression between kidney cancer and control samples, including up- and down-regulated results