| Literature DB >> 25972913 |
Cong Wang1, Yabing Hai1, Xiaoqing Liu2, Nanfang Liu3, Yuhua Yao1, Pingan He4, Qi Dai5.
Abstract
Discrimination of high-risk types of human papillomaviruses plays an important role in the diagnosis and remedy of cervical cancer. Recently, several computational methods have been proposed based on protein sequence-based and structure-based information, but the information of their related proteins has not been used until now. In this paper, we proposed using protein "sequence space" to explore this information and used it to predict high-risk types of HPVs. The proposed method was tested on 68 samples with known HPV types and 4 samples without HPV types and further compared with the available approaches. The results show that the proposed method achieved the best performance among all the evaluated methods with accuracy 95.59% and F1-score 90.91%, which indicates that protein "sequence space" could potentially be used to improve prediction of high-risk types of HPVs.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25972913 PMCID: PMC4418008 DOI: 10.1155/2015/756345
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Star sets of 20 amino acids based on PAM250 substitution matrix.
| Matrix | Star set | ||||
|
| |||||
| PAM250 | {AGPST} | {C} | {DEGHNQ} | EDHNQ | {FILY} |
| {GADS} | {HDENQR} | {IFLMV} | {KNQR} | {LFIMV} | |
| {MILV} | {NDEHKQS} | {PAS} | {QDEHKNR} | {RHKQW} | |
| {SAGNPT} | {TAS} | {VILM} | {WR} | {YF} | |
Figure 1Comparison of prediction accuracy of each class, overall accuracy, and F1-score of all the early and late proteins. The mutation matrices in X-coordinate are BLOSUM 40, BLOSUM 45, BLOSUM 62, BLOSUM 80, BLOSUM 100, PAM 40, PAM 80, PAM 120, PAM 200, and PAM 250.
Comparison of the real risk types (REAL) and the prediction results using the proposed approach.
| Types | Real | Predicted | Types | Real | Predicted | Types | Real | Predicted | Types | Real | Predicted |
|---|---|---|---|---|---|---|---|---|---|---|---|
| HPV 39 | High | High | HPV 7 | Low | Low | HPV 34 | Low | Low | HPV 50 | Low | Low |
|
|
|
|
|
|
| HPV 44 | Low | Low | HPV 5 | Low | Low |
| HPV 33 | High | High | HPV 73 | Low | Low | HPV 43 | Low | Low | HPV 20 | Low | Low |
| HPV 51 | High | High | HPV 6 | Low | Low | HPV 32 | Low | Low | HPV 23 | Low | Low |
| HPV 16 | High | High | HPV 27 | Low | Low | HPV 24 | Low | Low | HPV 19 | Low | Low |
| HPV 56 | High | High | HPV 13 | Low | Low | HPV 8 | Low | Low | HPV 47 | Low | Low |
| HPV 18 | High | High | HPV 55 | Low | Low | HPV 48 | Low | Low | HPV 22 | Low | Low |
| HPV 59 | High | High | HPV 2 | Low | Low | HPV 12 | Low | Low | HPV 25 | Low | Low |
| HPV 52 | High | High | HPV 10 | Low | Low | HPV 49 | Low | Low | HPV 9 | Low | Low |
| HPV 35 | High | High | HPV 42 | Low | Low | HPV 15 | Low | Low | HPV 36 | Low | Low |
| HPV 68 | High | High | HPV 28 | Low | Low | HPV 21 | Low | Low | HPV 41 | Low | Low |
| HPV 58 | High | High | HPV 40 | Low | Low | HPV 4 | Low | Low | HPV 63 | Low | Low |
| HPV 31 | High | High | HPV 3 | Low | Low | HPV 65 | Low | Low | HPV 1 | Low | Low |
|
|
|
| HPV 11 | Low | Low | HPV 37 | Low | Low | HPV 80 | Low | Low |
| HPV 45 | High | High | HPV 29 | Low | Low | HPV 38 | Low | Low | HPV 77 | Low | Low |
| HPV 61 | High | High | HPV 74 | Low | Low | HPV 60 | Low | Low | HPV 76 | Low | Low |
| HPV 67 | High | High | HPV 53 | Low | Low | HPV 17 | Low | Low | HPV 75 | Low | Low |
Figure 2Comparison of overall accuracy and F1-score of all the evaluated prediction methods for HPV high-risk viral types.
Prediction results of the HPVs with unknown types using the proposed prediction methods and from the available methods.
| Types | Prediction methods | ||||||
|---|---|---|---|---|---|---|---|
| Mismatch [ | Linear [ | Gap [ | Genetic [ | PseAAC [ | Ensemble [ | This paper | |
| HPV 26 | Low | Low | High | Low | High | High | High |
| HPV 54 | Low | Low | Low | Low | Low | Low | Low |
| HPV 57 | Low | Low | Low | Low | Low | Low | Low |
| HPV 70 | High | High | High | Low | Low | High | High |