| Literature DB >> 34268099 |
Shiva Pirhadi1, Keivan Maghooli1, Niloofar Yousefi Moteghaed2, Masoud Garshasbi3, Seyed Jalaleddin Mousavirad4.
Abstract
BACKGROUND: Mass spectrometry is a method for identifying proteins and could be used for distinguishing between proteins in healthy and nonhealthy samples. This study was conducted using mass spectrometry data of ovarian cancer with high resolution. Usually, diagnostic and monitoring tests are done according to sensitivity and specificity rates; thus, the aim of this study is to compare mass spectrometry of healthy and cancerous samples in order to find a set of biomarkers or indicators with a reasonable sensitivity and specificity rates.Entities:
Keywords: Biomarker discovery; imperialist competitive algorithm; mass spectrometry high-throughput proteomics data; ovarian cancer
Year: 2021 PMID: 34268099 PMCID: PMC8253319 DOI: 10.4103/jmss.JMSS_20_20
Source DB: PubMed Journal: J Med Signals Sens ISSN: 2228-7477
Figure 1Imperialist competitive algorithm flowchart[38]
Figure 2Flowchart of the proposed algorithm
Figure 3Control and cancerous spectra together with the absolute value (a) t-test; (b) entropy values; (c) the values related to Bhattacharyya distance
Imperialist competitive algorithm parameters in filter-based methods
| Methods | Number of features | Parameters of ICA | |||||
|---|---|---|---|---|---|---|---|
| nPop | nImp | Decades | Β | pRevolution | ζ | ||
| 200 | 30 | 8 | 20 | 0.3 | 0.3 | 0.1 | |
| Entropy | 200 | 30 | 5 | 20 | 0.1 | 0.1 | 0.1 |
| Bhattacharyya | 400 | 40 | 4 | 15 | 0.1 | 0.1 | 0.1 |
ICA – Imperialist competitive algorithm
Figure 4Values of the cost function and meaning of the costs in the implementation of (a) first, (b) second, and (c) third. The horizontal axis indicates the decade, the longest line shows the best cost, and the dotted line shows the mean of costs
Information related to the number of selected characteristics in each time of imperialist competitive algorithm implementation, and finally, the number of common characteristics implemented in each time, three times
| Methods | #features from 1st running of ICA | #features from 2nd running of ICA | #features from 3d running of ICA | Common features in tree times of running ICA |
|---|---|---|---|---|
| 104 | 113 | 111 | 31 | |
| Entropy-ICA | 114 | 97 | 98 | 28 |
| Bhattacharyya-ICA | 199 | 194 | 201 | 44 |
ICA – Imperialist competitive algorithm
Results obtained from imperialist competitive algorithm implementation on ovarian cancer dataset
| Methods | #decade | Best cost | Mean cost | #empires |
|---|---|---|---|---|
| t-test-ICA | 1 | 100 | 96.77 | 8 |
| 20 | 100 | 100 | 4 | |
| 1 | 100 | 96.77 | 8 | |
| 20 | 100 | 100 | 4 | |
| 1 | 100 | 96 | 8 | |
| 20 | 100 | 100 | 3 | |
| Entropy-ICA | 1 | 100 | 100 | 5 |
| 20 | 100 | 100 | 5 | |
| 1 | 100 | 100 | 5 | |
| 20 | 100 | 100 | 5 | |
| 1 | 100 | 100 | 5 | |
| 20 | 100 | 100 | 4 | |
| Bhattacharyya-ICA | 1 | 100 | 100 | 4 |
| 15 | 100 | 100 | 3 | |
| 1 | 100 | 100 | 4 | |
| 15 | 100 | 100 | 4 | |
| 1 | 100 | 100 | 4 | |
| 15 | 100 | 100 | 4 |
ICA – Imperialist competitive algorithm
Results obtained from applying the decision tree on ovarian cancer dataset
| Important values of M/Z | #rules for cancer group | #rules for control group | Accuracy% C5 | #features of original data |
|---|---|---|---|---|
| 1034/163, 8607/152, 7063/890, 8708/408 | 2 | 3 | 98.15 | 104 |
| 845/042, 8711/485, 8607/152, 7065/738, 8713/531 | 2 | 4 | 97.22 | 113 |
| 1036/285, 8022/877, 8100/848, 1072/332, 1006/425, 8604/093, 7065/738, 844/722 | 2 | 5 | 99.54 | 111 |
| 6856/641, 8794/786, 8212/033, 845/042, 1290/083, 8607/152, 4310/126, 8553/187 | 3 | 4 | 98.15 | 114 |
| 8794/786, 8603/073, 6834/813, 8213/03, 4310/126, 1056/195 | 3 | 4 | 99.07 | 97 |
| 8600/015, 8794/786, 8621/435, 845/042, 4310/126, 4003/314 | 3 | 4 | 98.61 | 98 |
| 1006/774, 8025/832, 845/042, 8710/459, 8603/073, 7065/738 | 3 | 5 | 99.07 | 199 |
| 6859/372, 1006/425, 8708/408, 1939/120, 8604/093, 7065/738, 6850/271, 1079/182 | 3 | 4 | 99.07 | 194 |
| 8522/717, 7063/890, 8607/152, 7174/257, 1049/775 | 3 | 3 | 99.54 | 201 |
Rules obtained from C5 algorithm related to ovarian cancer dataset
| 1. If 7065/738≤0.073 and 8603/073≤−0.034 then control |
| 2. If 1006/774>0.018 and 8603/073≤−0.034 then control |
| 3. If 1006/425≤0.028 and 7065/738>0.073 then cancer |
| 4. If 845/042>−0.013 and 8607/152≤−0.038 then control |
| 5. If 8603/073≤−0.034 then control |
| 6. If 8708/408>0.078 then control |
| 7. If 8607/152≤−0.038 then control |
| 8.If 7063/890>−0.037 and 8607/152>−0.038 and 8708/408≤0.078 then cancer |
| 9. If 845/042>0.045 then control |
Rules obtained from the exploration of generalized rule induction associative rules in the ovarian cancer dataset
| Rules | Percentage of confidence coefficient | Percentage of support |
|---|---|---|
| If 7065/738> −0.032 and 8601/034> −0.022 then cancer | 100 | 43.06 |
| If 1034/516 < 0.002 and 4310/126> −0.021 then cancer | 100 | 42.59 |
| If 7065/738> −0.032 and 8602/053> −0.021 then cancer | 100 | 42.59 |
| If 7065/738> −0.032 and 1078/821 < 0.021 and 4303/634> −0.025 then cancer | 100 | 40.74 |
| If 8618/373> −0.017 and 4302/912> −0.011 then cancer | 100 | 39.81 |
| If 4300/49 > −0.020 and 4302/912> −0.008 then cancer | 100 | 39.81 |
Figure 5Heatmap display for the average of the cancerous groups (top) and control (bottom). Biomarkers obtained from C5 method are shown with the red triangle
Figure 6Display of M/Z values for high-frequent biomarkers. (a) 861/094 and 862/076; (b) 3428/817; (c) 7053/731 and 7054/654. The horizontal axis shows the M/Z values and the vertical axis shows their severity
Figure 7Biomarkers obtained after the implementation of the C5 algorithm. The top row from left to right: 845/04, 1006/43, and 7063/89 and the bottom row from left to right/7065/74, 8603/07, 8607/15, and 8708/41. In this figure, the average control and cancerous samples are plotted with red and blue colors, respectively
Figure 8M/Z values in the antecedent part of some of the associative rules (the value for all these rules is the cancer class): (a) 4302/912 and 4300/749; (b) 8618/373 and 4302/912; (c) 4310/126 and 1034/516; (d) 1078/821 and 7065/738
Figure 9Display of 842/042 value in the average cancerous spectra (high) and control (low). As shown, the severity of this biomarker is greatly different in cancerous and healthy samples
Figure 10M/Z = 845/042 severity in all cancerous (high) and control (low) samples
Comparison of sensitivity and specificity in several studies related to ovarian cancer data with high resolution with our proposed algorithms
| Authors/year/classification type | Cross validation | Using the main features | Number of features | Sensitivity (%) | Specificity (%) |
|---|---|---|---|---|---|
| Yu | 1000 independent k-fold (k=2,…10) | No | 3382 | 97.38 | 93.30 |
| Wu | 10-fold | Yes | 100 | 93.9 | 93.23 |
| Tang | 5-fold | No | 1964 | 99.50 | 99.16 |
| Liu[ | 2-fold | No | 247-949 | 98.45-99.55 | 95.69-97.01 |
| Wu[ | 10-fold | No | 215 | 92.98 | 88.97 |
| Cui[ | 10-fold | Yes | 371 | 98.16 | - |
| Our proposed algorithm as | 10-fold | Yes | 104 | 97.52 | 98.94 |
| ICA-C5 | 113 | 96.69 | 97.89 | ||
| KNN | 111 | 100 | 98.94 | ||
| Our proposed | 10-fold | Yes | 114 | 98.34 | 97.89 |
| algorithm as Entropy- | 97 | 98.34 | 100% | ||
| ICA-C5 | 98 | 99.17 | 99.89 | ||
| KNN | |||||
| Our proposed algorithm | 10-fold | Yes | 199 | 99.17 | 98.94 |
| as Bhattacharyya- | 194 | 100 | 97.89 | ||
| ICA-C5 | 201 | 99.17 | 100 | ||
| KNN |
ICA – Imperialist competitive algorithm