| Literature DB >> 32911598 |
Emili Besalú1, Jesus Vicente De Julián-Ortiz2.
Abstract
The Superposing Significant Interaction Rules (SSIR) method is a combinatorial procedure that deals with symbolic descriptors of samples. It is able to rank the series of samples when those items are classified into two classes. The method selects preferential descriptors and, with them, generates rules that make up the rank by means of a simple voting procedure. Here, two application examples are provided. In both cases, binary or multilevel strings encoding gene expressions are considered as descriptors. It is shown how the SSIR procedure is useful for ranking the series of patient transcription data to diagnose two types of cancer (leukemia and prostate cancer) obtaining Area Under Receiver Operating Characteristic (AU-ROC) values of 0.95 (leukemia prediction) and 0.80-0.90 (prostate). The preferential selected descriptors here are specific gene expressions, and this is potentially useful to point to possible key genes.Entities:
Keywords: SSIR method; cancer; gene expressions; leukemia; multilevel fingerprints; prostate cancer; ranking
Mesh:
Year: 2020 PMID: 32911598 PMCID: PMC7564041 DOI: 10.3390/biom10091293
Source DB: PubMed Journal: Biomolecules ISSN: 2218-273X
Figure 1How a rule of order 2 selects over a set of 9 individuals represented by 5 descriptors: See the text for details.
Figure 2Receiver Operating Characteristic (ROC) curve for the external set of Golub and coworkers: rules are of order 2.
Figure 3Results of the randomization test for the external set of Golub and coworkers: The reduced set of descriptors was considered, the rules are of order 2, and p = 0.001.
AU-ROC values for several calculations for subset 1: the threshold p-value was modified to see the variation in AU-ROC value when considering rules of order 1 for overall fit and the cross-validation leave-one-out (L1O) test. See the text for more details.
| Rules Selected | Overall Fit | L1O Performance | |
|---|---|---|---|
| 10−2 | 1236 | 0.812 | 0.635 |
| 10−3 | 274 | 0.880 | 0.743 |
| 10−4 | 66 | 0.924 | 0.793 |
| 10−5 | 27 | 0.932 | 0.855 |
| 10−6 | 8 | 0.875 | 0.822 |
| 10−7 | 5 | 0.862 | 0.824 |
AU-ROC values for various calculations for subset 2: the threshold p-value was modified to see the variation in AU-ROC value when considering rules of order 1 for overall fit and the cross-validation leave-one-out (L1O) test. See the text for more details.
| Rules Selected | Overall Fit | L1O Performance | |
|---|---|---|---|
| 10−2 | 262 | 0.781 | 0.637 |
| 10−3 | 59 | 0.840 | 0.718 |
| 10−4 | 20 | 0.899 | 0.734 |
| 10−5 | 7 | 0.875 | 0.804 |
| 10−6 | 3 | 0.860 | 0.801 |
| 10−7 | 3 | 0.860 | 0.750 |
AU-ROC values for several calculations for subset 2: the threshold p-value was modified to see the variation in AU-ROC value when considering rules of order 2 for overall fit and the cross-validation leave-one-out (L1O) test. See the text for more details.
| Rules Selected | Training Fit | L1O Performance | |
|---|---|---|---|
| 10−3 | 176,608 | 0.900 | 0.743 |
| 10−4 | 43,193 | 0.936 | 0.800 |
| 10−5 | 13,276 | 0.945 | 0.854 |
| 10−6 | 4712 | 0.948 | 0.872 |
| 10−7 | 1501 | 0.947 | 0.854 |
| 10−8 | 459 | 0.960 | 0.838 |
| 10−9 | 88 | 0.955 | 0.861 |
| 10−10 | 21 | 0.962 | 0.872 |
| 10−11 | 10 | 0.946 | 0.923 |
| 10−12 | 4 | 0.922 | 0.795 |
| 10−13 | 1 | 0.864 | 0.803 |
| 10−14 | 1 | 0.864 | 0.470 |