| Literature DB >> 18941628 |
André Altmann1, Michal Rosen-Zvi, Mattia Prosperi, Ehud Aharoni, Hani Neuvirth, Eugen Schülter, Joachim Büch, Daniel Struck, Yardena Peres, Francesca Incardona, Anders Sönnerborg, Rolf Kaiser, Maurizio Zazzi, Thomas Lengauer.
Abstract
BACKGROUND: Analysis of the viral genome for drug resistance mutations is state-of-the-art for guiding treatment selection for human immunodeficiency virus type 1 (HIV-1)-infected patients. These mutations alter the structure of viral target proteins and reduce or in the worst case completely inhibit the effect of antiretroviral compounds while maintaining the ability for effective replication. Modern anti-HIV-1 regimens comprise multiple drugs in order to prevent or at least delay the development of resistance mutations. However, commonly used HIV-1 genotype interpretation systems provide only classifications for single drugs. The EuResist initiative has collected data from about 18,500 patients to train three classifiers for predicting response to combination antiretroviral therapy, given the viral genotype and further information. In this work we compare different classifier fusion methods for combining the individual classifiers. PRINCIPALEntities:
Mesh:
Substances:
Year: 2008 PMID: 18941628 PMCID: PMC2565127 DOI: 10.1371/journal.pone.0003470
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of the EuResist Integrated Database (release 11/2007) and training and test set.
| Patients | Sequences | VL measurements | Therapies | Successes | Failures | |
| EIDB | 18,467 | 22,006 | 240,795 | 64,864 | - | - |
| Labeled Therapies | 8,223 | 3,492 | 40,498 | 20,249 | 13,935 | 6,314 |
| Training Set | 2,389 | 2,722 | 5,444 | 2,722 | 1,822 | 900 |
| Test Set | 297 | 301 | 602 | 301 | 202 | 99 |
The table displays the number of Patients, Sequences, VL measurements, and Therapies for the complete EuResist Integrated Database (EIDB) and the set of therapies that could be labeled with the definition. 469 of the sequences associated with all labeled therapies belong to historic genotypes and are not directly associated with a therapy change. Moreover, detailed information on training set and test set (comprising labeled therapies with an associated sequence) is given.
Results for the individual classifiers on training set and test set.
| Engine | minimal feature set | maximal feature set | ||||||
| AUC | Accuracy | AUC | Accuracy | |||||
| Train | Test | Train | Test | Train | Test | Train | Test | |
| GD | 0.747 (0.027) | 0.744 | 0.745 (0.024) | 0.724 | 0.768 (0.025) | 0.760 | 0.752 (0.028) | 0.757 |
| ME | 0.758 (0.019) | 0.745 | 0.748 (0.031) | 0.757 | 0.762 (0.021) | 0.742 | 0.754 (0.030) | 0.757 |
| EV | 0.766 (0.030) | 0.768 | 0.754 (0.031) | 0.748 | 0.789 (0.023) | 0.804 | 0.780 (0.032) | 0.751 |
The table displays the performance, measured in AUC and Accuracy, achieved by the individual classifiers on the training set (using 10-fold cross validation; standard deviation in brackets) and the test set using different feature sets.
Results for the combined classifiers on training and test set.
| Method | minimal feature set | maximal feature set | ||||||
| AUC | Accuracy | AUC | Accuracy | |||||
| Train | Test | Train | Test | Train | Test | Train | Test | |
| Single Best | 0.766 (0.030) | 0.768 | 0.754 (0.031) | 0.748 | 0.789 (0.023) | 0.804 | 0.780 (0.032) | 0.751 |
| Oracle | 0.914 (0.015) | 0.911 | 0.842 (0.025) | 0.844 | 0.917 (0.013) | 0.920 | 0.850 (0.022) | 0.860 |
| Min | 0.771 (0.020) | 0.765 | 0.746 (0.027) | 0.761 | 0.792 (0.021) | 0.793 | 0.760 (0.030) | 0.764 |
| Max | 0.760 (0.023) | 0.765 | 0.742 (0.030) | 0.731 | 0.779 (0.021) | 0.779 | 0.757 (0.030) | 0.741 |
| Median | 0.773 (0.020) | 0.766 | 0.759 (0.027) | 0.766 | 0.789 (0.029) | 0.786 | 0.768 (0.029) | 0.761 |
| Mean | 0.777 (0.020) | 0.772 | 0.760 (0.024) | 0.744 | 0.794 (0.019) | 0.793 | 0.780 (0.028) | 0.781 |
| Majority | 0.683 (0.023) | 0.660 | 0.759 (0.027) | 0.738 | 0.697 (0.027) | 0.683 | 0.768 (0.029) | 0.761 |
| QDA | 0.771 (0.020) | 0.763 | 0.755 (0.031) | 0.738 | 0.790 (0.022) | 0.794 | 0.769 (0.027) | 0.764 |
| Logistic Regression | 0.778 (0.021) | 0.774 | 0.762 (0.028) | 0.744 | 0.798 (0.020) | 0.805 | 0.781 (0.030) | 0.771 |
| Decision Trees | 0.718 (0.044) | 0.741 | 0.748 (0.032) | 0.757 | 0.722 (0.033) | 0.678 | 0.777 (0.032) | 0.757 |
| Naïve Bayes | 0.732 (0.027) | 0.740 | 0.759 (0.027) | 0.738 | 0.752 (0.028) | 0.753 | 0.768 (0.029) | 0.761 |
| Decision Templates | 0.777 (0.021) | 0.774 | 0.755 (0.027) | 0.754 | 0.796 (0.019) | 0.797 | 0.766 (0.026) | 0.767 |
| Dempster-Shafer | 0.777 (0.021) | 0.772 | 0.755 (0.024) | 0.754 | 0.796 (0.019) | 0.796 | 0.767 (0.026) | 0.764 |
| Clustering | 0.775 (0.019) | 0.773 | 0.758 (0.029) | 0.741 | 0.797 (0.018) | 0.800 | 0.783 (0.028) | 0.784 |
| Local Accuracy | 0.777 (0.020) | 0.771 | 0.761 (0.025) | 0.741 | 0.795 (0.019) | 0.791 | 0.781 (0.029) | 0.777 |
| Feature | 0.750 (0.026) | 0.747 | 0.745 (0.029) | 0.751 | 0.786 (0.021) | 0.779 | 0.780 (0.029) | 0.767 |
The table summarizes the results achieved by the different combination approaches on the training set (10-fold cross validation; standard deviation in brackets) and the test set. The reference methods are Single Best and Oracle, the non-trainable combiners are named according to their function, the meta-classifiers according to the statistical learning methods. Decision Templates, Dampster-Shafer, Clustering and Local Accuracy are the methods described in detail in the Methods section. Feature is the combination on the feature level.
Figure 1Improvement in AUC of combination methods compared to the single best and single worst classifiers.
The figure displays the improvement in AUC of all combination methods over the single best (blue bars) and single worst (red bars) classifiers on the training set (upper panel) and the test set (lower panel). Significance of the improvement on the training set was computed with a one-sided paired Wilcoxon test. Solidly colored bars indicate significant (at a 0.05 p-value threshold) improvements, as opposed to lightly shaded bars for insignificant improvements. On the test set no p-values could be computed.
Figure 2Learning curves for the individual classifiers, the mean combiner, and the combination on feature level.
The figure shows the development of the mean AUC on the test set depending on the amount of available training data for the individual classifiers, the mean combiner, and the combination on the feature level using the minimal feature set. Error bars indicate the standard deviation on 10 repetitions.
Results on the (un)cleaned test set when individual classifiers are trained on the (un)cleaned training set.
| Engine | minimal features set | maximal feature set | ||||||
|
|
| |||||||
| none | only train | only test | both | none | only train | only test | both | |
| GD | 0.744 | 0.738 | 0.784 | 0.786 | 0.760 | 0.747 | 0.808 | 0.806 |
| ME | 0.745 | 0.739 | 0.770 | 0.771 | 0.742 | 0.757 | 0.808 | 0.810 |
| EV | 0.768 | 0.776 | 0.811 | 0.824 | 0.804 | 0.812 | 0.846 | 0.855 |
| Mean | 0.772 | 0.767 | 0.812 | 0.814 | 0.793 | 0.791 | 0.849 | 0.849 |
| Feature | 0.747 | 0.754 | 0.797 | 0.808 | 0.779 | 0.787 | 0.832 | 0.842 |
The table summarizes the results, measured in AUC for the individual classifiers, the mean combiner, and the combination of feature level when retrained on the (un)cleaned training set and tested on the (un)cleaned test set. Cleaned refers to the removal of ambiguous failing instances.
AUC for the combined engines on training set and test set with the ambiguous cases removed from test set and training set or test set only.
| Method | minimal feature set | maximal feature set | ||
| Train | Test | Train | Test | |
| removed from test only | ||||
| Single Best | 0.809 (0.021) | 0.811 | 0.839 (0.017) | 0.847 |
| Oracle | 0.935 (0.012) | 0.936 | 0.945 (0.014) | 0.950 |
| Min | 0.817 (0.019) | 0.807 | 0.847 (0.022) | 0.848 |
| Max | 0.807 (0.024) | 0.810 | 0.832 (0.018) | 0.824 |
| Median | 0.820 (0.020) | 0.810 | 0.844 (0.021) | 0.835 |
| Mean | 0.823 (0.019) | 0.816 | 0.850 (0.019) | 0.847 |
| Logistic Regression | 0.824 (0.019) | 0.816 | 0.852 (0.017) | 0.856 |
| Decision Templates | 0.823 (0.018) | 0.818 | 0.851 (0.019) | 0.850 |
| Clustering | 0.822 (0.020) | 0.808 | 0.852 (0.017) | 0.850 |
| Local Accuracy | 0.823 (0.019) | 0.813 | 0.850 (0.019) | 0.844 |
| removed from train and test | ||||
| Logistic Regression | 0.825 (0.019) | 0.816 | 0.852 (0.017) | 0.856 |
| Decision Templates | 0.823 (0.018) | 0.818 | 0.851 (0.019) | 0.850 |
| Clustering | 0.822 (0.021) | 0.796 | 0.852 (0.017) | 0.844 |
| Local Accuracy | 0.823 (0.019) | 0.813 | 0.850 (0.019) | 0.843 |
The table displays the results, measured in AUC, on training set (10-fold cross validation; standard deviation in brackets) and test set for a selection of combination approaches when trained on the (un)cleaned training set. For computation of the AUC the ambiguous cases were always removed.
Figure 3Distribution of predicted success probabilities.
Distribution of the predicted success for all successful therapies (blue solid), all failing therapies (red solid), failing therapies with at least one VL measure below 500 during the regimen (red dashed), and failing therapies with all VL measures above 500 (red dotted) of the mean combiner using the minimal feature set.