| Literature DB >> 22779048 |
Yuan Wu1, Xiaoqian Jiang, Jihoon Kim, Lucila Ohno-Machado.
Abstract
We proposed the I-spline Smoothing approach for calibrating predictive models by solving a nonlinear monotone regression problem. We took advantage of I-spline properties to obtain globally optimal solutions while keeping the computational cost low. Numerical studies based on three data sets showed the empirical evidences of I-spline Smoothing in improving calibration (i.e.,1.6x, 1.4x, and 1.4x on the three datasets compared to the average of competitors-Binning, Platt Scaling, Isotonic Regression, Monotone Spline Smoothing, Smooth Isotonic Regression) without deterioration of discrimination.Entities:
Year: 2012 PMID: 22779048 PMCID: PMC3392066
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Generalized gradient projection algorithm to solve I-spline Smoothing.
| |
|
|
| The execution guarantees that |
|
|
| Replace |
|
If the If there is at least one |
Summarization of the data used to conduct the experiments.
| Feature dimension | Sample size | Training / Test ratio | Note | |
|---|---|---|---|---|
| GSE2034 | 15 | 209 | 6 / 4 | Breast cancer data sets from NCBI Gene Expression Omnibus (GEO) used to construct a decision support system for predicting reoccurrences of breast cancer using extracted gene expression features. We followed Osl et al |
| Edin (MI) | 48 | 1,253 | 6 / 4 | This data contain clinical and electrocardiographic information about 500 patients with and without myocardial infarction (MI) admitted with chest pain into an emergency department in Sheffield, England. The study was to determine which, and how many, data items are required to construct a decision support system for early diagnosis of acute myocardial infraction |
| PIMATR | 8 | 768 | 6 / 4 | Pima Indians Diabetes data set from National Institute of Diabetes and Digestive and Kidney Diseases. The population lives near Phoenix, Arizona, USA, and all patients are females at least 21 years old of Pima Indian heritage |
Figure 1:Illustration of calibration functions of four different approaches, including Binning, Platt Scaling (PS), Isotonic regression (IR), Smooth Isotonic Regression (SIR) and our proposed method I-spline Smoothing (IS).
Figure 2:Illustration of AUCs and HL-tests of all five methods in comparison using three different data.
Performance of different models using different data.
| Logistic Regression (LR) | Platt Scaling (PS) | Isotonic Regression (IR) | Smooth Isotonic Regression (SIR) | I-spline Smoothing (IS) | |
|---|---|---|---|---|---|
| GSE2034 | (0.81±0.04) / (0.28) | (0.81±0.04) / (0.44) | (0.80±0.05) / (0.17) | (0.80±0.05) / (0.16) | (0.81±0.04) / (0.43) |
| Edin (MI) | (0.89±0.02) / (0.14) | (0.89±0.02) / (0.09) | (0.89±0.02) / (0.31) | (0.89±0.02) / (0.30) | (0.89±0.02) / (0.29) |
| PIMATR | (0.82±0.05) / (0.57) | (0.82±0.05) / (0.66) | (0.80±0.05) / (0.37) | (0.81±0.05) / (0.32) | (0.82±0.05) / (0.73) |
Summarization of popular calibration approaches.
| Monotonic | Non-parametric | Non-exponential complexity | Continuous | |
|---|---|---|---|---|
| Binning | x | x | ||
| Platt scaling | x | x | ||
| Isotonic Regression | x | x | x | |
| Smooth Isotonic Regression | x | x | x | x |
| Monotone Spline Smoothing | x | x | x | |
| I-spline Smoothing | x | x | x | x |