| Literature DB >> 27143958 |
Lin Chen1, Ji-Ting Jia1, Qiong Zhang1, Wan-Yu Deng1, Wei Wei2.
Abstract
We propose a simple online learning algorithm especial for high-dimensional data. The algorithm is referred to as online sequential projection vector machine (OSPVM) which derives from projection vector machine and can learn from data in one-by-one or chunk-by-chunk mode. In OSPVM, data centering, dimension reduction, and neural network training are integrated seamlessly. In particular, the model parameters including (1) the projection vectors for dimension reduction, (2) the input weights, biases, and output weights, and (3) the number of hidden nodes can be updated simultaneously. Moreover, only one parameter, the number of hidden nodes, needs to be determined manually, and this makes it easy for use in real applications. Performance comparison was made on various high-dimensional classification problems for OSPVM against other fast online algorithms including budgeted stochastic gradient descent (BSGD) approach, adaptive multihyperplane machine (AMM), primal estimated subgradient solver (Pegasos), online sequential extreme learning machine (OSELM), and SVD + OSELM (feature selection based on SVD is performed before OSELM). The results obtained demonstrated the superior generalization performance and efficiency of the OSPVM.Entities:
Mesh:
Year: 2016 PMID: 27143958 PMCID: PMC4838813 DOI: 10.1155/2016/5197932
Source DB: PubMed Journal: Comput Intell Neurosci
Algorithm 1OSPVM algorithm.
The specifications of the benchmark problems.
| Dataset | #Training set | #Testing set | #Attributes | #Classes |
|---|---|---|---|---|
| Face | 200 | 200 | 1600 | 10 |
| Secom | 1254 | 313 | 590 | 2 |
| Arcene | 400 | 500 | 10000 | 2 |
| Dexter | 1400 | 1200 | 20000 | 2 |
| Multi.fea. | 400 | 1600 | 650 | 10 |
| News20 | 3993 | 15935 | 62061 | 20 |
| Sector | 3207 | 6412 | 55197 | 105 |
Comparison of OSPVM, Batch-PVM, BSGD, AMM, and Pegasos.
| Dataset | Algorithms | Nodes (θ) | Training time (s) | Testing time (s) | Training accuracy | Testing accuracy |
|---|---|---|---|---|---|---|
| Face | OSPVM (40, 16-by-16) | 51 (0.96) | 1.50 s | 0.0004 s | 99.89% | 92.87% |
| OSPVM (40, 1-by-1) | 43 (0.99) | 13.07 s | 0.0005 s | 99.20% | 91.20% | |
| Batch-PVM | 65 | 0.460 s | 0.0005 s | 99.81% | 92.30% | |
| SVD + BSGD [ | 200 | 1.542 s | 0.0835 s | 99.92% | 91.63% | |
| SVD + AMM Online [ | 200 | 1.990 s | 0.0300 s | 99.82% | 88.75% | |
| SVD + Pegasos [ | — | 1.530 s | 0.0240 s | 99.11% | 86.38% | |
|
| ||||||
| Secom | OSPVM (40, 16-by-16) | 61 (0.96) | 1.67 s | 0.007 s | 94.08% | 93.14% |
| OSPVM (40, 1-by-1) | 16 (0.96) | 4.01 s | 0.0004 s | 94.14% | 93.3% | |
| Batch-PVM | 60 | 0.525 s | 0.0073 s | 93.37% | 93.35% | |
| SVD + BSGD | 100 | 1.801 s | 0.0083 s | 95.12% | 93.13% | |
| SVD + AMM Online | 100 | 12.19 s | 0.031 s | 94.11% | 87.87% | |
| SVD + Pegasos | — | 1.660 s | 0.026 s | 93.16% | 89.12% | |
|
| ||||||
| Arcene | OSPVM (40, 16-by-16) | 106 (0.96) | 61.17 s | 0.0005 s | 95.88% | 90.50% |
| OSPVM (40, 1-by-1) | 39 (0.96) | 130.6 s | 0.0004 s | 93.5% | 86.7% | |
| Batch-PVM | 85 | 5.06 s | 0.00038 s | 94.63% | 90.80% | |
| SVD + BSGD | 200 | 65.22 s | 0.0335 s | 95.92% | 90.43% | |
| SVD + AMM Online | 200 | 81.69 s | 0.06 s | 94.89% | 87.75% | |
| SVD + Pegasos | — | 56.41 s | 0.044 s | 94.42% | 86.31% | |
|
| ||||||
| Dexter | OSPVM (40, 16-by-16) | 176 (0.96) | 131.1 s | 0.004 s | 97.88% | 92.25% |
| OSPVM (40, 1-by-1) | 86 (0.96) | 619.3 s | 0.004 s | 96.0% | 91.20% | |
| Batch-PVM | 160 | 10.36 s | 0.005 s | 98.38% | 91.25% | |
| SVD + BSGD | 200 | 148.54 s | 0.003 s | 97.98% | 92.63% | |
| SVD + AMM Online | 200 | 178.19 s | 0.003 s | 96.81% | 89.95% | |
| SVD + Pegasos | — | 119.40 s | 0.004 s | 95.87% | 87.36% | |
|
| ||||||
| Multi.fea. | OSPVM (40, 16-by-16) | 55 (0.96) | 4.93 s | 0.0053 s | 98.16% | 94.40% |
| OSPVM (40, 1-by-1) | 38 (0.96) | 13.4 s | 0.0047 s | 96.6% | 93.4% | |
| Batch-PVM | 160 | 1.83 s | 0.0192 s | 99.98% | 95.67% | |
| SVD + BSGD | 200 | 5.54 s | 0.0095 s | 98.42% | 94.63% | |
| SVD + AMM Online | 200 | 10.79 s | 0.03 s | 99.82% | 92.15% | |
| SVD + Pegasos | — | 4.46 s | 0.034 s | 99.82% | 91.88% | |
|
| ||||||
| News20 | OSPVM (40, 16-by-16) | 1110 (0.96) | 1283 s | 19.8 s | 85.26% | 83.10% |
| OSPVM (40, 1-by-1) | 1100 (0.96) | 1949 s | 19.9 s | 85.6% | 83.14% | |
| Batch-PVM | 1000 | 1060 s | 19.2 s | 84.89% | 83.12% | |
| SVD + BSGD | 1200 | 2289 s | 18.6 s | 83.52% | 82.33% | |
| SVD + AMM Online | 1200 | 2679 s | 21.3 s | 83.83% | 82.25% | |
| SVD + Pegasos | — | 1679 s | 19.2 s | 83.22% | 81.81% | |
|
| ||||||
| Sector | OSPVM (40, 16-by-16) | 130 (0.96) | 10.12 s | 0.20 s | 88.86% | 78.40% |
| OSPVM (40, 1-by-1) | 150 (0.96) | 18.4 s | 0.21 s | 86.6% | 79.04% | |
| Batch-PVM | 160 | 2.13 s | 0.21 s | 87.98% | 79.01% | |
| SVD + BSGD | 200 | 7.53 s | 0.34 s | 87.44% | 76.68% | |
| SVD + AMM Online | 200 | 12.69 s | 0.33 s | 86.81% | 76.65% | |
| SVD + Pegasos | — | 6.45 s | 0.34 s | 86.12% | 75.88% | |
Note: since OSPVM is equivalent to PVM rather than an approximation, if it has the same experimental setting (same number of hidden nodes and same training and testing splits), OSPVM and PVM would obtain the same performance (training accuracy and testing accuracy).
t value and significant level of OSPVM versus BSGD, AMM, and Pegasos.
| SVD + BSGD (88.78%) | SVD + AMM (86.47%) | SVD + Pegasos (85.53%) | |
|---|---|---|---|
| OSPVM (16-by-16) (89.23%) |
|
|
|
| OSPVM (1-by-1) (88.18%) |
|
|
|
Comparison of training and testing accuracy (in %) (one-by-one).
| Dataset | SVD + OSELM | OSPVM | OSELM | |||
|---|---|---|---|---|---|---|
| Training accuracy | Testing accuracy | Training accuracy | Testing accuracy | Training accuracy | Testing accuracy | |
| Face | 99.8% | 91.0% | 99.2% | 91.2% | 98.1% | 88.5% |
| Secom | 93.3% | 93.0% | 94.14% | 93.3% | 93.2% | 92.4% |
| Arcene | 93.0% | 83.0% | 93.5% | 86.7% | 86.1% | 81.1% |
| Dexter | 95.7% | 91.4% | 96.0% | 91.2% | 75.6% | 86.2% |
| Multi.fea. | 99.0% | 92.8% | 96.6% | 93.4% | 96.5% | 93.0% |
| News20 | 85.12% | 82.9% | 85.6% | 83.14% | 85.5% | 83.0% |
| Sector | 89.11% | 77.8% | 88.6% | 79.04% | 89.1% | 78.1% |
t value and significant level of OSPVM versus SVD + ELM (1-by-1) and OSELM (1-by-1).
| SVD + OSELM (1-by-1) (86.73%) | OSELM (1-by-1) (86.04%) | |
|---|---|---|
| OSPVM (1-by-1) (88.18%) |
|
|
Comparison of training and testing time (in seconds) (one-by-one).
| Dataset | SVD + OSELM | OSPVM | OSELM | |||
|---|---|---|---|---|---|---|
| Training time | Testing time | Training time | Testing time | Training time | Testing time | |
| Face | 22.40 s | 0.0006 s | 13.07 s | 0.0005 s | 0.156 s | 0.035 s |
| Secom | 7.809 s | 0.015 s | 4.010 s | 0.0004 s | 0.346 s | 0.029 s |
| Arcene | 131.5 s | 0.0004 s | 130.6 s | 0.0004 s | 4.390 s | 0.337 s |
| Dexter | 619.3 s | 0.001 s | 519.8 s | 0.0006 s | 9.218 s | 0.281 s |
| Multi.fea. | 13.51 s | 0.042 s | 13.40 s | 0.0167 s | 1.164 s | 0.097 s |
| News20 | 1987 s | 19.1 s | 1949 s | 19.9 s | 611 s | 19.7 s |
| Sector | 18.79 s | 0.22 s | 18.4 s | 0.21 s | 3.34 s | 0.39 s |
The number of hidden nodes (one-by-one).
| Dataset | SVD + OSELM | OSPVM | OSELM | |
|---|---|---|---|---|
| #Target dimensions | #Hidden nodes | |||
| Face | 43 | 60 | 43 | 72 |
| Secom | 16 | 60 | 16 | 72 |
| Arcene | 39 | 110 | 39 | 160 |
| Dexter | 86 | 170 | 86 | 200 |
| Multi.fea. | 38 | 180 | 38 | 160 |
| News20 | 780 | 1200 | 1100 | 1200 |
| Sector | 90 | 150 | 150 | 250 |
Comparison of training and testing accuracy (in %) (16-by-16).
| Dataset | SVD + OSELM | OSPVM | OSELM | |||
|---|---|---|---|---|---|---|
| Training accuracy | Testing accuracy | Training accuracy | Testing accuracy | Training accuracy | Testing accuracy | |
| Face | 99.82% | 91.50% | 99.89% | 92.07% | 98.22% | 87.7% |
| Secom | 93.34% | 93.36% | 94.08% | 93.14% | 93.32% | 93.3% |
| Arcene | 93.63% | 89.90% | 95.88% | 90.50% | 94.1% | 89.7% |
| Dexter | 89.75% | 91.90% | 97.88% | 92.25% | 72.6% | 88.5% |
| Multi.fea. | 99.48% | 93.49% | 98.16% | 94.40% | 96.78% | 93.0% |
| News20 | 86.11% | 83.09% | 85.26% | 83.10% | 85.24% | 81.0% |
| Sector | 89.18% | 78.19% | 88.86% | 78.40% | 88.78% | 76.20% |
Comparison of training and testing time (in seconds) (16-by-16).
| Dataset | SVD + OSELM | OSPVM | OSELM | |||
|---|---|---|---|---|---|---|
| Training time | Testing time | Training time | Testing time | Training time | Testing time | |
| Face | 1.58 s | 0.0005 s | 1.5 s | 0.0004 s | 0.078 s | 0.035 s |
| Secom | 1.85 s | 0.018 s | 1.67 s | 0.007 s | 0.061 s | 0.040 s |
| Arcene | 61.4 s | 0.0008 s | 61.17 s | 0.0005 s | 2.03 s | 0.55 s |
| Dexter | 135.7 s | 0.0006 s | 131.1 s | 0.0004 s | 4.88 s | 0.718 s |
| Multi.fea. | 5.15 s | 0.0218 s | 4.93 s | 0.0053 s | 0.26 s | 0.098 s |
| News20 | 1283 s | 19.8 s | 1283 s | 19.8 s | 0.26 s | 0.098 s |
| Sector | 10.7 | 0.21 s | 10.12 | 0.20 s | 5.26 s | 0.38 s |
t value and significant level (p) of OSPVM versus SVD + ELM (16-by-16) and OSELM (16-by-16).
| SVD + OSELM (16-by-16) (88.7%) | OSELM (16-by-16) (87.05%) | |
|---|---|---|
| OSPVM (16-by-16) (89.23%) |
|
|
The number of hidden nodes (16-by-16).
| Dataset | SVD + OSELM | OSPVM | OSELM | |
|---|---|---|---|---|
| #Target dimensions | #Hidden nodes | |||
| Face | 62 | 60 | 62 | 72 |
| Secom | 54 | 60 | 54 | 72 |
| Arcene | 110 | 110 | 106 | 300 |
| Dexter | 176 | 170 | 176 | 400 |
| Multi.fea. | 55 | 180 | 55 | 160 |
| News20 | 780 | 1200 | 1100 | 1200 |
| Sector | 90 | 150 | 150 | 250 |
Figure 1Adaptive model updating with respect to new samples (on Face dataset).
Training and testing accuracy of PVM and OSPVM with the same hidden nodes.
| Dataset | Algorithms | #Hidden nodes | Training accuracy | Testing accuracy |
|---|---|---|---|---|
| Face | OSPVM (1-by-1 and 16-by-16) | 65 | 99.81% | 92.30% |
| Batch-PVM | 65 | 99.81% | 92.30% | |
|
| ||||
| Secom | OSPVM (1-by-1 and 16-by-16) | 60 | 93.37% | 93.35% |
| Batch-PVM | 60 | 93.37% | 93.35% | |
|
| ||||
| Arcene | OSPVM (1-by-1 and 16-by-16) | 85 | 94.63% | 90.80% |
| Batch-PVM | 85 | 94.63% | 90.80% | |
|
| ||||
| Dexter | OSPVM (1-by-1 and 16-by-16) | 160 | 98.38% | 91.25% |
| Batch-PVM | 160 | 98.38% | 91.25% | |
|
| ||||
| Multi.fea. | OSPVM (1-by-1 and 16-by-16) | 160 | 99.98% | 95.67% |
| Batch-PVM | 160 | 99.98% | 95.67% | |
|
| ||||
| News20 | OSPVM (1-by-1 and 16-by-16) | 1000 | 84.89% | 83.12% |
| Batch-PVM | 1000 | 84.89% | 83.12% | |
|
| ||||
| Sector | OSPVM (1-by-1 and 16-by-16) | 160 | 87.98% | 79.01% |
| Batch-PVM | 160 | 87.98% | 79.01% | |
Figure 2The influence of mean update to OSPVM.