| Literature DB >> 26197221 |
Noelia Vállez1, Oscar Deniz1, Gloria Bueno1.
Abstract
Automatic detection systems usually require large and representative training datasets in order to obtain good detection and false positive rates. Training datasets are such that the positive set has few samples and/or the negative set should represent anything except the object of interest. In this respect, the negative set typically contains orders of magnitude more images than the positive set. However, imbalanced training databases lead to biased classifiers. In this paper, we focus our attention on a negative sample selection method to properly balance the training data for cascade detectors. The method is based on the selection of the most informative false positive samples generated in one stage to feed the next stage. The results show that the proposed cascade detector with sample selection obtains on average better partial AUC and smaller standard deviation than the other compared cascade detectors.Entities:
Mesh:
Year: 2015 PMID: 26197221 PMCID: PMC4510611 DOI: 10.1371/journal.pone.0133059
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Viola-Jones cascade framework.
Fig 2Proposed cascade framework training process.
Proposed cascade training algorithm.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| [ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ( |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fig 3Image samples from used datasets.
CBCL dataset similar images (a-d), INRIA dataset similar images (e-h), and images from mammography dataset (i-l). Images on the first two columns contain objects while the rest are negative samples.
Fig 4Base Haar-like rectangle features.
First-order statistics.
| Feature | Definition |
|---|---|
| 1. Mean |
|
| 2. Mode |
|
| 3. Variance |
|
| 4. 1st quartile |
|
|
| |
| 5. 2nd quartile |
|
|
| |
| 6. 3rd quartile |
|
|
| |
| 7. Interquartile Range | #6—#4 |
| 8. Minimum |
|
| 9. Maximum |
|
| 10. Value Range |
|
| 11. Entropy |
|
| 12. Asymmetry |
|
| 13. Kurtosis |
|
| Where: | |
|
| |
|
| |
Second-order statistics.
| Feature | Definition |
|---|---|
| 1. Energy |
|
| 2. Correlation |
|
| 3. Contrast |
|
| 4. Variance |
|
| 5. Sum Average |
|
| 6. Sum Entropy |
|
| 7. Sum Variance |
|
| 8. Entropy |
|
| 9. Difference Variance |
|
| 10. Difference Entropy |
|
| 11. Correlation Inf. 1 |
|
| 12. Correlation Inf. 2 |
|
| 13. Homogeneity 1 |
|
| 14. Homogeneity 2 |
|
| 15. Cluster Shade |
|
| 16. Cluster Prominence |
|
| 17. Autocorrelation |
|
| 18. Dissimilarity |
|
| 19. Maximum Probability |
|
| Where: | |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
Fig 5Two ROCs, A and B, with the same AUC but different pAUC.
pAUC results.
|
|
|
| ||||
|---|---|---|---|---|---|---|
| Haar-like | Statistical | Haar-like | Statistical | Haar-like | Statistical | |
|
| ||||||
| pAUC | 0.3833 | 0.3666 | 0.3521 | 0.3727 | 0.2816 | 0.3384 |
|
| 0.000 | 0.0032 | 0.0020 | 0.0023 | 0.0032 | 0.0027 |
|
| ||||||
| pAUC | 0.4842 |
| 0.4359 | 0.4860 | 0.3458 | 0.3972 |
|
| 0.0051 | 0.0061 | 0.0116 | 0.0093 | 0.0109 | 0.0123 |
|
| ||||||
| pAUC | 0.4364 | 0.4264 | 0.3940 | 0.4458 |
| 0.3950 |
|
| 0.0591 | 0.0468 | 0.0648 | 0.0406 | 0.0125 | 0.0386 |
|
| ||||||
| pAUC |
| 0.4788 |
|
| 0.3398 |
|
|
| 0.0044 | 0.0054 | 0.0089 | 0.0063 | 0.0102 | 0.0113 |
Results obtained from applying the different cascade detectors over the different dataset and feature set combinations. The table shows the average pAUC values with their corresponding standard deviation σ. The best pAUC for each pair of database and feature set is highlighted in bold.
Fig 6Average pAUC results of the different cascade detectors using the six dataset and feature set combinations.
The lines on top of the bars represent the standard deviation.