| Literature DB >> 33286263 |
Adrian Carballal1,2, Carlos Fernandez-Lozano1,2, Nereida Rodriguez-Fernandez1,3, Iria Santos1,3, Juan Romero1,3.
Abstract
Providing the visual complexity of an image in terms of impact or aesthetic preference can be of great applicability in areas such as psychology or marketing. To this end, certain areas such as Computer Vision have focused on identifying features and computational models that allow for satisfactory results. This paper studies the application of recent ML models using input images evaluated by humans and characterized by features related to visual complexity. According to the experiments carried out, it was confirmed that one of these methods, Correlation by Genetic Search (CGS), based on the search for minimum sets of features that maximize the correlation of the model with respect to the input data, predicted human ratings of image visual complexity better than any other model referenced to date in terms of correlation, RMSE or minimum number of features required by the model. In addition, the variability of these terms were studied eliminating images considered as outliers in previous studies, observing the robustness of the method when selecting the most important variables to make the prediction.Entities:
Keywords: compression error; correlation; human-computer interaction; machine learning; psychiatry and psychology; sisual complexity; visual stimuli
Year: 2020 PMID: 33286263 PMCID: PMC7516971 DOI: 10.3390/e22040488
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Examples of stimuli of each category.
Figure 2Example of the end of the evolutionary process. The example consists of 20 individuals and an average threshold of 0.05 (red line). Each individual is mentioned in the graphic as Ind_X (where X is the number of the individual). The shade of green is darker with more iterations. At the moment in which the average difference is less than the pre-set threshold value (0.05 in this case) the iterations end, together with the evolutionary process.
Figure 3Results obtained in the experiments carried out with different ML models (Elastic Net (ENET), Feature Selection Multiple Kernel Learning (FSMKL), Partial Least Squares Regression (PLS), Random Forest (RF), Random Forest–Recursive Feature Elimination (RF-RFE), Support Vector Machine–Radial (SVM-Radial) and Correlation by Genetic Search (CGS). The behavior of each model is shown by analyzing two different datasets: dataset processed without the outliers and with the complete unprocessed set.
Figure 4Statistical comparison among the 4 ML models (RF, RF-RFE, SVM-Radial and CGS) that obtained the best results during the analysis phase. You can see in (a) the comparison of the results once the outliers of the dataset are eliminated and in (b) the statistical comparison with the complete dataset, without any processing.
Set of features used by CGS, organized by appearances. The identified features were: the average appearances in 10-fold CV 50 independent runs using all imagenes (With Outliers) and previously eliminating identified outliers (Without Outliers). All the features were identified using the terminology proposed in [20].
| Feature | Avg Appearances | Avg Appearances |
|---|---|---|
| With Outliers | Without Outliers | |
| Rank(NoFilter(V), M) | 100.00% | 100.00% |
| Size(NoFilter(H), M) | 99.60% | 98.80% |
| JPEG(Canny(S), Low) | 81.20% | 81.80% |
| Size(NoFilter(H+CS), M) | 80.20% | 81.00% |
| STD(Canny(S), value) | 80.20% | 81.20% |
| Fractal(NoFilter(H), High) | 79.60% | 81.00% |
| JPEG(NoFilter(H), High) | 66.20% | 66.40% |
| Box-Counting(Canny(V), M) | 62.00% | 63.40% |
| JPEG(Canny(S), Medium) | 61.00% | 58.60% |
| JPEG(Canny(S), High) | 58.60% | 58.00% |
| Size(Canny(H+CS), M) | 56.20% | 58.00% |
a Features previously identified by Machado et al. [20] as the best individual features for solving the problem. b Features previously identified by Fernandez-Lozano et al. [21] as the best individual features for solving the problem.