| Literature DB >> 31961909 |
Baixi Xing1,2, Kejun Zhang2, Lekai Zhang1,2, Xinda Wu2, Huahao Si3, Hui Zhang2, Kaili Zhu2, Shouqian Sun2.
Abstract
Aesthetic perception is a human instinct that is responsive to multimedia stimuli. Giving computers the ability to assess human sensory and perceptual experience of aesthetics is a well-recognized need for the intelligent design industry and multimedia intelligence study. In this work, we constructed a novel database for the aesthetic evaluation of design, using 2,918 images collected from the archives of two major design awards, and we also present a method of aesthetic evaluation that uses machine learning algorithms. Reviewers' ratings of the design works are set as the ground-truth annotations for the dataset. Furthermore, multiple image features are extracted and fused. The experimental results demonstrate the validity of the proposed approach. Primary screening using aesthetic computing can be an intelligent assistant for various design evaluations and can reduce misjudgment in art and design review due to visual aesthetic fatigue after a long period of viewing. The study of computational aesthetic evaluation can provide positive effect on the efficiency of design review, and it is of great significance to aesthetic recognition exploration and applications development.Entities:
Year: 2020 PMID: 31961909 PMCID: PMC6974033 DOI: 10.1371/journal.pone.0227754
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Multimedia aesthetic-aware modeling approaches.
| Refer | Features | Classifier/Method | Descriptors | Dataset | Annotators | Results |
|---|---|---|---|---|---|---|
| Tian, | Image | DCNNs | Aesthetic | AVA | N/A | Kendall’s tau-b: 0.487 |
| Liao, | Image | Statistics | Balance | 26000 | N/A | Output scores: |
| Sheng, | Image | Multi-patch | Aesthetic | AVA | N/A | Catalyst: 83.03% |
| Qian, | Image (SIFT, | Crowdsourced | Image | POI | 7387 Users | POI summarization |
| Ren, | Image | Active learning | Aesthetic | FLICKR- | Amazon | Direct score prediction: 0.039 |
| Kucer, | Hand-crafted features | VGG-16 | Mean aesthetic | HB | N/A | Accuracy: 81.95% |
| Chen, | Image | Fuzzy-rule-based | Aesthetic rating | Webpage | N/A | Better predictive ability than linear regression model |
| Maity, | Image, text, | SVM | Aesthetic | 250 images | 83 Users/ | Image RMSE: 0.68 |
| Persada, | Information | Kansei engineering | Feasibility | 20 | 47 Students | Feasibility of use |
| Wu, | Pattern Image | Cloud model | N/A | N/A | N/A | Pattern generation |
| Zhang, | Image | Stack GAN | Aesthetic | 4000 | 10 Students | Image generation |
| Erdem, | Image (composition, texture, line) | LSBoost | High/Low | AVA | N/A | MSE: 0.394 |
| Brain, | Image | Ralph’s model of | Score of | Images of | N/A | Visually harmonious |
| Wong, | Image | SVM | High/Low | 3161 | N/A | Accuracy: 78.8% |
| Su, | Image (color, | Adaboost | High/Low | DP | N/A | Accuracy: 92.06% |
| Lovato, | Image | LASSO Regressor | High/Low | Flickr | 200 | Accuracy: 96% |
| Zhang, | Visual graphic | Embedded | High/Low | AVA | N/A | Probabilistic: |
| Tarvainen, | Visual | Extreme Learning | Aesthetic | Movie | 73 Viewers | Prediction deviation ratio: 1.19 |
| Temel, | Image (SIFT, CN, DOG, DOSA) | GIST, GMM, DOG | Aesthetic score | AVA | N/A | SIFT: 75.5% |
| Wu, | Image (structural, local, global visual features) | SVM, SSVM, | Aesthetic | Webpage | N/A | Testing Errors: |
| Lu, | Image (global, local | RDCNN | Aesthetic | AVA | N/A | Accuracy:75.41% |
| Jin, | Image | DCNNs | Low/High | AVA | N/A | MSE: 97.54% |
| Lee, | Image | DCNNs | Low/High | AVA | N/A | Accuracy:81.02% |
| Liu, | Deep GSP image | SDAL | Human gaze | Millions | 200,000Users | Consistency: 93% |
| Wang, | Image | DNN | Low/High | AVA | N/A | AVA- Accuracy:76.9% |
| Tong, | Geometrical | DCNNs | Pleasant/ | 4240 faces | N/A | Proved that |
| Fu, | Image | DCNNs(VGG-16, | Low/High | AVA | N/A | Accuracy: 90.01% |
| Murray, | Image | SVMs | 60 categories | AVA | Hundreds of | mAP: 53.85% |
| Meng, | Image | MobileNet, VGG, | Aesthetic score | AVA | N/A | Accuracy: 79.38% |
| Sidhu, | Image(HSV,RGB, | Regression Models | Beauty/liking | 480 | 598 | Adjusted R2: 0.13 |
(LSBoost: Least-squares Boosting; BAG: Bagged Tree Ensembles; RF: Random Forests; POI: Place of Interest; SDAL: Semi-supervised Deep Active Learning; DNN: Deep Neural Networks; CNN: Convolutional Neural Networks; BDE: Boundary Displacement Error; mAP: mean Average Precision, Adjusted R2: Adjusted R-square)
Fig 1Architecture of the VGG-19 networks.
Each plane is a feature map.
Architecture and feature extraction process of VGG-19 for aesthetic-aware modeling.
| Layer name | Layers | Output size |
|---|---|---|
| Conv1 | Conv,3x3,64 × 2,Max pool | [224,224,64],[112,112,64] |
| Conv2_x | Conv,3x3,128 × 2,Max pool | [112,112,128],[56,56,128] |
| Conv3_x | Conv,3x3,256 × 4,Max pool | [56,56,256],[28,28,256] |
| Conv4_x | Conv,3x3,512 × 4,Max pool | [28,28,512],[14,14,512] |
| Conv5_x | Conv,3x3,512 × 4,Max pool | [14,14,512],[7,7,512] |
| Dense3_x | Flatten,fc × 3,fc × 3 | [25088],[1000],[64] |
| Output | 3d fc, Softmax | [3] |
Architecture and feature extraction process of ResNet-50 for aesthetic-aware modeling.
| Layer name | Layers | Output size |
|---|---|---|
| Conv1 | Conv,7x7,64.stride 2,Max pool,3x3,stride 2 | [112,112,64],[56,56,64] |
| Conv2_x | [56,56,256] | |
| Conv3_x | [28,28,512] | |
| Conv4_x | [14,14,1024] | |
| Conv5_x | [7,7,2048] | |
| Dense3_x | Average Pool,fc×3,Dropout | [512],[128],[64] |
| Output | 3d fc, Softmax | [3] |
Fig 2Architecture of the ResNet-50 network.
Each plane is a feature map.
Fig 3Experimental procedure for aesthetic-aware modeling, based on design awards datasets.
Evaluation items for Electronic Home Applicants Design Awards and electronic tool design awards.
| No. | Items | Description |
|---|---|---|
| 1 | Innovation | Innovative appearance and functions: novelty of the design, novelty of appearance, incorporation of new technology or new materials suitability as part of a new way of life. |
| 2 | Feasibility | Market value and feasibility: design concept is suitable for mass production at reasonable cost. |
| 3 | Environmental aspect | The design uses material that is not environmentally damaging and conserves energy in its use. |
| 4 | Harmonious color design | Harmonious selection and combination of colors in the product design. |
| 5 | Layout presentation quality | The design layout conforms to the aesthetic requirements in color and structure. |
| 6 | Comprehensive evaluation | An overall impression score, which essentially provides the juries a chance to show their own preference and to encourage those products they consider is interesting and that have potential, in spite of pitfalls for certain aspects. |
Fig 4Fig (a) presents layouts of the top 20 design works from the Electronic Tools Design Awards and Fig (b) presents 20 layouts of the eliminated design works in this award as a comparison. Only part of the layout is presented here due to copyright protections.
Aesthetic-aware classification accuracy of the dataset from the Electronic Home Applicants Design Awards.
| Dataset | Algor. | Results | |
|---|---|---|---|
| ACC | |||
| Liblinear | 57.01% | L2-regularized L2-loss | |
| LibSVM | 59.01% | C = 1, | |
| RBFNetworks | 63.06% | minStdDev 0.1, | |
| RSS- | 59.97% | RSS: | |
| VGG-19 | 70.03% | lr = 0.001, | |
| lr = 0.001, | |||
Aesthetic-aware modeling verification in the Electronic Tools Design Awards design award dataset.
| Dataset | Algor. | Results | |
|---|---|---|---|
| ACC | |||
| Liblinear | 59.57% | L2-regularized L2-loss | |
| LibSVM | 59.31% | C = 2, | |
| RBFNetworks | 61.19% | minStdDev 0.1, | |
| RSS- | 59.31% | RSS:subSpaceSize 0.5, | |
| VGG-19 | 68.36% | lr = 0.001, | |
| lr = 0.001, | |||
Best features selection by CfsSubsetEvaluation via BestFirst method.
| Dataset | Type | Numbers of features |
|---|---|---|
| HSV | 10 | |
| HIST | 1 | |
| LBP | 5 | |
| HSV | 8 | |
| HIST | 17 | |
| LBP | 1 |
Aesthetic-aware classification accuracy comparison using VGG-19 and ResNet-50.
| Dataset | Class | ||
|---|---|---|---|
| VGG-19 | ResNet-50 | ||
| Eliminated | 61.78% | 66.67% | |
| Nominees | 93.33% | 93.70% | |
| Eliminated | 67.03% | 75.95% | |
| Nominees | 83.59% | 84.19% | |
Fig 5Loss during ResNet-50 training process for Electronic Home Applicants Design Award dataset.
(Fig (a):Nominees classification accuracy; Fig(b): Eliminated classification accuracy).
Fig 6Loss during ResNet-50 training process for Electronic Tools Design Award dataset.
(Fig (a): Nominees classification accuracy; Fig (b): Eliminated classification accuracy).