| Literature DB >> 36015967 |
Effat Jalaeian Zaferani1, Mohammad Teshnehlab1, Amirreza Khodadadian2, Clemens Heitzinger3,4, Mansour Vali1, Nima Noii5, Thomas Wick2.
Abstract
In this work, a method for automatic hyper-parameter tuning of the stacked asymmetric auto-encoder is proposed. In previous work, the deep learning ability to extract personality perception from speech was shown, but hyper-parameter tuning was attained by trial-and-error, which is time-consuming and requires machine learning knowledge. Therefore, obtaining hyper-parameter values is challenging and places limits on deep learning usage. To address this challenge, researchers have applied optimization methods. Although there were successes, the search space is very large due to the large number of deep learning hyper-parameters, which increases the probability of getting stuck in local optima. Researchers have also focused on improving global optimization methods. In this regard, we suggest a novel global optimization method based on the cultural algorithm, multi-island and the concept of parallelism to search this large space smartly. At first, we evaluated our method on three well-known optimization benchmarks and compared the results with recently published papers. Results indicate that the convergence of the proposed method speeds up due to the ability to escape from local optima, and the precision of the results improves dramatically. Afterward, we applied our method to optimize five hyper-parameters of an asymmetric auto-encoder for automatic personality perception. Since inappropriate hyper-parameters lead the network to over-fitting and under-fitting, we used a novel cost function to prevent over-fitting and under-fitting. As observed, the unweighted average recall (accuracy) was improved by 6.52% (9.54%) compared to our previous work and had remarkable outcomes compared to other published personality perception works.Entities:
Keywords: big five personality traits; cultural algorithm; deep learning; hyper-parameter optimization; personality perception
Mesh:
Year: 2022 PMID: 36015967 PMCID: PMC9413006 DOI: 10.3390/s22166206
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
The 130 LLD features, including 65 LLD and 65 ΔLLD features [46].
| 4 Energy Related LLD | Group |
|---|---|
| Sum of Auditory Spectrum (Loudness) | Prosodic |
| Sum of RASTA-Style Filtered Auditory Spectrum | Prosodic |
| RMS Energy, Zero-Crossing Rate | Prosodic |
|
|
|
| RASTA-Style Auditory Spectrum, Bands 1–26 (0–8 kHz) | Spectral |
| MFCC 1-14 | Cepstral |
| Spectral Energy 250–650 Hz, 1 k–4 kHz | Spectral |
| Spectral Roll Off Point 0.25, 0.50, 0.75, 0.90 | Spectral |
| Spectral Flux, Centroid, Entropy, Slope, Harmonicity | Spectral |
| Spectral Psychoacoustic Sharpness | Spectral |
| Spectral Variance, Skewness, Kurtosis | Spectral |
|
|
|
| F0 (SHS & Viterbi Smoothing) | Prosodic |
| Probability of Voicing | Sound Quality |
| Log. HNR, Jitter (Local, Delta), Shimmer (Local) | Sound Quality |
Figure 1Flowchart of the MIC algorithm.
Figure 2Schematic of the asymmetric auto-encoder [24].
Figure 3Flowchart of SAAE hyper-parameter optimization.
Figure 4Converting auto-encoder to two RBMs for tuning the initial weights of the encoder and decoder layers.
Description of Three Benchmark Functions.
| Name | Formula | Range |
|
|---|---|---|---|
| Rastrigin |
|
|
|
| Ackley |
|
|
|
| Griewang |
|
| 0 |
Figure 5Benchmark functions (A) Rastrigin, (B) Ackley, and (C) Griewang.
The Results of MIC Compared with Traditional GA, DE, PSO, and ES in Three Benchmark Functions (10D and 30D).
| Benchmark Functions | Optimization Algorithm | AvI | AvP | SI | BOP | SD | SR (%) | AvI | AvP | SI | BOP | SD | SR (%) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
| ||||||||||||
| Rastrigin | MIC by LM | 83.4 | 6.7 × 10−5 | 61 | 7.2 × 10−5 | 4.7 × 10−5 | 100 | 307.1 | 7.8 × 10−4 | 254 | 4.8 × 10−4 | 2.4 × 10−4 | 100 |
| MIC by EM | 120.5 | 7.1 × 10−5 | 101 | 8.8 × 10−6 | 5.8 × 10−5 | 100 | 321.5 | 7.4 × 10−4 | 187 | 1.8 × 10−4 | 5.6 × 10−4 | 100 | |
| MIC by MM | 572.5 | 9.9 × 10−3 | 131 | 4.9 × 10−3 | 4.3 × 10−3 | 100 | 2000 | 0.46 | 2000 | 6.1 × 10−4 | 1.32 | 60 | |
| GA | 617.1 | 1.2 × 10−5 | 324 | 6.8 × 10−4 | 4.6 × 10−3 | 40 | 1178.4 | 1.34 | 926 | 6.8 × 10−4 | 3.20 | 50 | |
| DE | 1000 | 0.22 | 1000 | 0.14 | 0.47 | 0 | 2000 | 4.72 | 2000 | 0.10 | 2.55 | 0 | |
| ES | 1000 | 3.48 | 1000 | 1.94 | 2.39 | 0 | 2000 | 37.4 | 2000 | 24.7 | 14.4 | 0 | |
| PSO | 985.7 | 0.82 | 857 | 6.8 × 10−4 | 0.71 | 20 | 2000 | 10.9 | 2000 | 3.13 | 5.60 | 0 | |
| Ackley | MIC by LM | 467.2 | 4.4 × 10−15 | 355 | 4.4 × 10−15 | 0 | 100 | 1039.6 | 4.4 × 10−15 | 956 | 4.4 × 10−15 | 0 | 100 |
| MIC by EM | 788.4 | 4.4 × 10−15 | 462 | 4.4 × 10−15 | 0 | 100 | 1154.8 | 3.1 × 10−14 | 937 | 4.4 × 10−15 | 1.9 × 10−15 | 100 | |
| MIC by MM | 725.2 | 1.4 × 10−9 | 324 | 3.5 × 10−10 | 9.2 × 10−10 | 100 | 2000 | 2.48 | 2000 | 2.24 | 0.20 | 0 | |
| GA | 957.5 | 7.3 × 10−2 | 565 | 7.2 × 10−3 | 1.58 | 70 | 1895.1 | 2.86 | 951 | 0.01 | 1.53 | 10 | |
| DE | 1000 | 1.69 | 1000 | 1.24 | 4.47 | 0 | 2000 | 5.35 | 2000 | 4.34 | 0.67 | 0 | |
| ES | 1000 | 4.96 | 1000 | 3.20 | 3.09 | 0 | 2000 | 5.43 | 2000 | 5.23 | 1.9 × 10−1 | 0 | |
| PSO | 557.5 | 8.6 × 10−4 | 344 | 6.2 × 10−4 | 4.1 × 10−4 | 100 | 839.5 | 4.5 × 10−3 | 162 | 8.8 × 10−4 | 7.9 × 10−3 | 100 | |
| Griewang | MIC by LM | 154.7 | 6.2 × 10−14 | 38 | 1.2 × 10−14 | 2.8 × 10−14 | 100 | 106 | 8.4 × 10−14 | 92 | 8.7 × 10−14 | 4.4 × 10−14 | 100 |
| MIC by EM | 171.2 | 8.4 × 10−14 | 43 | 6.4 × 10−14 | 1.5 × 10−14 | 100 | 489 | 1.1 × 10−13 | 94 | 9.1 × 10−14 | 2.5 × 10−14 | 100 | |
| MIC by MM | 775.4 | 3.3 × 10−13 | 146 | 9.6 × 10−14 | 3.2 × 10−13 | 100 | 2000 | 0.27 | 2000 | 9.1 × 10−13 | 0.16 | 20 | |
| GA | 909.8 | 0.09 | 84 | 0.8 × 10−3 | 0.18 | 10 | 2000 | 0.21 | 2000 | 0.09 | 1.1 × 10−1 | 0 | |
| DE | 337.4 | 9.1 × 10−3 | 44 | 7.3 × 10−3 | 1.1 × 10−3 | 100 | 993 | 0.01 | 588 | 7.9 × 10−3 | 8.8 × 10−1 | 70 | |
| ES | 1000 | 0.36 | 1000 | 1.2 × 10−1 | 0.20 | 0 | 2000 | 0.81 | 2000 | 0.76 | 0.19 | 0 | |
| PSO | 555.2 | 0.02 | 258 | 6.6 × 10−2 | 2.8 × 10−2 | 30 | 2000 | 0.77 | 2000 | 0.37 | 3.1 × 10−2 | 0 | |
Comparison with Other Published Methods in 30D. N/A means not available.
| Methods | Benchmarks | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Rastrigin | Ackley | Griewang | |||||||
| AvP | SD | SR% | AvP | SD | SR% | AvP | SD | SR% | |
| Xin Zhao et al., 2022 [ | 2.1 × 10−13 | 4.1 × 10−14 | 100 | 8.2 × 10−15 | 1.3 × 10−15 | 100 | 3.78 × 10−13 | 1.7 × 10−13 | 100 |
| Chentoufi et al., 2021 [ | 0.99 | 1.31 | 100 | 1.0 × 10−15 | 6.4 × 10−16 | 43 | 8.3 × 10−4 | 5.4 × 10−4 | 67 |
| MIC_LM | 7.8 × 10−4 | 2.4 × 10−4 | 100 | 4.4 × 10−15 | 0 | 100 | 8.4 × 10−14 | 4.4 × 10−14 | 100 |
| MIC_EM | 7.4 × 10−4 | 5.6 × 10−4 | 100 | 3.1 × 10−14 | 1.9 × 10−15 | 100 | 1.1 × 10−13 | 2.5 × 10−14 | 100 |
| MIC_MM | 0.46 | 1.32 | 60 | 2.48 | 0.20 | 0 | 0.27 | 0.16 | 20 |
Comparison Results of Our Proposed Method with Other Works in the SPC Dataset in Terms of UA Recall % (Accuracy %).
| Methods | Traits | ||||
|---|---|---|---|---|---|
| Neu. | Ext. | Ope. | Agr. | Con. | |
| Mohammadi et al., 2010 [ | N/A (63) | N/A (76.3) | N/A (57.9) | N/A (63) | N/A (72) |
| Mohammadi et al., 2012 [ | N/A (65.9) | N/A (73.5) | N/A (60.1) | N/A (63.1) | N/A (71.3) |
| Chastagnol et al., 2012 [ | 58 (N/A) | 75.5 (N/A) | 73.4 (N/A) | 65 (N/A) | 62.2 (N/A) |
| Mohammadi et al., 2015 [ | N/A (66.1) | N/A (71.4) | N/A (58.6) | N/A (58.8) | N/A (72.5) |
| Solera-Urena et al., 2017 [ | 65.1 (64.7) | 75 (75.1) | 59.1 (58.2) | 60.3 (60.2) | 75.7 (75.6) |
| Carbonneau et al., 2017 [ | 70.8 (N/A) | 75.2 (N/A) | 56.3 (N/A) | 64.9 (N/A) | 63.8 (N/A) |
| Zhen-Tao Liu et al., 2020 [ | N/A (69.2) | N/A (76.3) | N/A (74.7) | N/A (65.3) | N/A (73.3) |
| Our privuse work 2021 [ | 77.1 (76.9) | 76.6 (72.9) | 81.2 (70.4) | 80.7 (68.7) | 78.5 (69.5) |
| Proposed method | 89.8 (80.5) | 82.2 (83.4) | 87.1 (84.7) | 85.8 (76.2) | 81.8 (72.6) |