| Literature DB >> 32529570 |
Harin Lee1, Daniel Müllensiefen2.
Abstract
To date, tests that measure individual differences in the ability to perceive musical timbre are scarce in the published literature. The lack of such tool limits research on how timbre, a primary attribute of sound, is perceived and processed among individuals. The current paper describes the development of the Timbre Perception Test (TPT), in which participants use a slider to reproduce heard auditory stimuli that vary along three important dimensions of timbre: envelope, spectral flux, and spectral centroid. With a sample of 95 participants, the TPT was calibrated and validated against measures of related abilities and examined for its reliability. The results indicate that a short-version (8 minutes) of the TPT has good explanatory support from a factor analysis model, acceptable internal reliability (α = .69, ωt = .70), good test-retest reliability (r = .79) and substantial correlations with self-reported general musical sophistication (ρ = .63) and pitch discrimination (ρ = .56), as well as somewhat lower correlations with duration discrimination (ρ = .27), and musical instrument discrimination abilities (ρ = .33). Overall, the TPT represents a robust tool to measure an individual's timbre perception ability. Furthermore, the use of sliders to perform a reproductive task has shown to be an effective approach in threshold testing. The current version of the TPT is openly available for research purposes.Entities:
Keywords: Gold-MSI; Musical abilities; Musical assessment; Psychoacoustics; Timbre perception
Mesh:
Year: 2020 PMID: 32529570 PMCID: PMC7536169 DOI: 10.3758/s13414-020-02058-3
Source DB: PubMed Journal: Atten Percept Psychophys ISSN: 1943-3921 Impact factor: 2.199
Fig. 1The layout of the TPT (left) and its testing dimensions (right). Graphic figures for the testing dimensions show how the reproduction tone is manipulated when the slider is positioned at ‘0’ (far left) or positioned at ‘100’ (far right). Envelope represents rise and fall time in amplitude, Spectral Flux represents the alignment of harmonics that results as more consonant when aligned in-phase, Spectral Centroid represents the filtered frequency area in the frequency spectrum. (Colour figure online)
Parameters of the three subtasks of TPT with theoretical slider range from 0 to 100
| Envelope (ms) | Spectral Flux | Spectral Centroid | |||||
|---|---|---|---|---|---|---|---|
| Attack | Decay | 4th harmonic | 5th harmonic | 6th harmonic | 7th harmonic | ||
Slider range (0–100) | 5–291 | 50–5 | 3.0–3.3 | 4.0–3.7 | 5.0–5.2 | 6.0–5.8 | 600–1k |
| Link function | Log base of 1.03 | Linear | X2 | ||||
Note. Attack and decay, 4th & 5th and 6th & 7th pair of harmonics have inversely proportional relationships. X = slider value/100. ‘Link function’ describes the relationship between the physical parameters of the sounds and slider scale of 0–100
Fig. 2Spearman’s correlations between six score variables of the TPT. The size of blue circles represents the magnitude of the correlations, and crossed circles represent statistically nonsignificant pairs at a threshold of p = .05. Mat = matching variant of subtask; Mem = memory variant of subtask. (Colour figure online)
Spearman’s correlations of the TPT with the convergent validity measures
| Gold-MSI | PSYCHOACOUSTICS1 | PROMS | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| G0 | G1 | G2 | G3 | G4 | G5 | Pitch | Duration | Profile | ||
| Match Envelope | .43*** | .38*** | .46*** | .40*** | .33** | .47*** | .41*** | .19 | .00 | .13 |
Memory Envelope | .49*** | .28** | .45*** | .31** | .26* | .44*** | .38*** | .19 | .06 | .18 |
Match Flux | .39*** | .42*** | .51*** | .33** | .36*** | .48*** | .43*** | .11 | .16 | .26* |
Memory Flux | .23* | .36*** | .39*** | .14 | .20 | .34** | .28* | .08 | .08 | .13 |
Match Centroid | .30** | .42*** | .36*** | .34** | .26** | .37* | .40*** | .27* | .07 | .33** |
Memory Centroid | .20 | .26* | .27** | .11 | .12 | .27** | .22* | .28* | .07 | .19 |
Match Total | .50*** | .54*** | .61*** | .50*** | .42*** | .60*** | .54*** | .28* | .14 | .36** |
Memory Total | .50*** | .47*** | .59*** | .30** | .30** | .56*** | .49*** | .22* | .12 | .25* |
| Overall Score | .52*** | .56*** | .64*** | .45*** | .40*** | .62*** | .56*** | .27* | .15 | .33** |
Note. G0 = Active Engagement; G1 = Perceptual Abilities; G2 = Musical Training; G3 = Singing Abilities; G4 = Emotions; G5 = General Sophistication
1Threshold of tests from PSYCHOACOUSTICS were calculated by taking the average of blocks converted into log values
*p < .05. **p < .01. ***p < .001. Significant levels are adjusted according to Benjamini and Hochberg (1995)
Fig. 3a Trial-by-trial correlations between number of TPT match trials and Gold-MSI General Sophistication (G5). b Trial-by-trial correlations between number of TPT memory trials and Gold-MSI General Sophistication (G5). Note. X symbol represents significance at p < .05. (Colour figure online)
Absolute distance and corresponding acoustical thresholds of testing dimensions of TPT
| Match condition | Memory condition | |||||
|---|---|---|---|---|---|---|
| Envelope | Spectral Flux | Spectral Centroid | Envelope | Spectral Flux | Spectral Centroid | |
| Mean abs slider distance | 9.65 (10.23) | 17.93 (16.98) | 20.02 (15.64) | 14.68 (13.09) | 23.24 (17.54) | 23.69 (17.87) |
| ( | ||||||
| Mean acoustical threshold | 14.41 ms (18.28 ms) | 0.0500 β1 (0.0391 β) | 68.47 Hz2 (63.89 Hz) | 18.81 ms (22.31 ms) | 0.0581 β (0.0439 β) | 95.24 Hz (74.53 Hz) |
| ( | ||||||
1β = arithmetic mean deviation in ratio of four harmonics from their original whole number integer. Absolute distance is calculated by | target value–position of slider | with a theoretical slider range of 0–100
2Reference frequency is 700 Hz at slider position = 50
Descriptive statistics of the TPT scores, thresholds of three tests from the PSYCHOACOUSTICS, PROMS(Timbre) score, and Gold-MSI
| Mean | Median | Min. | Max. | |||
|---|---|---|---|---|---|---|
| Envelope (match) | 3.668 | 1.090 | 3.750 | 1.000 | 6.000 | 95 |
| Envelope (memory) | 3.663 | 0.800 | 3.750 | 1.700 | 5.300 | 95 |
| Spectral Flux (match) | 3.628 | 0.940 | 3.500 | 1.500 | 5.750 | 95 |
| Spectral Flux (memory) | 3.611 | 0.619 | 3.650 | 1.500 | 5.100 | 95 |
| Spectral Centroid (match) | 3.632 | 0.973 | 3.600 | 1.200 | 5.400 | 95 |
| Spectral Centroid (memory) | 3.599 | 0.688 | 3.500 | 1.900 | 5.200 | 95 |
| Match Total | 3.642 | 0.792 | 3.692 | 1.692 | 5.385 | 95 |
| Memory Total | 3.624 | 0.443 | 3.650 | 2.533 | 4.567 | 95 |
| Overall Score | 3.629 | 0.501 | 3.640 | 4.567 | 4.814 | 95 |
| Pitch Discrimination (Hz) | 9.68 | 12.80 | 4.41 | 1.01 | 60.76 | 104 |
| Duration Discrimination (ms) | 39.44 | 26.69 | 32.13 | 10.54 | 203.77 | 103 |
| Profile Analysis (dB)1 | 4.04 | 2.52 | 3.51 | 0.99 | 16.45 | 100 |
| Timbre subtest | 11.32 | 2.17 | 11.00 | 6.00 | 17.00 | 104 |
| Active Engagement (G0) | 42.47 | 9.76 | 44.00 | 17.00 | 61.00 | 104 |
| Perceptual Abilities (G1) | 48.60 | 8.99 | 48.00 | 23.00 | 63.00 | 104 |
| Musical Training (G2) | 26.96 | 12.34 | 27.00 | 7.00 | 49.00 | 104 |
| Singing Abilities (G3) | 34.21 | 4.85 | 34.00 | 23.00 | 42.00 | 104 |
| Emotions (G4) | 31.98 | 8.59 | 32.00 | 11.00 | 48.00 | 104 |
| General Sophistication (G5)2 | 82.16 | 21.84 | 84.00 | 36.00 | 123.00 | 104 |
1Level of increase in sound intensity of the 3rd harmonic
2General Sophistication (G5) is the composite score of all subscales (G0–G4) of the Gold-MSI
Spearman’s correlation between PROMS Timbre subtest, Gold-MSI, and tests from PSYCHOACOUSTICS
| Gold-MSI | PSYCHOACOUSTICS | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| G0 | G1 | G2 | G3 | G4 | G5 | Pitch | Duration | Profile | |
PROMS (Timbre) | .30** | .37*** | .38*** | .33** | .35** | .42*** | .31** | .28** | .18 |
G0 = Active Engagement; G1 = Perceptual Abilities; G2 = Musical Training; G3 = Singing Abilities; G4 = Emotions; G5 = General Sophistication. *p < .05, **p < .01, ***p < .001. Significant levels are adjusted according to Benjamini and Hochberg (1995)
Spearman’s correlation between tests from PSYCHOACOUSTICS toolbox and Gold-MSI.
| Gold-MSI | |||||||
|---|---|---|---|---|---|---|---|
| G0 | G1 | G2 | G3 | G4 | G5 | ||
| PSYCHOACOUSTICS | Pitch | .44*** | .36*** | .56*** | .33** | .19 | .47*** |
| Duration | .31** | .18 | .22* | .36** | .18 | .24* | |
| Profile Analysis | .10 | .18 | .28** | .17 | .19 | .24* | |
G0 = Active Engagement; G1 = Perceptual Abilities; G2 = Musical Training; G3 = Singing Abilities; G4 = Emotions; G5 = General Sophistication. *p < .05, **p < .01, ***p < .001. Significant levels are adjusted according to Benjamini and Hochberg (1995)
Assigned bin scores by the lower and upper boundaries of six bins
| Parameter manipulated | Target | Assigned bin scores by boundaries of bin categories 1 | ||||||
|---|---|---|---|---|---|---|---|---|
| 6 | 5 | 4 | 3 | 2 | 1 | |||
(Slider range of 100 = 20–291 ms) | ||||||||
| Slider value | 42 | NA 3 | ||||||
| Attack time (ms) | 52.0 | |||||||
| Slider value | 6 | 0–2 | 3–6 | 7–11 | 12–18 | 19–29 | 30–94 | |
| Attack time (ms) | 22.8 | 0.0–1.1 | 1.5–3.5 | 4.1–6.8 | 7.6–12 | 13–24 | 25–268 | |
| Slider value | 60 | 0–1 | 2 | 3 | 4–6 | 7–11 | 12–60 | |
| Attack time (ms) | 92.8 | 0.0–3.6 | 5.0–5.4 | 7.4–8.1 | 9.8–17 | 16–34 | 26–198 | |
| Slider value | 24 | 0–2 | 3–5 | 6–8 | 9–11 | 12–18 | 19–76 | |
| Attack time (ms) | 35.3 | 0.0–1.8 | 2.6–4.8 | 4.9–8.1 | 7.0–12 | 9.1–21 | 13–256 | |
| Slider value | 77 | 0–1 | 2 | 3–4 | 5–9 | 10–19 | 20–77 | |
| Attack time (ms) | 150 | 0.0–4.4 | 8.0–9.0 | 12–18 | 20–44 | 37–109 | 65–145 | |
(Slider range of 100 = 0–0.25 of mean change in ratio of four harmonics) | ||||||||
| Slider value | 60 | 0–5 | 6–10 | 11–12 | 13–20 | 21–28 | 29–60 | |
| ß4 | 0.15 | 0.000–0.013 | 0.015–0.025 | 0.028–0.030 | 0.033–0.050 | 0.053–0.070 | 0.073–0.150 | |
| Slider value | 24 | NA | ||||||
| ß | 0.06 | |||||||
| Slider value | 77 | 0–5 | 6–11 | 12–15 | 16–23 | 24–33 | 34–77 | |
| ß | 0.19 | 0.000–0.010 | 0.013–0.030 | 0.028–0.040 | 0.038–0.060 | 0.058–0.080 | 0.083–0.190 | |
| Slider value | 42 | 0–3 | 4–7 | 8 | 9–15 | 16–21 | 22–58 | |
| ß | 0.11 | 0.000–0.013 | 0.015–0.023 | 0.015–0.025 | 0.018–0.043 | 0.035–0.058 | 0.060–0.140 | |
| Slider value | 94 | 0–16 | 17–24 | 25–33 | 34–44 | 45–48 | 49–94 | |
| ß | 0.24 | 0.000–0.040 | 0.043–0.060 | 0.063–0.083 | 0.085–0.110 | 0.113–0.120 | 0.123–0.240 | |
(Slider range of 100 = 600–1000 Hz) | ||||||||
| Slider value | 77 | 0–2 | 3–4 | 5–8 | 9–13 | 14–26 | 27–77 | |
| Frequency (Hz) | 837 | 0–14 | 18–25 | 30–52 | 52–87 | 78–109 | 93–237 | |
| Slider value | 42 | 0–7 | 8 | 9–12 | 13–19 | 20–31 | 32–58 | |
| Frequency (Hz) | 671 | 0–25 | 25–29 | 27–46 | 37–78 | 52–142 | 67–329 | |
| Slider value | 94 | 0–2 | 3–5 | 6–7 | 8–11 | 12–19 | 20–94 | |
| Frequency (Hz) | 953 | 0–16 | 22–39 | 43–50 | 57–77 | 84–128 | 134–353 | |
| Slider value | 60 | 0–4 | 5–7 | 8–10 | 11–13 | 14–22 | 23–60 | |
| Frequency (Hz) | 744 | 0–20 | 23–36 | 36–52 | 48–69 | 59–125 | 89–256 | |
| Slider value | 6 | 0–14 | 15–25 | 26–38 | 39–44 | 45–51 | 52–94 | |
| Frequency (Hz) | 601 | 0–15 | 17–37 | 40–76 | 80–99 | 103–129 | 134–399 | |
1Boundaries of bin categories are represented as absolute distance from the target of slider value (with a theoretical range 0–100) and corresponding acoustic parameter
2Although both log attack and decay were manipulated in the Envelope subtask, only the attack threshold is reported because this was the parameter of interest
3NA, item excluded due to more than 30% of participants not moving the slider
4ß = arithmetic mean deviation in ratio of four harmonics from their original whole number integers