| Literature DB >> 34435096 |
Kongmeng Liew1,2, Yukiko Uchida3, Igor de Almeida4.
Abstract
BACKGROUND: Preferences for music can be represented through music features. The widespread prevalence of music streaming has allowed for music feature information to be consolidated by service providers like Spotify. In this paper, we demonstrate that machine learning classification on cultural market membership (Taiwanese, Japanese, American) by music features reveals variations in popular music across these markets.Entities:
Keywords: Culture; Machine Learning; Music; Psychology; Spotify
Year: 2021 PMID: 34435096 PMCID: PMC8356657 DOI: 10.7717/peerj-cs.642
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
A list of song-level audio features obtained from Spotify for our analyses.
| Audio feature | Description |
|---|---|
| Duration | The duration of the music in milliseconds (ms). |
| Mode | If the melody of a track is in a major or minor key. |
| Acousticness | A confidence measure on whether a song is acoustic. |
| Danceability | The suitability of a song for dancing. This is based on several musical features, such as tempo, rhythmic stability, regularity, and beat strength. |
| Energy | A measure of the intensity and activity of a song as perceptually lound, fast, or noisy. This is based on several musical and spectral features, such as dynamic range, loudness, timbre, onset rate, and entropy. |
| Instrumentalness | A confidence measure of whether a song contains no vocals. |
| Liveness | A confidence measure on the presence of audiences in the recording. |
| Loudness | The overall intensity of the song in decibels (dBFS). |
| Speechiness | A confidence measure on the presence of spoken words in a song. |
| Valence | An estimate of whether a song conveys positive or negative affect. |
| Tempo | The estimated main tempo of a song. |
Note:
Audio features refer to the music features as listed on the Spotify API, followed by a brief description of each feature. More information on the features are available at: https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/.
Medians (L/U quantiles) and missing data for musical features (excluding mode), and release year.
| Feature | Chinese | Japanese | Western | |||
|---|---|---|---|---|---|---|
| Median (L/U) | Missing | Median (L/U) | Missing | Median (L/U) | Missing | |
| Danceability | 0.56 (0.46/0.66) | 3 | 0.56 (0.45/0.67) | 4 | 0.66 (0.55/0.76) | 1 |
| Energy | 0.46 (0.34/0.63) | 3 | 0.76 (0.51/0.90) | 4 | 0.69 (0.54/0.82) | 0 |
| Loudness | −8.9 (−11.1/−6.9) | 3 | −6.2 (−9.3/−4.3) | 4 | −6.8 (−8.9/−5.2) | 0 |
| Speechiness | 0.04 (0.03/0.05) | 3 | 0.05 (0.04/0.09) | 4 | 0.08 (0.04/0.24) | 0 |
| Acousticness | 0.54 (0.21/0.76) | 3 | 0.11 (0.01/0.49) | 4 | 0.09 (0.02/0.29) | 1 |
| Instrumentalness | 1.4E−6 (0.00/1.1E−4) | 3 | 1.0E−4 (0.00/0.36) | 4 | 0.00 (0.00/0.0006) | 1 |
| Liveness | 0.13 (0.10/0.21) | 3 | 0.14 (0.10/0.29) | 4 | 0.14 (0.10/0.29) | 1 |
| Valence | 0.37 (0.24/0.57) | 3 | 0.52 (0.31, 0.71) | 4 | 0.51 (0.33/0.69) | 1 |
| Tempo | 122.7 (100.0/138.1) | 3 | 123.7 (99.0/144.0) | 4 | 119.7 (96.6.136.2) | 0 |
| Duration (ms) | 240,213 (205,586/272,586) | 3 | 236,840 (187,266/281,573) | 4 | 215,640 (184,727/250,693) | 0 |
| Release year | 2011 (2005/2015) | 148 | 2015 (2010/2018) | 100 | 2016 (2012/2018) | 0 |
Figure 1Number of songs (in our data) by year for Japanese (JP), English (US) and Chinese (ZH) medium songs.
Comparison of feature importance between the GBDT and MLP multiclass models.
| Feature | RFI (GBDT) | PFI (MLP) |
|---|---|---|
| Speechiness | 24.3 | 1.26 |
| Loudness | 7.4 | 1.15 |
| Instrumentalness | 15.6 | 1.19 |
| Acousticness | 16.5 | 1.14 |
| Energy | 17.5 | 1.12 |
| Mode name | 0.1 | 1.00 |
| Duration | 8.4 | 1.12 |
| Danceability | 5.9 | 1.06 |
| Valence | 1.8 | 1.03 |
| Tempo | 2.2 | 1.01 |
| Liveness | 0.2 | 1.00 |
Comparison of feature importance measures for respective binomial classification models.
| Feature | Chinese–Japanese | Chinese–English | Japanese–English | |||
|---|---|---|---|---|---|---|
| RFI (GBDT) | PFI (MLP) | RFI (GBDT) | PFI (MLP) | RFI (GBDT) | PFI (MLP) | |
| Speechiness | 4.8 | 1.05 |
|
|
|
|
| Loudness | 14.1 |
| 3.0 | 1.19 | 4.7 | 1.04 |
| Instrumentalness | 26.0 | 1.22 | 2.8 | 1.08 | 8.5 | 1.105^ |
| Acousticness | 31.1 | 1.16 | 34.1 | 1.35 | 2.4 | 1.04 |
| Energy |
| 1.07 | 4.7 | 1.09 | 16.0 | 1.114^ |
| Mode name | 0.0 | 1.00 | 0.0 | 1.01 | 0.2 | 1.00 |
| Duration | 9.3 | 1.08 | 4.3 | 1.07 | 12.9 | 1.02 |
| Danceability | 1.2 | 1.01 | 1.6 | 1.06 | 10.4 | 1.110^ |
| Valence | 0.7 | 1.06 | 1.2 | 1.02 | 3.3 | 1.003 |
| Tempo | 1.4 | 1.01 | 1.5 | 1.02 | 1.9 | 1.00 |
| Liveness | 0.0 | 1.00 | 0.2 | 1.00 | 0.1 | 1.00 |
Note:
While scales between RFI and PFI are not equivalent, both measure model-specific feature importance relative to other features: the higher the score, the larger the importance within the model. Features with highest importance are in bold. PFIs were reported with two decimal places, but we used three decimal places for PFIs denoted by ‘^’. This was to identify the 2nd most important feature for the PDP.
Figure 2PDPs of top two most important features in each model on the probability of classification.
The positive class is indicated at the top of the Y-axis. For example, in the Japanese–English GBDT model (top right) for the speechiness feature, the decreasing trend indicates that the higher the speechiness score, the lower the probability of classification (of a song) as being Japanese (i.e., higher probability of being English), in a fairly linear fashion.
Figure 3Variation and stability of features (medians) across cultures from 2000 to 2020.
Note that scales differ bewteen features.