| Literature DB >> 35922535 |
Florian B Pokorny1,2,3, Maximilian Schmitt4, Mathias Egger5, Katrin D Bartl-Pokorny5,4, Dajie Zhang5,6,7, Björn W Schuller4,8, Peter B Marschik5,6,7,9.
Abstract
Fragile X syndrome (FXS) and Rett syndrome (RTT) are developmental disorders currently not diagnosed before toddlerhood. Even though speech-language deficits are among the key symptoms of both conditions, little is known about infant vocalisation acoustics for an automatic earlier identification of affected individuals. To bridge this gap, we applied intelligent audio analysis methodology to a compact dataset of 4454 home-recorded vocalisations of 3 individuals with FXS and 3 individuals with RTT aged 6 to 11 months, as well as 6 age- and gender-matched typically developing controls (TD). On the basis of a standardised set of 88 acoustic features, we trained linear kernel support vector machines to evaluate the feasibility of automatic classification of (a) FXS vs TD, (b) RTT vs TD, (c) atypical development (FXS+RTT) vs TD, and (d) FXS vs RTT vs TD. In paradigms (a)-(c), all infants were correctly classified; in paradigm (d), 9 of 12 were so. Spectral/cepstral and energy-related features were most relevant for classification across all paradigms. Despite the small sample size, this study reveals new insights into early vocalisation characteristics in FXS and RTT, and provides technical underpinnings for a future earlier identification of affected individuals, enabling earlier intervention and family counselling.Entities:
Mesh:
Year: 2022 PMID: 35922535 PMCID: PMC9349308 DOI: 10.1038/s41598-022-17203-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Available audio-video (AV) duration in format hh:mm:ss (two-digit hour number:two-digit minute number:two-digit second number; seconds rounded down to integer values; sums calculated on the basis of exact durations) as well as month-wise () and overall () number of included vocalisations per participant, per group, and in total.
AD, atypical development; FXS, fragile X syndrome; RTT, Rett syndrome; TD, typical development; ♀, female; ♂, male; colour code for month-wise numbers of vocalisations: greyscale proportional to the total number of vocalisations per line between white = 0 % of vocalisations and black = 100 % of vocalisations.
Paradigm-wise (a)–(d) best system configuration regarding audio normalisation, feature normalisation, and training partition upsampling, as well as respective classification results in form of class-specific numbers of (in)correctly assigned vocalisations (confusion matrices) and unweighted average recall (UAR) for both vocalisation-wise and infant-wise evaluation scenario.
UAR values are rounded to three decimal places. AD, atypical development; C, support vector machine kernel complexity parameter; FXS, fragile X syndrome; n, number of participants; RTT, Rett syndrome; TD, typical development; ♀, female; ♂, male; colour code for confusion matrices: greyscale proportional to the class-specific number of vocalisations between white = 0 % of class-specific vocalisations and black = 100 % of class-specific vocalisations.
Paradigm-wise (a)–(d) top ten acoustic features according to the ascending mean (given below the respective feature name outside the brackets) of cross-validation iteration-wise (mean of sub-models) feature ranks (given below the respective feature name within the brackets) for both vocalisation-wise and infant-wise evaluation scenario.
Mean ranks are rounded to two decimal places. A, amplitude; AD, atypical development/atypically developing; B, bandwidth; , fundamental frequency; , second and third vocal formant; FXS, fragile X syndrome; HNR, harmonics-to-noise ratio; idx, index; , equivalent sound level; , first, second, third, and fourth Mel-frequency cepstral coefficient; pctl, percentile; pctlr, percentile range; RTT, Rett syndrome; , standard deviation normalised by the arithmetic mean; TD, typical development; URs, unvoiced regions; VRs, voiced regions; ♀, female; ♂, male; colour code for features: light grey = frequency-related feature, middle grey = spectral/cepstral feature, dark grey = energy/amplitude-related feature.
Figure 1Vocalisation distributions for the paradigm FXS vs RTT vs TD within the space of the three best ranked acoustic features for (a) the vocalisation-wise and (b) the infant-wise evaluation scenario, respectively (see Table 3), in best system configuration, i.e., without audio and feature normalisation (see Table 2). dBp power converted to decibel with , FXS, fragile X syndrome, , equivalent sound level, , second Mel-frequency cepstral coefficient, pctlr, percentile range, RTT, Rett syndrome; TD, typical development; URs, unvoiced regions; VRs, voiced regions; *Real measurement unit not existent as feature values refer to the amplitude of the digital audio signal.