| Literature DB >> 34845298 |
Seyedeh Zahra Asghari1, Sajjad Farashi2, Saeid Bashirian3, Ensiyeh Jenabi4.
Abstract
In this systematic review, we analyzed and evaluated the findings of studies on prosodic features of vocal productions of people with autism spectrum disorder (ASD) in order to recognize the statistically significant, most confirmed and reliable prosodic differences distinguishing people with ASD from typically developing individuals. Using suitable keywords, three major databases including Web of Science, PubMed and Scopus, were searched. The results for prosodic features such as mean pitch, pitch range and variability, speech rate, intensity and voice duration were extracted from eligible studies. The pooled standard mean difference between ASD and control groups was extracted or calculated. Using I2 statistic and Cochrane Q-test, between-study heterogeneity was evaluated. Furthermore, publication bias was assessed using funnel plot and its significance was evaluated using Egger's and Begg's tests. Thirty-nine eligible studies were retrieved (including 910 and 850 participants for ASD and control groups, respectively). This systematic review and meta-analysis showed that ASD group members had a significantly larger mean pitch (SMD = - 0.4, 95% CI [- 0.70, - 0.10]), larger pitch range (SMD = - 0.78, 95% CI [- 1.34, - 0.21]), longer voice duration (SMD = - 0.43, 95% CI [- 0.72, - 0.15]), and larger pitch variability (SMD = - 0.46, 95% CI [- 0.84, - 0.08]), compared with typically developing control group. However, no significant differences in pitch standard deviation, voice intensity and speech rate were found between groups. Chronological age of participants and voice elicitation tasks were two sources of between-study heterogeneity. Furthermore, no publication bias was observed during analyses (p > 0.05). Mean pitch, pitch range, pitch variability and voice duration were recognized as the prosodic features reliably distinguishing people with ASD from TD individuals.Entities:
Mesh:
Year: 2021 PMID: 34845298 PMCID: PMC8630064 DOI: 10.1038/s41598-021-02487-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flow diagram of the search procedure according to the PRISMA guidelines.
Summary of included studies.
| Study (authors, year, ref) | n, nmale, age | Voice elicitation | Measurements | Outcomes | |
|---|---|---|---|---|---|
| ASD | TD | ||||
| Arciuli and Bailey (2019)[ | 20, 18, 7.38 ± 1.55 | 20, 18, 7.21 ± 1.78 | Picture-naming strategy | Pairwise variability index (PVI) | Stress contrastivity: ASD < TD |
| Arciuli et al. (2020)[ | 16, 13, 5.73 | 16, 3, 4.65 | Picture-naming targets | Duration, F0, intensity of the first two vowels for PVI | Results of acoustic analyses indicated no statistically significant group differences in PVIs |
| Bone et al. (2016)[ | 95, 75, 8.8 ± 2.6 | 81, 56, 8.3 ± 2.5 | Narration | Pitch dynamics, rate of speech, prosodic attributes, and turn-taking | Prosodic variability increases in interactions with higher severity ASD Pitch variability: ASD > TD |
| 41, 32, 5 ± 1.1 | 42, 17, 5.1 ± 0.7 | Picture-naming task | Long-term average spectrum and pitch variability | ASD spectrum was shallower and showed less harmonic structure. pitch range: ASD > TD | |
| 12, 10, 0.365 ± 0.073 | 11, 9, 0.309 ± 0.115 | Extracted infants’ and mothers’ voice productions recorded through the family home movies | Mean duration and pitch | ASD infants’ productions were not different in duration and pitch, however less complex modulated productions were created by ASDs | |
| Chan and To (2016)[ | 19, 19, 25.72 ± 3.63 | 19, 19, 25.50 ± 3.21 | Recording of narrative production | F0, pitch variability and the total number and the type of sentence-final particles from narrative samples | Pitch range: ASD > TD F0: ASD > TD Pitch variability: ASD > TD |
| Choi and Lee (2019)[ | 17, NR, 8.23 ± 1.55 | 34, NR, 8.27 ± 1.725 | Conversation samples | Voice intensity variation, prosody, pitch | Intensity, pitch, and intonation change: ASD > TD |
| 12, 12, 23.2 ± 6.6 | 6, 6, 26.3 ± 4.0 | Verbal responses | Overall range-fall (the difference between the peak and the proceeding lowest pitch value), | High language functioning ASD(HASD) had higher while moderate language functioning had lower pitch range compared with TD, higher range-fall for HASD | |
| Demouy et al. (2011)[ | 12, 10, 9.75 ± 3.5 | 12, NR, NR | Language assessment tasks | Sentence duration | Sentence duration for all types of Descending, falling, rising and floating sentences: ASD > TD |
| 24, 16, 12.31 ± 2.32 | 22, 15, 12.21 ± 2.64 | An instrument designed to assess prosody performance in children | Acoustic measures of prosody | Utterance duration, pitch range, pitch variance and mean pitch: ASD > TD | |
| 21, 19, 13.58 ± 2.10 | 21, 19, 13.24 ± 2.09 | A cartoon for eliciting narratives and Gestures | Standard deviation in F0, average fundamental frequency across the entire narrative | F0: ASD > TD Pitch variability: ASD > TD | |
| Drimalla et al. (2020)[ | 37, 19, 36.89 | 43, 21, 33.14 | Conversation between the participant and an actress | Prosodic features for each frame: f0, jitter (pitch perturbations), and shimmer (amplitude perturbations) and the root-mean-square energy | F0: ASD > TD |
| Esposito and Venuti, (2009)[ | 10, 5, 1.4 ± 0.125 | 10, 5, 1 ± 0.07 | Cry Observation codes | Duration | Longer screaming duration for ASD |
| 12, 10, 8.58 ± 0.51 | 17, 10, 8.35 ± 0.49 | PEPS-C test for assessing the receptive and expressive prosodic skills of children | Duration, pitch (range, mean, maximum, and minimum), and intensity (mean, maximum, and minimum) | Voice duration , pitch range, mean pitch, maximum pitch: ASD > TD | |
| 4, 4, 4–17 | 4, 4, 4–17 | Declarative and question sentences | Mean duration and p range | Longer voice duration in ASD group | |
| 16, NR, 12.33 ± 2.25 | 15, NR, 12.58 ± 3.08 | Picture-naming task | Intensity and duration of speech | Utterance duration: ASD > TD No statistical difference for intensity was found | |
| Hubbard et al. (2017)[ | 15, 15, 27 (21–42) | 15, 15, 21 (18–26) | Evoked elicitation procedure for prosodic production for different emotional context | F0 range and voice intensity | Intensity and F0 range: ASD > TD |
| 9, 6, 14.5 | 10, 9, 14.5 | Repeat type recorded contents with different intonation | Frequency, amplitude, and duration measurements of recorded speech | ASD exhibited lower pitch peak location accuracy compared with TD Pitch range: ASD > TD | |
| Hudenko et al. (2009)[ | 15, 13, 9.1 ± 0.77 | 15, 13, 9 ± 0.7 | Laugh elicitation | Duration, F0, F0 variability | All acoustic measures were not significant, with the exception of the comparisons between voiced and unvoiced laughter |
| 20, 14, 28.9 | 20, 3, 21.8 | Communication task | Pitch analysis | F0 range: ASD < TD | |
| Lehnert-LeHouillier et al. (2020)[ | 12, 3, 12.14 ± 1.84 | 12, 3, 12.23 ± 1.89 | Conversation | Acoustic analysis of a goal-directed conversation task, conversational F0 range | F0 range: ASD > TD |
| Patel et al. (2020)[ | 55, 45, 16.57 ± 6.62 | 39, 19, 18.99 ± 5.21 | Narration elicitation using a wordless picture book | Mean, range and standard deviation of F0, speech rate, speech rhythm using normalized PVI | F0 variability: ASD > TD |
| Lyakso et al. (2016)[ | 25, x, 5–14 | 60, NR, NR | Emotional speech, spontaneous speech, and the repetition of words | Pitch values, max and min values of pitch, pitch range, formants frequency, energy and duration of recorded voice and speech | Pitch values of spontaneous speech: ASD > TD |
| Nadig and Mulligan (2017)[ | 9, 1, 5.72 ± 1.00 | 9, 5, 3.065 ± 0.59 | Audio stimuli | Mullen scales of early learning for assessing cognitive functioning for receptive and expressive language | ASD and TD groups were not significantly different for repetition accuracy ASD group had higher score for accurate repetition for four syllables |
| 15, 13, 11 ± 0.791 | 13, 11, 11 ± 2 | Conversation task | Pitch range | Pitch range: ASD > TD | |
| Nadig and Shaw (2015)[ | 15, 12, 5.5 ± 1.42 | 11, 2, 5.66 ± 1.9 | Describe a target object | Amplitude, duration and mean pitch | Intensity: ASD < TD Duration: ASD > TD |
| 20, 15, 7.9 ± 0.7 | 21, 10, 7.9 ± 0.1 | Picture-naming task | F0 and pitch | Greater pitch variability: ASD > TD | |
| Nayak et al. (2019)[ | 16, 11, 7–18 | 27, 16, 7–18 | General communication | Mean pitch, pitch range, and the standard deviation of pitch | Pitch variability: ASD < TD |
| Ochi et al. (2019)[ | 62, 62, 26.9 ± 7.0 | 17, 17, 29.6 ± 7.0 | General conversation | log | Standard deviation of intensity: ASD < TD |
| Olivati et al. (2017)[ | 19, 19, 13.37 ± 6.12 | 19, 19, NR | Speech-language pathology screening for vocal quality, speech chain, comprehension of simple and complex orders | F0, intensity and duration of recorded voices | Maximum and minimum intensity and distance between maximum and minimum F0 frequencies: ASD > TD Duration: ASD > TD |
| Paul et al. (2008)[ | 46, 43, 13.2 ± 4.4 | 20, 17, 7.91–27.42 | Constrained production (imitation) | Duration | Stressed syllable duration : ASD < TD |
| Patel et al. (2020)[ | 55, 45, 16.57 ± 6.62 | 39, 19, 18.99 ± 5.21 | Narration | Mean pitch, speech rate | Speech rate: ASD < TD |
| 10, 5, 12.12 ± 0.89 | 9, 5, 11.95 ± 0.84 | Mother–infant social interaction | Mean F0, pitch range and intensity | No significant differences were found between groups | |
| 30, 26, 10.57 ± 1.6 | 30, 22, 10.60 ± 2 | Conversation | Pitch and intensity | Mean vocal intensity: ASD < TD | |
| 15, 14, 6.25 ± 1.5 | 10, 9 , 7.3 ± 2 | Spontaneous speech task | Pitch and pitch range | Pitch, pitch range: ASD > TD | |
| Sheinkopf et al. (2012)[ | 21, 15, 0.5 ± 0.5 | 18, 8, 0.5 ± 0.5 | Audio–video recordings at 6 months of age of participants and Identification of cry episodes | F0 and phonation | F0 for cry: ASD > TD |
| Unwin et al. (2017)[ | 22, 18, 1 | 27, 12, 1 | F0, Amplitude, first and second formants (F1, F2), Cry duration | Cry duration: ASD < TD | |
| Van Santen et al. (2010)[ | 22, NR, 6.35 ± 1.02 | 22, NR, 6.57 ± 1.29 | Lexical stress task | F0, amplitude and duration | F0: ASD > TD during lexical stress task |
| Wehrle et al. (2020)[ | 14, 10, 42.5 ± 7.8 | 14, 11, 37.3 ± 8 | Semi-spontaneous speech in the form of task-oriented dialogues | Pitch range, mean F0 | ASD group shows more melodic or singsongy intonation style |
The bold studies are related to the included studies in the last performed meta-analysis by Fusaroli et al.[35].
NR shows to not reported values.
Figure 2Forest plot for mean pitch value measure. The negative sign shows that the mean pitch value is larger for ASD individuals as compared with TD individuals.
Subgroup analyses for mean pitch difference between ASD and TD groups. The elicitation tasks and the age of participants were confounding factors.
| Pooled SMD | Heterogeneity (%) | p-value | |
|---|---|---|---|
| Task type | |||
| Narration | − 0.41 (95% CI [− 0.77, − 0.05]) | 23.00 | 0.268 |
| Conversation | − 0.28 (95% CI [− 0.85, 0.29]) | 80.70 | < 0.001 |
| Focus | − 0.79 (95% CI [− 1.26, − 0.05]) | 0.00 | 0.915 |
| Cry | − 0.58 (95% CI [− 2.48, 1.31]) | 71.7 | 0.029 |
| Age of ASD participants | |||
| Infancy (age ≤ 2) | − 0.58 (95% CI [− 2.48, 1.31]) | 85.70 | 0.008 |
| Childhood (age: 2–11) | − 0.30 (95% CI [− 0.76, 0.15]) | 63.1 | 0.019 |
| Adolescence (age: 12–18) | − 0.14 (95% CI [− 0.49, 0.21]) | 0.00 | 0.718 |
| Adulthood (age > 20) | − 0.94 (95% CI [− 1.36, − 0.52]) | 40.70 | 0.185 |
Subgroup analyses for the difference of pitch standard deviation between ASD and TD groups. The elicitation tasks and the age of participants were confounding factors.
| Pooled SMD | Heterogeneity (%) | p-value | |
|---|---|---|---|
| Task type | |||
| Narration | − 0.14 (95% CI [− 1.14, 0.85]) | 82.6 | < 0.001 |
| Conversation | − 0.16 (95% CI [− 0.75, 0.42]) | 43.7 | 0.169 |
| Focus | − 0.11 (95% CI [− 1.11, 0.89]) | 92.1 | < 0.001 |
| Crying | 0.56 (95% CI [− 0.68, 1.80]) | 58.0 | 0.123 |
| Age of ASD participants | |||
| Infancy (age ≤ 2) | 0.21 (95% CI [− 0.54, 0.96]) | 65.2 | 0.023 |
| Childhood (age: 2–11) | − 0.05 (95% CI [− 0.87, 0.76]) | 90.8 | < 0.001 |
Subgroup analysis for pitch range difference between ASD and TD groups. The elicitation tasks and the age of participants were confounding factors.
| Pooled SMD | Heterogeneity (%) | p-value | |
|---|---|---|---|
| Task type | |||
| Narration | − 0.58 (95% CI [− 0.94, − 0.22]) | 91.4 | < 0.001 |
| Conversation | − 0.69 (95% CI [− 1.46, 0]) | 80.7 | < 0.001 |
| Focus | − 1.00 (95% CI [− 2.25, 0.24]) | 57.2 | 0.097 |
| Cry | No study was found | ||
| Age of ASD participants | |||
| Infancy (age ≤ 2) | No study was found | ||
| Childhood (age: 2–11) | − 1.15 (95% CI [− 2.67, 0.37]) | 96.4 | < 0.001 |
| Adolescence (age: 12–18) | − 0.74 (95% CI [− 1.06, − 0.42]) | 0.00 | 0.935 |
| Adulthood (age > 20) | − 0.37 (95% CI [− 1.04, 0.29]) | 72.6 | < 0.001 |
Subgroup analyses for the difference of pitch variability between ASD and TD groups. The voice elicitation tasks and the age of participants were confounding factors.
| Pooled SMD | Heterogeneity (%) | p-value | |
|---|---|---|---|
| Task type | |||
| Narration | − 0.41 (95% CI [− 0.81, − 0.01] | 53.5 | 0.154 |
| Conversation | − 0.525 (95% CI [− 1.06, 0.01]) | 75.5 | < 0.001 |
| Focus | − 0.62 (95% CI [− 1.39, 0.16] | 94.0 | < 0.001 |
| Cry | 0.56 (95% CI [− 0.68, 1.80]) | 58.0 | 0.123 |
| Age of ASD participants | |||
| Infancy (age ≤ 2) | 0.21 (95% CI [− 0.54, 0.96] | 65.3 | 0.021 |
| Childhood (age: 2–11) | − 0.58 (95% CI [− 1.36, 0.19] | 94.4 | < 0.001 |
| Adolescence (age: 12–18) | − 0.73 (95% CI [− 1.02, − 0.45] | 0.0 | 0.971 |
| Adulthood (age > 20) | − 0.42 (95% CI [− 0.96, 0.13] | 68.1 | 0.008 |
Subgroup analyses for voice intensity difference between ASD and TD groups. The elicitation tasks and the age of participants were confounding factors.
| Pooled SMD | Heterogeneity (%) | p-value | |
|---|---|---|---|
| Task type | |||
| Narration | Only one study was available | ||
| Conversation | − 0.07 [− 0.94, 0.8] | 90.6 | < 0.001 |
| Focus | − 0.24 [− 0.85, 0.38] | 57.2 | 0.097 |
| Cry | − 0.19 [− 0.56, 0.18] | 0.0 | 0.926 |
| Age of ASD participants | |||
| Infancy (age ≤ 2) | − 0.34 [− 0.7, 0.02] | 13.7 | 0.327 |
| Childhood (age: 2–11) | 0.29 [− 0.53, 1.1] | 85.8 | < 0.001 |
| Adolescence (age: 12–18) | Only one study was available | ||
| Adulthood (age > 20) | 0.27 [− 0.93, 1.47] | 86.4 | < 0.001 |
Figure 3Forest plot for the subgroup meta-analysis of the difference of voice duration between ASD and TD groups. The confounding factor for this analysis was the type of voice elicitation task.
Figure 4Forest plot for the subgroup meta-analysis of the difference of voice duration between ASD and TD groups. The confounding factor for this analysis was the age span of participants.
Results for assessing publication bias using the Begg’s and Egger’s tests for included studies for different acoustic measures.
| Measure | Begg’s test | Egger’s test | |||
|---|---|---|---|---|---|
| p value | Z value | p value | Bias | 95% CI for bias | |
| Pitch range | 0.091 | 1.69 | 0.062 | − 5.31 | [− 10.92, 0.29] |
| Duration | 0.118 | 1.56 | 0.053 | − 3.24 | [− 6.53, 0.046] |
| Intensity | 0.324 | 0.99 | 0.144 | − 4.95 | [− 11.85, 1.94] |
| Mean pitch | 0.928 | 0.09 | 0.932 | 0.17 | [− 4.01, 4.34] |
| Pitch standard deviation | 0.583 | 0.55 | 0.219 | 3.19 | [− 2.19, 8.57] |
| Pitch variability | 0.668 | 0.43 | 0.399 | − 1.67 | [− 5.66, 2.32] |
| Speech rate | 0.734 | 0.34 | 0.653 | 1.72 | [− 12.43, 15.78] |