| Literature DB >> 25517997 |
Alejandrina Cristia1, Yasuyo Minagawa2, Emmanuel Dupoux1.
Abstract
In the adult brain, speech can recruit a brain network that is overlapping with, but not identical to, that involved in perceiving non-linguistic vocalizations. Using the same stimuli that had been presented to human 4-month-olds and adults, as well as adult macaques, we sought to shed light on the cortical networks engaged when human newborns process diverse vocalization types. Near infrared spectroscopy was used to register the response of 40 newborns' perisylvian regions when stimulated with speech, human and macaque emotional vocalizations, as well as auditory controls where the formant structure was destroyed but the long-term spectrum was retained. Left fronto-temporal and parietal regions were significantly activated in the comparison of stimulation versus rest, with unclear selectivity in cortical activation. These results for the newborn brain are qualitatively and quantitatively compared with previous work on newborns, older human infants, adult humans, and adult macaques reported in previous work.Entities:
Mesh:
Year: 2014 PMID: 25517997 PMCID: PMC4269432 DOI: 10.1371/journal.pone.0115162
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Examples of acoustic characteristics of the stimuli.
The top left is an example from the emotion stimuli; the middle left, from monkey vocalizations; and the bottom left, from speech. The three examples on the right are the scrambled counterparts of the corresponding stimulus on the left.
Figure 2Location of the 10 shallow channels (1–10) and 4 deep ones (4a,b and 7a,b) on a model of a newborn's head.
See Fig. 4 for an estimation of the point of maximal sensitivity for each channel.
Figure 4Channels responding significantly to auditory stimulation.
The three channels indicated had significant activation in the analysis where all stimulation was declared together against the silent baseline.
Figure 3Average time course of the hemoglobin changes following auditory stimulation in individual channels revealed by a general linear model.
The trace in red represents oxyHb, in blue deoxyHb; the error bars indicate standard error (over participants). The zero level or baseline is defined as the intercept of the linear model. The black dotted lines show the standard HRF model convolved with the average duration of stimulation, scaled to the maximum average concentration. The scale as well as the timing of stimulation (green box) are shown in the reference axes.
Channels activated in the sound versus silence contrast.
| All conditions versus silence | |||||||||||||
| Left | Right | ||||||||||||
| Ch |
| SE | t | N | p unc | p cor | Ch |
| SE | t | N | p unc | p cor |
| 1 | 0.01 | 0.004 | 2.25 | 39 | 0.031 | 0.41 | 3 | 0.013 | 0.005 | 2.73 | 37 | 0.01 | 0.16 |
| 4 * | 0.017 | 0.005 | 3.18 | 39 | 0.003 | 0.05 | 6 | 0.02 | 0.008 | 2.63 | 38 | 0.012 | 0.19 |
| 5 | 0.013 | 0.005 | 2.45 | 39 | 0.019 | 0.29 | 7 | 0.012 | 0.005 | 2.51 | 40 | 0.017 | 0.25 |
| 7 | 0.018 | 0.006 | 2.76 | 40 | 0.009 | 0.15 | 9 | 0.013 | 0.005 | 2.38 | 39 | 0.022 | 0.33 |
| 8 * | 0.018 | 0.005 | 3.53 | 39 | 0.001 | 0.02 | |||||||
| 10 | 0.011 | 0.004 | 2.58 | 36 | 0.014 | 0.22 | |||||||
| 4b * | 0.016 | 0.004 | 3.55 | 39 | 0.001 | 0.02 | |||||||
Ch = channel (channels with a number followed by a subscript are deep); β = beta recovered from the GLM in mM.mm; SE β = standard error of the β, N = number of children contributing data for that channel and condition, t value, p unc(orrected), and p cor(rected through resampling). Only channels whose β value was significantly different from zero at p ≤ 0.05 uncorrected, for the relevant condition, are shown. Channels with significant β after correction through resampling are marked with *.
Channels activated in the five conditions.
| Left | Right | ||||||||||||
| Ch |
| SE | t | N | p unc | p cor | Ch |
| SE | t | N | p unc | p cor |
| Macaque calls | |||||||||||||
| 5 | 0.022 | 0.01 | 2.17 | 33 | 0.03 | 0.54 | |||||||
| 4b | 0.016 | 0.008 | 2.09 | 32 | 0.05 | 0.61 | |||||||
| Human emotional vocalizations | |||||||||||||
| 2 | 0.03 | 0.011 | 2.7 | 34 | 0.01 | 0.22 | |||||||
| 4 | 0.031 | 0.01 | 3.03 | 32 | 0 | 0.11 | |||||||
| 8 | 0.02 | 0.01 | 2.02 | 33 | 0.05 | 0.69 | |||||||
| Native speech | |||||||||||||
| 7 | 0.022 | 0.009 | 2.37 | 34 | 0.02 | 0.38 | 7b | 0.025 | 0.009 | 2.65 | 33 | 0.013 | 0.23 |
| Foreign speech | |||||||||||||
| 7 | 0.025 | 0.012 | 2 | 34 | 0.05 | 0.71 | 4a | 0.025 | 0.008 | 2.93 | 31 | 0.006 | 0.13 |
| Scrambled auditory control | |||||||||||||
| 4a | 0.023 | 0.009 | 2.64 | 34 | 0.01 | 0.23 | 7 | 0.031 | 0.013 | 2.4 | 33 | 0.022 | 0.35 |
| 7a | 0.032 | 0.011 | 2.93 | 32 | 0.006 | 0.12 | |||||||
Ch = channel (channels with a number followed by a subscript are deep); β = beta recovered from the GLM in mM.mm; SE β = standard error of the β, N = number of children contributing data for that channel and condition, t value, p unc(orrected), and p cor(rected through resampling). Only channels whose β value was significantly different from zero at p ≤ 0.05 uncorrected, for the relevant condition, are shown.
Comparison between the newborn results reported here, and published results from human 4-month-olds [11], adult humans and adult macaques [12].
| Newborns | 4-month-olds | Adults | Macaques | |
| Sound vs. Silence | * L4 L8 L4b | * Lc Rc | ||
| Scrambled vs. Silence | # L4a R7 R7a | # Lc | * LS STG STS | * STG STS F |
| Macaque vs. Silence | # L5 L4b | * L and R b-d | ||
| Human Emotion vs. Silence | # L2 L4 L8 | # Rc | ||
| Native vs. Silence | # L7 R7b | * La Lc Ld | ||
| Foreign vs. Silence | # L7 R4a | # Lc Ld | ||
| Native vs. Foreign | None | Lc Rc | * STS, IFG | (–) |
| Human Emotion vs. Macaque | # -R3 | # STS | None | |
| Native vs. Human Emotion | # -L2 R1 R7b | # STS STG IFG | None | |
| Native vs. Macaque | # R7b | # STG STS IFG | None | |
| Foreign vs. Human Emotion | None | * STG STS | (–) | |
| Foreign vs. Macaque | None | * STG STS | (–) |
The symbol indicates significance level with * at a corrected level (through resampling in the present work, FDR in the 4-month-old study, and FWE in the adult work) and # at an uncorrected level (.05 for the infant work,.005 for the adult work); ‘none’ indicates None at p<.05 uncorrected for the infants, and at p<.001 for the macaques. The Native vs. Foreign contrast in the 4-month-olds was significant at uncorrected p = .05 in an ANOVA including both channels. A negative sign indicates that a difference was counter to the stated order (e.g., more activation for macaque calls than human emotional sounds in the newborns). Empty cells were not reported; those with (–) involve stimuli not presented to that population. The channels in [11] have been renamed a through c here for ease of reference. LS = lateral sulcus, STS = superior temporal sulcus, STG = superior temporal gyrus, F = frontal regions, IFG = inferior frontal gyrus. All activations in adult monkeys and humans were present in both hemispheres, with various degrees of dominance (not represented here).
Figure 5Approximate localization of channels in the 4-month-old study (in blue) overlaid over our shallow (in green) and deep (in pink) channels.
Comparison of amount of data between the newborn results reported here, and previous comparable infant work.
| First author | Sample size | Min trial N | Max trial N | Max Tot |
| Shultz* | 24 | 2 | 5 | 120 |
| Arimitsu | 17 | 7 | 7 | 119 |
| Gervain (exp. 1) | 22 | 14 | 308 | |
| Gervain (exp. 2) | 22 | 14 | 308 | |
| Kotilahti | 13 | 64 | 832 | |
| May | 20 | 7 | 140 | |
| Nishida | 10 | 5 | 10 | 100 |
| Peña | 14 | 10 | 140 | |
| Sato | 17 | 2 | 6 | 102 |
| Shukla | 25 | 10 | 30 | 750 |
| Minagawa | 28 | 10 | 24 | 672 |
| Present study | 39 | 4 | 8 | 312 |
Min/Max trial N stands for the minimum and maximum number of stimulation blocks or trials. Max Tot is calculated by multiplying the sample size by the maximum number of trials. White cells indicate unavailable information. Except for Shultz*, all studies focused on newborns and used fNIRS (see main text for details).
Comparison between the newborn results reported here, and previous newborn selectivity results (i.e., dissimilarity in strength of responses across conditions).
| First author | Condition 1 | Condition 2 | Localization | Effect Size |
| Arimitsu | phonemic change | no phonemic change | left temporal | 1.323 |
| Gervain (exp. 1) | ABB (syllables) | ABC (syllables) | left temporal | 0.814 |
| Shukla | full sentences | backwards sentences | left temporal | 0.575 |
| Minagawa | fast-changing tones | slow-changing tones | right perisylvian | 0.582 |
| Present study | brief phrases | scrambled | right perisylvian | −0.344 |
Localization and channel number provide information on the one channel whose data was used for the effect size. In all cases, the effect size is Cohen's d indicating selectivity of responses, in the sense of greater reaction to Condition 1 than Condition 2. For comparability with the previous research, the effect size for our study comes from the native speech versus scrambled controls. Please see main text for details on how effect sizes were selected and calculated.
Comparison between the newborn results reported here, and previous newborn speech sensitivity results (i.e., strength of responses).
| First author | Stimulation | Localization | Channel number | Effect Size |
| Kotilahti | full sentences | left | max in 8 | 1.184 |
| May | LPF full sentences | left | average of 6 | 0.242 |
| Nishida | full sentences | left perisylvian | average of 7 | 1.153 |
| Minagawa | fast-changing tones | left temporal | 1 | 0.979 |
| Present study | brief phrases | left temporal | 1 | 0.419 |
Localization and channel number provide information on the channel(s) whose data was used for the effect size. In all cases, the effect size is Cohen's d indicating the size of the response to speech against a baseline of silence. LPF stands for low-pass filtered. Please see main text for details on how effect sizes were selected and calculated.
Comparison of procedure between the newborn results reported here, and previous comparable infant work.
| First author | Duration | Min rest | Max rest |
| Shultz* | 20 | 12 | |
| Arimitsu | 15 | 15 | |
| Gervain (exp. 1) | 18 | 25 | 35 |
| Gervain (exp. 2) | 18 | 25 | 35 |
| Kotilahti | 5 | 15 | |
| May | 19 | 25 | 35 |
| Nishida | 15 | ||
| Pea | 15 | 25 | 35 |
| Sato | 10 | 20 | 30 |
| Shukla | 15 | 25 | 35 |
| Present study | 10 | 8 | 16 |
Duration is the average stimulation duration. Min and Max rest indicate the minimum and maximum duration of the silence following a block. When only the minimum rest duration is noted, only average and not the range was reported. The other white cells indicate unavailable information. Except for Shultz*, all studies focused on newborns and used fNIRS (see main text for details).