| Literature DB >> 26116964 |
Cyril R Pernet1, Phil McAleer2, Marianne Latinus3, Krzysztof J Gorgolewski4, Ian Charest5, Patricia E G Bestelmeyer6, Rebecca H Watson7, David Fleming2, Frances Crabbe2, Mitchell Valdes-Sosa8, Pascal Belin9.
Abstract
fMRI studies increasingly examine functions and properties of non-primary areas of human auditory cortex. However there is currently no standardized localization procedure to reliably identify specific areas across individuals such as the standard 'localizers' available in the visual domain. Here we present an fMRI 'voice localizer' scan allowing rapid and reliable localization of the voice-sensitive 'temporal voice areas' (TVA) of human auditory cortex. We describe results obtained using this standardized localizer scan in a large cohort of normal adult subjects. Most participants (94%) showed bilateral patches of significantly greater response to vocal than non-vocal sounds along the superior temporal sulcus/gyrus (STS/STG). Individual activation patterns, although reproducible, showed high inter-individual variability in precise anatomical location. Cluster analysis of individual peaks from the large cohort highlighted three bilateral clusters of voice-sensitivity, or "voice patches" along posterior (TVAp), mid (TVAm) and anterior (TVAa) STS/STG, respectively. A series of extra-temporal areas including bilateral inferior prefrontal cortex and amygdalae showed small, but reliable voice-sensitivity as part of a large-scale cerebral voice network. Stimuli for the voice localizer scan and probabilistic maps in MNI space are available for download. CrownEntities:
Keywords: Amygdala; Auditory cortex; Functional magnetic resonance imaging; Inferior prefrontal cortex; Superior temporal gyrus; Superior temporal sulcus; Voice
Mesh:
Year: 2015 PMID: 26116964 PMCID: PMC4768083 DOI: 10.1016/j.neuroimage.2015.06.050
Source DB: PubMed Journal: Neuroimage ISSN: 1053-8119 Impact factor: 6.556
Fig. 1Voice localizer design.
At the top is shown the first 7 blocks of the design with spectrograms (upper part, x-axis: time; y-axis: frequency, 0–11.025 kHz) and waveforms (lower part) of the 8-s blocks of non-vocal, vocal and silence periods. Block onset starts 2 s after experiment onset to allow a 2 s scanning period during which no stimulation is presented for sparse sampling designs (TR = 10s); however the design is also suitable for continuous scanning (TR = 2 s). At the bottom is shown the full design in time (voice in blue and non-voice in green) with the (non-convolved) blocks indicated. On the right hand side is shown the random effect (FWE 5%) for voice and non-voice stimuli separately that are then contrasted to reveal the TVA.
Stimuli characteristics. The amplitude range corresponds to the distance between the lowest and highest peaks in the time domain. The frequency peak indicates where the maximum energy was located in the frequency spectrum.
| Durations (s) | Nb of stimuli | Amplitude range (dB) | Frequency peak (Hz) | |
|---|---|---|---|---|
| Vocal sounds | 8 s blocks | 1 blocks of 3 | − 5 | 660 |
| Emotionally neutral | 1.8 s | 39 | − 5 | 659 |
| Emotionally loaded | 2 s | 27 | − 3 | 859 |
| Speech | 0.65 s | 31 | − 7 | 464 |
| Animals | 1.5 s | 29 | − 6 | 1368 |
| Natural | 1.75 s | 18 | − 5 | 397 |
| Man-made | 1.4 s | 41 | − 6 | 1528 |
| Music | 1.55 s | 10 | − 6 | 421 |
Fig. 2Random effects analysis in 218 individuals.
At the top is shown the whole-brain voxels at which a t-test of the vocal vs. non-vocal difference in BOLD parameter estimates across the n = 218 subjects yields significant values (p < 0.05, FWE corrected) and the corresponding functional connectivity over the 18 ROIs selected. Data are projected (i) on the inflated cortical mesh surfaces created using Freesurfer version 4.0.1 from an average of 27 T1 scans of the same subject, and available within SPM, and (ii) on slices of the 152 EPI templates. Note the large undifferentiated cluster of significant voxels in the temporal lobes without clear maxima, and the involvement of many extra-temporal structures. At the bottom are shown data from 6 subjects (3 atypical and 3 typical) with the TVA from the RFX outlined.
RFX results. The table shows (i) MNI coordinates of local maxima (peaks separated by more than 8 mm, 3 peaks max listed per cluster), (ii) the corresponding t-value of the vocal > non-vocal contrast at that location (height threshold t(1, 217) = 4.79 p < 0.05 FWE corrected), (iii) the cluster size, (iv) the anatomical regions within the clusters, and (v) percentage signal change with bootstrap 95% confidence intervals for each ROI (TVA anterior, mid, and posterior (33 voxel each), the IFG ventral (left: 231 voxels, right: 71 voxels), and medial (left: 515 voxels, right: 1039 voxels), the precentral gyrus (left: 106 voxels, right: 393 voxels), the thalamus (left: 108 voxels, right: 227 voxels), amygdalae (left: 96 voxels, right: 95 voxels) and olivary nuclei (left: 67 voxels, right: 92 voxels)).
| Cluster size | Labeling | Mean PSC difference and 95% CI | ||
|---|---|---|---|---|
| − 60 − 12 2 | 25.06 | 4256 | Left superior temporal | TVAa 0.0039 [0.0036 0.0043] |
| − 62 − 22 4 | 23.64 | |||
| − 56 2 − 10 | 18.80 | |||
| 60 − 14 0 | 25.24 | 4802 | Right superior temporal | TVAa 0.0035 [0.0031 0.0038] |
| 60 − 26 0 | 24.08 | |||
| 60 0 − 4 | 21.24 | |||
| − 50 − 6 46 | 11.30 | 108 | Left precentral gyrus | 0.0011 [0.0009 0.0014] |
| − 40 28 − 2 | 10 | 233 | Left inferior frontal gyrus ventral | 0.0011 [0.0008 0.0014] |
| 20 − 8 − 12 | 9.85 | 84 | Right amygdala | 0.0009 [0.0007 0.001] |
| 26 0 − 18 | 5.73 | |||
| − 46 14 24 | 9.60 | 500 | Left inferior frontal gyrus medial | 0.0013 [0.001 0.0016] |
| 12 − 14 8 | 8.50 | 419 | Left thalamus | 0.00072 [0.0005 0.0009] |
| 14 − 4 10 | 6.56 | |||
| 8 − 4 2 | 6.48 | |||
| 14 − 26 − 6 | 8.20 | 92 | Right pons (olivary nucleus) | 0.001 [0.0007 0.0013] |
| 6 − 32 0 | 5.53 | |||
| − 14 − 26 − 6 | 6.82 | 65 | Left pons (olivary nucleus) | 0.0008 [0.0005 0.001] |
Fig. 3Probability maps of the TVA.
At the top is illustrated the thresholding procedure: starting from the GLM output the SPM-T map is thresholded using a Gaussian–Gamma mixture model (the best model is selected using the Bayesian Information Criteria; BIC) yielding a thresholded uncorrected map (Uncorr-T). This map is then corrected for multiple comparisons using topological FDR, as implemented in SPM, yielding to a corrected map (corr-T). The average of all these maps across subjects forms the probability map shown below. The middle and bottom panels show the probability of activations at the individual level. Data are projected (i) on the inflated cortical mesh surfaces created using Freesurfer version 4.0.1 from an average of 27 T1 scans of the same subject, and available within SPM, and (ii) on slices of the 152 EPI templates.
Fig. 4Cluster analysis.
Shown on a 3D template of the gray/white interface is the density map representing the number of individual voice > non-voice activation peaks within 3-mm disks on the cortical surface (cf. Methods). The density map reveals three main clusters of voice sensitivity in each hemisphere along a voice-sensitive zone of cortex extending from posterior STS to mid-STS/STG to anterior STG. The cluster with the greatest peak density is in right pSTS, consistent with individual images (cf. Fig. 2).
MNI coordinates of TVA peak clusters or ‘voice patches’ observed in the cluster analysis (Fig. 4).
| Voice patch | Labeling | |
|---|---|---|
| 42 − 35 3 | Right TVAp | Right middle/ posterior superior temporal gyrus |
| 53 − 18 − 3 | Right TVAm | Middle superior temporal sulcus/gyrus |
| 55 − 2 − 7 | Right TVAa | Anterior superior temporal sulcus |
| − 46 − 38 2 | Leftg TVAp | Middle/posterior superior temporal gyrus |
| − 55 − 18 − 3 | Left TVAm | Middle superior temporal gyrus |
| − 55 − 8 − 3 | Left TVAa | Anterior superior temporal sulcus |
Fig. 5Test–Retest reliability.
Intra-class correlation (ICC) coefficients obtained from the test–retest analysis of ten individual subjects projected onto the standard inflated cortical surface generated by Caret (Van Essen et al., 2001) and mapped in colormap without (top) and with thresholding (bottom). Note the very high reliability of voice-sensitive activity along the STS bilaterally.