| Literature DB >> 35365659 |
Christian O Häusler1,2, Simon B Eickhoff3,4, Michael Hanke3,4.
Abstract
The "parahippocampal place area" (PPA) in the human ventral visual stream exhibits increased hemodynamic activity correlated with the perception of landscape photos compared to faces or objects. Here, we investigate the perception of scene-related, spatial information embedded in two naturalistic stimuli. The same 14 participants were watching a Hollywood movie and listening to its audio-description as part of the open-data resource studyforrest.org. We model hemodynamic activity based on annotations of selected stimulus features, and compare results to a block-design visual localizer. On a group level, increased activation correlating with visual spatial information occurring in the movie is overlapping with a traditionally localized PPA. Activation correlating with semantic spatial information occurring in the audio-description is more restricted to the anterior PPA. On an individual level, we find significant bilateral activity in the PPA of nine individuals and unilateral activity in one individual. Results suggest that activation in the PPA generalizes to spatial information embedded in a movie and an auditory narrative, and may call for considering a functional subdivision of the PPA.Entities:
Mesh:
Year: 2022 PMID: 35365659 PMCID: PMC8975992 DOI: 10.1038/s41597-022-01250-4
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Clusters (Z-threshold Z > 3.4; p < 0.05, cluster-corrected) of the primary t-contrast for the audio-visual movie comparing cuts to a setting depicted for the first time with cuts within a recurring setting (vse_new > vpe_old), sorted by size.
| Voxels | Max (MNI) | CoG (MNI) | Structure | ||||||
|---|---|---|---|---|---|---|---|---|---|
| X | Y | Z | X | Y | Z | ||||
| 3003 | <0.00001 | 5.31 | 22.5 | −45.5 | −12 | 4.53 | −63.3 | −3.7 | r. lingual g.; r. cuneal c., intracalcarine c., bilaterally occipital fusiform g., precuneus,temporal fusiform c., posterior parahippocampal c. |
| 154 | <0.00001 | 4.46 | −35 | −83 | 28 | −32.8 | −86.2 | 21.4 | l. superior lateral occipital c. |
| 121 | <0.00001 | 4.65 | 25 | −80.5 | 25.5 | 23.7 | −83.8 | 25.4 | r. superior lateral occipital cortex |
The first brain structure given contains the voxel with the maximum Z-value, followed by brain structures from posterior to anterior, and partially covered areas (l.: left; r: right; c.: cortex; g.: gyrus; CoG: Center of Gravity).
Fig. 1Mixed-effects group-level (N = 14) clusters (Z > 3.4; p < 0.05, cluster-corrected) of activity correlated with the processing of spatial information. The results of the audio-description’s primary t-contrast (blue) that compares geometry-related nouns spoken by the narrator to non-spatial nouns (geo, groom > all non-spatial categories) are overlaid on the movie’s primary t-contrast (red) that compares cuts to a setting depicted for the first time to cuts within a recurring setting (vse_new > vpe_old). (a) results as brain slices on top of the MNI152 T1-weighted head template, with the acquisition field-of-view for the audio-description study highlighted. For comparison depicted as a black outline, the union of the individual PPA localizations reported by Sengupta et al.[24] that was spatially smoothed by applying a Gaussian kernel with full width at half maximum (FWHM) of 2.0 mm. (b) results projected onto the reconstructed surface of the MNI152 T1-weighted brain template. After projection, the union of individual PPA localizations was spatially smoothed by a Gaussian kernel with FWHM of 2.0 mm.
Clusters (Z-threshold Z > 3.4; p < 0.05, cluster-corrected) of the primary t-contrast for the audio-description comparing geometry-related nouns to non-spatial nouns spoken by the audio-description’s narrator (geo, groom > all non-geo), sorted by size.
| Voxels | Max (MNI) | CoG (MNI) | Structure | ||||||
|---|---|---|---|---|---|---|---|---|---|
| X | Y | Z | X | Y | Z | ||||
| 188 | <0.00001 | 4.48 | −17.5 | −65.5 | 25.5 | −14.7 | −59.1 | 15.2 | l. precuneus |
| 164 | <0.00001 | 4.47 | 17.5 | −58 | 23 | 15.6 | −55.6 | 16 | r. precuneus; |
| 83 | 0.00013 | 4.48 | 27.5 | −43 | −17 | 27.2 | −41.1 | −14 | r. occipito-temporal fusiform c.; posterior parahippocampal g. |
| 73 | 0.00031 | 3.93 | −22.5 | −43 | −12 | −23.9 | −43.6 | −11.2 | l. lingual g.; occipito-temporal fusiform g., posterior parahippocampal c. |
| 63 | 0.00082 | 4.1 | 40 | −75.5 | 30.5 | 40.9 | −76.3 | 28.6 | r. superior lateral occipital c. |
| 37 | 0.0129 | 4.24 | −37.5 | −78 | 33 | −38.4 | −79.5 | 28.9 | l. superior lateral occipital c. |
The first brain structure given contains the voxel with the maximum Z-value, followed by brain structures from posterior to anterior, and partially covered areas (l.: left; r: right; c.: cortex; g.: gyrus; CoG: Center of Gravity).
Fig. 2Fixed-effects individual-level GLM results (Z > 3.4; p < 0.05, cluster-corrected). Individual brains are aligned via non-linear transformation to a study-specific T2* group template that is co-registered to the MNI152 template with an affine transformation (12 degrees of freedom). The results of the audio-description’s primary t-contrast (blue) that compares geometry related nouns to non-geometry related nouns spoken by the narrator (geo, groom > all non-geo) are overlaid over the movie’s primary t-contrast (red) that compares cuts to a setting depicted for the first time with cuts within a recurring setting (vse_new > vpe_old). Black: outline of participant-specific PPA(s) reported by Sengupta et al.[24]. Light gray: The audio-description’s field of view[23]. To facilitate comparisons across participants, we chose the same horizontal slice (x = −11) for all participants as this slice depicts voxels of significant clusters in almost all participants. The figure does not show voxels of the left cluster of the movie stimulus in sub-09 and sub-18, and voxels of the right cluster of the movie stimulus in sub-15.
Categories and criteria to categorize the nouns spoken by the audio-description’s narrator.
| Category | Criteria | Examples |
|---|---|---|
| body | trunk of the body; possibly clothed | back, hip, shoulder; jacket, dress, shirt |
| bodypart | limbs | arm, finger, leg, toe |
| face | face or parts of it | face, ear, nose, mouth |
| female | female person | nurse, mother, woman |
| females | female persons | women |
| fname | female name | Jenny |
| furniture | movable furniture (insides & outsides) | bench, bed, table, chair |
| geo | immobile landmarks | building, tree, street, alley, meadow, cornfield |
| groom | rooms & locales, or geometry-defining elements | living room; wall, door, window, floor |
| head | non-face parts of the head; worn headgear | head, hair, ear, neck; helmet |
| male | male person | man, father, soldier |
| males | male persons | boys, opponents |
| mname | male name | Bubba, Kennedy |
| object | countable entity with firm boundaries | telephone, car |
| objects | countable entities | wheels, plants |
| persons | concrete persons of unknown sex | hippies, patients |
| setting_new | a setting occurring for the first time | on a “bridge”, on an “alley”, on “campus” |
| setting_rec | a recurring setting | at the “bus stop” |
Examples are given in English. Some of these initial 18 noun categories were pooled resulting in 11 event categories that served as basis to build the regressors of the GLM (see Table 3).
Fig. 3Bland-Altman-Plots for individual participants. The x-axes show the means of two spatially corresponding voxels in the unthresholded Z-map of the audio-description’s primary contrast and unthresholded Z-map of the visual localizer (KDE plot on the top). The y-axes show the difference of two voxels (localizer minus audio-description; KDE plot on the right). The overlays depict voxels spatially constrained to the temporal and occipital cortex (gray; based on probabilistic Jülich Histological Atlas[74,75]), PPA overlap of all participants (blue), and individual PPA(s) (red).
Overview of event categories of the audio-visual movie and the audio-description.
| Label | Description | All | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
|---|---|---|---|---|---|---|---|---|---|---|
| vse_new | change of the camera position to a setting not depicted before | 96 | 11 | 14 | 17 | 4 | 17 | 9 | 21 | 3 |
| vse_old | change of the camera position to a recurring setting | 90 | 7 | 11 | 3 | 7 | 7 | 23 | 15 | 17 |
| vlo_ch | change of the camera position to another locale within the same setting | 89 | 10 | 31 | 2 | 23 | 0 | 4 | 18 | 1 |
| vpe_new | change of the camera position within a locale not depicted before | 386 | 31 | 38 | 72 | 90 | 89 | 33 | 24 | 9 |
| vpe_old | change of the camera position within a recurring locale | 208 | 25 | 61 | 0 | 13 | 1 | 32 | 29 | 47 |
| vno_cut | frames within a continuous movie shot | 148 | 30 | 13 | 0 | 21 | 15 | 27 | 9 | 17 |
| fg_av_ger_lr | left-right luminance difference | 180k | 22k | 22k | 22k | 24k | 23k | 22k | 27k | 16k |
| fg_av_ger_lrdiff | left-right volume difference | 180k | 22k | 22k | 22k | 24k | 23k | 22k | 27k | 16k |
| fg_av_ger_ml | mean luminance | 180k | 22k | 22k | 22k | 24k | 23k | 22k | 27k | 16k |
| fg_av_ger_pd | perceptual difference | 180k | 22k | 22k | 22k | 24k | 23k | 22k | 27k | 16k |
| fg_av_ger_rms | root mean square volume | 180k | 22k | 22k | 22k | 24k | 23k | 22k | 27k | 16k |
| fg_av_ger_ud | upper-lower luminance difference | 180k | 22k | 22k | 22k | 24k | 23k | 22k | 27k | 16k |
| body | trunk of the body; overlaid clothes | 66 | 6 | 12 | 7 | 12 | 2 | 9 | 13 | 5 |
| bpart | limbs and trousers | 69 | 9 | 8 | 6 | 13 | 5 | 7 | 11 | 10 |
| fahead | face or head (parts) | 83 | 12 | 11 | 10 | 5 | 9 | 13 | 12 | 11 |
| furn | moveable furniture (insides & outsides) | 50 | 8 | 5 | 2 | 5 | 7 | 10 | 7 | 6 |
| geo | immobile landmarks | 125 | 16 | 17 | 11 | 32 | 0 | 15 | 18 | 16 |
| groom | rooms & locales or geometry-defining elements | 105 | 12 | 11 | 8 | 5 | 8 | 25 | 28 | 8 |
| object | countable entities with firm boundaries | 284 | 39 | 34 | 27 | 44 | 29 | 42 | 32 | 37 |
| se_new | a setting occurring for the first time | 86 | 11 | 15 | 12 | 4 | 15 | 10 | 16 | 3 |
| se_old | a recurring setting | 37 | 2 | 5 | 1 | 4 | 2 | 9 | 8 | 6 |
| sex_f | female person(s), name | 108 | 14 | 22 | 6 | 6 | 13 | 10 | 23 | 14 |
| sex_m | male person(s), name | 403 | 41 | 68 | 38 | 102 | 45 | 42 | 42 | 25 |
| fg_ad_lrdiff | left-right volume difference | 180k | 22k | 22k | 21k | 24k | 23k | 21k | 27k | 16k |
| fg_ad_rms | root mean square volume | 180k | 22k | 22k | 21k | 24k | 23k | 21k | 27k | 16k |
Event categories of the movie are based on an annotation of cuts and depicted locations. Event categories of the audio-description are based on an annotation of nouns spoken by the audio-description’s narrator (see Table 4). Some of the audio-description’s event categories listed here (sex_f; sex_m; fahead, object) were created by pooling some categories of the original annotation of nouns (female, females, fname; male, males, mname; face, head; object, objects). Respective event counts are given for the whole stimulus (All) and the segments that were used for the eight sessions of fMRI scanning. Event counts for frame-based features are reported in units of a thousand.
Fig. 4Pearson correlation coefficients of model response time series used as regressors in the GLM analysis of the audio-description (blue; see Table 3 for a description) and audio-visual movie (red; see Table 3). Values are rounded to the nearest tenth. The correlation between the two stimuli’s root mean square volume and between their left-right difference in volume yielded the highest correlation values (fg_ad_rms and fg_av_ger_rms, r = 0.7635; fg_ad_lrdiff and fg_av_ger_lrdiff, r = 0.7749).
Computed contrasts for the analysis of the movie and the audio description, and their respective purpose.
| Nr. | Contrast | Purpose |
|---|---|---|
| 1* | vse_new > vpe_old | spatial processing |
| 2 | vse_new, vpe_new > vse_old, vpe_old | spatial processing |
| 3 | vse_new > vse_old | spatial processing |
| 4 | vse_new > vse_old, vpe_old | spatial processing |
| 5 | vse_new, vpe_new > vpe_old | spatial processing |
| 6 | vno_cut > vse_new | control |
| 7 | vno_cut > vse_old | control |
| 8 | vno_cut > vse_new, vse_old | control |
| 9 | vno_cut > vpe_new, vpe_old | control |
| 10 | se_new > se_old | control (absent narrator) |
| 1* | geo, groom > non-spatial | spatial processing |
| 2 | geo, groom, se_new > non-spatial | spatial processing |
| 3 | groom, se_new, se_old > non-spatial | spatial processing |
| 4 | geo > non-spatial | spatial processing |
| 5 | groom > non-spatial | spatial processing |
| 6 | se_new > non-spatial | spatial processing |
| 7 | se_new, se_old > non-spatial | spatial processing |
| 8 | se_new > se_old | spatial processing |
| 9 | vse_new > vpe_old | control (absent visual cuts) |
| 10 | vse_new, vpe_new > vse_old, vpe_old | control (absent visual cuts) |
| 11 | vse_new > vse_old | control (absent visual cuts) |
| 12 | vse_new > vse_old, vpe_old | control (absent visual cuts) |
| 13 | vse_new, vpe_new > vpe_old | control (absent visual cuts) |
The primary contrasts are marked with an asterisk. non-spatial refers to the event categories body, bodypart, fahead, object, sex_f, sex_m. An explanation of all event categories can be found in Table 3.