Literature DB >> 27747152

Corticolimbic dysfunction during facial and prosodic emotional recognition in first-episode psychosis patients and individuals at ultra-high risk.

Huai-Hsuan Tseng¹, Jonathan P Roiser², Gemma Modinos³, Irina Falkenberg⁴, Carly Samson³, Philip McGuire³, Paul Allen⁵.

Abstract

Emotional processing dysfunction is widely reported in patients with chronic schizophrenia and first-episode psychosis (FEP), and has been linked to functional abnormalities of corticolimbic regions. However, corticolimbic dysfunction is less studied in people at ultra-high risk for psychosis (UHR), particularly during processing prosodic voices. We examined corticolimbic response during an emotion recognition task in 18 UHR participants and compared them with 18 FEP patients and 21 healthy controls (HC). Emotional recognition accuracy and corticolimbic response were measured during functional magnetic resonance imaging (fMRI) using emotional dynamic facial and prosodic voice stimuli. Relative to HC, both UHR and FEP groups showed impaired overall emotion recognition accuracy. Whilst during face trials, both UHR and FEP groups did not show significant differences in brain activation relative to HC, during voice trials, FEP patients showed reduced activation across corticolimbic networks including the amygdala. UHR participants showed a trend for increased response in the caudate nucleus during the processing of emotionally valenced prosodic voices relative to HC. The results indicate that corticolimbic dysfunction seen in FEP patients is also present, albeit to a lesser extent, in an UHR cohort, and may represent a neural substrate for emotional processing difficulties prior to the onset of florid psychosis.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2016 PMID： 27747152 PMCID： PMC5053033 DOI： 10.1016/j.nicl.2016.09.006

Source DB: PubMed Journal: Neuroimage Clin ISSN： 2213-1582 Impact factor: 4.881

Introduction

Emotional processing deficits are widely reported in patients with schizophrenia and first-episode psychosis (FEP). Experimental studies, using both emotional faces and prosodic voice stimuli, report robust emotion recognition deficits in patients with schizophrenia and FEP (Edwards et al., 2002, Edwards et al., 2001, Thompson et al., 2012, Tseng et al., 2013). As in established schizophrenia, deficits in facial and prosodic emotion recognition have also been demonstrated in UHR populations (Addington et al., 2012, Amminger et al., 2012a, Amminger et al., 2012b, Thompson et al., 2012) indicating that impairments in emotional recognition and processing are already apparent in the prodromal phase of the illness. This is consistent with the emotional dysfunction, in the form of anxiety and affective symptoms, that is common in people who are at ultra-high risk (UHR) for developing psychosis (Yung et al., 2003). Dysfunction in brain regions important for emotional processing may be associated with vulnerability for developing the illness and may exist before the onset of florid psychosis (Barbour et al., 2012, Bediou et al., 2007, Eack et al., 2010, Habel et al., 2004). Neuroimaging studies in patients with schizophrenia and FEP (Lee et al., 2002, Li et al., 2010, Pinkham et al., 2005, Reske et al., 2009) have identified impairments during facial and prosodic emotional processing in cortical and limbic structures, including: the fusiform gyrus (FG) for facial expressions; superior temporal gyrus (STG) for vocal prosodies; amygdala; anterior cingulate gyrus and ventral and medial prefrontal cortex for both (Bach et al., 2009, Gur et al., 2002, Hempel et al., 2003, Li et al., 2010, Mitchell and Crow, 2005, Mitchell et al., 2004, Williams et al., 2004). Dysfunction in these regions is thought to account for patients' characteristic disturbances in facial and prosodic emotional processing and recognition (Gur et al., 2002, Williams et al., 2004). A multi-stage model of emotional perception and recognition has been proposed by Wildgruber et al. (2009). The model postulates that in an initial sensory processing stage, the FG and STG extract basic features from visual and speech input (stage 1). This emotional information is then conveyed to higher order emotional processing areas (i.e. amygdala, parahippocampal area, inferior frontal cortex) for evaluation. Neuroimaging findings in chronic schizophrenia and FEP patients suggest neural dysfunction is present at both stages of the putative model (Leitman et al., 2007, Li et al., 2010, Mitchell et al., 2004), i.e. in emotional perception and evaluation regions. Structural and functional abnormalities in these regions have also been reported in UHR populations (Broome et al., 2009, Fusar-Poli et al., 2011, Fusar-Poli et al., 2007, Mechelli et al., 2011, Seiferth et al., 2008, Smieskova et al., 2010, Tognin et al., 2014). Furthermore, previous functional imaging studies in UHR cohorts examining emotional processing explicitly, using face stimuli, reported altered activation in primary sensory (i.e. lingual, fusiform, and middle occipital gyri) and in the prefrontal cortex relative to healthy controls, but not always in the amygdala (Seiferth et al., 2008, Wolf et al., 2015). These findings imply that emotional processing dysfunction in UHR participants may arise from the initial information decoding stage in sensory areas prior to engagement of the amygdala. However, so far the evidence is equivocal. The incentive salience hypothesis proposes that increased firing of dopaminergic neurons in the striatum enhances the salience of irrelevant stimuli in patients with schizophrenia, including emotion-laden stimuli (Heinz and Schlagenhauf, 2010, Howes et al., 2009, Kapur, 2003, Roiser et al., 2013). It has been repeatedly demonstrated that striatal dopaminergic activity, including dopamine synthesis capacity and stress-induced dopamine release, is increased in the early phase of the illness, including FEP (Bonoldi and Howes, 2013, Ellison-Wright et al., 2008, Mizrahi et al., 2012) and UHR stages (Egerton et al., 2013, Howes et al., 2009, Mizrahi et al., 2012) These alterations are mainly observed in the dorsal striatum. Furthermore, increased resting perfusion, a marker of neural activity (Allen et al., 2015), and altered connectivity (Dandash et al., 2014) have been reported in the dorsal striatum (especially the caudate) in UHR cohorts. Altered striatal function observed in FEP and UHR individuals may contribute to altered salience responses (Roiser et al., 2013), including responses to emotional-laden stimuli (Winton-Brown et al., 2014). The dorsal striatum, especially the caudate nucleus, has been shown to modulate frontolimbic connections during valence-specific emotional processing (Diwadkar et al., 2012, Kotz et al., 2015), particularly in response to unpleasant stimuli (Carretie et al., 2009). Misattribution of emotionally salient stimuli has been reported in patients with schizophrenia during emotional processing (Cohen and Minor, 2010). Together, these findings implicate that altered striatal function in psychosis contributes to the valence misattribution of emotional-laden stimuli. It is not clear, however, if altered striatal function impacts on emotional valence judgment in UHR and FEP individuals. We investigated the neural correlates of emotion recognition in UHR and FEP subjects using both dynamic facial and prosodic voice stimuli. In addition to the emotional face stimuli used in previous studies in UHR populations (Diwadkar et al., 2012, Seiferth et al., 2008), we additionally included prosodic voice stimuli; as impaired capability to extract non-verbal emotional information from language is widely reported in schizophrenia (Bach et al., 2009, Edwards et al., 2002, Edwards et al., 2001, Kucharska-Pietura et al., 2005, Leitman et al., 2007) and high-risk populations (Addington et al., 2012, Amminger et al., 2012a, Amminger et al., 2012b). We predicted that (1) relative to HC, FEP patients would show reduced recognition accuracy for both facial and prosodic voice stimuli across emotions, and that this would be associated with decreased activation throughout corticolimbic regions involved in both sensory (i.e. FG, STG) and higher order emotional processes (i.e. amygdala and prefrontal cortex). We additionally predicted (2) that relative to HC, UHR participants would show reduced recognition accuracy and functional alterations in this corticolimbic network, particularly in cortical sensory regions (i.e. FG and STG), but to a lesser extent than that seen in FEP patients. Finally, given the role of the caudate nucleus in the processing of negative emotional stimuli (Carretie et al., 2009), we explored bilateral caudate regions and predicted that (3) UHR and FEP participants would show increased activation in this region relative to HC during emotional valence judgment.

Methods

Participants

All participants were between 18 and 40 years of age. Eighteen UHR participants were recruited from Outreach and Support in South London (OASIS) (Broome et al., 2005). The UHR state was defined according to the Personal Assessment and Crisis Evaluation (PACE) criteria (Yung et al., 1998) and confirmed using the Comprehensive Assessment of At Risk Mental States (CAARMS) scale (Yung et al., 2008). In brief, UHR participants met at least one of the following criteria: a) attenuated psychotic symptoms; b) brief limited intermittent psychosis; or c) a significant decline in cognitive and social functioning over the past year, together with either schizotypal personality disorder or a first degree relative with a psychotic disorder. One of the UHR participants was taking atypical antipsychotic medication (the chlorpromazine equivalent was 100 mg/day). Eighteen FEP patients were recruited to the study through South London and Maudsley early intervention clinics (http://www.slam.nhs.uk). FEP was operationally defined as ‘first treatment contact’ plus an ICD-10 diagnosis of psychosis (codes F20–F29 and F30–F33) (World Health Organization, 1992a). The clinical diagnosis was validated by administering the Schedules for Clinical Assessment in Neuropsychiatry (SCAN, World Health Organization, 1992b), and the clinical states were in partial remission. Ten of the FEP participants were taking atypical antipsychotic medication (all using second generation antipsychotics, the chlorpromazine equivalent in those FEP participants who were taking antipsychotic medications was 186.66 ± 118.84 mg/day). Twenty-one gender-matched healthy control (HC) participants were recruited via advertisements from the same geographical areas as UHR/FEP participants. No HC participants met criteria for a DSM-IV-TR psychiatric disorder, fulfilled the PACE criteria for prodromal symptoms, or had a first-degree family history of psychiatric disorders. One HC was excluded due to incomplete data collection. Exclusion criteria for all subjects included a history of neurological disorder, prior head trauma resulting in loss of consciousness and/or hospitalisation, or any contraindications to exposure to a magnetic field (e.g. metal implants, or pregnancy). Any participants reporting excessive use of alcohol (> 21 units per week for men and > 14 units per week for women) or recent recreational drug use (use of cannabis, stimulants, hallucinogens, or opiates in the two weeks prior to the fMRI scan) were excluded. None of the participants had received a DSM-IV-TR diagnosis for substance abuse or dependence. Written informed consent was obtained from all participants after detailed explanation of the study protocol. Ethical approval from the study was granted by the UK National Research Ethics Service Committee London – Bromley (reference number: 11/LO/0623).

Clinical and neurocognitive assessment

Participants' demographic and clinical data and estimated IQ scores are presented in Table 1. IQ was assessed using the Wide Range Achievement Test Revised (WRAT-R) (Jastak and Wilkinson, 1984). Symptoms in UHR and FEP participants were assessed with the Positive and Negative Syndrome Scale (PANSS) (Kay et al., 1987). The Clinical Assessment of At Risk Mental State (CAARMS) (Yung et al., 2005) was administered to UHR and HC participants (Table 1). Lifetime cannabis use experience was determined by self-report frequency and classified into four levels: 1-experimental, 2-occasional, 3-moderate, and 4-severe use, while non-users were coded as 0.

Table 1

Demographic information for participants across diagnostic group and statistical analysis. Means are followed by the standard deviations.

	HC (n = 21)	UHR (n = 18)	FEP (n = 18)	F/χ	p value
Age (years)	22.91 ± 3.79	24.44 ± 4.12	27.72 ± 5.36	5.86	0.005FEP > HC
Gender	8M:13F	10M:8F	13M:5F	4.57	0.10
Laterality	21R:1L	17R:1L	17R:1L	0.17	0.99
Years of education	16.71 ± 2.10	14.89 ± 1.94	14.78 ± 3.98	3.03	0.06
Cannabis use	0.76 ± 0.83	2.28 ± 1.02	1.39 ± 1.29	10.10	< 0.001HC = FEP < UHR
Verbal IQ WRAT-R(SS)	110.33 ± 9.78	99.06 ± 15.61	92.11 ± 15.39	8.47	0.001HC > UHR = FEP
PANSS total	–	53.88 ± 11.03	54.56 ± 13.79	0.25	0.88
PANSS positive	–	12.60 ± 2.92	13.47 ± 5.29	0.39	0.54
PANSS negative	–	14.39 ± 6.24	13.17 ± 5.45	0.04	0.85
PANSS general	–	26.73 ± 5.35	27.00 ± 7.36	0.00	0.98
CAARMS total	2.33 ± 3.81	36.29 ± 18.29	–	69.13	< 0.001
CAARMS positive	0.57 ± 1.08	7.72 ± 4.87	–	42.97	< 0.001
CAARMS emotion	0.05 ± 0.22	2.50 ± 3.02	–	13.87	0.001

HC = healthy controls; UHR = individuals at ultra-high risk state for psychosis; FEP = individuals with first episode psychosis; M = males; F = females; R = predominantly right handed; L = predominantly left handed; WRAT-R (SS) = Wide Range Achievement Test Revised (Standardized Score); PANSS = Positive and Negative Syndrome Scale.

MRI acquisition and processing

Functional images were acquired using a 1.5T MRI scanner (Sigma, LX-GE, Milwaukee, USA) at the Institute of Psychiatry, King's College London, UK, using the following parameters: TR = 3000 ms, TE = 40 ms, flip angle = 90°, slice thickness = 2.5 mm with 0.5 mm gap, field of view = 24 cm2 and a 64 × 64 matrix. In total, 46 axial slices parallel to the anterior commissure–posterior commissure (AC-PC) line were collected for each participant. Four hundred and twenty-seven image volumes were acquired during the task in each participant. Structural data were acquired using a three-dimensional T1-weighted FSPGR sequence (voxel size: 1 × 1 × 1 mm3, field of view: 280, 146 slices, TR = 11.092 ms, TE = 4.87 ms, TI = 300 ms, α = 18°) for coregistration purposes.

Emotional recognition paradigm

We used emotional stimuli with dynamic and continuous change in facial geometric configuration (Platt et al., 2010) and with vocal prosodic characteristics (Nowicki and Duke, 1994) validated in previous studies. The details for both facial and prosodic tasks are described in the Supplementary materials. There were 96 dynamic face trials (happy, sad, fearful and neutral) and 96 high-intensity voice trials (happy, sad, fearful and their low-intensity comparisons) of variable duration (the mean trial duration was 4.2 ± 1.37 s). Dynamic emotional stimuli were created with Abrosoft Fantamorph software (version 4.0). Photographs were morphed from neutral to the target emotion with increasing intensity within 25 frames during the ‘morph’. A one-second inter-stimulus interval in which a fixation-cross was presented in the centre of the screen followed each stimulus. During the emotion recognition task, face and voice trials were presented interspersed in a pseudo-random order and arranged into two sessions. Face stimuli were presented on a projection screen. A fixation cross was presented during the voice trials. Participants were instructed to choose between four emotional categories (happy, sad, fearful, and neutral) via a button box as quickly as possible before the voice and/or video clips ended. After the morphing face and voices stopped a black screen with fixation cross was presented until the end of the trial. During the task, participants' response accuracy was recorded.

Data analysis

Behavioural data analyses

Clinical and demographic data were analysed using chi-square tests (gender, handedness) and analyses of variance (ANOVA) (other demographic and clinical data) in IBM SPSS 19. Separate analyses were performed for the face trials and the voice trials. Accuracy scores were analysed using repeated measures analyses of covariance (RM-ANCOVAs) with age included as a covariate of no interest. Emotional category (happy, sad, fearful, neutral) was entered as the within-subjects variable. Diagnostic group was entered as the between-subjects variable. In addition, we explored the frequency of valence misrecognition between groups i.e. positive emotion (happy dynamic faces and prosodic voices trials) misrecognised as negative emotion (sad or fearful dynamic faces and prosodic voices trials). Following the detection of significant main effects or interactions, post-hoc t-tests or F-tests were employed and inferences were made at p < 0.05.

Functional MRI analyses

Functional images were pre-processed using SPM8 software (http://www.fil.ion.ucl.ac.uk/spm) running under Matlab 7.1 (Math Works, Natick, MA, USA). The full preprocessing procedures are detailed in Supplementary materials. Images of both sessions were realigned to the obtained structural image. The remaining images were then realigned to the first image of their respective session and resliced with sinc interpolation. Movement parameters were calculated and images with excessive movement (> 1.5 mm of translation and 1 degree of rotation in any axis) and the adjacent images were examined and removed if the image was corrupted. Interpolation between the images adjacent to the corrupted images was performed to replace the removed images. Subjects who had > 10% of data corrupted were considered as having excessive movements and were excluded from the subsequent analyses. One UHR participant and one FEP participant were thus excluded. Images were segmented and spatially normalized (Friston et al., 1995) to a standard MNI-305 template using nonlinear-basis functions and spatially smoothed with a Gaussian kernel 8-mm full width at half maximum isotropic. A standard event-related first-level analysis of regional responses was performed; onset times (i.e. of the onset of the facial expressions or voice clips) and associated durations were convolved with a canonical haemodynamic response function. To exclude low frequency drifts a high-pass filter was applied using a set of discrete cosine basis functions with a cutoff of 128 s, and an AR(1) model was applied to account for temporal auto-correlation intrinsic to the fMRI time-series. The movement parameters were entered as separate regressors of no-interest in the first level analyses. For first level statistical analysis, ten experimental regressors were defined: 1) Happy Face 2); Sad Face; 3) Fearful Face; 4) Neutral Face; 5) High-intensity Happy Voice; 6) Low-intensity Happy Voice; 7) High-intensity Sad Voice; 8) Low-intensity Sad Voice; 9) High-intensity Fearful Voice; and 10) Low-intensity Fearful Voice. Five first level contrasts of interest were then computed for dynamic face and prosodic voice stimuli: 1) Happy – Comparison; 2) Sad – Comparison; 3) Fearful – Comparison; 4) All emotions – Comparison; and 5) Positive emotion (Happy) – negative emotion (Fearful + Sad). For dynamic face contrasts, neutral faces acted as the comparison condition. For prosodic voice trials, since the validated stimulus set (DANVA-2-AP) did not contain neutral voice stimuli, low-intensity voice trials served as the comparison condition (i.e. high versus low intensity for the same prosodic emotion). Second-level analyses were performed using two approaches: whole brain voxel-wise analyses for exploration of the effect of emotional processing, and region of interest (ROI) analyses (Li et al., 2010, Witteman et al., 2012) based on a visual and auditory emotional processing model in schizophrenia (Tseng et al., 2015). Whole brain analysis was performed using ANCOVA and independent samples t-tests (conducted within the SPM ANCOVA framework) with age as a covariate of no interest. Other confounding factors (i.e. IQ, cannabis use and antipsychotics use) were not included in the analyses as supplementary correlational analyses between these factors and peak activation in all ROIs were non-significant within each group of participants. Statistical inferences were made at p < 0.05 after FWE cluster-level correction for multiple comparisons. When the group comparison omnibus F-contrast did not reach significance, additional exploratory pair-wise analyses were performed to compare FEP versus HC, UHR versus HC and FEP versus UHR, with a corrected threshold at p < 0.017 (Bonferroni corrections for 3 contrasts), except for the exploratory hypothesis examining bilateral caudate regions during emotional valence judgment. For ROI analyses, a search sphere with a radius of 16 mm (twice the smoothing kernel) was applied to the centre of each ROI using the small volume correction function in SPM8 (described below). Coordinates were described according to the standard Montreal Neurologic Institute (MNI) system. Four ROIs, identified in a meta-analysis (Li et al., 2010) of facial emotion recognition studies, were used to examine group effects during dynamic face trials. These were the bilateral FG (left, − 39, − 65, − 13; right, 40, − 52, − 14), left amygdala (− 21, − 7, − 8) and right lentiform gyrus (22, − 3, − 5). For the prosodic voice trials, primary facial decoding areas (bilateral FG) were replaced by primary prosodic decoding areas (bilateral STG; left, − 62, − 22, 1; right, 49, − 23, 6); coordinates were selected from the meta-analysis by Witteman et al. (2012). To test our valence-specific hypothesis an ROI in the caudate nucleus (Carretie et al., 2009), was chosen in the left (− 18, − 2, 24) and right caudate body (16, 4, 18) (Carretie et al., 2009). Spheres were then constructed in MarsBaR toolbox for SPM (Brett et al., 2002). A single inclusive mask containing all ROIs was applied, and statistical inferences were made at p < 0.05 with FWE correction for multiple comparisons at the voxel-level after applying small volume correction (SVC).

Results

Demographic and clinical data

Demographic and clinical data for each group are reported in Table 1. There were significant age differences (F = 5.86, df = (2, 54), p < 0.005) with FEP patients being older than HC. There were significant estimated IQ differences (F = 8.47, df = (2, 54), p = 0.001) with HC showing higher IQ scores than UHR and FEP. There were also significant differences in cannabis use (F = 10.10, df = (2, 54), p < 0.005) with UHR having more cannabis use experience than HC and FEP.

Dynamic face trials

Recognition accuracy

Mean accuracy scores are shown in Fig. 1(A). There was a significant main effect of emotional category (F = 14.86, df = (3, 159), p < 0.001). Across all participants, recognition accuracy was greatest for happy relative to sad (t = 8.22 df = 56, p < 0.001) and fearful trials (t = 6.62, df = 56, p < 0.001). There was a trend towards an effect of diagnostic group (F = 2.65, df = (2, 53), p = 0.08) with HC showing greater accuracy than FEP patients across all emotional conditions (post-hoc pairwise comparison: HC > FEP, F = 5.74, df = (1, 36), p = 0.022). The group × emotional category interaction was non-significant (F = 0.42, df = (6, 159), p = 0.87).

Fig. 1

Dynamic face trials. (A) Graph showing mean accuracy for group by emotional category. (B) Statistical Parametric Map (SPM) showing activation differences (HC > UHR, FEP; p = 0.04 and 0.04, respectively) within right FG. The effects did not survive after correction for multiple comparisons. The left side of the brain is on the left side of the image. (C) Graph showing peak BOLD activation level in right FG for each group during emotional dynamic faces contrasted against neutral dynamic faces, MNI coordinates (36, − 50, − 6). HC: healthy control group. UHR: ultra-high risk group. FEP: first-episode psychosis group.

The effect of diagnostic group for misrecognition of positive emotional faces (i.e. happy) as negative (i.e. sad or fearful) was also non-significant (F = 1.49, df = (2, 53), p = 0.24).

Functional MRI

The main effect of task (emotional > comparison trials) is reported in the Supplementary material. The main effect of group (emotional > neutral trials) was non-significant for both whole-brain and ROI analyses (Table 2). However, exploratory pair-wise group tests (conducted within the SPM ANCOVA framework) revealed trends towards significance in the right FG. In this region both FEP (t = 3.62, df = 51, p = 0.04 SVC) and UHR groups (t = 3.62, df = 51, p = 0.04 SVC) (Table 2, Fig. 1(B)) showed reduced activation during emotional face trials relative to HC. However, the effects did not survive after correction for multiple comparisons (corrected threshold p < 0.017). The difference between UHR and FEP groups did not approach significance. The group × valence (positive vs. negative emotions) interaction was non-significant in the caudate ROI.

Table 2

Dynamic face trials.

Whole brain voxel-wise analyses and ROI analyses using small volume correction for dynamic faces. Results reported for whole brain F-tests and ROI analyses are FWE corrected at the voxel level, p < 0.05.

ANCOVA group contrasts for Dynamic Face trials	No. of voxels	x	y	z	Maximum F values	Z	p value
1) F-test (whole-brain)
	No clusters reach threshold
2) F-test (ROIs)
	No voxels survive correction

All p-values reported for ROI analyses are FWE corrected at the voxel level.

Prosodic voice trials

Mean accuracy scores for prosodic voice trials are shown in Fig. 2(A). The main effect of emotional category was non-significant (F = 1.028, df = (3, 159), p = 0.38). The main effect of diagnostic group was significant (F = 7.96, df = (3, 159), p = 0.001) across all emotional categories, driven by greater overall accuracy in HC relative to UHR (F = 4.30, df = (1, 36), p = 0.045) and FEP (F = 23.69, df = (1, 36), p < 0.001) groups. The group × emotional category interaction (F = 1.035, df = (6, 159), p = 0.41) and group × valence interaction for misrecognition were non-significant (F = 2.22, df = (2, 53), p = 0.11).

Fig. 2

Prosodic voice trials. (A) Graph showing mean accuracy for each group by emotional category. Comparison refers to low-intensity voices. (B) SPM showing group activation differences (HC > FEP). The left side of the brain is on the left side of the image. (C) Graph showing peak BOLD activation level for each group during high intensity prosodic voices contrasted against low intensity prosodic voices in regions showing pair-wise differences between HC and FEP. HC: healthy control group. UHR: ultra-high risk group. FEP: first-episode psychosis group.

The main effect of task (high intensity prosodic voices > low intensity prosodic voices) is reported in Supplementary material. Whole-brain analysis revealed a trend towards a significant group effect in the left amygdala extending to left insula (Table 3, Fig. 2(B)). Pair-wise group comparisons revealed reduced activation in FEP patients relative to HC in the left amygdala, STG, medial orbital frontal gyrus, lingual gyrus and left angular gyrus (whole brain corrected, cluster-level) (Table 3). The difference between UHR and FEP groups was non-significant. ROI analyses confirmed the group effect in left amygdala (F = 15.08, df = (2, 51), p = 0.009; SVC) and in the left STG (F = 12.79, df = (2, 51), p = 0.03 SVC; see Table 3 and Fig. 2(C)). Pair-wise group tests also showed that the FEP patients had significantly lower activation than the HC group in the left amygdala (t = 5.46, df = 51, p = 0.002 SVC) and left STG (t = 5.05, df = 51, p = 0.003 SVC; see Table 3).

Table 3

Prosodic voice trials.

Whole brain voxel-wise analyses and ROIs analyses using small volume correction for prosodic voices. Results reported for whole brain F-tests and ROI analyses are FWE corrected at the voxel level, p < 0.05. Results reported for whole brain t-tests are FWE corrected at the cluster level, p < 0.05; clusters formed at p < 0.001 (minimum cluster size = 293).

ANCOVA group contrasts for Prosodic Voice trials	No. of voxels	x	y	z	Maximum F values	Z	p value
1) F-test (whole-brain)
	No clusters reach threshold
2) F-test (ROIs)
Left amygdala	4	− 32	− 2	− 18	15.08	4.34	0.009
Left amygdala	4	− 20	− 2	− 14	12.57	3.97	0.036
Left superior temporal gyrus	4	− 54	− 26	− 10	12.79	4.00	0.032

All p-values reported for voxel-wise whole brain group analyses are FWE corrected at the voxel level, p < 0.05.

All p-values reported for whole brain pair-wise comparisons are FWE corrected at the cluster level, p < 0.05; cluster size ≥ 293. Clusters are formed at p < 0.001, uncorrected.

All p-values reported for ROI analyses are FWE corrected at the voxel level, p < 0.05.

The group × valence interaction were non-significant (F = 9.31, df = (2, 51), p = 0.17) in the caudate ROI. Pair-wise group comparison showed a trend group × valence interaction between HC and UHR in the left dorsal caudate. In this region, HC showed greater activation for negative relative to positive emotions but the opposite pattern was seen in UHR participants (positive > negative) (t = 4.31, df = 51, p = 0.02; see Fig. 3). However, this did not survive a Bonferroni corrected threshold of p < 0.017. The group × valence interactions between FEP and HC groups, and between FEP and UHR groups in the caudate ROI were non-significant (Table 4).

Fig. 3

ROI analyses of left caudate nucleus body showing peak BOLD activation level for positive > negative prosodic voices in UHR > HC. HC: healthy control group. UHR: ultra-high risk group.

Table 4

ROI analyses of caudate area.

Caudate ROIs analyses for valence-specific hypothesis. Results are FWE corrected at the voxel level, p < 0.05.

ROIs	x	y	z	Maximum T values	Z	p valueFWE corrected
Positive valence Faces ≥ Negative valence Faces, UHR > HC
	No voxels survive correction

Positive valence Voices ≥ Negative valence Voices, UHR > HC
Left caudate body	− 18	− 2	24	4.31	3.96	0.02

All p-value reported for ROI analyses are FWE corrected at the voxel level.

Discussions

We investigated the neural correlates of emotional processing in response to emotional stimuli in two sensory modalities (visual, auditory) in FEP patients, UHR and HC individuals. In line with our hypothesis, FEP patients showed reduced recognition accuracy compared to HC during dynamic face and prosodic voice trials. During dynamic face trials, overall recognition accuracy for UHR participants was intermediate to HC and FEP patients but did not differ significantly from either group. During prosodic voice trials, however, UHR participants showed significantly reduced recognition accuracy relative to HC, while significantly reduced accuracy for fearful voice trials in UHR was observed (Fig. 2(A)). Of note, the vocal fear recognition rate was relatively low, which might reflect its higher ambiguity (i.e. lower accuracy and longer reaction times, see Edwards et al., 2002, Tseng et al., 2013) and thus susceptible to both time-urgent design (requiring participant to respond as soon as possible before the clip ended) and the background noise during image acquisition. Nevertheless, these factors affected all three groups and the results remain consistent with previous studies (Amminger et al., 2012a, Hoekert et al., 2007, Kohler et al., 2010, Thompson et al., 2012) that show impaired emotion recognition in people at clinical high-risk for schizophrenia before the full expression of psychotic illness. The majority of previous studies in schizophrenia (Lee et al., 2002, Li et al., 2010, Pinkham et al., 2005, Reske et al., 2009) report reduced amygdala activity in patients with schizophrenia. However, contrary to previous findings and our hypotheses, we did not find a significant group difference in brain activation during dynamic face trials. Using a relatively low magnetic field scanner (1.5T rather than 3T) may have contributed to the lack of differences in activation between groups. We used a low-field scanner to mitigate the loud noise generated by high-field scanners, which may interfere with the processing of acoustic stimuli. Nevertheless, the lack of activation differences in response to emotional face stimuli is consistent with a small number of previous studies that did not find clear amygdala activation differences in schizophrenia relative to healthy controls (Sachs et al., 2012, Swart et al., 2013). Our findings may suggest that relatively intact facial emotional processing is also seen in early and prodromal stages of psychosis. Despite that the group comparison of brain activation during dynamic face trials did not reach statistical significance, exploratory pair-wise ROI analysis of functional MRI data showed a non-significant trend of reduced right FG activation in FEP patients relative to HC. During prosodic voice trials, a more widespread pattern of reduced activation was apparent in FEP patients relative to HC, involving both sensory (STG) and emotional processing regions (amygdala and medial orbital prefrontal cortex), and also left temporal-parietal-limbic regions, including left MTG, left insula, and left thalamus. Dysfunction in cortical and limbic brain regions that are involved in sensory (i.e. FG, STG), information relaying and modulation (i.e. basal ganglia/caudate), and higher order emotional processes (i.e. amygdala and prefrontal cortex) in patients with schizophrenia has been established robustly (Lee et al., 2002, Li et al., 2010, Tian et al., 2011). Our findings complement those of schizophrenia studies, and confirm previous studies reporting functional changes in these corticolimbic regions, involved in both early decoding and emotional recognition/interpretation, in FEP population (Reske et al., 2009, Reske et al., 2007). In FEP patients the deactivation (negative contrast estimates) indicate greater activation to neutral rather than emotional stimuli in corticolimbic regions (Figs. 1(C) and 2(C)), and could be interpreted as hyperactivation to neutral or subtle emotional stimuli as previously reported in schizophrenia populations (Aleman and Kahn, 2005, Hall et al., 2008, Modinos et al., 2015, Seiferth et al., 2008). This functional change would be consistent with the notion that non-emotional information is more salient in FEP and at-risk states. Such abnormalities are thought to contribute to the social cognition and social functioning deficits apparent in emerging psychotic disorders (Amminger et al., 2012b). Relative to HC, despite a showing a non-significant trend for reduced activation in the sensory cortex (right FG) during emotional versus neutral dynamic face trials, UHR participants did not show significant difference in either face or voice modalities. Similarly, intermediate BOLD response between HC and FEP was observed in UHR participants in those areas showing decreased activation in the FEP group during prosodic voice trials, albeit the difference between HC and UHR did not reach statistical significance. This task-related subtle functional changes in the brain in the UHR participants is consistent with previous studies (Dutt et al., 2015). We speculate that these trends may reflect early subtle changes in primary sensory emotional processing regions, which may manifest in vulnerability states before the full-blown onset of psychosis. However, this requires testing in a larger sample. Interestingly, although the majority of studies in patients with schizophrenia report corticolimbic dysfunction during the presentation of emotional face stimuli, our findings in FEP patients suggest dysfunction that is more evident during the presentation of prosodic voice stimuli instead of facial stimuli. The reasons for this are not entirely clear but it is possible that as the emotional information carried in prosodic voices delivers more subtle interpersonal social cues than faces, voice stimuli may provide a more sensitive method to investigate functional alteration related to emotional recognition deficits in FEP and UHR cohorts. Our results support the findings of previous studies that reported reduced accuracy for prosodic emotional recognition in FEP and UHR groups (Amminger et al., 2012a) and suggest that prosodic emotional, rather than facial, stimuli may be better able to reveal the subtle emotional processing deficits associated with early psychosis and vulnerability states. The FG and the STG have been hypothesized to extract facial features and acoustic properties from visual and speech input, respectively, during stage 1 of the model proposed by Wildgruber et al. (2009). Although the current findings do not unequivocally support an impairment in these primary sensory processing areas in UHR, early subtle changes may have presented at the initial perceptual stage and impact on emotion recognition accuracy. By contrast, during both dynamic face and prosodic voice trials, activation in cortical regions (i.e. amygdala and OFC) involved in emotional recognition and interpretation was not significantly reduced in UHR participants relative to HC. This supports the view that corticolimbic hypoactivation (particularly in the amygdala) during the processing of emotion is related to the disease, rather than vulnerability states (Rasetti et al., 2009) and is constant with previous neuroimaging studies in UHR cohorts that also failed to detect amygdala dysfunction in the context of emotional recognition (Seiferth et al., 2008). In line with our exploratory hypothesis, UHR participants showed a trend towards an interaction in the caudate nucleus relative to HC when processing the valence of prosodic voices. In HC, caudate activation was greater for negative relative to positive valence trials, consistent with finding from a previous study by Carretie et al. (2009). The opposite pattern of activation was seen in UHR participants suggesting altered caudate function during emotional processing. Altered striatal activation in UHR populations has been reported previously during a salience processing task (Roiser et al., 2013) and may be related to elevated dopamine synthesis capacity in the associative striatum (Howes et al., 2009). Inappropriate activation in the caudate during the presentation of emotional stimuli could result in confusion regarding the salience and/or valence of emotional stimuli, although this was not seen at a behavioural level in UHR participants.

Limitations

The main limitation of the present study is the relatively small sample size: our findings will need to be replicated in larger FEP and UHR cohorts. The age difference between HC and FEP was another limitation. However, we included age as a covariate in all analyses to address this issue. Several potential confounding factors needs further discussion. First, the three groups were not matched for estimated pre-morbid IQ and our experimental task required explicit emotion recognition under a time constraint which may have been cognitively demanding (Phan et al., 2002). However, to our knowledge, although performance of emotion recognition may be associated with specific cognitive deficits (Bryson et al., 1997), there is no evidence that general intelligence significantly affects emotional processing (Coan and Allen, 2007). Furthermore, we chose not to control for IQ in the main analysis since this may remove important variance (Edwards et al., 2002) between groups, as low IQ is a phenotypic characteristic of psychosis (Mesholam-Gately et al., 2009). A supplementary correlation analyses showed that the peak activation in all ROIs did not correlate with IQ within any of the groups, suggesting that IQ was not a major confounding factor for emotional processing. A further limitation is that one UHR participant and a number of FEP patients that participated in the study were taking low doses of antipsychotic medication. Although most studies in patients with schizophrenia suggest that medication is not a major confounding influence on emotional recognition accuracy (Fusar-Poli et al., 2007, Navari and Dazzan, 2009), the influence of antipsychotic medication on hemodynamic responses during emotional processing remains unclear. Nevertheless, supplementary correlation analyses showed that the peak activation in all ROIs did not correlate with chlorpromazine equivalent dose in our FEP participants. Likewise, the higher lifetime experience of cannabis use in the UHR cohort, relative to both FEP and HC groups is also a potential confounder, given that chronic heavy cannabis use may affect emotional recognition accuracy (Hindocha et al., 2014, Platt et al., 2010). However, none of our UHR cohort reported concurrent heavy cannabis use nor met the DSM-IV diagnostic criteria for neither cannabis abuse nor dependence. A supplementary correlation analyses showed that the peak activation in all ROIs did not correlate with cannabis use within any of the groups, supporting the view that cannabis use did not affect the results. Another potential limitation is using low-intensity emotional prosodic stimuli as the contrast instead of neutral ones. It is arguable that the contrast of high versus low intensity for voice stimuli may reflect an intensity or arousal effect rather than emotion itself. In our study design it would not be possible to differentiate these two effects. However, as intensity is an important dimension of emotional information, these contrasts should still evoke the neural correlates of emotional processing, independent of their low-level acoustic properties (Ethofer et al., 2006).

Conclusions

In summary, FEP patients showed emotional recognition deficits and functional alterations in corticolimbic regions consistent with deficits across a multi-stage emotional processing model, mainly in the voice modality. By contrast, while UHR participants also showed emotional recognition deficits behaviourally, we only observed a trend towards an interaction in the neural processing of emotional stimuli in caudate nucleus, with a non-significant decrease of activation in early sensory processing regions. Our results highlight the need to investigate behavioural and neural vulnerability biomarkers in psychosis-prone high-risk populations in larger samples, and to expand the etiological understanding of psychosis and consequently provide insights for preventive strategies. Future longitudinal studies are needed to fully understand the chronology of emotional and corticolimbic dysfunction in the development of psychosis.

Financial disclosure

This work was supported by a Brain & Behavior Research Foundation NARSAD Independent Investigator Grant to PA, and a NARSAD Young Investigator Grant to GM (#21200, G.M., Lieber Investigator). IF was supported by the G.A. Lienert Foundation, Adolf-Schmidtmann-Foundation, FAZIT-Foundation and the German Academic Exchange Service. The funding agencies did not have any role in the design and conduct of the study; collection management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.

Declaration of interests

The authors have declared that no competing interests exist.

72 in total

1. Facial emotion perception and fusiform gyrus volume in first episode schizophrenia.

Authors: Amy Pinkham; David Penn; Bethany Wangelin; Diana Perkins; Guido Gerig; Hongbin Gu; Jeffrey Lieberman
Journal: Schizophr Res Date: 2005-08-25 Impact factor: 4.939

2. Disordered corticolimbic interactions during affective processing in children and adolescents at risk for schizophrenia revealed by functional magnetic resonance imaging and dynamic causal modeling.

Authors: Vaibhav A Diwadkar; Sunali Wadehra; Patrick Pruitt; Matcheri S Keshavan; Usha Rajan; Caroline Zajac-Benitez; Simon B Eickhoff
Journal: Arch Gen Psychiatry Date: 2012-03

3. Increased stress-induced dopamine release in psychosis.

Authors: Romina Mizrahi; Jean Addington; Pablo M Rusjan; Ivonne Suridjan; Alvina Ng; Isabelle Boileau; Jens C Pruessner; Gary Remington; Sylvain Houle; Alan A Wilson
Journal: Biol Psychiatry Date: 2011-11-30 Impact factor: 13.382

Review 4. Strange feelings: do amygdala abnormalities dysregulate the emotional brain in schizophrenia?

Authors: André Aleman; René S Kahn
Journal: Prog Neurobiol Date: 2005-12-13 Impact factor: 11.685

5. Facial and vocal affect perception in people at ultra-high risk of psychosis, first-episode schizophrenia and healthy controls.

Authors: G Paul Amminger; Miriam R Schäfer; Claudia M Klier; Monika Schlögelhofer; Nilufar Mossaheb; Andrew Thompson; Andreas Bechdolf; Kelly Allott; Patrick D McGorry; Barnaby Nelson
Journal: Early Interv Psychiatry Date: 2012-06-01 Impact factor: 2.732

Review 6. Right hemisphere language functions and schizophrenia: the forgotten hemisphere?

Authors: Rachel L C Mitchell; Tim J Crow
Journal: Brain Date: 2005-03-02 Impact factor: 13.501

7. Stability of emotional dysfunctions? A long-term fMRI study in first-episode schizophrenia.

Authors: Martina Reske; Thilo Kellermann; Ute Habel; N Jon Shah; Volker Backes; Martina von Wilmsdorff; Tony Stöcker; Wolfgang Gaebel; Frank Schneider
Journal: J Psychiatr Res Date: 2007-04-27 Impact factor: 4.791

8. Fusiform gyrus volume reduction in first-episode schizophrenia: a magnetic resonance imaging study.

Authors: Chang Uk Lee; Martha E Shenton; Dean F Salisbury; Kiyoto Kasai; Toshiaki Onitsuka; Chandlee C Dickey; Deborah Yurgelun-Todd; Ron Kikinis; Ferenc A Jolesz; Robert W McCarley
Journal: Arch Gen Psychiatry Date: 2002-09

Review 9. Do antipsychotic drugs affect brain structure? A systematic and critical review of MRI findings.

Authors: S Navari; P Dazzan
Journal: Psychol Med Date: 2009-04-02 Impact factor: 7.723

10. Social cognition in clinical "at risk" for psychosis and first episode psychosis populations.

Authors: Andrew Thompson; Alicia Papas; Cali Bartholomeusz; Kelly Allott; G Paul Amminger; Barnaby Nelson; Stephen Wood; Alison Yung
Journal: Schizophr Res Date: 2012-09-05 Impact factor: 4.939

3 in total

1. Dysregulated affective arousal regulates reward-based decision making in patients with schizophrenia: an integrated study.

Authors: Hong-Hsiang Liu; Chih-Min Liu; Ming H Hsieh; Yi-Ling Chien; Yung-Fong Hsu; Wen-Sung Lai
Journal: Schizophrenia (Heidelb) Date: 2022-03-21

2. A single dose of cannabidiol modulates medial temporal and striatal function during fear processing in people at clinical high risk for psychosis.

Authors: Cathy Davies; Robin Wilson; Elizabeth Appiah-Kusi; Grace Blest-Hopley; Michael Brammer; Jesus Perez; Robin M Murray; Paul Allen; Matthijs G Bossong; Philip McGuire; Sagnik Bhattacharyya
Journal: Transl Psychiatry Date: 2020-09-13 Impact factor: 6.222

Review 3. Neural correlates of emotional processing in psychosis risk and onset - A systematic review and meta-analysis of fMRI studies.

Authors: P B Lukow; A Kiemes; M J Kempton; F E Turkheimer; P McGuire; G Modinos
Journal: Neurosci Biobehav Rev Date: 2021-03-17 Impact factor: 8.989

3 in total