Literature DB >> 28894834

Auditory processing in the human cortex: An intracranial electrophysiology perspective.

Abstract

OBJECTIVE: Direct electrophysiological recordings in epilepsy patients offer an opportunity to study human auditory cortical processing with unprecedented spatiotemporal resolution. This review highlights recent intracranial studies of human auditory cortex and focuses on its basic response properties as well as modulation of cortical activity during the performance of active behavioral tasks. Data Sources: Literature review. Review
Methods: A review of the literature was conducted to summarize the functional organization of human auditory and auditory-related cortex as revealed using intracranial recordings.
RESULTS: The tonotopically organized core auditory cortex within the posteromedial portion of Heschl's gyrus represents spectrotemporal features of sounds with high temporal precision and short response latencies. At this level of processing, high gamma (70-150 Hz) activity is minimally modulated by task demands. Non-core cortex on the lateral surface of the superior temporal gyrus also maintains representation of stimulus acoustic features and, for speech, subserves transformation of acoustic inputs into phonemic representations. High gamma responses in this region are modulated by task requirements. Prefrontal cortex exhibits complex response patterns, related to stimulus intelligibility and task relevance. At this level of auditory processing, activity is strongly modulated by task requirements and reflects behavioral performance.
CONCLUSIONS: Direct recordings from the human brain reveal hierarchical organization of sound processing within auditory and auditory-related cortex. LEVEL OF EVIDENCE: Level V.

Entities: Chemical Disease Gene Species

Keywords: Electrocorticography; Heschl's gyrus; high gamma; superior temporal gyrus

Year: 2017 PMID： 28894834 PMCID： PMC5562943 DOI： 10.1002/lio2.73

Source DB: PubMed Journal: Laryngoscope Investig Otolaryngol ISSN： 2378-8038

INTRODUCTION

Auditory cortex in humans occupies the dorsal and lateral surface of the superior temporal gyrus (STG) (Fig. 1). The dorsal surface, termed the superior temporal plane, is buried deep within the Sylvian fissure. Heschl's gyrus (HG) is a major anatomical landmark within the superior temporal plane, oriented obliquely in a posteromedial‐to‐anterolateral direction. HG is bounded anteromedially by the anterior transverse sulcus extending from the circular sulcus, and posterolaterally by Heschl's sulcus, which extends onto the lateral surface of the STG as the transverse temporal sulcus (TTS).

Figure 1

Approximate location of the auditory cortex on the lateral surface and the superior temporal plane in the human brain (top and bottom panels, respectively). Area comprising core region is shaded in red, areas comprising non‐core (putative belt and parabelt areas) are shaded in blue. ATS = anterior temporal sulcus; HG = Heschl's gyrus; HS = Heschl's sulcus; IS = intermediate sulcus; SF = Sylvian fissure; STG = superior temporal gyrus; STS = superior temporal sulcus; TG2 = second transverse gyrus; TTS = transverse temporal sulcus. Modified from Nourski & Brugge.39 On a gross anatomical level, the superior temporal plane exhibits considerable complexity. It is one of the most highly folded regions in the human brain.1 There is a great deal of anatomical variability across individuals and between hemispheres in the same individual.2, 3, 4, 5, 6, 7 While a single HG per hemisphere is observed most frequently, there may be partially duplicated HG, two completely duplicated gyri, or even three gyri within any single superior temporal plane.3, 4, 5, 6, 8 Further, gross anatomical landmarks are not entirely predictable of underlying cytoarchitecture and functional organization.2, 4, 9, 10 All these factors make human auditory cortex challenging to study. Experimental animal models have proven extremely valuable in delineating basic organizational principles of auditory cortex (reviewed by Hackett11). In non‐human primates, including marmosets and macaques, auditory cortex has been hierarchically delineated into core (primary and primary‐like), belt, and parabelt fields. Currently it is not known how exactly this core–belt–parabelt model is reflected in the organization of human auditory cortex.12 Multiple studies have localized human core auditory cortex to the posteromedial portion of HG (approximately two thirds).1, 9, 10, 13, 14 In cases of HG duplication, core auditory cortex is identified in the most anterior gyrus within the superior temporal plane according to cytoarchitectonic criteria.2, 3, 9, 10 However, functional identification with neuroimaging methods shows that core auditory cortex can span both divisions of HG.6 While these studies provide a definition of core auditory cortex in the human, the specific organization of non‐core auditory fields, including homologs of auditory belt and parabelt areas, is at present unclear. Non‐invasive neuroimaging methods (electro‐ and magnetoencephalography, functional magnetic resonance imaging) have contributed greatly to the current understanding of human auditory cortex function. However, the ability of these methods to elucidate the detailed organization of auditory cortex is limited by their spatiotemporal resolution. Moreover, sources of activity inside the brain cannot be unambiguously localized from surface recordings with currently available neuroelectric and neuromagnetic methods. Recording electrophysiological activity directly from the human brain, including auditory and auditory‐related cortex, is possible in neurosurgical patients. Direct intracranial recordings (electrocorticography, ECoG) are made using electrodes implanted in patients’ brains for clinical reasons, usually to localize a potentially resectable seizure focus in medically intractable epilepsy. This provides a unique research opportunity to study the brain with high resolution in time (milliseconds) and space (millimeters).15, 16, 17

METHODS

ECoG allows for simultaneous recording from multiple regions of human auditory cortex. Implanted multicontact electrodes come in a variety of form factors and include penetrating depth electrodes and surface arrays. Penetrating depth electrodes can target HG.18, 19 This provides clinically important coverage of the superior temporal plane and allows for investigation of core auditory cortex and surrounding non‐core areas. Surface arrays that are placed subdurally provide coverage of auditory cortex on the lateral surface of the STG.20, 21 Anatomical reconstruction of electrode locations following implantation is critically important for interpretation of recorded electrophysiological data. Accurate anatomical reconstruction is carried out by co‐registration of pre‐ and post‐implantation structural imaging data using local anatomical landmarks and is aided by intraoperative photography.17, 19 Pooling data from multiple subjects can be done by projecting the electrode locations into a common brain coordinate space (e.g., Montreal Neurological Institute MNI305), and mapping the locations onto a standard brain. This process is complicated by a relatively small number of subjects in any given study, limited coverage of the auditory cortex provided by electrode arrays, and the aforementioned structural complexity and inter‐individual variability. In order to address these issues, statistical techniques such as linear mixed effects models are developed to account for subject differences and anatomical variables.22 Cortical activity spans a broad range of frequencies. This includes relatively low‐frequency activity evoked by the stimulus that is phase‐locked to its temporal features. Such phase‐locked activity can be visualized and measured as the averaged evoked potential (AEP). On the other end of the spectrum, higher‐frequency activity, including that in the high gamma frequency range (>70 Hz), has been proven to be crucial for auditory cortical processing,21, 23, 24, 25 Studies in non‐human primates26, 27 and humans28 have established the high gamma band as surrogate for unit activity. In contrast to low‐frequency cortical activity, activity in gamma (30–70 Hz) and high gamma bands that is modulated (“induced”) by the stimulus, is often not phase‐locked to it.21, 23, 29, 30 It has been shown that the absolute magnitude of cortical activity decreases with frequency, following the power law, regardless of whether a stimulus is presented or not.31, 32 This requires specialized analysis techniques that typically include filtering and rectification of the recorded ECoG signal to extract analytical amplitude or power within the specified ECoG frequency band. This is then followed by normalization to a pre‐stimulus baseline.17, 21 The resultant event‐related band power (ERBP) reflects changes in ECoG power within a particular frequency band (such as high gamma) that occur upon stimulus presentation relative to power in a pre‐stimulus reference interval. Understanding basic electrophysiological properties of human auditory cortex is achieved through a systematic study of responses to simple acoustic stimuli (e.g., pure tones and click trains) and complex sounds including speech. Characterization of onset response latencies allows to test hypotheses about the flow of information across core and non‐core regions within auditory cortex. Differential contributions of these regions to the higher auditory functions–"making sense of sound”33–are studied using active listening experimental paradigms. The present review highlights recent intracranial studies of basic response properties of human auditory cortex as well as modulation of auditory cortical activity during the performance of active behavioral tasks.

Basic response properties of human auditory cortex

Tonotopy–the orderly spatial arrangement of neurons tuned to different sound frequencies – is one of the fundamental organizational principles of the auditory system34. Hybrid depth electrodes implanted into HG allow for recording of local field potentials that reflect activity of local neural populations, as well as single‐unit activity35 (Fig. 2A, top panel). Using this approach, a high‐to‐low frequency tuning gradient has been described at the single cell level within posteromedial HG36 (Fig. 2A, bottom panel). Further, the tuning of these HG neurons is exquisitely fine (narrow), so that they distinguish sound frequencies differing by just ∼1/6‐1/18 octave37. This fine orderly tuning for sound frequency is consistent with the interpretation of posteromedial HG as core auditory cortex.

Figure 2

Core and non‐core auditory cortical regions exhibit spectral organization. A: Single unit best frequency data recorded from Heschl's gyrus (HG). Top: peristimulus time histogram for an exemplary unit depicting responses to a best‐frequency pure tone stimulus. Stimulus is schematically shown in gray. Bottom: Tracing of superior temporal plane showing the locations of clinical and microwire hybrid depth electrode contacts implanted in HG (large and small circles, respectively). Mean best frequencies for units recorded at three sites within core auditory cortex are indicated (N.R. = no response to pure tone stimuli). Modified from Howard et al.36 B: High gamma responses to pure tones recorded from superior temporal gyrus (STG). Top left: location of the 96‐contact subdural electrode grid implanted over perisylvian cortex, including STG (a and b denote the locations of two sites whose responses are detailed in the right panel). Bottom left: high gamma event‐related band power (ERBP) in response to tone stimuli presented at different frequencies (color bars), simultaneously recorded from the 96‐contact grid (individual plots). High gamma ERBP was averaged within the time window of 100–150 ms after stimulus onset. Right: time‐frequency analysis of cortical activity elicited by pure‐tone stimuli between 0.25 and 8 kHz (top to bottom), recorded from two sites on the STG (left and right columns). Stimulus is schematically shown in gray above time‐frequency plots. Modified from Nourski et al.38 Pure tones also elicit robust frequency‐selective high gamma responses on lateral STG (non‐core auditory cortical areas)38. In contrast to posteromedial HG, lateral STG does not exhibit a clear topographic selectivity gradient for sound frequency. For example, in Figure 2B, site ‘a’, located on the lateral surface of the STG immediately adjacent to the TTS, exhibits the strongest high gamma responses to the 500 Hz pure tones. An adjacent site ‘b’ responds best to 4‐8 kHz stimuli. In this case, sites that respond best to high frequencies surround a low‐frequency responsive region in a “mirror‐image” pattern (see Fig. 2B, bottom left panel). While such spectral organization is seen in some subjects, lateral STG is more often characterized by more complex, clustered response patterns, wherein low‐ and high‐frequency tones maximally activate different sites. Classification analysis, however, consistently reveals high accuracy for pure tone discrimination based on patterns of high gamma activity recorded from this region38. These findings indicate that high gamma activity on the lateral STG contains sufficient information to differentiate pure tone stimuli, and this region, though often considered tertiary (parabelt) auditory cortex, possibly includes a relatively early non‐core area (e.g. lateral belt) that maintains a topographic representation of sound frequency. Cortical processing of temporal sound features plays a major role in perception of environmental sounds, including speech. Repeated or periodic non‐speech stimuli, such as bursts of sinusoidal amplitude‐modulated noise or sequences (trains) of clicks offer convenient tools to study the representation of temporal sound features in the human auditory cortex39. Depending on click repetition rate within a sequence, click trains elicit distinct percepts, from “events” (at rates below 8‐10 Hz), to “flutter” (between ∼10‐40 Hz), to “pitch” (above ∼30‐40 Hz)40, 41. Additionally, repetitive stimuli presented at rates between ∼20‐300 Hz have a buzzing perceptual quality, which is referred to as “roughness”42, 43. These distinct perceptual classes are reflected in patterns of activity within core auditory cortex, as illustrated in Figure 3A. At low presentation rates, each click elicits a distinct multi‐phasic evoked potential and a burst of high gamma activity, essentially representing the clicks as separate events. As the rate increases, and the train is perceived as having a fluctuation‐like or “flutter” quality, the evoked potentials overlap, giving rise to increases in phase‐locked power at the frequency corresponding to the repetition rate – the frequency‐following response (FFR). At even higher rates, when the percept of pitch emerges, an induced (non‐phase‐locked) response in gamma and high gamma bands becomes evident in addition to the sustained FFR. This emergence of induced high gamma activity in responses to temporally regular stimuli has been considered a physiological correlate of pitch44. On the other hand, the presence of the FFR in core auditory cortical activity and its dissipation at higher repetition rates parallels the perceptual boundaries of roughness24, 39 (see also Fishman et al.45).

Figure 3

Auditory cortical regions differ in their capacity to track temporal modulations. A: Responses to 1 s click trains from core auditory cortex. Top: location of three exemplary recording sites: a core cortex site in medial Heschl's gyrus (HG), a site in a non‐core field on lateral HG and a superior temporal gyrus (STG) site (marked by a, b, and c, respectively). Bottom: averaged evoked potentials (AEPs) and event‐related band power (ERBP) obtained from site a in response to click trains presented at rates between 4 and 128 Hz (top to bottom rows). Stars indicate off‐response complexes. Arrowheads indicate driving frequencies and their harmonics at which increases in phase‐locked ERBP are seen. Stimulus schematics are shown above each ERBP plot. Modified from Brugge et al.24 B: AEPs and ERBP obtained from the three sites in response to 160 ms click trains presented at rates between 25 and 200 Hz (top to bottom rows). Modified from Nourski et al.49 C: Locations of physiologically defined posteromedial and anterolateral HG sites (red and blue symbols, respectively) in six subjects (different symbol shapes), plotted in Montreal Neurological Institute (MNI) coordinate space and projected onto the FreeSurfer average template brain. Recording sites were assigned to the posteromedial HG region of interest based on the presence of a frequency‐following response (FFR) to 100 Hz click trains and short‐latency (<20 ms) average evoked potential components. Modified from Nourski et al.50 The FFR, which represents neural phase locking to the temporal regularity of the stimulus, provides a useful physiological marker for cortical field delineation46, 47, 48. A comparison between response profiles of three auditory cortical sites is shown in Figure 3B. Intracranial electrophysiology studies using click train stimuli indicate that human auditory core cortex (site ‘a’ in Fig. 3B) can phase‐lock up to at least 200 Hz, non‐core cortex on STG (site ‘c’) has a more limited phase‐locking capacity, while anterolateral HG (site ‘b’) exhibits little‐to‐no phase locking24, 49. Pooling anatomical data across subjects (Fig. 3C) demonstrates that field delineation within HG based on physiological response properties in individual subjects translates into anatomically distinct regions on the population (across‐subject) level50. This supports the reliability of the physiology‐based operational definitions of posteromedial (core) and anterolateral (non‐core) HG cortex24, 47, 50. On the lateral STG, spatial distribution of phase‐locked responses to click trains is more variable, although a common organizational feature in individual subjects is that sites that feature FFRs typically cluster around the TTS49. Simple non‐speech sounds such as pure tones or click trains also help to better understand patterns of neural activity elicited by speech. Auditory core cortex simultaneously represents temporal features of speech over multiple time scales, including its syllabic structure, voicing of stop consonants and voice fundamental frequency (F 0) (Fig. 4) (see also Arnal et al.51). Figure 4A shows examples of responses to two speech sentences recorded from an electrode implanted in posteromedial HG. The temporal speech envelope is dominated by its syllabic structure (∼200‐250 ms long segments that correspond to individual syllables). Auditory core represents this relatively slowly time‐varying signal both in the temporal cadence of the evoked activity (black plots) as well as modulation of high gamma activity (color plots). The latter pattern – “tracking” of speech envelope by high gamma activity – is preserved even when speech is time‐compressed (accelerated) up to five times its normal rate, rendering it unintelligible52.

Figure 4

Temporal sound features of speech are represented in core auditory cortex. A: Representation of speech temporal envelope. Stimulus waveforms (gray) and temporal envelopes (black) of the two speech sentences are shown above averaged evoked potential (AEP) waveforms and event‐related band power (ERBP) time‐frequency plots. B: Representation of stop consonant voicing. Left column: waveforms corresponding to the first 200 ms following each initial stop consonant (top to bottom) in the two sentences shown in (A). Middle and right column: AEP and high gamma (70–150 Hz) ERBP responses, respectively, that correspond to each initial stop consonant. Red and blue plots correspond to voiced and unvoiced consonants, respectively. Gray horizontal lines represent the voice‐onset time. C: Phase locking to voice F 0. Stimulus waveforms (black) of the two speech syllables/had/, spoken by a male and a female talker (upper and lower waveforms, respectively) are shown above ERBP time‐frequency plots with superimposed pitch contours (black curves). Modified from Steinschneider et al.54 Finer‐grain temporal features, such as voicing onset of stop consonants (tens of milliseconds for unvoiced stops) – are also represented in both the AEP and high gamma activity at the level of core cortex53, 54. This is illustrated in Figure 4B, where the same two sentences (see Fig. 4A) are segmented into individual words. Here, words beginning with a voiced stop consonant (red waveforms) tend to elicit initial responses with a single peak, while words beginning with an unvoiced stop (blue waveforms) elicit double‐peaked responses. Finally, consistent with results of click train studies, auditory core cortex can exhibit FFRs to the voice . While FFRs are reliably observed in response to male talkers with relatively low F 0, this is not typically the case for female talkers with higher F 0s (Fig. 4C). Intracranial electrophysiology studies demonstrate that neural representations of basic spectrotemporal features of speech in auditory core cortex are remarkably similar in non‐human primates and humans53, 54 (see also Reser & Rosa57). The similarity in representation of speech at this cortical level is present despite the vastly different experience with language between humans and non‐human primate species. This indicates that activity within auditory core cortex is likely dominated by general auditory, rather than language‐specific, processing. Non‐core auditory areas on the lateral STG exhibit a somewhat more limited capacity for isomorphic representation of speech temporal acoustic features. Speech envelope‐following responses have been described at this processing stage58, 59, 60, paralleling non‐invasive studies (reviewed by Peelle & Davis61). Voicing of stop consonants is reflected in the temporal patterns of the AEP recorded from the lateral STG62. There is, however, no evidence for reliable FFRs to the voice F 0 at this level of cortical processing62, 63. Focal electrical stimulation of STG via subdural electrodes differentially affects the discrimination of vowels and consonants, disrupting the former, but not the latter64. This suggests that consonants and vowels at the level of STG are represented as distinct perceptual phenomena. Stimulus sets that are sufficiently rich in their spectral content and temporal modulations (such as spoken words or sentences) can be used to define spectrotemporal receptive fields (STRFs) of individual cortical sites. STRF‐based analysis65, 66 can describe the tuning properties of individual cortical sites and characterize spectrotemporal organization of human auditory cortex, predict responses to novel stimuli and, conversely, reconstruct the presented stimuli from patterns of cortical activity28, 67, 68, 69. Speech‐derived STRFs have been used by Hullett et al.69 to characterize sensitivity of the STG to spectrotemporal modulations. A posterior‐to‐anterior gradient was identified along the length of the STG, wherein posterior STG was tuned for fast temporal and low spectral modulation, while anterior STG represented slow temporal and high spectral modulation. Examination of spectral organization using this approach yielded similar results to those obtained using pure‐tone stimuli38, wherein spectral tuning maps in some subjects were reminiscent of a high‐low‐high mirror‐image pattern, but more often lacked clear frequency gradients69. Using STRF‐based reconstruction models, Pasley et al.67 were able to decode speech spectrograms from patterns of cortical high gamma activity. Cortical sites that contributed the most to reconstruction were localized to the posterolateral portion of STG. Speech synthesis based on reconstruction of speech spectrograms from auditory cortical high gamma activity using linear models is feasible in real time70. Mesgarani et al.68 characterized distributed spatiotemporal patterns of high gamma activity on the STG during listening to speech. Speech‐responsive sites on the STG were found to be selective to specific phoneme groups. Phonemic feature selectivity at this level of cortical processing was proposed to result from neural tuning to signature spectrotemporal cues. The latency of response onset is a basic electrophysiological property that permit inferences regarding relationship of different auditory cortical areas within the core‐belt‐parabelt processing hierarchy13, 71. Results obtained in non‐human primates predict that core areas would exhibit the shortest onset latencies, while non‐core areas would be characterized by progressively longer latencies72, 73, 74. Simultaneous recording from HG and lateral STG allows for systematic analysis of response latencies22 (Fig. 5). Onset latencies of high gamma responses to a speech syllable /da/ increase systematically along HG (sites ‘a’‐‘d’), and are the longest in its anterolateral portion. However, latencies on lateral STG (sites ‘e’‐‘h’) are shorter than those in anterolateral HG. Warping individual electrode locations from individual subjects into a common brain space reveals increases in latency along and across HG, a U‐shaped latency distribution along STG and longer latencies in the anterolateral third of the HG compared to middle portion of the STG (Fig. 5B). The latter two findings provide further evidence for the hypothesis that a portion of STG represents a non‐core field that is relatively early in the processing hierarchy.

Figure 5

Onset response latencies on lateral superior temporal gyrus (STG) are shorter than in anterolateral Heschl's gyrus (HG). A: Data from a representative subject. Left: location of recording contacts in the superior temporal plane (top) and over perisylvian cortex (bottom). Right: High gamma event‐related band power (ERBP) recorded from eight representative sites (a through h) in response to speech syllable /da/. Thick lines and gray shading correspond to cross‐trial mean ERBP and its 95% confidence interval, respectively. Arrows indicate measured high gamma onset response latency. B: Model predictions of onset response latencies to the syllable /da/ in HG (top) and lateral STG (bottom). FreeSurfer template brain is shown on the left. Predictions are based on data from 11 subjects and are bound by the convex envelopes of locations of responsive cortical sites. Modified from Nourski et al.22 Studies that simultaneously examined both relatively low (<20 Hz) and higher frequency auditory cortical activity revealed differences in the ways different ECoG bands contribute to representation of stimulus acoustic attributes and sound perception52, 75, 76, 77. Speech processing within the human auditory cortex is subserved by oscillatory activity on multiple time scales75, 76. The relative function of slow and fast cortical oscillations has been shown to exhibit hemispheric asymmetry and hierarchical organization75. High gamma activity featured increasing left lateralization along the auditory cortical hierarchy (from core auditory cortex in posteromedial HG to non‐core cortex on the lateral STG). Transition from evoked (phase‐locked to the stimulus) to induced (non phase‐locked) activity along the auditory cortical hierarchy was also more prominent in the left hemisphere. Phase‐amplitude coupling – amplitude modulation of faster (25‐45 Hz) oscillatory activity by the phase of the slower (4‐8 Hz) activity at the level of STG, was interpreted as transition from acoustic feature tracking to abstract phonological processing. The study of Morillon et al.75 supports a model wherein the ongoing speech signal is simultaneously parsed by left‐lateralized low gamma and right‐lateralized high theta oscillations, with left auditory cortex extracting faster phonemic features, and right auditory cortex parsing speech information at a syllabic rate. Fontolan et al.76 analyzed cross‐frequency regional coupling and showed that reciprocal functional connectivity between core auditory cortex and STG is dominated by activity in distinct ECoG frequency bands, with top‐down and bottom‐up information transfer subserved by relatively low (<40 Hz) and high (>40 Hz) ECoG bands, respectively. Modulation of local gamma activity by the phase of distant delta (1‐3 Hz) activity in the top‐down direction exhibited left hemisphere dominance. Further, directional information transfer was not continuous, but alternated between bottom‐up and top‐down directions at a rate of 1‐3 Hz. Taken together, the studies of Morillon et al.75 and Fontolan et al.76 indicate that both different ECoG frequencies and temporal windows are used for directional information transfer in the human auditory cortex, and that this process is asymmetric between the two hemispheres.

Task‐related modulation of auditory cortical activity

Using active listening paradigms such as auditory target detection, it is possible to characterize the extent to which auditory processing across different cortical areas is affected by task demands.78, 79, 80, 81, 82 In the study of Chang et al.,78 modest enhancement of high gamma activity was observed on the lateral surface of the superior temporal gyrus (STG) in response to target syllable stimuli in a phonemic categorization task. In the studies of Steinschneider et al.79 and Nourski et al.,80 subjects listened to words and tones presented in random order in a number of target detection tasks that required acoustic, phonemic, and semantic processing. The tasks included tone detection (“press the button whenever you hear a tone”) and semantic categorization (e.g., “press the button whenever you hear an animal word”). The experiment design allowed for comparisons of responses to the same class of stimuli–e.g., all instances of the word /cat/ and /dog/ ‐ presented in different experimental contexts (Fig. 6A). Early activity in posteromedial HG (site ‘a’) is minimally affected by the task, yet can feature target‐specific increases in late activity that follow the behavioral response (button press). Anterolateral HG (site ‘b’), in contrast, can exhibit target‐specific responses that precede the behavioral response. Activity on STG (site ‘c’) is strongly modulated by the task. Responses to non‐target words are greater during semantic categorization than tone detection and are further enhanced to target words. Auditory‐related cortex on supramarginal gyrus (site ‘d’) and prefrontal cortex in inferior and middle frontal gyri (IFG and MFG; sites ‘e’ and ‘f’, respectively) exhibit the most complex patterns, and often respond selectively to target stimuli prior to the behavioral response. When responses to target stimuli are divided into trials that yielded fast behavioral responses, slow responses or were missed altogether, the magnitude of STG high gamma responses, and timing and magnitude of prefrontal high gamma activity reflects the behavioral performance (i.e., reaction times).79, 80 These findings are corroborated by a related study that examined responses to pure tone stimuli in a target detection task.81

Figure 6

Task demands have differential effects on auditory and auditory‐related cortical areas. A: Responses to speech stimuli /cat/ and /dog/are shown for representative left hemisphere recording sites (a through e) in posteromedial, anterolateral Heschl's gyrus (HG), superior temporal gyrus (STG), supramarginal gyrus (SMG), inferior and middle frontal gyrus (IFG, MFG).. Colors (blue, green, and red) represent different task conditions. Lines and shaded areas represent mean high gamma event‐related band power (ERBP) and its standard error, respectively. Gray box denotes stimulus duration (300 ms). Horizontal box plot denotes the timing of behavioral responses to the target stimuli (median, 10th, 25th, 75th, and 90th percentiles). Modified from Steinschneider et al. 79 B: Spatial distribution of task and target effects (blue and yellow symbols, respectively). Data from 10 cerebral hemispheres in 9 subjects are plotted in Montreal Neurological Institute (MNI) coordinate space and projected onto FreeSurfer average template brain. Left and right panels depict 212 sites in the left hemisphere and 205 sites in the right hemisphere, respectively. Open symbols indicate sites that exhibited significant high gamma responses to the word stimuli, but did not exhibit either task or target effect. Letters a through e denote projections of the recording sites presented in panel A onto the average template brain. Modified from Nourski et al.81 Analysis of data from multiple subjects demonstrates that the task effect (responses to non‐target words in a control tone detection task vs. semantic categorization tasks) is most commonly localized to the STG81 (Fig. 6B). Target effect (target vs. non‐target words in a semantic task), on the other hand, is more prominent in surrounding auditory related areas, including middle temporal, supramarginal gyri, as well as IFG and MFG. Both task and target effects are more prominent in the left hemisphere than in the right. Taken together, these findings support left‐lateralized hierarchical organization of speech processing at the cortical level, wherein acoustic, phonemic, and semantic processing are primarily subserved by core, non‐core, and auditory‐related cortex, respectively. Tracking of speech temporal envelope by non‐core auditory cortex, introduced in the previous section, is also affected by task and attention. When the speech signal consists of two concurrent talkers, high gamma activity on the STG emphasizes spectrotemporal features of the talker's speech to which the listener is attending.83, 84, 85 Activity on STG thus does not merely reflect acoustic properties of speech, but also relates strongly to its perceived aspects, including attentional focus. This suggests that non‐core auditory cortex plays a key role in complex listening situations, such as a “cocktail party” environment. Activity in anterolateral HG and STG is attenuated during hearing self‐generated speech compared to listening to recorded speech56, 63, 86 or hearing an interlocutor's speech during a conversation.50 These observations support the idea that suppression of cortical activity to self‐initiated speech is an emerging property of human non‐core auditory cortex. Auditory‐related cortex within anterior temporal lobe exhibits further specialization during verbal communication. Low‐frequency (3–5 and 8–12 Hz bands) oscillatory activity within anterior temporal lobe exhibits differential response patterns during conversations with different interlocutors (life partners vs. attending physicians), representing a neural signature of communication that emerges at the level of auditory‐related cortex.87 Differential involvement of cortical regions in processing of auditory stimuli that are spectrally impoverished (as encountered by cochlear implant users) can be studied in normal‐hearing subjects using noise‐vocoded speech.88, 89 Ongoing research is showing a cortical hierarchical processing of spectrally impoverished speech.90 The intelligibility of noise‐vocoded speech increases with the amount of spectral information available to the listener.91, 92 When noise‐vocoded speech is presented in a two‐alternative forced‐choice task, activity in core auditory cortex is comparable across stimulus conditions (site ‘a’ in Fig. 7). In contrast, anterolateral HG responds selectively to the natural (unprocessed) sounds (site ‘b’ in Fig. 7). STG exhibits a variety of response patterns, with responses becoming progressively more selective for clear speech in more anterior regions (sites ‘c’–‘e’ in Fig. 7). In the frontal lobe, IFG and MFG are strongly affected by the stimulus condition. Some sites respond selectively to all intelligible, rather than just clear, stimuli (sites ‘f’ and ‘g’ in Fig. 7), while others appear to respond preferentially to vocoded, but not natural speech, thus perhaps reflecting increased difficulty and effort (site ‘h’ in Fig. 7). Taken together, these findings further illustrate the tiered organization of auditory cortical processing within and beyond human auditory cortex.

Figure 7

Responses to spectrally degraded speech are affected by intelligibility and task difficulty. Left: Locations of representative recording sites in Heschl's gyrus (HG), on superior temporal gyrus (STG), inferior and middle frontal gyrus (IFG, MFG) in three subjects. Right: Event‐related band power (ERBP) time‐frequency plots depicting responses to noise‐vocoded (1 and 4 bands) and natural speech stimuli /aba/, recorded from sites a through h (top to bottom). Stimulus spectrograms are shown on top.

SUMMARY

Direct recordings from the human brain reveal hierarchical organization of sound processing within auditory and auditory‐related cortex. The tonotopically organized core auditory cortex (posteromedial HG) represents spectrotemporal features of sounds with high temporal precision and short response latencies. At this level of processing, activity is minimally modulated by task, context or attention level. While non‐core cortex on lateral STG also represents stimulus acoustic features, its activity is modulated by task requirements. Finally, auditory‐related prefrontal areas (IFG and MFG) exhibit complex response patterns, related to stimulus intelligibility and task relevance. Responses in these cortical regions are strongly modulated by task requirements and correlate with behavioral performance.

86 in total

1. Induced electrocorticographic gamma activity during auditory perception. Brazier Award-winning article, 2001.

Authors: N E Crone; D Boatman; B Gordon; L Hao
Journal: Clin Neurophysiol Date: 2001-04 Impact factor: 3.708

2. Probabilistic mapping and volume measurement of human primary auditory cortex.

Authors: J Rademacher; P Morosan; T Schormann; A Schleicher; C Werner; H J Freund; K Zilles
Journal: Neuroimage Date: 2001-04 Impact factor: 6.556

3. Response profiles of auditory cortical neurons to tones and noise in behaving macaque monkeys.

Authors: G H Recanzone
Journal: Hear Res Date: 2000-12 Impact factor: 3.208

4. Complex tone processing in primary auditory cortex of the awake monkey. I. Neural ensemble correlates of roughness.

Authors: Y I Fishman; D H Reser; J C Arezzo; M Steinschneider
Journal: J Acoust Soc Am Date: 2000-07 Impact factor: 1.840

5. The lower limit of pitch as determined by rate discrimination.

Authors: K Krumbholz; R D Patterson; D Pressnitzer
Journal: J Acoust Soc Am Date: 2000-09 Impact factor: 1.840

6. Human primary auditory cortex: cytoarchitectonic subdivisions and mapping into a spatial reference system.

Authors: P Morosan; J Rademacher; A Schleicher; K Amunts; T Schormann; K Zilles
Journal: Neuroimage Date: 2001-04 Impact factor: 6.556

7. Auditory cortex on the human posterior superior temporal gyrus.

Authors: M A Howard; I O Volkov; R Mirsky; P C Garell; M D Noh; M Granner; H Damasio; M Steinschneider; R A Reale; J E Hind; J F Brugge
Journal: J Comp Neurol Date: 2000-01-03 Impact factor: 3.215

8. Architectonic identification of the core region in auditory cortex of macaques, chimpanzees, and humans.

Authors: T A Hackett; T M Preuss; J H Kaas
Journal: J Comp Neurol Date: 2001-12-17 Impact factor: 3.215

9. Frequency-dependent responses exhibited by multiple regions in human auditory cortex.

Authors: T M Talavage; P J Ledden; R R Benson; B R Rosen; J R Melcher
Journal: Hear Res Date: 2000-12 Impact factor: 3.208

10. Temporal envelope processing in the human left and right auditory cortices.

Authors: Catherine Liégeois-Chauvel; Christian Lorenzi; Agnès Trébuchon; Jean Régis; Patrick Chauvel
Journal: Cereb Cortex Date: 2004-03-28 Impact factor: 5.357

14 in total

1. Joint Representation of Spatial and Phonetic Features in the Human Core Auditory Cortex.

Authors: Prachi Patel; Laura K Long; Jose L Herrero; Ashesh D Mehta; Nima Mesgarani
Journal: Cell Rep Date: 2018-08-21 Impact factor: 9.423

2. Hierarchical Encoding of Attended Auditory Objects in Multi-talker Speech Perception.

Authors: James O'Sullivan; Jose Herrero; Elliot Smith; Catherine Schevon; Guy M McKhann; Sameer A Sheth; Ashesh D Mehta; Nima Mesgarani
Journal: Neuron Date: 2019-10-21 Impact factor: 17.173

3. Auditory Predictive Coding across Awareness States under Anesthesia: An Intracranial Electrophysiology Study.

Authors: Kirill V Nourski; Mitchell Steinschneider; Ariane E Rhone; Hiroto Kawasaki; Matthew A Howard; Matthew I Banks
Journal: J Neurosci Date: 2018-08-20 Impact factor: 6.167

4. Gamma Activation and Alpha Suppression within Human Auditory Cortex during a Speech Classification Task.

Authors: Kirill V Nourski; Mitchell Steinschneider; Ariane E Rhone; Christopher K Kovach; Hiroto Kawasaki; Matthew A Howard
Journal: J Neurosci Date: 2022-05-09 Impact factor: 6.709

5. The Spatiotemporal Neural Dynamics of Intersensory Attention Capture of Salient Stimuli: A Large-Scale Auditory-Visual Modeling Study.

Authors: Qin Liu; Antonio Ulloa; Barry Horwitz
Journal: Front Comput Neurosci Date: 2022-05-12 Impact factor: 3.387

10. Cortical Responses to Vowel Sequences in Awake and Anesthetized States: A Human Intracranial Electrophysiology Study.

Authors: Kirill V Nourski; Mitchell Steinschneider; Ariane E Rhone; Bryan M Krause; Rashmi N Mueller; Hiroto Kawasaki; Matthew I Banks
Journal: Cereb Cortex Date: 2021-10-22 Impact factor: 4.861