Literature DB >> 22615693

Ventral and dorsal streams in the evolution of speech and language.

Abstract

The brains of humans and old-world monkeys show a great deal of anatomical similarity. The auditory cortical system, for instance, is organized into a ventral and a dorsal pathway in both species. A fundamental question with regard to the evolution of speech and language (as well as music) is whether human and monkey brains show principal differences in their organization (e.g., new pathways appearing as a result of a single mutation), or whether species differences are of a more subtle, quantitative nature. There is little doubt about a similar role of the ventral auditory pathway in both humans and monkeys in the decoding of spectrally complex sounds, which some authors have referred to as auditory object recognition. This includes the decoding of speech sounds ("speech perception") and their ultimate linking to meaning in humans. The originally presumed role of the auditory dorsal pathway in spatial processing, by analogy to the visual dorsal pathway, has recently been conceptualized into a more general role in sensorimotor integration and control. Specifically for speech, the dorsal processing stream plays a role in speech production as well as categorization of phonemes during on-line processing of speech.

Entities: Chemical Disease Gene Species

Keywords: brain connectivity; cerebral cortex; communication sounds; human; internal models; macaque monkey; music; speech

Year: 2012 PMID： 22615693 PMCID： PMC3351753 DOI： 10.3389/fnevo.2012.00007

Source DB: PubMed Journal: Front Evol Neurosci ISSN： 1663-070X

From an auditory point of view, spoken language starts with the processing of complex auditory signals. Physiological recordings in non-human primates suggest that neurons already at the secondary stage of processing along the auditory cortical pathway (the lateral belt areas) can show a preference for species-specific communication calls (Rauschecker et al., 1995). This response tuning is generated by convergence of input from lower-order neurons that respond to simple sounds like tones, frequency-modulated sweeps, or band-passed noise bursts. Neurons are sensitive to highly specific combinations of such inputs, and combining signals in a non-linear conjunctive AND-logic leads to the existence of neurons that respond specifically to certain types of calls. There is no reason to believe that the human auditory cortex does not contain similar neurons with combination sensitivity and a similar hierarchy from rather simple to more complex neurons, whose incidence increases from primary auditory cortex to more anterior regions of the superior temporal lobe (Rauschecker, 1998; Rauschecker and Tian, 2000). Indeed, early studies of human auditory cortex with functional magnetic resonance imaging (fMRI) have shown that primary auditory cortex responds best to tones, while at the next stage, the equivalent of the lateral belt in the monkey, band-passed noise bursts are more effective stimuli (Wessinger et al., 2001). Further along the antero-ventral pathway, cortical regions are selectively activated by words and intelligible speech sounds (Binder et al., 2000; Scott et al., 2000). This hierarchical organization of the auditory ventral stream with regard to speech-sound processing was recently corroborated with more refined techniques (Chevillet et al., 2011b). Furthermore, a meta-analysis of more than 100 neuroimaging studies of human speech processing has demonstrated that cortical regions in the mid-STG near the human lateral belt are sensitive to phonemes; farther afield in anterior STG, words are processed; finally, in the most anterior locations of STS, short phrases lead to selective activation (DeWitt and Rauschecker, 2012). Invariant representation of sounds is another important step toward establishing a usable system for auditory communication, such as speech. There is evidence that invariances are formed along the antero-ventral stream as well (DeWitt and Rauschecker, 2012). However, other reports have found that premotor regions may be involved too (e.g., Chevillet et al., 2011a; Lee et al., 2012). It appears possible, therefore, that invariances are formed in different ways: once on the basis of spectro-temporal information, which is pooled along the frequency domain in the sense of an OR-logic within the auditory ventral stream; and independently in the domain of motor gestures, which are formed originally for speech production, but are invoked during the processing of speech as well. The same is almost inevitably true for the processing of other complex sounds that can be classified into discrete categories (Leaver and Rauschecker, 2010). Such auditory objects are also represented in anterior regions of the STG, but premotor cortex participates in their encoding as long as they can be produced and thus invoke a motor code. Monkeys are naturally handicapped by their less sophisticated vocal apparatus, which limits their vocal repertoire and their capacity to mimic sounds. The involvement of the dorsal pathway (including premotor regions) in the processing and categorization of self-produced sounds will, therefore, have to be tested by other means (Remedios et al., 2009). The involvement of the dorsal auditory pathway, including premotor and inferior parietal regions, in the encoding and representation of temporally extended sounds (or sound sequences) became especially evident, when imagery of musical melodies was investigated (Leaver et al., 2009). During the learning of such sequences, the basal ganglia were actively engaged, whereas after these sequences became highly familiar, the same sequences activated more and more prefrontal areas. It appears, therefore, that the basal ganglia are responsible for the concatenation of sequential auditory information or formation of “chunks,” which represent information about conditional probabilities for one sound being followed by another. Once the chunks have been formed, they are once again stored in prefrontal regions. A similar chunking process occurs with cued sequences of learned finger movements (Koechlin and Jubault, 2006). This process involves prefrontal cortex near Broca's area and has, therefore, been compared with models of language (Hagoort, 2005), redefining Broca's area in terms of chunking (“unification”) of semantic, syntactic, and phonological information. Thus, the role of the dorsal stream can be conceptualized into one of sensorimotor integration and control and applies to all kinds of sequential stimuli, even beyond the auditory domain. Specifically for speech, the dorsal processing stream plays a role in speech production as well as categorization of phonemes during on-line processing of speech (Rauschecker and Scott, 2009; Rauschecker, 2011; Figure 1). The former role conforms to the classical idea of an “efference copy” or feed-forward model and allows for fast and efficient on-line control of speech production. By contrast, the latter function can be formalized as an inverse model during real-time speech processing, creating the affordances of the speech signal in a Gibsonian sense (Gibson, 1966; Rauschecker, 2005). Both functions require a (direct or indirect) connection between sensory and motor cortical structures of the brain, whereby subcortical structures (e.g., the basal ganglia) provide an additional link setting up transitional probabilities during associative learning of sound sequences.

Figure 1

Ventral and dorsal streams for the processing of complex sounds in the primate brain: (A) in the rhesus monkey [modified from Rauschecker and Tian (2000)]; (B) in the human [simplified from Rauschecker and Scott (2009)]. The ventral stream (in green) plays a general role in auditory object recognition, including perception of vocalizations and speech. The dorsal stream (in red) pivots around inferior/posterior parietal cortex, where a quick sketch of sensory event information is compared with an efference copy of motor plans (dashed lines). Thus, the dorsal stream plays a general role in sensorimotor integration and control. In clockwise fashion, starting out from auditory cortex, the processing loop performs as a forward model: object information, such as vocalizations and speech, is decoded in the antero-ventral stream all the way to category-invariant inferior frontal cortex (IFC, or VLPFC in monkeys) and transformed into articulatory representations (DLPFC or ventral PMC). Frontal activations are transmitted to the IPL and pST, where they are compared with auditory and other sensory information. AC, auditory cortex; AL, antero-lateral area; CL, caudo-lateral area; STS, superior temporal sulcus; IFC, inferior frontal cortex; DLPFC, VLPFC, dorsolateral and ventrolateral prefrontal cortex; PMC, premotor cortex; IPL, inferior parietal lobule; IPS, inferior parietal sulcus; CS, central sulcus; pST, posterior superior temporal region. [Composite figure adapted, with permission, from Rauschecker (2011)] Comparing human and monkey brain connectivity along the dorsal stream, there may be quantitative differences in the strengths of these connections, but there does not seem to be a difference in principle (Frey et al., 2008). Similarly, in the ventral stream, the fine-grain organization of cortical areas and the fine-tuning of its neuronal elements may be richer in humans than in monkeys, providing humans with a perceptual network for the detection of more subtle differences in the acoustic signal. The decisive distinction between humans and monkeys may, however, lie in a third component where ventral and dorsal streams converge and interact: the prefrontal network. With its own hierarchical organization it provides the substrate for recursive processing of nested sequences, as they are typical for human grammatical language structures (Friederici, 2004). Again, however, this emergent new ability of humans may be based on a quantitative rather than principal difference in human and monkey brain organization, which ties in the existing strengths of both ventral and dorsal processing streams with fronto-parietal networks underlying working memory. To test the real evolutionary similarity of human and monkey ventral and dorsal streams, two things have to happen in future studies: Connectivity studies in both species have to investigate in great detail which areas are connected. This will establish a greater amount of homology than other approaches, especially when the same techniques of structural and functional imaging are utilized. While anatomical tracer studies in monkeys will remain the gold standard (Romanski et al., 1999; Petrides and Pandya, 2009; Hackett, 2011), non-invasive fiber tractography using MRI-based technology will gain increasing importance as its resolution improves, because the exact same approach can be used in both species. Early attempts using diffusion tensor imaging (DTI) have had insufficient power to resolve crossing fibers within a single voxel or disentangle fibers with crossing trajectories (Catani et al, 2005; Croxson et al., 2005; Anwander et al, 2007; Rilling et al., 2008). Such studies have, therefore, remained inconclusive with regard to monkey-human homologies in language evolution. High-angular-resolution techniques, such as diffusion spectrum imaging (DSI), have been utilized successfully in humans (e.g., Frey et al., 2008) and in monkeys (Schmahmann et al., 2007; Wedeen et al., 2008). Cross-validation studies of autoradiographic tract tracing and DSI in monkeys have shown a remarkable concordance of results between tracer studies and DSI (Schmahmann et al., 2007). However further improvements in resolution and reductions in scan time are certainly needed and possible, before DSI studies can become routine. Functional studies based on blood-oxygenation-level-dependent (BOLD) responses are feasible in both species as well (Petkov et al., 2006) and can elucidate connectivity to a certain extent. Microstimulation techniques as another approach to analyze connectivity (Kikuchi et al., 2008), on the other hand, are limited to animal studies. Behavioral monkey studies have to be designed that test the above concepts and go beyond traditional models. “What” and “where” processing are still characteristic for the two streams, but as generalized models are developed (Rauschecker and Scott, 2009; Rauschecker, 2011), more appropriate monkey studies have to follow. These studies have to focus on the computational transformations that occur between the various processing stages rather than merely the connectivity describing different anatomical pathways.

Conflict of interest statement

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

29 in total

1. Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex.

Authors: L M Romanski; B Tian; J Fritz; M Mishkin; P S Goldman-Rakic; J P Rauschecker
Journal: Nat Neurosci Date: 1999-12 Impact factor: 24.884

2. Hierarchical organization of the human auditory cortex revealed by functional magnetic resonance imaging.

Authors: C M Wessinger; J VanMeter; B Tian; J Van Lare; J Pekar; J P Rauschecker
Journal: J Cogn Neurosci Date: 2001-01-01 Impact factor: 3.225

3. Mechanisms and streams for processing of "what" and "where" in auditory cortex.

Authors: J P Rauschecker; B Tian
Journal: Proc Natl Acad Sci U S A Date: 2000-10-24 Impact factor: 11.205

4. Identification of a pathway for intelligible speech in the left temporal lobe.

Authors: S K Scott; C C Blank; S Rosen; R J Wise
Journal: Brain Date: 2000-12 Impact factor: 13.501

5. Categorical speech processing in Broca's area: an fMRI study using multivariate pattern-based analysis.

Authors: Yune-Sang Lee; Peter Turkeltaub; Richard Granger; Rajeev D S Raizada
Journal: J Neurosci Date: 2012-03-14 Impact factor: 6.167

Review 6. Cortical processing of complex sounds.

Authors: J P Rauschecker
Journal: Curr Opin Neurobiol Date: 1998-08 Impact factor: 6.627

7. Processing of complex sounds in the macaque nonprimary auditory cortex.

Authors: J P Rauschecker; B Tian; M Hauser
Journal: Science Date: 1995-04-07 Impact factor: 47.728

8. Human temporal lobe activation by speech and nonspeech sounds.

Authors: J R Binder; J A Frost; T A Hammeke; P S Bellgowan; J A Springer; J N Kaufman; E T Possing
Journal: Cereb Cortex Date: 2000-05 Impact factor: 5.357

9. Distinct parietal and temporal pathways to the homologues of Broca's area in the monkey.

Authors: Michael Petrides; Deepak N Pandya
Journal: PLoS Biol Date: 2009-08-11 Impact factor: 8.029

10. Brain activation during anticipation of sound sequences.

Authors: Amber M Leaver; Jennifer Van Lare; Brandon Zielinski; Andrea R Halpern; Josef P Rauschecker
Journal: J Neurosci Date: 2009-02-25 Impact factor: 6.167

44 in total

Review 1. Neural correlates of auditory scene analysis and perception.

Authors: Kate L Christison-Lagay; Adam M Gifford; Yale E Cohen
Journal: Int J Psychophysiol Date: 2014-03-25 Impact factor: 2.997

Review 2. Auditory and visual cortex of primates: a comparison of two sensory systems.

Authors: Josef P Rauschecker
Journal: Eur J Neurosci Date: 2015-03 Impact factor: 3.386

Review 3. Where, When, and How: Are they all sensorimotor? Towards a unified view of the dorsal pathway in vision and audition.

Authors: Josef P Rauschecker
Journal: Cortex Date: 2017-11-03 Impact factor: 4.027

4. Modification of spectral features by nonhuman primates.

Authors: Daniel J Weiss; Cara F Hotchkin; Susan E Parks
Journal: Behav Brain Sci Date: 2014-12 Impact factor: 12.579

5. Responses of primate frontal cortex neurons during natural vocal communication.

Authors: Cory T Miller; A Wren Thomas; Samuel U Nummela; Lisa A de la Mothe
Journal: J Neurophysiol Date: 2015-06-17 Impact factor: 2.714

6. Resting state functional connectivity of the ventral auditory pathway in musicians with absolute pitch.

Authors: Seung-Goo Kim; Thomas R Knösche
Journal: Hum Brain Mapp Date: 2017-05-08 Impact factor: 5.038

7. Selectivity for space and time in early areas of the auditory dorsal stream in the rhesus monkey.

Authors: Pawel Kusmierek; Josef P Rauschecker
Journal: J Neurophysiol Date: 2014-02-05 Impact factor: 2.714

8. Morphological features of the neonatal brain support development of subsequent cognitive, language, and motor abilities.

Authors: Marisa N Spann; Ravi Bansal; Tove S Rosen; Bradley S Peterson
Journal: Hum Brain Mapp Date: 2014-02-25 Impact factor: 5.038

9. Not All Predictions Are Equal: "What" and "When" Predictions Modulate Activity in Auditory Cortex through Different Mechanisms.

Authors: Ryszard Auksztulewicz; Caspar M Schwiedrzik; Thomas Thesen; Werner Doyle; Orrin Devinsky; Anna C Nobre; Charles E Schroeder; Karl J Friston; Lucia Melloni
Journal: J Neurosci Date: 2018-08-24 Impact factor: 6.167

10. Mapping phonemic processing zones along human perisylvian cortex: an electro-corticographic investigation.

Authors: Sophie Molholm; Manuel R Mercier; Einat Liebenthal; Theodore H Schwartz; Walter Ritter; John J Foxe; Pierfilippo De Sanctis
Journal: Brain Struct Funct Date: 2013-05-26 Impact factor: 3.270