| Literature DB >> 30087589 |
Baishen Liang1,2, Yi Du1,2.
Abstract
In tonal language such as Chinese, lexical tone serves as a phonemic feature in determining word meaning. Meanwhile, it is close to prosody in terms of suprasegmental pitch variations and larynx-based articulation. The important yet mixed nature of lexical tone has evoked considerable studies, but no consensus has been reached on its functional neuroanatomy. This meta-analysis aimed at uncovering the neural network of lexical tone perception in comparison with that of phoneme and prosody in a unified framework. Independent Activation Likelihood Estimation meta-analyses were conducted for different linguistic elements: lexical tone by native tonal language speakers, lexical tone by non-tonal language speakers, phoneme, word-level prosody, and sentence-level prosody. Results showed that lexical tone and prosody studies demonstrated more extensive activations in the right than the left auditory cortex, whereas the opposite pattern was found for phoneme studies. Only tonal language speakers consistently recruited the left anterior superior temporal gyrus (STG) for processing lexical tone, an area implicated in phoneme processing and word-form recognition. Moreover, an anterior-lateral to posterior-medial gradient of activation as a function of element timescale was revealed in the right STG, in which the activation for lexical tone lied between that for phoneme and that for prosody. Another topological pattern was shown on the left precentral gyrus (preCG), with the activation for lexical tone overlapped with that for prosody but ventral to that for phoneme. These findings provide evidence that the neural network for lexical tone perception is hybrid with those for phoneme and prosody. That is, resembling prosody, lexical tone perception, regardless of language experience, involved right auditory cortex, with activation localized between sites engaged by phonemic and prosodic processing, suggesting a hierarchical organization of representations in the right auditory cortex. For tonal language speakers, lexical tone additionally engaged the left STG lexical mapping network, consistent with the phonemic representation. Similarly, when processing lexical tone, only tonal language speakers engaged the left preCG site implicated in prosody perception, consistent with tonal language speakers having stronger articulatory representations for lexical tone in the laryngeal sensorimotor network. A dynamic dual-stream model for lexical tone perception was proposed and discussed.Entities:
Keywords: lexical tone; meta-analysis; neuroimaging; phoneme; prosody; speech perception
Year: 2018 PMID: 30087589 PMCID: PMC6066585 DOI: 10.3389/fnins.2018.00495
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 4.677
A summary of different speech elements.
| Lexical tone (tone phoneme) | Suprasegmental | Level and contour of the fundamental frequency | Larynx | Determine lexical meaning | |
| Segmental Phoneme | Consonant | Segmental | Voice onset time and formant transitions | Tongue, lips, teeth, palate | Determine lexical meaning |
| Vowel | Segmental | Regions of the 1st and 2nd formants | Tongue, lips | Determine lexical meaning | |
| Prosody | Word prosody | Suprasegmental | Level and contour of the fundamental frequency | Larynx | Pragmatic (intonation, stress, rhythm, emotion) |
| Sentence prosody | Suprasegmental | Level and contour of the fundamental frequency | Larynx | Pragmatic (intonation, stress, rhythm, emotion) | |
Figure 1Procedure of selection. Papers selected from PubMed and previous meta-analysis were screened manually following the criteria. Contrasts from selected papers were grouped into five categories, which were then entered into meta-analysis independently. Note that, the number of studies entered into analysis was smaller than the sum of contrasts in each condition, because one selected study may contain more than one contrast.
Details of studies recruited in the meta-analysis.
| Gandour et al., | Discrimination judgement of Thai tones | Silence | 5/Thai (3F, mean 25.2 yrs) | Table 2 | 5 |
| Gandour et al., | Discrimination judgement of Thai tones | Silence | 5/Chinese (2F, mean 25.4 yrs) | Table 2 | 6 |
| Gandour et al., | Discrimination judgement of Thai tones | Passive listening to hums | 10/Thai (5F, mean 25.8 yrs) | Table 2 | 1 |
| Gandour et al., | Discrimination judgement of Mandarin tones | Passive listening to hums | 10/Mandarin (5F, mean 27.3 yrs) | Table 3 | 2 |
| Hsieh et al., | Discrimination judgement of Mandarin tones | Passive listening to speech contour | 10/Mandarin (4F, mean 24.9 yrs) | Table 3 | 5 |
| Klein et al., | Discrimination judgement of Mandarin tones | Silence | 12/Mandarin (6F, not provided) | Table 1 | 13 |
| Li et al., | Matching judgement of Mandarin tones | Syllable discrimination | 12/Mandarin (6F, 23–32 yrs) | Table 2 | 5 |
| Li et al., | Matching judgement of Mandarin tones (random position) | Matching judgement of Mandarin tones (fixed position) | 12/Mandarin (6F, 23–32 yrs) | Table 2 | 5 |
| Nan and Friederici, | Tone congruity judgment of Mandarin phrases | Tone congruity judgment of musical phrases | 18/Mandarin (18F, 20.8 yrs) | Table 1 | 6 |
| Wong et al., | Discrimination judgement of Mandarin tones | Passive listening to Mandarin words | 7/Mandarin (0F, 18–32 yrs) | Table 1 | 14 |
| Zhang et al., | Discrimination judgement of Cantonese tones (deviant tones) | Discrimination judgement of Mandarin tones (same tones) | 19/Cantonese (12 F, 19.6–24.4 yrs) | Table 4 | 6 |
| Zhang et al., | Discrimination judgement of Cantonese tones | Discrimination judgement of musical tones | 11/Cantonese (9F, 18.8–28.8 yrs) | Table 3 | 1 |
| Gandour et al., | Discrimination judgement of Thai tones | Silence | 5/English (2F, mean 24.6 yrs) | Table 2 | 6 |
| Gandour et al., | Discrimination judgement of Mandarin tones | Passive listening to hums | 10/English (5F, mean 26 yrs) | Table 3 | 5 |
| Hsieh et al., | Discrimination judgement of Mandarin tones | Passive listening to speech contour | 10/English (5F, mean 25.6) | Table 3 | 3 |
| Klein et al., | Discrimination judgement of Mandarin tones | Silence | 12/English (6F, not provided) | Table 1 | 17 |
| Wang et al., | Identification of Mandarin tones | Rest | 6/English (4F, not provided) | Table 3 | 7 |
| Wong et al., | Discrimination judgement of Mandarin tones | Passive listening to Mandarin words | 7/English (0F, 18–27 yrs) | Table 2 | 25 |
| Wong et al., | Discrimination judgement of Mandarin tones | Discrimination judgement of sinusoids | 17/English (10F, 18–26 yrs) | Table 2 | 2 |
| Burton and Small, | Discrimination judgement of English phoneme | Discrimination judgement of tone | 10/not provided (8F, 20–50 yrs) | Table 3 | 4 |
| Chevillet et al., | Discrimination judgement of between-category phonemes | Discrimination judgement of within-category phonemes | 14/English (6F, 18–32 yrs) | Table 1 | 16 |
| Gandour et al., | Discrimination judgement of Thai consonants | Rest | 5/Thai (3F, mean 25.2 yrs) | Table 4 | 6 |
| Gandour et al., | Discrimination judgement of Thai consonants | Rest | 5/Chinese (2F, mean 25.4 yrs) | Table 4 | 5 |
| Gandour et al., | Discrimination judgement of Thai consonants | Rest | 5/English (2F, mean 24.6 yrs) | Table 4 | 6 |
| Hsieh et al., | Discrimination judgement of Chinese consonants | Passive listening to filtered speech contour | 10/Chinese (4F, mean 24.9 yrs) | Table 4 | 5 |
| Hsieh et al., | Discrimination judgement of Chinese consonants | Passive listening to filtered speech contour | 10/English (5F, mean 25.6 yrs) | Table 4 | 3 |
| Obleser et al., | Discrimination judgement of German vowels | Noise | 13/not provided (5F, 26–36 yrs) | Table 2 | 5 |
| LoCasto et al., | Discrimination judgement of consonants | Discrimination judgement of tones | 20/English (10F, 22–47 yrs) | Table 3 | 13 |
| Rimol et al., | Discrimination judgement of Norwegian consonants | Noise | 17/not provided (0F, 20–28 yrs) | Table 2 | 1 |
| Rogers and Davis, | Discrimination judgement of consonants | Rest | 24/not provided (14F, 18–45 yrs) | Table 1 | 6 |
| Wolmetz et al., | Discrimination judgement of between-category phonemes | Discrimination judgement of within-category phonemes | 8/not provided (6F, 19–27 yrs) | Table 2 | 14 |
| Zaehle et al., | Discrimination judgement of consonants | Discrimination judgement of non-speech stimuli | 16/Swiss-German (not provided, 22–36 yrs) | Table 1 | 6 |
| Zatorre et al., | Discrimination judgement of phonemes | Passive listening to noise | 10/not provided (6F, not provided) | Table 6 | 6 |
| Bach et al., | Processing of emotional word (various tasks) | Processing of neutral word (various tasks) | 16/not provided (8F, 22.1–29.9 yrs) | Table 1 | 9 |
| Belyk and Brown, | Emotion judgement of mono-syllables | Rest | 16/not provided (10F, not provided) | Table 2 | 26 |
| Brück et al., | Identification of emotional words | Identification of neutral words | 24/not provided (12F, 19–33 yrs) | Table 2 | 4 |
| Ethofer et al., | Processing of emotional words (various tasks) | Processing of neutral words (various tasks) | 24/not provided (12F, mean 26.3 yrs) | Table 2 | 9 |
| Frühholz et al., | Processing of emotional words (various tasks) | Processing of neutral words (various tasks) | 17/French (14F, 20–38 yrs) | SI Table 2 | 7 |
| Gandour et al., | Discrimination judgement of intonation of one syllable pair | Discrimination judgement of lexical tone of one syllable pair | 10/Chinese (10F, not provided) | Table 2 | 3 |
| Imaizumi et al., | Discrimination judgement of emotional words | Mean reformatted MRI | 6/not provided (not provided, 18–25 yrs) | Table 2 | 12 |
| Kanske and Kotz, | Negative words (sound location discrimination) | Neutral words (sound location discrimination) | 23/German (10F, mean 25.1 yrs) | Table 2 | 3 |
| Klein et al., | Discrimination judgement of word prosodies | Discrimination judgement of phonemes | 24/German (12F, mean 28.2 yrs) | Table 3 | 6 |
| Kreitewolf et al., | Discrimination judgement of word intonations | Discrimination judgement of speaker genders | 17/not provided (9F, 22–34 yrs) | Table 1 | 15 |
| Mothes-Lasch et al., | Angry bi-syllabic nouns (unrelated task) | Neutral bi-syllabic nouns | 28/not provided (21F, 18–34 yrs) | Results | 1 |
| Péron et al., | Emotion judgement of emotional pseudo-words | Emotion judgement of neutral pseudo-words | 15/French (12F, mean 25.12 yrs) | Table 1 | 14 |
| Quadflieg et al., | Processing of emotional words (various tasks) | Processing of neutral words (various tasks) | 12/not provided (6F, mean 23.25 yrs) | Table 3 | 11 |
| Sammler et al., | Linguistic prosody judgement of words | Phoneme judgement of words | 23/English (10F, 24.3–27.1 yrs) | Table 1 | 16 |
| Sander et al., | Angry pseudo-words (gender discrimination) | Neutral pseudo-words (gender discrimination) | 15/not provided (7F, 19.8–29 yrs) | Table 1 | 8 |
| Alba-Ferrara et al., | Classification of emotional prosodies (emotional) | Classification of emotional prosodies (neutral) | 19/not provided (0F, 18–51 yrs) | Table 1 | 12 |
| Beaucousin et al., | Categorization of emotional sentences | Categorization of sentence gramma | 23/French (12F, 20.7–26.7 yrs) | Table 2 | 23 |
| Beaucousin et al., | Classification of natural emotional sentences | Classification of artificial non-emotional sentences | 23/French (12F, 20.3–26.3 yrs) | Table 3 | 20 |
| Buchanan et al., | Detection of emotional word targets | Detection of emotional phoneme targets | 10/not provided (0F, 22–40 yrs) | Table 1 | 3 |
| Castelluccio et al., | Angry prosody sentences (unrelated judgement) | Neutral prosody sentences (unrelated judgement) | 8/English (5F, 18–30 yrs) | Table 1 | 8 |
| Doherty et al., | Intonation judgement of sentences (question) | Intonation judgement of sentences (statement) | 11/English (7F, 18–26 yrs) | Table 1 | 6 |
| Escoffier et al., | Judgement of emotional prosodies | Judgement of musical prosodies | 16/not provided (7F, 18–26 yrs) | Table 2 | 5 |
| Ethofer et al., | Judgement of emotional prosodies | Judgement of emotional word contents | 24/German (13F, mean 24.4 yrs) | Table 1 | 3 |
| Ethofer et al., | Emotional prosody (speaker gender judgement) | Neutral prosody (speaker gender judgement) | 22/not provided (13F, 18.6–34 yrs) | Table 1 | 2 |
| Gandour et al., | Judgement of intonations | Passive listening to speech | 10/Chinese (5F, mean 26.1 yrs) | Table 2 | 8 |
| Gandour et al., | Judgement of intonations | Passive listening to speech | 10/English (5F, mean 28 yrs) | Table 2 | 15 |
| Gandour et al., | Judgement of emotions | Passive listening to speech | 10/Chinese (5F, mean 26.1 yrs) | Table 2 | 7 |
| Gandour et al., | Discrimination judgement of intonations | Passive listening to speech | 10/Chinese (5F, mean 27.3 yrs) | Table 3 | 4 |
| Gandour et al., | Discrimination judgement of intonations | Passive listening to speech | 10/English (5F, mean 26 yrs) | Table 3 | 5 |
| Gandour et al., | Discrimination judgement of intonations | Discrimination judgement of lexical tones | 10/Chinese (10F, not provided) | Table 2 | 2 |
| George et al., | Emotion judgement of sentences | Active listening to sentences | 13/not provided (5F, mean 28.5 yrs) | Table | 2 |
| Heisterueber et al., | Discrimination judgement of suprasegmental/prosodic elements | Discrimination judgement of segmental/phonetic elements | 25/German (9F, mean 28.8 yrs) | Table 3 | 15 |
| Kotz et al., | Emotion judgement of emotional sentences | Emotion judgement of neutral sentences | 12/German (8F, 22–29 yrs) | Table 3 | 10 |
| Kreitewolf et al., | Discrimination judgement of sentence intonations | Discrimination judgement of verbs in sentences | 17/not provided (10F, 20–29 yrs) | Table 1 | 22 |
| Kristensen et al., | Sentences with focused stress (semantic judgement task) | Sentences without focused stress (semantic judgement task) | 24/Dutch (18F, 18–24 yrs) | Table 5 | 22 |
| Leitman et al., | Emotion judgement of emotional sentences | Emotion judgement of neutral sentences | 19/not provided (0F, 23–33 yrs) | Table 2 | 14 |
| Mitchell and Ross, | Emotion judgement of emotional sentences | Rest | 16/not provided (13F, 18–35 yrs) | Table 1 | 11 |
| Perrone-Bertolotti et al., | Sentences with focused stress (unrelated judgement task) | Sentences without focused stress (unrelated judgement) | 24/French (12F, 19–34 yrs) | Table 2 | 10 |
| Rota et al., | Judgement of emotional prosodic sentences | Rest | 10/German (0F, 24–38 yrs) | Table 1 | 9 |
| Wildgruber et al., | Identification of emotional sentences | Rest | 10/not provided (5F, 21–33 yrs) | Table 1 | 17 |
Figure 2Activation foci from selected contrasts. Red, blue, green, violet, and yellow dots represent foci from tonal tone, non-tonal tone, phoneme, word prosody and sentence prosody, respectively. Across conditions, foci were widely distributed in bilateral temporal, frontal, parietal regions, and the cerebellum.
Figure 3Convergence of activations in each condition (uncorrected p < 0.001, minimum cluster = 540 mm3). (A–E) Regions consistently activated by the perception of tonal tone, non-tonal tone, phoneme, word prosody, and sentence prosody, respectively. IPL, inferior parietal lobule; MFG, middle frontal gyrus; MTG, middle temporal gyrus; preCG, precentral gyrus; STG, superior temporal gyrus.
Brain regions consistently activated in each condition (uncorrected p < 0.001, minimum cluster = 540 mm3).
| R Superior Temporal Gyrus | 22 | 58 | −24 | 4 | 2.63 | 2,440 | 0.50 |
| L Superior Temporal Gyrus | 41 | −58 | −18 | 8 | 1.54 | 2,104 | 0.42 |
| R Cerebellum | NA | 2 | −64 | −26 | 1.49 | 1,960 | 0.50 |
| L Medial Frontal Gyrus | 6 | 0 | 18 | 44 | 1.31 | 1,936 | 0.50 |
| L Precentral Gyrus | 9 | −40 | 4 | 32 | 1.86 | 1,832 | 0.42 |
| R Superior Temporal Gyrus | 22 | 56 | −4 | 0 | 1.30 | 1,064 | 0.33 |
| R Superior Temporal Gyrus | 41 | 56 | −26 | 10 | 2.07 | 1,808 | 0.57 |
| L Superior Temporal Gyrus | 22 | −56 | −16 | 0 | 2.17 | 4,376 | 0.79 |
| R Superior Temporal Gyrus | 22 | 60 | −18 | 2 | 1.69 | 2,256 | 0.57 |
| L Precentral Gyrus | 6 | −38 | 0 | 42 | 1.21 | 576 | 0.21 |
| R Precentral Gyrus | 44 | 46 | 10 | 10 | 1.58 | 3,648 | 0.60 |
| R Superior Temporal Gyrus | 22 | 48 | −22 | 4 | 1.92 | 2,512 | 0.53 |
| L Putamen | NA | −24 | 10 | 6 | 1.49 | 1,080 | 0.27 |
| L Amygdala | NA | −20 | −10 | −12 | 1.63 | 752 | 0.27 |
| R Superior Temporal Gyrus | 22 | 46 | −36 | 4 | 2.92 | 4,208 | 0.56 |
| R Middle Frontal Gyrus | 9 | 48 | 14 | 30 | 3.06 | 2,976 | 0.44 |
| L Medial Frontral Gyrus | 6 | 0 | 14 | 48 | 2.43 | 2,840 | 0.40 |
| L Middle Temporal Gyrus | 21 | −56 | −28 | 2 | 2.25 | 2,744 | 0.36 |
| L Precentral Gyrus | 6 | −42 | 4 | 34 | 2.38 | 2,048 | 0.32 |
| R Inferior Parietal Lobule | 40 | 34 | −54 | 44 | 3.11 | 1,744 | 0.32 |
| L Inferior Parietal Lobule | 40 | −34 | −52 | 34 | 2.34 | 1,224 | 0.28 |
| L Middle Temporal Gyrus | 21 | −54 | −46 | 6 | 1.75 | 1,152 | 0.24 |
| R Superior Temporal Gyrus | 38 | 52 | −2 | −6 | 1.63 | 808 | 0.16 |
Figure 4Surface and 3D Rendering maps showing overlaid ALE statistics for all conditions. (A,B) uncorrected p < 0.001, minimum cluster = 540 mm3. (C,D) FDR-corrected p < 0.05, minimum cluster = 100 mm3. AMY, amygdala; CB, cerebellum; IPL, inferior parietal lobule; MFG, middle frontal gyrus; MeFG, medial frontal gyrus; MTG, middle temporal gyrus; preCG, precentral gyrus; PUT, putamen; STG, superior temporal gyrus.
Figure 5Conjunction and contrast maps between tonal tone and other conditions (uncorrected p < 0.001, minimum cluster = 100 mm3). (A–D) comparisons of tonal tone with non-tonal tone, phoneme, word prosody, and sentence prosody, respectively. Red: regions uniquely recruited in tonal tone compared with one of the other conditions; yellow: regions coactivated in tonal tone and one of the other conditions; blue: regions specifically engaged in one of the other conditions compared with tonal tone. IFG, inferior frontal gyrus; IPL, inferior parietal lobule; MTG, middle temporal gyrus; preCG, precentral gyrus; STG, superior temporal gyrus; TTG, transverse temporal gyrus.
Co-activated regions for tonal tone and other conditions based on uncorrected ALE results (uncorrected p < 0.001, minimum cluster = 100 mm3).
| R Superior Temporal Gyrus | 41 | 56 | −24 | 8 | 1.64 | 992 |
| L Superior Temporal Gyrus | 41 | −56 | −18 | 4 | 1.39 | 1,112 |
| R Superior Temporal Gyrus | 41 | 60 | −20 | 2 | 1.64 | 944 |
| R Superior Temporal Gyrus | 41 | 52 | −22 | 6 | 1.31 | 736 |
| L Medial Frontal Gyrus | 6 | 0 | 18 | 44 | 1.31 | 1,360 |
| L Precentral Gyrus | 9 | −40 | 4 | 32 | 1.86 | 872 |
| R Superior Temporal Gyrus | 22 | 54 | −28 | 6 | 1.30 | 504 |
| R Superior Temporal Gyrus | 22 | 56 | −4 | 0 | 1.30 | 400 |
| L Superior Temporal Gyrus | 41 | −58 | −22 | 8 | 1.28 | 264 |
Brain regions revealed by contrasting tonal tone with other conditions based on uncorrected ALE results (uncorrected p < 0.001, minimum cluster = 100 mm3).
| None | ||||||
| R Superior Temporal Gyrus | 22 | 62 | −32 | 10 | 2.54 | 728 |
| R Cerebellum | NA | 2 | −66 | −20 | 2.40 | 1,256 |
| R Superior Temporal Gyrus | 41 | 56 | −26 | 12 | 2.41 | 856 |
| L Middle Frontal Gyrus | 9 | −42 | 12 | 32 | 2.04 | 184 |
| L Middle Temporal Gyrus | 21 | −62 | −4 | −4 | 1.98 | 296 |
| L Cerebellum | NA | 2 | −64 | −21 | 2.63 | 1,904 |
| L Inferior Frontal Gyrus | 9 | −40 | 6 | 26 | 2.57 | 1,176 |
| L Medial Frontal Gyrus | 8 | −4 | 18 | 46 | 2.52 | 1,112 |
| R Superior Temporal Gyrus | 22 | 62 | −26 | 2 | 2.48 | 784 |
| L Transverse Temporal Gyrus | 42 | −59 | −16 | 12 | 2.38 | 712 |
| R Inferior Frontal Gyrus | 44 | 45 | 13 | 16 | 3.29 | 2,704 |
| L Parahippocampal Gyrus | NA | −28 | −7 | −16 | 2.48 | 688 |
| R Superior Temporal Gyrus | 22 | 44 | −22 | 4 | 1.82 | 240 |
| R Cerebellum | NA | 2 | −67 | −21 | 3.54 | 1,704 |
| R Superior Temporal Gyrus | 22 | 60 | −21 | 8 | 3.72 | 1,632 |
| L Superior Temporal Gyrus | 22 | −54 | −12 | 3 | 2.62 | 1,192 |
| R Inferior Frontal Gyrus | 45 | 48 | 23 | 23 | 3.89 | 2,760 |
| R Middle Temporal Gyrus | 22 | 48 | −44 | 5 | 2.77 | 1,920 |
| L Middle Temporal Gyrus | 22 | −53 | −42 | 6 | 2.99 | 1,152 |
| R Inferior Parietal Lobule | 40 | 34 | −48 | 42 | 2.48 | 1,104 |
| L Superior Temporal Gyrus | 41 | −44 | −32 | 8 | 3.01 | 856 |
| L Inferior Parietal Lobule | 40 | −36 | −44 | 44 | 2.05 | 568 |
Figure 6Dual-stream model for speech perception. (A) An example of Chinese sentence comprised of various linguistic elements (phoneme, lexical tone, word-level, and sentence-level prosody). The spectrogram and pitch contours of each lexical tone and prosody were extracted from the sentence spoken by a male Mandarin native speaker. Notably, lexical tone can bridge single vowel, double vowels (e.g., /ai/), triple vowels, or vowel and nasal consonant (e.g., /an/), although it is labeled upon single vowel in Pinyin. (B) Dual-stream model for speech perception in which the ventral stream is involved in spectrotemporal analysis and the dorsal stream is responsible for sensorimotor integration. Ventral stream: gradient representations of different linguistic elements in bilateral superior temporal gyrus (STG). Dorsal stream: topological representations of phoneme, tonal tone and sentence prosody in the left precentral gyrus (preCG), corresponding to different places of articulation.