| Literature DB >> 33196769 |
Tsuei-Ju Tracy Hsieh1,2, Ichiro Kuriki3,4, I-Ping Chen5,6, Yumiko Muto7,8,9, Rumi Tokunaga10,11, Satoshi Shioiri3,12.
Abstract
Previous claims of the number of color categories and corresponding basic color terms in modern Mandarin Chinese remain irreconcilable, mainly due to the shortage in objectively evaluating the basicness of color terms with statistical significance. Therefore the present study applied k-means cluster analysis to investigate native Mandarin Chinese speakers' color naming data of 330 color chips similar to those used in World Color Survey. Results confirmed that there are 11 basic color categories among modern Mandarin speakers in Taiwan, one corresponding to each basic color term. Results also showed that observers overwhelmingly agreed in their use of Mandarin color terms, including those that had yielded ambiguous results in previous studies (gray, brown, pink, and orange). There is significant cross-language similarity when comparing the distribution of color categories in the World Color Survey chart with American English and Japanese data. The motif analysis and group mutual information analysis suggest that Mandarin color terms used in Taiwan describe very similar categories and are, hence, similarly precise in communicating color information as those in Japanese and American English. These results show that three languages of fundamentally different cultures and histories have very similar basic color terms.Entities:
Mesh:
Year: 2020 PMID: 33196769 PMCID: PMC7671860 DOI: 10.1167/jov.20.12.6
Source DB: PubMed Journal: J Vis ISSN: 1534-7362 Impact factor: 2.240
Previous studies about the numbers of BCTs in modern Mandarin Chinese.
| BCT studies |
| Method | English translation |
|---|---|---|---|
|
| 6 | 紅 hóng, 黃 huáng, 綠 lù, 藍 lán, 黑 hēi, 白 bái | Color naming task with Chinese immigrants in US |
| red, yellow, green, blue, black, white | |||
|
| 11 | 紅 hóng, 橙 chéng, 黃 huáng, 綠 lù, 藍 lán, 桃 táo, 紫 zǐ, 褐 hé, 灰 huī, 黑 hēi, 白 bái | Free-naming method and color naming task with Taiwanese participants |
| red, orange, yellow, green, blue, pink, purple, brown, gray, black, white | |||
|
| 11 | 紅 hóng, 橘 jú, 黃 huáng, 綠 lù, 藍 lán, 粉紅 fěnhóng, 紫 zǐ, 棕 zōng, 灰 huī, 黑 hēi, 白 bái | Free color naming task with 40 Taiwanese participants |
| red, orange, yellow, green, blue, pink, purple, brown, gray, black, white | |||
|
| 11 | 紅 hóng, 橘 jú, 黃 huáng, 綠 lù, 藍 lán, 粉紅 fěnhóng, 紫 zǐ, 咖啡 kāfēi, 灰 huī, 黑 hēi, 白 bái | Free-naming method and color naming task with Taiwanese participants |
| red, orange, yellow, green, blue, pink, purple, brown, gray, black, white | |||
|
| 8 | 紅 hóng, 黃 huáng, 綠 lù, 藍 lán, 紫 zǐ, 灰 huī, 黑 hēi, 白bái | Word frequency accumulation of modern era from a Chinese Corpus |
| red, yellow, green, blue, purple, gray, black, white | |||
|
| 9 | 紅 hóng, 黃 huáng, 綠 lù, 藍 lán, 粉紅 fěnhóng, 紫 zǐ, 灰 huī, 黑 hēi, 白 bái | Free-naming method and color naming task with Northeast China participants |
| red, yellow, green, blue, pink, purple, gray, black, white | |||
|
| 8 | 紅 hóng, 棕 zōng or/and 褐 hé, 黑 hēi, 橙 chéng, 灰 huī, 黃 huáng | A matching task of 32 terms to color chips with Taiwanese participants |
| red, brown, green, black, orange, gray, yellow, pink |
Figure 2.Gap-statistics analysis for k = 2 to 24. (A) Trace of gap statistic values for 10,000 calculations. The horizontal and vertical axes represent the k-values tested and gap-statistic value for each k, respectively. (B) Histogram of cluster numbers (k), which was the smallest number before the gap-statistic value fell from a positive value to a negative value in each calculation. The most frequent number of clusters was k = 8.
Figure 3.(A) Eight clusters derived by k-means clustering analysis of the data from 41 participants. The brightness of the cluster represents the consistency across participants. (B) Each square represents the arrangement of the Munsell color chip set used in the World Color Survey (Kay et al., 2011) and are corresponding to the position within each small square of panel (A). (C) The plot shows the consistency index for repeated application of k-means clustering. The horizontal axis shows the number of clusters, and the vertical axis shows the consistency index (see text for details). Error bars indicate 95% confidence intervals after 1000 calculations for each k. The index was maximum at k = 8.
Figure 4.Comparison among Mandarin (Taiwan), English, and Japanese data sets, clustered under the same condition as the Mandarin data: k = 8. (A) The colored clusters show regions of over 80% consensus. (B) Outlines indicate the boundaries of clusters for corresponding categories in panel A (80% consensus) in the Mandarin (yellow), English (cyan), and Japanese (purple) data.
Figure 5.Sixteen optimal categories derived after pooling data from three languages. The labels of categories on the top of each panel are described in English, Japanese, and Mandarin, from left. “—” are used when naming data glossed to that color category were not found in the language. The eight categories on the left correspond to BCTs, the eight on the right to non-BCTs.
Figure 6.Motif analysis result. Rows show three types of color-naming system, i.e., motifs. (A) Consensus areas of color categories across participants. (B) Most frequent responses. (C) Areas with over 80% consensus criterion. Arrows show clusters that are unique in each motif. Motif 1 is a universal BCT type, Motif 2 has Green-Blue-Mizu clusters (Kuriki et al., 2017), and Motif 3 has Green-Blue-Teal clusters. (D) Fraction of speaker of each language contained in each of the 3 motifs. (E) Fraction in speakers of each language classified to each of the three motifs. See main text for precise numbers.
Figure 7.The analysis on the use of gray and brown terms using 330 color chips. The arrangement of colors is the same as in Figure 3B; 10 achromatic chips are now included in the leftmost column. Clusters represent the results of k-means analysis and category labels were chosen as the most frequently used names, indicated in each cluster. Cluster contours for achromatic terms are outlined with a broken line for the better visibility. The “灰 (huī; gray)” (left middle) shows they are confined to achromatic color chips (leftmost column). The “白 (bai; white)” has color chips at the top row (highest lightness) of chromatic chips and “黑 (hei; black)” also have four color chips at the bottom row (lowest lightness) of chromatic chips. These trends are commonly found in English (Lindsey & Brown, 2014) and Japanese (Kuriki et al., 2017) and are part of the typical distribution of white and black categories.
Figure 1.(A) Histogram of the numbers of color names used by Mandarin speakers in Taiwan. (B) Rank order plot (Zipf chart) of our Mandarin Chinese data. The horizontal and vertical axes represent logarithms of rank order and populations, respectively.
Color terms in Chinese characters, pinyin (pronunciation), English translations, and number of participants using them (N).
| Color terms | ||||
|---|---|---|---|---|
| Rank | MC letter | Pronunciation | English |
|
| 1 | 紅 | hóng | Red | 41 |
| 2 | 綠 | lù | Green | 41 |
| 3 | 藍 | lán | Blue | 41 |
| 4 | 黃 | huán | Yellow | 41 |
| 5 | 紫 | zǐ | Purple | 41 |
| 6 | 橘 | jú | Orange | 41 |
| 7 | 灰 | huī | Gray | 41 |
| 8 | 咖啡 | kāfēi | Brown | 41 |
| 9 | 白 | bái | White | 40 |
| 10 | 粉紅 | fěnhóng | Pink | 40 |
| 11 | 黑 | hēi | Black | 39 |
| 12 | 棕 | zōng | Brown | 10 |
| 13 | 桃紅 | táohóng | Pink | 6 |
| 14 | 土 | tǔ | Mud | 3 |
| 15 | 皮膚・膚 | pífū, fū | Skin | 3 |
| 16 | 褐 | hé | Brown | 2 |
| 17 | 澄 | chéng | Clear | 1 |
| 18 | 茶 | chá | Brown | 1 |
| 19 | 乳白 | rǔbái | Milk | 1 |
| 20 | 墨 | mò | Ink | 1 |
| 21 | 湛 | zhàn | Pearly | 1 |
| 22 | 橙 | chéng | Orange | 1 |
| 23 | 青蘋果 | qīng pínguǒ | Green apple | 1 |
Figure 8.Analysis of category overlaps. (A) Fraction of occurrence of outcomes during 10,000-time random sampling, in which a color chip that was called x = “粉紅 (fěnhóng; pink)” by at least one of the 10 participants was also called y = “紅 (hóng; red)” by at least one other subject. Histograms for the overlap ratio for x = red, y = pink in English and x = aka, y = pink in Japanese (from Figure 8 of Kuriki et al., 2017) are shown by the dotted lines. (B) The result of same analysis on blue and green categories (x = “藍 (lán; blue),” y = “綠 (lù; green)” for Mandarin (Taiwan); x = ao, y= midori in Japanese) for three language groups.
Figure 9.Frequency rank of color terms in Academia Sinica Balanced Corpus of Modern Chinese (colored bars), Academia Sinica Tagged Corpus of Early Mandarin Chinese (gray bars), and Academia Sinica Ancient Chinese Corpus (white bars). The X-axis presents the frequency rank of color terms based on the results of the modern corpus, whereas the Y-axis (log scale) presents the count of each color term.