Literature DB >> 24046264

Cross-linguistic similarity norms for Japanese-English translation equivalents.

David Allen1, Kathy Conklin.   

Abstract

Formal and semantic overlap across languages plays an important role in bilingual language processing systems. In the present study, Japanese (first language; L1)-English (second language; L2) bilinguals rated 193 Japanese-English word pairs, including cognates and noncognates, in terms of phonological and semantic similarity. We show that the degree of cross-linguistic overlap varies, such that words can be more or less "cognate," in terms of their phonological and semantic overlap. Bilinguals also translated these words in both directions (L1-L2 and L2-L1), providing a measure of translation equivalency. Notably, we reveal for the first time that Japanese-English cognates are "special," in the sense that they are usually translated using one English term (e.g., コール /kooru/ is always translated as "call"), but the English word is translated into a greater variety of Japanese words. This difference in translation equivalency likely extends to other non-etymologically related, different-script languages in which cognates are all loanwords (e.g., Korean-English). Norming data were also collected for L1 age of acquisition, L1 concreteness, and L2 familiarity, because such information had been unavailable for the item set. Additional information on L1/L2 word frequency, L1/L2 number of senses, and L1/L2 word length and number of syllables is also provided. Finally, correlations and characteristics of the cognate and noncognate items are detailed, so as to provide a complete overview of the lexical and semantic characteristics of the stimuli. This creates a comprehensive bilingual data set for these different-script languages and should be of use in bilingual word recognition and spoken language research.

Entities:  

Mesh:

Year:  2014        PMID: 24046264      PMCID: PMC4030127          DOI: 10.3758/s13428-013-0389-z

Source DB:  PubMed          Journal:  Behav Res Methods        ISSN: 1554-351X


Words within a language can have formal (phonological [P] and orthographic [0]) and/or semantic (S) overlap (e.g., bat/bat [ + P, + 0, – S], tear/tear [ − P, + 0, – S], break/brake [ + P, – 0, – S], couch/sofa [ − P, – 0, + S]). Importantly, research has shown that such overlap can increase activation and speed processing, or create competition and slow processing (Balota, Cortese, Sergent-Marshall, Spieler, & Yap, 2004; Jared, McRae, & Seidenberg, 1990; McClelland & Rumelhart, 1981). Words in different languages can also have such overlap. For example, the English word ball and the Japanese word ボール /booru/ overlap in terms of S, although not all of the senses of English ball are associated with the Japanese word /booru/. This issue of degree of S overlap is a crucial aspect of the present research and is discussed in more depth later. The two words overlap a great deal in P, although there is no O overlap, as the two languages are written in different scripts. Such overlap, or what we will refer to in the present research as cross-linguistic similarity, plays an important role in bilingual language processing (Allen & Conklin, 2013; Dijkstra, Miwa, Brummelhuis, Sappeli, & Baayen, 2010). Moreover, in the present research we show that this overlap is perceived by bilinguals to be continuous in nature. In the literature on bilingual word processing, words that share both form and meaning are usually referred to as cognates (Dijkstra, 2007). This is because, until recently, most of the research has investigated the processing of European languages (e.g., Catalan, Dutch, English, French, German, Italian, and Spanish). Thus, when words had form and meaning overlap (e.g., English night, French nuit, German nacht), this was in fact due to the modern words having a common historical root (e.g., Latin nocte); therefore, they were cognates. In the case of the Japanese word ボール /booru/ “ball,” it is more appropriately called a loanword/borrowing. However, in this case not the historical origin of such words in a language, but instead their cross-linguistic similarity, influences their processing. Thus, for the purposes of this article, we will refer to any cross-linguistic word pairs that share form and meaning as cognates. Although Japanese–English cognates can easily be identified by simply considering the overlap of form and meaning across languages, a much more precise definition of “cognateness” can be determined by the use of bilingual measures of perceived similarity. This is discussed further in the following sections. Cognates have been central to psycholinguistic research into bilingual language processing. Traditionally, bilinguals have been referred to by such classic distinctions as “compound” and “additive” by scholars such as Weinreich (1953). However, in the psycholinguistic literature, bilinguals are characterized by their proficiency in both languages. Thus, if a native speaker of Japanese also speaks English as a second language to some degree of proficiency, the speaker can be referred to as bilingual. An important question about bilingual language processing has been whether bilinguals could selectively activate a single language or whether both of their languages are activated nonselectively. In other words, when processing one language, is it possible to turn the other language “off”? Cognates have provided an ideal way to investigate this question, as they had a great deal of formal and S overlap. When bilinguals perform a task such as lexical decision, in which all words are presented in one language, cross-linguistic overlap should only influence processing if language activation is nonselective. That is, if both languages are activated during single-language processing, cognates should facilitate processing. Alternatively, if only a single language is activated during language processing, shared cross-linguistic S–O–P features of cognates should not influence processing relative to words that have no S–O–P overlap. A considerable amount of research has shown that bilingual word recognition is fundamentally nonselective in nature (e.g., Dijkstra & Van Heuven, 2002; Van Heuven, Dijkstra, & Grainger, 1998; see Dijkstra, 2007, for a review). When bilinguals use a second language, cognates have been shown to speed their responses, relative to matched noncognate controls, in a wide variety of tasks, such as word naming (Schwartz, Kroll, & Diaz, 2007), word translation (Christoffels, de Groot, & Kroll, 2006), and lexical decision (Dijkstra, Grainger, & Van Heuven, 1999). Moreover, similar findings have been found for languages that differ in script (e.g., lexical decision with Hebrew–English, Gollan, Forster, & Frost, 1997; picture naming with Japanese–English, Hoshino & Kroll, 2008; masked priming lexical decision with Japanese–English, Nakayama, Sears, Hino, & Lupker, 2012; or masked priming lexical decision and word naming with Korean–English, Kim & Davis, 2003). Typically, in such studies cognates and noncognates are matched on important characteristics such as frequency, length, phonological onset, and phonological neighborhood size (for naming) or orthographic neighborhood size (for lexical decision). Although cognates typically speed responses in L2 tasks such as lexical decision, picture naming, and translation, Dijkstra et al. (1999; Dijkstra et al., 2010) showed that in language decision tasks, in which bilinguals had to decide whether the targets were either Dutch or English, cognates were inhibited relative to noncognates.Thus, for tasks in which cross-linguistic similarity is disadvantageous, as in language decision, cognates can actually slow processing. Even in sentence processing tasks, in which semantic and syntactic constraints may be more likely to induce language-selective processing, cognate facilitation has been observed relative to noncognates (e.g., Schwartz & Kroll, 2006; Van Assche, Drieghe, Duyck, Welvaert, & Hartsuiker, 2011; Van Assche, Duyck, Hartsuiker, & Diependaele, 2009; Van Hell & de Groot, 2008). Although cognate effects are typically more prominent in the L2 than in the L1 for unbalanced bilinguals (i.e., bilinguals who are not equally proficient in both languages, and typically are more proficient in the L1), due to the boosted activation of the more dominant L1, L2 cognate effects have also been observed in the L1 (Duñabeitia, Perea, & Carreiras, 2010; Van Assche et al., 2009). Thus, even when bilinguals are more dominant in an L1, it is still possible to observe cross-linguistic similarity effects in both L1 and L2 processing. Although much of the previous research into bilingual processing has defined cognates and noncognates as being dichotomous, a growing number of studies have reported that bilinguals are sensitive to the degree of similarity, above and beyond a simple binary distinction (e.g., Allen & Conklin, 2013; Dijkstra et al., 2010; Van Assche et al., 2011; Van Assche et al., 2009). Using mixed-effects modeling with multiple independent variables, these studies have revealed that continuous measures of cross-linguistic similarity are indeed predictive of bilinguals’ responses in L2 tasks. Most relevant for the present study, Allen and Conklin found that Japanese–English bilinguals responded to English words faster in lexical decision, depending on the degree of P similarity between the English and Japanese words. For example, whereas both bus バス/basu/ and radio ラジオ /rajio/ are cognates and were responded to more quickly than noncognate matched controls, bus was rated as being more phonologically similar to バス /basu/ than radio was to ラジオ /rajio/, and bus was responded to significantly more quickly than radio. This study used mixed-effects modeling with multiple predictors, including word length and word frequency. Because such predictors are correlated with each other and also with P similarity, residualization was used to orthogonalize the predictors prior to model fitting. Collinearity was removed between all correlated variables, and then the residuals of these predictors were used to predict response times (RTs). The orthogonalized predictors showed that length, word frequency, and P similarity accounted for significant, but independent, portions of the variance in RTs. These facilitatory effects of P similarity were observed in L2 English lexical decision and picture naming with Japanese–English bilinguals. In addition, S similarity was shown to be an important predictor of responses to cognates in picture naming, with more semantically similar cognates being responded to more quickly than less semantically similar cognates. In an English lexical-decision experiment, S similarity had a reverse effect, relative to picture naming: Less semantically similar word pairs were responded to faster, due to such items having more senses, which apparently boosted activation of the lexical representation, leading to speeded responses relative to more semantically similar words (which tend to have fewer senses). These results highlighted the importance of task effects in language processing, but they also underscore the importance of continuous measures of cross-linguistic similarity as crucial indicators of bilinguals’ processing performance. Despite the importance of cross-linguistic measures of word similarity in bilingual language processing research, to our knowledge only one previous study has collected bilingual measures and made them available to researchers. Tokowicz, Kroll, de Groot, and van Hell (2002) conducted a large-scale study on 1,003 word pairs with Dutch–English bilinguals who rated translation equivalents for cross-linguistic O, P, and S similarity. They also elicited translations in order to determine translation equivalency and to assess the number of translations that a word has. Bilinguals translated words in both directions (i.e., from L1 to L2 and from L2 to L1), and this was used to determine whether a word has one or more translations in each language, as the number of translations that a word has has been shown to influence bilingual processing. The number of translations also can provide a metric of the amount of S overlap between words in two languages: If a word that has a number of senses in one language is translated into a single word in the other language, then both words are likely to be used in similar contexts in the two languages. However, if a word in one language has multiple senses that lead to different translations in the other language, that word is likely to be translated into more than one word. Thus, the words will be used in a variety of contexts, and likely will have less complete S overlap. An important implication of this is that when bilinguals translate a word presented in isolation, for many words multiple translations are likely to be given. Therefore, determining a “best” translation for most words is problematic. Providing a number-of-translations measure, on the other hand, provides information about the likelihood that bilinguals will be considering single or multiple translations for any particular word. For researchers interested in L2 and bilingual language processing, it is critical to have norms for cross-linguistic S similarity in order to control for the influence of the “other” language during language-processing tasks. Moreover, bilingual ratings may be more suitable measures of S overlap than are dictionary measures of the number of meanings/senses in each language, because dictionaries vary greatly in their methods of quantifying meanings/senses, and they also reflect the total number of senses that exist in the language, as opposed to those known by the average bilingual (see Gernsbacher, 1984, for a similar argument). To our knowledge, currently no measures of cross-linguistic similarity are available for languages other than Dutch–English. The present study thus provides cross-linguistic norming data for Japanese–English translations. Research into Japanese–English bilingual processing is particularly important, not only because relatively little bilingual research has focused on languages that differ in script (in comparison to research on same-script languages), but also because of the importance of English in Japanese society. Compulsory education and tertiary institutions place a strong emphasis on language education, and English is the most widely learned second language in Japan. The Japanese language has many thousands of loanwords borrowed from English that have entered the language since Japan’s ending of its self-imposed isolation in the mid-nineteenth century. Many of these loanwords are in regular and general use; however, the majority are reserved for technical and academic uses. The proportion of loanwords in the 5th edition of the Koujien (1998), a comprehensive Japanese dictionary, is 10.2 %, which equals around 23,000 word entries (Kawaguchi & Tsunoda, 2005; cited in Igarashi, 2008). Moreover, around 90 % of loanwords are borrowed from English (Shinnouchi, 2000). Therefore, a better understanding of how Japanese–English bilinguals process these loanwords/cognates is an important area for research.1 The primary goal of the present study was thus to provide a range of cross-linguistic similarity measures of Japanese–English translation equivalents. To this end, ratings were collected to assess P and S similarity. Also, participants were asked to translate words in order to provide estimates of the number of translations and meanings that were known by the bilinguals. A second aim was to collect additional norming data that will be critical for designing experiments to investigate bilingual processing, yet that are not publicly available for all of the Japanese words in this study. Because most studies have focused on high-frequency words (e.g., Yokokawa, 2009, investigated the top 3,000 words in the British National Corpus), there are few measures for many of the cognates that are ubiquitous in the Japanese language, but that tend to be of lower frequency. Thus, information about the perceived age of acquisition (AoA) and concreteness of L1 Japanese words was collected. Concreteness is particularly useful for researchers, as it is typically highly correlated with grammatical class, such that verbs tend to refer to abstract events or actions, whereas nouns often refer to concrete objects as well as to abstract entities. Although grammatical class is problematic as a norming measure, because many words can be read as either verbs or nouns (e.g., call, run, telephone), concreteness can be used as a measure of the intrinsic S properties of the item, which may include both the verbal and nominal uses of items. In addition, we collected bilingual ratings of L2 (English) word familiarity. To create a more complete set of information about Japanese–English cognates that could be useful to researchers, we also provide the following information in the present database: word length, number of English senses (WordNet), number of Japanese senses (Meikyo Japanese Dictionary, 2008 edition), English word frequency (Balota et al., 2007), and Japanese word frequency (Amano & Kondo, 2000). Finally, a set of descriptive statistics is presented, as well as a correlation analysis of the ratings and the collected measures.

Method

Participants

One hundred and sixty-six first- and second-year undergraduate university students participated in the present research. The participants were recruited from two Japanese universities: the University of Tokyo and Waseda University. All participants were enrolled in English language courses in one of the two institutions. All recruitment and participation procedures for the studies reported in this article were approved by the ethics committee at the School of English, University of Nottingham. All participants received course credit for taking part, and no participant took part in more than one study. All of the participants were native Japanese speakers who had studied English prior to their university education. In order to qualify for the rating studies, all participants confirmed that they considered themselves native Japanese speakers, who had lived in Japan for the majority of their lives and had received their education in Japan. Thus, L1 proficiency data were not collected, as all participants were at the native-speaker level. Details about the participants, as well as the number of participants in each study, are shown in Table 1. Participants were asked to rate their own perceived English language proficiency in reading, writing, speaking, and listening on a scale of 0–10, with 0 being no ability at all and 10 being native-speaker-level ability. The scores from each component for each participant were averaged in order to calculate an overall proficiency score.
Table 1

Number and mean age of participants; age at which they began learning English; time learning English; their proficiencies in reading, writing, speaking, and listening; and overall proficiency scores

P and S Rating (Concrete Items)P and S Rating (Abstract Items)Number of Translations TaskEnglish (L2) Word Familiarity RatingConcreteness RatingAge-of-Acquisition Rating
Number of participants333638191822
Age20.4 (4.5)20.2 (4.4)18.8 (0.8)18.4 (0.5)20.6 (4.4)19.1 (0.9)
Age began learning L2a 11–15 years11–15 years11–15 years11–15 years
Time learning L2b 5–9 years5–9 years3–7 years3–7 years
L2 reading proficiency6.8 (1.1)6.2 (1.6)6.2 (1.2)6.7 (1.5)
L2 writing proficiency5.3 (1.3)4.8 (1.6)5.1 (1.5)5.0 (1.6)
L2 speaking proficiency4.2 (2.0)3.8 (1.7)4.1 (1.7)4.4 (1.8)
L2 listening proficiency5.4 (2.2)4.7 (1.9)4.6 (1.9)5.4 (1.9)
Overall L2 proficiencyc 5.4 (1.3)4.9 (1.4)5.0 (0.9)5.4 (1.0)

Standard deviations appear in parentheses. a“Age began learning” is derived from self–selected categories (0, 1–5, 6–10, 11–16, 17–21, and 21 years or above), and the data provided above are the mode responses for the participants. b“Time learning L2” is simply “Age” minus “Age began learning.” c“Overall proficiency” is the mean of reading, writing, speaking, and listening proficiency measures.

Number and mean age of participants; age at which they began learning English; time learning English; their proficiencies in reading, writing, speaking, and listening; and overall proficiency scores Standard deviations appear in parentheses. a“Age began learning” is derived from self–selected categories (0, 1–5, 6–10, 11–16, 17–21, and 21 years or above), and the data provided above are the mode responses for the participants. b“Time learning L2” is simply “Age” minus “Age began learning.” c“Overall proficiency” is the mean of reading, writing, speaking, and listening proficiency measures.

Stimuli and apparatus

A total of 198 words were selected for the study. Our aim was to collect ratings for both concrete and abstract words in order to create a more representative stimulus set that can be used in a variety of tasks, such as picture naming (which typically uses concrete nouns) and comprehension tasks (which may include both concrete and abstract words). Moreover, because bilingual studies often make use of cognates due to their unique characteristics of having both formal and S similarity, approximately half of the words in the database were cognates. Loanwords in Japanese are all written in a separate script, katakana, making it relatively easy to determine “cognate” status. The cognates were all loanwords in Japanese that the authors determined shared obvious P and S similarity with their English translations. It was not necessary to do more than this, as the ratings themselves will show how similar the words are across languages. Five items were removed from analyses because they were not the same grammatical class (e.g., expect (verb) and 期待 /kitai/ (noun) “expectation”). This reduced the total number of items in the final study to 193. The concrete cognate and noncognate items (n = 94; cognate = 48; noncognate = 46) were selected from Nishimoto, Miyawaki, Ueda, Une, and Takahashi’s (2005) picture naming norming study. By selecting items from Nishimoto et al.’s study, which were taken from Snodgrass and Vanderwart’s (1980) picture naming norms in English (also see Szekely et al., 2004), the stimuli are suitable for research in both English and Japanese languages.2 Because English loanwords in Japanese are often low-frequency and to ensure that participants would know the items (i.e., that they are lexicalized in Japanese), all of the abstract cognate and noncognate words (n = 99; cognate = 50; noncognate = 49) were selected from a high-frequency word list derived from a 330-million-word Japanese Web corpus (Kilgariff, Rychly, Smrz, & Tugwell, 2004). For the cognates, two professional Japanese–English translators confirmed that the Japanese and English words were translation equivalents, although the translation was not always the most likely translation (e.g., the English word call has many possible translations, with the cognateコール /kooru/ being one of them). For the similarity rating task, the items were randomized and compiled into lists for P and S ratings. An additional 20 nontranslation filler pairs were added to the S rating task to encourage use of the full scale for similarity ratings. More specifically, because all item pairs are translation equivalents, they would be rated as similar to some degree across languages; in order to get participants to use the completely different end of the scale, non-translation-equivalents (e.g., door–雨 /ame/ “rain” in Japanese) were also included in the S rating task. Fillers were not necessary in the P similarity part of the study, as the use of both cognate and noncognate pairs ensures that the full scale will be utilized. The set of materials and ratings are provided in Appendix A (and in the supplemental materials). The filler items were removed from the analysis and were not used in any of the other tasks reported here. Two groups of participants completed the rating studies, one for concrete items and another for abstract items.
Table 5

Cross-linguistic and norming measures data (P and S similarity, Age-of-acquisition, L2 familiarity, concreteness)

English nameJapanese nameTranscriptionKatakana readingPicture namingCognate statusJapanese (L1) age-ofacquisitionEnglish (L2) familiarityConcreteness (L1)No. Trans L1-L2No. Trans L2-L1NoM L1NoM L2Phonological similaritySemantic similarity
accessアクセスakusesuアクセスNoC4.54.83.415113.03.9
acidsanサンNoNC4.83.54.013111.24.5
aid援助enjoエンジョNoNC4.43.32.957121.44.1
armudeウデYesNC4.03.65.412121.04.2
arrow矢印yajirushiヤジルシYesNC3.43.44.441211.03.3
ashtray灰皿haizaraハイザラYesNC4.22.05.412211.14.1
balloon風船fuusenフウセンYesNC3.63.35.6112121.04.7
bananaバナナbananaバナナYesC3.34.05.711113.44.9
bedベッドbeddoベッドYesC3.94.85.711113.74.7
benchベンチbenchiベンチYesC3.74.15.612114.24.1
bicycle自転車jitenshaジテンシャYesNC4.04.15.721111.04.7
blank空白kuuhakuクウハクNoNC4.64.14.323111.13.7
bonehoneホネYesNC4.13.24.911111.04.6
bricksレンガrengaレンガYesNC3.42.95.723211.13.7
broomほうきhoukiホウキYesNC3.42.75.531111.02.3
brushブラシburashiブラシYesC3.54.05.412123.14.2
busバスbasuバスYesC3.83.15.811114.14.7
bustmuneムネNoNC4.33.24.923111.24.2
buttonボタンbotanボタンYesC3.84.15.611113.04.2
cakeケーキkeekiケーキYesC3.54.55.421112.64.2
callコールkooruコールNoC4.64.72.714123.03.6
camelラクダrakudaラクダYesNC3.23.55.712111.14.1
careケアkeaケアNoC4.64.12.818123.94.1
careerキャリアkyariaキャリアNoC4.43.73.115122.93.6
carrot人参ninjinニンジンYesNC3.63.16.111111.04.8
caseケースkeesuケースNoC4.34.53.716123.33.8
caution注意chuuiチュウイNoNC4.33.62.543221.14.1
cherryさくらんぼsakuranboサクランボYesNC2.94.05.912111.04.1
chimneyエントツentotsuエントツYesNC3.02.15.311111.03.7
classクラスkurasuクラスNoC4.04.84.322113.53.7
classicクラシックkurashikkuクラシックNoC4.44.34.212123.73.7
clearクリアーkuriaaクリアーNoC4.44.22.616133.73.6
clue手がかりtegakariテガカリNoNC4.33.62.832111.13.9
coolクールkuuruクールNoC4.34.62.714123.93.8
coralサンゴsangoサンゴNoNC4.22.85.311111.23.7
coreコアkoaコアNoC4.93.52.614113.74.1
courseコースkoosuコースNoC4.34.53.413123.94.5
cowushiウシYesNC3.83.35.732211.04.6
crime犯罪hanzaiハンザイNoNC4.33.53.433211.14.6
crossクロスkurosuクロスNoC4.33.73.214123.54.1
cure治るnaoruナオルNoNC4.24.03.354121.13.7
curtainカーテンkaatenカーテンYesC3.33.95.511113.44.5
cycleサイクルsaikuruサイクルNoC4.64.02.515123.34.0
deerシカshikaシカYesNC3.53.15.821111.04.7
demand要求youkyuuヨウキュウNoNC4.64.02.963211.14.3
desktsukueツクエNoNC4.44.65.711111.04.4
dolphinイルカirukaイルカYesNC3.03.45.711111.04.9
doorドアdoaドアYesC3.65.45.412113.64.6
dressドレスdoresuドレスYesC3.84.35.414133.24.1
dresserたんすtansuタンスYesNC3.02.85.653131.03.1
eagleワシwashiワシYesNC3.53.05.732111.24.5
elephantzouゾウYesNC3.43.25.811111.04.9
exit出口deguchiデグチNoNC4.34.23.912121.14.1
fail失敗shippaiシッパイNoNC4.74.02.542111.14.1
find見つけるmitsukeruミツケルNoNC4.04.63.132121.14.3
firm会社kaishaカイシャNoNC4.03.63.835131.22.9
fishsakanaサカナYesNC4.14.64.311111.04.6
flaghataハタYesNC4.33.85.211111.04.3
fluteフルートfuruutoフルートYesC3.94.05.711113.84.1
foolバカbakaバカNoNC3.43.72.732111.04.1
footashiアシYesNC3.54.35.621211.03.8
forkフォークfookuフォークYesC3.13.85.621213.94.1
foxキツネkitsuneキツネYesNC3.13.55.611111.04.6
frogカエルkaeruカエルYesNC3.13.15.811111.04.6
frontmaeマエNoNC4.04.43.543321.14.3
fuel燃料nenryouネンリョウNoNC4.73.63.732221.14.5
fund資金shikinシキンNoNC4.53.53.655111.14.1
future将来shouraiショウライNoNC4.44.82.412111.14.4
genreジャンルjanruジャンルNoC4.43.22.632112.54.1
giraffeキリンkirinキリンYesNC3.03.06.111111.04.3
glassグラスgurasuグラスNoC3.74.45.313123.74.1
goalゴールgooruゴールNoC3.74.63.914123.64.6
goatヤギyagiヤギYesNC2.82.65.813111.04.5
gorillaゴリラgoriraゴリラYesC2.83.65.911113.34.7
grapesブドウbudouブドウYesNC3.03.45.811111.04.6
guitarギターgitaaギターYesC3.74.35.611113.14.9
hammockハンモックhanmokkuハンモックYesC4.02.65.311113.35.0
hangerハンガーhangaaハンガーYesC3.14.45.721213.64.1
hate憎むnikumuニクムNoNC4.94.02.413111.14.1
headatamaアタマNoNC3.84.75.211111.14.1
heartハートhaatoハートYesC4.44.53.313123.23.7
helicopterヘリコプターherikoputaaヘリコプターYesC3.64.05.811113.44.9
helmetヘルメットherumettoヘルメットYesC3.43.75.411113.64.9
hope希望kibouキボウNoNC4.94.91.914121.34.1
ideal理想risouリソウNoNC4.63.52.221111.24.1
ironアイロンaironアイロンYesC3.63.65.712123.22.8
jarつぼtsuboツボNoNC4.42.65.213313.64.6
joint関節kansetsuカンセツNoNC4.43.04.515121.13.1
jokeジョークjookuジョークNoC4.54.53.412113.34.6
jury陪審baishinバイシンNoNC5.42.64.112111.14.0
kangarooカンガルーkangaruuカンガルーYesC3.33.65.911113.54.9
kickキックkikkuキックNoC3.44.54.412113.94.6
kissキスkisuキスNoC4.04.34.412113.64.7
ladderはしごhashigoハシゴYesNC3.22.85.213111.04.1
learn習うnarauナラウNoNC3.94.53.112111.14.4
lefthidariヒダリNoNC3.64.34.312121.14.4
lemonレモンremonレモンYesC3.14.25.811123.44.9
lessonレッスンressunレッスンNoC4.25.13.815113.54.3
lionライオンraionライオンYesC3.03.85.612123.74.9
lipskuchibiruクチビルYesNC3.63.75.511111.04.7
loanローンroonローンNoC4.93.73.716214.03.9
lobsterザリガニzariganiザリガニYesNC3.03.15.825111.03.5
localローカルrookaruローカルNoC4.34.83.413113.43.9
loose緩いyuruiユルイNoNC4.43.82.836111.23.9
luckyラッキーrakkiiラッキーNoC4.05.12.413123.84.3
matter物事monogotoモノゴトNoC4.64.13.043211.13.6
maze迷路meiroメイロNoNC3.92.24.622121.12.9
moraleモラルmoraruモラルNoC4.63.22.715112.94.1
nakedhadakaハダカNoNC3.43.14.923111.04.3
necklaceネクレスnekuresuネクレスYesC3.94.55.513113.64.8
normal普通futsuuフツウNoNC4.14.42.142111.14.4
nosehanaハナYesNC3.94.15.511111.14.9
past過去kakoカコNoNC4.64.12.222111.04.3
peanutピーナツpiinatsuピーナツYesC3.34.05.812113.34.9
pelicanペリカンperikanペリカンYesC3.32.75.811113.44.8
pencil鉛筆enpitsuエンピツYesNC3.63.95.511111.14.8
penguinペンギンpenginペンギンYesC3.13.75.811113.35.0
pigbutaブタYesNC3.63.55.711111.14.6
pipeパイプpaipuパイプYesC4.03.15.014113.74.4
place場所bashoバショNoNC4.24.63.351121.04.3
plain明白meihakuメイハクNoNC4.93.62.4312211.03.3
poolプールpuuruプールYesC3.83.65.51313364.3
prison刑務所keimushoケイムショNoNC4.43.15.423121.24.5
profit利益riekiリエキNoNC4.53.32.942211.04.5
pyramidピラミッドpiramiddoピラミッドYesC3.73.25.313113.44.6
rabbitウサギusagiウサギYesNC3.03.76.112111.05.0
rabbitウサギusagウサギYesNC3.03.76.112111.05.0
raceレースreesuレースNoC4.54.24.413113.23.8
radioラジオrajioラジオYesC3.94.25.411122.54.6
rainameアメNoNC3.64.55.112111.34.1
rankランクrankuランクNoC4.14.43.217113.74.1
realリアルriaruリアルNoC4.54.32.514123.34.2
regularレギュラーregyuraaレギュラーNoC4.24.63.215123.33.7
releaseリリースririisuリリースNoC4.63.73.117113.73.9
rentalレンタルrentaruレンタルNoC4.24.53.918123.34.4
returnリターンritaanリターンNoC4.34.23.218113.44.3
ring指輪yubiwaユビワYesNC3.94.05.315121.14.0
rocketロケットrokettoロケットYesC3.83.55.211133.64.5
rollロールrooruロールNoC3.93.63.126213.24.0
ruleルールruuruルールNoC4.14.53.117113.54.3
sailor水兵suiheiスイヘイNoNC5.32.75.229121.14.1
scaleスケールsukeeruスケールNoC4.53.62.815113.34.0
scissorsはさみhasamiハサミYesNC3.32.75.711141.14.7
scoreスコアsukoaスコアNoC4.54.23.414113.74.3
screenスクリーンsukuriinスクリーンNoC4.34.54.912123.84.4
screwネジnejiネジYesNC3.23.13.014111.03.5
senseセンスsensuセンスNoC4.34.82.714123.83.8
shareシェアsheaシェアNoC4.44.75.414123.94.0
sharkサメsameサメYesNC3.13.15.222111.14.9
shirtワイシャツwaishatsuワイシャツYesC3.84.42.721121.73.4
shockショックshokkuショックNoC4.44.45.415113.74.3
showショーshooショーNoC4.24.43.615113.44.0
showerシャワーshawaaシャワーNoC3.64.55.312123.64.1
signサインsainサインNoC4.44.64.028224.03.3
singleシングルshinguruシングルNoC4.34.33.116123.24.1
sizeサイズsaizuサイズNoC4.04.73.312123.74.3
skiスキーsukiiスキーYesC4.04.15.611114.14.6
skillスキルsukiruスキルNoC4.64.53.214113.54.3
skirtスカートsukaatoスカートYesC3.54.05.611113.24.8
slipperスリッパsurippaスリッパYesC3.33.55.612123.44.2
slowスローsurooスローNoC4.24.32.712113.64.1
smell香りkaoriカオリNoNC4.23.72.942211.04.0
snakeヘビhebiヘビYesNC3.23.45.611121.14.9
snowman雪だるまyukidarumaユキダルマYesNC2.83.05.213111.04.6
sock靴下kutsushitaクツシタYesNC3.53.05.812121.04.6
solid固体kotaiコタイNoNC4.73.03.234111.24.1
spoonスプーンsupuunスプーンYesC2.84.45.913124.34.8
strokeなでるnaderuナデルNoNC3.53.24.227111.02.3
styleスタイルsutairuスタイルNoC4.54.52.318143.53.9
swan白鳥hakuchouハクチョウYesNC3.73.15.611111.04.6
tank戦車senshaセンシャYesNC4.13.65.534211.03.4
taskタスクtaskuタスクNoC4.64.12.914123.93.9
telephone電話denwaデンワYesNC4.04.35.141221.04.9
televisionテレビterebiテレビYesC3.64.45.711112.44.9
tentテントtentoテントYesC3.73.45.211113.94.8
tigerトラtoraトラYesNC3.23.45.711111.74.8
toasterトースターtoosutaaトースターYesC3.53.65.811113.54.7
tomatoトマトtomatoトマトYesC3.84.25.911112.84.9
tractorトラクターtorakutaaトラクターYesC3.72.75.211113.64.8
trapワナwanaワナNoNC4.03.53.911111.04.3
truckトラックtorakkuトラックYesC4.13.65.611213.53.9
trumpetトランペットtoranpettoトランペットYesC3.63.55.812113.75.0
turtleカメkameカメYesNC3.43.35.711111.04.8
umbrellakasaカサYesNC3.33.75.611111.04.9
vestベストbesutoベストYesC3.83.43.612213.44.2
view眺めnagameナガメNoNC4.64.23.2311211.04.1
violinバイオリンbairorinバイオリンYesC3.73.75.711123.34.8
wake覚めるsameruサメルNoNC4.23.62.823111.04.1
warm暖かいatatakaiアタタカイNoNC4.14.03.323111.04.6
waste浪費rouhiロウヒNoNC5.03.63.225221.04.0
wolfオオカミookamiオオカミYesNC3.83.05.411121.14.9
workワークwaakuワークNoC4.74.83.112113.64.0
youthユースyuusuユースNoC4.93.53.423213.43.9
zebraシマウマshimaumaシマウマYesNC2.93.15.711121.04.7

English name, Japanese name, the item pair in English and Japanese; Transcription, the roman letter transcription of the Japanese word; Katakana reading, the transcription of the Japanese word into katakana for confirming the phonetic reading; Picture naming, whether or not the item was selected from Nishimoto et al's (2005) picture naming study, if not, the item was selected from a high-frequency wordlist derived from a large web-corpus (Kilgariff et al., 2004); Cognate status, whether the item can be termed a cognate (C) or noncognate (NC), based on its obvious phonological and semantic similarity in English and Japanese; Japanese (L1) ageof-aquisition, the mean rating for age-of-acquisition gained from 22 bilinguals using the following scale: 1) 0-2 years, 2) 3-4 years, 3) 5-6 years, 4) 7-8 years, 5) 9-10 years, 6) 11-12 years and 7) 13 years or later; English (L2) familiarity, the mean rating gained from 19 bilinguals using a scale of 1-7: response categories ranged from very unfamiliar (1) to very familiar (7); Concreteness (L1), the mean concreteness rating for Japanese words gained from 18 bilinguals using a scale of 1-7: response categories ranged from very abstract (1) to very concrete (7); No. Trans L1-L2, the total number of different accurate English translations given for the Japanese word; No.Trans L2-L1, the total number of different accurate Japanese translations given for the English word; NoML1, the total number of different meanings translated for the L2 (English) word into the L1 (Japanese), i.e., only one meaning of the English word class was translated into Japanese; NoML2, the total number of different meanings translated for the L1 (Japanese) word into the L2 (English), i.e., only one meaning of the Japanese word クラス /kurasu/ 'class' was translated into English; Phonological similarity, the mean phonological similarity rating for item pairs using a scale from 1 (competely different) to 5 (identical); Semantic similarity, the mean semantic similarity rating for item pairs using a scale from 1 (competely different) to 5 (identical).

Procedure

All participants completed informed consent forms prior to beginning the experimental procedure. All surveys were administered using the online survey tool (www.surveymonkey.com). Fifteen participants were removed from the tasks due to due to incomplete responses or misunderstanding of the task. The total number of participants included in the tasks is shown in Table 1.

P and S similarity rating

Each item was rated on a 5-point scale ranging from (1 = completely different to 5 = identical). A 5-point scale was used instead of the typical 7-point scale because in a pilot study using a 7-point scale the participants found it difficult to discriminate between some of the levels (i.e., the difference between 5 and 6, or that for 2 and 3). Instructions were provided in Japanese to ensure understanding of the task. A brief explanation and examples were provided at the beginning of each survey. Participants were asked to decide how similar the word pairs sounded on the basis of their intuition and were encouraged to say the words aloud if necessary to help them decide. The examples provided for the P similarity task included band–バンド (/bando/), stress–ストレス (/sutoresu/), bird–鳥 (/tori/), which were rated as similar/very similar (4–5), somewhat similar/similar (3–4) and very different/different (1–2), respectively. For the S similarity-rating task, participants were asked to decide how similar in meaning the words in each pair were. The instructions asked participants to consider differences in senses shared and not shared between the languages, and also differences in use between the two languages. They were told not to use a dictionary, but to complete the task on the basis of their intuition (i.e., their knowledge of the words). The examples provided were triangle–三角 (/sankaku/), fan–扇子 (/sensu/), and clock − 壁 (/kabe/ “wall” in Japanese), which were rated as very similar (5), somewhat similar (3), and very different (1), respectively. Additional explanatory text was included to make clear the basis for the ratings of the examples: The words triangle–三角 have one meaning that is almost identical in both languages, thus having considerable S similarity; fan has a range of meanings in English, whereas 扇子 in Japanese has only one meaning that is similar to that of a (hand-held) fan, therefore they have some S similarity, but also differ in some senses; finally, clock and 壁 do not share word meanings, and therefore these words have no S similarity. To ensure that all parts of the scale were used, 20 nontranslation equivalents were included in the stimulus list. (All nontranslation equivalents were rated as 1, or completely different, in terms of S similarity). All Chinese characters that may have been unknown to the participants were transcribed in the hiragana phonetic script. Because of the large number of items that required ratings for both P and S similarity, and the likelihood of “survey fatigue,” each participant rated half of the words for each type of similarity, but no participant rated a pair of words for both types of similarity. Each individual item was rated for both P and S similarity by between 16 and 18 different participants.

Number of translations task

Because bidirectional translation data are desirable for bilingual research, two lists were created with half of the items being translated from the L2 to the L1, and the other half being translated from L1 to L2. These lists were counter-balanced across participants and items were presented in random order; each item was only translated once (i.e., either from L2 to L1, or from L1 to L2) by an individual participant. Participants were asked to think of the first translation that comes to mind for each item and to enter that word in the space provided. Instructions were in Japanese and examples were provided in both forward and backward translation tasks; these examples included both cognates and noncognates, and were reversed for each language direction: for instance, L1 to L2, 鳥 (/tori/)–bird, ストレス (/sutoresu/)–stress; and L2 to L1, bird–鳥, stress–ストレス.

Age-of-acquisition rating

Participants were asked to rate Japanese words on a scale of 1–7 indicating the age at which they had learned the words in Japanese: The seven response categories included (1) 0–2 years, (2) 3–4 years, (3) 5–6 years, (4) 7–8 years, (5) 9–10 years, (6) 11–12 years, and (7) 13 years or later. Participants were asked to focus on when they acquired knowledge of the word itself rather than the written form, as this may vary depending on the script (i.e., kana or kanji). Instructions were in Japanese and an example provided for respondents was お母さん (/okaasan/ “mother”) whose meaning would be learned between the ages of 0–2 years, whereas its written form would typically be acquired between 3–6 years, with the kana form preceding the kanji form.

Concreteness rating

Participants were asked to rate Japanese word items on a scale of 1–7: response categories ranged from very abstract (1) to very concrete (7). Participants were asked to consider whether an item was easily pictured in their mind, making it concrete, or whether it was difficult to picture, in which case it was more abstract. No examples were provided with this task.

L2 familiarity rating

Participants were asked to rate English word items on a scale of 1–7: response categories ranged from very unfamiliar (1) to very familiar (7). Participants were asked to consider how often they use the words in speaking and writing and also in reading and listening. Instructions were in English and examples were provided (signature and abolish are not used every day, whereas book may well be). A clarification was made to consider the words only in English, not loanwords in Japanese (e.g., サッカー /sakkaa/, “soccer”). Because participants were asked to focus on their use of the words, this familiarity survey is similar to a subjective frequency survey (e.g., Gernsbacher, 1984).

Results and discussion

In this section we first describe the cross-linguistic measures (P and S similarity, number of translations), followed by the norming data (AoA, concreteness, L2 familiarity) and finally the additional data that we are including in the data set (L1/L2 frequency, L1/L2 number of senses, L1/L2 word length (number of characters/ number of syllables). The descriptive statistics of all cross-linguistic, norming, and additional data are presented in Table 2. In what follows we make a distinction between cognate and noncognate items (on the basis of both the script used (i.e., katakana for cognates and hiragana/kanji for noncognates) and the obvious P and S similarity between the words) for the purposes of illustrating the characteristics of the stimuli. However, the cross-linguistic similarity ratings provided in this research will allow for more precise measurements of “cognateness” in future empirical Japanese–English bilingual studies. We provide the Bayes factors (using the Jeffrey–Zellner–Siow prior and Cauchy distribution on effect size) for comparisons between means of cognate and noncognate words (Rouder, Speckman, Sun, Morey, & Iverson, 2009). The advantage of using a Bayes factor over a traditional t-test is that the factor represents both the presence and strength of an effect (see Jeffreys, 1961).
Table 2

Descriptive statistics (range, mean, standard deviation) of all ratings and additional standardization measures for all items; means (and standard deviation) for cognate and noncognate items; and Bayes factor for the comparison of cognate and noncognate means

RangeMean (SD)Cognate Mean (SD)Noncognate Mean (SD)Bayes Factor (BF)
Mean P similarity ratings1.0–4.32.3 (1.2)3.5 (0.4)1.1 (0.3)<0.001**
Mean S similarity ratings2.3–5.04.2 (0.5)4.3 (0.4)4.2 (0.5)5.41
Number of translations
L1–L2 translation1.0–6.01.6 (1.0)1.1 (0.3)2.0 (1.3)<0.001**
L2–L1 translation1.0–12.02.8 (2.1)3.0 (2.1)2.5 (2.0)2.86
Number of meanings
L1–L2 translation1.0–3.01.2 (0.4)1.1 (0.3)1.2 (0.5)0.53
L2–L1 translation1.0–4.01.4 (0.6)1.5 (0.6)1.3 (0.6)2.12
Mean AoA ratings2.8–5.43.8 (0.6)4.0 (0.5)3.9 (0.6)5.85
Mean concreteness ratings1.9–6.14.5 (1.2)4.4 (1.2)4.5 (1.3)7.22
Mean L2 familiarity ratings2.0–5.43.8 (0.7)4.1 (0.5)3.6 (0.6)<0.001**
Japanese word frequency (raw)0.0a–156,2837,012.6 (19,013.6)6,706.7 (18,768.0)10,421.2 (25,027.3)0.18
Log-transformed Japanese word frequency0.0–12.07.3 (1.7)7.3 (1.6)7.7 (1.7)0.03*
English word frequency (per million)1.1–861.470.3 (139.6)80.0 (151.2)60.6 (126.4)5.66
Log-transformed English word frequency0.1–6.83.2 (1.5)3.2 (1.5)3.1 (1.3)6.57
Japanese number of senses1.0–15.02.1 (1.9)1.9 (1.3)2.4 (2.3)2.09
English number of senses1.0–45.08.0 (7.4)8.6 (8.3)7.2 (6.4)2.86
Japanese word length (morae)2.0–6.03.3 (0.9)3.5 (0.9)3.1 (0.8)0.02*
English word length (letters)3.0–105.1 (1.3)5.3 (1.4)5.0 (1.3)4.12
English word length (syllables)1.0–4.01.5 (0.7)1.6 (0.8)1.5 (0.6)4.81

*Very strong evidence against H 0. **Decisive evidence against H 0 (see Jeffreys, 1961; Kass & Raftery, 1995). aOne item (シェアー /sheaa/ share) was not found in the Amano and Kondo (2000) corpus; hence, a single zero frequency is included in the data set.

Descriptive statistics (range, mean, standard deviation) of all ratings and additional standardization measures for all items; means (and standard deviation) for cognate and noncognate items; and Bayes factor for the comparison of cognate and noncognate means *Very strong evidence against H 0. **Decisive evidence against H 0 (see Jeffreys, 1961; Kass & Raftery, 1995). aOne item (シェアー /sheaa/ share) was not found in the Amano and Kondo (2000) corpus; hence, a single zero frequency is included in the data set.

P and S similarity

Respondents used all parts of the scale in both the P and S similarity rating tasks (Fig. 1). Cognate items were clearly distinguishable from noncognates on the basis of P ratings, with a Bayes factor that suggests a decisive rejection of the null hypothesis (BF < 0.001). S similarity ratings were skewed to the right side of the scale indicating that items were mainly rated as being highly semantically similar across languages (Fig. 2; note that nontranslation fillers were removed from the analysis). The S ratings showed no difference between cognates and noncognates. This was expected as the primary distinction between cognates and other translation equivalents is that cognates share both form and meaning, whereas noncognate translation equivalents share only meaning. This finding supports those of Tokowicz et al. (2002), who found a similar result for Dutch–English translations, and thus refutes the assumption made by Van Hell and de Groot (1998) that cognates are more likely to share meaning because they share formal features. The present study shows that for languages that differ in script, formal (P) similarity does not make it more likely that words will share a greater amount of S similarity across languages.3
Fig. 1

Distribution of mean P similarity ratings for all items. The x-axis shows the mean ratings on a 5-point scale, with 1 being completely different and 5 being identical. The y-axis shows the number of translation pairs that fall into each mean rating band

Fig. 2

Distribution of mean S similarity ratings for all items (nontranslation filler items were removed from the S similarity task)

Distribution of mean P similarity ratings for all items. The x-axis shows the mean ratings on a 5-point scale, with 1 being completely different and 5 being identical. The y-axis shows the number of translation pairs that fall into each mean rating band Distribution of mean S similarity ratings for all items (nontranslation filler items were removed from the S similarity task)

Number of translations

Two professional Japanese–English translators determined the accuracy of translations in both directions (L1–L2, L2–L1). Correct translations were then coded for whether they were the expected translation (i.e., that provided by Nishimoto et al., 2005, for concrete items [the picture naming stimuli], or the translation assigned in the initial item selection stage—e.g., ball–ボール—or an alternative translation). The number of distinct meanings provided as translations was also determined and added to the database. (We did not count verb uses of nouns, adjectival uses of nouns, and so on, as different meanings. Also, where meanings were not easily distinguishable, such as in the case of find and locate for 見つける /mitsukeru/ in Japanese, they were treated as the same meaning; thus, our number of meanings measure is somewhat conservative as only distinct meanings were coded as being different). Additional data for the translation task are included in a separate sheet in the database (see the supplemental materials). Table 2 shows the descriptive statistics for the translation tasks in both directions. As expected, when translating from the L1 into the L2, there were more errors than when translating from the L2 into the L1 (11.5 % vs. 8.4 %). More information on error rates for translations of items in each direction are provided in the supplemental materials for this article. Also, the mean number of translations and the mean number of meanings provided was smaller when translating into the L2 relative to translating into the L1 (mean translations, 1.6 vs. 2.8; mean meanings: 1.2 vs. 1.4). Interestingly, when comparing the number of translations of cognates versus noncognates across the two tasks, one difference emerged: When cognates are translated from Japanese to English, there is usually only one translation (M = 1.1, SD = 0.3), which is the English cognate (e.g., クラス /kurasu/ is translated as class); however, when translating the same cognates from English to Japanese, there is a greater range of translations (M = 3.0, SD = 2.1), which may or may not include the Japanese cognate translation. Furthermore, for concrete items such as television, which have only one translation in Japanese, these are translated using the Japanese cognate form (テレビ /terebi/); however, more abstract words, such as class and other verbs, which can have multiple translations in Japanese, are translated using multiple Japanese words (e.g., クラス /kurasu/, 学級 /gakkyuu/, or 等級 /toukyuu/). The Bayes factor (BF < 0.001) indicates decisive evidence for a difference between the mean numbers of translations for cognates and noncognates in the L1-to-L2 direction, indicating that when bilinguals translate cognates into English, they use significantly fewer translations than when translating noncognates into English. Noncognates had more than one translation on average, regardless of direction (L1 – L2, M = 2.0, SD = 1.3; L2 – L1, M = 2.5, SD = 1.8).

Age of acquisition (AoA)

All parts of the scale were used, although few participants rated learning words in the earliest category (0–2 years). The mean AoA was 3.8, which is between the third and fourth categories (5–6 and 7–8 years; SD = 0.6; Table 2). There was no difference in AoA ratings between cognate and noncognate items. To test the reliability of the ratings, they were compared with Nishimoto et al.’s (2005) Japanese AoA ratings for picture stimuli and with Kuperman, Stadthagen-Gonzalez, and Brysbaert’s (2012) AoA ratings for English words. Only the items that existed in both the present and previous data sets could be subject to this analysis. Because Nishimoto et al.’s ratings focused on picture stimuli, only the concrete items occurred in both data sets. Kuperman et al.’s data set, however, was much larger and covers most of the concrete and abstract words in the present data set. Correlations for the Japanese picture stimulus items were reasonable (r = 0.26, CI = 0.04, 0.46) but stronger for the English word AoA ratings (r = 0.47, CI = 0.35, 0.58). The weaker correlation between our AoA ratings and Nishimoto et al.’s ratings reflects the difference in task requirements. In Nishimoto et al., participants rated the AoA for the concepts depicted in the picture stimuli, whereas our measure reflects the acquisition of word knowledge, which may be acquired later than conceptual knowledge. In sum, the AoA ratings appear most comparable to those collected from English native speakers by Kuperman et al. AoA for words thus appears to have some overlap across languages.

Concreteness

All parts of the scale were used showing that the stimuli included a variety of concrete and abstract words (M = 4.5, SD = 1.2). There was no difference in concreteness ratings between cognate and noncognate items. Correlations with concreteness and imageability ratings for those items that could be cross-referenced (n = 76) taken from the MRC database (Coltheart, 1981) revealed strong correlations (r = 0.91, CI = 0.86, 0.94, and r = 0.84, CI = 0.76, 0.90, respectively), indicating that the present concreteness ratings collected with Japanese speakers are highly comparable to those collected with English speakers.

English (L2) familiarity

All parts of the scale were used. The Bayes factor (BF < 0.001) for familiarity ratings for cognate and noncognate words provides clear evidence of a difference between the means with the cognate familiarity (M = 4.1, SD = 0.5) being considerably higher than the noncognate familiarity (M = 3.6, SD = 0.6). This shows that English words that are cognate with Japanese were rated as significantly more familiar than those that are noncognate (see Yokokawa, 2009, for a similar finding). To test the reliability of these ratings they were compared with Yokokawa’s L2 familiarity ratings for visually presented English words collected from Japanese learners of English. The correlation was high (r = 0.77, CI = 0.68, 0.84) suggesting that the present ratings are a comparable and reliable resource. Typically, norming data are collected from monolingual groups for use in monolingual studies. Such data can also be used as measures of one of a bilingual’s languages. However, a bilingual’s language processing system is not simply a combination of two monolingual systems (Grosjean, 1989). Research shows that a bilingual does not process language by accessing one lexicon exclusively depending on the language being used (Dijkstra, 2007). In contrast, nonselective access in language processing by bilinguals suggests that cross-linguistic activation influences performance a great deal in a wide variety of language tasks (Dijkstra, 2007). Here we show that in an L2 rating task, a bilingual’s first language (the nontarget language) can modulate responses, demonstrating cross-linguistic influences in tasks that are not response-speed-dependent (i.e., for which RT is not the primary dependent variable). Thus, when researchers collect L2 norming data, such as familiarity, from bilinguals, they must consider the impact of cross-linguistic influences on such ratings. Thus, the present L2 word familiarity measure incorporates bilingual participants’ familiarity with both of their languages. Although it is primarily a measure of L2 familiarity, this is clearly influenced by the L1 (as evidenced by the significantly higher familiarity ratings for cognate translations, which share form and meaning with the L1, than for noncognates, which only share meaning). Therefore, it is likely that this measure will be particularly predictive of bilinguals’ responses in word recognition tasks in the L2, at least for the particular sample population (i.e., mid-proficiency Japanese–English bilinguals). Because cross-linguistic influences tend to be more prominent in the weaker language (L2) than the dominant language (L1; Dijkstra & Van Heuven, 2002), the measure may be most predictive in L2 word recognition and or production tasks. Moreover, this bilingual measure of L2 familiarity should be more predictive of word recognition responses for Japanese–English bilinguals than a monolingual measure of English word familiarity.

Additional data for items

Japanese word frequency

Word frequency in Japanese was taken from the Amano and Kondo (2000) database, which consists of word frequencies from all issues of the Asahi Japanese newspaper between 1985–1998 (see Appendix B and the supplemental materials for all additional data for items). The corpus has a total type frequency of 341,771 morphemic units and a total token frequency of 287,792,797 morphemic units (cf. Tamaoka & Makioka, 2009). When Japanese words were used in more than one script (e.g., camel-ラクダ/駱駝 /rakuda/, the frequencies of the word in each script were totaled. When words had more than one reading (e.g., head–頭, in which the Japanese as a stand-alone noun is read /atama/ and when used in a compound it is pronounced /gashira/ or /tou/), frequency of the stand alone noun only was used). Descriptive statistics for raw frequencies are provided in Table 2. We could not provide occurrences per million, as we only had the token count for morphemic units, which overestimates the actual number of “words” (which often have two or more morphemes) in the corpus. Log-transformed frequencies, which increase normality and reduce random variance, are also provided. The Bayes factor (BF = 0.03) provides strong evidence for a difference between cognate and noncognate log-transformed word frequencies, though there was less evidence for such a difference using the raw frequencies (BF = 0.18). Thus, although our cognates were selected from a high frequency wordlist of katakana loanwords in Japanese, they are still lower in frequency than the noncognates in the present sample. This may partially be due to the fact that cognates tend to have one borrowed meaning (i.e., few senses). This is especially true for borrowed verbs, adjectives, and adverbs, as native words often exist, and the borrowed words fill narrow lexical gaps. The implication of this is that it is difficult for researchers to match cognate and noncognate items in languages in which the cognates are all borrowed words. Therefore, mixed-effects modeling, which can account for multiple continuous variables such as word frequency and number of senses as well as P and S similarity, might be most suitable for analyses with Japanese–English cognates.

English word frequency

Word frequency per million words in English was taken from the SUBTLEX corpus of film and television subtitles (Brysbaert & New, 2009) available from the Elexicon Project (Balota et al., 2007). Log-transformed frequencies, which increase normality and reduce random variance (Baayen, 2008), are also provided (logSUBTLEX). There was no difference in English word frequency or log-transformed frequency for cognate and noncognate items. In addition to the frequencies from the subtitles corpus (SUBTLEX) for English and the newspaper corpus (Amano & Kondo, 2000) for Japanese, we provide an additional set of corpus frequencies taken from large web-corpora for each language. These two corpora were obtained from the Sketch Engine website (www.sketchengine.co.uk; Kilgariff et al., 2004); the English corpus (UkWaC) contains 1,318,612,719 words, and the Japanese corpus (JpWac) contains 333,246,192 words. The advantage of using these corpora is that they are comparable in terms of their derivation: Both are derived from the Web—specifically, from shopping and commercial websites, blogs, and discussion forums. The log-transformed frequencies are included in Appendix B (and the supplemental materials). The UkWac corpus log frequencies significantly correlate with the SUBTLEX corpus log frequencies (r = 0.78, CI = 0.72, 0.83) and the JpWac log frequencies correlate strongly with the log frequencies from the Japanese newspaper corpus (r = 0.71, CI = 0.63, 0.77). The two Web corpora also correlated (r = 0.70, CI = 0.60, 0.75) to a much higher degree than the English subtitles and Japanese newspaper corpora (r = 0.35, CI = 0.22, 0.47). Thus, whereas within-language corpora correlations are strong for both languages, the Web corpora appear to better correlate across languages, indicating that they are utilizing similar text resources as the basis for the frequencies. Thus, these may also prove to be valuable resources for studies of Japanese–English bilingual language processing. All log-transformed frequencies are provided in Appendix B (and the supplemental materials).

Number of English senses

The total number of senses regardless of class (verb, noun, etc.) was taken from the online version of WordNet (Princeton University, 2010). There was no difference in the numbers of English word senses between cognate and noncognate items.

Number of Japanese senses

The total number of senses for Japanese words was taken from MeikyoKokugoJiten (Meikyo Japanese Dictionary, 2008 edition). In four cases, the Japanese loanword was not listed as a single entry (i.e., only as a compound entry) in the selected dictionary; therefore, the number of senses for these items was taken from a second dictionary—Koujien, 6th edition (2008)—in which the items were listed as single entries. Similar to the number of English senses, there was no difference between cognates and noncognates in terms of the number of Japanese senses.

Japanese word length

Japanese word length was calculated as the total number of morae in each word. A mora is the basic phonemic unit in Japanese, roughly corresponding to a syllable. For example, 魚 /sakana/ “fish” is written in kanji (Sino-Japanese characters) and contains three morae, which can be visualized by transcribing the word using the phonetic script, hiragana: さかな, /sa/, /ka/, /na/. On the other hand, カンガルー /kangaruu/ “kangaroo,” is written in katakana, which is used for writing loanwords, and contains five morae in Japanese (/ka/, /n/, /ga/, /ru/, and /u/) , even though the English word contains only three syllables. This exemplifies how the Japanese phonemic system determines the resulting phonetic constitution of the borrowed word, while also briefly illustrating the use of the three scripts of the Japanese language. The Bayes factor (BF = 0.02) provides very strong evidence for a difference in the numbers of morae in cognate and noncognate words, such that the former were longer on average. This is not surprising given that loanwords, which are rephonalized into Japanese from English, tend to be longer than native Japanese words, which typically contain 2–4 morae.

English word length

The number of letters and syllables in each English translation were used as two separate measure of English word length. As expected, the word lengths did not differ for the English translations of cognates and noncognates, whether we looked at the number of letters, or syllables, in each word.

Correlations between ratings and collected measures

S similarity

It is likely that the S similarity is related to other lexical–semantic characteristics, such as concreteness or number of translations. In order to assess whether this is the case, a number of predictors were selected for a correlation analysis with the S similarity measure derived in this study: number of translations (in both directions), number of meanings translated (in both directions), concreteness, number of senses in Japanese and English, and P similarity (Table 3). First, S similarity was strongly negatively correlated with the number of translations measures in the L1 to L2 direction (r = − 0.29, CI = − 0.41, –0.16) and in the L2 to L1 direction (r = − 0.41, CI = − 0.52, –0.29). This shows that as the number of translations increases, S similarity decreases, which is similar to Tokowicz et al.’s (2002) finding for Dutch–English translations. Second, S similarity was negatively correlated with the number of meanings translated in the L1 to L2 direction (r = − 0.20, CI = − 0.33, – 0.06) and less so in the L2 to L1 direction (r = − 0.14, CI = − 0.28, 0.00). Again, the negative correlation shows that words translated with more meanings were rated as less semantically similar across languages. The number of translations measures (L1–L2, L2–L1) were not strongly correlated (r = 0.13, CI = −0.01, 0.27); this was also the case for the number of meanings (r = − 0.09, CI = − 0.23, 0.05). This reflects the fact that the degree of S knowledge varies across languages, with participants having a greater knowledge of S characteristics of words in the L1 relative to the L2. Third, concreteness was highly correlated with S similarity (r = 0.40, CI = 0.27, 0.51), such that the more concrete the words were rated, the more semantically similar across languages they are (this is similar to Tokowicz et al., 2002). Fourth, S similarity was highly negatively correlated with the number of English senses (r = − 0.31, CI = − 0.43, – 0.18) and but much less so with the number of Japanese senses (r = − 0.10, CI = − 0.24, 0.04). The discrepancy may well be due to the different degrees of sense disambiguation in the English and Japanese sources (WordNet vs. Meikyo Japanese Dictionary), the former tending to provide many senses, and the latter tending to be more conservative. Nevertheless, the two measures of number of senses were strongly correlated (r = 0.37, CI = 0.24, 0.49). Taken together, the numbers of translations, meanings, and senses, along with concreteness, appear to be important S characteristics that determine cross-linguistic S similarity.
Table 3

Intercorrelations among factors

Factor123456789
1. Semantic similarity–0.29–0.41–0.20–0.140.40–0.10–0.31–0.09
2. Number of translations from L1 to L20.130.010.42–0.310.100.05–0.43
3. Number of translations from L2 to L10.220.12–0.510.110.370.13
4. Number of meanings of items translated into L2–0.09–0.130.100.220.12
5. Number of meanings of items translated into L1–0.110.180.01–0.11
6. Concreteness–0.20–0.37–0.04
7. Number of senses in Japanese0.37–0.09
8. Number of senses in English0.12
9. Phonological similarity
Intercorrelations among factors In addition, the role of P similarity was explored in order to determine whether there was any relationship between it and S similarity; however, the two similarity measures were not strongly correlated (r = − 0.09, CI = − 0.23, 0.05). This supports the finding of Tokowicz et al. (2002) who reported a similar finding for Dutch–English translations. Interestingly, P similarity was highly correlated with number of translations in the L1–L2 direction (r = − 0.43, CI = − 0.54, – 0.31) but much less so with the L2–L1 direction (r = 0.13, CI = − 0.01, 0.27). This shows that more phonologically similar items (i.e., cognates) had fewer translations in the L2 than did phonologically dissimilar items (i.e., noncognates); for example, コール /kooru/ “call” is usually translated into English using the cognate translation only (i.e., call). Finally, P similarity was not correlated with the numbers of meanings in the L2 (r = 0.12, CI = −0.02, 0.26) or in the L1 (r = − 0.11, CI = − 0.25, 0.03), which demonstrates that although fewer different translations were provided for cognates than for noncognates in the L1–L2 direction, the numbers of meanings provided did not differ depending on cognateness or direction of translation. The present study is the first to report this interesting difference in the numbers of translations for language pairs that do not share etymological origins but are instead loanwords. This characteristic of borrowed words is also likely to be observable in languages pairs such as Korean–English. Thus, when bilinguals translate Korean loanwords into English, they are likely to use a single translation, but this will not be the case when translating from English into Korean. To illustrate, the English word style can be translated into various Korean words: 스타일 /sutail/, 모양 /moyang/, 품격 /pumkyek/, or 문체 /munche/. However, when translating the Korean loanword 스타일 /sutail/, Korean–English bilinguals will use only the English word style. Because the number of translations influences bilingual processing, this feature of loanwords in such languages is thus important for understanding bilingual processing mechanisms.

P similarity and cognates

In most research to date, words have been dichotomized as cognate or noncognate on the basis of the degree of formal and S overlap. However, as we have shown here, words that are typically classed as cognate can vary in terms of their cross-linguistic P overlap. Because formal overlap across languages has been shown to influence processing bilingual tasks, both as a dichotomous “cognate status” variable (Hoshino & Kroll, 2008; Taft, 2002) and as continuous measures of P and/or O overlap (Allen & Conklin, 2013; Dijkstra et al., 2010), it is crucial to investigate the role of overlap in bilingual processing. As can be seen in Table 4, P similarity was highly correlated with cognate status (r = −0.96, CI = −0.97, –0.95), showing that the two measures are predicting much of the same characteristic. The almost complete correlation between P similarity and cognate status demonstrates just how well P similarity can categorize items as either cognate or noncognate. Importantly, because bilinguals have been shown in this research to be sensitive to the degree of P similarity between translations across languages, as opposed to simply knowing that words are either cognate or noncognate, P similarity is a superior measure of bilinguals’ actual word knowledge and thus should prove to be a more valid measure of bilingual performance in tasks that investigate cross-linguistic processes.
Table 4

Intercorrelations among factors

Factor1234567
1. Phonological similarity–0.96–0.240.060.24–0.05–0.02
2. Cognate status0.25–0.06–0.25–0.09–0.08
3. Log-transformed Japanese word frequency0.35–0.24–0.30–0.11
4. Log-transformed English word frequency–0.30–0.42–0.42
5. Japanese Word length (mora)0.450.39
6. English Word length (letters)0.78
7. English Word length (syllables)
Intercorrelations among factors Also, whereas Japanese log word frequency was highly correlated with P similarity (r = −0.24, CI = −0.37, –0.10), it was not correlated with English log word frequency (r = 0.06, CI = −0.08, 0.20). The same pattern is apparent for cognate status and the two log word frequency measures. This highlights the fact that in Japanese, cognates are typically of lower frequency than noncognates, even though we specifically selected items from a high-frequency word list in Japanese. Finally, whereas both the number of English letters and English syllables were not correlated with P similarity, the number of mora in Japanese was (r = −0.25, CI = −0.38, –0.11). This highlights the fact that Japanese cognates, which are loanwords from English, tend to have a greater number of mora than native Japanese words (i.e., noncognates).

Conclusions

The goal of this study was to provide cross-linguistic norming data for Japanese–English translation equivalents, which will be a useful resource for researchers of bilingual processing of Japanese and English languages. This is the first study to provide such rich resources for languages that differ in script. The data may be used for norming items for use in production tasks such as picture naming (see also Nishimoto et al., 2005; Szekely et al., 2004), word naming and translation, and also comprehension tasks, such as lexical decision, sentence-context reading studies, and studies using progressive de-masking techniques or the masked priming paradigm (e.g., Nakayama et al., 2012). In addition, we highlight a number of important features of cross-linguistic similarity for Japanese–English translations. First, we showed that P similarity ratings are varied for translation equivalents and distinguish between cognates and noncognates as well as within the cognates category. Thus, P similarity is more likely to reflect the processing mechanisms of bilinguals than a dichotomous all or nothing categorization of similarity, even though cognate status and P similarity are very highly correlated predictors. Second, we showed that although S similarity ratings do not differ significantly for cognate and noncognate items (contra the assumptions of Van Hell & de Groot, 1998), the number of translations varies by direction. Specifically, when Japanese loanwords are translated into English, one translation is unanimously preferred. However, when English words that have loanword equivalents in Japanese are translated, bilinguals use not only the Japanese loanwords but other words as well. This interesting feature may well be present in other languages that borrow from English but do not share its etymological origins, such as Korean–English and Chinese–English. Such knowledge is crucial for selecting stimuli for experiments that test theories of bilingual processing and representation. We also provided measures of standardization that are not freely available for all of the Japanese items in the present study (age of acquisition and concreteness) and bilingual norming data for English word familiarity. In the L2 familiarity study we observed language transfer effects that resulted in English cognates receiving higher familiarity ratings than noncognates, which is likely due to the effect of cross-linguistic similarity. This further stresses the important role of cross-linguistic similarity in offline, as well as online, tasks. Finally, additional information (frequency, number of senses, and word length) was provided. Cognates tend to be lower in L1 frequency and longer in the number of Japanese characters (or morae), whereas these factors are no different for cognates and noncognates in the L2 (English). To deal with these inherent differences between Japanese cognates and noncognates, bilingual research that uses cognates might benefit from the use of mixed-effects modeling, as this method can account for multiple continuous variables, such as frequency, length, and number of senses, as well as the researchers’ particular variables of interest. All in all, the present data set provides the richest cross-linguistic lexical resource currently available for bilingual studies with different-script languages. (XLSX 196 kb)
Table 6

Additional measures data (Word frequencies, word length, number of senses)

English nameJapanese nameSUBTLWFlog SBTLWFAKlogAKUkWaclogUKJpWaclogJPLengthSyllMoraENoSJNoS
accessアクセス31.73.4619017.5542427212.962965310.3062482
acid10.02.304316.073283010.4042188.3542254
aid援助13.92.633509810.478752411.38188249.8431361
arm65.44.1856568.644171710.64110399.3131285
arrow矢印7.82.062135.3693309.1412397.1252421
ashtray灰皿3.31.183645.905646.345606.3372411
balloon風船8.72.168046.6988379.094276.0672442
bananaバナナ10.72.3712647.1457748.6627707.9363321
bedベッド187.15.2345718.439116911.4294279.15313132
benchベンチ9.72.2729407.99152469.6328177.9451391
bicycle自転車6.61.8978098.9693519.14140779.5573421
blank空白9.72.2726277.87185519.8318947.5551482
bone26.13.2663128.753226810.3883659.0341266
bricksレンガ3.91.3714887.3190209.1110957.0061321
broomほうき4.81.563945.9816917.435246.2651351
brushブラシ14.22.653855.95127809.469886.90513141
busバス74.24.31118019.389418211.452900210.2831271
bust27.63.3211375511.6455108.61153239.64412107
buttonボタン28.33.3425587.857251611.19117249.3762392
cakeケーキ45.13.819426.85185359.8377458.9541341
callコール861.46.7630218.0126980512.5141618.33413414
camelラクダ5.01.615426.3028377.953625.8952312
careケア485.36.1817307.4636514612.8182449.02412112
careerキャリア45.23.8115257.3313519511.8187839.0862433
carrot人参3.81.3418127.5039028.279366.8462442
caseケース282.45.642258710.0349569913.112732310.22413222
caution注意5.21.64123089.42136909.523433810.4472353
cherryさくらんぼ13.62.617086.5643018.37152.7162551
chimneyエントツ4.21.436356.4579248.986876.5372421
classクラス117.44.7799279.2023439912.362451810.1151392
classicクラシック16.22.7817297.466085711.0252378.5672552
clearクリアー171.85.159546.8629597412.608146.70514454
clue手がかり17.62.8713167.18109139.3019607.5841432
coolクール195.95.281935.265310810.8841698.34413112
coralサンゴ2.40.8610786.9869628.853785.9352352
coreコア9.82.281955.278262011.3250988.54412115
courseコース487.26.19107289.2862041813.342342110.06613134
cow25.53.2445858.43111459.32114459.3531241
crime犯罪71.24.27126949.4510019611.512614610.1751421
crossクロス55.04.012585.558484011.3545708.43513163
cure治る20.83.0420357.62177729.7915747.3641358
curtainカーテン10.32.339626.8771988.8825137.8372432
cycleサイクル5.91.7713967.246946311.1543198.37524113
deerシカ8.72.162485.51137479.5324827.8241211
demand要求17.12.844760710.7710581711.573135110.35625111
desk43.93.7840918.322759310.2365078.7841312
dolphinイルカ2.81.028916.7933868.1317557.4772321
doorドア292.15.6845578.4211922711.69187559.8441251
dressドレス87.24.475866.373035810.3230378.02513211
dresserたんす3.61.278696.7710816.994366.087 2371
eagleワシ11.52.4413047.1741518.3315407.3452261
elephant11.42.4317327.4675648.9324317.8083221
exit出口15.62.7520617.632835310.2539978.2942351
fail失敗24.63.20143149.574684410.752242210.02414111
find見つける831.06.72128779.4666957413.4165688.79414182
firm会社35.33.5612116211.709387611.4512429111.73413141
fish83.54.4272058.8810030211.52143419.5741361
flag17.52.8633988.132631810.183075.73412134
fluteフルート2.10.756106.4157048.6510636.9751441
foolバカ89.34.4917277.4597739.1988879.0941277
foot64.94.17199209.907286111.203417910.444121410
forkフォーク8.82.1814067.2588459.0920107.6141392
foxキツネ21.63.078676.7784769.047016.55313102
frogカエル11.82.477946.6835958.1922347.7141341
front181.65.2015628311.9620347112.2224098012.395121315
fuel燃料17.22.8477828.966864611.14110559.3141451
fund資金10.62.365982911.007247311.193081810.3441391
future将来103.54.643500710.4638303812.863582810.4962472
genreジャンル1.10.0615997.38207669.9493169.1452341
giraffeキリン1.50.408916.79975 6888936.7972312
glassグラス60.74.116706.517589611.2435788.18513123
goalゴール16.82.8264558.779887211.5096859.1841343
goatヤギ10.52.354146.0347108.468896.7941241
gorillaゴリラ5.61.713485.8512497.135536.3273311
grapesブドウ3.91.3712817.1646558.458446.7461321
guitarギター15.62.7513177.184837810.7982909.0262311
hammockハンモック1.40.33513.939396.841364.9172521
hangerハンガー1.40.302255.4211387.044646.1462421
hate憎む214.65.376486.472835810.255816.3641321
head371.55.92212919.9723615112.375785910.9741349
heartハート2442 5.504246.0518864312.1539278.28513103
helicopterヘリコプター15.82.7649028.50106469.2714987.31104611
helmetヘルメット9.52.2511357.0391289.1211277.0362522
hope希望320.65.772391510.0821531012.283162210.3641392
ideal理想7.31.9947448.469164011.43134749.5152351
ironアイロン17.92.894066.014229510.657916.6742461
jarつぼ8.32.127306.5960008.707036.5662476
joint関節27.63.3213747.238561211.3651008.54514131
jokeジョーク73.04.295846.37150619.6221297.6641461
jury陪審42.83.769216.83138159.5319177.5642421
kangarooカンガルー2.30.841064.6610656.974536.1283511
kickキック73.44.303945.982882010.2722067.70413142
kissキス121.24.806476.4780979.0036278.2041261
ladderはしご9.32.226096.41134139.506146.4262342
learn習う118.64.7820227.6116450712.0110226.9351361
left484.56.18134349.5145649113.03216519.98413246
lemonレモン12.02.49844.4386579.0718967.5552351
lessonレッスン32.23.477736.653486510.4676388.9462442
lionライオン15.42.737576.6380538.9923657.7742441
lips31.23.448686.77123909.4232148.0841451
loanローン19.92.9940798.316424011.0795679.1741331
lobsterザリガニ7.31.991284.8524467.803585.8872422
localローカル41.73.734776.1780759813.6044328.4052451
loose緩い41.83.738746.772993910.314666.14513186
luckyラッキー143.54.973175.763706710.5237728.2452431
matter物事370.65.9212527.1319900912.2065978.7962471
maze迷路2.60.945096.2360958.729516.8641321
moraleモラル4.11.4216127.3970578.8628577.9662321
naked39.33.6716177.39105319.2634198.1452354
necklaceネクレス9.82.284566.1220957.657326.6082511
normal普通70.44.2576678.9413650111.824950910.8162352
nose69.84.2420507.632357310.07107359.28412102
past過去123.84.824072610.6128872812.574308410.6741264
peanutピーナツ12.42.511815.2019637.581204.7962551
pelicanペリカン1.80.56954.556146.421765.1773411
pencil鉛筆9.92.2913577.2181389.0016947.4362451
penguinペンギン2.91.063955.9815977.3814507.2872411
pig39.13.6714327.27114649.3550908.5431292
pipeパイプ19.42.9648338.482229010.0128277.95414104
place場所602.76.402695110.2071159613.486722711.12512303
plain明白21.83.085156.243861310.5636138.19514111
poolプール47.03.8532988.106592311.1059058.6841374
prison刑務所66.04.1927627.924707510.7642558.3662421
profit利益11.02.393506410.464523910.723202110.3762342
pyramidピラミッド4.01.397136.5741718.3420577.6373571
rabbitウサギ20.93.0415357.34104459.2521757.6862341
raceレース61.94.1381479.0113270711.80102949.24413102
radioラジオ77.24.3586969.079522011.46130139.475351
rain48.93.89122419.415065010.832456610.1141243
rankランク8.52.1429637.99200699.9147168.46413131
realリアル442.86.092055.3231006412.6476448.94413131
regularレギュラー33.93.528676.7714149011.8621157.66735173
releaseリリース36.33.59724.2812679011.75107889.29724222
rentalレンタル4.81.5713247.19209169.9545848.4362441
returnリターン91.74.521855.2224375512.4021537.67624293
ring指輪92.84.5310766.985505310.9222197.70413154
rocketロケット11.82.4737148.2288029.0844328.4062471
rollロール63.34.151835.213685110.5132658.09413334
ruleルール48.13.8794819.168176011.31194199.8741351
sailor水兵12.42.522395.4838698.262495.5262431
scaleスケール9.52.2510236.9310071911.5237478.23514183
scissorsはさみ6.71.9013697.2229868.007536.6282343
scoreスコア30.43.4212577.145997411.0048108.48513182
screenスクリーン23.43.1513057.1711769311.6851728.5565164
screwネジ37.53.625766.3687859.0813847.23512102
senseセンス131.84.8810576.9620019312.2157358.6551392
shareシェア69.54.240NA16551312.0257228.65512101
sharkサメ15.02.714576.1244918.417606.635251
shirtワイシャツ46.43.846976.55145299.583785.9351421
shockショック28.83.3658258.673005210.3188449.09513173
showショー488.46.1926437.8837508812.8388999.09413162
showerシャワー41.13.729036.813330410.4136088.19613111
signサイン133.34.8929567.9911062011.6170178.86413202
singleシングル72.14.2811967.0926744212.5044768.41624109
sizeサイズ46.13.8321647.6819052912.16169259.7441391
skiスキー8.12.0954448.60195149.8869128.8431322
skillスキル7.92.07233.144336110.6894519.155127
skirtスカート10.02.3011247.0244148.3923757.7751472
slipperスリッパ2.20.801915.256606.49313.4372421
slowスロー76.04.331074.676211811.0433868.13413111
smell香り83.14.4235408.17191649.8687969.08513102
snakeヘビ22.43.1112117.1066618.8024627.8151281
snowman雪だるま1.90.643575.884696.153825.9572511
sock靴下9.02.196456.4717937.498726.7741431
solid固体19.62.973645.905211410.8641358.33523181
spoonスプーン7.62.034206.0451998.5614267.2651452
strokeなでる13.12.575926.382576310.161745.16613163
styleスタイル30.13.4039628.2816046311.99162919.70514123
swan白鳥6.81.927936.6822657.7312237.1141441
tank戦車25.63.2434958.162957310.2926797.8941371
taskタスク12.72.54122.489427811.4523427.7641342
telephone電話32.43.486128911.0210844511.597212611.1993322
televisionテレビ33.93.526063611.018318411.335219810.86104321
tentテント17.52.8627197.91147639.6029437.9941321
tigerトラ18.52.9214947.3147448.4647318.4652222
toasterトースター3.91.36564.0315207.331895.2472521
tomatoトマト5.91.7717857.4973628.9039018.2763321
tractorトラクター3.71.324936.2042198.356116.4272521
trapワナ23.83.178676.77139509.545406.29412162
truckトラック72.94.2995439.16126509.4510926711.6051431
trumpetトランペット4.11.423175.7652768.577736.6572641
turtleカメ17.02.843995.9928677.9631268.0562231
umbrella7.52.0117067.4497229.1834708.1583242
vestベスト5.61.7247368.4628307.95105869.2741372
view眺め38.53.652945.6834015912.74100989.22413131
violinバイオリン4.81.5616137.3977738.9616557.4163511
wake覚める105.24.6610236.93201549.919766.88 41394
warm暖かい52.13.9547948.486698011.1139918.29415136
waste浪費53.33.978536.7511228611.6315047.32513161
wolfオオカミ20.33.018956.8049968.5217547.4741462
workワーク798.06.682125.36161554214.3086839.07413241
youthユース16.82.825206.256588011.1028057.9451361
zebraシマウマ2.50.92744.3015437.341575.0652411

English name, Japanese name, the item pair in English and Japanese; SBTLWF, the word frequency per million of the English item taken from the SUBTLEX corpus (Brysbaert & New, 2009); logSBTLWF, the log-transformed SBTLWF word frequency; AK, the raw word frequency of the Japanese item taken from a corpus of Japanese newspaper articles (Amano & Kondo, 2000); logAK, the log-transformed AK word frequency; UkWac, the raw word frequency of the English item taken from thea 1-billion word web-corpus (Kilgariff et al., 2004); logUK, the log-transformed UkWac word frequency; JpWac, the raw word frequency of the Japanese item taken from a 300-million word web-corpus (Kilgariff et al., 2004); logJP, the log-transformed JpWac word frequency; Length, the number of letters in the English word; Syll, the number of syllables in the English word; Mora, the number of mora in the Japanese word; ENoS, the total number of senses of the English word taken from WordNet (Princeton, 1990); JNoS, the total number of senses of the Japanese word, taken from Meikyou dictionary (2008).

  19 in total

1.  Number-of-translation norms for Dutch-English translation pairs: a new tool for examining language production.

Authors:  Natasha Tokowicz; Judith F Kroll; Annette M B de Groot; Janet G van Hell
Journal:  Behav Res Methods Instrum Comput       Date:  2002-08

2.  Cognate effects in picture naming: does cross-language activation survive a change of script?

Authors:  Noriko Hoshino; Judith F Kroll
Journal:  Cognition       Date:  2007-03-23

3.  Sentence context modulates visual word recognition and translation in bilinguals.

Authors:  Janet G van Hell; Annette M B de Groot
Journal:  Acta Psychol (Amst)       Date:  2008-05-16

4.  Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English.

Authors:  Marc Brysbaert; Boris New
Journal:  Behav Res Methods       Date:  2009-11

5.  Translation priming with different scripts: masked priming with cognates and noncognates in Hebrew-English bilinguals.

Authors:  T H Gollan; K I Forster; R Frost
Journal:  J Exp Psychol Learn Mem Cogn       Date:  1997-09       Impact factor: 3.051

6.  A standardized set of 260 pictures: norms for name agreement, image agreement, familiarity, and visual complexity.

Authors:  J G Snodgrass; M Vanderwart
Journal:  J Exp Psychol Hum Learn       Date:  1980-03

7.  The English Lexicon Project.

Authors:  David A Balota; Melvin J Yap; Michael J Cortese; Keith A Hutchison; Brett Kessler; Bjorn Loftis; James H Neely; Douglas L Nelson; Greg B Simpson; Rebecca Treiman
Journal:  Behav Res Methods       Date:  2007-08

8.  Orthographic processing of polysyllabic words by native and nonnative English speakers.

Authors:  Marcus Taft
Journal:  Brain Lang       Date:  2002 Apr-Jun       Impact factor: 2.381

9.  Japanese mental syllabary and effects of mora, syllable, bi-mora and word frequencies on Japanese speech production.

Authors:  Katsuo Tamaoka; Shogo Makioka
Journal:  Lang Speech       Date:  2009       Impact factor: 1.500

10.  Cross-linguistic similarity and task demands in Japanese-English bilingual processing.

Authors:  David B Allen; Kathy Conklin
Journal:  PLoS One       Date:  2013-08-28       Impact factor: 3.240

View more
  1 in total

1.  Determinants of translation ambiguity: A within and cross-language comparison.

Authors:  Tamar Degani; Anat Prior; Chelsea M Eddington; Ana B Arêas da Luz Fontes; Natasha Tokowicz
Journal:  Linguist Approaches Biling       Date:  2016-01-25
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.