Literature DB >> 30296666

Neurolinguistics Research Advancing Development of a Direct-Speech Brain-Computer Interface.

Ciaran Cooney¹, Raffaella Folli², Damien Coyle³.

Abstract

A direct-speech brain-computer interface (DS-BCI) acquires neural signals corresponding to imagined speech, then processes and decodes these signals to produce a linguistic output in the form of phonemes, words, or sentences. Recent research has shown the potential of neurolinguistics to enhance decoding approaches to imagined speech with the inclusion of semantics and phonology in experimental procedures. As neurolinguistics research findings are beginning to be incorporated within the scope of DS-BCI research, it is our view that a thorough understanding of imagined speech, and its relationship with overt speech, must be considered an integral feature of research in this field. With a focus on imagined speech, we provide a review of the most important neurolinguistics research informing the field of DS-BCI and suggest how this research may be utilized to improve current experimental protocols and decoding techniques. Our review of the literature supports a cross-disciplinary approach to DS-BCI research, in which neurolinguistics concepts and methods are utilized to aid development of a naturalistic mode of communication.

Entities: Chemical Disease Gene Species

Keywords: Cognitive Neuroscience; Computer Science; Hardware Interface

Year: 2018 PMID： 30296666 PMCID： PMC6174918 DOI： 10.1016/j.isci.2018.09.016

Source DB: PubMed Journal: iScience ISSN： 2589-0042

Seeking a Naturalistic Form of Communication through Direct-Speech Brain-Computer Interface

A direct-speech brain-computer interface (DS-BCI) is one that captures and decodes neural signals corresponding directly to speech production, enabling a naturalistic mode of communication (Iljina et al., 2017). Such a system has the potential to transform the lives of patients with severe motor dysfunction, including pathologies such as amyotrophic lateral sclerosis resulting in locked-in syndrome. Loss of verbal communication has a profound effect on those inflicted, with loss of social interaction and the potential for isolation. In parallel with this personal degeneration, a caregiver faces a more difficult challenge in ascertaining the needs of the patient. These factors have played a crucial role in driving the development of DS-BCIs (Brumberg et al., 2011, Oken et al., 2014). It is our view that development of a functional DS-BCI must be predicated on imagined speech (see section “Imagined Speech: A Special Case of Speech” for a detailed description) as the communicative modality. However, several other types of speech have been utilized in experiments referenced throughout this text, making it important to define their meanings. Table 1 is a categorization of the different types of speech typically used in DS-BCI experimentation. Three types of speech are presented, namely, overt (Blakely et al., 2008), intended (Guenther et al., 2009), and imagined (D'Zmura et al., 2009), and these are subcategorized according to whether the speech is being produced or perceived by a subject. Overt speech production results in an audible output that can be heard by the person speaking and by others within range of the sounds produced. Intended speech is the name given to describe when a person tries to speak but does not have the capacity to produce an audible output. Imagined speech is the internal pronunciation of words without any audible output or associated movement. These are types of speech production and possible methods of communication with DS-BCI. However, several studies have used decoding approaches applied to the neural correlates of speech perception as evidence for the potential of decoding speech processes for communication (Di Liberto et al., 2015, Wang et al., 2018). We consider it to be extremely important to distinguish speech perception studies from speech production studies and to be aware that the “speech” in these studies refers to different phenomena. In perception studies, the speech being considered is the stimulus provided by the experimenter. The corresponding response of the subject, typically in the auditory cortex, is the neural activity being decoded. This differs greatly from the study of speech production in which the subject is actively producing phones, words, or sentences, whether prompted or unprompted, with neural correlates typically corresponding to brain regions associated with speech production. Although speech perception studies are important for DS-BCI research, this review is primarily concerned with speech production and, in particular, imagined speech production.

Table 1

Categorization of Types of Speech Typically Used in DS-BCI Experiments

	Production	Perception
Overt	Fully articulated speech with audible output	Active or passive hearing of audible speech (one's own speech or from another source)
Intended	Intention to produce overt speech but without the capacity to produce audible output	Perception of one's own intended speech production
Imagined	Internal pronunciation of words, independent of movement and without any audible output	Perception of one's own imagined speech production

Categorization of Types of Speech Typically Used in DS-BCI Experiments A DS-BCI consists of several important stages (see Figure 1). The stages depicted in Figures 1B–1G have each been extensively covered in the literature (Blakely et al., 2008, Guenther et al., 2009; reviewed in Bocquelet et al., 2017). However, there is relatively little consideration of the difficulty in modeling the first of these stages (Figure 1A), namely, imagined speech production, during which a participant articulates words internally without any motor movement. Neurolinguistics research is providing insight into the cognitive function, phenomenology, and neurobiology of speech production in general (Hickok, 2014) and imagined speech in particular (Alderson-Day and Fernyhough, 2015, Perrone-Bertolotti et al., 2014), and it is our view that these insights should be utilized within DS-BCI research. We concur with the arguments expressed by Iljina et al. (2017) that, given the complexity of speech production processes, combining research from the fields of BCI and neurolinguistics must be seen as an important approach for those seeking to capture and decode the phenomena.

Figure 1

Seeking a Naturalistic Form of Communication through Direct-Speech BCI

(A) DS-BCI is a system that decodes neural signals (e.g., electroencephalography [EEG] or electrocorticography [ECoG]) (B) corresponding to imagined speech (A). Recorded signals are processed to facilitate maximal information extraction and improvement of signal-to-noise ratio (C). The feature extraction (D) and classification (E) stages compute the most discriminative information in the recorded signals and classify them as a part of speech. The output of a DS-BCI system is a textual representation of the imagined speech (F) and auditory representation, which can be used for both communication and feedback (G). In this example, the user actively produces the words “I am thirsty!” with imagined speech. The signals acquired are temporally aligned with each word to facilitate feature extraction and classification. The system produces two outputs: a text printout of the imagined speech words being produced and a synthesized audio output, i.e., “I am thirsty!”

Seeking a Naturalistic Form of Communication through Direct-Speech BCI (A) DS-BCI is a system that decodes neural signals (e.g., electroencephalography [EEG] or electrocorticography [ECoG]) (B) corresponding to imagined speech (A). Recorded signals are processed to facilitate maximal information extraction and improvement of signal-to-noise ratio (C). The feature extraction (D) and classification (E) stages compute the most discriminative information in the recorded signals and classify them as a part of speech. The output of a DS-BCI system is a textual representation of the imagined speech (F) and auditory representation, which can be used for both communication and feedback (G). In this example, the user actively produces the words “I am thirsty!” with imagined speech. The signals acquired are temporally aligned with each word to facilitate feature extraction and classification. The system produces two outputs: a text printout of the imagined speech words being produced and a synthesized audio output, i.e., “I am thirsty!” Imagined speech is the internal pronunciation of words without any motor movement or acoustic output (Torres-García et al., 2016) (see section “Imagined Speech: A Special Case of Speech”). Related, and overlapping, terminology for imagined speech includes self-talk, sub-vocal/covert speech, internal dialogue/monologue, sub-vocalization, utterance, self-verbalization, and self-statement (Morin and Michaud, 2007). However, for the purposes of performing controlled experiments in the field of DS-BCI, it is necessary to maintain a consistent terminology and description of the phenomena (see section “Imagined Speech: A Special Case of Speech”). Although imagined and overt speech are not identical, there is overlap between imagined and overt speech production, and imagined speech has become an alternative neuro-paradigm for communicative BCI (D'Zmura et al., 2009, DaSalla et al., 2009, Deng et al., 2010). Such a system differs from other types of communicative BCIs (Chaudhary et al., 2017, Pandarinath et al., 2017) in that it relies on tapping directly into a person's speech production processes, rather than using some unrelated neural activity as the method of communication. Several DS-BCI studies have used neurolinguistics approaches within their experimental procedures (González-Castañeda et al., 2017, Kim et al., 2013, Wang et al., 2011, Zhao and Rudzicz, 2015). In general, the approaches used have been to design a constrained dictionary of words categorized according to their relative semantic or phonological relationships. The basic principle underpinning this approach is that the categorical features of a word may aid decoding accuracy in imagined speech. There is some evidence that this is a valid approach to take, particularly in relation to semantic categorization, which has received greater attention in the literature. Studies examining the feasibility of decoding semantic information from neural signals have shown that semantic category can be predicted from brain activity (Kim et al., 2013, Wang et al., 2011). However, further research is required to determine the true potential of neurolinguistics research in relation to the neurobiology of imagined speech and the structured processes underlying speech production, to inform DS-BCI research. Here, we review trends in DS-BCI research, and the current understanding of speech production processes, with an emphasis on imagined speech. We consider the potential implications of attempting to harness neurolinguistics concepts and the limitations of working directly with imagined speech. An argument is presented that effective research in the field of DS-BCI should incorporate neurolinguistics research and a thorough understanding of imagined speech where possible to aid the development of a naturalistic mode of communication.

Trends in DS-BCI

The development of a “silent” interface has long been an active area of research to enable users to communicate without audible articulation of their speech. Several modalities have been developed to facilitate such communication through movement-independent BCI, including BCI-spellers (e.g., D'albis et al., 2012), BCIs based on steady-state visually evoked potential (e.g., Bin et al., 2009), and BCIs based on motor imagery (e.g., Tabar and Halici, 2017a) (see AlSaleh et al., 2016, Tabar and Halici, 2017b for reviews). There are numerous forms that these silent interfaces have taken to provide a more naturalistic, language-based mode of communication, including ultrasound imaging of lip profiles (Denby et al., 2006) and word recognition using magnetic implants and sensors (Gilbert et al., 2010). However, approaches such as these require active motor skills that can be readily utilized as the communicative modality and are therefore not movement-independent BCIs. The utility of BCI as a mode for language-based communication has been noted by researchers for many years (Denby et al., 2006, Donchin et al., 2000), with the concept for a DS-BCI being a movement-independent BCI based on neural activity corresponding directly to imagined speech production processes. However, the possibility of developing a BCI predicated purely on imagined speech has only recently begun to gather momentum (Ikeda et al., 2014, Yoshimura et al., 2016, Nguyen et al., 2017) as researchers have revealed promising results in attempts to classify units of imagined speech (González-Castañeda et al., 2017, Martin et al., 2014, Pei et al., 2011a, Yoshimura et al., 2016, Zhao and Rudzicz, 2015). There have been several incarnations of DS-BCIs, including a wireless BCI for real-time speech synthesis (Guenther et al., 2009) and a concept for continuous speech recognition (Herff et al., 2017). The current stream of DS-BCI research indicates a trend toward improved classification of imagined speech units for decoded brain activity (González-Castañeda et al., 2017, Martin et al., 2014) and the development of methodologies for continuous decoding of imagined speech (Brumberg et al., 2016). There have also been recent developments in the classification of the neural correlates of speech perception (Di Liberto et al., 2015, Wang et al., 2018), one of which demonstrates real-time classification of auditory sentences from neural activity (Moses et al., 2018). Although this research is vital for the implementation of a closed-loop DS-BCI, it is important that results from speech perception studies are assessed independently of speech production studies, as the neural activity corresponding to each cannot be assumed to have similar properties. There have been notable successes in attempts to improve the decoding of language content directly from neural activity. The neural correlates of vowels and consonants (Idrees and Farooq, 2016, Pei et al., 2011b, Yoshimura et al., 2016), phonemes (Brumberg et al., 2011, Leuthardt et al., 2011), syllables (Deng et al., 2010), whole words (González-Castañeda et al., 2017, Martin et al., 2016), and even sentences (Herff et al., 2015) have all been evaluated using advanced decoding algorithms. Decoding of discrete units of speech, single vowels, for example, has been a popular experimental paradigm in DS-BCI to date (Ikeda et al., 2014, Sereshkeh et al, Rezazadeh Sereshkeh et al., 2017a) presented evidence suggesting that it is possible to classify units of imagined speech from electroencephalogram (EEG), presenting 63.2% ± 6.4 accuracy for pairwise classification tasks. Other studies have shown that decoding accuracies of vowels and consonants were similar for both overt and imagined speech (Pei et al., 2011a). Elsewhere, linguistic content has been harnessed to aid discrimination of both overt and imagined speech, with phonology (Zhao and Rudzicz, 2015), semantics (Kim et al., 2013), and syntax (Herff et al., 2015) each showing some potential to aid classification in DS-BCI. Figure 2, and the corresponding data in Table 2, categorizes DS-BCI studies according to recording technique and the type of speech being investigated. The time period for this analysis begins with the study of Blakely et al. (2008), because this is the first study based on the BCI paradigm depicted, and runs through to 2018. Criteria for inclusion in this analysis are those studies using typical recording techniques (EEG, electrocorticogram [ECoG], micro-arrays, functional magnetic resonance imagining [fMRI], and functional near-infrared spectroscopy [fNIRS]) to decode speech production (overt, imagined, intended), but not speech perception, directly from neural activity. Studies utilizing speech imagery or imagined hearing have been excluded, as we do not consider these modalities to be representative of the speech production required of a DS-BCI. The cross-sectional data (Figure 2A) indicate that studies have favored two recording techniques and two types of speech. Clearly, EEG and ECoG are the most dominant recording techniques, having been cited in 16 and 20 studies, respectively (Figure 2B), the likely reason being the high temporal resolution (milliseconds) they both possess, particularly in comparison with imaging techniques such as fMRI (with temporal resolution in the order of seconds). This high temporal resolution is required to capture the dynamic processes associated with speech production (Herff et al., 2016). As a non-invasive recording technique, EEG makes recruitment of experimental participants easier, but the greater spatial resolution of ECoG renders it a better candidate for decoding imagined speech signals when participants are made available as a result of treatment of pre-existing medical conditions (e.g., epilepsy) (Martin et al., 2016). Although microelectrode arrays have shown good performance in fields such as neuromotor prostheses (e.g., Hochberg et al., 2012), relatively few studies have utilized them for recording the spiking activity of single or multiple units (SU or MU), i.e., neurons, during imagined speech. However, the SU or MU offer the required signal specificity to improve imagined speech decoding processes given the success in movement and movement intention decoding (Bouton et al., 2016).

Figure 2

Direct-Speech BCI Studies Categorized According to Recording Techniques and Types of Speech

(A) is a cross-categorization of DS-BCI studies according to the recording techniques applied and the types of speech being investigated. The time period for this analysis begins with the study of Blakely et al. (2008), because this is the first study based on the BCI paradigm depicted and runs to 2018. Criteria for inclusion in this analysis are those studies using said recording techniques to decode speech production (overt, imagined, and intended) directly from neural activity. EEG and ECoG are the most often used recording approaches. High temporal resolution is an important feature of both. Although micro-electrodes do offer high spatial and temporal resolution, their use is not always possible or appropriate. Overt speech has been used as a proxy for imagined speech, or in comparative studies. The behavioral difficulty of studying imagined speech is, at least in part, a reason for this trend. The two bar graphs (B) show the distribution of measurement techniques and of types of speech used across all studies. ECoG is utilized in a total of twenty studies and EEG in a total of sixteen.

See Table 2.

Table 2

Overview of DS-BCI Studies Attempting to Decode Speech from Neural Activity

Reference	Recording Technique	Type of Speech	Experimental Paradigm
Blakely et al., 2008	Micro-electrode	Overt	Phoneme pronunciation
D'Zmura et al., 2009	EEG	Imagined	Imagined speech of two syllables spoken in one of three rhythms
Guenther et al., 2009	Micro-electrode	Intended	Vowel production involving movement from a central vowel location to one of three peripheral vowel locations
Porbadnigk et al., 2009	EEG	Imagined	Five words, presented in block, sequential, or random order
Brigham and Kumar, 2010	EEG	Imagined	Imagined speech of two syllables, /ba/ and /ku/ at two rhythms
Deng et al., 2010	EEG	Imagined	Imagined speech of two syllables spoken in one of three rhythms
Kellis et al., 2010	Micro-electrode	Overt	Repetition of one of ten words
Brumberg et al., 2011	Micro-electrode	Intended	Intended production of 38 American English phonemes
Chi et al., 2011	EEG	Imagined	Generation of five types of phonemes that differ in their manner vocal articulation
Leuthardt et al., 2011	ECoG	Overt/Imagined	Overt and imagined phoneme articulation
Pei et al., 2011a	ECoG	Overt/Imagined	Overt and imagined repetition of 36 monosyllabic words
Wang et al., 2011	ECoG	Overt	Three language tasks based on picture naming
Pei et al., 2011b	ECoG	Overt/Imagined	Word repetition using overt or covert speech in response to visual or auditory stimuli
Derix et al., 2012	ECoG	Overt	Spontaneous speech in non-experimental setup
Herff et al., 2012	fNIRS	Overt/Imagined	Utterances produced in auditory, silent, and imagined speech
Zhang et al., 2012	ECoG	Overt	Articulation of Chinese sentences
Kim et al., 2013	EEG	Overt/Imagined	Speech of monosyllabic Korean words representing two categories of meaning (number and face)
Bouchard and Chang, 2014	ECoG	Overt	Reading of consonant-vowel syllables
Derix et al., 2014	ECoG	Overt	Spontaneous speech in non-experimental setup
Ikeda et al., 2014	ECoG	Imagined	Imagined speech production of three Japanese vowels
Kanas et al., 2014	ECoG	Overt	Two-syllable repetition tasks
Martin et al., 2014	ECoG	Overt/Imagined	Overt and covert reading of short stories
Mugler et al., 2014a	ECoG	Overt	Overt speech used to identify different phonemes by where they place in different words
Mugler et al., 2014b	ECoG	Overt	Overt speech used to identify different phonemes by where they place in different words
Song and Sepulveda, 2014	EEG	Overt/Imagined	High tone production in overt, inhibited, and imagined speech
Herff et al., 2015	ECoG	Overt	Reading from well-known texts
Iqbal et al., 2015a	EEG	Imagined	Imagined speech of vowels /a/ and /u/, and no action
Iqbal et al., 2015b	EEG	Imagined	Imagined speech of vowels /a/ and /u/, and no action
Lotte et al., 2015	ECoG	Overt	Reading from well-known texts
Zhao and Rudzicz, 2015	EEG	Overt/Imagined	Imagined speech production of seven phonemes and two pairs of phonologically similar words
Herff et al., 2016	ECoG	Overt	Recitation of a presented sentence
Martin et al., 2016	ECoG	Overt/Imagined	Overt and imagined speech production of words selected to maximize variability of number of syllables and semantic category
Yoshimura et al., 2016	EEG/fMRI	Imagined	Imagined speech production of Japanese vowels /a/ and /i/
González-Castañeda et al., 2017	EEG	Imagined	Imagined speech production of five Spanish words
Nguyen et al., 2017	EEG	Imagined	Imagined speech of short words, long words, and vowels
Ramsey et al., 2017	ECoG	Overt	Overt speech production of four phonemes
Rezazadeh Sereshkeh et al., 2017a	EEG	Imagined	Imagined speech repetition of the words "yes" or "no"
Rezazadeh Sereshkeh et al., 2017b	EEG	Imagined	Imagined speech repetition of the words "yes" or "no"
Fargier et al., 2018	EEG	Overt	Overt word production corresponding to presented pictures
Hashim et al., 2018	EEG	Imagined	Imagined speech word production
Ibayashi et al., 2018	ECoG	Overt	Overt speech of 15 Japanese syllables
Livezey et al., 2018	ECoG	Overt	Overt speech of 57 different consonant-vowel syllables

Direct-Speech BCI Studies Categorized According to Recording Techniques and Types of Speech (A) is a cross-categorization of DS-BCI studies according to the recording techniques applied and the types of speech being investigated. The time period for this analysis begins with the study of Blakely et al. (2008), because this is the first study based on the BCI paradigm depicted and runs to 2018. Criteria for inclusion in this analysis are those studies using said recording techniques to decode speech production (overt, imagined, and intended) directly from neural activity. EEG and ECoG are the most often used recording approaches. High temporal resolution is an important feature of both. Although micro-electrodes do offer high spatial and temporal resolution, their use is not always possible or appropriate. Overt speech has been used as a proxy for imagined speech, or in comparative studies. The behavioral difficulty of studying imagined speech is, at least in part, a reason for this trend. The two bar graphs (B) show the distribution of measurement techniques and of types of speech used across all studies. ECoG is utilized in a total of twenty studies and EEG in a total of sixteen. See Table 2. Overview of DS-BCI Studies Attempting to Decode Speech from Neural Activity It is clear from the data presented in Figure 2 that overt speech production is heavily utilized in experimental trials. Overt speech is included in a total of 26 studies (17 solely overt and 9 alongside imagined speech) (Figure 2B). There are several reasons for this trend, including the lack of behavioral verification associated with imagined speech, whereby it is difficult to confirm whether experimental tasks have been performed correctly, and the lower amplitude of EEG/ECoG signals it produces (Palmer et al., 2001, Shuster and Lemieux, 2005). Despite lower amplitude signals, there is evidence to suggest that EEG can provide considerable information on imagined speech that can be utilized for a DS-BCI (D'Zmura et al., 2009). Attempts to decode continuous overt speech have been made (Herff et al., 2015), and it is anticipated that further developments may enable adaptation of this approach for imagined speech. As stated, the use of overt speech is prevalent in DS-BCI research. However, if a truly naturalistic form of communication is to be achieved using imagined speech, then a thorough understanding of the phenomena is required.

Imagined Speech: A Special Case of Speech

The Phenomena of Imagined Speech

As mentioned earlier, many definitions for imagined speech are present in the literature (Alderson-Day and Fernyhough, 2015, Hirshorn and Thompson-Schill, 2006), one of which refers to it as the internal pronunciation of words without emitting sounds or making facial movements (Torres-García et al., 2016). Research has demonstrated that imagined speech involves many cognitive functions, including learning (Alderson-Day and Fernyhough, 2015), task production (Dolcos and Albarracin, 2014), and memory (Perrone-Bertolotti et al., 2014). Despite its central position in everyday life, imagined speech has been the subject of relatively little research. Behavioral evidence has indicated that imagined speech is provided by the motor system's prediction of sensory actions (corollary discharge) (Scott et al., 2013) and it has been suggested that imagined speech is produced in much the same way as overt speech, without the motor-based articulation that generates auditory output (Oppenheim and Dell, 2010). Martínez-Manrique and Vicente (Martínez-Manrique and Vicente, 2015) support an “activity” view of imagined speech, in which the phenomena does not have a “proper function” in cognition but has simply inherited its suite of functions from overt speech. Other studies have characterized imagined speech as the basis for rehearsal in short-term memory (Baddeley et al., 1975) and as having a phonological influence in reading and writing (Oppenheim and Dell, 2008). Further studies concur with these findings, suggesting that inner rehearsal is a central tenet of imagined speech within the phonological loop, i.e., the temporary storage of information in short-term memory (Perrone-Bertolotti et al., 2014), and that imagined speech may interact with working memory to enhance the encoding of new material (Marvel and Desmond, 2012). It has been suggested that imagined speech serves a regulatory role in social speech communication, meaning that it is utilized in overt speech communications (speaking and listening), as well as being implicated as part of a covert articulatory planning process within the speech-motor processing paradigm (see Price [2012] for review). It has been proposed that imagined speech may be used to generally represent, maintain, and organize task-relevant information and conscious thoughts (Dolcos and Albarracin, 2014). Although imagined speech is not normally associated with executive control processes, the role of imagined speech in task switching, for example, switching attention across multiple arithmetic problems, has been studied (Emerson and Miyake, 2003). The difficulties associated with studying imagined speech in experimental research has led to the use of overt speech as a proxy for the phenomena in DS-BCI research (e.g., Martin et al., 2014, Pei et al., 2011b). Therefore, it is useful to have a clear picture of the relationship between the two types of speech.

The Relationship between Overt and Imagined Speech Production

The relationship between overt speech and imagined speech has been extensively debated (Brocklehurst and Corley, 2011, Corley et al., 2011, Oppenheim and Dell, 2010, Oppenheim and Dell, 2008), although at present there is no definitive position on the precise nature of this relationship. Here, we present the evidence for a close relationship between overt and imagined speech, before considering the ways in which the two differ. Finally, we discuss the implications of this relationship for DS-BCI research. It has been posited that imagined speech is a truncated form of overt speech, in that the stages of production are the same for both, before the articulatory effects associated with overt speech (Oppenheim and Dell, 2010). Subjective accounts of imagined speech indicate that it resembles overt speech in tempo, pitch, and rhythm (MacKay, 1992) and studies have found that imagined speech retains deep-lying features such as lexical and semantic information (Oppenheim and Dell, 2008). The motor simulation hypothesis places overt and imagined speech on a continuum, on which linguistic mechanisms and physiological correlates are shared (Perrone-Bertolotti et al., 2014), albeit with features attenuated in imagined speech (Alderson-Day and Fernyhough, 2015). Importantly, the motor simulation hypothesis assumes that imagined speech necessarily includes fully specified articulatory detail (e.g., Levelt, 1989), merely lacking observable sound and movement. Phonemic similarity (in which mistaken phonemes are replaced with similar phonemes) has been observed with similar magnitudes for both overt and imagined speech production (Brocklehurst and Corley, 2011), and further findings suggest that imagined speech is specified at the sub-phonemic level and that its process of production must be similar to that of overt speech (Corley et al., 2011). The implication here is that imagined speech does contain much of the featural richness associated with overt speech, a view fully compatible with evidence that phonological representations are fully encoded in imagined speech. Imagined speech has been considered part of an overall speech production system, in which it is used for predictive simulation or “forward models” of linguistic representations, suggesting that it is produced in much the same way as overt speech, minus overt articulation (Levelt et al., 1999). There is considerable overlap between the neurobiology of overt and imagined speech (Marvel and Desmond, 2012), with neural activations in typical left-hemispheric language regions, in general, being associated with both (Basho et al., 2007, Huang et al., 2002, McGuire et al., 1996a, Palmer et al., 2001) (see section “The Neuroanatomy of Imagined Speech”). Activation of Broca area during imagined speech indicates that this typical language region is associated with its production and is consistent with results from functional imaging studies examining silent articulation (Paulesu et al., 1993). fMRI results have shown activation of the supplementary motor area (SMA), inferior frontal gyrus (IFG), and insula during phonological processing of imagined and overt speech (Aleman et al., 2005). Furthering current understanding of the neuroanatomy and neural correlates of imagined speech production is an important aspect of research in this field. Although they suggest that there is significant overlap between overt and imagined speech, Oppenheim and Dell (2008) also advise that imagined speech is impoverished at the featural level and thus abstract and underspecified. It has been suggested that imagined speech is often attenuated at the surface level, lacking phonological (Oppenheim and Dell, 2008) or phonetic (Wheeldon and Levelt, 1995) detail. Countering the view that imagined speech is intrinsically similar to overt speech, the abstraction hypothesis contends that imagined speech is produced as a consequence of activation of abstract linguistic representations (e.g., Indefrey and Levelt, 2004). The theory states that imagined speech is activated before the speaker retrieves any articulatory information and therefore should not require any motor activations. There are several arguments in favor of the abstraction view (summarized in Oppenheim and Dell, 2010), the first of which is that imagined speech is produced faster than overt speech, suggesting that imagined speech is abbreviated in some respect (e.g., MacKay, 1992) and thus lacks the articulatory properties associated with overt speech. Another argument is that attenuated activity in language-related brain regions during imagined speech indicates that the processes of production are not as complete as in overt speech. The third argument presented is that imagined speech does not require articulatory abilities and so articulation is not required for complete use of imagined speech. The authors also observe that articulatory suppression does not necessarily eliminate imagined speech. Moreover, imagined speech does not (necessarily) translate to overt speech performance. Theoretically, were overt and imagined speech to involve similar planning processes, then it would be reasonable to expect practice of an utterance in one form of speech to improve performance in the other. However, evidence has indicated that this is not the case (Corley et al., 2011). Alternatively, the flexible abstraction hypothesis states that there is a single form of imagined speech, which is represented at the phonemic-selection level (Oppenheim and Dell, 2010). The hypothesis states that representations can be modulated by articulation to include more explicit features, and the authors suggest that cases in which imagined speech appears to have phonological features may be caused by participants deploying a form of imagined speech involving a greater degree of articulation. The flexible abstraction hypothesis suggests that imagined speech may fail to involve articulatory representations but it can incorporate lower-level articulatory planning when speakers silently articulate. The surface-impoverished hypothesis states that imagined speech is impoverished at the surface level, having weaker lower-level representation (e.g., featural level), and the deep-impoverished hypothesis states that imagined speech represents sounds and gestures but not higher level information (Oppenheim and Dell, 2008). Imagined speech may be formed as a featurally abstract forward model (Pickering and Garrod, 2013), and phonological features may be experienced as a result of the sensory prediction created (Scott et al., 2013). Imagined speech may also vary depending on cognitive and emotional conditions, causing changes between abstract and concrete forms (Fernyhough, 2004). As stated earlier, neuroanatomical overlap between regions associated with overt and imagined speech has been observed. Nevertheless, there are significant differences in brain activity between the two processes (e.g., Basho et al., 2007). For example, fMRI has discovered that imagined speech elicits greater activation in several areas of the brain (e.g., Basho et al., 2007) and a lesion symptom mapping (LSM) study of patients with aphasia showed that participants with poor overt speech retained relatively strong imagined speech in comparison (Stark et al., 2017), suggesting a dissociation of the cognitive mechanisms generating overt and imagined speech. Previous work with aphasics, indicating that imagined speech abilities were more effected by lesions to the left pars opercularis than overt speech production, led Geva, Jones et al. (Geva et al., 2011b) to state that imagined speech cannot be assumed to be overt speech without a motor component. For further information on the neurobiology of imagined speech, see section “The Neuroanatomy of Imagined Speech.” Perrone-Bertolotti et al. (2014) astutely observe that the variance in results between overt and imagined speech experiments may, at least partially, be explained by the different speech tasks involved in the studies. Word repetition, object naming, verb generation, etc., all require different speech production processes and thus engage different areas of the brain. It is also conceivable that differences between the two types of speech could be put down to participants being better able to perceive certain types of error in overt speech. Perrone-Bertolotti et al. (2014) also suggest that differing results may indicate that imagined speech consists of flexible subtypes or levels and that the experimental paradigm may be partially responsible for the differences observed between the two types of speech. Clearly, there is no definitive description of the precise relationship between overt and imagined speech, and this is a subject that requires further elucidation from neurolinguistics research. We agree with Martínez-Manrique and Vicente (2015) that a comprehensive view of imagined speech will require precise models of linguistic production and comprehension and a cognitive account will require more data than is currently available. Therefore, we must also agree with Geva, Jones et al. (Geva et al., 2011b) that overt speech cannot simply be assumed to be a reliable substitute for imagined speech. It is our contention, in relation to DS-BCIs, that it is not possible to reliably infer performance in an imagined speech paradigm from results obtained during overt speech experiments. This is not to say that there is no value in overt speech paradigms, and given that there is much overlap in the linguistic theory and neurobiology associated with both, there is certainly a lot to be gained from such experiments. However, as the communicative paradigm for an eventual operational DS-BCI is imagined speech, we must emphasize the importance of utilizing this modality, when possible, in experimental protocols.

The Neuroanatomy of Imagined Speech

Alderson-Day and Fernyhough (2015) suggest that a prima facie assumption about the neural correlates of imagined speech might be that they closely resemble an attenuated version of the neural activity associated with overt speech. There is evidence supporting activation in Broca area, SMA, and parts of the prefrontal cortex, having been observed during both overt and imagined speech (see Price, 2012 for review). Studies have shown that overt and imagined speech do produce similar neural activations, with the exception of certain motor-related activity associated with overt speech (Palmer et al., 2001), and that the blood-oxygen-level-dependent response measured from fMRI recordings was greater during overt than during imagined speech (Shuster and Lemieux, 2005). However, the neuroanatomy of imagined speech has been shown to differ from that of overt speech (e.g., Basho et al., 2007). It is important to identify the regions specifically correlated with imagined speech in the context of development of a DS-BCI that are independent of movement and therefore not overt speech production and are independent of stimuli and therefore not speech perception. Reports on the anatomical underpinnings of imagined speech have consistently implicated the left inferior frontal gyrus (LIFG) as the anatomical basis for the phenomena (Aleman et al., 2005, McGuire et al., 1996a, McGuire et al., 1996b, Shergill et al., 2002) (see Figure 3 [Berwick et al., 2013]). Positron emission tomography (PET) has attributed LIFG activation to imagined speech during sentence and single-word production (McGuire et al., 1996b), and fMRI was used to observe LIFG activation during imagined sentence production (Shergill and Bullmore, 2001, Shergill et al., 2002). In the second of these fMRI studies (Shergill et al., 2002), the LIFG, along with other regions, was associated with increased activation corresponding to increased rates of imagined speech production. The region has also been associated with increased activation during dialogic, in comparison with monologic, imagined speech (Alderson-Day et al., 2015). Morin and Michaud (2007) note that the LIFG exhibits functional heterogeneity, observing that its most anterior parts (Brodmann area [BA]45) are involved in word retrieval and their associated meanings, whereas the posterior part (BA46/47) specializes in accessing words through an articulatory code (Paulesu et al., 1997). It has been observed that task-elicited imagined speech results in increased activation in the LIFG, in comparison with spontaneous imagined speech (Hurlburt et al., 2016). The authors suggest that activation of LIFG during task-elicited imagined speech may be a reflection of elicitation tasks rather than the speech itself, as the LIFG is thought to be integral to planning and execution of hierarchical sequences.

Figure 3

Neuroanatomical Regions Associated with Imagined Speech Production

The diagram depicts brain regions typically associated with language function in the left hemisphere (Berwick et al., 2013), with each of the numbered sections indicating one of Brodmann areas (BA). The IFG, which includes BA44 and BA45, is the most common region associated with imagined speech production. Single word and sentence production both activate the IFG, and the region is thought to be associated with word retrieval and associated meanings (BA45). Both the STG and MTG have been implicated in imagined speech studies as relating to the phonological loop and to production of dialogic imagined speech. The dorsal pathways between BA44 and the posterior superior temporal cortex (pSTC) supports core syntactic processes. The ventral pathways, including between BA45 and the temporal cortex (TC), support processing of semantic and conceptual information. Reprinted with permission from Berwick et al. 2013, copyright 2013, Elsevier.

Neuroanatomical Regions Associated with Imagined Speech Production The diagram depicts brain regions typically associated with language function in the left hemisphere (Berwick et al., 2013), with each of the numbered sections indicating one of Brodmann areas (BA). The IFG, which includes BA44 and BA45, is the most common region associated with imagined speech production. Single word and sentence production both activate the IFG, and the region is thought to be associated with word retrieval and associated meanings (BA45). Both the STG and MTG have been implicated in imagined speech studies as relating to the phonological loop and to production of dialogic imagined speech. The dorsal pathways between BA44 and the posterior superior temporal cortex (pSTC) supports core syntactic processes. The ventral pathways, including between BA45 and the temporal cortex (TC), support processing of semantic and conceptual information. Reprinted with permission from Berwick et al. 2013, copyright 2013, Elsevier. Among regions most often observed as corresponding to imagined speech production are SMA (Shergill and Bullmore, 2001, Shergill et al., 2002), insula (Aleman et al., 2005), premotor cortex (McGuire et al., 1996a), STG, and middle temporal gyrus (MTG) (Shuster and Lemieux, 2005). The SMA, left precentral gyrus, and right inferior parietal lobe are all associated with increased activation at slower rates of imagined speech production (Shergill et al., 2002). The SMA has also been associated with sentence-repetition tasks (Shergill and Bullmore, 2001) and phonological processing during imagined speech (Aleman et al., 2005). The insula has been implicated in multiple studies reporting on imagined word production (Aleman et al., 2005, Hubbard, 2010, McGuire et al., 1996a, Shergill and Bullmore, 2001) but may not be representative of imagined speech given that it is often associated with imagined hearing (see later discussion) and overt speech. However, Shuster and Lemieux (2005) observed that many studies that have failed to report involvement of the insula in speech production have typically used only imagined or silently articulated speech (Wildgruber et al., 2001). Increased activation has been observed in the left MTG and STG during the production of multisyllabic words in imagined speech trials (Shuster and Lemieux, 2005), and the posterior STG has been implicated in metric stress evaluation in the phonological loop (Aleman et al., 2005) (see Figure 3). Interestingly, the left MTG and STG are often associated with increased activity during trials involving imagined hearing or dialogic imagined speech (see Alderson-Day and Fernyhough, 2015 for review). This type of task, in which a participant is asked to imagine hearing speech in another person's voice, is thought to rely on memory for phonological information (Alderson-Day and Fernyhough, 2015) and to activate the primary auditory cortex (Heschl gyrus) (Hurlburt et al., 2016). Other findings indicate that dialogic imagined speech draws from a range of regions beyond a typical left-sided perisylvian language network, including the right IFG, right MTG, and the right STG/STS (Alderson-Day et al., 2015). The precuneus, posterior cingulate, left insula, and cerebellum are also implicated. The dorsal pathways between BA44 and the posterior superior temporal cortex subserve higher-order hierarchical sequences and thus support core syntactic processes (Friederici, 2018), whereas the ventral pathways, including between BA45 and the temporal cortex, support processing of semantic and conceptual information (Berwick et al., 2013). Hurlburt, Heavey, and Kelsey (Hurlburt et al., 2013) state that both production and perception of imagined speech exhibit activations in regions such as the IFG, SMA, insula, and posterior STG (Hubbard, 2010, Price, 2012). Although there certainly appears to be overlap between imagined speech and imagined hearing, they are, in general, anatomically separable. Imagined speech is typically associated with left-hemispheric regions, including the LIFG, insula, and STG (McGuire et al., 1996a), whereas imagined hearing corresponds to a bilateral network with the activation of SMA, posterior parietal cortex, STG, and MTG (Zatorre and Halpern, 2005). It has been suggested that differences between the two conditions may be the result of additional motor elements of imagined speech, which involve the deployment of a somatosensory forward model (Tian and Poeppel, 2013). Concerns have been raised surrounding the ecological validity of findings on the neural components of imagined speech (Alderson-Day and Fernyhough, 2015). Paradigms are often simple word or sentence-repetition tasks, ignoring the complexity of imagined speech (Jones and Fernyhough, 2007). Although experiments such as these are a common approach in language studies, it is our view that further studies examining spontaneously produced speech (Derix et al., 2014, Derix et al., 2012, Ruescher et al., 2013) and imagined speech (Hurlburt et al., 2016) are required to provide greater elucidation of the neural underpinnings of the phenomena. It is also important to note that, as well as general activations associated with imagined speech production, processing of complex lexical, phonological, semantic (Basho et al., 2007), or word retrieval (Hirshorn and Thompson-Schill, 2006) tasks correspond to additional activity in the inferior frontal cortex (IFC) of the left hemisphere. We concur with Bocquelet et al. (2017) that neuroanatomical findings indicate that high-level processing of imagined speech requires left-lateralization. Information on the neuroanatomical regions associated with imagined speech production is enhanced by consideration of the characteristics of the corresponding neural activations and, in particular, the frequency bands that may provide the most discriminable content. Activations in the beta band above Broca area and the frontal cortex have been associated with imagined speech production (Rezazadeh Sereshkeh et al., 2017b). In one study, increased activity was observed in EEG channels located close to Broca area in the frequency range of 20–30 Hz, whereas activity in Wernicke area appeared primarily below 15 Hz (Nguyen et al., 2017). This may indicate that separate frequency bands contain information relating to different speech production processes. In the same study, the authors use evidence from the classification of short versus long words to suggest that differences in the complexity of words could create discriminative features across frequency bands. In an imagined speech yes/no classification task, no discriminative difference was detected in the delta, theta, alpha, and mu rhythms. However, in the higher frequency ranges (beta and gamma), a discriminative pattern was associated with typical left-sided speech regions (Rezazadeh Sereshkeh et al., 2017b). MEG measurements obtained during a silent reading task showed event-related desynchronization in the alpha and beta bands over Broca area (Goto et al., 2011). The results of an ECoG study into imagined speech vowel articulation suggested that signals in the alpha (8–13 Hz) and beta (14–30 Hz) bands over Broca area may contain information about the articulatory code of single vowels but not about segmentation of a phoneme sequence (Ikeda et al., 2014). Clearly, the recording technique employed impacts the frequency ranges that can be analyzed. For example, filtering imagined speech EEG data between 3 and 20 Hz (Deng et al., 2010) found considerable energy in the alpha band (8–14 Hz), whereas using ECoG has allowed researchers to obtain features from the high gamma (70–150 Hz) band (Martin et al., 2016), which is useful for its association with spike rate and local field potential and its reliable tracking of rapid neural fluctuations during speech perception and production (e.g., Pei et al., 2011a). It is our view that this information on the important frequency bands associated with imagined speech can aid decoding approaches in future research. However, it is also important that further research in this area is undertaken so that a detailed and accurate picture of the spatial-temporal-spectral correlates of imagined speech is developed. In the next section, we extend our analysis on the neuroanatomical underpinning of imagined speech to include the current understanding of speech production processes and the anatomical regions of interest they correspond to.

How Is (Imagined) Speech Produced?

Models of Speech Production

It is a matter of consensus in psycholinguistic research that speech production is planned across multiple hierarchically organized levels of analysis (Hickok, 2012) and that word production involves at least two stages of processing: a lexical and a phonological stage (Levelt et al., 1999) (Figure 4B). Models of speech production can differ in terms of the number of distinct stages involved (Hickok, 2014, Hickok, 2012, Levelt, 1999, Levelt et al., 1999), but there is general agreement that it involves a staged, hierarchical process with a temporal structure, as indicated by the models in Figure 4.

Figure 4

Speech Production Models with Estimated Time Courses

Although models can differ in the number of components, there is general agreement that speech production is a staged, hierarchical process with a temporal structure, as indicated in the diagram. In (A), estimated time courses associated with the stages of production are provided in milliseconds (ms) (Indefrey, 2011) along with a production model containing two major components. These are the word (lemma) level and the phonological level (Hickok, 2012). In (B), a more detailed model depicts several different phases in the production process (Levelt et al., 1999). The initial stage is conceptual preparation, where a message to be expressed is formulated and a lexical concept produced. Next is lexical selection, in which a word or lemma is retrieved for use. Following selection of a lemma, the morphological stage bridges between the conceptual domain and the phonological, or articulatory, domain. A word is then encoded in syllabic form before being encoded in phonetic form, from which the audible output is produced. In (C), a truncated version of the model in (B) is presented to highlight the stages of production corresponding to imagined speech. The estimated time courses end with the phonological encoding/syllabification stage. (A) is adapted with permission from Hickok 2012, copyright 2012, Springer Nature. (B) is adapted with permission from Levelt et al. 1999, copyright 1999, Cambridge University Press. *upper boundary.

Speech Production Models with Estimated Time Courses Although models can differ in the number of components, there is general agreement that speech production is a staged, hierarchical process with a temporal structure, as indicated in the diagram. In (A), estimated time courses associated with the stages of production are provided in milliseconds (ms) (Indefrey, 2011) along with a production model containing two major components. These are the word (lemma) level and the phonological level (Hickok, 2012). In (B), a more detailed model depicts several different phases in the production process (Levelt et al., 1999). The initial stage is conceptual preparation, where a message to be expressed is formulated and a lexical concept produced. Next is lexical selection, in which a word or lemma is retrieved for use. Following selection of a lemma, the morphological stage bridges between the conceptual domain and the phonological, or articulatory, domain. A word is then encoded in syllabic form before being encoded in phonetic form, from which the audible output is produced. In (C), a truncated version of the model in (B) is presented to highlight the stages of production corresponding to imagined speech. The estimated time courses end with the phonological encoding/syllabification stage. (A) is adapted with permission from Hickok 2012, copyright 2012, Springer Nature. (B) is adapted with permission from Levelt et al. 1999, copyright 1999, Cambridge University Press. *upper boundary. According to Levelt (1999), spoken word production includes lexical selection, lemma retrieval, and morphological and phonological code retrieval and is completed with articulation (Figure 4A). Models of speech production typically begin with an input from the conceptual system, i.e., the message to be expressed (Levelt, 1999). This is then mapped to a corresponding lexical representation, encoding properties such as grammatical features but not a phonological form. Following selection of a lemma, the morphological stage bridges the gap between the conceptual domain and the phonological or articulatory domain. Phonetic encoding and articulation, seen in Figure 4A, are stages of the speech production process concerned with acoustic output. The speech production models, as stated here, are based primarily on work in the fields of motor control and psycholinguistics, and it has been noted that linguistic models are currently constrained by the need for further developments in neuroscience (Hickok, 2012). EEG studies have been used to study the time courses associated with the processing stages in word production (see Indefrey, 2011 for review). Following analysis of several event-related potential studies, Indefrey (2011) presented the following estimated onset times and durations for overt speech production: conceptual preparation (0–200 ms), lemma retrieval (200–275 ms), phonological code retrieval (275 ms onset), syllabification (355 ms onset; 20 ms per phonemes, 50–55 ms per syllable), phonetic encoding (455 ms onset), and articulation (600 ms) (Figure 4A). Although this research is based on overt speech, and the articulation stage is not relevant, the estimated timings can be informative for DS-BCI researchers seeking to target a specific stage of the production process during signal decoding. Language production involves multiple levels of representation, and this modular system incorporates various sub-systems, i.e., semantics, syntax, and phonology. Different brain regions in the left and right hemispheres have been identified as supporting these language functions, with syntactic processing supported by networks involving the temporal cortex and inferior frontal cortex, and less lateralized temporo-frontal networks subserving semantic processing (see Friederici, 2011). In discussing Hebbian theory, Pulvermüller (1999) considers whether lexical or semantic distinctions reflect differences that are biologically real, using it to explain the observation that word meanings can be mapped to different cortical regions, for example. This results in words that are distinguished on the basis of linguistic criteria being represented differently in the brain. Investigations into the neural correlates of language function and competence commonly employ functional imaging approaches (see Indefrey and Levelt, 2004), as well as LSM, to determine the links between linguistic pathologies and corresponding lesion sites in aphasics (Bates et al., 2003). Linguistic research can be considered within the context of several modular domains, four of which (semantics, lexical access, syntax, and phonology) are discussed in the following sections.

Semantics and the Meaning of Words

Semantic knowledge has been referred to as the ability to assign and use the meaning of words, relying on both stored semantic knowledge and executive control to enable semantic activation in line with goals and constraints (Whitney et al., 2012). The term semantics refers to the meaning of a word or collection of words. In the models of speech production in Figure 4, semantic information forms part of the conceptual stage in which a message to be expressed is conceived. This conceptual stage precedes lexical selection, syntactic encoding, and phonological encoding, with the process leading up to selection of a lexical concept referred to as “conceptual preparation.” Mapping between the semantic concept to be expressed and a lexical formulation of this message is not a simple one-to-one process, as there are often multiple ways to refer to a single concept (e.g., a car may be referred to as a vehicle, saloon, or motorcar) (Levelt et al., 1999). Semantic comprehension studies indicate that semantic operations are normally slower to develop and longer lasting than syntactic operations (Piñango et al., 2006) and thus accommodate slower lexical activation than syntactic dependencies (Love et al., 2008). However, it cannot simply be assumed that the relationship between semantic and syntactic comprehension is mirrored in speech production processes. One study has posited the possibility of an intermediate layer between semantics and phonology owing to the arbitrary nature of the mapping from meaning to sounds, i.e., words with similar meanings do not tend to have similar sounds associated (Lambon Ralph et al., 2002), and the Hebbian associationist model predicts that semantic differences between word categories generate patterns of neural activity reflective of those differences (Pulvermüller, 1999). For example, naming of living versus inanimate objects was more strongly correlated with integrity of the middle temporal cortex, whereas both categories showed significant overlap in the frontal cortex (Henseler et al., 2014). In addition, large parts of the IFG appear to be involved in semantic differentiation of verbs versus nouns. Activation in the LIFG is typically exhibited when difficult semantic relationships, such as the meaning of ambiguous words (e.g., words such as break, light, and head have multiple meanings) within a sentence, need to be parsed. These difficult relationships may be weak or unusual associations, an increased number of response options, or competition among potential targets in a semantic network (Badre et al., 2005). Although many neuroimaging studies have concentrated on the LIFG as the basis for semantic processing and control, other studies show that damage to a wide distribution of brain regions results in impairment of semantic control (Whitney et al., 2012). The orbital IFG exhibited higher correlation with the semantic differentiation of nouns, whereas a more posterior, triangular/opercular part of the IFG was associated with the impaired differentiation of verbs. Results from action word studies have indicated that semantic processing can engage many different cortical areas, with Pulvermüller (2005) stating that this contradicts the view that processing of meaning is concentrated in a single cortical location. Moreover, it has been demonstrated that word class distinctions can be made in relation to different types of action words (Hauk et al., 2004), with different cortical activations associated with the muscles used to perform a given action, the complexity of the movement, and the number of muscles involved (Pulvermüller, 1999).

Lexical Access Maps Meaning to Words

Lexical access is the process that facilitates access to the words retained in memory that are required for language production. Dell, Martin, and Schwartz (Dell et al., 2007) present a two-step model of lexical access in which a network consists of a semantic layer connected to words and words connected to a phoneme layer. Word retrieval begins when the semantic features of an intended word are activated. This activation proceeds through the network, resulting in the selection of the most active word from a grammatical category. A phonological retrieval stage begins with the activation of this selected word. Lexical access effects the fluency and speed at which speech is produced. For example, it has been shown that function words (i.e., contributing to syntax/grammar) are accessed faster than content words (i.e., contributing to information/meaning), independent of perceptual characteristics (Segalowitz and Lane, 2004). Another factor influencing lexical fluency is the frequency with which a word is used (Mohr et al., 1996). In a picture-naming paradigm, participants displayed quicker response times in object-naming tasks than they did in action-naming tasks, leading the authors to posit that the process of mapping between the picture and the name itself appears to differ between lexical categories, namely, nouns versus verbs (Szekely et al., 2005). Other evidence taken from studies involving patients with aphasia has shown that the mental lexicon distinguishes grammatical classes (Benetello et al., 2016). There are several brain regions associated with word production during lexical selection. Indefrey and Levelt (2004) reviewed 82 functional imaging studies of single word production, identifying 11 regions in the left hemisphere (posterior IFG, ventral precentral gyrus, SMA, mid- and posterior STG and MTG, posterior temporal fusiform gyrus, anterior insula, thalamus, and medial cerebellum) and four in the right (mid-STG, medial and lateral cerebellum and SMA) involved in core processes of word production. Other functional imaging studies have demonstrated that lexical-semantic knowledge is stored in the temporal lobe (Vigneau et al., 2006) and that the region can operate as a lexical interface linking phonological and semantic information in a sound-to-meaning interface (Hickok and Poeppel, 2007). Elsewhere, the left MTG has been found to associate with lexical selection (Indefrey and Levelt, 2004). The spatiotemporal dynamics of word retrieval, including lexical selection, are not well understood, but Riès et al. (2017) have shown that activation of word representations and their selection temporally co-occur and that a widespread network of overlapping brain regions is associated. The variety of brain regions implicated in word production suggests that there is potential for exploiting semantics, syntax, and phonology to activate different regions during imagined speech production to maximize the separability of brain activations for DS-BCI.

The Hierarchical Structure of Syntax

Contemporary linguistic theories contend that syntactic and sentential representations are complex sets of hierarchically organized syntactic categories and that the relationships between categories in this hierarchy determine the different aspects of propositional meaning (see Zaccarella and Friederici [2016] for a neurobiological review of syntactic hierarchies). During syntactic encoding, a conceptual message is linguistically encoded by retrieval of corresponding words from the lexicon and grammatical ordering of these words (Indefrey et al., 2001). Stored syntactic information, such as word class, is used to compute a structure that specifies the relationships between words in a sentence, e.g., order and inflection. It has been proposed (Frazier, 1987), and countered (Friederici, 2002), that there is an isolated syntactic processing mechanism that has no relation to semantics or other non-syntactic information. It has been stated that syntactic encoding in speech production exhibits close temporal overlap with other processes (Indefrey et al., 2001) and that brain activations in the frontotemporal language network have indicated that syntactic processing occurs before semantic processing but that these processes are not isolated mechanisms (Friederici, 2002). Syntactic processing is specifically associated with BA44, located in the posterior portion of Broca area in the LIFG and its white matter connection to the posterior temporal cortex (Friederici, 2018). A functional imaging study has provided evidence that hierarchical syntactic conditions localized in the ventral portion of BA44 (Zaccarella and Friederici, 2015). In contrast, activations corresponding to processing of two-word sentences without syntactic hierarchy were associated with the frontal operculum/anterior insula. Love et al. (2008) provide evidence that the left IFC supports syntactic processing because it sustains the requisite lexical activation speed needed for the real-time formation of a syntactic dependency. Elsewhere, PET has been used to identify both sentence-level and local syntactic encoding of speech in the Rolandic operculum, adjacent to Broca area (Indefrey et al., 2001).

The Internal Phonological Speech Code

Within psycholinguistic theory the assumption exists that speech articulation is preceded by an internal abstract speech code (Wheeldon and Levelt, 1995). In speech production, a word can have different intonation, duration, and amplitude, leading to the proposal that each linguistic unit has a phonological representation encoding features unique to that unit. Phonological representations are categorical and consist of discrete timeless segments (Wheeldon and Levelt, 1995). Models differ as to the timing and order at which phonemes are assigned to a phonological structure. Following the syntactic computation phase, stored information on the sounds of words is retrieved as “phonological codes.” These are then transformed to produce an executable code, i.e., speech (Indefrey et al., 2001). It has been proposed that phonological word representation is accessed from Broca area and compiled into segments of syllables (Indefrey and Levelt, 2004). Other studies indicate that the posterior middle and inferior portions of the temporal lobes are linked to phonological and semantic processing (see Hickok and Poeppel, 2007). Another suggestion (Edwards et al., 2010) is that speech production is enabled through verbal/phonological working memory using the dorsal stream areas implicated in speech perception and phonological working memory (e.g., Hickok and Poeppel, 2007). It has been suggested that phonological encoding exhibits correlation with the superior temporal sulcus (STS) (Llorens et al., 2011), whereas the authors of one study linked the IFG and STS gamma band responses (>40 Hz) to the phonological retrieval processes and imagined speech production, using intracranial EEG recordings (Mainy et al., 2008). Although it is well known that lemma selection begins earlier than phonological encoding, it seems that there is some temporal overlap between the two activations (Sedivy, 2014) and it is possible that phonologically similar words are represented by overlapping cell assemblies sharing a single perisylvian region (Pulvermüller, 1999). It is possible for a phonological word form to have two meanings (e.g., the noun/verb dichotomy of the/to beat), and it has been suggested that there must be an underlying mechanism for realizing the exclusive-or relationship between the two. The review of the literature presented in the earlier sections provides the basis for our discussion on the role of linguistics within the framework of DS-BCI research. This discussion is presented in the next section.

An Enhanced Role for Linguistics in BCI Research

Overt speech is a rich tapestry of sound, pitch, rhythm, structure, and meaning, and studies have shown that imagined speech retains many of these articulatory characteristics (Alderson-Day and Fernyhough, 2015, Scott et al., 2013). It is one of the great challenges of DS-BCI research to represent this communicative richness through the modality of a BCI. With this goal in mind, improvements to experimental protocol have been suggested, including the use of a vocabulary of words with semantic meaning to improve discrimination between words and a normalization of word length to mitigate the high variance of this feature (Porbadnigk et al., 2009). We advocate the use of novel experimental design to enhance effective elicitation of imagined speech and improve discriminability between phonemes, words, and sentences. Further investigation into the neurological and neuroanatomical underpinnings of imagined speech production and the development of a more concrete understanding of the information contained within different frequency bands at different brain foci are also required. The importance of consistency in the way imagined speech is produced by experimental participants and the effect of providing them with a thorough understanding of what is meant by imagined speech production are additional areas for investigation that may improve the robustness of experimentation. In the following subsections, we extend the work of Iljina et al. (2017) by highlighting three key areas where BCI research can benefit from findings in the field of neurolinguistics.

Incorporating the Structure of Speech Production Processing

The sheer complexity of the neural mechanisms underpinning speech is one of the primary factors causing resistance to the development of a DS-BCI. In comparison with many of the previous incarnations of communicative BCI (Chaudhary et al., 2017, Pandarinath et al., 2017), the character of the modality of interaction, i.e., imagined speech, is still a relatively poorly understood phenomenon. In relation to DS-BCIs, the following question has been put forward: when does semantic, phonological, or syntactic processing occur (Iljina et al., 2017)? The analysis of Indefrey (2011) provides some insight into the relative timings associated with the stages of speech production (see Figure 4) and indicates that it may be possible to target decoding of semantic information at an earlier stage than the phonological representation. The temporal sequence of these processes is an important consideration for BCI researchers seeking to extract meaning from imagined speech, but there are opposing views to navigate. One of these is a sequential model in which word production involves a series of separate stages from semantic concept through word retrieval and phonological articulation (Levelt et al., 1999). Alternative models hypothesize a parallel architecture in which neurolinguistic processes occur simultaneously (Jackendoff, 2007). Whichever of these models is correct, they must be incorporated into the DS-BCI paradigm. The speech production process as depicted in section “How is (Imagined) Speech Produced?” offers a staged process with the potential to be mined for more targeted decoding approaches. Models of speech processing, for example, have proposed that accessing the phonological representation of a word releases two kinds of information: a frame that specifies the structure of a word and phonemes to fill slots in this structure (Dell, 1988, Levelt, 1992). An interesting operation referred to as gap filling (Love et al., 2008) has been observed in studies of lexical priming whereby the meaning of a displaced constituent is activated when it is first encountered in a sentence and then reactivated at a site indexed by a trace. Consider the following sentence as an example: “(The boy) that the horse chased (t) is tall.” In a case like this, activation is present for “boy” and again at the gap indexed by “t,” where there is no phonologically realized word. Crucially, there is no activation before the word “chased,” indicating that the activation for “boy” at the gap is not residual activation but the result of reactivation (Love et al., 2008). This may have important implications for the development of a DS-BCI that decodes continuous imagined speech from brain activity, as the neurological basis of syntax requires a complex series of operations not simply based on surface word order. Understanding of the widely distributed brain regions associated with semantic and syntactic processing and speech production (as discussed in sections “Semantics and the Meaning of Words” and “The Hierarchical Structure of Syntax”) should be harnessed along with enhanced methods for eliciting imagined speech, to improve the decoding accuracy of DS-BCIs. Herff et al. (2017) have shown that continuous speech is represented as a sequence of phones within the brain and is thus a legitimate target for DS-BCI research. Following this, it seems reasonable to suggest that concatenation of imagined speech units can be used to produce words and sentences. Perrone-Bertolotti et al. (2014) discuss concerns over the way imagined speech manifests itself and how personal agency or lack thereof leads to different forms of imagined speech. The more active form, described as “deliberate covert production of speech,” is consciously generated speech and the target of DS-BCI research. However, a less deliberate manifestation known as “verbal mind wandering” can occur spontaneously. Despite not being the direct target of DS-BCIs, this second state of imagined speech may influence the performance of such a device or even activate communication when none was intended.

Leveraging Neurolinguistics Concepts to Improve Discriminability

The ability to effectively discriminate between neural recordings is an essential component of any BCI, and it is a particularly complicated challenge in relation to DS-BCIs, given the complex and dynamic processes of speech production. Decoding brain activity corresponding to imagined speech, given the dense vocabulary and the volume of potential semantic combinations that humans possess is an exceptional challenge. In section “Semantics and the Meaning of Words,” evidence is presented linking different semantic categories to different lesion foci, and semantic categorization of words appears to be a promising method for improving classification from a constrained lexicon. Content words, i.e., words with rich semantic meaning (e.g., words referring to tastes, sensations, sounds, or motor activities) have been associated with distinct regions of the brain and may enable classification of words based on semantic criteria (Pulvermüller, 1999). Although this may appear to be a somewhat contrived method for improving accuracy, this approach can help elucidate the degree to which semantic categorization contributes to differentiation between words (Wang et al., 2011). Categorical differences between words can induce significantly different brain activity, and this variance may be an aid to classification. For example, action words (e.g., kick, throw, blink) can have the effect of activating brain regions actually involved in carrying out the activity (Hauk et al., 2004). Similarly, words corresponding to touch may include significant activation in the somatosensory cortices and sound words may cause increased activation in bilateral auditory cortices (Pulvermüller, 1999). Imagined speech's close association with working memory (Marvel and Desmond, 2012), the range of articulatory forms it can take (Alderson-Day et al., 2015, Deng et al., 2010), and the different neural activations it exhibits in relation to overt speech (Basho et al., 2007) contribute to making imagined speech extremely difficult to decode effectively. Methods employed in neurolinguistics can help DS-BCI researchers improve cuing and elicitation techniques, making it easier to determine precisely what is being decoded from brain activity. This may take the form of semantic or phonological priming, as suggested earlier, or improvement of experimental protocols to ensure participants are clear on what is expected from them. It may also be possible to protect against unwanted noise in the data, for example, via articulatory suppression. The previously stated proposal that each linguistic unit has a unique phonological representation (section “The Internal Phonological Speech Code”) is a potential avenue for improving imagined speech discriminability (Zhao and Rudzicz, 2015). Clearly, if the assertion of a unique phonological code is correct, this would be a primary target of DS-BCI decoding approaches, as a single representation corresponding to a single word or phoneme would make those approaches easier to implement, given that the prior stages in the speech production process may not be required. It is the recommendation of this review that further investigation into the potential phonological discriminability of units of imagined speech is pursued. Although much of the research to date into a possible DS-BCI has focused on discrete linguistic units, i.e., vowels, consonants, it has been suggested that the neural substrates responsible for the representation of phonemes may differ depending on whether they are processed as part of a sequence or processed alone (Ikeda et al., 2014). Di Liberto, O'Sullivan, and Lalor (Di Liberto et al., 2015) lament the lack of research present in the literature regarding the parsing and processing of continuous speech. However, the difficulty of experimentation with imagined speech and the impracticality of attempting to decode continuous speech, at a time when decoding discrete units of speech is still enormously challenging, has meant that to date the majority of studies have focused on discrete units of speech in the development of decoding strategies. If progress is to be made using these approaches, the anatomical information summarized in sections “Imagined Speech: A Special Case of Speech” and “How is (Imagined) Speech Produced?” will be important for informing decoding strategies. Targeting regions of interest specific to speech production may be a promising approach to the development of a DS-BCI (Guenther et al., 2009), particularly considering that speech processing is a highly distributed operation with semantics, lexical access, syntax, and phonology, all correlated to different regions. Although we agree with Bocquelet et al. (2017) that the LIFG is clearly implicated in imagined speech production, and a promising candidate for DS-BCI research, we think it is important to consider a wider, and probably bilateral, network where the distributed connectivity predicted by Hebbian theory is accounted for. The evidence presented here indicates a wide cortical network associated with different linguistic categories and stages of the speech production process. It is our assertion that a complete picture of the neuroanatomical correlates of imagined speech will provide greater opportunities for effective discriminability.

Mitigating the Limitations of Experimental Methodology

Progress toward a DS-BCI is dependent on the effectiveness of future research methodologies and on novel approaches to system development. It has been noted that researchers seeking to distinguish word classes from neural activation should consider the effect of word length, word frequency, emotional properties of the stimuli, word repetition, priming, and syntactic and semantic context when designing experiments (Pulvermüller, 1999). The same author also warns of the possible unintended effects of presenting words in sentences or word strings, because the neurophysiological response is a complex blend of the semantic and syntactic interactions of the given words. One of the difficulties associated with the development of a DS-BCI is inferring from experimental participants that the required tasks have been performed (Geva et al., 2011b). The lack of behavioral output from participants has meant that researchers have been faced with a choice of whether to accept assertions that a given task has been correctly undertaken, to design their experimental procedure in a manner that will elicit the required imagined speech activity (Geva et al., 2011a), or to merge their imagined speech protocols with an overt action in an attempt at cross-verification (Oppenheim and Dell, 2008). Limitations to the scope of empirical study in the case of imagined speech has induced the development of methods for indirect study of the phenomenon (Filik and Barber, 2011, Oppenheim and Dell, 2008). Alderson-Day and Fernyhough (2015) present recent methodological advances in the field, including imagined speech inducement and inhibition, as a means of studying its effects. Neuroimaging studies into the nature of imagined speech have often asked participants to simply articulate some words or sentences in imagined speech or to imagine speech with different characteristics. A danger associated with these studies is the lack of ecological validity in eliciting imagined speech (Alderson-Day and Fernyhough, 2015) and the failure of researchers to acknowledge the possibility that imagined speech is present during baseline assessments (Jones and Fernyhough, 2007). A technique known as articulatory suppression might provide some assistance in ameliorating this issue (Miyake et al., 2004). The evidence presented in section “The Phenomena of Imagined Speech” indicated variation in the phenomena of imagined speech, both in terms of how it is activated and how it is perceived. Studies have shown that imagined speech is not generally understood in the same way by participants and can vary widely in its phenomenology (Alderson-Day and Fernyhough, 2015). It is the job of the DS-BCI researcher to ensure that each participant is well informed before engaging in experimentation. The methodology employed by Geva, Jones, et al. (Geva et al., 2011b) may be an interesting avenue for exploration in DS-BCI research. Their use of rhyming words and/or homophones is commonly applied in linguistics (Badre et al., 2005, Filik and Barber, 2011) to allow researchers to know whether participants are using imagined speech or resorting to other linguistic/cognitive strategies. For example, “might” and “mite” are homophones, whereas “ear” and “oar” are not. These are tasks that could not be solved by orthography alone and thus require the use of imagined speech. Research methodology using overt speech to represent imagined speech within experimental paradigms is flawed, at least to some degree. Overt speech–trained models, for example, are an active research area, but it must be understood that neural representations of overt and imagined speech are not identical (Chakrabarti et al., 2015). Hubbard (2010) reflects that differences in experimental results between overt and imagined speech may simply be a function of a participant's ability to self-monitor and report accurately. There is general agreement that overt speech engages greater activation across a broader network of the brain than imagined speech, with areas including the mesial temporal lobe and sub-cortical structures (Kielar et al., 2011). Owing to some notable differences observed from neural responses in overt and imagined conditions, inferences drawn from language processing studies should be considered with caution (Llorens et al., 2011). However, Iljina et al. (2017) believe that the body of research presented on both overt and imagined speech supports the premise of being able to decode expressive language from neuronal processes as well as translation of findings from overt to imagined speech. Experimental results can be negatively affected by experimental conditions, and an alternative approach to improving the robustness of results in relation to speech production and communicative interaction is the use of non-experimental, “real-world” speech (Derix et al., 2014, Derix et al., 2012, Ruescher et al., 2013). Spontaneous language can reflect mental states and thus constitutes a fundamental link between externally observable behavior and internal cognitive processes (Derix et al., 2014). Using their methodology, in which simultaneous ECoG and digital video recordings are used to identify periods of spontaneous communication between interlocutors, the group cited earlier has conducted studies based on concepts developed in psycholinguistic research into spontaneously spoken language. The authors highlight the importance of study paradigms in which real-world situations can be investigated in a way not possible under strict experimental procedures. They present the use of stimuli such as naturalistic texts, recordings of interacting individuals, and virtual reality simulations as associated methods being employed elsewhere (Derix et al., 2014). In a series of studies, the research team used their methodology to study the neuronal processes related to real-life communication in a non-experimental scenario (Derix et al., 2014, Derix et al., 2012, Ruescher et al., 2013). This involved a technique for identifying time periods in which patients were involved in conversation with either partners or physicians (Derix et al., 2012). Extracted epochs consisted of periods of natural, uninstructed conversation, with the results indicating that the choice of linguistic and non-linguistic behaviors depends on whom a person is speaking with. The authors suggest that such meta-information may have utility in BCI applications aimed at restoration of expressive speech. Although non-experimental conditions do facilitate the study of spontaneous speech, it is important to acknowledge, as the research team has, that participants' behavior may be moderated by the knowledge that they are under surveillance, and therefore not completely natural (Derix et al., 2014). However, we agree with Iljina et al. (2017) that a thorough understanding of brain activity during real-world speech is required for the development of truly naturalistic DS-BCI. As indicated throughout this review, there are several ways in which DS-BCI research can benefit from neurolinguistics research advances. Understanding the phenomena of imagined speech and individual speech processes is crucial, but looking toward neurolinguistics to enhance experimental methodology and interpretation of results is also advocated here. Other avenues exist for exploration of improvements to the performance of DS-BCIs, including signal acquisition and advanced classification algorithms, but it would be wrong to ignore the potential utility of cross-disciplinary research in neurolinguistics and DS-BCI.

Concluding Remarks

Development of a DS-BCI is an extremely challenging undertaking. It is the assertion of this review that a cross-disciplinary approach must be taken to advance the field toward a naturalistic form of communication. Here, we advocate the integration of neurolinguistics within the DS-BCI paradigm for the improvement of experimental methodology and to aid approaches to the decoding of neural signals. Insights into the nature of imagined speech and speech production processes can inform research practices, whereas methodological approaches common in linguistics can help improve procedural robustness in studies involving imagined speech. Clearly, there is no definitive description of the phenomena of imagined speech. Independently depicted as a truncated form of overt speech, as showing greater activation in several brain regions than overt speech and as having attenuated features in comparison with overt speech, imagined speech is still relatively poorly understood. Continuing research into imagined speech from a neurolinguistics perspective will be vital for DS-BCI. Imagined speech manifests itself in different forms, whether that be through active or passive generation of imagined speech; through accent, rhythm, or pitch; or through conversational or single-speaker scenarios. That being the case, future research in this field must make it abundantly clear to experimental participants precisely what is being asked of them. The field of neurolinguistics can help inform DS-BCI research on methods for targeting the imagined speech content required. Not unrelated to this is the potential for additional information to be encoded in the neural recordings extracted during periods of imagined speech production. Working memory and imagined speech appear to be intrinsically linked, and imagined speech trials are susceptible to influence from the auditory or visual cues presented. It is therefore important that experimental methodologies and decoding approaches mitigate against this unwanted content where possible. This review has shown that DS-BCI is concerned not only with the phenomena of imagined speech and how it differs from overt speech but also with the neuroanatomy and specific processes involved in the production of speech. Speech production is a temporal process with a hierarchical structure, and it is clear that it cannot be considered a single function localized in a single brain region. Evidence has been presented from neurolinguistics research to indicate that different systems of speech production, such as semantics and syntax, operate at distinct time periods (sometimes overlapping) across a distributed network of brain regions and that these systems activate patterns of brain activity that may be useful for approaches to decoding imagined speech. A fully functioning DS-BCI may, at present, seem a long way off, and it may appear that there are more pressing concerns, such as improving signal acquisition, for the field to be focused on at present. However, it is our contention that it would be remiss to ignore the field of neurolinguistics in DS-BCI research, given the potential benefits it can offer in the short term and the high probability that it will be required in the longer-term development of a naturalistic mode of communication.

115 in total

Review 1. Words in the brain's language.

Authors: F Pulvermüller
Journal: Behav Brain Sci Date: 1999-04 Impact factor: 12.579

2. Voxel-based lesion-symptom mapping.

Authors: Elizabeth Bates; Stephen M Wilson; Ayse Pinar Saygin; Frederic Dick; Martin I Sereno; Robert T Knight; Nina F Dronkers
Journal: Nat Neurosci Date: 2003-05 Impact factor: 24.884

3. Frequency-dependent spatiotemporal distribution of cerebral oscillatory changes during silent reading: a magnetoencephalograhic group analysis.

Authors: Tetsu Goto; Masayuki Hirata; Yuka Umekawa; Takufumi Yanagisawa; Morris Shayne; Youichi Saitoh; Haruhiko Kishima; Shirou Yorifuji; Toshiki Yoshimine
Journal: Neuroimage Date: 2010-08-19 Impact factor: 6.556

4. Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing.

Authors: Giovanni M Di Liberto; James A O'Sullivan; Edmund C Lalor
Journal: Curr Biol Date: 2015-09-24 Impact factor: 10.834

5. Effects of generation mode in fMRI adaptations of semantic fluency: paced production and overt speech.

Authors: Surina Basho; Erica D Palmer; Miguel A Rubio; Beverly Wulfeck; Ralph-Axel Müller
Journal: Neuropsychologia Date: 2007-01-16 Impact factor: 3.139

6. Role of the left inferior frontal gyrus in covert word retrieval: neural correlates of switching during verbal fluency.

Authors: Elizabeth A Hirshorn; Sharon L Thompson-Schill
Journal: Neuropsychologia Date: 2006-05-24 Impact factor: 3.139

7. From storage to manipulation: How the neural correlates of verbal working memory reflect varying demands on inner speech.

Authors: Cherie L Marvel; John E Desmond
Journal: Brain Lang Date: 2011-09-01 Impact factor: 2.381

8. High performance communication by people with paralysis using an intracortical brain-computer interface.

Authors: Chethan Pandarinath; Paul Nuyujukian; Christine H Blabe; Brittany L Sorice; Jad Saab; Francis R Willett; Leigh R Hochberg; Krishna V Shenoy; Jaimie M Henderson
Journal: Elife Date: 2017-02-21 Impact factor: 8.140

9. Using the electrocorticographic speech network to control a brain-computer interface in humans.

Authors: Eric C Leuthardt; Charles Gaona; Mohit Sharma; Nicholas Szrama; Jarod Roland; Zac Freudenberg; Jamie Solis; Jonathan Breshears; Gerwin Schalk
Journal: J Neural Eng Date: 2011-04-07 Impact factor: 5.379

10. Restoring cortical control of functional movement in a human with quadriplegia.

Authors: Chad E Bouton; Ammar Shaikhouni; Nicholas V Annetta; Marcia A Bockbrader; David A Friedenberg; Dylan M Nielson; Gaurav Sharma; Per B Sederberg; Bradley C Glenn; W Jerry Mysiw; Austin G Morgan; Milind Deogaonkar; Ali R Rezai
Journal: Nature Date: 2016-04-13 Impact factor: 49.962

6 in total

1. Decoding Imagined and Spoken Phrases From Non-invasive Neural (MEG) Signals.

Authors: Debadatta Dash; Paul Ferrari; Jun Wang
Journal: Front Neurosci Date: 2020-04-07 Impact factor: 4.677

2. Evaluation of Hyperparameter Optimization in Machine and Deep Learning Methods for Decoding Imagined Speech EEG.

Authors: Ciaran Cooney; Attila Korik; Raffaella Folli; Damien Coyle
Journal: Sensors (Basel) Date: 2020-08-17 Impact factor: 3.576

3. Imagined speech can be decoded from low- and cross-frequency intracranial EEG features.

Authors: Timothée Proix; Jaime Delgado Saa; Andy Christen; Stephanie Martin; Brian N Pasley; Robert T Knight; Xing Tian; David Poeppel; Werner K Doyle; Orrin Devinsky; Luc H Arnal; Pierre Mégevand; Anne-Lise Giraud
Journal: Nat Commun Date: 2022-01-10 Impact factor: 17.694

4. Thinking out loud, an open-access EEG-based BCI dataset for inner speech recognition.

Authors: Nicolás Nieto; Victoria Peterson; Hugo Leonardo Rufiner; Juan Esteban Kamienkowski; Ruben Spies
Journal: Sci Data Date: 2022-02-14 Impact factor: 6.444

5. Brain Computer Interfaces and Communication Disabilities: Ethical, Legal, and Social Aspects of Decoding Speech From the Brain.

Authors: Jennifer A Chandler; Kiah I Van der Loos; Susan Boehnke; Jonas S Beaudry; Daniel Z Buchman; Judy Illes
Journal: Front Hum Neurosci Date: 2022-04-21 Impact factor: 3.473

Review 6. 2020 International brain-computer interface competition: A review.

Authors: Ji-Hoon Jeong; Jeong-Hyun Cho; Young-Eun Lee; Seo-Hyun Lee; Gi-Hwan Shin; Young-Seok Kweon; José Del R Millán; Klaus-Robert Müller; Seong-Whan Lee
Journal: Front Hum Neurosci Date: 2022-07-22 Impact factor: 3.473

6 in total