| Literature DB >> 34722130 |
Hannah Maslen1, Stephen Rainey1.
Abstract
Implantable brain-computer interfaces (BCIs) are being developed to restore speech capacity for those who are unable to speak. Patients with locked-in syndrome or amyotrophic lateral sclerosis could be able to use covert speech - vividly imagining saying something without actual vocalisation - to trigger neural controlled systems capable of synthesising speech. User control has been identified as particularly pressing for this type of BCI. The incorporation of machine learning and statistical language models into the decoding process introduces a contribution to (or 'shaping of') the output that is beyond the user's control. Whilst this type of 'shared control' of BCI action is not unique to speech BCIs, the automated shaping of what a user 'says' has a particularly acute ethical dimension, which may differ from parallel concerns surrounding automation in movement BCIs. This paper provides an analysis of the control afforded to the user of a speech BCI of the sort under development, as well as the relationships between accuracy, control, and the user's ownership of the speech produced. Through comparing speech BCIs with BCIs for movement, we argue that, whilst goal selection is the more significant locus of control for the user of a movement BCI, control over process will be more significant for the user of the speech BCI. The design of the speech BCI may therefore have to trade off some possible efficiency gains afforded by automation in order to preserve sufficient guidance control necessary for users to express themselves in ways they prefer. We consider the implications for the speech BCI user's responsibility for produced outputs and their ownership of token outputs. We argue that these are distinct assessments. Ownership of synthetic speech concerns whether the content of the output sufficiently represents the user, rather than their morally relevant, causal role in producing that output.Entities:
Keywords: Brain-computer interfaces; Control; Neuroprosthetics; Ownership; Responsibility; Speech
Year: 2020 PMID: 34722130 PMCID: PMC8550345 DOI: 10.1007/s13347-019-00389-0
Source DB: PubMed Journal: Philos Technol ISSN: 2210-5433
Fig. 1This illustration shows essential parts of the process from input of neural signals to speech output. Neural signals, in this case correlates of covert speech, are recorded and prepared for processing. The actual covert speech activity of the user constitutes a variable input to this stage, but not a random one, as only specific signals from known brain areas are recorded (e.g. spectral features, articulatory motor data). This allows processing of the signals in terms of how given signals correspond with probable speech outputs and with positions of the lips, tongue, velum, etc. This is bolstered with a model of language that further constrains processing according to rules (such as likelihood of one syllable following another). Likely syllable combinations can be predicted from the language model. Combinations of syllables, forming words, can also be predicted based on learning from the actual covert speech activity of a user: predictions based on ‘fixed’ rules of a language can be adapted according to the actual covert speech behaviour of a user. Machine learning can predict likely speech outputs based on prior speech outputs, as well as a language model. This processed neural and language information can then be input to a vocoder in order to output synthetic speech