Guy H Wilson1, Sergey D Stavisky2,3,4, Francis R Willett2,4,5, Donald T Avansino2, Jessica N Kelemen6, Leigh R Hochberg6,7,8,9, Jaimie M Henderson2,3, Shaul Druckmann3,10, Krishna V Shenoy3,4,5,10,11. 1. Neurosciences Graduate Program, Stanford University, Stanford, CA, United States of America. 2. Department of Neurosurgery, Stanford University, Stanford, CA, United States of America. 3. Wu Tsai Neurosciences Institute and Bio-X Institute, Stanford University, Stanford, CA, United States of America. 4. Department of Electrical Engineering, Stanford University, Stanford, CA, United States of America. 5. Howard Hughes Medical Institute at Stanford University, Stanford, CA, United States of America. 6. Department of Neurology, Harvard Medical School, Boston, MA, United States of America. 7. Center for Neurotechnology and Neurorecovery, Dept. of Neurology, Massachusetts General Hospital, Boston, MA, United States of America. 8. VA RR&D Center for Neurorestoration and Neurotechnology, Rehabilitation R&D Service, Providence VA Medical Center, Providence, RI, United States of America. 9. Carney Institute for Brain Science and School of Engineering, Brown University, Providence, RI, United States of America. 10. Department of Neurobiology, Stanford University, Stanford, CA, United States of America. 11. Department of Bioengineering, Stanford University, Stanford, CA, United States of America.
Abstract
OBJECTIVE: To evaluate the potential of intracortical electrode array signals for brain-computer interfaces (BCIs) to restore lost speech, we measured the performance of decoders trained to discriminate a comprehensive basis set of 39 English phonemes and to synthesize speech sounds via a neural pattern matching method. We decoded neural correlates of spoken-out-loud words in the 'hand knob' area of precentral gyrus, a step toward the eventual goal of decoding attempted speech from ventral speech areas in patients who are unable to speak. APPROACH: Neural and audio data were recorded while two BrainGate2 pilot clinical trial participants, each with two chronically-implanted 96-electrode arrays, spoke 420 different words that broadly sampled English phonemes. Phoneme onsets were identified from audio recordings, and their identities were then classified from neural features consisting of each electrode's binned action potential counts or high-frequency local field potential power. Speech synthesis was performed using the 'Brain-to-Speech' pattern matching method. We also examined two potential confounds specific to decoding overt speech: acoustic contamination of neural signals and systematic differences in labeling different phonemes' onset times. MAIN RESULTS: A linear decoder achieved up to 29.3% classification accuracy (chance = 6%) across 39 phonemes, while an RNN classifier achieved 33.9% accuracy. Parameter sweeps indicated that performance did not saturate when adding more electrodes or more training data, and that accuracy improved when utilizing time-varying structure in the data. Microphonic contamination and phoneme onset differences modestly increased decoding accuracy, but could be mitigated by acoustic artifact subtraction and using a neural speech onset marker, respectively. Speech synthesis achieved r = 0.523 correlation between true and reconstructed audio. SIGNIFICANCE: The ability to decode speech using intracortical electrode array signals from a nontraditional speech area suggests that placing electrode arrays in ventral speech areas is a promising direction for speech BCIs.
OBJECTIVE: To evaluate the potential of intracortical electrode array signals for brain-computer interfaces (BCIs) to restore lost speech, we measured the performance of decoders trained to discriminate a comprehensive basis set of 39 English phonemes and to synthesize speech sounds via a neural pattern matching method. We decoded neural correlates of spoken-out-loud words in the 'hand knob' area of precentral gyrus, a step toward the eventual goal of decoding attempted speech from ventral speech areas in patients who are unable to speak. APPROACH: Neural and audio data were recorded while two BrainGate2 pilot clinical trial participants, each with two chronically-implanted 96-electrode arrays, spoke 420 different words that broadly sampled English phonemes. Phoneme onsets were identified from audio recordings, and their identities were then classified from neural features consisting of each electrode's binned action potential counts or high-frequency local field potential power. Speech synthesis was performed using the 'Brain-to-Speech' pattern matching method. We also examined two potential confounds specific to decoding overt speech: acoustic contamination of neural signals and systematic differences in labeling different phonemes' onset times. MAIN RESULTS: A linear decoder achieved up to 29.3% classification accuracy (chance = 6%) across 39 phonemes, while an RNN classifier achieved 33.9% accuracy. Parameter sweeps indicated that performance did not saturate when adding more electrodes or more training data, and that accuracy improved when utilizing time-varying structure in the data. Microphonic contamination and phoneme onset differences modestly increased decoding accuracy, but could be mitigated by acoustic artifact subtraction and using a neural speech onset marker, respectively. Speech synthesis achieved r = 0.523 correlation between true and reconstructed audio. SIGNIFICANCE: The ability to decode speech using intracortical electrode array signals from a nontraditional speech area suggests that placing electrode arrays in ventral speech areas is a promising direction for speech BCIs.
Authors: David M Brandman; Tommy Hosman; Jad Saab; Michael C Burkhart; Benjamin E Shanahan; John G Ciancibello; Anish A Sarma; Daniel J Milstein; Carlos E Vargas-Irwin; Brian Franco; Jessica Kelemen; Christine Blabe; Brian A Murphy; Daniel R Young; Francis R Willett; Chethan Pandarinath; Sergey D Stavisky; Robert F Kirsch; Benjamin L Walter; A Bolu Ajiboye; Sydney S Cash; Emad N Eskandar; Jonathan P Miller; Jennifer A Sweet; Krishna V Shenoy; Jaimie M Henderson; Beata Jarosiewicz; Matthew T Harrison; John D Simeral; Leigh R Hochberg Journal: J Neural Eng Date: 2018-04 Impact factor: 5.379
Authors: Jennifer L Collinger; Brian Wodlinger; John E Downey; Wei Wang; Elizabeth C Tyler-Kabara; Douglas J Weber; Angus J C McMorland; Meel Velliste; Michael L Boninger; Andrew B Schwartz Journal: Lancet Date: 2012-12-17 Impact factor: 79.321
Authors: Leigh R Hochberg; Daniel Bacher; Beata Jarosiewicz; Nicolas Y Masse; John D Simeral; Joern Vogel; Sami Haddadin; Jie Liu; Sydney S Cash; Patrick van der Smagt; John P Donoghue Journal: Nature Date: 2012-05-16 Impact factor: 49.962
Authors: Stephanie Martin; Iñaki Iturrate; José Del R Millán; Robert T Knight; Brian N Pasley Journal: Front Neurosci Date: 2018-06-21 Impact factor: 4.677
Authors: Anisha Rastogi; Carlos E Vargas-Irwin; Francis R Willett; Jessica Abreu; Douglas C Crowder; Brian A Murphy; William D Memberg; Jonathan P Miller; Jennifer A Sweet; Benjamin L Walter; Sydney S Cash; Paymon G Rezaii; Brian Franco; Jad Saab; Sergey D Stavisky; Krishna V Shenoy; Jaimie M Henderson; Leigh R Hochberg; Robert F Kirsch; A Bolu Ajiboye Journal: Sci Rep Date: 2020-01-29 Impact factor: 4.379
Authors: Sarah K Wandelt; Spencer Kellis; David A Bjånes; Kelsie Pejsa; Brian Lee; Charles Liu; Richard A Andersen Journal: Neuron Date: 2022-03-31 Impact factor: 18.688
Authors: Timothée Proix; Jaime Delgado Saa; Andy Christen; Stephanie Martin; Brian N Pasley; Robert T Knight; Xing Tian; David Poeppel; Werner K Doyle; Orrin Devinsky; Luc H Arnal; Pierre Mégevand; Anne-Lise Giraud Journal: Nat Commun Date: 2022-01-10 Impact factor: 17.694
Authors: Maxime Verwoert; Maarten C Ottenhoff; Sophocles Goulis; Albert J Colon; Louis Wagner; Simon Tousseyn; Johannes P van Dijk; Pieter L Kubben; Christian Herff Journal: Sci Data Date: 2022-07-22 Impact factor: 8.501