| Literature DB >> 23680020 |
David E Thompson1, Stefanie Blain-Moraes, Jane E Huggins.
Abstract
A large number of incommensurable metrics are currently used to report the performance of brain-computer interfaces (BCI) used for augmentative and alterative communication (AAC). The lack of standard metrics precludes the comparison of different BCI-based AAC systems, hindering rapid growth and development of this technology. This paper presents a review of the metrics that have been used to report performance of BCIs used for AAC from January 2005 to January 2012. We distinguish between Level 1 metrics used to report performance at the output of the BCI Control Module, which translates brain signals into logical control output, and Level 2 metrics at the Selection Enhancement Module, which translates logical control to semantic control. We recommend that: (1) the commensurate metrics Mutual Information or Information Transfer Rate (ITR) be used to report Level 1 BCI performance, as these metrics represent information throughput, which is of interest in BCIs for AAC; 2) the BCI-Utility metric be used to report Level 2 BCI performance, as it is capable of handling all current methods of improving BCI performance; (3) these metrics should be supplemented by information specific to each unique BCI configuration; and (4) studies involving Selection Enhancement Modules should report performance at both Level 1 and Level 2 in the BCI system. Following these recommendations will enable efficient comparison between both BCI Control and Selection Enhancement Modules, accelerating research and development of BCI-based AAC systems.Entities:
Mesh:
Year: 2013 PMID: 23680020 PMCID: PMC3662584 DOI: 10.1186/1475-925X-12-43
Source DB: PubMed Journal: Biomed Eng Online ISSN: 1475-925X Impact factor: 2.819
Figure 1Architecture of a BCI-based AAC system that is comprised of two modules: (1) a BCI Control Module that translates brain signals into logical control outputs and (2) a Selection Enhancement Module that translates logical control to semantic control. Performance of BCI-based AAC systems can be measured at three levels (labeled Level 1, Level 2, Level 3) within this architecture; each level of measurement is currently assessed by a variety of often incommensurable performance metrics.
Metrics used in the literature from January 2005 – January 2012 to report performance of communication-based BCIs
| Accuracy | 38 | [ |
| Accuracy and information transfer rate (ITR) | 16 | [ |
| Information transfer rate (ITR) | 7 | [ |
| True and false positives | 1 | [ |
| Accuracy and written symbol rate (WSR) | 1 | [ |
| Accuracy and speed | 1 | [ |
| Accuracy and mutual information | 1 | [ |
| Accuracy and number of errors | 1 | [ |
| Accuracy and selections per minute | 1 | [ |
| Accuracy, bit rate, selections per minute, output characters per minute | 1 | [ |
| Characters per minute | 1 | [ |
| Accuracy, information transfer rate (ITR), NASA task load index, QUEST 2.0 | 1 | [ |
Comparison of common Level 1 BCI-based AAC performance metrics
| Accuracy/Error Rate | | ✓ | | ✓ | |
| Cohen’s Kappa | | ✓ | | ✓ | |
| Confusion Matrix | A matrix with intended (true) outputs as rows, actual outputs as columns, and the number of occurrences in the intersections. | | ✓ | ✓ | |
| IMutual Information or the formulation in
[ | ✓ | ✓ | ✓ | | |
| Information Transfer Rate (ITR) | ✓ | ✓ | ✓ | ||
P: probability of correct selection; N: number of choices; p(x): marginal distribution of X; p(x,y): joint distribution of X and Y; c: time per selection.
Check marks indicate that the metric fulfills the evaluation criterion.
Comparison of common Level 2 BCI-Based AAC performance metrics
| Written symbol rate (WSR) | | | | | ✓ | |
| Practical bit rate (PBR) | | | | ✓ | ✓ | |
| Extended confusion matrix (ECM) | Confusion matrix, as in Table
| ✓ | | ✓ | ✓ | |
| EffSYS | ✓ | | ✓ | | ✓ | |
| EffSYS’ | | | ✓ | ✓ | ✓ | |
| Output characters per minute (OCM) | ✓ | ✓ | ✓ | | ✓ | |
| BCI-Utility metric | ✓ | ✓ | ✓ | ✓ | * | |
c: time per selection; N: number of choices; ITR: information transfer rate; P: probability of correct selection.
*: Not currently practical, but can be with future work.
Check marks indicate that the metric fulfills the evaluation criterion.
Figure 2Comparison of performance of Level 2 metrics. a) Comparison of each scalar Level 2 metric on data from a P300 copy-spelling task with correction, 75 sentences from 22 users, sorted by OCM. All metrics were converted into characters per minute (e.g. f(Effsys) and f(Effsys’) represent the metric multiplied by the character output rate). b) The ECM for the first observation presented in a), in a format required by checkerboard-style spellers. Note the sparsity of the matrix, even though data from fifteen minutes of BCI use are included. For practicality, the 74 ECMs corresponding to the other observations are not presented.
Figure 3Example of augmented level 1 performance metric for a P300-speller BCI. Both the ITR and the accuracy are reported with respect to time, enabling comparison with other BCI Control Modules. Note that ITR was calculated including the time between selections.