Kevin Q Shan1, Evgueniy V Lubenov1, Athanassios G Siapas2. 1. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States; Division of Engineering and Applied Science, California Institute of Technology, Pasadena, United States. 2. Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, United States; Division of Engineering and Applied Science, California Institute of Technology, Pasadena, United States. Electronic address: thanos@caltech.edu.
Abstract
BACKGROUND: Chronic extracellular recordings are a powerful tool for systems neuroscience, but spike sorting remains a challenge. A common approach is to fit a generative model, such as a mixture of Gaussians, to the observed spike data. Even if non-parametric methods are used for spike sorting, such generative models provide a quantitative measure of unit isolation quality, which is crucial for subsequent interpretation of the sorted spike trains. NEW METHOD: We present a spike sorting strategy that models the data as a mixture of drifting t-distributions. This model captures two important features of chronic extracellular recordings-cluster drift over time and heavy tails in the distribution of spikes-and offers improved robustness to outliers. RESULTS: We evaluate this model on several thousand hours of chronic tetrode recordings and show that it fits the empirical data substantially better than a mixture of Gaussians. We also provide a software implementation that can re-fit long datasets in a few seconds, enabling interactive clustering of chronic recordings. COMPARISON WITH EXISTING METHODS: We identify three common failure modes of spike sorting methods that assume stationarity and evaluate their impact given the empirically-observed cluster drift in chronic recordings. Using hybrid ground truth datasets, we also demonstrate that our model-based estimate of misclassification error is more accurate than previous unit isolation metrics. CONCLUSIONS: The mixture of drifting t-distributions model enables efficient spike sorting of long datasets and provides an accurate measure of unit isolation quality over a wide range of conditions.
BACKGROUND: Chronic extracellular recordings are a powerful tool for systems neuroscience, but spike sorting remains a challenge. A common approach is to fit a generative model, such as a mixture of Gaussians, to the observed spike data. Even if non-parametric methods are used for spike sorting, such generative models provide a quantitative measure of unit isolation quality, which is crucial for subsequent interpretation of the sorted spike trains. NEW METHOD: We present a spike sorting strategy that models the data as a mixture of drifting t-distributions. This model captures two important features of chronic extracellular recordings-cluster drift over time and heavy tails in the distribution of spikes-and offers improved robustness to outliers. RESULTS: We evaluate this model on several thousand hours of chronic tetrode recordings and show that it fits the empirical data substantially better than a mixture of Gaussians. We also provide a software implementation that can re-fit long datasets in a few seconds, enabling interactive clustering of chronic recordings. COMPARISON WITH EXISTING METHODS: We identify three common failure modes of spike sorting methods that assume stationarity and evaluate their impact given the empirically-observed cluster drift in chronic recordings. Using hybrid ground truth datasets, we also demonstrate that our model-based estimate of misclassification error is more accurate than previous unit isolation metrics. CONCLUSIONS: The mixture of drifting t-distributions model enables efficient spike sorting of long datasets and provides an accurate measure of unit isolation quality over a wide range of conditions.
Authors: Andreas S Tolias; Alexander S Ecker; Athanassios G Siapas; Andreas Hoenselaar; Georgios A Keliris; Nikos K Logothetis Journal: J Neurophysiol Date: 2007-10-17 Impact factor: 2.714
Authors: Felix Franke; Michal Natora; Clemens Boucsein; Matthias H J Munk; Klaus Obermayer Journal: J Comput Neurosci Date: 2009-06-05 Impact factor: 1.621
Authors: Santiago A Cadena; George H Denfield; Edgar Y Walker; Leon A Gatys; Andreas S Tolias; Matthias Bethge; Alexander S Ecker Journal: PLoS Comput Biol Date: 2019-04-23 Impact factor: 4.475
Authors: Réka Barbara Bod; János Rokai; Domokos Meszéna; Richárd Fiáth; István Ulbert; Gergely Márton Journal: Front Neuroinform Date: 2022-06-13 Impact factor: 3.739
Authors: George H Denfield; Alexander S Ecker; Tori J Shinn; Matthias Bethge; Andreas S Tolias Journal: Nat Commun Date: 2018-07-09 Impact factor: 14.919
Authors: Yasamin Mokri; Rodrigo F Salazar; Baldwin Goodell; Jonathan Baker; Charles M Gray; Shih-Cheng Yen Journal: Front Neuroinform Date: 2017-08-17 Impact factor: 4.081