| Literature DB >> 21154911 |
Luis Guerra1, Laura M McGarry, Víctor Robles, Concha Bielza, Pedro Larrañaga, Rafael Yuste.
Abstract
In the study of neural circuits, it becomes essential to discern the different neuronal cell types that build the circuit. Traditionally, neuronal cell types have been classified using qualitative descriptors. More recently, several attempts have been made to classify neurons quantitatively, using unsupervised clustering methods. While useful, these algorithms do not take advantage of previous information known to the investigator, which could improve the classification task. For neocortical GABAergic interneurons, the problem to discern among different cell types is particularly difficult and better methods are needed to perform objective classifications. Here we explore the use of supervised classification algorithms to classify neurons based on their morphological features, using a database of 128 pyramidal cells and 199 interneurons from mouse neocortex. To evaluate the performance of different algorithms we used, as a "benchmark," the test to automatically distinguish between pyramidal cells and interneurons, defining "ground truth" by the presence or absence of an apical dendrite. We compared hierarchical clustering with a battery of different supervised classification algorithms, finding that supervised classifications outperformed hierarchical clustering. In addition, the selection of subsets of distinguishing features enhanced the classification accuracy for both sets of algorithms. The analysis of selected variables indicates that dendritic features were most useful to distinguish pyramidal cells from interneurons when compared with somatic and axonal morphological variables. We conclude that supervised classification algorithms are better matched to the general problem of distinguishing neuronal cell types when some information on these cell groups, in our case being pyramidal or interneuron, is known a priori. As a spin-off of this methodological study, we provide several methods to automatically distinguish neocortical pyramidal cells from interneurons, based on their morphologies.Entities:
Mesh:
Year: 2011 PMID: 21154911 PMCID: PMC3058840 DOI: 10.1002/dneu.20809
Source DB: PubMed Journal: Dev Neurobiol ISSN: 1932-8451 Impact factor: 3.964
Figure 1“Benchmark” task: distinguishing between GABAergic interneurons and pyramidal cells. Representative basket (A) and pyramidal (B) cell from mouse neocortex. Axonal arbor in blue and dendritic tree in red. Data examples obtained from http://www.columbia.edu/cu/biology/faculty/yuste/databases.html.
Figure 2Graphical representation of a hierarchical clustering (dendrogram). Interneurons are labeled in red and pyramidal cells in blue. Note that although there are two major clusters which represent mostly interneurons and pyramidal cells, there are many of misplaced neurons in this type of unsupervised classification.
Figure 3Example of the models obtained from the supervised classification algorithms used in this study. A: Partial naïve Bayes model. For each class label and feature, mean and standard deviation (std. dev.) are shown. B: Partial classification tree model obtained from C4.5 algorithm. C: Projection of data in 2D. In k-nn, each instance is classified based on the class label of its k nearest neighbors. This algorithm does not build a model. D: Partial multilayer perceptron model. A neural network is built with an input, output and several hidden layers. E: Graphical representation of a logistic function, base of the logistic regression model.
Results Obtained with Hierarchical Clustering Using Ward's Method
| Hierarchical Clustering | |||
|---|---|---|---|
| Accuracy | # | ||
| No FSS | 59.33 | 65 | |
| PCA | PC | 59.02 | 6 |
| Original features | 66.77 | 10 | |
| Filter | Forward | 77.68 | 10 |
| Backward | 71.25 | 17 | |
| Genetic | 79.82 | 16 | |
PC uses the six first principal components, whereas “Original Features” uses the original features with correlation greater than 0.7 with the six first principal components. The number of features used (#) is also indicated.
Results Obtained Using Naïve Bayes (NB)
| NB | |||
|---|---|---|---|
| Accuracy | # | ||
| No FSS | 80.73 ± 10.44 | 65 | |
| Filter | Forward | 79.82 ± 9.86 | 10 |
| Backward | 79.51 ± 9.74 | 17 | |
| Genetic | 80.43 ± 7.07 | 16 | |
| Wrapper | Forward | ||
| Backward | 83.18 ± 9.12 | 50 | |
| Genetic | 83.49 ± 8.55 | 23 | |
Values correspond to the accuracy of each model, i.e. the mean ± standard deviation (percentage) averaged over the 10 values estimated using 10-fold cross-validation. The number of features used (#) is also indicated as before. Bold face indicates the model with no significant statistical differences with the highest accuracy supervised model.
Results Obtained Using the Decision Tree C4.5
| C4.5 | |||
|---|---|---|---|
| Accuracy | # | ||
| No FSS | 84.40 ± 3.84 | 65 | |
| Filter | Forward | 82.26 ± 7.17 | 9 |
| Backward | |||
| Genetic | 81.65 ± 7.24 | 6 | |
| Wrapper | Forward | 86.85 ± 5.29 | 7 |
| Backward | |||
| Genetic | |||
Results Obtained Using K-nn (with K = 5)
| 5-nn | |||
|---|---|---|---|
| Accuracy | # | ||
| No FSS | 83.18 ± 7.15 | 65 | |
| Filter | Forward | 83.79 ± 9.55 | 10 |
| Backward | 84.71 ± 6.03 | 17 | |
| Genetic | 85.01 ± 5.60 | 16 | |
| Wrapper | Forward | ||
| Backward | 86.85 ± 6.26 | 51 | |
| Genetic | |||
Results Obtained Using Multilayer Perceptron (MLP)
| MLP | |||
|---|---|---|---|
| Accuracy | # | ||
| No FSS | |||
| Filter | Forward | 82.57 ± 9.54 | 10 |
| Backward | 87.77 ± 6.36 | 17 | |
| Genetic | 82.26 ± 9.17 | 16 | |
| Wrapper | Forward | ||
| Backward | |||
| Genetic | 87.46 ± 6.26 | 37 | |
Results Obtained Using Logistic Regression (LR)
| LR | |||
|---|---|---|---|
| Accuracy | # | ||
| No FSS | 82.26 ± 7.36 | 65 | |
| Filter | Forward | 82.26 ± 9.82 | 10 |
| Backward | 85.63 ± 8.56 | 17 | |
| Genetic | 83.49 ± 9.45 | 16 | |
| Wrapper | Forward | ||
| Backward | 84.71 ± 7.54 | 59 | |
| Genetic | |||
Details as before.
Results of Wilcoxon Signed-Rank Test
| FSS | Algorithm | ||
|---|---|---|---|
| No FSS | MLP | 0.091 | |
| Filter | Backward | C4.5 | 0.095 |
| Wrapper | Forward | NB | 0.095 |
| 5-nn | 0.220 | ||
| MLP | 0.063 | ||
| LR | 0.053 | ||
| Backward | C4.5 | 0.077 | |
| MLP | 0.115 | ||
| C4.5 | 0.052 | ||
| Genetic | 5-nn | 0.052 | |
| LR | – | ||
Models which do not reject the null hypothesis, and therefore, with no significant statistical differences (p-value greater than 0.05) with the highest accuracy model are listed.