| Literature DB >> 35309141 |
ZhanDong Li1, Deling Wang2, HuiPing Liao3, ShiQi Zhang4, Wei Guo5, Lei Chen6, Lin Lu7, Tao Huang8,9, Yu-Dong Cai10.
Abstract
In mammals, the cerebellum plays an important role in movement control. Cellular research reveals that the cerebellum involves a variety of sub-cell types, including Golgi, granule, interneuron, and unipolar brush cells. The functional characteristics of cerebellar cells exhibit considerable differences among diverse mammalian species, reflecting a potential development and evolution of nervous system. In this study, we aimed to recognize the transcriptional differences between human and mouse cerebellum in four cerebellar sub-cell types by using single-cell sequencing data and machine learning methods. A total of 321,387 single-cell sequencing data were used. The 321,387 cells included 4 cell types, i.e., Golgi (5,048, 1.57%), granule (250,307, 77.88%), interneuron (60,526, 18.83%), and unipolar brush (5,506, 1.72%) cells. Our results showed that by using gene expression profiles as features, the optimal classification model could achieve very high even perfect performance for Golgi, granule, interneuron, and unipolar brush cells, respectively, suggesting a remarkable difference between the genomic profiles of human and mouse. Furthermore, a group of related genes and rules contributing to the classification was identified, which might provide helpful information for deepening the understanding of cerebellar cell heterogeneity and evolution.Entities:
Keywords: cerebellum; gene expression pattern; golgi cells; granule cells; interneuron cells; machine learning method; unipolar brush cells
Year: 2022 PMID: 35309141 PMCID: PMC8930846 DOI: 10.3389/fgene.2022.857851
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Overview of the design. Four types of mouse and human cerebellum cells constitute four datasets, where cells are represented by single-cell profiles. The profiles are analyzed by Boruta and minimum redundancy maximum relevance feature selection methods one by one, resulting in one mRMR feature list on each dataset. The list is used in the incremental feature selection, incorporating some classification algorithms, synthetic minority oversampling technique and ten-fold cross-validation to extract significant single-genes and combined-gene rules.
Breakdown of 4 cell sample datasets.
| Cell type | Number of mouse cells | Number of human cells | Total number of cells | Number of gene features |
|---|---|---|---|---|
| Golgi cell | 3,989 | 1,059 | 5,048 | 14,512 |
| Granule cell | 119,972 | 130,335 | 250,307 | 23,422 |
| Interneuron cell | 45,555 | 14,971 | 60,526 | 23,203 |
| Unipolar brush cell | 1,613 | 3,893 | 5,506 | 13,456 |
FIGURE 2IFS curves of decision tree and random forest on datasets of four cerebellum cell types. (A) Curves on dataset for Golgi cells, (B) Curves on dataset for Granule cells, (C) Curves on datasets for Interneuron cells, (D) Curves on dataset on Unipolar brush cells.
Performance of optimal classifiers on four datasets using different classification algorithms.
| Cell type | Classification algorithm | Number of features | MCC |
|---|---|---|---|
| Golgi cell | Decision tree | 34 | 0.99642 |
| Random forest | 518 | 1.00000 | |
| Granule cell | Decision tree | 5 | 1.00000 |
| Random forest | 2 | 1.00000 | |
| Interneuron cell | Decision tree | 1 | 0.99996 |
| Random forest | 100 | 1.00000 | |
| Unipolar brush cell | Decision tree | 28 | 0.99606 |
| Random forest | 28 | 1.00000 |
FIGURE 3Some measurements of the optimal decision tree and random forest classifiers on datasets of four cerebellum cell types. (A) Measurements on dataset for Granule cells, (B) Measurements on datasets for Granule cells, (C) Measurements on datasets for Interneuron cells, (D) Measurements on datasets for Unipolar brush cells.
FIGURE 4Box plots of MCC values yielded by classifiers with randomly selected gene features on datasets of four cerebellum cell types. (A) Box plots on dataset for Granule cells, (B) Box plots on datasets for Granule cells, (C) Box plots on datasets for Interneuron cells, (D) Box plots on datasets for Unipolar brush cells.
Feature list of important genes based on mRMR ranking.
| Cell type | The rankings of feature | Genes |
|---|---|---|
| Golgi Cell | 1 | Lingo2 |
| 3 | ube3a | |
| 5 | Nlgn1 | |
| Granule Cell | 1 | Ralyl |
| 3 | Fgf14 | |
| Interneuron Cell | 1 | Malat1 |
| 2 | Ank2 | |
| 3 | Nrxn3 | |
| Unipolar Brush Cell | 1 | Pde1a |
| 7 | Rgs6 |
Classification rules generated by DT.
| Index | Rule | Label | |
|---|---|---|---|
| Golgi Cell gene | |||
| 1 | (Lingo2 > 2,924.455) and (Lrp1b ≤ 1,185.155) and (Upk3b ≤ 180.656) | Negative | |
| 2 | (Lingo2 ≤ 2,924.455) and (Pla2g3 ≤ 108.329) and (Nrxn1 ≤ 5,364.268) and (Ube3a ≤2,781.385) | Positive | |
| 3 | (Lingo2 ≤ 2,924.455) and (Pla2g3 > 108.329) | Negative | |
| 4 | (Lingo2 > 2,924.455) and (Thsd7b > 1,185.155) | Positive | |
| 5 | (Lingo2 ≤ 2,924.455) and (Pla2g3 ≤ 108.329) and (Nrxn1 ≤ 5,364.268) and (Ube3a >2,781.385) and (Fstl5 ≤ 539.2883) | Negative | |
| 6 | (Lingo2 ≤ 2,924.455) and (Pla2g3 ≤ 108.329) and (Nrxn1 > 5,364.268) | Negative | |
| 7 | (Lingo2 ≤ 2,924.455) and (Pla2g3 ≤ 108.329) and (Nrxn1 ≤ 5,364.268) and (Ube3a >2,781.385) and (Fstl5 > 539.288) | Positive | |
| 8 | (Lingo2 > 2,924.455) and (Thsd7b ≤ 1,185.155) and (Upk3b > 180.656) | Positive | |
| Granule Cell gene | |||
| 1 | Malat1 ≤ 3,654.637 | Positive | |
| 2 | Malat1 > 3,654.637 | Negative | |
| Interneuron Cell gene | |||
| 1 | Malat1 > 945.180 | Negative | |
| 2 | Malat1 ≤ 945.1805 | Positive | |
| Unipolar Brush Cell gene | |||
| 1 | (Ccdc85a ≤152.189) and (Rgs6 ≤ 4,444.267) and (Kcnd2 ≤ 5,474.120) and (Cdh12 ≤ 1975.607) and (Fgf14 ≤ 7,328.168) | Positive | |
| 2 | (Ccdc85a >152.189) and (Hsp90aa1≤1,532.738) | Negative | |
| 3 | (Ccdc85a≤152.189) and (Rgs6 > 4,444.267) and (Aff3≤1,376.276) | Negative | |
| 4 | (Ccdc85a≤152.189) and (Rgs6≤4,444.267) and (Kcnd2 > 5,474.120) and (Cblb≤284.311) | Negative | |
| 5 | (Ccdc85a≤152.189) and (Rgs6≤4,444.267) and (Kcnd2≤5,474.120) and (Cdh12≤1975.607) and (Fgf14 > 7,328.168) and (Kcnd2≤2,874.838) | Positive | |
| 6 | (Ccdc85a >152.190) and (Hsp90aa1 > 1,532.738) | Positive | |
| 7 | (Ccdc85a≤152.189) and (Rgs6 > 4,444.267) and (Aff3 > 1,376.276) | Positive | |
| 8 | (Ccdc85a≤152.189) and (Rgs6≤4,444.267) and (Kcnd2≤5,474.120) and (Cdh12≤1975.607) and (Fgf14 > 7,328.168) and (Kcnd2 > 2,874.838) | Negative | |
| 9 | (Ccdc85a≤152.189) and (Rgs6≤4,444.267) and (Kcnd2 > 5,474.120) and (Cblb >284.311) | Positive | |
| 10 | (Ccdc85a≤152.189) and (Rgs6≤4,444.267) and (Kcnd2≤5,474.120) and (Cdh12 > 1975.607) and (Pde1a >2,364.066) | Positive | |
| 11 | (Ccdc85a≤152.189) and (Rgs6≤4,444.267) and (Kcnd2≤5,474.120) and (Cdh12 > 1975.607) and (Pde1a≤2,364.066) | Negative | |