| Literature DB >> 35874318 |
Dongfang Xu1, Rong Chen1.
Abstract
In neural decoding, a behavioral variable is often generated by manual annotation and the annotated labels could contain extensive label noise, leading to poor model generalizability. Tackling the label noise problem in neural decoding can improve model generalizability and robustness. We use a deep neural network based sample reweighting method to tackle this problem. The proposed method reweights training samples by using a small and clean validation dataset to guide learning. We evaluated the sample reweighting method on simulated neural activity data and calcium imaging data of anterior lateral motor cortex. For the simulated data, the proposed method can accurately predict the behavioral variable even in the scenario that 36 percent of samples in the training dataset are mislabeled. For the anterior lateral motor cortex study, the proposed method can predict trial types with F1 score of around 0.85 even 48 percent of training samples are mislabeled.Entities:
Keywords: anterior lateral motor cortex; deep neural networks; neural decoding; noisy label; sample reweighting method
Year: 2022 PMID: 35874318 PMCID: PMC9296819 DOI: 10.3389/fncom.2022.913617
Source DB: PubMed Journal: Front Comput Neurosci ISSN: 1662-5188 Impact factor: 3.387
Sample Reweighting Algorithm
|
|
| θ0, |
|
|
| θ |
| 1: |
| 2: Sample a mini-batch from Ω |
| 3: Sample a mini-batch from Ω |
| 4: Train a model and predict: ŷ |
| 5: Calculate loss: |
| 6: Update model parameters: |
| 7: Predict samples in the validation dataset: ŷ |
| 8: Calculate validation loss: |
| 9: Rectify weighting: |
| 10: Normalize weighting: ω = |
| 11: Calculate weighted training loss: |
| 12: Update model parameter: θ |
| 13: |
The training dataset of the simulated data study.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
|
|
|
|
|
| |
| 1 | 600 | 464 | 136 | 100 | 36.3 |
| 2 | 600 | 460 | 140 | 80 | 29.6 |
| 3 | 600 | 456 | 144 | 60 | 23.0 |
| 4 | 600 | 460 | 140 | 40 | 13.0 |
| 5 | 600 | 461 | 139 | 20 | 7.3 |
Figure 1The model's performances on the simulation datasets with different label noise levels. (A) is the F1 scores of baselines 1 and 2 and sample reweighting method. (B) is the balanced accuracies of baseline 1 and 2 and sample reweighting method.
Figure 2The boxplots of weights for the correctly-labeled and mislabeled samples in the simulation training dataset with noisy labels (the noise level is 36.3%). **** denotes that weights of the correctly-labeled samples were significantly different from those of mislabeled samples (p < 0.0001).
Figure 3The effects of sample size of validation data. (A) is the averages (the gray bars) and the standard deviations (the black line) of the F1 score. (B) is the averages (the white bars) and the standard deviations (the black line) of the balanced accuracy.
Figure 4The effects of the mixed training datasets. The pink squares and black diamonds denote the balanced accuracy and F1 score, respectively.
Figure 5The ALM data. The upper panel is for trial type “left” and the bottom panel is for trial type “right”.
The ALM study.
|
|
|
|
|
| |
|---|---|---|---|---|---|
|
|
|
|
|
| |
| Training 1 | 853 | 446 | 407 | 100% | 52.3% |
| Training 2 | 853 | 446 | 407 | 80% | 41.3% |
| Training 3 | 853 | 446 | 407 | 60% | 31.0% |
| Training 4 | 853 | 446 | 407 | 40% | 21.1% |
| Training 5 | 853 | 446 | 407 | 20% | 9.9% |
| Validation | 66 | 33 | 33 | 0 (Clean) | 0 |
| Test | 103 | 53 | 50 | 0 (Clean) | 0 |
Figure 6The test performances on the ALM data. (A) is the F1 scores of baselines 1 and 2 and sample reweighting method. (B) is the balanced accuracies of the baseline 1 and 2 and sample reweighting method.
Figure 7The boxplots of weights for the correctly-labeled and mislabeled samples in ALM training dataset (the noise level is 52.3%). **** denotes that weights of the correctly-labeled samples were significantly different from those of mislabeled samples (p < 0.0001).