Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE.

Literature DB >> 34308361

Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE.

Juntang Zhuang¹, Nicha Dvornek^1,2, Xiaoxiao Li¹, Sekhar Tatikonda³, Xenophon Papademetris^1,2,4, James Duncan^1,2,4.

Abstract

Neural ordinary differential equations (NODEs) have recently attracted increasing attention; however, their empirical performance on benchmark tasks (e.g. image classification) are significantly inferior to discrete-layer models. We demonstrate an explanation for their poorer performance is the inaccuracy of existing gradient estimation methods: the adjoint method has numerical errors in reverse-mode integration; the naive method directly back-propagates through ODE solvers, but suffers from a redundantly deep computation graph when searching for the optimal stepsize. We propose the Adaptive Checkpoint Adjoint (ACA) method: in automatic differentiation, ACA applies a trajectory checkpoint strategy which records the forward-mode trajectory as the reverse-mode trajectory to guarantee accuracy; ACA deletes redundant components for shallow computation graphs; and ACA supports adaptive solvers. On image classification tasks, compared with the adjoint and naive method, ACA achieves half the error rate in half the training time; NODE trained with ACA outperforms ResNet in both accuracy and test-retest reliability. On time-series modeling, ACA outperforms competing methods. Finally, in an example of the three-body problem, we show NODE with ACA can incorporate physical knowledge to achieve better accuracy. We provide the PyTorch implementation of ACA: https://github.com/juntang-zhuang/torch-ACA.

Entities: Chemical

Year: 2020 PMID： 34308361 PMCID： PMC8299461

Source DB: PubMed Journal: Proc Mach Learn Res

5 in total

Review 1. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM.

Authors: Joseph P Weir
Journal: J Strength Cond Res Date: 2005-02 Impact factor: 3.775

2. The Structured Clinical Interview for DSM-III-R (SCID). II. Multisite test-retest reliability.

Authors: J B Williams; M Gibbon; M B First; R L Spitzer; M Davies; J Borus; M J Howes; J Kane; H G Pope; B Rounsaville
Journal: Arch Gen Psychiatry Date: 1992-08

3. Trainable Nonlinear Reaction Diffusion: A Flexible Framework for Fast and Effective Image Restoration.

Authors: Yunjin Chen; Thomas Pock
Journal: IEEE Trans Pattern Anal Mach Intell Date: 2016-08-01 Impact factor: 6.226

4. Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness.

Authors: Pengzhan Jin; Lu Lu; Yifa Tang; George Em Karniadakis
Journal: Neural Netw Date: 2020-07-03

5. Statistical methods for assessing agreement between two methods of clinical measurement.

Authors: J M Bland; D G Altman
Journal: Lancet Date: 1986-02-08 Impact factor: 79.321

5 in total

1 in total

1. A Differentiable Dynamic Model for Musculoskeletal Simulation and Exoskeleton Control.

Authors: Chao-Hung Kuo; Jia-Wei Chen; Yi Yang; Yu-Hao Lan; Shao-Wei Lu; Ching-Fu Wang; Yu-Chun Lo; Chien-Lin Lin; Sheng-Huang Lin; Po-Chuan Chen; You-Yin Chen
Journal: Biosensors (Basel) Date: 2022-05-09

1 in total