| Literature DB >> 35891027 |
Mohamad Wehbi1, Daniel Luge1, Tim Hamann2, Jens Barth2, Peter Kaempf2, Dario Zanca1, Bjoern M Eskofier1.
Abstract
Efficient handwriting trajectory reconstruction (TR) requires specific writing surfaces for detecting movements of digital pens. Although several motion-based solutions have been developed to remove the necessity of writing surfaces, most of them are based on classical sensor fusion methods limited, by sensor error accumulation over time, to tracing only single strokes. In this work, we present an approach to map the movements of an IMU-enhanced digital pen to relative displacement data. Training data is collected by means of a tablet. We propose several pre-processing and data-preparation methods to synchronize data between the pen and the tablet, which are of different sampling rates, and train a convolutional neural network (CNN) to reconstruct multiple strokes without the need of writing segmentation or post-processing correction of the predicted trajectory. The proposed system learns the relative displacement of the pen tip over time from the recorded raw sensor data, achieving a normalized error rate of 0.176 relative to unit-scaled tablet ground truth (GT) trajectory. To test the effectiveness of the approach, we train a neural network for character recognition from the reconstructed trajectories, which achieved a character error rate of 19.51%. Finally, a joint model is implemented that makes use of both the IMU data and the generated trajectories, which outperforms the sensor-only-based recognition approach by 0.75%.Entities:
Keywords: convolutional neural network; digital pen; handwriting recognition; inertial measurement unit; sensor-based deep learning; trajectory reconstruction
Mesh:
Substances:
Year: 2022 PMID: 35891027 PMCID: PMC9318904 DOI: 10.3390/s22145347
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.847
Figure 1A visual summary of the workflow of our paper. We develop an end-to-end neural network model for handwriting trajectory reconstruction using data collected by writing with a digital pen on a tablet. The system was also tested on writing on paper by developing a text recognition model using IMU data and the relative generated trajectory data.
Summary of the related work of trajectory reconstruction systems. Gyr and Acc represent gyroscope and accelerometer, repectively.
| Authors | Strokes | Segmentation | Calibration | TR Method | Recognition Method | Recognition Type |
|---|---|---|---|---|---|---|
| [ | Single | - | Yes | Gyr single integration | Apple Newton recognizer | Words |
| [ | Single | - | Yes | Acc double integration | Hidden markov models | Characters |
| [ | Single | - | Yes | Acc double integration | Hidden markov models | Characters |
| [ | Single | - | Yes | Acc double integration | Vista tablet PC recognizer | Digits |
| [ | Multi | Yes | Yes | Acc double integration | Google IME recognizer | Characters |
| Proposed Model | Multi | No | No | Neural networks | Neural networks | Words |
Figure 2Sensor distribution in the Digipen.
Figure 3Wacom compatible writing tip on the Digipen (left bottom pen), and the recording app (right).
Figure 4Tablet data upsampling via complete label interpolation and stroke-based interpolation.
Figure 5Absolute coordinates of the character ‘B’ (left), and the calculated relative displacement vectors of the same character (right).
Detailed description of the convolutional neural network used in this study.
| Layer | Hyperparameters | # of Parameters |
|---|---|---|
| 1D convolution | Filters: 256, Kernel-size: 3 | 7936 |
| Batch normalization | Momentum: 0.99, Epsilon = 0.001 | 1024 |
| Dropout | Rate: 0.3 | 0 |
| 1D convolution | Filters: 256, Kernel-size: 3 | 196,864 |
| Batch normalization | Momentum: 0.99, Epsilon = 0.001 | 1024 |
| Dropout | Rate: 0.3 | 0 |
| 1D convolution | Filters: 256, Kernel-size: 3 | 196,864 |
| Batch normalization | Momentum:0.99, Epsilon = 0.001 | 1024 |
| Dropout | Rate: 0.3 | 0 |
| TimeDistributed (fully connected) | Units: 2 | 514 |
|
| 405,250 |
User-tablet data recordings split into leave-one-user-out folds.
| Folds | # of Training Samples | # of Test Samples |
|---|---|---|
| 1 | 1774 | 334 |
| 2 | 1608 | 500 |
| 3 | 1941 | 167 |
| 4 | 1718 | 390 |
| 5 | 1875 | 233 |
| 6 | 1624 | 484 |
User paper data recordings split into five user-independent folds.
| Folds | # of Training Samples | # of Test Samples |
|---|---|---|
| 1 | 24,087 | 3874 |
| 2 | 22,877 | 5084 |
| 3 | 22,896 | 5065 |
| 4 | 22,900 | 5061 |
| 5 | 22,896 | 5065 |
Normalized TR error rates over the different users.
| 1 | 2 | 3 | 4 | 5 | 6 | Mean | |
|---|---|---|---|---|---|---|---|
| Label | 0.1649 |
| 0.1734 | 0.1928 | 0.1764 | 0.1964 | 0.1864 |
| Stroke | 0.1633 | 0.2154 |
| 0.1634 | 0.1726 |
|
|
| Chunks |
| 0.2285 | 0.1654 |
|
| 0.1838 | 0.1772 |
Character error rates over the five folds of the paper data using IMU data and generated trajectories.
| 1 | 2 | 3 | 4 | 5 | Mean | |
|---|---|---|---|---|---|---|
| IMU | 14.9484 | 27.3908 | 11.5514 | 12.9280 | 13.5233 | 16.0683 |
| Label | 20.7685 | 30.1311 | 19.4653 | 18.8399 | 16.1897 | 21.0789 |
| Stroke | 20.3729 | 28.7688 |
| 18.1958 | 14.8872 | 19.8676 |
| Chunks |
|
| 17.3706 |
|
|
|
| Joint IMU-Chunks | 15.2440 | 25.3947 | 11.2754 | 12.3797 | 12.2769 | 15.3141 |
Figure 6Trajectory recovery results when writing on tablet, including the ground truth trajectory collected by the tablet.
Figure 7Trajectory recovery results when writing on paper.