Handwriting Recognition Based on 3D Accelerometer Data by Deep Learning
Abstract
:1. Introduction
- A proprietary dataset of 3D accelerometer data corresponding to multi-stroke freestyle handwritten lowercase letters and digits. Unlike previous approaches, we do not impose restrictions on handwriting style and number and order of the strokes.
- Three neural network architectures (CNN, LSTM, and CNN-LSTM) were proposed. In the last architecture, a CNN was used for feature extraction to encode the global characteristics of raw 3D accelerometer data together with an LSTM for sequence processing and classification.
2. Related Work
Algorithm | Methodology | Limitations |
---|---|---|
Digital Pen [12] | 3D accelerometer signals are converted to image which is recognized by a neural network | Ten digits written in a special single stroke font |
WIMU-Based Hand Motion Analysis [13] | Movement and attitude features are extracted from motion sensor and magnetometer signals and recognition is completed by DTW | English lowercase letters and digits written in a special single stroke font |
Accelerometer-Based Digital Pen [14] | 3D accelerometer signals are recognized by a PNN | Ten digits written in a special single stroke font |
Air writing [15] | 3D accelerometer and gyroscope signals are recognized by HMMs | English uppercase letters and words |
Gyroscope-equipped smartphone [4] | 3D gyroscope signals are recognized by stepwise lower-bounded dynamic time warping | English lowercase letters written with a smartphone grabbed as a pen |
Marker-based Air writing [17] | Handwriting is captured from motion of a marker in a video and recognition is completed by a CNN | Ten digits written in a single stroke |
PhonePoint Pen [18] | Basic strokes are detected from 3D accelerometer signals by correlation with templates and handwritten characters are recognized by juxtaposition of basic strokes | English letters and digits written using basic strokes, smartphone grabbed as a pen |
Deep Fisher Discriminant Learning [19] | 3D accelerometer and gyroscope signals are recognized as hand gestures by an F-BiGRU | Six uppercase English letters and six digits written in a predefined stroke ordering |
Motion data from smartwatch [20] | Features are extracted from accelerometer and gyroscope signals and letter recognition is done by DTW | English uppercase letters written on a whiteboard |
SHOW [22] | Features are extracted from accelerometer and gyroscope signals and recognition was tested with seven machine learning algorithms | English letters and digits written on a horizontal surface with the elbow as support point |
MotionHacker [23] | After preprocessing and segmentation, features are extracted from accelerometer and gyroscope signals and letter recognition is performed by random forest classifier | Demonstration of motion sensors-based eavesdropping on handwriting |
AirScript [7] | Recognition is completed by a fusion of a CNN, and two GRU networks using as input an image derived from 2-DifViz features, post-processed 2-DifViz features and standardized raw data, respectively | Ten digits written in the air |
Finger Writing with Smartwatch [25] | Energy, posture, motion shape and motion variation features are extracted from accelerometer and gyroscope signals, and three classifiers are tested for recognition (Naive Bayes, linear regression and decision trees) | English lowercase letters written on a surface |
Trajectory-Based Air Writing [32] | Trajectory of handwriting with fingertip using a video camera and recognition was completed by a CNN and an LSTM | Ten digits written with a predefined stroke ordering |
Air Writing with Interpolation [33] | Motion sensor data are interpolated and then recognized by a 2D-CNN | Uses datasets of others |
3. CNN and LSTM for Sequence Recognition
3.1. Convolutional Neural Networks
3.1.1. Convolutional Layer
3.1.2. Activation Function
3.1.3. Pooling Layer
3.1.4. Fully Connected Layer (Dense)
3.2. Long Short-Term Memory Neural Networks
3.3. Implemented Architecture
4. Experiments and Numerical Evaluation
4.1. Hardware and Data Collection
4.2. Evaluation and Results
4.3. Comparison with the State-of-the-Art
4.4. Computational Time Analysis
5. Conclusions and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Kim, J.; Sin, B.K. Online handwriting recognition. In Handbook of Document Image Processing and Recognition; Springer: London, UK, 2014; pp. 887–915. [Google Scholar] [CrossRef]
- Zhang, Q.; Wang, D.; Zhao, R.; Yu, Y. MyoSign. In Proceedings of the 24th International Conference on Intelligent User Interfaces—IUI ’19, Marina del Ray, CA, USA, 17–20 March 2019; ACM Press: New York, NY, USA, 2019; pp. 650–660. [Google Scholar] [CrossRef]
- Ignatov, A. Real-time human activity recognition from accelerometer data using Convolutional Neural Networks. Appl. Soft Comput. J. 2018, 62, 915–922. [Google Scholar] [CrossRef]
- Kim, D.W.; Lee, J.; Lim, H.; Seo, J.; Kang, B.Y. Efficient dynamic time warping for 3D handwriting recognition using gyroscope equipped smartphones. Expert Syst. Appl. 2014, 41, 5180–5189. [Google Scholar] [CrossRef]
- Mannini, A.; Intille, S. Classifier Personalization for Activity Recognition using Wrist Accelerometers. IEEE J. Biomed. Health Inf. 2018, 23, 1585–1594. [Google Scholar] [CrossRef] [PubMed]
- Garcia-Ceja, E.; Uddin, M.Z.; Torresen, J. Classification of Recurrence Plots’ Distance Matrices with a Convolutional Neural Network for Activity Recognition. Procedia Comput. Sci. 2018, 130, 157–163. [Google Scholar] [CrossRef]
- Dash, A.; Sahu, A.; Shringi, R.; Gamboa, J.; Afzal, M.Z.; Malik, M.I.; Dengel, A.; Ahmed, S. AirScript—Creating Documents in Air. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 908–913. [Google Scholar] [CrossRef] [Green Version]
- Saha, S.; Saha, N. A Lightning fast approach to classify Bangla Handwritten Characters and Numerals using newly structured Deep Neural Network. Procedia Comput. Sci. 2018, 132, 1760–1770. [Google Scholar] [CrossRef]
- Abdulhussain, S.H.; Mahmmod, B.M.; Naser, M.A.; Alsabah, M.Q.; Ali, R.; Al-Haddad, S.A.R. A Robust Handwritten Numeral Recognition Using Hybrid Orthogonal Polynomials and Moments. Sensors 2021, 21, 1999. [Google Scholar] [CrossRef] [PubMed]
- Rani, L.; Sahoo, A.K.; Sarangi, P.K.; Yadav, C.S.; Rath, B.P. Feature Extraction and Dimensionality Reduction Models for Printed Numerals Recognition. In Proceedings of the 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 23–25 March 2022; pp. 798–801. [Google Scholar] [CrossRef]
- Lawhern, V.J.; Solon, A.J.; Waytowich, N.R.; Gordon, S.M.; Hung, C.P.; Lance, B.J. EEGNet: A compact convolutional neural network for EEG-based brain–computer interfaces. J. Neural Eng. 2018, 15, 056013. [Google Scholar] [CrossRef] [Green Version]
- Ghosh, D.; Goyal, S.; Kumar, R. Digital pen to convert handwritten trajectory to image for digit recognition. In Advances in Communication, Devices and Networking; Bera, R., Sarkar, S.K., Chakraborty, S., Eds.; Springer: Singapore, 2018; pp. 923–932. [Google Scholar] [CrossRef]
- Patil, S.; Kim, D.; Park, S.; Chai, Y. Handwriting Recognition in Free Space Using WIMU-Based Hand Motion Analysis. J. Sens. 2016, 2016, 3692876. [Google Scholar] [CrossRef] [Green Version]
- Wang, J.S.; Chuang, F.C. An Accelerometer-Based Digital Pen With a Trajectory Recognition Algorithm for Handwritten Digit and Gesture Recognition. IEEE Trans. Ind. Electron. 2012, 59, 2998–3007. [Google Scholar] [CrossRef]
- Amma, C.; Georgi, M.; Schultz, T. Airwriting: A wearable handwriting recognition system. Pers. Ubiquitous Comput. 2014, 18, 191–203. [Google Scholar] [CrossRef]
- Wijewickrama, R.; Maiti, A.; Jadliwala, M. deWristified. In Proceedings of the 12th Conference on Security and Privacy in Wireless and Mobile Networks—WiSec ’19, Miami, FL, USA, 15–17 May 2019; ACM Press: New York, NY, USA, 2019; pp. 49–59. [Google Scholar] [CrossRef]
- Roy, P.; Ghosh, S.; Pal, U. A CNN Based Framework for Unistroke Numeral Recognition in Air-Writing. In Proceedings of the 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), Niagara Falls, NY, USA, 5–8 August 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 404–409. [Google Scholar] [CrossRef]
- Agrawal, S.; Constandache, I.; Gaonkar, S.; Roy Choudhury, R.; Caves, K.; DeRuyter, F. Using mobile phones to write in air. In Proceedings of the MobiSys ’11, the 9th International Conference on Mobile Systems, Applications, and Services, Washington, DC, USA, 28 June–1 July 2011; pp. 15–28. [Google Scholar] [CrossRef]
- Li, C.; Xie, C.; Zhang, B.; Chen, C.; Han, J. Deep Fisher discriminant learning for mobile hand gesture recognition. Pattern Recognit. 2018, 77, 276–288. [Google Scholar] [CrossRef] [Green Version]
- Ardüser, L.; Bissig, P.; Brandes, P.; Wattenhofer, R. Recognizing text using motion data from a smartwatch. In Proceedings of the 2016 IEEE International Conference on Pervasive Computing and Communication Workshops, PerCom Workshops, Sydney, Australia, 14–18 March 2016. [Google Scholar] [CrossRef]
- Kwon, M.C.; Park, G.; Choi, S. Smartwatch user interface implementation using CNN-based gesture pattern recognition. Sensors 2018, 18, 2997. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lin, X.; Chen, Y.; Chang, X.W.; Liu, X.; Wang, X. SHOW: Smart Handwriting on Watches. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2017, 151, 23. [Google Scholar] [CrossRef]
- Xia, Q.; Hong, F.; Feng, Y.; Guo, Z. MotionHacker: Motion sensor based eavesdropping on handwriting via smartwatch. In Proceedings of the INFOCOM 2018—IEEE Conference on Computer Communications Workshops, Honolulu, HI, USA, 15–19 April 2018; pp. 468–473. [Google Scholar] [CrossRef]
- Rahagiyanto, A.; Basuki, A.; Sigit, R.; Anwar, A.; Zikky, M. Hand Gesture Classification for Sign Language Using Artificial Neural Network. In Proceedings of the 2017 21st International Computer Science and Engineering Conference (ICSEC), Bangkok, Thailand, 15–18 November 2017; Volume 6, pp. 205–209. [Google Scholar] [CrossRef]
- Xu, C.; Pathak, P.H.; Mohapatra, P. Finger-writing with Smartwatch: A Case for Finger and Hand. In Proceedings of the International Workshop on Mobile Computing Systems and Applications, Santa Fe, NM, USA, 12–13 February 2015; pp. 9–14. [Google Scholar] [CrossRef]
- Varkey, J.P.; Pompili, D.; Walls, T.A. Erratum to: Human motion recognition using a wireless Sensor-Based wearable system. Pers. Ubiquitous Comput. 2012, 16, 897–910. [Google Scholar] [CrossRef]
- Jalloul, N.; Poree, F.; Viardot, G.; Hostis, P.L.; Carrault, G. Activity Recognition Using Complex Network Analysis. IEEE J. Biomed. Health Inf. 2018, 22, 989–1000. [Google Scholar] [CrossRef]
- Ojagh, S.; Cauteruccio, F.; Terracina, G.; Liang, S.H. Enhanced air quality prediction by edge-based spatiotemporal data preprocessing. Comput. Electr. Eng. 2021, 96, 107572. [Google Scholar] [CrossRef]
- Sa-nguannarm, P.; Elbasani, E.; Kim, B.; Kim, E.H.; Kim, J.D. Experimentation of human activity recognition by using accelerometer data based on LSTM. In Advanced Multimedia and Ubiquitous Engineering; Park, J.J., Loia, V., Pan, Y., Sung, Y., Eds.; Springer: Singapore, 2021; pp. 83–89. [Google Scholar]
- Livieris, I.E.; Pintelas, E.; Pintelas, P. A CNN–LSTM model for gold price time-series forecasting. Neural Comput. Appl. 2020, 32, 17351–17360. [Google Scholar] [CrossRef]
- Elmaz, F.; Eyckerman, R.; Casteels, W.; Latré, S.; Hellinckx, P. CNN-LSTM architecture for predictive indoor temperature modeling. Build. Environ. 2021, 206, 108327. [Google Scholar] [CrossRef]
- Alam, M.S.; Kwon, K.C.; Alam, M.A.; Abbass, M.Y.; Imtiaz, S.M.; Kim, N. Trajectory-Based Air-Writing Recognition Using Deep Neural Network and Depth Sensor. Sensors 2020, 20, 376. [Google Scholar] [CrossRef] [Green Version]
- Abir, F.A.; Siam, M.A.; Sayeed, A.; Hasan, M.A.M.; Shin, J. Deep Learning Based Air-Writing Recognition with the Choice of Proper Interpolation Technique. Sensors 2021, 21, 8407. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y. Convolutional networks for images, speech, and time series. Handb. Brain Theory Neural Netw. 1995, 3361, 1995. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Sukhbaatar, S.; Bruna, J.; Paluri, M.; Bourdev, L.; Fergus, R. Training Convolutional Networks with Noisy Labels. In Proceedings of the ICLR 2015, San Diego, CA, USA, 7–9 May 2015; pp. 1–11. [Google Scholar]
- Zhang, Z.; Sabuncu, M.R. Generalized cross entropy loss for training deep neural networks with noisy labels. Adv. Neural Inf. Process. Syst. 2018, 31, 8778–8788. [Google Scholar]
- Wu, J.; Pan, G.; Zhang, D.; Qi, G.; Li, S. Gesture recognition with a 3-D accelerometer. In Lecture Notes in Computer Science; (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2009; pp. 25–38. [Google Scholar]
Layer (Type) | Output Shape | Parameters |
---|---|---|
Input | (None, 116, 3) | 0 |
Conv1D | (None, 114, 16) | 160 |
Dropout | (None, 114, 16) | 0 |
Conv1D | (None, 112, 16) | 784 |
Dropout | (None, 112, 16) | 0 |
MaxPooling | (None, 56, 16) | 0 |
Conv1D | (None, 54, 32) | 1568 |
Dropout | (None, 54, 32) | 0 |
Flatten | (None, 1728) | 0 |
Dense | (None, 36) | 62,244 |
Total | 64,756 |
Layer (Type) | Output Shape | Parameters |
---|---|---|
Input | (None, 116, 3) | 0 |
LSTM | (None, 116, 250) | 254,000 |
Flatten | (None, 1728) | 0 |
Dense | (None, 36) | 1,044,036 |
Total | 1,298,036 |
Layer (Type) | Output Shape | Parameters |
---|---|---|
Input | (None, 116, 3) | 0 |
Conv1D | (None, 114, 16) | 160 |
Dropout | (None, 114, 16) | 0 |
Conv1D | (None, 112, 16) | 784 |
Dropout | (None, 112, 16) | 0 |
MaxPooling | (None, 56, 16) | 0 |
Conv1D | (None, 54, 32) | 1568 |
Dropout | (None, 54, 32) | 0 |
LSTM | (None, 250) | 283,000 |
Dense | (None, 36) | 9036 |
Total | 294,548 |
Method | Accuracy [%] | Precision | F-Measure |
---|---|---|---|
DTW, Kim et al. [4] | 95.00 | − | − |
LR, Xu et al. [25] | 94.60 | − | − |
DTW, Wang and Chuang [14] | 98.00 | − | − |
F-BiLSTM, Li et al. [19] | 98.04 | − | − |
F-BiGRU, Wu et al. [40] | 99.15 | − | − |
FDSVN, Patil et al. [13] | 95.21 | − | − |
CNN | 99.45 | 0.97 | 0.97 |
LSTM | 99.68 | 0.99 | 0.99 |
CNN-LSTM | 99.55 | 0.98 | 0.98 |
Architecture | Training [min] | Testing [s] |
---|---|---|
DTW, Kim et al. [4] | − | 0.9 |
LR, Xu et al. [25] | − | − |
DTW, Wang and Chuang [14] | 16.156 | − |
F-BiLSTM, Li et al. [19] | − | − |
F-BiGRU, Wu et al. [40] | − | − |
FDSVN, Patil et al. [13] | − | − |
CNN | 5.66 | 0.0009 |
LSTM | 155.58 | 0.0263 |
CNN-LSTM | 78.48 | 0.0123 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lopez-Rodriguez, P.; Avina-Cervantes, J.G.; Contreras-Hernandez, J.L.; Correa, R.; Ruiz-Pinales, J. Handwriting Recognition Based on 3D Accelerometer Data by Deep Learning. Appl. Sci. 2022, 12, 6707. https://doi.org/10.3390/app12136707
Lopez-Rodriguez P, Avina-Cervantes JG, Contreras-Hernandez JL, Correa R, Ruiz-Pinales J. Handwriting Recognition Based on 3D Accelerometer Data by Deep Learning. Applied Sciences. 2022; 12(13):6707. https://doi.org/10.3390/app12136707
Chicago/Turabian StyleLopez-Rodriguez, Pedro, Juan Gabriel Avina-Cervantes, Jose Luis Contreras-Hernandez, Rodrigo Correa, and Jose Ruiz-Pinales. 2022. "Handwriting Recognition Based on 3D Accelerometer Data by Deep Learning" Applied Sciences 12, no. 13: 6707. https://doi.org/10.3390/app12136707
APA StyleLopez-Rodriguez, P., Avina-Cervantes, J. G., Contreras-Hernandez, J. L., Correa, R., & Ruiz-Pinales, J. (2022). Handwriting Recognition Based on 3D Accelerometer Data by Deep Learning. Applied Sciences, 12(13), 6707. https://doi.org/10.3390/app12136707