Important Features Selection and Classiﬁcation of Adult and Child from Handwriting Using Machine Learning Methods

: The classiﬁcation of different age groups, such as adult and child, based on handwriting is very important due to its various applications in many different ﬁelds. In forensics, handwriting classiﬁcation helps investigators focus on a certain category of writers. This paper aimed to propose a machine-learning (ML)-based approach for automatically classifying people as adults or children based on their handwritten data. This study utilized two types of handwritten databases: handwritten text and handwritten pattern, which were collected using a pen tablet. The handwritten text database had 57 subjects (adult: 26 vs. child: 31). Each subject (adult or child) wrote the same 30 words using Japanese hiragana characters. The handwritten pattern database had 81 subjects (adult: 42 and child: 39). Each subject (adult or child) drew four different lines as zigzag lines (trace condition and predict condition), and periodic lines (trace condition and predict condition) and repeated these line tasks three times. Handwriting classiﬁcation of adult and child is performed in three steps: (i) feature extraction; (ii) feature selection; and (iii) classiﬁcation. We extracted 30 features from both handwritten text and handwritten pattern datasets. The most efﬁcient features were selected using sequential forward ﬂoating selection (SFFS) method and the optimal parameters were selected. Then two ML-based approaches, namely, support vector machine (SVM) and random forest (RF) were applied to classify adult and child. Our ﬁndings showed that RF produced up to 93.5% accuracy for handwritten text and 89.8% accuracy for handwritten pattern databases. We hope that this study will provide the evidence of the possibility of classifying adult and child based on handwriting text and handwriting pattern data.


Introduction
In recent years, the field of handwriting has attracted interest from various aspects, such as biometrics [1] and the medical field [2]. In addition, handwritten characters can be obtained from a variety of sources such as paper documents, images, touch screens, and other devices. This makes the data easy to collect and suitable for classification. Furthermore, since handwriting is something that everyone uses every day in school, it is a method that is less stressful for people. There are few studies on handwriting classification for adults and children, and most of the studies are on the classification of face recognition [3], age groups [4], age, gender, and nationality [5], gender [6,7], gender and handedness [8], detection of alcohol [9], and Parkinson's disease (PD) [10] based on handwriting images.
There are two types of handwriting data: offline and online. The input data collected using a scanning machine are called "offline", whereas input data obtained using a pen tip are called "online" [11]. In our research work, we used the online-based handwritten database. Moreover, a single writer's handwriting may be unique or differ slightly, but the handwriting of a child and adult must always be different. Most forensic handwriting analysis is based on the inspection of specific character shapes, character ligatures, size, pen lift, pen pressure, speed, letter spacing, etc., to identify a suspected person. Age group detection will be a great solution before detecting the actual suspected person in forensic analysis. It will give additional evidence about the suspected person's age. Currently, there are many applications of handwritten recognition, for example, signature authentication used in industrial applications [2], authenticating of criminal investigations in a court of justice [12,13], document examinations [14], and so on. The most difficult aspects of handwriting identification are distortions and pattern variations; feature extraction is of supreme importance. Handwritten forensic analysis or handwriting recognition using machine learning (ML) algorithms can be a great solution to classify adults and children based on their handwritten text and handwritten pattern. Ahmad et al. (2004) proposed support vector machine (SVM) with some kernels for online handwritten recognition [15]. They showed that at the character level, the SVM recognition rate was dramatically better due to the use of maximizing boundaries in the decision function. The only problem with this algorithm was storing large support vector for a huge training character that requires a larger memory size. Babu et al. (2014) proposed k-nearest neighbors (k-NN) for recognizing handwritten digits based on structural features, which does not require thinning operation and size normalization approach [16]. Ramzan et al. (2018) implemented neural networks (NN) and their variants to recognize handwritten digits. The survey details some existing techniques implemented for handwritten digit recognition (HWDR) being carried out [17]. Baldominos et al. (2019) [18] also proposed convolutional neural networks (CNNs) to distinguish previous work to recognize handwritten characters using some data augmentation from works using the original dataset out-of-the-box [19,20]. They provided the most extensive and updated survey of the MNIIST and EMNIST datasets and achieved the lowest error rate.
Poon et al. (2019) [2] applied logistic regression to predict PD based on handwritten recognition. They utilized the publicly available PD database and extracted secondary kinematic handwriting features from the dataset. It is being studied not only for personal identification but also in the medical field. The limitation of their proposed model was that they used small sample size of the dataset and lacked control in the study design. As for all the limitations of handwriting recognition, Japanese handwritten character recognition is complex due to the various types of writing styles, characters, and confusion among similar characters. One of the major causes of the inefficient classification of Japanese characters is a large number of letters. However, many methods have been developed to recognize Japanese handwriting as text images for several applications, but there are few studies on the classification of adults and children based on Japanese handwritten recognition. Nisimura et al. (2004) [21] suggested a discriminating strategy based on statistical learning and extracted linguistic features from speech or voice data to classify adults and children. They applied SVM and found that it performed with better classification accuracy than the Gaussian mixture model [22]. The disadvantage of this strategy is that it has a trait in common with both labels. Makihara et al. (2010) [23] proposed a method to classify gender and age using video-based gait feature analysis with a large-scale multiview gait database. They adopted the k-NN classifier to classify gender and age. They used three databases (HumanID, Soton, and CASIA) that contained over 100 subjects. These datasets have their particular limitations, such as the small view images in the HumanID dataset; also, single view images in the Soton dataset, and maximum subjects in the CASIA dataset included in the 20's or 30's. Faghel-Soubeyrand classified adult and child based on faces [24]. In this study, we propose a new approach for the classification of adults and children based on their handwritten text and pattern recognition. Our proposed method can achieve more than 89% classification accuracy, implying that classification accuracy with handwritten characters can be expected.
The organization of this paper is as follows: Section 2 presents materials and methods, including proposed ML-based framework; description of datasets, feature extraction, feature selection, classifiers along with their performance evaluation metrics are discussed in this section. The experimental results and discussion are discussed in Section 3. Finally, the conclusion is discussed in Section 4.

Materials and Methods
In this section, we summarize the proposed ML-based framework. Next, two databases used in this research work are described. We also describe feature extraction, feature selection, and two classification methods along their performance evaluation metrics in this section.

Proposed ML-Based Framework
The goal of this work is to propose an ML-based model for predicting adult and child based on their handwritten texts and handwritten patterns. The proposed ML-based framework is presented in Figure 1. First, we divide the handwriting (text and pattern) dataset into two phases: the training phase and the testing phase. We take 80% of the dataset in the training phase and the remaining 20% of the dataset for the test phase. The second step is to preprocess the handwriting data. After preprocessing handwriting data, we extract 30 features and then select an optimal subset of the features using sequential forward floating selection (SFFS). We applied two ML-based algorithms, SVM and RF, for the classification of adult and child. We tuned the hyperparameters of the classifiers (SVM and RF) using a grid search method and trained SVM and RF-based classifiers with five-fold cross-validation protocol. After training, classifiers (SVM and RF) are used in the testing phase for the classification of adult and child. Accuracy, recall, precision, f1-score, and area under the curve (AUC) are used to evaluate the performance of the classifiers.

Device for Data Collection
Handwriting data were recorded using a pen tablet system (Cintiq Pro 16, Wacom Co., Ltd., Saitama, Japan). The tablet was connected to a laptop PC running Windows 10. Figure 2 illustrates the coordinates of the parameters generated by the pen tablet.The screen size of the pen tablet was 15.6 inches, and the resolution size was 2560 × 1440 pixels.

Handwritten Text
We developed a new dataset to evaluate our proposed method where adult and child handwriting-based text data were collected using a pen tablet. A total of 57 participants were taken for this work, consisting of 26 adults (aged 19-59 years) with handwriting and 31 children (aged 12-13 years). Each subject (child or adult) was asked to write the same 30 words (tasks) using hiragana characters only on the pen tablet using a dedicated stylus pen. Each word contains a minimum of 2 characters and a maximum of 7 characters. A summary of the handwritten text dataset is described in Table 1.

Handwritten Pattern
Handwriting-based pattern data were also collected from 81 subjects using a pen tablet system. The dataset had 39 children and 42 adults. In this study, we adopted two patterns. One was drawing a continuous zigzag line, essentially a continuous set of triangles without a base. Another was drawing a continuous periodic line pattern (PL) that was repeated squares and triangles sequentially without a base. The trace and predict conditions were used for each pattern. Each subject was asked to draw these four patterns on the pen tablet using a dedicated stylus pen and each drawing pattern was repeated 3 times. The traced over the sample zigzag lines are presented in Figure 3a, and the data are written on a blank sheet of paper after memorizing the sample. The traced over the sample PL lines are also presented in Figure 3b, and the data were derived from memorizing the sample and writing it on a blank sheet of paper. The data were collected by separating the zigzag line and the PL line, taking data for 30 s, resting for 20 s, taking data again for 30 s, and resting for 20 s, and so on, until six data were collected. The reason for the intervals was to let the brain rest. A summary of the handwritten pattern dataset is described in Table 2.

Feature Extraction
The handwriting data contained six pieces of information, including the time of writing, pen pressure, x-coordinate and y-coordinate of the writing position, angle of the horizontal component of the pen, and angle of the vertical component of the pen. To classify adults and children based on their handwriting, 30 feature parameters are evaluated for each task. These feature parameters only required the localization of primary features of handwritten text images, namely, the width, height, speed, peak, different types of grip angle, and various types of pressure, which are given in detail in Table 3. Table 3. Extracted feature names and their description.

SN
Feature Description Length The total length of the drawing 4 Velocity GripAngleSDW SD of grip angle values for the entire drawing task (Horizontal) 12 GripAngleSDL SD of grip angle values for the entire drawing task (Vertical) 13 PressureMean Mean of recorded pressure values for the entire task 14 PressureSD SD of recorded pressure values for the entire task 15 PCAvgPos Mean increase in pressure between two-time points 16 PCSDPos SD of increase in pressure between two-time points 17 PCMax The maximum increase in handwriting pressure between two-time points 18 PCAvgNeg Mean decrease in pressure between two-time points 19 PCSDNeg SD of decrease in pressure between two-time points 20 PCMin Maximum reduction in handwriting pressure between two-time points 21 Error

Feature Normalization
Data normalization is a technique that minimizes redundancy and improves the efficiency of the data. Mathematically, it is defined as follows: where X is the original feature vector; µ is the mean of that feature vector, and σ is its standard deviation. The value of z lies between 0 to 1.

Feature Selection
Feature selection is the process of removing irrelevant features to improve the efficiency of the model. We have used SFFS for feature selection, which is an extension of sequential forward selection (SFS), to reduce the initial d-dimensional feature space into a k-dimensional feature subspace (k < d) [25]. Let Y = {y 1 , y 2 , . . . , y d } be a set of all features and X k = {x j |j = 1, 2, . . . , k; x j ∈ Y}, where k ∈ (0, 1, , 2, . . . , d) and X k is a subset of Y. We start the algorithm with X o = ∅, k = 0. The steps of SFFS are described as follows: Step 1: is an evaluation index and x + is the feature with the highest evaluation when it chooses.
Step 2: X k+1 = X k + x + . The feature with the highest evaluation by selecting is used.
Step 4: Step 1 to Step 3 is repeatedly iterating. Then, x + when k reaches the specified number which is the set of the most appropriate features obtained. SFFS is performed up to Step 3 of SFS, and a process for searching for features to be deleted is added. At first, Step 1 to Step 4 are performed starting from X 0 = ∅, k = 0, as in the SFS.
Step 5: x − = argmax J(X k − x), where x ∈ X k and x − is the feature with the best performance when the feature is deleted. In Step 1, we capture the features that best improve the performance of the feature subset from the feature space. Then, we proceed to Step 2. In Step 2, remove features only if they improve the performance of the resulting subset. In this study, the Sequential Feature Selector in mlxtend library was used and implemented [26].
2.6. Classifiers 2.6.1. Support Vector Machine Support vector machine (SVM) [27,28] is supervised learning that is used for both classification and regression problems. In this study, we implemented SVM in Scikit-learn support vector classification (SVC) [29]. SVM is classified on the largest hyperplane up to the nearest training data point of the class. A highly accurate model can be obtained with a small amount of data, and the accuracy of identification can be kept even when the number of features increases. The main objective of SVM is to find the hyperplane in the feature space that can easily separate the classes, which needs to solve the following constraint problem: Subject to The final discriminate function takes the following form: where, b is the bias term.

Random Forest
Random forest (RF) [30] is one type of ensemble learning used for classification, regression, etc. It is a model in which decision trees are created in parallel and predictions are made by calculating the majority vote of the output results of each learning machine. Random learning enables fast learning and identification even for high-dimensional features, and the random selection of training data makes it strong against noise. Therefore, it is possible to build an overall good model. In this study, we also implemented RF with random forest classifier in Scikit-learn [29].

Performance Evaluation Metrics
To evaluate the performance of the classification model, we adopted five evaluation metrics: classification accuracy (ACC), recall (Rec), precision (Pre), f1-score, and AUC. The evaluation metrics of accuracy, recall, precision, and f1-score are computed based on true positive (t p ), false positive (f p ), true negative (t n ), and false negative (f n ), which are briefly explained as follows: ACC (%) = t p +t n t p +f p +t n +f n ×100 Rec (%) = t p t p +f n ×100 Pre (%) = t p t p +f p ×100 f1-score (%) = 2× (Pre × Rec) Pre + Rec ×100 (8)

Experimental Setup
To perform the classification of adult and child, 80% of the dataset was utilized for training sets and 20% of the dataset for testing sets. For all statistical analysis, Python version 3.9 and Scikit-learn version 1.0.2 were used. We used Windows 10 21H1 (build 19043.1151) 64-bit with an Intel (R) Core (TM) i5-10400 processor and 16 GB of RAM.

Baseline Characteristics of Adult and Child
The baseline characteristics of adults and children for the handwritten text and pattern datasets are presented in Table 4. For the handwritten text dataset, the prevalence of adult and child was 45.6% and 54.4%. Among them, 42.7% and 59.3% of adult and child were female. The average ages of adult and child for the handwritten text dataset were 27.3 ± 10.5 and 12.5 ± 0.3 years.
For the handwritten pattern dataset, the average ages of adult and children were 23.9 ± 4.9 and 11.8 ± 1.6 years. The overall prevalence of females was 59.3%. Approximately 64.6% and 35.4% of adult and child were female. It was observed that age and gender (except gender for handwritten text data) were significantly associated with adult and child for both handwritten text and pattern dataset (p-value < 0.05).

Hyperparameter Tuning
For the classification tasks, we set the following hyperparameters for SVM as cost We implemented grid search algorithms to tune these hyperparameters. We choose the hyperparameters that will provide the highest classification accuracy.

Experiment-1: Evaluation for Handwritten Text Dataset
In this experiment, we used different types of handwritten texts and then extracted various types of features from each image or task. We applied SVM and RF classifiers to classify adult and child and calculated the classification accuracy. We used 30 hiragana words and extracted 30 features which are clearly explained in Table 3. Table 5 shows the performance scores of SVM and RF for better features combination of handwritten text dataset. It was observed that SVM with RBF kernel produced the classification accuracy of 87.7% for the combination of 15 selected features out of 30 features. Moreover, SVM also produced 92.4% recall, 85.9% precision, 89.1% f1-score, and 0.919 AUC for the selected 15 features, whereas RF classifier achieved an excellent classification accuracy of 93.5% along with 95.7% recall, 92.2% precision, 93.9% f1-score, and 0.983 AUC, respectively, for the combination of 18 selected features. Therefore, RF achieved more outstanding performance than SVM. We observed that 15 and 18 features were selected by SFFS with SVM and RF classifiers. A total of 11 common features was extracted from those two methods, which are shown in Figure 4, and the listed selected features are presented in Table 6. These 11 common features were used as input features and then we applied SVM and RF classifiers to distinguish adults from children.
The performance scores of SVM and RF classifiers for 11 common features are shown in Table 7. It was observed that SVM with RBF provided 87.4% accuracy, 90.8% recall, 86.6% precision, 88.7% f1-score, and 0.947 AUC, respectively, whereas RF gave 91.5% accuracy, 93.0% recall, 91.5% precision, 92.3% f1-score, and 0.967 AUC, receptively. Finally, we may conclude that RF had more outstanding performance scores than SVM for the prediction of the adult and child for handwritten text dataset.  GripAngleMeanL --

Experiment-2: Evaluation for Handwritten Pattern Dataset
To evaluate our proposed model, we used a handwritten pattern dataset and obtained a classification accuracy of up to 89.8%. In this section, we performed two experiments. Firstly, the best combination of the features set was identified using SFFS-based RF and SVM classifiers. We chose the feature combination at which the classification model provides the highest classification accuracy. The classification accuracy of RF and SVM for the handwritten pattern dataset is presented in Table 8. For the trace of zigzag lines, RF produced 83.3% classification accuracy for the combination of 19 selected features, whereas SVM produced 71.4% accuracy for the combination of 26 selected features. For the prediction of the zigzag, the RF classifier obtained the highest classification accuracy of 85.7% for 13 combinations of feature sets and the prediction of the zigzag line, whereas SVM provided 75.5% classification accuracy for 3 selected features. For the prediction and trace of the PL line, RF achieved 73.5% classification accuracy for the combination of 24 selected features, whereas SVM achieved 79.6% accuracy for 7 selected features and 87.7% accuracy for 12 selected features. RF classifier provided a good classification accuracy of 85.6% for the combination of all handwritten patterns, 25 features, whereas 82.1% classification accuracy was provided by SVM for the combination of all 28 features. Therefore, RF achieved better classification accuracy (89.8%) than SVM for the prediction of PL line. The second experiment was to take the common features from the two best combinations of feature sets and apply two classifiers for the prediction of adult and child. The number of selected common features was 18 features from the trace of zigzag line, 2 features from the prediction of zigzag line, 7 features from the trace of PL line, 9 features from the prediction of PL lines, and 23 features from all handwritten patterns (zigzag and PL lines), which are shown in Figure 5, and the corresponding list of selected common features is presented in Table 9.  AngleSpeed ---Error  18  ErrorRate  ---PeakpresMean  19  ----ErrorStopTime  20  ----AngleVar  21  ----ReglineSlope  22  ----ReglineIntercept  23 - The classification accuracies of RF and SVM for these common features are presented in Table 10. It was observed that the RF classifier provided a higher classification accuracy of 79.5%, 73.4%, 83.6%, and 89.8% for the trace and prediction of the zigzag line than SVM for the trace and prediction of the PL line, respectively. On the other hand, the SVM classifier provided 85.1% accuracy for all handwritten patterns, whereas RF classifier gave 84.1% accuracy. The recall, precision, f1-score, and AUC of RF and SVM for common features of the handwritten dataset are presented in Table 11. It was observed that the RF classifier achieved comparatively better performance for all types of lines than SVM. RF classifier provided a higher recall of 87.0%, precision of 93.1%, f1-score of 90.0%, and AUC of 0.903 for the prediction of the PL line dataset, whereas SVM gave 83.8% recall, and 0.811 AUC, respectively. Table 11 shows that the highest performance scores are achieved by RF for four types of lines with all handwritten patterns. Finally, we can say that in our experiment, RF performed better than SVM.

Comparison of Our Proposed Method with the Existing Method
The comparison of the classification accuracy of our proposed method with the existing method in the literature is presented in Table 12. Guimaraes et al. (2017) [4] applied different ML algorithms such as multilayer perception (MLP), deep convolutional neural network (DCNN), decision tree (DT), RF, and SVM for the classification of adult and teenager age groups based on sentences. They collected 7000 sentences for the classification of age groups (teenager vs. adult). They showed that DCNN had a better performance and obtained 95.0% precision. Rizwan et al. (2021) [31] proposed a novel method for the classification of human age. They extracted features using interior angle formulation, anthropometric model, carnio-facial development, wrinkle detection, and heat maps. The best combination of feature sets was selected using SFS. They adopted CNN to classify human age and achieved 94.6% classification accuracy. Özkan and Turan (2018) [32] proposed a deep learning algorithm for the classification of people based on their age. They divided the people into 12 classes using age groups and collected 18,000 images. They took 10% of the images for testing and the rest of the images for training. They showed that the DL model can correctly classify people into different groups of age and achieved 78.5% classification accuracy.  [34] investigated a novel biometric method for the classification of human age. For classification, RF, SVM, linear regression (LR), ridge regression (RR), polynomial regression (PR), and ANN were used. They collected a total of 837 subjects aged 6-60 years to evaluate the proposed biometric system. They showed that RF produced the highest classification accuracy of 92.0%. Voice is also used for user authentication and identification. Voiceprints were used in various forensic approaches to classify age, gender, and language. Reade et al. (2015) [35] conducted a study for the classification of adult, child, and senior using face images dataset. They extracted features using HOG, local binary pattern, and active appearance model. They adopted k-NN, SVM, and GB algorithms for the classification of adult, child, and senior and achieved 82.0% classification accuracy. Tin (2012) [36] applied PCA for the classification of age using face image and produced the highest classification accuracy of 92.5%. Our proposed SFFS with RF (SFFS-RF) model produced higher accuracy compared to SFFS with SVM (SFFS-SVM) to classify adult and child based on their handwritten text and handwritten pattern.

Conclusions
The purpose of this study was to clarify changes in the development of handwritten text and pattern between adult and child. Online handwritten text and pattern datasets were collected using a pen tablet system. We utilized SFFS for feature selection and adopted two classification algorithms, RF and SVM, for the classification of adult and child. We selected the common features from SFFS-RF and SFFS-SVM classifiers and then also applied RF and SVM classifiers for the classification of adult and child. For the handwritten text dataset, our proposed system SFFS with RF classifier produced 93.5% accuracy for 18 features, and 89.8% accuracy for 9 features in the handwritten pattern dataset. After identifying the common features, SFFS-RF also produced 91.5% and 87.7% classification accuracy for handwritten text and handwritten pattern datasets. We hope that this study will provide evidence of the possibility of classifying adults and child based on their handwritten text and handwritten pattern data. If we can find out the age range between adult and child, that will help our model to produce an estimated performance accuracy. Funding: This work was supported by the Japan Society for the Promotion of Science Grants-in-Aids for Scientific Research (KAKENHI), Japan (Grant Numbers JP20K11892, which was awarded to Jungpil Shin and JP21H00891, which was awarded to Akira Yasumura).

Institutional Review Board Statement:
The study was conducted in accordance with the Declaration of Helsinki, and approved by The Ethics Committee of the Graduate School of Social and Cultural Sciences of Kumamoto University (approval number: 45, approved on: 25 May 2021).