Meta-Analyzing the Writing Process of Structural Language to Develop New Writing Analysis Elements

: As the basis of communication, a writer is often identiﬁed through their handwriting characteristics. In clinical practice, static elements of handwriting are evaluated and scored, which might be used for subjective judgment in health situations. By investigating the dynamic information in space when writing Hangul, in this study, we present how to analyze Hangul writing characteristics and build new writing analysis elements in the structural language. The ample characters included 14 consonants and 10 vowels. The cloud of line distribution (COLD) method was used to visualize on-stroke characteristics when writing each character. If the written character showed a straight line (the angle of the letter being 0), the feature distribution appeared on the x-axis of the polar domain. If the written character had many kinks (the angle of the letter being − 90 or 90), the feature distribution appeared on the polar domain’s y-axis. In-air movement was visualized using principal component analysis (PCA), and typical in-air movement had an annular shape, which might be used as a new element in handwriting analysis. This study shows the possibility of using such a tool for the writing analysis of structural languages.


Introduction
Text is the basis of communication and involves sophisticated activities for describing and expressing thoughts. Writing is a visual representation of the integration of perception, cognitive precession, and exercise planning and execution [1][2][3][4]. Writing is also a high-dimensional function that involves various individual characteristics and is performed through continuous behavioral changes. Therefore, it is widely understood as a process that is characterized by spatial and kinetic parameters, rather than a mere product of hand and finger movements [5,6]. Therefore, pattern and handwriting recognition studies are currently being conducted to analyze individual cursive characteristics for learning, document analysis, crime, signature verification, language translation, and disease prediction [7][8][9][10][11][12]. Writing is an important task in the childhood development process because writing-related regressions in children are likely to negatively affect academic achievement and cause social sentiment problems, thus highlighting the importance of initial screening, arbitration, and inspection [13][14][15]. The disadvantage of clinical evaluation is that accurate analysis is difficult to achieve, given the focus on scoring systems' exercise factors or evaluation methods, which allow evaluators to interpret results subjectively [4,16]. However, even when writers linguistics and language education. In this study, we aimed to present basic data on Hangul cursive writing characteristics and new writing analysis elements by utilizing three-dimensional movement and dynamic information in the space that appears when writing basic consonants on tablets.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 3 of 12 language therapy, and Korean linguistics and language education. In this study, we aimed to present basic data on Hangul cursive writing characteristics and new writing analysis elements by utilizing three-dimensional movement and dynamic information in the space that appears when writing basic consonants on tablets.   Appl. Sci. 2020, 10, x FOR PEER REVIEW 3 of 12 language therapy, and Korean linguistics and language education. In this study, we aimed to present basic data on Hangul cursive writing characteristics and new writing analysis elements by utilizing three-dimensional movement and dynamic information in the space that appears when writing basic consonants on tablets.

Subjects
All 24 participants (age: 22.5 ± 2.35) were diagnosed as being free of neurological conditions. The inspection paper selected 24 basic consonants and vowels in Hangul. The upper line was placed according to the order of the consonants, while the lower line was placed according to the order of the vowels ( Figure 3) [4,33]. The inspection paper compartments were 1.7 × 1.7 cm.

Subjects
All 24 participants (age: 22.5 ± 2.35) were diagnosed as being free of neurological conditions. The inspection paper selected 24 basic consonants and vowels in Hangul. The upper line was placed according to the order of the consonants, while the lower line was placed according to the order of the vowels (Figure 3) [4,33]. The inspection paper compartments were 1.7 × 1.7 cm.

Tablet
Using a table (Wacom Cintique 13HD, Wacom Co., Saitama, Japan) and MovAlyzeR software (NeuroScript, LLC, Tempe, AZ), write-inspection was rendered on the tablet screen at the same size as A4 paper. As soon as the tablet pen touched the screen, the coordinates were saved. In-air coordinates were collected over time. After separating in-air pen movements (in the air) from onstroke pen movements (on the tablet screen), the data were extracted by calculating the size, pressure (MAX.1024) [34], in-air and on-stroke times, and velocity by 200 Hz.

Cloud of Line Distribution (COLD)
As a method, COLD finds a dominant point and expresses it in the polar domain based on the curve consisting of the line's characteristics [17,35]. In this study, the on-stroke and in-air movements were separated by writing pressure using the developed system. To analyze the distribution pattern of static elements, only pressures above zero were recognized and displayed on the screen. The reason the pressure was above zero is because writers wrote on the screen. The consonants and vowels were then cropped to 70 × 70 pixels and stored as images. Finding the boundaries of letters with a canny edge, the cropped image followed Otsu binarization, which is not sensitive to noise and extracts strong edges [19].
The angle of the vector around a letter's boundaries was defined as vector v, vector length l. Dominant point = { ( , ) | = 1, 2, 3 ⋯ } coordinates were obtained using Equation (1). In this study, the number of dominant points was expressed as n, the coordinates were ( , ), and the number was set to eight.
The coordinates ( , ) were calculated based on Equations (2) and (3) by setting the image angle and line segments for representation in the polar domain. In addition, line segments k = 1 were calculated and established to clearly show the character characteristics in this study [8,17]. Later, the distribution pattern was saved as a 420 × 560 TIF file.

Tablet
Using a table (Wacom Cintique 13HD, Wacom Co., Saitama, Japan) and MovAlyzeR software (NeuroScript, LLC, Tempe, AZ, USA), write-inspection was rendered on the tablet screen at the same size as A4 paper. As soon as the tablet pen touched the screen, the coordinates were saved. In-air coordinates were collected over time. After separating in-air pen movements (in the air) from on-stroke pen movements (on the tablet screen), the data were extracted by calculating the size, pressure (MAX.1024) [34], in-air and on-stroke times, and velocity by 200 Hz.

Cloud of Line Distribution (COLD)
As a method, COLD finds a dominant point and expresses it in the polar domain based on the curve consisting of the line's characteristics [17,35]. In this study, the on-stroke and in-air movements were separated by writing pressure using the developed system. To analyze the distribution pattern of static elements, only pressures above zero were recognized and displayed on the screen. The reason the pressure was above zero is because writers wrote on the screen. The consonants and vowels were then cropped to 70 × 70 pixels and stored as images. Finding the boundaries of letters with a canny edge, the cropped image followed Otsu binarization, which is not sensitive to noise and extracts strong edges [19].
The angle of the vector around a letter's boundaries was defined as vector v, vector length l. Dominant point a = P i (x i , y i ) | i = 1, 2, 3 · · · n coordinates were obtained using Equation (1). In this study, the number of dominant points was expressed as n, the coordinates were (x i , y i ), and the number was set to eight.
The coordinates (θ, γ) were calculated based on Equations (2) and (3) by setting the image angle θ and line segments γ for representation in the polar domain. In addition, line segments k = 1 were calculated and established to clearly show the character characteristics in this study [8,17]. Later, the distribution pattern was saved as a 420 × 560 TIF file.

Principal Component Analysis (PCA)
As the dataset's characteristics increased, the dimension also increased. As the data dimensions increased, the volume of data space also increased exponentially. Therefore, the density of the data became sparse as the dimensions increased. In addition, as the dimension of the data increased, the distance between datapoints also increased, which is called the curse of dimensionality. As a technique to address this, PCA converts samples of high-dimensional space into low-dimensional space without linear association with the aim of finding a new basis that preserves the distribution of the original data as much as possible [36][37][38][39]. In this study, only coordinates with pressures of zero were recognized to analyze the movement in space. The reason the pressure was zero is because it moved in space. The average centralization of Equation (4) was performed for raw data X to ensure that the first component represented the maximum direction of distribution when performing PCA.
After calculating the eigenvalues λ and eigenvectors → e of the covariance matrix C, Equation (5) of m × n data, we then internalized the eigenvector and X to convert new bases and data that were orthogonal to each other [36]. The COLD results were skewed to the right because the range of the arctangent was − π 2 < tan −1 < π 2 , and the line segments were taken as the absolute value. The absolute value was justified because the orientation of the length did not need to be considered, and a distinct difference was visually observable when the distribution characteristics were brought together. In the polar domain, the x-axis did not differ significantly with respect to the angle, while the y-axis was represented by a dense distribution of features when it differed significantly in angle. Furthermore, short lines of line segments represented points at the center of the polar domain, while the lines of the line segments represented points far from the center. ㅡ is a typical example of a line pattern with little difference in angles. Meanwhile, ㅣ is a typical example of a short pattern, although there are many differences in angles. In addition, ㅅ is a typical example of a text with a characteristically symmetrical form of writing in which COLD results also showed a symmetrical form of angle and length relative to zero degrees on the x-axis. ㅇ and ㅎ are circular characters with varying degrees of angles in the letter. However, the point-to-point distance is a representative example of a short pattern. These results explain the general distribution patterns for each consonant and vowel of on-stroke movements. If the COLD results of other writers differed from the distribution patterns of this study, it can be confirmed that the characters were written in different characters or curved surfaces, stair formation, or short or long lengths. As such, one distribution pattern identifies the character characteristics of the writer and allows for an analysis of the individuals. Appl. Sci. 2020, 10, x FOR PEER REVIEW 6 of 12   Figure 5 presents the results of 10 PCAs. Each color represents one writer's results. In Figure 5, we can see that the distribution of data is divided on both sides. This is because when we write, we generally move from left to right, as shown in Figure 3. When participants wrote all 14 consonants and then changed lines to write, the y-axis changed greatly, and moved from right to left and then wrote. Therefore, the upper distribution (Figure 6b) is the result of consonants, and the lower one (Figure 6c) is the distribution of vowels. At this time, the x-axis is the number of the row, and the y-axis is the result of PCA. With PCA, the larger the variance, the wider the spread of the dots on the axis, and the more information is made available. At this time, reducing three-dimensional motion to one dimension may result in information loss, but as a result of this study, data could be preserved at an average of 97.59%. This enabled the use of reliable data. Thus, a short stay in the air produced a mild brown line result with respect to the slope of the baseline, while a long stay in the air resulted in a steep pink line (Figure 6a). In-air movement was also observed in this case (Figure 6b,c), where the turquoise is shown in a ring form and not in pink. The ring-shaped writing style appeared because Hangul's writing style is lower than the starting point, where the stroke of the letter ends; meanwhile, movement from the end point to the starting point was constant. Thus, the PCA results showed no ring figure, meaning either that the pen was moved somewhere else or that there was hesitation near the point at which the stroke of the letter ends. Figure 6b,c show a void in the middle of the pink line, indicating that neither hesitation in writing nor in-air movements were detected. This also means that the pen was removed from the tablet and used for rewriting. This allows the writer to see if his or her concentration and movement are constant when performing the writing and to visually check the time of the writing.

PCA Features
Appl. Sci. 2020, 10, x FOR PEER REVIEW 7 of 12 Figure 5 presents the results of 10 PCAs. Each color represents one writer's results. In Figure 5, we can see that the distribution of data is divided on both sides. This is because when we write, we generally move from left to right, as shown in Figure 3. When participants wrote all 14 consonants and then changed lines to write, the y-axis changed greatly, and moved from right to left and then wrote. Therefore, the upper distribution (Figure 6b) is the result of consonants, and the lower one (Figure 6c) is the distribution of vowels. At this time, the x-axis is the number of the row, and the yaxis is the result of PCA. With PCA, the larger the variance, the wider the spread of the dots on the axis, and the more information is made available. At this time, reducing three-dimensional motion to one dimension may result in information loss, but as a result of this study, data could be preserved at an average of 97.59%. This enabled the use of reliable data. Thus, a short stay in the air produced a mild brown line result with respect to the slope of the baseline, while a long stay in the air resulted in a steep pink line (Figure 6a). In-air movement was also observed in this case (Figure 6b,c), where the turquoise is shown in a ring form and not in pink. The ring-shaped writing style appeared because Hangul's writing style is lower than the starting point, where the stroke of the letter ends; meanwhile, movement from the end point to the starting point was constant. Thus, the PCA results showed no ring figure, meaning either that the pen was moved somewhere else or that there was hesitation near the point at which the stroke of the letter ends. Figure 6b,c show a void in the middle of the pink line, indicating that neither hesitation in writing nor in-air movements were detected. This also means that the pen was removed from the tablet and used for rewriting. This allows the writer to see if his or her concentration and movement are constant when performing the writing and to visually check the time of the writing.

Discussion
As the basis of sophisticated communication, writing is a visual representation of integrated perception, cognitive processing, and exercise planning and execution. Writing is also a highdimensional function in which a person's unique characteristics are revealed through continuous behavioral changes. Handwriting activates specific areas of the brain, allowing it to predict reading and mathematical abilities in addition to writing abilities [26]. Therefore, detailed, sophisticated handwriting analysis is needed; however, evaluators and inspectors can proceed using a paper inspection sheet in a defined environment. Furthermore, clinical writing can be assessed using a scoring system that judges the form, presentation, and writing ability of a writing sample without providing any specific measurements. This highlights the need for tool development in the area of assessment and intervention. Such a tool requires careful analysis and cannot be overlooked in the interest of proper intervention for individuals who stand to benefit [40,41].
In this study, the COLD characteristics of on-stroke and in-air PCA, which can be seen in the dynamic information obtained from tablet writing, were intended to report on the potential for writing patterns and analysis elements. The experiment was conducted on 24 adults (age: 22.5 ± 2.35) who did not have neurological conditions. From the COLD distribution, only the morphological appearance of the consonants and vowels showed that ㄱ, ㄴ, ㅏ, ㅓ, ㅗ and ㅜ all consist of two lines that are horizontal and vertical and have similar patterns. In addition, if the sequence of strokes is not considered in the actual writing experiment, ㅓ and ㅜ can be written similarly to ㄱ, because ㅓ and ㅜ are used in one stroke.
In the case of ㅗ and ㅜ, the horizontal line takes a longer form because the ㅡ line is longer. Therefore, it was possible to see a shape in which dots are concentrated on the x-axis rather than the y-axis. ㄷ, ㅑ, ㅕ, and ㅠ added one line to the preceding vowel, which was visually confirmed as having a similar pattern. ㅑ and ㅕ have a slightly curved shape, rather than a horizontal one, owing to changes in angles that are not parallel to the horizontal line when writing two strokes across.
Thus, the distribution of points is symmetrically spread around the x-aixs. ㅋ is a stroke that is different from ㄱ, but the pattern did not look similar as it showed differences in the order, length, and angle of the strokes. Therefore, ㅋ could produce a distribution of points in the center rather than the characteristics of the x and y axes. ㅛ also added a stroke to ㅗ, while the ㅡ section was often short when only the vowels were used, with each writer showing a visible difference in stroke order and length. It can be explained that the distribution length of the y-axis is shorter and that the ㅛ is slightly spread rather than concentrated on the x-axis. ㅁ and ㅂ are similar consonants, but the distribution results showed visible differences. With respect to ㅁ, there were many scribbles where writers wrote in single strokes or in the form of upside-down triangles, such as ㅇ. Thus, the horizontal and vertical stroke features were found in the polar domain because there were various

Discussion
As the basis of sophisticated communication, writing is a visual representation of integrated perception, cognitive processing, and exercise planning and execution. Writing is also a highdimensional function in which a person's unique characteristics are revealed through continuous behavioral changes. Handwriting activates specific areas of the brain, allowing it to predict reading and mathematical abilities in addition to writing abilities [26]. Therefore, detailed, sophisticated handwriting analysis is needed; however, evaluators and inspectors can proceed using a paper inspection sheet in a defined environment. Furthermore, clinical writing can be assessed using a scoring system that judges the form, presentation, and writing ability of a writing sample without providing any specific measurements. This highlights the need for tool development in the area of assessment and intervention. Such a tool requires careful analysis and cannot be overlooked in the interest of proper intervention for individuals who stand to benefit [40,41].
In this study, the COLD characteristics of on-stroke and in-air PCA, which can be seen in the dynamic information obtained from tablet writing, were intended to report on the potential for writing patterns and analysis elements. The experiment was conducted on 24 adults (age: 22.5 ± 2.35) who did not have neurological conditions. From the COLD distribution, only the morphological appearance of the consonants and vowels showed that ㄱ, ㄴ, ㅏ, ㅓ, ㅗ and ㅜ all consist of two lines that are horizontal and vertical and have similar patterns. In addition, if the sequence of strokes is not considered in the actual writing experiment, ㅓ and ㅜ can be written similarly to ㄱ, because ㅓ and ㅜ are used in one stroke. In the case of ㅗ and ㅜ, the horizontal line takes a longer form because the ㅡ line is longer. Therefore, it was possible to see a shape in which dots are concentrated on the x-axis rather than the y-axis. ㄷ, ㅑ, ㅕ, and ㅠ added one line to the preceding vowel, which was visually confirmed as having a similar pattern. ㅑ and ㅕ have a slightly curved shape, rather than a horizontal one, owing to changes in angles that are not parallel to the horizontal line when writing two strokes across. Thus, the distribution of points is symmetrically spread around the x-aixs. ㅋ is a stroke that is different from ㄱ, but the pattern did not look similar as it showed differences in the order, length, and angle of the strokes. Therefore, ㅋ could produce a distribution of points in the center rather than the characteristics of the x and y axes. ㅛ also added a stroke to ㅗ, while the ㅡ section was often short when only the vowels were used, with each writer showing a visible difference in stroke order and length. It can be explained that the distribution length of the y-axis is shorter and that the ㅛ is slightly spread rather than concentrated on the x-axis. ㅁ and ㅂ are similar consonants, but the distribution results showed visible differences. With respect to ㅁ, there were many scribbles where writers wrote in single strokes or in the form of upside-down triangles, such as ㅇ. Thus, the horizontal and vertical stroke features were found in the polar domain because there were various types of writing. ㄹ and ㅇ had many variations in terms of the order of strokes, so various distribution results were visible without showing direction. ㅌ and ㅍ have many handwriting formats that do not take the stroke sequence into account, and each writer can have a different style. Therefore, different patterns can be identified visually. ㅅ, ㅈ and ㅊ are more pronounced with angle variations in strokes than with other consonants. ㅅ had a long, visible stroke in the angle of change. Thus, we could see a symmetrical shape on the x-axis rather than on the x-axis and y-axis. In the case of ㅊ and ㅎ, the characters are made by adding strokes to ㅈ and ㅇ, respectively, compared to other consonants. Each writer showed different strokes, resulting in different results to other consonants. This means that the changes in angle and length were different for each writer, and it can be seen that there were various distribution based on the center. The relationship between the speed that appeared in the vertical and horizontal directions depended on both the writer and the letter. This can be used to provide additional information and improve the system's performance during handwriting analysis.
Since handwriting is associated with several movements and motor skills, pauses and hesitations within a series of handwriting acts influence writing ability [42,43]. Indeed, skilled writers' in-air movements are less fragile and not much movement in space can be seen. However, unskilled writers' in-air movements-for example, the movements of patients with writing disorders, dyslexia, and Parkinson's disease-can be observed as inconsistent and brittle [29]. This study analyzed in-air movements that were obtained in real time through the visualization of PCA. Skilled writers showed an annular appearance, whereas unskilled writers who were unable to concentrate, who hesitated, or who so much as removed their hands from the table and rewrote, produced S-shaped or empty appearances. Overall, this study's results suggest that in-air analysis is possible through morphological patterns and the visualization of PCA, and this can be used as an element of new handwriting analysis.
The development of digital devices has allowed for new systems to be used at various learning sites, thus increasing the need for new analytical elements to provide a thorough assessment of both users' learning abilities and performance procedures. Studies that aim to fulfill these needs are underway [44][45][46]. Bonneton-Botté et al. [22] and Neumann [47] include notes on tables that are applicable to children, objectifying the added value of the digital learning environment. Meanwhile, Rosemblum et al. [48] and Vessio [49] have indicated that Parkinson's disease patients' on-stroke and in-air handwriting features could provide meaningful applications for cost-effective, rapid, and reliable medical diagnoses. Moreover, Sesa-Nogueras [50] has suggested that in-air movements can offer as much information as on-stroke movements can, and can also be used in recognition. In addition, it was revealed that Parkinson's disease can be assessed by analyzing on-surface and in-air movements in Drotár [29] and dynamic information of writing in Moetesum et al. [51]. The reason we can distinguish between normal people and patients is that patients with Parkinson's disease can see irregular patterns and non-smooth forms of Archimedean spiral patterns, and size and shape gradually change. Therefore, it is believed that this study's results-on-stroke COLD pattern and in-air PCA-will be the basis for writing analysis. Furthermore, since Hangul is uncomplicated and concisely expressed, with a clear visual distinction when compared to other languages, it can be used as basic data for structural language, in addition to being used in data and analysis systems for learning evaluations based on basic shapes, given that it consists of basic strokes in the form of straight lines, circles, and squares. Other applications include medical diagnosis and rehabilitation process analysis using Parkinson's disease patients' writing, for instance, as well as write-through analysis.

Conclusions
As the most sophisticated communication skill, writing is an evaluation factor that measures children's development and the integration of adult vision movements; however, presently, it is only being assessed using static feature elements and evaluators' subjective interpretations. In addition, as learning with digital devices increases, there is a need for new analytical elements for tools and methods to analyze writing digitally. Therefore, in this study, objective and consistent data were obtained using tablets to present basic data on the writing characteristics and new writing analysis elements of Hangul, a structural language. Afterwards, a static element was analyzed using COLD, and movements during writing were analyzed using PCA. First, COLD was used to visually identify the characteristics of each character. Characters consisting of horizontal and vertical lines were distributed on the x-and y-axes, respectively. However, in the case of letters with several kinks or variations in angles, such as ㅇ or ㅎ, the distribution patterns were independent of the axis. In addition, the shorter the line segments, the more distributed the dots around the polar domain were, and the longer lines were scattered far away. Next, as a result of PCA, the shorter the time spent engaged in in-air movement, the less urgent the slope of the baseline, but the longer the time, the more urgent the slope of the baseline. In addition, writing movements in space generally showed ring-shaped features. This study has proved that visual pattern analysis and thr evaluation of dynamic elements that show patterns and movements in static elements through tablets are possible. In addition, a new analytical element was presented as a digital writing evaluation tool. We intend to conduct a study involving structural letters, words, and sentences in combination with future consonants and vowels. We also plan to develop a system that allows data to take the form of big data in order to examine writing disorder interventions and assessments, and rehabilitation or learning progress.