Next Article in Journal
Controlling Heterogeneous Multi-Agent Systems Under Uncertainty Using Fuzzy Inference and Evolutionary Search
Previous Article in Journal
A Review on Gas Pipeline Leak Detection: Acoustic-Based, OGI-Based, and Multimodal Fusion Methods
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Graph Convolutional Network with Agent Attention for Recognizing Digital Ink Chinese Characters Written by International Students

1
College of Computer Science and Engineering, North China Institute of Science and Technology, Langfang 065201, China
2
College of Information Science, Beijing Language and Culture University, Beijing 100083, China
*
Authors to whom correspondence should be addressed.
Information 2025, 16(9), 729; https://doi.org/10.3390/info16090729
Submission received: 25 July 2025 / Revised: 20 August 2025 / Accepted: 22 August 2025 / Published: 25 August 2025
(This article belongs to the Section Artificial Intelligence)

Abstract

Digital ink Chinese characters (DICCs) written by international students often contain various errors and irregularities, making the recognition of these characters a highly challenging pattern recognition problem. This paper designs a graph convolutional network with agent attention (GCNAA) for recognizing DICCs written by international students. Each sampling point is treated as a vertex in a graph, with connections between adjacent sampling points within the same stroke serving as edges to create a Chinese character graph structure. The GCNAA is used to process the data of the Chinese character graph structure, implemented by stacking Block modules. In each Block module, the graph agent attention module not only models the global context between graph nodes but also reduces computational complexity, shortens training time, and accelerates inference speed. The graph convolution block module models the local adjacency structure of the graph by aggregating local geometric information from neighboring nodes, while graph pooling is employed to learn multi-resolution features. Finally, the Softmax function is used to generate prediction results. Experiments conducted on public datasets such as CASIA-OLWHDB1.0-1.2, SCUT-COUCH2009 GB1&GB2, and HIT-OR3C-ONLINE demonstrate that the GCNAA performs well even on large-category datasets, showing strong generalization ability and robustness. The recognition accuracy for DICCs written by international students reaches 98.7%. Accurate and efficient handwritten Chinese character recognition technology can provide a solid technical foundation for computer-assisted Chinese character writing for international students, thereby promoting the development of international Chinese character education.

1. Introduction

In the context of the international promotion of the Chinese language, there is a growing emphasis on Chinese language learning, with the number of foreign students who take Chinese as their second language (hereinafter referred to as “international students”) steadily increasing. Unlike the linear arrangement of letters in Pinyin characters, Chinese characters have become a key and difficult point in international Chinese education due to their complex structure and semantic characteristics. When learning Chinese characters, international students usually unconsciously apply their native language’s thinking patterns and writing habits, resulting in deviations in writing and various instances of non-standard practices. International students are more likely to have extra or missing strokes, improperly connected strokes, broken strokes, and other issues during the writing process. International students may be influenced by their native language writing habits, resulting in deviations in stroke order, direction, and structure from standard Chinese character writing, as well as overall structural imbalances and non-standard stroke forms. Students from different cultural backgrounds have different writing styles, which increases the difficulty of recognition. Figure 1 shows some samples of DICCs for international students.
The DICCs collected through digital paper and pen, including the coordinates, pressure values, time information, and stroke information of sampling points, provide a rich source of data for DICCR of international students. Most of the existing DICCR methods focus on native Chinese speakers. To better serve the international student community, it is essential to introduce a DICCR method tailored specifically to address the writing habits and frequent errors of international students. This not only enhances recognition accuracy but also boosts the quality and efficiency of Chinese language education, thereby further facilitating the dissemination and development of Chinese globally.
The currently popular deep learning-based DICC recognition methods treat DICCs as still images or written trajectories, and the corresponding deep learning networks use two-dimensional convolutional neural networks (2-D CNNs) [1,2,3] and recurrent neural networks (RNNs) [4,5,6]. Two-dimensional CNNs gradually learn hierarchical features from sampling points to the entire character through multiple layers of convolution and pooling, which are feasible methods for achieving freedom of stroke direction and stroke order. However, they fail to account for the time-dependent features of DICCs and have relatively high storage space and computational costs. RNNs are widely used for processing temporal data. RNNs can retain information from previous moments through hidden states, thus modeling dynamic changes during the writing process. However, traditional RNNs suffer from gradient vanishing and exploding problems, making it difficult to handle long sequence dependencies. To address this issue, Hochreiter and Schmidhuber proposed LSTM [7], which is designed with gate control units to manage information transmission, overcoming the gradient vanishing that exists in traditional RNNs, effectively alleviating the long-term dependency problem. The sequential processing nature in RNN and LSTM architectures limits their parallelization capability, resulting in inefficient processing of lengthy sequences and constraints on stroke order/direction flexibility.
The underlying recognition mechanisms of 2D CNNs and RNNs differ fundamentally from human cognition, mainly relying on low-level relationship modeling: 2D-CNNs learn low-level spatial correlations between adjacent pixels, while RNNs capture local patterns of adjacent points in a sequence. The current learning approach lacks capacity for direct geometric semantic modeling of characters, resulting in limited interpretability. Addressing this, recent advances in graph neural networks [8,9,10] have enabled novel solutions. Notably, Gan et al. [11] introduced a skeletal graph representation for Chinese characters, developing the PyGT framework that integrates Transformer [12] and GCN [8] architectures. This structure-aware computational paradigm more closely approximates human cognitive processes in character recognition. Although the PyGT uses Transformers to capture global information of graph nodes well, Transformers use Softmax attention [12], which calculates the similarity between each query–key pair. The computational complexity is the square of the token, resulting in high computational costs and long inference time for PyGT.
We present GCNAA, an innovative graph convolutional network with agent attention for Chinese character recognition. GCNAA learns the inherent graph structure of DICCs, automatically ignores irrelevant information such as writing style and stroke order, and achieves high-precision DICCR. The agent attention mechanism in GCNAA represents a generalized linear attention method that not only significantly improves computational efficiency compared to the Softmax attention used in PyGT, but also maintains the ability to understand global context. This mechanism reduces computational complexity, thereby shortening the time required for model training and accelerating the inference process. Our core contributions include the following:
(1)
We propose representing DICCs of international students as graphs, which not only model the geometric structure of Chinese characters but also include the temporal and spatial information of sampling points.
(2)
For the first time, GCNAA is proposed to implement DICCR for international students. GCNAA has low computational complexity, fast inference speed, and high recognition rate for DICCR of international students with writing errors and lack of standardization.
(3)
There have been relevant studies on using GCNs to achieve 3755 first-level Chinese character recognition tasks, but the effectiveness of GCNs has not been validated in larger categories of Chinese character recognition tasks, such as the recognition of 6763 first-level and second-level Chinese characters. This paper experimentally verifies the generalization performance of the GCNAA on 6763 large category datasets using the publicly available datasets CASIA-OLWHDB1.0-1.2, SCUT-COOCH2009 GB1&GB2, and HIT-OR3C-ONLINE.

2. Related Work

DICCR represents a significant branch of pattern recognition research, with extensive utilization across multiple domains including mobile computing systems, human–machine interaction interfaces, and intelligent education platforms.

2.1. DICCR

Traditional recognition methods have made great progress in preprocessing [13,14], feature extraction [15,16], stroke extraction [17], structure matching [18], statistical classification [19,20,21], and other aspects [22,23]. L. Jin et al. [24] extracted 8-directional features to form the original feature vector, used LDA dimensionality reduction, and presented comparative evaluation metrics obtained by applying a cutting-edge recognition method to the SCUT-COUCH 2009 dataset. Liu and colleagues [25] conducted a comprehensive evaluation of leading DICCR approaches utilizing the OLHWDB1.0 and OLHWDB1.1 datasets, systematically assessing various normalization techniques, feature extraction methodologies, and classification algorithms. Qu et al. [26] developed a locality-sensitive sparse representation method for optimized prototype classification (LSROPC) in IAHCCR. Bai and colleagues [17] proposed a method for DICCR based on stroke names and whole-character structure. In 2020, they improved this method by employing a Hidden Conditional Random Field (HCRF) to classify stroke symbols [18]. Traditional recognition methods mainly rely on handcrafted features, which require strong expertise in the field. Following the rapid progress in deep learning architectures, DICCR has undergone an evolution from traditional methods to deep learning, resulting in significant improvements in recognition performance. According to the type of data being processed, current deep learning approaches for DICCR can be systematically categorized into three distinct methodological paradigms.
(1)
Image-based 2D-CNN method
The 2D-CNN method converts DICCs into images and treats Chinese character recognition as image classification [3,27], which loses the temporal information contained in DICCs. Yang et al. [2] and Zhang et al. [1] used domain-specific knowledge to extract feature images (such as eight-directional feature maps, path signatures, etc.) to fully utilize spatial and temporal information, significantly improving recognition performance. However, this method requires complex domain-specific knowledge, while also increasing computational costs and storage space requirements.
(2)
Sequence-based RNN and 1D-CNN methods
The recurrent neural network (RNN) method directly processes DICCs, thereby better utilizing the temporal and spatial information contained in sequence data, and does not require any domain knowledge to extract feature images. It has achieved good results in DICCR. Zhang et al. [5] used LSTM and GRU networks to model temporal dependencies, and achieved the best recognition level at the time on the ICDAR-2013 dataset. However, due to the serial computing mechanism of RNNs, the processing speed of long sequences is slow. Gan et al. [28] and Xu et al. [29] proposed a one-dimensional convolutional neural network (1D-CNN) method, which has faster computation speed compared to RNNs and can process long sequences in parallel. But these methods all rely on temporal information and cannot achieve freedom of stroke order.
(3)
Graph-based GNN method
A graph can fully represent the two-dimensional structure of Chinese characters, and traditional graph-based methods describe DICCs as relational data structures, such as attributed relational graphs. Zheng Jing et al. [30] proposed a fuzzy attribute relationship graph FARG to describe DICCs and achieved good results. However, the definition of graph similarity measurement in attributive relational graphs lacks statistical basis, and the graph-matching algorithm is complex, slow in computation speed, weak in anti-interference ability, and difficult to train. With the rapid development of graph neural networks [8,9,10], Gan [11] depicted Chinese characters using skeletal graph structure, and used Transformers and GCNs to design the PyGT architecture for Chinese character recognition, achieving good results. Although Transformers’ global attention mechanism has good expressive power, its high computational cost leads to long inference time in PyGT.

2.2. Attention Mechanism

The attention mechanism, widely adopted in NLP and computer vision, improves model performance by dynamically weighting important input regions during processing.
The Transformer, introduced by Vaswani, revolutionizes deep learning with its self-attention mechanism [12], which completely abandons RNNs and CNNs and focuses on multi-head self-attention, position encoding, and residual connections to achieve parallel global modeling. However, integrating Transformer and self-attention mechanisms into the field of vision poses significant challenges.
Contemporary Transformer architectures typically employ the Softmax attention mechanism [12], where attention weights are computed based on query–key similarity, resulting in quadratic computational complexity relative to token count. When deployed in vision tasks, the full-receptive-field Softmax attention mechanism suffers from extremely high computational overhead. In response to this challenge, recent research [31,32,33,34] has focused on developing efficient attention mechanisms to reduce computational costs. For example, Swin Transformer [33] narrows the receptive field by constraining self-attention computation within a local window; PVT [32] adopts a sparse attention strategy to alleviate computational pressure by reducing the number of keys and values. Although these methods have achieved certain results, they have weakened the modeling ability of long-range dependencies to varying degrees, and their performance is still inferior to that of global self-attention mechanisms. Therefore, Dongchen Han innovatively proposed the new attention paradigm-agent attention [35], which balances representation ability while maintaining computational efficiency.
This paper proposes the GCNAA (graph convolutional network with agent attention) method for Chinese character recognition by combining agent attention and graph convolution network. GCNAA is a human-like learning method that abstracts the inherent graphic structure of Chinese characters, automatically ignores irrelevant information such as writing style and stroke order, and achieves high-precision handwritten character recognition. GCNAA uses agent attention to simplify the calculation process, reduce the computational complexity of the model, but still maintain an understanding of the global context of the input data, which is crucial for capturing the global dependencies of the graph and understanding complex patterns.

3. Constructing Skeleton Graphs of DICCs

Skeleton graphs of DICCs are constructed according to the trajectory information of DICCs. The specific method is as follows.
(1)
Extract coordinates from the original data to obtain a sequence with length N, [ [ x 1 ,   y 1 ,   s 1 ] , [ x 2 ,   y 2 ,   s 2 ] [ x N ,   y N ,   s N ] ] , in which x i and y i represent the x and y coordinates of the sampling point i , and s i represents the strokes to which the sampling point i belongs.
(2)
Normalized coordinates. While keeping the original aspect ratio, the coordinates of sampling points are normalized into a standard coordinate system [5].
(3)
Resampling. The original trajectory sequence is resampled to generate a new sequence composed of equidistant points along the path. The coordinates (x, y) of the new point are calculated through linear interpolation. The stroke marker directly inherits the marker of the previous point.
(4)
Build skeleton graphs. The skeleton graphs are created by traversing each sampling point, taking the sampling points of a Chinese character as the vertexes of the skeleton graph and the connecting line between adjacent sampling points as the edge of the graph.
(5)
Merge nearby graph vertexes.
The number of vertices in the new skeleton graph will decrease, and each vertex will contain more information, which will maintain the original structure of Chinese characters intact, yet effectively lowers computational demands. The procedure for merging nearby graph vertexes is as follows.
Divide all vertices in the graph into multiple subsets based on the threshold. The specific method is to traverse each vertex P i . If the vertex P i has not been added to any subset, it will be placed in a new subset. At the same time, all vertices P j after the vertex P i will be traversed, and the distance between P i and P j will be calculated. If it is less than the threshold, P j will be added to the subset P i belongs to.
According to the divided subset, a new skeleton graph is generated. The specific method is to traverse each subset and generate new vertices. The mean value of all vertices in each subset is regarded as a vertex in the new skeleton graph. Next, every edge of the original graph is traversed. If the two vertices connected by the edge are mapped to different subsets i and j , and [ i , j ] does not exist in the edge set of the new skeleton graph, [ i ,   j ] is added to the edge set of the new skeleton graph. If the two vertices connected by the edge map to the same subset, the edge is ignored directly.
The steps for constructing the skeleton graph of the Chinese character “啊” are presented in Figure 2.

4. Graph Convolutional Network with Agent Attention

The graph convolutional network with agent attention (GCNAA) is used to process the data of the graph structure. It takes the skeleton graphs of DICCs as input and produces prediction results through the softmax function. GCNAA learns graph-based data, extracts and explores the features and patterns in the graphs of DICCs, and finally realizes the classification of DICCs. The overall framework of GCNAA is shown in Figure 3.
The GCNAA is realized by stacking Block modules repeatedly. The two basic components of the Block module are graph agent attention and graph convolution. The graph agent attention component captures the global relationship between the nodes of a graph. The graph convolution component realizes the modeling of the local adjacency structure of a graph by aggregating the local geometric information from neighboring nodes. By capturing the global relationship and local geometric structure between nodes, the Block module can better model the characteristics of graphs. Finally, multi-resolution features are learned by graph pooling.

4.1. Graph Agent Attention

Graph agent attention captures global relationships primarily by facilitating interaction between all graph nodes. It adopts a generalized linear attention mechanism, and its computational efficiency is much higher than that of the widely used Softmax attention, while retaining the ability of global context modeling. The main idea is to represent triples (Q, K, V) as quadruples (Q, A, K, V), and introduce an additional set of agent vectors A into the traditional attention module. The agent vectors initially serve as intermediaries for query vector Q, gathering relevant information from keys (K) and values (V) before propagating this aggregated information back to Q. With significantly fewer agent vectors than query vectors, this architecture maintains global context modeling capabilities while substantially improving computational efficiency compared to conventional Softmax attention. Specifically, for the m-th Block module, let V m R N × D represent the node feature matrix, where N denotes the number of graph nodes and D indicates the feature dimension. The FC layer, with V m as its input, generates queries Q m R N × D , keys K m R N × D , and values V m R N × D , then pools V m to obtain agent A m R n × D . The final multi-head graph attention output is then obtained as
o m = ξ Q m A m D ξ A m K m D V m
Namely,
o m = A t t n Q m , A m , A t t n A m , K m , V m A g e n t   A g g r e g a t i o n A g e n t   B r o a d c a s t
where ξ denotes the Softmax function on the row. While the graph agent attention module effectively captures global information across all nodes, it lacks the capacity to encode the spatial topological relationships between nodes.

4.2. Graph Convolution

The graph convolution captures local graph topology by aggregating features from adjacent nodes. In irregular graph structures G, this convolutional operation at a given node v i may be mathematically expressed as
f v i m + 1 = f m v i + 1 N v i v j N v i g ( u ( i , j ) ) f m v j
where f m v i denotes the features of node v i of the m-th graph convolution layer, N v i denotes the neighbors of node v i , and u ( i , j ) = p j p i denotes the coordinate offset from v j to v i . g is the convolution kernel of the graph. In our setting, the Gaussian mixture graph kernel [8] is used to approximate the graph convolution kernel depending on the local geometry u, i.e.,
g ( u ) = exp 1 2 ( u μ ) Σ 1 ( u μ )

4.3. Graph Pooling

Inspired by the design of CNNs, GCNAA integrates the graph pooling module into the Block module and gradually aggregates fine-grained stroke nodes into higher-level structural units (such as radicals and components) through multiple rounds of pooling, simulating the cognitive process of humans from strokes to whole characters. The graph pooling alleviates the excessive smoothing problem of deep GCNs, significantly reducing the computing cost. Graph pooling is based on the topological structure of a graph, which is decomposed into K disjoint clusters V l , , V k [36], and coarse-grained graphs are created according to these K clusters. The pooling process of the Chinese character “典” is shown in Figure 4.

5. Experiments

The experiment systematically evaluates GCNAA from three aspects, including its classification performance on international student datasets, its classification performance on super multi-class datasets, and the comparison between GCNAA and PyGT. From practical applications (international student datasets) to technical challenges (super multi-class datasets), and then to horizontal comparisons (with PyGT), it comprehensively evaluates the advantages of GCNAA in terms of efficiency, accuracy, and computational complexity.

5.1. Datasets

The datasets used in the experiment include DICCs written by international students, CASIA-OLWHDB 1.0-1.2, SCUT-COUCH2009 GB1 & GB2, and HIT-OR3C-ONLINE.
DICCs written by international students [37] include 525 categories of Chinese characters and 31,734 samples. There are many problems in the sample, such as extra strokes, missing strokes, broken strokes, incomplete strokes, non-standard structure and components, lack of components, alteration, wrong stroke order and so on.
CASIA-OLWHDB [38] is an unconstrained online handwriting database established by National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences. It includes three datasets of isolated characters: OLHWDB 1.0, OLHWDB 1.1, and OLHWDB 1.2. The statistics of these datasets are shown in Table 1.
CASIA-OLWHDB1.0-1.2 dataset is to construct all the data of the training sets of OLHWDB1.0, OLHWDB1.1 and OLHWDB1.2 into a large training set and three test sets into a large test set. It contains 3755 first-class Chinese characters and 3008 second-class Chinese characters in the GB2312-80 standard [39], totaling 6763 Chinese characters. Among them, the training set contains 2,844,672 Chinese character samples, and the test set contains 711,123 Chinese character samples.
SCUT-COUCH2009 [24] dataset was established by South China University of Technology, which consists of 11 different vocabulary datasets. We use the GB1 and GB2 datasets, denoted as SCUT-COUCH2009 GB1 & GB2, which contain 6763 Chinese characters in the GB2312-80 standard.
HIT-OR3C dataset was established by Harbin Institute of Technology, which consists of four character subsets (GB1, GB2, Digit, Letter) and one document subset, among which GB1 and GB2 are abbreviations of two subsets in GB2312-80. We used two subsets of GB1 and GB2 in our experiment.

5.2. Implementation

The complete architecture is implemented in PyTorch 2.0.1+CU118, featuring three configurations of GCNAA with increasing complexity: GCNAA-S (small), GCNAA-M (medium), and GCNAA-L (large). The model configuration (where D: node feature dimension; N: graph agent attention head number) is detailed in Table 2. All models are trained using Adam optimization [40] with L2-regularized cross-entropy loss minimization. We employ a dropout rate of 0.2 and an initial learning rate of 0.002, which decays by a factor of 0.1 upon validation loss plateau. Training terminates when model convergence is achieved.

5.3. Experimental Results

5.3.1. The Classification Effect of GCNAA on DICCs for International Students

We trained GCNAA-S, GCNAA-M, and GCNAA-L networks with DICCs from international students, and observed the classification accuracy of these three networks on DICCs from international students. The validation accuracy is shown in Figure 5. Despite the presence of stroke errors, including extra strokes, missing strokes, redundant strokes, incomplete strokes, wrong stroke direction, and stroke order errors in international students’ DICCs, the geometric structure underlying the graphs of DICCs remains largely unchanged from that of standard Chinese characters and GCNAA relies on geometric structures to make classifications. Therefore, these three networks have good classification accuracy for DICCs of international students, reaching over 97.5%. The accuracy of GCNAA-S is slightly lower, GCNAA-M is in the middle, and GCNAA-L is the highest. Taking into account hardware resources, dataset size, and the performance of GCNAA-L on the DICC dataset for international students, GCNAA-L was ultimately chosen.
The larger the dataset, the better the performance of the model. This is because a larger dataset can provide richer information and more samples for the model to learn, thus helping the model to capture more useful features in the data and improve its generalization ability. Directly training the GCNAA-L model with DICCs from international students, with a batch size of 512, achieved a maximum accuracy of 97.82%. Pretraining GCNAA on the publicly available CASIA-OLHWDB1.0 dataset can enhance its classification performance for international students’ DICCs. Table 3 presents the recognition results of various methods using the CASIA-OLHWDB1.0 dataset. As shown, our model significantly outperforms other approaches in terms of accuracy.
After pretraining on CASIA_OLHWDB1.0, the model was trained with DICCs from international students, achieving a maximum validation accuracy of 98.53%. The verification accuracy has been improved by 0.71, and the training loss and verification accuracy are shown in Figure 6 and Figure 7. After pretraining GCNAA-L with an additional dataset and fine-tuning with international students’ DICCs, the classification accuracies on the international students’ DICC test set are shown in Table 4. Therefore, when the dataset is limited, using pretrained models is an efficient small-sample learning method.
A comparison of recognition approaches based on DICCs for international students is shown in Table 5. Among them, the hierarchical model [17] and improved hierarchical models [18] proposed by Bai et al. divide the dataset into five subsets based on the stroke count of Chinese characters: 1–3, 4–6, 7–9, 10–12, and 13+. The classification accuracies of the hierarchical model on these five subsets are 72.73%, 86.56%, 92.55%, 85.74%, and 65.56%, respectively. The classification accuracies of improved hierarchical models on five subsets are 79.2%, 94.29%, 97.91%, 93.42%, and 71.34%, respectively. From Table 5, it can be seen that our proposed GCNAA method has the highest recognition rate.

5.3.2. The Classification Effect of GCNAA on Super-Large-Category Datasets

Although GCNs have been successfully applied to many classification tasks [41], these tasks are often classification tasks based on a small number of categories of small datasets. At present, there have been related studies on using GCNs to realize 3755 first-level Chinese character recognition tasks [11], but in larger categories of Chinese character recognition tasks (such as 6763 first-level and second-level Chinese character recognition tasks), the effectiveness of GCNs has not been verified. Based on CASIA-OLWHDB1.0-1.2, SCUT-COUCH2009 GB1 & GB2, and HIT-OR3C-ONLINE, we verify the performance of GCNAA on 6763 Chinese character datasets.
There are more training data in super-large-category datasets. Smaller batch sizes introduce higher gradient stochasticity, which may lead to training instability and convergence difficulties. In a certain range, increasing batch size is helpful to the stability of convergence and improves the training speed, especially in parallelization. After verification, the training effect is very good when the batch size is adjusted to 1024. The verification accuracy is shown in Figure 8.
As can be observed from Figure 8, GCNAA demonstrates strong performance across all five datasets, including CASIA-OLHWDB1.0, CASIA-OLHWDB1.0-1.2, SCUT-COUCH2009 GB1, SCUT-COUCH2009 GB1&GB2, and HIT-OR3C-ONLINE. Among these, CASIA-OLHWDB1.0 and SCUT-COUCH2009 GB1 contain 3740 and 3755 categories of Level-1 Chinese characters, respectively, while CASIA-OLHWDB1.0-1.2, SCUT-COUCH2009 GB1&GB2, and HIT-OR3C-ONLINE include both Level-1 and Level-2 Chinese characters, totaling 6763 categories. Despite the significant variation in the number of categories across datasets, GCNAA achieves even higher recognition accuracy on datasets with more categories: CASIA-OLHWDB1.0-1.2 shows higher accuracy than CASIA-OLHWDB1.0, and SCUT-COUCH2009 GB1&GB2 outperforms SCUT-COUCH2009 GB1. It is particularly noteworthy that the accuracy on HIT-OR3C-ONLINE is the highest among all datasets. These results fully validate the effectiveness and strong generalization capability of GCNAA in large-scale multi-category Chinese character recognition tasks.

5.3.3. The Comparison of GCNAA and PyGT

The agent attention mechanism in GCNAA represents a generalized linear attention method. Compared with Softmax attention used in PyGT [11], it not only significantly improves the computational efficiency, but also maintains the ability to understand the global context. This mechanism reduces the computational complexity, thus shortening the time required for model training and accelerating the reasoning process. The training effects of GCNAA and PyGT on the public datasets CASIA-OLWHDB1.0 and SCUT-COUCH2009 GB1 are compared, as shown in Figure 9, Figure 10, Figure 11, Figure 12, Figure 13 and Figure 14, in which the former covers 3740 first-class Chinese character categories, while the latter has 3755. Experimental results show that GCNAA has slightly lower verification accuracy than PyGT on the CASIA-OLWHDB1.0 dataset, but both of them show excellent performance on the SCUT-COUCH2009 GB1 dataset. It is worth noting that in the same time frame, GCNAA has achieved higher verification accuracy and lower training loss, which shows that it has achieved a better balance between efficiency and effect.
To comprehensively evaluate the classification performance of GCNAA on larger category datasets, we choose three datasets CASIA-OLWHDB1.0-1.2, SCUT-COUCH2009 GB1 & GB2, and HIT-OR3C-ONLINE for further experiments. These datasets not only cover more Chinese character categories but also provide more complex writing styles and diversity, enabling more comprehensive assessment of the model’s generalization capability and robustness. The experimental results demonstrate (Figure 15, Figure 16, Figure 17, Figure 18, Figure 19, Figure 20, Figure 21, Figure 22 and Figure 23) that GCNAA achieves both lower training loss and higher validation accuracy. This is because GCNAA adopts agent attention, which reduces the computational complexity and accelerates the reasoning speed of the model. It takes less time to train the same EPOCH, and can achieve higher accuracy at the same time. Because of the efficient computation of the agent attention mechanism, GCNAA should be able to complete training in a shorter time, and the reasoning speed is faster. Although faced with a larger category and more complex samples, GCNAA maintains high classification accuracy, especially under the same training time or resources, and will achieve a better balance between efficiency and effect.
Through experimental comparison, it is found that GCNAA has the following significant advantages:
(1)
High computational efficiency
GCNAA adopts agent attention. Compared with the traditional Softmax attention, agent attention uses a generalized linear attention mechanism, which reduces the computation and improves the processing speed. This means that results can be obtained faster in the process of training and reasoning, especially on large-scale datasets.
(2)
Global context modeling
Although the computational process is simplified, agent attention can still maintain an understanding of the global context in the input data. This is very important for capturing the global dependencies of graph nodes and understanding complex patterns.
(3)
Reducing computational complexity
Because of its design, GCNAA reduces the computational complexity of the overall model, which helps to reduce the required computing resources, such as memory and processor time, so that the model can be deployed in a wider range of hardware environments, including those with limited computing power.
(4)
The training time is shortened
Due to the improvement of computational efficiency and the reduction in computational complexity, the model using agent attention can complete training in a shorter time. This is very helpful for fast iterative model development and responding to new data or requirement changes.
(5)
Speed up reasoning
In addition to the advantages of the training stage, agent attention also optimizes the reasoning process of the model, which makes it possible to predict or make decisions faster in practical applications and improves the real-time performance of the system.
(6)
Better balance
At the same time, GCNAA has higher verification accuracy and lower training loss, which shows that this mechanism can achieve a better balance between efficiency and effect. This shows that it is not only fast but also accurate, which is especially critical for application scenarios that pursue high performance.

5.3.4. Prediction Result Analysis

The model can accurately recognize standard handwritten Chinese characters while demonstrating strong robustness against wrong and irregular writing variations, including extra/missing strokes, broken strokes, component omissions, alterations, non-standard strokes, and structural imbalance (as shown in Figure 24).
To further validate the effectiveness of the proposed method, we conducted a systematic analysis of mispredicted samples and categorized the error causes into the following three types:
(1)
Labeling errors, such as cases where the handwritten character is ‘水’ but incorrectly labeled as ‘云’, make correct recognition inherently impossible for such samples.
(2)
Classification errors occur when non-standard strokes, incorrect stroke position relationship, missing strokes and so on cause handwritten characters to exhibit greater morphological similarity to other standard characters.
For example, the character “八” may resemble “儿” when its “㇏” is written irregularly, as shown in Figure 25a.
In “几”, the intersection point between the “丿” and the “㇈” should ideally meet at the starting points of both strokes. However, if the stroke “丿” crosses through the horizontal segment of the stroke “㇈”, the character becomes visually similar to “九”, as shown in Figure 25b.
When the stroke “丨” in “午” protrudes excessively at the top, the character morphology approaches that of “牛”, as shown in Figure 25c.
(3)
The inherent similarity among Chinese characters. Chinese characters possess intrinsic visual similarities that pose unique challenges for recognition systems. For example, “夏” and “复”.

6. Conclusions

In this paper, first of all, we propose expressing international students’ DICCs as skeleton graphs, because graphs better retain the essential features of Chinese characters. Secondly, a novel GCNAA is proposed to realize DICCR for international students. Specifically, graph agent attention in GCNAA is used to model the global context between graph nodes, and Graph Conv Block captures the local context between neighbor nodes. Graph pooling ensures the learning of multi-resolution features of graphs. Experiments show that GCNAA not only has a high recognition rate but also high computational efficiency and fast training and reasoning speed on the open datasets CASIA OLHWDB, SCUT-COUCH2009 GB1 & GB2, HIT-OR3C-ONLINE, and DICC dataset for international students, which makes GCNAA an effective method for international students’ DICCR. GCNAA can accurately identify Chinese characters containing errors and lack of standardization, and help international students better understand writing norms. It lays a technical foundation for improving the quality of international Chinese character teaching and further promoting the spread and development of Chinese in the world.

Author Contributions

H.X.: conceptualization, methodology, software, validation, investigation, data curation, writing—original draft preparation, writing—review and editing, visualization, project administration, funding acquisition. X.Z.: conceptualization, methodology, resources, supervision, investigation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by “the Fundamental Research Funds for the Central Universities” grant number 3142024038; “Research on the Construction of Innovation and Entrepreneurship Education Practice Platform in the Scenario of Engineering Education Accreditation + AI” grant number YDJG202524.

Acknowledgments

Author Huafen Xu thanks Xiwen Zhang for his mentorship and the provision of high-performance computational resources. On a personal note, Huafen Xu is indebted to her families for their understanding and motivation.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zhang, X.-Y.; Bengio, Y.; Liu, C.-L. Online and offline handwritten Chinese character recognition: A comprehensive study and new benchmark. Pattern Recognit. 2017, 61, 348–360. [Google Scholar] [CrossRef]
  2. Yang, W.; Jin, L.; Tao, D.; Xie, Z.; Feng, Z. DropSample: A new training method to enhance deep convolutional neural networks for large-scale unconstrained handwritten Chinese character recognition. Pattern Recognit. 2016, 58, 190–203. [Google Scholar] [CrossRef]
  3. Xu, H.; Zhang, X. A Residual Network with Multi-Scale Dilated Convolutions for Enhanced Recognition of Digital Ink Chinese Characters by Non-Native Writers. Int. J. Knowl. Innov. Stud. 2024, 2, 130–146. [Google Scholar] [CrossRef]
  4. Zhang, J.; Zhu, Y.; Du, J.; Dai, L. Trajectory-based Radical Analysis Network for Online Handwritten Chinese Character Recognition. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; IEEE: New York, NY, USA, 2018; pp. 3681–3686. [Google Scholar]
  5. Zhang, X.-Y.; Yin, F.; Zhang, Y.-M.; Liu, C.-L.; Bengio, Y. Drawing and Recognizing Chinese Characters with Recurrent Neural Network. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 849–862. [Google Scholar] [CrossRef]
  6. Ren, H.; Wang, W.; Liu, C. Recognizing online handwritten Chinese characters using RNNs with new computing architectures. Pattern Recognit. 2019, 93, 179–192. [Google Scholar] [CrossRef]
  7. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  8. Monti, F.; Boscaini, D.; Masci, J.; Rodola, E.; Svoboda, J.; Bronstein, M.M. Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; IEEE: New York, NY, USA, 2017; pp. 5425–5434. [Google Scholar]
  9. Fey, M.; Lenssen, J.E.; Weichert, F.; Muller, H. SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; IEEE: New York, NY, USA, 2018; pp. 869–877. [Google Scholar]
  10. Dwivedi, V.P.; Joshi, C.K.; Luu, A.T.; Laurent, T.; Bengio, Y.; Bresson, X. Benchmarking Graph Neural Networks. J. Mach. Learn. Res. 2022, 23, 1–48. [Google Scholar]
  11. Gan, J.; Chen, Y.; Hu, B.; Leng, J.; Wang, W.; Gao, X. Characters as graphs: Interpretable handwritten Chinese character recognition via Pyramid Graph Transformer. Pattern Recognit. 2023, 137, 109317. [Google Scholar] [CrossRef]
  12. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is All you Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  13. Liu, C.-L.; Marukawa, K. Pseudo two-dimensional shape normalization methods for handwritten Chinese character recognition. Pattern Recognit. 2005, 38, 2242–2255. [Google Scholar] [CrossRef]
  14. Ding, K.; Deng, G.; Jin, L. An Investigation of Imaginary Stroke Techinique for Cursive Online Handwriting Chinese Character Recognition. In Proceedings of the 2009 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, 26–29 July 2009; IEEE: New York, NY, USA, 2009; pp. 531–535. [Google Scholar]
  15. Graham, B. Sparse arrays of signatures for online character recognition. arXiv 2013, arXiv:1308.0371. [Google Scholar] [CrossRef]
  16. Bai, Z.; Huo, Q. A study on the use of 8-directional features for online handwritten Chinese character recognition. In Proceedings of the Eighth International Conference on Document Analysis and Recognition (ICDAR’05), Seoul, Republic of Korea, 31 August–1 September 2005; IEEE: New York, NY, USA, 2005; Volume 1, pp. 262–266. [Google Scholar]
  17. Bai, H.; Zhang, X. Recognizing Chinese Characters in Digital Ink from Non-Native Language Writers Using Hierarchical Models; Jiang, X., Arai, M., Chen, G., Eds.; Second International Workshop on Pattern Recognition: Singapore, 2017; p. 104430A. [Google Scholar]
  18. Bai, H.; Zhang, X.-W. Improved hierarchical models for non-native Chinese handwriting recognition using hidden conditional random fields. In Proceedings of the Fifth International Workshop on Pattern Recognition, Chengdu, China, 5–7 June 2020; Jiang, X., Zhang, C., Song, Y., Eds.; SPIE: Bellingham, WA, USA, 2020; p. 9. [Google Scholar]
  19. Long, T.; Jin, L. Building compact MQDF classifier for large character set recognition by subspace distribution sharing. Pattern Recognit. 2008, 41, 2916–2925. [Google Scholar] [CrossRef]
  20. Kimura, F.; Takashina, K.; Tsuruoka, S.; Miyake, Y. Modified quadratic discriminant functions and the application to chinese character recognition. IEEE Trans. Pattern Anal. Mach. Intell. 1987, 9, 149–153. [Google Scholar] [CrossRef]
  21. Kim, H.J.; Kim, K.H.; Kim, S.K.; Lee, J.K. On-line recognition of handwritten chinese characters based on hidden markov models. Pattern Recognit. 1997, 30, 1489–1500. [Google Scholar] [CrossRef]
  22. Cheng-Lin, L.; Jaeger, S.; Nakagawa, M. Online recognition of chinese characters: The state-of-the-art. IEEE Trans. Pattern Anal. Mach. Intell. 2004, 26, 198–213. [Google Scholar] [CrossRef]
  23. Fujisawa, H. Forty years of research in character and document recognition—An industrial perspective. Pattern Recognit. 2008, 41, 2435–2446. [Google Scholar] [CrossRef]
  24. Jin, L.; Gao, Y.; Liu, G.; Li, Y.; Ding, K. SCUT-COUCH2009—A comprehensive online unconstrained Chinese handwriting database and benchmark evaluation. Int. J. Doc. Anal. Recognit. (IJDAR) 2011, 14, 53–64. [Google Scholar] [CrossRef]
  25. Liu, C.-L.; Yin, F.; Wang, D.-H.; Wang, Q.-F. Online and offline handwritten Chinese character recognition: Benchmarking on new databases. Pattern Recognit. 2013, 46, 155–162. [Google Scholar] [CrossRef]
  26. Qu, X.; Wang, W.; Lu, K.; Zhou, J. In-air handwritten Chinese character recognition with locality-sensitive sparse representation toward optimized prototype classifier. Pattern Recognit. 2018, 78, 267–276. [Google Scholar] [CrossRef]
  27. Ciresan, D.; Meier, U.; Schmidhuber, J. Multi-column deep neural networks for image classification. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; IEEE: New York, NY, USA, 2012; pp. 3642–3649. [Google Scholar]
  28. Gan, J.; Wang, W.; Lu, K. A new perspective: Recognizing online handwritten Chinese characters via 1-dimensional CNN. Inf. Sci. 2019, 478, 375–390. [Google Scholar] [CrossRef]
  29. Xu, H.; Zhang, X. Recognizing Digital Ink Chinese Characters Written by International Students Using a Residual Network with 1-Dimensional Dilated Convolution. Information 2024, 15, 531. [Google Scholar] [CrossRef]
  30. Zheng, J.; Ding, X.; Wu, Y. Recognizing on-line handwritten Chinese character via FARG matching. In Proceedings of the Fourth International Conference on Document Analysis and Recognition, Ulm, Germany, 18–20 August 1997; IEEE Computer Soc: Washington, DC, USA, 1997; Volume 2, pp. 621–624. [Google Scholar]
  31. Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jégou, H. Training data-efficient image transformers & distillation through attention. arXiv 2021. [Google Scholar] [CrossRef]
  32. Wang, W.; Xie, E.; Li, X.; Fan, D.-P.; Song, K.; Liang, D.; Lu, T.; Luo, P.; Shao, L. Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions. arXiv 2021. [Google Scholar] [CrossRef]
  33. Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv 2021. [Google Scholar] [CrossRef]
  34. Xia, Z.; Pan, X.; Song, S.; Li, L.E.; Huang, G. Vision Transformer with Deformable Attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 4794–4803. [Google Scholar]
  35. Han, D.; Ye, T.; Han, Y.; Xia, Z.; Pan, S.; Wan, P.; Song, S.; Huang, G. Agent Attention: On the Integration of Softmax and Linear Attention. arXiv 2024, arXiv:2312.08874. [Google Scholar]
  36. Dhillon, I.S.; Guan, Y.; Kulis, B. Weighted Graph Cuts without Eigenvectors A Multilevel Approach. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 1944–1957. [Google Scholar] [CrossRef] [PubMed]
  37. Bai, H. Study on Digital Ink Characters Stroke Error Extraction by Beginning Learners of Chinese As a Foreign Language. Ph.D. Thesis, Beijing Language and Culture University, Beijing, China, 2018. [Google Scholar]
  38. Liu, C.-L.; Yin, F.; Wang, D.-H.; Wang, Q.-F. CASIA Online and Offline Chinese Handwriting Databases. In Proceedings of the 2011 International Conference on Document Analysis and Recognition, Beijing, China, 18–21 September 2011; IEEE: New York, NY, USA, 2011; pp. 37–41. [Google Scholar]
  39. GB 2312-80; Information Technology—Chinese Character Coded Character Set for Information Interchange—Basic Set. Standardization Administration of China (SAC): Beijing, China, 1980.
  40. Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar] [CrossRef]
  41. Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 4338–4364. [Google Scholar] [CrossRef]
Figure 1. Examples of DICCs for international students.
Figure 1. Examples of DICCs for international students.
Information 16 00729 g001
Figure 2. The steps for constructing the skeleton graph of the Chinese character “啊”. (a) Extract coordinates. (b) Coordinate normalization. (c) Resampling. (d) Skeleton graph. (e) Skeleton graph after merging nearby graph vertexes.
Figure 2. The steps for constructing the skeleton graph of the Chinese character “啊”. (a) Extract coordinates. (b) Coordinate normalization. (c) Resampling. (d) Skeleton graph. (e) Skeleton graph after merging nearby graph vertexes.
Information 16 00729 g002
Figure 3. Overall framework of GCNAA.
Figure 3. Overall framework of GCNAA.
Information 16 00729 g003
Figure 4. The pooling process of the Chinese character “典”.
Figure 4. The pooling process of the Chinese character “典”.
Information 16 00729 g004
Figure 5. Comparison of verification accuracy of GCNAA-S, GCNAA-M, and GCNAA-L on DICCs for international students.
Figure 5. Comparison of verification accuracy of GCNAA-S, GCNAA-M, and GCNAA-L on DICCs for international students.
Information 16 00729 g005
Figure 6. Comparison of training losses.
Figure 6. Comparison of training losses.
Information 16 00729 g006
Figure 7. Comparison of verification accuracy.
Figure 7. Comparison of verification accuracy.
Information 16 00729 g007
Figure 8. Comparison of verification accuracy of GCNAA on the five datasets.
Figure 8. Comparison of verification accuracy of GCNAA on the five datasets.
Information 16 00729 g008
Figure 9. Verification accuracy comparisons for each epoch for CASIA-OLWHDB 1.0.
Figure 9. Verification accuracy comparisons for each epoch for CASIA-OLWHDB 1.0.
Information 16 00729 g009
Figure 10. Verification accuracy comparisons in relative time for CASIA-OLWHDB1.0.
Figure 10. Verification accuracy comparisons in relative time for CASIA-OLWHDB1.0.
Information 16 00729 g010
Figure 11. Comparison of training loss in relative time for CASIA-OLWHDB1.0.
Figure 11. Comparison of training loss in relative time for CASIA-OLWHDB1.0.
Information 16 00729 g011
Figure 12. Verification accuracy comparisons for each epoch for SCUT-COUCH2009 GB1.
Figure 12. Verification accuracy comparisons for each epoch for SCUT-COUCH2009 GB1.
Information 16 00729 g012
Figure 13. Verification accuracy comparisons in relative time for SCUT-COUCH2009 GB1.
Figure 13. Verification accuracy comparisons in relative time for SCUT-COUCH2009 GB1.
Information 16 00729 g013
Figure 14. Comparison of training loss in relative time for SCUT-COUCH2009 GB1.
Figure 14. Comparison of training loss in relative time for SCUT-COUCH2009 GB1.
Information 16 00729 g014
Figure 15. Verification accuracy comparisons for each epoch for CASIA-OLWHDB1.0-1.2.
Figure 15. Verification accuracy comparisons for each epoch for CASIA-OLWHDB1.0-1.2.
Information 16 00729 g015
Figure 16. Verification accuracy comparison in relative time for CASIA-OLWHDB1.0-1.2.
Figure 16. Verification accuracy comparison in relative time for CASIA-OLWHDB1.0-1.2.
Information 16 00729 g016
Figure 17. Comparison of training loss in relative time for CASIA-OLWHDB1.0-1.2.
Figure 17. Comparison of training loss in relative time for CASIA-OLWHDB1.0-1.2.
Information 16 00729 g017
Figure 18. Verification accuracy comparison of each epoch for SCUT-COUCH2009 GB1 & GB2.
Figure 18. Verification accuracy comparison of each epoch for SCUT-COUCH2009 GB1 & GB2.
Information 16 00729 g018
Figure 19. Comparisons of verification accuracy in relative time on SCUT-COUCH2009 GB1 & GB2.
Figure 19. Comparisons of verification accuracy in relative time on SCUT-COUCH2009 GB1 & GB2.
Information 16 00729 g019
Figure 20. Comparison of training loss in relative time for SCUT-COUCH2009 GB1 & GB2.
Figure 20. Comparison of training loss in relative time for SCUT-COUCH2009 GB1 & GB2.
Information 16 00729 g020
Figure 21. Verification accuracy comparison of each epoch for HIT-OR3C-ONLINE.
Figure 21. Verification accuracy comparison of each epoch for HIT-OR3C-ONLINE.
Information 16 00729 g021
Figure 22. Comparison of verification accuracy in relative time for HIT-OR3C-ONLINE.
Figure 22. Comparison of verification accuracy in relative time for HIT-OR3C-ONLINE.
Information 16 00729 g022
Figure 23. Comparison of training loss in relative time for HIT-OR3C-ONLINE.
Figure 23. Comparison of training loss in relative time for HIT-OR3C-ONLINE.
Information 16 00729 g023
Figure 24. Correctly predicted samples.
Figure 24. Correctly predicted samples.
Information 16 00729 g024
Figure 25. Incorrectly predicted samples. (a) Handwritten Chinese character “八”. (b) Handwritten Chinese character “几”. (c) Handwritten Chinese character “午”.
Figure 25. Incorrectly predicted samples. (a) Handwritten Chinese character “八”. (b) Handwritten Chinese character “几”. (c) Handwritten Chinese character “午”.
Information 16 00729 g025
Table 1. The statistics of OLHWDB1.0, OLHWDB 1.1 and OLHWDB1.2.
Table 1. The statistics of OLHWDB1.0, OLHWDB 1.1 and OLHWDB1.2.
DatasetTotal WriterTotal ClassTotal SamplesChinese Character ClassChinese Character Samples
OLHWDB1.042040371,694,74138661,622,935
OLHWDB1.130039261,174,36437551,123,132
OLHWDB1.230034901,042,9123319991,731
Table 2. Detailed configurations of GCNAA-S, GCNAA-M, and GCNAA-L.
Table 2. Detailed configurations of GCNAA-S, GCNAA-M, and GCNAA-L.
BlockGCNAA-SGCNAA-MGCNAA-L
Block1[D = 48, N = 2] × 2[D = 64, N = 2] × 2[D = 72, N = 2] × 2
Block2[D = 72, N = 4] × 2[D = 96, N = 4] × 2[D = 108, N = 4] × 3
Block3[D = 108, N = 6] × 2[D = 144, N = 6] × 3[D = 168, N = 6] × 3
Block4[D = 160, N = 8] × 2[D = 200, N = 8] × 2[D = 256, N = 8] × 2
Table 3. Comparison of recognition results for various approaches on the CASIA-OLHWDB1.0 dataset.
Table 3. Comparison of recognition results for various approaches on the CASIA-OLHWDB1.0 dataset.
MethodAcc. (%)
Traditional approach of MQDF [25]95.28
MCDNN [27]94.39
DropSample-DCNN [2]96.93
1-D ResNetDC [29]96.2
GCNAA-L (ours)97.47
Table 4. Classification accuracies of GCNAA-L after pretraining.
Table 4. Classification accuracies of GCNAA-L after pretraining.
NetworksExtra DataTest DataAcc. (%)
GCNAA-L×DICCs written by international students97.82%
GCNAA-L+CASIA OLHWDB1.0DICCs written by international students98.53%
GCNAA-L+SCUT-COUCH 2009 GB1DICCs written by international students98.59%
GCNAA-L+CASIA OLHWDB1.0-1.2DICCs written by international students98.70%
GCNAA-L+HIT-OR3CDICCs written by international students98.59%
GCNAA-L+SCUT-COUCH 2009 GB1-GB2DICCs written by international students98.67%
Table 5. Comparison of recognition approaches based on DICCs for international students.
Table 5. Comparison of recognition approaches based on DICCs for international students.
ApproachAccuracy
Hierarchical model [17]72.73%, 86.56%, 92.55%, 85.74%, 65.56%
Improved hierarchical models [18]79.2%, 94.29%, 97.91%, 93.42%, 71.34%
1-D ResNetDC [29]97.2%
ResNetDC [3]98.5%
GCNAA (ours)98.7%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, H.; Zhang, X. Graph Convolutional Network with Agent Attention for Recognizing Digital Ink Chinese Characters Written by International Students. Information 2025, 16, 729. https://doi.org/10.3390/info16090729

AMA Style

Xu H, Zhang X. Graph Convolutional Network with Agent Attention for Recognizing Digital Ink Chinese Characters Written by International Students. Information. 2025; 16(9):729. https://doi.org/10.3390/info16090729

Chicago/Turabian Style

Xu, Huafen, and Xiwen Zhang. 2025. "Graph Convolutional Network with Agent Attention for Recognizing Digital Ink Chinese Characters Written by International Students" Information 16, no. 9: 729. https://doi.org/10.3390/info16090729

APA Style

Xu, H., & Zhang, X. (2025). Graph Convolutional Network with Agent Attention for Recognizing Digital Ink Chinese Characters Written by International Students. Information, 16(9), 729. https://doi.org/10.3390/info16090729

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop