Recognition of Urdu Handwritten Characters Using Convolutional Neural Network

: In the area of pattern recognition and pattern matching, the methods based on deep learning models have recently attracted several researchers by achieving magniﬁcent performance. In this paper, we propose the use of the convolutional neural network to recognize the multifont ofﬂine Urdu handwritten characters in an unconstrained environment. We also propose a novel dataset of Urdu handwritten characters since there is no publicly-available dataset of this kind. A series of experiments are performed on our proposed dataset. The accuracy achieved for character recognition is among the best while comparing with the ones reported in the literature for the same task.


Introduction
In the field of pattern recognition and computer vision research, the task of handwritten text recognition is regarded as one of the most challenging areas. The cursive nature of text, the shape similarity of individual characters, and the availability of different writing styles are some of the key issues that make the recognition task more challenging. While recognizing the isolated word and character in the printed text, higher accuracy rates are observed in the literature; however, there is a need for an efficient recognition system that gives remarkable results in recognizing handwritten texts [1][2][3][4][5]. Urdu is one the cursive languages that is widely spoken and written in the regions of South-East Asia including Pakistan, India, Bangladesh, Afghanistan, etc. [6]. Optical character recognition (OCR) of Urdu script started in late 2000, and the first work on Urdu OCR was published in 2004. The literature review identified the fact that there is a lack of research efforts in Urdu handwritten text recognition as compared to the recognition of the scripts of other languages [4,[7][8][9]. Furthermore, there are a few Urdu OCR systems for printed text that are commercially available [10,11], but there is no system available for recognition of Urdu handwritten text to date. It is pertinent to mention that in the field of computer vision and pattern recognition, handwritten text recognition is termed ICR (intelligent character recognition), while analysis of the printed text is known as OCR (optical character recognition). In the text, we use ICR for handwritten text recognition.
It is observed that several machine learning models like SVM (support vector machines) [12], NB (naive Bayes) [13], ANN (artificial neural network) [14,15], etc., were applied in the analysis of Urdu handwritten text in the literature survey. We also proved the competitiveness of the above approaches in analyzing the text images. In the literature, many researchers recommend using CNN (convolutional neural networks) [16,17] in extracting information from the images having text data. Furthermore, the notable work reported in [18][19][20] concluded that CNN is one of the most commonly-used DNNs (deep neural networks) in image processing while performing complex tasks like pattern matching, pattern analysis, etc. Furthermore, CNN is equally applicable to the data corpus at either the word or character level, without any prior knowledge of the syntactic (or semantic) structures of the language. The CNN model is equally applicable in a variety of data science-related tasks ranging from computer vision applications to speech recognition and others. The reason behind the successive usage of DNN is that these network models build the required correct mathematical relationship between the given input and the output regardless of the underlying nature of the model whether it is linear or non-linear. Moreover, the information in the DNN moves through the underlying layers calculating the probability of each output. This capability makes DNN one of the reliable and efficient models for solving the tasks mentioned earlier. Furthermore, the capability of deep learning models of extracting and identifying the peculiar features plays a key role in generating incisive and reliable results to the researchers. These approaches have also been proven to be competitive with traditional models. The literature related to Urdu handwritten text recognition [21][22][23][24][25][26][27] also recommended deep network models to get more optimal results in the minimum time.
To our knowledge, this paper is a pioneer in reporting the results of applying CNN to classify Urdu handwritten characters with exceptions. Our contribution in this work is to prove the fact that deep CNN does not require the knowledge of each character individually when trained on the large-scale dataset. The advantage of our proposed recognition system is quite useful to help the children who are learning to write Urdu characters and numerals. As a result, our proposed system will help in correctly classifying the characters written by the children.
The paper is organized as follows: Section 2 gives a brief introduction to the Urdu script. A detailed literature review is given in Section 3. Our proposed dataset is explained in Section 4. In Section 5, the proposed model is explained in detail, and experimental results are shown in Section 6. Finally, future work and conclusion are given in Section 7.

Urdu Script
Urdu is the national language of Pakistan and also considered as one of the two official languages of Pakistan [6] (with the other being English). It is widely spoken and understood as a second language by a majority of people of Pakistan [28,29] and also being adopted increasingly as a first language by the people living in urban areas of Pakistan.
Urdu script is written from right to left, while numerals are written from left to right; this is the reason Urdu is considered as one of the bidirectional languages [6]. Urdu script consists of 38 basic letters and 10 numeric letters, as shown in Figure 1. This character set is also considered a superset of some other Urdu-based script, i.e., Arabic contains 28 and Persian contains 32 characters [30]. Furthermore, the Urdu script also contains some additional characters to express the Hindi phonemes. Both Hindi and Urdu languages [30] share the same phonology with only a difference in written script. All Urdu-script-based languages such as Arabic and Persian have some unique characteristics, i.e., (i) the script of these languages is written from right to left in cursive style, and (ii) the script of these languages is context sensitive, i.e., written in the form of ligatures, which is a combination of a single or more characters. Due to this context sensitivity, most of the characters have different shapes depending on their position and the adjoining character in the word [7]. The connectivity of characters [31] has enriched the Urdu vocabulary with almost 24,000 ligatures.

Literature Review
In this section, we aim to assess in detail the use of different approaches in the recognition of Urdu handwritten characters. In general, we categorized the tasks and issues related to character level analysis into two subsections: (i) Urdu handwritten character recognition and (ii) Urdu handwritten numerals' recognition.

Urdu Handwritten Character Recognition
Handwritten text recognition at the character level is a challenging task because of having a large number of variations in writing styles (even from a single author). It is observed from the literature related to character-level recognition in the Urdu script, the artificial neural network (ANN) and its different variants are widely used. An ANN [32] is a collection of nodes (also known as artificial neurons) linked with each other. These links between artificial neurons are enabled to transmit a signal from one to another within the network. These neurons can process the signals received and then propagate to the neurons connected in subsequent layers. The structure of the ANN may be affected by the kind of information flowing through it because a neural network usually trains itself using the input and labeled output.
The problem of developing a generic type of ICR that can resolve the issues associated with any language is challenging since different languages exhibit different characteristic features, and thus, generalizing this type of system is not possible. In order to overcome this problem, a novel approach was proposed in [33] exploring how the character set of any language can be represented by primitive geometrical strokes. One of the promising features of the approach is that the recognizer (artificial neural network) has to be trained only once. The data structure of the character set should be represented in the form of geometrical strokes in an XML file. This file helps in training the neural network, not for every time, for each word in the language. Figure 2 shows a set of thirteen basic geometrical strokes. For evaluation purposes, a set of 25 handwritten Urdu text samples were tested and achieved a success rate of 75-80%. One of the limitations of this approach is that it does not apply to the words having dots and diacritics. Due to having a large character (or alphabet) set, there is inherent similarity among some major strokes, as shown in Figure 2. This similarity in characters is one of the challenging issues of the incorrect recognition of Urdu handwritten text. Keeping in view the fact mentioned above, in [21], the authors divided the Urdu character set into four groups according to the number of strokes, as shown in Figure 3. The authors performed an online Urdu ICR considering single-stroke characters only. Some novel features (shown in Figure 4) were extracted and then fed to three different classifiers namely, the back propagation neural network (BPNN), the probabilistic neural network (PNN), and the correlation-based classifier. The proposed approach was tested on 85 instances of single-stroke characters taken from 35 writers of different age groups. The results showed that the PNN classifier achieved a higher accuracy of 95% as compared to the other two classifiers. Unlike BNN, the PNN-based classifiers require no initial training. This is the reason PNN-based classifiers achieved higher accuracy than BNN.  For isolated character recognition, the authors in [22] proposed a technique that builds the feature vector by analyzing the primary and secondary strokes while writing Urdu characters in isolated form. Some of the stroke features that were used to train the classifier were as follows: the diagonal length of the bounding box; the sine-cosine angle ration of the bounding box diagonal; the displacement of the first and last point while tracing the bounding box; the corresponding sine-cosine ratio of the angle between the first and last point; the total length (in pixels) of the primary stroke; and the total angle traversed. A linear classifier was applied to the dataset of five samples each of 38 Urdu characters, i.e., a total of 190 characters were provided by two different writers who could write Urdu characters smoothly. The classifier recognized the characters with an error rate of almost 6% because some characters share quite similar shapes (see Figure 5) and were not correctly recognized. Similar work was reported in [23] by considering the initial half of different Urdu characters. In this work, only those characters were considered that change their shapes concerning their position and context in a word. Figure 6 depicts Urdu characters in the initial half forms and classified based on the number of strokes. Almost 100 native Urdu writers and speakers were invited to write in Urdu script. The writers were provided with a stylus and digitizing tablet to get the dataset of 3600 instances of Urdu letters in the initial half form. A combination of multilevel one-dimensional wavelet analysis with the Daubechies wavelet [34][35][36] was applied to extract features from these instances. Several neural networks with different configurations were trained for recognition purposes. Among these networks, BPNN provided a maximum recognition rate of 92%.
The MDLSTM (multidimensional long short-term memory) neural network is one of the RNNs (recurrence neural networks) that is implicitly used for sequence learning and segmentation in multidimensional environments [37][38][39]. This model was used for the first time in the work of [26] for Urdu script recognition. One of the promising features of the model is that it can scan the input image in all four directions, thus reducing the chance of ambiguity. For evaluation purposes, the UPTI (Urdu Printed Text Image) dataset [40] was used, which contains 10,000 scanned images of both Urdu handwritten and printed text. MDLSTM is one of the supervised techniques; therefore, each input sample in the dataset is tagged and labeled with appropriate information. The dataset is further divided according to the following ratio: 68% for training and 16% for both testing and validation purposes. In order to evaluate the accuracy of the proposed approach, the Levenshtein edit distance [41] was computed between the output text and baseline results and achieved an accuracy of 94.04% as compared to the results reported in the works of [42,43], reporting 88.94% and 89% accuracy, respectively. Table 1 shows a comparison of the proposed approach on the UPTI dataset [40] with other techniques. Figure 6. Classification of the initial half forms on the basis of the number of strokes [23]. MDLSTM, multidimensional long short-term memory. Promising work was reported in [46] in which Urdu handwritten text was recognized using the dataset UNHD (Urdu Nastaliq Handwritten Dataset) [47]. This dataset can be accessed publicly https://sites.google.com/site/researchonUrdulanguage1/databases UNHD Database. The dataset contains 312,000 words (including both Urdu script and Urdu numerals) written on a total of 10,000 lines by 500 writers of different age groups. The writers were directed to write on white pages of size A4. Each was provided six blank pages labeled with the author ID and the page number. One of the samples of written pages is shown in Figure 7. Furthermore, in order to maintain the uniformity in data, the writers were asked to write the provided printed text. In order to recognize the text, a one-dimensional long short term memory (BLSTM) based approach was proposed that was based on RNN (recurrent neural network), capable of restoring the previous sequence information. For evaluation purposes, the dataset was divided into 50% for training, 30% for validation, and 20% for testing and achieved a 6-8 percent error rate that can be improved using two-dimensional BLSTM, as proposed by the authors. Table 2 gives the summary of the accuracy reported on common datasets in the Urdu domain. In [27], the authors proposed a novel approach for Urdu text recognition at the character level, written in Nastaliq font by combining CNN (convolutional neural network) and MDLSTM. In the first phase, CNN was deployed to extract the characteristic features, which were then fed to MDLSTM in the second phase. This approach outperformed the state-of-the-art systems on the UPTI dataset. Table 3 shows the comparison of Urdu recognition on UPTI datasets.

Urdu Numeral Recognition
It is quite easy for a human being to recognize the handwritten numeral data, but for the computer system, there is a need for an intelligent approach based on some machine learning algorithms developed for this kind of job. The writing stroke, length, width, orientation, and other geometrical features tend to change while writing the same numeral even by the same author. These different writing styles may introduce shape variations of Urdu numerals that may break the strokes' primitives and also change their topology. These issues make Urdu handwritten numeral recognition one of the active research areas in the field of image processing. Unfortunately, there is no commercially-available standard dataset of Urdu numerals. Due to this lack of resources, the researchers developed their own dataset and concluded the results. This section covers some notable work related to handwritten numeral recognition in the Urdu domain.
In [51], different transformations of the Daubechies wavelet [34][35][36] were applied for feature extraction from a dataset of about 2150 samples of handwritten Urdu numerals. For evaluation purposes, 2000 samples were used for training the neural network and 150 instances for testing. In order to decompose the images into different frequency bands, both the low-pass and high-pass filtering were applied at each phase of the Daubechies wavelet [34][35][36] filtering. For classification purposes, BPNN was used and achieved an average recognition rate of 92.05%, as shown in Figure 8. In [52,53], the authors presented the similarities and dissimilarities between Urdu and Arabic script with recognition of handwritten numeric data. A hybrid technique of HMM and the fuzzy rule was used to recognize the handwritten numerals of both Arabic and Urdu script. The dataset was prepared by inviting 30 trained users to write both the Urdu and Arabic numerals and collected 900 samples in total. The system obtained 97%, 96%, and 97.8% recognition rates using the fuzzy rule, HMM, and the hybrid approach, respectively. The authors also conclude that separation of numerals from Urdu text in a handwritten text is still a challenging issue due to having shape similarity, e.g., First character of Urdu script (Alif) and Urdu numeric (One) both have exactly same shape. A new algorithm is proposed in [54] to preprocess the complex input and preserve shape of the actual input. Fuzzy association rules are used to link secondary stroke with their respective primary strokes. Different classifiers such as the hidden Markov model (HMM), fuzzy logic, the k-nearest neighbor (KNN), hybrid fuzzy HMM, hybrid KNN fuzzy, and the convolutional neural network (CNN) wee used for the classification. Statistical tests were applied to find the significance of classifiers' results. Similarly, a newly-developed OCR algorithm was introduced in the work reported in [55] that used a semi-supervised multi-level clustering for categorization of the ligatures. Classification was performed using four machine learning techniques, i.e., decision trees, linear discriminant analysis, naive Bayes, and k-nearest neighbor (k-NN). The system was implemented, and the results showed 62, 61, 73, and 9% accuracy for the decision tree, linear discriminant analysis, naive Bayes, and k-NN, respectively.
In a very recent work [56], the authors presented a simple and robust line segmentation algorithm for Urdu handwritten and printed text. In the proposed line segmentation algorithm, a modified header and a baseline detection method were used. This technique purely depends on the counting pixels approach, which efficiently segments Urdu handwritten and printed text lines along with skew detection. The handwritten and printed Urdu text dataset was manually generated for evaluating the algorithm. Dataset consisted of 80 pages having 687 handwritten Urdu text lines, and printed dataset consisted of 48 pages having 495 printed text lines. The algorithm performed significantly well on printed documents and handwritten Urdu text documents with well-separated lines and moderately well on a document containing overlapping words.
The literature related to the Urdu text recognition at the character level proved that the ANN outperformed other machine learning approaches. The results generated by the character recognition system based on ANN were two-fold, i.e., the system was not only applicable for Latin script, but also for handwritten cursive characters of the Arabic-base script. We present a novel approach of CNN in order to recognize Urdu handwritten characters embedding both pixel-and geometrical-based features. The geometrical features were extracted for each text image using hybrid approaches of connected-components labeling [57] and the upper-lower profile [58]. The upper-lower profile works by dividing the image into four columns, then by detecting the position of both the first and last black pixels on each column, and provides the bounding box covering the area of interest. The extracted features are then embedded with pixel-based features, making a feature vector and then processed by our proposed model (discussed in the subsequent section) in order to recognize and classify using the variable size of the test set and invariant font.

Our Dataset
In research activities related to data science, a data-enrich dataset has a key role in generating the correct results. The precise and established dataset leads to the correct evaluation of the mathematical models that are implemented on the dataset. Furthermore, it is mandatory to have a standard dataset for each data-science domain to achieve the benchmark results. While performing our experimental work, we found that there was no publicly-available standard dataset, as mentioned earlier. In order to bridge this gap, we developed a novel dataset of Urdu handwritten isolated characters and numerals. Our dataset contained 800 images of each of the 38 Urdu character and 10 numerals. The dataset was built by inviting 500 native Urdu speakers from different social groups. Each author was directed to write both the Urdu characters and numerals each in his or her own handwriting in Nastaliq font in a column, as shown in Figure 9. As mentioned earlier, the dataset was not from a single writer; therefore, there was very less chance of overfitting the classification model. It is pertinent to mention that the numeral part of this dataset was also used successfully for visualization in our work [59]. The ground truth and information about the authors of our proposed dataset, e.g., age, gender, hand preference while writing (left hand/right hand or both), physical impairment (if any), and profession, were also recorded in a suitable XML-based repository. After dataset collection, the text pages were scanned on a flatbed scanner at 300 dpi and segmented manually into images of 28 by 28 for each Urdu character and numeral data. As mentioned earlier, the dataset consisted of 800 × 10 = 8000 numeral images and 800 × 38 = 30,400 Urdu characters. We planned to increase the number of authors to 1000 later in order to enrich the dataset by adding as many variations of handwriting as possible. Upon completion of the dataset, we will make the dataset publicly available to researchers. In the case of Urdu characters, we considered only those characters that have much the shape similarity rather than the number of strokes, as shown in Figure 3. Keeping this observation, we divided the Urdu characters into 12 groups, as shown in Figure 10. It is pertinent to mention that our way of grouping the Urdu characters was different since we grouped the characters based on the shape similarity rather than based on the number of strokes, as reported in [21].

Proposed Model
The block diagram of the proposed Urdu handwritten character classification is shown in Figure 11. The proposed recognition technique relies on a convolutional neural network model (CNN) with a feature mapped output layer. Our proposed model will classify the given input out of 10 classes using CNN while classifying the Urdu numeral. Similarly, while classifying the Urdu character, the same model will classify the given Urdu character out of 12 classes (see Figure 10). The detail about the different phases of our proposed model will come in the following subsections.

Preprocessing
In this phase, the images of our proposed dataset went through some preprocessing steps to prepare the data for further processing. First, the images were processed in order to remove noise using the algorithms reported in the noteworthy work [60]. Then, we converted the images of our dataset gray-scale and then resized to 28 × 28 pixels by keeping the aspect ratio locked.

Feature Extraction
Along with pixel-based data of images, each Urdu handwritten character was processed in order to extract the structural/geometrical features like the width of the character, the height of the character, the aspect ratio of the text image, the number of horizontal and vertical lines in the image, the number and position of loops and arcs, etc. These features were then embedded with the pixel-based data of the image in order to obtain accurate results in the classification. The structural features of the Urdu handwritten characters are shown in Figure 12.

Convolutional Neural Network
The architecture of CNN is quite different from a conventional neural network model. In the conventional neural network, input values are transformed by traversing through a series of hidden layers. Every layer is made up of a set of neurons, where each layer is fully connected to all neurons in the layer before. The reason behind the better performance of CNNs is that these networks capture the inherent properties of images [61]. This significant feature of CNN gave us the confidence to use it in the analysis of our proposed dataset.
We have our proposed dataset of Urdu handwritten numerals (31K images with 10 labels (0-9) with the size of each image being 32 × 32 pixels, which will be fed into the CNN model. In our model, the first layer was a 2D convolution layer equipped with a 5 × 5 kernel size. This layer will help in interleaving each and every input image pixel. The output of this layer was embedded with the feature map having a size of 28 × 28. After that, the structural features were embedded to build a feature vector. Each output in the convolution was able to be activated by using the activation function ReLU (rectified linear unit) [62]. The ReLU function can better handle the gradient vanishing problem when compared with the the "sigmoid" function [63]. Furthermore, ReLU plays an efficient role in simulating the brain mechanism of humans using the inherent threshold invariant. Finally, the fully-connected layer at the end helps to classify the given input. The model was tested for Urdu characters and numerals separately. The inside functioning of CNN used in our experimental work is depicted in Figure 13.

Experimental Setup and Results
In order to evaluate the accuracy of our proposed model (see Figure 11, we used CNN to classify both the Urdu handwritten characters and numerals in two different experiments. In order to classify the Urdu handwritten numerals, 3/4 of the 8000 data was selected as training data and 1/4 as test data. The same proportion was used for classifying the Urdu handwritten character dataset in the second set of experiments. While training CNN having four convolutional layers for both the experiments, we considered the learning rate, the number of hidden neurons, and the batch size as parameters. It was observed from the results that CNN worked efficiently by increasing the network scale with one major drawback of the problem of over-fitting due to a longer time incurred while training. On the other hand, it is possible to get the optimal state of the model by tuning the batch size. The rule of thumb is that the model cannot be trained when the batch size is increased to some certain value [64]. Furthermore, the batch size was also dependent on the available memory. The effect of both the batch size and learning rate on accuracy is shown in Figure 14. In order to avoid the issues mentioned above, we trained the model in a controlled environment using a momentum value of 0.8. This specific value of momentum helped with obtaining the optimal results. In order to achieve the optimal state of the network, we had to increase the number of convolutional cores gradually since the increase all at once would cause the problem of overfitting. In the case of batch size, relatively large numbers were needed to achieve the global gradient. We chose the learning rate of 0.0025 with a batch size of 132. The confusion matrices with efficiency graphs using the different numbers of hidden neurons for the Urdu handwritten numerals are shown in Figure 15 with an average accuracy rate of 98.03%. The diagonal values show the classification accuracy of individual Urdu numeral, while the overall accuracy achieved in each experiment is highlighted in a box (bottom right) in the corresponding confusion matrix. Furthermore, the off-diagonal cells correspond to incorrectly classified observations. Both the number of observations and the percentage of the total number of observations are shown in each cell. The column on the far right of the matrix shows the percentages of all the correctly and incorrectly predicted examples belong to each class.
It is clear from the performance graphs that the number of epochs increased with the number of hidden neurons in order to achieve higher accuracy. Here, hidden neurons are not representing the number of classes. In practice, the number of hidden neurons in each new hidden layer equals the number of connections to be made. It is also mentioned in the structure of CNN ( Figure 13) that the internal layer may be comprised of different numbers of hidden neurons. These neurons help in choosing the features of the input image as deeply as possible. It is pertinent to mention that adding to the number of neurons may increase the complexity, but it helps achieve a higher accuracy rate. The output class here was the labels of the Urdu handwritten character, i.e., from 0-9 in case of numeral recognition.
The same set of experiments was performed with the dataset of Urdu handwritten characters. In this experiment, the model of CNN was modified by increasing the number of outputs from 10 to 12 since the Urdu characters were grouped into 12 classes based on the shape similarity (see Figure 10). Similarly, the same set of parameters with different values was applied for this set of experiments. We chose a learning rate of 0.08 with a starting batch size of 40. The confusion matrices with efficiency graphs using the different number of hidden neurons for the Urdu handwritten numerals and isolated characters are shown in Figures 15 and 16, respectively. Our final test accuracy was around 98.3% for Urdu numerals and 96.04% for Urdu characters. Table 4 depicts the comparison of our proposed approach with other techniques for the same task.
It is noteworthy that the results shown in blocks (last two red-colored rows of Figure 16) were quite similar regardless of the number of hidden neurons in the case of Urdu handwritten character classification. The experiments were also performed using variations of the n-fold cross-validation approach in order to avoid any confusion regarding the ratio of training and testing data. The confusion matrices for Urdu handwritten numerals are given in Tables 5 and 6 showing average accuracy of 92.7% and 95.6%, respectively using 10-fold and 8-fold cross validation. Similarly Tables 7 and 8 show results of Urdu handwritten characters (shown in groups in Figure 10) with 10 fold and 8 fold cross validation.
Overall it was observed that our proposed model of CNN showed better predictive accuracy compared with other classification models. Our proposed approach CNN pixel-and geometrical-based 98.3% Table 5. Confusion matrix of n-fold cross-validation (10-fold) results for Urdu handwritten numeral classification.  Table 7. Confusion matrix of n-fold cross-validation (10-fold) results for Urdu handwritten characters classification. Grp Classified  1  2  3  4  5  6  7  8  9  10  11  12 as

Conclusions
In this paper, we made use of CNN (convolutional neural network) in recognizing and classifying Urdu handwritten characters. We also generated a novel dataset of Urdu handwritten characters and numerals. While performing experiments on our proposed dataset using CNN, we compared the results of different approaches in order to propose recommendations based on parameter tuning. The application of CNN in Urdu handwritten characters' classification provides a platform for developing applications for children at the beginner level to learn how to write Urdu characters and numerals correctly. Furthermore, there is a lack of standard data resource in the Urdu domain in order to generate benchmark results.
In the field of machine learning, deep CNNs come with a revolutionary change by providing quite efficient results in comparison with conventional approaches. However, there are also some inherent questionable issues like there is a lack of knowledge of how to determine the number of levels and hidden neurons in each layer. Furthermore, a large-scale dataset is required to check the validity and efficiency of deep network models. Therefore, in our experiment, we had to train the CNN with many samples. In addition, finding a set of optimal parameters to generate error-free results is also a research issue. Moreover, our proposed classifier can be assessed using some other convolutional neural network models like two-dimensional BLSTM or bidirectional LSTM. Similarly, some complex future tasks like character recognition of rotated, mirror-text, and noisy images by extracting novel features could benefit. Moreover, we have also planned to develop a system that should recognize individual Urdu characters rather than in groups. Since data science is continuously providing multifaceted large-scale datasets, it is essential to design and develop more efficient CNN models that are cost effective in the utilization of resources like memory, computational bandwidth, etc.
According to Table 4, our proposed model was significantly better than the approaches used in the related literature in terms of the number of parameters and the amount of calculation. Furthermore, our proposed model was quite efficient (in terms of accuracy) and effective at performing the recognition and classification since it provided better accuracy in the minimum time as compared with the others, and it is suitable for developing a learning application for children on mobile phones.