Atom Search Optimization with Deep Learning Enabled Arabic Sign Language Recognition for Speaking and Hearing Disability Persons

Sign language has played a crucial role in the lives of impaired people having hearing and speaking disabilities. They can send messages via hand gesture movement. Arabic Sign Language (ASL) recognition is a very difficult task because of its high complexity and the increasing intraclass similarity. Sign language may be utilized for the communication of sentences, letters, or words using diverse signs of the hands. Such communication helps to bridge the communication gap between people with hearing impairment and other people and also makes it easy for people with hearing impairment to express their opinions. Recently, a large number of studies have been ongoing in developing a system that is capable of classifying signs of dissimilar sign languages into the given class. Therefore, this study designs an atom search optimization with a deep convolutional autoencoder-enabled sign language recognition (ASODCAE-SLR) model for speaking and hearing disabled persons. The presented ASODCAE-SLR technique mainly aims to assist the communication of speaking and hearing disabled persons via the SLR process. To accomplish this, the ASODCAE-SLR technique initially pre-processes the input frames by a weighted average filtering approach. In addition, the ASODCAE-SLR technique employs a capsule network (CapsNet) feature extractor to produce a collection of feature vectors. For the recognition of sign language, the DCAE model is exploited in the study. At the final stage, the ASO algorithm is utilized as a hyperparameter optimizer which in turn increases the efficacy of the DCAE model. The experimental validation of the ASODCAE-SLR model is tested using the Arabic Sign Language dataset. The simulation analysis exhibit the enhanced performance of the ASODCAE-SLR model compared to existing models.


Introduction
Communication is the major component of interpersonal relationships that acts as an important connection between individuals and describes human existence. Additionally, it is a prominent basis to promote the growth of the human population. Communication is classified into verbal and nonverbal forms, and its core is to exchange data between the sender and the receiver [1]. As the component of communication, verbal and nonverbal forms are considered spontaneous and disguised spontaneous communications, the initial one is demonstrated as an intentional communication from the motivation emotional state, and the last one is demonstrated as an instinctive intentional strategic algorithm to accomplish the detection of 20 sentences and 50 words. The segmentation technique classifies whole sentence signals into word units. Next, the DL method identifies each word element and reversely recognizes and reconstructs sentences. Khan et al. [13] aimed to illustrate a user-friendly method for Bangla SL for converting text via CNN and personalized ROI segmentation. By utilizing the ROI selection approach, the technique illustrates improved performance when compared to traditional methodologies.
In [14], the authors developed an SL fingerspelling alphabet detection technique with an image processing technique, supervised deep learning, and machine learning. Especially, twenty-four alphabetical symbols are developed by different integrations of static gestures (not including two motion gestures Z and J). Local binary pattern (LBP) and histogram of oriented gradients (HOG) features of every gesture would be extracted from the training image. Next, the multi-class support vector machine (SVM) is employed for training the extracted dataset. Mannan et al. [15] applied a deep convolution neural network for ASL alphabet detection to resolve ASL detection problems. The study proposed an ASL detection technique with a DCNN. The efficiency of the DCNN method enhances the quantity of the given dataset; for these purposes, we employed the data augmentation method for expanding the size of the trained dataset from the current dataset.
Sharma et al. [16] introduce a DCNN method for recognizing different symbols in ISL, which belongs to thirty-five classes. Such classes comprise cropped images of hand gestures. Different from other feature selection-based models, DCNN has the benefit of automated feature extraction in the training. It is named end-to-end learning. A lightweight transfer learning (TL) structure makes the model training faster which provides 100% accuracy. Furthermore, a web-based method was proposed which could simply decode the symbol. In [17], a proposed novel architecture for signer-independent SL detection with different DL architectures encompassing DRNN, hand semantic segmentation, and hand shape feature representation. Extracting hand shape features can be accomplished by a single-layer convolution self-organizing map (CSOM) rather than depending on the TL of pretrained DCNN. Then, the series of extracted feature vectors are identified by utilizing deep BiLSTM-RNN.
Though several ML and DL models for sign language recognition are available in the literature, it is still needed to enhance classification performance. Owing to the continual deepening of the model, the number of parameters of DL models also increases quickly which results in model overfitting. Since the trial and error method for hyperparameter tuning is a tedious and erroneous process, metaheuristic algorithms can be applied. Therefore, in this work, we employ the ASO algorithm for the parameter selection of the DCAE model.

The Proposed Model
In this study, a new ASODCAE-SLR technique has been developed for recognizing sign languages to assist the communication of speaking and hearing disabled persons. The ASODCAE-SLR technique initially pre-processes the input frames by a weighted average filtering approach. Next, the ASODCAE-SLR technique employed a CapsNet feature extractor to produce a collection of feature vectors. To identify and classify sign language, ASO with the DCAE model is exploited in the study.

Image Pre-Processing
The ASODCAE-SLR technique initially pre-processes the input frames by a weighted average filtering approach. The weighted average filter was planned to pre-process that suppresses noise and improves spatial domain features efficiently [9]. This filter W η was determined as a matrix, whereas η refers to the odd number. All the element values of the matrix were defined as the distance between the present place and the center of the matrix, as demonstrated in Equation (1). The center of the matrix was determined as w (η+1)/2,(η+1)/2 = 2/η 2 . The presented filter continues edges but suppresses speckle noise related to another filter namely the mean filter and maintains the continuity of images.
whereas I 1 , I 2 ∈ R N r ×N C , the convolutional of all the images utilizing W η is obtained for acquiring 2 images I w 1 (η) = I 1 * W η and I w 2 (η) = I 2 * W η , whereas * signifies the 2D convolutional function.

Feature Extraction: CapsNet Model
Next to pre-processing, the ASODCAE-SLR technique employs the CapsNet model to generate feature vectors. A major benefit of CapsNet is that they hold the characteristics of more concrete features which can be interpreted to understand what and how is the network learning. The CapsNet has the ability for encoding spatial data and differentiate among several poses, textures, and orientations [18]. The capsule is a set of neurons, thus all the capsules have an activity vector connected to it that captures several instantiation parameters to recognition of a certain kind of object or its part. The length and orientation of the vector present the probability or possibility of the presence of that object and its generalization pose. These vectors were passed on to the upper-level capsules in lower-layer capsules. The coupling coefficients occur among these layers of capsules. When the forecast by the lower-level capsule equals the outcome of current capsules, the value of coupling coefficient amongst them improves, calculated with utilize of softmax function. Specifically, when the present capsule identifies a tight cluster of preceding prediction, strongly representing the occurrence of that object, its outcomes in a higher probability is also recognized as routing by agreement. Figure 1 depicts the framework of the CapsNet method. whereas the value is 1 for correct labels and 0 else, refers to the constant whose value is 0.5. The 1st term is calculated to correct labels, and the second term calculates to incorrect labels. If will be 1, the second term develops 0, and for as 0, the first term develops 0. Similarly, the loss value 1 is 0 for correct forecasts with being superior to 0.9 and non-zero otherwise.

Sign Language Recognition: DCAE Model
To identify and classify sign language, the DCAE model is exploited in the study. AE is a conventional DNN structure that makes use of its input as a label. Later, the network attempts to recreate its input in the learning mechanism [19]; for these purposes, it generates and automatically extracts the representation feature in suitable time iterations. This kind of network is created by stacking deep layers in AE forms consisting of two major parts of decoder and encoder. DCAE is a kind of AE applying a convolution layer to determine the inner data of an image. In CAE, structure weight is shared amongst each location within every feature map, thereby reducing parameter redundancy and preserving the spatial locality. For extracting deep features, consider D, W, and as the depth, width, and height of the dataset, correspondingly, and refers to the pixel count. For every member of the X set, the image patches with the size 7 × 7 × are extracted, where denotes the central pixel. Consequently, the X set is characterized as an image patch, every patch, * , is given into the encoder blocks. For an input * , the hidden layer mapping of ℎ feature map is shown below: In Equation (7), refers to the bias; denotes an activation function, and the symbol * corresponds to the 2D convolution layer and it is attained by the following expression: Initially, the prediction vector (Equation (2)) was calculated as: whereasû j |i refers to the outcome of the forecast vector of upper-level jth capsule, W ij and u i implies the weighted matrix and forecast vector of capsules i from the lower layer correspondingly. It can capture spatial connections and interactions among sub-objects and objects. In Equation (3), dependent upon the degree of agreement amongst neighboring layer capsules, the coupling coefficients were calculated using the softmax function, Healthcare 2022, 10, 1606

of 15
In which b ij signifies the log probability amongst two capsules, initialization to zero, and k represents the number of capsules. The input vector s j to jth layer capsule that a weighted sum ofû j |i ∧ vectors learned by routing technique is computed as: Lastly, a squashing function that integrates squashing and unit scaling (Equation (5)) was executed for confining the value of results from the range amongst zero and one, therefore calculating the probability as, The loss function (as calculated by Equation (6)) was connected to capsules from the final layer, whereas m + arιd m-are fixed to 0.9 and 0.1 resp.
whereas the value T k is 1 for correct labels and 0 else, λ refers to the constant whose value is 0.5. The 1st term is calculated to correct labels, and the second term calculates to incorrect labels. If T k will be 1, the second term develops 0, and for T k as 0, the first term develops 0. Similarly, the loss value 1 k is 0 for correct forecasts with v k being superior to 0.9 and non-zero otherwise.

Sign Language Recognition: DCAE Model
To identify and classify sign language, the DCAE model is exploited in the study. AE is a conventional DNN structure that makes use of its input as a label. Later, the network attempts to recreate its input in the learning mechanism [19]; for these purposes, it generates and automatically extracts the representation feature in suitable time iterations. This kind of network is created by stacking deep layers in AE forms consisting of two major parts of decoder and encoder. DCAE is a kind of AE applying a convolution layer to determine the inner data of an image. In CAE, structure weight is shared amongst each location within every feature map, thereby reducing parameter redundancy and preserving the spatial locality. For extracting deep features, consider D, W, and H as the depth, width, and height of the dataset, correspondingly, and n refers to the pixel count. For every member of the X set, the image patches with the size 7 × 7 × D are extracted, where χ j denotes the central pixel. Consequently, the X set is characterized as an image patch, every patch, x * i , is given into the encoder blocks. For an input x * i , the hidden layer mapping of kth feature map is shown below: In Equation (7), b refers to the bias; σ denotes an activation function, and the symbol * corresponds to the 2D convolution layer and it is attained by the following expression: In Equation (8), there exists bias b for every input channel, and h denotes the set of latent feature maps. The W corresponding to the flip operation over both dimensions of weight W. y denotes the prediction value. In order to define the parameter vector depicting the complete DCAE architecture, one could minimalize the subsequent cost function signified as follows: For minimizing this function, we need to evaluate the gradient of cost function concerning the convolutional kernel W, W and bias b, b parameter: Now, δh and δy denote the deltas of the hidden state and the reconstruction, correspondingly. Then, the weight is upgraded by the optimization methodology. At last, the DCAE parameter is evaluated when the loss function convergence is accomplished. The output feature map of the encoder block is regarded as a deep feature. In the study, batch normalization (BN) was employed for tackling the internal covariant shift phenomenon and enhancing the efficiency of the network via the normalization of input layers by re-centering and rescaling. The BN assists to increase accuracy and learn faster.

Hyperparameter Tuning: ASO Algorithm
In this study, the ASO algorithm is exploited to finely adjust the hyperparameter values related to the DCAE model. The molecular dynamics simulate the mathematical process of the ASO technique. In ASO, the place of all the atoms from the searching space that is affected by their mass signifies the solutions [20]. ASO begins the optimization by creating a group of arbitrary particles from N-dimensional space. Afterward, the solution of all the atoms was estimated as dependent upon the main function. Atoms upgrade their place and velocity from all the iterations, and the place of the optimum atom was upgraded from all the iterations. The velocity of particles is a function of their acceleration, and the acceleration of atoms is estimated based on Newton's second law dependent upon the ratio of forces executed to the mass of particles. The mass of i th atom from the iteration of t, m i (t) was computed by the subsequent formulas: whereas Fit Best (t) and Fit Worst signifies atoms with optimum and worse values from the t th iteration and Fit i (t) implies the value of i th atom main function from the T th iteration, correspondingly. Regarding the minimize problems, Fit Best and Fit Worst were assumed dependent upon the subsequent connections: Fit During all the periods, the count of neighbors of all the atoms that interact is defined utilizing Equation (16): In which T defines the entire amount of iterations of the technique, or in another word, the life of systems. As is noted, the parameter K is a function of time, slowly reducing Healthcare 2022, 10, 1606 7 of 15 the iterations. The forces executed on all the particles contain two kinds of interaction forces and internal constraint forces. The interaction force that is determined utilizing the Lennard-Jones potential method and the internal constraint force that is connected to the bond length potential and differs depending upon the distance amongst all the atoms to optimum atoms were computed utilizing Equations (17) and (18), correspondingly.
whereas F and G define the communication and internal constrain forces correspondingly, rand j depicts an arbitrary number amongst 0 and 1, and K Best refers to the subset of the atom population containing K atoms with optimum main function values. Additionally, x d best (t) demonstrates the place of an optimum atom from the t th iteration from the d dimensional space, λ(t) illustrates the Lagrangian coefficient, α stands for the depth coefficient, and β implies the weighted coefficient. Figure 2 illustrates the flowchart of the ASO technique. Additionally, ( ) demonstrates the place of an optimum atom from the ℎ iteration from the dimensional space, ( ) illustrates the Lagrangian coefficient, stands for the depth coefficient, and implies the weighted coefficient. Figure 2 illustrates the flowchart of the ASO technique.
As follows, the acceleration of particle from the dimensional and period was computed in Equation (19): The last step in all the iterations is for updating the particle velocity and location that is achieved in the subsequent formulas: ( + 1) = ( ) + ( + 1) Every update and compute were carried out constantly still the termination condition is met. Lastly, the location and value of the main function of an optimum atom were assumed as the optimum estimate of problems.
The ASO approach derives a fitness function to achieve an enhanced performance of the classification. It defines a positive integer to signify the performance of the candidate solution. The minimization of the classification error rate is considered as the fitness function in this study as follows. As follows, the acceleration of i particle from the dimensional d and period τ was computed in Equation (19): The last step in all the iterations is for updating the particle velocity and location that is achieved in the subsequent formulas: Every update and compute were carried out constantly still the termination condition is met. Lastly, the location and value of the main function of an optimum atom were assumed as the optimum estimate of problems.
The ASO approach derives a fitness function to achieve an enhanced performance of the classification. It defines a positive integer to signify the performance of the candidate solution. The minimization of the classification error rate is considered as the fitness function in this study as follows.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool on PC i5-8600k, GeForce 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parameter settings are given as follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch count: 50, and activation: ReLU. This section inspects the sign language recognition outcomes of the ASODCAE-SLR model using the Arabic Sign Language dataset. In this study, a total of 1100 samples under 11 class labels are used. Table 1 depicts the detailed description of the dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outco SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. The confusion matrix generated by the ASODCAE-SLR mod is demonstrated in Figure 3. The figure depicted that the ASODC curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outco SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. The confusion matrix generated by the ASODCAE-SLR mod is demonstrated in Figure 3. The figure depicted that the ASODC curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outco SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. The confusion matrix generated by the ASODCAE-SLR mo is demonstrated in Figure 3. The figure depicted that the ASOD curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outc SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. The confusion matrix generated by the ASODCAE-SLR mo is demonstrated in Figure 3. The figure depicted that the ASOD curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outco SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. The confusion matrix generated by the ASODCAE-SLR mo is demonstrated in Figure 3. The figure depicted that the ASOD curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outco SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. Table 1  The confusion matrix generated by the ASODCAE-SLR mod is demonstrated in Figure 3. The figure depicted that the ASODC curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outco SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. Table 1  The confusion matrix generated by the ASODCAE-SLR mo is demonstrated in Figure 3. The figure depicted that the ASOD curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The param follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch c ReLU. This section inspects the sign language recognition outc SLR model using the Arabic Sign Language dataset. In this stud under 11 class labels are used. Table 1  The confusion matrix generated by the ASODCAE-SLR mo is demonstrated in Figure 3. The figure depicted that the ASOD curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outco SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. Table 1  The confusion matrix generated by the ASODCAE-SLR mo is demonstrated in Figure 3. The figure depicted that the ASOD curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outco SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. Table 1  The confusion matrix generated by the ASODCAE-SLR mod is demonstrated in Figure 3. The figure depicted that the ASODC curately recognized all the 11 class labels on the entire dataset.

Result Analysis
The proposed model is simulated using Python 3.6.5 tool o 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parame follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch co ReLU. This section inspects the sign language recognition outco SLR model using the Arabic Sign Language dataset. In this study under 11 class labels are used. Table 1  The confusion matrix generated by the ASODCAE-SLR mod is demonstrated in Figure 3. The figure depicted that the ASODC curately recognized all the 11 class labels on the entire dataset.

House 100
Total Number of Samples 1100 The confusion matrix generated by the ASODCAE-SLR model on the entire dataset is demonstrated in Figure 3. The figure depicted that the ASODCAE-SLR model has accurately recognized all the 11 class labels on the entire dataset.     Table 2 report the sign language recognition outcomes of the ASODCAE-SLR model on the entire dataset. The ASODCAE-SLR model has recognized samples under class 1 with accu y , prec n , reca l , F1 score , and Jaccard index of 99.27%, 96.94%, 95%, 95.96%, and 99.23%. Additionally, the ASODCAE-SLR system has recognized samples under class 2 with accu y , prec n , reca l , F1 score , and Jaccard index of 99.27%, 96.94%, 93%, 95.88%, and 92.08%. In line with this, the ASODCAE-SLR method has recognized samples under class 3 with accu y , prec n , reca l , F1 score , and Jaccard index of 99.64%, 98%, 98%, 98%, and 96.08%. Next, the ASODCAE-SLR system has recognized samples under class 4 with accu y , prec n , reca l , F1 score , and Jaccard index of 99%, 92.38%, 97%, 94.63%, and 89.81%. The confusion matrix generated by the ASODCAE-SLR approach on 70% of training (TR) data is displayed in Figure 4. The figure depicted that the ASODCAE-SLR model has accurately recognized all the 11 class labels on 70% of TR data.
Healthcare 2022, 10, x FOR PEER REVIEW 10 of 17 The confusion matrix generated by the ASODCAE-SLR approach on 70% of training (TR) data is displayed in Figure 4. The figure depicted that the ASODCAE-SLR model has accurately recognized all the 11 class labels on 70% of TR data.     Table 3 illustrate the sign language recognition outcomes of the ASODCAE-SLR methodology on 70% of TR data. The ASODCAE-SLR technique has recognized samples under class 1 with accu y , prec n , reca l , F1 score , and Jaccard index of 99.35%, 97.10%, 95.71%, 96.40%, and 93.06%. Additionally, the ASODCAE-SLR algorithm has recognized samples under class 2 with accu y , prec n , reca l , F1 score , and Jaccard index of 99.09%, 98.48%, 91.55%, 94.89%, and 90.28%. Similarly, the ASODCAE-SLR approach has recognized samples under class 3 with accu y , prec n , reca l , F1 score , and Jaccard index of 99.48%, 96.97%, 96.97%, 96.97%, and 94.12%. At last, the ASODCAE-SLR system has recognized samples under class 4 with accu y , prec n , reca l , F1 score , and Jaccard index of 98.96%, 93.51%, 96%, 94.74%, and 90%. The confusion matrix generated by the ASODCAE-SLR approach on 30% of testing (TS) data is represented in Figure 5. The figure depicted that the ASODCAE-SLR technique has accurately recognized all the 11 class labels on 30% of the TS dataset. The confusion matrix generated by the ASODCAE-SLR approach on 30% of testing (TS) data is represented in Figure 5. The figure depicted that the ASODCAE-SLR technique has accurately recognized all the 11 class labels on 30% of the TS dataset.    Table 4 demonstrates the sign language recognition outcomes of the ASODCAE-SLR technique on 30% of TS data. The ASODCAE-SLR approach has recognized samples under class 1 with accu y , prec n , reca l , F1 score , and Jaccard index of 99.09%, 96.55%, 93.33%, 94.92%, and 90.32%. Furthermore, the ASODCAE-SLR methodology has recognized samples under class 2 with accu y , prec n , reca l , F1 score , and Jaccard index of 99.70%, 100%, 96.55%, 98.25%, and 96.55%. In addition, the ASODCAE-SLR system has recognized samples under class 3 with accu y , prec n , reca l , F1 score , and Jaccard index of 100%, 100%, 100%, 100%, and 100%. Afterward, the ASODCAE-SLR system recognized samples under class 4 with accu y , prec n , reca l , F1 score , and Jaccard index of 99.09%, 89.29%, 100%, 94.34%, and 89.29%. The training accuracy (TRA) and validation accuracy (VLA) acquired by the ASODCAE-SLR approach on the test dataset is shown in Figure 6. The experimental result stated that the ASODCAE-SLR technique has achieved improved values of TRA and VLA. Particularly the VLA seemed greater than TRA. The training accuracy (TRA) and validation accuracy (VLA) acquired by the ASODCAE-SLR approach on the test dataset is shown in Figure 6. The experimental result stated that the ASODCAE-SLR technique has achieved improved values of TRA and VLA. Particularly the VLA seemed greater than TRA. The training loss (TRL) and validation loss (VLL) accomplished by the ASODCAE-SLR system on the test dataset are depicted in Figure 7. The experimental result revealed that the ASODCAE-SLR approach has obtained minimal values of TRL and VLL. Certainly, the VLL is lesser than TRL. The training loss (TRL) and validation loss (VLL) accomplished by the ASODCAE-SLR system on the test dataset are depicted in Figure 7. The experimental result revealed that the ASODCAE-SLR approach has obtained minimal values of TRL and VLL. Certainly, the VLL is lesser than TRL. A clear precision-recall examination of the ASODCAE-SLR algorithm on the test dataset is illustrated in Figure 8. The figure represented that the ASODCAE-SLR approach has resulted in higher values of precision--recall values under all classes.
A detailed ROC analysis of the ASODCAE-SLR system on the test dataset is illustrated in Figure 9. The outcomes represented by the ASODCAE-SLR algorithm have demonstrated their capability in categorizing distinct classes on the test dataset.  At last, comprehensive comparative results of the ASODCAE-SLR model with recent models are given in Figure 10   A detailed ROC analysis of the ASODCAE-SLR system on the test dataset is illustrated in Figure 9. The outcomes represented by the ASODCAE-SLR algorithm have demonstrated their capability in categorizing distinct classes on the test dataset.  At last, comprehensive comparative results of the ASODCAE-SLR model with recent models are given in Figure 10   At last, comprehensive comparative results of the ASODCAE-SLR model with recent models are given in Figure 10  Next, the GRU, RNN, and BiLSTM models have reported slightly enhanced classification performance whereas the LSTM model has shown reasonable classification performance. Moreover, the LSTM-GRU model has accomplished near-optimal performance. However, the obtained values implied that the ASODCAE-SLR model has accomplished improved performance over other models.

Conclusions
In this study, a new ASODCAE-SLR technique has been developed for recognizing sign languages to assist the communication of speaking and hearing disabled persons. The ASODCAE-SLR technique initially pre-processes the input frames by a weighted average filtering approach. Next, the ASODCAE-SLR technique employed a CapsNet feature extractor to produce a collection of feature vectors. To identify and classify sign language, the DCAE model is exploited in the study. At the final stage, the ASO algorithm is utilized as a hyperparameter optimizer which in turn increases the efficacy of the DCAE model. The experimental validation of the ASODCAE-SLR model is tested using the Arabic Sign Language dataset. The simulation analysis exhibit the enhanced performance of the ASODCAE-SLR model compared to existing models. Therefore, the proposed model can be employed to assist communication between deaf and dumb people with ordinary people. The proposed model can be extended to sign board recognition in real-time applications. In the future, the performance of the proposed model can be tested on a real-time large-scale dataset. In addition, a fusion of DL models can be derived to boost the SL recognition performance. Next, the GRU, RNN, and BiLSTM models have reported slightly enhanced classification performance whereas the LSTM model has shown reasonable classification performance. Moreover, the LSTM-GRU model has accomplished near-optimal performance. However, the obtained values implied that the ASODCAE-SLR model has accomplished improved performance over other models.

Conclusions
In this study, a new ASODCAE-SLR technique has been developed for recognizing sign languages to assist the communication of speaking and hearing disabled persons. The ASODCAE-SLR technique initially pre-processes the input frames by a weighted average filtering approach. Next, the ASODCAE-SLR technique employed a CapsNet feature extractor to produce a collection of feature vectors. To identify and classify sign language, the DCAE model is exploited in the study. At the final stage, the ASO algorithm is utilized as a hyperparameter optimizer which in turn increases the efficacy of the DCAE model. The experimental validation of the ASODCAE-SLR model is tested using the Arabic Sign Language dataset. The simulation analysis exhibit the enhanced performance of the ASODCAE-SLR model compared to existing models. Therefore, the proposed model can be employed to assist communication between deaf and dumb people with ordinary people. The proposed model can be extended to sign board recognition in real-time applications. In the future, the performance of the proposed model can be tested on a real-time large-scale dataset. In addition, a fusion of DL models can be derived to boost the SL recognition performance. Data Availability Statement: Data sharing is not applicable to this article as no datasets were generated during the current study.