Artificial Intelligence Driven Biomedical Image Classification for Robust Rheumatoid Arthritis Classification

Recently, artificial intelligence (AI) including machine learning (ML) and deep learning (DL) models has been commonly employed for the automated disease diagnosis process. AI in biological and biomedical imaging is an emerging area and will be a future trend in the field. At the same time, biomedical images can be used for the classification of Rheumatoid arthritis (RA) diseases. RA is an autoimmune illness that affects the musculoskeletal system causing systemic, inflammatory and chronic effects. The disease frequently becomes progressive and decreases physical function, causing articular damage, suffering, and fatigue. After a time, RA causes harm to the cartilage of the joints and bones, weakens the tendons and joints, and finally causes joint destruction. Sensors (thermal infrared camera sensor, accelerometers and wearable sensors) are more commonly employed to collect data for RA. This study develops an Automated Rheumatoid Arthritis Classification using an Arithmetic Optimization Algorithm with Deep Learning (ARAC-AOADL) model. The goal of the presented ARAC-AOADL technique lies in the classification of health disorders depending upon RA and orthopaedics. Primarily, the presented ARAC-AOADL technique pre-processes the input images by median filtering (MF) technique. Then, the ARAC-AOADL technique uses AOA with an enhanced capsule network (ECN) model to produce feature vectors. For RA classification, the ARAC-AOADL technique uses a multi-kernel extreme learning machine (MKELM) model. The experimental result analysis of the ARAC-AOADL technique on a benchmark dataset reported a maximum accuracy of 98.57%. Therefore, the ARAC-AOADL technique can be employed for accurate and timely RA classification.


Introduction
Biomedical imaging acts as a vital part in the domain of biology and biomedicine, offering data related to the structural and functional mechanism of cells and the human body. Biological and biomedical imaging comprises microscopy, molecular imaging, pathological imaging, optical coherence tomography, nuclear medicine, ultrasound imaging, X-ray radiography, computed tomography, magnetic resonance imaging, and so on. The typical medical phenotype is labelled as non-infectious persistent polyarticular swelling, specifically of small joints, that causes progressive joint destruction and deformity and bone erosion, i.e., rheumatoid arthritis (RA) [1]. However, persistent periarticular synovitis, regardless of the immunopathogenesis, was linked to joint erosion and destruction, so regardless of the immune trigger a similar medical phenotype arises [2]. A logical extension of this view is that clinically different cases are linked with the medical RA phenotype. An unfulfilled necessity occurs in the translational setting for developing a robust pattern for assessing, diagnosing and prognosing patients affected with polyarthritis characteristic of early RA, particularly in recent times where the key role of autoinflammation or innate immunity is effectively detected in other chronic inflammatory illnesses [3]. Here, a new classification is proposed for the full medical disease spectrum of RA by utilizing the paradigmatic shift that occurred, in addition to the description of autoimmunity against citrullinated antigens in numerous RA cases. This immunological disease continuum method of inflammation in RA has consequences for therapeutic techniques [4,5].
Particularly viable reasons for applications utilizing the ensemble ML technique include hospital-based applications, controlling smart homes, information on-request systems, monitoring systems, outpatient care and mobile games communication interface [6]. Additionally, machine learning (ML) can be used in the development and evaluation of electroencephalogram (EEG)-related brain activities to measure using a biosensor [7]. With the developments in on-body wearable sensors and sensor technology, the ML technique will perform effectively in RA disease categorization. The author used a wearable gloverelated sensor IoT for identifying RA diseases. Certainly, thermal structure-related camera sensors can be used to monitor temperature disparities in finger joints in several analyses [8]. ML includes different algorithms, procedures and techniques for finding limited associations within particular data and to produce tools constituting prescription, prediction or description combinations. ML has been initiated with many clinical domains and has been exemplified as very accurate in classifying and identifying different illnesses [9]. ML is a commonly used method for enhancing medical services and disease diagnosis with medical data development in various medical fields. Studies of effective applications of artificial intelligence (AI), which includes deep learning (DL) and ML methods. have seen an exponential growth in healthcare and medical fields [10]. Such techniques are critical in offering high-quality care to patients with RA.
This study develops an Automated Rheumatoid Arthritis Classification using an Arithmetic Optimization Algorithm with Deep Learning (ARAC-AOADL) model. The goal of the presented ARAC-AOADL technique lies in the classification of health disorders depending upon RA and orthopaedics. Primarily, the presented ARAC-AOADL technique pre-processes the input images by median filtering (MF) technique. Then, the ARAC-AOADL technique uses AOA with an enhanced capsule network (ECN) model to produce feature vectors. For RA classification, the ARAC-AOADL technique uses a multi-kernel extreme learning machine (MKELM) model. The experimental results analysis of the ARAC-AOADL technique is tested on two medical datasets and the results are examined under several aspects.

Related Works
Lim et al. [11] modelled a new feature engineering technique compiling potentially functional coding haplotypes (pfcHap), including ML feature selection to detect biologically meaningful, probably causative genetic factors, that considers effective SNP-SNP interactions in the pfcHap to optimally forecast the methotrexate (MTX) response in RA patients. Ahalya et al. [12], by utilizing modified pre-trained CNN techniques, produced automated patch-related classification of hand Xray images, and then for for automated classification and feature extraction of hand Xray images and, for comparing the efficiency of CNN techniques with linear and non-linear kernels, a customized CNN technique was developed; they finally classified the normal and RA by employing ML methods and framing the hand-crafted feature fusion (SIFT and Customized CNN features).
Yang et al. [13] introduced a grading technique to estimate and detect the texture and geometric features of bone erosion and synovium thickening. This study will use the metrics and texture features of ROI in a dissimilar way to previous studies in this area. The segmented outcomes were examined for the extraction of three quantitative geometric variables, which are integrated with GLCM statistic texture features to describe the ultrasonic image of metacarpophalangeal RA. Tang et al. [14] modelled an automated RA grading technique leveraging DCNN for assistance in medical assessment. Here, the input is the Gray-scale ultrasound images of finger joints whereas the output is the RA grading outcomes. The authors executed data augmentation for increasing the training samples count. The authors pre-trained the GoogLeNet on ImageNet as an extracting feature and then fine-tuned them.
In [15], the classification of clinical disorders related to RA and orthopaedics dataset utilizing Ensemble techniques was conferred. The RA data was collected from the study of WBC classification by utilizing features derived from the lymphocyte image obtained through a digital microscope. The orthopaedic datasets are a benchmark dataset for this work, since they imposed the same classifier issue with some numerical features. In this study, three ensemble techniques, random subspace, bagging and Adaboost, were used. Such ensemble techniques use RF and kNN as the base learners of the ensemble techniques. In [16], for enhancement of disease risk evaluation, ML and matrix factorization methods were combined to find significant and implicit risk factors. A new structure was modelled that can successfully evaluate early disease risks and RA was employed as a case study. This structure has three main phases: 1early disease risk assessment, data preprocessing, and risk factor optimization. This was the first study compiling ML and matrix factorization for disease risk evaluation implemented with nation-wide and longitudinal medical diagnostic databases. Andreu-Perez et al. [17] formulated a technique that could produce optimally grained actigraphies for capturing the effect of the disease on the daily actions of patients. A study of processing techniques related to ML and DL was offered.

The Proposed Model
In this study, we have developed a new ARAC-AOADL technique for accurate RA classification, which helps to identify the health disorders depending upon RA and orthopaedics. The presented ARAC-AOADL technique encompasses MF-based noise removal, ECN feature extraction, AOA hyperparameter tuning and MKELM classification. The working of the 1ARAC-AOADL technique is depicted in Figure 1.

Noise Filtering Technique
Primarily, the presented ARAC-AOADL technique pre-processes the input images by the MF technique. Initially, the input image was preprocessed by means of the MF algorithm for getting rid of the noise within them. MF based on specificity is one of the applications from medical image noise extraction. The major concept behindhand MF is to present a process for assembling each neighborhood from the increasing order, which selects the median values of arranged numbers and replaces the central pixel as follows: where C indicates the central neighborhood location of the image. In such cases, the MF was executed by digital noise extraction from the input images, whereby the filter mask with size 3 × 3 was applied.

Noise Filtering Technique
Primarily, the presented ARAC-AOADL technique pre-processes the input images by the MF technique. Initially, the input image was preprocessed by means of the MF algorithm for getting rid of the noise within them. MF based on specificity is one of the applications from medical image noise extraction. The major concept behindhand MF is to present a process for assembling each neighborhood from the increasing order, which selects the median values of arranged numbers and replaces the central pixel as follows: where indicates the central neighborhood location of the image. In such cases, the MF was executed by digital noise extraction from the input images, whereby the filter mask with size 3 3 was applied.

Feature Extraction Using Optimal ECN Model
In this study, the ARAC-AOADL technique uses AOA with the ECN models to produce the feature vector. In the presented technique, the splitted pixel set of the images can be labelled as a collection of nerve cells corresponding to the capsule [18]. Consider ∈ [healthy, tumor] as -th output capsules, and signifies the weighted matrices in the following: (2) In Equation (5), describes the detection vector that diagnoses output parentth capsules with -th capsule, and pixel range was applied to evaluate the weight quantity. The quantity for the weight was improved if the value was decreased or the pixel involves the positive group. The softmax method is exploited by the preceding layer capsule and the potential parent capsule as a coefficient is encoded wherein major logits show the log preceding probability of -th routing capsules in the preceding layer to -th capsules in the succeeding layer. Generally, the "routing-by-agreement" methodology was executed by logit for the capsule in all the layers:

Feature Extraction Using Optimal ECN Model
In this study, the ARAC-AOADL technique uses AOA with the ECN models to produce the feature vector. In the presented technique, the splitted pixel set of the images can be labelled as a collection of nerve cells corresponding to the capsule [18]. Consider Y i ∈ [healthy, tumor] as i-th output capsules, and we ij signifies the weighted matrices in the following:ŷ In Equation (5),ŷ (ij) describes the detection vector that diagnoses output parent j-th capsules with i-th capsule, and pixel range was applied to evaluate the weight quantity. The quantity for the weight was improved if the value was decreased or the pixel involves the positive group. The softmax method is exploited by the preceding layer capsule and the potential parent capsule as a coefficient is encoded c ij wherein major logits b ij show the log preceding probability of i-th routing capsules in the preceding layer to j-th capsules in the succeeding layer. Generally, the "routing-by-agreement" methodology was executed by logit for the capsule in all the layers: The previous layer demonstrates key elements to compute the input of j-th parent capsules as follows: The compressed pixel vector can be defined in (0, 1) by a non-linear method called squashing and it is computed by the following expression: where ε = 10 −7 . The subsequent layer capsule was attained using: The entire capsule classifier is regarded as margin loss (Loss k ) in the class capsule k for capsule network based on the loss: For the hyperparameter selection process, the AOA is exploited. AOA mainly replicates the use of the arithmetical operator during the arithmetical problem-solving method. Arithmetic is the part of mathematics that is exploited to handle the property of operation and numbers. An arithmetical operator is an operation symbol that executes fundamental arithmetic, i.e., a symbol applied to four processes. During optimization, these operators are utilized for selecting the best solution from the candidates. The optimization technique is based on two main processes: exploration and development. Initially, the search space of candidate solution is expansively protected for breaking the deadlock of the methodology within the search stagnation. Then, the performance for the solution is searched more deeply.
The series of possible solutions was arbitrarily generated in the initial stage of the optimization method of AOA, as given below.
Formerly, the AOA implements the optimized methodology, necessitating the resolution of the search process in accordance with the value of the Math Optimizer Accelerated (MOA) process that is calculated by the following equation.
In Equation (9), MOA(C_Iter) indicates the function value in C_Iter iteration; C_Iter represent the present iteration; M_Iter denotes the maximal iterations count; Min and Max are accelerated function minimal and maximal values.
The exploration phase in the AOA method is realized generally by Division (D) and Multiplication (M) operators [19]. During mathematical computation, these two operators accomplish distributed value for a wide-ranging coverage of candidate solutions. The location of candidate solution is upgraded considerably in the exploration procedure, as follows: In Equation (16), χ i,j (C_Iter + 1) denotes the jth location of ith solution in (C_Iter + 1)th iteration; ε indicates the small values; UB and LB indicates the upper and lower bounds of the location of candidate solution; µ is employed for regulating the exploration stage set to 0.5; the MOP signifies the math optimizer probability of AOA that is described in the following: Now, α defines the accuracy of exploitation on the iteration, α = 5. The execution of exploitation method depends mainly on Subtraction (S) and Addition (A) operators that are easier to cause minimal dispersion, for the candidate solution is executed by a deep searching with larger probability of estimating the optimal solution [20]. In the growth step, the candidate solution was upgraded as follows: The adaptive conversions among the exploration and exploitation stages are supported by the AOA method which defines an optimal solution and continues with a diversity of possible solutions to conduct a wide-ranging search as illusetrated in Algorithm 1.

RA Classification Model
For RA classification, the ARAC-AOADL technique employed the MKELM model. The standard KELM is a single kernel based model. The structure of standard ELM is shown in Figure 2. Since a distinct kernel function provided a similar measure to the sample point, the efficacy of the kernel function could be based considerably to the related dataset. The input signal has features of considerable amount, irregular distribution of instances generated using imbalance, and maximal dimension feature space. Utilizing a single kernel to process the input dataset could not solve the problems effectively. The kernel function was regarded as global or local kernel function depending on rotation or translation invariances [21]. The global kernel function was higher at removing global features, and the local kernel function was better at eliminating the local feature of instance. In multi-kernel learning, an optimal kernel was regarded as linear integration of the group of base kernels, and the better linear integration coefficient and the classification parameter are learned equally using the margin maximization. The polynomial kernel and RBF are global and local kernel functions with optimal efficacy.
In order to balance the integration of the classifier efficiency and generalized capability, an MKELM has been generated using linear integration of the polynomial kernels and RBF and they are defined in the following equation: where λ(0 < λ < 1) denotes the weighted coefficient of linear integration.
Biomedicines 2022, 10, x FOR PEER REVIEW 7 of 15 translation invariances [21]. The global kernel function was higher at removing global features, and the local kernel function was better at eliminating the local feature of instance. In multi-kernel learning, an optimal kernel was regarded as linear integration of the group of base kernels, and the better linear integration coefficient and the classification parameter are learned equally using the margin maximization. The polynomial kernel and RBF are global and local kernel functions with optimal efficacy. In order to balance the integration of the classifier efficiency and generalized capability, an MKELM has been generated using linear integration of the polynomial kernels and RBF and they are defined in the following equation: where 0 1 denotes the weighted coefficient of linear integration. , , From the expression, indicates fixed to two, as the dimension of polynomial space refers to ; once the sample size corresponded to thousands and the index corresponded to three, the dimension can be accomplished as 1 billion, and the computation of the inner product generates a dimension disaster [22].
Eventually, the resulting objective of MKELM was defined by the following expression:

Experimental Validation
The proposed model is simulated using Python 3.6.5 tool on PC i5-8600k, GeForce 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parameter settings are given as From the expression, d indicates fixed to two, as the dimension of polynomial space refers to n d ; once the sample size corresponded to thousands and the index corresponded to three, the dimension can be accomplished as 1 billion, and the computation of the inner product generates a dimension disaster [22].
Eventually, the resulting objective of MKELM was defined by the following expression:

Experimental Validation
The proposed model is simulated using Python 3.6.5 tool on PC i5-8600k, GeForce 1050Ti 4GB, 16GB RAM, 250GB SSD, and 1TB HDD. The parameter settings are given as follows: learning rate: 0.01, dropout: 0.5, batch size: 5, epoch count: 50, and activation: ReLU. The experimental validation of the ARAC-AOADL model is performed on a benchmark dataset from the Kaggle repository [23]. The dataset holds 310 samples with three classes, as given in Table 1.  Figure 3 shows the confusion matrices of the ARAC-AOADL model on various training (TR) and testing (TS) data. The figure highlights that the ARAC-AOADL model has reached effective RA classification results.  Figure 3 shows the confusion matrices of the ARAC-AOADL model on various training (TR) and testing (TS) data. The figure highlights that the ARAC-AOADL model has reached effective RA classification results.    Table 2 offers a detailed classifier results of the ARAC-AOADL model on 80% of TR data and 20% of TS data. The results reveal the RA classification results of the ARAC-AOADL model on 80% of TR data. The ARAC-AOADL model has recognized Hernia class samples with accu y of 96.77%, prec n of 93.88%, reca l of 90.20%, F score of 92%, and AUC score of 94.34%. In addition, the ARAC-AOADL model has categorized Normal class samples with accu y of 97.58%, prec n of 96.15%, reca l of 96.15%, F score of 96.15%, and AUC score of 97.19%. Moreover, the ARAC-AOADL model has reached average accu y of 97.04%, prec n of 95.30%, reca l of 94.61%, F score of 94.94%, and AUC score of 96.11%. The ARAC-AOADL method has recognized Hernia class samples with accu y of 98.39%, prec n of 90%, reca l of 100.00%, F score of 94.74%, and AUC score of 99.06%. Furthermore, the ARAC-AOADL technique has categorized Normal class samples with accu y of 98.39%, prec n of 100%, reca l of 95.45%, F score of 97.67%, and AUC score of 97.73%. In addition, the ARAC-AOADL approach has reached average accu y of 97.85%, prec n of 95.59%, reca l of 97.41%, F score of 96.40%, and AUC score of 97.85%.  Table 3 shows the overall RA classification results on 70% of TR data and 30% of TS data. Figure 4 demonstrates the RA classification outcomes of the ARAC-AOADL technique on 70% of TR data. The ARAC-AOADL system has recognized Hernia class samples with accu y of 98.62%, prec n of 97.67%, reca l of 95.45%, F score of 96.55%, and AUC score of 97.44%. Furthermore, the ARAC-AOADL method has categorized Normal class samples with accu y of 94.47%, prec n of 89.19%, reca l of 94.29%, F score of 91.67%, and AUC score of 94.42%. Besides, the ARAC-AOADL technique has obtained average accu y of 96.01%, prec n of 94.29%, reca l of 94.31%, F score of 94.27%, and AUC score of 95.57%. A brief training accuracy (TRAC) and validation accuracy (VAAC) of the ARAC-AOADL model is given in Figure 4. The results inferred that the ARAC-AOADL model has reached maximum TRAC and VAAC values. It is obvious that the VAAC is superior to TRAC.
In Figure 5, a clear training loss (TRAL) and validation loss (VALL) of the ARAC-AOADL model is reported. The figure reported that the ARAC-AOADL model has reached minimal values of TRAL and VALL.
A brief training accuracy (TRAC) and validation accuracy (VAAC) of the ARAC-AOADL model is given in Figure 4. The results inferred that the ARAC-AOADL model has reached maximum TRAC and VAAC values. It is obvious that the VAAC is superior to TRAC.
In Figure 5, a clear training loss (TRAL) and validation loss (VALL) of the ARAC-AOADL model is reported. The figure reported that the ARAC-AOADL model has reached minimal values of TRAL and VALL.   To highlight the enhanced performance of the ARAC-AOADL model, a comparison study is made in Table 4 and Figure 6. An extensive comparison study of the presented ARAC-AOADL model with existing ML models in terms of and is provided in Figure 6.  To highlight the enhanced performance of the ARAC-AOADL model, a comparison study is made in Table 4 and Figure 6. An extensive comparison study of the presented ARAC-AOADL model with existing ML models in terms of accu y and F score is provided in Figure 6. The experimental values inferred that the ARAC-AOADL model has shown effective classification performance. For instance, based on accu y , the ARAC-AOADL model has offered higher accu y of 98.57% whereas the Bagging, Adaboost, DT, and Subspace-k-NN random models have attained lower accu y of 94.89%, 89.37%, 92.64%, and 97.50% respectively. Furthermore, based on F score , the ARAC-AOADL technique has provided maximum accu y of 97.67% while the Bagging, Adaboost, DT, and Subspace-k-NN random techniques have accomplished lower accu y of 95.23%, 89.21%, 94.68%, and 96.82%, correspondingly.  A comprehensive analysis of the proposed ARAC-AOADL methodology with current ML models with respect to and is given in Figure 7. After examining the detailed results, the proposed model has gained enhanced performance with maximum of 98.57%, of 98.22%, of 97.21%, and of 97.67%. The enhanced performacne of the proposed model is due to the inclusion of the AOA based hyperparameter tuning process. Since the trial and error hyperparmaeter selection is not an effective process, the optimal hyperparmater tuning AOA helps to accomplish enhanced RA classification performance. Therefore, the proposed model A comprehensive analysis of the proposed ARAC-AOADL methodology with current ML models with respect to prec n and reca l is given in Figure 7. The experimental value demonstrates that the ARAC-AOADL approach has demonstrated effective classification performance. For example, based on prec n , the ARAC-AOADL technique has provided maximum prec n of 98.22% while the Bagging, Adaboost, DT, and Subspace-k-NN random models have accomplished lower prec n of 94.89%, 89.37%, 92.64%, and 97.50% correspondingly. Furthermore, based on reca l , the ARAC-AOADL technique has given maximum reca l of 97.67% while the Bagging, Adaboost, DT, and Subspace-k-NN random approaches have accomplished lower reca l of 95.40%, 90.01%, 94.73%, and 97.02%, correspondingly.

Conclusions
In this study, we have developed a new ARAC-AOADL technique for accurate RA classification, which helps to identify health disorders depending upon RA and orthopaedics. Primarily, the presented ARAC-AOADL technique pre-processes the input images by the MF technique. Then, the ARAC-AOADL technique uses AOA with the ECN model to produce feature vectors. For RA classification, the ARAC-AOADL technique employed the MKELM model. The experimental result analysis of the ARAC-AOADL technique is tested on two medical datasets and the results are inspected under several aspects. The simulation results ensured the enhancements of the ARAC-AOADL technique in terms of different measures. In future, we can extend the ARAC-AOADL technique by hybrid DL classification models.   After examining the detailed results, the proposed model has gained enhanced performance with maximum accu y of 98.57%, prec n of 98.22%, reca l of 97.21%, and F score of 97.67%. The enhanced performacne of the proposed model is due to the inclusion of the AOA based hyperparameter tuning process. Since the trial and error hyperparmaeter selection is not an effective process, the optimal hyperparmater tuning AOA helps to accomplish enhanced RA classification performance. Therefore, the proposed model can be employed for precise RA classification, which enables detection of health disorders based on RA and orthopaedics.

Conclusions
In this study, we have developed a new ARAC-AOADL technique for accurate RA classification, which helps to identify health disorders depending upon RA and orthopaedics. Primarily, the presented ARAC-AOADL technique pre-processes the input images by the MF technique. Then, the ARAC-AOADL technique uses AOA with the ECN model to produce feature vectors. For RA classification, the ARAC-AOADL technique employed the MKELM model. The experimental result analysis of the ARAC-AOADL technique is tested on two medical datasets and the results are inspected under several aspects. The simulation results ensured the enhancements of the ARAC-AOADL technique in terms of different measures. In future, we can extend the ARAC-AOADL technique by hybrid DL classification models.  Data Availability Statement: Data sharing not applicable to this article as no datasets were generated during the current study.