A Robust Algorithm for Classification and Diagnosis of Brain Disease Using Local Linear Approximation and Generalized Autoregressive Conditional Heteroscedasticity Model

Regions detection has an influence on the better treatment of brain tumors. Existing algorithms in the early detection of tumors are difficult to diagnose reliably. In this paper, we introduced a new robust algorithm using three methods for the classification of brain disease. The first method is Wavelet-Generalized Autoregressive Conditional Heteroscedasticity-K-Nearest Neighbor (W-GARCH-KNN). The Two-Dimensional Discrete Wavelet (2D-DWT) is utilized as the input images. The sub-banded wavelet coefficients are modeled using the GARCH model. The features of the GARCH model are considered as the main property vector. The second method is the Developed Wavelet-GARCH-KNN (D-WGK), which solves the incompatibility of the WGK method for the use of a low pass sub-band. The third method is the Wavelet Local Linear Approximation (LLA)-KNN, which we used for modeling the wavelet sub-bands. The extracted features were applied separately to determine the normal image or brain tumor based on classification methods. The classification was performed for the diagnosis of tumor types. The empirical results showed that the proposed algorithm obtained a high rate of classification and better practices than recently introduced algorithms while requiring a smaller number of classification features. According to the results, the Low-Low sub-bands are not adopted with the GARCH model; therefore, with the use of homomorphic filtering, this limitation is overcome. The results showed that the presented Local Linear (LL) method was better than the GARCH model for modeling wavelet sub-bands.


Introduction
Electromagnetic imaging techniques provide valuable information about the human body. One of these methods is the Magnetic Resonance Imaging (MRI) of the brain [1]. One major area of research that has expanded in medical engineering involves diagnostic tools by machine control for a quicker and easier inference, which can be a great help for physicians in clinical medicine. Therefore, in recent years, mathematical methods have attracted much attention to the analysis of neural network data [2]. Brain images are considered as interesting subjects in the mathematical application and diagnosis of brain disorders in a patient [3]. The MRI can be used to examine the status of the brain tissue and discover whether or not there is a disease [4]. In MRI imaging, the patient is exposed to a strong with brain tumors and 18 patients with Mild Cognitive Impairment (MCI); eight remained stable in a three-year follow-up, and 15 were healthy individuals. The classification was also improved by limiting the analysis to the left-brain hemisphere. Devanand, et al. [23], using morphometric mapping of MRI, evaluated the local changes of the hippocampus grains and entorhinal cortex in predicting the transformation from a MCI cognitive impairment to an AD brain tumor. In the MCI model, Cox regression models for the conversion time to conversion converters were made for AD (n = 31) and 99 non-converted controls for age, sex, and education. In Zöllner, et al. [24], the performances of reduction features such as the Pearson correlation coefficient, principal components analysis, and independent component analysis in the classification of Glioma's disease were analyzed using a backup vector machine classifier.
Afshar, et al. [25] studied classification using CapsNets for the detection of brain tumors in order to present a developed architecture with higher accuracy. Their findings indicated that the presented method could overcome Convolutional Neural Networks (CNNs) successfully. Mohan and Subashini [26] provided a clinical study of brain tumor imaging related to gliomas. They used related methods of segmentation and classification. Huang, et al. [27] proposed an algorithm based on the rough set method. They presented a hybrid method with the use of FCM. Initially, the feature table was set based on FCM clustering amounts. Then, the relationship among features showed similarity criteria in each cluster.
In this paper, we presented three algorithms, named WGK, D-WGK, and WLK. The first presented method is Wavelet-GARCH-KNN (WGK). In this method, we first used a two-stage 2D-DWT to decompose the input images into sub-bands of wavelets. The reached wavelet coefficients were features of classification. Then, the GARCH model was used for feature extraction with the use of HH1, HL1, LH1, and second stage HH2, HL2, LH2. Because of the incompatibility of Local Linear (LL) with the GARCH model, this sub-band was ignored [28]. To reduce the number of features, the PCA and PCA + LDA method was then used with the extracted feature brain lesion being classified via KNN methods. The results are illustrated in the results section. The second presented method is named Developed Wavelet-GARCH-KNN (D-WGK). In the second method, we overcame the limitation of the WGK algorithm using homomorphic filtering before a wavelet transformation. Therefore, the LL2 sub-band participated in the GARCH model. Then, similarly to the WGK method, the KNN method was designated for the classification of brain tumors. The third method was Wavelet-LLA-KNN (WLK). In this method, all sub-bands of the wavelet decomposition were used for modeling with the LLA algorithm. The remaining part of the third method was also similar to the WGK and D-WGK method.

Image Processing
The modern world of today allows digital images to be analyzed and stored [29]. To get better results, it is sometimes necessary to make changes to these images. These changes have three main purposes: processing, analysis, and image perception. For this reason, computer image processing systems have been developed to perform these operations with better speed and accuracy. In these systems, four major processes occur pre-processing, image quality enhancement, image transformation, and classification and segmentation. In these methods, using mathematical science, rules have been created by computers to simulate human visual elements, and this is an aspect of image analysis that is used for specific purposes. Computer Vision is the analysis of scientific images in various scientific branches such as medicine, engineering, molecular imaging, astronautics, security, etc. Modern digital technology has made it possible to manipulate multidimensional signals from systems ranging from simple digital circuits to multiple parallel computers [30,31].
where f (x) is the input variable as a vector, and ϕ j 0 ,k (x) and ψ j,k (x) are the scaling coefficient and wavelet coefficient, respectively. x = 0, 1, . . . , M − 1, j = 0, 1, . . . , J − 1, k = 0, 1, 2, ..., M − 1, where M is the number of samples to be transformed that is equal to 2 J , J is the number of transformation levels, and j 0 is a random starting scale. The expansion function is a series of crisp numbers; it is also called the discrete wavelet transform of f (x). The representation of the discrete function of f (x) can be written as a weighted summation of wavelet ψ j,k (x) and the scaling coefficient ϕ j 0 ,k (x), as shown in Equation (1). In this equation, W φ ( j 0 , k) and W ψ ( j 0 , k) are the approximation coefficient and detail coefficient, respectively. The expansion coefficients are shown as follows. Figure 1 shows a two-step wavelet transformation that generates four sub-bands, where ψ H , ψ V and ψ D indicate deviations along the horizontal, vertical, and diagonals edges, respectively. In this diagram, 2 ↓ shows a down stampeding indicator. 2D-DWT can be executed with digital filtration and down samplers. The other sub-bands are generated with discrete 2D scaling functions, with the use of 1D-FWT on f (x, y) [32]. For the computation of the DWT coefficients, we should consider the multiresolution refinement equation, as shown in Equations (6) and (7): where ( ) is the input variable as a vector, and , ( ) and , ( ) are the scaling coefficient and wavelet coefficient, respectively. x = 0, 1, …, M−1, j = 0, 1, …, J−1, k = 0, 1, 2, ..., M−1, where M is the number of samples to be transformed that is equal to 2 , is the number of transformation levels, and is a random starting scale. The expansion function is a series of crisp numbers; it is also called the discrete wavelet transform of ( ). The representation of the discrete function of ( ) can be written as a weighted summation of wavelet , ( ) and the scaling coefficient , ( ), as shown in Equation (1). In this equation, ( , ) and ( , ) are the approximation coefficient and detail coefficient, respectively. The expansion coefficients are shown as follows. Figure 1 shows a two-step wavelet transformation that generates four sub-bands, where , and indicate deviations along the horizontal, vertical, and diagonals edges, respectively. In this diagram, 2 ↓ shows a down stampeding indicator. 2D-DWT can be executed with digital filtration and down samplers. The other sub-bands are generated with discrete 2D scaling functions, with the use of 1D-FWT on f(x, y) [32]. For the computation of the DWT coefficients, we should consider the multiresolution refinement equation, as shown in Equations (6) and (7): where h φ and h ψ are the scaling vector and wavelet vector, respectively. h φ and h ψ may be considered as weights for the summation of Equations (6) and (7). With the inclusion of Equations (6) and (7) into Equations (2) and (3), the following questions result.
The scaling and wavelet coefficient of a certain scale j may be obtained via the convolution of the scaling coefficients of the next scale j + 1 (with finer detail) with the order-reversed scaling and wavelet vectors h φ (−n) and h ψ (−n). Based on Figure 1, the results of the first level of transformation for the column of an input image are as follows: Generally, 2D-ϕ(x, y), and 3D-ψ H (x, y), ψ V (x, y), and ψ D (x, y) are required to generate a 1D scaling function ϕ and related wavelet ψ [20].

Generalized Autoregressive Conditional Heteroscedasticity
Bollerslev was the first researcher who developed the GARCH method [33]. It can be considered as being the variance of the time variable, for example, an oscillation. Conditional requires immediate dependence on past observations, and self-control combines past data at the present time. GARCH models are statistical methods that are more common in the economy. Engle [34] presented the process of Autoregressive Conditional Heteroscedasticity (ARCH) to change the conditional variance over time as a factor of past mistakes that remain based on the conditional constant variance. The GARCH process (Algorithm 1) is a general form of ARCH and is a time series modeling technique that uses the last variance to predict future variances. Algorithm 1. GARCH 1: Input: y t , P, Q, dist 2: Output: a i , t 3: Step 1: Estimate AR(q): 4: y t = a 0 + a 1 y t−1 + · · · . + a q y t−q + t 5: Step 2: Compute and plot the autocorrelations of 2 by: Step 3: null hypothesis states that there are no ARCH or GARCH errors

Local Linear Approximation
The Local Linear Approximation is calculated via [35]. In this method, the first and second derivatives are determined so as to generate a fitting function with the observation data.
Let x have three value x(1), x (2), and x(3). An LLA for the derivative of x at the x(2) is calculated via the mean of the two slopes between x(1)−x (2) and between x(2)−x(3), which can now be calculated from x(3) and stored in the matrix y of the same order as x (3) where the kth row of y is: where the first column of y is the value of x at the moment of measurement indexed in the second column of x(3), and the second and third columns of y are the approximated first and second derivatives, respectively, at that same moment of measurement. In this case, τ = 1 since x(1), x (2), and x(3) are successive measures, and ∆t is the time interval among the measures. The others (for instance x(1), x(3), and x (5)) can be calculated with τ = 2 being substituted into Equation (20).

K-Nearest Neighbour Algorithm
KNN is a simple form of machine learning [31,36]. In this algorithm, an article is classified by the values of its neighbors, which are allocated to k (∈ N + ) nearest neighbors [37]. The similarity of each object in a class is utilized as the weight of the class. In the case of a few of the k nearest neighbors sharing a category, the per-neighbor weights of that category are included together at that point, and the obtained weighted entirety is utilized as the probability score of the candidate categories. A positioned list is obtained for the test archive. By thresholding these scores, twofold category assignments are obtained.

Proposed Method
In this paper, we aim to use mathematical methods to diagnose brain diseases. We implemented three methods for the classification and diagnosis of brain tumors (Algorithm 2). The first presented method is Wavelet-GARCH-KNN (WGK). In this method, we first used two-stage 2D-DWT to decompose input images into sub-bands of wavelets. The obtained wavelet coefficients are features of classification. Then, the GARCH model was used for feature extraction with the use of HH1, HL1, LH1, and second-stage HH2, HL2, LH2. Because of the incompatibility of LL with the GARCH model, this sub-band was ignored. To reduce the number of features, the PCA and PCA + LDA method was then used, with extracted feature brain lesions being classified with the use of KNN methods. The results are illustrated in the results section.
The second presented method is named Developed Wavelet-GARCH-KNN (D-WGK). In the second method, we overcame the limitation of the WGK algorithm by using homomorphic filtering before a wavelet transformation. Therefore, the LL2 sub-band participated in the GARCH model. Then, similarly to the WGK method, the KNN method was designated for the classification of brain tumors.
The third method is Wavelet-LLA-KNN (WLK). In this method, all sub-bands of wavelet decomposition were used for modeling with the LLA algorithm. The remaining part of the third method was also similar to the WGK and D-WGK method. The results of each algorithm are depicted in the below sections. The structure and proposed model in this study are shown in Figure 2.  Step 1: Wavelet decomposition for all images 5: Step 2: Calculate GARCH parameters for sub-bands of high-frequency detail of (HH1, HL1, LH1, HL2, LH2) 6: Step 3: Normalization of features 7: Step 4: Feature reduction using PCA and PCA+LDA 8: Step 5: Classification of Features using KNN 9: Case 2: D-WGK 10: Step 1: Apply homomorphic filtering for all images 11: Step 2: Wavelet decomposition for all images 12: Step 2: Calculate GARCH parameters for all sub-bands of high-frequency detail of (HH1, HL1, LH1, HL2, LH2, LL2) 13: Step 3: Normalization of features 14: Step 4: Feature reduction using PCA and PCA+LDA 15: Step

Datasets
In this paper, we used seven brain diseases to implement and test the presented methods. They consist of Alzheimer's, Alzheimer plus visual agnosia, Glioma, Huntington, Meningioma, Pick, and Sarcoma. These diseases, in conjunction with normal brain images, include 240 MRI images from the Harvard medical school website. All images are from T2-weighted MR brain images in the axial plane and have 256 × 256 pixels. These images were saved in different folders and studied separately. Therefore, after feature extraction, they were aggregated into a single code folder.

Two-dimensional Discrete Wavelet Transforms (2D-DWT)
In this paper, we used 2D-DWT to separate the sub-bands of images. In this transformation, we input images from 256 × 256 pixels to 131 × 131 first-stage sub-bands and 69 × 69 sub-bands. The example of a wavelet transformation is shown in Figure 3. Additionally, for the second aforementioned method, we needed homomorphic filtering. The results of the wavelet discretization when using homomorphic filtering ( = 5) are shown in Figure 4. The top images show the original image without (with) the filtration, and the first and second transformation are shown on the left and right sides of the images, respectively. Regarding the literature studies, the GARCH model was not compatible with LL2 sub-bands [38]. This situation is obviously shown in Figure 3 (LL2). Because all the brain sections of the images were almost within the GARCH model in (1, 1), we did not find the model coefficient to be significant for the GARCH (1, 1) model. In this paper, we overcame this limitation and made the LL2 model be compatible with the GARCH (1, 1) model. To overcome this condition, we used homomorphic filtration for the main image, and then the 2D-DWT was performed on it. With this method, we increased the contrast of the LL2 sub-band, as can be seen in Figure 4 (LL2). The results of the wavelet discretization when using homomorphic filtering (σ = 5) are shown in Figure 4. The top images show the original image without (with) the filtration, and the first and second transformation are shown on the left and right sides of the images, respectively. Regarding the literature studies, the GARCH model was not compatible with LL2 sub-bands [38]. This situation is obviously shown in Figure 3 (LL2). Because all the brain sections of the images were almost within the GARCH model in (1, 1), we did not find the model coefficient to be significant for the GARCH (1, 1) model. In this paper, we overcame this limitation and made the LL2 model be compatible with the GARCH (1, 1) model. To overcome this condition, we used homomorphic filtration for the main image, and then the 2D-DWT was performed on it. With this method, we increased the contrast of the LL2 sub-band, as can be seen in Figure 4

Feature Reduction
In this section, we used nonconvulsive status epilepticus (NCSE) to extract features and classify them. Via this method, we can classify the features into two classification states: two-classes and eight-classes. In the two-class state, we classify the features into two classes to diagnosis the normal and abnormal MRI images. Using this state, we can find patient and inpatient brain images. Moreover, using the eight-class state, we can classify the brain images into seven different classes in conjunction with normal brain images.
In this paper, we studied different methods to reduce features, consisting of: WGK: Using GARCH without LL2 + PCA WGK: Using GARCH without LL2 + PCA + LDA D-WGK: Using Homomorphic filtering + GARCH with LL2 + PCA

Feature Reduction
In this section, we used nonconvulsive status epilepticus (NCSE) to extract features and classify them. Via this method, we can classify the features into two classification states: two-classes and eight-classes. In the two-class state, we classify the features into two classes to diagnosis the normal and abnormal MRI images. Using this state, we can find patient and inpatient brain images. Moreover, using the eight-class state, we can classify the brain images into seven different classes in conjunction with normal brain images.
In this paper, we studied different methods to reduce features, consisting of: WGK: Using GARCH without LL2 + PCA WGK: Using GARCH without LL2 + PCA + LDA D-WGK: Using Homomorphic filtering + GARCH with LL2 + PCA The results are depicted in Figures 5-8. Figure 5 shows the feature reduction plots used to find the best number of classes. In this figure, we used two methods, 2D-DWT and GARCH (1, 1), used with and without LL2 sub-bands. The result shows that with the addition of LL2 to the GARCH method, the model is developed. Furthermore, the results of the PCA method show that we can use 14 features for the classification of images. Furthermore, this enhancement is shown in Figure 6 for the two-class state. Additionally, in this state, the method is developed, and a number of features are decremented from 6 to 5. This can speed up the classification method and increase the accuracy of the methods because we used all sub-bands of the 2D-DWT method for classification with fewer features. the best number of classes. In this figure, we used two methods, 2D-DWT and GARCH (1, 1), used with and without LL2 sub-bands. The result shows that with the addition of LL2 to the GARCH method, the model is developed. Furthermore, the results of the PCA method show that we can use 14 features for the classification of images. Furthermore, this enhancement is shown in Figure 6 for the two-class state. Additionally, in this state, the method is developed, and a number of features are decremented from 6 to 5. This can speed up the classification method and increase the accuracy of the methods because we used all sub-bands of the 2D-DWT method for classification with fewer features. Figures 7 and 8 show the feature reduction results for the presented LLA and GARCH model using the PCA and PCA + LDA methods. In the eight-class method (Figure 7), the best number of features for the GARCH and PCA method was 20, which for GARCH + PCA + LLDA decreased to 10 features. Using the presented method, the LLA + PCA method's number of features decreased to 7 features. Furthermore, for the LLA + PCA + LDA method, the best number of features should be three features. The results showed that the last presented method decreased the number of features to 3 so that it would be great for feature reduction.
For a two-class state (Figure 8), this reduction is conspicuously shown. The resolution between the classes is high, which indicates the ability of LLA + PCA + LDA in incremental inter-class distances and decremental intra-class distances.          (Figure 7), the best number of features for the GARCH and PCA method was 20, which for GARCH + PCA + LLDA decreased to 10 features. Using the presented method, the LLA + PCA method's number of features decreased to 7 features. Furthermore, for the LLA + PCA + LDA method, the best number of features should be three features. The results showed that the last presented method decreased the number of features to 3 so that it would be great for feature reduction.
For a two-class state (Figure 8), this reduction is conspicuously shown. The resolution between the classes is high, which indicates the ability of LLA + PCA + LDA in incremental inter-class distances and decremental intra-class distances.

The Classification Results
In this paper, we used the k-Nearest Neighborhood (KNN) method to classify the input features. KNN is a non-parametric method used in data mining, machine learning, and pattern recognition. The KNN algorithm is one of the ten most used algorithms in various machine learning and data mining projects in the industry. The KNN algorithm can be used for classification and regression issues. However, it is often used for classification issues.
The value of K in the KNN method is one of the effective parameters in classification. The mean classification accuracy was determined for different values of K, which was increased from 1 to 11 in steps of two for both states. The results are depicted in Figure 9. The results show that, for K ≤ 5, the classifier has good efficiency. Furthermore, the accuracy of the LLA method in the KNN classifier is greater than the GARCH method.

The Classification Results
In this paper, we used the k-Nearest Neighborhood (KNN) method to classify the input features. KNN is a non-parametric method used in data mining, machine learning, and pattern recognition. The KNN algorithm is one of the ten most used algorithms in various machine learning and data mining projects in the industry. The KNN algorithm can be used for classification and regression issues. However, it is often used for classification issues.
The value of K in the KNN method is one of the effective parameters in classification. The mean classification accuracy was determined for different values of K, which was increased from 1 to 11 in steps of two for both states. The results are depicted in Figure 9. The results show that, for K ≤ 5, the classifier has good efficiency. Furthermore, the accuracy of the LLA method in the KNN classifier is greater than the GARCH method. In the statistics, indicators of sensitivity and specificity are utilized to evaluate the result of the binary classification (two-class). When the data can be divided into positive and negative groups, the accuracy of the results of a test that divides the information into these two categories is measurable and describable using sensitivity and attribute indicators. Sensitivity means a proportion of positive cases that will test them correctly as being positive. Specificity means the proportion of negative cases that mark them correctly as being negative. True In a similar way, the specificity results are the division of TN cases by the sum of FP and TN cases.
Other classification criteria, such as precision, accuracy, and fall-out, are defined as following Equations (23)-(25). In the statistics, indicators of sensitivity and specificity are utilized to evaluate the result of the binary classification (two-class). When the data can be divided into positive and negative groups, the accuracy of the results of a test that divides the information into these two categories is measurable and describable using sensitivity and attribute indicators. Sensitivity means a proportion of positive cases that will test them correctly as being positive. Specificity means the proportion of negative cases that mark them correctly as being negative. True In a similar way, the specificity results are the division of TN cases by the sum of FP and TN cases.
accuracy(ACC) = TP + TN TP + TN + FP + FN (24) f all − out(FPR) = FP FP + FN (25) The results of the classification using the LLA method are shown in Tables 1 and 2. Table 1 shows the results of the classification using the presented methods. The results show that the maximum accuracy belongs to the presented LLA method that conducted the extraction using the combination of the PCA and LDA methods. Moreover, the minimum one belongs to the GARCH method that used the PCA feature extraction method.  Therefore, we can prioritize the presented methods that follow a maximum sensitivity and minimum fall-out as PCA + LDA (WLK), PCA (WLK), PCA + LDA (WGK), and PCA (WGK) (see Figure 10). This shows that the LLA method is better than the GARCH model in terms of robustness, sensitivity, and accuracy. Moreover, the combination of PCA and LDA produces better results than single PCA. One of the main reasons that the GARCH model has not produced a good model is the incompatibility of this method with some images. Table 2 also shows the results of a classification in the eight-class state, and the results show the acceptable outperformance of most diseases. The diagnosis of Pick and Sarcoma is somewhat more inaccurate than that of others; this is because of the complex images of these diseases. Figure 11 shows the confusion matrix of the presented model of the hybrid PCA, LDA, and LLA methods for the diagnosis of normal and abnormal images. The main diameter of the matrix shows the number of images detected correctly. From 210 abnormal images, 18 (8.57%) were recognized as normal lesions. However, 192 (91.43%) of the abnormal images were diagnosed correctly. Nevertheless, all the normal images were detected, and 92.5% (accuracy) of all images are were correctly classified, while 7.5% were incorrectly classed.    Figure 12 also shows the confusion matrix of the presented method for the classification in eight classes. The lower row of the matrix shows the percentages of each disease that were detected correctly (sensitivity). The maximum detection percentage belonged to normal images, and then Huntington and Meningioma came second. However, only 93.3% of Sarcomas were diagnosed correctly. In the end, 92.5% (accuracy) of all images were classified in the proper class, while 7.3% of them could not be recognized and were incorrectly classed. The red cells show incorrect choices or false ones. In each column, the sum of the elements equals the number of images of each disease. For example, for Alzheimer's (first column), 28 images (from 30) were diagnosed correctly; however, two images were classified into the Alzheimer plus category.   Figure 11 shows the confusion matrix of the presented model of the hybrid PCA, LDA, and LLA methods for the diagnosis of normal and abnormal images. The main diameter of the matrix shows the number of images detected correctly. From 210 abnormal images, 18 (8.57%) were recognized as normal lesions. However, 192 (91.43%) of the abnormal images were diagnosed correctly. Nevertheless, all the normal images were detected, and 92.5% (accuracy) of all images are were correctly classified, while 7.5% were incorrectly classed.  Figure 12 also shows the confusion matrix of the presented method for the classification in eight classes. The lower row of the matrix shows the percentages of each disease that were detected correctly (sensitivity). The maximum detection percentage belonged to normal images, and then Huntington and Meningioma came second. However, only 93.3% of Sarcomas were diagnosed correctly. In the end, 92.5% (accuracy) of all images were classified in the proper class, while 7.3% of them could not be recognized and were incorrectly classed. The red cells show incorrect choices or false ones. In each column, the sum of the elements equals the number of images of each disease. For example, for Alzheimer's (first column), 28 images (from 30) were diagnosed correctly; however, two images were classified into the Alzheimer plus category.

The Complexity Analysis
In the proposed method, we used five major approaches. Therefore, we should calculate their complexity. The complexity of PCA is O ( ( , )), where p shows the number of features, and n is the data abundance (image size 256 × 256) [40]. Additionally, the complexity of the LDA method for feature extraction is ( ) if n > p. Otherwise, it is ( ). The complexity of 2D-DWT is (4 2 ), where M is the number of vanishing moments of the mother wavelets that are used.
The complexity of the GARCH (1, 1) method depends on the autocorrelation complexity and is ( ), where, in this case, n is 256 × 256. Regarding the complexity of the LLA method, we can calculate this as ( ), where ′ is of the order of the derivative in the method and where, in this case, = 2.
The complexity of the KNN method is ( ), where, in this case, k = 1. Therefore, the complexity of the presented method is as follows: PCA (GARCH) is ( ( , ) + ), PCA + LDA (GARCH) is ( ( , ) + + ) , PCA (LLA) is ( ( , ) + 2 ) , and PCA + LDA (LLA) is ( ( , ) + + 2 ); therefore, the group of the LLA method is somewhat more complex than that of the GARCH group; however, the result is remarkable and compatible with all of the images.

Conclusions
In this paper, a hybrid algorithm for determining the diagnosis of brain disease in MRIs is presented. Initially, the two-level transformation of the 2D-DWT was calculated as the input images. The sub-banded wavelet coefficients could be modeled using the GARCH and LLA models. We used five studies in this paper. After using the 2D-DWT method and the separation of the image into six sub-bands to model the sub-bands, we used GARCH (1, 1) without using the Low-Low sub-band in the second wavelet level (use of WGK). Because this sub-band was incompatible with the GARCH (1, 1) method in terms of overcoming this condition, we used homomorphic filtering before 2D-DWT (use of D-WGK). The results showed that, by using Homomorphic filtering, the LL2 sub-band with the maximum image data could be utilized in the GARCH (1, 1) method with high performance. Moreover, we used the LLA method to model the 2D-DWT sub-bands. In this method, we used all of the sub-bands to model features (use of WLK). The results showed that using the LLA method, we could reduce the number of features from 20 to 3. Then, we classified the images using the KNN method. The results demonstrated the high accuracy and robustness of the presented methods. The results showed that the WLK method was better than the WGK and D-WGL models in terms of

The Complexity Analysis
In the proposed method, we used five major approaches. Therefore, we should calculate their complexity. The complexity of PCA is O (min(p 3 , n 3 )), where p shows the number of features, and n is the data abundance (image size 256 × 256) [40]. Additionally, the complexity of the LDA method for feature extraction is O np 2 if n > p. Otherwise, it is O p 3 . The complexity of 2D-DWT is O 4Mn 2 log2n , where M is the number of vanishing moments of the mother wavelets that are used. The complexity of the GARCH (1, 1) method depends on the autocorrelation complexity and is O(n), where, in this case, n is 256 × 256. Regarding the complexity of the LLA method, we can calculate this as O(n n), where n is of the order of the derivative in the method and where, in this case, n = 2. The complexity of the KNN method is O(npk), where, in this case, k = 1. Therefore, the complexity of the presented method is as follows: PCA (GARCH) is O min(p 3 , n 3 ) + n , PCA + LDA (GARCH) is O min(p 3 , n 3 ) + np 2 + n , PCA (LLA) is O min(p 3 , n 3 ) + 2n , and PCA + LDA (LLA) is O min(p 3 , n 3 ) + np 2 + 2n ; therefore, the group of the LLA method is somewhat more complex than that of the GARCH group; however, the result is remarkable and compatible with all of the images.

Conclusions
In this paper, a hybrid algorithm for determining the diagnosis of brain disease in MRIs is presented. Initially, the two-level transformation of the 2D-DWT was calculated as the input images. The sub-banded wavelet coefficients could be modeled using the GARCH and LLA models. We used five studies in this paper. After using the 2D-DWT method and the separation of the image into six sub-bands to model the sub-bands, we used GARCH (1, 1) without using the Low-Low sub-band in the second wavelet level (use of WGK). Because this sub-band was incompatible with the GARCH (1, 1) method in terms of overcoming this condition, we used homomorphic filtering before 2D-DWT (use of D-WGK). The results showed that, by using Homomorphic filtering, the LL2 sub-band with the maximum image data could be utilized in the GARCH (1, 1) method with high performance. Moreover, we used the LLA method to model the 2D-DWT sub-bands. In this method, we used all of the sub-bands to model features (use of WLK). The results showed that using the LLA method, we could reduce the number of features from 20 to 3. Then, we classified the images using the KNN method. The results demonstrated the high accuracy and robustness of the presented methods. The results showed that the WLK method was better than the WGK and D-WGL models in terms of robustness, sensitivity, and accuracy. Furthermore, the hybrid of PCA and LDA produced better results than PCA. One of the main reasons why the GARCH model has not produced a good model relates to the incompatibility of this method with some images. We overcame this problem in D-WGT with the use of homomorphic filtering. The results of an eight-class classification (diagnosis of disease type) showed an acceptable outperformance for most diseases. The diagnosis of Pick and Sarcoma was somewhat more inaccurate than that of the others; this is because of the complex images of these diseases. Out of the abnormal images, 8.57% were recognized as normal lesions. However, 91.43% of abnormal images were diagnosed correctly. Nevertheless, all the normal images were detected with 92.5% accuracy. The maximum detection percentage belonged to normal images, and then Huntington and Meningioma came second. However, 93.3% of sarcomas were classified correctly. In the end, 92.5% of all images were classified in their proper class, with a 7.3% error. Future work should focus on increasing the dataset volume for the diagnosis of brain tumors. Furthermore, it could be implemented for other MRI images, like breast cancer, prostate cancer, and so on. The novel methods of deep learning could also be enriched with this feature extraction method, which increases process speed and accuracy. Funding: The funding sources had no involvement in the study design, collection, analysis or interpretation of data, writing of the manuscript or in the decision to submit the manuscript for publication.