Cucumber Leaf Diseases Recognition Using Multi Level Deep Entropy-ELM Feature Selection

: Agriculture has becomes an immense area of research and is ascertained as a key element in the area of computer vision. In the agriculture ﬁeld, image processing acts as a primary part. Cucumber is an important vegetable and its production in Pakistan is higher as compared to the other vegetables because of its use in salads. However, the diseases of cucumber such as Angular leaf spot, Anthracnose, blight, Downy mildew, and powdery mildew widely decrease the quality and quantity. Lately, numerous methods have been proposed for the identiﬁcation and classiﬁcation of diseases. Early detection and then treatment of the diseases in plants is important to prevent the crop from a disastrous decrease in yields. Many classiﬁcation techniques have been proposed but still, they are facing some challenges such as noise, redundant features, and extraction of relevant features. In this work, an automated framework is proposed using deep learning and best feature selection for cucumber leaf diseases classiﬁcation. In the proposed framework, initially, an augmentation technique is applied to the original images by creating more training data from existing samples and handling the problem of the imbalanced dataset. Then two different phases are utilized. In the ﬁrst phase, ﬁne-tuned four pre-trained models and select the best of them based on the accuracy. Features are extracted from the selected ﬁne-tuned model and reﬁned through the Entropy-ELM technique. In the second phase, fused the features of all four ﬁne-tuned models and apply the Entropy-ELM technique, and ﬁnally fused with phase 1 selected feature. Finally, the fused features are recognized using machine learning classiﬁers for the ﬁnal classiﬁcation. The experimental process is conducted on ﬁve different datasets. On these datasets, the best-achieved accuracy is 98.4%. The proposed framework is evaluated on each step and also compared with some recent techniques. The comparison with some recent techniques showed that the proposed method obtained an improved performance.


Introduction
Agriculture is one of the most important research topics globally nowadays [1]. Agriculture is a significant source of income and the economy of a country is based on the quality and yields of crops [2]. Cucumber is an important vegetable and during the year 2020, the global cucumber planting area was around 2.25 million hectares and the global cucumber leaf diseases under complex backgrounds. They fused DeepLabV3+ and U-Net models instead of a single network. In the first step, DeepLabV3+ was used to segment the leaves from the images. Then the diseased area was segmented using U-Net. The fused models give better accuracy than the accuracy reported by the individual models. Researchers in [25], introduced a model for the identification of crop diseases in real-world images. The proposed trilinear convolutional neural network utilized bilinear pooling. In the laboratory environment, the proposed technique achieved 99.99% accuracy and in the real-world environment, the obtained accuracy is 84.11%. Kianat et al. [7] proposed a hybrid system for the recognition of cucumber diseases. In the pre-processing step, the data augmentation was applied using different angles to increase the image count in the dataset. In this step, contrast stretching was also performed to visually improve the images. The features were extracted from binary robust invariant scalable keypoints (BRISK), histogram of gradient (HOG), and features from the accelerated segmented test (FAST). Initially, the irrelevant features were eliminated by utilizing the probability distributionbased entropy (PDbE) technique. Then features were fused using the serial-based method and implemented Manhattan distance-controlled entropy (MDcE) method was to select the robust features. The proposed model achieved maximum accuracy of 93.5%. These techniques faced a major challenge of irrelevant feature extraction that were tried to be resolved through feature selection techniques [26].
Visual inspection of crops was carried out by farmers and agriculture experts. This evaluation process is exhausting, time-consuming, and highly subjective. The development of computer vision systems to identify, recognize, and classify disease-affected crops will keep humans out of the equation, allowing for unbiased, accurate disease-infection decisions [1]. An automatic classification system consists of various steps as mentioned above. Preprocessing is an important step, the aim is to remove noise and improve the quality of original images that later helps in important feature extraction. The extracted features from the refined images are used for the training of deep learning models that are further employed for feature extraction and classification. The key problems which are considered in this work are (i) training a deep learning model on an imbalanced dataset gives the high priority in the prediction to higher numbers of sample class; (ii) disease spots and background objects differ in appearance; (iii) changes in the shape, color, texture, and origin of the disease; (iv) irrelevant and redundant features extraction, and (v) choosing the superlative features for the classification.
In this article, our major focus is to design an automated computerized method for cucumber leaf diseases recognition using deep learning and Entropy-ELM-based best feature selection. The recent methods focused on the infected region identification and then employed for feature extraction; however, the error in the identification step misleads the irrelevant feature extraction that later reduces the classification accuracy. Our major contributions are: (i) Four mathematical functions such as horizontal flip, vertical flip, rotate 45, and rotation 60 are implemented for the sake of data augmentation. Later, four deep learning models are fine-tuned and trained on the augmented dataset. (ii) Deep learning features are extracted from the average pooling layer instead of the fully connected layer. The extracted deep features are passed to the Softmax classifier and compared the accuracy. Based on the accuracy value, the Densenet201 fine-tuned model is selected for the rest of the process. Moreover, all fine-tuned model features are fused using a new parallel approach. (iii) An Entropy-ELM based best feature selection technique is proposed. The proposed technique is applied on both the Densenet201 feature vector and fused vector, that later serially fused for the final classification. (iv) To determine which step of the proposed framework is better performed, a comparison is made between all hidden steps.
The rest of the manuscript is organized as follows: a proposed methodology that includes augmentation of the dataset, deep learning-based feature extraction, and Entropy- ELM-based best feature selection, is presented in Section 2. Results are discussed in Section 3 with the help of tables and graphs. Finally, the conclusion of the manuscript is given in Section 4.

Proposed Methodology
In this work, an automated framework is proposed for cucumber leaf diseases recognition using deep learning and Entropy-ELM-based best feature selection. The proposed framework is illustrated in Figure 1. In this figure, it is shown that the initial augmentation step is applied to the original images by creating more training data. Then two different phases are utilized. In the first phase, four pre-trained deep models are fine-tuned and selected the best of them based on the accuracy. Features are extracted from the selected fine-tuned model and refined through the Entropy-ELM technique. In the second phase, fused the features of all four fine-tuned models and apply the Entropy-ELM technique, and finally fused with phase 1 selected feature. Finally, the fused features are classified using machine learning classifiers for the final output. The rest of the manuscript is organized as follows: a proposed methodology that includes augmentation of the dataset, deep learning-based feature extraction, and Entropy-ELM-based best feature selection, is presented in Section 2. Results are discussed in Section 3 with the help of tables and graphs. Finally, the conclusion of the manuscript is given in Section 4.

Proposed Methodology
In this work, an automated framework is proposed for cucumber leaf diseases recognition using deep learning and Entropy-ELM-based best feature selection. The proposed framework is illustrated in Figure 1. In this figure, it is shown that the initial augmentation step is applied to the original images by creating more training data. Then two different phases are utilized. In the first phase, four pre-trained deep models are fine-tuned and selected the best of them based on the accuracy. Features are extracted from the selected fine-tuned model and refined through the Entropy-ELM technique. In the second phase, fused the features of all four fine-tuned models and apply the Entropy-ELM technique, and finally fused with phase 1 selected feature. Finally, the fused features are classified using machine learning classifiers for the final output.

Dataset Collection and Augmentation
The experiments were performed on the publically available dataset named the Cucumber leaf diseases scan dataset [27]. This dataset consists of six different diseases such as anthracnose, powdery mildew, downy mildew, angular spot, mosaic, and blight. A sample of images are illustrated in Figure 2. Each class has 100 to 150 images originally that are not enough to train a deep learning model. Therefore, we design a simple algorithm (Algorithm 1) for data augmentation that includes four operations such as horizontal flip, vertical flip, rotate 45, and rotate 60. This algorithm is applied to each cucumber disease class and increases the number of images to 2000 in each class. In the later steps, this augmented dataset is utilized for the training of deep models.

Dataset Collection and Augmentation
The experiments were performed on the publically available dataset named the Cucumber leaf diseases scan dataset [27]. This dataset consists of six different diseases such as anthracnose, powdery mildew, downy mildew, angular spot, mosaic, and blight. A sample of images are illustrated in Figure 2. Each class has 100 to 150 images originally that are not enough to train a deep learning model. Therefore, we design a simple algorithm (Algorithm 1) for data augmentation that includes four operations such as horizontal flip, vertical flip, rotate 45, and rotate 60. This algorithm is applied to each cucumber disease class and increases the number of images to 2000 in each class. In the later steps, this augmented dataset is utilized for the training of deep models. Step 5: Repeat Step 2, 3, and 4 for the Rest of the Disease Classes End

Deep Learning Architecture
Four deep learning pre-trained models are employed in this work for feature extraction. The selected models are-VGG16, ResNet50, ResNet101, and DenseNet201. As mentioned in Figure 1, all selected models are initially fine-tuned and then trained through transfer learning using an augmented dataset. A brief description of each deep model is given below.
VGG16 [28] is a pre-trained model that was created by the Visual Geometry Group. This group is a combination of students and teachers focused on Computer Vision at Oxford University. This model is reflected to be one of the best computer vision models in the world. A unique feature of VGG16 is that rather than having numerous hyper-parameters it concentrates on having used identical PL and MPL of 2 × 2 filters of stride 2 and CL of 3 × 3 filters with a stride 1. VGG16 continues the same organization containing Convolutional and Maxpool Layers continuously during the course of the entire structural design. In the end, VGG has 2 Fully Connected Layers afterward a Softmax to output. Due to the fact that the VGG16 has 16 layers with weights, it has the name VGG16. This model was originally trained on an ImageNet dataset having 1000 object classes. The prediction of this model was done by the Softmax layer, defined as:

Algorithm 1: (Data Augmentation)
Step 1: Input Original Database Step 2: Consider First Disease Class Step 3: Count Images of Step 2 (Selected Disease Class) Step 4: For i = 1 to Total Images of each Class -Horizontal Flip and Image Write -Vertical Flip and Image Write -Rotate 45 and Image Write -Rotate 60 and Image Write Step 5: Repeat Step 2, 3, and 4 for the Rest of the Disease Classes End

Deep Learning Architecture
Four deep learning pre-trained models are employed in this work for feature extraction. The selected models are-VGG16, ResNet50, ResNet101, and DenseNet201. As mentioned in Figure 1, all selected models are initially fine-tuned and then trained through transfer learning using an augmented dataset. A brief description of each deep model is given below.
VGG16 [28] is a pre-trained model that was created by the Visual Geometry Group. This group is a combination of students and teachers focused on Computer Vision at Oxford University. This model is reflected to be one of the best computer vision models in the world. A unique feature of VGG16 is that rather than having numerous hyper-parameters it concentrates on having used identical PL and MPL of 2 × 2 filters of stride 2 and CL of 3 × 3 filters with a stride 1. VGG16 continues the same organization containing Convolutional and Maxpool Layers continuously during the course of the entire structural design. In the end, VGG has 2 Fully Connected Layers afterward a Softmax to output. Due to the fact that the VGG16 has 16 layers with weights, it has the name VGG16. This model was originally trained on an ImageNet dataset having 1000 object classes. The prediction of this model was done by the Softmax layer, defined as: ResNet [29] also known as a Deep Residual Network, have proved to perform with great accuracy and efficiency with a Deep Framework and to create an extra straight pathway for the transmission of data through the network. Within such Deep Systems, the deprivation issue arises because of the rise of Network Layers and the precision begins to dilute which results in its reduction quickly. Backpropagation does not come across the Vanishing Gradient problem when working with RESNET. There are some "Shortcut Connections" that a Residual Network has which are to be equivalent to a regular Convolutional Layer which aids the network to comprehend the Global Features. Then an input x has to be added to the output layer by adding the Shortcut connection, afterward some weight layers below. After the application of these Shortcut Connections, they permitted the network by avoiding the layers which were not beneficial while training. Hence, the output came in an ideal modification of the number of layers to perform rapid training. Mathematical, the output of H (x) can be expressed as A type of Residual Mapping is used to train the weight layers which is expressed as, The above-mentioned function F(x) signifies stacked nonlinear weight layers. Several properties of ResNet50 include the fact that it has 64 kernels including 7 × 7 Convolutional layers. It also includes 16 residual blocks. There are 23 million trainable parameters.
ResNet101 model utilizes Residual links that the angles can stream straightforwardly over to hinder the slopes to get 0 after the utilization of Chain Rule. There are 104 convolutional layers altogether in ResNet101. Alongside, it comprises 33 squares of layers altogether and 29 of these squares utilize past squares yield straightforwardly which is characterized as leftover associations above. Hence the above-mentioned residuals were using such main Operand of Summation (OOS) administrator towards the termination of every square to obtain the contribution of the accompanying squares. Leftover 4 squares get the past square's yield and apply it to a CL with a channel size of 1 × 1 and a step of 1 after a clump standardization layer, which performs standardization activity and the resultant yield is shipped off the summation administrator at the yield of that block. Mathematically, this model working is defined as follows: Densenet-201 [30] is a convolutional neural network that is 201 layers deep. In this model, each layer gets feature maps from all preceding layers, the network can be thinner and more compact, resulting in fewer channels. The extra number of channels for each layer is the growth rate k. As a result, it has better computational and memory efficiency. The transition layers between two contiguous dense blocks are 11 Conv followed by 22 average pooling. Within the dense block, feature map sizes are uniform, allowing them to be readily concatenated. A global average pooling is done after the last dense block, and then a softmax classifier is added. The error signal can be transmitted more directly to earlier levels. As previous layers can get direct supervision from the final classification layer, this is a form of implicit deep supervision.

Transfer Learning Based Feature Extraction
Transfer learning (TL) is a process of reusing a pre-trained model for a new task [31], as illustrated in Figure 3. The ImageNet dataset was used as a source dataset of the pretrained model. The pre-trained model is fine-tuned and transfer knowledge through the TL concept. In the last, the new fine-tuned model is trained on the augmented cucumber dataset that is utilized for further feature extraction. The features are extracted from the deep layers like FC7 for VGG, Average Pool for ResNet50, ResNet101, and Densenet201. Several hyperparameters are employed during the training process such as 0.0001 learning rate, max epochs are 200, the mini-batch size is 16, and the activation function is sigmoid. be readily concatenated. A global average pooling is done after the last dense block, and then a softmax classifier is added. The error signal can be transmitted more directly to earlier levels. As previous layers can get direct supervision from the final classification layer, this is a form of implicit deep supervision.

Transfer Learning Based Feature Extraction
Transfer learning (TL) is a process of reusing a pre-trained model for a new task [31], as illustrated in Figure 3. The ImageNet dataset was used as a source dataset of the pre-trained model. The pre-trained model is fine-tuned and transfer knowledge through the TL concept. In the last, the new fine-tuned model is trained on the augmented cucumber dataset that is utilized for further feature extraction. The features are extracted from the deep layers like FC7 for VGG, Average Pool for ResNet50, ResNet101, and Densenet201. Several hyperparameters are employed during the training process such as 0.0001 learning rate, max epochs are 200, the mini-batch size is 16, and the activation function is sigmoid.

Entropy-ELM Based Features Selection and Parallel Fusion
Feature selection is an important and hot research topic nowadays [32]. The main purpose of feature selection is to increase the system accuracy and minimize the computational time by focusing on the selection of the most important features [33]. In this work, a new technique is proposed named Entropy-ELM for the best feature selection. This proposed technique worked in the following steps: (i) compute the entropy of input vector; (ii) based on the entropy value, a threshold function is employed that return two vectors-fulfill the threshold value (selected) and not-selected; (iii) ELM [34] employed

Entropy-ELM Based Features Selection and Parallel Fusion
Feature selection is an important and hot research topic nowadays [32]. The main purpose of feature selection is to increase the system accuracy and minimize the computational time by focusing on the selection of the most important features [33]. In this work, a new technique is proposed named Entropy-ELM for the best feature selection. This proposed technique worked in the following steps: (i) compute the entropy of input vector; (ii) based on the entropy value, a threshold function is employed that return two vectors-fulfill the threshold value (selected) and not-selected; (iii) ELM [34] employed as a fitness function and selected threshold passed features are utilized as an input. Mathematically, the entropy formulation is defined as follows: The detail of this selection process is given in Algorithm 2.

Algorithm 2: (Entropy-ELM)
Step 1: Input Feature Vector N × K // K is the length of features Step 2: For i = 1 to N Step 3: Computer Entropy through Equations (9)-(12) Step 4: Define Threshold Function as Equation (13) Step 5: Check Fitness through ELM Step 6: Evaluate the Accuracy Step 7: Repeat Step 2-6, until accuracy on the top side End Selected Feature Vector Finally, the parallel fusion approach is opted to get the fused feature vector. This approach is based on the following three steps. In the first step, get the maximum length feature vector. As we have two feature vectors X and X 1 , where the length of vectors is N × K and N × K 1 , respectively. In the second step, compute the entropy value and perform padding for the lower size feature vector. In the third step, correlation is computed among K and K 1 features for the final fusion. The fused vector is finally utilized for the classification through supervised learning classifiers.
where K and K 1 ∈ X and X 1

Experimental Results
The proposed framework is evaluated on the selected cucumber dataset having a ratio of 70:15:15 which means that 70% of the images are utilized to train the model, whereas the 15% for testing and 15% for validation. We combined the testing and validation images and performed testing (30%). All the experimental results are computed with K-Fold crossvalidation, whereas the value of K is 10. Several classifiers are implemented as discussed in Table 1. The performance of each classifier is computed through several measures such as recall rate, precision rate, F1-Score, accuracy, and time. The entire framework simulations are conducted on Simulink MATLAB2021a using a Personal Desktop.

Results
The detailed experimental process of the proposed framework is conducted in this section. The results are computed using the following steps: (i) classification using originally collected dataset on fine-tuned pre-trained models; (ii) classification using augmented dataset on fine-tuned deep models and select the best deep model for the further processing; (iii) best deep model features are refined using a new technique name Entropy-ELM; (iv) fusion of fine-tuned deep model features (augmented dataset), and (v) fused both step features using a parallel approach

Results on Original Cucumber Dataset
The results of the proposed method on the original cucumber dataset are given in Table 2. In this table, accuracy is computed for each fine-tuned deep model using the original dataset. Fine-tuned VGG16 (F-VGG16) obtained the maximum accuracy of 56.9% on the MG SVM classifier. The fine-tuned ResNet50 and ResNet101 obtained the best accuracy of 58.7 and 55.1% on Cubic SVM and Quadratic SVM, respectively. The fine-tuned Densenet201 deep model obtained an accuracy of 61.9% on Quadratic SVM. Based on these results, it is noticed that the originally collected dataset have several issues like imbalancing and short training data. Using these data, the fine-tuned Densenet201 gives better results for all classifiers.

Results on Augmented Cucumber Dataset
Experimental results of fine-tuned VGG16 pre-trained model after augmentation are given in Table 3. The best-obtained accuracy is 93.8% on Cubic SVM, whereas the recall rate and precision rates are 93.84 and 93.92%, respectively. The second best-obtained accuracy is 93.6%, which was accomplished on Quadratic SVM, whereas the recall rate and precision rates are 93.66 and 93.72%, correspondingly. The execution time of Linear SVM is better than the rest of the classifiers. The classification accuracy of fine-tuned ResNet50 on the augmented dataset is given in Table 4. This table presents the highest obtained accuracy on Cubic SVM of 94.6%, whereas the recall and precision rates are 94.36 and 94.46%, correspondingly. The second top accuracy is 94.4% obtained on Quadratic SVM, whereas the recall and precision rates are 94.26 and 94.36%, respectively. Similar to fine-tuned VGG16, the Quadratic SVM executed fast than the rest of the classifiers. Experimental results of fine-tuned ResNet101 pre-trained model are given in Table 5. The best-obtained accuracy of 97.7% was accomplished on Cubic SVM. The recall and precision rates are 97.7 and 97.7%, correspondingly. The second best-obtained accuracy is 97.2% on Quadratic SVM. The recall and precision rates are 97.24 and 97.32%, correspondingly. In this experiment, the Linear SVM was executed fast than the rest of the selected classifiers. The classification results of fine-tuned Densenet201 pre-trained model are given in Table 6. In this table, the obtained best accuracy is 98.4% on Cubic SVM. Moreover, the recall and precision rates are 98.44 and 98.5%, correspondingly. Figure 4 illustrated the confusion matrix of Cubic SVM that was utilized for the verification of recall rate. The second best-obtained accuracy is 97.4%, which was accomplished on Quadratic SVM. The computation time of each classifier is also noted and the minimum time is 302 (sec) for LSVM. At the first step comparison among without augmented and augmented datasets, it is noted that the accuracy obtained on the augmented dataset is significantly better. In the second step comparison, it is noted that the fine-tuned DenseNet201 model achieved better results than VGG16, ResNet50, and ResNet101. Based on this analysis, the fine-tuned DenseNet201 is selected for the rest of the experiments.  The fine-tuned deep learning model is selected based on the better accuracy and applied proposed Entropy-ELM feature selection technique. The results are given in Table  7. This presents the best accuracy of 98% on Cubic SVM. The other computed measures are the recall rate which is 98.02, the precision rate at 97.98, and the F1-Score at 98%. The recall rate of Cubic SVM can be also verified through a confusion matrix, illustrated in Figure 5. Compared to the results given in Table 6, it is noted that the accuracy is a bit reduced but on the other side, a huge change occurred in the computation time. The time is also plotted in Figure 6 (FDenseNet201 and Dense Entropy-ELM).  The fine-tuned deep learning model is selected based on the better accuracy and applied proposed Entropy-ELM feature selection technique. The results are given in Table 7. This presents the best accuracy of 98% on Cubic SVM. The other computed measures are the recall rate which is 98.02, the precision rate at 97.98, and the F1-Score at 98%. The recall rate of Cubic SVM can be also verified through a confusion matrix, illustrated in Figure 5. Compared to the results given in Table 6, it is noted that the accuracy is a bit reduced but on the other side, a huge change occurred in the computation time. The time is also plotted in Figure 6 (FDenseNet201 and Dense Entropy-ELM).     After the selection of the best dense features, in the next step all fine-tuned deep model features are fused using the proposed parallel approach. The results of this experiment are given in Table 8. The best-noted accuracy in this table is 98.2% on Cubic SVM. The recall and precision rates are 97.92 and 98.12%, respectively. Figure 7 illustrated the confusion matrix that can be utilized for the verification of the recall rate. The time of each classifier is also noted and plotted in Figure 6 (Fusion Entropy-ELM). In comparison with the results of Tables 6 and 7, it is noted that the overall accuracy is improved but the time is more increased than in the Dense Entropy-ELM step. After the selection of the best dense features, in the next step all fine-tuned deep model features are fused using the proposed parallel approach. The results of this experiment are given in Table 8. The best-noted accuracy in this table is 98.2% on Cubic SVM. The recall and precision rates are 97.92 and 98.12%, respectively. Figure 7 illustrated the confusion matrix that can be utilized for the verification of the recall rate. The time of each classifier is also noted and plotted in Figure 6 (Fusion Entropy-ELM). In comparison with the results of Tables 6 and 7, it is noted that the overall accuracy is improved but the time is more increased than in the Dense Entropy-ELM step. Finally, the features of Dense Entropy-ELM and Fusion Entropy-ELM are fused using the proposed parallel approach, and the results are given in Table 9. This table presents the best-obtained accuracy of 98.50% on Cubic SVM. The noted precision rate is 98.30, recall rate is 98.36 and F1-Score is 98.48%, respectively. The second best-noted accuracy is 97.5% on Quadratic SVM. The recall rate of Cubic SVM can be verified through a confusion matrix plotted in Figure 8. This figure shows the correct prediction rate of each class in the diagonal. Compared to the results of this experiment with all previous experiments, it is clearly noted that the accuracy is improved and computational time is significantly reduced. Finally, the features of Dense Entropy-ELM and Fusion Entropy-ELM are fused using the proposed parallel approach, and the results are given in Table 9. This table presents the best-obtained accuracy of 98.50% on Cubic SVM. The noted precision rate is 98.30, recall rate is 98.36 and F1-Score is 98.48%, respectively. The second best-noted accuracy is 97.5% on Quadratic SVM. The recall rate of Cubic SVM can be verified through a confusion matrix plotted in Figure 8. This figure shows the correct prediction rate of each class in the diagonal. Compared to the results of this experiment with all previous experiments, it is clearly noted that the accuracy is improved and computational time is significantly reduced.  Figure 1 showed the proposed framework that includes a few important steps. This figure illustrated the importance of the data augmentation step. The results without data augmentation having less accuracy than the results obtained after the data augmentation. Moreover, the selection of important features improves the accuracy that is later fused through a parallel approach. This step not only improves the classification accuracy but also reduced the computational time, as plotted in Figure 6. This figure clearly shows that the final fusion step significantly reduced the computational time than the rest of the steps on all classifiers.

Discussion
In the last, the proposed framework accuracy is compared with recent SOTA techniques, as given in Table 10. The methods mentioned in this table are from the year 2017-2022. Moreover, all the methods mentioned in this table used the same leaf dataset. The recent best accuracy was 98.08% and 96.50% achieved by Khan et al. [13] and Hussain et al. [24]. The other methods such as Lin et al. [35] achieved an accuracy of 96.08% on the same dataset. The proposed framework achieved an accuracy of 98.48% that is improved than the SOTA techniques.  Figure 1 showed the proposed framework that includes a few important steps. This figure illustrated the importance of the data augmentation step. The results without data augmentation having less accuracy than the results obtained after the data augmentation. Moreover, the selection of important features improves the accuracy that is later fused through a parallel approach. This step not only improves the classification accuracy but also reduced the computational time, as plotted in Figure 6. This figure clearly shows that the final fusion step significantly reduced the computational time than the rest of the steps on all classifiers.

Discussion
In the last, the proposed framework accuracy is compared with recent SOTA techniques, as given in Table 10. The methods mentioned in this table are from the year 2017-2022. Moreover, all the methods mentioned in this table used the same leaf dataset. The recent best accuracy was 98.08% and 96.50% achieved by Khan et al. [13] and Hussain et al. [24]. The other methods such as Lin et al. [35] achieved an accuracy of 96.08% on the same dataset. The proposed framework achieved an accuracy of 98.48% that is improved than the SOTA techniques.

Conclusions
Agriculture is a hot topic of research nowadays. In agriculture, deep learning showed significant success from the last decade for the recognition of plant diseases. In this article, a deep learning and Entropy-ELM based framework is proposed for the recognition of cucumber leaf diseases. In the proposed framework, four pre-trained deep models are trained and selected one of them based on the accuracy that is later employed for the selection of best features using the proposed Entropy-Elm technique. In the opposite step, features of all pre-trained models are fused and apply the feature selection technique. In the last, features of both steps are fused and perform classification. The proposed framework is tested on an augmented cucumber leaf dataset and achieved an accuracy of 98.48%. Comparison with the existing techniques showed the proposed framework obtained improved results. From the results, it is concluded that the augmentation process improves the recognition accuracy but also increases the time that was the first limitation of this framework; therefore a feature selection technique is proposed to maintain the accuracy and reduce the computational time. Through feature selection and fusion process, important information is obtained that later improves the classification accuracy. Another limitation of this work was the reduction of a few features that were ignored during the selection process. In the future, EfficientNet deep model will be implemented and features will be refined through the Butterfly metaheuristic algorithm instead of the heuristic search approach [20]. Moreover, reinforcement learning and Graph CNN shall be applied and refined through feature selection algorithms for the better results [38][39][40][41][42].