Multiple Ocular Disease Diagnosis Using Fundus Images Based on Multi-Label Deep Learning Classiﬁcation

: Designing computer-aided diagnosis (CAD) systems that can automatically detect ocular diseases (ODs) has become an active research ﬁeld in the health domain. Although the human eye might have more than one OD simultaneously, most existing systems are designed to detect speciﬁc eye diseases. Therefore, it is crucial to develop new CAD systems that can detect multiple ODs simultaneously. This paper presents a novel multi-label convolutional neural network (ML-CNN) system based on ML classiﬁcation (MLC) to diagnose various ODs from color fundus images. The proposed ML-CNN-based system consists of three main phases: the preprocessing phase, which includes normalization and augmentation using several transformation processes, the modeling phase, and the prediction phase. The proposed ML-CNN consists of three convolution (CONV) layers and one max pooling (MP) layer. Then, two CONV layers are performed, followed by one MP and dropout (DO). After that, one ﬂatten layer is performed, followed by one fully connected (FC) layer. We added another DO once again, and ﬁnally, one FC layer with 45 nodes is performed. The system outputs the probabilities of all 45 diseases in each image. We validated the model by using cross-validation (CV) and measured the performance by ﬁve different metrics: accuracy (ACC), recall, precision, Dice similarity coefﬁcient (DSC), and area under the curve (AUC). The results are 94.3%, 80%, 91.5%, 99%, and 96.7%, respectively. The comparisons with the existing built-in models, such as MobileNetV2, DenseNet201, SeResNext50, InceptionV3, and InceptionresNetv2, demonstrate the superiority of the proposed ML-CNN model.


Introduction
Vision-threatening ocular diseases (ODs), such as age-related macular degeneration (AMD), diabetic retinopathy (DR), cataracts, uncorrected refractive errors, and trachoma, have become remarkably common over the past two decades. A recent world report on vision from the World Health Organization (WHO) demonstrated that visually impaired persons worldwide exceed 2.2 billion. At least 45% of these cases could have been prevented or are yet to be addressed [1]. Trachoma, cataracts, and uncorrected refractive errors (e.g., myopia, astigmatism, hypermetropia, and presbyopia) are three leading causes of blindness and vision impairment. According to the WHO, more than 153 million people are visually impaired due to uncorrected refractive errors, almost 18 million people are bilaterally blind from cataracts, and approximately 10.6 million are diagnosed with trachoma [2]. Moreover, studies showed that AMD is the most common cause of blindness, particularly in developed countries, as it accounts for 8.7% (i.e., 3 million people) of all blindness worldwide. The Several different ODs can co-occur in the human eye. Therefore, a model needs to be optimized to diagnose the various ODs types based on multi-label classification (ML-C). ML-C means that each training example is associated with more than one label [25]. Prediction of a single label is the goal of typical classification tasks. However, it might be required to predict the likelihood across several class labels at the same time where the classes are generally assumed to be mutually exclusive. In ML-C, it is necessary to simultaneously output zero or more labels for each input sample [26]. Many studies were conducted to detect the various ODs, but a few utilized ML-C of the ODs as one patient can have more than one type of eye disease. Therefore, optimizing the models that support the concept of ML-C is very important. On the other hand, many studies merely apply the existing universal models in classifying ODs rather than building new or optimizing current architectures. That lessens the classification accuracy because the model can perform accurately in one field while it cannot be applied to the other. This paper presents a novel CAD system for detecting various ODs using ML-C. We utilize a multi-label (ML) benchmark dataset, the retinal fundus multi-disease image dataset (RFMiD) [27], that contains 45 classes of fundus retinal images. Most of the images in the utilized dataset may include more than one disease simultaneously, as shown in Figure 1. Table 1 clarifies the different ODs shown in Figure 1, along with their labels and definitions. The proposed CAD system starts with preprocessing steps that include the normalization and augmentation by some transformation processes. After that, we train the divided data using our proposed CNN model. Then, we test the unknown images from different diseases. The results of the proposed CAD system are probabilities of the ODs. We utilize different evaluation metrics and compare the proposed system with various state-of-the-art schemes to demonstrate its efficiency and reliability. The main contributions of this work are summarized in the following points: • We propose an ML-CAD framework based on DL for simultaneous diagnosis of interleaved ODs from color fundus images. • The effectiveness of the proposed framework is verified utilizing a recent publicly available ML dataset (RFMiD) that contains a wide variety of challenging ocular diseases. • We compare the performance of the proposed framework, utilizing five different measures, with similar frameworks and built-in models. The experimental results illustrate the practicality and superiority of the proposed framework. • Compared to existing multiple ocular diseases frameworks that can detect at most ten ODs, the proposed framework can detect more than twenty-nine ODs.
The rest of this paper is organized into five sections. Section 2 presents the related works. It discusses the current limitations and highlights the main directions and solutions included in the proposed system to overcome the current shortcomings. Section 3 explains the detailed phases and techniques utilized in the proposed CAD framework. Section 4 describes the different experiments conducted and presents the findings. Section 5 introduces the discussion and provides a comparative analytical study of the proposed CAD system and other state-of-the-art techniques. Finally, Section 6 presents the conclusion of the work and findings and highlights future research directions.  Tiny pebbles of debris that build up over time.
Myopia [35] MYA Objects in the distance appear blurred while close objects are often seen clearly.
Degenerative changes in the choroid, sclera. The eye is too long or the cornea is more curved. The anatomic sequelae of atrophy of the anterior visual pathway with the loss of retinal ganglion cells.
Pale yellow discoloration of the optic disc and absence of many small vessels.

Tessellation [38] TSLN
A common characteristic of myopic eyes and a clinical marker for the development of retinochoroidal changes.
It appears due to the thinning of RPE and choriocapillaris. The choroidal vessels are visible due to the reduced density of the pigments.

Related Work
This section reviews the existing diagnosis systems for ODs. We discuss the current limitations and highlight the main directions and remedies suggested in the proposed system to overcome the current shortcomings. For example, He et al. [8] presented a CAD system using a dense correlated network (DCNet) to classify color fundus images. They used a public dataset (ODIR 2019) based on seven types of ODs using ML-C. The authors utilized two fully connected layers with rectified linear unit (ReLU) activation in one of them. One dense layer is 512, and the other is 8 as the number of the output categories of the ODs. They employed an ML soft margin loss function. The main advantage of their method is that it can be used in multi-modal image analysis. However, they could not compare their model with other existing works because their method is patient-based, and the other studies are image-based. They shared the same backbone CNN to extract features from the right and left eyes to reduce the computation complexity. Hence, they could not handle the unbalanced distribution of patient cases.
Wang et al. [9] utilized transfer learning to extract features of the color fundus images and then utilized ML-C based on problem transformation. The authors utilized a multi-label dataset with eight labels. They utilized histogram equalization on gray and colored images. Then, they applied two classification models to the two images sets. Finally, they obtained the average of the sigmoid output probabilities from the two models. The main limitation of their work is the low network performance because of the variety of uncommon ODs categorized in the label "other diseases" in the utilized dataset. In addition, their system suffers from the data imbalance problem due to the limited data in some disease categories. Hence, some of the specific features learned are unknown.
Cheng et al. [39] used a graphical convolution network (GCN) to detect eight DR lesions (laser scars, drusen, cup disc ratio, hemorrhages, retinal arteriosclerosis, microaneurysms, hard and soft exudates) from color fundus images. They utilized ResNet-101, then two convolutional (CONV) layers with a 3 × 3 kernel, stride 2, and adaptive max pooling for feature extraction. Their model's accuracy (ACC) and receiver operating characteristic (ROC) values showed better detection results for laser scars, drusen, and hemorrhage lesions. In contrast, their system had poor detection ability for microaneurysms, soft exudates, and hard exudates. This is mainly because microaneurysms appeared as small red spots in the retinal capillaries. Thus, the model could not distinguish microaneurysms from the background of the fundus images. On the other hand, soft and hard exudate lesions often accompanied multiple other fundus lesions simultaneously, which made it difficult for the model to extract the features of all fundus lesions.
Dipu et al. [40] utilized transfer learning to detect eight ODs from the ODIR2019 dataset. They compared the performance of some state-of-the-art DL networks, such as Resnet-34, EfficientNet, MobileNetV2, and VGG-16. The authors trained the cutting-edge on the utilized dataset and reported the results. They evaluated the models' performance by estimating the ACC. They ordered the models due to the resulting ACC into VGG-16, Resnet-34, MobileNetV2, and EfficientNet. However, the authors did not build a new model to detect the ODs. Moreover, calculating only ACC was not enough to estimate the model performance.
Choi et al. [41] proposed a CAD system using random forest transfer learning based on VGG-19. They utilized a small dataset in order to detect ten OD categories. They observed that the ACC increases by decreasing the classes that need to be detected to three. On the contrary, when they increased the categories to ten, the ACC decreased with a difference of about 30%. The authors tried to use an ensemble classifier with transfer learning and found a slight increase in ACC of 5.5%. Although the authors made augmentation, they could not achieve good performance because of the data imbalance.
Diaz Pinto et al. [23] utilized DL to detect glaucoma from color fundus images. First, they cropped the images around the optic disc. Then, other transformation processes were applied, such as random rotations, zooming by a range between 0 and 0.2, and horizontal and vertical flipping. They utilized VGG16, VGG19, InceptionV3, ResNet50, and Xception. Each architecture was followed by global average pooling. They used the softmax classifier. The authors used stochastic gradient descent (SGD) for updating weights. They set epochs to 100 and 250. The batch size was set to 8, the learning rate (LR) was set to 10 −4 , and the momentum rate was 0.9. The fine-tuning performance was decreased when testing the CNNs on databases that were different from those used for training.
Tan et al. [42] classified AMD by using a CNN. Their proposed model consists of seven CONVs with a 3 × 3 kernel, four max-pooling (MP) (2 × 2), and three fully connected (FC) layers. The proposed model was fully automatic so that no hand-crafted feature extraction or selection was required. Moreover, no classifier was required as they involved meticulous engineering in designing a feature extraction module that could extract highly distinctive features for classification. The proposed model can be installed in a cloud system. On the other hand, the authors highlighted the issues of their model, as the overall diagnostic performance of their CNN model was poor and would improve with more extensive data. Furthermore, the proposed model suffered from convergence and overfitting problems. In addition, the training of the CNN model was slow and intensive. A summary of the most recent studies (published from 2018 to 2021) is shown in Table 2.
From the previous review of the current studies conducted recently, we can conclude their main limitations in diagnosing multiple ODs based on the ML-C concept as follows: • Increasing the number of classes decreases the model performance, mainly if the number of training samples is not sufficiently large; • Some systems are conservative and cannot be applied in the real world because of the imbalanced and/or insufficient datasets; • Some models suffer from overfitting; • The overall performance of the ML-C based models is minor compared to ML-or binary classification-based models because of overlapping labels.
To overcome the previous limitations, we propose a novel ML CAD system to accurately diagnose the different ODs from various ML color fundus images. We apply some augmentation processes to enlarge the training dataset and avoid overfitting and data imbalance issues. On the other hand, data augmentation improves the model performance. We utilize basic augmentation for preserving the labels after transformation. Moreover, we suggest a novel CNN model to extract feature maps (FM) of the augmented images. We customized the different hyper-parameters of the CONV layers, dense layers, dropout (DO), stride, kernel, filters, MP, optimizer, regularization, LR, loss function, number of epochs, batch size, and classifier to provide precise results. The proposed model can report the probability of each disease of the 45 ODs in each image. Finally, we evaluated the system performance using six different metrics and compared it with many current systems and models.

The Proposed ML CAD System
This section gives a detailed explanation of the proposed multiple OD ML-based CAD system. The proposed system consists of three phases. It starts by supplying the preprocessing phase with the ML dataset [27]. This phase aims to scale the images to a standard size and apply some transformation processes, such as vertical and horizontal flipping, rotation, and brightness, contrast, hue, and saturation adjustments. The normalized preprocessed images are then fed to the proposed CNN model in the feature extraction and ML-C phase. Finally, the prediction is obtained by testing the proposed CNN model. Figure 2 shows the proposed ML CAD system. We present below the three phases of the proposed system in detail.

Preprocessing
The preprocessing stage consists of two main phases, which are image resizing and data augmentation. These phases are discussed in the following subsections.
Image Resizing. All images are resized to a standard size of 128 × 128 pixels. Although using the actual size could be helpful in learning, resizing the input images is essential to save memory and space and reduce the training time.
Data augmentation. Each image in the utilized dataset could contain multiple diseases out of forty-five different ODs. However, this dataset is imbalanced because the number of normal samples could greatly outnumber the samples of some abnormal classes (diseases). For instance, while the number of normal images (i.e., negative samples) in which no ODs appear in the training dataset is 401, the number of images in which some diseases appear, such as cotton-wool spots (CWS) and choroidal folds (CF), are less than 10. Thus, data augmentation methods should be utilized to increase the number of images in which rare diseases appear and reduce the positive-negative class imbalance. Different transformation methods [43] can be applied to the image dataset to enlarge the dataset, solve class imbalances, and avoid overfitting. In this work, we utilized simple, safe, and manageable augmentation methods to preserve labels and encourage the model to learn more general features. The safety of a data augmentation method refers to its likelihood of preserving the label post-transformation. Precisely, we used different augmentation methods, including horizontal and vertical flipping, rotation, brightness change, saturation change, and hue change, to images of rare diseases to ensure that each label in the dataset occurs at least 100 times. Table 3 lists the utilized augmentation methods and the corresponding parameters' values, and Figure 3 shows an example of an augmented image to 10 images. The number of training samples increased from 1920 to 4784 after data augmentation.

Feature Extraction and ML-C
The CNN general architecture is made up of CONV, pooling (PO), and FC layers [44,45]. Each component may at least consist of one layer. Different mapping functions and regulatory units, such as batch normalization (BN) and DO, are also included in the architecture to optimize the performance and avoid overfitting. The CONV operation picks up distinct features from the input color fundus image. It is performed with kernel filters to generate feature maps (FMs). Some down-samplings are included in the architecture, such as stride and PO. A stride is the number of units the filter slides upon for CONV and MP. On the other hand, each successive FM would get smaller after the CONV process without zero paddings. At last, the CONV layer output is passed to a non-linear activation function (AF), such as sigmoid and ReLU.
Pooling [46] regulates the CNN complexity and ensures the fixed size of the output. It decreases the learnable parameters and reduces FM size. Moreover, it increases the generalization and reduces overfitting. Pooling in CNNs can be MP or global pooling (GP), among other alternatives. The MP operation takes the highest value from each kernel. Finally, in FC (dense layer) [47], the output of CONV and PO layers is flattened and transformed into a 1D array. The weight connects each input with each output. The output nodes have the same number of classes.
It is necessary to focus on the arrangement of all components in CNN architecture. This arrangement plays a vital role in building new architectures and achieving the needed performance. Therefore, the proposed ML-CNN model consists of three CONV layers with 32 filters and a kernel size of 3 × 3, and the AF is ReLU. Then, the MP layer with a 2 × 2 kernal is performed, followed by DO with 0.25 for regularization. Again, two CONV layers with 64 filters are performed, followed by one MP (2 × 2) and DO with 0.25. After that, one flatten layer is performed, followed by one FC layer with 512 neurons. We add another DO with 0.5. Finally, one FC layer with 45 nodes is performed. The total trainable parameters are 29,589,613. Figure 4 demonstrates in detail the layers of the proposed ML-CNN model, and Table 4 shows the proposed ML-CNN architecture. The configurations of the hyper-parameters utilized in the proposed ML-CNN model are presented in Table 5. The optimizer is the SGD with LR equal to 0.01, decay equal to 10 −6 , and momentum rate equal to 0.9. The batch size is 32, the loss function is categorical cross-entropy, the AF of the final FC layer is sigmoid, and the number of epochs is 50.  We utilize the sigmoid function at the end of the ML-CNN model classifier to transform the raw output values into probabilities, which is the final understandable format. Moreover, the sigmoid function is suitable in the ML-C problem [48] because since there is more than one "right answer", the outputs are not mutually exclusive. The probabilities produced by the sigmoid function are independent and are not constrained to sum to one. In addition, the sigmoid function allows us to have a high probability for all OD classes, some of them or none. For example, when classifying ODs in the color fundus image, the image might contain DR and/or RS, or none of those defects.

The Prediction
The last layer of the proposed model is a fully connected layer that consists of forty-five output neurons. The output of this layer provides the probabilities of the 45 ODs in the utilized RFMiD dataset. Based on the output probabilities, the model can predict whether each OD appears in the image presented to the model. For each label (disease), if the corresponding probability is larger than a preset cut-off value (0.50), our model confirms the presence of this disease in the presented image. Figure 5 shows an example of an image presented to the proposed model along with the top 10 ODs with the highest probabilities. It can be observed that only the top three ODs are confirmed by the model based on the preset cut-off value. The proposed model is validated using the 10-fold cross-validation technique to reduce overfitting. The training set is split into k (=10) smaller subsets. The proposed ML-CNN model is trained using k − 1 of the folds as the training set of data, and the resulting model is validated on the remaining part of the data. The output of the model and validation using five different metrics for evaluating the performance of the proposed model are discussed in detail in the next section.

Experimental Results
This section gives a detailed description of the dataset and explains the utilized performance metrics. Finally, we present the results obtained from applying the proposed ML-CNN model.
RFMiD is an ML dataset where each image contains multiple ODs. It consists of 3200 images, including normal and abnormal cases (669 images are normal, and the rest are abnormal). The abnormal images include 45 diseases or classes in which DR appears in 632 images, MH appears in 523 images, and ODC appears in 445 images. The remaining diseases appear in smaller numbers of images. The full list of diseases and the number of images in which each disease appears is shown in Table 6. In addition, the complete name of each disease can be found in [27]. All images were captured with three fundus cameras, namely, TOPCON 3D OCT-2000, TOPCON TRC-NW300 with a distance of 40.7 mm and a 45 • field of view (FOV), and Kowa VX-10 with a distance of 39 mm and a 50 • FOV. The images are in various resolutions, such as 2144 × 1424, 4288 × 2848 and 2048 × 1536. The utilized dataset contains a CSV file that includes the image ID, disease risk (presence of disease/abnormality), and 45 classes of ODs, which are found in the color fundus images. An image is assigned the value '0', in the CSV sheet, if it does not have a specific disease (out of the 45 diseases) and labeled by '1' otherwise. For instance, if the image includes DR, RS, and LS, but does not include the remaining ODs, only columns representing DR, RS, and LS, for this image, will be assigned the value '1' in the CSV sheet.

The Performance Measures
We utilized five different metrics to evaluate the performance of the proposed ML-CAD system, namely, SEN/recall, the Dice similarity coefficient (DSC), accuracy (ACC), area under the ROC curve (AUC), and the positive predictive value (PPV), which are defined in Equations (1)-(5) [49].

The Results
The proposed framework was implemented using python 3.7 on the free cloud service "Google Colab". We utilized the popular TensorFlow 2.4 machine learning library. We also utilized the open-source Python library OpenCV and "roboflow.ai" for the preprocessing steps and the DL Python open-source library "TFLearn" for classification. We ran all experiments using a local machine with a core i5/2. After we built the proposed ML-CNN model, we had to improve its performance by customizing the various hyperparameters such as the optimizer and its LR, L 1 , and L 2 regularization, number of epochs, and batch size (BS). We evaluated the model's performance using the five performance measures described in the previous subsection for each tested combination to find the best combination of hyper-parameters. We split the dataset using the built-in library "sklearn" into 90% for the training set and 10% for the testing set. We assigned a random state (seeds) to be 114 and the shuffle to true. Table 7 shows the hyper-parameter optimization experiments of the proposed ML-CNN model.
The hyper-parameters have been customized in the network based on the experiments. We have considered the relationship between the loss function and learning rate, optimizer type, calculating stride, dropout values, and the number of epochs. The learning rate controls how quickly the model is adapted to the problem. Smaller learning rates require more epochs given the smaller changes made to the weights each update, whereas larger learning rates result in rapid changes and require fewer training epochs. From Table 7, we can observe that the hyper-parameters are 20 epochs, 4 BS, the Adam optimizer with (LR = 10 −5 ), and 0.001 for L 1 regularization. This achieved 100% for ACC, 100% for DSC, 94% for AUC, 67% for precision and recall curve. Figures 6 and 7 show the training and validation ACC and loss. Figure 8 shows the ROC curve for all the 45 classes of the utilized dataset. The figure shows that the micro average ROC curve area for all 45 classes or ODs is 94%. Figure 9 shows the precision and recall (PR) curve for all 45 classes of the utilized dataset. The figure shows that the micro average PR curve area for all 45 classes or ODs is 67%.   The proposed model outputs the images with a title that informs the physician of the predicted ODs by the system and the actual labels or ground truth (GT) of that image. Figure 10 shows an example of output predictions and the actual GTs. The image in (a) has TSLN disease, but it is predicted that it has DR, TSLN, and MH ODs. On the other hand, the image in (b) has the DR and LS diseases simultaneously, but it is predicted to have TSLN, DR, and MH ODs. Now, we demonstrate the comparison between the proposed ML-CNN model and the other state-of-the-art models, such as MobileNetV2, DenseNet201, SeResNext50, Xception, InceptionV3, and InceptionresNetv2, through the five performance metrics defined above. Table 8 shows the resulting averages of ACC, SEN, PREC, DSC, Loss, and AUC for Mo-bileNetV2, DenseNet201, SeResNext50, Xception, InceptionV3, InceptionresNetv2, and the proposed ML-CNN by using dataset split and 10-fold cross-validation techniques. We can observe that the proposed system with a 10-fold cross-validation technique is the best in SEN, PREC, and AUC.      Figure 11 shows the training and validation of ACC and loss of utilizing the Incep-tionV3 model on the same utilized dataset in our ML-CNN model. Figure 12 shows the training and validation of ACC and loss of utilizing the mobileNetV2 model on the same utilized dataset in our ML-CNN model. Figure 13 shows the training and validation of ACC and loss of utilizing the DenseNet201 model on the same utilized dataset in our ML-CNN model. Figure 14 shows the training and validation of ACC and loss of utilizing the seResNext model on the same utilized dataset in our ML-CNN model. Figure 15 shows the training and validation of ACC and loss of utilizing the Xception model on the same utilized dataset in our ML-CNN model. Figure 16 shows the training and validation of ACC and loss of utilizing the InceptionResNetV2 model on the same utilized dataset in the ML-CNN model.

Discussion
This section provides detailed analytical comparisons between the proposed ML-CNN system and the other studies reported in the literature for detecting various ODs. In the system proposed by Choi et al. [41], it was observed that increasing the classes to be predicted in MLC decreases the ACC. They achieved 30.5% for ACC of classification of ten classes. However, when they predicted only three classes, the ACC increased to 72.8%. Wang et al. [9] achieved 90% for ACC to make MLC of only eight classes from the ODIR2019 dataset by using the EfficientNet model, while Dipu et al. [40] achieved 93.82% for ACC in 2021. We achieved 100%, 100%, 94%, 81%, and 67% for ACC, DSC, AUC, PREC, and SEN, respectively. Apart from PREC and SEN, our system outperforms the system proposed by Wang et al. [9]. In the proposed ML-CNN, the AUC is higher than what is achieved by He et al. [8] by 1%. On the other hand, the proposed ML-CNN system achieved 94.3%, 80%, 91.5%, 99%, and 96.7% for ACC, SEN, PREC, DSC, and AUC, respectively, by using 10-fold CV.
Although ML-C suffers from the overlapped classes and is not mutually exclusive, it enables flexible (soft) classification. Each image may include more than one or two ODs simultaneously. If ML-C cannot predict all ODs, the patient is at risk. On the contrary, binary (hard) classification predicts only the presence/absence of the disease. ML-C gives the probability of occurring the ODs in each case. Our results are good enough compared to the other works that employ the ML-C concept to detect various ODs from color fundus images, such as Dipu et al. [40], He et al. [8], Wang et al. [9], Cheng et al. [39], and Choi et al. [41], as shown in Figure 17.
The k-fold CV reduces the overfitting but does not completely eliminate overfitting. Therefore, we will split data manually in the future. The limitations of our system are that it falls into overfitting in some epochs, and SEN/recall could be relatively low. In the case of increasing the epochs, the recall increases, and the ACC decreases.

Conclusions
ODs threaten human health. They can result from infections in other body parts or be indicators of any other disease. We developed a new ML-CNN system that depends on ML-C to diagnose various ODs from color fundus images. We utilized an ML dataset that includes 45 different diseases. After applying the augmentation processes required to enlarge the dataset, we split the dataset to 90% for training and 10% for testing. We validated the system using five performance measures: ACC, SEN, PREC, DSC, and AUC. We utilized a k-fold CV with different values of k to verify the obtained result. Specifically, we applied the 2-fold, 5-fold, and 10-fold CV. Furthermore, we compared the proposed ML-CNN system and other previously proposed systems to diagnose multiple ODs. Additionally, we compared our system with other built-in models by applying them to the same dataset. The proposed system gives promising performance for future research. In the future, we intend to establish a large and balanced ML dataset and split it manually to give us the flexibility to eliminate overfitting.