1. Introduction
The new variant of coronavirus, known as COVID-19, has spread around the world at a massive scale. Many patients with pneumonia disease were initially detected in the Wuhan region in China around December 2019 [
1,
2]; at first, it was an unknown category of illness. The symptoms found in several patients were like that of severe acute respiratory syndrome (SARS). The Chinese Center for Disease Control and Prevention (CCDC) detected a new type of coronavirus (nCoV) through throat swab samples collected from several patients in Wuhan. Eventually, as many as 692,201,275 people were infected with coronavirus, with more than 6,903,321 deaths and 664,525,952 recoveries all around the world as of July 2023 [
3]. Because this was a challenging condition, academics and medical experts adopted real-time reverse transcription polymerase chain reaction (RT-PCR) to evaluate and identify COVID-19 [
4], whereby the reverse transcription task, which is responsible for capturing an infected person’s DNA, can be obtained. After that, the DNA is subjected to PCR so that it can be strengthened before it is evaluated. Because coronavirus is the only virus that contains RNA patterns, it is possible for PCR to detect the virus. The results acquired using a PCR pack were delayed because of the increase in demand for COVID-19 tests. Consequently, the results that are generated by PCR kits are not dependable due to the incidence of false-negative (FN) outputs as a result of their composition [
4].
Pneumonia is a potentially fatal respiratory illness that affects around 7% of the world’s population each year [
5]. It is a condition that is thought to be fatal and has unrelenting outcomes within a short period because it causes a steady flow of fluid inside the lungs, ultimately resulting in drowning. The bacteria, germs, and other organisms that cause pneumonia induce a heightened response in the area of the lung sacs called alveoli [
6]. When microorganisms begin to multiply within the lungs, leukocytes begin producing sores in the sacs to combat the bacteria and fungi that are responsible for the infection. Therefore, the region of the lungs that is influenced by pneumonia becomes filled with an infected fluid, which leads to breathing difficulty as well as tussis and fever. It is possible for a person to pass away because of this dangerous pneumonia infection if the earlier stages are not treated with appropriate medication [
7].
Individuals who contract COVID-19 infection may experience the subsequent signs and symptoms: fever, cough, taste or smell loss, sore throat, chest pain, and dyspnea [
8]. Patients with pneumonia infection, pneumothorax, pulmonary tuberculosis, or cancer in the lungs (LC) are those who exhibit these symptoms most frequently. Because of this, it can be challenging for medical professionals to identify COVID-19 [
9]. Many researchers and professionals in the medical field need a technique with which to accurately diagnose COVID-19 [
10]. It was determined that X-ray imaging analysis was the best method for identifying coronavirus. Chest radiography, also known as chest X-rays, is the cheapest and most common medical analysis procedure used today. Even in poor nations, contemporary digital radiography devices are fairly affordable. Therefore, health professionals make extensive use of this method in their work to identify and analyze potentially lethal diseases such as tuberculosis (TB), pneumonia, and LC. X-ray images of a patient’s chest can provide an incredible amount of information about their medical history. Despite this, obtaining an accurate identification of COVID-19 using X-ray imaging is the most important undertaking for medical professionals. When it comes to chest X-rays, overlapping tissue structures significantly increase the level of difficulty associated with making an accurate diagnosis. Because of this, the human diagnostician may encounter difficulties in detecting COVID-19 when the degree of contrast between the lesion and the neighboring tissues is relatively small or when the lesion covers the ribs. It is not always easy to identify a person who has COVID-19, pneumothorax, pneumonia, LC, or TC from chest X-rays, and this can be the case even for an experienced medical professional. Therefore, the automated identification of these disorders using chest radiographs by means of AI—more specifically, the field of machine learning models—may provide a solution to this problem [
11]. There has been a lot of elation brought about by reports that DL and transfer learning algorithms are outperforming humans in diagnostic evaluations. In the field of AI, both approaches, namely, deep learning and transfer learning, provide a framework where previously acquired information can be used to address new but relevant tasks in a manner that is significantly more efficient and effective [
12]. In previous studies, the ability of machine learning algorithms to detect COVID-19 was famously called the transfer learning method.
The process of illness classification has been fundamentally altered by DL models, which has presented medical practitioners with a new opportunity [
13,
14]. The majority of medical diagnosis systems that use CNNs and that have seen great achievements have been used in the segmentation and identification of brain and breast tumors [
15], the detection of cancer cells, the treatment of chest infections [
16], and the diagnosis of cancer in individual cells [
17]. Accordingly, CNNs have become a famous model in COVID-19 detection. Based on the history of CNNs, Fukushima initially suggested the CNN architecture in 1988 [
18]. However, because of the limitations of the computing devices required to train the network, it was not widely deployed. LeCun et al. successfully solved the handwritten digit categorization issue in the 1990s by using CNNs and a gradient-based learning method [
19]. Subsequently, scientists enhanced CNNs even more and obtained cutting-edge outcomes in numerous identification assignments. CNNs are superior to primitive ML in several ways, including their structural similarity to the natural visual processing system, their ability to learn and extract the abstractions of 2D features, and their high level of optimization in processing both 2D and 3D images. The maximum pooling phase of a CNN is effective at collecting structural variations. Additionally, compared to completely connected networks of similar dimensions, CNNs have much fewer elements because they are made up of weak connections with coupled weights. The gradient-based learning approach is mostly used to train CNNs, with training declining with the declining gradient issue. CNNs are able to produce significantly precise weights given that the network as a whole is trained using the gradient-based approach to directly lower mistake criteria. Several studies have adopted CNNs to solve the problem of automatic detection of COVID-19. For example, in a study by Hammoudi et al. [
20], they implemented a hybrid CNN to solve category four lung disease, including bacterial, COVID-19, viral, and normal cases. The report demonstrated that their model achieved 95.72% in accuracy tests.
Deep learning (DL) was first introduced in the last decade with the name AlextNet [
21,
22]. AlexNet became one of the most popular DL techniques due to tremendous achievements in the domain of computer science applications, such as image processing. AlexNet is composed of CNN working mechanisms. After the emergence of AlexNet, various DL techniques appeared, including VGG-16, GoogleNet, RestNet, DenseNet, MobileNet, etc. Stacks of many convolutional layers and max-pooling layers are commonly used, followed by fully linked and SoftMax mechanisms at the end of the layering. Another DL model is composed of recurrent neural networks (RNNs), including long short-term memory (LSTM), gated recurrent units (GRUs), and an auto-encoder (AE) [
23,
24]. There are examples of deep learning approaches for cyber security algorithms [
23,
25] and e-commerce [
26], which achieved excellent performance. However, the majority of the DL models above consist of a lot of parameters, meaning high computation costs and a lot of time required for the learning process.
DL is the most popular machine learning approach that can be utilized in detecting COVID-19. One of the most popular kinds of DL models is the visual geometry group (VGG). VGG consists of several variants, including VGG-16, VGG-19, and VGG-32. The VGG architecture is created from two convolutional processes that involve the ReLU activation function. After processing with ReLU, they continue with maximum pooling and are linked to multiple layers. The whole process involves the ReLU activation function, and the last process adopts SoftMax to produce the classification result. One study adopted VGG-16 to automatically detect COVID-19 [
27]. This study considered four classes of datasets, including normal, COVID-19, bacteria, and pneumonia images. According to the evaluation metrics, the VGG-16 model achieved 87.49% accuracy.
DenseNet is a convolutional neural network architecture proposed by Gao Huang et al. in 2017 [
14]. Specifically, they used whole output layers coupled to all of the subsequent networks within a dense layer. Consequently, the network exhibits a high level of interlayer connection, which justifies its designation as DenseNet. The utilization of this notion demonstrates efficacy in the context of adopting characteristics. This mechanism achieved significant network number reduction. The DenseNet architecture is composed of multiple solid blocks and transformation blocks strategically stored among two consecutive solid-block layers. The DenseNet model provides great results when handling COVID-19 classification. Another study implemented a DenseNet model for X-ray lung images for lungs that were normal, had COVID-19, or had pneumonia [
28]; according to the experimental report, this DenseNet model achieved 96.25% accuracy.
The explanations above describe the effective achievements of DL algorithms. Some of them developed hybrid multi-layer DL models, and the majority of the studies succeeded in developing significant COVID-19 detection. However, these existing models require high computation costs for the transfer learning process. Moreover, the model’s performances produced an error detection of more than 3%. Following on from this, we propose a novel DL classification model called the AMIKOMNET that was designed specifically for this research project. Our proposed model exploits a subclass of DL algorithms called AE and a CNN model with minimum layers and parameters; it incorporates a combination of the enhancement algorithm using AdaBoost. The development of AMIKOMNET aims to reduce error detection and high computation costs by using a low number of parameters. Chest X-ray image datasets were used to observe the effectiveness of AMIKOMNET. The X-ray image datasets include normal patients, COVID-19 patients, and pneumonia patients. Our proposed model is a hybrid model that uses an AE, CNN, and Adaboost. The hybrid model uses a serial mechanism that involves three essential algorithms.
In summary, there are three main contributions from this study:
A novel, hybrid method involving a CNN and AE with a minimum number of layers and parameters that is used to effectively learn different classes of lung disease X-ray images and classify them;
A novel, hybrid model aimed at reducing error classification by using the hybridization of an AE, CNN, and AdaBoost;
A novel image analysis model that uses heat map patterns as a result of applying Grad-CAM. This application is essential in determining the focus location impact of pneumonia or COVID-19.
2. Literature Review
The classification of respiratory system problems utilizes multiple types of medical imaging material, i.e., MRI scan images, CT scans, and X-rays. The most essential material used in this is the application of DL. Numerous investigations have shown that COVID-19 may be found using X-rays, saving time and effort for medical personnel. However, finding COVID-19 at an early stage has been a difficult aspect in current studies. In this section, we explain several of the most significant and pertinent publications on AI techniques used in lung disease detection, for instance, COVID-19, pneumonia, pneumothorax, lung cancer, and tuberculosis. A comparison of state-of-the-art techniques is demonstrated in
Table 1. In summary, the majority of the studies handle essential problems regarding COVID-19 detection, such as a minimum number of datasets for training, inaccurate detection results, reducing the time needed for computation, and adopting several types of datasets that are popularly named multimodal.
Many medical studies involving DL algorithms have begun in recent decades. The adoption of DL algorithms has seen success in handling several problems in medical imaging problems. A review study conducted by Moorthy and Gandhi [
29] shows that DL has played an essential role in five areas of medical imaging, including segmentation, classification, detection, registration, and image enhancement. In terms of classification tasks, the majority of researchers employ a CNN variant, such as deep CNN. Several studies have involved traditional machine learning algorithms. For example, the authors of [
30] used random forest (RF) to classify COVID-19 and non-COVID-19 samples. They considered that the adoption of a DL algorithm required high computation costs. The experiment showed that RF achieved an accuracy of 87.9%, a sensitivity of 90.7%, and a specificity of 83.3%. One study compared DL models for COVID-19 classification [
31], including several generic DL models such as RestNet, VGG19, DenseNet, and Inception; Inceptionv3 achieved 98% accuracy. Another proposed model using DenseNet aimed to reduce error detection in COVID-19 identification by using DenseNet [
28]. This research aimed to enhance COVID-19 detection even with a minimum number of datasets. The feature extraction process involved ImageNet, and the classification algorithm used was DenseNet; the experimental report shows that DenseNet accuracy was 96.25%, its precision was 96.28%, its recall was 96.29%, and its specificity was 96.21%, yet the model required high levels of computation to learn the datasets.
Imbalanced datasets are a big problem in COVID-19 detection. The classification task can see inaccurate results due to imbalanced data training. A study implemented GAN to generate imitation datasets [
32]. The researchers adopted VGG16 as the core algorithm for the classification task. The experimental results achieved an accuracy of 94.74%, a sensitivity of 92.86%, a specificity of 87.50%, and an F1-score of 96%. A study that used a parallel hybridization model using InceptionV3 and VGG16 aimed to reduce error detection. The experimental results show an accuracy of 98%, a recall of 98%, and a precision of 98%. This research achieved tremendous results, but the model, again, requires high computation costs. A similar hybrid model using VGG16 and VGG19 aimed to reduce error detection. The model achieved an accuracy of 92%, a precision of 93%, a recall of 92%, a specificity of 77.77%, a sensitivity of 98.68%, and an F1-score of 92%. Another hybrid model using parallel MobileNet aimed to reduce error detection [
33]. The researchers reported an accuracy of 93.4%, a precision of 93.48%, and a recall of 92.7%. This research implemented CT scan images, with the training process using 100 epochs, which is in contrast to a previous study, where they used swab tests instead. From a computation cost point of view, this model uses very low computation because it does not involve image data. They incorporated principal component analysis into the feature extraction process and used this with several traditional machine learning algorithms, such as naïve Bayes, random forest, support vector regression, and linear regression. The experimental results show that naïve Bayes achieved the best performance, with an accuracy of 83%, a recall of 82.7%, and a precision of 62.6%. The shortcoming of this research is that the performance of the model is very low.
The difficult problem to solve in COVID-19 detection is the bias of datasets, where image datasets are collected from several different types of hospital resources. The problem occurs due to similarity classes among the image datasets. Research that aimed to solve this problem has been conducted [
34]. The proposed model implements advanced texture feature extraction with GLCM (grey-level co-occurrence matrix), GLDM (grey-level different matrix), and wavelet transform, and a CNN is responsible for classifying COVID-19. The experimental results show an accuracy of 92%, a recall of 93%, a precision of 87%, and an F1-score of 89%.
Table 1.
Previous work regarding DL models for COVID-19 detection using X-ray in binary class datasets: (1) COVID-19; (2) pneumonia; (3) healthy/normal; (4) pneumonia bacteria; (5) pneumonia viral; (6) tuberculosis; (7) lung cancer; (8) no finding; (9) influenza; (10) SARS.
Table 1.
Previous work regarding DL models for COVID-19 detection using X-ray in binary class datasets: (1) COVID-19; (2) pneumonia; (3) healthy/normal; (4) pneumonia bacteria; (5) pneumonia viral; (6) tuberculosis; (7) lung cancer; (8) no finding; (9) influenza; (10) SARS.
Ref. | Year | Classes | Algorithm | Accuracy |
---|
[30] | 2021 | 1, 2 | Random Forest | 87.9% |
[28] | 2021 | 1, 3 | DenseNet DL | 96.25% |
[31] | 2021 | 1, 3 | Custom CNN | 94.5% |
[32] | 2024 | 1, 3 | GAN and VGG16 | 96.55% |
[35] | 2024 | 1, 3 | Inceptionv3 and VGG16 | 99% |
[36] | 2024 | 1, 3 | Hybrid VGG16 and VGG19 | 92% |
[33] | 2024 | 1, 3 | Parallel MobileNet | 96% |
[37] | 2024 | 1, 3 | RestNet50 | 97.19% |
[38] | 2024 | 1, 10 | SVM, Naïve Bayes, Random Forest | 94% |
[34] | 2024 | 1, 5 | Modified CNN | 92% |
[39] | 2024 | 1, 3 | Ensemble KNN, SVM | 87% |
Many researchers implemented multi class lung diseases detection. They found essential algorithm to enhance classification task. The study explanation shows on
Table 2 below. For Example, a proposed model called BDCNet that uses a CNN enhancement algorithm was recently created by the authors of the current study for COVID-19 detection [
40]. We used three classes of lung diseases, including COVID-19, pneumonia, and LC. We adopted one DL model in the context of a traditional CNN and used evaluation metrics to observe model performance; this model was compared against several modern DL platforms, such as ResNet-50 and VGG-19. The BDCNet model achieved a tremendous performance when compared to the other competing techniques, with an accuracy of 99.10%. This study involves 15,043 X-ray images in total, including 6012 images of pneumonia, 180 images for COVID-19 cases, and 8851 images of healthy cases. The experimental report showed an accuracy of 88.9% [
41]. The architecture of the model conducts an analysis based on 18 layers for the residual convolutional layer, working on categorization after using a CNN model to determine the most noticeable traits as part of the initial step. The anomaly module was applied in the final stage to identify the model scores. A total of 1531 X-ray images were used in the experiment, 100 of which showed positive results for COVID-19 and the rest of which revealed pneumonia infection. The model achieved a precision of 70.65% and a recall of 96% for COVID-19 classification.
The limited number of datasets in this area has become a serious problem. GAN is a popular algorithm that is used to generate automatic datasets. Researchers conducted a study to enhance the limited number of datasets [
42]. The experiment involved 307 images from COVID-19 datasets. Several DL algorithms were involved in this study, including Alexnet, GoogleNet, and RestNet. The experimental results demonstrate that the adoption of GAN and RestNet achieved better performance over another DL model. This model achieved an accuracy of 81.48%, a precision of 88.10%, a recall of 81.48%, and an F1-score of 84.66%. Researchers proposed another model aimed at the same case of a limited number of datasets. They adopted a patch-based CNN to extract image datasets, and they used a statistical analysis biomarker, with RestNet18 used as the classification task engine. The training process involved a pretrained model based on ImageNet. The performance of the model achieved 88.9% accuracy, 83.4% precision, 85.9% recall, and 96.4% specificity.
The problem of error detection in COVID-19 classification is still a serious problem. A study tried to solve the problem by using a hybrid scenario, which has become a popular model, even though the model uses very high computation costs, for example, ensemble learning using InceptionRestnet, Restnet50, and MobileNet [
43]. The adoption of this ensemble learning achieved 95.09% accuracy, 94.43% sensitivity, 98.31% precision, and 94.84% F1-score. Researchers proposed a study to enhance COVID-19 classification by using highlighted spatial areas as regions of interest (ROI) in X-rays [
27]. The classification algorithm in this research used attention with VGG16. The result of the experiment achieved 79.56% accuracy. The drawback of research using limited numbers of datasets is that it has an impact on inaccurate performance. In later work, researchers required GAN to produce more datasets. Researchers proposed a model using GAN to generate imitation datasets [
44,
45]. The majority of DL variants require huge datasets to learn. GAN provided better achievements in generating datasets. A type of GAN algorithm, Unet-GAN, was used as a basic mechanism in a Unet model. The researchers selected DenseNet as the classifier engine. The experimental results show an accuracy of 98.59%, a precision of 98.33%, a recall of 98.68%, a specificity of 99.30%, and an F1-score of 98.50%.
One of several strategies to increase the effectiveness of the classification task is preprocessing enhancement. Researchers [
46] used the modification of a CNN with a rest block as the classifier engine. The experimental results achieved a specificity of 93.33% and an F1-score of 93.07%. Another piece of research adopted point-of-care ultrasound images (POCUS) [
47], using a CNN as the classifier engine. The experiment implemented three classes of lung diseases, including COVID-19. The results show that the model achieved great effectiveness, with 99.76% accuracy, 99.89% specificity, 99.87% sensitivity, and 99.75% F1-score. The drawback of this approach is the datasets are not popular in medical applications.
The majority of existing models use a DL model. DL always requires high computation costs and is always time-consuming. One study [
48] proposed a model that aimed to lower the computation cost. This model used advanced contrast enhancement (CLAHE) because the majority of X-ray images have low standard quality. They integrated the results of contrast enhancement with traditional machine learning algorithms, such as Naïve Bayes, KNN, decision tree, and SVM. Naïve Bayes showed better performance in this study, with 99.01% accuracy, 97.77% precision, 100% recall, and 98.87% F1-score. A hybrid model using a DL algorithm was used in [
49]. The authors used a serial hybrid model of RestNet, VGG, and fusion. Their research involved eight classes of lung diseases, and they adopted a GAN algorithm to handle imbalanced datasets. The experimental results show an accuracy of 92.88%, a sensitivity of 82.37%, a precision of 89.56%, and an F1-score of 85.82%. A hybrid multimodal dataset that included X-rays, CT scans, and sounds was integrated in [
50]. A CNN was responsible for the classification task, and the experimental results showed an accuracy of 99.01%. The adoption of image data enhancement plays an essential role in enhancing classification tasks, such as the adoption of region proposal networks to detect chest cavity regions of interest (ROI). Their model achieved 98.10% accuracy, 98.13% precision, 99.03% specificity, 98.30% recall, and 98.12% F1-score.
Table 2.
Previous work regarding DL models for COVID-19 detection using X-rays in multiclass datasets: (1) COVID-19; (2) pneumonia; (3) healthy/normal; (4) pneumonia bacteria; (5) pneumonia viral; (6) tuberculosis; (7) lung cancer; (8) no finding; (9) influenza; (10) SARS.
Table 2.
Previous work regarding DL models for COVID-19 detection using X-rays in multiclass datasets: (1) COVID-19; (2) pneumonia; (3) healthy/normal; (4) pneumonia bacteria; (5) pneumonia viral; (6) tuberculosis; (7) lung cancer; (8) no finding; (9) influenza; (10) SARS.
Ref. | Year | Class Dataset | Algorithm | Test |
---|
[40] | 2021 | 1, 2, 3, 7 | Modification layer VGG19 | 99.10% |
[42] | 2020 | 1, 3, 4, 5 | GAN, GoogleNet, Restnet18, Alexnet | 80.56% |
[51] | 2020 | 1, 2, 3 | Xception+Resnet50 | 91.40% |
[52] | 2020 | 1, 3, 4, 5, 6 | Hybrid CNN | 88.90% |
[53] | 2020 | 1, 2, 3 | CNN and MobileNet | 96.78% |
[54] | 2020 | 1, 2, 4 | Inception V3 | 76.0% |
[55] | 2020 | 1, 2, 3 | Resnet and SVM | 95.33% |
[43] | 2021 | 1, 3, 4, 5 | Inception, RestNet, MobileNet | 94.84% |
[40] | 2021 | 1, 2, 3, 7 | UUNet++ | 95.24% |
[27] | 2021 | 1, 2, 8 | Attention and VGG16 | 87.49% |
[56] | 2020 | 1, 2, 3 | Enhanced CNN | 93.3% |
[44] | 2023 | 1, 2, 3 | GAN and CNN | 98.78% |
[45] | 2022 | 1, 2, 3 | GAN and CNN | 99.2% |
[46] | 2021 | 1, 2, 3, 9 | Annotation and CNN | 96.7% |
[47] | 2024 | 1, 2, 3 | Enhance Xception | 99.7% |
[48] | 2024 | 1, 2, 3 | CLAHE and Naïve Bayes, SVM | 98% |
[49] | 2024 | 1, 3, 4, 9, 10, 6 | Multimodal, VGG19, RestNet | 95.97% |
[50] | 2024 | 1, 2, 3, 4, 5, 6, 7, 8, 9 | Multimodal and CNN | 99.1% |
[57] | 2024 | 1, 3, 8 | MLP-BiLSTM | 98.10% |
A COVID-19 detection model was proposed by Apostolopoulos et al. [
53]. They utilized a MobileNet-V2 model in addition to a CNN model on two classes of datasets. The first dataset has a total of 1427 images, of which 224 show positive results for COVID-19, 700 for bacterial infections, and the other images are normal. In the second dataset, there is a comparable number of images with COVID-19, pneumonia, and normal. When compared to a traditional CNN model, the MobileNet-V2 model achieved 96.78% accuracy and 98.6% recall. Tsiknakis et al. [
54] proposed a new model for the classification of COVID-19 by employing a pretrained model known as InceptionV3. This method used 572 cases of bacterial and viral infections, 122 images of COVID-19, and 150 images of normal cases to learn. It was determined that 76% accuracy was attained. In the research, 381 X-ray images were loaded into nine distinct models that had previously been trained. In this study, they used SVM and ResNet-50 to extract these features, from which COVID-19 was then found. The ResNet-50 model achieved 95.33% accuracy and 95.33% F1-score.
A proposed model for COVID-19 detection used a CNN-based model called CovXNet for COVID-19 image classification [
58]. The feature was extracted using the CNN model, and the depth-wise convolution phenomena were examined; this made automatic identification possible. The CovXNet model uses a gradient-based selective localization technique, which was created by utilizing X-ray images of both healthy individuals and individuals with pneumonia in order to identify COVID-19. For normal and COVID-19 instances, the model yielded an accuracy rate of 97.4%; for all other cases, including pneumonia and bacterial and viral infections, the algorithm generated a detection accuracy of 90.2%. When Horry et al. [
59] were tasked with classifying photos contaminated with COVID-19, they used four well-known medical transfer learning classifiers. A total of 60,798 photos were used to validate the model: a total of 115 images showed COVID-19-positive X-rays, 60,361 images were normal, and 322 images showed pneumonia infection. In the experiment, they employed four different types of transfer learning and compared them. According to the results, VGG-19 performed the best, with 81% accuracy.
A study by Kermany tried to increase accuracy performance in COVID-19 and pneumonia detection by using a pretrained model with Inception-V3. The objective of the study was to diagnose pneumonia based on X-ray images [
60]. The following experimental results demonstrated that the diagnostic accuracy of the model was 92.8%, and its recall was determined to be 93.2%. A study by Wang [
61] produced an analysis of localization methods to detect pneumonia infection in X-rays images, aiming to discover the exact location of the disease. According to the results of their experiment, the curve (AUC) for the categorization of the pneumonia disease class was 63.3%. In addition, Rajpurkar [
62] used a hypothesized model with 121 CNNs and obtained an AUC curve of 0.768. In addition to this study, they then gave their model the name CheXNet, and it was analyzed and verified using the publicly accessible “chest X-ray” image collection, which has 112,120 frontal chest images. In their study on the classification of chest diseases, Malik et al. [
50] computed the consumption of a CNN with two branches. Their model detected a pneumonia disease class with an AUC curve of 0.776. The technique of disease diagnosis based on radiograph images was investigated, and the imaging of the chest area was used to apply a segmentation process to a variety of bodily organs.
A study by Wang noticed the tones of pneumonia in association with lethal coronavirus (COVID-19), where severely infected humans with acute respiratory illnesses were first discovered in Wuhan. The outbreak of the disease began in people with severe acute respiratory infections. The number of people who were infected with COVID-19 was very hard to obtain; however, these data are very necessary in terms of understanding the pattern of how the disease might spread in the future [
63]. During the pandemic, the pathogenic laboratory was a viable choice; nevertheless, their testing procedure was time-consuming and frequently produced false negative (FN) results. Because of this, the screening for this condition was carried out utilizing various forms of medical imaging, such as CT scans and MRI scans, in conjunction with DL techniques. An AUC of 89.5% was reached during the preliminary screening, along with a specificity of 88% and a recall of 87%.
A CNN model was used by Stephen et al. to diagnose people with pneumonia based on chest X-ray images taken from a database. The proposed CNN models were examined (starting from the ground up) to extract the significant and dominating characteristics from X-ray images to detect pneumonia. This model tackled the problem of medical imaging for a significant number of pneumonia datasets, and it came up with significant results [
64]. When it comes to categorization, other methodologies rely on handcrafted and pretrained methods to achieve a remarkable level of performance. They trained their model with information from multichannel images to imitate the clinical monitoring procedure. A proposed model by Wang et al. [
63] applied a regression mechanism with DL to automatically screen for pneumonia. First, they improved their ability to screen for pneumonia by extracting visual cues from multiple modes of photos. This allowed them to better detect the illness. A second process for new structures of the chest was studied by utilizing a recurrent convolutional neural network (RCNN), which can automatically extract numerous image characteristics from multichannel image slices. RCNN is able to do this because it can extract many image features simultaneously. When compared to the previous baseline work, also known as RCNN with a ResNet structure, the suggested model demonstrated an improvement, with an accuracy of 2.3% and a sensitivity evolution of 3.1%.
Janizek et al. [
65] described the emergence of pneumonia due to several factors, such as fungi, viruses, and also bacteria. Taking chest X-rays of patients is a standard diagnostic procedure that is utilized in the evaluation and treatment of pneumonia. In detecting pneumonia from medical images, a certain type of CNN model was used, which is built on pretrained layers, such as Inception, VGG32, Exception, Efficient, ResNet, and InceptionRestnet, and it attained an extremely high level of accuracy. For chest X-rays and computed tomography (CT) of COVID-19 infection, Dansana et al. [
66] conducted an evaluation of pretrained convolutional layers, for instance, VGG-19, Inception-v2, and DT. They revealed that VGG-19 outperformed the other methods in classifying the radiographs and CT scan images of people infected with COVID-19. The experimental report showed a diagnosis accuracy of 91%. Moreover, Soin [
55] and Chouhan et al. [
67] initiated an innovative novel DL framework as a method for health professionals to use in the diagnosis of pneumonia. They used many distinct types of pretrained neural networks on chest X-rays in order to extract the dominating features, and then they evaluated the classification accuracy of these features. After that, they assembled the last network for the pretrained process, allowing them to reach the highest possible diagnostic accuracy. In their research, Waheed et al. [
68] created a novel model using an “Auxiliary Classifier Generative Adversarial Network” (ACGAN) and called it COVIDGAN, which was used to produce imitations of chest X-ray images. According to the experimental reports, their models improved the accuracy over that of a traditional CNN model in pneumonia classification tasks, and they claimed that their model achieved more efficient and useful application when supporting radiologists.
A classification ML model called COVIDNET was proposed by Wang et al. [
56]. Their research goal was to enhance the classification algorithm. The categories of the datasets included COVID-19-infected and normal lungs. Their experiment implemented a dataset with 13,975 images collected from 13,870 human illness cases. They used CNN as the main ML algorithm for the classification task. COVIDNET was compared with previous work based on VGG-16 and RestNet-50. According to the experimental report, COVIDNET outperformed VGG-16 and RestNet-50. Following these evaluation metrics, COVIDNET achieved 93.3% accuracy, VGG-16 achieved 83.30% accuracy, and RestNet-50 achieved 90.6% accuracy. Furthermore, another model for COVID-19 classification using DL was proposed by Zhang et al. [
41]. They implemented a novel DL method called generative adversarial network (GAN) to enhance the preprocessing of datasets. The preprocessing process included generating augmentation data. They considered training the model using multiclass classification, including COVID-19, normal, and pneumonia cases. The main ML algorithm used enhanced CNN architectures. According to the evaluation report, their model achieved 98.78% accuracy.
A novel model using a pretrained model based on VGG-16, RestNet, Inception, and MobileNet was proposed by Uddin et al. [
31], where only two classes were included: normal patients and COVID-19 patients. They implemented an enhanced and novel CNN model as the main algorithm for the classification task. The experimental results demonstrated that their model achieved 97% accuracy. A proposed model using a novel GAN to augment generated imitation data (combined with the enhanced CNN model) was proposed by Gulakala et al. [
45]. They applied this to a multiclassification task, including COVID-19 class, patients with healthy lungs, and lungs with pneumonia. The experimental results show that the hybrid model of GAN and the novel CNN architecture achieved tremendous results: 99.2% accuracy. A novel model using annotation and a CNN framework was first initiated by Liang et al. [
46]. The adoption of annotation is responsible for increasing the effectiveness of detection performance. The novel annotation model was built with a specific application in mind in this study, with a CNN algorithm responsible for tackling the classification task. The evaluation report showed that this model achieved 96.72% F1-score and 99.33% specificity.
A study that employed GAN aimed to automatically generate COVID-19 datasets. They integrated GAN with VGG16 for the classification task of detecting COVID-19 [
32]. Here, GAN was responsible for handling imbalanced datasets between normal and COVID-19 X-ray images. The adoption of GAN to generate imitation datasets successfully improved upon a previous deep CNN model. According to the experimental report, GAN and VGG16 achieved 96.55% accuracy. Almost similar in terms of utilizing GAN and VGG16, another study proposed a novel algorithm that integrated dual DL algorithms [
35]. Their model consists of Inceptionv3 and VGG16. The objective of their study was to enhance the performance of previous work in terms of handling error detection due to low-quality X-ray images, low-accuracy classification, and overfitting. This study consists of two classes, including normal and COVID-19. The datasets contain 121 X-ray images of COVID-19 and 122 X-ray images of normal patients. According to the experimental report, the combination of Inceptionv3 and VGG16 achieved 98% accuracy. Surprisingly, their model was superior when compared to previous DL-based work, including VGG16, MobileNet, RestNet50, and DenseNet models.
A study to enhance the COVID-19 classification task was proposed by Prince et al. [
48]. The majority of machine learning faces the problem of requiring a huge amount of data for training, with the training process requiring high computation costs that are time-consuming. In aiming to solve this problem, they employed contrast enhancement using contrast-limited adaptive histogram equalization (CLAHE). After image processing with CLAHE, they transformed the images to the YCrCB color mode. The classification characteristic vector that was utilized is a balanced regional binary spectrum based on reflection (Cr) and YCb. In this research, several traditional classifiers were considered, including naïve Bayes, decision trees, logistic regression, nearest neighbor, and SVM. The proposed method was adopted for a binary class. According to the experimental report, their model achieved 99% accuracy. The best performance was achieved by naïve Bayes as the classifier in a binary class including normal and COVID-19 cases. The shortcoming of this research is that the experiment did not consider a multiclass classification task.
Even though the majority of the work has succeeded in handling some of the obstacles during the COVID-19 pandemic, COVID-19 detection still faces several shortcomings. Error detection is still a relevant problem in this area of research. One study aimed to enhance the COVID-19 classification task by using hybrid deep learning, as proposed by Abdullah et al. [
36]. The proposed model used a hybrid, dual deep learning algorithm that includes VGG16 and VGG19 in a parallel scenario. The experiment considered two classes of lung diseases, including normal and COVID-19. The datasets used contain 2413 X-ray images of COVID-19 patients and 6807 images that represent normal patients. According to the experimental report, the proposed method achieved 92% accuracy. The proposed model was also compared with several state-of-the-art models, including VGG16, VGG19, Efficienet, and RestNet. They claimed the hybrid model of VGG16 and VGG19 was superior to several deep learning platforms. Indeed, the hybridization of deep learning models requires millions of parameters, which has an impact on computation cost.
The review above shows the existing deep learning approaches face the shortcomings of high error detection, high computation cost, image dataset bias, and a lack of image datasets. From a medical diagnosis point of view, the problem of error detection cannot be tolerated. There are several problems in deep learning when hybrid models are adopted, including high computation costs. The deep learning process requires many layers, and the stacking of these layers is necessary for learning. This is very costly from a computing perspective. Additionally, the involvement of an unlimited number of layers influenced overfitting. Therefore, reducing error detection is an avenue of research that remains open. In this study, we consider integrating a dual model of traditional deep learning platforms, including AE and CNN, with the aim of enhancing the effectiveness of the classification task. To the best of our knowledge, AE and traditional CNNs involve a lower number of parameters; the number of parameters can be seen in
Table 2,
Table 3 and
Table 4. The adoption of Adaboost is expected to increase classification performance in COVID-19 detection. Applying this to multiclass datasets becomes essential in terms of observing the performance of such models in several classes of lung diseases.