1. Introduction
The Coronavirus Disease 2019 (COVID-19) was formally named Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) by the International Committee on Taxonomy of Viruses [
1]. According to the primary studies, SARS-CoV-2 was originally transmitted from bats to humans by unknown intermediate animals [
2,
3]. Most patients exhibit symptoms such as fever, dry cough, headache, chest pain, and shortness of breath [
4,
5], even though some patients may experience asymptomatic COVID-19 [
6,
7]. For diagnosing COVID-19, a throat swab is the gold standard [
8,
9,
10]. Typically, a chest X-ray (CXR) is essential for high-risk patients or Patients Under Inspection (PUI) to detect lung infection. The CXR examination is widely considered the gold standard for diagnosing pneumonia [
11]. Specifically, the severity of COVID-19 is exacerbated when patients have risk factors such as older age, high blood pressure, and chronic respiratory disease [
12]. The causes of pneumonia include viruses, bacteria, and fungi [
13].
Deep learning (DL) serves as an advanced tool for image analysis, object classification, and pattern recognition. DL models accurately identify and classify objects within images, automatically learning complex patterns and representations directly from raw input data. Moreover, they can differentiate patterns and features that are challenging to extract from large datasets. Consequently, these models can be leveraged to develop computer-aided diagnosis (CADx) systems to effectively analyze medical information. With the assistance of artificial intelligence algorithms, CADx systems enable physicians to provide valuable information and support to fellow healthcare professionals. Furthermore, CADx reduces the workload on them by efficiently analyzing a large amount of medical data. However, it is essential to note that CADx systems are designed to assist rather than substitute for physicians. Final diagnostic decisions should always be made by qualified healthcare professionals, drawing on their clinical knowledge, patient history, and other relevant information.
According to previous studies, CADx may help clinicians identify COVID-19 pneumonia using CXR images [
14,
15]. Currently, CADx of medical images is becoming increasingly viable. State-of-the-art DL approaches have shown promising results in identifying CXR images of COVID-19 infections [
16,
17,
18]. Chowdhury et al. [
19] proposed a transfer learning approach with binary classification: normal and COVID-19, and three-class classification: normal, viral pneumonia, and COVID-19. The binary classification achieved 99.7% accuracy and sensitivity, while the three-class classification obtained 97.4% accuracy, sensitivity, and F-measure. Afshar et al. [
20] proposed the COVID-CAPS modeling framework based on capsule networks, which achieved accuracy and sensitivity of 95.7% and 90.0%, respectively. Wang et al. [
21] introduced COVID-net, a deep convolutional neural networks (CNN) design tailored to classify COVID-19, non-COVID-19, and healthy. COVID-net achieved an accuracy of 93.3% and a sensitivity of 93.3%. Ahmad et al. [
22] proposed a hybrid ensemble model for identifying COVID-19 infections in CXR images. MobileNet and InceptionV3 models were fine-tuned for the learned weights. The authors performed four-fold cross-validation. The results showed accuracy, precision, and specificity of 94.2%, 89.9%, and 88.3%, respectively.
Furthermore, Khuzani, Heidari, and Shariati [
23] used a dimensionality reduction method to generate a set of optimal features of CXR images to develop a COVID-classifier that distinguishes COVID-19 cases from non-COVID-19 cases. The proposed COVID-classifier achieved an accuracy of 94.05%. Yang et al. [
24] used VGG16 to detect multiclass COVID-19 CXR and CT-scan images. The accuracy of the three-class classification of CXR images was 97%. Hong et al. [
25] proposed a lightweight CNN, MGMADS-CNN that classifies COVID-19 CXR images with an accuracy of 96.75%. Nishio et al. [
26] developed EfficientNet to classify COVID-19 pneumonia, non-COVID-19 pneumonia, and healthy CXR images. Three-class classification accuracy was 86.67%. According to
Table 1, DL and pre-trained models for COVID-19 detection on CXR images dominated this research field. However, several studies used small and unbalanced datasets to develop models. This raises potential problems with generalization and bias. Small datasets may restrict the model’s ability to understand the full complexity of cases, and unbalanced datasets can result in biased performance towards the majority class. Even with the prevalence of CXR-pretrained models, researchers should be careful about the size and balance of their datasets to ensure strong and unbiased model performance in detecting COVID-19.
Our retrospective study mainly aimed to classify COVID-19, non-COVID-19, and healthy CXR images using a stacked CNN model. Using stacked CNN models, multi-stage automated classification (MSTAC) of COVID-19 CXR images was realized based on two CNN-based binary classification models. The first model differentiates healthy and unhealthy CXR images. Consequently, all the unhealthy CXR images are passed to the second CNN-based binary classification model. The second model mainly classifies COVID-19 and non-COVID-19 CXR images. We compared MSTAC with the CNN model that classifies COVID-19 pneumonia, non-COVID-19, and healthy. However, only a few studies have used multi-stage automated detection for classifying COVID-19 based on CXR images.
The following are the main contributions of our study that aim to address these issues:
- (1)
We propose the Multi-Stage Automated Classification (MSTAC) for COVID-19 detection using two stacked CNN-based binary classification models and a large dataset of CXR images.
- (2)
The first model for MSTAC (stage-1) is a DL model that classifies between healthy and unhealthy CXR images.
- (3)
The second model for MSTAC (stage-2) is a DL model that classifies between COVID-19 and non-COVID-19 CXR images.
- (4)
The proposed MSTAC model stacks stage-1 and stage-2. This stacking allows MSTAC to identify three classes of CXR images: COVID-19, non-COVID-19, and healthy.
- (5)
MSTAC achieves high accuracy in identifying COVID-19 pneumonia cases from CXR images. This proposed model enhances the efficiency and accuracy of COVID-19 detection.
- (6)
We also compare MSTAC with previous studies that classify COVID-19 pneumonia, non-COVID-19, and healthy cases. This comparison demonstrates the promising results of MSTAC in COVID-19 classification and its effectiveness for medical image analysis and diagnosis.
Overall, the MSTAC for COVID-19 model aims to make a precise contribution to the ongoing struggle with the COVID-19 pandemic.
The study’s structure is organized as follows: The methodology and proposed models are discussed in
Section 2. The experimental findings are presented in
Section 3. In
Section 4, discussions are provided. Finally,
Section 5 presents the conclusion.
4. Discussion
This retrospective study represents a significant advance in the development of MSTAC, a model designed for the challenging task of three-class classification—distinguishing between COVID-19 pneumonia, non-COVID-19 pneumonia, and healthy cases through the analysis of CXR images. The utilization of stacked multiple binary models was a detailed process, demanding considerable effort in the reconstruction of the development set and the subsequent training of each model. The proposed CNN exhibited efficiency, converging in 20 epochs, each lasting approximately 18 s. The paramount outcome of our investigation is the noteworthy diagnostic accuracy achieved by MSTAC, surpassing the 97% threshold across the three distinctive classes. In the field of deep learning, numerous studies have investigated a range of preprocessing techniques, feature extraction methodologies, and classification approaches. Notably, customized CNNs and ensemble learning have emerged as pivotal strategies. Given the gravity of potential risks associated with misdiagnoses, our primary objective was to craft a model that excels in accuracy.
Table 6 compares the performances achieved in compared studies. With an accuracy exceeding 95%, our model establishes itself as a statistical benchmark. We observed that our proposed approach of stacking multiple binary models achieved better performance than that achieved in studies that used a similar number of CXR image datasets. Comparatively, the performance of MSTAC places it among the top-performing models in the field. It outperforms the accuracy achieved in several notable studies including Wang et al. [
21], Khuzani et al. [
23], Yang et al. [
24], and Nishio et al. [
26] in terms of three-class classification. Another strength of MSTAC is its competitive sensitivity of 97.30%, indicating its effectiveness in correctly identifying true positive cases across all three classes. This sensitivity outperforms several existing models, highlighting MSTAC’s robustness in detecting COVID-19, non-COVID-19 pneumonia, and healthy CXR images. Compared to the various transfer learning techniques utilized by Yang et al. [
24] and Nishio et al. [
26], MSTAC demonstrates notably higher performance. This underscores the effectiveness of our approach in achieving superior results. Additionally, the robustness of the MSTAC model is significantly influenced by the dataset size. The use of a well-balanced dataset, comprising 1050 images for each class and totaling 3150 CXR images, contributes to a more comprehensive and diverse set for both training and evaluation. This stands in clear contrast to studies employing smaller datasets, as shown by Khuzani et al. [
23]. It emphasizes MSTAC’s advantage in efficiently addressing diverse features and enhancing its overall effectiveness in detecting COVID-19.
The MSTAC architecture is intentionally designed for optimal performance in detecting COVID-19. The two-stage classification process initially categorizes cases into healthy and unhealthy, followed by a precise classification into COVID-19 and non-COVID-19 categories. The stacking approach integrates the strengths of individual models. This allows for a comprehensive analysis of the three distinct classes within CXR images.
The clinical importance of MSTAC extends beyond its technical strengths. The model’s potential for high accuracy and efficiency promises to significantly reduce misdiagnoses, thereby minimizing risks for patients. The efficient diagnostic workflow streamlined by MSTAC shows potential for integration into current healthcare systems. This is especially important in environments where rapid and accurate COVID-19 detection is crucial.
While MSTAC has demonstrated considerable success, it is crucial to acknowledge a limitation in its accuracy for healthy cases when compared to its robust performance in identifying COVID-19 and non-COVID-19 cases. In the future, we could improve the model by looking at more detailed features, using advanced training methods, and maybe including more data sources to make it better at identifying healthy cases. Our approach utilizes existing resources without the need for extensive processing, providing a straightforward and efficient means of detecting COVID-19 CXR images.
Our distinctive stacking model technique, which has played a crucial role in the success of MSTAC, explores a less-traveled path in the field of CXR image analysis for detecting COVID-19. The simplicity and efficiency of our approach allow it to utilize existing resources without the need for extensive processing. These attributes position MSTAC as a promising system in the realm of diagnostic tools for respiratory conditions.
5. Conclusions
The MSTAC model offers a promising method for accurately and efficiently detecting COVID-19 from CXR images. By using stacked CNN models and a two-stage classification strategy, MSTAC achieves high diagnostic performance. The system showcases impressive accuracy, sensitivity, precision, F1-score, and AUC. This suggests its potential as an effective tool for COVID-19 diagnosis. Our study makes significant contributions, such as developing a three-class classification system that can distinguish COVID-19, non-COVID-19, and healthy cases. Additionally, we validated MSTAC on publicly available datasets to ensure transparency and reproducibility. Comparisons with traditional CNN models underscore MSTAC’s superiority in COVID-19 detection.
In the future, we will implement image data augmentation techniques on CXR images to enhance classification accuracy. Additionally, our current study focused on classifying three categories. However, our future plans involve expanding the dataset and creating a model capable of classifying four classes. This will involve incorporating a new binary classification model for distinguishing between viral and bacterial pneumonia.
Furthermore, our future objectives include developing a classification model that utilizes a pretrained model to identify abnormal regions in CXR images. Lastly, we aim to create a prototype of a web-based CADx system. This system could serve as an alternative, bringing the technique into clinical settings as a valuable tool to support medical staff in improving COVID-19 detection accuracy. At the core of this system is the diagnosis of CXR images uploaded by users, which are classified by MSTAC into healthy, pneumonia, or COVID-19 cases.