PneumoniaNet: Automated Detection and Classiﬁcation of Pediatric Pneumonia Using Chest X-ray Images and CNN Approach

: Pneumonia is an inﬂammation of the lung parenchyma that is caused by a variety of infectious microorganisms and non-infective agents. All age groups can be affected; however, in most cases, fragile groups are more susceptible than others. Radiological images such as Chest X-ray (CXR) images provide early detection and prompt action, where typical CXR for such a disease is characterized by radiopaque appearance or seemingly solid segment at the affected parts of the lung due to inﬂammatory exudate formation replacing the air in the alveoli. The early and accurate detection of pneumonia is crucial to avoid fatal ramiﬁcations, particularly in children and seniors. In this paper, we propose a novel 50 layers Convolutional Neural Network (CNN)-based architecture that outperforms the state-of-the-art models. The suggested framework is trained using 5852 CXR images and statistically tested using ﬁve-fold cross-validation. The model can distinguish between three classes: viz viral, bacterial, and normal; with 99.7% ± 0.2 accuracy, 99.74% ± 0.1 sensitivity, and 0.9812 Area Under the Curve (AUC). The results are promising, and the new architecture can be used to recognize pneumonia early with cost-effectiveness and high accuracy, especially in remote areas that lack proper access to expert radiologists, and therefore, reduces pneumonia-caused mortality rates.


Introduction
Pneumonia is a leading cause of death in children under five years of age, taking a life every 39 s [1] accounting for 15% of the population under five years and being responsible for 808,694 deaths in 2017 [2]. It is an acute Lower Respiratory Tract (LRT) disease which creates inflammation in the lung parenchyma that can be caused by a variety of infective organisms such as bacteria, viruses, and fungi, as well as non-infective substances, for instance, sterile gastric contents. Individuals from any age group may be affected; however, in most cases, the cause is specific to a particular group. In the case of children presenting with Pneumonia, symptoms typically include fever, cough with or without sputum and with or without difficulty breathing, and fatigue and retraction of the chest during inhalation. Currently, the diagnostic criteria for pneumonia are based on clinical presentation, findings on Chest-X-ray (CXR), culture and sensitivity from throat swabs or sputum sampling, and blood samples. This disease is preventable, especially swabs or sputum sampling, and blood samples. This disease is preventable, especially through vaccination and, since it is treatable as well, early diagnosis of pneumonia plays a significant factor in preventing complications.
According to the World Health Organization (WHO), Acute Respiratory Infections (ARI) are the worst communicable disease amongst children and an additional 18 million healthcare workers are essential by 2030 to prevent, diagnose, and treat pneumonia [2]. In 2021, the Center for Disease Control (CDC) in the United States estimates that the number of emergency department visits with pneumonia as the primary diagnosis was 1.5 million, and the number of deaths was 43,881 [3].
Even though Chest X-rays (CXRs) have a weaker resolution as compared to Magnetic Resonance Imaging (MRI) or Computerized Tomography (CT) scans, they can be used to perform multiple assessments such as cardiomegaly, pneumonia, pneumothorax, and atelectasis. Diagnosing pneumonia using radiographs is highly subjective and depends on the knowledge and expertise of the radiologist. It is easier to diagnose pneumonia using high resolution MRI and CT scans; however, most radiologists use CXRs to perform assessments owing to quicker turn-over and cost effectiveness of the modality. On a typical radiograph, pneumonia is marked by radio-opacities or white spots in the airways, particularly in the alveoli, which indicates the presence of inflammatory exudate. These radiological findings may present a challenge to a novice radiologist, leading to false positives and false negatives owing to the fact that other diseases mimic these signs. Figure 1 shows samples of CXR images that were utilized in this study and classified as normal, bacterial pneumonia, and viral pneumonia from the pediatric group. Recently, Artificial Intelligence (AI) has been employed to automatically detect findings consistent with pneumonia from radiographic images. The availability of labelled CXR datasets combined with massive and relatively cheap computing power have made Deep Learning (DL) methods the most known and widely spread tool to detect and classify medical images in general and pneumonia in particular. These systems are promising and can achieve human doctor accuracy in detecting multiple diseases [4]. Instead of using pretrained models and transfer learning, this paper proposes a novel simple Deep Learning structure that can detect and distinguish between three classes of pediatric pneumonia using Chest-X-ray (CXR) images. The new architecture can distinguish between viral, bacterial, and normal with a very high accuracy of 99.7%. The proposed architecture performance far exceeds the state-of-the-art models mentioned in the literature.
The model uses a multi-layered Convolutional Neural Network (CNN) that automatically extracts features from the radiographic images and correlates with any pneumonia category with high accuracy. Computer Aided Diagnostic (CAD) systems can be used to eliminate radiologist subjectivity when diagnosing pneumonia. They can effectively be used to confirm clinical findings as well as in countries or remote areas that lack resources, particularly radiological expertise. Unlike previous research [5][6][7] that focused on using transfer learning and traditional machine learning techniques, this study proposes a novel model that can differentiate between normal, bacterial pneumonia, and viral pneumonia. Recently, Artificial Intelligence (AI) has been employed to automatically detect findings consistent with pneumonia from radiographic images. The availability of labelled CXR datasets combined with massive and relatively cheap computing power have made Deep Learning (DL) methods the most known and widely spread tool to detect and classify medical images in general and pneumonia in particular. These systems are promising and can achieve human doctor accuracy in detecting multiple diseases [4]. Instead of using pretrained models and transfer learning, this paper proposes a novel simple Deep Learning structure that can detect and distinguish between three classes of pediatric pneumonia using Chest-X-ray (CXR) images. The new architecture can distinguish between viral, bacterial, and normal with a very high accuracy of 99.7%. The proposed architecture performance far exceeds the state-of-the-art models mentioned in the literature.
The model uses a multi-layered Convolutional Neural Network (CNN) that automatically extracts features from the radiographic images and correlates with any pneumonia category with high accuracy. Computer Aided Diagnostic (CAD) systems can be used to eliminate radiologist subjectivity when diagnosing pneumonia. They can effectively be used to confirm clinical findings as well as in countries or remote areas that lack resources, particularly radiological expertise. Unlike previous research [5][6][7] that focused on using transfer learning and traditional machine learning techniques, this study proposes a novel model that can differentiate between normal, bacterial pneumonia, and viral pneumonia. For this purpose, we used the well-known Guangzhou Women and Children's Medical Center (GWCMC) dataset together with familiar data augmentation techniques. The rest of this paper is organized as follows: Section 2 describes the relevant literature; Section 3 describes the proposed model in detail and the data used in this study, our proposed methods, and training procedure; Section 4 presents the experiment results; Section 5 discuss the results of this study; Section 6 describes the conclusion of this study, followed by references.

Background and State-of-the-Art Research
A study of the literature reveals that many attempts have been made to use Artificial Intelligence (AI) and Deep Learning techniques to detect the presence or absence of pneumonia (binary classification problem) [5][6][7][8][9]. However, few studies have tried to apply machine learning to detect pneumonia [10][11][12] and Deep Learning (DL), particularly Convolutional Neural Networks (CNNs), to classify pneumonia according to its etiological origin (bacterial and viral).
In 2018, Rajaraman et al. [13] evaluated the performance of different customized CNN architectures in identifying pneumonia and distinguishing between viral and bacterial types in 5232 pediatric chest radiographs. The authors evaluated the performance of Sequential CNN, Inception CNN, Residual CNN, and VGG16, and used a novel visualization technique to define the Region of Interest (ROI). Customized VGG16 outperformed the surveyed models and achieved an accuracy of 96.2% in detecting pneumonia as well as an accuracy of 91.8% in differentiating between viral and bacterial types.
Rahman et al. [14] attempted to automatically diagnose different classes of pneumonia using 5247 CXR images from the Kaggle pneumonia dataset. Using transfer learning, they analyzed the performance of four popular pretrained models, AlexNet, ResNet18, DenseNet201, and SqueezeNet. They found that DenseNet201 outperformed all other models by achieving an accuracy of 98% in detecting pneumonia and an accuracy of 93.3% in differentiating between the two etiological variants. With a similar objective, Polat et al. [15] used two different CNNs, a binary CNN and a triple CNN, to detect pneumonia in 5840 pediatric CXR images. The CNNs were trained using the Walsh function to properly extract the features from digital chest radiographs, they also used a minimum distance classifier for classification. They conducted three different parametric studies and found that their proposed method achieved an accuracy of 100% in detecting pneumonia, 92% in distinguishing between two types of pneumonia, and 90% for distinguishing features that pertained to either normal, bacterial, or viral pneumonia.
In 2021, Alqudah et al. [16] used a modified CNN framework to distinguish bacterial and viral pneumonia from normal CXRs in 5852 images. The framework consisted of two stages; firstly, a CNN was used for feature extraction, and, in the later stage, K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) classifiers were utilized. They built two hybrid models, i.e., CNN-KNN and CNN-SVM, while utilizing a 10-fold cross validation methodology as well. The prior hybrid model achieved an accuracy of 94.03%, and the latter hybrid model achieved an accuracy of 93.9%. Another study [17] deployed a pretrained Xception model with data augmentation to solve the multiclass classification problem. The models classified 5840 images from the Mendeley website and achieved an accuracy of 82.69%.
Lastly, various studies employed deep learning with Chest-X-ray (CXR) images to detect pneumonia. For example, Harsh Agrawal used pre-processing techniques as an initial step before using the ResNet50 v2 deep learning structure, which lead to improving the detection accuracy of pneumonia in CXR images to 96% [18]. Alquran et al. exploited texture features and traditional machine learning algorithms to classify Chest-X-rays (CXR) into three classes, Pneumonia regardless of its source including viral or bacterial, COVID 19, and normal chest images; they obtained a 93.1% accuracy among all classes [19]. Rajasenbagam et al. also utilized deep learning to detect pneumonia infection using Chest-X-ray (CXR) images. The proposed CNN was trained on 12000 CXR and achieved a 99.34% accuracy in test images. Moreover, the suggested CNN outperformed existing CNNs such as AlexNet, VGG16Net, and InceptionNet [20].

Materials and Methods
The proposed recognition approach consists of four main stages: the first stage is to load and resize the whole dataset; the second stage is to split the dataset into training, validation, and testing sub-datasets; the third stage involves training and validating the PneumoniaNet CNN model using training and validation datasets; finally, testing the PneumoniaNet using testing dataset. Figure 2 shows a flow diagram for the proposed methodology.

Materials and Methods
The proposed recognition approach consists of four main stages: the first stage is to load and resize the whole dataset; the second stage is to split the dataset into training, validation, and testing sub-datasets; the third stage involves training and validating the PneumoniaNet CNN model using training and validation datasets; finally, testing the PneumoniaNet using testing dataset. Figure 2 shows a flow diagram for the proposed methodology.

Pneumonia Dataset
This study utilized a pediatric CXR image dataset from the Guangzhou Women and Children Medical Center (GWCMC), which was published online by Kermany et al. [21]. The dataset contains a total of 5852 Anterior-Posterior (AP) CXRs images from pediatric patients between one and five years of age. Of the total, 4097 (70%) were used for training and 1755 (30%) were used for testing purposes. Table 1 shows the distribution of the dataset images into normal, bacterial, and viral pneumonia that were used.

Data Pre-Processing and Augmentation
All the images in the dataset were utilized, preprocessed, and resized from 1024 × 1024 pixel to 256 × 256 pixel as the proposed PneumoniaNet requires, and neither lowquality nor low-resolution images were excluded. To prevent overfitting, some noise was added to the dataset; it is well known that adding some noise to the inputs of neural network, in some situations, leads to significant improvement in model generalization capability [22,23]. Moreover, adding noise acts as some sort of augmentation of the dataset. Furthermore, other augmentation techniques were also used. Since not all augmentation approaches were suitable for X-ray images, we processed the images in four steps. First, we resized the images to 256 × 256, then five augmentation techniques were used, Random Horizontal and Vertical Flip (to deal with the pneumonia symptoms on any side of the chest X-ray), Random Horizontal and Vertical Shear (to obtain deeper relation among pixels), and, finally, augmenting images with a varying rotation of images [24,25].

Pneumonia Dataset
This study utilized a pediatric CXR image dataset from the Guangzhou Women and Children Medical Center (GWCMC), which was published online by Kermany et al. [21]. The dataset contains a total of 5852 Anterior-Posterior (AP) CXRs images from pediatric patients between one and five years of age. Of the total, 4097 (70%) were used for training and 1755 (30%) were used for testing purposes. Table 1 shows the distribution of the dataset images into normal, bacterial, and viral pneumonia that were used.

Data Pre-Processing and Augmentation
All the images in the dataset were utilized, preprocessed, and resized from 1024 × 1024 pixel to 256 × 256 pixel as the proposed PneumoniaNet requires, and neither low-quality nor low-resolution images were excluded. To prevent overfitting, some noise was added to the dataset; it is well known that adding some noise to the inputs of neural network, in some situations, leads to significant improvement in model generalization capability [22,23]. Moreover, adding noise acts as some sort of augmentation of the dataset. Furthermore, other augmentation techniques were also used. Since not all augmentation approaches were suitable for X-ray images, we processed the images in four steps. First, we resized the images to 256 × 256, then five augmentation techniques were used, Random Horizontal and Vertical Flip (to deal with the pneumonia symptoms on any side of the chest X-ray), Random Horizontal and Vertical Shear (to obtain deeper relation among pixels), and, finally, augmenting images with a varying rotation of images [24,25].

Proposed Architecture (PneumoniaNet)
Deep learning is one of the most powerful and state-of-the-art technologies that is inspired by the deep neuronal structure of a human brain [26], characterized by numerous hidden layers that allow for the extraction and abstraction of features at different levels. Deep learning starts with a method proposed in 2006 [27], whereby this newly developed algorithm (greedy layer-wise training) is used to train the neuron layers of deep network architecture. It is considered a form of unsupervised learning algorithm that uses unlabeled data and trains the deep network one layer after the other. Since this method is very effective and powerful, it has been chosen as a training algorithm for many deep learning networks. The most powerful, efficient, and widely used deep network is CNN, which includes multiple hidden layers that perform convolution and subsampling to extract low and high level features from the input data whether it is in a single dimension or two dimensions [28]. Basically, CNN consists of six types of layers (input, convolution, RELU, fully connected, classification, and output), and arranging and ordering of these layers is crucial and must take into consideration that they must extract fine details from the input data [26,27]. In general, CNN shows high performance in various sectors, especially in the biomedical field and computer vision, as well as other disciplines [29,30].
In this study, the proposed 50-layer CNN architecture will be utilized to classify and distinguish the input images into three classes as shown in Figure 3. This architecture will decrease the number of the layers as compared to similar pretrained networks that are usually used with transfer learning techniques, i.e., 201 layers in Densnet201, 101 layers in ResNet-101, and 144 layers in GoogleNet, to only 50 layers. Reducing the number of layers will shorten the time required for training and for finding probabilities of new input images in addition to reducing the computing resources required to run the system. Table 2 shows detailed information about layers in the proposed CNN architecture. Using Figure 3 and Table 2, we can notice that the proposed model (PneumoniaNet) has three blocks for features extraction. These blocks are the core of the model and are targeted towards extracting both deep and general features and combining them to obtain the most discriminant features.

Proposed Architecture (PneumoniaNet)
Deep learning is one of the most powerful and state-of-the-art technologies that is inspired by the deep neuronal structure of a human brain [26], characterized by numerous hidden layers that allow for the extraction and abstraction of features at different levels. Deep learning starts with a method proposed in 2006 [27], whereby this newly developed algorithm (greedy layer-wise training) is used to train the neuron layers of deep network architecture. It is considered a form of unsupervised learning algorithm that uses unlabeled data and trains the deep network one layer after the other. Since this method is very effective and powerful, it has been chosen as a training algorithm for many deep learning networks. The most powerful, efficient, and widely used deep network is CNN, which includes multiple hidden layers that perform convolution and subsampling to extract low and high level features from the input data whether it is in a single dimension or two dimensions [28]. Basically, CNN consists of six types of layers (input, convolution, RELU, fully connected, classification, and output), and arranging and ordering of these layers is crucial and must take into consideration that they must extract fine details from the input data [26,27]. In general, CNN shows high performance in various sectors, especially in the biomedical field and computer vision, as well as other disciplines [29,30].
In this study, the proposed 50-layer CNN architecture will be utilized to classify and distinguish the input images into three classes as shown in Figure 3. This architecture will decrease the number of the layers as compared to similar pretrained networks that are usually used with transfer learning techniques, i.e., 201 layers in Densnet201, 101 layers in ResNet-101, and 144 layers in GoogleNet, to only 50 layers. Reducing the number of layers will shorten the time required for training and for finding probabilities of new input images in addition to reducing the computing resources required to run the system. Table  2 shows detailed information about layers in the proposed CNN architecture. Using Figure 3 and Table 2, we can notice that the proposed model (PneumoniaNet) has three blocks for features extraction. These blocks are the core of the model and are targeted towards extracting both deep and general features and combining them to obtain the most discriminant features.   The proposed PneumoniaNet model is unique because it combines both the deep features extracted from the three consequent convolution layers separated by ReLU and the batch normalization layer, while the general features extracted using only one convolution layer with batch normalization layer. This is known as the x-block technique and will allow the CNN model to use both general and minor changes in the Chest-X-ray (CXR) images. Moreover, PneumoniaNet will improve the flow of information and gradients through the network, making the optimization of very deep networks tractable. Also, Pneu-moniaNet will strengthen feature propagation, encourage feature reuse and combination, and substantially reduce the number of parameters. The network weights and biases are initialized using "glorot" weight initialization. This method initializes each weight with a small gaussian value with a zero mean. Finally, the network will be trained end-to-end.
The main difference between the proposed PneumoniaNet model and Resnet50 is that the ResNet50 model uses the output of the previous layer as an input to the next layer, which is known as the residual blocks, to learn from the reference layer inputs instead of learning from unreferenced layers; whereas the main difference between the proposed PneumoniaNet model and VGG model is that the VGG uses a deep structure of very small receptive sequential 3 × 3 filters.

K-Fold Cross-Validation
In general, evaluating any machine learning or deep learning model will be quite tricky due to variations in the size of the dataset used. Usually, machine learning engineers tend to split the data set into training and testing sets with different ratios and use the training set to train the model and testing set to test the model; then, we evaluate the performance of the model using the accuracy metric [31]. However, this method is not very reliable as the accuracy obtained for one test set can be very different from the accuracy obtained for a different test set. Therefore, the K-fold Cross Validation provides a perfect solution to this problem: the solution is obtained by dividing the data into folds and ensuring that each fold is used as a testing set at some point. Figure 4 shows a block diagram of K-fold cross-validation [32].
The main difference between the proposed PneumoniaNet model and Resnet50 is that the ResNet50 model uses the output of the previous layer as an input to the next layer, which is known as the residual blocks, to learn from the reference layer inputs instead of learning from unreferenced layers; whereas the main difference between the proposed PneumoniaNet model and VGG model is that the VGG uses a deep structure of very small receptive sequential 3 × 3 filters.

K-Fold Cross-Validation
In general, evaluating any machine learning or deep learning model will be quite tricky due to variations in the size of the dataset used. Usually, machine learning engineers tend to split the data set into training and testing sets with different ratios and use the training set to train the model and testing set to test the model; then, we evaluate the performance of the model using the accuracy metric [31]. However, this method is not very reliable as the accuracy obtained for one test set can be very different from the accuracy obtained for a different test set. Therefore, the K-fold Cross Validation provides a perfect solution to this problem: the solution is obtained by dividing the data into folds and ensuring that each fold is used as a testing set at some point. Figure 4 shows a block diagram of K-fold cross-validation [32]. Moreover, K-Fold cross-validation is where a given dataset will be split into a K number of folds (groups), where each fold is used as a testing set at some point. A sample scenario will be to choose K = 10, which we will call a 10-Fold cross validation. Here, the data set is split into 10 folds. In the first iteration, the first fold is used to test the model and the rest are used to train the model. In the second iteration, the second fold is used as the testing set while the rest serve as the training set. This process is repeated until each fold of the 10 folds have been used as the testing set [33].

Running Environment
All the experiments were conducted on a desktop computer with Microsoft Windows, running an Intel core i7-6700/3.4 GHz processor, 16 GB of RAM, and a 500 GB hard disk drive (HDD) using MATLAB 2020b. To test the proposed model, we performed a five-fold methodology, and, for each fold training, we used the Adam optimizer and the cross-entropy loss function [33,34]. The initial learning rate was 0.001 and, using this value, the proposed model was trained for 100 epochs for each fold.

Results
By creating a 50-layer CNN, this study has tackled a harder problem than simply detecting the presence or absence of pneumonia (binary classification problem) [35,36]. The suggested architecture was trained to discriminate between normal CXR images versus those from viral or bacterial pneumonia (multiclass problem), and the model seems to have learned how to solve this problem effectively and efficiently. It seems to have Moreover, K-Fold cross-validation is where a given dataset will be split into a K number of folds (groups), where each fold is used as a testing set at some point. A sample scenario will be to choose K = 10, which we will call a 10-Fold cross validation. Here, the data set is split into 10 folds. In the first iteration, the first fold is used to test the model and the rest are used to train the model. In the second iteration, the second fold is used as the testing set while the rest serve as the training set. This process is repeated until each fold of the 10 folds have been used as the testing set [33].

Running Environment
All the experiments were conducted on a desktop computer with Microsoft Windows, running an Intel core i7-6700/3.4 GHz processor, 16 GB of RAM, and a 500 GB hard disk drive (HDD) using MATLAB 2020b. To test the proposed model, we performed a five-fold methodology, and, for each fold training, we used the Adam optimizer and the crossentropy loss function [33,34]. The initial learning rate was 0.001 and, using this value, the proposed model was trained for 100 epochs for each fold.

Results
By creating a 50-layer CNN, this study has tackled a harder problem than simply detecting the presence or absence of pneumonia (binary classification problem) [35,36]. The suggested architecture was trained to discriminate between normal CXR images versus those from viral or bacterial pneumonia (multiclass problem), and the model seems to have learned how to solve this problem effectively and efficiently. It seems to have succeeded in extracting the features that correlate with every specific class. Figure 5  with 99.7% accuracy; moreover, the error represented by the False Positives and the False Negatives is nearly 3%. Aside from accuracy, the model sensitivity is 0.9974, specificity is 0.9985, and precision is 0.9970. In Figure 7, the Receiver Operating Characteristic (ROC) curve for the proposed model is presented; a very important metric in evaluating the performance of any classifier, the curve shows the tradeoff between specificity and sensitivity. The curve in the figure is very close to the upper left corner, and the AUC is nearly one (0.9812), indicating high performance in discriminating between all three classes. The curve also shows that the proposed model capability to differentiate normal, bacterial, and viral pneumonia is almost identical. succeeded in extracting the features that correlate with every specific class. Figure 5 plots the average accuracy and average loss against epochs. The best results were obtained by the proposed PneumoniaNet network both in terms of loss values and accuracy. A graphical representation of the classifier performance ( Figure 6) shows the PneumoniaNet multiclass confusion matrix, where the rows represent the predictions, and the columns represent the actual class. The figure shows the number of accurately and wrongly classified images, and it is clear from the figure that the proposed model managed to discriminate all three classes with 99.7% accuracy; moreover, the error represented by the False Positives and the False Negatives is nearly 3%. Aside from accuracy, the model sensitivity is 0.9974, specificity is 0.9985, and precision is 0.9970. In Figure 7, the Receiver Operating Characteristic (ROC) curve for the proposed model is presented; a very important metric in evaluating the performance of any classifier, the curve shows the tradeoff between specificity and sensitivity. The curve in the figure is very close to the upper left corner, and the AUC is nearly one (0.9812), indicating high performance in discriminating between all three classes. The curve also shows that the proposed model capability to differentiate normal, bacterial, and viral pneumonia is almost identical.   Class Activation Mapping (CAM) helps visualize the regions that the model used to extract the underlying features that are uniquely associated with each class. It helps identify possible areas within an image that contributed to the identification of each class. Figure 8 presents the Class Activation Mapping (CAM) for all three classes under consideration, and these images show the heat map superimposed on the original CXR thus highlighting the discriminative regions of maximum activation. It also shows how the     Class Activation Mapping (CAM) helps visualize the regions that the model used to extract the underlying features that are uniquely associated with each class. It helps identify possible areas within an image that contributed to the identification of each class.   Class Activation Mapping (CAM) helps visualize the regions that the model used to extract the underlying features that are uniquely associated with each class. It helps identify possible areas within an image that contributed to the identification of each class. Figure 8 presents the Class Activation Mapping (CAM) for all three classes under consideration, and these images show the heat map superimposed on the original CXR thus highlighting the discriminative regions of maximum activation. It also shows how the trained model localized the class specific Region of Interest (ROI) that corresponds to the appropriate pneumonia label to make predictions. Using Figures 6 and 7, we can note that our proposed PneumoniaNet is one the few models targeted towards differentiating pneumonia types, either viral or bacterial, with very high efficiency. Also, the proposed PneumoniaNet will open a new trend towards three class classifications instead of the simple two class classifications, which is easy due to the visual difference between normal and pneumonia, while differentiating between the viral and bacterial pneumonia is a difficult task even to.

Discussion
Previous research [13,14,17] focused on using transfer learning and pretrained models, although only two groups [15,16] attempted to build a model from scratch or to modify an existing model for the purpose of detecting pneumonia as this study did. Most researchers [13][14][15] used the GWCMC dataset that this research used; nevertheless, Rahman et al. [14] used a Kaggle dataset, and Madhubala et al. [17] used images from Mendeley. Unlike this study, a small number of researchers [13,15] did not employ data augmentation. Table 3 compares the proposed model performance with the most recent state-of-theart. All the models in the table use CXR images, and three of the models use the same dataset that this research used. It is clear from the table that PneumoniaNet outperforms all other models with respect to all important performance metrics. The new model accuracy is the highest, with 99.72% accuracy, a 5% increase from the second-best model accuracy of 94.03% reported by Alqudah et al. in 2021 [16]. Because the accuracy metric by itself can be misleading, other performance metrics of interest such as the new model Using Figures 6 and 7, we can note that our proposed PneumoniaNet is one the few models targeted towards differentiating pneumonia types, either viral or bacterial, with very high efficiency. Also, the proposed PneumoniaNet will open a new trend towards three class classifications instead of the simple two class classifications, which is easy due to the visual difference between normal and pneumonia, while differentiating between the viral and bacterial pneumonia is a difficult task even to.

Discussion
Previous research [13,14,17] focused on using transfer learning and pretrained models, although only two groups [15,16] attempted to build a model from scratch or to modify an existing model for the purpose of detecting pneumonia as this study did. Most researchers [13][14][15] used the GWCMC dataset that this research used; nevertheless, Rahman et al. [14] used a Kaggle dataset, and Madhubala et al. [17] used images from Mendeley. Unlike this study, a small number of researchers [13,15] did not employ data augmentation.  [16]. Because the accuracy metric by itself can be misleading, other performance metrics of interest such as the new model sensitivity, specificity, precision, and AUC are reported, and they are also better than what is described in the literature. Sensitivity is the most important metric in medical applications because it shows the percentage of correct positives, and Table 3 shows that the proposed model recall is 99.74%. It is also clear in Table 3 that all previous work [13][14][15][16][17] achieved lower accuracy, sensitivity, precision, and F1 score when compared with the suggested method. On top of that, their dataset size is smaller than the one used in this paper. The proposed model structure is simple, which means that it converges fast, and it does not require a lot of computing power. On the other hand, the suggested model's generalization capability is not on par with that of pretrained models. Unlike pretrained models that were trained using millions of images, the proposed architecture used only 5852 CXR images for training. The image dataset used for this study is not sufficient to cover all pneumonia inherent image features and to create a reliable CNN model with high accuracy. Lastly, the authors of the previous research did not provide detailed information to perform a comprehensive assessment, they did not make available the approach they used to validate their data and it is not clear whether they used cross validation as this study did or not.
To study the proposed system complexity, the average time required for the system to make a training per fold and to generate a decision about Chest-X-ray (CXR) images has been calculated and is shown in Table 4. The table shows that the suggested architecture converges faster and is very efficient in the real-time classification of Chest-X-ray images. The results in this research are almost perfect, which makes this method more trustworthy and dependable. Finally, the proposed model will have an impact on medicine, medical staff in rural areas can detect pediatric pneumonia quickly, cost effectively, and with high accuracy using the suggested model. Quick and accurate detection of pneumonia can mitigate fatal complications of pneumonia, particularly in seniors and in children. The proposed model can help alleviate the interpretation variability and subjectivity problem when reading a Chest-X-ray (CXR) radiograph. It can also be used to assist novice radiologists in remote areas that lack expert radiologists to make the right decision. The next step is to build a mobile application that airports can use to discriminate pneumonia using Chest-X-ray (CXR) images.

Conclusions
Pneumonia is a preventable and treatable communicable disease and is a leading cause of death especially amongst children. Early detection will help to achieve quicker access to proper treatment and reduce the ramifications of the disease. PneumoniaNet is a novel deep learning-based model that uses CXR images to distinguish normal radiographic images from those with features consistent with viral or bacterial pneumonia in the pediatric group aged one-five years with a 99.72% accuracy, 99.74% sensitivity, 99.85% specificity, 99.7% precision, and 0.9812 AUC. The proposed model outperforms state-ofthe-art architectures based on the existing performance metrics. Promising results from this new system are going to help radiologists in rural areas with limited resources detect and identify pneumonia quickly and cost effectively, where this will impact the health of the global population, especially in children, by reducing pneumonia-related morbidity and mortality rates. In the future, we plan to build a more complex system capable of calculating the area of pneumonia area and detecting the position of pneumonia accurately. Data Availability Statement: The dataset analyzed during the current study was derived from the following public domain resources. Available online: https://www.kaggle.com/paultimothymooney/ chest-xray-pneumonia (accessed on 4 November 2020).