A Histogram-Based Low-Complexity Approach for the Effective Detection of COVID-19 Disease from CT and X-r ay Images

: The global COVID-19 pandemic certainly has posed one of the more difﬁcult challenges for researchers in the current century. The development of an automatic diagnostic tool, able to detect the disease in its early stage, could undoubtedly offer a great advantage to the battle against the pandemic. In this regard, most of the research efforts have been focused on the application of Deep Learning (DL) techniques to chest images, including traditional chest X-rays (CXRs) and Computed Tomography (CT) scans. Although these approaches have demonstrated their effectiveness in detecting the COVID-19 disease, they are of huge computational complexity and require large datasets for training. In addition, there may not exist a large amount of COVID-19 CXRs and CT scans available to researchers. To this end, in this paper, we propose an approach based on the evaluation of the histogram from a common class of images that is considered as the target. A suitable inter-histogram distance measures how this target histogram is far from the histogram evaluated on a test image: if this distance is greater than a threshold, the test image is labeled as anomaly, i.e., the scan belongs to a patient affected by COVID-19 disease. Extensive experimental results and comparisons with some benchmark state-of-the-art methods support the effectiveness of the developed approach, as well as demonstrate that, at least when the images of the considered datasets are homogeneous enough (i.e., a few outliers are present), it is not really needed to resort to complex-to-implement DL techniques, in order to attain an effective detection of the COVID-19 disease. Despite the simplicity of the proposed approach, all the considered metrics (i.e., accuracy, precision, recall, and F-measure) attain a value of 1.0 under the selected datasets, a result comparable to the corresponding state-of-the-art DNN approaches, but with a remarkable computational simplicity.


Introduction
The novel Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is the cause of one of the worst pandemic of this century: the Coronavirus Disease 2019 (or, simply, COVID-19) [1]. COVID-19 is responsible for illness in the respiratory system, with common symptoms, such as fever and cough, and can also lead to severe pneumonia, an infection that causes a severe inflammation in the lungs' air sacs, which are responsible for the oxygen exchange [2].
However, many studies have highlighted that the Novel COVID-19 Pneumonia (NCP) is different from other viral (Common) Pneumonia (CP) [3]. In this regard, some works have shown that cases of NCP tend to affect the entire lung, unlike common diseases that are limited to small regions [3,4]. Pneumonia caused by the COVID-19 shows a typical hazy patch on the outer edges of the lungs.
Due to the seriousness of possible consequences of COVID-19, early detection of the disease is vital [2]. Currently, the COVID-19 screening is commonly based on the to produce a target histogram. This histogram is then used as a reference in the inference phase: for each unknown test scan, the related test histogram is obtained, and a suitable measurement of the distance between the tested and target histograms is evaluated. If this distance is below a certain threshold, the test image is classified the same as the target class; otherwise, it is considered an anomaly and labeled as  Overall, the main contributions of this paper may be so summarized: • We propose a histogram-based technique to automatically detect the COVID-19 disease from CXRs and CT images. The histogram evaluation is a simple and very fast operation. In addition, in order to construct a target histogram, very few images are needed, in contrast to DNNs that have to use a huge number of scans to be trained. • We investigate different inter-histogram distances to evaluate how an unknown scan is far from the target one. These distances are used to label the test images as normal (reference class) or anomaly (NCP), depending on whether they are less or greater than suitably set thresholds. Although all the proposed distances can be computed with the same computational cost, we expect that some of them work better in practice. • We evaluate numerical results on benchmark datasets available in the open literature on CXRs and CT scans and compare the proposed approach to other state-of-the-art DNN-based architectures. We observed that the proposed approach, although very simple to implement, is able to obtain excellent results.
We expect that the main insight, deduced from the numerical results of the proposed approach, is that, at least when the images of the considered datasets are homogeneous enough (i.e., a few outliers are present), it is not really needed to resort to complex-toimplement DNNs, in order to attain an effective detection of the COVID-19 disease. The rest of the paper is organized as follows. Section 2 presents the recent literature on the topic. Section 3 describes the proposed approach in terms of both the used methodology and inter-histogram distances, as well as introduces the experimental setup. Section 4 presents the obtained numerical results and performance comparisons, as well as provides the discussion on the proposed idea. Finally, Section 5 concludes the paper and outlines some future research.

Related Work
During the last two decades, great attention of researchers has been focused on automatic classification of medical images, and, to that end, a plethora of different techniques have been employed [29]. Although some interpretations are still made by a visual inspection of the obtained scans, Computer-Aided Detection or Computer-Aided Diagnosis (CAD) facilities help radiologists to detect lesions on chest X-rays [29]. However, in its early stage of usage, CAD can be defined solely as a computer analysis tool for image data. Over time, research efforts transformed CAD into automatic diagnostic tools [30].
The majority of the proposed approaches in the literature, principally devoted to the diagnosis of lung cancer [31], exploits low-level handcrafted features [32]. These approaches are founded on: texture-based features [33], edge-based features [34,35], graph mining [36], and color-based features [37][38][39]. Specifically, these works exploit different methodologies to perform image segmentation in regions of interest for building up masking operations that label them as either infected or not infected lungs.
In particular, such as the proposed approach, the works in Reference [40][41][42] exploit histogram-based information to characterize and/or classify medical images. In the past, histograms, due to the simplicity of their computation and accuracy of the produced results, have been widely used for object recognition [43][44][45][46], identification [47], and/or mining from images [27,28]. Among the applications in medical images, Ref. [41,42] used distancebased approach in detecting anomalies in images. Specifically, Reference [41] evaluates the histogram of a 3D volume of neuronal cells and fit them with a Gaussian function, in order to define different clusters in the images. Similar to the approach proposed in this paper, Reference [42] performs automatic segmentation of cell nuclei by exploiting an adaptive (local) threshold computed on the basis of the mean and standard deviation of the pixel gray-level values. However, unlike the proposed approach, these works do not directly employ histograms to detect images of infected patients.
Many works have been focused on important classes of medical images, i.e., CXRs and CT scans. These studies analyze such images to detect tuberculosis [48][49][50] and lung cancer [51][52][53][54]. However, CXRs and CT scans can be useful also to detect other important diseases, such as viral or bacterial pneumonia [55,56].
In this regard, CXRs and CT scans have been massively used for the detection of COVID-19 disease. Although the pandemic is quite recent, several contributions are available in the literature. However, even if there are some sparse works about the manual screening of such images [57][58][59][60][61], almost all of the state-of-the-art approaches relate to the use of Machine Learning (ML) [62] and Deep Learning (DL) [63,64] techniques. The ML, in fact, could become a helpful and potential powerful tool for large-scale COVID-19 screening [13,65], and the author in Reference [3] has recently hypothesized that DL techniques applied to CT scans can become the first alternative screening test to the rRT-PCR in the near future. Motivated by this expectation, in the last year, the DL has been successfully used for CXRs [15,20,66,67], CT scans [56,[68][69][70], or both [18,71]. Being, indeed, challenging to summarize all the available literature in a single paper, there are some useful reviews regarding the application of DL techniques to COVID-19 detection on CXRs [72], CT scans [73,74], and both [65,75]. A systematic review on the detection of COVID-19 using chest radiographs and CT scans, highlighting strongness and weakness of several different approaches, can be found in Reference [76]. There are also some surveys of related datasets available [77,78].
An overview of these works points out that DL approaches, mainly based on the supervised paradigm, can be divided into two main families: those based on segmentation and those that perform the classification task directly. The approaches, which are based on segmentation, are usually founded on U-Net type architecture to identify relevant part of the CXRs/CT scans and perform classification, focusing the attention only on these sections [56,[79][80][81][82][83][84]. The second family of approaches, instead, is based on the binary classification problem of COVID/Non-COVID images [20,69,70,[85][86][87][88] and utilize deep Convolutional Neural Networks (CNNs) and their variants, including VGG16, InceptionV3, ResNet, and DenseNet.
Finally, we point out that there are a few approaches that exploit the unsupervised paradigm, such as the use of deep auto-encoders [89][90][91][92]. Specifically, these works rely on deep or stacked autoencoders to automatically extract a set of meaningful features and then use softmax classifiers on the top to distinguish images of infected lungs from the healthy ones.
Among DL approaches, there exist some papers that exploit image histograms [93][94][95][96][97][98][99]. Specifically, the work in Reference [93], after a pre-processing with a median filter, extracts meaningful features from cropped regions by using Histogram Orientation Gradients (HOGs) and uses a final classification stage with a feed-forward neural network. Reference [94] introduces a pre-processing step based on the Contrast Limited Adaptive Histogram Equalization (CLAHE) idea to improve the contrast and generate an enhanced CXR image. Authors in Reference [95] propose a statistical histogram-based method for the pre-categorization of skin lesion: the image histogram is used to check the image contrast variations and classify these images into high and low contrast images. Similarly, Reference [98] uses histogram thresholding techniques to perform image clustering and then propose an image segmentation method for skin lesion delineation. Authors in Reference [96] propose ImHistNet, a deep neural network for end-to-end texture-based image classification based on a variant of the learnable histogram introduced in Reference [100] for pattern detection in pictures. The work in Reference [97], similar to our approach, extracts features from suitable inter-histogram distances but uses these features by feeding them into several (supervised) classifiers. Similarly, Reference [99] uses histogram to extract meaningful features, along with other statistical characterizations, and then performs a final classification through an artificial neural network. Moreover, the works in Reference [101,102] exploit histogram-based enhanced techniques and metaheuristic approaches in Magnetic Resonance Imaging (MRI). Specifically, Reference [101] introduces MedGA, an image enhancement method based on Genetic Algorithms (GAs) that improves the threshold selection for the segmentation of the region of interest. MedGA uses a pre-processing stage for the enhancement of MR images based on some nonlinear histogram equalization techniques. Similarly, the work in Reference [102] proposes a Particle Swarm Optimized (PSO) texture-based histogram equalization technique to enhance the contrast of MRI brain images and find an optimal threshold for the segmentation. The obtained results overcome those of standard histogram equalization and those presented in Reference [101]. Finally, also the recent work in Reference [103] exploits DL segmentation-based histogram and threshold analysis in chest CT scans for differentiating healthy lungs from the infected ones. Specifically, the proposed method automatically derives imaging biomarkers from chest CT images based on histogram and provides a Hounsfield unit (HU) threshold analysis to assess differences in those biomarkers between lung-healthy individuals and those affected by atypical pneumonia.
Overall, although these last papers exploit the histogram in their approaches, they are substantially different from our work, since they exploit the histogram for: (i) enhancing the image quality; (ii) adjusting the contrast; and (iii) extracting features to be used by a final classifier. Moreover, only a couple of these papers focus on COVID-19 disease. Our approach, on the other hand, directly uses the histogram information to measure the distance from a target histogram and then decides whether or not an unknown CXR or CT image is related to a patient who is infected with COVID-19. This makes our approach an unsupervised method, which exhibits very limited implementation complexity.
A synoptic overview of the related work is provided in Table 1, that summarizes the main approaches pursued by the referred papers.

Proposed Approach
The proposed approach is based on evaluating a suitable distance between the histogram of an unknown image and a target histogram, which is obtained by averaging a number of histograms belonging to a target class (i.e., common pneumonia or normal). Afterward, if this distance is greater than a threshold, then the unknown image is labeled as anomaly, i.e., it presents COVID-19 disease; otherwise, it is labeled as the target class.

The Evaluation of the Target Histogram
Let X = {X k } N T k=1 be a set composed of N T target images (i.e., belonging to normal or CP diagnostics). Since both CT and CXR scans are grayscale images, the k-th input data X k is modeled as a matrix of dimension M × N, representing the number of rows and columns, respectively. The pixels' intensity values have been normalized into the range [0, 1].
For each of the N T images, the related histogram h k , with k = 1, . . . , N T , is computed for N bin bins. In this paper, we choose the (normalized) histogram as a statistical representation of the target, principally due to its simplicity and efficiency in computation. A histogram is an estimate of the probability distribution obtained by partitioning the range of values into a sequence of N bin equally-spaced intervals (called bins) and counting how many values fall into each interval. The histogram is then normalized so that its sum total equates to one.
The N bin bins, used to build up the histogram, should be chosen by a trade-off between numerical stability of the distance measurements and its discriminating capability. Although this number turned out not to be critical for the performance of the proposed approach, we found that an optimized setting to guarantee non-empty bins is to use 50 bins, i.e., N bin = 50.
The target histogram is, hence, evaluated as the average of h k , computed for each image, over N T histograms: Alternatively, since the histogram evaluation is a nonlinear operation, we have also tested the average of all the N T target images followed by the evaluation of the histogram. However, we expect that this procedure gives a rise to inadequate results, as detailed in Section 4.5.
In this phase, we also evaluate the distance between the just computed target histogram h and the histogram h k of every single k-th reference scan, by using suitable probability dissimilarity measurements (introduced in the next subsection). Among these distances, we compute the mean d m and standard deviation σ d values, in order to set conveniently a suitable threshold TH used during the test phase to discriminate a reference scan from an anomaly. The idea is that the statistical distance from an anomalous scan should be greater than a reference one. Therefore, the threshold TH would be set equal to the mean distance d m plus a term depending on its standard deviation. Mathematically, we set the threshold TH as follows: where η is a suitable constant. In the case of CT scans, we set η = 2, while, for CXRs, η = 1.4 provides good results. During the inference phase, the test histogram of an unknown image is evaluated. This test histogram will be then compared to the target one, according to the same distance measurement used previously. If the distance between the test and target histograms rises beyond the threshold TH, the underlying image will be marked as COVID (anomaly); otherwise, it will be marked as the reference one. The proposed idea is shown in Figure 1, which depicts both the evaluation phase of the target histogram, as well as the inference phase, where an unknown image goes through the deduction process.

The Considered Inter-Histogram Distances
Evaluation of the dissimilarity between the target histogram and that obtained from an unknown test image is a highly important issue.
In the literature, the similarity between the two histograms is evaluated by several distance measurements over the underlying distributions [104,105]. Regarding the aims of this paper, after denoting by p and q the two N bin -dimensional vectors representing the involved histogram distributions, defined over the set of interval bins I = {1, . . . , N bin }, we have selected the following four distances.

1.
Cosine distance: It is formally defined as: and it normally ranges in the interval [0, 2]. However, since we are considering probabilities, each bin value is non-negative (i.e., p i , q i ≥ 0, for all i), so that the distance in (3) is limited to the interval [0, 1]. A distance equal to zero means that the two histogram are identical, while a distance equal to one denotes orthogonal histograms.

2.
Kullback-Leibler (KL) divergence [106]: It is defined as: where p i and q i are the values of the p and q histogram in the i-th bin, respectively. By definition, the contribution of the i-th term in the summation in (4) is zero when p i vanishes. It is always non-negative, and it is zero when the two distributions are equal.

3.
Bhattacharyya distance [107]: It is defined as: The Bhattacharyya distance, such as the KL divergence, is always non-negative, while it is vanishing when the two distributions are equal. 4. χ 2 distance: It is defined as: In addition, the χ 2 distance is a non-negative measure.
These distances, except the cosine one, have been normalized by the number N bins of used bins, in order to render them independent of the N bin setting. The Bhattacharyya distance is widely used in several application, such as image processing, and, unlike the KL one, has the advantages of being insensitive to the zeros of distributions.
In the proposed approach, if the distance (chosen between the cosine, KL, Bhattacharyya, and χ 2 ones) between the target and a test histogram is above the set threshold TH of Equation (2), then, the image under test is classified as COVID-19 (CNP); otherwise, it is classified as the target class (normal or CP, depending on the used training set).

The Considered Datasets
In this work, we use two kinds of chest medical images: the chest CT scans and the traditional CXR images.
Regarding the CT scans, we have selected the COVIDx CT-2A dataset (It can be downloaded from: https://www.kaggle.com/hgunraj/covidxct (accessed on 20 May 2021.), which has been constructed by collecting a number of open data sources [108,109] and comprises 194,922 CT slices from 3745 patients. The scans of the dataset are related to three classes: novel coronavirus pneumonia due to SARS-CoV-2 viral infection (NCP), common pneumonia (CP), and normal (N) controls (i.e., images from non-infected individuals). For NCP and CP CT volumes, slices marked as containing lung abnormalities were leveraged. Moreover, all the CT volumes contain the background, in order to avoid model biases. An example of a representative image for each image class is reported in the first row of Figure 2. In order to stress the effectiveness of the proposed approach in view of numerical comparisons, which are presented later, for the CXRs, we found a highly imbalanced dataset that contains few COVID images. This is a representative of the actual situation of public datasets, and it has been selected "ad hoc" to check the effectiveness of the proposed approach with respect to the corresponding supervised state-of-the-art approaches based on deep learning, which usually need a large amount of well-balanced training data.
Specifically, for the CXRs, we have selected the COVID-XRay-5K dataset (It can be downloaded form: https://github.com/shervinmin/DeepCovid (accessed on 20 May 2021.), which has been constructed by collecting data from two publicly available sources [15]. The downloadable COVID-Xray-5k dataset is already split in training and test sets, and it contains 2084 training and 3100 test images. However, from the web URL, it can be downloaded a training set composed of only 580 Non-COVID and 84 COVID images, respectively, while the test set is composed of 3000 Non-COVID and 100 COVID images.
An example of a representative image for each class is reported in the second row of Figure  2.
Since the COVID-19 disease causes a pneumonia, from both datasets, we have selected only pneumonia images as the Non-COVID target class. In addition, for validating the proposed approach, we also consider normal images as targets. Furthermore, from the CT dataset, we have randomly selected 3500 images for both the Pneumonia and COVID classes for evaluating the target and 500 images from both classes to test it. For the CXR dataset, we used all the 580 images available for the Non-COVID class and the 84 images available for the COVID class to evaluate the target, while we selected 100 pneumonia images and all the available 97 COVID images for the test set. A summary of the used datasets is provided in Table 2. The same size of the target/test sets are used for the normal class. In the numerical results, we then select N T target images from those available.

Built-Up Simulation Environment
All the carried out simulations have been implemented in Python environment by using the end-to-end and open-source deep learning platform TensorFlow 2 exploiting the Keras API. Simulations have been performed on a PC equipped with an Intel Core i7-4500U 2.4 GHz processor, 16 GB RAM, and Windows 10 operating system.

The Considered Performance Metrics
In a binary classification problem, we are interested in classifying items belonging to a positive class (P) versus a negative one (N). Therefore, there are four basic combinations of actual data category and assigned output category, namely: The set of these four metrics is usually arranged in a bi-dimensional matrix layout, called Confusion Matrix (CM), which allows a simple visualization of the performance of a binary classification algorithm. Specifically, each column of the CM represents the instances in a predicted class, while each row represents the instances in an actual class. Moreover, the combination of the previous four numbers in some powerful indicators can be a valid tool to quantitatively measure the performance of a classification algorithm [110]. Among all the possible combinations, in this paper, we focus on the accuracy, precision, recall, and F-measure metrics, whose formal definitions are briefly reported in Table 3. Accuracy is the ratio between the correct identified instances among their total number. Precision is the ratio of relevant instances among the retrieved instances, while the Recall is the ratio of the total amount of relevant instances that were actually retrieved. Finally, precision and recall can be combined in a single measurement, called F-measure, that is mathematically defined as their harmonic mean.
The state-of-the-art approaches have also been evaluated by the area under the Receiver Operating Characteristic (ROC) curve, abbreviated as AUC. The closer the AUC is to one, the better is the classifier performance. The ROC curve, which is a graphical representation of the performance of a binary classifier, is obtained by plotting the TP rate on the y-axis against the FP rate on the x-axis for increasing values of the decision threshold. The TP rate (formally coincident with the recall metric) and the FP rate are, respectively, the ratio of the number of TP to the total positive examples, and the ratio of the number of FP to the total negative examples, respectively. Their definition can be found in the last two rows of Table 3.

Results and Discussion
In this section, we provide the numerical results, obtained from the proposed approach, on the two considered datasets. The performance has been evaluated by considering Non-COVID scans as the reference images to evaluate the target histograms. The Non-COVID class has been formed from images related to patients infected by common viral pneumonia (CP) or from healthy subjects (N). The performance comparisons with three state-of-theart deep architectures are also presented. In the case of CT dataset, the test set, used in experiments, is composed of 500 CT scans belonging to the new coronavirus pneumonia (NCP) and 500 CT scans belonging to the reference class (CP or N). In the case of CXR dataset, instead, the test set, used in experiments, is composed of 100 CXRs belonging to the new coronavirus pneumonia (NCP) and 97 CXRs belonging to the reference class (CP or N). To provide a clear graphical representation of the obtained distances, the test instances have been fed into the proposed algorithm in this order: first, the NCP scans, and then the reference CP or N ones.

Evaluation of the Proposed Approach
In the first set of experiments, we investigate the choice of the inter-histogram distances of Section 3.1.2 on both datasets. In this experiment, we use the CP class as the reference. The results have been obtained by selecting N T = 500 reference images and a number N bin = 50 of histogram bins. Moreover, the η variable in (2) has been set to η = 2 for the CT dataset and to η = 1.4 for CXRs, respectively. Table 4 summarizes the results, obtained by the proposed approach, in terms of the Accuracy, Precision, Recall, and F-measure metrics, introduced in Section 3.4 and defined in Table 3.
The results of Table 4 support the effectiveness of the proposed approach. In fact, all of the considered metrics reach high values, and, interestingly enough, they reach the top result of 100% with some distance measurements. Table 4 also reports the mean d m , standard deviation σ d , and the related threshold TH obtained by Equation (2). The second column of this table shows that the actual value of σ d depends on the selected inter-histogram distance. By a careful examination of the rows of Table 4, we can draw three main considerations: • although all the considered distances provide good results for the CT datasets, the cosine distance is able to reach 100%; • in case of CXR dataset, all the considered distances obtain the accuracy of 100%; and • for both the datasets, the cosine distance provides the lower values of the standard deviation σ d .
The results provided in Table 4 support the use of the cosine distance that works to advantage, compared to the other distance measurements. In fact, cosine distance is able to capture the "spatial" information provided by histograms, since it effectively measures their similarity, regardless of the single bin values. Motivated by these considerations, in the following tests and comparisons, we use the cosine distance. In order to give visual insights about the results in Table 4 and justify the top 100% accuracy, we point out that Figure 3 shows the numerically evaluated spectra of the obtained cosine, KL, and Bhattacharyya distances. Since the χ 2 distance behaves as in the latter two distances, it is not explicitly depicted in the paper. This figure clearly shows the effectiveness of the proposed approach. In fact, we can see that the NCP CT scans (the first 500 bars in the left panels of Figure 3) are much more distant with respect to the corresponding reference images (the last 500 bars). Similar conclusions apply to the CXRs (the first 100 bars and last 97 bars in the right panels of Figure 3, respectively). The difference between the classes is about one order of magnitude for the cosine distance, while it is reduced for the KL divergence and the Bhattacharyya distance. These last cases also show a larger variance of the obtained distances with respect to the cosine distance, as already highlighted by the second column of Table 4. However, Figure 3 also underlines that the differences between the considered distances in the case of CXRs are more limited with respect to the CT case.
As a final consideration on these first results, both Table 4 and Figure 3 confirm that a lower value of the standard deviation σ d makes the performance more robust with respect to the outliers eventually present in the tested dataset.
In order to validate the proposed approach, we repeat the first experiment by using the normal class (N), related to scans of healthy subjects, as reference. Results on both the datasets are shown in Table 5. This table shows that the results, obtained by using the normal class as the reference, provide metrics very similar to those shown in Table 4. Moreover, once again, by using the proposed cosine distance, all the considered performance metrics are unit-valued.

Sensitivity of the Proposed Approach to the Parameter Settings
The next test is concerned with the setting of the number N bin of bins, used in the histogram evaluation. As already mentioned, the number of bins, used to construct the histogram should attain a suitable trade-off between numerical stability of the distance measurement and its discriminating capability. Table 6 summarizes the obtained results using the cosine distance in Equation (3). In this test, we again use N T = 500 target images, while the constant η, in Equation (2), has been set to η = 2 for the CT dataset and to η = 1.4 for the CXRs, respectively.  Table 6 shows that the performance of the proposed approach is quite independent of the choice of the number N bins of histogram bins for the CXR dataset, since it remains quite stable. However, its effect is more noticeable for the CT dataset. Although the standard deviation σ d decreases with a smaller number of bins, the best performance in terms of all the considered metrics is obtained with a number of bins from 50 to 100 in the case of CT dataset. Hence, in order to obtain the best performance in both the datasets and maintain a limited computational complexity, we select N bin = 50 definitively.
We also evaluate the robustness of the proposed approach with respect to the number N T of target images. Towards this end, we evaluate the target histogram by averaging N T histogram representations of the corresponding scans. Table 7 summarizes the obtained results using the cosine distance in (3). In this test, we again used N bin = 50 histogram bins, while the threshold in (2) has been set to η = 2 for the CT dataset and to η = 1.4 for CXR one, respectively.  Table 7 shows that, especially for the CT dataset, the performance of the proposed approach is quite insensible to the choice of the number N T of target images in a large interval, since performance remains unchanged, even if the standard deviation tends to decrease by increasing the number of used images, because this depends on their average. However, we can see a gradual degradation in performance by reducing the number N T of target images. On the basis of these results, we used 500 target images to maintain the standard deviation at an appropriate lower value.
The last test on the sensitivity of the proposed approach is concerned with the choice of the η parameter in Equation (2). To this end, we vary the η parameter inside the interval [0, 5], with a step-size equal to 0.1, for a total of 51 different values, and we graphically report the corresponding obtained accuracy (measured in percentage). Figure 4 shows the results for both the considered datasets. These results have been evaluated by using the cosine distance, a number N T = 500 of target images, and a number N bin = 50 of histogram bins. Figure 4 clearly shows that there is a suitable range of η values that produces the accuracy of 100%, while the performance rapidly decreases with vanishing η (that means the threshold in Equation (2) is set on the basis of the mean distance d m only) or with increasing η values. Figure 4 also shows that this range is independent of the used dataset, since its corresponding value for the CT is larger than that of the CXR dataset. However, the range producing the top accuracy is sufficiently wide to confirm the robustness of the proposed approach. In order to guarantee the broadest margin, we have set the η parameter at the value corresponding to approximately the middle of the range, i.e., η = 2 for CT dataset and η = 1.4 for CXR dataset.

Performance Comparison with an Alternative Histogram-Based Benchmark Approach
In order to numerically validate the chosen order of the mathematical operations to compute the target histogram shown in Figure 1, we test the proposed approach by considering an alternative method of evaluation. Different from the proposed methodology, shown in Figure 1, in the present experiment, we perform the average of all the N T = 500 target images, and then we evaluate the corresponding histogram. This last idea is sketched in Figure 5 and could affect the performance, since the whole process is nonlinear. The inference phase remains unchanged with respect to our original idea. Once again, we use a number N bin = 50 of histogram bins, and results are obtained with using the cosine distance in (3). The numerical results, obtained by using such an idea, are provided in Table 8 for both the considered datasets. An examination of the rows of this table clearly demonstrates that this alternative does not perform very well, since all the reported metrics are poorer with respect to the excellent results of the proposed idea, shown in Figure 3. Hence, we chose to: (i) first, compute each single histogram; then, (ii) evaluate the target by averaging all of them, as shown in the top part of Figure 1. Table 8. Numerical results obtained by using the target histogram evaluated as in Figure 5. The results have been obtained by using the cosine distance, N T = 500 reference images, and a number N bin = 50 of histogram bins.

Architecture
Accuracy In this subsection, we show some comparisons with other state-of-the-art benchmark solutions. Specifically, in this paper, we consider some well-known feed-forward deep networks in the literature, i.e., the AlexNet [111], the GoogLeNet [112], and the ResNet18 [113]. In this regard, we note that AlexNet is composed of the cascade of five convolutional layers and three (dense) fully connected layers, while the GoogLeNet is more complicated, since it is much deeper and constructed by stacking three convolutional layers, nine inception modules, and two dense layers. An inception module is a particular layer obtained by concatenating several convolution operations with different filter sizes and a max pooling operation. Finally, ResNet18 is composed of 18 layers ending with a dense layer with a softmax activation. Different from the previous networks, the central part of the ResNet18 is a deep stack of residual units: these are composed of two convolutional layers (without pooling layer), with Batch Normalization and ReLU activation, using 3 × 3 kernels and preserving spatial dimensions.
Since these architectures are of supervised type, they have been trained by using both the CP and NCP classes in the training set. The training has been performed by using the Adam algorithm [114] with the default values (β 1 = 0.9, β 2 = 0.999, and ε = 10 −7 ), a batch size N b = 16, and a learning rate set to µ = 10 −6 . The training has been executed for a total of 60 epochs.
The results, provided by these state-of-the-art supervised DNNs, are shown in Table 9. These results point out that AlexNet performs worse than our approach, since it reaches merely a degree of 71% and 93% accuracy for the CT and CXR, respectively, and, in addition, in terms of other performance metrics. The performance of GoogLeNet is the same as the proposed approach under the CT dataset but is worse than the one of the proposed approach under the CXR dataset. This lower performance in the case of CXR images is due to the limited number of training images used during the learning. However, we have to remark that the high performance of GoogLeNet on CT is obtained by a deep architecture that uses a huge number of free parameters compared to the proposed approach, as shown in Table 10, which reports, for completeness, the number of trainable parameters and the training time (in minutes) for all the considered architectures and datasets. In our approach, we use only one free parameter, the adaptive threshold in Equation (2). Once again, this consideration, along with the trade-off shown in Table 10, supports the actual effectiveness of the proposed methodology. Table 10. Computational complexity of the tested models ("M" stands for millions of parameters). The training time, in minutes, refers to data sets composed of images of size 300 × 200 pixels for the CT dataset and 320 × 390 pixels for the CXR dataset. The number of images is presented in Table 2.

Performance Robustness of the Considered Approaches
The aim of this last subsection is to evaluate the performance robustness of the proposed histogram-based approach, compared to the considered state-of-the-art DNN architectures.
In order to provide a visual interpretation of the results, shown in Table 9, and check how robust these solutions are regarding the testing images, Figure 6 shows, for both the considered datasets, the confusion matrices obtained by the state-of-the-art (supervised) DNNs.
The confusion matrices in Figure 6 clearly show the poor robustness of the DNN approaches. They illustrate that the main confusion is for the Non-COVID class, where, especially, in the case of CT scans, many instances are erroneously classified.

Limitations of the Proposed Approach
In this subsection, we outline some limitations of the proposed approach with respect to the DL-based ones.
We underline that the DNN methods can produce a spectrum of a posterior probabilities that may be, in turn, used as indexes of the reliability of the taken decision. This is possible only by the prior learning performed by such approaches. In our approach, instead, the reliability is directly measured by the value of the inter-histogram distance used in the discrimination phase. Although this distance may be used as (coarse) reliability index, nevertheless, it is not more informative than the corresponding spectrum of posterior probabilities nor is it related to the probability spectrum in a direct way.
As highlighted in previous numerical results, we have identified some datasets where the proposed approach is able to attain the unit-valued accuracy. However, the performance obtained on other datasets could be not so significant, since the effectiveness of the histogram comparison depends on the quality of the scanned images and the way used to be recorded. DNNs, due to their prior learning phase, are expected to be less sensitive, indeed, to the image characteristics.
To recap, DNN-based approaches are expected to be more flexible, powerful, and generalizable than the proposed approach, but at the cost of larger datasets, higher computational resources, and longer training time. After stressing this, we point out that the goal of this paper is not the introduction of a general methodology but, rather, showing that, under some operating conditions, a simple and computationally efficient approach can produce good results.

Conclusions and Hints for Future Research
In this paper, we investigate whether a traditional histogram-based approach can be used for the detection of CT and CXR scans of infected lungs. Specifically, we propose a histogram-based approach to detecting the new coronavirus pneumonia from CT and CXR scans. Since the number of these images is not high, we evaluate a target histogram on a reference class (i.e., normal or common pneumonia). A suitable inter-histogram distance is then used to evaluate how far this target histogram is from the corresponding histogram evaluated for an unknown test scan: if this distance is above a threshold, the test image is classified as anomaly, i.e., affected by the COVID-19 disease; otherwise, it is classified the same as the target class. A number of numerical results, evaluated on two open-source benchmark datasets, demonstrates the effectiveness of the proposed approach, since it is able to obtain a top degree of the considered performance metrics (i.e., accuracy, precision, recall, and F-measure), equal to unit value, comparable to the corresponding state-of-the-art DNN approaches, but with a limited computational complexity.
In a nutshell, the main lesson stemming from the reported performance comparisons is that, at least when the images, embraced by the considered datasets, are homogeneous enough (i.e., few outliers are present, so that the standard deviations σ d of the corresponding inter-histogram (normalized) distances are limited up to 0.1), it is not really needed to resort to complex-to-implement DNNs, in order to attain reliable detection of the COVID-19 disease. The proposed histogram-based approach attains, indeed, very good detection performance, comparable with those of DNNs, but at an (extremely) reduced implementation complexity and training time (see Table 10).
In future works, we aim at extending our methodology to different types of medical images, other than CT, and/or different diseases. We expect, in fact, that the automatic screening by means of pathological images can take a great advantage by the simplicity of our methodology, in both the resulting accuracy and prediction time. To this end, it could be interesting to investigate, as a second line of research, more performing interhistogram distances, such as the Earth Mover's Distance (EMD) [115], which measures how much work it would take to transform one histogram shape into another. A third hint of future research can be addressed towards the use of Varational Autoencoeders (VAEs) and Generative Adversarial Networks (GANs) for generating additional examples in the case of new variants of COVID-19, in order to be fast in the automatic discrimination of these scans without awaiting the construction of sufficiently copious dataset. Finally, a fourth line of future research can be focused on the implementation of the proposed methodology atop distributed Cloud/Fog Computing technological platforms [116,117], in order to produce fast and reliable clinical responses by exploiting the low-delay (and, possibly, multi-antenna empowered [118,119]) capability of the supporting broadband wireless access networks [120].

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following main abbreviations are used in this paper: