Recommender System for the Efficient Treatment of COVID-19 Using a Convolutional Neural Network Model and Image Similarity

Background: Hospitals face a significant problem meeting patients’ medical needs during epidemics, especially when the number of patients increases rapidly, as seen during the recent COVID-19 pandemic. This study designs a treatment recommender system (RS) for the efficient management of human capital and resources such as doctors, medicines, and resources in hospitals. We hypothesize that a deep learning framework, when combined with search paradigms in an image framework, can make the RS very efficient. Methodology: This study uses a Convolutional neural network (CNN) model for the feature extraction of the images and discovers the most similar patients. The input queries patients from the hospital database with similar chest X-ray images. It uses a similarity metric for the similarity computation of the images. Results: This methodology recommends the doctors, medicines, and resources associated with similar patients to a COVID-19 patients being admitted to the hospital. The performance of the proposed RS is verified with five different feature extraction CNN models and four similarity measures. The proposed RS with a ResNet-50 CNN feature extraction model and Maxwell–Boltzmann similarity is found to be a proper framework for treatment recommendation with a mean average precision of more than 0.90 for threshold similarities in the range of 0.7 to 0.9 and an average highest cosine similarity of more than 0.95. Conclusions: Overall, an RS with a CNN model and image similarity is proven as an efficient tool for the proper management of resources during the peak period of pandemics and can be adopted in clinical settings.


Introduction
SARS-CoV-2 coronavirus was first discovered and reported in Wuhan, China, in 2019 and has spread globally, causing a health hazard [1][2][3]. On 30 January 2020, the World Health Organization labelled the outbreak a Public Health Emergency of International Concern, and on 11 March 2020, it was declared a pandemic. COVID-19 has varied effects on different people. The majority of infected patients experience mild to moderate symptoms and do not require hospitalization. Fever, exhaustion, cough, and a loss of taste or smell are all common COVID-19 symptoms [4]. Loss of smell, confusion, trouble breathing or shortness of breath, and chest discomfort are some of the major symptoms that lead to serious pneumonia in both lungs [1,[4][5][6]. COVID-19 pneumonia is a serious infection with a high mortality rate. The signs of a COVID-19 infection progressing into or shortness of breath, and chest discomfort are some of the major symptoms that lead to serious pneumonia in both lungs [1,[4][5][6]. COVID-19 pneumonia is a serious infection with a high mortality rate. The signs of a COVID-19 infection progressing into dangerous pneumonia include a fast pulse, dyspnea, confusion, rapid breathing, heavy sweating, and pulmonary embolism [7,8]. It induces serious lung inflammation, as seen in lung microscopy [9]. It puts strain on the cells and tissue that cover the lungs' air sacs. The oxygen for breathing is collected and supplied to the bloodstream through these sacs. Due to injury, tissue breaks off and blocks the lungs [10]. The sacs' walls might thicken, making breathing extremely difficult.
The most prevalent method of diagnosing individuals with respiratory disorders is chest radiography imaging [11][12][13]. At the beginning of COVID-19, a chest radiography image appeared normal, but it gradually altered in a fashion that may be associated with pneumonia or acute respiratory distress syndrome (ARDS) [11]. Figure 1 depicts the progression of chest X-ray images for a 45-year-old person infected with COVID-19. Roughly 15% of COVID-19 patients require hospitalization and oxygen therapy. Approximately 5% of people develop serious infections and require a ventilator.  [11] (reproduced with permission).
During the peak period of infection transmission, having enough oxygen and a ventilator is also a major challenge for hospitals [14,15]. As a result, hospitals and medical practitioners are under a lot of stress trying to deal with critical patients who have been admitted to hospitals [16]. They concentrate on providing good care to individuals who are hospitalized so that the mortality rate can be lowered, and the patients can recover quickly. However, hospitals' capability to provide adequate treatments to hospitalized patients is sometimes limited by the availability of doctors and resources. In this scenario, During the peak period of infection transmission, having enough oxygen and a ventilator is also a major challenge for hospitals [14,15]. As a result, hospitals and medical practitioners are under a lot of stress trying to deal with critical patients who have been admitted to hospitals [16]. They concentrate on providing good care to individuals who are hospitalized so that the mortality rate can be lowered, and the patients can recover quickly. However, hospitals' capability to provide adequate treatments to hospitalized patients is sometimes limited by the availability of doctors and resources. In this scenario,

Background Literature
Many researchers have presented various models employing traditional machine learning approaches in the past for the identification of COVID-19 using radiography images [2,28]. Zimmerman et al. [29] reviewed many cardiovascular uses of machine learning algorithms, as well as their applications to COVID-19 diagnosis and therapy. The authors in refs. [30,31] proposed image analysis tools to classify lung infection in COVID-19 based on chest X-ray images and claimed that artificial intelligence (AI) methods have the potential to improve diagnostic efficiency and accuracy when reading portable chest Xrays. In ref. [19], the authors established an ensemble framework of five classifiers such as K-nearest neighbors (KNN), naive Bayes, decision tree, support vector machines (SVM), and artificial neural network for the detection of COVID-19 using chest X-ray images. Ref. [32] describes a method for detecting SARS-CoV-2 precursor-miRNAs (pre-miRNAs) that aids in the identification of specific ribonucleic acids (RNAs). The method employs an artificial neural network and proposes a model with an estimated accuracy of 98.24%. The proposed method would be useful in identifying RNA target regions and improving recognition of the SARS-CoV-2 genome sequence in order to design oligonucleotide-based drugs against the virus's genetic structure.
Due to the unprecedented benefits of a deep CNN in image processing, it has been successfully utilized by various researchers for the identification and accurate diagnosis

Background Literature
Many researchers have presented various models employing traditional machine learning approaches in the past for the identification of COVID-19 using radiography images [2,28]. Zimmerman et al. [29] reviewed many cardiovascular uses of machine learning algorithms, as well as their applications to COVID-19 diagnosis and therapy. The authors in refs. [30,31] proposed image analysis tools to classify lung infection in COVID-19 based on chest X-ray images and claimed that artificial intelligence (AI) methods have the potential to improve diagnostic efficiency and accuracy when reading portable chest X-rays. In ref. [19], the authors established an ensemble framework of five classifiers such as K-nearest neighbors (KNN), naive Bayes, decision tree, support vector machines (SVM), and artificial neural network for the detection of COVID-19 using chest X-ray images. Ref. [32] describes a method for detecting SARS-CoV-2 precursor-miRNAs (pre-miRNAs) that aids in the identification of specific ribonucleic acids (RNAs). The method employs an artificial neural network and proposes a model with an estimated accuracy of 98.24%. The proposed method would be useful in identifying RNA target regions and improving recognition of the SARS-CoV-2 genome sequence in order to design oligonucleotide-based drugs against the virus's genetic structure.
Due to the unprecedented benefits of a deep CNN in image processing, it has been successfully utilized by various researchers for the identification and accurate diagnosis of COVID- 19. In ref. [20], the authors proposed a deep learning (DL) model for the detection of COVID-19 by annotating computed tomography (CT) and X-ray chest images of patients. In ref. [33], various DL models such as ResNet-152, VGG-16, ResNet-50, and DenseNet-121 were applied to radiographic medical images for the identification of COVID-19 and were compared and analyzed. To overcome the lack of information and enhance the training time, the authors also applied transfer learning (TL) techniques to the proposed system. A voting-based approach using DL for the identification of COVID-19 was proposed in ref. [34]. The proposed method makes use of CT scan chest images of patients and utilizes a voting mechanism to classify a CT scan image of a new patient. Various DL algorithms for identifying COVID-19 infections from lung ultrasound imaging were reviewed and compared by the authors in ref. [35]. The proposed method adopts four pre-trained models of DL such as InceptionV3, VGG-19, Xception, and ResNet50, for the classification of a lung ultrasound image. In ref. [36], the authors compared the results of using CNNs pre-trained with ML-based classification algorithms. The major purpose of this research was to see how CNN-extracted features affect the construction of COVID-19 and non-COVID-19 classifiers. The usefulness of DL learning algorithms for the detection of COVID-19 using chest X-ray images is demonstrated in ref. [37]. The proposed approach was implemented using 15 different pre-trained CNN models, and VGG-19 showed a maximum classification accuracy of 89.3%. In ref. [38], an object detection approach using DL for the identification of COVID-19 in chest X-ray images was presented. The suggested method claims a sensitivity of 94.92% and a specificity of 92%.
Many kinds of research have also been conducted in the past using image segmentation, image regrouping, and other hybrid techniques for accurate diagnosis of COVID-19 [39]. In ref. [40], the authors proposed an innovative model using multiple segmentation methods on CT scan chest images to determine the area of pulmonary parenchyma by identifying pulmonary infiltrates (PIs) and ground-glass opacity (GGO). In ref. [41], the authors proposed a hybrid model for the detection of COVID-19 using feature extraction and image segmentation techniques to improve the classification accuracy in the detection of COVID-19. In ref. [42], a hybrid approach of feature extraction and CNN on chest X-ray images for the detection of COVID-19 using a histogram-oriented gradient (HOG) algorithm and watershed segmentation methodology was proposed. This proposed hybrid technique showed satisfactory results in the detection of COVID-19 with an accuracy of 99.49%, sensitivity of 93.65%, and specificity of 95.7%. In ref. [43], the authors came up with a new way to determine COVID-19 in images of chest X-rays using image segmentation and image regrouping. The proposed approach was found to outperform the existing models for the identification of COVID-19 in terms of classification accuracy with a lower amount of training data. In ref. [44], the transfer learning technique was used in conjunction with image augmentation to train and validate several pretrained deep Convolutional Neural Networks (CNNs). The networks were trained to classify two different schemes: (i) normal and COVID-19 pneumonia and (ii) normal, viral, and COVID-19 pneumonia with and without image augmentation. The classification accuracy, precision, sensitivity, and specificity for both schemes were 99.7%, 99.7%, 99.7%, and 99.55% and 97.9%, 97.95%, 97.9%, and 98.8%, respectively. The high accuracy of this computer-aided diagnostic tool can significantly improve the speed and accuracy of COVID-19 diagnosis. A systematic and unified approach for lung segmentation and COVID-19 localization with infection quantification from CXR images was proposed in ref. [45] for accurate COVID-19 diagnosis. The proposed method demonstrated exceptional COVID-19 detection performance, with sensitivity and specificity values exceeding 99%.
RS has also been useful in combating the COVID-19 pandemic by making recommendations such as medical therapies for self-care [46], wearable gadgets to prevent the COVID-19 outbreak [47], and unreported people to reduce infection rates by contact tracing [48], among others. An RS based on image content was proposed in ref. [25] that employed a random forest classifier to determine the product's class or category in the first phase and employed the JPEG coefficients measure to extract the feature vectors of the photos in the second phase to generate recommendations using feature vector similarity. A neural network-based framework for product selection based on a specific input query image was provided by ref. [26]. The suggested system employed a neural network to classify the supplied input query image, followed by another neural network that used the Jaccard similarity measure to find the most comparable product image to that input image. In ref. [27], the authors developed a two-stage DL framework using a neural network classifier and a ranking algorithm for recommending fashion images based on similar input images. Traditional RS frequently faces a significant challenge in learning relevant features of both users and images in big social networks with sparse relationships between users and images, as well as the widely different visual contents of images. Refs. [49][50][51] presented a strategy for solving this data sparsity problem in content-based and collaborative filtering RS by importing additional latent information to identify users' probable preferences.
The majority of previous research in RS based on computer vision was conducted for the e-commerce domain, with only a few works carried out for the healthcare domain, according to the literature. It was also revealed from the literature that image similarity is one of the successful techniques used for designing RS in computer vision. Furthermore, the efficacy of computer vision in RS in providing solutions for combating the COVID- 19 pandemic has yet to be investigated. In this context, we suggest a health recommender system (HRS) that uses image similarity and collaborative filtering to provide treatment suggestions for COVID-19.

Recommender System
RS is a software program that aids in the personalization of users and is based on the principle of information filtering [52]. RS has the following formal definition: Let P represent the set of all users, and Q represent the set of all items that can be recommended. Let t be a utility function that measures the usefulness of item q to user p, i.e., t: p × q → S, where S is an ordered set. The items q ∈ Q that maximize the user's utility will then be recommended for each user p ∈ P. As a result, ∀p ∈ P, q p = arg max q∈Q t (p, q) may be stated more formally. RS can be broadly divided into four types, as shown in Figure 3.
Diagnostics 2022, 12, x FOR PEER REVIEW 6 of 38 photos in the second phase to generate recommendations using feature vector similarity. A neural network-based framework for product selection based on a specific input query image was provided by ref. [26]. The suggested system employed a neural network to classify the supplied input query image, followed by another neural network that used the Jaccard similarity measure to find the most comparable product image to that input image. In ref. [27], the authors developed a two-stage DL framework using a neural network classifier and a ranking algorithm for recommending fashion images based on similar input images. Traditional RS frequently faces a significant challenge in learning relevant features of both users and images in big social networks with sparse relationships between users and images, as well as the widely different visual contents of images. Refs. [49][50][51] presented a strategy for solving this data sparsity problem in content-based and collaborative filtering RS by importing additional latent information to identify users' probable preferences. The majority of previous research in RS based on computer vision was conducted for the e-commerce domain, with only a few works carried out for the healthcare domain, according to the literature. It was also revealed from the literature that image similarity is one of the successful techniques used for designing RS in computer vision. Furthermore, the efficacy of computer vision in RS in providing solutions for combating the COVID- 19 pandemic has yet to be investigated. In this context, we suggest a health recommender system (HRS) that uses image similarity and collaborative filtering to provide treatment suggestions for COVID-19.

Recommender System
RS is a software program that aids in the personalization of users and is based on the principle of information filtering [52]. RS has the following formal definition: Let P represent the set of all users, and Q represent the set of all items that can be recommended. Let t be a utility function that measures the usefulness of item q to user p, i.e., t: p × q → S, where S is an ordered set. The items q′ ∈ Q that maximize the user's utility will then be recommended for each user p ∈ P. As a result, ∀p ∈ P, qp′ = arg maxq∈Q t (p, q) may be stated more formally. RS can be broadly divided into four types, as shown in Figure 3. Content-based RS or cognitive RS provides recommendations based on a comparison of the items' content with a user profile [53,54]. Collaborative RS collects preferences or taste information from the collaborated users to produce automatic predictions regarding the user's interests [55,56]. Memory-based and model-based are the two different categories of a collaborative RS. A memory-based collaborative RS makes use of all the data to Content-based RS or cognitive RS provides recommendations based on a comparison of the items' content with a user profile [53,54]. Collaborative RS collects preferences or taste information from the collaborated users to produce automatic predictions regarding the user's interests [55,56]. Memory-based and model-based are the two different categories of a collaborative RS. A memory-based collaborative RS makes use of all the data to provide recommendations based on user or item similarity, whereas model-based collaborative filtering RS entails creating a model based on all of the data in order to detect similari-Diagnostics 2022, 12, 2700 7 of 36 ties between users or items for recommendation purposes. Hybrid RS combines two or more recommendation algorithms in different ways to take advantage of their different strengths [57,58]. A knowledge-based RS intelligently filters a group of targets to fulfil the user's preferences. It assists in overcoming the difficulties of both collaborative and content-based RSs [59,60].
An RS used in health applications to analyze patients' digital data and filter out the best information according to their profile is known as a health recommender system (HRS) [61,62]. HRS can be thought of as a decision-making system that plays a big role in society by advising patients on suitable disease treatments and doctors on good disease diagnoses.

Convolutional Neural Network
A CNN is a powerful DL tool for image processing and recognition [63]. The fundamental architecture of a CNN contains three distinct types of layers: convolutional, pooling, and fully connected, as shown in Figure 4. provide recommendations based on user or item similarity, whereas model-based collaborative filtering RS entails creating a model based on all of the data in order to detect similarities between users or items for recommendation purposes. Hybrid RS combines two or more recommendation algorithms in different ways to take advantage of their different strengths [57,58]. A knowledge-based RS intelligently filters a group of targets to fulfil the user's preferences. It assists in overcoming the difficulties of both collaborative and content-based RSs [59,60]. An RS used in health applications to analyze patients' digital data and filter out the best information according to their profile is known as a health recommender system (HRS) [61,62]. HRS can be thought of as a decision-making system that plays a big role in society by advising patients on suitable disease treatments and doctors on good disease diagnoses.

Convolutional Neural Network
A CNN is a powerful DL tool for image processing and recognition [63]. The fundamental architecture of a CNN contains three distinct types of layers: convolutional, pooling, and fully connected, as shown in Figure 4.

Convolutional Layer
The basic layer of CNN is the convolutional layer, which has the responsibility of extracting features and recognizing patterns in input images. The CNN extracts low-level and high-level features by passing images from the training dataset through a filter comprised of feature maps and kernels [23]. The convolutional layer's output can be expressed in the following way: where qs r (m, n) is the convolution layer, and pc (x, y) is an element of the input image tensor pc, multiplied by the ( , ) index of the s th convolutional kernel sr of the r th layer element-wise.

Pooling Layer
The pooling layer, or the down-sampling layer, gathers comparable data in the vicinity of the feature layer and generates the dominating response inside this layer. The pooling process aids in the extraction of a group of features that are invariant to xi, and it can be defined as Equation (2).

Convolutional Layer
The basic layer of CNN is the convolutional layer, which has the responsibility of extracting features and recognizing patterns in input images. The CNN extracts lowlevel and high-level features by passing images from the training dataset through a filter comprised of feature maps and kernels [23]. The convolutional layer's output can be expressed in the following way: where q s r (m, n) is the convolution layer, and p c (x, y) is an element of the input image tensor p c , multiplied by the g s r (u, v) index of the sth convolutional kernel s r of the rth layer element-wise.

Pooling Layer
The pooling layer, or the down-sampling layer, gathers comparable data in the vicinity of the feature layer and generates the dominating response inside this layer. The pooling process aids in the extraction of a group of features that are invariant to x i , and it can be defined as Equation (2).
where for the sth input feature-map Q s r , Y r S conveys the pooled feature-map of the rth layer.

Fully Connected Layer
The fully connected layer is utilized for classification at the end of the CNN network. This layer takes the features that have been collected at different stages of the network as the input and then analyses and compares those features to the results from all the other layers.

Activation Function
The activation function modifies the weighted sum input of one node for a given layer and uses it to activate that node for a certain input. The activation function assists in the learning of feature patterns by acting as a decision function. ReLU is one of the most widely used activation functions due to its ability to handle the gradient problem in CNN models. Mathematically, the ReLU activation function can be defined as follows:

Batch Normalization
Batch normalization is used for the normalization of the output of the preceding layers, which can assist with issues such as internal covariance shifts in feature maps. The equation for the batch normalization of transformed feature map Q r s can be defined as shown in Equation (4).
where B r s denotes the normalized feature map, and Q r s represents the input feature map. The mean and the variance of the feature map are represented by µ b and σ b 2 , respectively. is used to deal with the numerical instability caused by division by zero.

Dropout and Flatten Layer
Dropout is a technique for adding regularization to a CNN network, which establishes generalization by omitting some connections at random. After removing some random connections, the network design with the lowest weight value is chosen as a close approximation of all the suggested networks. The Flatten layer transforms the pooled feature map into a one-dimensional array that is passed as a single feature vector to the next layer.

Feature Extraction Methods
CNN is widely used in computer vision for feature extraction because it can discover relevant features from images without requiring human interaction and is computationally efficient. There are various models of CNN for the feature extraction process. In this study, we tested the performance of our proposed system with two specific CNN models, namely: the residual neural network (ResNet) and the visual geometry group (VGG). We used three different versions of ResNet, namely ResNet-50, ResNet-101, and ResNet-152, and two versions of VGG, namely VGG-16 and VGG-19. The detailed architectures of both ResNet and VGG are described in the following subsections.

ResNet
ResNet is an artificial neural network that can solve the problem of training very deep networks using residual blocks [64]. Several COVID-19-related publications have been tried using ResNet or the Hybrid nature of ResNet [65][66][67]. The basic architecture of a ResNet network is shown in Figure 5.
A ResNet model with these residual blocks is shown in Figure 6. A direct connection in the ResNet model can skip some layers and is known as a "skip connection", which is the heart of the model. The model produces a different output due to this skipped connection. When the connection is not skipped, the input X is multiplied by the weights of the following layer, and a bias term is added to this. Therefore, Equation (5) can be used to describe the model's output function.

ResNet
ResNet is an artificial neural network that can solve the problem of training very deep networks using residual blocks [64]. Several COVID-19-related publications have been tried using ResNet or the Hybrid nature of ResNet [65][66][67]. The basic architecture of a ResNet network is shown in Figure 5. A ResNet model with these residual blocks is shown in Figure 6. A direct connection in the ResNet model can skip some layers and is known as a "skip connection", which is the heart of the model. The model produces a different output due to this skipped connection. When the connection is not skipped, the input X is multiplied by the weights of the following layer, and a bias term is added to this. Therefore, Equation (5) can be used to describe the model's output function.

VGG Net
VGG Net is a traditional CNN model composed of blocks, each consisting of 2D convolution and max pooling layers. The basic architecture of a VGG Net is shown in Figure  7. It was created to improve the performance of a CNN model by increasing the depth of the model. The VGG16 and VGG-19 are the two versions available. VGG-16 and VGG-19 include 16 and 19 layers of convolution layers, respectively. The VGG Net architecture serves as the foundation for cutting-edge object recognition models. The VGG Net, which was created as a deep neural network (DNN), outperforms baselines on a variety of tasks and datasets. Small convolutional filters are used to build the VGG network. VGG-16 [71] and VGG-19 [72] are two different versions of VGG that we used to test our proposed system.   Double or triple-layer skips with nonlinearities (ReLU) and batch normalization are used in most ResNet models [64]. An additional weight matrix can be utilized, and such models are termed "Highway Nets." We used three variations of ResNet, namely ResNet-50 [68], ResNet-101 [69], and ResNet-152 [70].

VGG Net
VGG Net is a traditional CNN model composed of blocks, each consisting of 2D convolution and max pooling layers. The basic architecture of a VGG Net is shown in Figure 7. It was created to improve the performance of a CNN model by increasing the depth of the model. The VGG16 and VGG-19 are the two versions available. VGG-16 and VGG-19 include 16 and 19 layers of convolution layers, respectively. The VGG Net architecture serves as the foundation for cutting-edge object recognition models. The VGG Net, which was created as a deep neural network (DNN), outperforms baselines on a variety of tasks and datasets. Small convolutional filters are used to build the VGG network. VGG-16 [71] and VGG-19 [72] are two different versions of VGG that we used to test our proposed system.

Similarity Measures
The similarity measure is a means of determining how closely data samples are related. It plays an important role in computer vision by aiding in the comparison of two images by determining their feature vector similarity [64,74]. The proposed model uses the cosine similarity measure to compute the similarity between two feature vectors to find the most similar images to the input image, which are further utilized for the recommendation process.

Cosine Similarity Measure
The similarity between two vectors using cosine similarity can be calculated as follows:

Similarity Measures
The similarity measure is a means of determining how closely data samples are related. It plays an important role in computer vision by aiding in the comparison of two images by determining their feature vector similarity [64,74]. The proposed model uses the cosine similarity measure to compute the similarity between two feature vectors to find the most similar images to the input image, which are further utilized for the recommendation process.

Cosine Similarity Measure
The similarity between two vectors using cosine similarity can be calculated as follows: where U and V represent two vector components. The cosine similarity is measured on a scale of 0 to 1, with 0 representing no similarity and 1 representing 100% similarity. All the other values in the range [0, 1] show the equivalent percentage of similarity.

Euclidean Distance Similarity Measure
The Euclidean similarity between image vectors U and V can be calculated as follows: The Euclidean similarity is also measured on a scale of 0 to 1, with 0 representing no similarity and 1 representing 100% similarity. All other values in the [0, 1] range reflect the equivalent percentage of similarity.

Jaccard Similarity Measure
Jaccard similarity is a popular proximity measurement that is used to determine the similarity between two objects. The Jaccard similarity is calculated by dividing the number of observations in both sets by the number of observations in either set. It is also graded on a scale of 0 to 1, with 0 indicating no similarity and 1 indicating complete similarity.
All other values in the [0, 1] range correspond to the same percentage of similarity. The similarity between two vectors using Jaccard similarity can be calculated as: The Maxwell-Boltzmann similarity is a popular similarity measure for document classification and clustering [75]. It is calculated using the overall distribution of feature values and the total number of nonzero features found in the documents. The difference between the two documents is represented by the following: where, and where, tnz = the total number of nonzero attributes, tnzu = the total number of features of U having nonzero items, tnzv = the total number of features of V having nonzero items, 0 < λ < 1, k denotes features, and q denotes the number of present-absent pairs.
The Maxwell-Boltzmann similarity is calculated from the value of D as follows:

Proposed Model
In general, computer vision-based RSs are based upon the assumption that a user submits or picks an image of a product, and the user is then provided with similar products/images [25]. The proposed method slightly deviates from this assumption as it extracts features from past COVID-19 patients' chest X-ray images and recommends some metadata information related to treatment alternatives based on these images.
The proposed framework is aimed at providing emergency solutions to hospitals during the COVID-19 pandemic using the information of past COVID-19 patients who have successfully recovered from the hospital. Therefore, it also assumes that the hospital database used to implement the proposed RS contains the chest X-ray images of COVID-19 patients who have recovered from the hospital along with the metadata (associated information) such as doctors who have investigated the patient, medicine, and resources (ICU, oxygen mask, and ventilator) provided to the patient. The architecture or the local system for the proposed RS is shown in Figure 8 and consists of two major phases: fine-tuning CNN models for feature learning (Phase-1) and recommendation (Phase-2). Phase 1 of the system is an offline system, while phase 2 is an online system. have successfully recovered from the hospital. Therefore, it also assumes that the hospital database used to implement the proposed RS contains the chest X-ray images of COVID-19 patients who have recovered from the hospital along with the metadata (associated information) such as doctors who have investigated the patient, medicine, and resources (ICU, oxygen mask, and ventilator) provided to the patient. The architecture or the local system for the proposed RS is shown in Figure 8 and consists of two major phases: finetuning CNN models for feature learning (Phase-1) and recommendation (Phase-2). Phase 1 of the system is an offline system, while phase 2 is an online system. The overall algorithm of the proposed framework is also provided in Algorithm 1. The algorithm illustrates the basic workflow of the proposed system. The system takes the chest X-ray image of a new patient as the input and recommends doctors, medicine, and hospital resources as the output. It uses a CNN model to extract the feature vectors of the input chest X-ray image and all the chest X-ray images of past COVID-19 patients stored in the hospital database. It uses a similarity measure to compute the most similar COVID- The overall algorithm of the proposed framework is also provided in Algorithm 1. The algorithm illustrates the basic workflow of the proposed system. The system takes the chest X-ray image of a new patient as the input and recommends doctors, medicine, and hospital resources as the output. It uses a CNN model to extract the feature vectors of the input chest X-ray image and all the chest X-ray images of past COVID-19 patients stored in the hospital database. It uses a similarity measure to compute the most similar COVID-19 patients to a new patient and utilizes the metadata associated with them for the recommendations, which is represented in the testing protocol from step 5 to step 11 in the algorithm. The pre-processing and the training of the chest X-ray images are explained in the training protocol with steps from 1 to 4 in the algorithm.

Algorithm 1: The Overall Algorithm of the Proposed System
Training Protocol for Feature Extraction using deep learning

1.
Obtain the chest X-ray images of COVID-19 as training data.

2.
Crop those chest X-ray images at random to 224 × 224 and rotate them at random by 30 • .

3.
Input the transformed chest X-ray images obtained in step 2 into the CNN classifier for fine-tuning and begin the training of the model.

4.
When training is completed, extract the desired output layer features and save the model. Testing Protocol using Similarity Measure

5.
Obtain the chest X-ray images from the database of previous COVID-19 cases. 6.
Resize the chest X-ray images from the COVID-19 database to 225 × 225 and perform a centre crop of 224 × 224. 7.
Extract and store the feature vectors of chest X-ray images from the database using the pre-trained CNN model. 8.
Calculate the similarity of the query image feature vector with all the stored database feature vectors. 9.
Find the top-k similar feature vectors in the database, where k is a positive integer. 10. Retrieve the chest X-ray images with their records of meta-data from the database, corresponding to the top-k similar feature vectors obtained in step 6. 11. Recommend the doctors, medicines, and resources present in the retrieved meta-data records to the new patient as the output.

(A) Phase 1 (Offline System): Fine Tuning for Feature Learning
In phase 1, the proposed method learns to extract infection features from COVID-19 patients' chest X-ray images based on image characteristics. A CNN model is trained to learn these features by classifying these chest X-ray images into respective lung condition categories (one of which should be COVID-19) as present in the training data. The architecture of a CNN model consists of two components: (1) feature vector extractor and (2) classifier [24,76], as shown in Figure 9.
the pre-trained CNN model.

8.
Calculate the similarity of the query image feature vector with all the stored database feature vectors.

9.
Find the top-k similar feature vectors in the database, where k is a positive integer. 10. Retrieve the chest X-ray images with their records of meta-data from the database, corresponding to the top-k similar feature vectors obtained in step 6. 11. Recommend the doctors, medicines, and resources present in the retrieved metadata records to the new patient as the output.

(A) Phase 1 (Offline System): Fine Tuning for Feature Learning
In phase 1, the proposed method learns to extract infection features from COVID-19 patients' chest X-ray images based on image characteristics. A CNN model is trained to learn these features by classifying these chest X-ray images into respective lung condition categories (one of which should be COVID-19) as present in the training data. The architecture of a CNN model consists of two components: (1) feature vector extractor and (2) classifier [24,76], as shown in Figure 9.  Several convolution layers are followed by max pooling and an activation function in the feature extraction process. Typically, the classifier is made up of fully connected layers. The proposed approach uses a fine-tuning method, which is more commonly used in radiology research. It involves not only replacing the pre-trained model's fully connected layers with a fresh set to retrain them on the given dataset but also using backpropagation to fine-tune some of the layers in the pretrained convolutional base. The binary cross-entropy loss function was used for optimization in the training of the CNN models [67,77]. The binary cross entropy loss can be defined by the following equation: During training, the ReLU activation function and its variations are also used because they can solve the problem of vanishing gradients, which often happens in CNN models. Before training the model for feature learning, the suggested method utilizes specific image transformations or augmentation, as shown in phase 1 [78,79]. This allows the model to be more adaptable to the huge variation in the region of interest (lungs) within the image, with less emphasis on its location, orientation, and size. Models that are trained with data transformations are more likely to improve CNN's performance on image datasets and make them more general. In this phase, any efficient CNN model, such as VGG or ResNet, may be trained. The trained model weights are then saved, and the fine-tuned convolutional base is then employed in phase 2 to extract features. Steps 1 to 4 of the proposed algorithm shown in Figure 9 describe phase 1.

(B) Phase 2 (Online System): Recommendation
Phase 2 of the proposed approach is used for providing recommendations based on the features obtained from X-ray images using the fine-tuned convolutional base from phase 1, which acts as a feature extractor in phase 2. The metadata associated with each image in the database is utilized to provide recommendations such as doctors, medicines, and resources. For recommendation, the system utilizes similar patients from the database who have the same type of infection in the chest due to COVID-19 as that of the patient corresponding to the input query chest X-ray image. In doctor recommendation, it recommends doctors who have already successfully treated similar patients to the patient corresponding to the input query chest X-ray image. In medicine recommendation, the system recommends medicines that have already been consumed by previously recovered patients who had similar chest infections. For resource recommendations, it recommends emergency resources such as oxygen masks, ventilators, and ICU if required by the patient in the future so that the hospital can arrange those resources beforehand. Phase 2 of the proposed method is again divided into two sub-phases: (1) feature vector extraction and (2) similarity-based retrieval.

Feature Vector Extraction
The elements or patterns of an object in an image that assist in identifying it are called "features." Feature extraction is a step in the dimensionality reduction process, which divides and reduces a large collection of raw data into smaller groupings. It aids in extracting the most useful information from higher-dimensional data by choosing and merging variables into features, hence minimizing the amount of data. These features are easy to use and describe real data collection uniquely and accurately.
CNNs excel in extracting complex features in the form of feature vectors that depict the image in great detail, learning task-specific features while being extremely efficient [80]. Therefore, the proposed method uses CNN-based feature extractors obtained from phase 1 to extract features of the infection present inside the chest X-ray images of COVID-19 patients.
Feature vector extraction is applied both to the input query image and the chest X-ray images of COVID-19 patients present in the hospital database. Steps 5 to 10 of the proposed algorithm describe the feature vector extraction process. However, the extracted feature vectors are further exploited for similarity-based retrieval.

Similarity-Based Retrieval
The extracted feature vectors of the input query image and the chest X-ray images of recovered COVID-19 patients present in the database obtained from the previous step are further utilized to retrieve similar images for a given input query image. The system utilizes the cosine similarity measure to find the top-k similar patients for a given query patient, where k is a positive integer. The system further utilizes those top-k similar patients to provide various recommendations such as doctors, medicines, and resources to the patient corresponding to the given input chest X-ray image. The doctors, medicines, and resources allotted to those similar patients are recommended to the query patient. Steps 11 to 14 summarize the workflow of the proposed system ( Figure 10). In the results section, we present the results for the two hypotheses of our proposed system. proposed algorithm describe the feature vector extraction process. However, the extracted feature vectors are further exploited for similarity-based retrieval.

Similarity-Based Retrieval
The extracted feature vectors of the input query image and the chest X-ray images of recovered COVID-19 patients present in the database obtained from the previous step are further utilized to retrieve similar images for a given input query image. The system utilizes the cosine similarity measure to find the top-k similar patients for a given query patient, where k is a positive integer. The system further utilizes those top-k similar patients to provide various recommendations such as doctors, medicines, and resources to the patient corresponding to the given input chest X-ray image. The doctors, medicines, and resources allotted to those similar patients are recommended to the query patient. Steps 11 to 14 summarize the workflow of the proposed system ( Figure 10). In the results section, we present the results for the two hypotheses of our proposed system.

Experimental Protocol
To verify the efficacy of the proposed approach, the experimental environment, dataset description, pre-processing of the datasets, and the related results of the experiments are discussed in this section.

Experimental Protocol
To verify the efficacy of the proposed approach, the experimental environment, dataset description, pre-processing of the datasets, and the related results of the experiments are discussed in this section.

Experimental Environment
The details of the computing resources used for the implementation of the proposed system are shown in Table 1.

Dataset Description
We employed two datasets, including the chest X-ray images of COVID-19 patients, for the implementation and performance evaluation of our proposed model. The detailed descriptions of the datasets are provided in Table 2. The "Dataset for Training and Verification (DFTV)" was used for to train the CNN models and was also used for the analysis of the model's performance. It was split into training, validation, and test sets using the K5 protocol in the ratio of 8:1:1 before training the models. In total, 16,932 images are used for training, and 2116 images are used for testing. For the performance analysis of our proposed recommendation model, all the images of the COVID class in this dataset were also taken again separately and split into five different subsets of equal size, namely DFTV-1, DFTV-2, DFTV-3, DFTV-4, and DFTV-5. The second dataset, "Dataset for Cross Verification (DFCV)", was used as a dataset for cross-verification of the system's performance on completely new data unseen by the CNN models. It was also split into five subsets of equal size, namely DFCV-1, DFCV-2, DFCV-3, DFCV-4, and DFCV-5.

Data Pre-Processing
In phase 1 of the proposed approach, the images underwent certain image transformations, as mentioned in Step 2 of the algorithm. First, the images were randomly cropped and resized to 224 × 224 and then randomly rotated by 30 degrees before going into the CNN model for training. In phase 2 of the approach, the query and the database images were pre-processed before feature extraction took place. The chest X-ray images were first resized to 225 × 225 and then cropped to size 224 × 224, facilitating the input to ResNet and VGG architectures.

Results and Performance Evaluation
The results of the proposed system were compartmentalized based on the system's two phases. The performance of offline (Phase-1) and online (Phase-2) systems was assessed using different CNN models and similarity measures.

Results
The results were obtained for the two phases of the proposed system. In phase 1, the results were determined by fine-tuning the CNN model. In phase 2, the results were recorded and obtained from the recommendation process. The results were obtained by considering both the DFTV and DFCV datasets.
Phase 1: Fine Tuning for Feature Learning-Offline System In phase 1 of the proposed system, the CNN model was fine-tuned, and the model was saved for further use in phase 2. Training was optimized using the stochastic gradient descent (SGD) optimizer through a binary cross-entropy loss function on the DFTV dataset. Figure 11 depicts the result of these fine-tuned CNN models, as found in the test split of the dataset DFTV. The metrics used in the results are defined in the following equations.
where, TP is the true positive, and this is when the model correctly predicts the positive class. TN is the true negative, and this is when the model correctly predicts the negative class. FP is the false positive, and this is when the model incorrectly predicts the positive class. FN is the false negative, and this is when the model incorrectly predicts the negative class.
From Figure 11, it is found that the weighted precision, recall, and f1-score of all the CNN models are between 0.90 and 0.95. The average weighted precision, recall, and f1-score of all five CNN models are 0.938, 0.936, and 0.936, respectively. Phase 2: Recommendation-Online System The final weights of the fine-tuned CNN model obtained from phase 1 were used in phase 2 for feature extraction. The weights were used for the feature extraction of both the query image and the database images. The similarity of the feature vector corresponding to the input query image was determined concerning all the feature vectors corresponding to the database images. The four similarity measures, namely, cosine similarity, Euclidean distance similarity, Jaccard similarity, and Maxwell-Boltzmann similarity, were considered for evaluating the performance of the proposed system. The results were obtained by taking all the images of the COVID-19 class present in each dataset. For each dataset, 80% of these images were considered hospital database images, i.e., chest X-ray images of past recovered COVID-19 patients of the hospital. The feature vectors were already extracted and stored in the backend, and the remaining 20% of images were considered new input query images, i.e., chest X-ray images of new patients. For each query image, the similarity with every database image was calculated, and the top-k images from the hospital database having the highest similarity were retrieved, where k is the number of recommendations. A" threshold value (T)" of similarity was decided to identify relevant similar images for each query image. A retrieved database image was considered relevant when it had a similarity greater than or equal to the threshold value, as defined in Equation (18).
Relevant recommendation = retrieved database image with cosine similarity ≥ T.
Where, TP is the true positive, and this is when the model correctly predicts the positive class. TN is the true negative, and this is when the model correctly predicts the negative class. FP is the false positive, and this is when the model incorrectly predicts the positive class. FN is the false negative, and this is when the model incorrectly predicts the negative class.
From Figure 11, it is found that the weighted precision, recall, and f1-score of all the CNN models are between 0.90 and 0.95. The average weighted precision, recall, and f1score of all five CNN models are 0.938, 0.936, and 0.936, respectively. Figure 11. Performance evaluation of the fine-tuned CNN models.

Phase 2: Recommendation-Online System
The final weights of the fine-tuned CNN model obtained from phase 1 were used in phase 2 for feature extraction. The weights were used for the feature extraction of both the query image and the database images. The similarity of the feature vector corresponding Figure 11. Performance evaluation of the fine-tuned CNN models.
We varied this threshold value between 0.7 and 0.95 to analyze different scenarios. This threshold value represents the minimum similarity of image features in the chest X-ray needed for the recommended medicines, doctors, and resources to be considered valid. This threshold value may be fixed after consulting a medical professional for practical use.
For the input query set, the average of the highest similarity corresponding to the most similar image (top-1) concerning each query was calculated and was referred to as the average highest similarity (AHS) of our proposed method, as shown in Appendix A. Tables A1 and A2 depict the average highest similarity as observed on various datasets using different similarity measures and CNN models for feature extraction.
The performance of the different similarity measures can be analyzed from the graphs shown in Figure 12. From Figure 12, it is observed that the mean of AHS of all the datasets for Maxwell-Boltzmann similarity is maximum. The performance of the Cosine similarity measure is nearer to Maxwell-Boltzmann similarity. The composite means and standard deviation of AHS considering all the datasets for all the models are represented in Figure 13. The performance of the different CNN models was further analyzed, considering Maxwell-Boltzmann similarity.
The mean average precision (MAP@k) metric was used to evaluate the performance of phase 2 (online system) of the proposed RS, which is defined in the following Equation (19).
where N denotes the total number of users or the length of the input query set, k denotes the number of recommendations made by the recommender system, and P(m) denotes the precision up to the first m recommendations. rel(m) is a relevance indicator function for each recommended item, returning 1 if the mth item is a relevant recommended chest X-ray image, with a similarity higher than the threshold value T, and 0 otherwise. To check the performance of our proposed RS, we determined the MAP@k for k = 5 and k = 10. The values obtained for MAP@k for k = 5 and k = 10 using the five CNN feature extraction models are listed in Appendix A and are shown in Tables A3 and A4, respectively. The performance of the models was analyzed through the graphs represented in Figures 14 and 15. From the graphs, it was observed that the performance of the proposed RS varies according to the different feature extraction methods through the different CNN models. The proposed RS implemented with the ResNet-50 feature extraction model provided the highest MAP@k with k = 5 and k = 10 for all the datasets with higher threshold values of similarity. The proposed RS with the ResNet-50 feature extraction CNN model had the highest MAP of more than 0.90 for the threshold similarities in the range of 0.7 to 0.9. Therefore, it confirmed the first part of the hypothesis that the performance of the proposed RS depends upon the feature extraction technique through CNN models. It was also found that this framework provides better performance for the DFCV, which follows the performance obtained from DFTV.  The mean average precision (MAP@k) metric was used to evaluate the performance of phase 2 (online system) of the proposed RS, which is defined in the following Equation (19).  The mean average precision (MAP@k) metric was used to evaluate the performance of phase 2 (online system) of the proposed RS, which is defined in the following Equation (19).   Figure 14. Mean average precision @k graphs for DFTV datasets for k = 5 and k = 10. Figure 14. Mean average precision @k graphs for DFTV datasets for k = 5 and k = 10.  Figure 15. Mean Average precision @k graphs for DFCV datasets for k = 5 and k = 10. Figure 15. Mean Average precision @k graphs for DFCV datasets for k = 5 and k = 10.
To analyze the effect of the similarity measures on the performance of the proposed system, we found the MAP@5 for both the DFTV and DFCV datasets using the Resnet-50 CNN model. We used the Resnet-50 CNN model as it is the best-performing model for our datasets. The results obtained are shown in Table 3. It was also observed from Table 3 that the MAP@5 was at its maximum using the Maxwell-Boltzmann similarity. Hence, the MAP@k depends upon the similarity measure used for similarity computation, which reveals another part of the hypothesis of our proposed system. We also validated our two hypotheses in the performance evaluation section.

Performance Evaluation
The proposed study used two performance metrics, (i) the ROC curve and (ii) the figure of merit (FoM), to validate the performance of the proposed system. The ROC curves for the performance of the different CNN models are shown in Figure 16. The ROC curve represents the ability of the CNN models in feature extraction so that the predicted value of the recommended image matches the gold standard. The performance of the CNN models was analyzed with the area under the curve (AUC) and the corresponding p-value, as shown in Table 4. The Resnet-50 model was found to outperform other CNN models with an AUC. The performance of the CNN models was also analyzed through FoM. The FoM is defined as the error's central tendency and can be defined as follows: where W is the number of images incorrectly classified according to the GT, and N is the total number of images present in the test sample.   We also determined the FoM values considering the four similarity measures keeping the CNN model fixed. We considered Resnet-50 as the best-performing CNN model observed from the previous results. The FoM values obtained using Resnet-50 and the four similarity measures are shown in Table 6. It was found that the FoM was at its maximum for Maxwell-Boltzmann similarity and varied according to the similarity measure used in the system. This result validates the second part of our hypothesis that the performance of the proposed RS depends upon the similarity measure used for similarity computation.   We also determined the FoM values considering the four similarity measures keeping the CNN model fixed. We considered Resnet-50 as the best-performing CNN model observed from the previous results. The FoM values obtained using Resnet-50 and the four similarity measures are shown in Table 6. It was found that the FoM was at its maximum for Maxwell-Boltzmann similarity and varied according to the similarity measure used in the system. This result validates the second part of our hypothesis that the performance of the proposed RS depends upon the similarity measure used for similarity computation.  Table 7 shows the time consumed by the proposed RS with each of the five CNN feature extraction algorithms. It is the average of multiple runs that have been expressed in seconds. The working setup to conduct experimentation is shown in Table 1. The running time of the proposed RS was calculated as the time required for the feature extraction of an input query image supplied to the proposed RS, its similarity calculation with all the images in the hospital database, and the retrieval of top-k similar images. The average running time of the proposed RS implemented with the CNN feature extraction models was compared and is represented in Figure 17. The bars in the figure represent the average running time of each RS with different CNN models. It was observed that the running time of the proposed RS is primarily dependent upon the size of the hospital database and is also affected by the type of feature extraction model.  Table 7 shows the time consumed by the proposed RS with each of the five CNN feature extraction algorithms. It is the average of multiple runs that have been expressed in seconds. The working setup to conduct experimentation is shown in Table 1. The running time of the proposed RS was calculated as the time required for the feature extraction of an input query image supplied to the proposed RS, its similarity calculation with all the images in the hospital database, and the retrieval of top-k similar images. The average running time of the proposed RS implemented with the CNN feature extraction models was compared and is represented in Figure 17. The bars in the figure represent the average running time of each RS with different CNN models. It was observed that the running time of the proposed RS is primarily dependent upon the size of the hospital database and is also affected by the type of feature extraction model.

Statistical Tests
The proposed study performed the validations of the two hypotheses designed for the proposed system. To assess the system's reliability and stability, the standard Mann-Whitney, paired t-test, and Wilcoxon tests were used. When the distribution was not nor-

Statistical Tests
The proposed study performed the validations of the two hypotheses designed for the proposed system. To assess the system's reliability and stability, the standard Mann-Whitney, paired t-test, and Wilcoxon tests were used. When the distribution was not normal, the Wilcoxon test was used instead of the paired t-test to determine whether there was sufficient evidence to support the hypothesis. MedCalc software (Osteen, Belgium) was used for the statistical analysis. To validate the system proposed in the study, we provided all of the MAP@k values for k = 5 and k = 10 against various models of RS with different CNNs. The results of the Mann-Whitney, paired t-test, and Wilcoxon test are shown in Table 8.

Principal Findings
The test was carried out on 20,000 COVID-19 patients' chest X-ray images. The following similarity measures were used to select the best one for the system based on the AHS value: (i) cosine similarity, (ii) Maxwell-Boltzmann similarity, (iii) Euclidean similarity, and (iv) Jaccard similarity. With a similarity value of more than 94%, the Maxwell-Boltzmann similarity outperformed all other similarity measures. The proposed RS' performance was validated using the following CNN models: (i) Resnet-50, (ii) Resnet-101, (iii) Resnet-152, (iv) VGG-16, and (v) VGG-19. The performance of the CNN models was validated using parameters such as the ROC curve and FoM value. The AUC and p-values obtained from the ROC curve indicate the ability of the CNN models to correctly predict the GT of the input image. The Resnet-50 model was found to outperform other CNN models with an AUC greater than 0.98 (p < 0.0001). The performance of the CNN models was also analyzed through FoM. The FoM was defined as the error's central tendency. The Resnet-50 CNN model was found to have a maximum FoM value of 98.38. The performance of the similarity measures was also validated using the FoM value, and Maxwell-Boltzmann similarity outperformed the other three similarity measures; the overall performance of the proposed RS was evaluated using MAP@k. The MAP@k was determined using different CNN models for the threshold similarity in the range of 0.7 to 0.95. The proposed RS with the Resnet-50 CNN model showed the best result with a MAP@k value of 0.98014 and 0.98861 for k = 5 and k = 10, respectively. Finally, the system recommended meta-data information regarding hospital resources to a new COVID-19 patient admitted to the hospital based on his or her chest X-ray image.

Benchmarking
We considered various papers related to RS based on image similarity in our benchmarking strategy. This included Ullah et al. [17], Chen et al. [18], Tuinhof et al. [19], and Geng et al. [40]. In ref. [17], an RS based on image content was proposed and divided into two phases. The RS used a random forest classifier in the first phase to determine the product's class or category. The system then used the JPEG coefficients measure to extract the feature vectors of the photos, which were then used to provide recommendations based on feature vector similarity in the second phase. The proposed method produced correct recommendations with a 98% accuracy rate, indicating its efficacy in real-world applications. Ref. [18] provided a neural network-based framework for product selection based on a specific input query image. A neural network was used in the proposed system to classify the supplied input query image, followed by another neural network that used the Jaccard similarity measure to determine the most comparable product image to that input image. The approach had a classification accuracy of 0.5. It offered quick and accurate online purchasing assistance and recommended products with a similarity of more than 0.5. Ref. [19] describes a two-stage deep learning framework for recommending fashion images based on similar input images. The authors proposed using a neural network classifier as a data-driven, visually aware feature extractor. The data were then fed into ranking algorithms, which generated suggestions based on similarities. The proposed method was validated using the fashion dataset, which was made public. The proposed framework, when combined with other types of content-based recommendation systems, can improve the system's stability and effectiveness. Ref. [40] proposed a framework for combining an RS with visual product attributes by employing a deep architecture and a series of convolution operations that result in the overlapping of edges and blobs in images. The benchmarking table for the proposed study is shown in Table 9. CNN. Ref. [82] presented an RS framework that uses chest X-ray images to predict whether a person needs COVID-19 testing. It implemented the same datasets used by the proposed method but with a different objective. None of these studies proposed any hypothesis for their proposed systems.
In contrast, we proposed two hypotheses for our system and also evaluated and validated them in the result and performance evaluation sections, respectively.

Special Note on Searching for RS
RS works on the principle of information filtering, and the searching strategy plays an important role in finding the relevant items to produce efficient and useful recommendations. The proposed RS utilizes image similarity to find the most relevant chest X-ray images with similar infections for a new COVID-19 patient with a chest X-ray image. Although CNN models play a vital role in producing accurate feature vectors, the quality of the recommendation mainly depends on the similarity measure. A proper similarity measure producing a high similarity value can produce more accurate recommendations. The four similarity measures considered for this study were analyzed based on AHS. In this study, the AHS was determined by averaging the similarity value of the most similar image to every input image present in the test set. The similarity measure with the highest AHS was considered for the RS. The performance of the proposed RS was determined in terms of MAP@k for a top-k recommendation. To identify relevant similar images for each query image, a "threshold value (T)" of similarity was also considered in the system. A retrieved database image was considered relevant when it had a similarity greater than or equal to the threshold value. This threshold value was found to affect the overall performance of the system in terms of MAP@k for a top-k recommendation.
The input images in both the training set and the testing sets were large images. These large images had many pixels to process. Further, the method we adopted reduced the computational complexity. The similarity measure strategy was very fast, quick, and low in complexity, one reason being there was no special optimization protocol and iteration adopted. Thus, overall, there was simplicity, speed, and low complexity. Such benefits overrule direct image comparison. Note that the top-n similar images obtained from the similarity computation were used for the recommendation. The proposed RS using CNN for feature extraction and similarity measurement can be an efficient tool to produce recommendations in the healthcare domain. The recommendations can be utilized for the proper allocation of doctors, medicine, and hospital resources to new patients.

Strengths, Weaknesses, and Extensions
The proposed method shows that the RS using a CNN for feature extraction and similarity measure can be an efficient tool for producing recommendations in the healthcare domain. The recommendations can be utilized for the proper allocation of doctors, medicine, and hospital resources to new patients. The proposed study proposed two hypotheses and also evaluated and validated them in the paper.
The results of the current pilot study are encouraging. However, due to the unavailability of the denoising technique in the proposed RS, the quality of the recommendation may be affected due to the presence of noise in the chest X-ray images. Denoising can be conducted in the offline and online systems. Denoising is an expensive operation in terms of computations. Therefore, offline denoising does not hurt the system that much, but the online system must be hardware interactive. The low resolution of chest X-ray images may also affect the quality of recommendations. Due to the limited number of images available for similarity calculation, a small database size may result in incorrect recommendations. A large database size may result in longer training time. While the study used basic ResNet-based systems, this can be extended to hybrid ResNet systems [83,84].
In the future, we could apply more sophisticated feature extraction techniques by fusing the different deep-learning models to achieve accurate recommendations. Better similarity methods can be explored to increase the efficiency of the proposed system. It could also be enhanced by applying segmentation techniques to make the system more robust. It can also be extended to cloud settings and big data platforms.

Conclusions
Through this study, we offered an RS for treating COVID-19 patients based on X-ray images of the chest. The proposed RS was divided into two phases. In phase 1, the proposed system fine-tuned the CNN models for feature extraction in phase 2. In phase 2, the finely tuned CNN model was used to extract features from both the chest X-ray of a new COVID-19 patient and the chest X-rays of COVID-19 patients present in the hospital database who were already treated successfully. The top-k similar images to the input query image of a new COVID-19 patient were determined further utilized for recommendation. In its recommendation, the proposed RS recommends doctors, medicines, and resources for new COVID-19 patients according to the metadata information of similar patients.
The proposed RS implemented with the ResNet-50 feature extraction CNN model provides the highest MAP@k with k = 5 (top-5) and k = 10 (top-10) for all the datasets with higher threshold values of similarity. The proposed RS with ResNet-50 CNN feature extraction model was found to be a proper framework for the treatment recommendation with a mean average precision (MAP) of more than 0.90 for the threshold similarities in the range of 0.7 to 0.9. The results of the proposed study were hypothesized and validated using various parameters. The proposed RS in this paper assumes that the hospital database contains related metadata, such as information about the doctors investigated, medicines, and resources allocated to a patient. The major limitation of our proposed system is that we did not consider the related physiological parameters such as sugar level, blood pressure, and other associated parameters that may affect the condition of a COVID-19 patient having similar chest infections. In the future, the proposed RS can be enhanced by considering these parameters for better recommendations.