Fast Image Super-Resolution Using Particle Swarm Optimization-Based Convolutional Neural Networks

Image super-resolution based on convolutional neural networks (CNN) is a hot topic in image processing. However, image super-resolution faces significant challenges in practical applications. Improving its performance on lightweight architectures is important for real-time super-resolution. In this paper, a joint algorithm consisting of modified particle swarm optimization (SMCPSO) and fast super-resolution convolutional neural networks (FSRCNN) is proposed. In addition, a mutation mechanism for particle swarm optimization (PSO) was obtained. Specifically, the SMCPSO algorithm was introduced to optimize the weights and bias of the CNNs, and the aggregation degree of the particles was adjusted adaptively by a mutation mechanism to ensure the global searching ability of the particles and the diversity of the population. The results showed that SMCPSO-FSRCNN achieved the most significant improvement, being about 4.84% better than the FSRCNN model, using the BSD100 data set at a scale factor of 2. In addition, a chest X-ray super-resolution images classification test experiment was conducted, and the experimental results demonstrated that the reconstruction ability of this model could improve the classification accuracy by 13.46%; in particular, the precision and recall rate of COVID-19 were improved by 45.3% and 6.92%, respectively.


Introduction
Single-image super-resolution (SR) refers to the reconstruction of low-resolution (LR) images to recreate high-resolution (HR) images as realistic as possible [1]. It has a promising future in medical and remote sensing, visual surveillance, etc. [2][3][4]. SR approaches used to be based on interpolation [5] and degradation [6] models. Currently, learning-based method [1,7] have received wide attention, among which deep learning models have shown powerful performance in image SR [8].
Convolutional neural networks (CNN) are widely used in image SR models. Many CNN-based methods attempt to learn how to achieve a better reconstruction performance by using deeper networks [9,10]. They have shown powerful performance in image SR; however, the high computing costs make them inconvenient for handling real-time problems. Dong et al. [11] proposed a lightweight model called FSRCNN, which has quite comparable performance to and is up to 40 times faster than SRCNN-EX [1]. Studies have further explored methods to improve the quality of SR images generated by FSRCNN. Considering the FSRCNN [11] model, it uses stochastic gradient descent (SGD) [12] optimization-based CNNs. However, the optimization problem for SR is non-convex and thus sensitive to the initial location. Such feature may cause the traditional neural network models to easily fall into the local optimum [13,14]. Particle swarm optimization (PSO) [15] has become a commonly used optimization method for training neural networks because of its simple rules and high search speed [16,17]. Kennedy et al. [18] used PSO to optimize the weight of a feedforward neural network, proposing the first combination of PSO and a neural network. Dong et al. [14] presented a modified PSO combined with an information entropy function to optimize the weights and bias of a back propagation neural network. The results showed that the joint algorithm had better performance in terms of accuracy and stability. Tu et al. [13] proposed an evolutionary convolutional neural network, which uses ModPSO and the backpropagation algorithm, to train convolutional neural networks to avoid models falling into local minima. This is the most advanced attempt; however, this study did not optimize particle swarm in terms of population diversity.
In order to further explore the effect of the PSO algorithm on CNN training and SR image quality, in this paper, a CNN training method based on the PSO algorithm was constructed, that is, the PSO algorithm was used to optimize CNN network parameters. In addition, in view of issues associated with PSO, the mutation of particles with high similarity is proposed according to the cosine similarity between particles, and the mutation probability decreased linearly with the number of iterations. The cosine similarity mutation strategy reinitialized the aggregated particles according to the cosine similarity, which could maintain a better spatial solution distribution of the particle swarm. Finally, the model was used to perform SR on low-resolution chest X-ray (CXR) images and analyze its impact on the diagnosis of pneumonia. The CXR images classification experiment showed that although the hybrid model was trained on a 91-image dataset, it could also super-resolve CXR images effectively and enhance the accuracy of their classification.

Deep Learning for SR
Dong et al. [1] first proposed the use of a CNN model for image SR, which is called SR-CNN. It is a lightweight model with three layers of construction but demonstrates advance repair qualities. It preprocesses the images using bicubic interpolation and then reconstructs SR images through nonlinear mapping of a three-layer convolutional neural network.
At that time, SRCNN was superior to all other reconstruction methods, but its interpolation structure would lead to too much computation when processing large images [11,19]. To speed up the process, Dong et al. [11] proposed the fast super-resolution convolutional neural networks (FSRCNN) model, which is 40 times faster than the SRCNN-EX model. In the FSRCNN model, the mean square error (MSE) is used as the cost function, which is formulated as: where X i and Y i are the HR and LR training data pair, and F(Y i ; θ) denotes the neural network output of the parameters θ. The goal of the stochastic gradient descent algorithm is to enforce the approach to 0. Its iteration formula is: where α represents the step size of each update, usually 0.01, 0.001, and 0.0001

PSO Based on Centroid Opposition-Based Learning
PSO is an interaction-based optimization algorithm imitating the preying behavior of birds [15]. Each particle delegates a group of weights and bias, and the optimal solution is obtained by iteratively searching particles in the solution space. The particles update their position in two ways: one is the individual optimal solution (pbest), and the other is the global group optimal solution (gbest). The dynamical formulas of PSO are as follows: where N is the initial number of particles, v k i and x k i measure the velocity and position of the i th particle at the k th iteration, respectively, ω denotes the inertia weight, which reflects the effect of the previous velocity on the current velocity, c 1 and c 2 are acceleration factors, usually represented by two real numbers, r 1 and r 2 are unpremeditated numbers from the interval (0,1).
Opposition-based learning (OBL) [20] was proved to be an effective means to improve particle swarm optimization algorithms [21]. The central idea of OBL is to improve the optimization ability by searching a solution and its corresponding opposing solution simultaneously in the solution space. Centroid opposition-based learning (COBL) [22] makes use of the centroid of the swarm when calculating the position of opposing solutions and utilizes the swarm experience to improve the searching efficiency of the particle swarm. Assuming that (X 1 , X 2 , . . . , X n ) is the location of n particles, the centroid is calculated as follows: The opposite solution based on the centroid of the swarm can be formulated as: The opposite solution exists in a search space with dynamic boundaries, which is expressed as: If the opposite solution exceeds the dynamic boundary, the opposite solution is recalculated according to the following equation: By comparing the current solution with the opposite solution, the better one can be selected.

Cosine Similarity Variation Strategy
In the late phase of COBL, the algorithm will be trapped into a local optimum due to the extreme population aggregation. Cosine similarity is mainly used to measure the size of the difference between two individuals, and here it is used to quantitatively describe the aggregation degree of particles and populations, according to the formulas: where gbest represents the global optimal solution of the current iteration, and x i denotes the location of the i th particle. In order to improve the search efficiency, some particles, whose cosine similarity is greater than the average cosine similarity of the population, are re-initialized to maintain the diversity of the particles. The average cosine similarity was calculated by the cosine similarity between each particle and pbest, and the mutation region was defined according to the average cosine similarity, as indicated by the red dotted line in Figure 1. The region whose cosine similarity to pbest is greater than the average similarity is the mutation region, and the particles in this region have a certain probability to be randomly initialized. The region whose cosine similarity to pbest is less than the average similarity is defined as the non-mutation region, and the particles in this region continue the iterative optimization.
region continue the iterative optimization. In addition, the cosine similarity of the population is affected by some extreme particles, which may lead to a higher mutation rate. However, a higher mutation rate may be detrimental to the later convergence of the algorithm. To address this issue, A mutation factor δ is introduced, which reduces the probability of mutation from 50% to 5% as the number of iterations increases. The mutation factor can be defined as:   In addition, the cosine similarity of the population is affected by some extreme particles, which may lead to a higher mutation rate. However, a higher mutation rate may be detrimental to the later convergence of the algorithm. To address this issue, A mutation factor δ is introduced, which reduces the probability of mutation from 50% to 5% as the number of iterations increases. The mutation factor can be defined as:

FSRCNN Model Based on SMCPSO
where M start and M end represent the initial and final mutation probability, respectively, k indicates the number of current iterations, and T max denotes the limit number of iterations.

FSRCNN Model Based on SMCPSO
In our implementation, we utilized the SMCPSO method to initialize the weights and bias of the FSRCNN model. The MSE is defined as the fitness function of SMCPSO, and the dimension of the particles is the number of parameters to be learned in the FSRCNN network. Figure 2 illustrates the flowchart of the joint algorithm. The weight and bias of the FSRCNN model correspond to each dimension of the particle. The number of optimized particles was set to 50, and each particle represented a set of possible weights and biases of the FSRCNN model. The number of iterations was set to 10,000. Every 100 iterations, the particle whose cosine similarity to the optimal particle was less than the average cosine similarity and whose random value was greater than the variation factor was initialized. Each iteration considers whether there is a better solution for the inverse particle of the particle, and if so, transforms the particle into its inverse particle. When the iteration stop condition is reached, the particle swarm training ends. The value of each dimension of the optimal particle corresponds to the weight and bias of the FSRCNN model; the SGD algorithm was used to optimize the weight and bias of the model until the training was completed. The SGD algorithm is greatly affected by the initial position; therefore, PSO can set the ideal initial position for SGD. Specifically, PSO is used to search the desired weights and bias of the CNN as the initial parameters of the SGD algorithm. Higher accuracy can be achieved through this joint training method.
biases of the FSRCNN model. The number of iterations was set to 10,000. Every 100 itera-tions, the particle whose cosine similarity to the optimal particle was less than the average cosine similarity and whose random value was greater than the variation factor was initialized. Each iteration considers whether there is a better solution for the inverse particle of the particle, and if so, transforms the particle into its inverse particle. When the iteration stop condition is reached, the particle swarm training ends. The value of each dimension of the optimal particle corresponds to the weight and bias of the FSRCNN model; the SGD algorithm was used to optimize the weight and bias of the model until the training was completed. The SGD algorithm is greatly affected by the initial position; therefore, PSO can set the ideal initial position for SGD. Specifically, PSO is used to search the desired weights and bias of the CNN as the initial parameters of the SGD algorithm. Higher accuracy can be achieved through this joint training method.

Classification of Pneumonia
Deep convolutional neural networks are being used in medical diagnosis. ResNet34 [23] adopted the residual network structure to achieve a good balance between classification accuracy and network complexity. Therefore, ResNet34 [23] was selected as the diagnostic classifier for pneumonia. Since the pneumonia data set publicly available online is not large, transfer learning was used to train the model. ResNet34 [23] uses ImageNet weights, and the full connection layer was modified to fit the four categories of the experimental data set.
Five indexes, i.e., accuracy, precision, sensitivity, F1 score, and specificity [24], were used to evaluate the classification results of ResNet34 [23]. The calculation formulas are as follows: accuracy class i =

Improved Particle Swarm Optimization
For the sake of evaluating the search capability of the proposed SMCPSO algorithm, 15 multimodal test functions (F6-F20), recommended by IEEE Evolutionary Computing Conference (CEC) 2013 [25], were used to test the algorithm. The optimal solutions of the F6-F20 functions are −900, −800, −700, −600, −500, −400, −300, −200, −100, 100, 200, 300, 400, 500, 600, 700, respectively. In the experiment, the population size was N = 40, the maximum number of function evaluations was 10,000, and the problem dimension was D = 30. The standard PSO, COBL, and SMCPSO algorithms were used in 20 independent tests. Table 1 records the best value, worst value, mean value, standard deviation, mean absolute error, and time of each algorithm. The results in the Table 1 show that SMCPSO could find better solutions for F6-F20 multi-peak test functions. Especially for the F6, F10, F14, and F19 functions, the optimization of PSO and COBL was not satisfactory, but SMCPSO found a solution extraordinarily close to the global optimal solution. As shown in Figure 3, the SMCPO demonstrated improving searching ability for most test functions, and only a slightly lower ability compared to COBL in the initial stage. This is because in the early stage of the SMCPSO algorithm search, the particles with high cosine similarity to the pbest mutated in the optimization process, which was not conducive to the algorithm convergence in the early stage; however, this could make the algorithm less likely to be confused by the local optimal solution. With the increase of the function evaluation times, the SMCPSO algorithm showed a better global searching ability than the standard PSO and COBL models in most multimodal optimization problems.

FSRCNN Model Based on SMCPSO
Consistent with the work of SRCNN and FSRCNN, a 91-image dataset [26] was used for training, and Set5 [27], Set14 [28], BSD100 data set [29] and Urban data set [30] were used for testing. The peak signal-to-noise ratio (PNSR) [31] and the structural similarity index (SSIM) [32] were employed to evaluate the quality of the images. PNSR = 10 log 10 ( where u X and u Y represent the mean values of images X and Y, respectively, and were used for the estimation of image brightness; σ X and σ Y denote the standard deviations of images X and Y, respectively, and were used for the estimation of contrast; σ XY represents the covariance of images X and Y, and was used for the measurement of structural similarity. C 1 and C 2 are constants that prevent the denominator from being 0. most test functions, and only a slightly lower ability compared to COBL in the initial stage. This is because in the early stage of the SMCPSO algorithm search, the particles with high cosine similarity to the pbest mutated in the optimization process, which was not conducive to the algorithm convergence in the early stage; however, this could make the algorithm less likely to be confused by the local optimal solution. With the increase of the function evaluation times, the SMCPSO algorithm showed a better global searching ability than the standard PSO and COBL models in most multimodal optimization problems.   x f x f − is the current error represented by a logarithm, and the smaller the error is, the better the algorithm performance is.) (a) Errors between the optimal value of each iteration and the actual optimal value of the f6 multi-peak test function by the standard PSO, COBL, and SMCPSO methods, respectively; similarly, this is shown for (b) f8, (c) f10, (d) f12, (e) f14, (f) f16, (g) f18, (h) f20.

FSRCNN Model Based on SMCPSO
Consistent with the work of SRCNN and FSRCNN, a 91-image dataset [26] was used for training, and Set5 [27], Set14 [28], BSD100 data set [29] and Urban data set [30] were used for testing. The peak signal-to-noise ratio (PNSR) [31] and the structural similarity index (SSIM) [32] were employed to evaluate the quality of the images.  (x) represents the optimal solution of the current iteration, and f (x * ) represents the actual optimal solution of the function. f (x) − f (x * ) is the current error represented by a logarithm, and the smaller the error is, the better the algorithm performance is.) (a) Errors between the optimal value of each iteration and the actual optimal value of the f6 multi-peak test function by the standard PSO, COBL, and SMCPSO methods, respectively; similarly, this is shown for (b) f8, (c) f10, (d) f12, (e) f14, (f) f16, (g) f18, (h) f20. SMCPSO was used to optimize FSRCNN and compared with SGD optimization; their performances were compared using Set5 [27]. In Figure 4a,b, compared with the SGD method, the images generated by the SMCPSO-SGD method show better performance as regards the PNSR [31] and the SSIM [32]. Moreover, the improvement of PNSR and SSIM with SMCPSO-SGD training was relatively stable. To verify the universality of the algorithm, image quality was evaluated using four general test sets. Table 2 demonstrates the test results of the SMCPSO-SGD using the general test sets Set5 [27], Set14 [28], BSD100 data set [29], and Urban data set [30]. The results indicated that the model trained with SMCPSO performed best as regards PNSR and SSIM at all scales. With the BSD100 dataset, when the scale factor was 2, the improvement of the SMCPSO-SGD model was the most significant, about 1.54 db, compared with SGD model. This suggests that the SMCPSO optimization algorithm can help models generate more realistic images.  In  (a,b), PNSR and SSIM of images generated during SGD and SMCPSO-SGD training are compared. The red curve is smoother than the black curve, and the PNSR and SSIM of the images are higher in each epoch, indicating better quality of the generated image. In (c,d), the train loss and eval loss of the red curve are always lower, and the black curve declines steadily in (c) but fluctuates greatly in (d), while the red curve declines steadily all the time, indicating that SMCPSO helps SGD to better train.
In Figure 4c, the training loss function of the SMCPSO-SGD method converges fast and shows a small error; in Figure 4d, the eval loss function of the SMCPSO-SGD method declines stably and shows a lower error. These results showed that SMCPSO can help the SGD algorithm find better solutions faster.
To verify the universality of the algorithm, image quality was evaluated using four general test sets. Table 2 demonstrates the test results of the SMCPSO-SGD using the general test sets Set5 [27], Set14 [28], BSD100 data set [29], and Urban data set [30]. The results indicated that the model trained with SMCPSO performed best as regards PNSR and SSIM at all scales. With the BSD100 dataset, when the scale factor was 2, the improvement of the SMCPSO-SGD model was the most significant, about 1.54 db, compared with SGD model. This suggests that the SMCPSO optimization algorithm can help models generate more realistic images. Table 2. PNSR and SSIM of the bicubic, SGD, SMCPSO-SGD models with the test sets Set5 [27], Set14 [28], BSD200 [29], and Urban100 [30].

Classification Evaluation
To confirm the effectiveness of generating image details, we conducted an experiment on chest X-ray (CXR) SR image classification. The data set we used was from Kaggle (available on the website: https://www.kaggle.com/datasets/tawsifurrahman/covid1 9-radiography-database (accessed on 19 March 2022)) [24]. It has a total of 21,165 CXR images, divided in four categories. The "COVID-19" and "Viral Pneumonia" classes were data-enhanced to balance the training set. Table 3 summarizes the number of images per class used for training, validation, and testing. CXR LR images for each category were obtained from CXR HR images using the down-sampling method. Then, the CXR LR images, CXR SR images, and CXR HR images were fed into the ResNet34 [23] classification network. The implementation scheme is shown in Figure 5.  Table 2. PNSR and SSIM of the bicubic, SGD, SMCPSO-SGD models with the test sets Set5 [27], Set14 [28], BSD200 [29], and Urban100 [30].

Classification Evaluation
To confirm the effectiveness of generating image details, we conducted an experiment on chest X-ray (CXR) SR image classification. The data set we used was from Kaggle (available on the website: https://www.kaggle.com/datasets/tawsifurrahman/covid19-radiography-database (accessed on 19 March 2022)) [33]. It has a total of 21,165 CXR images, divided in four categories. The "COVID-19" and "Viral Pneumonia" classes were dataenhanced to balance the training set. Table 3 summarizes the number of images per class used for training, validation, and testing. CXR LR images for each category were obtained from CXR HR images using the down-sampling method. Then, the CXR LR images, CXR SR images, and CXR HR images were fed into the ResNet34 [23] classification network. The implementation scheme is shown in Figure 5.  (The SMCPSO-SGD network was first used to convert LR X-rays into HR X-rays, then ResNet34 was used to classify the HR X-ray images).
In order to evaluate the reliability of the restored details, the ResNet34 [23] classification network was used to classify LR CXR images, SR CXR images, and HR CXR images. Five performance indicators, i.e., accuracy, precision, sensitivity, F1 score, and specificity [24], were used to evaluate the classification results. Table 4 shows the evaluation results. (The SMCPSO-SGD network was first used to convert LR X-rays into HR X-rays, then ResNet34 was used to classify the HR X-ray images).
In order to evaluate the reliability of the restored details, the ResNet34 [23] classification network was used to classify LR CXR images, SR CXR images, and HR CXR images. Five performance indicators, i.e., accuracy, precision, sensitivity, F1 score, and specificity [24], were used to evaluate the classification results. Table 4 shows the evaluation results.  Figure 6 displays the CXR HR, LR, and SR images; the CXR SR image was recovered from the CXR LR image using the SMCPSO-SGD model. In terms of visual perception, images processed by SMCPSO-SGD had better visual perception and higher scores of PNSR and SSIM. Although the given CXR LR images lost a large amount of valid information, this model could help acquire a better visual effect. Figure 7 shows the matrix diagram of CXR HR, LR, and SR image classification for subsequent diagnosis using the ResNet34 model. It was found that the CXR SR images improved the diagnostic accuracy of COVID-19, lung opacity, normal and viral pneumonia from 80.89% to 90.03%, 39.10% to 80.53%, 65.75% to 90.58%, and 0.00% to 11.09%, respectively.  Figure 6 displays the CXR HR, LR, and SR images; the CXR SR image was recovered from the CXR LR image using the SMCPSO-SGD model. In terms of visual perception, images processed by SMCPSO-SGD had better visual perception and higher scores of PNSR and SSIM. Although the given CXR LR images lost a large amount of valid information, this model could help acquire a better visual effect. Figure 7 shows the matrix diagram of CXR HR, LR, and SR image classification for subsequent diagnosis using the ResNet34 model. It was found that the CXR SR images improved the diagnostic accuracy of COVID-19, lung opacity, normal and viral pneumonia from 80.89% to 90.03%, 39.10% to 80.53%, 65.75% to 90.58%, and 0.00% to 11.09%, respectively.
In order to further evaluate the comprehensive performance of the algorithm, the five indexes of accuracy, precision, sensitivity, F1 scores, and specificity were used to evaluate the classification results. As can be seen in Table 4, the CXR LR image classification accuracy was the lowest, indicating that the LR images contained very little useful information conducive to classification, while the SMCPSO-SGD model led to improvement in five evaluation indicators, increasing the CXR LR image classification accuracy to values close to those obtained with CXR HR images. The results revealed that the model could recover the images' real and useful details well, which is beneficial to the diagnosis of pneumonia.  In order to further evaluate the comprehensive performance of the algorithm, the five indexes of accuracy, precision, sensitivity, F1 scores, and specificity were used to evaluate the classification results. As can be seen in Table 4, the CXR LR image classification accuracy was the lowest, indicating that the LR images contained very little useful information conducive to classification, while the SMCPSO-SGD model led to improvement in five evaluation indicators, increasing the CXR LR image classification accuracy to values close to those obtained with CXR HR images. The results revealed that the model could recover the images' real and useful details well, which is beneficial to the diagnosis of pneumonia. . Classification mixing matrix for three kinds of images. (The vertical axis represents the true category, the horizontal axis represents the predicted category, the ratio matrix on the right represents the accuracy of each of the four categories, and the lower ratio matrix represents the recall rate of each of the four categories.) (A) Classification result for the CXR LR images. Since the CXR LR images were obtained from CXR HR images by using the down-sampling method, there was a loss of information, resulting in the worst classification. (B) Classification results for the CXR SR images. The SMCPOS-SGD model was used to process the CXR LR images, and the details of the CXR LR images could be restored. The generated CXR SR images could effectively improve the number of correct classifications for the four categories. (C) Classification results for the CXR HR images. The CXR HR images are the most primitive images; therefore, it is important to classify them correctly.

Conclusions
A training method of a convolution neural network based on an improved particle swarm optimization algorithm was proposed. It was proved that the joint training method could improve the efficiency and accuracy of the model. A mutation strategy was proposed and proved to be able to prevent a premature improvement of the optimization ability of the particle swarm optimization algorithm. Moreover, the chest X-ray image classification experiment proved that the proposed model can reconstruct useful information for pneumonia recognition.
Author Contributions: Conceptualization, C.Z. and A.X.; methodology, C.Z.; software, C.Z.; validation, C.Z. and A.X.; formal analysis, C.Z.; investigation, A.X.; resources, A.X.; data curation, C.Z.; writing-original draft preparation, C.Z. and A.X.; writing-review and editing, A.X.; visualization, C.Z. and A.X.; supervision, A.X. project administration, A.X.; funding acquisition, A.X. All authors have read and agreed to the published version of the manuscript.  . Classification mixing matrix for three kinds of images. (The vertical axis represents the true category, the horizontal axis represents the predicted category, the ratio matrix on the right represents the accuracy of each of the four categories, and the lower ratio matrix represents the recall rate of each of the four categories.) (A) Classification result for the CXR LR images. Since the CXR LR images were obtained from CXR HR images by using the down-sampling method, there was a loss of information, resulting in the worst classification. (B) Classification results for the CXR SR images. The SMCPOS-SGD model was used to process the CXR LR images, and the details of the CXR LR images could be restored. The generated CXR SR images could effectively improve the number of correct classifications for the four categories. (C) Classification results for the CXR HR images. The CXR HR images are the most primitive images; therefore, it is important to classify them correctly.

Conclusions
A training method of a convolution neural network based on an improved particle swarm optimization algorithm was proposed. It was proved that the joint training method could improve the efficiency and accuracy of the model. A mutation strategy was proposed and proved to be able to prevent a premature improvement of the optimization ability of the particle swarm optimization algorithm. Moreover, the chest X-ray image classification experiment proved that the proposed model can reconstruct useful information for pneumonia recognition.
Funding: This research received no external funding.