An Agile Super-Resolution Network via Intelligent Path Selection

: In edge computing environments, limited storage and computational resources pose significant challenges to complex super-resolution network models. To address these challenges, we propose an agile super-resolution network via intelligent path selection (ASRN) that utilizes a policy network for dynamic path selection, thereby optimizing the inference process of super-resolution network models. Its primary objective is to substantially reduce the computational burden while maximally maintaining the super-resolution quality. To achieve this goal, a unique reward function is proposed to guide the policy network towards identifying optimal policies. The proposed ASRN not only streamlines the inference process but also significantly boosts inference speed on edge devices without compromising the quality of super-resolution images. Extensive experiments across multiple datasets confirm ASRN’s remarkable ability to accelerate inference speeds while maintaining minimal performance degradation. Additionally, we explore the broad applicability and practical value of ASRN in various edge computing scenarios, indicating its widespread potential in this rapidly evolving domain.


Introduction
Super-resolution technology [1][2][3] is particularly crucial in real-world applications such as urban traffic monitoring, medical imaging, and satellite imaging.It enhances not only the resolution of images but also their quality, providing clear and accurate visual information for vehicle identification and traffic flow monitoring and ensuring traffic safety.However, traditional super-resolution network models often require substantial computational resources, presenting a major issue in resource-limited edge computing environments.For example, the Internet of Things (IoT) is a typical scenario wherein limitations on computational resources and storage are even more stringent.Therefore, effectively reducing the computational complexity and inference time of these models without sacrificing image super-resolution quality becomes an urgent problem.
To alleviate this problem, we propose an agile super-resolution network via intelligent path selection (ASRN).ASRN incorporates a dynamic path selection mechanism and a policy network to optimize the inference process of super-resolution network models intelligently.The motivation of our method is that ResNet [4] has some redundancies [5][6][7]: removing some layers would not cause severe performance degradation.Inspired by this, we propose to skip some layers in the network to reduce the computational complexity.We assign different inference paths for various input data.The characteristic of our method is that it can dynamically choose the optimal inference paths in the network based on input data and available computational resources on edge devices.We carefully designed a reward mechanism for this purpose.It balances the complexity of the network structure with the performance of specific tasks, enhancing the efficiency of the inference process while minimizing degradation in super-resolution quality.
We explore the applicability of ASRN in various edge computing application scenarios, particularly focusing on its effectiveness in executing super-resolution tasks in resourcelimited environments.Compared to traditional lightweight model techniques, such as pruning [8][9][10], quantization [11][12][13], low-rank factorization [14][15][16], and knowledge distillation [17][18][19], ASRN shows greater flexibility and adaptability.Specifically, it maintains efficient operation under resource-limited conditions and can restore model performance by simply tuning one hyper-parameter once the edge devices' resources are improved.The main contributions of this research are as follows: • We propose an agile super-resolution network via intelligent path selection (ASRN) for edge computing environments.ASRN adopts a dynamic path selection mechanism that utilizes a policy network to tailor computational pathways based on the real-time data.

•
We introduced a smart reward mechanism in ASRN that has been ingeniously crafted to evaluate the policy network's decisions.By comprehensively assessing the overall performance of the model and the effectiveness of current policy, it directs the policy network towards optimal choices, thereby marking a significant advancement for super-resolution applications in edge-computing scenarios.

•
Our extensive experiments across a variety of datasets confirmed the effectiveness of the proposed ASRN.In particular, on the Div2k dataset [20], we reduced the average number of residual blocks by 15.88% and the computational complexity (FLOPS) by 15.68% while maintaining performance close to baseline.

Super-Resolution Technology
The development of super-resolution technology commenced with early interpolationbased methods, which primarily utilized linear techniques [21,22] to enhance image resolution.However, these methods often led to blurred images that lacked detail.The field experienced a significant transition with the advent of deep learning, particularly with the introduction of convolutional neural networks (CNNs) [23].CNNs revolutionized superresolution by enabling more complex and accurate image reconstruction, significantly improving the quality of upsampled images.This era also saw the integration of advanced techniques such as generative adversarial networks (GANs) [24], which introduced a competitive aspect to model training, resulting in sharper and more realistic images.Attention mechanisms [25,26], another significant advancement, allowed models to focus on specific image regions, enhancing detail where most needed.Ongoing advancements in the field are characterized by the exploration of novel deep learning architectures [27] and loss functions [28] that are aimed at enhancing the precision of super-resolution outputs.

Deep Learning Applications in Edge Computing
Edge computing presents a unique set of challenges for deep learning applications, primarily due to limitations in computational capacity and memory.The focus thus has shifted towards developing models that are not only lightweight but also are capable of achieving real-time performance.This is particularly critical for applications like video surveillance, autonomous driving, and real-time data analysis.The recent progress in this domain is to train models that are efficient both in terms of size and computational speed while not compromising on performance.Techniques like model quantization and network pruning [29,30] have been pivotal to achieving these goals.There is also an increasing trend towards designing custom hardware accelerators [31,32] that are specifically optimized for running deep learning models in resource-limited environments.This progress make edge computing a viable platform for advanced deep learning applications.

Lightweight Model Techniques
Efficient deployment of deep learning models on edge devices necessitates the reduction of their computational resource requirements without loss in performance.Lightweight model techniques are pivotal to achieving this by compressing and accelerating models while retaining their model performance.Each technique adopts a unique approach to tackle the challenges of limited resources: Model Pruning: Pruning is a technique that reduces the complexity of neural networks by removing less important parameters or connections.It effectively reduces the model size and computational load without significantly impacting performance.Existing pruning methods now include structured pruning (removing entire neurons, channels, or layers) [33][34][35] and unstructured pruning (eliminating individual weights) [36].Dynamic pruning, which adapts network complexity in real-time based on input data, is also promising.
Quantization: Quantization involves reducing the bit-size of model parameters, thereby lowering the model's storage and computational demands.This process often converts floating-point parameters into fixed-point formats.The latest trends include mixed precision quantization [37,38], which applies different bit-widths to different parts of the network.Integration with other model compression techniques, such as pruning [39], enhances overall efficiency.
Knowledge Distillation: Knowledge distillation is a compression technique whereby a smaller model (student) learns to replicate the behavior of a larger model (teacher).The student model captures the essential information from the teacher, resulting in a compact yet effective version.Beyond the classic teacher-student setup, mutual learning [40,41] among multiple networks and cross-modal distillation [42,43] have been explored.These methods allow leveraging diverse data modalities and unlabeled data to improve the student model's generalization.
Low-Rank Factorization: Low-rank factorization involves reducing the number of parameters in a model by decomposing large weight matrices into lower-rank approximations.This technique is particularly effective for reducing redundancies in convolutional layers.The focus is tensor decomposition methods like Candecomp/Parafac(CP) [44,45] and Tucker decompositions [46,47] for convolutional layers.These approaches maintain model performance while significantly reducing the parameter count.
Compared to existing lightweight model techniques, our proposed ASRN method demonstrates significant advantages.By adjusting a single hyperparameter, ASRN allows for flexible control over model performance, enabling efficient super-resolution processing in edge computing environments.Long et al. [48] also propose a dynamic path selection method by considering both the inference speed and the PSNR metric.This approach, however, overlooks the texture and structural integrity of the images evaluated by the SSIM metric.The proposed ASRN further improves Long's work by taking the SSIM of the images into consideration when choosing the inference path: thus, effectively improving the quality of super-resolution performance.Our method achieves balanced optimization by considering not only the pixel fidelity but also the perceptual quality, thereby rectifying the bias and significantly enriching the model's applicability to real-world scenarios.Unlike other one-time optimization techniques, ASRN supports dynamic adjustment, enabling the model to recover or further optimize its performance based on changes in available computational resources.This augmented flexibility and reversibility, now with a more comprehensive evaluation through PSNR and SSIM, are particularly crucial in edge computing scenarios faced with resource limitations and varying demands, proving the innovative and practical value of ASRN in the field of deep learning model optimization.

Overall Framework
This paper presents a network architecture designed to address the challenges of super-resolution tasks in edge computing environments.As illustrated in Figure 1, our network architecture comprises three key components: the backbone network, the policy network, and the reward mechanism.These components work collaboratively to enhance the efficiency and effectiveness of super-resolution processing, particularly in scenarios with limited computational resources.

Policy Network
In the proposed ASRN, the policy network plays a crucial role.Drawing on insights from previous studies analyzing ResNet, we recognized that skipping certain blocks within the network enhances inference efficiency without substantially affecting performance.This understanding has laid a vital foundation for our policy network design.The primary task of the policy network involves generating decision policies based on input data to determine the network architecture or operations to be executed during the inference process.Acting as an intelligent decision-maker, the policy network effectively selects the optimal inference paths according to the characteristics of the input data and the current computational resource limitations, thereby accelerating the inference process.
The policy generation process of the policy network is as follows: where f p represents the policy network parameterized by weights w, and m denotes the output policy corresponding to the input image x.We have carefully designed a lightweight policy network with far fewer parameters than the backbone network, ensuring minimal computational cost.Different from the sampling policies in traditional reinforcement learning [49], the policy in our method is generated based on a k-dimensional Bernoulli distribution [50,51], expressed as: where each element represents the decision to execute or skip the corresponding network block.With the help of the policy network, ASRN can flexibly adjust the network structure based on the complexity of the input images.This dynamic adjustment mechanism not only enhances the model's inference speed but also ensures the quality of super-resolution images in resource-limited edge computing environments.

Design Principles of the Policy Network
The policy network in ASRN is grounded in deep reinforcement learning principles and continuously refines decision quality through iterative learning and optimization.Adopting a lightweight architecture-specifically, a simplified ResNet variant with three blocks-substantially reduces parameter count compared to the backbone network, ensuring effective decision-making without overburdening the inference process.
This pivotal component of ASRN comprises three ResNet blocks that are responsible for processing low-resolution images and generating a binary vector representing the policy.Dynamically aligning with the backbone network's block quantity, this binary vector activates specific ResNet blocks during super-resolution, streamlining the inference path.

Policy Generation Process
Policy generation is based on a k-dimensional Bernoulli distribution that is calculated using Equations ( 2) and (3).Each element of the policy vector represents the decision to execute or skip a corresponding network block.This method enables the policy network to dynamically adjust the inference path according to different input data characteristics and resource limitations.

Collaboration of the Policy Network with the Backbone Network
The policy network does not operate independently but works closely with the backbone network.It intelligently adjusts the execution path of the backbone network based on the complexity of the input data.For instance, for relatively simple images, the policy network may choose a shorter path, skipping some unnecessary network blocks to accelerate processing.

Adaptability to Application Scenarios
We further explore the performance of the policy network in different application scenarios, such as processing high-resolution traffic surveillance images on resource-limited edge devices.In these scenarios, the policy network effectively adapts to various challenges, such as limited computational power and urgent inference time requirements.
Through this in-depth analysis, we demonstrate the key role of the policy network within the ASRN framework and how it supports the efficient execution of super-resolution tasks.This comprehensive and detailed discussion highlights the innovativeness and practical value of our research.

Reward Mechanism
To optimize the training process of the policy network, we employed reinforcement learning methods.In this process, the policy network makes decisions at each step of inference based on the actions chosen by the current policy.The performance of these decisions is evaluated through a carefully designed reward mechanism.The significance of the reward mechanism lies in its direct guidance for the policy network to choose optimal operations that simultaneously enhance inference speed and maintain super-resolution quality.Through this continuous optimization process, the policy network progressively becomes more intelligent and capable of generating increasingly effective inference policies.
The following reward function is defined for a backbone network with k residual blocks: where u is a policy vector composed of binary values, where 1 represents the retention of the corresponding residual block, and 0 indicates skipping it.The dimension of the vector u is k, which is the total number of residual blocks in the network.The expression |u| k 2 quantifies the degree to which individual blocks are incorporated into the overall network architecture.The variables s and p represent the performance evaluation results of the backbone network after applying the policy: specifically, they are the structural similarity index (SSIM) and the peak signal-to-noise ratio (PSNR), respectively, where 0 < s < 1.The variable t is a critical hyperparameter that represents a threshold value we set to determine the application of rewards or penalties.Its specific value varies based on the employed evaluation method and the range of p (PSNR).When p − t > 0, it indicates that the applied policy is effective, and thus, we provide a reward.This reward is directly proportional to the performance evaluation results after using the policy and inversely proportional to the number of residual blocks used.That is, better performance and fewer blocks used lead to larger rewards.Conversely, when p − t ≤ 0, it indicates that the policy is not effective, and we impose a penalty using the parameter γ.
In this way, the reward mechanism enables the policy network to more accurately adjust inference paths for samples of varying complexity, optimizing the performance and efficiency of the entire network.

Optimization of the Policy Network
In the proposed ASRN, special attention was given to the optimization policy of the policy network.Employing reinforcement learning methods, the policy network generates specific policies for each test sample with the aim of enhancing inference efficiency while maintaining super-resolution quality.
The optimization objective in Equation ( 5) is formulated to maximize the expected reward, which is expressed as where J(θ) represents the optimization objective with respect to policy parameters θ, R(s, a) denotes the reward for state s and action a, and π θ is the policy under parameters θ.This formulation guides the policy network to efficiently manage computational resources while enhancing or maintaining the quality of super-resolution.In accordance with the principles of reinforcement learning, the policy network updates its strategy based on the feedback loop of actions and rewards, iteratively improving its path selection decisions.This mechanism is akin to the exploration-exploitation trade-off, where the policy network explores various inference paths, learns from their performance outcomes, and exploits the knowledge to make more efficient decisions over time.

Optimization Objective
Our objective in this study is to maximize the expected value J to derive the optimal policy for the backbone model.Mathematically, this objective is expressed as: This formula is in line with our goal of finding the most effective policy for superresolution tasks.Through this methodological approach, the policy network continuously learns and improves across iterations, enabling the generation of more effective inference paths for the backbone network.

Application of Gradient Optimization Techniques
To optimize Equation ( 5), we employed gradient optimization techniques, as referenced in [52].This involved substituting Equation (2) into Equation ( 5) to derive the optimization formulation for J.However, due to the non-differentiability of Equation ( 6), we resorted to the Monte Carlo [53,54] sampling method as an approximation technique to estimate the gradient of J.This approximation was achieved by using all available samples within a given mini-batch.Gradient optimization is mathematically represented as follows:

Policy for Reducing Variance
While the gradient approximation is unbiased, it is prone to cause significant variance, as noted in [49].To mitigate this issue, we introduced a self-critical baseline R( û, p) as a technique for variance reduction.This approach leads to the reformulation of Equation ( 6).The modified equation for gradient optimization, incorporating the self-critical baseline, is represented as follows: In Equation ( 7), û represents the most likely policy under the current policy probability m k .Here, the binary variable u i = 1 when m i > 0.5; conversely, when m i ≤ 0.5, u i = 0.This reformulation helps to reduce the variance of the gradient estimation, thereby enhancing the reliability of the optimization process.

Incentive Mechanism for Policy Exploration
To encourage the exploration of more optimal policies by the policy network and reduce the risk of policy saturation, we introduced the parameter α.This parameter is used to adjust the range of the policy vector m ′ , ensuring it stays within the interval [1 − α, α].Such an adjustment is crucial to maintain the policy network's exploratory capabilities while preventing it from straying too far from the boundaries of desirable policies.The adjustment of the policy vector is mathematically expressed as:

Parameter Sensitivity Analysis
We evaluated the sensitivity of the ASRN model on the Set5 dataset to the reward function parameters γ (gamma) and t (threshold) and report the results in Table 1.The optimal settings of γ = −10 and t = 30 yield the highest PSNR of 37.450 while using only 26 blocks, indicating that precise parameter tuning can significantly enhance superresolution quality.On the contrary, extreme values like γ = −100 with t = 100 or t = 10, despite maintaining competitive PSNR levels, require using more blocks, reducing network efficiency.
Intermediate values such as γ = −50 and t = 30 achieve a PSNR of 37.380 and use 27 blocks, showing the sensitivity of the ASRN model to its reward function parameters and emphasizing the importance of careful calibration to strike an optimal balance between super-resolution quality and computational efficiency.
In summary, sensitivity analysis, as depicted in Table 1, is crucial for optimizing the performance of the ASRN model, particularly in resource-constrained edge computing environments, and enables a strategic balance between high-quality super-resolution and efficient computational usage.By employing these optimization policies, the ASRN demonstrates impressive performance and efficiency across various datasets and application scenarios.This section is dedicated to exploring the optimization process of the policy network and underscoring its vital contribution to enhancing the efficiency of the super-resolution network model.The strategic balance achieved by this mechanism facilitates the discovery of effective inference policies, thereby augmenting the model's overall robustness and adaptability in different computing environments.

Experiments
In this section, we provide a detailed account of the integration of the policy network into the EDSR [55] backbone network and assess the effectiveness of our approach across five different datasets.Our primary goal is to significantly reduce the inference time of the network while maintaining super-resolution performance, with a particular focus on edge devices limited by storage and computational resources.
This comprehensive evaluation aims to demonstrate the practical applicability and efficiency of our method in diverse scenarios, highlighting its potential in addressing the challenges faced in edge computing environments, where resource optimization is crucial.

Dataset
The Div2k dataset is a widely used benchmark in super-resolution research and comprises 800 high-quality training images and 100 validation images.Different superresolution tasks share some similarities in pixel statistics.Therefore, based on the philosophy of transfer learning, we initialize the parameters through a model pretrained on the Div2k training set (just like initializing the classification models through a model pretrained on ImageNet [56]).To comprehensively assess the effectiveness and adaptability of our method, tests were conducted not only on the Div2k validation set but also on four additional benchmark datasets, including Set5 [57], Set14 [58], B100 [59], and Ur-ban100 [60].These experiments were designed to validate the generalization ability of the policy network across different image characteristics and real-world scenarios.

Network Architecture Components
We selected the EDSR network as the backbone for our super-resolution task; the EDSR network consists of a head, body, and tail, with the body comprising 32 residual blocks.The network model was trained from scratch using the first 800 images from the Div2k dataset, ensuring that the model could adequately learn and adapt to a variety of image features.
We use a ResNet with three blocks (equivalently, ResNet-8), with the aim of minimizing the computational overhead introduced.This lightweight design allows the policy network to effectively support the backbone network without becoming a computational bottleneck.Its smaller scale compared to the backbone network ensures that it plays a supportive role in our overall approach, allowing us to allocate more computational resources to the actual super-resolution task while benefiting from the intelligent guidance provided by the policy network.During training, we employ the ADAM optimizer with a learning rate of 1 × 10 −4 and betas of (0.9, 0.999).To enhance convergence and stability, step decay on the learning rate is utilized, with a decay interval of 200 epochs and a decay factor of 0.5.Notably, the policy network's training integrates reinforcement learning, with rewards derived from the backbone network's output, ensuring alignment between super-resolution quality and computational efficiency.
Next, we will present the experimental results on these datasets and analyze the performance of our method on different datasets and with different settings.Specifically, we will focus on discussing the adaptability of the policy network to various scenarios and its ability to balance inference speed and image quality.

Balancing Speed and Quality
In this study, we focused particularly on balancing the speed and image quality of the super-resolution model.To this end, we compared the performance of our ASRN model with the original EDSR model and other popular super-resolution models such as A+ [61], SRCNN [62], VDSR [63], and SRResNet [64].

Performance Comparison Analysis
Upon analysis of Figures 2-4, we illustrate the dynamic selection process employed by our policy network across different datasets, highlighting its capacity to adjust the computational depth in accordance with the complexity of input images.This process emphasizes the policy network's adaptability, ensuring computational efficiency and maintaining super-resolution quality.For simpler images, fewer neural network blocks are required, whereas complex images necessitate a more extensive computational effort.Further insights are provided in Tables 2-4, which detail the FLOPS reduction for representative examples, demonstrating our approach's effectiveness at varying degrees of model simplification.This approach underscores our model's flexibility in balancing the demand for high-quality super-resolution against the constraints of limited computational resources, marking a significant advancement in the field of super-resolution within edge computing environments.By analyzing Table 5, it is evident that there is an average enhancement in inference speed of 9.41% to 15.93% across various datasets without any significant alteration in image quality following super-resolution processing.This observation leads us to further explore the comparative performance of different super-resolution models.6, it is observed that our ASRN model consistently surpasses other models with regard to super-resolution performance across all datasets.Despite a marginal decrement in performance metrics, the model significantly reduces computational complexity (FLOPS), thereby directly speeding up the inference time.This noteworthy reduction in computational overhead not only demonstrates a groundbreaking methodology for the efficient execution of super-resolution tasks but also highlights the model's crucial role in edge computing settings, where computational resources and storage capacities are limited.Such outcomes offer new possibilities for efficiently handling super-resolution tasks in environments that demand swift and resource-conscious processing.Our analyses further reveal a direct correlation between image complexity and the computational efficiency achieved by ASRN.Specifically, Figures 5-7 illustrate that simpler images necessitate fewer processing blocks, while more complex images require a greater number.This adaptive behavior underscores the ASRN's capacity to dynamically adjust its processing policy according to the image's complexity, ensuring optimal resource utilization and faster inference speeds across varied scenarios.

Scalability Testing
We conducted scalability tests of ASRN on edge computing devices with different computational capabilities.On a Texas Instruments MSP432P401R, ASRN reduced the average inference time from 156.490 s to 146.159 s, while on an ARM Cortex-M7, the inference time decreased from 62.596 s to 58.464 s.From the data presented in Tables 7 and 8, these results demonstrate the scalability of ASRN across different edge devices, further validating its effectiveness in diverse edge computing environments and emphasizing its potential to enhance super-resolution tasks across a variety of computing resources.The experimental results proved the effectiveness of the policy network for optimizing resource usage.With minimal or no degradation in performance, ASRN achieved a significant reduction in computational load and storage requirements.This provides a practical solution for super-resolution applications on edge devices, especially in scenarios with strict requirements for inference speed and efficiency.
Furthermore, our work offers valuable insights for future research and applications in the field of edge computing, particularly for resource optimization and real-time data processing.ASRN demonstrates the tremendous potential of super-resolution technology in edge computing environments.
In summary, our research has not only made significant technical progress but also provides important references and directions for future studies and applications in similar fields.

Conclusions
In this paper, we proposed an agile super-resolution network via intelligent path selection (ASRN): an efficient super-resolution model tailored to edge computing environments.ASRN aims to significantly reduce the inference time of super-resolution network models on edge devices while maintaining high-quality performance.By incorporating a policy network, ASRN dynamically selects the most efficient inference paths based on input data and available computational resources.The key to our approach is the intelligent reward function, which refines the decision-making process by evaluating the effectiveness of chosen paths, thus optimizing both the speed and quality of super-resolution outcomes.
Our research not only demonstrates the effectiveness of the policy network for handling super-resolution tasks but also reveals its extensive potential for accelerating inference processes across various edge device applications.The significant technical progress made by ASRN offers fresh perspectives and possibilities for future research and practical applications in this rapidly evolving field.

Figure 1 .
Figure 1.Overview of ASRN: combines a backbone and a policy network with a specialized reward function to selectively execute neural network blocks during inference.

Figure 2 .Figure 3 .Figure 4 .
Figure2.Comparative computational analysis of our approach against the baseline on the B100 dataset.The blue and orange lines represent the baseline and our method, respectively, with the x-axis showing image indices and the y-axis showing corresponding flops.

Figure 5 .
Figure 5. Visualization of representative images in B100 passing through different numbers of block units.Without the guidance of the path selection policy, each sample needs to go through the complete path in the network, i.e., 32 blocks.

Figure 6 .
Figure 6.Visualization of representative images in Urban100 passing through different numbers of block units.Without the guidance of the path selection policy, each sample needs to go through the complete path in the network, i.e., 32 blocks.

Figure 7 .
Figure 7. Visualization of representative images in Div2k passing through different numbers of block units.Without the guidance of the path selection policy, each sample needs to go through the complete path in the network, i.e., 32 blocks.

Table 1 .
Sensitivity analysis of reward function parameters.

Table 5 .
Analysis of our method regarding usage blocks and flops across various datasets.