Symmetric Enhancement of Visual Clarity through a Multi-Scale Dilated Residual Recurrent Network Approach for Image Deraining

: Images captured during rainy days present the challenge of maintaining a symmetrical balance between foreground elements (like rain streaks) and the background scenery. The interplay between these rain-obscured images is reminiscent of the principle of symmetry, where one element, the rain streak, overshadows or disrupts the visual quality of the entire image. The challenge lies not just in eradicating the rain streaks but in ensuring the background is symmetrically restored to its original clarity. Recently, numerous deraining algorithms that employ deep learning techniques have been proposed, demonstrating promising results. Yet, achieving a perfect symmetrical balance by effectively removing rain streaks from a diverse set of images, while also symmetrically restoring the background details, is a monumental task. To address this issue, we introduce an image-deraining algorithm that leverages multi-scale dilated residual recurrent networks. The algorithm begins by utilizing convolutional activation layers to symmetrically process both the foreground and background features. Then, to ensure the symmetrical dissemination of the characteristics of rain streaks and the background, it employs long short-term memory networks in conjunction with gated recurrent units across various stages. The algorithm then incorporates dilated residual blocks (DRB), composed of dilated convolutions with three distinct dilation factors. This integration expands the receptive ﬁeld, facilitating the extraction of deep, multi-scale features of both the rain streaks and background information. Furthermore, considering the complex and diverse nature of rain streaks, a channel attention (CA) mechanism is incorporated to capture richer image features and enhance the model’s performance. Ultimately, convolutional layers are employed to fuse the image features, resulting in a derained image. An evaluation encompassing seven benchmark datasets, assessed using ﬁve quality metrics against various conventional and modern algorithms, conﬁrms the robustness and ﬂexibility of our approach.


Introduction
Rainy weather is a prevalent natural phenomenon.During such conditions, outdoor images often suffer from quality degradation due to the refraction of rain and obscuring of background objects [1].This results in image blur, deformity, and loss of detail.These issues have a significant impact on subsequent image processing and analysis [2], which are vital to various computer vision systems [3], such as autonomous driving [4,5] and road surveillance [6].As a result, the development of image-deraining algorithms has gained substantial attention from researchers globally.In recent years, single image-deraining algorithms [7] have predominantly been classified into two categories: model-driven and data-driven.The conventional image-deraining techniques are primarily model-driven, influenced by image decomposition, sparse coding, and priors based on Gaussian mixed models.However, since traditional methods establish models for specific rain streaks, they struggle to remove complex and diverse rain streaks.Additionally, with the advent of convolutional neural networks (CNNs), generative adversarial networks (GAN), and semi/unsupervised learning techniques, image-deraining algorithms have shifted toward data-driven strategies, making extensive use of deep learning algorithms.In typical modeldriven deraining algorithms, Li et al. [8] used bilateral filters to decompose a rainy image into high-frequency and low-frequency images.Afterward, they employed dictionary learning and sparse coding to remove rain streaks from the high-frequency image and, finally, combined it with low-frequency information to obtain the derained image.However, this method heavily relies on the preprocessing of bilateral filters, resulting in blurred background details.Jiang et al. [9] used a Gaussian mixture model (GMM) to model the rain layer and the background layer by calculating the distribution of the rain streaks of different angles and shapes, thus achieving deraining.However, this algorithm can only effectively remove rain streaks in light rain and struggles with heavy or sudden rain.In datadriven deraining algorithms, Fu et al. [10] proposed DerainNet, based on CNNs, to extract features and achieve deraining.Furthermore, they referred to the residual network [11] to further propose the deep detail network to reduce the mapping range from the input to output, making the learning process easier.Li et al. [12] used a recalibration network to progressively remove rain streaks at different stages and obtain a clean background image.Zhang et al. [13] applied the GAN [14,15] to image deraining, and used an ensemble residual perceptual classifier to adapt to the rainwater density information.Although the performance of deep learning algorithms has significantly improved compared to traditional algorithms, there are still some issues, such as the size and direction of the rain streaks being ignored, resulting in residual rain; during the rain removal process, due to the inability to distinguish between rain streaks and background textures, the background details are lost.
To address the aforementioned issues, this paper proposes an image-deraining algorithm based on multi-scale dilated residual recurrent networks.The algorithm employs a dilated residual network (DRN) to extract the multi-scale features and utilizes dilated convolution (DC) with different dilation rates to accomplish multi-scale rain streak removal.In the adjacent stages of rain streak removal, LSTM networks and gated recurrent unit (GRU) networks are used to, respectively, convey the mapping relationships of the rain streaks and background, ensuring the extraction of rain removal and the background detail information, progressively achieving deraining and obtaining rain-free images.Contributions:

•
We developed an image-deraining algorithm that harnesses the power of multi-scale dilated residual recurrent networks.This sophisticated tool is capable of not only effectively eliminating rain streaks from images but is also adept at restoring the intricate details of the background.

•
We deployed convolutional activation layers (CAL) at the initial stage of the algorithm to glean the elementary features.Subsequently, we employed a combination of long short-term memory networks and gated recurrent units, which enabled an effective propagation of both the rain streaks' characteristics and background details across different stages of the process.

•
We incorporated DRB, composed of DC with three distinct dilation factors, to expand the receptive field and facilitate the extraction of the deep, multi-scale features of both the rain streaks and background information.Additionally, we added a CA mechanism to capture the richer image features and enhance the model's performance given the complex and diverse nature of rain streaks.

•
We performed a comprehensive evaluation of the approach using five benchmark datasets, assessed using five quality metrics against eighteen conventional and modern algorithms, verifying the robustness and flexibility of the proposed method.

Related Work
Traditional deraining methods regard rainy images as a combination of background and rain streaks.Kang et al. [16] decomposed images into high-and low-frequency com-ponents and, through dictionary learning and sparse coding, further learned rain streak information.Although it effectively removed the light rain streaks, it led to background blurring.Chen et al. [17] introduced exclusivity into sparse coding, separating the background layer from the nonlinear combination of the rain and background layers, thus achieving deraining.While this method retained a clean background, the rain residuals remained.Li et al. [12] treated deraining as an image optimization process.By constructing iterative layers, the rain streaks were progressively removed from the background layer.Using prior information from a specific rain layer, the background texture details were removed from the rain streak layer.However, due to the diversity in the direction of the rain streaks, this method could not remove varied rain streaks and suffered from rain residuals.Li et al. [16] utilized a GMM to simulate the rain and background layers and, by constraining information, dynamically learned different rain streaks to achieve deraining.Kim et al. [18] constructed a robust learning-based framework that unfolds in three critical steps: first, image decomposition is accomplished using guided filters; next, a frequency-based haze and rain removal network is applied; and, finally, the image is restored based on an atmospheric scattering model using the predicted transmission maps and the rain-removed images.
In recent years, many researchers have utilized neural networks for image deraining.Fu et al. [19] proposed a deep detail network, which uses guided filters to decompose the image into detail and base layers, and inputs the detail layer into the CNN for deraining.Yang et al. [20] proposed a network based on DC for the joint detection and removal of the rain streaks, which detects the location of the rain streaks and uses a recursive framework for deraining, but suffers from the loss of the background details.Yu et al. [21] introduced a PHMNet that is designed for single image deraining, built upon a two-branch, coarse-to-fine framework.Notably, they have created a hybrid-modulated module within a two-branch structure, specifically designed to integrate and modulate the features of the rain-free layers and rain streaks.Li et al. [22] proposed a squeeze-and-excitation recursive network that takes into account the size of the rain streaks and removes them in stages through squeeze-and-excitation modules.Jiang et al. [23] introduced a multiscale progressive fusion network that removes multi-scale rain streaks through a pyramid structure but results in blurry image backgrounds.Tang et al. [24] skillfully utilized the dilation technique, enabling the effective consolidation of contextual information while preserving spatial resolution.Subsequently, they harnessed a gated subnetwork to amalgamate the intermediate features across various levels.To enhance the learning and application of the rain streaks, they embedded an LSTM module to create a link between the different recurrences, facilitating the transfer of knowledge about rain streaks from earlier stages to subsequent ones.Chen et al. [25] proposed an end-to-end multi-scale hourglass fusion network that accurately captures rain streak features through multi-scale extraction, hierarchical distillation, and information aggregation.Huang et al. [26] put forward a rain removal method underpinned by directional gradient priors, aiming to preserve the original rain image's structural information to the maximum extent while effectively eliminating rain streaks.Initially, to address the issue of the residual rain streaks, they constructed two directional gradient regularization terms upon the foundation of the sparse convolutional coding model, tasked with constraining the directional information of the rain streaks.Subsequently, they designed a multi-scale dictionary for convolutional sparse coding, which was incorporated into the rain layer coding within the directional gradient prior terms, to detect rain streaks of varying widths.Zhang et al. [14] proposed a deraining network based on generative adversarial networks.Son et al. [27] proposed a two-stage network.The first stage creates low-resolution facial images, effectively removing heavy rain to enhance visibility.This is achieved through an interpretable image degradation network, specifically designed to predict physical parameters like rain streaks, transmission maps, and atmospheric light.For the second stage, the goal is to reconstruct high-resolution facial images from the low-resolution outputs generated in the first stage.To facilitate this, they utilize facial component-guided adversarial learning, which notably amplifies the expression of the facial structures, but results in darker images.To address this issue, Cao et al. [28] proposed a gated multi-scale feature fusion two-stage density-aware network, designing three discriminators targeting color, gradients, and gray levels for the targeted removal of the artifacts.Wei et al. [29] proposed a deraining network, which initially incorporates an attention mechanism into the generator.This directs the rain-removal process to focus primarily on areas near the rain line, which, in turn, helps preserve the background details.Secondly, a multi-scale discriminator is utilized, discriminating the produced image across different scales to enhance its quality.Lastly, they introduce perceptual-consistency loss and internal feature perceptual loss to diminish the artificial features in the generated image, thereby making it more visually convincing and realistic.

Proposed Method
This paper has outlined a multi-scale dilated residual recurrent network for image deraining.Our methodology involves classifying rain into distinct layers and progressively eliminating them through a cyclical process.We use LSTM and GRU networks to transmit the relationships among the pixel points across different stages of the process and employ a DRN to extract the features at multiple scales.The LSTM network is utilized to maintain a continuity of rain streak information across various stages, facilitating the removal of the different types of rain streaks.It takes inputs from the input convolutional layer, the rain streak layer from the preceding stage, and the background layer from the previous stage.This network is characterized by five gates, including input, forget, and output gates, which modulate the flow of information.The GRU network is used to capture the dependencies over varying time scales, which is crucial in extracting the background information over successive stages.At each stage, it ingests the shallow features from the input convolutional layer, outputs from the background recurrent layer of the preceding stage, and outputs from the rain streak recurrent layer of the current stage.It has two primary gates, a reset and an update gate, controlling the amalgamation of previous and current information.The DRN plays a significant role in identifying and eliminating rain streaks across various scales due to its capacity to capture multi-scale contextual information.We have used dilated residual blocks, each comprising three activation layers with varied dilation rates, to extract multi-scale features effectively.Moreover, CA is incorporated into the LSTM network to bolster the extraction of rain streak information.This mechanism prioritizes certain aspects of the input data while reducing the emphasis on others, aiding the model to concentrate on the salient features.Finally, we employ a loss function to assess the performance of the algorithm.We use the negative structural similarity (SSIM) as the network's loss function, which takes into account luminance, contrast, and structural metrics that align more closely with human visual perception.
We believe that our model takes a comprehensive approach to image deraining by maintaining the integrity of the background details while effectively removing the rain streaks.

Network Structure
Owing to the variety in both the size and direction of rain streaks, this paper classifies rain into distinct layers and employs a cyclical approach to progressively eliminate rain streaks of varying sizes and orientations.Since LSTM and GRU networks are capable of transmitting the relationships among the pixel points across different stages, and DC is adept at extracting features at multiple scales, this paper introduces an image-deraining algorithm predicated on a multi-scale dilated residual network.The comprehensive structure is illustrated in Figure 1.At every stage of the process, a convolutional activation layer is initially utilized to extract the rudimentary features of the rain streaks.To make use of the valuable data from the preceding stage, an LSTM network is employed to establish the connections between the distant pixel points, thereby guiding the subsequent stage of rain removal and facilitating the extraction of the multi-directional rain streak features.Following this, a dilated residual network is harnessed to uncover the deep multi-scale features.This network is composed of three stacked DRB, each featuring three dilation CAL with varied dilation rates.This structure ensures the comprehensive extraction of the rain streaks across multiple scales.Building on this foundation, CA is invoked to augment the extraction of the rain streak information.Ultimately, a convolutional layer is deployed to merge the rain streak features.For the retrieval of the background texture information, a similar approach is adopted as with the rain streak feature extraction.The background layer employs a GRU network to relay information from one stage to the adjacent one.Additionally, given the richer texture information within the background layer, its dilated residual network incorporates five DRB.Within each stage, the LSTM and GRU networks manage the interaction between the rain streaks and the background.Collectively, the framework leverages a multi-scale dilated residual recurrent network to achieve deraining while maintaining the integrity of the background details.The fundamental structure of each stage is depicted in Figure 2.

Derain Model
In the field of image deraining, the prevailing models generally conceptualize a rain image as an integration of two distinct constituents: the rain streaks and the background.The rain streaks are delineated by n tiers of data, denoted as (x r ), while the background is depicted by an equivalent n tiers of data, represented as (x b ).The distributional characteristics inherent to the rain are employed to ascertain the manner in which these two components are amalgamated.Predicated on the assumption that the rain streaks at a uniform depth exhibit a degree of homogeneity in terms of size and direction, it becomes feasible to aggregate them into singular layers.The specific model is delineated as follows: In Equation ( 1), y represents the observed rain image that we aim to analyze.This observed rain image is composed of two main elements: the rain streaks and the background.The first term on the right side of the equation, ∑ n t=1 x t r , represents the rain streaks in the image.The rain streaks are modeled as tiers of data, denoted by (x r ) t , where t ranges from 1 to 'n'.This is based on the assumption that the rain streaks at a uniform depth exhibit a degree of homogeneity in terms of size and direction and, thus, can be aggregated into singular layers or tiers.The second term, ∑ n t=1 x t b , represents the background of the image, which is also broken down into n tiers of data, represented as (x b ) t .The overall aim of our model, therefore, is to decompose the observed rain image y into these two components: the rain streaks and the background.This decomposition is the first step in image deraining, a process that involves removing the rain streaks from the image to recover a clean and clear background.

LSTM
The LSTM network, a type of recurrent neural network, is particularly well-suited for adjusting rainy images to its ability to remember and propagate information over long sequences, which is crucial in tracking and removing rain streaks over successive stages of an image.At the tth stage, the LSTM network is fed with inputs from three sources: the input convolutional layer u t r , the rain streak layer from the preceding t − 1th stage h t−1 r , and the background layer from the previous stage . This configuration enables the LSTM network to maintain a continuity of rain streak information across various stages, thereby facilitating the removal of the different types of rain streaks.The LSTM network is characterized by five gates: an input gate h ι −1 b , a forget gate f r t , a gating mechanism gr t , a cell unit c t r , and an output gate o t r .The input gate modulates the volume of information that is permitted to enter the cell unit, the forget gate determines the extent of information retention from the preceding stage, and the output gate regulates the quantity of information that is discharged from the cell unit.The mathematical representation of these gates is as follows: In the aforementioned equations, the symbol * denotes 2D convolution, while σ signifies the sigmoid function.The elements W and b correspond to the convolution matrix and the bias vector, respectively.The operator is indicative of element-wise multiplication.The term h t r refers to the output emanating from the rain streak recurrent layer at the tth stage.

Gated Recurrent Unit
The GRU network, a variant of the recurrent neural network, is particularly adept at this task due to its capacity to capture dependencies over varying time scales, which is crucial in extracting background information over successive stages of an image.A notable advantage of the GRU network is its reduced parameter count compared to the LSTM network, rendering it a more computationally efficient choice for transmitting information in the background layer, thereby mitigating computational overheads.
At the tth stage, the GRU network ingests the shallow features from the input convolutional layer u t r , outputs from the background recurrent layer of the preceding (t − 1) th stage h t−1 r , and outputs from the rain streak recurrent layer of the current t th stage (h t r ).This methodology enables the effective extraction of background information across a multitude of stages.The architecture of the GRU network includes a reset gate rb t and an update gate zb t .The reset gate determines the degree to which information from the previous stage amalgamates with the input information prior to entering the cell unit.Conversely, the update gate controls the proportion of the hidden state information retained during the current stage.The mathematical representation of these gates is as follows: In the above equation, ĥt b represents the output of the hidden layer; h t b represents the output of the background recurrent layer at the tth stage.
In our ablation study, we evaluated the performance of a dual GRU that utilized residual mapping.The results demonstrated a quantitative improvement.However, the rain streak layer extracted by dual GRU tends to incorporate certain textural contents from the background image.As a result, when the derained image undergoes excessive subtraction, it may exhibit an overly smoothed appearance.To advance our methodology, we implemented LSTM and GRU modules for the purpose of extracting rain streaks and predicting a clean background image, respectively.At the t th stage, the formulation can be described as follows: where F r and F x denote two interconnected modules, i.e., LSTM and GRU, utilized for the extraction of the rain streak layer and the clean background image layer, respectively.As illustrated in Figure 3, LSTM and GRU are employed to disseminate the deep features throughout the rain streak and background image layers, respectively.In our study, the interaction between LSTM and GRU potentially enhances the efficiency of the deraining process.To this end, as illustrated in Figure 4, we suggest fostering an interaction between LSTM and GRU by propagating their hidden states across the stages.In Figure 4, the hidden state h t within the LSTM is disseminated across the stages to enhance rain streak extraction.Concurrently, this state is also inputted into the GRU, and the process is reciprocated.Specifically, the LSTM within F r at stage t accepts the features from the previous input layer, which is represented as follows: z t r = f r y, r t−1 .
Figure 4.A detailed view of the LSTM and GRU shows how the hidden states h t and h r are not only propagated through the LSTM layer X and GRU layer R (as denoted by dashed lines) but also create an interaction between layer R and layer X (illustrated by solid lines).Note that the term 'Conv' here signifies convolutional matrices and bias vectors.

Dilated Residual Network
In the specific application of rain streak removal, the DRN exhibits significant utility.Given the considerable variation in size, shape, and orientation of the rain streaks within an image, they present a multi-scale challenge.The DRN's capacity to capture multiscale contextual information empowers it to effectively identify and eliminate the rain streaks across the various scales.Moreover, the residual connections inherent to the DRN assist in preserving the intricate details of the original image, ensuring that the rain streak removal process does not inadvertently compromise the image's quality.The classic residual network (CRN) [30] consists of residual blocks, with each block utilizing convolutional layers that share the same kernel size.Its fundamental structure is depicted in Figure 5.While the CRN has been widely utilized in deraining tasks, the homogeneity of the convolutional kernel (CK) falls short in extracting the multi-scale features of the diverse rain streaks.DC serves as an enhancement of standard convolution with a kernel size of k, introducing d − 1 spaces between the elements of the CK.This modification expands the kernel size to d(k − 1) + 1, effectively augmenting the receptive field [31].In response to this, this present study introduces a DRN, specifically designed to possess a receptive field commensurate with the varying sizes of the rain streaks.This network is assembled by iteratively stacking DRB.Each of these blocks comprises three 3 × 3 DC activation layers with dilation factors assigned at 1, 5, and 10, respectively.Through the implementation of DC, the receptive field of the network is magnified, facilitating the effective extraction of the multi-scale features.Figure 6 delineates the structure of the DRB.

Channel Attention
The attention mechanism, a notable advancement in the realm of deep learning, has proven to be a pivotal factor in augmenting the efficacy of a multitude of models.It operates by selectively prioritizing certain facets of the input data while diminishing the emphasis on others, thereby emulating the human cognitive process of allocating 'attention'.This mechanism equips the model with the capacity to distribute varying weights to disparate segments of the input, thereby enabling it to concentrate more intensively on the salient features and less so on the irrelevant ones.
In light of the rich information encapsulated within rain streaks, a CA mechanism is integrated into the LSTM network to bolster the extraction of the rain streak information.In Figure 1, CA is placed right after DRN to further extract the rain pattern information.To apply CA in our specific case, the technique of depthwise separable convolution (DSC), which includes depth convolution and pointwise convolution, is utilized in this scenario.Depth convolution possesses the ability to extract features across various channels, while pointwise convolution serves to combine the features between these channels.Consequently, the attention mechanism employs DSC with a CK of 3 to distill the feature information from the rain streak channel.Within the framework of the CA module, the input features are initially fused with information from the channel domain via DSC.This is followed by the integration of the global features through the average pooling layer.Ultimately, the spatial features are generated using a conventional convolution layer.The architecture of this mechanism is illustrated in Figure 7.

Loss Function
Within the purview of machine learning and deep learning, a loss function serves as a metric for assessing the proficiency of an algorithm in modeling the provided data.If the algorithm's predictions significantly diverge from the actual results, the loss function yields a high value.Over time, via the process of optimization, the value of the loss function is progressively minimized, thereby enhancing the performance of the model.
In the domain of deep learning, the mean squared error (MSE) is frequently employed as a loss function.However, its application tends to induce background blurring in images.The SSIM loss function, on the other hand, considers luminance, contrast, and structural metrics, which are more in line with human visual perception.In light of these considerations, this paper employs the SSIM as the network's loss function.The output values of SSIM range between 0 and 1; a value closer to 1 indicates that the restored image closely resembles the original, signifying a superior result.The mathematical representation for the loss function is as follows: where x t represents the image after rain removal; x gt represents the ground truth image without rain.
where µ x t represents the mean of x t ; µ x gt signifies the mean of x gt ; σ 2 x t denotes the variance of x t ; σ 2 x gt represents the variance of x gt ; σ x t x gt denotes the covariance of x t and x gt ; and c 1 and c 2 are constants to avoid having the fraction equal to zero.

Experiment and Result Analysis
In this study, the rain removal algorithm, which is predicated on a multi-scale dilated residual recurrent network, is subjected to an experimental comparison with several representative algorithms.These comparative analyses are conducted across different datasets, providing a comprehensive evaluation of the algorithm's performance in diverse scenarios.This rigorous experimental design allows for a robust assessment of the proposed and prior algorithms' efficacy in removing rain streaks from images.

Network Configuration
The proposed methodology is executed utilizing the PyTorch framework, a popular open-source machine learning library.The training process is conducted on a personal computer equipped with an Intel Core i7 CPU operating at 3.6 GHz, complemented by 32 GB of RAM.For the purpose of GPU-accelerated computation, an NVIDIA TITAN Xp graphics card is employed.This hardware configuration provides the computational power necessary to effectively train and evaluate the proposed model.During the training phase, images are randomly cropped from the dataset to dimensions of 256 pixels by 256 pixels.The model employs the Adam optimization algorithm for training, which is conducted over 100 epochs.The learning rate is initially set at 0.001 and is subsequently reduced by a factor of 0.2 at the 31st, 51st, and 81st epochs.

Datasets
The experiment entails the training of a rain removal algorithm, which leverages a multi-scale dilated residual recurrent network.This training process is executed on two distinct datasets: Rain100L [20] and Rain100H [20].Rain100L comprises 300 sets of training images and 200 sets of test images.The images within this dataset are characterized by relatively sparse rain streaks, providing a specific context for the training process.Conversely, Rain100H is composed of 1700 sets of training images and 200 sets of test images.This dataset is distinguished by the presence of rain streaks in five unique directions, resulting in a comparatively denser distribution of streaks.In addition to these datasets, a test dataset, Rain128, is created by randomly selecting a diverse range of images from the Rain800 [14] dataset.This diverse selection of datasets ensures a comprehensive and robust evaluation of the rain removal algorithm's performance.
Furthermore, to substantiate the effectiveness of our model, we have incorporated real-world rainy image datasets, i.e., SPA-Data [32], Real147 [33], RIS [34], and RID [34].These images serve solely for the purpose of evaluation, providing a practical context in which to assess the performance of our model.This use of real-world images aids in demonstrating the model's potential applicability and robustness in handling real-world deraining tasks.

Quantitative Metrics
Quality assessment is a vital component of image processing and analysis, providing a quantitative evaluation of an algorithm's effectiveness in preserving or improving the quality of an image.A range of metrics have been devised for this purpose, each offering unique strengths and limitations.These metrics can be broadly divided into two categories: nonreference algorithms, which do not necessitate a reference image for comparison, and full-reference algorithms, which juxtapose the processed image with an original, unaltered reference image.In this study, we utilize an array of quality metrics for evaluation, including the SSIM [35], VP-NIQE [36], PSNR [37], SSEQ [38], and LPIPS [39].Furthermore, SSIM is employed to analyze the similarity of the corresponding images in terms of illumination, structure, and contrast.PSNR calculates the peak signal-to-noise ratio in decibels between two images.LPIPS, a system based on CNN, assesses image quality using perceptual patch similarity.VP-NIQE [36] evaluates the image quality and predicts image error by integrating the fidelity and naturalness measurements of the natural images.The SSEQ model conceptually integrates structure and texture similarity and is capable of assessing the quality of a distorted image across multiple distortion categories.
Generally, higher values of SSIM and PSNR indicate superior visual quality in the enhanced results.Conversely, lower values of SSEQ, LPIPS, and VP-NIQE suggest that the visual quality has less color distortion and a more pleasing perceptual effect.

Ablation Study
In order to determine the effectiveness of each individual component within our proposed deraining network, we carried out a comprehensive set of ablation studies.Each of these studies is meticulously designed to assess the network's performance when one or more of its integral modules were removed.Different modules are utilized for the rain streak recurrent layer and background recurrent layer, specifically including: LSTM for the rain streak recurrent layer and GRU for the background recurrent layer (LSTM+GRU), GRU for the rain streak recurrent layer and LSTM for the background recurrent layer (GRU+LSTM), and both layers using GRU (GRU+GRU).More specifically, three configurations are considered:

•
The first configuration utilizes LSTM for the rain pattern recurrent layer and GRU for the background recurrent layer (denoted as LSTM+GRU).
• The second configuration applies GRU for the rain pattern recurrent layer and LSTM for the background recurrent layer (denoted as GRU+LSTM).

•
The third configuration employs GRU for both layers (denoted as GRU+GRU).
We executed the ablation experiments on the Rain200H dataset, maintaining a consistent configuration for each experiment to ensure a fair evaluation.Upon conducting empirical evaluations, it has been demonstrated that the configuration where LSTM is used for the rain pattern recurrent layer and GRU for the background recurrent layer exhibits superior performance.The comparative results of these configurations are presented in Figure 8.

Analysis of Loss Function
In the arena of deep learning-based deraining methods, the training phase is crucial in determining the effectiveness of the model.During this phase, the model learns to differentiate between the rain streaks and the actual content of the image, with the ultimate goal of effectively removing the rain streaks while retaining the original details of the image.To facilitate this learning process and measure the model's performance, we employed various loss functions.These loss functions are essential as they measure the discrepancy between the model's output and the ground truth image.The primary aim during the training phase is to minimize this discrepancy.Selecting the appropriate loss function can significantly influence the deraining model's performance.For this reason, it is imperative to carefully consider and evaluate the different loss functions for this task.In our training phase, we employed three different loss functions, such as negative SSIM, MSE, and a combination of MSE and SSIM, i.e., MSE, SSIM.The choice of these specific loss functions was made based on their known effectiveness in tasks related to image processing.SSIM measures the similarity between two images, which makes it an excellent choice for ensuring our model's output closely resembles the ground truth image.MSE is a popular choice due to its simplicity and efficacy in measuring the average squared difference between the estimated values and the actual value.Lastly, we combined MSE and SSIM to exploit the strengths of both these loss functions.The comparison of these different loss functions and their impact on the model's performance is detailed in Figure 9.We found that the SSIM loss function offered superior performance compared to MSE and the combined function, i.e., MSE, SSIM.This conclusion is drawn based on the lower loss value it produced and the more visually appealing deraining results.Hence, based on the empirical evidence, we selected SSIM as the primary loss function in our model to enhance its performance in image deraining.We believe this selection allows our model to generate more accurate and visually pleasing deraining results.

Quantitative Analysis
In this section, we have undertaken a comprehensive set of experiments with the aim of objectively assessing the performance of our proposed model, juxtaposing it with previously established models.We have chosen SSIM and PSNR as the metrics for evaluation, given their widespread acceptance and utility in assessing the quality of image processing tasks.For this experiment, our proposed algorithm is rigorously benchmarked against a selection of state-of-the-art algorithms.This includes the DID-MDN [13], ResGuideNet [40], and SSIR [33].The evaluation process involves testing these algorithms on two classic synthetic datasets, Rain100H and Rain100L, which are widely used in the field for benchmarking purposes.Additionally, a more generalized test dataset, Rain128, is also employed to ensure a broad evaluation of the algorithm's performance.The outcomes of these evaluations are succinctly presented in Table 1.The performance of our proposed algorithm is highlighted in bold, drawing attention to its comparative effectiveness.This rigorous evaluation process provides a clear and objective assessment of our proposed model's performance in the context of existing state-of-the-art deraining algorithms.  1 reveals that the proposed model outperforms other algorithms on the Rain100H and Rain100L datasets in terms of both PSNR and SSIM.Specifically, on the Rain100H dataset, the algorithm exhibits improvements in the PSNR values in comparison to the DID-MDN [13], ResGuideNet [40], SSIR [33], NJS [41], CID [42], and TRNR [43] algorithms, respectively.Furthermore, it demonstrates increases in the SSIM values relative to the DID-MDN [13], ResGuideNet [40], SSIR [33], and TRNR [43] algorithms, respectively.Moreover, the performance of recent research [42] is also better in terms of PSNR and SSIM.In conclusion, the algorithm proposed in this paper achieves substantial enhancements in the removal of rain streaks and the restoration of fine details.While the PSNR evaluation metric for the algorithm is marginally lower than that of TRNR [43] on the synthetic Rain128 dataset, subsequent experiments indicate that our model exhibits markedly superior performance compared to TRNR [43] in deraining real-world images.
To further substantiate the efficacy of both the proposed model and its predecessors, an additional experiment is conducted.This experiment involves a quantitative analysis utilizing three key metrics: PSNR, SSIM, and LPIPS.The comparative performance of several deraining algorithms is graphically represented in Figure 10.As depicted in Figure 10, the proposed algorithm demonstrates superior performance over several established algorithms, including DGSM [44], ResGuideNet [40], DJRDR [20], RSECAN [22], SSIR [33], DHCN [45], DPNet [46], and SIDBRN [47] in terms of the PSNR values when tested on the Rain100L and Rain100H datasets.Similarly, the SSIM and LPIPS values of the proposed model also show significant improvements, further reinforcing the model's superior performance.This consistent outperformance across multiple metrics and datasets underscores the robustness of the proposed model and its potential for practical application in the field of image deraining.

Qualitative Analysis of Synthetic Rain Images
This section aims to elucidate the efficiency of the model using synthetic images.To assess the performance of the algorithm developed in this study, we chose four different algorithms to compare rain removal efficacy against ours.The test utilized images from the Rain800 [14] dataset, with the results depicted in Figures 11 and 12.It is observable that under torrential rain conditions, the DGSM [44], ResGuideNet [40], and DJRDR [20] algorithms tend to excessively remove the background along with the rain, as they are unable to distinguish between the rain streaks and background textures, resulting in a significant loss of the background details.The DGSM [44] algorithm, constrained by its single scale, is ineffective in accurately restoring the edge details.The ResGuideNet [40] algorithm leaves behind some rain streaks as well.In contrast, our proposed algorithm is capable of distinguishing the rain streaks from the background textures, efficiently removing the rain and essentially restoring the background details, thereby improving the visual quality.In cases where the rain streaks are accumulated, the DGSM [44] and SIDBRN [47] algorithms exhibit noticeable rain remnants.While the DJRDR [20] algorithm manages to remove most of the rain streaks, some still remain.However, our algorithm stands out as it is not only effective in removing the accumulated rain streaks but also maintains the image's background details, resulting in superior visual outcomes.

Qualitative Analysis of Real Rain Images
In an effort to validate the applicability and effectiveness of the proposed method within real-world scenarios, we have conducted an experiment using real rainy images that are randomly taken from the SPA-Data [32] dataset.A review of these images revealed in Figure 13 showed that while prior methods can remove most of the rainwater obstruction, they struggle with the removal of useful texture details in the background.Additionally, these methods also face the issue of the incomplete removal of large raindrops.As evidenced in Figure 13, the image processed by RSN [48] exhibits obvious rain streak residues.Moreover, it can be observed that both the DJRDR [20] method and the RSECAN [22] method suffer from excessive deraining, leading to some details becoming blurred and distorted.The second last column in the visual results reveals that the MSPFN [23] method has a limited ability to remove dense rain streaks.In contrast, our proposed method can thoroughly remove most of the rain streaks while effectively maintaining the color and details of the background.
To further substantiate the robustness of our proposed method, we selected random real rainy images from the SPA-Data dataset for a subjective comparison.The rain patterns in the SPA-Data dataset are relatively sparse, and most learning-based methods demonstrate good deraining effects.However, as shown in Figure 13, the visual results of RSN [48] are low in contrast.Moreover, the RSECAN [22] and DJRDR [20] methods leave behind some rain streak residues, and the MSPFN [23] method exhibits an excessive smoothing problem and restricted capacity to eliminate densely packed rain streaks.In contrast, our proposed method demonstrates commendable efficacy in removing rain streaks and preserving background details, further underscoring its robustness and potential for practical application.It is important to note that while the proposed deraining algorithm put forward in this study achieved remarkable results on the real rainy image dataset, the complexity of the backgrounds in the rainy images is highly variable.Therefore, there can be significant differences in rainy images captured in various settings, such as rural fields versus urban streets, in terms of background composition and rain streak patterns.This implies that the choice of training datasets for real rainy images can impact the performance of the final model across different scenarios.Upon examining the deraining results, it is evident that the deraining algorithm proposed in this study excels in removing a substantial proportion of rain streaks in real rainy images, compared to the conventional and learning-based algorithms depicted in Figures 12 and 13.It achieved this with a minimal loss of detail and color distortion.In addition, the algorithm proposed herein demonstrates a notable enhancement in deraining efficacy on real rainy images and exhibits a distinct advantage in both rain streak elimination and background preservation.
Moreover, we provide another two full-reference deep quality measures for comparison in order to assess the deraining performance of the proposed and prior models.The new quality metric is visual perception nature image quality evaluation, which is the most recent novel perceptual image error metric; VP-NIQE [36] can predict the image quality score and simulate the top-down structure of the human visual system in image perception.Furthermore, the SSEQ [38] offers accurate forecasts of human quality ratings on both textures and natural pictures and resistance to minor geometric aberrations.We evaluate these two metrics on three datasets including the Real147 [33], RIS [34], and RID [34] datasets.The quantitative results are illustrated in Table 2.The results show that our method achieves the best performance in both metrics, which further illustrates the superiority of the proposed method.

Strengths and Weaknesses
The novel image-deraining algorithm proposed in this article demonstrates a number of significant strengths.By addressing the complex challenges associated with the removal of diverse rain streaks and the preservation of background details, it exhibits both robustness and flexibility.This approach effectively preserves and recovers the background details while eliminating various rain streaks.The integrity of the original image is maintained, a feature often compromised in other deraining methods.Importantly, the algorithm achieves a balance between the preservation of the background details and the removal of the rain streaks, often seen as competing objectives in this field.The use of a multi-scale dilated residual recurrent network, the integration of deep and multi-scale features, and the application of convolutional layers to produce the final derained image all contribute to its superior performance.The algorithm's ability to discern and extract features of both the rain streaks and background content is key to its success.However, as is the case with all methods, there are areas that require further investigation and improvement.In more challenging scenarios, such as heavy rain or fog, the algorithm's performance needs to be enhanced.In future research, we aim to explore the potential of incorporating more advanced attention mechanisms and transformers to further enhance the model's discerning abilities.By investigating the potential of applying this algorithm to other related tasks, such as image denoising or dehazing, we hope to broaden the scope and applicability of our research.

Conclusions
This article addresses the complex challenges associated with the removal of diverse rain streaks, particularly given the limitations associated with the use of a single scale.Additionally, it also tackles the issue of the background details becoming blurred in images following the rain removal process.To address these challenges, we proposed an innovative image-deraining algorithm that leverages a multi-scale dilated residual recurrent network.The algorithm initiates its process by leveraging convolutional activation layers to distill the elementary features.Following this initial step, it deploys LSTM in conjunction with GRU.This combination serves to effectively disseminate the attributes of the rain streaks and the background across various stages, thereby capturing the temporal dependencies and spatial correlations within the image.Subsequently, the algorithm integrates DRB, composed of DC with three distinct dilation factors.This integration serves to expand the receptive field of the model, a crucial step that facilitates the extraction of the deep and multi-scale features.By doing so, the algorithm is able to more effectively discern and extract the features of both the rain streaks and background contents.This multi-stage, multi-scale approach allows the algorithm to capture a wide range of features from low-level details to high-level patterns.This comprehensive feature extraction is key to the algorithm's ability to effectively separate the rain streaks from the background, thereby improving the quality of the derained images.Ultimately, the deraining results are integrated using convolutional layers to produce the final derained image.An extensive evaluation was conducted using ten benchmark datasets and twelve quality metrics to assess the performance of twentytwo conventional and modern algorithms.The results of this comprehensive evaluation demonstrate that our approach exhibits both robustness and flexibility, outperforming other methods under a variety of conditions.A key strength of our approach is its ability to effectively preserve and recover the background details in images while simultaneously eliminating various rain streaks.This capability is crucial as it ensures the integrity of the original image is maintained, a factor that is often compromised in other deraining methods.The preservation of the background details and the removal of the rain streaks are often seen as competing objectives, with improvements in one area frequently leading to compromises in the other.However, our approach manages to achieve a balance between these two objectives, demonstrating its effectiveness and potential for practical application.
Looking ahead, future research will focus on further improving the algorithm's performance, particularly in more challenging scenarios, such as heavy rain or fog.Additionally, we aim to explore the potential of incorporating more advanced attention mechanisms and multi-scale approaches to further enhance the model's ability to distinguish between heavy rain streaks and the background details.We also plan to investigate the potential of applying this algorithm to other related tasks, such as image denoising or dehazing, thereby broadening the scope and applicability of our research.

Figure 1 .
Figure 1.Overall structure of the multi-scale dilated residual recurrent network.

Figure 2 .
Figure 2. Architecture of the multi-scale dilated residual recurrent network at stage t.

Figure 3 .
Figure 3. Structure of LSTM and GRU modules at stage t.

Figure 7 .
Figure 7. Structure of channel attention module.

Figure 8 .
Figure 8. Results of the ablation study.
(a) Loss functions measurement using PSNR.(b)Loss functions measurement using SSIM.

Figure 9 .
Figure 9.Comparison results of different loss functions in terms of PSNR and SSIM.The loss function, i.e., SSIM acheived better results as compared to MSE and combination of MSE and SSIM.

Figure 11 .
Figure 11.Visual results obtained by proposed and different prior deraining models using synthetic rain images.

Figure 12 .
Figure 12.Visual results obtained by proposed and different prior deraining models using synthetic rain images.

Figure 13 .
Figure 13.Visual results obtained by proposed and different prior deraining models using real rainy images.

Table 1 .
Evaluating the performance of various deraining algorithms on test datasets by measuring PSNR and SSIM.

Table 2 .
Quantitative analysis of prior models and ours in terms of SSEQ and VP-NIQE.