1. Introduction
The utilization of unmanned aerial vehicles (UAVs) for facility safety inspections has brought about a significant transformation in various industries, including civil engineering. It offers cost effectiveness and increased operational efficiency compared to conventional human-based safety inspections. Structural monitoring, which relies on reliable technological methods to assess infrastructure conditions, is a crucial practice to ensure the long-term serviceability of targeted structures. Manual inspection conducted by human worker, a widely employed approach for monitoring structures over the years, entails evaluating the condition of a structure based on subjective judgment. However, this method presents several challenges, including exposing inspectors to hazardous environments, consuming substantial time and financial resources, and yielding results that may not be fully dependable.
In recent years, a promising solution to overcome these limitations has emerged through the active utilization of unmanned aerial vehicles (UAVs) for structure monitoring. This method involves employing UAVs equipped with high-resolution vision sensors or cameras to capture detailed images of a structure’s exterior. The UAV can be guided along a predefined flight path or manually controlled to conduct the monitoring operation effectively. Based on these characteristics and advantages, numerous studies have been conducted regarding the utilization of images captured by unmanned aerial vehicles (UAVs) for bridge inspection [
1,
2,
3,
4]. Depending on the type of camera equipped on UAVs, research has been conducted to detect various types of damages, such as crack detection using RGB images and the quantification of deterioration, like concrete spalling and delamination, using thermal images [
5,
6]. Furthermore, research has proposed comprehensive frameworks from the pre-inspection to post-inspection phases of bridge assessment [
7,
8]. However, despite these efforts, challenges remain in addressing the quality issues of images captured in dynamic environments. Particularly, issues related to image quality degradation due to environmental factors such as wind or low lighting have been highlighted as hurdles that UAV-based bridge inspection technologies need to overcome. Among various quality degradation phenomena like noise, blur, low lighting, and defocusing, motion blurring is a problem that is difficult to overcome through post-processing. Especially when capturing large-scale structures like bridges, there is a tendency to choose the shortest path to minimize operational time, which can be constrained by time limitations. As a result, problems stemming from environmental factors like vibrations or wind affecting rapidly moving UAVs have been identified as issues directly impacting image quality. In addressing the motion blur problem, some researchers have focused on the detection and removal of blurry images. To identify areas of blurriness within an image, prior methodologies predominantly focused on assessing the sharpness of image edges [
9] or calculated the gradient magnitude [
10]. Alternatively, Su et al. [
11] adopted a distinctive approach for blur detection by combining several localized characteristics, including the power spectrum slope, Gradient Histogram Span, and Maximum Saturation, all of which become apparent in the presence of blur. This approach also made a substantial contribution to addressing the problem of image restoration by classifying blur into two distinct types, motion blur and out-of-focus blur, relying on image patches as the basis for categorization. Another method proposed by Bang et al. [
12] entails the comparison of blur metric values derived from adjacent frames through the application of moving averages. However, it is important to note that these techniques continue to depend on threshold settings, which can be problematic to ascertain with precision. In addition to blur area detection, research on deblurring techniques aimed at directly improving the quality of blurry images has also been actively conducted. Most non-uniform deblurring methodologies initiate their process with the foundational assumption that the observed blurred image (
B) results from the convolution of an underlying sharp image (
I) with a blur kernel (
K), which is determined based on a motion field.
The family of image deblurring approaches can be classified into blind and non-blind deconvolution methods. Non-blind deconvolution assumes prior knowledge of the blur kernels present in an image, while blind deconvolution is conducted without any additional information on the blur kernels. Early work predominantly focused on non-blind deconvolution methods, often relying on algorithms such as Richardson–Lucy, Wiener filter, or Tikhonov filter to perform deblurring [
13,
14]. In more recent times, there has been the development of blind deconvolution approaches aimed at handling situations in which the blur kernel remains unidentified. Gupta et al. [
15] estimated the spatially non-uniform blur kernel resulting from camera vibration and deconvolved the image using a motion density function. Nonetheless, it is evident that there is potential for further refinement in the estimation of the blur kernel. Tai et al. [
16] focused on spatially varying camera motion blur and proposed a projective motion deblurring model based on the Richardson–Lucy algorithm. However, it is worth noting that their approach necessitates knowledge of a pre-defined camera motion path, thereby constraining its practical applicability. Sieberth et al. [
17] introduced two deblurring methodologies, one based on the Fourier approach and the other utilizing the edge-shifting technique. While both methods yielded outstanding results in deblurring aerial images, they exhibited certain limitations related to the requirement of precise transformation parameters. Additionally, the edge-shifting approach faces challenges in detecting complex crack patterns. The non-uniform blind deblurring algorithms mentioned earlier showcase proficient image deblurring capabilities. However, their efficacy is hinged on a multitude of problem-specific parameters and configurations, encompassing internal camera parameters, external motion functions, thresholds, and termination criteria. Consequently, these algorithms encounter challenges when it comes to practical implementation and generalization in real-world scenarios. In the early stages, an analysis was conducted on the impact of motion blur in UAV-based images on feature matching using SURF and brute force matching [
17]. The results revealed that even minor displacements of the camera, leading to image blur could have a significant adverse effect on image processing. Furthermore, through related studies, it became evident that the quality of images captured by UAVs directly influences the outcomes of visual inspections on structures [
8]. They demonstrated that excluding blurry images prior to photogrammetric processing could greatly enhance feature detection and reconstruction. Additionally, they observed that the extent of motion blur caused by camera shake led to a reduction in image sharpness and a decrease in the accuracy of crack detection.
In recent times, owing to the progress in deep learning technology, there has been a significant focus on investigating learning-based deblurring techniques. One such approach involves the utilization of convolutional neural networks (CNNs) for the estimation of the blur kernel function. Extensive research has been conducted to develop a convolutional neural network (CNN) capable of predicting blur kernels at the patch level to effectively eliminate non-uniform blur in images [
18]. Furthermore, considerable attention has been directed towards research that explores the utilization of convolutional neural networks (FCNs) for image deblurring through the estimation of motion flow [
19]. Another approach is using multi-scale CNNs to deblur images without explicitly estimating the blur kernels [
20,
21]. Similarly, Generative Adversarial Networks (GANs), particularly the DeblurGAN model [
22,
23], have shown promising results in image deblurring with reduced computation time, without relying on explicit blur kernel estimation. In the context of UAV-based crack images, they present an interesting case for analysis. Due to the hairline nature of many cracks, they can be easily distorted by blurring, leading to decreased accuracy in crack detection. Nevertheless, the previously mentioned deblurring techniques have not undergone dedicated testing on UAV-based images of cracks, leaving room for potential enhancements in this domain.
Similarly, Generative Adversarial Networks (GANs), particularly the DeblurGAN model [
22,
23], have shown promising results in image deblurring with reduced computation time, without relying on explicit blur kernel estimation. In the context of UAV-based crack images, they present an interesting case for analysis. Due to the hairline nature of many cracks, they can be easily distorted by blurring, leading to decreased accuracy in crack detection. Liu et al. [
24] conducted a study aimed at removing blur from crack images captured by UAVs using a deblur GAN model. Given the challenges in obtaining corresponding image pairs in real-world scenarios, artificially generated blurry images through motion blur simulation were employed. Notably, this research focused on the domain of crack images, utilizing an existing deblur GAN model as the generator network and the VGG16 network [
25] as the discriminator network. This novel approach, distinct from previous studies, yielded impressive deblurring results, representing significant advancements in crack identification. Nevertheless, it is important to note that the use of artificially created blurry images through motion blur simulation and clear images from the same frame as data pairs has limitations in capturing the comprehensive characteristics of real-world blurring.
In this study, the main goal is to minimize blurring between bridge monitoring using UAVs. Basically, the characteristics of blur in images taken in a static state and the characteristics of motion blur in images acquired from UAVs are different in terms of the shape, size, and shape of the blur kernel mentioned earlier. Technological solutions, such as UAV speed control, camera vibration control, and sufficient illumination, exist for the suppression of motion blur affected by multiplicative artifacts. However, more effective solutions are required, as motion blur can occur within the image during filming due to the effects of flight time constraints and instability due to the UAV battery limit, flight path deviation due to GPS signal shaded area, and error in shooting angle.
Therefore, in this study, GAN-based image deblurring networks and UAV image domains are utilized to differentiate them from existing studies by solving problems through image post-processing rather than a hardware approach. Typically, to employ GAN for image deblurring, a dataset consisting of paired sharp and blurred images is required [
26]. However, in the case of images captured by UAVs, reference images are often unavailable, necessitating the artificial synthesis of blurred images by combining consecutive frames [
27]. Nevertheless, this process may introduce discrepancies between artificially synthesized blur characteristics and those occurring naturally. To address this, this paper verifies the applicability in the UAV image domain by connecting a module that learns deblurring with a module that learns blurring characteristics.
In other words, this paper contributes in the following ways: Firstly, based on the recognition of domain differences between artificially synthesized blurry images and actually captured blurry images, it generates synthesized blurry images that closely resemble real-world blurry images. Secondly, it trains the model using the synthesized blurry images, which closely resemble real ones and actual sharp images as data pairs. Thirdly, it employs the trained GAN model to remove blur in UAV images used for bridge inspection and validates its effectiveness using image quality metrics and a deep learning model for object detection.