Evaluation and Selection of Video Stabilization Techniques for UAV-Based Active Infrared Thermography Application

Unmanned Aerial Vehicles (UAVs) that can fly around an aircraft carrying several sensors, e.g., thermal and optical cameras, to inspect the parts of interest without removing them can have significant impact in reducing inspection time and cost. One of the main challenges in the UAV based active InfraRed Thermography (IRT) inspection is the UAV’s unexpected motions. Since active thermography is mainly concerned with the analysis of thermal sequences, unexpected motions can disturb the thermal profiling and cause data misinterpretation especially for providing an automated process pipeline of such inspections. Additionally, in the scenarios where post-analysis is intended to be applied by an inspector, the UAV’s unexpected motions can increase the risk of human error, data misinterpretation, and incorrect characterization of possible defects. Therefore, post-processing is required to minimize/eliminate such undesired motions using digital video stabilization techniques. There are number of video stabilization algorithms that are readily available; however, selecting the best suited one is also challenging. Therefore, this paper evaluates video stabilization algorithms to minimize/mitigate undesired UAV motion and proposes a simple method to find the best suited stabilization algorithm as a fundamental first step towards a fully operational UAV-IRT inspection system.


General UAV Applications
The use of Unmanned Aerial Vehicles (UAVs) for the remote inspection of large and/or difficult to access areas has witnessed significant growth in the last few years thanks to their flexibility of movement and their ability to carry multiple sensors. Constant technological evolvement has contributed to making UAVs more affordable, easier and safer to deploy. Moreover, thanks to the recent developments in variety of sensors for UAV applications that are low weight, low power consumption, and improved performance, thereby, allowing multiple sensors to be flown at the same time. UAV related scientific literature is overwhelmingly extensive with a wide variety of applications ranging from precision agriculture [1], traffic analysis [2], 3D mapping/modeling [3], archeological exploration [4], surveillance [5], public safety [6], mining and air pollution monitoring [7], etc. The list is vast and rapidly growing. In some cases, compared to traditional technologies, UAV-based survey systems offer better image spatial resolution (e.g., compared to satellites), and/or are much faster (e.g., compared to ground surveys in remote areas). Moreover, UAV's flight operation is becoming more automated, and image processing and data fusion tools are continuously evolving [8].

Passive Thermography and UAV Applications
UAVs and Infrared Thermography (IRT) are a perfect match for a contactless survey of thermal phenomena of either large and/or otherwise inaccessible areas, or to ease the inspection of large structures that are difficult to access. The use of UAV-IRT systems has been explored for numerous applications. In most cases, the passive approach has been employed, i.e., the observation of thermal phenomena without the use of external energy stimulation, with the assumption that the features of interests (plants, building materials, photovoltaic cells, people, etc.) will naturally produce thermal gradients that can be isolated from the background. This assumption is valid in many cases, such as building inspection [9] (e.g., detection of thermal bridges, air leakages, moisture, or humidity); precision agriculture [10] (e.g., monitor nutriments levels or lack of water in crop fields); quality assessment of large structures [7] (e.g., inspection of photovoltaic panels farms, wind turbines); or for surveillance applications [11] (e.g., people or animal tracking).
Heat transfer is a complex and transient phenomenon that depends on a combination of factors. This can be used advantageously in some cases, such as for finding the presence of some anomalies (e.g., porosity, fracturing and weathering of rocks, soil slopes, landslide hazard, etc.) [12]. In other cases, the presence of anomalies could be missed if the inspection is not performed at the "correct" time. Building inspection is a good example of this, where the temperature of building materials fluctuates following day/night and seasonal cycles, and also by the effect of weather. Solar Loading Thermography (SLT) [13] exploits the periodic solar irradiation (day and night fluctuations) to retrieve in-depth information about surface and sub-surface anomalies (e.g., cracks, delaminations, thermal bridges, etc.). Moreover, it does so at the expense of large acquisition time, at least 24 h to have a complete "view" of the thermal behavior of different materials.
Another example of passive thermography is to detect water ingression in honeycomb aircraft structures after landing [14], based on the principle that, if water is present inside the honeycomb cells it would take longer to warm-up (thaw) after landing than other materials (aluminum, Nomex, composites) and will appear as cold spots in thermal images.
A different situation is encountered when the feature of interest is at approximately the same temperature as its background. This is the situation found during the inspection of aeronautical components during production or in-service, where typical anomalies can be difficult to detect visually (e.g., cracks, impact damage) and/or can be situated at a certain depth inside the materials (e.g., delaminations, internal damage, liquid ingress). In such cases, the active thermography approach is better suited as explained in the following section.

Active Thermography and UAV Applications
In the case of structures where the features of interest are under the surface and there is no naturally occurring thermal gradient, the passive approach is seldom useful. In those cases, it is far more practical and effective to stimulate the structures to be inspected with a controlled energy source and to use data processing to improve the results [15].
On one hand, the inspection of large structures with the purpose of defect detection and characterization (i.e., determination of the size, depth, or thermo-physical properties) requires careful control of the input energy (duration, waveform type), and data recording (frame rate, time window) to exploit the relationship between the heat transfer rate and the appearance of eventual sub-surface anomalies (shallow defects appear earlier and with superior thermal contrast than deep defects). On the other hand, classical IRT experimental setups allow the inspection of relatively small surfaces at a time (the larger the area the lower the spatial resolution). A map of the complete inspected surface can be reconstructed from several individual inspections (i.e., mosaicking) [16].
Alternatively, large structures can be inspected using a dynamic configuration such as Line Scan Thermography (LST) [17], where the camera and the heat source move in tandem (the camera records thermograms right after heating) with respect to the surface of the component, which is normally static while being inspected. This can be performed for example by mounting a camera and a source on a robot or a 2-axis actuator. LST allows inspecting large and/or complex-shaped components faster than classical static IRT. It is an excellent option for quality control during production. For in-service inspection, the ideal situation would be to inspect an aircraft without the need of removing any component. A LST system would require in this case a huge robotic arm or several smaller robots properly installed and distributed to cover all the areas that need to be inspected. Alternatively, a dynamic system moving "freely" around the aircraft and performing the inspection of all the areas of interest in a fast and effective manner can be conceived. This is where a UAV based active IRT system becomes interesting. Mavromatidis et al. investigated the use of UAVs with active thermography for the inspection and estimation of thermophysical properties of building materials [18]. The authors demonstrated the feasibility of a flashbased system and pointed out the need to improve UAV stability during the acquisition or the development of stabilization post-processing methods.
Although unexpected motions like sudden spikes may have little to no impact on the detection of large and/or shallow damage, detection of small and/or deeper damage often requires further processing. This is illustrated in Figure 1, in which the raw temperature data sequence is processed by pulsed phase thermography or PPT [19] (pixel by pixel through time) to obtain phase profiles that are put together to reconstruct phase images (phasegrams), which significantly improves defect detection. Any undesirable motions can disturb the pixel-wise alignment of consecutive frames, which is already often noisy adding errors in the analysis of temperature evolution used for damage detection. Therefore, video stabilization is required as a first step to minimize/mitigate any undesired motions prior to the application of signal processing technique (e.g., PPT as exemplified in Figure 1), thereby improving the damage detection capabilities of UAV based active thermography inspection technique. with superior thermal contrast than deep defects). On the other hand, classical IRT experimental setups allow the inspection of relatively small surfaces at a time (the larger the area the lower the spatial resolution). A map of the complete inspected surface can be reconstructed from several individual inspections (i.e., mosaicking) [16]. Alternatively, large structures can be inspected using a dynamic configuration such as Line Scan Thermography (LST) [17], where the camera and the heat source move in tandem (the camera records thermograms right after heating) with respect to the surface of the component, which is normally static while being inspected. This can be performed for example by mounting a camera and a source on a robot or a 2-axis actuator. LST allows inspecting large and/or complex-shaped components faster than classical static IRT. It is an excellent option for quality control during production. For in-service inspection, the ideal situation would be to inspect an aircraft without the need of removing any component. A LST system would require in this case a huge robotic arm or several smaller robots properly installed and distributed to cover all the areas that need to be inspected. Alternatively, a dynamic system moving "freely" around the aircraft and performing the inspection of all the areas of interest in a fast and effective manner can be conceived. This is where a UAV based active IRT system becomes interesting. Mavromatidis et al. investigated the use of UAVs with active thermography for the inspection and estimation of thermophysical properties of building materials [18]. The authors demonstrated the feasibility of a flash-based system and pointed out the need to improve UAV stability during the acquisition or the development of stabilization post-processing methods.
Although unexpected motions like sudden spikes may have little to no impact on the detection of large and/or shallow damage, detection of small and/or deeper damage often requires further processing. This is illustrated in Figure 1, in which the raw temperature data sequence is processed by pulsed phase thermography or PPT [19] (pixel by pixel through time) to obtain phase profiles that are put together to reconstruct phase images (phasegrams), which significantly improves defect detection. Any undesirable motions can disturb the pixel-wise alignment of consecutive frames, which is already often noisy adding errors in the analysis of temperature evolution used for damage detection. Therefore, video stabilization is required as a first step to minimize/mitigate any undesired motions prior to the application of signal processing technique (e.g., PPT as exemplified in Figure 1), thereby improving the damage detection capabilities of UAV based active thermography inspection technique.

Video Stabilization for UAV Applications
Video stabilization methods are primarily based on mechanical/optical, and digital techniques. In the mechanical/optical stabilization techniques, the camera motion is detected and measured by internal sensors such as accelerometers, gyroscopes, etc. Motion compensation is done by mechanical/optical means, i.e., by using a microcontroller to direct small linear motors to move the image sensor or optically by shifting the lens [20]. Mechanical/optical stabilization is usually built-in as a part of the camera system. Digital video stabilization on the other hand compares the motion between two consecutive frames and shifts the frames to compensate for the undesired motion. The advantages of using digital image stabilization techniques are that there are no moving components and also the ability to apply different algorithms to improve the stabilization.
Digital image stabilization for UAV applications is not new, Shen et al. used block matching technique with polynomial smoothing for stabilization [21]. Wang et al. used corner point detection and matching with a cubic spline for smoothing [22]. In both cases, translations and rotation were used for evaluation. Hong et al. provided a multiresolution video stabilization algorithm based on the Scale Invariant Feature Transform (SIFT) and Haar Wavelet decomposition algorithm. To quantify the improvement, processing time and accuracy were used [23]. Rahmanair et al. used Speeded-up Robust Features (SURF) for motion estimation and Kalman filter to minimize unstable UAV videos used to detect moving objects [24]. Walha et al. used SIFT and Kalman filter with median filter for smoothing and stabilization [25]. Zhou and Ansari [26] compared SIFT and SURF for motion estimation between frames and used Motion Vector Integration (MVI) with adaption damping proposed in [27] for smoothing. Marcenaro et al. used grid and feature-based methods to estimate motion between two consecutive frames [28].
To quantify the stabilized videos Peak Signal to Noise Ratio (PSNR) and Interframe Transformation Fidelity (ITF) were used by Walha, Zhou, Marcenaro, and others, [27][28][29][30][31][32], just to name a few. PSNR and ITF are image quality measurements based on Mean Squared Error (MSE), which is a pixel-by-pixel comparison of two images and does not take into account any changes in luminance and contrast, which are expected to vary due to heating and cooling of the specimen during an active thermography inspection. To evaluate the performance of the video stabilization algorithm Multi-Scale Structural Similarity (MS-SSIM) is used in this work instead of the commonly used PSNR and ITF. MS-SSIM was chosen as it takes into account luminance, contrast, and structural information between two images and compares them at different scales, where in each additional scale the images are passed through a low pass filter and down-sampled by 2 from the previous scale, providing a more advanced image quality measure [31] as compared to PNSR and ITF.

Experimental Setup
Two sets of experiments were performed, where the thermal and optical videos in both experiments were acquired from a DJI Matrice 210 RTK UAV equipped with a Zenmuse X4S (FC6510) optical camera and a Zenmuse XT thermal camera. In the first experiment, the UAV was flown over three carbon fiber specimens that were flat, curved, and trapezoidalshaped, as shown in Figure 2. The UAV was navigated manually by an experienced pilot in an indoor facility by maintaining a height of approximately 1.5 m above the specimen while acquiring optical and thermal footages at 1920 × 1080 pixels and 720 × 480 pixels respectively for approximately 25 s. The optical and thermal videos were acquired at 24 frames per second (fps) and 30 fps respectively; however, the thermal videos were downsampled to 24 fps to match the optical videos' frame rate for future image registration purposes. The first experimental data set was used to develop and validate the video stabilization and selection method.  The second experiment was conducted on a Nomex honeycomb core carb skin sandwich aircraft part containing manually crafted undersurface defects ( different depths with various shapes and sizes, which are provided in Table 1. Th part was inspected using UAV-based active thermography in an indoor environ ing the same UAV and camera setup as the first experiment. The specimen was co heated using two halogen flash lamps, as shown in Figure 3. The UAV was n manually by an expert pilot, following a predefined flight pattern at three diffe tudes of 1.5, 2, and 3 m above the specimen. Only the thermal videos were proc the purpose of this experimental set was to evaluate a preliminary drone-base thermography inspection technique, as well as, to further validate the video stab algorithm and selection method.   The second experiment was conducted on a Nomex honeycomb core carbon-fiber skin sandwich aircraft part containing manually crafted undersurface defects (holes) at different depths with various shapes and sizes, which are provided in Table 1. The aircraft part was inspected using UAV-based active thermography in an indoor environment using the same UAV and camera setup as the first experiment. The specimen was constantly heated using two halogen flash lamps, as shown in Figure 3. The UAV was navigated manually by an expert pilot, following a predefined flight pattern at three different altitudes of 1.5, 2, and 3 m above the specimen. Only the thermal videos were processed as the purpose of this experimental set was to evaluate a preliminary drone-based active thermography inspection technique, as well as, to further validate the video stabilization algorithm and selection method.  The second experiment was conducted on a Nomex honeycomb core carbon-fiber skin sandwich aircraft part containing manually crafted undersurface defects (holes) at different depths with various shapes and sizes, which are provided in Table 1. The aircraft part was inspected using UAV-based active thermography in an indoor environment using the same UAV and camera setup as the first experiment. The specimen was constantly heated using two halogen flash lamps, as shown in Figure 3. The UAV was navigated manually by an expert pilot, following a predefined flight pattern at three different altitudes of 1.5, 2, and 3 m above the specimen. Only the thermal videos were processed as the purpose of this experimental set was to evaluate a preliminary drone-based active thermography inspection technique, as well as, to further validate the video stabilization algorithm and selection method.

Methodology
In this paper, a comparative analysis of various smoothing techniques is conducted to develop a method on how to find the most suitable stabilization option for reducing/minimizing the effect of undesired UAV's motions. For this purpose, the video stabilization process pipeline described in Figure 4 was implemented in Python 3.7.6 with the use of the OpenCV library based on the flow suggested by Thakur [32].
Sensors 2021, 21, x FOR PEER REVIEW 6 of 16 In this paper, a comparative analysis of various smoothing techniques is conducted to develop a method on how to find the most suitable stabilization option for reducing/minimizing the effect of undesired UAV's motions. For this purpose, the video stabilization process pipeline described in Figure 4 was implemented in Python 3.7.6 with the use of the OpenCV library based on the flow suggested by Thakur [32]. The process flow shown in Figure 4 can be broken down into seven major steps. First, the desired number of strongest corners or features, shown as (×) in Figure 5 are extracted for frame (fi) using Shi-Tomasi method [33]. Second, the extracted features from frame (fi) are matched and tracked in the consecutive frame (fi+1) using Lucas-Kanade optical flow [34], as shown by dotted lines connecting the features in Figure 5. The number of features is selected such that no frames are skipped. For the first experimental set, 100 features could be reliably tracked for optical videos as compared to 50 features for the thermal videos, which is the result of the lower spatial resolution of the thermal camera capturing fewer details as compared to the optical one, which can also be seen in Figure 5. For both optical and thermal videos, a minimum distance of 20 pixels was set to minimize feature clustering around a single or few strong features. As for the second experiment, 40 features could be reliably tracked with a minimum feature distance set at 20 pixels for all three different heights. Third, an affine transformation matrix is constructed using the features' movement to find the overall inter-frame motion in the x-direction, y-direction, and rotation between frames (fi) and (fi+1). Fourth, all the inter-frame motions are compiled to retrieve the global trajectory of the UAV in the x-direction, y-direction, and rotation. Fifth, different algorithms are used for smoothing the global trajectory. Sixth, individual frames are shifted based on the difference between the original trajectory and the smoothed global trajectory. Seventh, a stabilized video is then constructed from the shifted frames.
As a result of shifting a frame for stabilization to fit within the desired video size, undefined regions with black pixels known as Blank Borders (BB), such as the one shown in Figure 6 are generated. For simplicity, optical and thermal stabilized frames from both experimental sets were enlarged to 130% from their initial size to minimize BB in this work.
The BB was calculated for each frame by converting the color image into grayscale first, and then into a binary image by setting the threshold to zero. The binary image was used to find the number of black pixels, which were divided by the overall pixels to find the BB in each frame. The average BB was then calculated to provide information regarding the overall content of BB in the stabilized video. The process flow shown in Figure 4 can be broken down into seven major steps. First, the desired number of strongest corners or features, shown as (×) in Figure 5 are extracted for frame (f i ) using Shi-Tomasi method [33]. Second, the extracted features from frame (f i ) are matched and tracked in the consecutive frame (f i+1 ) using Lucas-Kanade optical flow [34], as shown by dotted lines connecting the features in Figure 5. The number of features is selected such that no frames are skipped. For the first experimental set, 100 features could be reliably tracked for optical videos as compared to 50 features for the thermal videos, which is the result of the lower spatial resolution of the thermal camera capturing fewer details as compared to the optical one, which can also be seen in Figure 5. For both optical and thermal videos, a minimum distance of 20 pixels was set to minimize feature clustering around a single or few strong features. As for the second experiment, 40 features could be reliably tracked with a minimum feature distance set at 20 pixels for all three different heights. Third, an affine transformation matrix is constructed using the features' movement to find the overall inter-frame motion in the x-direction, y-direction, and rotation between frames (f i ) and (f i+1 ). Fourth, all the inter-frame motions are compiled to retrieve the global trajectory of the UAV in the x-direction, y-direction, and rotation. Fifth, different algorithms are used for smoothing the global trajectory. Sixth, individual frames are shifted based on the difference between the original trajectory and the smoothed global trajectory. Seventh, a stabilized video is then constructed from the shifted frames.
As a result of shifting a frame for stabilization to fit within the desired video size, undefined regions with black pixels known as Blank Borders (BB), such as the one shown in Figure 6 are generated. For simplicity, optical and thermal stabilized frames from both experimental sets were enlarged to 130% from their initial size to minimize BB in this work.
The BB was calculated for each frame by converting the color image into grayscale first, and then into a binary image by setting the threshold to zero. The binary image was used to find the number of black pixels, which were divided by the overall pixels to find the BB in each frame. The average BB was then calculated to provide information regarding the overall content of BB in the stabilized video.  Several smoothing techniques, such as Simple Moving Average (SMA) [35], Exponential Moving Average (EMA) [35], Gaussian Filter (GF) [36], Linear Regression (LR) [37], Support Vector Regression-Linear Regression (SVR-LR) [38], and a Low-pass Butter worth filter (LBW) [39] were used to stabilize both optical and thermal videos. For SMA and EMA three different window sizes of 1 s (second), 3 s, and 5 s were used. Since both optical and thermal videos were processed at 24 frames per second (fps), 1 s, 3 s, and 5 s refer to window sizes of 24, 72, and 120 frames, respectively. Similarly, for GF, the standard deviation of 24, 72, and 120 frames were used and are referred to as GF-1 s, 3 s, and 5 s, respectively. Default settings were chosen for SVR-LR and LR. As for the LBW, the 5th order with a cut-off frequency of 1 Hz was selected to remove any high-frequency motion. These are the only handful of smoothing techniques with limited parameter settings. Since the focus of this work is to develop a methodology to select an optimal smoothing algorithm for stabilizing UAV based active IRT videos, these algorithms are deemed sufficient to demonstrate the concept.  Several smoothing techniques, such as Simple Moving Average (SMA) [35], Exp nential Moving Average (EMA) [35], Gaussian Filter (GF) [36], Linear Regression (L [37], Support Vector Regression-Linear Regression (SVR-LR) [38], and a Low-pass But worth filter (LBW) [39] were used to stabilize both optical and thermal videos. For SM and EMA three different window sizes of 1 s (second), 3 s, and 5 s were used. Since bo optical and thermal videos were processed at 24 frames per second (fps), 1 s, 3 s, and refer to window sizes of 24, 72, and 120 frames, respectively. Similarly, for GF, the stan ard deviation of 24, 72, and 120 frames were used and are referred to as GF-1 s, 3 s, and s, respectively. Default settings were chosen for SVR-LR and LR. As for the LBW, the 5 order with a cut-off frequency of 1 Hz was selected to remove any high-frequency motio These are the only handful of smoothing techniques with limited parameter settings. Sin the focus of this work is to develop a methodology to select an optimal smoothing alg rithm for stabilizing UAV based active IRT videos, these algorithms are deemed sufficie Several smoothing techniques, such as Simple Moving Average (SMA) [35], Exponential Moving Average (EMA) [35], Gaussian Filter (GF) [36], Linear Regression (LR) [37], Support Vector Regression-Linear Regression (SVR-LR) [38], and a Low-pass Butter worth filter (LBW) [39] were used to stabilize both optical and thermal videos. For SMA and EMA three different window sizes of 1 s (second), 3 s, and 5 s were used. Since both optical and thermal videos were processed at 24 frames per second (fps), 1 s, 3 s, and 5 s refer to window sizes of 24, 72, and 120 frames, respectively. Similarly, for GF, the standard deviation of 24, 72, and 120 frames were used and are referred to as GF-1 s, 3 s, and 5 s, respectively. Default settings were chosen for SVR-LR and LR. As for the LBW, the 5th order with a cut-off frequency of 1 Hz was selected to remove any high-frequency motion. These are the only handful of smoothing techniques with limited parameter settings. Since the focus of this work is to develop a methodology to select an optimal smoothing algorithm for stabilizing UAV based active IRT videos, these algorithms are deemed sufficient to demonstrate the concept.

Results
For the first experimental run, typical extracted motions between the original and the stabilized frames are shown in Figures 7 and 8

Results
For the first experimental run, typical extracted motions between the original and the stabilized frames are shown in Figure 7 and Figure 8 for optical and thermal videos respectively.  From Figure 7, it can be seen that the stabilization algorithm managed to reduce the vibration in both x and y directions for the optical video. As for the stabilization algorithm results of the thermal video shown in Figure 8, sudden spikes can be seen around frame number 225, which was due to the thermal video being paused while undergoing automatic Flat Field Correction (FFC) for approximately one second [40]. FFC is performed during power up and periodically during operation to compensate for errors, which may have built-up during operation. FFC requires a shutter or similar uniform temperature device to cover the camera field of view [41]. Pausing of the thermal video, while undergoing FFC can be witnessed between frame number 203 to 226, where the x and y-translations were zero during pausing, followed by a sudden spike (both are highlighted by vertical lines in Figure 8). During this self-calibration period, the thermal camera did not acquire any new frames; however, the UAV continued to fly on its trajectory. The acquisition restarted again upon completion of the FFC process. As can be witnessed from Figure 8, translations and rotation can be prone to outliers caused by sudden shift or lack of enough tracking features between two consecutive frames. To mitigate any errors due to outliers, motions are characterized using Upper Bound (UB) and Lower Bound (LB), which are calculated using Tukey's fence method, where data outside of LB and UB are considered as outliers [42]. The overall performance of the smoothing techniques for both optical and thermal videos from the first experiment are summarized in Tables 2 and 3, respectively.

Results
For the first experimental run, typical extracted motions between the original and the stabilized frames are shown in Figure 7 and Figure 8 for optical and thermal videos respectively.  From Figure 7, it can be seen that the stabilization algorithm managed to reduce the vibration in both x and y directions for the optical video. As for the stabilization algorithm results of the thermal video shown in Figure 8, sudden spikes can be seen around frame number 225, which was due to the thermal video being paused while undergoing automatic Flat Field Correction (FFC) for approximately one second [40]. FFC is performed during power up and periodically during operation to compensate for errors, which may have built-up during operation. FFC requires a shutter or similar uniform temperature device to cover the camera field of view [41]. Pausing of the thermal video, while undergoing FFC can be witnessed between frame number 203 to 226, where the x and y-translations were zero during pausing, followed by a sudden spike (both are highlighted by vertical lines in Figure 8). During this self-calibration period, the thermal camera did not acquire any new frames; however, the UAV continued to fly on its trajectory. The acquisition restarted again upon completion of the FFC process. As can be witnessed from Figure 8, translations and rotation can be prone to outliers caused by sudden shift or lack of enough tracking features between two consecutive frames. To mitigate any errors due to outliers, motions are characterized using Upper Bound (UB) and Lower Bound (LB), which are calculated using Tukey's fence method, where data outside of LB and UB are considered as outliers [42]. The overall performance of the smoothing techniques for both optical and thermal videos from the first experiment are summarized in Tables 2 and 3, respectively. From Figure 7, it can be seen that the stabilization algorithm managed to reduce the vibration in both x and y directions for the optical video. As for the stabilization algorithm results of the thermal video shown in Figure 8, sudden spikes can be seen around frame number 225, which was due to the thermal video being paused while undergoing automatic Flat Field Correction (FFC) for approximately one second [40]. FFC is performed during power up and periodically during operation to compensate for errors, which may have built-up during operation. FFC requires a shutter or similar uniform temperature device to cover the camera field of view [41]. Pausing of the thermal video, while undergoing FFC can be witnessed between frame number 203 to 226, where the x and y-translations were zero during pausing, followed by a sudden spike (both are highlighted by vertical lines in Figure 8). During this self-calibration period, the thermal camera did not acquire any new frames; however, the UAV continued to fly on its trajectory. The acquisition restarted again upon completion of the FFC process. As can be witnessed from Figure 8, translations and rotation can be prone to outliers caused by sudden shift or lack of enough tracking features between two consecutive frames. To mitigate any errors due to outliers, motions are characterized using Upper Bound (UB) and Lower Bound (LB), which are calculated using Tukey's fence method, where data outside of LB and UB are considered as outliers [42]. The overall performance of the smoothing techniques for both optical and thermal videos from the first experiment are summarized in Tables 2 and 3, respectively. From the summary of the first experimental set provided in Tables 2 and 3, it can be seen that most of the stabilization algorithms worked well for both optical and thermal videos when average MS-SSIM was used for comparison, i.e., an increase in average MS-SSIM when compared with the original average MS-SSIM. When the actual range of translation was used for comparison, the stabilization algorithm worked equally well for both optical and thermal videos, where there is a reduction in the Range of Motion (RoM) defined as Upper Bound (UB) minus Lower Bound (LB), using some of the algorithms. No significant rotations were present in both optical and thermal videos. It can also be noted that the outlier shown in Figure 8, due to temporary pausing in the thermal video acquisition for auto-calibration did not affect the RoM. As for the BB, the higher the SMA, EMA windows, and GF standard deviations, the higher the BB in the stabilized video. From Tables 2 and 3, it can also be noticed that some algorithms performed well when average MS-SSIM was used for comparison, while others performed well when RoM and BB were used for comparison. Thus, it is evident that in addition to image quality measure, other parameters should be considered, such as reduction in RoM which provides information regarding how much of the unwanted motion has been reduced, as well as blank border which provides evidence regarding how much the frames are shifted for stabilization, thereby preserving or losing information.
The method proposed in this work takes into account additional features to provide a single metric to evaluate the overall outcome of different stabilization algorithms. The proposed method to identify the best overall algorithm is a weighted Overall Stabilization Metric (OSM). The OSM is based on the Range of MS-SSIM (RoMS_SSIM), defined as the maximum MS-SSIM minus the minimum MS-SSIM (max MS_SSIM − min MS_SSIM ). Additional terms in the OSM are the reduction in the RoM, which is what the stabilization algorithm tries to minimize, and average BB content providing information regarding how much the information in the frames are preserved (0 means no BB, and 1 means that the entire video contains only black pixels). The OSM is expressed as: where, ori refers to original and stab refers to stabilized. RoM is the Range of Motion and is calculated individually for x-translation, y-translation and rotation. W MS_SSI M , W i=x,y,rot , and W BB are weights associated with RoMS_SSIM, individual motion, and BB respectively. For simplicity, equal weights are assigned for both experimental sets, where out of 100 weight scores, 33.3, 33.3, and 33.4 were assigned to W MS_SSI M , W BB , and W x,y,rot , respectively. Since there was minimal to no rotation in both experimental sets, W rot was set as zero; whereas, W x and W y were each assigned 16.7. These weights can be adjusted by the user depending on their preference on what is important. For example, if minimizing BB is important, then the user can assign a higher weight to BB and vice-versa. Results of the OSM for the first experiment are presented in Table 4 for both optical and thermal videos, where OSM values greater than zero would signify overall improvements in the stabilized videos, as compared to the original ones. The greater the OSM value, the better the stabilization algorithm. From Table 4, it can be seen that for the first experiment most of the smoothing techniques used for the optical video had an overall improvement in the stabilized video (OSM greater than zero); however, some made it worst (OSM less than zero). As for the thermal video, all the smoothing techniques had an overall improvement. The lowest-performing algorithm with the lowest OSM was EMA-5s for both optical and thermal videos (highlighted in Table 4). Upon close inspection from the summary provided in Table 2 for optical video, it can be witnessed that the lowest-performing algorithm significantly increased the RoM instead of reducing them and had significant BB. Similarly, the best performing algorithms with the highest OSM was GF-5s (highlighted in Table 4) for optical video and LR for thermal videos, which had the opposite effect such as: reduction in RoM and range of MS-SSIM, as well as, low to no BB.
The video stabilization and selection method were further evaluated on the second experimental set, along with a preliminary demonstration of a UAV-based active thermography inspection technique, where the UAV was flown above an aircraft part at three different heights of 1.5, 2, and 3 m, while acquiring optical and thermal videos. Figure 9 shows optical and thermal frames taken at different heights, where the damage (drilled holes) can be seen in the thermal frame but not in the optical frame demonstrating a UAV-based active thermography for detecting damage. As for evaluating the smoothing techniques using the developed OSM approach, the same weight scores that were used in the first experimental set were used here. As mentioned earlier only the thermal video was analyzed from the second experimental set and the outcome is provided in Table 5. For brevity, only OSMs are provided.
thermal video, all the smoothing techniques had an overall improvement. The lowestperforming algorithm with the lowest OSM was EMA-5s for both optical and thermal videos (highlighted in Table 4). Upon close inspection from the summary provided in Table  2 for optical video, it can be witnessed that the lowest-performing algorithm significantly increased the RoM instead of reducing them and had significant BB. Similarly, the best performing algorithms with the highest OSM was GF-5s (highlighted in Table 4) for optical video and LR for thermal videos, which had the opposite effect such as: reduction in RoM and range of MS-SSIM, as well as, low to no BB.
The video stabilization and selection method were further evaluated on the second experimental set, along with a preliminary demonstration of a UAV-based active thermography inspection technique, where the UAV was flown above an aircraft part at three different heights of 1.5, 2, and 3 m, while acquiring optical and thermal videos. Figure 9 shows optical and thermal frames taken at different heights, where the damage (drilled holes) can be seen in the thermal frame but not in the optical frame demonstrating a UAVbased active thermography for detecting damage. As for evaluating the smoothing techniques using the developed OSM approach, the same weight scores that were used in the first experimental set were used here. As mentioned earlier only the thermal video was analyzed from the second experimental set and the outcome is provided in Table 5. For brevity, only OSMs are provided.    From Table 5 it can be seen that SVR-LR provided the best overall results, i.e., the highest OSM value when the UAV was flown at heights above the specimen at 1.5 and 3 m; whereas, LR provided the best results when the UAV was flown at 2 m (highlighted in Table 5). This also highlights that since UAV motions are unpredictable, a single smoothing technique may not always provide the best results even for similar applications. Results of the stabilization in x and y translations are provided in Figures 10 and 11, respectively for the best performing algorithms at different heights. From Table 5 it can be seen that SVR-LR provided the best overall results, i.e., the highest OSM value when the UAV was flown at heights above the specimen at 1.5 and 3 m; whereas, LR provided the best results when the UAV was flown at 2 m (highlighted in Table 5). This also highlights that since UAV motions are unpredictable, a single smoothing technique may not always provide the best results even for similar applications. Results of the stabilization in x and y translations are provided in Figures 10 and 11, respectively for the best performing algorithms at different heights.  The presence of an outlier in the x-translation can be seen in (Figure 10 left), since Tuckey's fencing technique [42] is adopted for OSM, outliers like these have no effect in the OSM calculation. No significant improvements can be noted in the x-translation ( Figure 10); however, for y-translation (Figure 11), the stabilization algorithm managed to reduce the overall range of motion, which was also observed during the first experiment, as shown in Figures 7 and 8.

Discussion
The video stabilization and selection method presented in this paper was developed for UAV based active thermography inspection, where it was deemed essential to improve the damage detection capabilities. The video stabilization method was based on the one suggested by Thakur [32]. In terms of quantifying the improvement of the stabilized videos, several new methods are proposed. First, instead of using PSNR and ITF, the range of MS-SSIM is used because MS-SSIM considers luminance, contrast, and structural infor-  From Table 5 it can be seen that SVR-LR provided the best overall results, i.e., the highest OSM value when the UAV was flown at heights above the specimen at 1.5 and 3 m; whereas, LR provided the best results when the UAV was flown at 2 m (highlighted in Table 5). This also highlights that since UAV motions are unpredictable, a single smoothing technique may not always provide the best results even for similar applications. Results of the stabilization in x and y translations are provided in Figures 10 and 11, respectively for the best performing algorithms at different heights.  The presence of an outlier in the x-translation can be seen in (Figure 10 left), since Tuckey's fencing technique [42] is adopted for OSM, outliers like these have no effect in the OSM calculation. No significant improvements can be noted in the x-translation ( Figure 10); however, for y-translation (Figure 11), the stabilization algorithm managed to reduce the overall range of motion, which was also observed during the first experiment, as shown in Figures 7 and 8.

Discussion
The video stabilization and selection method presented in this paper was developed for UAV based active thermography inspection, where it was deemed essential to improve the damage detection capabilities. The video stabilization method was based on the one suggested by Thakur [32]. In terms of quantifying the improvement of the stabilized videos, several new methods are proposed. First, instead of using PSNR and ITF, the range of MS-SSIM is used because MS-SSIM considers luminance, contrast, and structural infor- The presence of an outlier in the x-translation can be seen in (Figure 10 left), since Tuckey's fencing technique [42] is adopted for OSM, outliers like these have no effect in the OSM calculation. No significant improvements can be noted in the x-translation ( Figure 10); however, for y-translation (Figure 11), the stabilization algorithm managed to reduce the overall range of motion, which was also observed during the first experiment, as shown in Figures 7 and 8.

Discussion
The video stabilization and selection method presented in this paper was developed for UAV based active thermography inspection, where it was deemed essential to improve the damage detection capabilities. The video stabilization method was based on the one suggested by Thakur [32]. In terms of quantifying the improvement of the stabilized videos, several new methods are proposed. First, instead of using PSNR and ITF, the range of MS-SSIM is used because MS-SSIM considers luminance, contrast, and structural information between two images and compares them at different scales, providing a more advanced image quality measure. Additionally, active thermography relies on processing temperature evolution over time of acquired image sequence; therefore, MS-SSIM range provides a better indication of how close or far-apart all the images are to one another-the smaller the range of MS-SSIM, the more similar the images in the video are to one another; and hence, the better the stabilization algorithm. Second, the inclusion of RoM, which is the difference between Tuckey's fence UB and LB capturing the reduction in the overall motion while removing any outliers, providing a robust method to quantify stabilization. Third, the inclusion of BB, which is created due to excessive shifting of the frames for stabilization, offering a quantitative indication of how much of the information is retained or lost due to shifting.
It was also found that the majority of the research in the literature was conducted on a single or handful of stabilization algorithms, where the improvements were quantified using individual comparison metric such as translations, rotations, processing time, accuracy, PSNR, ITF, etc. In this work, the range of MS-SSIM, reduction in undesired motion, and BB were all included in a single quality metric to provide a complete evaluation of various stabilization algorithms. The stabilization and selection methods were applied to optical and thermal videos from two different experimental sets; the outcome of which is provided in the Results section. The highest scoring OSM was selected as the best performing algorithm. Enhancements from the higher scoring OSM algorithms can be witnessed as a significant reduction in the RoM, lower MS_SSIM ranges, and low BBs, all of which signify overall improvements in the stabilized videos.
The OSM calculation contains BB, which can be further reduced by enlarging the stabilized frame size; however, this can result in information loss and, therefore, care must be taken. There are ways to fill these BBs using several techniques such as mosaicking [43], finding a match in the neighboring frames [44], interpolating matching sharper pixels from the neighboring frames for stitching [45], etc. These techniques to fill the BB, are applicable to optical videos but may not be applicable to thermal videos used for pulse/step-heating based active thermography inspection because they rely on the pixel temperature evolution over time, which is expected to vary between neighboring frames. The OSM presented in this paper neatly captures important aspects of video stabilization and provides a simple means to evaluate different algorithms to identify the best one. The user can also add processing time to the OSM, if it is deemed critical.
It can also be noted that in the second experiment the specimen was heated constantly using halogen lamps for active thermography inspection, therefore damages can be seen in the thermal frames although the depth and sizes of these damages are difficult to determine. To detect and size the damage at different depths, infrared thermography post-processing techniques can be employed [46], where the pixel-wise thermal evolution is processed using several techniques such as pulsed phase thermography, shown in Figure 1 [19], thermographic signal reconstruction (TSR) [47], TSR combined with 1st and 2nd derivative approaches [48], principal component analysis [49], etc. Methods to stabilize and select an optimal video stabilization such as the one presented in this paper are required for such applications to reduce/eliminate any pixel shifts attributing to the post-processing errors. Continuation of this work includes developing a UAV-based pulse/step-heating inspection technique and applying the methods developed here to stabilize and select the optimal smoothing techniques.

Conclusions
An Unmanned Aerial Vehicle (UAV) based inspection system that can move "freely" around an aircraft to perform the inspection of all the areas of interest in a fast and effective manner can have a significant impact in reducing inspection time and cost. However, UAV inspection is challenging because the UAV carrying the optical and thermal cameras is subject to vibration and undesired motion. To reduce such undesired motion, a digital video stabilization technique along with a proper methodology to select the best smoothing techniques are presented in this paper. The stabilization method is based on finding the motion between two consecutive frames using a features-based approach. To evaluate the performance of the video stabilization algorithms Multi-Scale Structural Similarity (MS-SSIM), reduction in undesired motion, and Blank Border (BB) were used. Some algorithms performed better when MS-SSIM was used for comparison, while others performed better when the range of motion and BB were used for comparison. Instead of using three different comparison metrics, a simple weighted Overall Stabilization Metric (OSM) based on the reduction in the range of MS-SSIM and motion, as well as average BB content was proposed for an overall evaluation of the stabilization algorithms. The stabilization and selection methods were evaluated on two different experimental sets. The first experimental set was used to develop and test the methodologies; whereas, the second experiment was conducted to demonstrate a UAV-based active thermography technique, as well as, to evaluate the methods developed to stabilize and select the best smoothing techniques. The OSM showed that different smoothing techniques had different stabilization results, some improved them, and some made them worst. The highest OSM was used as a criterion to find the best-suited algorithm. The highest scoring smoothing techniques all had low range of MS-SSIM and motions, as well as, low BB content, all of which are characteristics of better overall stabilization. Therefore, the method presented in this paper provides a simple means to stabilize videos and to evaluate different stabilization algorithms to select the one that is best suited for the application, which is a fundamental first step towards developing a fully operational UAV-based active thermography inspection system.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.