UAV Video-Based Approach to Identify Damaged Trees in Windthrow Areas

: Disturbances in forest ecosystems are expected to increase by the end of the twenty-ﬁrst century. An understanding of these disturbed areas is critical to deﬁning management measures to improve forest resilience. While some studies emphasize the importance of quick salvage logging, others emphasize the importance of the deadwood for biodiversity. Unmanned aerial vehicle (UAV) remote sensing is playing an important role to acquire information in these areas through the structure-from-motion (SfM) photogrammetry process. However, the technique faces challenges due to the fundamental principle of SfM photogrammetry as a passive optical method. In this study, we investigated a UAV video-based technology called full motion video (FMV) to identify fallen and snapped trees in a windthrow area. We compared the performance of FMV and an orthomosaic, created by the SfM photogrammetry process, to manually identify fallen and snapped trees, using a ground survey as a reference. The results showed that FMV was able to identify both types of damaged trees due to the ability of video to deliver better context awareness compared to the orthomosaic, although providing lower position accuracy. In addition to its processing being simpler, FMV technology showed great potential to support the interpretation of conventional UAV remote sensing analysis and ground surveys, providing forest managers with fast and reliable information about damaged trees in windthrow areas.


Introduction
In Eastern Asia, typhoons are one of the main natural hazards affecting the forest ecosystem [1,2]. With the frequency of intense tropical cyclones predicted to increase by the end of the twenty-first century [3], an expansion of forest ecosystem disturbance is also expected. Understanding the ecological resilience of forest ecosystems to natural and human impact is critical for identifying the optimum management measures [4][5][6]. While some studies emphasize the importance of quick salvage logging to dampen insect outbreaks in windthrow areas [7,8], other studies emphasize the ecological importance of the deadwood caused by natural disturbances [9][10][11][12], and the importance of individual deadwood management to benefit the biodiversity of disturbed areas [13,14].
The development of remote sensing with different sensors onboard satellites, airborne, and unmanned aerial vehicles (UAVs) has brought many tools and techniques to manage areas affected by natural disturbances [15,16], that are able to acquire remotely sensed data to monitor disturbed areas. Recently, UAVs have been playing an important role in remote sensing because of their ability to capture a variety of very high-resolution datasets at any time [17,18]. A widely used UAV remote sensing technique is the structure from motion (SfM) photogrammetry [19], which enables the creation of two-dimensional (2D) and three-dimensional (3D) datasets to analyze areas affected by natural disasters [20,21].
However, UAV SfM photogrammetry faces challenges such as long processing time, difficulty visualizing high-resolution point clouds in GIS, reproduction of complex areas such as those found in forests and steep terrains, susceptibility to lighting conditions, and one viewing angle for orthomosaics [20,[22][23][24]. Some of these challenges are often related to the fundamental principles of SfM photogrammetry as a passive optical method [19]. Goodbody et al. [25] stated that while digital aerial photogrammetry plays an important role in forest inventory frameworks in a variety of forested environments due to its high accuracy and lower cost compared to other technologies (i.e., lidar), further research and development of acquisition parameters, image-matching algorithms, and point cloud processing workflows are needed to help the establishment of the digital aerial photogrammetry as a logical data source for forest management. Lidar is an option to overcome some limitations of UAV SfM photogrammetry techniques, such as the complicated and unreliable matching process, especially when dealing with significant depth variation [26], but it is still expensive requiring high-skilled personnel and high computational processing [27,28].
Another way to overcome some of the limitations of SfM photogrammetry techniques is aerial videography; some studies using video streams combined with GIS were used for forest fire prevention [29] and to assess forest damage caused by hurricanes [30]. The development of video and GIS technology brought a technology called full motion video (FMV). The technology consists of automatically combining the video with GIS through a multiplexing process, generating a spatial-aware video. The FMV also provides telestration capabilities, by allowing the analysis and editing of feature data inside the video, and automatically generating features inside the GIS [31]. This technology is being used to assess remotely sensed satellite data [32], and by the military industries for intelligence, surveillance, and reconnaissance [33,34].
Considering the importance of managing individual deadwood, in this study we investigated the usage of FMV technology to identify fallen trees (i.e., uprooted trees and segments of downed trunks) and snapped trees in a windthrow area. Specifically, the feature data created from FMV and orthomosaic (produced by the UAV SfM photogrammetry process) were compared with a ground survey as a reference, to identify the strengths and weaknesses of the FMV technology in monitoring damaged trees in a windthrow area.

Study Area
In September 2004, the Typhoon Songda (no. 18) hit northern Japan and destroyed 369.6 km 2 of forests. Of the total windthrow area, 30% occurred around Chitose City and Tomakomai City in Hokkaido, Japan [35]. For this study, we selected an area of 0.37 ha inside a management unit of the national forest in Chitose City located at 42 • 45 43.9 N, 141 • 30 03.3 E at 150 m of altitude ( Figure 1).
The topography of the study area was flat, with the soil composed of volcanic ash and pumice, and annual temperature and precipitation averages of 7.1 • C and 1384 mm, respectively. The dominant tree species of the natural forest were Abies sachalinensis (F. Schmidt) Mast. and Quercus crispula Blume. After the typhoon, no human intervention was conducted; thus during the data collection, the deadwood and vegetation were found to have recovered during the years since the windthrow occurrence [36]. The topography of the study area was flat, with the soil composed of volcanic ash and pumice, and annual temperature and precipitation averages of 7.1 °C and 1384 mm, respectively. The dominant tree species of the natural forest were Abies sachalinensis (F. Schmidt) Mast. and Quercus crispula Blume. After the typhoon, no human intervention was conducted; thus during the data collection, the deadwood and vegetation were found to have recovered during the years since the windthrow occurrence [36].

Data Acquisition
The data for this study were collected on 7 December 2021, 17 years after the hit of the Typhoon Songda. The aerial data for this study (still images and video), were taken using the DJI Phantom RTK UAV, with a 1-inch CMOS RGB sensor delivering images of 5472 × 3648 pixels and 4 K (4096 × 2160 pixels) resolution video [37]. The UAV was also coupled with a built-in Real-Time Kinematic (RTK) system connected to the ICHIMILL virtual reference station service provided by Softbank Japan [38] to improve the position and altitude accuracy of the aircraft [39].
To create the FMV compliant data, the UAV was flown using the Site Scan LE application for iPad [40]. This application was necessary to convert the geospatial metadata generated from the UAV to MISB standards [41] to be combined with the video file in the multiplexing process. The flight was performed at 30 m above the ground and automatically followed a predefined route, with the gimbal angle set at 20 degrees and the video set at 4 K resolution in 24 frames per second.
Apart from the video, a total of 145 images were taken at 30 m above the ground, with both overlap and sidelap at 80% used to create an orthomosaic. To improve the orthomosaic accuracy, 4 ground control points were placed at the corners of the study site (Figure 1c), and the position of each ground control point was collected using the DG-PRO1RWS RTK system (RTK system) delivering accuracies within centimeter-level [42].
A ground survey was also conducted on the same day. Because the high density of recovering juvenile trees [43] blocked the way, it was not possible to take samples of all fallen and snapped trees from the whole study area. The sample positions of fallen and snapped trees were taken in accessible areas using the RTK system, which corresponded

Data Acquisition
The data for this study were collected on 7 December 2021, 17 years after the hit of the Typhoon Songda. The aerial data for this study (still images and video), were taken using the DJI Phantom RTK UAV, with a 1-inch CMOS RGB sensor delivering images of 5472 × 3648 pixels and 4 K (4096 × 2160 pixels) resolution video [37]. The UAV was also coupled with a built-in Real-Time Kinematic (RTK) system connected to the ICHIMILL virtual reference station service provided by Softbank Japan [38] to improve the position and altitude accuracy of the aircraft [39].
To create the FMV compliant data, the UAV was flown using the Site Scan LE application for iPad [40]. This application was necessary to convert the geospatial metadata generated from the UAV to MISB standards [41] to be combined with the video file in the multiplexing process. The flight was performed at 30 m above the ground and automatically followed a predefined route, with the gimbal angle set at 20 degrees and the video set at 4 K resolution in 24 frames per second.
Apart from the video, a total of 145 images were taken at 30 m above the ground, with both overlap and sidelap at 80% used to create an orthomosaic. To improve the orthomosaic accuracy, 4 ground control points were placed at the corners of the study site (Figure 1c), and the position of each ground control point was collected using the DG-PRO1RWS RTK system (RTK system) delivering accuracies within centimeter-level [42].
A ground survey was also conducted on the same day. Because the high density of recovering juvenile trees [43] blocked the way, it was not possible to take samples of all fallen and snapped trees from the whole study area. The sample positions of fallen and snapped trees were taken in accessible areas using the RTK system, which corresponded to around 78% of the total area (Appendix A, Figure A1); for each fallen tree, two GNSS coordinates were taken (one at each end of a fallen tree), and for each snapped tree, one GNSS coordinate was taken. To understand the influence of the characteristics of snapped trees on their identification, the height and diameter of each snapped tree were measured from the photos taken on the ground survey with a reference pole.

Data Processing
The processing workflow is shown in Figure 2. We used three different sources to identify fallen and snapped trees in the study area: FMV, orthomosaic, and the ground survey.
to around 78% of the total area (Appendix A, Figure A1); for each fallen tree, two GNSS coordinates were taken (one at each end of a fallen tree), and for each snapped tree, one GNSS coordinate was taken. To understand the influence of the characteristics of snapped trees on their identification, the height and diameter of each snapped tree were measured from the photos taken on the ground survey with a reference pole.

Data Processing
The processing workflow is shown in Figure 2. We used three different sources to identify fallen and snapped trees in the study area: FMV, orthomosaic, and the ground survey.

Full Motion Video Processing
To create the FMV compliant data, we combined the video with the metadata generated from the SiteScan LE application on the iPad, using the video multiplexer tool inside the image analyst extension for ArcGIS Pro 2.8 [44]. The video was converted into full HD (1920 × 1080 pixels) resolution to improve the playback inside ArcGIS Pro, following ESRI's recommendation [45]. Additionally, to align the video footprint in GIS, some adjustments to correct the UAV flight altitude data in the geospatial video log files had to be completed according to the parameters supplied by ESRI [46].
After combining the video with the metadata, we visually interpreted the whole study area throughout the video, frame by frame. The feature data were created inside the video, automatically generating feature data inside the GIS ( Figure 3). One feature line was created for each fallen tree, and one feature point for each snapped tree.

Full Motion Video Processing
To create the FMV compliant data, we combined the video with the metadata generated from the SiteScan LE application on the iPad, using the video multiplexer tool inside the image analyst extension for ArcGIS Pro 2.8 [44]. The video was converted into full HD (1920 × 1080 pixels) resolution to improve the playback inside ArcGIS Pro, following ESRI's recommendation [45]. Additionally, to align the video footprint in GIS, some adjustments to correct the UAV flight altitude data in the geospatial video log files had to be completed according to the parameters supplied by ESRI [46].
After combining the video with the metadata, we visually interpreted the whole study area throughout the video, frame by frame. The feature data were created inside the video, automatically generating feature data inside the GIS ( Figure 3). One feature line was created for each fallen tree, and one feature point for each snapped tree.

SfM Photogrammetry Processing
To create the orthomosaic, we used the SfM technique [47] on Agisoft Metashape [48]. Combining all 145 images with the 4 ground control points, we generated an orthomosaic with 0.793 cm per pixel of spatial resolution, with a horizontal accuracy of 0.77 cm. Through visual interpretation of the generated orthomosaic, we manually created feature lines to identify fallen trees in the whole study area. For snapped trees, the identification was not possible since only the top of the snapped trees could be seen from the orthomosaic.
In addition, a classification map was also created from the orthomosaic to examine how the ground surface (considering the above view) affected the identification of fallen and snapped trees in the windthrow area. The classification map was divided into 3 different classes: vegetation with leaves, vegetation without leaves, and non-vegetation. The vegetation with leaves class consisted mostly of coniferous trees, while the vegetation without leaves class consisted of deciduous trees and shrubs. The non-vegetation class consisted of areas that were exposing everything on the ground, such as soil and deadwood.

SfM Photogrammetry Processing
To create the orthomosaic, we used the SfM technique [47] on Agisoft Metashape [48]. Combining all 145 images with the 4 ground control points, we generated an orthomosaic with 0.793 cm per pixel of spatial resolution, with a horizontal accuracy of 0.77 cm. Through visual interpretation of the generated orthomosaic, we manually created feature lines to identify fallen trees in the whole study area. For snapped trees, the identification was not possible since only the top of the snapped trees could be seen from the orthomosaic.
In addition, a classification map was also created from the orthomosaic to examine how the ground surface (considering the above view) affected the identification of fallen and snapped trees in the windthrow area. The classification map was divided into 3 different classes: vegetation with leaves, vegetation without leaves, and non-vegetation. The vegetation with leaves class consisted mostly of coniferous trees, while the vegetation without leaves class consisted of deciduous trees and shrubs. The non-vegetation class

Ground Survey Processing
After collecting the GNSS coordinates from the fallen and snapped trees with the RTK system on the field, we imported the data into ArcGIS Pro and converted the coordinates into feature data. For fallen trees, the coordinates located at each end of a fallen tree were connected, creating a feature line. For snapped trees, the coordinates were only converted into feature points with accuracy at centimeter-level [42].

Comparison
To compare the feature data extracted by the 3 types of processing (FMV, orthomosaic, and ground survey), pairs of fallen and snapped tree features were manually identified through visual interpretation using the ground survey as a reference. For paired damaged tree features between FMV and ground survey, and between orthomosaic and ground survey, we defined them as matched, while the non-paired features from the ground survey were defined as unmatched. In this study, position accuracy was defined from the distance Remote Sens. 2022, 14, 3170 6 of 15 determined in FMV or orthomosaic to that in the ground survey, as explained in detail below. The longer the distance, the lower the position accuracy, while the shorter the distance, the higher the position accuracy.
For fallen trees, the visual identification of the pairs was mainly based on their position and angle direction. We matched pairs between FMV and ground survey, and between orthomosaic and ground survey. For position accuracy, using the ground survey as a reference, a center point for each feature line was determined and the distance between the center points of matched pairs was measured. The length of feature lines acquired by FMV, orthomosaic, and ground survey was also compared to examine the characteristics of the feature data extracted by each type of processing.
For snapped trees, we defined the pairs considering the feature data position. We only identified pairs between FMV and ground survey since it was not possible to identify snapped trees from the orthomosaic. For position accuracy, we measured the distance between matched feature points between FMV and the ground survey. We also compared the physical characteristics (height and diameter) to understand the difference between matched and unmatched pairs.
To examine the influence of ground surface on the identification of fallen and snapped trees through FMV and orthomosaic, a 0.25 m buffer was created for each fallen or snapped tree. According to Morimoto et al. [49], the average trunk diameter was 0.5 m in the same study area. Inside each buffer, the percentage of vegetation with leaves, vegetation without leaves, and non-vegetation were calculated from the classification map generated from the orthomosaic (Figure 4). This was necessary since the vegetation and branches frequently hide fallen and snapped trees when viewed from above [24]. To understand the differences in ground surface conditions between matched and unmatched fallen and snapped trees, we tested with the generalized linear models with beta distribution and logit link function [50]. When the p < 0.05, we considered the difference as significant. All data analyses were conducted with R v.4.2.0 [51] using "betareg" v.3.1.4 for generalized linear models [50].   Figure 5 shows the matched and unmatched number of fallen trees identified by FMV and ground survey, and by the orthomosaic and ground survey.   Figure 5 shows the matched and unmatched number of fallen trees identified by FMV and ground survey, and by the orthomosaic and ground survey.

Fallen Trees
Through FMV a total of 111 fallen trees were identified, while through orthomosaic and the ground survey a total of 202 and 105 fallen trees were identified, respectively. Between the FMV and ground survey, 76 fallen trees were matched, while for non-paired fallen trees, the FMV identified 35, and the ground survey 29 (unmatched). Between the orthomosaic and ground survey, 87 fallen trees were matched, while non-paired fallen trees were 115 and 18 (unmatched) in the orthomosaic and ground survey, respectively.  Figure 5 shows the matched and unmatched number of fallen trees identified by FMV and ground survey, and by the orthomosaic and ground survey.  The ground surface conditions from matched and unmatched fallen trees (FMV and orthomosaic) are shown in Figure 7, with the respective p-values (Table 1).    The ground surface conditions from matched and unmatched fallen trees (FMV and orthomosaic) are shown in Figure 7, with the respective p-values (Table 1).

Fallen Trees
In general, the results from FMV for matched and unmatched fallen trees were similar in all three classes (Figure 6a) (Figure 7a).
By comparison, the difference between matched and unmatched fallen trees was higher between the orthomosaic and ground survey (Figure 7b  The ground surface conditions from matched and unmatched fallen trees (FMV and orthomosaic) are shown in Figure 7, with the respective p-values (Table 1).         The coverage proportion of vegetation with leaves, vegetation without leaves, and non-vegetation for FMV is shown in Figure 10, with the respective p-values (Table 2):  Because of the small number of samples, the variance between matched and unmatched snapped trees was high. The p-value showed a non-significant difference The coverage proportion of vegetation with leaves, vegetation without leaves, and non-vegetation for FMV is shown in Figure 10, with the respective p-values (Table 2) The coverage proportion of vegetation with leaves, vegetation without leaves, and non-vegetation for FMV is shown in Figure 10, with the respective p-values (Table 2):  Because of the small number of samples, the variance between matched and unmatched snapped trees was high. The p-value showed a non-significant difference

Discussion
With the study conducted in December when deciduous trees have no leaves, the FMV technology was suitable to identify damaged trees in a windthrow area due to the ability of video to deliver better context-awareness, where views of the same point from different angles can provide more opportunities to find them underneath the canopies [24]. Although delivering lower position accuracies compared to the orthomosaic, the FMV was capable of identifying fallen trees even with the presence of vegetation with leaves and vegetation without leaves covering them. The identification of snapped trees was also possible through FMV, different to the orthomosaic, which could not identify snapped trees.

Performance of FMV and Orthomosaic for Fallen Trees Identification
In both FMV and orthomosaic we found more fallen trees than in the ground survey ( Figure 5). This happened for two main reasons: it was possible to survey the whole study area [52], and because of the presence of vegetation with leaves, the orthomosaic identified one single fallen tree as multiple fallen trees (Appendix A, Figure A2).
For FMV, the graph in Figure 7a showed no differences in the three classes between matched and unmatched trees, evidencing that the environment did not have a significant influence on the identification of fallen trees. The camera angle and the different perspectives from the same target throughout the frames helped in the identification of fallen trees even with the presence of vegetation with leaves and vegetation without leaves.
For orthomosaic, the graph in Figure 7b showed a higher difference in vegetation with leaves and non-vegetation classes between matched and unmatched fallen trees compared to FMV. Apart from having fewer non-vegetation averages, the higher amount of vegetation with leaves for unmatched trees showed that the fallen trees were partially or fully covered, where one single fallen tree could be identified as multiple fallen trees (Appendix A, Figure A2). Thus, resulting in a higher number of fallen trees with a shorter length average, 6.96 (s.d. 3.21) m for orthomosaic compared to 10.01 (s.d. 3.33) m for the ground survey ( Figure 6).
Overall, for fallen tree identification, the ability of video in delivering more contextawareness compared to the orthomosaic [31,53] shows the potential of FMV in identifying fallen trees in areas with vegetation coverage, while only visible trees could be identified by orthomosaics [54]. Although the frame movement delivered better context-awareness, it was also a hindrance to identifying fallen trees. Since the frame is always moving, consequently its position is also moving, generating a misalignment between some frames [55]. This led to a lower position accuracy when compared to the orthomosaic.

Performance of FMV for Snapped Trees Identification
The ability of FMV to see the same snapped tree in different frames (since the video is moving), made it possible to identify snapped trees through video [55]. Although the video movement made it possible to identify snapped trees, the position accuracy of snapped trees was similar to the fallen trees' identification accuracy (2.58 (s.d. 1.88) m for fallen trees, and 2.31 (s.d. 0.61) m for snapped trees). This also happened because of the misalignment between video frames which are always moving.
The characteristics of matched snapped trees were taller and thicker compared to unmatched ones (Figure 9), and consequently shorter and thinner snapped trees were assumed to be harder to identify. Physical characteristics were not the only variables to affect their identification; the presence of vegetation without leaves was also a hindrance to the identification of snapped trees due to their similarity with the standing tree branches.
The combination of shorter and thinner snapped trees in areas with the presence of vegetation without leaves (branches of deciduous trees) made snapped trees difficult to identify in windthrow areas due to the similarity between tree branches and snapped trees. Despite higher averages of vegetation with leaves for matched snapped trees, the color difference between the snapped tree and the green vegetation was less of a hindrance to identifying snapped trees (Appendix A, Figure A3).

FMV advantages and limitations for Damaged Trees Identification
While the FMV delivered lower position accuracies compared to the orthomosaic, it was sufficient to calculate the number of damaged trees based on unit per area. In addition, since an RTK UAV was used for this study, the data taken from FMV yielded results with better accuracy (around 3 m) than common handheld GNSS devices, which generally vary between 5 to 10 m under favorable conditions [56]. Another limitation of FMV was observed in the identification of short and thin snapped trees, but larger segments of deadwood, which remain in the stand longer and play an important role in forest ecosystems [57], could be identified by using FMV.
In contrast to the orthomosaic, the FMV was able to identify snapped trees. The FMV showed a simpler workflow and faster processing time compared with the orthomosaic, mainly due to the ability to analyze the data by just combining the metadata with the video. Thus, the FMV method allows quick assessment of individual damaged trees, enabling the generation of fast and accurate information for forest managers to take quick actions, which is key in deciding the management of disturbed areas [7,8]. Furthermore, FMV technology also showed great potential to improve and support the interpretation of remote sensed data and ground surveys, due to the enhanced context-awareness provided by the video. This context awareness could potentially open up new possibilities for monitoring damaged trees in forested areas with complex vegetation and rich understory.
Overall, the FMV proved to be a powerful tool in the forest disaster management process due to its simple workflow, accuracy, and quick results-even with the presence of vegetation-providing detailed information on damaged trees in windthrow areas to identify optimum management measures. New studies using this technology combined with other technologies, such as object detection through deep learning, are encouraged to automatically detect damaged trees in windthrow areas.