Shoreline Detection Accuracy from Video Monitoring Systems

Video monitoring has become an indispensable tool to understand beach processes. However, the measurement accuracy derived from the images has been taken for granted despite its dependence on the calibration process and camera movements. An easy to implement self-fed image stabilization algorithm is proposed to solve the camera movements. Georeferenced images were generated from the stabilized images using only one calibration. To assess the performance of the stabilization algorithm, a second set of georeferenced images was created from unstabilized images following the accepted practice of using several calibrations. Shorelines were extracted from the images and corrected with the measured water level and the computed run-up to the 0 m contour. Image-derived corrected shorelines were validated with one hundred beach profile surveys measured during a period of four years along a 1.1 km beach stretch. The simultaneous high-frequency field data available of images and beach surveys are uncommon and allow assessing seasonal changes and long-term trends accuracy. Errors in shoreline position do not increase in time suggesting that the proposed stabilization algorithm does not propagate errors, despite the ever-evolving vegetation in the images. The image stabilization reduces the error in shoreline position by 40 percent, having a larger impact with increasing distance from the camera. Furthermore, the algorithm improves the accuracy on long-term trends by one degree of magnitude (0.01 m/year vs. 0.25 m/year).


Introduction
Video cameras have been used to monitor the beach since the eighties with the development of the ARGUS system [1]. Since then, other systems have been developed such as SIRENA [2], BEACHKEEPER [3], or COSMOS [4], among others. Despite the recent rise of drones [5] and satellite imagery [6,7] for coastal monitoring, the balance between the temporal and spatial resolution offered by fixed cameras during extended periods of time (years) can still provide unique information given the dynamic beach behavior.
Upon the introduction of fixed video platforms in the nearshore community, considerable advance on the understanding of coastal morphodynamics has been made. Lippmann and Holman [8] demonstrated that time-averaged images show elongated white breakerlines parallel to the coast that approximate the position of submerged bars and the shoreline. This finding triggered the study of sandbar and shoreline dynamics. Tracking breaker lines in multi-year data sets revealed that bars and shorelines respond to incident waves by migrating in onshore and offshore direction (see, e.g., in [9][10][11][12][13]), by developing or straightening out patterns (see, e.g., in [12][13][14][15]), and by rotating clockwise/counterclockwise (see, e.g., in [10,16,17]). Furthermore, images have been used to study dynamics on shorter timescales, including the formation of finger bars [18,19], rip currents [20,21], and groundwater seepage on beaches [22,23].
Pixel coordinates need to be transformed to UTM coordinates to allow the quantification of coastal features. Usually, an intrinsic calibration is done to correct the lens deformation and then an extrinsic calibration is performed (finding the camera position and view angles) by relating the pixel position of discernible objects to UTM coordinates [24]. Both calibrations change in time due to thermal expansion and wind action [1] as well as human manipulation during maintenance. There are three paths to deal with these movements: (i) assume that the calibration parameters do not change during a period of time (see, e.g., in [18]), (ii) find the calibration parameters in each image (see, e.g., in [25]), and (iii) stabilize each image based on a reference one (see, e.g., in [26]).
The methods developed to correct beach camera movements can be divided in those that model the effect of thermal expansion and wind (see, e.g., in [25]) and those that track the movement of features between two images (see, e.g., in [1,[26][27][28]). The first approach has the advantage of working blindly at the expense of accuracy and requires non-standard data such as cloud coverage. In addition, the method requires a large number of manual calibrations (approximately 400 per camera). The second one is more accurate given an adequate field of view with high contrast areas, but algorithms are often restricted to a specific type of scene. The first to automatically detect image displacements in a beach environment were Holman and Stanley [1]. Small regions containing fixed objects were defined in a reference image. Then, the pixel displacements of these regions in a new image were computed using a template matching method. Similarly, Pearre and Puleo [29] obtained the camera displacements by defining regions of interest, mainly composed of fixed objects with high contrast, in a control image and finding the correlation peak around a certain pixel distance in the next image. Vousdoukas et al. [27] found common features between images applying the Speeded-Up Robust Features (SURF) algorithm [30]. SURF consists of a detector and a descriptor. The detector finds the position of interest points such as corners, blobs, and T-junctions. The descriptor represents the neighborhood of every interest point by a feature vector. Although not required by SURF, Vousdoukas et al. [27] excluded interest points that did not correspond to fixed objects. In case SURF found few common interest points between images (less than 10), it was assumed that there was no camera displacement. However, the percentage of occurrence was not provided. Rodriguez-Padilla et al. [26] combined a sub-pixel cross-correlation algorithm with a canny edge detector to track corners and salients of buildings in predefined areas. The method proved to be robust under variable lighting conditions, achieving success in about 90% of the images over a 5-year period. More recently, Simarro et al. [28] used as a detector/descriptor the ORB algorithm [31]. In order to handle a wide range of lightning conditions, they compared an image to a basis of images (e.g., 8 instead of 1). As a result, the algorithm had a high success rate (90%) in a beach with a large presence of buildings over a 12-year period. The success rate was lower (40%) in a beach with few fixed objects and a large presence of vegetation over a 4-year period.
In general, studies aimed to correct camera movements have been restricted to beaches with discernible fixed objects and have relied upon their own outputs to evaluate their performance. Therefore, long-term errors in image-retrieved metrics (e.g., shorelines) related to stabilization have not been evaluated. Therefore, the range of applicability of these methods remain vague for users.
The aim of this work is to present an image stabilization algorithm capable of solving camera movements at a vegetated beach and to evaluate its performance by comparing image-derived shorelines with a 4-year data set of in situ measurements. The study site is described in Section 2. The field data, stabilization algorithm, and shoreline retrieval are presented in Section 3. The results, presented in Section 4, include a comparison between the stabilized images and the resulting planview images using multiple calibrations, as well as a sensitivity analysis of the parameters that correct shorelines to the 0-m contour and the long-term performance. Discussion points are raised in Section 5 and a list of conclusions is provided in Section 6.

Study Site
Sisal beach is located on a low-lying barrier island on the northwestern coast of the Yucatan Peninsula (Mexico), facing the Gulf of Mexico (Figure 1c), and west-east oriented (80 • with respect to the North, see Figure 1b). The beach is dissipative with a shore-parallel bar system and a mean grain size of 0.3 mm [32], and is bounded by the port's jetty and the Sisal pier (Figure 1a). Local sea breezes and cold front passages control the mean wave climate and winter wave climate, respectively. The sea breezes generate short-period NE waves, inducing a persistent westward alongshore circulation [33] and a significant (30,000-60,000 m 3 /year) net westward sediment flux [34]. Field data from the period 2015-2020 show beach progradation at a rate of ∼6 m/year close (east) to the jetty and lower rates farther away from the structure [35,36]. The cold fronts, also known as Central American Cold Surge (CACS) events and occurring in fall-winter, generate northerly swell which is efficiently dissipated by the wide and shallow continental shelf [37]. The cold fronts induce a net alongshore circulation towards the east [33] as well as important cross-shore transport [38]. Occasional wind squalls generating energetic short-period waves and significant storm surge (∼1 m), locally called turbonadas, occur during spring months. The tropical cyclone season at the study area is from June to November, however no landfalls occurred during the study period. Microtidal conditions with diurnal dominance and a spring tidal range of 0.75 m [39] characterize the study area. The sea level has a seasonal modulation with maximum/minimum values in fall/summer months related to the variability of along-shelf currents on the western Gulf of Mexico and inverse barometer contribution [40].

Field Data
In situ field observations have been collected in the study area by means of both monitoring programs and systems. Figure 1a shows the location of survey transects, camera system, tide gauge, and wave gauge. Information about the data acquisition and analysis is provided below.

Beach Monitoring
Beach elevation profiles have been measured along a 2 km stretch (20 cross-shore transects with a spacing of 100 m, see Figure 1a) since May 2015 using a Leica differential GPS system in RTK. A reference station has been installed permanently on top of a building close to the jetty. The rover is carried on a backpack along each of the transects from the dune to a water depth of about 1.5 m, measuring the position at 1 Hz, resulting in a spatial resolution of 0.5 m approximately. A ground control point (GCP) is measured at the beginning and at the end of each survey to correct the antenna height. Measurements with a horizontal and vertical error higher than 0.05 m are discarded. For this work, 100 surveys from 8 of May 2015 to the 26 of June 2019 were used (see red marks in Figure 2e) and consist of 12 cross-shore profiles (see red lines in Figure 1a). During the first year the profiles were measured every week and later every 15 days [35]. The bed elevation, z, was referenced to the MEX97 geoid. The shoreline position, defined as the position of the 0-m elevation contour, was obtained for each survey at each transect and is referred throughout the paper as measured shoreline. Furthermore, the foreshore slope was computed for −0.5 < z < 0.5 m.

Environmental Conditions
The sea water level z gauge , including the astronomical tide and storm surge, was acquired every minute with an ultrasonic tidal gauge installed inside the Port of Sisal (Figures 1a and 2d). Offshore significant wave height H s , peak period T p and peak direction θ p were obtained by a bottom-mounted ADCP deployed 10-km offshore at a water depth of 10 m (Figures 1b and 2a-c). Gaps in the wave records were filled with reanalysis information from Wave Watch III whereas a gap in the tide record was filled with data from a tide gauge located 40 km east of Sisal. This information was employed to obtain the total water elevation given by where z w is the wave setup and S is the swash height Both z w and S correspond to the parameterizations for dissipative beaches of Stockdon et al. [41].

Video Imagery
Sisal beach has been video-monitored since September 2012 with five cameras placed at 43 m height, located 300 m east of the jetty, and 100 m inland ( Figure 1a). The camera system was controlled by the SIRENA software [2] covering a beach stretch of 1.7 km (from the jetty to the pier, see Figure 1a) delivering snapshot, time exposure and standard deviation images every 30 min (over 10 min at 7.5 Hz). In this study, only time exposure images from camera 1 were used (C1 in Figure 1a), given that C1 covers 75 percent of the shoreline and is the most sensitive (out of the five cameras) to thermal expansion and wind stresses. Figure 2. Significant wave height (a), peak period (b), wave direction (c), water level measured inside the port (d), and run up two percent (e). Red marks indicate the topographic measurement dates. The data in dark blue correspond to measurements. The data in light blue correspond to Wave Watch III and a tide gauge located 40 km east of Sisal.

Stabilization Algorithm
The motivation of stabilizing images is to use a single calibration, over the whole study period, to create images where pixels have an equidistant separation in meters (planviews). The image stabilization process is performed on oblique time exposure images and consists of three steps: (i) select a seed image and obtain its calibration parameters, (ii) create a database of monthly high-quality reference images (Figure 3b), and (iii) stabilize any image of interest (Figure 3c). The stabilization of an image requires defining a reference image, steps ii and iii use the same algorithm (Figure 3a) but select the reference differently.
Step ii updates the reference image backward and forward in time using the seed image as starting point.
Step iii uses the closest image from the monthly reference images.
The requirement of creating a database of reference images (step ii) is related to the configuration of Sisal C1 images-which might be common in other sites because it is representative of coastal video monitoring images: the presence of ever-changing vegetation, water, and morphology. These ever-changing characteristics cover more than 90 percent of the field of view (FOV) while fixed structures cover less than 3 percent of the image. Therefore, continuous updating of the reference image is necessary. Otherwise, images with a large time gap (in the order of seasons) become too different to find matching characteristics. Furthermore, a single low-quality reference image can propagate an error. To overcome that problem, a trusted database of stabilized images was created by selecting one good quality image per month. The seed image is used as reference for the stabilization of the first monthly image (after the seed). This newly stabilized image is used as reference for the second monthly image, and so on. This same process is applied backwards for monthly images that occur before the seed image.  The image stabilization algorithm (Figure 3a) starts with defining the unstabilized image to be corrected and its closest stabilized image in time from the database of monthly high-quality reference images. Two regions of interest (ROI) are evaluated separately in the following steps because the camera orientation heavily favored feature detection near the camera, which in turn introduced a bias in the stabilization of the entire field of view. Characteristic features are then detected for both ROIs independently in the two images using the SURF algorithm [30], and the descriptors of the features are extracted and matched between images for each ROI. A lower threshold was required to detect features farther away (ROI 2) in order to balance the number of features throughout the image. Afterwards, a homography was optimized to coincide the pixel position of the unstabilized features with that of the reference features. If a sufficient number of features complied with this condition in each ROI, the geometric transformation was saved. Due to the random nature of the feature detection process, different geometric transformations can be obtained using the same set of images. Therefore, the stabilization routine was repeated n number of times (∼20) to compute a mean transformation. If an insufficient number of transformations (<5) was found due to a low-quality image, the stabilization routine was repeated with lower thresholds.

Image Data Sets
The one-hundred analyzed images correspond with the dates of the topographic surveys (96 were taken within an hour of the survey). Four image datasets were compared: (1) oblique non-stabilized images (NSI), (2) oblique stabilized images (SI), (3) planviews created from oblique SI using only one calibration, and (4) planviews created from oblique NSI updating calibrations upon visible movements of the FOV.
Planviews were created by performing a rectification with the ULISES software [42], which transforms the image coordinate system to a UTM-based coordinate system using four ground control points (GCPs) and five horizon points (red points in Figure 4a). Then, the coordinates were translated such that the tower position became 0,0 (to accelerate computations and improve readability, see Figure 4c). Every planview was generated with a 0.5 m pixel size. The real pixel resolution along the shoreline, defined as the one-pixel projected distance change, was divided in its row and column components (Figure 4b).
The camera orientation favored the performance on the column component, showing resolutions of half a meter at the pier. The performance on the row component was worse, particularly in the x projection with a resolution of 18 m at the pier.

Shoreline Detection and Correction
The image-derived shorelines were manually detected in the planviews. Here, the detected shoreline was considered to be the run-up limit, appreciated in the images as the last pixel of the blurred white area (Figure 5a,c). Unlike the in situ measured shorelines (defined as the 0 m elevation position, see Section 3.1.1), the detected shoreline positions had to be corrected to the 0 m elevation contour by estimating the cross-shore offset. This offset was computed from the average foreshore slope of each field survey (each correction used a different slope) and the total water level (Section 3.1.2). As an example, the detected shoreline and its water-level correction are shown for the 25 August 2015 in Figure 5b. The differences with the measured shoreline are shown in Figure 5d. From now on, the image-derived shorelines will take into account the cross-shore offset and will be referred to as SI/NSI shorelines.

Raw Assessment
The stabilization algorithm found a valid geometric transformation with the maximum quality threshold in 80 images and with a lower threshold in 20 images. The position of the pier corner was manually digitized in the oblique images (both NSI and SI) as a mean to evaluate the performance of the stabilization algorithm ( Figure 6). The cameras were reoriented in August 2018 resulting in a 120-pixel deviation of the pier corner. For a fair comparison, this deviation was added as a constant after August 2018 in the non-stabilized set (see green circles in Figure 6) while the stabilized set did not require additional corrections. While the pier position in NSI varies within a 6-pixel range, the pier position in SI remains within one pixel from the original position (except for five images) and has a mean column/row pixel deviation of 0.35/0.25 pixels. After reorientation (August 2018), the pier position in SI is less stable with a mean column/row deviation of 0.5/0.7 pixels, suggesting an accuracy loss when correcting a large FOV difference. The effect of the stabilization process on solving the camera movements can be appreciated by averaging the image datasets, which results in a blurred pier for NSI and a well defined pier for SI (Figure 6d,e). For fairness, the NSI average image was computed until August 2018, before the camera reorientation.
The discontinuities found in the manually digitized position of the pier corner were used to define calibration updates of the NSI dataset to create NSI planviews. In total, six calibrations were used (July 2015, July 2016, February 2017, January 2018, and August 2018; see red-green shades in Figure 6a,b). Only one calibration (seed image 12 August 2018) was needed to create the SI planviews.

Shoreline Assessment
The algorithm performance was assessed by comparing the two data sets of imagederived shorelines to measured shorelines (Figure 7). The camera covers a beach section of ∼1.1 km corresponding to transects P4 to P15 (red lines in Figure 1a) with P4/P15 being ∼1380/290 m away from the camera.
The shoreline Root Mean Squared Error (RMSE) values show that, in general, SI shorelines had higher accuracy than the NSI shorelines with an average RMSE of 1 m and 1.4 m, respectively. Individually, NSI shorelines have a lower RMSE in 28 images with an average of 0.85 m (1 m for SI). According to the RMSE, shorelines were classified as excellent (<1 m), good (1-2 m), bad (2-3 m), and poor (>3 m; Figure 7b). The image stabilization process increased the number of excellent shorelines (60 for SI versus 39 for NSI) and reduced the number of bad and poor shorelines (5 for SI versus 11 for NSI).  Figure 8 shows the evolution of the shoreline position, close (profile P15) and far (profile P4) from the camera, referenced to the initial in situ measured shoreline. The progradation of P15 during the entire study period is well captured by both data sets with small variations around the in situ measured shoreline (Figure 8a). Both data sets present significant deviations during winter 2016/2017 and the shoreline position change is overestimated in NSI during the second half of 2018. The shoreline position at P4 is stable on interannual timescales and shows seasonal variability, located more seaward in spring and shorewards in autumn (Figure 8b). In this case, the NSI shoreline behavior is more erratic than the SI shoreline behavior. In general, the SI shoreline position follows the measured positions, with certain abrupt deviations but it captures the shoreline seasonality. The NSI shoreline positions show large artificial variations throughout the study period capturing to a lower extent the seasonality (Figure 8). The bias of SI and NSI shorelines from in-situ measurements (Figure 8c,d) clearly shows that accuracies are lower far from the camera, the mean absolute bias for SI/NSI is 0.75/0.96 m at P15 and 0.96/1.8 m at P4. The RMSE for SI is 1.2 m and 1.5 m for NSI at P15 whereas the RMSE is 1.75 m for SI and 3.5 m for NSI at P4. Shoreline long-term trends are computed for each beach transect by fitting a line to each shoreline position time series. The behavior of shoreline tendencies along the beach is well represented in both data sets, with large accretion in the western profiles (P10 to P15) and relatively stable profiles closer to the pier (P4 to P7, see Figure 9). The SI shorelines improve the determination of the shoreline trends with a mean absolute bias of 0.01 m/year compared to 0.25 m/year for NSI. In line with the bias in shoreline position, bias in shoreline tendency for NSI increases with distance from the camera.

Error Sources
The water level varies due to tides and waves, influencing the detected shoreline position. The former is measured in this study but the latter had to be inferred. Although not shown here, we followed the methodology of Aarninkhof et al. [43] and tested other parametrizations [41,44,45]. For Sisal beach, the best results were obtained using the dissipative beach parametrization of Stockdon et al. [41].
The time-averaged RMSE of the shoreline position without water-level correction is 1.9 m. Taking the tide into account reduces the error to 1.6 m and adding the waves contribution further reduces the error to 1.0 m. Other sources of error are the image stabilization, shoreline detection, and alongshore beach slope and water level variability. Each one is related to three characteristics: image quality, shoreline clarity, and shoreline sinuosity. The SURF algorithm (feature detection) and, therefore, the movement stabilization are influenced by the image quality and by illumination variations. Given that the surveys were taken at approximately the same hour, the effect of the illumination was reduced. However, the image quality was not a constraint to capture diverse field conditions. Seventy-three out of one-hundred images were of high quality, whereas twenty-four were slightly blurred, with fog and with salt over the camera case glass and three had very poor visibility over large parts of the image ( Figure 10). For this reason, the proposed algorithm includes checks to ensure that the transformations are being solved properly. Then, the level of shoreline clarity (a combination of sand color, fog, wave conditions and wrack) influenced shoreline detection. Fifty seven images were identified with high confidence, thirty nine were clear enough and four required guessing in several beach sections ( Figure 10). Finally, alongshore shoreline/bar variability produces alongshore variability on the wave-induced water level, which was not taken into account. For example, beach cusps focus swash motion in the valleys and megacusps induce alongshore breaking variability that influence wave setup at the shoreline. Only 40 shorelines were considered smooth, 38 were irregular, and 22 had pronounced undulations ( Figure 10). With the introduced stabilization technique, the number of images with large RMSE (above 2 m) was reduced from 22 on the NSI dataset to 5 in the SI dataset (Figure 7). These five values occur during low shoreline visibility and in the presence of large shoreline undulations (mega cusps and crescentic bars), suggesting that the stabilization algorithm is robust enough to handle challenging field conditions. Similar behavior occurs for the 18 images with an RMSE above 1.5 m, from which 10 images had a high sinuosity degree, 5 poor clarity, and 3 where optimal images but with large amounts of algae debris. This indicates that the largest shoreline inaccuracies arise mostly due to alongshore variability and the shoreline detection process instead of the stabilization process.

Relevance
The long-term trends computed with the stabilized and non stabilized data sets have an accuracy of 0.01 m/year and 0.25 m/year, respectively. The analyzed camera moves around a point, which results in either positive or negative pixel bias that compensate, to a certain extent, the camera movement in the long term. A systematic bias in a given direction (such as the period of 17 January to 18 July, see Figure 6) can have a larger impact but can be mitigated by defining new calibrations. Despite these strategies to improve the accuracy in non-stabilized images, the stabilization still improves the long-term tendency accuracy by one degree of magnitude. This level of accuracy might not be necessary for certain objectives, i.e., understanding large scale dynamics of the system. However, there are other benefits besides accuracy. The multiple calibration process, manually selecting control points and ensuring that the new planviews generate consistent measurements, is user dependent and monotonous, which adds another source of uncertainty. Therefore, the extra initial effort to stabilize the camera movement for long-term tendencies is justified even if the user does not require the accuracy improvement. Furthermore, applying an adequate water level correction can be as important.
The movement correction becomes more relevant when studying events or seasonality, especially with increasing distance from the camera. For instance, transect P4, located 1300 m away from the camera, presents large seasonal variations, in particular during 2016 and summer-autumn 2018 (Figure 8). While SI shorelines mostly follow the measured shoreline fluctuations, NSI shorelines exhibit unreal events of accretion and erosion (e.g., during 2016, Figure 8b) or disguise the erosion and recovery cycles (e.g., summer-autumn 2018, Figure 8b).

Methods Applicability
The correction of camera movements at beach video monitoring stations around the world has been improved over the last years [25,26,28]. The algorithm presented in this work is aimed to handle vegetated beaches with few fixed objects. Next, the applicability of methods capable to handle this type of scene is discussed.
Bouvier et al. [25] predicted camera movements based on environmental conditions. It should work independent of scene. However, its implementation in a different site is challenging, given the data requirements, and its accuracy seems to be lower than other studies. Furthermore, systematic deviations (see 2017-2018 in Figure 6b) not related to climate conditions cannot be solved this way. In spite of the limitations, their method is the preferred alternative in a field of view that only captures water.
Simarro et al. [28] used the ORB algorithm that relies on a corner detector to find keypoints [31]. Instead of comparing one image against another, they compared it against a basis of images (8 instead of 1). This provides great flexibility to the algorithm and increases its autonomous performance. They analyze two types of scenes: with buildings (90% success rate) and with vegetation and fixed objects (40% success rate). The latter is similar to our scene. Our method uses the SURF algorithm that relies on a blob detector [30]. A possible explanation for the lower success rate is that the corner detector might not be as suitable for vegetated scenes as it is for buildings scenes. Another explanation is that changes in vegetation might not be captured by the basis of images. For this particular case, the basis of images might need to be extended to also cover a wide range of months and years. In fact, at Sisal beach SURF does not find enough matching features between two images that are six months apart. This suggests that, in a natural environment that is in constant change, an update of the reference image has to be done at a time frame characteristic of the vegetation evolution. In our case, an update of a month was enough to detect matching features without losing accuracy.

Conclusions
An unsupervised algorithm to stabilize the movement of beach video monitoring systems has been proposed. The algorithm is capable of solving camera movements in spite of challenging conditions, e.g., blurry images capturing energetic conditions and ever-evolving dune vegetation.
The performance of the algorithm is evaluated by assessing the accuracy of imagederived shorelines up to 1380 m from the camera, using imagery of a single camera looking along a beach stretch of 1.1 km and 4 years of beach profile in situ measurements (a total of 100 surveys). The high frequency field data available (beach surveys, tide gauge and adcp measurements) allows to determine seasonal changes and long-term trends accuracy. In general, image stabilization increases the accuracy of shoreline measurements by reducing the root mean squared error in shoreline position by 40% compared to a traditional approach. Stabilization increases its importance at shorter time scales and with increasing distance from the camera. Without stabilization, seasonal variability was not captured in 2016/2017 and, in some events, the images suggest non-existent erosion/accretion. Long-term trends along the beach are well captured with and without stabilization but its accuracy differed by one degree of magnitude (0.01 m/year vs. 0.25 m/year). Furthermore, accounting for water level variations is as important as stabilization to track morphological changes. At the studied microtidal site, taking into account the tide reduces the error by 20% and including wave-induced elevations reduces the error an additional 40%.