1. Introduction
Wave-breaking is the main mechanism of wave energy dissipation, which plays a crucial role in the energy balance of surface waves. Whitecaps are formed when breaking waves entrain air at the surface, forming a submerged bubble plume that appears as a patch of highly reflective foam at the sea surface. Whitecaps enhance the transport of gases across the air–sea interface [
1,
2], are an important source of primary marine aerosols [
3], and alter the ocean albedo [
4,
5].
At present, whitecap observation methods include the use of monocular cameras [
6,
7,
8,
9,
10], stereo cameras [
11], infrared cameras [
12,
13,
14], and satellites [
15,
16,
17,
18], among other methods. Monocular cameras have been widely used to detect the whitecap coverage
W over the past 20 years due to their low cost and high flexibility. The most widely used whitecap detection method based on the use of a monocular camera is the automated whitecap extraction (AWE) method based on the percentage increase in the number of pixels (PIP) function proposed by Callaghan [
19]. Compared with the earlier algorithms [
20,
21,
22], Callaghan’s method can automatically obtain the threshold for every single grayscale image if whitecaps exist in the picture, while earlier algorithms prefer to use one threshold when analyzing a short video. Inspired by AWE, adaptive thresholding segmentation (ATS) [
23] has been proposed as an improved global threshold segmentation method, which reduces the multiple time derivation and smoothing operations used in AWE, effectively improving the speed of whitecap detection. More recently, the use of deep learning methods for whitecap analysis has been discussed [
24,
25,
26]; however, some of them are only concerned with wave-breaking events, not the whitecap itself.
Whitecaps have particular spatial and temporal characteristics. For further understanding of the movement of a single whitecap and the statistical properties of whitecaps, whitecap detection in continuous sequences is particularly important. In order to obtain convergent whitecap coverage, hundreds of images within 20 min need to be available [
19], as well as continuous motion tracking. However, due to the optical characteristics of cameras, images under varying illumination conditions may differ, resulting in significant errors. As for AWE and ATS, their whitecap detection performance is accurate under uniform illumination conditions; however, in case of uneven illumination, the global threshold method will lead to misjudgment. In order to retrieve whitecaps automatically in uneven lighting conditions, IBCV has been proposed [
23]. This method aims to maximize the inter-class variance when obtaining the segmentation threshold. As IBCV shares different ideas with other methods, IBCV only performs better in uneven illumination conditions and is completely inapplicable under other conditions. Liu [
27] has introduced a pre-processing method, the top-hat transform, which suppresses the background pixels under uneven lighting in a single image, allowing whitecap pixels to be enhanced. An adaptive thresholding method has also been applied [
28] for whitecap detection. The image was split into 64 × 64 overlapping sub-images, and the Otsu method was used to obtain the optimal threshold for every sub-image. A contour identification method is used in the method in order to distinguish actual whitecap contours. A semantic whitecap extraction model has been proposed [
26], allowing the error result in uneven illumination conditions to be eliminated.
In addition to uneven illumination, sun glints can also influence the captured image, serving as the main source of light pollution. In most analyses, these images are discarded. The easiest solution is to add a polarizer to the camera, which removes certain incident light [
29]. Considering that most whitecap images are derived from ship-borne cameras or fixed cameras on offshore platforms, the vibrations induced by waves may come from any direction and cannot be avoided, and therefore fixed polarizers are not effective at all times. Careful positioning of the camera can help to avoid contamination from the effects of sun glints and uneven illumination caused by sky reflection [
19]. For long-term monitoring, frequent camera adjustments lead to increased costs and reduced automation as, in the analysis of whitecap coverage, it is necessary to remove the perspective effect, while a change of angle would also lead to a change in the field of view, and the camera’s extrinsic parameter matrix must also be re-calculated according to the angle [
30]. With the movement of clouds and the sun, light pollution also occurs in short-term videos, such that accurate whitecap detection under light pollution conditions is generally necessary.
Sun glints occur in many imagery methods applied to sea surface monitoring. Two sun glint removal techniques are commonly used—(1) a radiative transfer model coupled with a statistical model of surface water, in order to predict water leaving reflectance [
31,
32]; and (2) using near-infrared (NIR) wavelengths, which exhibit maximum absorption and minimal water-leaving radiance over clear waters [
33,
34]—as a proxy for the amount of sun glint in a pixel, as well as for finding the spatial variation of glint intensity across the image. However, these methods require additional equipment to measure radiance, and are most often used with satellite images, which typically have a spatial resolution higher than 100 m per pixel, making them unsuitable for high spatial resolution scenes of offshore structure-based or ship-borne cameras. A sun glint correction method for a UAV platform has been proposed [
35], and the use of simultaneous multi-channel polarimetric cameras has been shown to be capable of minimizing the influence of sun glints in the case of a detailed analysis of sea surface polarization patterns under different sea states and solar zenith angles [
36]; however, these approaches also require additional equipment.
It should be emphasized that our goal is not to remove sun glints but, instead, to reliably detect whitecaps even in highly light-polluted images. As the whitecap alters the albedo of the sea surface, in most cases, the light pollution area will not contain the whitecap; conversely, the whitecap area will not have obvious light pollution. This is why the whitecap can still be observed by the naked eye in the presence of light pollution. In the absence of additional equipment, the difference in spatial and temporal properties between sun glints and whitecaps can be utilized. Contours with tracked time shorter than 2/3 second have been directly removed in an earlier study [
37], where sun glints may behave similarly to whitecaps when the wave is slight or moderate. A UNet-based sun glint and whitecap separation method [
38] has been proposed, but no detailed comparison is available. Distinguishing whitecaps and light pollution using statistical methods is acceptable in some cases; for example, using the average grayscale value
X and standard deviation value of an abnormal pixels
[
39] to find a new threshold, where the abnormal pixels are those which have values higher than a manually determined threshold. The new threshold can be calculated as
, as determined by experience, and whitecaps should have a higher grayscale value than the new threshold. This method shares the same idea as an efficient method to separate the diffuse and specular reflection components in a single image [
40]; however, this method requires a significant difference in brightness between the whitecap and the sun glint in the image, which is not always fully satisfied in images.
Although existing whitecap detection algorithms can achieve automatic detection under ideal conditions, it is still challenging to detect whitecaps reliably due to the various illumination conditions. Therefore, we propose an automated whitecap detection method under complex illumination conditions with varying levels of light pollution. First, the automated whitecap detection algorithm is improved on the basis of existing approaches, allowing the proposed method to obtain all abnormal pixels in a single image, including whitecaps and light pollution. Then, the sea surface abnormal image sequence and a down-sampled abnormal image sequence are used for optical flow trajectory analysis in order to remove misjudgments due to illumination effects.
The remainder of this manuscript is organized as follows:
Section 2 describes the materials and methods.
Section 3 provides the results. Finally, in
Section 4, we detail our findings in the discussion.
3. Results
The light pollution in an image will be very different under different illumination conditions. According to the behavior and shape of the light pollution area, the performance of the proposed method under various conditions was analyzed. As mentioned above, we used videos converted to WCS. For convenience, we selected a fixed rectangular area in the videos as our region of interest (ROI), which did not change in a certain video but may differ between two videos.
3.1. Results for AD
In previous methods, images that were not pre-processed were used for whitecap detection, leading to performance that varies greatly under different illumination conditions. Here, we compare the accuracy of whitecap detection under various illumination conditions.
As shown in
Figure 9, the AWE, ATS, and sub-image Otsu methods were selected for comparison with our method.
The source video of these four images was used to confirm the exact location of the whitecaps. We selected five volunteers to mark whitecaps and used the average value as the real
W. The comparison between
W obtained by different methods is shown in
Table 1. In the first row of the figure, every method could obtain most of the whitecap areas. For AWE, the brightness of the water surface close to the whitecap is relatively high, resulting in obvious misjudgments. However, in both the ATS and sub-image Otsu(SOtsu) methods, there are missed detections. It should be noted that in
Table 1, although ATS has the best effect in the first case, it is based on the fact that ATS falsely detected some whitecaps and also missed some other whitecaps. For the proposed method, the whitecaps are not missed, but only the edge of whitecaps with lower intensities are missed.
In the second row, there is almost no bright area on the water surface caused by illumination, and there is a whitecap with a significant difference in brightness from the overall water surface. In this case, all methods obtained the whitecap accurately, although AWE still obtained some misjudgments and SOtsu missed a small whitecap. Whether from the image or from the detection results, the effect of AWE and our proposed method is basically the same in this case.
In the third row, the brightness difference between the whitecap and the water surface was significantly smaller than in others. For AWE, although most of the whitecaps were obtained, many bright areas on the water surface were also misjudged. However, the threshold obtained by ATS is higher than that of AWE, and only a small part of the whitecap was obtained. This is also reflected in each case regarding the higher threshold of ATS compared to AWE, and this is the reason why the false-positive area with ATS is smaller than that of AWE. Smaller differences in brightness caused the SOtsu method to fail. In this case where the brightness difference between the whitecap and the water surface is small, through reasonable preprocessing, our method can identify the whitecap more accurately.
The last row shows the existence of many sun glints and high-brightness areas on the sea surface. In terms of the ability to detect abnormal areas, the performance of the other methods cannot be used for comparison, as this condition is not within the scope of whitecap detection; however, this condition occurs frequently. It is shown here, as minimizing the possibility of detecting high-brightness areas on the sea surface will help to speed up the performance of subsequent analyses. As can be seen in both
Figure 9 and
Table 1, our method is able to obtain the minimal possible abnormal sea surface. The processing time of these methods is compared in
Section 3.6.2. According to the above analysis, the improved abnormal detection method proposed in this study can accurately detect whitecap and light pollution areas under different illumination conditions.
3.2. Results for WS without Sun Glint
We selected videos without light pollution conditions and applied our method in order to detect whitecaps. The original images, the optical flow trajectories under
and
, and the whitecap detection results are depicted in
Figure 10.
In the figure, the first three and last three columns are from two different videos, which we denote by and , respectively. In the two cases, due to the different illumination conditions, the color of the sea surface is quite different, and whitecaps with different lifetimes have significant intensity differences in the same image. In , the features found were concentrated on whitecaps, with a few features on the sea surface; that is, small areas on the sea surface had relatively high brightness after the top-hat transform and were identified as abnormal areas instead of being suppressed as the background. As shown in the figure, their trajectories were not well-preserved, and thus, those features were not expected to influence further analysis. Considering the feature points of the whitecap, the number of feature points is proportional to its size; that is, the larger the whitecap, the higher the number of feature points. The whitecap marked in was undergoing a transition from the generating stage to active stage, so the whitecap area increased significantly, and feature points inside the whitecap also gradually increased. In , W is much higher than in , leading to many more feature points. All of the features were within whitecaps, as the difference between the whitecaps and the water surface was distinct. It should be noted that trajectories are not distinguished by color in order to make the difference between the trajectories more obvious, and random colors were selected for drawing.
For each whitecap, a stable optical flow trajectory can be maintained at two sampling rates. Thus, the corresponding
and
have high similarity. Ideally, the trajectories should be exactly the same at both sample rates, as every feature point is exactly the same. However, in the actual feature point trajectory iteration process, different feature points may be discarded or retained at different sampling rates. Therefore, we may use a spatial neighbor feature instead of the same feature. In the Shi–Tomasi corner detection method, we set the maximum number of detected corners to 100 and set the minimum distance between every two corners to 5 pixels. Both designs are based on a compromise between detection accuracy and processing speed. Both neighbor feature trajectories of
and
are shown in
Figure 11, where
and
can be found in the third and sixth columns. We applied correlation analysis to the sub-trajectories in the
x- and
y-axes, respectively, where, in
, the subscripts
x and
y represent the respective sub-trajectory axes.
appears as solid lines, while dashed lines represent
. Changes in
are not as frequent as those in
, which is expected as
. However,
and
were found to be highly correlated after linear interpolation. In addition, for a certain trajectory, too few trajectory points or the loss of features in the trajectory iterative process at
may lead to the interruption of whitecap tracking. Our solution to this was as follows: once a confirmed whitecap trajectory was found, the corresponding
was treated as the trajectory of the whitecap, which continued to be tracked until
was lost. This is also due to consideration of the analysis speed, as if all trajectories are compared in every frame, the efficiency would be reduced. As shown in
Figure 10, all whitecaps were detected in
, while a few small whitecaps in
were rejected, as the whitecap features were lost due to low quality.
3.3. Result of WS with Random Sun Glint
In this section, the video shown in
Figure 8 was used for analysis. We used images under the WCS, while the images in
Figure 8 are under the PCS. In these images, except for the whitecap, there are many sun glint pixels, which appeared and disappeared randomly, such that a stable optical flow trajectory would not exist for light pollution features. The images converted to WCS, trajectories, and whitecap detection results are provided in
Figure 12. The longest trajectory lengths produced by the whitecaps and sun glints in the corresponding column are recorded in
Table 2.
First, we examined the optical flow trajectory of the whitecap. In the first column of
Figure 12, the whitecap is in the generating stage. Obviously, there are no
and
, which could be considered as neighbor feature trajectories. More precisely, the feature trajectory of the whitecap at
does not build at this time. With the movement of the whitecap, the whitecap evolves into the active stage. In the second and third columns, with the update of the trajectory, stable
and
appear, and the length of the trajectory increases with time, as shown in
Table 2, until the maximum length of 25 is reached, which indicates the existence of stable trajectories. In the last two columns, the whitecap gradually evolves from the active stage to the mature stage. In
, trajectories from the active stage to the mature stage are preserved. The rapid evolution of the whitecap is hard to track at
, and thus, loss of features and re-establishment of trajectories occurs, while
is much shorter than
in the same column. However, this does not affect the analysis, as only the last segment of trajectories at the same time period was used in the calculation of
, and we only utilized
, which has more points than interpolated
. We assumed that interpolated
has
a points and
has
b points, where
, and we used the last
a points of interpolated
and
. What may be misleading is why the first column did not find
and
, which satisfy the judgment criteria, but a whitecap was still marked in
Figure 12d. When we marked the whitecap in images, we assumed that we found trajectories satisfying our judgment criteria at time
t, and
has
b points. We started processing from the image at the first point of the trajectory; that is, the image captured at time
, as the number of video frames is
and the contour containing the trajectory point is the whitecap.
Regarding the sun glint points, as the sun glint basically appears at random locations in such cases as in
Figure 12b, except for the whitecap
, there are some features brought by sun glints, some of which create light flow trajectories of a certain length. We counted the length of the longest trajectory produced by whitecaps and by sun glints in
Figure 12.
Using images to describe the trajectory length is not intuitive; we describe it in more detail from the perspective of the maximum trajectory length. Based on the data in
Table 2, in the first four images,
is gradually increased from 4 to 25 at
and from 2 to 6 at
.
is not stable, which also proves the random appearance of the sun glints because the length of a stable feature trajectory should be gradually accumulated. It can also be seen in
Figure 12 that the trajectory of sun glints in different images vary greatly. In the fifth image,
also decreases at a low sampling rate due to the transition from the active stage to the mature stage of the whitecap, but increases again in the next image. However, there is still no significant regularity in
. It is also worth mentioning that the
of the fourth picture at
reaches 17, and with a sampling rate 25 Hz, its duration exceeds 0.67 s. According to the previous detection method [
37], this area will be treated as a whitecap. Such a case is analyzed in
Section 3.4.
3.4. Results for WS with Sun Glint in Certain Shapes
The condition considered in the previous section is a relatively simple case of sun glints. In more complex cases, due to the peaks and troughs of the waves, the sun glints appear to have a specific slope on the sea surface and persist with the movement of the waves. Under these conditions, the behavior and trajectories of sun glints are shown in
Figure 13.
The case shown in the first three columns is denoted by , while that in the last three columns is denoted by . The marked sun glint area is similar to a whitecap in the mature stage in , but its intensity changes in a wide range, and the shape change is quite different from that of a whitecap. There is a stable trajectory at , but its corresponding does not exist. This is because, as the sampling rate decreases, the feature points that can be continuously found at are already regarded as error points in , as the feature quality threshold designed in the method will reject features with lower similarity. Hence, there are less feature points at than at . Inside the red box of , the sun glints and mature whitecaps are very hard to distinguish. On one hand, this is due to the image degradation caused by the conversion to WCS. On the other hand, the two do present very similar behaviors. The whitecap is marked in the yellow box; under the optical flow trajectory at a certain sampling rate, both whitecaps and sun glints exhibit similar persistent trajectories.
The number of trajectories and the trajectory lengths in the marked area in
were compared at
and
, as shown in
Figure 14a. In the figure, the length of the trajectories in the marked area at
is longer, which is reasonable: about half of
were longer than 10, and about a quarter of
were longer than 16. In addition, at
, a quarter of
had more than 5 points. For a trajectory with
a feature points at sampling rate
f, the tracked time
can be calculated as
. In this condition, regardless of whether we use
or
to track the feature, if the feature lasts longer than 2/3 s, the method using tracked time [
37] would fail. In the following content, we will also call it the single optical flow method, abbreviated as SOF. Correspondingly, the whitecap separation method proposed could be summarized as a method of judgment using the correlation coefficient of optical flow trajectory, which would be abbreviated as OFTC.
Figure 14.
(
a) Histogram of trajectory point numbers. (
b) Whitecap neighbor feature optical flow trajectories of
. (
c) Sun glint neighbor feature optical flow trajectories of
. (
d) Sun glint neighbor feature optical flow trajectories of images in
Figure 15.
Figure 14.
(
a) Histogram of trajectory point numbers. (
b) Whitecap neighbor feature optical flow trajectories of
. (
c) Sun glint neighbor feature optical flow trajectories of
. (
d) Sun glint neighbor feature optical flow trajectories of images in
Figure 15.
The whitecap trajectory and the sun glint trajectory of
are shown in the figure. The correlation coefficients of the whitecap trajectory are 0.9828 and 0.9747, respectively, while the correlation coefficients of the sun glint trajectory are 0.7162 and −0.6462, respectively. The detection result of
could be found in
Figure 16b.
We believe that the use of optical flow trajectories at a certain sampling frequency, although temporal and spatial information is used, still lacks reliable judgment criteria. From the results in the previous section and this section, it can be seen that even the sun glint area would have a stable optical flow trajectory, and the correlation coefficient analysis at two sampling rates can solve this problem to a certain extent. Combining
Figure 13 and
Table 3, although there is no whitecap in
, there are many light pollution areas. Obvious false detections could be found under ATS or AD. Furthermore, using SOF to remove false positives cannot remove the sun glint areas in the red box of
. This is also the result of
Figure 14a, as a considerable part of the feature points have long-duration optical flow trajectories. In
, the information that can be obtained from the images is that the trajectories of the whitecaps and the sun glint areas are very similar, so in the actual detection results, the SOF method cannot obtain the results well.
Under WCS, the image is obviously degraded. In order to better demonstrate the sun glint with certain shape conditions, the videos under PCS were selected for the analysis, in which it is easier to distinguish between sun glints and whitecaps. In
Figure 15, it is important to note that there was no whitecap in these images.
We also divided the images under PCS into two cases,
and
, corresponding to the first three columns and the last three columns of
Figure 15, in the false detection areas by tracking the time method marked in
Figure 15d. Stable optical flow trajectories can be observed at both sampling rates. We selected the last three images with more obvious trajectories, whose correlation analysis is shown in
Figure 14d. The correlation coefficients were
= 0.9733 and
= 0.4886, respectively. The optical flow method itself depends on the brightness, and the brightness of the sun glint area varies greatly; thus, the neighbor features could be found and the stable trajectories were built, but the feature trajectories at the two sampling rates were very different. It is manifested in the results that optical flow trajectories lasts for a long time, causing significant errors. However, such errors can be removed by the proposed judgment criteria 1.
Similar to under WCS, we also compared the detection results under PCS. Due to the absence of degradation and the larger field of view, using PCS images is more likely to generate neighbor light pollution feature points on the sea surface and produce stable trajectories. The ATS and AD detection results of
can refer to
Figure 9, but in
Figure 9, we used the third image of
, and the result of AD would be higher than the second image we used here. Even though AD marked the smaller possible areas, SOF still produced significant false positives. The optical flow at two sampling rates gives more abundant information, allowingg the OFTC method to handle this condition better.
3.5. Ablation Study
It is obvious that, in the presence of sun glints, it is essential to use the optical flow method to remove misjudgments. Regardless of whether there is a whitecap in the image, only using the abnormal detection method may lead to obvious misjudgments. The images that were analyzed before are shown in
Figure 16a. The top-hat transform results and the abnormal detection results indicate that the proposed abnormal detection method can identify abnormal points in the image. If the misjudgment removal method is not used, we could not retrieve a reliable
W.
In
Figure 16b, most of the abnormal pixels originate from light pollution. After the process of the proposed method,
W can be obtained correctly. The same result can be found in
Figure 12.
Even in the absence of sun glints, as mentioned above, sea surface areas with relatively high brightness in the image may also be considered abnormal areas. If the bright sea surface area is large, it can be removed using various uneven illumination correction methods; however, the smaller area itself has a similar image performance to the whitecap, and it is likely to be considered an abnormal area using the threshold segmentation method. In this case, an accurate
W cannot be obtained with only a single image; the results of such cases are shown in
Figure 17b. Such conditions occur very frequently. In a sea surface scene, such conditions usually last for several hours on sunny days, which is very unfavorable for long-term whitecap detection. Fortunately, it is simpler to use optical flow processing under sun glint conditions as, for sun glint points, even if the brightness of the point changes due to motion, the brightness is always much higher than the brightness of most points in the image, making the trajectory more likely to be found at high sampling rates. The general high-brightness area is only slightly brighter than other points. As time goes by, this area is not likely to exist in the following images, and the features are lost.
Considering the detection results shown in the second row in
Figure 9, for example, all whitecaps were found and there were no high-brightness sea surface areas or sun glints in the image. Therefore, the whitecap detection can be accomplished without the use of subsequent processing. The trajectories after the optical flow process are shown in
Figure 17c. With
Figure 10, we found that the detection results would not be affected if we added the optical flow method. The process is not required under such conditions; however, there is no good way to automatically determine whether the image contains sun glints and high-brightness sea surface areas. According to a previous statistical analysis of whitecaps [
51], we can estimate the approximate range of
W at the current wind speed based on statistical information. However, this method is still not stable because there may still be a notable difference from the empirical curve in actual situations.
3.6. Comparison with Previous Method
3.6.1. Accuracy Comparison
Due to in situ storage limitations at sea and the filtering of available video data, our data are mainly derived from
in the range of 5–10 m/s; this can be observed from
Figure 1b. In the existing dataset, firstly, we consider the use of videos without light pollution to obtain the widely used 20-min whitecap coverage
W [
19,
52,
53]. Both
W obtained using ATS and using AD are displayed. We used ATS and AD in the same dataset, and the results showed that the differences were not significant. In
Figure 18, we added the experience curve proposed by Scanlon [
52], and most of the
W we obtained is below the convex curve. Although we cannot confirm that the statistical characteristics of the whitecap in the current sea area and utilized in Scanlon’s work are the same, we could use it as a rough distinction criterion. In our dataset, the value of
W at the most time should be lower than that obtained from the curve at a certain
or close to it.
It needs to be explained that since the camera is not toward the nadir, it is inevitable that, at certain moments, part of the whitecap will be obscured by the wave crest that is closer to the camera, which may result in a smaller
W obtained. In our verification, we found the phenomenon has a very limited impact on results. On the one hand, we used a 20-min average of the whitecap coverage, and errors in a small number of images are tolerable. On the other hand, the occlusion is not particularly severe due to the used part in the original image, as shown by the orange box in
Figure 2b.
Furthermore, we selected the videos under different light pollution conditions and plotted
W obtained by (1) the SOF method, (2) the proposed OFTC method, and (3) the whitecap extraction method using single image in
Figure 18. It is obvious that in the results without using video data,
W has a great error due to the influence of illumination and has little relationship with wind speed. The method using tracked time can obtain results close to the optical flow correlation coefficient method in some cases, such as case A in the figure. However, in case B, the single optical flow method still cannot obtain reliable results. This actually has much to do with the shape of light pollution. As previously introduced in
Section 3.3 and
Section 3.4, case A and case B are obtained from videos of different light pollution shapes, and if the shape of a sun glint area is similar to a whitecap shape, the performance of SOF will be greatly attenuated. In the results of case B, the
W obtained by the SOF method is also several times higher than the statistical data. Since the ordinate of
Figure 18 is logarithmic, a slightly larger value also represents a difference of several times, not to mention that the SOF results in case B are significantly far from the statistical value. While the results obtained by the OFTC method are relatively more reliable, our proposed method can guarantee almost the same accuracy as previous methods in the case of random light pollution and can still stably extract the whitecap of the sea surface in the case of specific-shaped light pollution.
3.6.2. Processing Time Comparison
According to the above analysis, we believe that, under various illumination conditions and with only video data available, the proposed post-processing method should be widely used rather than only relying on single-image information. Videos under different illumination conditions were selected for process time analysis. All codes used were written in Python.
Figure 19 shows the processing times of the abnormal detection method and the abnormal detection method + whitecap separation method compared to those of methods available in the literature.
Due to the multiple derivation and smoothing operations of AWE, its algorithm efficiency is relatively low, while ATS, the sub-image Otsu method, and the proposed abnormal detection method can be processed quickly. After fixing the image size, these methods are not concerned with the content of the image, which makes their process efficiency basically fixed. However, after adding post-processing, the processing speed will be significantly reduced. First, the optical flow trajectory acquisition under two sampling rates will lead to reduced performance. In addition, the neighbor feature correspondence and correlation were calculated after the trajectory, which also reduces the processing speed. The video we selected include sea states with relatively low
W, relatively high
W, whitecaps and sun glint points, as well as whitecaps and large sun glint areas, corresponding to 0–500, 500–1000, 1000–1500 frames, and >1500 frames, respectively. Images from this video are shown above, for example, in
Figure 8 and
Figure 9. In these cases, the number of feature points in the images differ, resulting in different feature trajectories that need to be judged. Therefore, it can be seen that the processing time after 500 frames was significantly longer than that before 500 frames. Except for relatively low
W, there will be many abnormal areas in the images causing many feature points, especially in the case of large sun glint areas, and therefore, the required processing time increased significantly. In the first stage of processing (about 25 frames), AD+WS ran significantly faster than in the subsequent frames, as the whitecap that is not detected in the current frame needs to be fully considered. In this way, there will not be a single whitecap that goes undetected. Before 25 frames, the program only needs to store the pictures into the queue while, after 25 frames, it starts to perform enqueue and dequeue operations and detects the whitecap(s) in each frame, which would take a relatively greater amount of time. However, in general, the time required for AD+WS is acceptable, and the detection could be completed in real-time.
4. Discussion
In previous work on whitecap detection, the inefficiency of manual marking led to research on the automated detection of whitecaps. Notably, in the automated extraction process, due to the influence of illumination, failure to perform error correction due to light pollution can lead to significant detection errors. Currently popular single-image post-processing approaches are generally based on contours, but this relies on strong prior knowledge. From the data in this study, we can state that there are obvious differences in the shapes of whitecaps; therefore, utilizing contours to determine the presence of a whitecap is difficult. Another type of method is based on brightness difference. Although there is a certain brightness difference between light pollution and certain whitecaps, whitecaps in different lifetimes also have obvious brightness differences. In general, the information that can be utilized to separate whitecaps from light pollution areas in a single image is very limited, and manual marking is difficult to achieve under such a condition. Furthermore, in previous studies, few illumination conditions are considered, and whether the method can be used under various illuminations could not be proved.
Therefore, the detection of whitecaps should not only be limited to a single image; instead, the presence or absence of whitecaps should be confirmed using video data. Whitecaps are an important part of the air–sea exchange, and they can last on the sea surface for some time. Moreover, whitecaps are displaced with the waves and currents, which is significantly different from the sun glints we observed. Manual marking can often obtain accurate whitecaps in a video sequence, but when the video sequence is not available and the illumination pollution is strong, recognizing whitecaps can be difficult even for humans. Considering the movement of a whitecap, a method using the tracked time to judge the whitecap was proposed, but only judging by time can lead to some long-lasting light-pollution areas also being marked as whitecaps.
The abnormal detection method proposed in the study was motivated by previous research [
19,
23,
27,
43]. Using the
channel makes our method more sensitive to the points with specular reflection in the image, and using the top-hat transform also makes it possible to suppress the larger bright background in the image. Furthermore, a new histogram-based threshold determination method is used, which simplifies the search process and improves the robustness of the approach, compared to previous ones requiring derivation and smoothing operations. Under different illumination conditions, the whitecap detection accuracy is higher, and the determination of abnormal points under strong illumination pollution is more accurate. Under different illumination conditions, the proposed abnormal detection method was shown to be able to detect whitecaps more accurately in images without light pollution than previous whitecap detection methods. In the images containing light pollution, our method could detect all possible areas that may contain whitecaps. Compared with the previous whitecap detection method, we reduced the candidate area as much as possible to improve the speed of subsequent analysis.
In the analysis of whitecaps under different illumination conditions, the post-processing method proposed in this study does not concern the shape of the contour and only judges whether the contour is a whitecap or not by assessing the correlation of the motion trajectories of neighboring feature points under the high and low sampling rates, making the proposed method much less dependent on experience. Compared with the tracked time method, the illumination pollution component can be removed more accurately. In addition, compared with previous research, the premise of this method is the use of a high video sampling rate, while the low sampling rate used in this paper is the same as the sampling rate in most previous studies. It should be noted that this study is still based on prior knowledge, as the randomness of sun glints appearing and disappearing allows us to find different feature trajectories for the same sun glint features at different sampling rates, while whitecap features are obviously spatially and temporally continuous.
The method proposed in this study also has limitations. First, for a large
W, the abnormal detection method will fail, which is the same as observed in previous whitecap detection methods based on thresholds retrieved from histograms. In post-processing, illumination pollution can be removed in most cases; however, the brightness uniformity of some sun glint areas is good and the sun glint area features last for a long time in a manner closely related to the current wave shape and sea state. In future work, we will focus on solving this problem. It is also worth mentioning that using the motion detection method of computer vision to analyze the movement of whitecaps is also a problem worthy of study, which can be used to distinguish the lifetime of a certain whitecap [
14,
46,
50,
54], and deepen the understanding of the role of whitecaps in different lifetimes [
52,
55,
56]. The method proposed in this study, while removing illumination pollution, actually includes many trajectories of whitecap movement over time. Tracking specific whitecaps and analyzing the movement of whitecaps in different stages could be a main research direction in future work.