Machine Learning Beach Attendance Forecast Modelling from Automatic Video-Derived Counting

Castelle, Bruno; Carayon, David; Dehez, Jeoffrey; Liquet, Sylvain; Marieu, Vincent; Sénéchal, Nadia; Lyser, Sandrine; Savy, Jean-Philippe; Barneix, Stéphanie

doi:10.3390/jmse13061181

Open AccessArticle

Machine Learning Beach Attendance Forecast Modelling from Automatic Video-Derived Counting

by

Bruno Castelle

^1,*

,

David Carayon

²

,

Jeoffrey Dehez

²

,

Sylvain Liquet

³,

Vincent Marieu

¹

,

Nadia Sénéchal

¹

,

Sandrine Lyser

²

,

Jean-Philippe Savy

⁴ and

Stéphanie Barneix

⁴

¹

Univ. Bordeaux, CNRS, Bordeaux INP, EPOC, UMR 5805, 33615 Pessac, France

²

INRAE Nouvelle Aquitaine, Cestas-Gazinet, 33140 Villenave-d’Ornon, France

³

Météo-France, 31055 Toulouse, France

⁴

SMGBL, 40660 Messanges, France

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2025, 13(6), 1181; https://doi.org/10.3390/jmse13061181

Submission received: 9 May 2025 / Revised: 11 June 2025 / Accepted: 13 June 2025 / Published: 17 June 2025

(This article belongs to the Section Coastal Engineering)

Download

Browse Figures

Versions Notes

Abstract

Accurate predictions of beach user numbers are important for coastal management, resource allocation, and minimising safety risks, especially when considering surf-zone hazards. The present work applies an XGBoost model to predict beach attendance from automatically video-derived data, incorporating input variables such as weather, waves, tide, and time (e.g., day hour, weekday). This approach is applied to data collected from Biscarrosse Beach during the summer of 2023, where beach attendance varied significantly (from 0 to 2031 individuals). Results indicate that the optimal XGBoost model achieved high predictive accuracy, with a coefficient of determination (

R^{2}

) of 0.97 and an RMSE of 70.4 users, using daily mean weather data, tide and time as input variables, i.e., disregarding wave data. The model skilfully captures both day-to-day and hourly variability in attendance, with time of day (hour) and daily mean air temperature being the most influential variables. An XGBoost model using only daily mean temperature and hour of the day even shows good predictive accuracy (

R^{2}

= 0.90). The study emphasises the importance of daily mean weather data over instantaneous measurements, as beach users tend to plan visits based on forecasts. This model offers reliable, computationally inexpensive, and high-frequency (e.g., every 10 min) beach user predictions which, combined with existing surf-zone hazard forecast models, can be used to anticipate life risk at the beach.

Keywords:

beach attendance forecasting; XGBoost modelling; video-derived beach user data; weather and tidal influences; coastal safety management

1. Introduction

Beaches are popular recreational destinations that attract millions of visitors annually [1]. Measurement and prediction of beach attendance are important for effective coastal management and public safety [2]. Understanding attendance patterns provides valuable insights into the temporal and spatial distribution of beach users, which, in turn, informs resource allocation, environmental impact assessments, and economic evaluations of coastal regions [3,4]. In addition, from a beach safety perspective, the number of beach users likely to enter the water and expose themselves to physical hazards in the surf zone is key. By integrating attendance data with real-time hazard monitoring, beach safety programmes can optimise lifeguard deployment, enhance public awareness campaigns, and reduce overall risk to beach users.

As described in [5], the life risk on the beach is determined by the combination of the number of people exposed and the severity of the life-threatening physical hazards present. Consequently, the level of life risk can potentially be modelled indirectly by assessing both the intensity of physical hazards and human exposure [5,6]. The dominant hazards in the surf zone are rip currents [7,8,9,10] and shore-break waves [11], both of which are well-documented contributors to injuries, rescues, and fatalities (e.g., refs. [12,13,14]). Rip currents, in particular, account for a significant proportion of rescues performed by lifeguards worldwide [15,16,17], while shore-break waves are notorious for causing severe spinal injuries and other trauma [13,18,19,20]. Considerable progress has been made in understanding and modelling surf-zone hazards. For instance, advanced hydrodynamic models (e.g., refs. [21,22]), data-driven techniques [23,24], and simple physics-based or semi-empirical models [25,26] now enable the prediction of rip current hazards, and even shore-break wave hazards [26], with increasing accuracy. Comparatively, while the physical hazard component of life risk at the beach has been extensively studied, beach attendance has received less attention.

It is widely acknowledged that beach attendance at a given site is significantly influenced by weather conditions, as evidenced by the disproportionate number of surf-zone injuries occurring on warm, sunny, and light-wind days [6,14,27,28,29]. Ocean variables, such as tidal stage and wave height, may also play a role in beach attendance, but this remains largely unexplored. Attendance is further influenced by temporal factors, such as weekdays versus holidays [30,31]. However, most of these findings are derived from injury reports, which reflect a disproportionate number of incidents rather than providing direct measures of attendance. This limitation arises primarily due to the challenges of quantitatively measuring beach attendance. Visual counting, conducted by lifeguards or other organisations (e.g., refs. [14,27,32]), is labour-intensive and subject to significant uncertainties. Other data sources, such as visual counting from aerial images [33] or analysis of social media activity [34], can supplement attendance estimates. However, the most promising method is the use of fixed video monitoring systems combined with advanced image processing techniques. For instance, ref. [35] applied red-band contrast enhancement, binary conversion using Otsu’s method, and a connectivity algorithm to count objects in a defined beach area. These counts were then calibrated against manual observations to estimate attendance using site-specific regression equations. Similar image processing techniques (e.g., ref. [36]) can be employed to derive attendance estimates from individual video station snapshots. Nonetheless, such approaches are highly sensitive to weather conditions and require labour-intensive manual calibration. In contrast, artificial intelligence (AI) offers an automated alternative for detecting and counting beach users, as recently demonstrated by [37,38,39].

While AI-based methods for estimating beach attendance have been validated at various sites, to the best of our knowledge, they have not yet been applied to develop, test and validate beach attendance forecast models. Given the non-linear relationships between beach attendance and environmental conditions, data-driven approaches are expected to provide robust predictions of video-derived attendance data. Among the approaches suitable for this purpose, XGBoost (eXtreme Gradient Boosting) stands out as a powerful and scalable machine learning algorithm designed for supervised learning tasks [40], with demonstrated success in coastal applications (e.g., refs. [41,42]). As an advanced implementation of gradient boosting, XGBoost builds an ensemble of decision trees sequentially, enhancing their performance by minimising error and maximising predictive accuracy. Its robustness, computational efficiency, and ability to handle complex, non-linear relationships make it particularly well-suited for predicting beach attendance from environmental data. By incorporating diverse features such as weather, oceanic, and temporal conditions, XGBoost has the potential to both unravel the respective importance of the input variables and provide reliable forecasts of beach attendance.

The main objectives of this study are (1) to develop and validate a machine learning model that accurately forecasts beach attendance, and (2) to identify which environmental variables, such as weather, wave conditions, tides, and temporal factors (e.g., time of day, day of the week), are most critical for accurate predictions. The novelty lies in combining automated deep learning-based counting with XGBoost regression modelling to predict beach attendance, capturing both intra-day and inter-day variability with unprecedented accuracy. Section 2.1 describes the study area, the video stations used, and the lifeguard beach user count estimates of beach attendance during the summer of 2022. Section 2.2 details the application of automatic video-derived beach attendance estimation and its validation with lifeguard estimates. In Section 2.3, these data are combined with environmental variables to build an XGBoost model for predicting beach attendance, along with an assessment of the model’s skill and the influence of different input variables. Results are discussed in Section 4, and conclusions are drawn in Section 5. We demonstrate the effectiveness of an XGBoost model trained on automatic video-derived beach user counts to accurately predict beach attendance. Notably, the hour of the day and daily mean air temperature emerged as the most influential factors, with a simplified model using only these two inputs still achieving good predictive skill. These findings highlight the potential of machine learning for real-time beach attendance forecasting, aiding in coastal management and safety planning.

2. Materials and Methods

2.1. Study Sites

2.1.1. The Beaches of Southwest France: General Settings

The area of interest covers the sandy beaches of southwestern France (Figure 1a), spanning from the Adour estuary in the south to the Gironde estuary in the north. This 230 km stretch of coastline, interrupted by the large Arcachon Inlet, which separates the Landes coast in the south from the Gironde coast in the north, predominantly consists of relatively straight sandy beaches backed by a large coastal dune system [43,44]. The coast faces ocean waves generated in the North Atlantic Ocean, with the summer mean significant wave height slightly exceeding 1 m. It is a meso-macrotidal coast, with a mean tidal range of approximately 3 m. Each year, around 4–5 million tourists, mainly from France and other European countries, visit the Gironde and Landes coasts to enjoy the beaches [45]. This region is also home to some of the best surfing beaches in Europe, attracting surfers of all skill levels. The area is characterised by a small number of coastal resorts offering direct beach access, while most beaches are reached via coastal dune tracks leading from inland car parks. Some of these access points lead to beaches patrolled by lifeguards and featuring designated bathing zones, while many others are located on unpatrolled stretches of coast, several kilometres from lifeguard stations.

2.1.2. Video-Monitoring Sites of Biscarrosse and Vielle Saint-Girons

Two video-monitored sites were used in this study: the coastal resort of Biscarrosse, which offers several direct beach access points (Figure 1b), and La Lette Blanche, a remote patrolled beach accessible only via a single coastal dune track connected to an inland parking area (Figure 1c). Biscarrosse Beach has been video-monitored since 2007, primarily to study morphological changes [46,47]. Various video systems have been implemented over the years, with notable camera malfunctions resulting in the absence of video data or partial coverage for several short to long (years) periods. Since 2022, a full 180° station has been installed, providing almost continuous video data. Although post-processed products include Timex and variance images, for the present work, which focuses on counting people on the beach, we used only snapshot images taken every 10 min. Figure 2 shows some snapshot examples, illustrating qualitatively how beach attendance can vary largely throughout the day or from one day to another during summer. On relatively warm, sunny, light-wind days, beach attendance can range from relatively low in the morning (Figure 2a) to high in the afternoon (Figure 2b). Conversely, on a cool, rainy, and windy day, the beach can be almost empty even in the afternoon (Figure 2c). The Biscarrosse video data is used herein to train the XGBoost model for predicting beach attendance during the summer of 2023.

Contrary to Biscarrosse Beach, the video station at La Lette Blanche is not a permanent installation. This 180° video monitoring system was implemented as part of a multidisciplinary beach safety field experiment during the summer of 2022, encompassing beach user surveys, surf-zone drifter measurements, topographic surveys, lifeguard assessments of surf-zone hazards and beach user counts, as well as the monitoring of environmental conditions. For more details on this beach safety experiment, the reader is referred to [14,48]. During this experiment, on each patrolled day, the lifeguard chief (or co-chief on the lifeguard chief’s days off, i.e., two days a week) was asked to provide an hourly estimate of the total number of beach users during patrolling hours from 11 AM to 7 PM. The video data from La Lette Blanche are used in this study to validate the automatic beach user counting from the video system against the lifeguard estimates throughout the entire month of August 2022, when the video station operated continuously.

2.1.3. Environmental Data

Environmental data were required as input variables for the XGBoost model predicting the number of beach users at Biscarrosse Beach during the summer of 2023. We assumed that, in summer, the total number of beach users at a given time is influenced by weather and marine conditions, in addition to the weekday, time of day, and, hypothetically, the month (July or August). For marine conditions, in line with [26] for physical hazard forecast modelling, we used the numerical wave hindcast MFWAM, implemented and used by Météo-France for operational sea state forecasting. This model nests a high-resolution WaveWatch 3 wave model [49] along the Atlantic coast of France, with unstructured grid resolution decreasing to approximately 200 m at the coast. Modelled wave conditions were extracted at approximately 10 m depth in front of Biscarrosse Beach, showing very good skill compared to measurements at a nearby buoy [26]. The significant wave height

H_{s}

, peak wave period

T_{p}

, and direction of wave incidence

θ

were the three hourly variables used to describe wave conditions at Biscarrosse Beach. Astronomical tide elevation

ζ

was estimated every 10 min using the TPXO9 (version 5) 1/30°-resolution atlas [50] at the grid point closest to Biscarrosse Beach. Weather data were collected at the Biscarrosse Météo-France station. Hourly data of air temperature (T), precipitation (P), mean wind speed (W), mean wind direction (

α

), and insolation (I) were used to estimate weather conditions at Biscarrosse Beach. The time series for all these environmental variables for the summer of 2023 are presented later in the paper.

2.2. Automatic Video-Derived Beach Attendance Estimation

2.2.1. Method

To automatically count the number of beach users, we used snapshot images captured by the video stations at Biscarrosse and La Lette Blanche. Although improvements in the architecture and training procedures of computer vision techniques could be applied (e.g., ref. [51]), we aimed to use an off-the-shelf deep learning model for ease of replicability across other stations. The full-resolution video station images, i.e., 7551 × 1416 pixels for the Biscarrosse station and 8160 × 3616 pixels for the La Lette Blanche station, are, however, much larger than the images typically used to train such models. Therefore, the snapshot images were subdivided into smaller images, referred to as vignettes (see Figure 3a,b). For each vignette, the YOLOv7 (You Only Look Once version 7, [52]) algorithm was employed to detect and count individuals on the beach (Figure 3c). YOLOv7 is a state-of-the-art object detection model that uses deep learning to simultaneously classify and localise multiple objects within an image. The model is known for its high accuracy in detecting objects, including small and partially obscured individuals, making it particularly suitable for crowded environments like beaches.

Overall, 152 and 112 vignettes were designed for the Biscarrosse and La Lette Blanche video stations, respectively (Figure 3a,b). We used the pre-trained model yolov7-e6e.pt with a confidence threshold of 0.1 and an input image size set to 1280 pixels to enhance detection accuracy. Once the number of beach users in each vignette was counted, the counts were summed for each snapshot image to estimate the total number of beach users at that time. By further aggregating the counts from all images, the time series of beach user data was constructed. As shown in Figure 3a,b, the vignettes did not cover the entire alongshore length captured by the camera due to poor resolution at the edges of the image, which made it impossible to visually detect people, and consequently, for YOLOv7 to detect them. This also applied to the most offshore part of the beach, where, during spring low tide, people far out on the beach could not be recognised. However, the covered area was still sufficient to count the majority of people on the beach at both stations.

2.2.2. Validation with Lifeguard Estimates

Figure 4 shows the comparison between the beach user count from lifeguard estimates and the automatic object detection from the video system at La Lette Blanche during August 2022. The results show that the video-derived counting accurately reproduces the temporal patterns of beach users, with both crowded and almost empty days well captured by the algorithm (Figure 4a). In addition, the algorithm effectively captures the variability in the number of beach users throughout the patrolling hours, typically showing an increasing number of beach users in the morning, followed by stabilisation or even a decrease during midday around lunchtime, before increasing again in the afternoon and decreasing again around 5–6 PM. Furthermore, outside of the patrolling hours, the video monitoring captures the low beach attendance in the early morning, as well as the increase in attendance in the late hours of the day, which will be discussed further in the paper (Section 4).

However, despite the reasonable correlation (coefficient of determination

r^{2} = 0.56

), the beach user count estimated by the lifeguards is systematically underestimated by the video-derived count. This is particularly true for the busiest days, as seen, for instance, in the afternoons in Figure 4. This discrepancy is due to two primary reasons. (1) Manual counting from a few images indicates that the lifeguards provide rough estimates, meaning they do not attempt to count people individually or in groups of 50, for example. This is also why the lifeguard counts almost systematically miss the decrease in beach attendance during lunchtime, which is well captured by the video-derived count. (2) During the busiest days, beach users tend to spread along the shoreline to have more space. As a result, they often move beyond the alongshore limits of the vignettes (Figure 3a). This explains why the number of beach users is underestimated during the busiest hours. Although corrections were applied (polynomial regression), they only slightly improved the correlation. Given the small improvement and the fact that the correction may differ at Biscarrosse due to multiple direct access points to the beach, it was decided to proceed with the uncorrected video-derived beach user count at Biscarrosse to develop the XGBoost forecast model in the next section. This will also be discussed further in the paper (Section 4).

2.3. Beach Attendance Forecast Model

2.3.1. Data

We used the exact same approach as described in Section 2.2 to automatically count beach users at Biscarrosse Beach during July and August 2023. The resulting time series is shown in Figure 5i, which illustrates a large variability in beach attendance, N, ranging from crowded days with instantaneous video-derived beach user counts exceeding 2000 on August 9, to almost empty days, such as August 10. The input variables for predicting the number of beach users, N, consisted of the environmental data described in Section 2.1.3, specifically, air temperature (T), precipitation (P), mean wind speed (W), mean wind direction (

α

), insolation (I), significant wave height (

H_{s}

), peak wave period (

T_{p}

), direction of wave incidence (

θ

), and astronomical tide elevation (

ζ

). Additionally, since the decision to go to the beach is largely influenced by the day’s overall weather conditions (and, for example, by average wave conditions for surfers), rather than the specific conditions at the moment the beach user is counted, daily averages of meteorological and hydrodynamic variables were also included as input variables for the model. The corresponding daily mean values (from 11 AM to 7 PM) for all the variables described below are denoted using the

| . |

notation (

| H_{s} |

,

| T_{p} |

,

| θ |

,

| T |

,

| P |

,

| W |

, and

| I |

), except for the astronomical tide elevation (

ζ

), for which the corresponding daily mean variable is the tide range (

| T R |

). The time series for all these variables is shown in Figure 5.

Additional features capturing temporal patterns and periodic variations in the input data were also considered, including

M o n t h

, the calendar month (July = 7, August = 8);

D a y

, the day of the week, where values range from 1 (Monday) to 7 (Sunday); and

H o u r

, the hour of the day (e.g.,

H o u r = 14.5

means 2:30 PM). This approach was intended to enable the model to understand time-based dependencies, as illustrated in Figure 4 for La Lette Blanche beach.

Figure 6 shows the correlation matrix for all environmental variables described above, excluding the temporal variables, to better understand the linear relationships between them and the beach user count N. The matrix reveals that, among the weather variables, T exhibits the strongest positive linear correlation with N, meaning that, unsurprisingly, beach user count increases with warmer weather (Figure 6a). Linear relationships are much weaker for the other weather input variables, as well as for most of the hydrodynamic variables. Interestingly, a moderate negative linear correlation is observed between the beach user count N and significant wave height

H_{s}

. This is due to the fact that

H_{s}

and T are negatively correlated (

R = - 0.41

, Figure 6a), meaning that days with lower waves tend to be associated with warmer temperatures. Similar patterns are found when using daily mean input variables (Figure 6b), with generally stronger linear correlations with N, except for the daily mean air temperature

| T |

.

A weak but statistically significant linear (Pearson) correlation is found between beach user count N and

H o u r

(

R = 0.31

, not shown in Figure 6). However, the hour of the day has a strong, non-linear influence on beach user count. This is further illustrated in Figure 6c, which shows the averaged hourly distribution of video-derived beach user counts at Biscarrosse Beach during July and August 2023. The pattern demonstrates a clear and expected increase in beach users throughout the day, with a slight reduction in the rate of increase around lunchtime, followed by a peak around 4–5 PM during the warmest hours of the day. The number of beach users then rapidly decreases, with a slight increase during the last hour of daylight.

2.3.2. XGBoost Modelling

XGBoost is well-known for its ability to achieve high predictive accuracy while mitigating overfitting and addressing computational constraints, making it a popular choice for predictive modelling. To maximise predictive accuracy and minimise errors in this regression task, an XGBoost model was implemented to predict instantaneous beach attendance. Normalising the input data is crucial for gradient-based algorithms like XGBoost, as it helps to accelerate convergence and improve model performance. Consequently, all input variables and the target variable, as described in the previous sections, were normalised.

An extensive hyperparameter tuning strategy was used in order to generate an optimal model. The tuning process used a grid search approach to optimise key model parameters, ensuring a robust balance between bias and variance. The parameter grid included six hyperparameters: max_depth ([3, 5]), controlling the complexity of individual decision trees; learning_rate ([0.01, 0.05]), determining the step size for updates during training; sub_sample ([0.8, 1.0]), specifying the fraction of samples used for tree construction; colsample_bytree ([0.8, 1.0]), dictating the proportion of features considered for each split; and regularisation terms reg_alpha ([0, 1]) and reg_lambda ([1, 2]), which prevent overfitting by penalising large coefficients. By enumerating all possible parameter combinations via the Cartesian product, the model was rigorously evaluated using 5-fold cross-validation. For each combination, the root-mean-squared error (RMSE) was computed on the validation folds. To enhance efficiency and prevent overfitting, early stopping was employed, halting training when the validation performance plateaued for 10 consecutive rounds. The optimal parameter configuration, which minimised the mean validation RMSE, was selected for final model training. Using this configuration, the XGBoost model was trained on the entire dataset (dtrain) for 100 boosting rounds. This comprehensive approach to hyperparameter tuning ensured the model’s ability to capture complex relationships in the data while maintaining generalisability.

The resulting coefficient of determination

R^{2}

and root-mean-square error (RMSE), computed on the denormalised data, were systematically evaluated for each set of input variables. Additionally, feature importance (F score) was calculated to assess the relative contribution of each input variable to the predictive performance of the XGBoost model, with the F score indicating the number of times the input variable appears as a decision node in the trees of the XGBoost model. These metrics were important for interpreting and understanding the underlying relationships within the dataset. They also guided the selection of an optimal model by helping to strike a balance between maximising predictive quality and minimising the number of input variables.

3. Results

The optimal model, balancing model skill and the number of input parameters, was found by excluding wave variables and using daily mean weather variables instead of instantaneous weather data, and the daily tide range instead of instantaneous tidal elevation. Specifically, the model uses

| T |

,

| P |

,

| W |

,

| α |

,

| I |

,

| T R |

,

| M o n t h |

,

| D a y |

, and

| H o u r |

as input variables. Figure 7 shows the time series of the video-derived beach user count N, as well as the simulated count from the optimal model. Results show very good agreement between the model and the data, with a coefficient of determination

R^{2} = 0.97

and an RMSE of 70.4 beach users (Figure 8a). As shown in Figure 7, the model captures day-to-day variability very well, ranging from crowded days (e.g., August 9) to almost empty days (e.g., August 3), during warm, sunny, and light-wind conditions, and colder, rainy, and windy days, respectively. Additionally, the model accurately captures key features throughout the day, such as the slight decrease in beach user count around lunchtime, as well as the substantial increase in beach users at the end of the day during the warmest days (Figure 7b).

As shown in Figure 8a, the most important variable in predicting beach user count N through the optimal XGBoost model is by far Hour, determined by how much Hour contributes to reducing the model’s loss function during training. Hour has an F score, i.e., the number of times the input variable appears as a decision node in the trees of the XGBoost model, more than twice as large as that of the second most important input variable,

| T |

, the daily mean air temperature. The importance of the weather-related input variables, in decreasing order, is daily mean air temperature

| T |

, insolation

| I |

, wind speed

| W |

, wind direction

| α |

, and precipitation

| P |

. Surprisingly, the daily tide range

| T R |

is more important to model skill than the weather variables

| α |

and

| P |

, which will be discussed in Section 4.

Although only the optimal model results are shown here, other models demonstrate fair to very good skill with a limited number of variables. For instance, using an XGBoost model with solely

H o u r

and

| T |

as input variables, i.e., the most important variables in the optimal model (Figure 8b), still results in an accurate model (

R^{2} = 0.90

, RMSE = 128.2). This is illustrated in Figure 9, showing that such an XGBoost model tends to slightly underestimate the most crowded days (Figure 9). In addition, on certain days the model may overestimate the beach user count by a factor of more than two (see, for instance, 2 August 2023, in Figure 9a). Such bias typically occurs on warm but cloudy days with strong westerly winds and/or rain. The importance of the key input variable

H o u r

is confirmed by the largely decreased model skill when removing it from the input variables (

R^{2} = 0.28

, RMSE = 340.7). In contrast, removing

| T |

only slightly decreases model skill (

R^{2} = 0.96

, RMSE = 75.7). This is because, as shown in Figure 8b, all the daily mean weather variables show statistically significant correlations with

| T |

, indicating that the XGBoost model optimised without T, but with all the other weather variables, still indirectly accounts for the daily mean air temperature. Finally, an XGBoost model based solely on

H o u r

shows poor to fair skill (

R^{2} = 0.53

, RMSE = 275.1). This model essentially generates, for all days, regardless of weather, wave, and tide conditions, the time series of the average beach user count throughout the day shown in Figure 6c. Lastly, the optimised model with hourly weather data instead of daily mean weather data, while still using

| T R |

,

M o n t h

,

D a y

, and

H o u r

as other input variables, is skilful (

R^{2} = 0.94

, RMSE = 94.6). These results show that the hour of the day and the daily mean temperature are the two key parameters for forecasting beach user count, with the other variables mostly marginally increasing model skill.

4. Discussion

4.1. Environmental Controls on Beach User Count

The optimised XGBoost approach used in this study demonstrates strong predictive skill (

R^{2} = 0.97

, RMSE = 70.4) in modelling beach user count at a beach in southwest France during the summer of 2023. In line with previous research indicating a higher number of incidents and rescues, and/or more people at the beach on warm, sunny, and light-wind days (e.g., refs. [6,14,27,28,29]), weather variables (air temperature, cloud cover, and wind conditions) were found to be critical in modelling the video-derived beach user count. In particular, the daily mean air temperature

| T |

was the most important weather variable for predicting instantaneous beach user count N, followed in decreasing order by cloud cover (through

| I |

), wind speed

| W |

, wind direction

| α |

, and precipitation

| P |

(Figure 8b). This dominance of air temperature was not systematically observed in previous studies. For instance, in the Netherlands, precipitation was found to have a more significant influence than other weather variables [53], while in Spain, sunshine hours, of which

| I |

is a proxy, were identified as the most important factor [54]. It is important to note, however, that in this context, “importance” does not refer to a statistical relationship, such as the linear Pearson correlation shown in Figure 6a,b, but rather to the number of times a given input variable appears as a decision node in the trees of the optimised XGBoost model.

Marine conditions were not found to impact the XGBoost model’s skill, with the notable exception of tide range

| T R |

(Figure 8b). This is an unexpected result, and to our knowledge, no previous studies have reported such a finding. Furthermore, results from surveys of beach users in southwest France [48] indicate that tide conditions are not a factor influencing an individual’s decision to go to the beach. We anticipate that the slight influence of tide range

| T R |

is the result of a minor bias in the total beach user count derived from the video, as tide variations affect the width of the dry beach. At spring low tide, beach users positioned at the far reach of the camera’s field of view may be more difficult to identify, potentially introducing a small bias in the data. We also demonstrate that daily mean weather data are more influential than instantaneous weather data. This is unsurprising, given that the decision to visit the beach is often based on checking the daily forecast, e.g., approximately 80% of beach users in [54] reported doing so. The time of day (

H o u r

) was the most influential input variable, with the XGBoost model effectively capturing the non-linear relationship between

H o u r

and beach user count (N), which dominates the distribution of N throughout the day (Figure 6a). Largely thanks to the use of video-derived data, which provide high-frequency data (every 10 min) extending well beyond patrolling hours, the XGBoost model developed here can also simulate the gradual increase in beach users in the early morning, the relative decrease or stabilisation during lunch time, the sharp drop in the late afternoon, as well as the slight increase around sunset. To the best of the authors’ knowledge, this is the first beach user count prediction model to capture such a detailed range of variability and with such accuracy.

4.2. Limitations

Our approach has certain limitations, primarily related to the video-derived beach user count methodology. To provide an easily replicable framework, unlike previous studies that retrained AI models for specific beach scenes [37,38], we used an existing model (YOLOv7) to automatically detect beach users from small vignettes cropped from 10-minute spaced snapshot images captured by a 180° video station. While this approach provides reasonable estimates of beach user count, it is likely less accurate than site-specific trained models. A significant limitation of video-based beach user estimation on open beaches is pixel resolution at the far edges of the camera’s field of view, where even visual detection of beach users is impossible. This issue was more pronounced at the La Lette Blanche video station compared to Biscarrosse due to greater image distortion (Figure 3), which rapidly decreased pixel resolution with increasing distance from the camera alongshore. Consequently, on the most crowded days, when beach users tended to move further away from the beach entrance and thus beyond the optimal camera view, the video-derived beach user count was significantly underestimated at La Lette Blanche (for example, N saturating at approximately 600 in Figure 4a). This issue was far less pronounced at Biscarrosse, as evidenced by the large day-to-day variability of N and the absence of saturation (Figure 7a). Additionally, people in the water, typically in chest-high water, were not identified by the model, leading to an underestimation of total beach users on days with high bathing activity. Despite these limitations, which are difficult to fully address, the video-derived N was deemed sufficiently accurate for training our beach user count forecast model at Biscarrosse Beach.

4.3. Future Research

The optimised XGBoost model for the summer of 2023 at Biscarrosse Beach demonstrates strong predictive skill. However, generalising this model across time and/or space requires caution. Firstly, Biscarrosse Beach is a coastal resort beach, where beach use is likely different from that of more remote beaches accessible only via coastal dune tracks leading from inland car parks. An inspection of the differences in beach user count between Biscarrosse Beach and La Lette Blanche, based on a limited period of common data coverage and biased beach user count estimates on crowded days at La Lette Blanche, nevertheless reveals large similarities. Further applications of the XGBoost model should focus on remote beaches, where a low-distortion camera system should be installed to improve beach user count estimation. Such an approach would help to better understand differences in beach use, develop distinct models depending on beach type, and explore whether the relative importance of different input variables (weather and marine conditions) varies from one beach to another. A key future development will be the extension of the model to include the shoulder seasons, typically from April to October. Each year, severe drowning incidents occur during these periods when beaches are unpatrolled. As a preliminary analysis, an XGBoost model was optimised for Biscarrosse Beach from 1 April to 31 October 2023, using the same input variables as the summer 2023 model. This model still demonstrates good skill (

R^{2} = 0.92

, RMSE = 80.5; see Figure 10), although summer peaks in beach user count are systematically underestimated. Interestingly, while

H o u r

and daily mean air temperature

| T |

remain the most important input variables, they are followed by

M o n t h

(i.e., seasonality), and with an increased importance of

D a y

(i.e., weekends). Given the complexity of beach user count patterns during the shoulder seasons, the development of an all-year-round beach attendance model is part of future work. This model will be trained using multiple years of video-derived beach user count data, in order to cover a representative range of environmental conditions, and will also need to incorporate public holidays as an input variable.

Understanding and predicting the temporal and spatial distribution of beach users, and thus the probability that a beach user enters the water and encounters physical hazards in the surf zone if present, can help optimise lifeguard deployment, enhance public awareness campaigns, and reduce the overall risk to beach users. Combined with surf zone hazard models (rip currents and shore-break waves) under development for the southwest coast of France [26], this approach has the potential to predict, a few days ahead, at high temporal resolution (e.g., 10 min) and within seconds, all components of the life risk at the beach. While the optimised model shows very good skill (

R^{2} = 0.97

, RMSE = 70.4), using only the time of day (

H o u r

) and daily mean air temperature (

| T |

) still provides good skill (

R^{2} = 0.90

, RMSE = 128.2), indicating that accurate beach user count forecasts can be made based on simple weather forecasts. The number of beach users can differ significantly from the number of individuals in the water, and to reliably link beach crowds with water entry (exposure), bathing rates must be considered. Ref. [27] estimated that, on average, 45% of visitors at Southern California beaches enter the water, varying seasonally from 26% in winter to 54% in summer. Wave conditions can also affect bathing rates, with, for instance, large shore-break waves (

H_{s} > 2.5

m) potentially deterring swimmers [6]. Similarly, refs. [48,55] showed that weather and ocean conditions influence risk perception and water entry. Although refining exposure predictions, including bathing rates, remains an area for future research, this study already provides valuable beach user count forecasts.

5. Conclusions

An XGBoost model was successfully used to predict beach attendance based on video-derived data, and driven by environmental and temporal variables. The model’s high predictive skill (

R^{2} = 0.97

, RMSE = 70.4) allows capturing the complex patterns of beach user behaviour. By combining weather conditions, tidal data, and time-related variables such as the time of day and day of the week, the model accurately forecasts beach attendance throughout the summer of 2023 at Biscarrosse Beach. This achievement highlights the potential of machine learning techniques for providing reliable, real-time predictions. The importance of daily mean weather data in predicting beach attendance is highlighted, as visitors often make decisions based on weather forecasts. Notably, the model identified time of day as the most influential factor, reflecting the non-linear relationship between attendance and hourly variation. Weather variables, particularly air temperature, had significant impacts on attendance predictions. Integrating attendance data with surf-zone hazard models, such as rip currents and shore-break waves, shows potential for optimising lifeguard deployment, improving public awareness campaigns, and reducing the overall risk to beach users. While the model demonstrated high accuracy, future research should focus on refining exposure predictions by considering bathing rates and more detailed hazard assessments.

Author Contributions

Conceptualisation, B.C.; data curation, B.C.; formal analysis, B.C. and D.C.; funding acquisition, J.D. and B.C.; investigation, B.C.; methodology, B.C. and D.C.; resources, S.L. (Sylvain Liquet), V.M., N.S., J.-P.S. and S.B.; software, B.C.; validation, B.C. and D.C.; visualisation: B.C.; writing—original draft, B.C., writing—review and editing, all authors. All authors have read and agreed to the published version of the manuscript.

Funding

This study received financial support from Project SWYM (Surf zone hazards, recreational beach use and Water safetY Management in a changing climate) and CORALi (PSGAR) both funded by Région Nouvelle-Aquitaine and the French government in the framework of the University of Bordeaux’s IdEx “Investments for the Future” program/RRI Tackling Global Change. Additional funding was provided by ANR-22-EXIR-0004 (Project IRICOT/PEPR IRIMA).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of INRAE (Project SWYM) on 22 July 2024.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

We warmly thank the SMGBL (Syndicat Mixte de Gestion des Baignades Landaise), and particularly Stéphanie Barneix and the La Lette Blanche lifeguards who were on duty during the summer of 2022. We are also thankful to the Vielle Saint-Girons council for providing technical support and access to the lifeguard facilities. We thank Météo-France for providing weather data from Capbreton station through their RADOME (Réseau d’Acquisition de Données d’Observations Météorologiques Etendu) automatic weather station network.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lucrezi, S.; van der Walt, M.F. Beachgoers’ perceptions of sandy beach conditions: Demographic and attitudinal influences, and the implications for beach ecosystem management. J. Coast. Conserv. 2016, 20, 81–96. [Google Scholar] [CrossRef]
Morgan, D. Counting Beach Visitors: Tools, Methods and Management Applications. In Beach Management Tools—Concepts, Methodologies and Case Studies; Botero, C.M., Cervantes, O., Finkl, C.W., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 561–577. [Google Scholar] [CrossRef]
Ghermandi, A.; Nunes, P.A. A global map of coastal recreation values: Results from a spatially explicit meta-analysis. Ecol. Econ. 2013, 86, 1–15. [Google Scholar] [CrossRef]
Houston, J.R. The economic value of America’s beaches. Shore & Beach 2024, 92, 33–43. [Google Scholar] [CrossRef]
Stokes, C.; Masselink, G.; Revie, M.; Scott, T.; Purves, D.; Walters, T. Application of multiple linear regression and Bayesian belief network approaches to model life risk to beach users in the UK. Ocean. Coast. Manag. 2017, 139, 12–23. [Google Scholar] [CrossRef]
de Korte, E.; Castelle, B.; Tellier, E. A Bayesian network approach to modelling rip-current drownings and shore-break wave injuries. Nat. Hazards Earth Syst. Sci. 2021, 21, 2075–2091. [Google Scholar] [CrossRef]
MacMahan, J.H.; Thornton, E.B.; Reniers, A.J.H.M. Rip current review. Coast. Eng. 2006, 53, 191–208. [Google Scholar] [CrossRef]
Dalrymple, R.A.; MacMahan, J.H.; Reniers, A.J.; Nelko, V. Rip Currents. Annu. Rev. Fluid Mech. 2011, 43, 551–581. [Google Scholar] [CrossRef]
Castelle, B.; Scott, T.; Brander, R.; McCarroll, R. Rip current types, circulation and hazard. Earth Sci. Rev. 2016, 163, 1–21. [Google Scholar] [CrossRef]
Houser, C.; Wernette, P.; Trimble, S.; Locknick, S. 11—Rip currents. In Sandy Beach Morphodynamics; Jackson, D.W., Short, A.D., Eds.; Elsevier: Amsterdam, The Netherlands, 2020; pp. 255–276. [Google Scholar] [CrossRef]
Chang, S.K.Y.; Tominaga, G.T.; Wong, J.H.; Weldon, E.J.; Kaan, K.T. Risk factors for water sports-related cervical spine injuries. J. Trauma 2006, 60, 1041–1046. [Google Scholar] [CrossRef]
Barlas, B.; Beji, S. Rip current fatalities on the Black Sea beaches of Istanbul and effects of cultural aspects in shaping the incidents. Nat. Hazards 2016, 80, 811–821. [Google Scholar] [CrossRef]
Puleo, J.; Hutschenreuter, K.; Cowan, P.; Carey, W.; Arford-Granholm, M.; McKenna, K. Delaware surf zone injuries and associated environmental conditions. Nat. Hazards 2016, 81, 845–867. [Google Scholar] [CrossRef]
Castelle, B.; Dehez, J.; Savy, J.P.; Marieu, V.; Lyser, S.; Bujan, S.; Carayon, D.; Brander, R. Environmental controls on lifeguard-estimated surf-zone hazards, beach crowds, and resulting life risk at a high-energy sandy beach in southwest France. Nat. Hazards: J. Int. Soc. Prev. Mitig. Nat. Hazards 2024, 120, 1557–1576. [Google Scholar] [CrossRef]
Dusek, G.; Seim, H.; Hanson, J.L.; Elder, D.H. Analysis of Rip Current Rescues at Kill Devil Hills, North Carolina. In Rip Currents; Taylor & Francis Group: Boca Raton, FL, USA, 2011. [Google Scholar]
Brighton, B.; Sherker, S.; Brander, R.; Thompson, M.; Bradstreet, A. Rip current related drowning deaths and rescues in Australia 2004–2011. Nat. Hazards Earth Syst. Sci. 2013, 13, 1069–1075. [Google Scholar] [CrossRef]
Brewster, B.C.; Gould, R.E.; Brander, R.W. Estimations of rip current rescues and drowning in the United States. Nat. Hazards Earth Syst. Sci. 2019, 19, 389–397. [Google Scholar] [CrossRef]
Castelle, B.; Brander, R.; Tellier, E.; Simonnet, B.; Scott, T.; McCarroll, J.; Campagne, J.M.; Cavailhes, T.; Lechevrel, P. Surf zone hazards and injuries on beaches in SW France. Nat. Hazards 2018, 93, 1317–1335. [Google Scholar] [CrossRef]
Robbles, L. Cervical spine injuries in ocean bathers: Wave-related accidents. Neurosurgery 2006, 58, 920–923. [Google Scholar] [CrossRef]
Thom, O.; Roberts, K.; Leggat, P.A.; Devine, S.; Peden, A.E.; Franklin, R.C. Cervical spine injuries occurring at the beach: Epidemiology, mechanism of injury and risk factors. BMC Public Health 2022, 22, 1404. [Google Scholar] [CrossRef]
Austin, M.; Scott, T.M.; Brown, J.W.; Brown, J.A.; MacMahan, J.H.; Masselink, G.; Russell, P. Temporal observations of rip current circulation on a macro-tidal beach. Cont. Shelf. Res. 2010, 30, 1149–1165. [Google Scholar] [CrossRef]
Stokes, C.; Poate, T.; Masselink, G.; Scott, T.; Instance, S. New insights into combined surfzone, embayment, and estuarine bathing hazards. Nat. Hazards Earth Syst. Sci. 2024, 24, 4049–4074. [Google Scholar] [CrossRef]
Dusek, G.; Seim, H. Rip Current Intensity Estimates from Lifeguard Observations. J. Coast. Res. 2012, 29, 505–518. [Google Scholar] [CrossRef]
Dusek, G.; Seim, H. A probabilistic rip current forecast model. J. Coast. Res. 2013, 29, 909–925. [Google Scholar] [CrossRef]
Casper, A.; Nuss, E.S.; Baker, C.M.; Moulton, M.; Dusek, G. Assessing NOAA Rip-Current Hazard Likelihood Predictions: Comparison with Lifeguard Observations and Parameterizations of Bathymetric and Transient Rip-Current Types. Weather. Forecast. 2024, 39, 1045–1063. [Google Scholar] [CrossRef]
Castelle, B.; Dehez, J.; Savy, J.P.; Liquet, S.; Carayon, D. Physics-based forecast modelling of rip-current and shore-break wave hazards. Nat. Hazards Earth Syst. Sci. Discuss. 2024. [preprint], in review. [Google Scholar] [CrossRef]
Dwight, R.H.; Brinks, M.V.; SharavanaKumar, G.; Semenza, J.C. Beach attendance and bathing rates for Southern California beaches. Ocean Coast. Manage. 2007, 50, 847–858. [Google Scholar] [CrossRef]
Ibarra, E. The use of webcam images to determine tourist-climate aptitude: Favourable weather types for sun and beach tourism on the Alicante coast (Spain). Int. J. Biometeorol. 2011, 55, 373–385. [Google Scholar] [CrossRef] [PubMed]
Coombes, E.; Jones, A.P.; Bateman, I.; Tratalos, J.; Gill, J.; Showler, D.; Watkinson, A.; Sutherland, W. Spatial and temporal modeling of beach use: A case study of east Anglia, UK. Coast. Manag. 2009, 37, 94–115. [Google Scholar] [CrossRef]
Kane, B.; Zajchowski, C.A.; Allen, T.R.; McLeod, G.; Allen, N.H. Is it safer at the beach? Spatial and temporal analyses of beachgoer behaviors during the COVID-19 pandemic. Ocean. Coast. Manag. 2021, 205, 105533. [Google Scholar] [CrossRef]
Tellier, E.; Simonnet, B.; Gil-Jardiné, C.; Lerouge-Bailhache, M.; Castelle, B.; Salmi, R. Predicting drowning from sea and weather forecasts: Development and validation of a model on surf beaches of southwestern France. Inj. Prev. 2022, 28, 16–22. [Google Scholar] [CrossRef]
Bustos, M.L.; Zilio, M.I.; Ferrelli, F.; Piccolo, M.C.; Perillo, G.M.; Van Waarde, G.; Manstretta, G.M.M. Tourism in the COVID-19 context in mesotidal beaches: Carrying capacity for the 2020/2021 summer season in Pehuén Co, Argentina. Ocean. Coast. Manag. 2021, 206, 105584. [Google Scholar] [CrossRef]
Provost, E.J.; Coleman, M.A.; Butcher, P.A.; Colefax, A.; Schlacher, T.A.; Bishop, M.J.; Connolly, R.M.; Gilby, B.L.; Henderson, C.J.; Jones, A.; et al. Quantifying human use of sandy shores with aerial remote sensing technology: The sky is not the limit. Ocean. Coast. Manag. 2021, 211, 105750. [Google Scholar] [CrossRef]
Teles da Mota, V.; Pickering, C.; Chauvenet, A. Popularity of Australian beaches: Insights from social media images for coastal management. Ocean. Coast. Manag. 2022, 217, 106018. [Google Scholar] [CrossRef]
Guillén, J.; García-Olivares, A.; Ojeda, E.; Osorio, A.; Chic, O.; González, R. Long-term quantification of beach users using video monitoring. J. Coast. Res. 2008, 24, 1612–1619. [Google Scholar] [CrossRef]
Balouin, Y.; Rey-Valette, H.; Picand, P.A. Automatic assessment and analysis of beach attendance using video images at the Lido of Sète beach, France. Ocean. Coast. Manag. 2014, 102, 114–122. [Google Scholar] [CrossRef]
Domingo, M.C. Deep Learning and Internet of Things for Beach Monitoring: An Experimental Study of Beach Attendance Prediction at Castelldefels Beach. Appl. Sci. 2021, 11, 10735. [Google Scholar] [CrossRef]
Johnston-González, R.; Adarraga, E.; Coca, O.; Correa, M.; de la Hoz, E.; Legarda, G.; Navarro, J.; Ramírez, M.; Rozo, A.; Ricaurte-Villota, C. Artificial intelligence for beach monitoring: An experimental study of beach attendance at El Rodadero, Colombia. Ocean. Coast. Manag. 2024, 253, 107159. [Google Scholar] [CrossRef]
Sempere-Tortosa, M.; Toledo, I.; Marcos-Jorquera, D.; Carbonell, D.; Gilart-Iglesias, V.; Aragonés, L. A new occupancy index model based on artificial vision for enhancing beach management. J. Environ. Manag. 2024, 370, 122675. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the KDD ’16: 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Uddin, M.G.; Nash, S.; Rahman, A.; Olbert, A.I. A comprehensive method for improvement of water quality index (WQI) models for coastal water quality assessment. Water Res. 2022, 219, 118532. [Google Scholar] [CrossRef]
Hasan, M.H.; Ahmed, A.; Nafee, K.; Hossen, M.A. Use of machine learning algorithms to assess flood susceptibility in the coastal area of Bangladesh. Ocean. Coast. Manag. 2023, 236, 106503. [Google Scholar] [CrossRef]
Bossard, V.; Nicolae Lerma, A. Geomorphologic characteristics and evolution of managed dunes on the South West Coast of France. Geomorphology 2020, 367, 107312. [Google Scholar] [CrossRef]
Nicolae Lerma, A.; Castelle, B.; Marieu, V.; Robinet, A.; Bulteau, T.; Bernon, N.; Mallet, C. Decadal beach-dune profile monitoring along a 230-km high-energy sandy coast: Aquitaine, southwest France. Appl. Geogr. 2022, 139, 102645. [Google Scholar] [CrossRef]
Brumaud, S. Saison Touristique 2015 en Aquitaine—La fréQuentation des Hôtels et Campings au Beau Fixe in French; INSEE Analyses Nouvelle-Aquitaine: Poitiers, France, 2016. [Google Scholar]
Senechal, N.; Coco, G.; Castelle, B.; Marieu, V. Storm impact on the seasonal shoreline dynamics of a meso- to macrotidal open sandy beach (Biscarrosse, France). Geomorphology 2015, 228, 448–461. [Google Scholar] [CrossRef]
Angnuureng, D.B.; Almar, R.; Senechal, N.; Castelle, B.; Addo, K.A.; Marieu, V.; Ranasinghe, R. Shoreline resilience to individual storms and storm clusters on a meso-macrotidal barred beach. Geomorphology 2017, 290, 265–276. [Google Scholar] [CrossRef]
Dehez, J.; Lyser, S.; Castelle, B.; Brander, R.W.; Peden, A.E.; Savy, J.P. Investigating beachgoer’s perception of coastal bathing risks in southwest France. Nat. Hazards 2024, 120, 13209–13230. [Google Scholar] [CrossRef]
Tolman, H.L.; Balasubramaniyan, B.; Burroughs, L.D.; Chalikov, D.; Chao, Y.Y.; Chen, H.S.; Gerald, V.M. Development and Implementation of Wind-Generated Ocean Surface Wave Modelsat NCEP. Weather. Forecast. 2002, 17, 311–333. [Google Scholar] [CrossRef]
Egbert, G.D.; Erofeeva, S.Y. Efficient Inverse Modeling of Barotropic Ocean Tides. J. Atmos. Ocean. Technol. 2002, 19, 183–204. [Google Scholar] [CrossRef]
Zhou, J.; Yang, D.; Song, T.; Ye, Y.; Zhang, X.; Song, Y. Improved YOLOv7 models based on modulated deformable convolution and swin transformer for object detection in fisheye images. Image Vis. Comput. 2024, 144, 104966. [Google Scholar] [CrossRef]
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
Moreno, A.; Amelung, B.; Santamarta, L. Linking Beach Recreation to Weather Conditions: A Case Study in Zandvoort, Netherlands. Tour. Mar. Environ. 2008, 5, 111–119. [Google Scholar] [CrossRef]
R.-Toubes, D.; Araújo-Vila, N.; Fraiz-Brea, J.A. Influence of Weather on the Behaviour of Tourists in a Beach Destination. Atmosphere 2020, 11, 121. [Google Scholar] [CrossRef]
Dehez, J.; Lyser, S.; Castelle, B. Predicting individual’s decision to enter the water at a high-energy recreational surf beach in France. Inj. Prev. 2025. [Google Scholar] [CrossRef]

Figure 1. (a) Location map of the southwest coast of France, along with an aerial view of the two video-monitored sites used in this study: (b) the coastal resort of Biscarrosse, which offers several direct beach access points (photo: Observatoire de la Côte de Nouvelle-Aquitaine, OCNA), and (c) La Lette Blanche, a remote patrolled beach accessible only via a single coastal dune track connected to an inland parking area (photo: V. Marieu).

Figure 2. Snapshot images from the Biscarrosse video station showing contrasting beach attendance levels on a warm, sunny, and light-wind day: (a) in the morning and (b) in the afternoon, and on (c) the afternoon of a cool, rainy, and windy day. In all panels, local date and time are provided in the top-right corner.

Figure 3. Example of video station image snapshot and vignette for (a) La Lette Blanche and (b) Biscarrosse. (c) YOLOv7 beach user detection on a given vignette at La Lette Blanche.

Figure 4. Time series of beach user count from hourly lifeguard estimates (black) and from video at 10-minute intervals (blue) during (a) August and (b,c) zoomed onto two 3-day periods. (d) Lifeguard versus video beach user count with the black line showing the linear fit. In (a–c), light-grey areas indicate unpatrolled hours.

Figure 5. Time series of (a–h) model input environmental variables to predict (i) the video-derived beach user count N: (a) air temperature (T), (b) precipitation P, (c) mean wind speed (W), (d) insolation (I), (e) significant wave height

H_{s}

, (f) peak wave period

T_{p}

, (g) direction of wave incidence

θ

and (h) astronomical tide elevation

ζ

. In (a–g), the red circles indicate the daily mean (11 AM–7 PM) denoted as

| . |

in the text, while in (h), they indicate the daily astronomical tide range

| T R |

.

Figure 5. Time series of (a–h) model input environmental variables to predict (i) the video-derived beach user count N: (a) air temperature (T), (b) precipitation P, (c) mean wind speed (W), (d) insolation (I), (e) significant wave height

H_{s}

, (f) peak wave period

T_{p}

, (g) direction of wave incidence

θ

and (h) astronomical tide elevation

ζ

. In (a–g), the red circles indicate the daily mean (11 AM–7 PM) denoted as

| . |

in the text, while in (h), they indicate the daily astronomical tide range

| T R |

.

Figure 6. Correlation matrix for (a) all the environmental variables and (b) all the daily mean environmental variables. (c) Time series of averaged hourly beach user count N with the vertical bar indicating the standard deviation.

Figure 7. Time series of video-derived (blue) and modelled (red) beach user count based on the optimal XGBoost model for (a) the entire time series and (b,c) zoomed onto two 4-day periods. The optimal XGBoost model uses

| T |

,

| P |

,

| W |

,

| α |

,

| I |

,

| T R |

,

M o n t h

,

D a y

and

H o u r

as input variables.

Figure 7. Time series of video-derived (blue) and modelled (red) beach user count based on the optimal XGBoost model for (a) the entire time series and (b,c) zoomed onto two 4-day periods. The optimal XGBoost model uses

| T |

,

| P |

,

| W |

,

| α |

,

| I |

,

| T R |

,

M o n t h

,

D a y

and

H o u r

as input variables.

Figure 8. (a) Video-derived versus modelled beach user count, with the dashed black line showing the 1:1 relationship and (b) model feature (input variable) importance ordered from the most important Hour to the least important Day for the optimal XGBoost model, where the F score is the count of how many times a given feature appears as a decision node in the trees of the XGBoost model.

Figure 9. (a) Time series of video-derived (blue) and modelled (red) beach user count based on an optimised XGBoost model using only

| T |

and

H o u r

as input variables in July and August 2023 and (b) corresponding video-derived versus modelled beach user count, with the dashed black line showing the 1:1 relationship.

Figure 9. (a) Time series of video-derived (blue) and modelled (red) beach user count based on an optimised XGBoost model using only

| T |

and

H o u r

as input variables in July and August 2023 and (b) corresponding video-derived versus modelled beach user count, with the dashed black line showing the 1:1 relationship.

Figure 10. (a) Time series of video-derived (blue) and modelled (red) beach user count based on an optimised XGBoost model using

| T |

,

| P |

,

| W |

,

| α |

,

| I |

,

| T R |

,

M o n t h

,

D a y

and

H o u r

as input variables including the shoulder seasons and (b) corresponding video-derived versus modelled beach user count, with the dashed black line showing the 1:1 relationship.

Figure 10. (a) Time series of video-derived (blue) and modelled (red) beach user count based on an optimised XGBoost model using

| T |

,

| P |

,

| W |

,

| α |

,

| I |

,

| T R |

,

M o n t h

,

D a y

and

H o u r

as input variables including the shoulder seasons and (b) corresponding video-derived versus modelled beach user count, with the dashed black line showing the 1:1 relationship.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Castelle, B.; Carayon, D.; Dehez, J.; Liquet, S.; Marieu, V.; Sénéchal, N.; Lyser, S.; Savy, J.-P.; Barneix, S. Machine Learning Beach Attendance Forecast Modelling from Automatic Video-Derived Counting. J. Mar. Sci. Eng. 2025, 13, 1181. https://doi.org/10.3390/jmse13061181

AMA Style

Castelle B, Carayon D, Dehez J, Liquet S, Marieu V, Sénéchal N, Lyser S, Savy J-P, Barneix S. Machine Learning Beach Attendance Forecast Modelling from Automatic Video-Derived Counting. Journal of Marine Science and Engineering. 2025; 13(6):1181. https://doi.org/10.3390/jmse13061181

Chicago/Turabian Style

Castelle, Bruno, David Carayon, Jeoffrey Dehez, Sylvain Liquet, Vincent Marieu, Nadia Sénéchal, Sandrine Lyser, Jean-Philippe Savy, and Stéphanie Barneix. 2025. "Machine Learning Beach Attendance Forecast Modelling from Automatic Video-Derived Counting" Journal of Marine Science and Engineering 13, no. 6: 1181. https://doi.org/10.3390/jmse13061181

APA Style

Castelle, B., Carayon, D., Dehez, J., Liquet, S., Marieu, V., Sénéchal, N., Lyser, S., Savy, J.-P., & Barneix, S. (2025). Machine Learning Beach Attendance Forecast Modelling from Automatic Video-Derived Counting. Journal of Marine Science and Engineering, 13(6), 1181. https://doi.org/10.3390/jmse13061181

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning Beach Attendance Forecast Modelling from Automatic Video-Derived Counting

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Sites

2.1.1. The Beaches of Southwest France: General Settings

2.1.2. Video-Monitoring Sites of Biscarrosse and Vielle Saint-Girons

2.1.3. Environmental Data

2.2. Automatic Video-Derived Beach Attendance Estimation

2.2.1. Method

2.2.2. Validation with Lifeguard Estimates

2.3. Beach Attendance Forecast Model

2.3.1. Data

2.3.2. XGBoost Modelling

3. Results

4. Discussion

4.1. Environmental Controls on Beach User Count

4.2. Limitations

4.3. Future Research

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI