Next Article in Journal
U-Space Social and Environmental Performance Indicators
Previous Article in Journal
Enhancing the Performance of Unmanned Aerial Vehicle-Based Estimation of Rape Chlorophyll Content by Reducing the Impact of Crop Coverage
 
 
Article
Peer-Review Record

Drone Insights: Unveiling Beach Usage through AI-Powered People Counting

Drones 2024, 8(10), 579; https://doi.org/10.3390/drones8100579
by César Herrera 1,*, Rod M. Connolly 1, Jasmine A. Rasmussen 1, Gerrard McNamara 1, Thomas P. Murray 1, Sebastian Lopez-Marcano 1, Matthew Moore 2, Max D. Campbell 1 and Fernando Alvarez 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Drones 2024, 8(10), 579; https://doi.org/10.3390/drones8100579
Submission received: 5 September 2024 / Revised: 4 October 2024 / Accepted: 9 October 2024 / Published: 13 October 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Authors,

The manuscript "Drone insights: unveiling beach usage through AI powered people counting" has been reviewed, and my comments are as follows.

1 The style of abstract is not correct.

2 Need some content to explain the reasons for the incorrect detection.

3 how to define urban beaches? what about the difference with natural sandy beaches?

4 I did not know how to detect people under shelters?

5 Can the study be extended to a larger scale, and why? what about the data availability?

6 Can the study be enhanced or changed to satellite imagery?

7 How to combine some information derived by satellite data, such as coastline, bathymetry, coastal slope and so on.

Monitoring beach topography and nearshore bathymetry using spaceborne remote sensing: A review.

Estimating coastal slope of sandy beach from ICESat-2: a case study in Texas

 

Author Response

We thank the reviewer for their feedback and constructive suggestions.

Comment 1: The style of abstract is not correct.

Response 1: We agree. While preparing abstract using the journal template, we accidentally left the indicative numbers for the suggested abstract sections. We have removed the numbers from suggested sections from abstract on page 1, lines 12, 13, 17, and 23.

 

Comment 2: Need some content to explain the reasons for the incorrect detection.

Response 2:

Thanks for pointing this out. For explaining the origin of inaccurate detections in high-performing models, we have added the following text in the Results for Model Performance (pages 7 lines 268-285):

“All classes in the Land-Water model performed well, with precision and recall over 90% (Table 2). For the Usage model, classes with the lowest number of ground-truths performed worse than classes with more ground-truth labels. However, precision and recall for all classes were robust at over 80%, with classes people resting, runners/walkers, surfers, and shelters performing particularly well. Furthermore, mAP50 for Land-Water and Usage models were 0.55 and 0.54 respectively, which is on par with performance from state-of-the-art models trained on very large datasets [47]. Incorrect detections can occur even in well-performing models due to several factors associated to the model architecture and data. False positive and negative instances can arise from errors on the regressor or classifier components of the model architecture (e.g. for anchor-box architectures, see Miller et al 2022a). Inter-class misclassification can also occur due to unusual viewpoints, similarity in classes characteristics, relative size of objects, and the type of background (Hoiem et al 2012, Miller et al 2022b). In fact, we observed a slightly higher error classification on two pair of classes that are semantically similar, Runners/Walkers and Anglers, and Surfers and Kite surfers. The top-down viewpoint from our surveys marginally disfavoured people sitting cross-legged on the Resting category, because their small relative size. Nonetheless, the performance metrics of the Land-Water and Usage models were satisfactory, and the number of incorrect detections were within an acceptable range.

 

Comment 3: how to define urban beaches? what about the difference with natural sandy beaches?

Response 3:  We recognized that the term urban beaches could be subject of various semantic interpretations. Our definition of urban beaches is a beach in a well-developed area which undergoes high use and is of high social, cultural, environmental and economic value to an urbanized region. For improving text clarity, we have changed the sentence on page 1 lines 39-40:

 “Therefore, obtaining accurate information about human use of sandy beaches in well-developed urban areas is a key component of beach management”.

Comment 4: I did not know how to detect people under shelters?

Response 4: Detecting people under shelters was not feasible in our study, and we acknowledge this as a limitation. We have described our approach for estimating the number of people under shelters in the Methods section. Our strategy involved conducting four field studies at four locations known for high shelter usage to estimate the mean number of people typically present under shelters. This mean count was then used as a correction factor to adjust the total number of people based on the number of shelters observed. To improve clarity, we have renamed this section in the Methods as 'Accounting for People Under Beach Shelters' (page 5, line 182).

 

Comment 5: Can the study be extended to a larger scale, and why? what about the data availability?

Response 5: We appreciate the encouragement of the reviewer in making us consider how our method can be scaled up. Yes, we believe this method can be extended to a larger scale with adequate considerations. The main reasons enabling scalability are that (1) drone surveys and AI are very cost effective, and (2) costs are typically high during initial development of protocols and pipelines but those cost not necessarily increase linearly when extended to a larger area. We have included a new paragraph in the manuscript the describe how and why this study can be extended. We also discuss how existing data (e.g. from long-term lifeguard monitoring) can help inform drone surveys. The new paragraph in pages 14, lines 435-445 reads:

“Drone surveys and AI offer significant advantages over manual counting, particularly in scalability. Contrary to other methods of counting where costs increase in proportion to the scale of deployment, drone surveys and AI can exhibit economy of scale, where initial costs may be high, but the marginal cost of scaling up diminishes. For instance, besides personnel and expertise costs, initial investments in AI, for model training and pipeline development, are high but are one-time efforts that enable the models to handle larger datasets efficiently. However, researchers and managers must monitor for data drift, which could impact model performance if the new data differ from training data. Extending drone surveys and AI to a larger area would be facilitated by availability of data on beach usage patterns in the larger areas. In the absence of such data, drone surveys must be de-signed and scheduled to learn dynamically from ongoing surveys.”.

 

Comment 6: Can the study be enhanced or changed to satellite imagery?

Comment 7: How to combine some information derived by satellite data, such as coastline, bathymetry, coastal slope and so on.

  • Monitoring beach topography and nearshore bathymetry using spaceborne remote sensing: A review.
  • Estimating coastal slope of sandy beach from ICESat-2: a case study in Texas

Response 6 & 7: Satellite data can indeed provide complementary information to help explain observed usage patterns. For example, it can assist by (1) estimating the beach area available to beachgoers, depending on beach topography, slope, and tidal range; (2) indicating wave-breaking quality in the surf zone based on local bathymetry and weather conditions; and (3) evaluating beach morpho-hydrodynamics. However, there are practical challenges associated with these use cases.

For instance, calculating beach slope is not well defined or standardized in the literature, as it can be measured from different reference points such as from the dune and berm to shoreline (mean sea level) or low tide bar. Given that our surveys were conducted over a wide range of dates, times, and locations, we were unable to incorporate slope data (e.g., from sources such as http://coastsat.wrl.unsw.edu.au/ described in Vos et al., "Beach-Face Slope Dataset for Australia," Earth System Science Data 2022, 14, 1345–1357, doi:10.5194/essd-14-1345-2022) and tidal information to account for their effects on people counts. That said, we do not believe that water-based activities or most land-based activities were significantly affected by coastline, bathymetry, or slope at the scale of our study. However, by introducing the methods used in this study, we hope to lay the groundwork for future research that explores innovative ways to incorporate complementary data sources to further explain beach usage patterns. This is indeed a limitation of our study, which we have acknowledged on page 17, lines 513-529, we have also included the suggested references so others can get inspiration on the future possibilities. The new paragraph reads:

“Satellite data could have provided complementary information to further explain some of the observed beach usage patterns. For example, satellite data can be used to (1) estimate the area of the beach available to visitors based on topography, slope, and tidal range; (2) provide insights into wave-breaking quality in the surf zone, which depends on local bathymetry and weather conditions; and (3) evaluate beach morpho-hydrodynamics (Ma et al 2023; Salameh et al 2019). However, there are practical challenges associated to incorporating such data. Calculating beach slope, and therefore beach width affected by tides, is not well standardized across the literature, as it can be measured from different points of reference such as from the dune or berm to shoreline (mean sea level) or low tide bar. Given the large spatial scale and the wide range of times and dates for our drone surveys, we were unable to use consistent slope and tide data from sources such as CoastSat (Vos et al., 2022). Consequently, our study does not account for the potential effects of these factors on beachgoer counts. Despite this limitation, we do not believe that beach slope, bathymetry, or tidal range would have significantly impacted most water-based and land-based activities in our analysis. Nevertheless, future studies could incorporate satellite data and other complementary methods to provide further insights into how such environmental variables influence beach usage patterns.”

Reviewer 2 Report

Comments and Suggestions for Authors

The study presents a novel method for assessing beach use through the combined use of drones and artificial intelligence . A total of 507 drone surveys were conducted at 28 beaches on the Gold Coast, Australia, covering 30 km of coastline in different seasons, times of day and environmental conditions. Two AI models were employed, one to count people on land and in the water, and another to identify types of use. The method proved to be a scalable and cost-effective solution for long-term beach use assessment compared to traditional methods.

1. The study does not provide information on the length of stay or the movement patterns of beachgoers, which could be valuable for understanding beach usage dynamics.

2. While the authors mention the use of a deep learning single-shot detector model, they could provide more specific details about the model architecture and hyperparameters. Additionally, a more elaborate explanation of the Bayesian GLM, including the choice of priors and model validation techniques, could enhance the clarity of the statistical analysis.

3. The assertion that the method supports cases for increased funding or targeted investment could benefit from more concrete examples or projections based on the identified usage patterns and trends. 

In general it is an interesting and well organized study, the presentation has an adequate structure, the methodological process can be improved in the aspects mentioned.

Author Response

We thank the reviewer for their feedback and constructive suggestions.

Comment 1: The study does not provide information on the length of stay or the movement patterns of beachgoers, which could be valuable for understanding beach usage dynamics.

Response 1: Agree. We did not consider the length of stay or movement patterns of beachgoers. This is a limitation of our study, as it is explained on page 7 line 279. For this reason, we emphasize that our annual estimates mirrored the methodology currently employed by lifeguards, where counts are done at three distinctive times throughout the day and assume that these counts are independent. Thus, the underlaying assumption is that the average beachgoer length of stay is less than three to four hours. Which is within the length of stay for beach visitation found in Australia and elsewhere (Sardá et al 2009 study on the Mediterranean; Maguire et al 2011 study in Australia; and Raybould and Lazarow 2009 study on Gold Coast). However, we acknowledge that knowing length of stay would allow understanding visitation patterns and estimating the number of unique visitors. We have added a new paragraph in the discussion pointing out how length of stay could provide additional information for managers. The new paragraph in page 17 lines 530-537 reads:

“Future studies could incorporate the quantification of travel time, visitation frequency, and length-of-stay patterns among beachgoers. Understanding where visitors are travelling from, the number of beach visits, and the average length of stay would enable the estimation of complementary management indicators such as the total number of unique visitors, the economic impact on the region, and a deeper understanding of the factors driving beachgoer behaviour (West & Bayne 2002). This additional data could also prove valuable for estimating the economic losses following catastrophic events, such pandemic, that lead to reduced visitation (e.g., English et al. 2018 ).”.

 

Comment 2: While the authors mention the use of a deep learning single-shot detector model, they could provide more specific details about the model architecture and hyperparameters. Additionally, a more elaborate explanation of the Bayesian GLM, including the choice of priors and model validation techniques, could enhance the clarity of the statistical analysis.

Response 2:  Agree. We have accordingly revised the description of the deep learning and Bayesian GLM methods and included additional details, as well as referring to the GitHub repository associated with this manuscript where hyperparameters and Bayesian Notebooks can be found.

Thus, for the deep learning model we have included the model architecture and referenced the hyperparameters file on page 4 lines 140-152. The paragraph reads:

“For object detection we employed a Deep Learning single-shot detector model (YOLOv5, Jocher et al 2020), consisting of CSP-Darknet53 backbone, Spatial Pyramid Pooling-Fast (SPP) neck, and YOLOv3 head trained on the COCO dataset [42–44](Wang et al 2020). Model initialization used pre-trained weights, followed by custom dataset training with augmentation for improved performance and generalizability. To prevent overfitting, we implemented early stopping (Morgan and Bourlard 1989; Prechelt 2002) and evaluated the model on an evaluation dataset [45]. Model optimization involved fine-tuning hyperparameters and adjusting confidence thresholds for each target class. To mitigate over-hyping, a single test run on an independent dataset was conducted after satisfactory hyperparameter adjustments. All performance metrics are reported against the testing dataset (Table 2). The Land-Water model incorporated a tracking module for position-based tracking and re-identification of people and shelters [46]. All hyperparameters used for training can be found on the repository: https://github.com/globalwetlands/BeachAI.”.

                For the statistical model, we have specified that the full description of the model and choice of priors can be found on supplementary materials (page 6, lines 214-216), and the model Python notebooks are available on the GitHub repository associated to this manuscript (page 6, line 216). We have already mentioned some of the model validation techniques we used, i.e. priors and predictive posterior checks. Furthermore, we have added additional information about model validation, including model selection using Watanabe-Akaike Information Criterion (WAIC), and model fitting checks via Effective Sample Size (ESS) and Diagnostic of Posterior Convergence. The modified paragraph describing Bayesian GLM reads (page 6 lines 206-235):

“We used a Multi-level Bayesian Generalized Linear Model (GLM) with weak priors to assess the importance of explanatory variables on people count estimates from drone surveys [50]. As we employed two distinct deep learning models to detect multiple categories, each category from each detection model was studied independently. Thus, for each category and for the total people count, GLMs were fitted to extract pertinent information from count patterns and predict the annual use considering relevant explanatory variables. We used a Negative Binomial distribution with a Log Link function for modelling people counts (S4). This distribution was suitable given the mean to variance relationship in the dataset. The full description of the statistical model and choice of priors are described on supplementary material (S4), and model built on Python can be found on the repository: https://github.com/globalwetlands/BeachAI. We assessed variable importance by quantifying posterior means and credible intervals of coefficients for each predictor. Thus, we recovered parameters associated to each predictor in the model. In addition to the weather variables described previously, we explored the importance of location, compartment, season, and time of the day. We allowed random intercepts and slopes for location, compartment, and season. For all other variables we originally hypothesised that changes in their group levels will only be expected on their baseline effects to response variable (i.e. changes on intercept), but over upon further exploration and on a single base basis, random slopes were allowed. From these analyses, and for variables with predictive power, we extracted the GLM predictor parameters that allows to compute the probability of observing specific people counts for combinations of explanatory variables. These parameters were critical for conducting temporal extrapolation from individual survey estimates over the entire day, season, and year. Thus, our predicted annual estimates were constructed by resampling the parameters’ distributions (n=4,000) and predicting over 365 days. In addition, when a weather variable exhibited high predictive power, we produced 100 estimates over its range of observed values. We conducted prior and posterior predictive checks, including Diagnostic of Posterior trace convergence for evaluating adequacy of priors and models, together with Effective Sample Size (ESS) and Watanabe-Akaike Information Criterion (WAIC). Predictors that exhibited high correlation with each other were dropped from statistical models.”.

 

Comment 3: The assertion that the method supports cases for increased funding or targeted investment could benefit from more concrete examples or projections based on the identified usage patterns and trends.

Response 3: We agree with this comment, and accordingly we have added the following paragraph to the discussion on page 16-17 lines 501-512:

“The City of Gold Coast (local council) Ocean Beaches Strategy - End of Life Review stated that an increasing population and the consequential changes to the use of beach amenities was one of the significant challenges for managing these natural assets into the future (OBS 2023). The work presented here supports a better understanding of these challenges by providing a more accurate tool to understand beach visitation and beach usage. Additionally, the City publishes an annual State of the Beaches Report which aims to provide an overview of Gold Coast beaches and their visitors, uses, and facilities, while demonstrating the City of Gold Coast’s role in coastal management (City of Gold Coast, 2024). The State of the Beaches report includes beach visitation counts per compartment, with a breakdown between swimming, surfing, and craft activities. The methodology presented here enhances these management tools by providing more accurate data to support the allocation of funds to beach protection, assets, and amenities.”

Reviewer 3 Report

Comments and Suggestions for Authors

This paper is generally fine, it is clearly put together and described, but some changes are needed in the methods and results in particular. The test itself is quite uneven, I think different authors have put together the different sections. Intro is ok and to the point. Methods need some more work, the section on model performance needs to be moved to the results (but you are presenting results from it). The results section itself is very brief and you do not actually present any of your data. What numbers did you count? This is not even stated for your different beaches/times/dates. The basic results are completely missing. Extrapolation of numbers to a single year is very problematic. Line 396 says you did the surveys only over 5 months but then you extrapolate to a whole year in Table 3. This is not great, you need to be much more critical here, errors are not even mentioned. No mention is made of tidal state - this will determine beach width and therefore area available for people. You also only talk about total numbers (e.g fig 3) whereas DENSITY of people is actually better and more meaningful, so please calculate this. Overall, the paper is ok but please note these weak areas listed above that need focusing on. This therefore requires some moderate revisions. 

I have also indicated on the attached pdf a few areas that need some more clarity. Some of the refs are also missing information. 

Comments for author File: Comments.pdf

Author Response

We appreciate the encouraging and constructive feedback from the reviewer and feel that we have been able to address all the concerns raised by the reviewer (see responses below).

Comment 1: This paper is generally fine, it is clearly put together and described, but some changes are needed in the methods and results in particular.

Response 1: Thanks.

Comment 2: The test itself is quite uneven, I think different authors have put together the different sections.

Response 2: Thank you for this observation. Although all authors contributed to various sections of the manuscript, the lead author was responsible for writing and editing the entire document. We have carefully revised the manuscript to ensure consistency in writing style throughout text.

Comment 3: Intro is ok and to the point.

Response 3: Thanks.

 

Comment 4: Methods need some more work, the section on model performance needs to be moved to the results (but you are presenting results from it).

Response 4:  We agree. Regarding methods, we have added additional details on the detection model architecture we employed, and we have referenced and made available the model hyperparameters used for training the model on the GitHub repository associated with the manuscript. We have specified that the full description of the statistical model and choice of priors can be found on the supplementary material, and we have made available the Python Notebooks with statistical models on the GitHub repository. We have added new references where were needed. These modifications have been done on page 4 lines 140-152, and pages 6 lines 206-235. New paragraphs read:

“For object detection we employed a Deep Learning single-shot detector model (YOLOv5, Jocher et al 2020), consisting of CSP-Darknet53 backbone, Spatial Pyramid Pooling-Fast (SPP) neck, and YOLOv3 head trained on the COCO dataset [42–44](Wang et al 2020). Model initialization used pre-trained weights, followed by custom dataset training with augmentation for improved performance and generalizability. To prevent overfitting, we implemented early stopping (Morgan and Bourlard 1989; Prechelt 2002) and evaluated the model on an evaluation dataset [45]. Model optimization involved fine-tuning hyperparameters and adjusting confidence thresholds for each target class. To mitigate over-hyping, a single test run on an independent dataset was conducted after satisfactory hyperparameter adjustments. All performance metrics are reported against the testing dataset (Table 2). The Land-Water model incorporated a tracking module for position-based tracking and re-identification of people and shelters [46]. All hyperparameters used for training can be found on the repository: https://github.com/globalwetlands/BeachAI.”

and

“We used a Multi-level Bayesian Generalized Linear Model (GLM) with weak priors to assess the importance of explanatory variables on people count estimates from drone surveys [50]. As we employed two distinct deep learning models to detect multiple categories, each category from each detection model was studied independently. Thus, for each category and for the total people count, GLMs were fitted to extract pertinent information from count patterns and predict the annual use considering relevant explanatory variables. We used a Negative Binomial distribution with a Log Link function for modelling people counts (S4). This distribution was suitable given the mean to variance relationship in the dataset. The full description of the statistical model and choice of priors are described on supplementary material (S4), and model built on Python can be found on the repository: https://github.com/globalwetlands/BeachAI. We assessed variable importance by quantifying posterior means and credible intervals of coefficients for each predictor. Thus, we recovered parameters associated to each predictor in the model. In addition to the weather variables described previously, we explored the importance of location, compartment, season, and time of the day. We allowed random intercepts and slopes for location, compartment, and season. For all other variables we originally hypothesised that changes in their group levels will only be expected on their baseline effects to response variable (i.e. changes on intercept), but upon further exploration and on a single base basis, random slopes were allowed. From these analyses, and for variables with predictive power, we extracted the GLM predictor parameters that allows to compute the probability of observing specific people counts for combinations of explanatory variables. These parameters were critical for conducting temporal extrapolation from individual survey estimates over the entire day, season, and year. Thus, our predicted annual estimates were constructed by resampling the parameters’ distributions (n=4,000) and predicting over 365 days. In addition, when a weather variable exhibited high predictive power, we produced 100 estimates over its range of observed values. We conducted prior and posterior predictive checks, including Diagnostic of Posterior trace convergence for evaluating adequacy of priors and models, together with Effective Sample Size (ESS) and Watanabe-Akaike Information Criterion (WAIC). Predictors that exhibited high correlation with each other were dropped from statistical models.”

Regarding model performance, we originally considered that results from model detection were not that interesting for the audience. However, we acknowledge that deep learning detection models and its application in cases such in our study are novel and deserve to be highlighted through the text. Thus, we have followed reviewer suggestion and moved model performance and model prediction sections to results. We have also added additional text explaining reasons behind imperfect detections. The new text on Results reads (page 7 lines 267-285):

“All classes in the Land-Water model performed well, with precision and recall over 90% (Table 2). For the Usage model, classes with the lowest number of ground-truths performed worse than classes with more ground-truth labels. However, precision and recall for all classes were robust at over 80%, with classes people resting, runners/walkers, surfers, and shelters performing particularly well. Furthermore, mAP50 for Land-Water and Usage models were 0.55 and 0.54 respectively, which is on par with performance from state-of-the-art models trained on very large datasets [47]. Incorrect detections can occur even in well-performing models due to several factors associated to the model architecture and data. False positive and negative instances can arise from errors on the regressor or classifier components of the model architecture (e.g. for anchor-box architectures, see Miller et al 2022a). Inter-class misclassification can also occur due to unusual viewpoints, similarity in classes characteristics, relative size of objects, and the type of background (Hoiem et al 2012, Miller et al 2022b). In fact, we observed a slightly higher error classification on two pair of classes that are semantically similar, Runners/Walkers and Anglers, and Surfers and Kite surfers. The top-down viewpoint from our surveys marginally disfavoured people sitting cross-legged on the Resting category, because their small relative size. Nonetheless, the performance metrics of the Land-Water and Usage models were satisfactory, and the number of incorrect detections were within an acceptable range.

Comment 5: The results section itself is very brief and you do not actually present any of your data. What numbers did you count? This is not even stated for your different beaches/times/dates. The basic results are completely missing.

Response 5: While a detailed description of the drone survey results could provide additional insights, given our extensive sampling effort, the majority of beach usage patterns have been captured and thoroughly described in the annual estimates. To avoid redundancy, we have focused on explaining variations across beaches, time of day, days of the week, and seasons through these estimates. We believe the study strikes a balance between presenting a novel methodology and offering a comprehensive account of the methods and results. Our aim is to highlight the broader applicability of this methodology, rather than overemphasize patterns specific to our study area. For those interested in local trends, we have created an interactive dashboard that allows stakeholders to explore results at the beach level in detail (https://fishid.shinyapps.io/beachai-dashboard/). That said, we have added a new paragraph that provides key summary statistics and patterns derived from the drone survey counts. The new paragraph, page 8 lines 290-300 reads:

“Drone surveys: All of our drone surveys detected individuals, with an average count of 1,572 ± 2,445 (standard deviation). During low and medium seasons, we observed a higher number of people on weekends, but the pattern reversed during peak season, except in Burleigh Heads and Kurrawa (compartments 15 and 20, respectively, Figure S5). The increased number of beachgoers on weekdays during the peak season suggests that this period attracts more tourists, who visit the beaches outside of typical working hours. Other patterns in beach usage observed in our drone surveys are well-represented in the annual estimates. These patterns, which have been captured and described in detail through the modelling, will be discussed in the following sections for avoiding redundancy in the results.”

 

Comment 6: Extrapolation of numbers to a single year is very problematic. Line 396 says you did the surveys only over 5 months but then you extrapolate to a whole year in Table 3. This is not great, you need to be much more critical here, errors are not even mentioned.

Response 6: Correct, we conducted surveys only over five months. Unlike a traditional time series forecast, which predicts future values based solely on temporal trends, our model was designed to estimate visitation by incorporating seasonal effects explicitly. Based on historical data (long-term lifeguard dataset), three seasons were defined, namely low, medium and high. Also, how many high, medium, and low season days were expected annually was calculated. This approach allowed us to produce an annual estimate of visitation based on seasonal variability and the effects of day-of-week, time-of-day, and beach locations patterns, which are critical in explaining beach usage dynamics. Critically, the five-month period we surveyed includes the three seasons. We have added two sentences on the methods section for clarifying our approach. New sentences on page 7 lines 261-265 read:

“Furthermore, unlike a traditional time series forecast, which predicts future values based on temporal trends, our model was designed to estimate visitation by incorporating sea-sonal effects explicitly. Thus, we used historical data to calculate how many high, medium, and low season days are expected annually”

We have also included our error estimate on Table 3, already mentioned on page 12-13 lines 390-394. The new table reads:

 

Lifeguard observed

(2022)

Drone survey projection

(2022-2023)

Total people count

16,489,292 (unknown error)

34,080,959 ± SE 3.7 million

Land to water ratio

1.59

1.21

Comment 7: No mention is made of tidal state - this will determine beach width and therefore area available for people.

Response 7: We agree that various factors including tidal state and beach slope will determine beach width, and therefore the area available for people. During preliminary analysis, we evaluated the effect of beach width to people counts. Beach width is measured by City of Gold Coast on cross-shore profile lines spaced 400 m apart monthly. We did not find any effect of beach width. However, we do not discard potential effects of beach width variations at smaller temporal scales, such as those produced by tides. Nonetheless, we believe some of the activities carried out on Gold Coast beaches might not be affected by beach width, such as water activities. Given the above, we did not calculate beach width driven by tides for each drone survey and location. This could be done in future works. However, we must be able to find standardized ways to measure beach slope, a key component interacting with tides that define beach width. As described by Vos et al 2022:

"As identified in a previous validation at eight diverse sites (Vos et al., 2020), this dataset provides a good estimate of the “typical” or long-term average beach-face slope obtained from 20 years of Landsat imagery. It is recognised, however, that the beach-face slope can vary quite substantially through time, particularly for microtidal intermediate beaches (such as those found in SE Australia), where the beach often rapidly transitions between morphodynamic beach states (Wright and Short, 1984). While estimating this temporal variability is challenging when using the described method with the historic Landsat data (as undertaken here), new satellite remote sensing capabilities may make this a future possibility. For example, by combining satellite missions such as Landsat and Sentinel-2 (5 d revisit at the Equator) with Planet's CubeSat imagery (Kelly and Gontz, 2019), it might be possible to significantly increase the sampling frequency of shoreline observations. This higher-frequency data would enable the use of a narrower time window in which beach-face slopes are estimated, potentially opening up the possibility of estimating the temporal variability in beach-face slopes at different timescales (e.g. inter-annual and seasonal)."

We have added a new paragraph to the discussion that explain this limitation in our study, and discusses how assessing beach width can provided additional information to managers. The new paragraph in page 17 lines 513-529 reads:

“Satellite data could have provided complementary information to further explain some of the observed beach usage patterns. For example, satellite data can be used to (1) estimate the area of the beach available to visitors based on topography, slope, and tidal range; (2) provide insights into wave-breaking quality in the surf zone, which depends on local bathymetry and weather conditions; and (3) evaluate beach morpho-hydrodynamics (Ma et al 2023; Salameh et al 2019). However, there are practical challenges associated to incorporating such data. Calculating beach slope, and therefore beach width affected by tides, is not well standardized across the literature, as it can be measured from different points of reference such as from the dune or berm to shoreline (mean sea level) or low tide bar. Given the large spatial scale and the wide range of times and dates for our drone surveys, we were unable to use consistent slope and tide data from sources such as CoastSat (Vos et al., 2022). Consequently, our study does not account for the potential effects of these factors on beachgoer counts. Despite this limitation, we do not believe that beach slope, bathymetry, or tidal range would have significantly impacted most water-based and land-based activities in our analysis. Nevertheless, future studies could incorporate satellite data and other complementary methods to provide further insights into how such environmental variables influence beach usage patterns.”

 

Comment 8: You also only talk about total numbers (e.g fig 3) whereas DENSITY of people is actually better and more meaningful, so please calculate this.

Response 8: We recognize the value of calculating density, as it can facilitate comparisons across beaches of varying sizes. However, we believe density is not the most meaningful metric for our study. Our analysis was shaped by the needs of the City of Gold Coast, which sought to quantify the absolute number of visitors, not their distribution per area. While density can standardize beach usage, it assumes that infrastructure (e.g., lifeguard towers) scales proportionately with beach size, which is not the case. For instance, as shown in Figure 1, lifeguard towers are not distributed based on area, meaning a lower density on larger beaches could obscure the need for more resources. In contrast, focusing on total counts better informs decisions about infrastructure and safety measures, ensuring resources match actual visitor numbers rather than potentially misleading density figures.

Comment 9: Overall, the paper is ok but please note these weak areas listed above that need focusing on. This therefore requires some moderate revisions. 

Response 9: Thanks. We have addressed the majority of the reviewer’s concerns by incorporating additional information where appropriate. In instances where we did not fully implement suggested changes, we have provided clear justifications for our decisions. This was done to ensure that our study remains focused and aligned with its original objectives. We appreciate the reviewer’s valuable input and trust that our responses and revisions will clarify any remaining issues.

Comment 10: I have also indicated on the attached pdf a few areas that need some more clarity. Some of the refs are also missing information.

Response 10: We have made changes throughout the manuscript based on reviewer’s suggestions.

  • Page 1 lines 12, 13, 17, and 23 – we removed numbers from abstract.
  • Page 1 line 43 – included suggested edits by reviewer, so new sentence reads:

 “Assessing the number of beach visitors is a challenging task because it requires a reliable and consistent source of visitor counts over space and time, characterised by a defined and constant level of precision and accuracy.”

  • Page 3 line 98-99, figure 1 – we included an inset map of Australia in Figure 1 and we indicated the geographical position of Gold Coast in this inset.
  • Page 3 line 103 – we clarified that while Gold Coast beaches are divided into 29 compartments we only survey 24 of them in the current study. The new sentence reads:

 “We conducted 507 aerial surveys across 38 beaches (24 out of 29 compartments, Figure 1) from December 2022 to April 2023 (Table S1), …”

  • To the question: “were the lifeguard towers georeferenced?” Yes, lifeguard towers were georeferenced. We included a new sentence on page 6 line 237 that reads:

“Lifeguard towers’ geographical position was obtained from City of Gold Coast.”

  • To the suggestion: “you need to express training and evaluation as a proportion - this is usually 80-20” We agree that expressing data splits as proportions is useful for the reader. We have added proportions in the manuscript on page 4 line 129. New sentence reads:

“Ground-truth labels were manually added to images from drone videos (Figure 1c), with 35,289 total labels split across training (51%), evaluation (9%), and testing (40%) datasets (Table 1).”

  • For addressing the comment: “Not clear what this [Shelter] is” we have added the description of what shelters are on the table description. The revised table title and description read (page 4, line 137):

“Table 1. Ground-truth labels for the two detection models. Shelters category refers to umbrellas, gazebos and cabanas used by beachgoers to protect from the sun.” An entire section of the methods is devoted to explain the challenges of counting people under shelters.

  • For addressing the comment: “Not clear what this [early stopping] is” we have added two seminal references explaining this technique that seeks to avoid overfitting model during training. New references are (page 4 line 145):
  1. Morgan, N.; Bourlard, H. Generalization and Parameter Estimation in Feedforward Nets: Some Experiments. Advances in neural information processing systems 1989.
  2. Prechelt, L. Early Stopping | but When? In Neural Networks: Tricks of the trade; Springer Berlin Heidelberg: Berlin, Heidelberg, 2002; pp. 55–69.
  • For addressing the comment: “you need to present these equations a little better, as given in line 157” we have modified how equations were introduced. For clarity and simplicity, we have kept the paragraph that describe each performance metric and introduce equations. Equation titles were removed. The revised section is on page 4-5 lines 153-169:

“Model performance assessment utilized three metrics: precision, recall, and F1-score. Calculated per class by comparing predictions to ground-truth labels (i.e., manual bounding box on a beachgoer), precision balances true positives (TP) against false positives (FP, equation 1), while recall measures the model's ability to recover ground-truths (equation 2). The F1-score, a weighted average of precision and recall, is calculated using equation 3. Overall model performance is reported as Mean Average Precision (mAP50; equation 4), a metric considering precision, recall, and the Intersection over Union (IoU) method for bounding box overlap between prediction and ground-truth."

  • For addressing the comment “this section is results [Model Performance] - MOVE IT” we have moved the model training and model prediction sections, including table 2, to results now on pages 7-8 lines 267-288.
  • We did not address reviewer suggestion on page 5 line 185 (i.e. adding “s” to shelter), as it is being used as a non-count noun, referring to the general concept of protection from the sun, rather than referring to multiple distinct items.
  • On page 6 lines 221, we accepted reviewer edits. The new sentence reads:

“For all other variables we originally hypothesised that changes in their group levels will only be expected on their baseline effects to response variable (i.e. changes on intercept), but upon further exploration and on a single base basis, random slopes were allowed.”

  • For addressing the following comments on page 7 results section:
  1. “right here you need to present your actual RESULTS of the beach user numbers. You cannot extrapolate to annual levels until you first present the actual data”
  2. “there is very little here, MUCH MORE IS NEEDED”
  3. “say something about the numbers on individual beaches, which are most/least popular? highest and lowest densities? which had the most/least swimmers? surfers?”
  • Regarding the comment on Figure 2, page 8 “DENSITY would be a much better representation of the data”, we sustain our argument from above: “We recognize the value of calculating density, as it can facilitate comparisons across beaches of varying sizes. However, we believe density is not the most meaningful metric for our study. Our analysis was shaped by the needs of the City of Gold Coast, which sought to quantify the absolute number of visitors, not their distribution per area. While density can standardize beach usage, it assumes that infrastructure (e.g., lifeguard towers) scales proportionately with beach size, which is not the case. For instance, as shown in Figure 1, lifeguard towers are not distributed based on area, meaning a lower density on larger beaches could obscure the need for more resources. In contrast, focusing on total counts better informs decisions about infrastructure and safety measures, ensuring resources match actual visitor numbers rather than potentially misleading density figures.”
  • Regarding the comment “you mean dependent variables” on page 9 line 325, we agree with the reviewer. We have accordingly changed the sentence for improving clarity. The new sentence reads:

“We found strong evidence of interacting dependent variables.”

In page 9 lines 329 we accepted reviewer edit, and new sentence reads “However, the people count is marginally different during weekdays across all seasons, …”

  • For addressing the comment “refs needed here” associated to page 9 lines 343 referring to people’s preference to be near lifeguard assistance we have included the following reference:
  1. Blackwell, B.D.; Tisdell, C.A. The Marginal Values of Lifesavers and Lifeguards to Beach Users in Australia and the United States. Economic Analysis and Policy 2010, 40, 209–227

This reference evaluated Willingness to Pay, as a proxy of preference of level of commodity, including explicit evaluation of willingness to pay for extra lifeguard services. Safe bathing areas were preferred, although other factors such as crowding, level of income and convenience (distance to accommodation) affect beachgoers beach selection.

  • For addressing the comment “this is really not clear, present the actual data first” we have added a new paragraph on the results, page 8 lines 291-300.
  • For addressing the comment “not clear what messy and clean mean, explain” on page 11 lines 327, we have added the definition of clean and messy conditions to the description of Figure 5. The next figure description reads (page 12 lines 362-364):

“Figure 5: Swell condition effect on surfer count. Surfer estimate derived from Usage model and GLM for a typical compartment. Clean condition refers to smooth, well-formed and consistent waves, while messy conditions refers to choppy and irregular waves”

  • Regarding the comment “REMEMBER that extrapolation to an annual timescale will increase the error”, we agree that our annualised calculation could increase uncertainty in our estimate. We have added a sentence that explicitly indicates this to the reader. The sentence reads (page 12, line 367):

“Our annualized estimation approach, while robust, may introduce some additional uncertainty due to variations in seasonality and temporal factors. However, the significant differences between our estimates and those provided by lifeguards extend beyond the bounds of this uncertainty, suggesting that our method captures more comprehensive spatial and temporal coverage, leading to a higher visitation estimate.”

  • For addressing the comment “no idea what this is”, we have added an explanation to what the Jacob’s method is to the already in place reference to Jacob’s work. The new sentence reads (page 12, line 372):

“Manual counting is influenced by various factors, including weather conditions, the level of crowd counting expertise, and the chosen counting method [e.g. Jacob’s method -a technique developed for counting crowds 51, individual counting, aggregated temporal counting].”

  • For addressing the comment “source of the data? did you get this from the lifeguard organisation? if so, state in the methods” we have included the source of the data in the table description. Data sources were already described on methods. The new table #3 title and description on page 12, line 390 reads:

 “Table 3. Comparison between people count estimates from the lifeguard dataset and annual projection from drone surveys. Lifeguard data was obtained for the 2022 calendar year (source: City of Gold Coast Lifeguards and weekend-volunteer Surf Life Saving Queensland lifeguards). We encourage precaution when comparing these results because of different sampling methods; please refer to text below.”

  • For addressing the comment “you need to drill down into different beach users and spatial/temporal patterns, this is MISSING from the discussion”, we have added a new paragraph to this end. The new paragraph in pages 13-14 lines 413-434 reads:

“Our findings reveal notable interactions between season, day of the week, and time of day on beach usage, emphasizing the need for dynamic management strategies that adapt to these temporal patterns. The observed increase in visitor numbers throughout the day during low and medium seasons, especially on weekends (Figure 4), suggests that re-source allocation such as lifeguard services and beach amenities could be optimized by anticipating higher afternoon activity. The reversal of this trend during peak season weekends, where counts were lower in the afternoon (Figure 4), highlights the potential influence of factors such as overcrowding or alternative recreational opportunities, which require further investigation. The strong effects of weather variables on activity-specific beach usage provide additional insights (Figure 5 and 6). For example, clean surf conditions lead to significantly higher surfer counts, indicating that surf forecasting could be leveraged to anticipate crowd sizes. Although weather conditions such as temperature and wind speed had measurable effects on beach usage, limitations in our dataset—particularly the low number of surveys during medium rain probability—restrict our ability to generalize trends for all weather conditions. Future studies should extend data collection across a full year to improve model robustness and better understand the nuances of weather-beachgoer interactions. The spatial and temporal specificity of our findings underscores their broader applicability to other coastal regions. By accounting for both seasonality and environmental conditions, beach managers could refine staffing and resource deployment strategies to enhance safety and visitor satisfaction. However, caution should be exercised when extrapolating these results to other contexts, as local factors such as infrastructure and beach morphology could lead to different patterns”.

  • Regarding the comment “this contradicts what you have done here, see table 3” we disagree. As mentioned above: "Unlike a traditional time series forecast, which predicts future values based solely on temporal trends, our model was designed to estimate visitation by incorporating seasonal effects explicitly. We used historical data (long-term lifeguard dataset) to calculate how many high, medium, and low season days are expected annually. This approach allowed us to produce an annual estimate of visitation based on seasonal variability and the effects of day-of-week, time-of-day, and beach locations patterns, which are critical in explaining beach usage dynamics."
  • For addressing the comment “CHECK THE REFS” we have revised all references and ensure that all references’ details are properly listed.

 

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Dear authors,

I think the current form can be accepted after the revision of manuscript.

 

 

Back to TopTop