Detecting Multi-Decadal Changes in Seagrass Cover in Tauranga Harbour, New Zealand, Using Landsat Imagery and Boosting Ensemble Classification Techniques

: Seagrass provides a wide range of essential ecosystem services, supports climate change mitigation, and contributes to blue carbon sequestration. This resource, however, is undergoing significant declines across the globe, and there is an urgent need to develop change detection techniques appropriate to the scale of loss and applicable to the complex coastal marine environment. Our work aimed to develop remote-sensing-based techniques for detection of changes between 1990 and 2019 in the area of seagrass meadows in Tauranga Harbour, New Zealand. Four state-of-the-art machine-learning models, Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boost (XGB), and CatBoost (CB), were evaluated for classification of seagrass cover (presence/absence) in a Landsat 8 image from 2019, using near-concurrent Ground-Truth Points (GTPs). We then used the most accurate one of these models, CB, with historic Landsat imagery supported by classified aerial photographs for an estimation of change in cover over time. The CB model produced the highest accuracies (precision, recall, F 1 scores of 0.94, 0.96, and 0.95 respectively). We were able to use Landsat imagery to document the trajectory and spatial distribution of an approximately 50% reduction in seagrass area from 2237 ha to 1184 ha between the years 1990–2019. Our illustration of change detection of seagrass in Tauranga Harbour suggests that machine-learning techniques, coupled with historic satellite imagery, offers potential for evaluation of historic as well as ongoing seagrass dynamics.


Introduction
Seagrass provides a number of valuable ecosystem services in coastal areas, including primary production, biogenic habitat production, water filtering, wave energy attenuation, and sediment trapping [1,2]. In recent years, blue carbon, including seagrass meadows, has been acknowledged as an important service for climate change mitigation because of its value in the sequestration of carbon [3,4]. Seagrass meadows, however, have declined and degraded across most regions in the world, a change largely attributed to anthropogenic effects [5][6][7].
The destruction of seagrass leads to the loss of various ecosystem services [7,8] and threatens the stability [6] and long-term livelihood of the fisherman in coastal areas [9,10]. Therefore, an accurate and rapid technique to inventory this resource is in high demand [5,11,12], to contribute baseline data for the evaluation of coastal ecosystem dynamics, establishment of marine protected areas, and functional zoning fitting to the local conditions. Where this can include a historic perspective, it can provide a comprehensive understanding of system change.
Several attempts at mapping and monitoring seagrass meadows using different satellite sensors and approaches have been reported [12]. Change of seagrass cover has been assessed using RapidEye [13], Indian satellite image (IRS LISS IV) [14], WorldView-2, IKONOS, Quickbird-2 [15], and Landsat [16][17][18][19] in various parts of the globe including the Mediterranean, the USA, Australia, and Malaysia. The temporal range of these attempts is constrained by the various platform launch dates, and typically range from 5 to around 25 years. Few efforts have attempted a longer-term change detection (30-40 years) of seagrass, and accuracy assessment has not frequently been reported for such long-term change detection. The reasons for this may relate to a deficiency of groundtruth data against which to evaluate older satellite scenes, and a need for imagery for the development of robust models for the classification of seagrass meadows in variably submerged conditions to be captured at optimal times to allow traditional classification procedures to be applied.
In recent years, machine learning (ML) has been emerging as an effective approach in various classification tasks, including for seagrass mapping [12,20]. ML provides improvements over the traditional Earth observation (EO) data classification approaches, to deal better with the challenges of mixed habitat, coarse spatial resolution of satellite imagery, and water column and atmospheric interference in coastal habitats [20][21][22]. Advantages of ML models are their use of non-parametric approaches, requiring no assumptions of normal distribution of input data, effective use of noisy data, and capability for multiple feature extraction [23][24][25][26]. The application of ML techniques to multitemporal satellite data, gathered from different satellite platforms, may therefore improve the overall accuracy of the classification result and enhance the reliability of seagrass change assessment. A range of different ensemble-based supervised classification techniques, such as boosting and bagging approaches [21][22][23][24], have been considered and tested in the literature for this type of task [27,28]. The most important differences between the bagging and boosting methods come from the approaches to the creation of training and testing datasets, and how the bagging and boosting methods deal with weak learners during the learning process [29,30]. Despite the potential for improved classification accuracy in suboptimal datasets, these approaches have not yet been fully implemented for seagrass change detection [12]. We are aware of only a single study, using Random Forest (RF) classification, for mapping the change of seagrass cover [13]. In the case study reported by these authors, the performance of the model was unstable and the accuracy varied among acquired scenes. Here we test the performance of a range of ML models, both boosting and bagging methods, with a time-series of satellite images, to compare their performances for assessment of seagrass cover and long-term change in Tauranga Harbour. Our goal is to improve the accuracy of tools for seagrass mapping and change detection.
Landsat time-series data were selected for the current study as the longest available time series and as freely available satellite remotely sensed resources. Landsat has operated since 1972 and provides continuous, homogeneous input data up to the most current Landsat 8 operational land imager (OLI) in orbit [31]. The Landsat multitemporal data has been used previously for several long-term change detection tasks [12,32] with the combination of long-term acquisition, medium spatial resolution, and the high quality of atmosphere-corrected products cited as important attributes. The spatial resolution has been retained as 30 m through eight generations (Landsat 1-Landsat 8); however, the radiometric resolution has been improved from 8 bit to 12 bit, leading to a better recognition of surface objects [33]. In addition, Landsat imagery includes blue, green, and red wavebands, which are the most appropriate for underwater resource mapping [34][35][36], but have not yet been evaluated for long-term seagrass change detection [12]. Thus, our work attempts to fill a gap in the current literature by assessing the performance of historic Landsat imagery, coupled with various machine-learning boosting and bagging models implemented in an open-source environment, in mapping changes in seagrass extent in a tidally inundated environment.
We employed two well-known models, i.e., Support Vector Machine (SVM), Random Forests (RF); and two novel techniques, Extreme Gradient Boosting (XGB), and CatBoost (CB)) for the classification of seagrass meadows in Tauranga Harbour from Landsat imagery, and for detecting change across 29 years. The results demonstrated that the novel classification method CB was successful in describing the dynamics of change in seagrass in the study site as well as contributing baseline data for further assessment of change.

Study Site
We selected Tauranga Harbour (North Island, New Zealand) as the study site ( Figure  1), due to its large size (201 km 2 in surface area [37]), variation in water depth (from 0 m when exposed to 20 m in deep channels [38]), widely distributed but patchy seagrass cover and the availability of historic ground-truth information. The tidal regime is semidiurnal, with a range of 0.2-2.1 m, and the estuary has an average water residence time of 3-8 days [39]. Zostera muelleri is the only species of seagrass, occurring primarily in the intertidal parts of the harbor [20,37]. The growth rate of Z. muelleri is optimal at 12 practical salinity units (psu) [40] and 27-33 °C [41,42]. It attains its highest biomass in the austral summer and declines gradually over the winter, reaching a minimum cover in early spring [43]. Flowering and seed production of Z. muelleri is rare in New Zealand, reproduction is primarily vegetative and patch dynamics are correspondingly slow [44,45]. Seagrass is primarily intertidal in the estuary and, based on bathymetry and tidal predictions [38] at the time of the Landsat image acquisition, water depths ranged between 0.0-1.5 m in the locations where seagrass was present.
In recent decades, Tauranga Harbour has been increasingly influenced by agricultural activities in the northern part (between 37.44° S and 37.54° S) and urban development in the southern part (between 37.62° S and 37.72° S). Episodic high loadings of sediment have been recorded and have resulted in the accumulation of sediment and high turbidity over the autumn and winter seasons [46,47]. Changes in the sedimentary environment have been implicated in negative impacts on the growth of seagrass [48,49], though other factors may also be involved. Available maps of seagrass in 1959, 1996, and 2011 derived from manual classification of aerial photography provided a resource for model validation [37].

Satellite Image Acquisition
Landsat images were downloaded from the GLOVIS website [50] for the years 1990, 2001, 2011, 2014, and 2019 (Table 1) at process level 1 (pixel value in digital number), and in the projection of WGS-84 UTM 60S. Landsat images were selected based on: (1) the acquired time of the Landsat image that coincided as closely as possible to low tide at the study site; (2) the image that had the lowest coverage of cloud; (3) whether there existed a similar acquisition month among the scenes. In practice, we selected scenes that ranged 1-2 months around March (Table 2).

Field Survey Data
A field survey was undertaken from 1-7 April 2019 ( Figure 1) in the intertidal areas of the harbor. At low tide, the boundary of seagrass meadows was delimited using a Global Positioning System (GPS) Garmin Etrex 30 with an accuracy of ±2 m. Other substrata recorded during the field survey were bare sand and muddy sand. Macroalgae were neither detected from our field survey nor mentioned in previous mapping reports [37,51].
Ground-Truth Points (GTPs), which were the base points to make the regions of interest (ROIs) for given classes, were recorded by following the boundary between seagrass meadows and non-vegetated areas. A total of 4315 GTPs were recorded for seagrass distribution, and 237 GTPs for other substrata in the harbor.

Ground-Truth Historical Scenes
Before 2019, no GTPs from field surveying were available, therefore we used aerial and Google Earth images ( Table 2) and published documents [37,51] to identify regions of interests (ROIs), within which we were able to determine seagrass presence/absence with sufficient confidence to develop the models and to evaluate the accuracy of the hindcast seagrass maps. High-resolution aerial imagery exists from the years 2011 and 2014, and cloud-free, near-low-tide Landsat scenes, from February 2011 and March 2014, could be found that coincided with these. However, for the Landsat scenes in 1990 and 2001, aerial images were only available with a gap of 1-2 years. These included aerial images in 1991-1992 (monochrome and colour) and 2003 (colour). We found Google Earth images (identified as Landsat/Copernicus images in the Google Earth application) for both December 1990 and December 2001, which were in the austral summer and were close to the acquisition time of the Landsat scenes in April 1990 (austral autumn) and March 2001 (austral summer). Due to concerns over circularity of use of Landsat data, we used both Google Earth and aerial images to select the ROIs for Landsat scenes in 1990 and 2001, ensuring that ROIs were only used where both sources showed seagrass present. We considered that the slow dynamics of seagrass patches in Tauranga Harbour [44,45] made this approach robust.

Development of Seagrass Maps and Detection of Change
Our method of seagrass change detection using Landsat images involved four steps (  4) identifying the changes of area and spatial distribution. Due to the deficiency of field data in the past, a binary classification (seagrass and non-seagrass) was adopted to deliver the most consistent change detection.

Atmospheric Correction
An atmospheric correction for all Landsat scenes was conducted using ACOLITE, in the Python TM environment (Table A1, Appendix A) [52]. The original pixel values in physical digital number were converted to surface reflectance. Atmospheric corrected surface reflectance for pixels (limited by the study boundary, Figure 1) for the ρBlue (ρw482), the ρGeen (ρw561), and the ρRed (ρw654) bands were retained for all Landsat scenes in the years 1990, 2001, 2011, 2014, and 2019 for further processing steps. In the years 2014 and 2019, when Landsat 8 images were available, the coastal aerosol band (ρw443) was used, together with the ρBlue, ρGeen, and ρRed bands. The selected bands were used as independent variables in ML model prediction of the presence/absence of seagrass.
Due to inconsistency between the tidal status and the acquisition time of Landsat images, our study site was considered to contain both exposed and submerged areas. Therefore, the near infrared (ρNIR) band, which attenuates rapidly in water, was not used in the analysis. A water column correction was not employed for water pixels in Tauranga Harbour, since the water depth and water optical characteristics (i.e., attenuation coefficient of the solar radiance in the water column) were unavailable for the historic scenes (1990, 2001, 2011).

Application of Machine-Learning Algorithms
Hyper-Parameter Tuning for Selected Machine-Learning Models Machine-learning models comprise several hyperparameters (i.e., the parameters that control the learning process during the implementation of ML models), which often need to be optimized (i.e., by the process of tuning) to find the best combination to achieve best classification performance. The hyper-parameters of the RF, the SVM, the XGB, and the CB models were tuned using a grid search with threefold cross-validation in the scikitlearn library [53]. The hyperparameters for each of the models were maintained during the training and the testing phases (Table A2, Appendix A).

Theoretical Background of the Machine-Learning Algorithms Used Random Forests
Random Forests (RF) [54] is perhaps the most popular machine-learning model for both classification and regression problems in remote sensing [55]. It is an ensemble bagging method, which uses a bootstrap sampling approach to build the training and the testing data and a voting method to select the most accurate decision from a large group of input decision tress. The RF model is a nonparametric method that is insensitive to the data's distribution, reducing the overfitting. The RF technique supplies various hyperparameters for tuning; however, the large number of parameters in the model results in slow optimization.

Support Vector Machine
Support Vector Machine (SVM) [56,57] supports linear, poly-nominal, and radial basis function (RBF) kernels and can be adapted to various linear or non-linear data types. It has relatively few tunable hyperparameters but performance speed is still relatively slow when dealing with a large dataset. The SVM model uses a hyperplane to find the separation space among the classes with the most typical rules being: (i) better segregation of data; (ii) maximization of the distance between the closest data points and the hyperplane. Despite an accurate prediction and robustness to outliers, the SVM technique is not effective on overlaid classes or noisy datasets.

Extreme Gradient Boost
Extreme Gradient Boost (XGB) [17] is different from Gradient Boosting as it uses a more regularized model, which reduces over-fitting and results in a higher prediction accuracy. In the regularized gradient boosting mode, a selection of L1 or L2 regularization can be made to adapt the model to suit input data. Similar to other boosting models, the XGB technique supports various hyperparameters that are tuned using a grid search or genetic algorithm (GA) [58].
CatBoost CatBoost (CB) was introduced in 2018 [59] for classification, regression, and ranking tasks. It can handle both category and numerical data types. Using ordered boosting on decision trees, a permutation-driven derivation from classic boosting, the CB yields a fast and reliable performance, even with a small dataset. The model itself produces robust predictive results with default hyperparameters, reducing the requirement of tuning, and its novel gradient boosting scheme results in less overfitting.

Comparison of ML Algorithms for Seagrass Mapping Using the Landsat Image Taken in 2019
Four ML models, SVM, RF, XGB, CB, were compared for seagrass mapping using the Landsat image from May 2019 and near-synchronous GTPs collected in April 2019 to identify the regions of interest (hereafter referred to as ROIs-2019) known to either seagrass or non-seagrass classes. The 1-month gap between the acquisition date of the Landsat image and the field survey date is acceptable due to the stable condition of the weather (i.e., no extreme weather phenomena) [60], and seagrass dynamics are slow in the study site [44,45]. A dataset of pixel reflectance values was extracted from ROIs-2019 and its corresponding Landsat image (dataset DS5, Table A3, Appendix A), split randomly into 60% for the training and 40% for the testing of selected ML models. The best model was selected as the model with highest accuracy and F1 score. The best ML model identified using the 2019 data was applied for mapping of seagrass using Landsat images from 1990, 2001, 2011, and 2014 (see Table 1

Change Detection
Change detection was conducted using the standard confusion matrix tool in the SAGA GIS [62]. The confusion matrix analyzed the changes of the pairs of classified maps (years 1990-2011 and 1990-2019), reporting in the map as seagrass loss (seagrass to nonseagrass), seagrass recovery (non-seagrass to seagrass), and unchanging seagrass.

Evaluation Criteria
We employed standard metrics for the evaluation of the classification skill: accuracy, Kappa coefficient (κ), Kendall's tau coefficient (τ), precision, recall, and F1 (Equations (1)-(6)). These were applied independently to the five datasets listed in Table A3 In addition, the nonparametric McNemar test was used to assess the statistical significance of the differences of the overall accuracy of the selected models in this research. The test was executed in a Python™ environment using the mlxtend library [64]. The chi-square value (χ 2 ) was calculated from Equation (7) with Edward's continuity correction.
in which: : false negative : false positive

Performance of the RF, SVM, XGB, and CB Models Using Landsat Image and GTPs for 2019 Data
Of the four machine-learning models applied to the 2019 data, the CB model outperformed all others, with the F1 score, κ and τ coefficients reaching 0.95 (Table 3), and 0.92 (Table A4, Appendix A), respectively. The difference between models was statistically significant (McNemar's test, Table 4) with the exception of the XGB and RF models. The CB model required a longer computation time (3.71 s) than the RF model (0.33 s), the XGB model (0.15 s), and the SVM model (0.04 s). The RF and XGB techniques showed an equivalent performance (Table 3) with F1 score of 0.93, while the SVM model underperformed the other models with a F1 score of 0.91.  All models tested were able to classify seagrass from other bottom types in the harbor with a precision exceeding 0.89, but the highest precision was again from the CB model. Despite a similar F1 score, the XGB model gained a higher precision than the RF technique.

Seagrass Change Detection from 1990-2019
The CB technique was then used to make classification maps for the years 1990, 2001, 2011, and 2014 ( Figure 3). Our results indicated a performance across all metrics that was equivalent to that in the 2019 case, with accuracy and F1 scores over 95% for the binary classification of seagrass and nonseagrass (Table 5 and Table A4, Appendix A).  The time series shows that the seagrass meadow area decreased from 2237 ha in 1990 to 1184 ha in 2019, though not monotonically (Figures 3 and 4). A downward trend from 1990 (2237 ha) to 2001 (2035 ha), was followed by a recovery in 2011 (to 2380 ha), followed by a second decline to 1184 ha in 2019 (Figure 4a). Different trends, though all with an overall decline to 2019, were discovered in the northern (Figure 4b

Discussion
In this investigation, we have demonstrated the use of machine-learning approaches successfully to classify seagrass in Landsat images of Tauranga Harbour, and to use this classification to detect changes in seagrass cover over a period of 29 years. Due to the relative paucity of field validation data from most of the time series of this analysis, we only tested a binary classification (seagrass and non-seagrass classes), but the four machine-learning models, RF, XGB, SVM, and CB, were all capable of detecting seagrass from other bottom types with high precision and recall scores. Previously, the RF and the SVM models have been tested for seagrass classification [65] and the RF for seagrass change detection in the Mediterranean [13]. These previous attempts have produced accuracies from 76-98% for Posidonia oceania [13,65] and 32-62% for Cymodocea nodosa using higher resolution RapidEye imagery [13], both lower than were achieved in this study using the CB technique. P. oceania and C. nodosa are structurally similar to Z. muelleri and would seem likely to offer a similar target. This suggests that the use of the state-ofthe-art ML models with optimized hyper-parameters is an important factor contributing to the high-precision classification of seagrass presence/absence. Both the XGB and the CB techniques have been proven as potential candidates for a range of classification [58,[66][67][68], and regression [69][70][71] problems but have not previously been applied to seagrasses, or to any other semi-submerged targets, so it is not clear if this is a general performance advantage in this type of application.
Other advantages over previous studies may, however, exist in Tauranga Harbour. Specifically, Z. muelleri occurs as monospecific meadows, without a substantial presence of macroalgae, which can degrade classifications [20,37], and where the reflectance value of seagrass is considerably different from the other common bottom types (sand, muddy sand, deep water). In addition, we were able to use cloud-free Landsat scenes, with atmospheric correction using ACOLITE, which has been designed for the aquatic application of Landsat imagery, which likely reduced the uncertainty of atmospheric impact and derived a higher quality of corrected surface reflectance [72].
In this study, the two boosting techniques (XGB, CB) and the one bagging (RF) outperformed the more traditional SVM methods. The SVM model does not work well with noisy data, where unclear margins exist between classes [73]. Such fuzzy margins were observed at the study site at the overlap between seagrass and non-seagrass (sand, muddy sand) classes, where the distinction between present and absent was gradual. This likely resulted in the relatively poor performance of the SVM model. The boosting techniques XGB (0.93) and CB (0.94) show slightly higher precision than the bagging RF (0.92), which might have resulted from the advancement in decision-tree growth of the boosting techniques. Unlike the RF model, which builds the independent decision tree from the bootstrapped samples, the boosting XGB and CB models sequentially grow new trees using the residual information of previous trees, which allows the new learner to solve the errors of the previous tree by minimizing the residual of the next model fitting. For a final prediction of a classification task, the bagging RF takes a majority vote from all decision trees while a weighted majority vote is adapted to the boosting techniques, such as XGB and CB, and potentially results in a higher precision of a class prediction.
Given the classification skill metrics, the CB is the best candidate for the mapping and change detection of seagrass in the study site. The CB is also amongst the latest emerging algorithms developed in the computer vision and pattern recognition fields (released in 2018); is easy to tune with fewer hyperparameters than the RF and the XGB techniques; and is using symmetric trees, which potentially results in faster optimization and prediction [59]. The CB model differs from the boosting algorithm family by using ordered boosting on a random permutation of given dataset, which prevent the prediction shift and alleviate the overfitting in model prediction. The outperformance of the CB over other ML models has been reported for mangrove total carbon estimation [74], various testing datasets [59], and forest aboveground biomass [75], which confirm the reliability and the capability of the CB implementation for seagrass mapping in our study. Our accurate long-term (29-year) change detection of seagrass meadows using the CB machine-learning model in Tauranga Harbour is a significant advance in the classification and monitoring of seagrass ecosystem using multispectral, remotely sensed data.
Our analysis has confirmed a general declining trend of seagrass cover in Tauranga Harbour reported previously [37] using aerial photography. In absolute terms, Park (2011) reported 2744 ha in March 2011, close to our estimate of 2380 ha at that time. Also, like Park, our analysis was able to resolve areas within the estuary where the greatest loss has occurred between assessments. We specifically noted that the seagrass loss was initially focused in the northern and southern parts of the harbor. High flux of sediment was recorded into the northern part, due to agricultural intensification, and the southern part, due to urban development, particularly after 2011 [47] and may explain the long-term decline of seagrass in those areas. The potential impact of agricultural and urban developments in the northern and the southern parts is supported by the observation that recovery was only observed in the central part of the harbor (Figure 3, year 2011, and Figure 5). Another potential factor contributing to long-term loss of seagrass is the grazing of black swans, which has previously been linked to variations in seagrass cover in the southern harbor [37,76]. Further analysis is required to develop a detailed explanation on the dynamics of seagrass meadows in Tauranga Harbour.
Here, we advocate the use of novel and advanced ML models, in combination with multitemporal Landsat images to obtain a long-term, historic series of observations on seagrass dynamics that will continue to be supported into the future through ongoing developments of the Landsat series. The proposed method potentially provides a lowcost, high-precision classification tool that can be extended to other estuaries with similar target conditions. While aerial photography and very high spatial resolution (VHR) satellite images have higher spatial resolution than Landsat images, they come at a high cost, and spatial coverage can be limited. Currently, Landsat is the most suitable satellite image resource for any long-term change detection due to its long time in service. A 30 m spatial resolution was found suitable to support a binary classification of seagrass in this study, and accuracy was unaffected by the small changes in spectral information that have accompanied the incremental changes in Landsat optical sensors. The most recent generation Landsat 8, with an improvement in radiometric resolution up to 16 bit (in the level 1 product), compared to the 8 bit in previous generations, and the addition of a coastal aerosol band, has good potential for accurate detection of the dynamics of seagrass. An improvement of spectral and radiometric resolution in Landsat 9 (scheduled for launching in 2021) is expected to provide continuity into the future monitoring of seagrass [77]. For a short-term observation of seagrass change, our proposed methods for seagrass classification are also potentially applicable to a wide range of VHR images (Quickbird, Ikonos, Unmanned Aerial Vehicle (UAV)) with consideration of the trade-off among the spatial coverage of the study site, the spatial resolution of the image, and the available budget.
The open-source approach is another significant advantage of our proposed methodology. The Python environment provides an excellent option for the end users to apply the novel machine-learning algorithms and remote-sensing data processing platform to support accurate mapping and estimation of the blue carbon budget of seagrass ecosystem [78]. Most commercial software only provides a limited number of processing and classification algorithms, with few, older ML options (e.g., SVM) and has a high license cost. Our proposed methods are more flexible, free of charge, and offer a high efficiency for mapping the dynamics of seagrass meadows in the complex coastal marine environment.
Despite a successful application of the CB model for seagrass classification and change detection, this research still comes with limitations. Since we used a supervised classification technique, both classification and validation require an independent assessment of seagrass cover in at least part of the remote image, to provide the ROIs that allow the training and validation steps. In addition, the seasonal growth of seagrass in temperate waters, and its intertidal habit, raise the uncertainty of change detection between various time points unless imagery is available at the same time, and under similar tidal conditions. The offset between Landsat, the time of image acquisition, and tidal regime (Table 1) is unavoidable in the study site; however, we consider that it is unlikely to significantly impact on classification accuracy. In Tauranga Harbour, seagrass meadows are distributed in the intertidal regions at a water depth ranging from 0 m (exposed) to a maximum of 1.5 m (at high tide) [37,51]. The ρBlue, ρGreen, and ρRed bands have nominal maximum penetration depths of 15, 10, 5 m respectively [34], and while moderate, but variable, coastal turbidity in the harbor will increase attenuation rate, the maximum immersion depth of 1.5 m suggests that the spectral bands reflectance signatures are highly likely to have been impacted by seagrass. Average vertical attenuation rate of the downwelling radiation within the 400-700 nm band in Tauranga Harbour is 0.40 m −1 (range 0.16-0.98 m −1 ) [79] and these authors found that 65% of incident radiation reached the estuary floor at 1.2 m depth. Again, this suggests that water clarity is sufficient to ensure that, even at maximum water depth, seagrass will contribute to the reflectance spectrum detected by the satellite. As with all satellite-based remote sensing, a cloud-free view is required, which constrains use of this technology.
To compensate for the limitation, we attempted to select all Landsat images acquired in the growing season of seagrass in Tauranga Harbour (austral summer and autumn) and at low tide, but this further constrains the availability of verifiable Landsat imagery for seagrass cover estimation. Further research focusing on expanding the novel approach used in the current study for long-term change detection of seagrass meadows is underway.

Conclusions
In this research, we used the novel machine-learning model CatBoost (CB) and other well-known ML models (RF, SVM, and XGB) for seagrass cover classification (present/absent), using Landsat satellite imagery, in Tauranga Harbour, New Zealand. Our results showed a high level of accuracy for all approaches, but the CB model outperformed the other selected models, with precision, recall, and F1 scores of 0.94, 0.96, and 0.95 respectively.
We then applied the CB technique to multispectral Landsat data for the detection of change in seagrass cover over a 29-year period between 1990 and 2019 in Tauranga Harbour. The change detection analysis determined an overarching declining trend of seagrass cover in Tauranga Harbour with approximately 50% loss over the 29 years period (from 2237 ha in 1990 to 1184 ha in 2019); these results concurred with a study using aerial imaging. Seagrass was lost in the far northern and southern areas of the harbor during the first part of this time, then more gradually from the central region. This analysis of change using Landsat images combined with the CB model demonstrates the value of historic satellite imagery and machine-learning for accurate documentation of the change over time in this difficult-to-quantify coastal vegetation.