Deep Learning of High-Resolution Aerial Imagery for Coastal Marsh Change Detection: A Comparative Study

: Deep learning techniques are increasingly being recognized as effective image classiﬁers. Aside from their successful performance in past studies, the accuracies have varied in complex environments, in comparison with the popularly of applied machine learning classiﬁers. This study seeks to explore the feasibility of using a U-Net deep learning architecture to classify bi-temporal, high-resolution, county-scale aerial images to determine the spatial extent and changes of land cover classes that directly or indirectly impact tidal marsh. The image set used in the analysis is a collection of a 1-m resolution collection of National Agriculture Imagery Program (NAIP) tiles from 2009 and 2019, covering Beaufort County, South Carolina. The U-Net CNN classiﬁcation results were compared with two machine learning classiﬁers, the random trees (RT) and support vector machine (SVM). The results revealed a signiﬁcant accuracy advantage in using the U-Net classiﬁer (92.4%), as opposed to the SVM (81.6%) and RT (75.7%) classiﬁers, for overall accuracy. From the perspective of a GIS analyst or coastal manager, the U-Net classiﬁer is now an easily accessible and powerful tool for mapping large areas. Change detection analysis indicated little areal change on marsh extent, though increased land development throughout the county has the potential to negatively impact the health of the marshes. Future work should explore applying the constructed U-Net classiﬁer to coastal environments in large geographic areas, while also implementing other data sources (e.g., LIDAR and multispectral data) to enhance classiﬁcation accuracy.


Introduction
Machine learning (ML) algorithms have become commonplace in remote sensing data analysis [1][2][3][4][5][6][7][8].The successful use of ML for a variety of GIS and remote sensing applications has led to the implementation of these methods, often based on support vector machine (SVM) and random forest (RF) statistical methods, into GIScience software packages that can be used by non-technical investigators.The tools are readily available for supervised classifications in particular [9,10].Numerous studies have supported the use of machine learning over traditional, i.e., statistically-based classifiers, such as maximum likelihood methods, with SVM often performing the best [11][12][13][14].ML classifiers have now been established across the professional community as reliable tools for mapping, without requirement of extensive machine learning and programming experiences.
Advancements within the past ten years have led to a new division within machine learning.Deep learning (DL) is a learning algorithm designed to mimic the function of human brain in the form of neural networks [15].An advanced subsection of ML, DL is able to perform artificial intelligence functions with extensive training resources.The recent popularity and success of DL in other disciplines and applications, such as speech recognition [16] and medical image recognition [17], has led to the rise of its use in remote sensing applications.While citing other reviews of DL applications in remote sensing, [18][19][20] gave a comprehensive review by describing different model types of DL in remote sensing.The authors also found in their meta-analysis of the subject that, as of the publication of their article, there were 221 peer reviewed articles and 181 conference papers or proceedings pertaining to remote sensing and DL.It is clear that the use cases of DL in remote sensing applications are increasing rapidly.
As applications of DL classifiers in remote sensing become more established, the algorithms and tools using DL for image classification are being made available as user-friendly graphical user interface (GUI) tools in commonly used GIS software, just like the ML tools.Though geospatial software companies tout both user-friendly and robust DL tools, it is often difficult to manipulate the tools to the user's desired specifications.Nevertheless, just as the ubiquity of ML GUI tools opened the use of ML to users without a background in programming and ML, DL tools are also now available to all.Application-driven researchers, managers, and GIScience professionals may have a difficult time choosing or knowing the appropriateness of a particular tool for a particular use case.
A growing literature base has begun extensively testing DL classifiers against traditional ML classifiers in a variety of environments and with several data types and sizes [21][22][23][24].The goal of these studies is to identify the best performing classifier by comparing the results of the classifiers on the same datasets, usually using the same or similar training and validation data.The results of these studies have, thus far, been inconclusive, in regard to which classifier (DL or ML) performs best for many environments, though the current available literature can provide guidance for professionals looking to use such tools for particular applications (e.g., land use/land cover classification, vegetation cover, and coral reef habitat classification) [25][26][27].Certainly, each tool should be selected based on how it best answers the research question.However, few studies have used the GUI tools developed for less technically inclined researchers by the large GIS software companies.
The present study seeks to identify the best performing classifier (among three effective and commonly used DL and ML classifiers) for mapping land use/land cover (LU/LC) using a large, complex, county-wide dataset for a coastal county.Large, high-resolution imagery datasets of coastal areas that include complex land cover can be more difficult to process and classify, depending on the research question, methods, quality of image, and field data.High-resolution imagery can introduce a salt-and-pepper effect, due to intra-class variations.Knowing which classifier performs best in this type of environment and with imagery of these specifications will especially benefit coastal managers and practitioners.This study will specifically use the ML and DL tools, embedded in Esri ArcGIS Pro 2.8.1.
Three effective and commonly used ML and DL classifiers are compared in this study.The U-Net convolutional neural network (CNN) is a DL algorithm that was originally created for biomedical image segmentation but has been now used for remote sensing image classification applications [28,29].The architecture of the network can be divided into two halves-an encoding or 'contracting" side and decoding or "expansive" side-that give the architecture its "u" shape.The U-Net algorithm has shown success in classifying coastal wetlands, using the remotely-sensed imagery in previous studies [30,31].SVM and RF classifiers are commonly used ML classifiers for remote sensing analysis.The SVM classifier is a supervised classification method based on the statistical learning theory and was developed in the computer science community in the 1990s [32] (p.337).It is now commonly used in remote sensing research [21,33,34].SVM classifiers are beneficial because they can handle small training samples, and the training samples do not need to be normally distributed.SVM classifiers can handle non-linear class boundaries and multiple classes.The RF classifier is a supervised classification method based on the random forest statistical method [32] (p.311).A series of decisions are made based on the statistical makeup of the classes, as well as the image overall.The decisions branch out together and form what look like tree branches.When the entire image is classified, many instances of classification are performed on subsets of the data, therefore creating many decision trees [35].The most frequent tree output is used as the overall classification.Using multiple trees is meant to mitigate overfitting to the training samples provided by the user.The 'best' classifier will be determined by comparing time costs for classifying the imagery and overall accuracy (OA) results.
Upon determining the most effective classifier, a case study for demonstrating the effective use of the best classifier is conducted using the large, county-wide data set to detect changes in LU/LC over a ten-year period, which can affect the extensive marsh environment across a large county in South Carolina, USA.While direct modification of the marsh environment is important to map, for the sake of development, indirect impacts, such as pollution, excessive nutrient and sewage inputs, other upstream development and freshwater diversions coming from coastal communities, may have lasting negative effects on the health of our important coastal wetlands [36][37][38][39].
Section 2 of this study outlines the materials and methods of the experiment and case study.This includes a description of the study area, the data used, how classifiers were trained and applied, and change detection analysis that was conducted.Section 3 presents the results from the comparison and case study.Results of this case study provide insights for coastal managers to better monitor and adaptively manage marsh health.Finally, results are discussed, in the context of the current literature base, followed by a brief conclusion.

Approaches for Classifier Comparison Experiment
A general workflow of this study is represented in Figure 1.The experiment was performed by classifying the same large, county-wide, high-resolution aerial image, using the three different classifiers: RF, SVM, and DL U-Net.The mathematical and coding application of these classifiers were left to how the ESRI development team designed them, in ArcGIS Pro 2.8.1, in order to best represent what is available to the coastal managers and scientists with access to these common tools.Object oriented classifiers (OOC) were used for the experiment because of their ability to mitigate some of the high-resolution, intra-class detail and salt-and-pepper phenomenon that often occurs with pixel-based classifications [40].While noise can still be found in the objects, object parameters can smooth out much of the salt-and-pepper effect.OOC have been shown to have better accuracy than pixel-based classifiers in a variety of environments, including salt marsh and other LU/LC classes found in our study area [41][42][43].The U-Net classifier performed the semantic segmentation, without the user input of a segmented image.For the SVM and RF classifications, the base image was segmented using different properties.Spectral detail was placed at the highest importance, with spatial detail coming in second.In the range of 1.0 to 20.0, spectral detail was placed at 18.5, while spatial detail was placed at an 8, in an effort to smooth out the image.Minimum segment size was placed at 5 pixels (5 × 5 m), as to accommodate smaller buildings and patches of marsh vegetation.
Each classifier required certain input parameters beyond the images and training data (described in Section 2.2).To train the U-Net DL classifier, the training AOIs1 were used as inputs in the ArcGIS Pro's Train Deep Learning Model tool.To perform the training, the entire raster image was segmented into 7169 tiles with the dimensions of 256 × 256 pixels.The training data were embedded, as well.The tool performed all of this in the software, with nothing more than a few clicks.Once the classifier was trained using the training data, and all of the small tiles were input into the U-Net classifier and ultimately classified one at a time, before being mosaicked together again to create the whole classified image.The final output was a classified image raster.The U-Net architecture includes a series of down-and up-sampling, resulting in a network with the appearance of the letter 'U'.In the first half of the architecture, sometimes referred to as the encoder, the features of the input image are extracted by 3 × 3 convolution layers, followed by a ReLU activation function and 2 × 2 maximum pooling operation.In the second half of the network, often referred to as the decoder, deconvolution occurs to restore the image back to the original resolution.Finally, a 1 × 1 convolution kernel is used in the final output layer [28].
The simple idea behind the SVM classifier can be viewed in Figure 2, which is based on the maximum margin classifier [32].Here, a maximal margin hyperplane is computed, where the hyperplane separates out two classes and is the furthest from any of the training data.The observations that fall on the boundary, or on the outside of the extent of the hyperplane, are transformed into a 'slab.'These edge points are support vectors.However, most natural datasets cannot be separated as nicely as the example.Support vector machine classifiers allow a nonlinear decision boundary to separate the classes.For the experiment, the SVM classifier required the input of the segmented image, training samples, and classification scheme.Only a single parameter of 500 maximum samples per class was required, which limits the number of training samples you can use for each class.The parameter was set to the default given by ArcGIS Pro 2.8.1.Once each of the inputs were collectively used to train the SVM classifier, they were applied to the entire county-wide image.The final output was a classified image raster.
In ArcGIS Pro, the random trees classifier is a supervised classification method, based on the random forest statistical method [32] (p.311).A series of decisions are made based on the statistical makeup of the classes, as well as the image overall.The decisions branch out together and form what looks like tree branches (Figure 3).When the entire image is classified, many instances of classification are performed on subsets of the data, therefore creating many decision trees [35].It is called a random forest method is because each classification is made from a random subset of training pixels, selected from the overall image, and the final classification is based on the most frequent tree output from several trees.Using multiple trees is meant to mitigate overfitting to the training samples provided by the user.The RF classification followed a similar method.The same inputs were required to train the RF, though the required parameters were different.The RF classifier was trained using the following parameters: 120 maximum trees, maximum tree depth of 30, and 1000 as the maximum number of samples per class.Each of these parameters limited the size of the forest during the classifier training, while seeking to maintain a high level of accuracy.
All three classification methods were trained using the same training sample data and applied to the same aerial image composite from 2019.The final results were three classified image rasters for comparison.

Accuracy Assessment
Accuracy assessment metrics were used to compare the accuracy of the three classifiers.A confusion matrix was calculated using ArcGIS Pro's Compute Confusion Matrix tool, where the producer's accuracy, user's accuracy, overall accuracy, and kappa were computed.The producer's accuracy is the total number of pixels classified correctly for a class, divided by the total number of pixels in that class, as determined from the ground truthing data.The user's accuracy is the total number of pixels correctly classified into a class, divided by the total number of pixels classified into that class.An overall accuracy (OA) percentage was also calculated: where x ii represents a pixel classified correctly, and N is the total number of pixels being assessed.Kappa analysis is a multivariate technique for accuracy assessment, first published in a remote sensing journal in 1983 [44].Kappa is similar to overall accuracy, as a measure of the accuracy of the entire classification, but each considers slightly different information.A kappa estimate ( K) was determined as [45]: where N is the total number of samples, k is the number of rows in the confusion matrix, x ii is the number of observations in row i and column i, and x i+ and x +j are the marginal totals for row i and column j.

Case Study 2.2.1. Study Area
Beaufort County is one of South Carolina's populous counties, nestled in the southern coast of the state (Figure 4).From 2010 to 2019, the population in Beaufort County grew from 162,233 to 192,122, an increase of 18.4% [46].It ranks as the wealthiest county in the state, with respect to the median household income of USD 68,377.Beaufort County is home to half of the state's salt marshes [47].According to [48], South Carolina salt marshes and coastal wetlands provide services in the four ecosystem service categories: provisioning, regulating, cultural, and supporting.Each of these categories, though not all marketed services, provide valuable resources to coastal communities.For example, the salt marshes serve as a nursery habitat to many species, especially shrimp.South Carolina's commercial fishing industry, which relies upon these environments, generates 42 million USD annually to the state economy [49].Other services, such as flood protection, carbon sequestration, filtration, and tourism, all contribute to the enormous value these marshes have to the South Carolina coast.The predominate species of marsh vegetation is Spartina Alterniflora.Juncus Romerianus is also commonly found.The county is home to a diversity of several land cover types, including wetlands, forests, large water bodies, extensive housing and commercial developments, and agriculture.As the county's population continues to grow in a dynamic and complex coastal environment, the importance of monitoring change using accurate classification methods is critical for future planning and measuring trends in socioeconomic and ecological health.Despite extensive regulations to abate the environmental impacts of development on Beaufort County's salt marsh, in [50] (p.33), community stakeholders continue to voice concerns for the health of the marsh.While no substantial evidence of marsh loss was cited, additional insights from the document state that the lack of monitoring in Beaufort County is a detriment to our understanding of how the marsh is being affected [50] (p.34).Furthermore, the aforementioned Beaufort County comprehensive plan does not address marsh migration in the face of sea level rise [51].In addition to comparing the DL algorithm competency in mapping a large, complex, county-wide image with other ML classifiers, this study seeks to fill a gap in the understanding of land use/land cover extent changes in the area that may be directly or indirectly impacting marsh health.

Data
The aerial imagery used in this study were collected by the National Agriculture Imagery Program, or NAIP.This program began in 2002 and is administered by the U.S. Department of Agriculture (USDA) Farm Service Agency to collect aerial imagery during growing seasons.The digital sensors used for NAIP imagery, though not apparent in the metadata provided with the imagery, meet rigid calibration specifications [52].NAIP imagery is generally collected at a 1 m spatial resolution (50-60 cm in some areas) across the conterminous United States.
For this study, NAIP images of Beaufort County, acquired in 2009 and 2019, were used (Table 1).The imagery varies in the month collected.For the 2009 NAIP imagery, each tile in the orthomosaic was collected between 16-25 April 2009.The 2009 imagery is a traditional, true-color orthomosaic, with a 1 m spatial resolution, captured by a Leica Geosystems ADS40-SH52 sensor (sensor numbers 30028 and 30045).The 2019 imagery was collected between 29 August and 23 September (2019).The 2019 flights resulted in a 60 cm spatial resolution and true-color imagery from a Leica Geosystems ADS100 model sensor (sensor numbers 10530 and 10552).The pixel size was resampled to 1 m to match the 2009 image.The tide of each image varied, even within an image, due to the flight times of each tile that makes up the images.In general, the 2009 image shows higher tides, with much of the lower marsh slightly inundated.The National Wetland Inventory (NWI) shapefile for South Carolina was used to mask out deep water bodies, while retaining marsh areas [53].After masking, the imagery was reduced to a smaller size and became more manageable for classifications.Both images were transformed into the NAD83 (2011) UTM 17N coordinate system.[54].The level 3 classes included mudflat, marsh vegetation, forest, roads, buildings, agriculture, grassland, water, shadows, dry bare ground, and wet bare ground (Table 2).Several classes were combined in our level 2 classes, in order to leave seven predominant classes that described the general LU/LC in the study area.For example, mudflat and marsh vegetation combined the marsh class, roads, and buildings into the urban class, and dry and wet bare ground into a single bare ground class.It was determined that the agriculture and grass classes were significantly confused and were, therefore, combined, due to their similarities.These combinations were made to identify general environments and limit unnecessary misclassifications.A final set of level 1 classes were determined by combining bare ground, urban, and grass/agriculture into a developed class that was used as a proxy for general LU/LC changes that can impact marsh health.Bare ground is included in development because construction sites and pre-construction sites across the county are typically bare ground.Grass is included as a development class because it represents a loss of natural forested coastal area.Many parks and yards are part of developments, and this is where the grass is found.Further, while agricultural use may be impactful if the farmer is using certain chemicals, the same could be said for large grass areas, where added nutrients can eventually reach the wetlands through runoff.Other classes, such as water, shadows, forest, and marsh, remained separate in the level 1 classification.In the end, the forest and development classes were deemed the most important for determining changes that would affect marshes.Aside from determining the actual marsh changes (i.e., development on marsh or marsh gain through marsh restoration), the changes in forested land and increase in development were used as an indicator of how the marsh may be affected.
Areas of interest (AOIs) for training and validation samples were manually digitized from the 1 m NAIP imagery, based on expert knowledge in the study area and field visits within the last two years.Training samples were gathered visually from the imagery in collaboration with the Beaufort County Mapping and Applications Director, who has had residence in the position since 1995.A portion of the open-source building footprints layer, provided by Microsoft, was used as AOIs for the building class [55,56].While the dataset was produced in 2018, the individual tiles or scenes throughout the imagery were collected for a wide variety of dates.Therefore, a Beaufort County building footprints subset was thoroughly examined before usage, as ancillary training and validation data.Over 100 training AOIs were generated for each level 3 class, though all polygons were not equal in size.As a result, several classes had fewer AOIs but very large areas for training the classifiers.Since the study area was exceptionally large (nearly 2400 km 2 ), there was ample space to define a large number of training and validation samples (Table 3).
An accuracy assessment was performed for each classified image using the validation AOIs generated in a similar manner to the training AOIs.Validation AOIs were a fraction of total AOIs for any given class.AOIs were combined for accuracy assessment to reflect level 2 classes, which were of more interest than the level 3 individual classes.Regardless of the number of validation AOIs, a stratified random sample of 1500 validation points was generated for the validation process within the given validation AOIs for each image (Figure 5).The validation points were then used to calculate the accuracy assessment metrics, as described in Section 2.1.While mapping a marsh class alone gives us direct information on actual changes in the marsh, many indirect impacts from nearby land use/land cover changes have been documented [37,57].Because of these documented impacts, we decided to map all classes to suggest and discuss what changes may potentially occur if development trends continue.
Pixels that changed from any particular class to the shadow class, or from the shadow class to another class, were disregarded for this change analysis.The pixels of interest for this study were the pixels that changed from the marsh or forest class to any other class, but particularly to the development class.The pixels that experienced these changes were mapped and visually analyzed to determine impacts and assess potential future impacts.

Comparison of Model Performance and Accuracies
Processing time is an important factor in processing large-size imagery.Computational costs depend on the data being processed as well as the computational abilities of the machine being used.Here, a Dell Inspiron 5680 6-core intel i7 CPU with 16 gb Ram and Nvidia GTX 1060 3 gb GPU was used to process each classification.As noted in Table 1, the U-Net classifier required the least training time, but the most total classification time, to apply the trained model to the image.Training the U-Net classifier required 2 h and 59 min, at least 30 min faster than the two other classifiers.However, the classification of the image itself took 43 h and 23 min, for a total of 46 h and 22 min.The length of classification is not a common finding, however, as U-Net classifiers have been found to be faster than many others in remote sensing applications [30,31,58].The authors suggest the extra length of time required to complete classification was due to the machine specifications, size of the dataset, and methods by which the tiles were classified and, subsequently, mosaicked together.The SVM classifier required 4 h 52 min for training and then 30 min to classify the image.The RF classifier was trained in 4 h 29 min and applied to the image in approximately 23 min.The computational times and classification accuracies are reported in Table 4.The overall accuracies of the three classifiers for the 2019 image ranged from 75.74% to 92.38%, with the U-Net classifier performing the best (Table 3).This study found the object-based ML classifiers did not perform as well, though certain classes performed well (Tables 5-7).The forest class was consistently classified with high user and producer accuracy (99-89%).Other classes varied, based on the classifier.For example, the SVM and RT classifiers correctly classified marsh at least 70% of the time, though the U-Net classifier had a user accuracy of 84.41% and producer accuracy of 93.52%.However, both ML classifiers, interestingly, confused marsh with the urban class the most.It is proposed that the high-resolution imagery and complex environment lead to a high intra-class variability, making it difficult for the ML classifiers to separate the classes.After finding the U-Net classifier performed the best, it was applied to the 2009 NAIP dataset, as well.The overall accuracy of the 2009 image classification was 85.28%, with a Kappa statistic of 80.45%.While many of the same classes performed remarkably well between the two sets of imagery, the tidal ranges within the 2009 image seems to have proved difficult for the U-Net classifier.The producer's accuracy for the marsh class was a low 65.18%, though the user's accuracy was 96.90%.The marsh areas were often confused for the water or agriculture/grass classes.The agriculture/grass class was often confused, as well.It is suggested that this was due to the image collection during peak biomass, when S. Alterniflora is its greenest and most like an agricultural product or grass.Outside of the marsh and bare ground classes, which were confused for urban areas, all other classes resulted in a producer accuracy of at least 90.0%.

Comparison of Classification Results
The DL U-Net classifier was able to navigate the complexities of the environment better than the other ML classifiers and achieved a higher accuracy.Evidence of this assertion can be found in three subset areas with classification challenges.Figure 6 shows a forested and agricultural area in northern Beaufort County.It is classified reasonably well by U-Net (Figure 6A), which shows very little salt-and-pepper effect in the classification.However, SVM and RF (Figure 6B,C) misclassified the forest as water, marsh, or shadow, depending on the hue of the green space.Portions of the marsh on the western edge of the image were confused for bare ground by RF, though the U-Net and SVM classifiers generally recognized it to be marsh.SVM also misclassified small portions of the marsh area as urban area.U-Net struggled with wet areas in and around inland water bodies, classifying the surrounding vegetation as marsh.In another subset area under development (Figure 7), U-Net (Figure 7A) once again classifies the bare ground areas correctly, along with the extensive suburban areas.SVM similarly classified most of the bare ground and urban areas correctly.However, RF misclassified the bare ground areas as urban areas.Another difficulty for each classifier was differentiating some wetland areas and ponds, in neighborhoods and golf courses, from the marsh.SVM and U-Net occasionally misclassified those areas, while RF struggled the most.Nearly every inland water body was misclassified as marsh by the RF classifier.It is well known that marsh extent is difficult to map, especially when the tidal range varies throughout the imagery.These tidal discrepancies made classifying the marsh difficult for each classifier in this study (Figure 8).Marsh was sometimes misclassified as water, urban, bare ground, and even grass.If images were collected during low tide conditions across the entire study area, classifications could have been more accurate.Water hues, ranging from blue to algae-ridden green waters, made classifications of water and grass difficult, as well.Ancillary information, such as texture, RGB-based indices, or even a DEM, might assist in the differentiation between some of the more troubled classes.Some inland areas in and around small ponds were also misclassified as marsh.This is similar to RF, which struggled to differentiate the water class from the marsh class in nearly every inland pond (as shown in Figure 7C).

Coastal Development and Impact to Marshes
After the DL U-Net classifier was applied to both the 2019 and 2009 images, changes between the two dates were assessed visually and, as best as possible, quantitatively.The 2009 image classification struggled to classify marsh correctly in some areas, assigning some pixels as development (i.e., the urban, grass/agriculture, or bare ground classes), rather than marsh.With this knowledge, it is apparent that several areas that were marked as marsh loss or gain were, in fact, errors made by the classifier.These areas were visually inspected.Figure 9C,D indicate two areas where actual changes did occur; in fact, some marsh vegetation was lost.An overall marsh system loss was estimated at 3300 ha.However, because of the errors, detected extensively throughout the 2009 marsh class, in particular, a quantitative assessment of marsh losses may not be completely trusted.To reiterate the issues described above, the producer's accuracy of the marsh class in the 2009 image was only 65%, the lowest of all the classes.The marsh was misclassified as an urban area because of sun glint and sometimes as agriculture, due to its greenness.We used expert visual observation to determine that there was very little true marsh loss over the ten-year period.There were a few areas where marsh vegetation extent expanded between dates (Figure 9A,B), but there were no detected areas where the marsh vegetation or mudflat were directly affected by development in the marsh system.Tidal levels, throughout both images, made classifications and comparisons difficult.Despite using a separate class for submerged or underwater marsh, these areas are where much of the misclassifications occurred among the marsh class.Large areas of development expansion were detected across the county.This study indicates that approximately 7102.74 ha of forest were lost to other land cover classes (e.g., urban, bare ground, and grass/agriculture).For the purposes of the case study, any forest lost to the level 1 development class was deemed development.The development in northern Beaufort County included small areas of urban and large areas of agricultural development.Southern Beaufort County saw the greatest amount of urban growth.Figure 10 indicates the area of development across the county.On the other hand, some previously urban areas from 2009 were naturalized over the past 10 years.Some agricultural and urban areas from 2009 were overtaken by shrubs and small trees over the ten-year period.These areas were then often classified as forest and counted as lost developed land.

Discussion
The DL, SVM, and RT classification results fared well, when compared to other largearea DL mapping studies that included wetlands and other complex land cover classes.For large-scale wetland mapping across Alberta Canada, [59] achieved an 80.2% OA, using a deep CNN.In a study comparing RF, SVM, and three other deep learning classifiers for classifying wetland using small unmanned aerial systems hyperspatial resolution imagery, [60] found that the DL classifiers performed better than the SVM and RF classifiers, especially when the training sample counts were high.RF and SVM classifiers resulted in OA as high as 65% and 67%, respectively.The DL classifiers resulted in OA upwards of 76% to 84%.Similarly, our results support assertions made by [61] that CNN can outperform RF classifiers.Specifically, U-Net has been shown to outperform SVM and RF classifications for wetland mapping using Sentinel-2 10 m imagery.Ref. [31] discovered that the SVM and RT classifiers only achieved an OA of 50.5% and 46.4%, respectively, while the U-Net classifier regularly reached at least 85%, depending on the optimizing function used.Our study suggests the higher resolution NAIP imagery includes enough spatial detail to improve the OA to the detected levels of accuracy (e.g., U-Net = 92.4%,SVM = 81.6 %, and RT = 75.7%).When using a similar spatial resolution data, from the Worldview 3 satellite, to classify forested wetlands using a DL CNN, [62] found similar accuracy levels as our study (92%), when only the optical imagery was used.
In this study, all three classifiers showed a fair amount of competency in classifying large, complex aerial image mosaics.Other applications, where these classifiers, especially the highest performing DL U-Net classifier, might be of use include tree cover mapping, disaster assessment using imagery directly after a storm event or natural disaster, sUAS imagery classification, and species-level mapping.The classifiers can be considered adequate for these purposes because of the results from this study, indicating that high-resolution imagery can be processed quickly and with high accuracy.In some of these examples, time can be an important factor; when that is the case, the SVM and RF classifiers have been shown, in this study, to provide adequate results quickly (2-3 h) for even a large, 2400 km 2 , study area.
Several challenges were faced when classifying the coastal tidal marsh in Beaufort County for this study.Maneuvering the tide, and water levels in general, is a significant challenge when using remotely-sensed imagery to map coastal wetlands, including coastal tidal marshes [63].This was particularly evident in this study, as an event within the NAIP imagery, for one county, there was a significant difference across the tiles that made up the image in tide levels.This was one of the major difficulties in classifying the marsh.Other environmental conditions, such as cloud cover and shadows cast by tall objects (like buildings, trees, and water towers), obscured the target wetlands, complicated spectral signatures, and made optical imagery difficult to interpret or use [64].Plant phenology also played a factor in image classification.Peak biomass conditions are best for modeling plant health characteristics, such as biomass, and can be beneficial in mapping certain coastal wetland species [65,66].The 2009 imagery was taken in April, which is at the beginning stages of growth and greening up for S. Alterniflora, the dominant marsh grass in Beaufort County.The 2019 imagery was taken in late august and early September, which is in the peak biomass for S. Alterniflora [67].All three classifiers were more successful at classifying the marsh class in the 2019 imagery than the U-Net classifier was in the 2009 image.We propose that plant phenology, along with tide levels throughout each image, was a significant factor in these results.
NAIP datasets provide high-resolution aerial imagery, with great potential for vegetation mapping, in particular when acquired during leaf-on conditions.While we found a fair amount of success mapping various classes, including the marsh class, the NAIP RGB imagery alone was not sufficient to overcome all of the complexities of the coastal wetland environment.To better classify the coastal tidal marsh, particularly the vegetation and mudflat, it would be expedient to incorporate ancillary remote sensing data.This process, called data fusion, can be used to better describe and classify wetlands [68].Data fusion can be performed at the pixel level, feature level, and decision levels.[69], as described in [70], found that they could improve land cover classification by fusing multispectral data with radar data.While the increase in overall accuracy (OA) was small, some subclasses improved, while others decreased slightly in classification accuracy.Ref. [71] applied a fusion of multispectral imagery, with LiDAR-derived elevation datasets, to map peatlands in Canada, with a 76.4% OA, opposed to only achieving a 65.8% with the RGB and IR bands.Data fusion is able to provide better information for decision makers.For example, the addition of a NIR band or vegetation index, such as the normalized difference vegetation index (NDVI), provides greater discrimination between marsh vegetation and mudflat, as well as marsh vegetation from other vegetation classes.Elevation data, derived from LiDAR or other sources, improves feature extraction of trees, agriculture, and grasses from the marsh grasses, as well as mudflats.
The potential biases and errors, introduced in the study, may be introduced in the selection of training and validation AOIs by the researchers.Potential bias was mitigated by involving multiple long-term residents of the county, who interpreted the aerial imagery and selected the training and validation polygons, based on extensive local knowledge.
Future work should incorporate ancillary remotely-sensed data into the classification process to further increase classification accuracy.As previously stated, other spectral bands and indices, elevation data, and imagery from other scales (i.e., small unmanned aerial systems) should be examined to produce a data fusion product of potentially higher accuracy.Results from the LU/LC classifications can be used as input into models for other phenomenon, such as water quality [72].Water quality is another element that can impact marsh health.Further trials with other available deep learning pixel classifiers, such as DeepLabv3, are useful tools to be investigated, as well.These methods could be further validated through application and testing in similar coastal environments.

Conclusions
This study, compared DL with traditional ML classifiers, based on the classification of high-resolution imagery over an entire coastal county using GUI applications from ArcGIS Pro 2.8.1.Our case study then used the LU/LC maps from 2019 and 2009 to detect salt marsh change patterns over a 10-year period.Results indicated that a U-Net DL classifier significantly outperformed the other classifiers for the classification of a complex, highresolution, county-wide dataset, in terms of OA (92.4%, as opposed to the 81.6% by SVM and 75.7% by RF).DL algorithms, now available to any coastal manager or GIS analyst with access to Esri's ArcGIS pro, showed their high applicability to large-area mapping.Using computational resources commonly available to coastal managers and professional GIS analysts, the U-Net classification required a longer time to classify the large dataset (46 total hours vs 5.33 and 4.83 h).Because this was not found among other literature, regarding other U-Net classifiers, we believe the time required for classification was a function of the large dataset, computational resources, and DL model structure.Our study focused on DL and ML classifiers, from the perspective of the environmental or coastal manager.Findings indicate a bright future for DL and ML LU/LC classification for large-area mapping, even for those without complicated programming and DL or ML backgrounds.Our case study demonstrated the power of using these tools for change detection, showing large areas of development over a 10-year period across the county, which may have an impact on marsh health.Further research is needed to validate findings and test similar methods across similar complex coastal environments.Additional ancillary remote sensing data, including multispectral and hyperspectral imagery, LiDAR, and RADAR, can be integrated to improve classification accuracy.

Figure 1 .
Figure 1.A general workflow for the experiment and case study.First, training data was used to classify the 2019 image using the three different classifiers (U-Net, SVM, and RF).Accuracy assessment was used to determine the most effective classifier.The 2009 image was classified using the best performing classifier and then compared with the 2019 image for change detection analysis.

Figure 2 .
Figure 2. Basic SVM, in the form of a maximal margin classifier.X 1 and X 2 are hypothetical measurements and, in our case, would be pixel values.

Figure 4 .
Figure 4. Map of the United States, with Beaufort County (highlighted in red), within South Carolina (highlighted in blue), and its NAIP image, acquired in 2019.

Figure 5 .
Figure 5. Distribution of sample points used for accuracy assessment across Beaufort County.Following the classification of both the 2019 and 2009 images of Beaufort County, a change detection analysis was conducted using the change detection tool in ArcGIS pro v.2.8.1.The tool requires an input of a series of maps or images and computes a change detection map and change matrix in return.For the change detection analysis, a final classification map, with the combined classes, was created.The water, shadows, marsh, and forest classes remained intact, but the agriculture/grass, bare ground, and urban classes were combined into a class called development.Areas of change were assessed based on the numbers of pixels that changed from a particular class to another.Pixel counts were multiplied by the 1 m × 1 m pixel size to determine the approximate area in meters squared.Further conversion from m 2 to ha was accomplished by multiplying by 0.0001.While mapping a marsh class alone gives us direct information on actual changes in the marsh, many indirect impacts from nearby land use/land cover changes have been documented[37,57].Because of these documented impacts, we decided to map all classes to suggest and discuss what changes may potentially occur if development trends continue.

Figure 6 .
Figure 6.A mixed-use area classified by (A) U-Net, (B) SVM, and (C) random forest.(D) is the NAIP image of this subset.

Figure 7 .
Figure 7.A developing area classified by (A) U-Net, (B) SVM, and (C) random forest.(D) is the NAIP image of this subset.

Figure 8 .
Figure 8.A marsh area classified by (A) U-Net, (B) SVM, and (C) random forest.(D) is the NAIP image of this subset.

Figure 9 .
Figure 9. Examples of marsh vegetation gain (A,B) and loss (C,D) from 2009 to 2019.

Figure 10 .
Figure 10.Forested areas lost to development across Beaufort County in 2009-2019.

Table 2 .
Classes used in this study at classification levels 1, 2, and 3.

Table 3 .
Number of training and validation samples (area in ha).

Table 4 .
Computational time and classification accuracies.

Table 5 .
Accuracy for the predominant Level 2 classes (U-Net).

Table 7 .
Accuracy for the predominant Level 2 classes (RF).