2.3. Classifier Evaluation and Community Prediction for the Study Area
Creating the Classification Training Set: For each of the seven plant community types plus mud flats and open water, a set of field GPS points were acquired using a Garmin 60C× (Garmin International, Inc., Olathe, KS, USA) with a 1–7 m accuracy. To account for this accuracy in relation to our scale of interest, we collected the points more than 7 m from the edge of the community in plant community patches that were greater than 14 m in diameter. In addition to field training points, we digitized training samples using a digital stereoplotter (DAT/EM Systems International, Anchorage, AK, USA) with 2009 aerial photography and the ArcMap 10.3.1 (ESRI 2014) basemap. A total of 17,166 training points were used in the classification. The number of points per vegetation class varied from 557 (buttonwood/glycophyte community) through 4408 (halophyte prairie community). The number of points varied because classes that had more variability in their spectral signatures needed more training points, and some communities, such as the buttonwood/glycophyte community, were rare. During the ground survey, we documented community types at the GPS points with pictures in each cardinal direction, plus zenith and nadir, creating a 2012–2015 photographic vegetation database (4356 pictures at 730 points, Figure 1
). This database provided temporal and spatial photographic documentation of the study site, which was used to select the additional digitized training points.
Satellite Data and Image Processing: We used WorldView-2 (WV2) satellite data acquired at the end of the wet (11 December 2011) and dry (13 April 2013) seasons. The WV2 imagery has 2 × 2 m2 resolution and eight spectral bands (coastal: 400–450 nm; blue: 450–510 nm; green: 510–580 nm; yellow: 585–625 nm; red: 630–690 nm; red edge: 705–745 nm; near-IR1: 770–895 nm; and near-IR2: 860–1040 nm). We chose images from the end of the wet and dry seasons because species are at their maximum response level to both seasons at those times. There was no major storm event between December 2011 and 2013 that affected the vegetation in the area. Therefore, the different years’ wet and dry seasons were assumed comparable. Both images were geometrically corrected using the orthorectification module in ENVI version 5.2 (Exelis Visual Information Solutions, Boulder, CO, USA). Root mean square errors for the December 2011 and the April 2013 images for the 2009 stereo aerial photography reference data sets were 5.9 m and 5.0 m, respectively, and in reference to the ArcGIS basemap, they were 4.5 m and 4.8 m, respectively.
Images were atmospherically corrected using the Fast Line-of-sight Atmospheric Analysis of Hypercubes (FLAASH) module in ENVI version 5.2 (Exelis Visual Information Solutions, Boulder, CO, USA). The selected atmospheric model was “Tropical”, and the aerosol model was “Maritime” for both images, with the initial visibility parameter set to 80 km for the April 2013 image and 40 km for the December 2011 image.
For each band of the two images, we generated local texture variables using a 3 × 3 kernel. The local texture variables we calculated for the kernel were reflectance mean, range, and standard deviation. In addition, we calculated the Normalized Difference Vegetation Index (NDVI) for each image. Since elevation differences were expected to provide valuable information for plant community presence, a digital terrain model (DTM) was included. The LiDAR-derived DTM, acquired in 2007–2008 and gridded at 5ft spatial resolution by the Florida Division of Emergency Management, had a vertical accuracy of 0.18 m (CI = 95%) [39
]; to match the resolution of WV2, the DTM was resampled to 2 m resolution using nearest-neighbor interpolation.
Feature Set Evaluation: To assess model-based overall accuracy differences between single wet- or dry-season data and between each single season and the combined bi-seasonal data, we constructed three feature sets for optimal feature selection using the wrapper method [40
]. The three feature sets were (1) bi-seasonal data with 33 features each plus the DTM, totaling 67 features; (2) single wet; and (3) single dry season data with 33 features plus DTM each (Table 1
). The classifier we chose was the random forest [41
], which has been applied successfully in many remote sensing applications [42
For each training sample, variables for the three feature sets were extracted. Within each feature set, secondary feature selection was performed and feature importance was assessed using built-in bootstrapping and cross-validation procedures inherent to the random forest algorithm as implemented in the R package caret [44
]. The parameters used in the random forest procedures as implemented in the caret package were the number of trees, the number of randomly selected features at each node, and folds for the cross-validation procedure. The number of trees was consistently set to 1000. Parameter tuning was employed to determine the best number of features to be considered and randomly selected at each node (“mtry”). This parameter was tuned for the range of 2 features up to the number of features available in each feature set (e.g., for a single-season feature set with 34 features, “mtry” ranged from 2 to 34 and for the bi-seasonal set, from 2 to 67) with the evaluation criterion set to “accuracy”. The number of folds for cross-validation was set to 10.
Classifier performance for the three sets was assessed with the overall, model-based, out-of-bag (oob) error estimates obtained with a 10-fold cross-validation [45
]. We evaluated which communities benefited most from bi- versus single-season data and which features were most effective within each feature set. The criterion for feature importance was decrease in mean accuracy, which was provided by the output of the random forest algorithm.
Design-Based Map Accuracy: The final map was predicted from the feature set that provided the highest model-based overall accuracy. For those areas in the satellite data that were cloud covered, we masked by manually digitizing cloud and cloud shadow masks. For the cloud-masked areas only the satellite image that did not contain clouds and the DTM were used to predict the final class label. The amount of cloud and cloud shadow were 0.9% for the dry and 0.6% for the wet season images. For the final map, a minimum mapping unit of 20 m2
was chosen. Pixel clumps for each community that were below the 20 m2
threshold (nine-pixel neighborhood) were reclassified using a majority rule for a 3 × 3 moving window; clump gaps were filled iteratively from the outside in until all pixels were assigned a class label. Map accuracy of the final map was evaluated with a design-based, stratified, random-sampling method where the number of samples was calculated based on a multinomial distribution evaluating an 85% accuracy at a 95% confidence [46
]. Stratifying by community type, 53 pixels per community were randomly selected for evaluation. Because the two invasive species, S. terebinthifolius
and C. asiatica,
had the smallest area and were difficult to distinguish from aerial photography, we combined those two categories for the accuracy assessment. We used a digital stereoplotter (DAT/EM Systems International, Anchorage, AK, USA) with 2009 aerial photography and the ArcMap 10.3.1 (ESRI 2014) basemap to label each of the randomly-sampled reference pixels without prior knowledge of the model classification. Those pixels that we could not identify in either of the aerial photography sources were verified in the field.
To quantify accuracy of the map and estimate area of each community type across the 71 km2
area, we evaluated the error matrix of the map label versus reference pixel labels. Using the reference data of the random samples and mapped class proportions, the areal cover estimates for each class were adjusted to eliminate bias that can be attributed to classification errors [48
]. Confidence intervals for the error-adjusted area estimates were calculated to quantify the sampling variability of the estimated area using the procedures recommended by Olofsson et al. [48
] and Stehman [49
]. Mapped proportions of each class and estimated error-corrected areal estimates were then used to adjust overall, user’s and producer’s accuracies with standard errors and 95% confidence intervals [48
]. All data processing and analysis was performed in R [50
] using packages raster [51
], rgdal [52
], and caret [44