Harnessing Machine Learning Techniques for Mapping Aquaculture Waterbodies in Bangladesh

: Aquaculture in Bangladesh has grown dramatically in an unplanned manner in the past few decades, becoming a major contributor to the rural economy in many parts of the country. National systems for the collection of statistics have been unable to keep pace with these rapid changes, and more accurate, up to date information is needed to inform policymakers. Using Sentinel-2 top of atmosphere reﬂectance data within Google Earth Engine, we proposed six different strategies for improving ﬁshpond detection as the existing techniques seem unreliable. These techniques include: (1) identiﬁcation of the best time period for image collection, (2) testing the buffer size for threshold optimization, (3) determining the best combination of image reducer and water-identifying indices, (4) introduction of a convolution ﬁlter to enhance edge-detection, (5) evaluating the impact of ground truthing data on machine learning algorithm training, and (6) identifying the best machine learning classiﬁer. Each enhancement builds on the previous one to develop a comprehensive improvement strategy called the enhanced method for ﬁshpond detection. We compared the results of each improvement strategy to known ground truthing ﬁshponds as the metric of success. For machine learning classiﬁers, we compared the precision, recall, and F1 score to determine the quality of results. Among four machine learning methods studied here, the classiﬁcation and regression trees performed the best with a precision of 0.738, recall of 0.827, and F1 score of 0.780. Overall, the proposed strategies enhanced ﬁshpond area detection in all districts within the study area. Author Contributions: Conceptualization, A.P.N., H.F. and J.S.H.-S.; methodology, H.F. and A.P.N.; software, H.F., J.K. and I.K.; validation, R.E. and M.M.H.; investigation, N.M. and J.S.H.-S.; data curation, H.F. and J.K.; writing—original draft preparation, H.F.; writing—review and editing, A.P.N., N.M. and B.B.; supervision, A.P.N. and N.M.; administration, A.P.N.


Introduction
Aquaculture (farming fish and other aquatic animals) is growing at an extremely rapid pace, increasing around 6.5 times, from 13 to 85 million tons over the past thirty years [1], and is forecast to continue growing rapidly in coming decades [2]. Most of this growth has occurred in Asia, which accounts of 89% global aquaculture production [1]. Aquaculture farm expansion in Asia has been concentrated mainly in water abundant deltaic regions [3]. The growth of aquaculture since the 1990s has been particularly rapid in Bangladesh, induced by demand caused by rising incomes, urbanization, a growing population, and declining supplies of fish from capture fisheries. This development has occurred in a largely The seven districts are primarily made up of cultivated land, artificial surfaces, water bodies, and wetlands (Table S1). Gopalganj, Bagerhat, and Jessore have the highest percentages of cultivated land (all over 65%). Satkhira has the highest percentage of water bodies at 39%-even higher than their cultivated land. Bhola is one of the few districts For this research, we focus on aquaculture in enclosed waterbodies (ponds and ghers), and exclude farming taking place in stocked natural waterbodies, pens, and cages [6]. The study region excludes the Sundarbans National Park in the southern parts of Satkhira, Khulna, and Bagerhat. The land use for the study region was calculated from data from GlobeLand30 [29]. The study region is made up of 62% cultivated land (9966 km 2 ), 23% artificial surfaces (3661 km 2 ), 13% water bodies (2136 km 2 ), and 2% wetlands (296 km 2 ). Forest, shrubland, grassland, and bare land each encompass less than 1% of the total land use ( Figure S1).
The seven districts are primarily made up of cultivated land, artificial surfaces, water bodies, and wetlands (Table S1). Gopalganj, Bagerhat, and Jessore have the highest percentages of cultivated land (all over 65%). Satkhira has the highest percentage of water bodies at 39%-even higher than their cultivated land. Bhola is one of the few districts with wetlands as a land cover and has the highest percentage of wetlands with 13%.
The study area falls into three different climate subregions: the south-eastern zone, south-western zone, and south-central zone [30]. The south-eastern zone includes Bhola and the southern part of Barisal. This region sees high rainfall from May to September. Northern Barisal, Gopalganj, Southern Khulna, and Southern Satkhira fall within the south-central zone. This area has less rain on average than the south-eastern zone. The south-western zone is comprised of northern Khulna, Northern Satkhira, and Jessore and has the lowest levels of rainfall on average out of the three zones [31]. The study area has four seasons throughout the year that are described as winter (December to February), pre-monsoon (March to May), monsoon (June to September), and post-monsoon (October and November) [30]. The heavy rainfall of the monsoon season makes it challenging to obtain could-free satellite images during that season. The districts within the study region see anywhere from 1200 to 2800 mm of rainfall during the monsoon season [32]. The south-eastern zone has a slightly longer rainy season with higher precipitation from May to September [31].

Overview of the Base Method
As described in the Introduction section, the Yu et al. [23] method was selected as a base method for detecting fishponds for the study area. In general, only a small number of changes (e.g., temporal and spatial extend of satellite imageries) to the original code were needed to fit the method for the study area. The base method comprises six main steps: data acquisition, water index calculation, initial mask, threshold optimization, majority vote, and fishpond identification ( Figure 2).
Step 1. Data Acquisition: The process starts with combining data from Sentinel-2 Level-1C (top of atmosphere reflectance) from 1 January 2020 to 31 December 2020. Sentinel-2 collects multi-spectral data with spatial resolutions from 10 to 60 m. Sentinel-2 Level-1C bands range from visible to near-infrared and short-wave infrared. The images from these dates are then filtered for cloud coverage, keeping only those images with less than 10% cloud pixels. We obtained 311 images for our study region during 2020 that include any image that covers any part of our study region. This is one of the major differences with the original Yu et al. [23] study, as they only obtained four images for their region of interest during the study period of 2016. In addition, the size of our study region is just under 33 times larger than the Yu et al. [23] study area. During the 2020 study period, no images were available between mid-June and the end of September due to the monsoon occurrence. We identified the cloud coverage using the QA60 band available in the Sentinel-2 imagery. This band provides 60 m spatial resolution masks for opaque and cirrus clouds. Opaque clouds are determined with a thresholding technique using the blue (B2), cirrus (B10), SWIR1 (B11) and SWIR2 (B12) bands to avoid snow/cloud confusion. Cirrus clouds, which are high-altitude, thin, and semi-transparent clouds, are determined using the blue band and the B10 band corresponding to a high atmospheric absorption. In addition, morphological operations (i.e., erosion, dilation) are performed to limit false detections. We used the cloud masks to remove the corresponding pixels from the water Remote Sens. 2021, 13, 4890 5 of 26 detection analysis. In Figure S2 we present the cloud distribution for the single-day images employed in this study.
Step 2. Water Index Calculation: To identify where water is located within the study area, three different water indexes were used. These indexes analyze different wavelengths of reflected light to determine the presence of water. Using the images left after the cloud filter, we calculated the three water indexes: the normalized difference water index (NDWI) [33], the modified NDWI (MNDWI) [34], and the automated water extraction index with no shadow (AWEI) [35].
Step 3. Initial Mask: Previous studies showed that MNDWI performed the best among the three water indices [36][37][38] and also recommended by Yu et al. [23] method; the mask is created from this index. The MNDWI is filtered by the 'greater than' function. If an MNDWI pixel value is greater than 0, the function returns a value of 1. The filtered MNDWI images are reduced into one image using the allNonZero reducer command in the Google Earth Engine (GEE). The reducer assigns the pixel values of 0 or 1 based on all values obtained from every image in the collection at a specific location. If every value at that pixel location was a non-zero, then the final image has a pixel value of 1 at that location. If there is even one 0 value in any image at that location, then the final image has a pixel value of 0. This creates an initial binary mask that is converted to polygons. The polygons are enlarged with a buffer of 5 pixels to capture values of pixels that surround bodies of water. The polygons are then rasterized, creating a new image called MNDWI Mask (MM). The MM is then applied to the NDWI, original MNDWI, and AWEI image collections.
Step 4. Threshold Optimization: This initiates a process called Otsu segmentation which optimizes the threshold between foreground and background in images (water and notwater) [39]. The segmentation process creates a histogram of water index values from the locations the MM provided. Otsu segmentation performs best when the histogram is bimodal with a buffer of 5 pixels, the two modes being water and non-water. The threshold between the two modes is found using the sum of squares and is the optimal balance of low-intraclass variance and high inter-class variance. The Otsu segmentation process occurs within each index's image collection separately.
Step 5. Majority Vote: The optimized NDWI, MNDWI, and AWEI image collections after Otsu segmentation are then reduced into three images (one for each index) using the allNonZero reducer. This creates three separate images that show where each water index identified water in every image during the time period. The three reduced index images are then reduced to one image through the mode/majority vote method. The mode method sees what value is present at that pixel location most often and assigns the final image that value. The final combined image depicts where water is in the majority of indexes and also where water is located in every image. This image is then converted into polygons and these polygons are given object-based features (OBFs). The OBFs include the isoperimetric quotient, solidity, patch fractal dimensions, convexity, and square pixel metric. The convexity, solidity, and path fractal dimensions describe basic geometric characteristics of the pond and how complex it may be [40]. The iso-perimetric quotient measures how similar an object is to compact shapes like circles [41]. Lastly, the square pixel metric analysis for the shape convexity and is similar to the iso-perimetric quotient [42].
Step 6. Fishpond Detection: Two machine learning techniques are utilized following the Yu et al. [23] recommendations: classification and regression trees (CART) and logistic regression (LR). These two algorithms were run in R Studio utilizing the caret package [43]. The algorithms are trained by analyzing the calculated OBFs of the ground truthing data. The base method determined algorithm performance by using five-fold cross-validation.
The base method used images from the GEE platform to determine what defined a fishpond visually. The fishponds were then manually digitized for their study area to train their algorithms. In addition, non-fishpond water bodies were identified by removing digitized lakes from the Tibetan Plateau.  Step 1. Data Acquisition: The process starts with combining data from Sentinel-2 Level-1C (top of atmosphere reflectance) from 1 January 2020 to 31 December 2020. Sentinel-2 collects multi-spectral data with spatial resolutions from 10 to 60 m. Sentinel-2 Level-1C bands range from visible to near-infrared and short-wave infrared. The images from these dates are then filtered for cloud coverage, keeping only those images with less than 10% cloud pixels. We obtained 311 images for our study region during 2020 that include any

Proposed Improvements
We identified Yu et al. [23] method as having the most promising and applicable methods for our region of interest in southern Bangladesh, but with some modifications. From initial surveys, we found that many of the assumptions in the Yu et al. [23] paper do not apply to the south-west and south-central regions of Bangladesh (Table 1).
In order to address these shortcomings, the following changes were proposed to improve the base method (1) to identify a period of time that all types of fishponds are filled with water, (2) to modify the buffer size to improve differentiating fishponds from their surroundings, (3) to change the image reducer and water index combination to enhance waterbody detection, (4) to introduce a convolution filter for edge detection, and (5) to utilize support vector machine (SVM) and random forest (RF) as two additional machine learning techniques, and (6) utilize ground truthing data for better algorithm training.

Base Method Assumptions
Shortcoming of the Base Method and Alternative Approach

Improvement Number
Fishponds are filled with water year-round Some fishpond types in Bangladesh may dry out for a portion of the year (e.g., homestead ponds). Other fishponds may be planted with rice for part of the year (e.g., Gher). Therefore, assuming all fishponds are filled with water all year is not correct. Instead of focusing on areas that have water for an entire year, we should focus on time periods when each type of fishpond holds water.
1, 3 Fishponds are surrounded by non-water Bangladesh's landscape is very diverse, and in many areas, different types of land use can be found around fishponds such as trees, rice paddies, buildings. In some regions, fishponds are very close with very narrow boundaries (<10 m apart, which is the highest resolution of imageries used here). This assumption impacts the buffer size for the MM.

2, 4
Fishponds are easy to detect visually from images. Non-fishponds water bodies from another region were selected for algorithm training.
There is a high probability that small fishponds cannot be detected correctly through visual observation of satellite imageries. The preferred method is to use ground truthing fishpond collections from the study region that are diverse in size, shape, type, and surrounding areas.

5
CART and LR are the preferred machine learning techniques for this application Support vector machine and random forest are identified as promising methods [44] for differentiating between fishponds and non-fishponds classes.
6 Improvement 1: As described earlier, the shortcoming of the base method in assuming year-round fishponds impoundment is not valid in our study region as many fishponds are dried for a certain part of the year. In addition, using many images for the entire year for a large study area, such as the one here, can significantly increase the computational time or prohibit the implementation of the base method for a large study area.
In order to address this shortcoming, we identified a period of time in which all types of fishponds are filled with water. For this period, two sets of analyses were performed. First using all images with less than 10% cloud cover over the span of one month. Secondly, in order to further reduce the computational needs for data processing in our large study region, a single image for each district during the month of interest was selected for the second set of analyses.
Using images for a single month, relevant satellite images were obtained for the period of 11 October to 13 November of 2020 instead of the full year. Therefore, the total number of images was reduced from 311 for the whole year observation to 43 for one month observation. For the second scenario, one single day during the month was selected. Since our study area is very large compared to the base method, each district has a different day that falls between 11 October and 13 November. The images chosen were based on visual inspection and clarity, such as cloud coverage and contrast between water and non-water. The days chosen for each district are shown in Table 2. Improvement 2: The base method assumes that fishponds are surrounded by non-water pixels, but this is not the case in some of the districts in our study region. The Satkhira and Khulna districts, for example, have fishponds and agricultural fields that are very close together with dikes and buffers that are smaller than 10 m. We tested buffer sizes of 0, 1, 3, and 5 pixels to identify the best size as the buffer in the base method is a very important step in creating the MM. The MM is a binary raster that is one of the inputs into the Otsu segmentation function. Where the MM has a value of 1, the Otsu segmentation function creates a histogram of index values at those locations. The buffer selects an additional region around the areas where the MNDWI has identified water that is assumed to be land. The histogram created by Otsu should be bimodal, where one peak represents water and the other represents non-water. Having a bimodal histogram makes the threshold values from Otsu more accurate. The potential issue with a buffer of 5 pixels is that it assumes that the water bodies identified by MNDWI during the MM process are surrounded by 50 m of land, trees, or buildings. This is not the case in many of the districts in our study region. In areas where aquaculture ponds are being used, they tend to be very close together, with only a dike separating them. Dikes are much smaller than 50 m; some are even smaller than the 10 m pixel that we can see. Fishponds surrounded by other water bodies could change the histogram from bimodal to unimodal since most pixels identified by the MM are water. We wanted to see if the buffer size impacts the histogram and threshold value from the Otsu segmentation. If that threshold impacts how much water we identify and whether they have clear boundaries for a better classification. Improvement 3: The base method functions under the assumption that water is present in fishponds throughout the year and in every image. To implement this assumption, they utilize an all-or-nothing reducer called allNonZero and combine all three water index images. We know that some fishponds are drained for maintenance or may not be visible due to an imaging error (e.g., sensor issues, platform issues, or angle issues). With the all-or-nothing style of reducer, the base method removes too many potential fishponds and under detects. Therefore, we tested an image reducer that would take the value of what is present in the majority of images instead of all.
The allNonZero reducer in the base method labels final image pixels with a value of 0 (or non-water) if even one image does not have water at that location. This is a very strict reducer that eliminates too much water during the pre-processing and establishes a lower bound estimate. It does not take into account potential cloud interference or the periods of time when ponds may be dried or have rice within them. The allNonZero reducer is used two times in the base method. The first instance is in creating the MM when combining all the MNDWI images. The second instance is when combining the calculated index images into one summary image for each index (Step 5: majority vote). The first instance of the allNonZero reducer remains the same since we want very strict water locations to create the MM. As an improvement, we replaced the second instance of the allNonZero reducer with the mode reducer. The mode reducer, instead of requiring water to be present for every image, requires that water be present for most of the images. The mode reducer allows for more water pixels to be identified in the final classification.
The base method combines images from all three water-identifying indexes together to create one final summary image. We assumed this muddles the shape and location of identified water bodies, decreasing the overall amount of detected water. To improve upon this, we tested four different index combinations to evaluate what performed best in each district. The combinations are the base method, AWEI individually, NDWI individually, and MNDWI individually. This test shows whether combining the three indexes negatively impacts the fishpond identification results or filters out likely non-fishpond water bodies. Improvement 4: Increasing the number of water pixels, whether from reducing the time frame, changing the buffer, or using the mode reducer, causes many water bodies to merge, forming much larger polygons. This problem is exacerbated by the issue of dikes and other dividers commonly less than 10 m wide, much smaller than the best resolution images that are used in this study. These large polygons have a much higher chance of being labeled as a non-fishpond by the classifier because of their size and odd shapes. One way of Remote Sens. 2021, 13, 4890 9 of 26 breaking up the polygons is by using light detection and ranging (LiDAR) digital elevation models (DEM) to determine where raised boundaries are [45]. Unfortunately, these data do not exist in our study region. Instead, to try to reduce the size of these polygons, we introduced a Laplacian 5 × 5 convolution filter to smooth the images [46]. The convolution filter was applied to the best index or index combination image from Improvement 3 of each district and enhanced the differences between water and non-water. The convolution filter is used in ArcGIS Pro 2.4. We chose Laplacian 5 × 5 as the filter type since it visually performed better than the other filter types and has Gaussian smoothing built into the kernel (e.g., After the convolution filter is used, we put the smoothed index image through Otsu segmentation with the MM mask. The process follows the same as the previously improved base method. This improvement aims to provide the Otsu segmentation process with a clearer index image than the one provided by the base method. Improvement 5: The base method assumes that fishponds are easy to detect visually from standard RGB images, but that may depend more on when the images were captured and what surrounds the fishpond. Many fishponds in Bangladesh are hard to detect visually because they are surrounded by trees or many rice paddies and they have different fishpond layouts. In the study region, four types of fishponds were identified that include gher with and without rice, commercial ponds, and homestead ponds. All four types of fishponds vary in average size, typical shape, and period with water, all highlighted in Table 3. Gher without rice is, on average, much larger than all the other fishpond types. Homestead ponds and gher without rice can have round and irregular shapes, whereas gher with rice and commercial ponds typically are rectangular. The period in which each of these fishpond types is filled with water varies as well. Commercial and homestead ponds are typically permanent ponds that have water year-round. Gher can have water or not because of their design to cultivate both fish and crops throughout the year. All of these differences between fishpond types make it very difficult to locate a diverse range of fishponds visually from satellite images. Table 3. Average size, shape, and period with water for gher with and without rice, commercial ponds, and homestead ponds.

Characteristic Gher with Rice Gher without Rice Commercial Pond Homestead Pond
Average size (m 2 ) 1620-2020 4047 1620 810 The base method utilizes historical GEE images to visually find and trace fishponds. The traced fishponds are not directly used for training, but instead, they used the shape of the polygon of water identified at the traced locations through the base method for training. For their non-fishpond data, they manually trace water bodies from a Tibetan plateau for algorithm training using the Joint Research Centre (JRC) Yearly Water Classification dataset that has a 30-m spatial resolution [47]. Tibet is north of Bangladesh and has a very different elevation and climate.
In comparison, for this study, we used 1728 ground truthing polygons provided by surveyors in Bangladesh to validate the base method's performance and the proposed improvements. The ground truthing data consists of 996 fishponds and 741 non-fishpond waterbodies (Tables S2 and S3). Using ground truthing data additionally makes validation results reliable instead of trusting that the waterbody chosen visually was a fishpond or not. The ground truthing data was created by manually drawing borders around the known fishponds and non-fishponds in GEE. This results in the training dataset having smoother edges than what the water identification process will be. This is because the water identification process takes the shape of pixel groupings that appear blocky with hard edges. For this simplification of shape to not impact the results, we used the 'Simplify Polygon' function in ArcGIS to simplify the polygons identified through Steps 1-4. Improvement 6: The two machine learning algorithms used in the base method are CART and LR due to their wide use and applications in addition of being transcribed into GEE. In addition to these algorithms, SVM [48] and RF [49] were identified as promising methods and also recommended by Maxwell et al. (2018), considering many factors such as training data, requirements, and computational cost. SVM was selected because it is useful for finding an optimal boundary between two classes [44] and performs reliably when trained with a smaller dataset [50]. RF is a larger combination of many decision trees and is easy to optimize, so it is reasonable to compare its results with the CART method [51].

Overview of the Base Method
To compare the performance of the different fishpond detection algorithms, we used several criteria, including: (1) The number of ground truthing fishponds that are correctly identified; (2) The percentage of the ground truthing area that is correctly identified by the classifier; (3) The number of fishponds classified; (4) The recall: Recall is the ratio of true positive fishponds to the total of all known fishponds, whether true positive or false negative Equation (1); (5) The precision: Precision is the ratio of true positive fishponds to all identified fishponds, whether true positive or false positive Equation (2); (6) The F1 score: The F1 score Equation (3) is the harmonic mean of the recall and precision and is used to provide a more insightful characteristic of performance than the arithmetic mean [52].
The range for recall, precision, and F1 score is 0.0 to 1.0, with 1.0 being the highest score for all of them.
To determine whether the proposed improvements significantly impacted the results, we compared the areas that are correctly identified between the base method and the proposed improvement method against the ground truthing fishpond areas. The comparison was made using a confidence interval for the mean relative error between the identified area and the known ground truthing area [53].

Base Method Performance Evaluation within the Study Area in Detecting Waterbodies and Fishponds
The performance of the base method was evaluated based on: (1) how well it identified water bodies and (2) how well it classified those water bodies as fishponds. Figure 3 shows the percentage of the ground truthing area that are correctly identified at preclassifier (waterbody identification) and post-classifier (fishpond identification) stages. First, concerning detecting water areas within the known fishponds, the highest detection was in Bagerhat and had 9.4% of ground truthing fishpond area overlap. The second highest was Gopalganj with 5.9%, but the rest of the districts saw less than 1% ground truth fishpond area detection. Second, regarding the applicability of classifiers to differentiate between fishponds and non-fishponds, the LR method was identified more ground truth fishponds correctly than CART. The performance post-classifier will always be the same or less than the waterbody identification since the classifiers are only applied to the areas identified as water. Bagerhat had the highest correct classification out of the seven districts.
Most of the other districts had correct classifications under 1%, except Gopalganj had around 5% classification after the LR classifier (Table S3). The CART classification true positives were lower than the LR, with the highest being Bagerhat's 5% ground truth area identification. In Bagerhat, the base method using the LR classifier correctly identified 16 ground truthing fishponds out of 235 (9% of the total ground truthing fishpond area) ( Figure 3). Having said that, still the performance of the Base Method, even in the Bagerhat, is low.
Remote Sens. 2021, 13, x FOR PEER REVIEW 12 of 28 area) (Figure 3). Having said that, still the performance of the Base Method, even in the Bagerhat, is low. Figure 3. Percentage of the ground truthing area that were correctly identified as waterbodies, classified as fishpond using logistic regression method, and classified using the classification and regression trees method for different districts within the study area.
Meanwhile, the Bagerhat district is the only one of the seven in which the results are significant enough to visualize. Therefore, in Figure 4, we visually compared the performance of the classifiers that were used in the base method. While CART correctly identifies fewer ground truthing fishponds ( Figure 4C), LR overclassifies water bodies as fishponds ( Figure 4B). Further notable in Figure 4 are the large polygons that are identified as water and how those polygons are classified by LR and CART. LR tends to classify the large polygons as water, whereas CART is more conservative and classifies them as nonfishponds. The polygons are much larger than the ground truthing fishponds in the area, and in some cases, overlap the ground truthing fishponds. These large polygons make individual fishpond detection difficult since they typically encompass multiple fishponds or several agricultural lands (e.g., rice paddies).   Meanwhile, the Bagerhat district is the only one of the seven in which the results are significant enough to visualize. Therefore, in Figure 4, we visually compared the performance of the classifiers that were used in the base method. While CART correctly identifies fewer ground truthing fishponds ( Figure 4C), LR overclassifies water bodies as fishponds ( Figure 4B). Further notable in Figure 4 are the large polygons that are identified as water and how those polygons are classified by LR and CART. LR tends to classify the large polygons as water, whereas CART is more conservative and classifies them as non-fishponds. The polygons are much larger than the ground truthing fishponds in the area, and in some cases, overlap the ground truthing fishponds. These large polygons make individual fishpond detection difficult since they typically encompass multiple fishponds or several agricultural lands (e.g., rice paddies).

Identifying the Best Period of Image Collection for Detecting Fishponds (Improvement 1)
We tested one year, one month, and one day time periods for image collection to evaluate how the number of images impacts the threshold optimization results. The one year period combined information for 311 images while one month lowered the number of images to 43. These totals are for the study area as a whole and not for each specific district. Table 4 compares the waterbody identification results from each time period. Rows 1, 3, and 5 highlight the ground truth fishpond area percentage identified and how it improves with the shorter time for every district. The district of Bhola performed the worst out of all districts with 0% water identification for both year and month periods and 3% for the day. The poor performance in Bhola could be explained by the low numbers of ground truthing data (27 locations) we have for the district. Generally, the fewer ground truthing fishponds we have, the harder it is to identify any of them specifically. Bagerhat had the best water identification during the one-month period with 52% of the ground truthing fishpond area identified, but Jessore had the best performance in a single day with 91% area identification. The districts of Barisal, Gopalganj, Khulna, and Satkhira all saw significant improvement over the base method, but performed worse than the Bagerhat and Jessore values (Tables S4-S10). All seven districts improved water identification when we changed the time period from a year to either a month or a day. The single-day images had the highest results out of the three periods. To ensure that the best time period is a day, we used both single day and single month for Improvements #2 and #3. Figure 3. Percentage of the ground truthing area that were correctly identified as waterbodies, clas-sified as fishpond using logistic regression method, and classified using the classification and regression trees method for different districts within the study area.
Meanwhile, the Bagerhat district is the only one of the seven in which the results are significant enough to visualize. Therefore, in Figure 4, we visually compared the performance of the classifiers that were used in the base method. While CART correctly identifies fewer ground truthing fishponds ( Figure 4C), LR overclassifies water bodies as fishponds ( Figure 4B). Further notable in Figure 4 are the large polygons that are identified as water and how those polygons are classified by LR and CART. LR tends to classify the large polygons as water, whereas CART is more conservative and classifies them as nonfishponds. The polygons are much larger than the ground truthing fishponds in the area, and in some cases, overlap the ground truthing fishponds. These large polygons make individual fishpond detection difficult since they typically encompass multiple fishponds or several agricultural lands (e.g., rice paddies).   The time period of image collection proved to be a large area for improvement for identifying water. Changing the period from the full 2020 year to just 11 October to 13 November 2020 proved to be effective in identifying more water than the base method. The base method utilizes the allNonZero reducer in GEE to identify water that is present year-round. A longer time period means more images to compare and more chances for water to not be present. All districts saw an improvement in water detection from the base method and from the month period when using an image for one day. Figure 5 shows the drastic difference in water detection between the three periods, but also how small boundaries between ponds and agriculture can blend together to form large polygons. Figure 5 also compares a location in which we had good water detection, Jessore, to an area with very poor water detection, Bhola.
method and from the month period when using an image for one day. Figure 5 shows the drastic difference in water detection between the three periods, but also how small boundaries between ponds and agriculture can blend together to form large polygons. Figure 5 also compares a location in which we had good water detection, Jessore, to an area with very poor water detection, Bhola.

Testing the Buffer Size for Threshold Optimization (Improvement 2)
The buffer size impacts the amount of water identified by the Threshold Optimization process. The larger the buffer, the more pixel values are included in determining the threshold value between water and non-water. Having the best possible threshold will significantly improve upon water and fishpond identification. We tested buffer sizes of 5, 3, 1, and 0 pixels within the MM for both single day and one month time periods. The water identification pre-classifier for Bagerhat with a buffer size of 5 was 67% for one day, but that dropped to 58% with a buffer size of 0 (Table S11). This was the same trend across all districts for both time frames except for Bhola (Tables S11-S17). None of the attempted methods have resulted in any success in Bhola higher than the 3% of ground truthing area identified pre-classifier achieved from the 5-pixel buffer for a single day. The results showed that the buffer of five pixels performed better than any other size buffer for both time periods. Figure 6 shows the histogram of NDWI values at the MM location for Khulna. The final results for each district are from using the base method, in which all three indices are combined. Khulna was chosen for the example since it had a noticeable change from the buffer size and NDWI is one of the most used water indices. The histograms for 0-and 1-pixel buffers have large peaks between −0.2 and −0.1 that represent non-water pixels. As the buffer size increases, the second peak around 0.3 to 0.4 increases considerably. With NDWI, it is commonly interpreted that values above 0.3 are water bodies and values between 0 and 0.3 may be water bodies [54]. These histograms show that with the increased buffer size, water pixels become more prominent. The 5-pixel buffer may work best because it includes more pixels in total. The more pixels included in the histogram, the more reflective the histogram is to the land use in the district. Additionally, the 5-pixel buffer works better in districts with small fishpond boundaries because it provides more opportunities for endmember pixels, or pure water pixels, to be accounted for.

Testing the Buffer Size for Threshold Optimization (Improvement 2)
The buffer size impacts the amount of water identified by the Threshold Optimization process. The larger the buffer, the more pixel values are included in determining the threshold value between water and non-water. Having the best possible threshold will significantly improve upon water and fishpond identification. We tested buffer sizes of 5, 3, 1, and 0 pixels within the MM for both single day and one month time periods. The water identification pre-classifier for Bagerhat with a buffer size of 5 was 67% for one day, but that dropped to 58% with a buffer size of 0 (Table S11). This was the same trend across all districts for both time frames except for Bhola (Tables S11-S17). None of the attempted methods have resulted in any success in Bhola higher than the 3% of ground truthing area identified pre-classifier achieved from the 5-pixel buffer for a single day. The results showed that the buffer of five pixels performed better than any other size buffer for both time periods. Figure 6 shows the histogram of NDWI values at the MM location for Khulna. The final results for each district are from using the base method, in which all three indices are combined. Khulna was chosen for the example since it had a noticeable change from the buffer size and NDWI is one of the most used water indices. The histograms for 0-and 1pixel buffers have large peaks between −0.2 and −0.1 that represent non-water pixels. As the buffer size increases, the second peak around 0.3 to 0.4 increases considerably. With NDWI, it is commonly interpreted that values above 0.3 are water bodies and values between 0 and 0.3 may be water bodies [54]. These histograms show that with the increased buffer size, water pixels become more prominent. The 5-pixel buffer may work best because it includes more pixels in total. The more pixels included in the histogram, the more reflective the histogram is to the land use in the district. Additionally, the 5-pixel buffer works better in districts with small fishpond boundaries because it provides more opportunities for endmember pixels, or pure water pixels, to be accounted for.

Determining the Combination of Image Reducer and Water-Identifying Index to Improve Waterbody and Fishpond Detection (Improvement 3)
Both the image reducer and water-identifying indices impact the amount of water identified for classifying. The base method resulted in too few of the ground truthing fishponds being identified. Therefore, we hypothesized that determining the best combination of image reducer and indices will significantly improve the overall detections.
To increase the amount of water identified, we tested two image reducers (i.e., mode and allNonZero) and four different water-identifying index combinations (i.e., AWEI, MNDWI, NDWI, and combination of AWEI; MNDWI; and NDWI). The mode reducer increased the area percentage of ground-fishponds identified for all districts (Tables S18-S24) compared to the allNonZero reducer (Tables S25-S31). The exception to this is Bhola, in which most of the results were 0% except for the NDWI index in which the Mode reducer identified 7%. Figure 7 shows the comparison of the combined indexes, AWEI, MNDWI, and NDWI in correctly identifying the ground truth fishpond areas for each district. The combination of all three indexes that represents the base method did not perform the best for any district. Figure 7 also shows that the MNDWI by itself also does not perform the best for any district (Tables S32-S38).

Determining the Combination of Image Reducer and Water-Identifying Index to Improve Waterbody and Fishpond Detection (Improvement 3)
Both the image reducer and water-identifying indices impact the amount of water identified for classifying. The base method resulted in too few of the ground truthing fishponds being identified. Therefore, we hypothesized that determining the best combination of image reducer and indices will significantly improve the overall detections. To increase the amount of water identified, we tested two image reducers (i.e., mode and allNonZero) and four different water-identifying index combinations (i.e., AWEI, MNDWI, NDWI, and combination of AWEI; MNDWI; and NDWI). The mode reducer increased the area percentage of ground-fishponds identified for all districts (Tables S18-S24) compared to the allNonZero reducer (Tables S25-S31). The exception to this is Bhola, in which most of the results were 0% except for the NDWI index in which the Mode reducer identified 7%. Figure 7 shows the comparison of the combined indexes, AWEI, MNDWI, and NDWI in correctly identifying the ground truth fishpond areas for each district. The combination of all three indexes that represents the base method did not perform the best for any district. Figure 7 also shows that the MNDWI by itself also does not perform the best for any district (Tables S32-S38).

Implementing Edge Detection with a Convolution Filter to Improve Fishpond Boundary Detection (Improvement 4)
Considering the results from Improvement 1 through 3, the best combination of time of image collection, buffer size for the threshold optimization, and water index for each district was identified. Two major combinations are as follows: (1) single day, 5-pixel buffer, with AWEI index for Bagerhat, Gopalganj, Jessore, and Khulna and (2) single day, 5-pixel buffer, with NDWI index for Barisal, Bhola, and Satkhira (Table S39). Since the best combination for every district entails using the single day image, the reducer type is no longer relevant. A potential reason why MNDWI does not perform to the same standards as NDWI and AWEI is that MNDWI was created to differentiate water from built-up land since built-up land can have a very similar NDWI value to water [55]. This is not necessarily an issue in our study region since there is not significant built-up land present. In

Implementing Edge Detection with a Convolution Filter to Improve Fishpond Boundary Detection (Improvement 4)
Considering the results from Improvement 1 through 3, the best combination of time of image collection, buffer size for the threshold optimization, and water index for each district was identified. Two major combinations are as follows: (1) single day, 5-pixel buffer, with AWEI index for Bagerhat, Gopalganj, Jessore, and Khulna and (2) single day, 5-pixel buffer, with NDWI index for Barisal, Bhola, and Satkhira (Table S39). Since the best combination for every district entails using the single day image, the reducer type is no longer relevant. A potential reason why MNDWI does not perform to the same standards as NDWI and AWEI is that MNDWI was created to differentiate water from built-up land since built-up land can have a very similar NDWI value to water [55]. This is not necessarily an issue in our study region since there is not significant built-up land present. In fact, most of the representative fishponds (ground truth) are not located near significant developed land. Our preliminary literature review also supported this assumption that is most likely the case with the majority of fishponds in the country. In addition to this, MNDWI uses spectral bands that have 20 m spatial resolution that may decrease the accuracy of its results.
In the next step, we apply a Laplacian 5 × 5 convolution filter for edge detection to identify the best image combination for each district. The purpose of introducing edge detection is to begin with an enhanced image that highlights sharp differences in pixel values. The difference in pixel values shows where boundaries are located around different plots of land. The convolution filter emphasized the differences in the select index images to identify the boundaries around water. Figure S4 shows the results of both the Laplacian 3 × 3 and Laplacian 5 × 5 filters. The 5 × 5 filter is much more sensitive to change than the 3 × 3 filter is due to having a larger range of values than the 3 × 3. This can be seen in the legend of Figure S3. The spectrum of index values widens after the filter has been applied, allowing for differences to become more apparent numerically. For example, the original range of NDWI values in Barisal is −0.50 to 0.65. After using the filter, the range lengthens from −14.83 to 8.95. Visually, the filter highlights boundaries but does make many of the pixels become a very similar grey color ( Figure 8B).
In the next step, we apply a Laplacian 5 × 5 convolution filter for edge detection to identify the best image combination for each district. The purpose of introducing edge detection is to begin with an enhanced image that highlights sharp differences in pixel values. The difference in pixel values shows where boundaries are located around different plots of land. The convolution filter emphasized the differences in the select index images to identify the boundaries around water. Figure S4 shows the results of both the Laplacian 3 × 3 and Laplacian 5 × 5 filters. The 5 × 5 filter is much more sensitive to change than the 3 × 3 filter is due to having a larger range of values than the 3 × 3. This can be seen in the legend of Figure S3. The spectrum of index values widens after the filter has been applied, allowing for differences to become more apparent numerically. For example, the original range of NDWI values in Barisal is −0.50 to 0.65. After using the filter, the range lengthens from −14.83 to 8.95. Visually, the filter highlights boundaries but does make many of the pixels become a very similar grey color ( Figure 8B). Bagerhat, Bhola, Gopalganj, Jessore, and Khulna had all their ground truthing fishponds identified before classification (Table S40). Jessore performed the best with edge detection seeing all 113 ground truthing fishponds and 91.5% of the area correctly identified before any classifier was used. Despite Jessore's high performance, the water body identification pre-classifier decreased with adding edge-detection from 95% to 91.5%. Satkhira experienced a similar result from adding edge-detection with the ground truth area identified decreasing from 91% to 82%. Oppositely, Gopalganj, Barisal, and Bhola all increased their ground truthing fishpond area identification at the pre-classifier stage from 54% to 68%, 26% to 58%, and 20% to 57%, respectively. The same district divide was seen with the post-classifier results as well. Barisal and Bhola saw an increase in their postclassifier area identification for both LR and CART. The ground truthing fishpond area percentage for Jessore decreased from 94.3% to 91.4% with LR and from 3% to 0% with CART using edge-detection. Satkhira saw a slight increase from 0.3% to 5% for the CART classifier, but its LR dropped from 93% to 56% ground truth area identification. Bagerhat, Bhola, Gopalganj, Jessore, and Khulna had all their ground truthing fishponds identified before classification (Table S40). Jessore performed the best with edge detection seeing all 113 ground truthing fishponds and 91.5% of the area correctly identified before any classifier was used. Despite Jessore's high performance, the water body identification pre-classifier decreased with adding edge-detection from 95% to 91.5%. Satkhira experienced a similar result from adding edge-detection with the ground truth area identified decreasing from 91% to 82%. Oppositely, Gopalganj, Barisal, and Bhola all increased their ground truthing fishpond area identification at the pre-classifier stage from 54% to 68%, 26% to 58%, and 20% to 57%, respectively. The same district divide was seen with the post-classifier results as well. Barisal and Bhola saw an increase in their post-classifier area identification for both LR and CART. The ground truthing fishpond area percentage for Jessore decreased from 94.3% to 91.4% with LR and from 3% to 0% with CART using edge-detection. Satkhira saw a slight increase from 0.3% to 5% for the CART classifier, but its LR dropped from 93% to 56% ground truth area identification.

Evaluating the Impact of Ground Truthing Data on Machine Learning Training (Improvement 5)
To train machine learning classifiers, the base method manually traced suspected fishponds from historical satellite images. The water bodies they traced may have only been representative of one type of fishpond because homestead ponds and gher with rice are very difficult to identify as fishponds. To improve the machine learning classifier training, we used detailed ground truthing data for all four types of ponds described previously. In addition to this, we had ground truthing data for non-fishpond water bodies that would provide juxtaposing training data. Table 5 compares the performance of the machine learning classifiers when they are trained with ground truthing data versus historical imagery that was used in the Yu et al. [23]. The recall significantly increased from 0.538 to 0.989 for LR and from 0.495 to 0.827 for CART. This shows that the ground truthing data decrease the number of false negatives. The precision score decreased from 0.788 for LR and 0.773 for CART to 0.594 and 0.738, respectively, when using the ground truthing data for machine learning training. A decrease in precision implies an increase in the number of false positives. In addition to LR and CART, RF and SVM are widely recognized classifiers and can be used for many different research applications that require the differentiation of two or more classes [54][55][56]. To improve the classification results, we added RF and SVM since they are identified as promising classifiers [44]. Overall, the LR and SVM classifiers identified more ground truthing fishponds both in terms of area percentage and total number than the CART and RF (Tables S41 and S42). Figure 9 shows how the results of the RF and CART are similar to each other as well as how similar the results from LR and SVM are. The district with the best performance is Jessore, with both LR and SVM classifying 91% of the ground truthing fishpond area and all 113 ground truthing fishponds correctly. Jessore is also where there is the most drastic difference in performance by classifiers was observed. This could be due to the predominant presence of large water polygons in Jessore. LR and SVM tend to classify the large polygons are fishponds, whereas RF and CART do not. The RF and CART classifiers in Jessore performed the worst out of any district with 3% ground truthing fishpond area identification and 3 of 113 fishponds correctly identified. Introducing two additional classifier types yielded similar results to the base method. Table 6 compares the precision, recall, and F1 scores for each of the four classifier types we used.    The total number of classified fishponds for each district also supports the conclusion that CART is the best middle ground (Tables S41 and S42). Figure 10 shows the classifier results from LR, CART, RF, and SVM in Eastern Gopalganj as compared to the ground truth fishponds in that location.

Determining the Extent to Which the Improvements Works
We calculated the mean relative error for four different scenarios to determine whether the six proposed improvements considerably enhanced the capability of the base method for detecting fishponds. The scenarios that are considered here are the base method before classification, the enhanced method (considering all improvements) before classification, the base method using LR, and the enhanced method using CART. Table 7 outlines these statistics. The mean relative difference closest to 0 is the enhanced method without classifier (lower and upper limits of −34.4% and −31.3%). This means that the enhanced method is regularly 31-34% under detecting the ground truthing area. The other three scenarios had mean relative differences very close to −100%. The enhanced method CART mean relative difference was better than for the base method pre-and post-LR. This shows that the enhanced method without classification considerably improves upon the base method water identification.

Determining the Extent to Which the Improvements Works
We calculated the mean relative error for four different scenarios to determine whether the six proposed improvements considerably enhanced the capability of the base method for detecting fishponds. The scenarios that are considered here are the base method before classification, the enhanced method (considering all improvements) before classification, the base method using LR, and the enhanced method using CART. Table 7 outlines these statistics. The mean relative difference closest to 0 is the enhanced method without classifier (lower and upper limits of −34.4% and −31.3%). This means that the enhanced method is regularly 31-34% under detecting the ground truthing area. The other three scenarios had mean relative differences very close to −100%. The enhanced method CART mean relative difference was better than for the base method pre-and post-LR. This shows that the enhanced method without classification considerably improves upon the base method water identification. Table 7. The lower and upper limits for the mean relative difference of the based and enhanced methods performances before and after a machine learning classifier.

Scenarios
Mean

High-Resolution Imagery Impacts on Results
Having access to high-resolution imagery can improve the results of the enhanced method significantly. A WorldView-2 image was obtained for an area of 45 km 2 in Jamalnagar, Satkhira to analyze the performance of the enhanced method. The high-resolution image was used to identify the boundaries around all land plots and intersected those boundaries with the final enhanced method for Satkhira. Figure 11 compares the boundary identification from the WorldView-2 image, the base method water identification, the enhanced method water identification, and the intersect of the WorldView-2 boundaries and the enhanced method. We applied the CART classification to the intersect of WorldView-2 and the enhanced method to compare it to the enhanced method fishpond identification alone. There are 47 ground truth fishponds in this region. The enhanced method correctly identified 9 of the 47 whereas the intersect correctly identified 29 of the 47 fishponds. In addition, the area of fishponds correctly identified was doubled with the introduction of the high-resolution boundaries. The improvement in fishpond and area identification in one small region shows that having access to high-resolution imagery would refine the results of the enhanced method even further. In summary, using high-resolution imagery would significantly improve boundary detection around fishponds and provide more accurate water locations.

Implications of the Base Method
The threshold optimization (Step 4) was created to pinpoint the water bodies most likely to be fishponds. For the small study area that the base method was applied to, this technique works very well, especially since the layout of their study area has clear divided water bodies. However, even broken into the seven districts in our study area, the pixel selection technique does not perform to the same standard. The underlying assumptions of the base method are not valid for our study area and have to be adjusted to make accurate fishpond identification possible.
Overall, the base method is under-identifying water bodies before the classifier is even applied. In addition, the base method's performance in detecting fishponds is very low, the highest of all districts being 9%, and can be enhanced significantly. In the following sections, we will evaluate the effectiveness of proposed improvements on waterbody and fishpond detections in a very large and diverse study area.

Implications of the Base Method
The threshold optimization (Step 4) was created to pinpoint the water bodies most likely to be fishponds. For the small study area that the base method was applied to, this technique works very well, especially since the layout of their study area has clear divided water bodies. However, even broken into the seven districts in our study area, the pixel selection technique does not perform to the same standard. The underlying assumptions of the base method are not valid for our study area and have to be adjusted to make accurate fishpond identification possible.
Overall, the base method is under-identifying water bodies before the classifier is even applied. In addition, the base method's performance in detecting fishponds is very low, the highest of all districts being 9%, and can be enhanced significantly. In the following sections, we will evaluate the effectiveness of proposed improvements on waterbody and fishpond detections in a very large and diverse study area.

Single Day as Best Time Period
The method performs much better when using a shorter time frame with fewer images. The best period proved to be the single day. We do not expect that the day chosen will drastically impact results, but the image must have low cloud coverage (e.g., less than 10%) and it must be during a time period when we are fully confident fishponds contain water. Because of these guidelines and Bangladesh's climate, we are only able to obtain a few images that fit these standards.

Image Reducer and Index Combination Comparisions
Testing the mode reducer on the month-long period resulted in less water identification than the single day (Tables S18-S24), but more than the same period using the allNonZero reducer (Tables S25-S31). Meanwhile, the mode reducer did improve water identification, but since single-day images only have utilized one image, the reducer has no impact on its performance.
MNDWI and AWEI performed the best out of the three indices. Using the indices alone significantly improved upon the base method combination of the three indices. Both AWEI and MNDWI were created to better differentiate water from built-up land since built-up land has similar reflectance and absorption to water. Meanwhile, the introduction of the cloud removal technique (i.e., QA60) had minimal impact (<0.21%) on the total fishponds identified.

Introducing Edge Detection with a Laplacian Convolution Filter
The results of applying the Laplacian 5 × 5 convolution filter were mixed; however, every district saw a large increase in the overall number of fishponds identified, with some now seeing over 100,000 fishponds (Table S40). The filter did succeed in dividing up large polygons of identified water, hence the large increase in overall ponds identified. The cause for why some districts saw a decrease in their overlapping area between ground truth fishponds and identified water can be attributed to the division of these polygons. The Laplacian filter identifies significant changes in neighboring pixel values, but the filter still is being used on 10 m pixels. With some 10 m pixels now being labeled as a boundary, this removes area from the water.

Creating Training Datasets from Ground Truthing Surveys and Identifying the Best Classifier for Fishpond Classification
The training datasets created from our ground truthing surveys improved both the recall and the F1 score for both LR and CART classifiers. The increases in recall and F1 score show that utilizing ground truthing data improves the performance of both the LR and CART classifiers, but the decrease in precision increases the number of false positives compared to the base method.
The F1 scores are all very similar to each other, with the range from lowest value (0.706) to highest (0.780) being only 0.074. The LR recall and precision results are interesting because it has the worst precision, but the best recall meaning the LR is very good at preventing false negatives, but the worst at preventing false positives. RF has the exact opposite results with the highest precision and the lowest recall. Both LR and RF have lower F1 scores. CART and SVM fall between LR and RF for precision and recall, but because of their consistency, they have the two higher F1 scores. From these scores, we can conclude that LR is over-classifying fishponds and RF is under-classifying fishponds. When looking at the results in Figure 10, we can conclude that since SVM performed very similarly to LR, it must also be over-classifying. In addition, since RF typically identifies less than CART and RF under-classifies, CART can be the best final option.
Since the data shows that LR is over-classifying, we can conclude SVM must also be since SVM classifies 3 to 5 times as many fishponds as LR does. This conclusion is also backed by the fact that LR and SVM are very similar classifiers, so it is realistic that they would behave similarly [57]. The CART classifier numbers always fall below LR and above RF levels. This could also support our previous finding that CART provides the best classification out of the four.

Land Use and Ground Truth Fishpond Characteristics to Explain Trends in Results
All seven districts within the study region are very different from each other. This makes creating one over-arching method very difficult. We looked at the land use surrounding the ground truth fishponds to see if there was a trend in districts that performed better/worse than others. To do this, we tested buffer sizes of 10, 50, and 100 m around ground truth fishponds in each district separately (Tables S43-S45). The predominant land use types in all districts were cultivated land, water bodies, and artificial surfaces. Bhola and Barisal had the lowest percentage of ground truthing fishpond area identified before the machine learning classifier. These two districts had the highest percentage of artificial surfaces, with Bhola seeing 73% and Barisal having just under 50% with the 100 m buffer. All other districts had artificial surfaces as 26% or less of their land use at 100 m. Water and artificial surfaces may have very similar reflectance values [58]. In this instance, since there was a very high percentage of artificial surfaces and a very low percentage of water, the indexes may be identifying artificial surfaces instead of water during Otsu segmentation.
The median size ground truthing fishpond also varied significantly by district (Table S46). The district with the smallest median size ground truthing fishpond was Bhola (713 m 2 ), followed by Barisal (833 m 2 ), then Satkhira (1252 m 2 ). All other districts had median ground truthing fishponds larger than 2200 m 2 . The combination of having the least and smallest ground truthing fishponds may be why the improved methods performed the worst in Bhola. Barisal had more ground truth fishponds with 168, but they were additionally much smaller than other districts. This could explain why the enhanced method performed worse in Barisal than other districts.

Conclusions
Aquaculture is becoming increasingly important in regions that rely heavily on fish for food. Identifying where aquaculture is occurring is critical in understanding the industry, how it operates, and the areas it can improve in. With this information, we can estimate total production, inform policy, or analyze trends in aquaculture locations.
This study showed that the approach taken for identifying fishponds must be tailored to the characteristics of the fishponds, the fishponds' surrounding areas, and the satellite data being used. The best approach for this is to test the time period for image capture, the buffer size for image optimization with Otsu Segmentation, the combination of wateridentifying indexes, the image reducer type, and the machine learning classifier type. Utilizing edge-detection techniques and high-resolution imageries can improve the overall waterbody detection. For South-west and South-central Bangladesh, limiting the number of images for a specific time period results in an increased number of waterbodies detected. One reason for this is that additional errors can be introduced when combining multiple images together if an image does not capture the period with water. The buffer size for threshold optimization of water and non-water had an enormous impact on the results. The 5-pixel buffer performed the best for our study region. This proves that including more pixels in image thresholding results in better thresholds altogether, especially when the shape and size of the fishponds change throughout the year. Of the three water indices, NDWI and AWEI performed the best for the region, but this varies by location and land use/cover characteristics. The mode reducer did allow for more waterbodies to be identified than the allNonZero image reducer when working with more than one image for the study region. The image reducer type, in the end, did not matter for the enhanced method since one image is usually selected for fishpond detection. The training dataset for machine learning plays an important role in identifying the best classifier. Having diverse and representative ground truthing fishponds will improve the machine learning classifiers.
Some of the challenges that were observed in this study included (1) the size of the study area: Due to the size of the study area and the limitation of the GEE platform computational power, many of the tasks proposed in this study could not be executed for the entire region together. One way to address this issue was to divide the study area into seven districts and apply various tasks separately for each district. Meanwhile, as the computational power is increasing exponentially, it is expected that in the near future, all of these analyses can be performed within GEE without the need for dividing the study area. (2) Threshold optimization: In the original base method, one threshold value for water detection was obtained for the study region; however, our results showed that this approach does not work in a large region. Therefore, in this study, we optimized the threshold values based on the water index values at the district level. One alternative is to train the system based on the region's physiographical and climatological characteristics instead of its political boundaries. Using the agroclimatological zone is one alternative but is not a good representation of the fishpond distribution and attributes. (3) Highresolution imagery: Our study showed that using high-resolution imagery can significantly improve the overall performance of the enhanced method in detecting fishponds. Having high-resolution imagery would be very beneficial in identifying the boundaries between fishponds and breaking up any large polygons. As the cost for high-resolution imagery decreases, these six strategies will become increasingly more useful. (4) Rainy season: The rainy and monsoon season in Bangladesh is from June through mid-October. This limits the availability of high-quality images with low cloud coverage. In fact, our study showed that the viable images for this study are mainly available from January to May and October to December. One solution can be the combined usage of SAR and optical imaging in identifying small waterbodies. Ultimately, verification of the enhanced method in other regions is crucial to examine the reliability of the proposed method. The use of a cloud mask paired with image composites through machine learning could also enhance future results by removing any cloud pixels entirely.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/rs13234890/s1, Figure S1. Land use map of the study region (adapted from (China Ministry of Natural Resources [29]); Figure S2. Distribution of ground truthing data locations by type, Table S1. Land cover as percentage of area for each district within the study region (China Ministry of Natural Resources [29]); Table S2. Numbers of ground truthing data fishponds and non-fishponds by district;  Table S17. Buffer size comparison for Satkhira for both month and single day; Table S18. Mode reducer applied for one-month period in Bagerhat comparing water-identifying index combinations; Table S19. Mode reducer applied for one-month period in Barisal comparing water-identifying index combinations; Table S20. Mode reducer applied for one-month period in Bhola comparing water-identifying index combinations; Table S21. Mode reducer applied for one-month period in Gopalganj comparing water-identifying index combinations; Table S22. Mode reducer applied for one-month period in Jessore comparing water-identifying index combinations; Table S23. Mode reducer applied for one-month period in Khulna comparing wateridentifying index combinations; Table S24. Mode reducer applied for one-month period in Satkhira comparing water-identifying index combinations; Table S25. allNonZero reducer applied for onemonth period in Bagerhat comparing water-identifying index combinations; Table S26. allNonZero reducer applied for one-month period in Barisal comparing water-identifying index combinations; Table S27. allNonZero reducer applied for one-month period in Bhola comparing water-identifying index combinations; Table S28. allNonZero reducer applied for one-month period in Gopalganj comparing water-identifying index combinations; Table S29. allNonZero reducer applied for onemonth period in Jessore comparing water-identifying index combinations; Table S30. allNonZero reducer applied for one-month period in Khulna comparing water-identifying index combinations; Table S31. allNonZero reducer applied for one-month period in Satkhira comparing water-identifying index combinations; Table S32. Water-identifying index combination comparison for Bagerhat using single-day time period; Table S33. Water-identifying index combination comparison for Barisal using single-day time period; Table S34. Water-identifying index combination comparison for Bhola using single-day time period; Table S35. Water-identifying index combination comparison for Gopalganj using single-day time period; Table S36. Water-identifying index combination comparison for Jessore using single-day time period; Table S37. Water-identifying index combination comparison for Khulna using single-day time period; Table S38. Water-identifying index combination comparison for Satkhira using single-day time period; Table S39. Best results of the index tests for each of the seven districts; Table S40. Results from adding the Laplacian 5 × 5 convolution filter for each district; Table S41. Comparison of LR, CART, RF, and SVM results when applied to ground truthing fishpond data for Bagerhat, Barisal, Bhola, and Gopalganj during one day with that district's best performing water-identifying index; Table S42. Comparison of LR, CART, RF, and SVM results when applied to ground truthing fishpond data for Jessore, Khulna, and Satkhira during one day with that district's best performing water-identifying index; Table S43. Land use as percent of total area for 10 m buffer around ground truth fishpond locations for each district; Table S44. Land use as percent of total area for 50 m buffer around ground truth fishpond locations for each district; Table S45. Land use as percent of total area for 100 m buffer around ground truth fishpond locations for each district; Table S46. Median ground truth fishpond size for each district.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.