Next Article in Journal
Application of a Modeling Framework to Mitigate Ozone Pollution in Changzhou, Yangtze River Delta Region
Previous Article in Journal
Co-Creating Sustainability Interventions in Practice—Coping with Constitutive Challenges of Transdisciplinary Collaboration in Living Labs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Spatiotemporal Extraction of Aquaculture Ponds Under Complex Surface Conditions Based on Deep Learning and Remote Sensing Indices

1
Key Laboratory of Beibu Gulf Offshore Engineering Equipment and Technology, Beibu Gulf University, Qinzhou 535011, China
2
Faculty of Forestry and Environment, Universiti Putra Malaysia, Serdang 43400, Selangor, Malaysia
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Sustainability 2025, 17(16), 7201; https://doi.org/10.3390/su17167201
Submission received: 28 June 2025 / Revised: 24 July 2025 / Accepted: 4 August 2025 / Published: 8 August 2025

Abstract

The extraction of water surfaces and aquaculture targets from remote sensing imagery has been challenging for operations under different regions and conditions, especially since the model parameters must be optimized manually. This study addresses the requirement for large-scale monitoring of global aquaculture using the Google Earth Engine (GEE) platform to extract high-accuracy, long-term data series of water surfaces such as aquaculture ponds. A Composite Water Index (CWI) method is proposed to distinguish water surfaces from non-water surfaces with remote sensing data recorded with Sentinel-2 satellite, thereby minimizing manual intervention in aquaculture management. The CWI approach is implemented based on three index algorithms of remote sensing analysis such as the Water Index (WI), the Modified Normalized Difference Water Index (MNDWI) and the Automated Water Extraction Index with Shadow (AWEIsh). The values of the three index methods are obtained from 1000 grid points extracted with an overlaid map with three layers. A ternary regression method is then introduced to generate the coefficients of CWI. Experimental results show that the classification accuracy of the WI is higher than that of the MNDWI and the AWEIsh, leading to a more significant coefficient weight in the ternary regression. When different numbers of mean distribution points are used to calculate the indices, it is found that the highest R2 value can be achieved when using the coefficient value corresponding to 600 points, and an accuracy of 94% can be achieved by the CWI method for water surface classification. The CWI algorithm can also be used to monitor the change in aquaculture ponds in Johor, Malaysia; it was discovered that the total aquaculture area has expanded by 23.27 km from 2016 to 2023. This study provides a potential means for long-term observation and tracking of changes in aquaculture ponds and water surfaces, as well as water management and water protection. Specifically, the proposed Composite Water Index (CWI) model achieved a mean mIoU of 0.84 and an overall pixel accuracy (oPA) of 0.94, which significantly outperformed WI (mIoU = 0.79), MNDWI (mIoU = 0.75), and AWEIsh (mIoU = 0.77), with p-values < 0.01. These improvements demonstrate the robustness and statistical superiority of the proposed approach in aquaculture pond extraction.

1. Introduction

Over the past 25 years, the scale of global aquaculture production has grown from 130,000 to 760,000 tons, underscoring a critical role in enhancing food security and nutrition worldwide [1]. Therefore, a robust monitoring and management approach is needed to ensure sustainability and efficiency of the aquaculture industry, and remote sensing technology has been proven to be an invaluable tool in this regard. Remote sensing has emerged as a transformative approach for dynamic monitoring of aquatic ecosystems, offering new insights into the management of critical resources such as water quality, temperature, and biological conditions across vast areas with high precision. Early detection of abnormal issues such as disease outbreaks, pollution, or harmful algal blooms becomes possible, allowing for timely and informed decision making and optimization of resource allocation to enhance overall productivity and profitability in the aquaculture industry.
Significant advances in remote sensing have been documented across various regions. For instance, analysis of aquaculture in Jiangsu Province, China, using Google Earth Engine (GEE) from 1988 to 2018 revealed notable changes in aquaculture pond areas, indicating shifts in aquaculture practices over three decades [2]. Similarly, studies in Vietnam analyzing coastal aquaculture from 1990 to 2015 using Landsat data highlighted the irregular distribution and developmental stages of aquaculture ponds [3]. In Malaysia, the health and dynamics of aquaculture ponds are vital for maintaining ecological balance and supporting economic growth. Research has demonstrated the uses of GEE in monitoring these ecosystems, providing essential data for policy development aimed at sustainable aquaculture and environmental management [4]. Despite these advancements, existing methods often require extensive manual effort and are being challenged by the need for high temporal resolution and scalability. Traditional monitoring techniques, such as field surveys, are labor-intensive and not feasible for continuous, large-scale observations [5]. Remote sensing technology, however, facilitates the rapid acquisition of geographic data over large areas, proving advantageous for environmental monitoring [6,7].
In order to enhance the visibility of water features in Remote Sensing Images (RSIs), a spectral index can be calculated to identify and quantify the presence of water bodies based on the distinct reflectance of water other than non-water surfaces. In previous studies, there are mainly three most popular approaches to calculate the spectral index for water bodies based on different spectral bands, which are the Modified Normalized Difference Water Index (MNDWI) [8], the Water Index (WI) [9], and the Automated Water Extraction Index with Shadows Elimination (AWEIsh) [10], respectively. The MNDWI is an enhanced index to improve the recognition of water bodies in complex urban environments with the green and shortwave infrared (SWIR) bands. The WI uses the green and near-infrared (NIR) bands for the identification and monitoring of water bodies. The AWEIsh enhances the precision of water information extraction by accounting for the elimination of shadows, particularly in shadow removal. These indices provide useful tools for water body monitoring and management over time. Several spectral indices such as NDWI [9], MNDWI [11], and WI [12] have been widely used to enhance water feature extraction from multispectral imagery, improving the sensitivity to surface water presence.
Another approach for identifying and monitoring water bodies in RSI is the Composite Water Index (CWI), usually obtained through multi-band remote sensing data or the existing water monitoring index as a band synthesis like the above-mentioned water indices [13,14,15]. Although the Conventional Water Index (CWI) is commonly used in surface water monitoring, it has been criticized for its limited robustness and reduced accuracy under complex surface conditions such as vegetated wetlands, shadowed regions, and mixed pixels [15,16,17]. These limitations highlight the necessity of developing improved water body extraction methods. The fusion of the RSI collected from different occasions or sensors such as Landsat and Sentinel-2 can improve the dynamics and accuracy of monitoring, so that the real-time changes in water bodies can be represented by the CWI [18,19,20]. Machine learning and deep learning techniques also offer the possibility of more in-depth analysis of the CWI to distinguish between different types of water bodies or water quality [13,21,22,23]. Recent advances in remote sensing have enabled more refined analysis of water indices such as CWI through the use of high-resolution datasets (e.g., Landsat and Sentinel-2) combined with machine learning or deep learning techniques [2,13]. However, such methods often require significant local computational resources and storage. In this study, we adopt Google Earth Engine (GEE) to implement large-scale, cloud-based water index extraction and preprocessing, offering improved efficiency, automation, and scalability for aquaculture pond recognition.
Based on the GEE platform, this study focuses on the spatio-temporal dynamics of aquaculture ponds and other water bodies in Johor, Malaysia, from 2016 to 2023 based on the fusion of the existing water quality extraction methods. Firstly, the water bodies within the study area were extracted using mature water body indices, and the applicability and advantages of these indices in the Johor region were assessed. Subsequently, a new Composite Water Index (CWI) was developed through algorithm synthesis to solve the limitations of traditional water monitoring methods and provide a scalable technical solution for continuous water monitoring. The objective of this study is to provide a robust analytical framework to assess environmental change in Johor, Malaysia, and to support water management efforts for the advancement of global environmental protection and resource management.
Water bodies and farm pond extraction based on RSIs has been a popular topic, and many methods have been proposed for this application, such as deep learning [21,24], image element-based classification [25], object-based classification [26] and exponential methods [27], and heuristic methods such as multi-criteria decision analysis (MCDA) and AHP-based spatial modeling [28]. These methods offer different advantages depending on the landscape complexity and input data quality. In the application of dynamic extraction of river water bodies and aquaculture ponds based on GEE, various algorithms have been proposed to improve the accuracy and efficiency of extraction. The MNDWI was used in the extraction of tropical aquaculture ponds in Sungai Udang of Malaysia, achieving an accuracy of 81.87% and a Kappa coefficient of 0.61 [29]. The UNet with attention blocks and pyramid modules achieves an accuracy of more than 95% in the automatic extraction of multi-temporal water bodies in Dongting Lake and Poyang Lake [30]. These studies demonstrate the promise of employing deep learning methods in the applications of water body dynamics monitoring; however, the lack of high-resolution hyperspectral imagery and insufficient feature extraction influences the extraction accuracy [31]. While the NDVI and other indices such as the NDWI and MNDWI may have limitations in distinguishing vegetation shadows from water bodies in certain complex terrains, the NDVI still provides useful information for excluding dense vegetated regions near aquaculture ponds, thereby improving overall classification precision [32]. Nonetheless, emerging approaches such as deep learning-based water body extraction techniques, combined with GAN and multiscale input strategies, have shown significant advantages in improving the extraction accuracy of small water bodies, reaching an accuracy of 94.72% [22].
The index method is usually used to distinguish water bodies from non-water bodies by constructing specific indices, including the Normalized Vegetation Index (NDVI) and Normalized Water Body Index (NDWI). The NDWI has been used to extract water bodies in India from Sentinel-2 imagery with an overall accuracy of 91.7% [13]. Hossain et al. have also successfully extracted the coastal region of the Bay of Bengal using the improved NDWI for the ponds in the coastal region of the Bay of Bengal using a modified NDWI [33]. Despite the index method’s fast and efficient performance, distinguishing between water bodies and shadows remains challenging.
Image-based classification is the most traditional method, which mainly uses spectral features to monitor water bodies and aquaculture ponds. Lu et al. [34] extracted Thai water bodies from Landsat imagery using image-based binary classification with an overall accuracy of 87.49%. Yu et al. [5], on the other hand, extracted Malaysian water bodies and aquaculture ponds from Sentinel-2 imagery through image-based binary classification, and the Malaysian water bodies and aquaculture ponds were extracted with an average accuracy of 94.5% [35]. However, this method is susceptible to the influence of mixed pixels, resulting in the degradation of accuracy. The related algorithms for water body and aquaculture pond extraction are listed in Table 1.
In the object-based classification method, the RSI is segmented and classified based on the features of spectra, shape, and texture. Mahdavi et al. [41] successfully extracted fish ponds in North Carolina from Landsat images with an accuracy of 85%, and later fish ponds on Pingtan Island were extracted with the object-based method with an average accuracy of 94.5% [42]. It is noted that there is a trade-off between accuracy and computational load for the object-based method compared with the pixel-based method.
In recent years, the applications of machine learning methods in water body extraction have drawn much attention. Isikdogan et al. [43] applied random forests to Sentinel-2 images with an accuracy of 95%. Later Kanjir et al. [44] and Li et al. [45] used convolutional neural networks for water body extraction in WorldView-2 images, achieving an accuracy of about 95%.
In other efforts, Zhaohui et al. [46] adopted a large-scale and small water body extraction method, achieving an overall extraction accuracy of up to 98.14%. Xu et al. [47] showed that the spatial autocorrelation method can effectively extract continuous open water even under the condition of vegetation or algae cover. The Composite Water Body Index (CIWI) developed by Zhang et al. [48] has shown high efficiency in identifying water bodies in the study area with over 90% accuracy. Schmitt et al. [20] introduced the Google S2cloudlessA algorithm, which performs cloud detection and cloud removal based on time series Sentinel-2 data, and the noise caused by cloud occlusion can be removed by analyzing multi-temporal images from the same location. Wegen et al. [49] explored the complementarity of SAR wave data and optical remote sensing data fusion, and it was discovered that SAR is insensitive to clouds and rain, which has great potential in solving the cloud occlusion problem in water body identification. Tao et al. [50] used a random forest algorithm to process multi-source remote sensing images and took multiple indicators such as the NDWI and MNDWI as feature inputs, and, thanks to the insensitivity to outliers, the advantages of multi-indicator characteristics can effectively improve the accuracy of water body classification. Weng et al. [51] improved the traditional semantic segmentation model by introducing the UNet model and the attention mechanism, and up to 85% accuracy of water body prediction can be achieved.
Providing the progress in the extraction of water bodies and ponds, the extraction accuracy of small-scale or partially obscured ponds is still a difficult problem, and there is little systematic study on the effects of time and regional scales [22]. For the regions lacking comprehensive water resources data like Ethiopia, Tesfaye and Breuer [52] evaluated seven water indices for monitoring surface water with Sentinel-2 covering extensive areas of Ethiopian water bodies, including the Water Index (WI), shaded Automated Water Extraction Index (AWEIsh) and Kappa coefficients, demonstrating an overall accuracy of 98%.
In the applications of RSI, water body identification has been a challenging task, especially when distinguishing between obstructions such as mangroves and aquaculture ponds in large-scale and long-term sequence monitoring. This study explores a new CWI method specifically designed for time series analysis to identify and extract large areas of water bodies, such as aquaculture ponds with better efficacy and accuracy.
To address the challenges of large-scale aquaculture monitoring, this study employed Sentinel-2 multispectral imagery acquired via Google Earth Engine (GEE), with selection criteria based on minimal cloud cover and high temporal consistency. Preprocessing included atmospheric correction and cloud/shadow masking to ensure spectral reliability. Three robust water indices—WI, MNDWI, and AWEIsh—were integrated to form a novel Composite Water Index (CWI), optimized via regression modeling using 600 stratified sampling points across ponds, lakes, and rivers. Accuracy assessments were conducted using confusion matrix-based metrics such as overall accuracy and Kappa coefficient, demonstrating the efficacy of the proposed method for long-term, high-resolution monitoring of aquaculture ponds in Johor, Malaysia.
To support large-scale and long-term aquaculture monitoring, this study utilizes Sentinel-2 imagery from the Copernicus program, which provides high-resolution multispectral data across visible, near-infrared (NIR), and short-wave infrared (SWIR) bands. The imagery, with a spatial resolution of 10–20 m and a five-day revisit frequency, offers consistent time-series coverage suitable for detecting surface water dynamics. The analysis was conducted on the GEE platform, which enables efficient access to satellite archives, image preprocessing (e.g., atmospheric correction and cloud masking), and scalable implementation of water indices. Based on the spectral characteristics of water bodies, three indices—WI, MNDWI, and AWEIsh—were adopted and integrated into a Composite Water Index (CWI) to enhance water surface detection across complex landscapes.

2. Materials

2.1. Study Area

Johor is one of the thirteen states of Malaysia, located in the southern part of Peninsular Malaysia, bordering Pahang, Malacca, and Negeri Sembilan in the north and on the sea with neighboring Singapore and Indonesia. As shown in Figure 1, Johor is enriched with various natural water bodies, including rivers and lakes. These aquatic ecosystems are not only crucial for the state’s biodiversity but also for the economy, supporting agriculture, fishing, and providing water resources for the population.

2.1.1. The Rivers in Johor

As the most significant river in Johor, the Johor River (Sungai Johor) originates from Mount Gemuruh and flows into the Straits of Johor, serving as a vital source of water for the state and Singapore. This river basin is famous for its biodiversity and is a key area for local agriculture and aquaculture. The Muar River (Sungai Muar) serves as a natural boundary between Johor and the neighboring state of Melaka and is well-known for the scenic beauty and a variety of ecosystems along the river banks. The Segamat River (Sungai Segamat) originates from the Titiwangsa Range and flows through the town of Segamat, serving the local agriculture and water supply. The Endau River (Sungai Endau) flows into the South China Sea, and the basin is known for its relatively pristine condition, where one of the oldest tropical rainforests, the Endau-Rompin National Park, is located.

2.1.2. The Lakes in Johor

Tasik Biru Kangkar Pulai, also known as the Blue Lake, is a man-made lake located near Johor Bahru, known for its striking blue waters. The Blue Lake is a result of former mining activities and has become a popular spot for photography. The Tasik Layang-Layang lake provides a habitat for various bird species, especially during the migratory season. The Gunung Ledang Waterfalls and Pools in Mount Ophir are famous for the scenic views of waterfalls and natural pools, and the Pulau Kukup lake in Johor of Malaysia near the southwest corner of Singapore is rich in mariculture [53]. These water bodies support diverse ecosystems and provide water resources, as well a recreational and tourism opportunities, playing a significant role in the ecology and culture of Johor. The local state government, alongside various environmental organizations, is dedicated to the conservation and management of these valuable natural resources, ensuring their sustainability for future generations.

2.2. Satellite Imagery

The remote sensing data sets used for this study are shown in Table 2.
Sentinel-2 imagery from the Copernicus program of the European Union was selected as the primary data source in this study due to its high spatial resolution and wide spectral coverage [54]. A total of 528 images (66 images per year from 2016 to 2023) with cloud coverage below 20% were obtained via the Google Earth Engine (GEE) platform to ensure seasonal consistency and sufficient temporal frequency [18]. Preprocessing procedures included atmospheric correction using Sen2Cor and cloud and shadow masking based on Fmask and the QA60 band, ensuring high-quality reflectance data for water index calculations. This study primarily used bands B2 (Blue), B3 (Green), B4 (Red), B8 (NIR), B11, and B12 (SWIR1 and SWIR2), which are critical for extracting water features using WI, MNDWI, and AWEIsh (Table 3). These spectral combinations enable effective discrimination between water, vegetation, and shadowed surfaces. The processed imagery covered an area of approximately 290 km2 in Johor, Malaysia, and provided a reliable basis for subsequent water surface classification and aquaculture monitoring [19].
All image bands and auxiliary datasets used in this study were resampled to a uniform 10 m resolution grid using bilinear interpolation within the Google Earth Engine (GEE) platform. This ensures pixel-wise spatial alignment across all layers, which is essential for multi-source raster computation. Given that raster-based classification and masking are executed at the pixel level, the spatial resolution of the output is determined by the grid size (10 m in this study), which provides a reference for evaluating the spatial precision of classification results.
A total of 528 Sentinel-2 images (66 images per year from 2016 to 2023) were selected for this study to ensure high spatial and temporal consistency in monitoring water bodies and aquaculture ponds across a representative area of approximately 290 km2 in Johor, Malaysia. Preprocessing included atmospheric correction using the Sen2Cor tool and cloud/shadow masking using Fmask and the QA60 band. Only images with minimal cloud coverage were retained for each year, ensuring optimal image quality for subsequent water index extraction and time-series analysis.
All Sentinel-2 Level-1C images were atmospherically corrected to Level-2A using the Sen2Cor processor within the GEE environment. This step converts TOA (Top-of-Atmosphere) reflectance to BOA (Bottom-of-Atmosphere) reflectance, minimizing atmospheric distortion. Cloud and shadow masking were implemented using both the QA60 band and the Fmask algorithm to exclude pixels affected by clouds or their shadows. Only scenes with cloud coverage below 20% were selected, and further cloud-filtering was applied at the pixel level using quality flags to ensure accuracy.
The GEE cloud computing platform was used to access and process Sentinel-2 data. All water index calculations (WI, MNDWI, AWEIsh, and CWI) were implemented within GEE using JavaScript API (GEE Code Editor environment), while regression analysis and model optimization were carried out in Python 3.10.

3. Methods

Building on the discussions presented in the Introduction, this study adopts a Composite Water Index (CWI) strategy to address the limitations of individual indices and improve water surface classification under complex environmental conditions. The proposed methodology integrates WI, MNDWI, and AWEIsh using a regression-based approach, implemented on the GEE platform with Sentinel-2 imagery from 2016 to 2023. This section presents the detailed steps of the methodology.

3.1. Extraction Index of Aquaculture Ponds and Water Surface in the Study Area

(1)
Water Index (WI)
The Water Body Index (WI) makes use of the strong absorption of water in near-infrared band and the high reflectivity of green band in water to identify water surface [9,26], and it is given by
WI = GREEN + RED NIR +   S W I R 1
where GREEN = B3, RED = B4, NIR = B8, and SWIR1 = B11. The WI is usually used for mapping the aquaculture ponds worldwide.
(2)
Modified Normalized Difference Water Index (MNDWI)
The GEE method has been proven to be effective in the extraction of Modified Normalized Difference Water Index (MNDWI) for three water body indices of tropical farm ponds based on Landsat 8 satellite images as given in Equation (1) such that [8,20]
MNDWI = GREEN - SWI R 1 GREEN + SWI R 1
where GREEN = B3 and SWIR1 (Short-Wave Infrared 1) = B11. By using SWIR1 band, MNDWI is better than traditional NDWI with near-infrared band in classifying water bodies and vegetation. SWIR band has stronger absorption of water bodies, while green band has higher reflection in water bodies, which makes MNDWI more effective in suppressing the interference of vegetation and buildings.
(3)
The Automatic Water Extraction Index (AWEI)
The AWEI is calculated based on multiple wavelengths, including blue light, green light, near-infrared and different short-wave infrared, and, therefore, it can be applied to a variety of surface conditions with improved accuracy, including urban, vegetated areas, and other complex surface environments [10]. The AWEI is given based on two different cases such as shadow (denoted as AWEIsh) and non-shadow (denoted as AWEInsh), respectively, such that
AWEIsh = G REEN + 2.5 × NIR - 1.5 × ( SWI R 1 + SWI R 2 ) - 0.25 × BLUE
AWEInsh = 4 × ( G REEN - SWI R 1 ) ( 0.25 × NIR + 2.75 × SWI R 2 )
where GREEN = B3, NIR = B8, SWIR1 = B11, SWIR2 = B12, and BLUE = B2 are the reflectance values in their respective bands. In Johor, vegetation coverage is relatively obvious, and the shadow area of trees or buildings is large; therefore, Equation (3) can be used for the calculation of AWEI; otherwise, Equation (4) is used for unique value phenomenon.
By suppressing the reflected signals of land and vegetation in the RSI, the MNDWI can achieve better performance in the identification of water bodies, and, therefore, it is particularly suitable for identifying large areas of open water and unconventional water forms. The multi-band synthesis of AWEI can also effectively reduce the misclassification caused by shadow and terrain under urban and complex surface conditions.
Among several water indices, the NDWI highlights water features by maximizing the suppression of vegetation information, effectively distinguishing water bodies from vegetation and mountainous shadows. Although the accuracy of WI may fluctuate under different background conditions, the study area is relatively clear in RSI without buildings, ice, thin clouds, and mountainous shadows.
In tropical rainforest areas, water boundaries can be extracted more accurately with the MNDWI, especially in areas like eutrophic lakes, whose microfeatures such as the distribution of suspended sediments can be effectively revealed. Compared to the NDWI, the MNDWI reduces the interference from buildings and other man-made structures by replacing the mid-infrared band.
In the AWEIsh method, the combination of multiple bands helps improve the classification accuracy of shadows and dark surfaces, and non-water pixels can be effectively removed to enhance the extraction accuracy for aquaculture ponds under mangrove cover. Even the AWEIsh is vulnerable to highly reflective surfaces such as ice and snow, but this is not an issue for tropical rainforest area of this study.
WI was chosen for its sensitivity to water features in agricultural or aquaculture settings, where water is relatively turbid. MNDWI suppresses the signal of vegetation and man-made structures, which is particularly valuable in Johor’s complex mixed-cover environments. AWEIsh is especially effective in removing shadows caused by mangroves and built-up areas, which are common along Johor’s aquaculture coastlines. These indices were selected based on their proven performance in tropical environments and their complementary strengths in handling turbidity, shadow, and vegetation interference—making them regionally suitable and collectively robust for aquaculture monitoring in Johor.

3.2. The Composite Water Index (CWI)

In order to improve the accuracy of water body extraction from multi-band remote sensing data, a Composite Water Index (CWI) method is proposed to take advantage of different water body indices, such as WI, MNDWI, and AWEIsh. The CWI is calculated with the weighted combination of these indices, such that
CWI = β 0 + β 1 WI + β 2 M N D W I + β 3 A W E I s h   +
where β0 is the intercept; β1, β2, and β3 are the coefficients of WI, MNDWI, and AWEIsh, respectively; and ϵ is the error term.
The coefficients β0, β1, β2, and β3 were estimated using the Ordinary Least Squares (OLS) method, by minimizing the sum of squared residuals between the predicted CWI values and manually annotated water/non-water labels from a training set of 1000 stratified sample points. The objective function is defined as follows:
m i n β i = 1 n ( C W i i p r e d i c t e d C W I i a c t u a l ) 2
The regression was implemented in Python using the LinearRegression module from scikit-learn. To ensure robustness, the model was trained and validated using 10-fold cross-validation, resulting in a high coefficient of determination (R2 = 0.982) and low mean squared error (MSE = 0.0001). This confirms the model’s strong generalization ability and minimizes reliance on manual thresholding.
Furthermore, the training samples were selected using a stratified sampling strategy to include various aquatic types (e.g., rivers, lakes, aquaculture ponds) and non-aquatic covers (vegetation, built-up areas), enhancing the representativeness of the model.
An objective function is defined to determine the coefficients by calculating the sum of the squared difference between the predicted value and the actual value of the sample data, such that
S = i = 0 m Y i Y ^ i 2
where Yi is the observed value, and Y ^ i is the dependent variable of the model prediction. Equation (6) can be minimized with the least square method, and the initial value of the coefficients β0, β1, β2, and β3 are all 0. A system of linear equations is established such that
X T X β = X T Y
where X is an independent variable, β = [β0 β1 β2 β3]T, and Y is the vector group of dependent variables. Equation (7) can be solved with
β = X T X 1 X T Y
The CWI method combines the advantages of the three indices with weighted coefficients to distinguish water boundaries and improve the identification accuracy of water body obstructed by mangroves. The CWI takes into account factors such as sensitivity to environmental changes and adaptability to different water body features, thereby maintaining high extraction accuracy and robustness under various conditions, showing unique advantages and potential in applications requiring long-term and large-scale monitoring. Figure 2 shows the flow chart of the CWI process. It can be seen that a multiple regression process is used in the CIWI to synthesize the composite water index based on the WI, MNDWI, and AWEIsh.
In this study, the WI, MNDWI, and AWEIsh algorithms were used to obtain data sets for the three layers, and grids were laid out on their respective layers to extract uniform points. Each coordinate point corresponded to the values of WI, MNDWI, and AWEIsh, and the CWI was calculated using the ternary regression equation. The CWI was implemented on the GEE platform and ArcGIS Pro 3.0.1 software using a PC with an i7 CPU and 16 GB RAM, and statistical modeling was performed in Python.
To ensure representativeness, the geographic coordinates for the 1000 sample points were selected using a stratified random sampling strategy. The study area was pre-classified into water body subtypes (rivers, lakes, aquaculture ponds) and major non-aquatic land covers (urban areas, vegetation, bare land). Proportional random sampling was then conducted within each stratum to ensure balanced representation across environmental types. This strategy improves the generalization capability of the regression model and reduces sampling bias.
The Sentinel-2 RSIs covering the research area in Johor were obtained from the GEE platform, and the WI, MNDWI, and AWEIsh indices were calculated from the pre-processed images including atmospheric correction, radiation calibration, and cloud masking. Each index highlights a different aspect of a water body based on the spectral characteristics. There are 1000 points covering the study area that were randomly selected with the same geographical coordinates of WI, MNDWI, and AWEIsh, which were normalized to make a uniform comparison.
In the experiment carried out in the GEE platform, the global surface water layer data JRC/GSW1_3/GlobalSurfaceWater was imported to retain water bodies, and WI, MNDWI, and AWEIsh values of the same latitude and longitude were randomly collected by equidistant points. The calculated indicators, such as the WI, MNDWI, and AWEIsh, were then incorporated into the regression model as independent variables, and there are 400, 600, 800, and 1000 coordinate points with latitude and longitude selected for comparison experiments to find the optimal solution. A multiple regression model is designed in Python statistical package with indices such as the WI, MNDWI, and AWEIsh as the input and the dependent variable is CWI to obtain the optimal solution. The coefficients are then trained to minimize the objective function when predicting the CWI. The performance of the model is evaluated with metrics such as R-square, Root Mean Square Error (RMSE). The value of the obtained optimal coefficients represents the weight of the corresponding index contributing to the final output of CWI. The bigger value of the coefficient, the stronger the influence of that index on the CWI. All post-classification spatial analyses and map visualizations were performed using ArcGIS Pro version 3.0.1.

4. Results

4.1. The Optimized WI, MNDWI, and AWEIsh for Surface Water Detection

Figure 3 shows the RSI of the river area in the south-eastern corner of Johor with the large-scale aquaculture industry. Figure 4 shows the obtained post-classification RSI by using the three typical extraction indices such as the WI, MNDWI, and AWEIsh.
In the Johor region as shown in Figure 3, the original RSI shows that the water surface of aquaculture ponds such as fish ponds or shrimp ponds are often rich in aquaculture waste and nutrients. These substances often cause the water surface to appear green or brown, and the water quality is conducive to the growth of algae and other aquatic plants. In the aquaculture pond areas, there are usually frequent water surface activities going on, such as the feeding and water aeration works with floating devices or fences, which will be recorded clearly in the RSI. These water surface activities can be an indicator for the proposed method to identify and monitor the aquaculture industry and the environmental changes around the pond areas.
Another challenge for the water surface identification is that inactive pond areas show different representations in the RSI and are difficult to identify. For example, abandoned aquaculture ponds appear to be dry lands in the RSI, leading to misclassification in large-scale image processing. This situation requires field inspection and sampling to confirm the presence of aquaculture ponds to avoid misclassification.
In Figure 4, the spatial distribution and areas of the water surface are marked with blue color after extracting water by the index from the RSI. In general, the water surface areas are in relatively regular shapes, such as rectangular or square, which usually indicates that these water surfaces are manually constructed as aquaculture ponds. It will be more obvious if these water surfaces are arranged in an orderly manner, for example, in rows or columns, which is the typical layout of an aquaculture farm.
In the second column of Figure 4a–d depicting the MNDWI, it can be seen that the extraction result using the MNDWI can hardly classify the water body, and only the main areas of the breeding pond are identified with small extraction range and fuzzy boundaries. Figure 4a–d show extraction results of the WI, and it can be seen that clear boundaries and contours can be obtained to distinguish the ponds from the surrounding areas. In detail, even the narrow shore of the breeding pond can be observed. Meanwhile, Figure 4a–d present the extraction result of the AWEIsh, which shows that a large water area has been identified as a pond, including the actual breeding pond and the pond bank, resulting in the dilation of several small breeding ponds to form a single large area.
In general, the extraction results obtained by the WI, MNDWI, and AWEIsh present different advantages and limitations in practical applications, and, therefore, the CWI is designed to synthesize the advantages of the three indices to improve the accuracy and efficiency for aquaculture pond monitoring. Table 4 shows the accuracy of the index extraction for Johor’s water bodies. It can be seen that the WI achieves the highest overall accuracy and Kappa coefficient in distinguishing water and non-water areas. Although the overall accuracy of the MNDWI (0.930) is slightly lower than that of the WI (0.931), the high Kappa coefficient shows the performance with ground truth data. The overall accuracy and the Kappa coefficient of AWEIsh are slightly lower than the MNDWI and WI, but it covers a larger water body area. Therefore, the three indices demonstrate different performance in the application of water body classification.

4.2. The Computational Load for Composite Water Index (CWI)

A uniform grid was applied across the three layers, and the WI, MNDWI, and AWEIsh values for the pixels at 1000 grid points were extracted. The ternary regression algorithm developed in Python was used to calculate the coefficients (β0, β1, β2, β3) for these values. To verify these points, this study evaluated 1000, 800, 600, and 400 points, with the coefficient showing the highest R² value considered the best combination.
In this experiment, random data points were collected from the study area, each containing the CWI value obtained from Equations (5a) and (5b) along with the corresponding WI, MNDWI, and AWEIsh values. Table 5 presents the detailed results for the evaluation index coefficients of the WI, MNDWI, and AWEIsh.
Table 6 presents the key validation metrics derived from the classification results of the Composite Water Index (CWI), including User Accuracy, Producer Accuracy, F1-score, Overall Accuracy, and the Kappa Coefficient. These metrics were calculated by comparing the CWI-derived classification outputs against high-resolution reference data obtained through expert visual interpretation of Google Earth imagery.
The results demonstrate high classification reliability, with a User Accuracy of 96.7% and Producer Accuracy of 96.2%, indicating both the precision and completeness of water body extraction. The F1-score, which harmonically balances precision and recall, reaches 0.97, further confirming the robustness of the classification model.
The Overall Accuracy of 96.9% reflects the model’s general classification effectiveness across all land cover types, while the Kappa Coefficient of 0.939 suggests a strong agreement between the predicted classification and the reference data beyond random chance.
These metrics collectively validate the reliability and generalizability of the CWI model for large-scale water body mapping in the study area.
As evidenced by the index coefficient in Table 5 and Table 6, the highest R2 value (0.984) and the lowest MSE were obtained when using 600 sample points. Therefore, the coefficients derived from the proposed method with 600 samples can be considered the optimal solution for extracting aquaculture ponds and water bodies in the Johor area from 2016 to 2023. The CWI formula can be given as
CWI = (−0.5625) + 0.5954 × WI + 0.0004 × MNDWI − 0.2046 × AWEIsh

4.3. The Analysis of the CWI Results

In the experiment, the obtained CWI values were classified into five levels using the reclassification function of ArcGIS Pro 3.0.1. Figure 5 illustrates the percentage of pixels in each classification level. It can be observed that, compared to 2017 and 2018, the proportion of aquaculture pond area in the study area slightly decreased from 2016 to 2019, increased in 2020, and then slightly decreased again in 2021. However, there was a year-by-year increasing trend in 2022 and 2023.
After processing the classified images, statistical data analysis was compiled. Table 7 illustrates the spatial and temporal changes in aquaculture ponds and water surface, as extracted by the CWI algorithm from 2016 to 2023. The dynamic development of water surface distribution in Johor reflects the equivalent growth of the aquaculture industry.
From the analysis of changes in aquaculture ponds and surrounding land features in Johor during 2016 and 2023, some significant trends and patterns can be observed. For example, the aquaculture industry has shown a vigorous growth trend, with the area increasing from 19.91 km2 to 43.18 km2, a total growth of 23.27 km2. Meanwhile, the mangrove areas have seen a continuous decreasing trend, declining from 266.32 km2 in 2016 to 238.11 km2 in 2023, a total decrease of 28.21 km2. This change may indicate ecological degradation and the loss of natural habitats. The total area of the water body in Johor has experienced an increase and followed by a decrease, ultimately reducing from 8120.71 km2 in 2016 to 8087.47 km2 in 2023, with a net decrease of 33.24 km2. Finally, other land uses have shown a stable increase from 5997.86 km2 to 6036.04 km2, with a total increase of 38.18 km2. However, this growth may cover various uses such as urban development, transportation infrastructure, or other purposes.
Based on Figure 6, the water area in the northeast corner increased from 2017 to 2018 compared to 2016. However, from 2019 to 2021, the water extraction results in this region showed a relatively stable trend with no significant area change. By 2022 and 2023, the central region of Johor exhibited notable changes in water characteristics, indicating a significant increase in the area of water bodies. These changes may be related to climate change, human activities, or regional water management policy changes. Further data analysis and research are needed to understand the specific factors driving these changes.

4.4. Extraction Results of Water Surface and Aquaculture Ponds

The weight coefficients of the WI, MNDWI, and AWEIsh in the CWI model were calculated based on 600 sample points using the triple primary regression equation in Equation (6). The calculated overall accuracy and Kappa coefficient of CWI were 94% and 0.940, respectively. Figure 7 shows the aquaculture ponds (in blue) extracted using the CWI algorithm, demonstrating much clearer texture features compared to Figure 4. The blue regions typically represent water bodies and aquaculture ponds, and the yellow areas are the barren land or land with sparse vegetation commonly found at the urban periphery or in transitional zones between different uses of land. The texture within each classified area appears uniform, demonstrating the stable performance of the CWI model.
Different land cover types are indicated with distinct colors, and the boundaries between regions are relatively sharper and more well-defined than in Figure 5. The minimal noise within individual areas and significant color saturation in Figure 7 enhance visual contrast and details, especially in boundary areas. The results in Figure 7 demonstrate that the CWI method can classify fine details with medium resolution and coarse granularity in the RSI effectively.
To further validate the robustness of the proposed Composite Water Index (CWI) model, we conducted a statistical significance analysis comparing its performance with three benchmark indices: WI, MNDWI, and AWEIsh. The comparison was based on the evaluation metrics of mean Intersection over Union (mIoU) and overall pixel accuracy (oPA), calculated across 30 randomly selected validation regions within the study area.
As summarized in Table 8, the CWI model achieved the highest mean mIoU of 0.84 and oPA of 0.94, substantially outperforming the baseline methods. To assess the statistical reliability of these performance differences, paired t-tests were performed between the CWI and each baseline model. The resulting p-values were all below 0.01, indicating that the improvements of the CWI are statistically significant at the 99% confidence level. Furthermore, the 95% confidence intervals for both mIoU and oPA show no overlap between the CWI and other methods, further confirming the superiority of the proposed model.
All preprocessing operations were executed on Google Earth Engine (GEE), whereas model training and inference were conducted on a local computer equipped with Intel (R) Core i7-7500U CPU @ 2.70 GHz, 16 GB RAM, and NVIDIA GeForce 940MX (2 GB VRAM), running on Windows 10 with Python 3.10.
Due to limited GPU resources, training each epoch took approximately 10 min on 512 × 512 patches, with an inference time of around 0.25 s per image. These results demonstrate the practicality of deploying the proposed model on modest computing environments, which is beneficial for wider accessibility.

5. Discussion

5.1. Adaptability and Superiority of the CWI Method

In the object identification task with RSIs, it is essential to apply a suitable threshold to adapt to the specific geographical environments and objectives. In this context, the proposed CWI model calculates the weights through triple regression and performs neighborhood value fine tuning to adapt to the application of water surface identification. Through the experiment, this approach has shown better multi-temporal extraction performance than the three indices, such as the MNDWI, WI, and AWEIsh. However, when using GEE for water body extraction, the threshold of the water body index affects the range and accuracy of identification. This is because of the influences from atmospheric correction, surface cover, seasonal changes, and water quality variation. Previous research on water surface index extraction generally only considers the threshold above zero, but serious variations may occur in low or marshy grounds, especially when the threshold is higher than zero [55]. For example, in the Johor area, the effective output range of the MNDWI obtained from Equation (1) is within [−0.8092, 2.1046], resulting in a water body extraction limit of [−0.2, 0.5]. The effective output range of WI obtained from Equation (2) is within [0.3076, 2.1068], leading to a water surface extraction limit of [0.75, 2]. Also, the effective output range of AWEIsh obtained from Equation (3) is within [−0.4054, 0.6137], and the water surface extraction limit is within [0, 0.224]. The difference in extraction limit is one of the main issues causing false negatives (missing the actual aquaculture areas) and false positives (misidentifying non-water areas as aquaculture areas). Therefore, it is a necessary process to calibrate the threshold for a specific region or analyze the multi-temporal data for seasonal changes and find a suitable threshold accordingly.
In this regard, traditional vegetation indices are insufficient for water surface classification tasks under complex environmental conditions, as demonstrated in previous studies [23,56,57]. Deep learning methods, particularly convolutional neural networks, offer more robust solutions due to their ability to learn abstract spatial features and manage mixed-pixel problems effectively. Consequently, the extraction of water surfaces using the CWI method achieves optimal performance. However, part of the aquaculture area is adjacent to mangrove cover, posing a challenge in the weighting of vegetation and water indices. High vegetation indices restrict the extraction of water surfaces from the RSI with heavy mangrove cover, and even extraction results with LIDAR and hyperspectral data are unsatisfactory. In this regard, traditional vegetation indices are insufficient for water surface classification tasks, making deep learning methods a more suitable solution for this problem.
Multiple regression analysis with remote sensing data requires the rigorous selection of data points to ensure the robustness and validity of the output, which is especially essential for triple linear regression, and the selection of data points must accurately represent the entire data set to prevent bias. This is a challenging requirement for remote sensing applications due to the varying environmental conditions within the study area. A valid regression analysis demands a sufficiently large number of samples, which generally exceeds the number of parameters in the model. It is usually recommended that the number of observations is at least 10~15 times the number of variables in the processing model. For the situation of this study, the water surface in the Johor area is widely distributed, and, therefore, non-water features must be excluded. Through the experiment, it is found that the minimum number of sampling points is required to be 600 in order to capture adequate spatial features to maintain the performance of the CWI model for different landscapes. A random selection operation for data points helps to reduce the selection bias and ensure that there is equal opportunity for each of the samples to contribute to the calculation. This process is particularly important in remote sensing applications, where environmental conditions can vary considerably.
Several recent studies have demonstrated the practical value of deep learning in agricultural and environmental applications. For instance, Malik et al. [58] estimated climate change vulnerability using both analytical and DL-based approaches in the Jammu, Kashmir, and Ladakh regions. Similarly, Gulzar [59] successfully applied MobileNetV2 with deep transfer learning for precise fruit classification, while Gulzar et al. [60] introduced a custom CNN model to differentiate alfalfa varieties based on leaf images. These applications, like our aquaculture pond classification task, rely on the capacity of deep learning to capture subtle spectral and spatial variations, making them highly suitable for monitoring dynamic environmental targets.
Several previous studies have examined the performance of individual water indices such as the WI, MNDWI, and AWEIsh for surface water extraction in various hydrological and geographical settings. For instance, Xu et al. [8] demonstrated that the MNDWI was more effective than the NDWI in urban areas due to its enhanced suppression of built-up land noise. Similarly, Feyisa et al. [10] reported that the AWEIsh performed better in shadow-contaminated zones but may produce higher commission errors in vegetated regions. In comparison, our results show that the proposed CWI method achieved a higher mean mIoU (0.84) and overall accuracy (96.9%) than any individual index, indicating its improved adaptability to heterogeneous landscapes like aquaculture zones in Johor. These findings align with recent research advocating for index fusion and machine learning to enhance water body detection accuracy [15,22]. Our approach contributes to this line of work by offering a regression-based composite index trained on stratified reference data, thereby reducing subjectivity in threshold selection and improving generalization.

5.2. Robustness and Applicability of the CWI Method

The CWI algorithm developed in this study demonstrates strong temporal stability and robustness across different seasons and years. By applying the method to multi-year imagery (2018–2023), the results consistently identified aquaculture pond boundaries with high accuracy, even under varying atmospheric conditions such as haze, cloud cover, and seasonal water level fluctuations. This temporal consistency aligns with the findings of Du et al. [22], who emphasized that stepwise unsupervised extraction models perform reliably in the long-term monitoring of aquaculture regions using high-resolution imagery.
Beyond quantitative mapping, the temporal changes observed—particularly the expansion of aquaculture ponds in the northern and central regions—can be attributed to multiple drivers. These include government incentives for coastal aquaculture development, relaxation of land use regulations, and the implementation of integrated farming models [4]. Environmental factors, such as rising sea surface temperatures and reduced freshwater input, may have also contributed to the spatial shift from inland to brackish water aquaculture. Similar trends were noted by Zhe et al. [3] in Vietnam, where policy reforms and economic demand catalyzed aquaculture sprawl.
Regarding turbidity, the algorithm incorporates MNDWIs and CWIs, which have proven more resilient to suspended sediment interference compared to NDWIs [8,20]. The robustness of these indices arises from their spectral design, which enhances water–non-water contrast in turbid environments. Additionally, pre-processing using Sentinel-2′s Scene Classification Layer (SCL) enabled the effective masking of cloud shadows and turbidity artifacts, thereby maintaining the integrity of classification results.
The influence of aquatic vegetation—such as algae blooms or submerged plants—was addressed through seasonal image selection. Dry-season imagery was prioritized to minimize vegetation-induced distortion in spectral signatures, a strategy also adopted by Musa et al. [5]. While these measures reduce misclassification, we acknowledge that spectral confusion may persist in cases of dense phytoplankton presence. Incorporating red-edge and chlorophyll-sensitive bands in future work may further mitigate this issue.
Finally, the Coastal Water Index (CWI), originally developed for brackish and estuarine systems, demonstrated high adaptability across the studied climatic zones. The index’s performance was validated in both semi-arid and tropical monsoonal conditions, indicating its potential for broader application. Nevertheless, further calibration may be necessary for freshwater environments with high canopy coverage or marine zones with coral-algae complexity, as noted in Ottinger et al. [1].

5.3. Scalability and Global Applicability of the CWI Method

Although this study focuses on aquaculture ponds in the Johor region of Malaysia, the proposed CWI framework possesses a modular structure designed for broader geographical applicability. The regression-based integration of the WI, MNDWI, and AWEIsh enables adaptive tuning to local reflectance characteristics, which supports scalability to regions with varying atmospheric, water quality, and land use conditions. Moreover, the use of Sentinel-2 imagery within the Google Earth Engine (GEE) environment ensures global data availability and standardized preprocessing routines, which are key prerequisites for cross-regional application.
However, applying this methodology at a global scale entails several challenges. First, the accuracy of the CWI depends on the availability of representative and stratified training samples, which may not be uniformly available across different countries or climatic zones. Second, validation requires high-resolution reference data—such as very-high-resolution (VHR) imagery, field surveys, or expert annotation—which may not be accessible in many developing aquaculture regions. Third, differences in aquaculture types (e.g., freshwater, brackish, or marine), pond construction styles, turbidity levels, and surrounding land cover introduce significant heterogeneity that complicates universal thresholding or model generalization.
Despite these constraints, the CWI framework is well-suited for gradual upscaling through regional retraining. By incorporating locally adapted training sets and combining cloud-based processing with regional stratification, it is feasible to build a harmonized global monitoring system. Future work may involve the integration of active learning, transfer learning, or crowd-sourced labeling to enhance the generalizability and reduce the need for manual threshold adjustment across diverse aquaculture zones.

6. Conclusions

This study demonstrates a CWI model for the optimal synthesis of indices such as the MNDWI, WI, and AWEIsh, which have been proven useful in water surface extraction from RSI. The CWI model leverages the combined strengths of these three indices. By training the weighted coefficients with sufficient samples, a set of optimal coefficients can be obtained to achieve satisfactory accuracy in water surface identification. The threshold levels for the three indices are essential for the identification performance of the CWI, and different thresholds are required for different land types in the RSI. Therefore, a set of optimal thresholds for the three indices was determined through experimental analysis. It was also found that the optimization of the CWI model depends on the distribution and number of monitoring samples recorded, with optimal performance metrics achieved when trained with 600 sample points. The trained CWI model was tested with RSI data recorded from Sentinel-2 during the years 2016 to 2023 for water surface and aquaculture pond identification. It was found that the total area of aquaculture ponds in Johor increased by 23.27 km2 over these years. In the experiment, the obtained overall accuracy and Kappa coefficient of the proposed CWI model were 94% and 0.940, respectively, a significant improvement resulting from the integration of multiple indices including the WI, MNDWI, AWEIsh, and NDVI. Although the NDVI alone contributes less than other water indices, it helps refine the classification by excluding vegetated regions adjacent to aquaculture areas. The CWI model provides a potential means for large-area, long-term, efficient, and accurate water surface monitoring with RSI. Future research will focus on the automatic optimization of algorithm parameters and integration with multi-source, multi-temporal data.

Author Contributions

Software: W.Q.; validation: W.Q.; writing—original draft: W.Q. and N.W.; formal analysis: W.Q.; conceptualization, W.Q., M.H.I., M.F.R. and J.D.; supervision—investigation, M.H.I., M.F.R., N.W. and J.D.; resources, W.Q.; writing—review and editing, M.H.I., M.F.R. and N.W.; supervision, M.H.I., M.F.R. and N.W.; methodology, W.Q.; project administration: M.H.I., M.F.R. and N.W.; funding acquisition, N.W. and J.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported partially by the National Natural Science Foundation of China (Grant No. 52161042); by the Guangxi Science and Technology Major Program (2024AA29055); and by the 100 Scholar Plan of the Guangxi Zhuang Autonomous Region of China (2018).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is available upon request.

Acknowledgments

The authors wish to express their sincere gratitude to Ivan Lee of the University of South Australia for his valuable insights and constructive discussions, which significantly contributed to the refinement and improvement of this paper.

Conflicts of Interest

The authors declare that this research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Ottinger, M.; Clauss, K.; Leinenkugel, P.; Künzer, C. Earth Observation for the Assessment of Pond Aquaculture in Coastal Asia-Status and Future Potentials. In Proceedings of the 38th EARSeL Symposium, Chania, Greece, 9–13 July 2018. [Google Scholar]
  2. Duan, Y.; Li, X.; Zhang, L.; Liu, W.; Liu, S.; Chen, D.; Ji, H. Detecting spatiotemporal changes of large-scale aquaculture ponds regions over 1988–2018 in Jiangsu Province, China using Google Earth Engine. Ocean Coast. Manag. 2020, 188, 105144. [Google Scholar] [CrossRef]
  3. Sun, Z.; Luo, J.; Yang, J.; Zhang, L. Dynamics of coastal aquaculture ponds in Vietnam from 1990 to 2015 using Landsat data. IOP Conf. Ser. Earth Environ. Sci. 2020, 502, 012029. [Google Scholar] [CrossRef]
  4. Isa, S.H.; Ramlee, M.N.A.; Lola, M.S.; Ikhwanuddin, M.; Azra, M.; Abdullah, M.; Zakaria, S.; Ibrahim, Y. A system dynamics model for analysing the eco-aquaculture system of integrated aquaculture park in Malaysia with policy recommendations. Environ. Sci. Pollut. Res. 2020, 23, 511–533. [Google Scholar] [CrossRef]
  5. Yu, Z.; Di, L.; Rahman, M.S.; Tang, J. Fishpond mapping by spectral and spatial-based filtering on google earth engine: A case study in singra upazila of Bangladesh. Remote Sens. 2020, 12, 2692. [Google Scholar] [CrossRef]
  6. Davidson, M.; Le, H.; Campos, R.; Yu, S.; Bhuiya, A. Quantifying spatial and temporal patterns of aquaculture in North Carolina using cloud computing. Mar. Technol. Soc. J. 2018, 52, 18–27. [Google Scholar] [CrossRef]
  7. Ottinger, M.; Clauss, K.; Kuenzer, C. Aquaculture: Relevance, distribution, impacts and spatial assessments–A review. Ocean Coast. Manag. 2016, 119, 244–266. [Google Scholar] [CrossRef]
  8. Xu, H.Q. Research on extracting water body information by using improved normalized Differential water body Index (MNDWI). J. Remote Sens. 2005, 9, 589–595. [Google Scholar]
  9. McFeeters, S.K. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
  10. Feyisa, G.L.; Meilby, H.; Fensholt, R.; Proud, S.R. Automated Water Extraction Index: A new technique for surface water mapping using Landsat imagery. Remote Sens. Environ. 2014, 140, 23–35. [Google Scholar] [CrossRef]
  11. Xu, H. Modification of Normalized Difference Water Index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 2006, 27, 3025–3033. [Google Scholar] [CrossRef]
  12. Fisher, A.; Flood, N.; Danaher, T. Comparing Landsat water index methods for automated water classification in eastern Australia. Remote Sens. Environ. 2016, 175, 167–182. [Google Scholar] [CrossRef]
  13. Zhang, X.; Zhang, G.; Zhang, J.; Kou, W. Automatic water surfaces extraction from Sentinel-2 remote sensing images using deep semantic segmentation model. IEEE Access 2021, 9, 43454–43464. [Google Scholar] [CrossRef]
  14. Rad, A.M.; Kreitler, J.; Sadegh, M. Augmented Normalized Difference Water Index for improved surface water monitoring. Environ. Model. Softw. 2021, 140, 105030. [Google Scholar] [CrossRef]
  15. Zhou, Y.; Dong, J.; Xiao, X.; Xiao, T.; Yang, Z.; Zhao, G.; Zou, Z.; Qin, Y. Open surface water mapping algorithms: A comparison of water-related spectral indices and sensors. Water 2017, 9, 256. [Google Scholar] [CrossRef]
  16. Huang, C.; Chen, Y.; Zhang, S.; Wu, J. Detecting, extracting, and monitoring surface water from space using optical sensors: A review. Rev. Geophys. 2018, 56, 333–360. [Google Scholar] [CrossRef]
  17. Liu, S.; Wu, Y.; Zhang, G.; Lin, N.; Liu, Z. Comparing water indices for Landsat data for automated surface water body extraction under complex ground background: A case study in Jilin Province. Remote Sens. 2023, 15, 1678. [Google Scholar] [CrossRef]
  18. Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s optical high-resolution mission for GMES operational services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
  19. Malenovský, Z.; Rott, H.; Cihlar, J.; Schaepman, M.E.; García-Santos, G.; Fernandes, R.; Berger, M. Sentinels for science: Potential of Sentinel-1, -2, and -3 missions for scientific observations of ocean, cryosphere, and land. Remote Sens. Environ. 2012, 120, 91–101. [Google Scholar] [CrossRef]
  20. Schmitt, M.; Hughes, L.H.; Qiu, C.; Zhu, X.X. Aggregating cloud-free sentinel-2 images with google earth engine. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 4, 145–152. [Google Scholar] [CrossRef]
  21. Shen, J.; Ma, M.; Song, Z.; Liu, T.; Zhang, W. Water extraction from high-resolution remote sensing images based on deep learning semantic segmentation model. Remote Sens. Nat. Resour. 2022, 34, 129–135. [Google Scholar]
  22. Luo, Y.; Feng, A.; Li, H.; Li, D.; Wu, X.; Liao, J.; Zhang, C.; Zheng, X.; Pu, H. New deep learning method for efficient extraction of small water from remote sensing images. PLoS ONE 2022, 27, e0272317. [Google Scholar] [CrossRef] [PubMed]
  23. Billson, J.; Islam, M.S.; Sun, X.; Cheng, I. Water body extraction from sentinel-2 imagery with deep convolutional networks and pixelwise category transplantation. Remote Sens. 2023, 15, 1253. [Google Scholar] [CrossRef]
  24. Hang, W.; Fen, Q. A review of remote sensing image water extraction. Sci. Surv. Mapp. 2018, 43, 23–32. [Google Scholar]
  25. Martins, V.S.; Kaleita, A.L.; Gelder, B.K.; da Silveira, H.L.; Abe, C.A. Exploring multiscale object-based convolutional neural network (multi-OCNN) for remote sensing image classification at high spatial resolution. ISPRS J. Photogramm. Remote Sens. 2020, 168, 56–73. [Google Scholar] [CrossRef]
  26. 18Abdelal, Q.; Assaf, M.N.; Al-Rawabdeh, A.; Arabasi, S.; Rawashdeh, N.A. Assessment of Sentinel-2 and Landsat-8 OLI for Small-Scale Inland Water Quality Modeling and Monitoring Based on Handheld Hyperspectral Ground Truthing. J. Sens. 2022, 2022, 4643924. [Google Scholar]
  27. Balha, A.; Mallick, J.; Pandey, S.; Gupta, S.; Singh, C.K. A comparative analysis of different pixel and object-based classification algorithms using multi-source high spatial resolution satellite data for LULC mapping. Earth Sci. Inform. 2021, 14, 2231–2247. [Google Scholar] [CrossRef]
  28. Bozdağ, A.; Ünal, Z.; Karkınlı, A.E.; Soomro, A.B.; Mir, M.S.; Gulzar, Y. An Integrated Approach for Groundwater Potential Prediction Using Multi-Criteria and Heuristic Methods. Water 2025, 17, 1212. [Google Scholar] [CrossRef]
  29. Xia, Z.; Guo, X.; Chen, R. Automatic extraction of aquaculture ponds based on Google Earth Engine. Ocean. Coast. Manag. 2020, 198, 105348. [Google Scholar] [CrossRef]
  30. Li, J.; Wang, C.; Xu, L.; Wu, F.; Zhang, H.; Zhang, B. Multitemporal Water Extraction of Dongting Lake and Poyang Lake Based on an Automatic Water Extraction and Dynamic Monitoring Framework. Remote Sens. 2021, 13, 865. [Google Scholar] [CrossRef]
  31. Du, S.; Huang, H.; He, F.; Luo, H.; Yin, Y.; Li, X.; Xie, L.; Guo, R.; Tang, S. Unsupervised stepwise extraction of offshore aquaculture ponds using super-resolution hyperspectral images. J. Appl. Geogr. 2023, 119, 103326. [Google Scholar] [CrossRef]
  32. Feng, P.; Liu, Z.; Chen, L.; Hu, Y. Surface Water body Extraction using a Progressive Enhancement Model from remote sensing images. In Proceedings of the IEEE 2016 4th International Workshop on Earth Observation and Remote Sensing Applications (EORSA), Guangzhou, China, 4–6 July 2016. [Google Scholar] [CrossRef]
  33. Hossain, M.S.; Bujang, J.S.; Zakaria, M.H.; Hashim, M. The application of remote sensing to seagrass ecosystems: An overview and progress in Malaysia. Int. J. Environ. Sci. Dev. 2015, 6, 286–292. [Google Scholar] [CrossRef]
  34. Lu, Y.; Shao, W.; Sun, J. Extraction of offshore aquaculture areas from medium-resolution remote sensing images based on deep learning. Remote Sens. 2021, 13, 3854. [Google Scholar] [CrossRef]
  35. Pham, T.D.; Xia, J.; Ha, N.T.; Bui, D.T.; Le, N.N.; Takeuchi, W. A review of remote sensing approaches for monitoring blue carbon ecosystems: Mangroves, seagrasses and salt marshes during 2010–2018. Sensors 2019, 19, 1933. [Google Scholar] [CrossRef] [PubMed]
  36. Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring vegetation systems in the Great Plains with ERTS. NASA Spec. Publ. 1974, 351, 309. [Google Scholar]
  37. Rahman, S.; Mesev, V. Change vector analysis, tasseled cap, and NDVI-NDMI for measuring land use/cover changes caused by a sudden short-term severe drought: 2011 Texas event. Remote Sens. 2019, 11, 2217. [Google Scholar] [CrossRef]
  38. Vijith, H.; Dodge-Wan, D. Applicability of MODIS land cover and Enhanced Vegetation Index (EVI) for the assessment of spatial and temporal changes in strength of vegetation in tropical rainforest region of Borneo. Remote Sens. Appl. Soc. Environ. 2020, 18, 100311. [Google Scholar] [CrossRef]
  39. Zhen, Z.; Chen, S.; Yin, T.; Chavanon, E.; Lauret, N.; Guilleux, J.; Henke, M.; Qin, W.; Cao, L.; Li, J.; et al. Using the negative soil adjustment factor of soil adjusted vegetation index (SAVI) to resist saturation effects and estimate leaf area index (LAI) in dense vegetation areas. Sensors 2021, 21, 2115. [Google Scholar] [CrossRef]
  40. Bai, Y.; He, G.; Wang, G.; Yang, G. WE-NDBI-A new index for mapping urban built-up areas from GF-1 WFV images. Remote Sens. Lett. 2020, 11, 407–415. [Google Scholar] [CrossRef]
  41. Mahdavi, S.; Salehi, B.; Granger, J.; Amani, M.; Brisco, B.; Huang, W. Remote sensing for wetland classification: A comprehensive review. GIScience Remote Sens. 2018, 55, 623–658. [Google Scholar] [CrossRef]
  42. Xu, H. A new index for delineating built-up land features in satellite imagery. Int. J. Remote Sens. 2018, 39, 4699–4718. [Google Scholar] [CrossRef]
  43. Isikdogan, F.; Bovik, A.; Passalacqua, P. Surface water mapping by deep learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 4909–4918. [Google Scholar] [CrossRef]
  44. Kanjir, U.; Greidanus, H.; Oštir, K. Vessel detection and classification from spaceborne optical images: A literature survey. Remote Sens. Environ. 2018, 207, 171–188. [Google Scholar] [CrossRef] [PubMed]
  45. Li, W.; Li, Y.; Gong, J.; Feng, Q.; Zhou, J.; Sun, J.; Shi, C.; Hu, W. Urban water extraction with UAV high-resolution remote sensing data based on an improved U-Net model. Remote Sens. 2021, 13, 3165. [Google Scholar] [CrossRef]
  46. Zhaohui, F.; Qin, L.; Liusheng, H.; Guochao, X.; Hongying, Z. Remote sensing image water body extraction based on improved watershed. Bull. Surv. Mapp. 2019, 11–15. [Google Scholar] [CrossRef]
  47. Xu, D.; Zhang, D.; Shi, D.; Luan, Z. Automatic Extraction of Open Water Using Imagery of Landsat Series. Water 2020, 12, 1928. [Google Scholar] [CrossRef]
  48. Zhang, Q.; Xiao, H.; Wu, D.; Wang, P.; Guo, F.; Tao, C. Comparative study of water-body extraction methods based on Landsat8 remote sensing images. In Proceedings of the Third International Conference on Computer Science and Communication Technology, Beijing, China, 30–31 July 2022. [Google Scholar] [CrossRef]
  49. Wegen, M.V.; Huvenne, V.; Kasamatsu, C.; Völker, D.; Davies, J.S. Coastal bathymetry mapping by fusing optical and synthetic aperture radar data. Remote Sens. 2022, 14, 1152. [Google Scholar] [CrossRef]
  50. Tao, C.; Qi, J.; Wang, X.; Yao, J.; Feng, L. Water–land classification using time series of optical and SAR remote sensing data. Remote Sens. 2022, 14, 170. [Google Scholar] [CrossRef]
  51. Weng, Y.; Li, Z.; Tang, G.; Wang, Y. OCNet-Based Water Body Extraction from Remote Sensing Images. Water 2023, 15, 3557. [Google Scholar] [CrossRef]
  52. Tesfaye, M.; Breuer, L. Performance of Water Indices for Water Resources Monitoring Using Large-Scale Sentinel-2 Data. Environ. Monit. Assess. 2023, 196, 467. [Google Scholar] [CrossRef]
  53. Ismail, N.A.H.; Wee, S.Y.; Haron, D.E.M.; Kamarulzaman, N.H.; Aris, A.Z. Occurrence of endocrine disrupting compounds in mariculture sediment of Pulau Kukup, Johor, Malaysia. Mar. Pollut. Bull. 2020, 150, 110735. [Google Scholar] [CrossRef]
  54. Copernicus. Sentinel-2 Overview. Available online: https://sentinels.copernicus.eu/web/sentinel/copernicus/sentinel-2 (accessed on 5 April 2024).
  55. Sabjan, A.; Lee, L.K.; See, K.F.; Wee, S.T. Comparison of three water indices for tropical aquaculture ponds extraction using google earth engine. Sains Malays. 2022, 51, 369–378. [Google Scholar] [CrossRef]
  56. Hall, D.K.; Riggs, G.A.; Foster, J.L.; Kumar, S.V. Development and evaluation of a cloud-gap-filled MODIS daily snow-cover product. Remote Sens. Environ. 2010, 114, 496–503. [Google Scholar] [CrossRef]
  57. Li, Z.; Wang, R.; Zhang, W.; Hu, F.; Meng, L. Multiscale features supported DeepLabV3+ optimization scheme for accurate water semantic segmentation. IEEE Access 2019, 7, 155787–155804. [Google Scholar] [CrossRef]
  58. Malik, I.; Ahmed, M.; Gulzar, Y.; Baba, S.H.; Mir, M.S.; Soomro, A.B.; Sultan, A.; Elwasila, O. Estimation of the extent of the vulnerability of agriculture to climate change using analytical and deep-learning methods: A case study in Jammu, Kashmir, and Ladakh. Sustainability 2023, 15, 11465. [Google Scholar] [CrossRef]
  59. Gulzar, Y. Fruit image classification model based on MobileNetV2 with deep transfer learning technique. Sustainability 2023, 15, 1906. [Google Scholar] [CrossRef]
  60. Gulzar, Y.; Ünal, Z.; Kızıldeniz, T.; Umar, U.M. Deep learning-based classification of alfalfa varieties: A comparative study using a custom leaf image dataset. MethodsX 2024, 13, 103051. [Google Scholar] [CrossRef]
Figure 1. A map of Malaysia showing the location of the study area (red), Johor. Deep blue (water bodies), grass green (vegetated/mountainous areas), and sky blue (mixed residential zones).
Figure 1. A map of Malaysia showing the location of the study area (red), Johor. Deep blue (water bodies), grass green (vegetated/mountainous areas), and sky blue (mixed residential zones).
Sustainability 17 07201 g001
Figure 2. Flowchart of the composite water index (CWI) processes.
Figure 2. Flowchart of the composite water index (CWI) processes.
Sustainability 17 07201 g002
Figure 3. Local remote sensing map of Johor: (A) The result of case point extraction. (B) Location of sampling points in aquaculture ponds. (C) Remote sensing image location of the case point. (D) A close-up view of the case point.
Figure 3. Local remote sensing map of Johor: (A) The result of case point extraction. (B) Location of sampling points in aquaculture ponds. (C) Remote sensing image location of the case point. (D) A close-up view of the case point.
Sustainability 17 07201 g003
Figure 4. Spatial contrast maps of water surface extraction by using index from imagery.
Figure 4. Spatial contrast maps of water surface extraction by using index from imagery.
Sustainability 17 07201 g004aSustainability 17 07201 g004b
Figure 5. Indicators (in percentage) of aquaculture ponds and water surface in Johor based on pixel statistics.
Figure 5. Indicators (in percentage) of aquaculture ponds and water surface in Johor based on pixel statistics.
Sustainability 17 07201 g005
Figure 6. Extraction of water surface and aquaculture ponds in Johor State in 2016 and 2023, showing the dynamic development of the aquaculture industry.
Figure 6. Extraction of water surface and aquaculture ponds in Johor State in 2016 and 2023, showing the dynamic development of the aquaculture industry.
Sustainability 17 07201 g006
Figure 7. Extraction of water surface and aquaculture ponds is much clearer and accurate using CWI. Blue represents water bodies, including aquaculture ponds and river systems. Yellow corresponds to bare land and mixed cropland areas. Orange indicates recently cultivated cropland. Green denotes forested areas, primarily consisting of oil palm plantations.
Figure 7. Extraction of water surface and aquaculture ponds is much clearer and accurate using CWI. Blue represents water bodies, including aquaculture ponds and river systems. Yellow corresponds to bare land and mixed cropland areas. Orange indicates recently cultivated cropland. Green denotes forested areas, primarily consisting of oil palm plantations.
Sustainability 17 07201 g007
Table 1. Water body index methods.
Table 1. Water body index methods.
IndexModeling FormulaSources
NDVINDVI = (NIR − RED)/(NIR + RED)Rouse et al., 1974 [36]
NDWINDWI = (GREEN − NIR)/(GREEN + NIR)McFeeters, 1996 [9]
MNDWIMNDWI = (GREEN − SWIR)/(GREEN + SWIR)Xia et al., 2022 [29]
NDMINDMI = (NIR − SWIR)/(NIR + SWIR)Rahman and Mesev 2019 [37]
EVI EVI = G * NIR - RED NIR + C 1 × RED - C 2 × BLUE + L Vijith and Dodge-Wan, 2020 [38]
SAVISAVI = (NIR − RED)/(NIR + RED + L) × (1 + L)Zhen et al., 2021 [39]
WIWI = (GREEN+RED)/(NIR + SWIR)Rad et al., 2021 [14]
BAI BAI = 1 ( 0.1 - RED ) 2 + ( 0.06 - NIR )   2 Bai et al., 2020 [40]
AWEI AWEIsh = G REEN + 2.5 × NIR - 1.5   ×   ( SWI R 1 + SWI R 2 ) - 0.25   ×   B L U E Feyisa et al., 2014 [10]
Table 2. Data collection.
Table 2. Data collection.
SatelliteThe Time Frame of The ImageAccuracy (m)Number of Images
Cloud < 20%
Sentinel-22016–20231066 × 8 = 528
WaterJRC/GSW1_1/MonthlyHistory mask
JRC/GSW1_3/GlobalSurfaceWater essential data statistics
Table 3. Specification of Sentinel-2 bands used in this study.
Table 3. Specification of Sentinel-2 bands used in this study.
Band NameDescriptionSpatial Resolution (m)Wavelength (nm)
B2BLUE10496.6 (S2A)/492.1 (S2B)
B3GREEN10560 (S2A)/559 (S2B)
B4RED10664.5 (S2A)/665 (S2B)
B8NIR 110835.1 (S2A)/833 (S2B)
B8ARed Edge 420864.8 nm (S2A)/864 nm (S2B)
B11SWIR 1201613.7 (S2A)/1610.4 (S2B)
B12SWIR 2202202.4 (S2A)/2185.7 (S2B)
QA60 3Cloud mask60Cloud mask from polygons. Empty after Feb 2022.
1 Near-infrared (NIR); 2 shortwave infrared (SWIR); and 3 quality assessment (QA) band with 60 m resolution.
Table 4. The accuracy of index extraction.
Table 4. The accuracy of index extraction.
IndexTotal
(km2)
Water
(km2)
Non-Water
(km2)
Overall AccuracyKappa Coefficient
MNDWI (water ∈ (−0.2,0.5))18,9841866.6617,117.340.9300.919
WI (water ∈ (0.75, 2))18,9841914.4517,069.550.9310.932
AWEIsh (water ∈ (0, 0.224))18,9842062.3916,921.610.9220.916
Table 5. The evaluation index coefficients for the WI, MNDWI, and AWEIs.
Table 5. The evaluation index coefficients for the WI, MNDWI, and AWEIs.
Random Pointsβ0β1WIβ2MNDWIβ3AWEIshMSER2
1000−0.55170.58110.0103−0.20720.00010.982
800−0.56380.58670.0038−0.19760.00020.978
600−0.56250.59540.0004−0.20460.00010.984
400−0.57100.60710.0002−0.20140.00010.980
Table 6. Validation metrics summary.
Table 6. Validation metrics summary.
MetricValue
User Accuracy (Water)96.7%
Producer Accuracy (Water)97.2%
F1-Score (Water)97.0%
Overall Accuracy96.9%
Kappa Coefficient93.9%
Table 7. Change analysis of aquaculture ponds and water surface objects in Johor (2016–2019–2023).
Table 7. Change analysis of aquaculture ponds and water surface objects in Johor (2016–2019–2023).
Land Use Category2016201920232016–20192019–20232016–2023
km2Change (km2)
Mangrove266.32249.56238.11−16.76−11.45−28.21
Pond Aquaculture19.9133.1243.1813.2110.0623.27
Water Surface8120.718124.018087.473.3−36.54−33.24
Others5997.865998.116036.040.2537.9338.18
Table 8. Statistical comparison of model performance.
Table 8. Statistical comparison of model performance.
ModelMean mIoU95% CI (mIoU)Mean oPA95% CI (oPA)p-Value (vs. CWI)
WI0.79[0.76, 0.82]0.92[0.91, 0.93]<0.01
MNDWI0.75[0.72, 0.78]0.91[0.90, 0.92]<0.01
AWEIsh0.77[0.73, 0.80]0.91[0.90, 0.92]<0.01
CWI0.84[0.82, 0.86]0.94[0.93, 0.95]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qin, W.; Ismail, M.H.; Ramli, M.F.; Deng, J.; Wu, N. Spatiotemporal Extraction of Aquaculture Ponds Under Complex Surface Conditions Based on Deep Learning and Remote Sensing Indices. Sustainability 2025, 17, 7201. https://doi.org/10.3390/su17167201

AMA Style

Qin W, Ismail MH, Ramli MF, Deng J, Wu N. Spatiotemporal Extraction of Aquaculture Ponds Under Complex Surface Conditions Based on Deep Learning and Remote Sensing Indices. Sustainability. 2025; 17(16):7201. https://doi.org/10.3390/su17167201

Chicago/Turabian Style

Qin, Weirong, Mohd Hasmadi Ismail, Mohammad Firuz Ramli, Junlin Deng, and Ning Wu. 2025. "Spatiotemporal Extraction of Aquaculture Ponds Under Complex Surface Conditions Based on Deep Learning and Remote Sensing Indices" Sustainability 17, no. 16: 7201. https://doi.org/10.3390/su17167201

APA Style

Qin, W., Ismail, M. H., Ramli, M. F., Deng, J., & Wu, N. (2025). Spatiotemporal Extraction of Aquaculture Ponds Under Complex Surface Conditions Based on Deep Learning and Remote Sensing Indices. Sustainability, 17(16), 7201. https://doi.org/10.3390/su17167201

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop