Next Article in Journal
Assessing the Economic Impact of Irrigation Modernization Projects: A Case Study from Türkiye
Previous Article in Journal
Environmental Patterns of Phytoplankton Community Composition Across Lentic and Lotic Systems in Ecuador
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Learning Framework for Atmospheric Correction and Chlorophyll-a Estimation from Landsat-8 Images over the Inland Waters of Northern Vietnam

1
Faculty of Geography, Hanoi National University of Education, Hanoi 100000, Vietnam
2
Department of Geomatics, National Cheng-Kung University, Tainan 70101, Taiwan
3
Faculty of Geology, VNU Hanoi University of Science, Hanoi 100000, Vietnam
*
Author to whom correspondence should be addressed.
Water 2026, 18(4), 498; https://doi.org/10.3390/w18040498
Submission received: 10 January 2026 / Revised: 10 February 2026 / Accepted: 13 February 2026 / Published: 16 February 2026

Abstract

Chlorophyll-a (Chl-a), a proxy for phytoplankton biomass, plays an important indicator in monitoring trophic states of inland waters. This study proposes a comprehensive framework that utilizes two convolutional neural networks (CNNs) for AC (ConvNet-AC) and Chl-a estimation (ConvNet-CHL) in the eutrophic lakes of Hanoi city (Vietnam) using Landsat-8 images. Satellite-based Chl-a retrieval algorithms have been established based on water remote sensing reflectance ( R r s ( λ ) ). However, existing atmospheric correction (AC) models often struggle to efficiently extract R r s ( λ ) due to the complex optical properties of turbid lakes, leading to significant errors in Chl-a retrieval. In this study, a total of 45,764 R r s ( λ ) and 13,561 Chl-a samples are synthesized using radiative transfer AC and regional Chl-a retrieval algorithms to address the scarcity of their data. A two-stage training strategy combined with hyperparameter tuning is utilized to automatically optimize the architecture of both networks. Model validation and testing are performed using a subset of synthesized data and an in situ dataset. In the comparative analysis, numerous AC approaches, including atmospheric correction for OLI “lite”, Case-2 Regional Coast Color, Image Correction for Atmospheric Effects, Landsat-8 Surface Reflectance Code, QUick Atmospheric Correction, and Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH), and the existing regional Chl-a retrieval algorithm are implemented. Results indicate that ConvNet-AC achieves an average R 2 = 0.72 and RMSE = 0.0024 sr−1 for R r s ( λ ) prediction across five spectral bands, outperforming other AC candidates. The ConvNet-CHL achieves R 2 = 0.73 and RMSE = 40.40 mg·m−3 for Chl-a estimation within a range between 50 mg·m−3 and 300 mg·m−3, representing a 43% improvement over the existing regional Chl-a retrieval algorithm with RMSE = 71.99 mg·m−3. Furthermore, the proposed framework successfully captures the spatial and seasonal patterns of the Chl-a concentration distributions, demonstrating the effectiveness of integrating CNN-based AC and Chl-a retrieval, offering a robust and transferable solution for monitoring inland water quality with limited ground-truth data.

1. Introduction

Eutrophication is a critical environmental challenge threatening inland waters worldwide [1,2,3]. This excessive nutrient enrichment process causes harmful algal blooms (HABs) that degrade water quality, negatively influence aquatic ecosystems, and pose significant risks to public health [4,5]. In Vietnam, inland waters, including lakes, reservoirs, and rivers, are undergoing accelerated eutrophication driven by rapid urbanization, untreated wastewater discharge, and climate change [6,7,8]. Severe recurrent mass fish mortalities have been recorded in important lakes in Hanoi, such as Lake Ho Tay, Lake Linh Dam, and Lake Suoi Hai [9,10]. Chlorophyll-a (Chl-a) concentration serves as a key proxy for phytoplankton biomass [11,12,13] and is widely recognized as a core component of the Carlson Trophic State Index [14]. Although traditional field sampling methods have obtained Chl-a concentrations through laboratory analysis, providing accurate point-specific measurements, they are costly, labor-intensive, and time-consuming, offering information only at discrete sampling locations and times with limited insight into the spatial–temporal variations in the Chl-a concentrations across the entire lake surfaces [15,16].
Over the past decades, multispectral satellite imagery has become instrumental in monitoring Chl-a concentrations, enabling synoptic observations across the entire water surfaces and continuous temporal coverage essential for evaluating eutrophication dynamics [17,18,19]. Chl-a has distinct spectral characteristics, including strong absorption in the blue ( λ 440 nm) and red ( λ 675 nm) regions and a reflectance peak near λ 700 nm in the visible and near-infrared (NIR) bands [11]. Previous studies have established empirical [20] and analytical approaches [21] linking water remote sensing reflectance ( R r s ( λ ) ), inherent optical properties (IOPs), and Chl-a concentrations based on these properties. However, inland waters are typically classified as optically complex Case-2 waters due to the presence of colored dissolved organic matter (CDOM) and suspended sediments that vary independently of phytoplankton [22], making accurate Chl-a estimation a challenging task [23].
At present, Chl-a retrieval algorithms are divided into two main categories: (i) empirical and semi-empirical methods that exploit statistical relationships between R r s ( λ ) and Chl-a and (ii) analytical methods based on radiative transfer models and IOPs to retrieve Chl-a concentrations. In recent years, machine learning (ML) methods have demonstrated remarkable capability in capturing complex, non-linear relationships between R r s ( λ ) and Chl-a concentrations [24], including traditional algorithms, such as random forest [25] and gradient boosting machine [26]. More recently, deep learning methods, which is a specialized branch of machine learning, including Artificial Neural Networks [27,28] and Convolutional Neural Networks (CNN) [29,30,31,32], have demonstrated superior capability in automatically learning hierarchical feature representations directly from spectral data without requiring manual feature engineering. However, the performance of these advanced models is highly dependent on input spectral accuracy [33,34]. When atmospheric effects distort input spectra, even advanced models produce inaccurate results. Therefore, an accurate AC method is a critical prerequisite for a reliable Chl-a retrieval from satellite imagery [35].
Atmospheric correction (AC) is important for water quality monitoring because water signals are typically weak relative to atmospheric contributions [36]. The primary objective of AC is to extract satellite-derived R r s ( λ ) that closely match in situ R r s ( λ ) . In fact, satellite sensors receive reflected signals not only from water surfaces but also from atmospheric constituents along the path between the Earth surface and the sensor [36]. Gas molecules, water vapor, and aerosols induce scattering and absorption, significantly altering the magnitude and spectral shape of the water-leaving signal. Extensive evidence demonstrates that an inadequate AC method results in underestimation or overestimation of the Chl-a concentrations [35]. AC methods can be broadly classified into three main categories: physics-based, image-based and ML methods [37]. Physics-based methods utilize RTMs, such as 6S [38] and MODTRAN [39], to simulate atmospheric absorption and scattering and remove these effects from the at-sensor signal. These methods require extensive atmospheric input data, including aerosol optical thickness (AOT), water vapor content, and gas concentration profiles, along with computationally intensive calculations. Image-based methods utilize internal image information (e.g., dark object assumptions) to estimate atmospheric contributions. The dark object subtraction algorithm assumes the presence of nearly dark objects in the image with near-zero reflectance to estimate atmospheric scattering [40]. Although image-based approaches offer simplicity and computational efficiency without requiring ancillary meteorological data, they show limited accuracy under complex atmospheric conditions. These challenges are further amplified in tropical regions, such as Vietnam, where high atmospheric water vapor content, variable aerosol loading from agricultural burning and urban emissions, and frequent cloud cover during monsoon seasons complicate AC procedures. Performance evaluations of AC algorithms in previous studies reveal that no single AC method performs effectively across all spectral bands [35,41], indicating that further methodological development remains necessary.
Recently, ML algorithms have shown considerable promise as alternatives for AC of satellite images [42,43,44]. The key concept is to use ML models to learn the direct relationship between top-of-atmosphere (TOA) reflectance and bottom-of-atmosphere spectra without explicitly modeling atmospheric parameters. ML methods offer several advantages: no requirement for ancillary atmospheric parameters and fast processing instead of solving complex physical equations [42,45]. However, ML methods also face challenges, particularly the need for sufficiently large and diverse training datasets spanning a wide range of atmospheric conditions to ensure robust generalization [35]. Although Chl-a retrieval algorithms and AC models for satellite-based applications have been topics of discussion for several decades, most previous studies have focused either on improving Chl-a retrieval algorithms from atmospherically corrected images (existing Level-2 products) [34,46] or on developing improved AC models for satellite images [47], treating these two components as separate sequential processes. This separation allows errors from the AC stage to propagate and amplify in subsequent Chl-a estimation, yet few studies have developed unified frameworks that jointly optimize both components to minimize such error propagation [48]. Moreover, conventional AC methods rely on assumptions regarding aerosol types and water optical properties that are frequently violated in optically complex tropical inland waters [37], and region-specific AC approaches optimized for these challenging environments remain scarce. These challenges are further compounded by the limited availability of in situ R r s ( λ ) and Chl-a measurements in tropical developing regions, which poses significant difficulties for training robust data-driven models.
This study aims to develop an integrated CNN framework that optimizes AC and Chl-a estimation processes. Among the ML approaches, CNNs are particularly well suited for these tasks due to their ability to capture spatial–spectral features and learning hierarchical representations from multi-spectral satellite images, offering advantages over pixel-wise methods, such as random forest or gradient boosting that do not exploit spatial context [29,49]. This integrated approach leverages the advantages of ML methods at the AC and Chl-a retrieval stages. Unlike conventional workflows that rely on generic AC processors not specifically designed for inland waters, both CNN models in this framework are optimized for the optical characteristics of turbid tropical inland waters, potentially reducing uncertainties propagated from the AC stage to Chl-a estimation. This study utilizes a synthetic training data generation strategy using established AC and Chl-a algorithms to create robust training datasets and address the limited availability of in situ measurements in tropical inland waters. The specific objectives of this study were: (1) to develop a CNN-based AC model that converts satellite TOA reflectance into R r s ( λ ) spectra closely matching field measurements; (2) to establish a CNN-based model for an accurate Chl-a concentration estimation; and (3) to validate the capabilities of the integrated CNN framework in capturing the spatial and seasonal variations in Chl-a in inland waters of Northern Vietnam.

2. Study Area and Data

2.1. Study Area

Seven inland lakes and reservoirs were selected, primarily located in Northern Vietnam, spanning a wide range of sizes and exhibiting diverse trophic states. The study area is characterized by a tropical monsoon climate with distinct wet (May–October) and dry (November–April) seasons. The locations and their corresponding trophic states are illustrated in Figure 1. These lakes play a crucial role in regulating rainwater, mitigating urban inundation and flooding (Lake Ho Tay, Lake Linh Dam), and supporting aquaculture activities (Lake Suoi Hai). However, the ecosystems of these lakes are increasingly threatened by overfishing and the rapid urbanization of Hanoi city. Consequently, lake water quality has experienced severe degradation. For instance, several incidents of mass fish deaths were reported in Lake Ho Tay and Lake Linh Dam between 2016 and 2022, drawing significant public attention to the deterioration of water quality. Therefore, continuous monitoring of the water quality of these lakes is essential for the ecological health and landscape value of urban inland lakes in Vietnam.

2.2. Datasets

2.2.1. Landsat 8-OLI Collection-2 Level-1 Images

The Landsat-8 satellite, launched in February 2013, is equipped with the Operational Land Imager (OLI) sensor and provides freely accessible remotely sensed data through the United States Geological Survey (USGS) portal. The OLI sensor acquires nine spectral bands, comprising eight multi-spectral bands and one panchromatic band at 30 and 15 m spatial resolutions, respectively. Landsat-8 has a revisit cycle of 16 days, making it suitable for long-term monitoring of the dynamics of optically active water constituents, which typically vary on seasonal timescales.
In this study, a collection of 17 Landsat-8 Collection-2 Level-1 (hereafter L8/C2-L1) scenes was downloaded with 16-bit digital numbers (DNs) through the USGS portal (https://earthexplorer.usgs.gov/, accessed on 25 May 2025). Scene selection followed quality criteria to ensure data reliability for model development and validation. First, satellite acquisition dates were selected to coincide with in situ sampling campaigns, ensuring temporal consistency between satellite observations and field measurements. Second, only scenes with cloud cover less than 10% over the study area were included in the analysis. At the pixel level, the Landsat Collection 2 Level-1 Quality Assessment (QA) band was used to identify and exclude pixels affected by cloud, cloud shadow, cirrus, or adjacent cloud contamination. These quality control procedures ensured that only high-quality, cloud-free pixels were used for subsequent atmospheric correction and model training. Atmospheric correction algorithms were implemented using the Sentinel Application Platform (SNAP 12.0.0) and ENVI 5.3 software. In line with the previous studies, this study utilized only the visible-to-NIR spectral range with wavelength λ between 430 nm and 880 nm because of their sensitivity to Chl-a concentration.

2.2.2. Field-Trip Measurements

This section describes two datasets of in situ R r s ( λ ) and Chl-a measurements, which were used to develop two CNN models. The first model, ConvNet-AC, was designed for AC, while the second model, ConvNet-CHL, was developed for Chl-a retrieval.
a. In situ spectral measurements
A total of 81 in situ R r s ( λ ) measurements were distributed across the seven selected lakes (Figure 1). The in situ spectra were measured within a 2 h window before and after the Landsat-8 satellite overpass. Field campaigns were conducted by teams from Hanoi National University of Education and Vietnam National University–Hanoi University of Science from 2016 to 2019 using a Spectra Vista Corporation (SVM) GER1500 spectrometer. The locations to measure the above-water spectra were selected 50–60 m away from the lake or reservoir shorelines to avoid the selection of mixed land–water pixels in the satellite images. Water-leaving radiance was recorded by the SVM GER1500 spectroradiometer across the spectral range of 350–1050 nm with a bandwidth of 1.5 nm positioned approximately 1 m above the water surface. These reflectance samples were primarily used to calibrate, validate, and further test the proposed ConvNet-AC model.
b. Field-trip Chl-a measurements
Four field campaigns were conducted on Lake Ho Tay, Lake Linh Dam and Lake Suoi Hai from June 2016 to August 2019. On the boat, lake water was collected at a depth of 0.5 m using a Van Dorn water sampler and immediately stored in 1 L cleaned, dark colored bottles in a portable cooler. A set of 69 water samples was analyzed in a professional laboratory to determine the Chl-a concentrations following the American Public Health Association (APHA) method [50]. The descriptive statistics of the measured Chl-a concentrations across the three lakes are presented in Table 1. Moreover, Secchi disk depth (SD) measurements were concurrently obtained at each sampling location to ensure sufficient water depth and avoid shallow areas to minimize bottom effects on water reflectance.

3. Methodology

3.1. CNN-Based AC Model (ConvNet-AC)

3.1.1. ConvNet-AC Input Features

a. Top-of-Atmosphere Reflectance
The first input to ConvNet-AC was the TOA reflectance derived from the Landsat-8 OLI Collection 2 Level-1 image. The conversion from DNs to TOA reflectance was performed using the radiometric rescaling coefficients provided in the image metadata file, following the standard procedure described in the Landsat 8 Data Users Handbook [51]. The TOA reflectance without sun angle correction ( ρ λ ) was calculated as follows:
ρ λ = M ρ × Q c a l + A ρ ,
ρ λ = ρ λ cos ( θ S Z A )
where Q c a l is the quantized pixel value (DN), M ρ is the band-specific multiplicative rescaling factor, and A ρ is the band-specific additive rescaling factor. Thereafter, the sun angle correction was applied to obtain the final TOA reflectance ( ρ λ ), where θ S Z A is the solar zenith angle extracted from the satellite image metadata. This conversion was applied to all examined spectral bands.
b. AOT
AOT is a critical input for AC of satellite imagery, regardless of whether the workflow is a physics-based, semi-empirical, or ML-based approach. AOT quantifies the degree to which aerosols attenuate light transmission through absorption or scattering. Accurate integration of AOT data can significantly minimize surface reflectance errors in the visible bands. This study adopted AOT data derived from the Image Correction for Atmospheric Effects (iCOR) processor implemented in SNAP software (version 12.0.0) [52]. iCOR implements an adapted version of a land-based AOT retrieval algorithm that exploits the spectral variability of land pixels within an image to estimate aerosol at λ 550 nm. The study area, characterized by heterogeneous land cover surrounding the lakes, satisfied the spatial variability requirements of this method and did not violate any of its assumptions. Afterward, the iCOR-derived AOT data were normalized using min–max normalization to produce values ranging between zero and one, ensuring consistent input scaling for the neural network.
c. Water Vapor
AC of Landsat satellite imagery involves converting the TOA reflectance to water remote sensing reflectance by removing the effects of atmospheric gases and aerosols. Among these atmospheric constituents, water vapor is especially influential due to its strong and highly variable absorption characteristics in specific wavelength regions, particularly in the NIR and shortwave-infrared (SWIR) bands. Inadequate compensation for water vapor absorption can result in underestimation of surface reflectance, especially in water-vapor-sensitive bands. This study utilized column water vapor data provided by the U.S. National Centers for Environmental Prediction, which are included as an ancillary layer in the Landsat-8 Provisional Aquatic Reflectance (L8PAR) products [53]. Additional information about this product can be found in the L8PAR Algorithm Description Document [54]. The water vapor values were normalized using min–max scaling to ensure consistent input ranges for the ConvNet-AC model, similar to AOT.
d. Sun-sensor geometry angles
Sun-sensor geometry significantly influences the accuracy of atmospherically corrected products due to its effects on atmospheric path length and bidirectional reflectance distribution. Three key angular parameters were incorporated into ConvNet-AC: solar zenith angle (SZA), viewing zenith angle (VZA), and relative azimuth angle (RAA). These parameters jointly controlled the atmospheric path lengths of incoming solar radiation and outgoing water-leaving radiance, affecting Rayleigh scattering and aerosol correction accuracy. Per-pixel solar and sensor geometry angles were provided for Landsat-8 Collection 2 products as separate angle coefficient files. The SZA and VZA were extracted directly from these files, while the RAA was calculated as the absolute difference between the solar azimuth angle (SAA) and the sensor azimuth angle (VAA) [38]:
RAA = SAA VAA
When RAA exceeded 180 , it was adjusted as RAA adj = 360 RAA to constrain values between 0 and 180 . All angular inputs were normalized to the zero and one range to maintain consistent scaling across all ConvNet-AC inputs.

3.1.2. ConvNet-AC Training Data Generation

The ConvNet-AC model contained a substantial number of trainable parameters, necessitating a large and diverse training dataset of R r s ( λ ) reflectance to mitigate overfitting and achieve robust generalization across various optical water conditions. However, the in situ dataset collected in this study contained only 81 R r s ( λ ) spectra. We generated synthetic training labels by applying a physics-based AC algorithm to L8/C2-L1 imagery to overcome this data scarcity challenge. Although the iCOR algorithm was utilized in this study, other state-of-the-art AC approaches designed for inland and coastal waters, such as Atmospheric Correction for OLI “lite” (ACOLITE) and Case-2 Regional Coast Color (C2RCC), can serve as alternatives. These methods utilize radiative transfer modeling to simulate atmospheric effects, including aerosol scattering, molecular absorption, and sun-sensor geometric configurations, making them suitable for generating training data despite limitations in absolute accuracy over highly turbid waters. The ConvNet-AC architecture required extensive training data to learn robust AC conditions. Of the 81 in situ R r s ( λ ) spectra available, 43 measurements were used for training and 38 for testing. We utilized a knowledge distillation approach to avoid overfitting during direct training wherein a teacher model, the iCOR algorithm, generated pseudo-labels for model pre-training. The iCOR algorithm was applied to eight L8/C2-L1 scenes covering study lakes, producing 15,664 synthetic iCOR-derived R r s ( λ ) samples. These scenes were selected to represent diverse atmospheric and water conditions encountered during the period from 2013 to 2020. The 43 in situ R r s ( λ ) spectra were expanded through oversampling to 30,100 samples to prevent the model from prematurely converging to potentially biased synthetic data during subsequent fine-tuning. This oversampling factor was empirically determined to achieve balanced representation between synthetic and ground-truth data during the fine-tuning stage, ensuring that mini batches contained sufficient samples for effective gradient updates. Although simple oversampling does not introduce additional variability, it serves as an oversampling strategy to emphasize the importance of accurate ground-truth measurements during calibration. The combined dataset of 45,764 samples, including 30,100 in situ plus 15,664 iCOR-derived R r s ( λ ) , was used for model training. The objective was to integrate in situ and iCOR-derived samples into a training dataset with different weights. During the mini-batch training, the in situ and iCOR samples had the probabilities 1 / n R r s i n s i t u and 1 / n R r s i C O R to be selected for training, respectively.

3.1.3. ConvNet-AC Model Architecture

The ConvNet-AC model is a CNN designed for AC of satellite-derived TOA reflectance to obtain accurate R r s ( λ ) reflectance. The architecture is composed of two main components: a convolutional subnetwork for spectral feature extraction and a fully connected subnetwork for R r s ( λ ) estimation. In the convolutional subnetwork, the input consists of TOA reflectance at five wavelengths λ = 443, 482, 561, 655, and 865 nm, within a 3 × 3 spatial window centered on the target pixel. These spectral inputs are processed through convolutional layers, utilizing filters with a 1 × 1 × 3 kernel size to extract spectral features while maintaining spatial information. Thereafter, the output feature maps are flattened and concatenated with additional auxiliary inputs including: (i) geometry-related angles, (ii) AOT, and (iii) water vapor content. The concatenated feature vector is passed through fully connected layers to estimate the R r s ( λ ) values at five output wavelengths λ = 443, 482, 561, 655, and 865 nm.

3.2. CNN-Based Chl-a Retrieval (ConvNet-CHL)

3.2.1. ConvNet-CHL Inputs and Outputs

The inputs of the ConvNet-CHL model consist of five R r s ( λ ) at λ = 443, 482, 561, 655, and 865 nm, which are the outputs of the ConvNet-AC, and the output of the ConvNet-CHL is the Chl-a concentration. The structure of ConvNet-CHL consists of two subnetworks. The first subnetwork is the CNN network, and the second one is the fully connected network, which aims to estimate the Chl-a concentration in Lake Ho Tay, Linh Dam, and Suoi Hai.

3.2.2. ConvNet-CHL Training Data Generation

The ConvNet-CHL model contained numerous trainable parameters, requiring a substantial dataset for effective training to learn robust relationships between R r s ( λ ) and Chl-a concentrations. However, only 69 in situ Chl-a measurements were available, collected during field campaigns in 2016 and 2019 across three lakes (Suoi Hai, Linh Dam, and Ho Tay), representing insufficient samples for deep learning training. These samples were divided into training (n = 60) and validation (n = 9) subsets. We generated a large dataset of synthetic Chl-a samples using the green-to-blue band ratio algorithm (hereafter HaGrB) developed by Ha et al. [9] for Hanoi urban lakes to compensate for limited ground-truth data and improve model generalization. The HaGrB algorithm is a regional Chl-a retrieval model specifically calibrated for hypereutrophic urban lakes in Hanoi, expressed as:
Chl - a = 28.1 × e 0.72 × GrB
GrB = R r s ( 561 ) R r s ( 482 )
where GrB represents the green-to-blue band ratio, and R r s ( 561 ) and R r s ( 482 ) denote the remote sensing reflectance at green ( λ = 561 nm) and blue ( λ = 482 nm) wavelengths, respectively. This algorithm was originally calibrated using FLAASH-corrected Landsat-8 Collection-1 reflectance and in situ Chl-a measurements from seven urban lakes in Hanoi [9]. The selection of the green-to-blue band ratio, rather than NIR/Red algorithms commonly used for turbid waters, was based on two considerations. Firstly, Landsat-8 OLI lacks a red-edge band near λ = 700–710 nm required for effective NIR/red algorithms. Secondly, the HaGrB algorithm demonstrated superior performance over NIR-based approaches for the hypereutrophic conditions characteristic of our study lakes, where phytoplankton biomass dominates the optical signal [9].
Although Landsat-8 Collection 1 data were discontinued for scenes acquired after 2020, the algorithm’s coefficients remain applicable to atmospherically corrected reflectance. The application of the HaGrB algorithm to Landsat-8 scenes atmospherically corrected using ConvNet-AC yielded 13,561 synthetic Chl-a estimates ranging from 50 mg·m−3 to 310 mg·m−3. We utilized a binning strategy based on Chl-a concentration to balance the contribution between in situ and synthetic data and prevent model bias toward algorithm-derived values. The concentration range was divided into 13 bins of 20 mg·m−3 each. The 60 in situ training samples were augmented through oversampling and then merged with synthetic Chl-a samples, resulting in approximately 1500 samples per bin to ensure balanced representation across all concentration ranges and prevent model bias (Figure 2). The training strategy utilized a weighted sampling approach, where the probability of selecting in situ and synthetic samples in each mini-batch was proportional to 1 / n C h l i n s i t u and 1 / n C h l H a G r B , respectively. In total, the training dataset consisted of 39,964 training samples. The overall workflow of the proposed CNN framework is illustrated in Figure 3.

3.3. Loss Function and Optimization

The training objective for ConvNet-AC and ConvNet-CHL was formulated as a regression problem minimizing the mean square error (MSE) between predictions and ground-truth measurements [55]. The loss function for AC was defined as follows:
L o s s R r s = 1 m k = 1 m ( R r s p r e d k R r s m e a s k ) 2
In Chl-a estimation, the corresponding loss function for Chl-a retrieval was expressed as follows:
L o s s C h l = 1 n i = 1 n ( C h l p r e d i C h l m e a s i ) 2
where R r s p r e d k and R r s m e a s k represent the predicted and measured R r s ( λ ) for the kth training spectra sample; C h l p r e d i and C h l m e a s i represent the predicted and measured Chl-a concentrations for the ith training sample, and n denotes the number of training Chl-a samples. The Adam optimizer, which utilized adaptive learning rates and momentum, was selected to minimize the loss functions [56].

3.4. Hyperparameter Optimization

Hyperparameter optimization remains a critical challenge in deep learning applications for remote sensing, as the selection of optimal network hyperparameters significantly influences the model performance. We utilized the Hyperband algorithm to automatically optimize network architectures for the ConvNet-AC and ConvNet-CHL models and address these difficulties. Hyperband uses a principled early-stopping strategy based on successive halving. This approach begins with multiple random hyperparameter configurations and progressively eliminates poor performances while allocating additional resources to promising candidates [57]. This bandit-based approach significantly reduces computational cost while maintaining optimization quality. Specifically, we optimized four critical hyperparameters, including the number of convolutional filters, number of fully connected layers, number of neurons in each fully connected layer, and dropout rates, for both networks. The Hyperband optimization enables automatic architectural refinement without manual trial-and error processes, ensuring reproducibility and minimizing human bias in model design decisions.
After the hyperparameter optimization, the fine-tuned ConvNet-AC and ConvNet-CHL were obtained (Figure 4 and Figure 5, respectively). The fine-tuned ConvNet-AC architecture comprise a convolutional subnetwork with two convolutional layers, each containing 16 filters with a 1 × 1 × 3 kernel size, designed to extract spectral features from five input TOA reflectance bands. The output feature maps are flattened and concatenated with five auxiliary inputs, processed through three fully connected layers, each with 384 neurons, followed by an output layer that produces five R r s ( λ ) values corresponding to the input spectral bands. The ConvNet-CHL architecture takes the five band R r s ( λ ) outputs from ConvNet-AC as inputs and utilizes a convolutional and fully connected structure for Chl-a retrieval. The convolutional subnetwork consists of a single layer with 10 filters (1 × 1 × 3 kernel) that extracts spectral features from a 3 × 3 spatial neighborhood. After flattening the resulting 270 element feature vector, two fully connected layers with 480 neurons in each layer regress the final Chl-a concentration. Both models were optimized using the Adam optimizer with a learning rate of 0.0001.

3.5. Accuracy Assessment

We compared the model-derived R r s ( λ ) with the corresponding in situ R r s ( λ ) measurements across five visible-to-near-infrared bands to quantitatively assess the performance of the ConvNet-AC model in reconstructing surface reflectance. The evaluation metrics followed the standardized accuracy assessment protocol established in the Atmospheric Correction Intercomparison Exercise (ACIX-Aqua) [35], including coefficient of determination ( R 2 ), which determines the degree of fit of the AC model; root MSE (RMSE), which represents the general difference between the predicted and in situ R r s ( λ ) ; bias; and mean absolute percentage error (MAPE), which evaluates the systematic error in the inversion algorithm.
R 2 = 1 i = 1 n ( y o b s i y p r e d i ) 2 i = 1 n ( y o b s i y o b s ¯ ) 2
RMSE = i = 1 n ( y p r e d i y o b s i ) 2 n
Bias = i = 1 n ( y p r e d i y o b s i ) n
MAPE = 1 n i = 1 n y o b s i y p r e d i y o b s i × 100 %
where y o b s i and y p r e d i represent the observed (in situ) and predicted (model-derived) values for the ith sample, respectively; and n is the total number of samples. R 2 indicates the proportion of variance in the observed values explained by the model, with values close to one indicating improved fit. RMSE quantifies the average magnitude of prediction errors in the same units as the target variable. Bias measures the systematic over- or underestimation, with positive values indicating overestimation. MAPE expresses prediction accuracy as a percentage, facilitating interpretation across different scales. The same evaluation framework was applied to assess the ConvNet-CHL model performance in estimating Chl-a concentration. However, only R 2 and RMSE were selected for that evaluation.

4. Results and Discussion

4.1. Results of AC of the Satellite Images

We compared the model-derived R r s ( λ ) with the corresponding in situ R r s ( λ ) measurements across five visible-to-near-infrared bands to quantitatively assess the performance of the ConvNet-AC model in reconstructing surface reflectance (Figure 6). The evaluation metrics included R 2 , RMSE, bias, and MAPE. Although the ConvNet-AC achieved its highest accuracy at Band 1, with an R 2 of 0.80 and an RMSE of 0.0002 sr−1 a moderate decline in performance was observed toward longer wavelengths, with R 2 values of 0.78, 0.72, 0.70, and 0.58 for Bands 2-5, respectively. Nevertheless, the model retained a reasonable level of predictive accuracy across all bands, even under the challenging conditions of low reflectance and increased atmospheric interference typically present at long wavelengths. The ability of ConvNet-AC to preserve acceptable performance under such challenging conditions highlights its robustness and adaptability in modeling multi-band spectral information. Bias values remained negligible throughout, indicating no systematic over- or underestimation by the model.

4.2. Comparison of Performance Between ConvNet-AC and Other AC Candidates

A comparative analysis was conducted using six state-of-the-art AC algorithms to evaluate the performance of ConvNet-AC against other AC methods: ACOLITE, C2RCC, FLAASH, iCOR, Landsat 8 Surface Reflectance (L8SR), and QUick Atmospheric Correction (QUAC). Figure 7, Figure 8 and Figure 9 present the retrieval accuracy of R r s ( λ ) across five Landsat-8 spectral bands using the six AC methods. The results demonstrate that ConvNet-AC consistently outperformed other AC methods across all examined bands, achieving high R 2 and low RMSE values. At the blue bands ( λ = 443 and 482 nm), ConvNet-AC achieved excellent performance with R 2 values of 0.80 and 0.78, respectively, exceeding the best-performing compared methods (FLAASH at λ = 443 nm: R 2 = 0.10; QUAC at λ = 482 nm: R 2 = 0.15). The corresponding RMSE errors for ConvNet-AC at these wavelengths were of 0.0002 sr−1 and 0.0013 sr−1, indicating improved accuracy in the short wavelength regions where atmospheric scattering is most pronounced, with MAPE values also remaining below 0.4%. In the green band ( λ = 561 nm), ConvNet-AC maintained strong performance, and the red band ( λ = 655 nm) exhibited similar patterns, with ConvNet-AC achieving R 2 = 0.70 and RMSE = 0.0032 sr−1, while traditional methods demonstrated R 2 ⩽ 0.2, highlighting the challenges that these methods face in accurately correcting atmospheric effects at long visible wavelengths. The NIR band ( λ = 865 nm) proved most challenging for all methods due to the minimal water-leaving signals and high atmospheric contributions. Nevertheless, ConvNet-AC achieved R 2 = 0.58 and RMSE = 0.0021 sr−1, maintaining better performance compared with other AC approaches (best R 2 = 0.12 by iCOR). In particular, ConvNet-AC exhibited near-zero bias across all bands (0.0 to 0.0006), indicating minimal systematic errors. Meanwhile, traditional methods displayed variable bias patterns ranging from −0.010 to 0.0187. Among the examined methods, QUAC and C2RCC performed better than ACOLITE, FLAASH, iCOR, and L8SR.
The enhanced performance of ConvNet-AC can be attributed to the fundamental limitations of conventional AC algorithms when applied to Vietnamese inland lakes, which present particularly challenging optical conditions. These water bodies are characterized by high concentrations of suspended sediments, phytoplankton, and CDOM, resulting in optically complex Case-2 waters with significant reflectance extending into the NIR and SWIR regions. Furthermore, a number of inland lakes in Vietnam are relatively small due to the rapid urbanization and situated near land surfaces, making them susceptible to adjacency effects where light scattered from adjacent terrain contaminates the water signal. These AC methods rely on specific assumptions that are violated in these conditions. Dark-pixel algorithms, such as iCOR and ACOLITE, assume near-zero water reflectance in the SWIR or NIR bands to estimate aerosol contributions. However, turbid or algal-bloom waters in Vietnamese lakes exhibit non-zero reflectance in these bands. Consequently, these algorithms overestimate aerosol thickness and subtract excessive signal, resulting in negative or biased water-leaving reflectance. Meanwhile, FLAASH and QUAC identify the darkest pixels in an image to infer aerosol properties. When water represents the darkest surface in the scene, these methods incorrectly treat it as a dark reference object and underestimate aerosol effects, yielding unrealistically low reflectance values. C2RCC utilizes a neural network trained on synthetic Case-2 water optical properties; however, its performance is constrained by the representativeness of its training dataset, which may not adequately capture the specific optical characteristics of Vietnamese inland waters. The standard Landsat-8 Collection-2 Level-2 surface reflectance product applies a land-oriented aerosol correction algorithm (LaSRC) that is not optimized for water bodies, frequently producing negative reflectance values over water and exhibiting large uncertainties in the blue bands. By contrast, ConvNet-AC overcomes these limitations through its data-driven approach, which learns complex AC mappings directly from diverse training data without relying on fixed assumptions about water optical properties or aerosol models. The model dynamically adapts to variable conditions by explicitly incorporating atmospheric parameters (AOT, water vapor) and viewing geometry as auxiliary inputs. The knowledge distillation and transfer learning strategy using simulated data from physics-based models (iCOR) enables the network to generalize across a wide range of atmospheric and water optical conditions representative of Vietnamese inland waters, thereby achieving robust performance where other AC methods fail.

4.3. Performance Evaluation of ConvNet-CHL and Green–Blue (HaGrB) Band Ratio

The comparative evaluation between the ConvNet-CHL and the HaGrB band ratio showed differences in predictive accuracy (Figure 10). Although the ConvNet-CHL model demonstrated a strong agreement with in situ Chl-a concentrations, yielding an R 2 of 0.73 with RMSE of 40.40 mg·m−3 and a MAPE of 24.34%, the HaGrB algorithm exhibited lower predictive performance with a low R 2 value, high RMSE of 71.99 mg·m−3, and a MAPE of 147.82%. The scatter plot further illustrates that the HaGrB algorithms overestimated the Chl-a values, particularly in the low Chl-a range of 50–100 mg·m−3. This behavior can be attributed to the limited adaptability of the algorithms to varying optical water types across the three lakes and the underlying assumptions. Furthermore, the green–blue band ratio was often applied to optically simple Case-1 waters, such as coastal or ocean waters, where water color is dominated by phytoplankton, and has limited applicability in Case-2 environments, such as eutrophic lakes, where turbidity, suspended sediments, and CDOM significantly contribute to water-leaving reflectance. In these inland waters, most algorithms that incorporate red and near-infrared bands are often preferred. The improved performance of the ConvNet-CHL model can be attributed to its ability to learn complex spectral and spatial relationships. The convolutional layers utilize local connectivity and weight sharing, which reduces the number of trainable parameters, improves computational efficiency, and allows the network to learn patterns across adjacent pixels. CNNs can mitigate noises and take into account optical properties by considering the information from neighboring pixels. Moreover, CNN-based models are well suited to perform non-linear regression directly on high dimensional spectral data; weight sharing and local perception markedly reduce parameter counts. Such results underline the capacity of CNNs to generalize across varying trophic states and highlight their suitability for operational Chl-a retrieval in eutrophic lakes.

4.4. Spatial–Temporal Maps of Chl-a Concentration

The ConvNet-CHL model utilized Landsat-8/9 OLI images acquired between 2016 and 2025 to determine the spatiotemporal variations in the Chl-a concentrations in Lake Ho Tay. The Chl-a levels varied across different parts of the lake, with the highest concentrations exceeding 300 mg·m−3, most frequently observed in the upper northern side (Figure 11). The near-shore Chl-a concentrations were consistently higher than those at the lake center, likely resulting from the intensive anthropogenic activities in densely populated areas surrounding the lake. These areas include well-known cultural sites, such as Tran Quoc, Tay Ho, and Quan Thanh Pagodas, along with numerous tourist attractions and restaurants. Untreated and inadequately treated wastewater discharged directly from these sources significantly increases nutrient loading, consequently elevating the Chl-a concentrations [6].
The spatial distribution of Chl-a concentrations also exhibits notable seasonal variability between the dry and the rainy seasons. The rainy season, extending from May to October, is characterized by notably low Chl-a concentrations, as demonstrated by observations on 16 May 2016, 28 June 2020, and 7 October 2016. During that period, frequent heavy rainfall and occasional tropical storms enhance water flushing and dilute accumulated nutrients, resulting in reduced Chl-a concentrations across the lake. By contrast, the dry season (November to April) exhibits substantially elevated Chl-a concentrations, as clearly shown in the maps for 10 December 2016, 9 March 2025, and 28 April 2025. The limited precipitation during that period reduces flushing and dilution effects, allowing nutrient, particularly nitrogen and phosphorus, to remain concentrated within the lake for extended periods [58]. This nutrient accumulation frequently triggers severe algal blooms. The incidents of mass fish mortality on Lake Ho Tay’s surface have been documented during these periods, with over 200 tonnes of dead fish collected in October 2016 due to oxygen depletion. Additionally, the absence of rainfall and diminished inflow during dry months significantly limit water renewal [59]. Despite the generally low air temperatures during the dry season, daytime temperatures in Hanoi city commonly remain within the optimal range of approximately 20 °C to 30 °C, which are conditions highly conducive to algal growth [60]. Furthermore, urban lakes, such as Ho Tay, are typically shallow and surrounded by dense urban buildings, which obstruct wind flow and impede effective water mixing. This situation exacerbates thermal stratification and restricts oxygen exchange between water layers, further promoting eutrophic conditions [59].
The Carlson Trophic State Index calculated from the Chl-a concentrations consistently exceeds 80 throughout the year, signifying persistent hypereutrophic conditions and frequent algal bloom occurrences [14]. The Chl-a map generated on 28 April 2025 demonstrates the model’s capability in detecting anthropogenic effects, as this date coincides with an extended national holiday period, Reunification Day and International Workers’ Day, during which the lake experiences substantially increased pressure from recreational activities, temple visits, and elevated tourist traffic. The Chl-a distribution map for this date reveals severely elevated concentrations exceeding 300 mg·m−3 across the entire water body.

4.5. Model Applicability and Transferability

The practical value of the proposed CNN framework extends beyond the specific study sites, encompassing potential applications to different sensors and geographical regions. The proposed CNN framework is expected to be applicable to Landsat-9 OLI-2 images without retraining, given the near-identical spectral response functions and radiometric characteristics between Landsat-8 OLI and Landsat-9 OLI-2 [61]. Xu et al. [62] demonstrated that the surface reflectance differences between the two sensors were consistent across most spectral bands, with R 2 values exceeding 0.95. This high radiometric consistency enables the trained ConvNet-AC and ConvNet-CHL models to be seamlessly applied to Landsat-9 data, allowing the integration of both satellites for operational water quality monitoring.
The combination of Landsat-8 and Landsat-9 reduces the effective revisit interval from 16 days to 8 days. This doubling of temporal resolution significantly enhances the capability for near-real-time water quality monitoring, enabling frequent detection of rapid changes, such as algal bloom development and post-rainfall turbidity events, and seasonal eutrophication dynamics. With regard to the eutrophic lakes in Northern Vietnam, where the Chl-a concentrations can substantially fluctuate within days due to monsoon-driven nutrient loading, the eight-day revisit interval provides a more practical monitoring frequency compared with single-satellite observations. Furthermore, the increased temporal density improves the probability of acquiring cloud-free imagery, which is particularly advantageous in tropical regions characterized by persistent cloud cover during the rainy season.
The geographical applicability of the proposed framework primarily depends on the similarity of the optical water types between training and target sites. The study lakes (Ho Tay, Linh Dam, and Suoi Hai) represent optically complex Case-2 waters characterized by moderate-to-high turbidity and eutrophic-to-hypertrophic conditions. The application to lakes with similar optical properties in tropical and subtropical Asia is expected to yield reasonable results without extensive retraining. However, waters with substantially different bio-optical characteristics, such as clear oligotrophic lakes with low phytoplankton biomass, will likely require site-specific model calibration or transfer learning approaches.

5. Conclusions and Future Works

This study successfully developed and validated an integrated end-to-end CNN framework for Chl-a retrieval from the L8/C2-L1 imagery in tropical inland waters, addressing the critical challenge of error propagation between AC and bio-optical parameter estimation, particularly Chl-a concentration. The proposed framework comprises ConvNet-AC for AC and ConvNet-CHL for Chl-a retrieval. This framework demonstrated better performance compared with existing methods when applied to seven eutrophic water bodies in Northern Vietnam. The ConvNet-AC model achieved robust AC performance with an average R 2 of 0.72 and RMSE of 0.0024 sr−1 across five spectral bands at wavelengths λ = 443, 482, 561, 655, and 865 nm, outperforming six state-of-the-art AC algorithms, including ACOLITE, C2RCC, FLAASH, iCOR, L8SR, and QUAC. In terms of Chl-a retrieval, ConvNet-CHL achieved R 2 = 0.73 and RMSE = 40.40 mg·m−3, representing a 43.8% improvement over the regional green-to-blue band ratio (HaGrB) algorithm previously developed for this study area.
Moreover, the proposed models successfully captured the spatiotemporal dynamics of the Chl-a concentrations in Lake Ho Tay using L8/C2-L1 acquired from 2016 to 2025, revealing hypereutrophic conditions with distinct seasonal patterns characterized by high concentrations during dry seasons due to reduced flushing and nutrient accumulation. The knowledge distillation approach, combining limited in situ measurements with physics-based simulations and data augmentation, proved effective in overcoming data scarcity challenges common in tropical regions. This methodology enabled the CNN models to generalize across diverse optical water types despite limited training on in situ R r s ( λ ) spectra and Chl-a measurements, demonstrating the potential for transfer learning in data-limited environments.
However, several limitations should be acknowledged in this study. The reliance on synthetic training data may introduce biases that affect model performance under extreme conditions not well represented in the training set. The spatial coverage, limited to seven lakes in Northern Vietnam, may restrict the model’s applicability to other tropical regions with different optical properties. Despite these limitations, the proposed framework establishes a foundation for operational water quality monitoring in data-scarce tropical regions. Future research should focus on expanding the geographic and temporal coverage of validation datasets, testing model transferability across diverse water body types, integrating higher-resolution satellite sensors such as Sentinel-2 MSI to address spatial resolution constraints, and extending the framework to other water quality parameters, such as total suspended solids and CDOM.
This integrated CNN framework represents a significant advancement in operational water quality monitoring for tropical inland waters, providing water resource managers with a reliable tool for assessing eutrophication status. At the local level, the framework offers substantial practical value for environmental management authorities by generating spatially continuous Chl-a concentration maps from freely available Landsat imagery. Urban lake managers in Hanoi can utilize this cost-effective monitoring tool to identify pollution hotspots, prioritize pollution control actions, and assess whether implemented treatment measures are working effectively. Beyond local applications, the approach is particularly valuable for developing countries where traditional monitoring infrastructure is limited, but satellite data are freely available. Such scalable and accurate remote sensing solutions become increasingly critical for sustainable water management as climate change and urbanization continue to threaten freshwater resources globally.

Author Contributions

Conceptualization, L.T.D., M.V.N. and C.-H.L.; methodology, L.T.D., M.V.N. and C.-H.L.; software, L.T.D., M.V.N. and D.H.D.; validation, C.-H.L., H.T.T.N. and T.P.T.N.; formal analysis, L.T.D., M.V.N., D.H.D. and C.Q.N.; investigation, C.Q.N. and T.P.T.N.; data curation, H.T.T.N.; writing—original draft preparation, L.T.D. and M.V.N.; writing—review and editing, L.T.D., H.T.T.N., C.-H.L., D.H.D. and C.Q.N.; visualization, H.T.T.N. and M.V.N.; supervision, M.V.N.; project administration, L.T.D. and M.V.N.; funding acquisition, L.T.D. and M.V.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Vietnam Ministry of Education and Training, grant number B2024-SPH-15.

Data Availability Statement

The original contributions presented in the study are included in the article. For any further inquiries, please contact the corresponding author.

Acknowledgments

The authors would like to acknowledge Landsat-8 satellite imagery support from USGS. We further appreciate reviews provided by three anonymous reviewers.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Dokulil, M.T.; Teubner, K. Eutrophication and Climate Change: Present Situation and Future Scenarios. In Eutrophication: Causes, Consequences and Control; Springer: Basingstoke, UK, 2011; pp. 1–16. [Google Scholar] [CrossRef]
  2. Smith, V.H.; Tilman, G.D.; Nekola, J.C. Eutrophication: Impacts of excess nutrient inputs on freshwater, marine, and terrestrial ecosystems. Environ. Pollut. 1999, 100, 179–196. [Google Scholar] [CrossRef]
  3. Wang, S.L.; Li, J.S.; Zhang, B.; Spyrakos, E.; Tyler, A.N.; Shen, Q.; Zhang, F.F.; Kutser, T.; Lehmann, M.K.; Wu, Y.H.; et al. Trophic state assessment of global inland waters using a MODIS-derived Forel-Ule index. Remote Sens. Environ. 2018, 217, 444–460. [Google Scholar] [CrossRef]
  4. Heisler, J.; Glibert, P.M.; Burkholder, J.M.; Anderson, D.M.; Cochlan, W.; Dennison, W.C.; Dortch, Q.; Gobler, C.J.; Heil, C.A.; Humphries, E.; et al. Eutrophication and harmful algal blooms: A scientific consensus. Harmful Algae 2008, 8, 3–13. [Google Scholar] [CrossRef]
  5. Paerl, H.W.; Huisman, J. Climate change: A catalyst for global expansion of harmful cyanobacterial blooms. Environ. Microbiol. Rep. 2009, 1, 27–37. [Google Scholar] [CrossRef]
  6. Ha, N.T.T.; Dzung, D.T.; Hang, H.T.T.; Huy, T.Q.; Tu, N.N. Water quality assessment and eutrophic classification of Hanoi lakes using different indices. Vietnam J. Agric. Sci. 2021, 4, 1229–1240. [Google Scholar] [CrossRef]
  7. Nguyen, T.L.; Pham, T.H.T.; Luong, T.P.; Vu, T.H.; Nguyen, T.T.H.; Pham, Q.V. Using Sentinel-2B Imagery to Estimate the Eutrophication Level of Linh Dam Lake, Hoang Mai District, Hanoi. VNU J. Sci. Earth Environ. Sci. 2019, 35, 88–96. [Google Scholar] [CrossRef]
  8. Vinh, P.Q.; Ha, N.T.T.; Thao, N.T.P.; Linh, N.T.; Oanh, L.; Phuong, L.T.; Huyen, N.T.T. Monitoring the trophic state of shallow urban lakes using Landsat 8/OLI data: A case study of lakes in Hanoi (Vietnam). Front. Earth Sci. 2022, 19, 25–40. [Google Scholar] [CrossRef]
  9. Ha, N.T.T.; Koike, K.; Nhuan, M.T.; Canh, B.D.; Thao, N.T.P.; Parsons, M. Landsat 8/OLI Two Bands Ratio Algorithm for Chlorophyll-A Concentration Mapping in Hypertrophic Waters: An Application to West Lake in Hanoi (Vietnam). IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 4919–4929. [Google Scholar] [CrossRef]
  10. Linh, N.T.; Ha, N.T.T.; Thao, N.T.P.; Vinh, P.Q. Assessing trophic status of Suoi Hai Reservoir using Carlson’s Trophic State Index. Vietnam J. Earth Sci. 2021, 43, 509–523. [Google Scholar] [CrossRef]
  11. Gons, H.J. Optical teledetection of chlorophyll a in turbid inland waters. Environ. Sci. Technol. 1999, 33, 1127–1132. [Google Scholar] [CrossRef]
  12. Santos, V.O.; Guimaraes, B.; Neto, I.E.L.; de Souza, F.D.; Rocha, P.A.C.; Thé, J.V.; Gharabaghi, B. Chlorophyll-a Estimation in 149 Tropical Semi-Arid Reservoirs Using Remote Sensing Data and Six Machine Learning Methods. Remote Sens. 2024, 16, 1870. [Google Scholar] [CrossRef]
  13. Synan, H.E.; Howes, B.L.; Sampieri, S.; Lohrenz, S.E. Water Quality Monitoring Using Landsat 8 OLI in Pleasant Bay, Massachusetts, USA. Remote Sens. 2025, 17, 638. [Google Scholar] [CrossRef]
  14. Carlson, R.E.; Simpson, J. A Coordinator’s Guide to Volunteer Lake Monitoring Methods; North American Lake Management Society: Madison, WI, USA, 1996; 96p. [Google Scholar]
  15. Amieva, J.F.; Oxoli, D.; Brovelli, M.A. Machine and Deep Learning Regression of Chlorophyll-a Concentrations in Lakes Using PRISMA Satellite Hyperspectral Imagery. Remote Sens. 2023, 15, 5385. [Google Scholar] [CrossRef]
  16. Yu, J.; Zhang, Z.H.; Lin, Y.; Zhang, Y.G.; Ye, Q.; Zhou, X.F.; Wang, H.T.; Qu, M.Z.; Ren, W.W. Optimal Hyperspectral Characteristic Parameters Construction and Concentration Retrieval for Inland Water Chlorophyll-a Under Different Motion States. Remote Sens. 2024, 16, 4323. [Google Scholar] [CrossRef]
  17. Gholizadeh, M.H.; Melesse, A.M.; Reddi, L. A Comprehensive Review on Water Quality Parameters Estimation Using Remote Sensing Techniques. Sensors 2016, 16, 1298. [Google Scholar] [CrossRef] [PubMed]
  18. Matthews, M.W. A current review of empirical procedures of remote sensing in inland and near-coastal transitional waters. Int. J. Remote Sens. 2011, 32, 6855–6899. [Google Scholar] [CrossRef]
  19. Ritchie, J.C.; Zimba, P.V.; Everitt, J.H. Remote sensing techniques to assess water quality. Photogramm. Eng. Remote Sens. 2003, 69, 695–704. [Google Scholar] [CrossRef]
  20. O’Reilly, J.E.; Maritorena, S.; Mitchell, B.G.; Siegel, D.A.; Carder, K.L.; Garver, S.A.; Kahru, M.; McClain, C. Ocean color chlorophyll algorithms for SeaWiFS. J. Geophys. Res.-Ocean. 1998, 103, 24937–24953. [Google Scholar] [CrossRef]
  21. Lee, Z.P.; Carder, K.L.; Arnone, R.A. Deriving inherent optical properties from water color: A multiband quasi-analytical algorithm for optically deep waters. Appl. Opt. 2002, 41, 5755–5772. [Google Scholar] [CrossRef]
  22. Morel, A.; Prieur, L. Analysis of Variations in Ocean Color. Limnol. Oceanogr. 1977, 22, 709–722. [Google Scholar] [CrossRef]
  23. Odermatt, D.; Gitelson, A.; Brando, V.E.; Schaepman, M. Review of constituent retrieval in optically deep and complex waters from satellite imagery. Remote Sens. Environ. 2012, 118, 116–126. [Google Scholar] [CrossRef]
  24. Sagan, V.; Peterson, K.T.; Maimaitijiang, M.; Sidike, P.; Sloan, J.; Greeling, B.A.; Maalouf, S.; Adams, C. Monitoring inland water quality using remote sensing: Potential and limitations of spectral indices, bio-optical simulations, machine learning, and cloud computing. Earth-Sci. Rev. 2020, 205, 103187. [Google Scholar] [CrossRef]
  25. Chusnah, W.N.; Chu, H.J. Estimating chlorophyll-a concentrations in tropical reservoirs from band-ratio machine learning models. Remote Sens. Appl.-Soc. Environ. 2022, 25, 100678. [Google Scholar] [CrossRef]
  26. Cao, Z.G.; Ma, R.H.; Duan, H.T.; Pahlevan, N.; Melack, J.; Shen, M.; Xue, K. A machine learning approach to estimate chlorophyll-a from Landsat-8 measurements in inland lakes. Remote Sens. Environ. 2020, 248, 111974. [Google Scholar] [CrossRef]
  27. Beal, M.R.W.; Özdogan, M.; Block, P.J. A Machine Learning and Remote Sensing-Based Model for Algae Pigment and Dissolved Oxygen Retrieval on a Small Inland Lake. Water Resour. Res. 2024, 60, e2023WR035744. [Google Scholar] [CrossRef]
  28. Shamloo, A.; Sima, S. Investigating the potential of remote sensing-based machine-learning algorithms to model Secchi-disk depth, total phosphorus, and chlorophyll-a in Lake Urmia. J. Great Lakes Res. 2024, 50, 102370. [Google Scholar] [CrossRef]
  29. Pyo, J.; Duan, H.; Baek, S.; Kim, M.S.; Jeon, T.; Kwon, Y.S.; Lee, H.; Cho, K.H. A convolutional neural network regression for quantifying cyanobacteria using hyperspectral imagery. Remote Sens. Environ. 2019, 233, 111350. [Google Scholar] [CrossRef]
  30. Syariz, M.A.; Lin, C.H.; Nguyen, M.V.; Jaelani, L.M.; Blanco, A.C. WaterNet: A Convolutional Neural Network for Chlorophyll-a Concentration Retrieval. Remote Sens. 2020, 12, 1966. [Google Scholar] [CrossRef]
  31. Zeng, Y.; Liang, T.; Fan, D.; He, H. A Novel Algorithm for the Retrieval of Chlorophyll a in Marine Environments Using Deep Learning. Water 2023, 15, 3864. [Google Scholar] [CrossRef]
  32. Jeong, B.; Lee, S.; Heo, J.; Lee, J.; Lee, M.J. Deep Learning-Based Retrieval of Chlorophyll-a in Lakes Using Sentinel-1 and Sentinel-2 Satellite Imagery. Water 2025, 17, 1718. [Google Scholar] [CrossRef]
  33. Ilori, C.O.; Pahlevan, N.; Knudby, A. Analyzing Performances of Different Atmospheric Correction Techniques for Landsat 8: Application for Coastal Remote Sensing. Remote Sens. 2019, 11, 469. [Google Scholar] [CrossRef]
  34. Pahlevan, N.; Smith, B.; Schalles, J.; Binding, C.; Cao, Z.G.; Ma, R.H.; Alikas, K.; Kangro, K.; Gurlin, D.; Hà, N.; et al. Seamless retrievals of chlorophyll-a from Sentinel-2 (MSI) and Sentinel-3 (OLCI) in inland and coastal waters: A machine-learning approach. Remote Sens. Environ. 2020, 240, 111604. [Google Scholar] [CrossRef]
  35. Pahlevan, N.; Mangin, A.; Balasubramanian, S.V.; Smith, B.; Alikas, K.; Arai, K.; Barbosa, C.; Bélanger, S.; Binding, C.; Bresciani, M.; et al. ACIX-Aqua: A global assessment of atmospheric correction methods for Landsat-8 and Sentinel-2 over lakes, rivers, and coastal waters. Remote Sens. Environ. 2021, 258, 112366. [Google Scholar] [CrossRef]
  36. Gordon, H.R. Evolution of Ocean Color Atmospheric Correction: 1970–2005. Remote Sens. 2021, 13, 5051. [Google Scholar] [CrossRef]
  37. Chen, J.Y.; Chen, S.S.; Fu, R.; Li, D.; Jiang, H.; Wang, C.Y.; Peng, Y.S.; Jia, K.; Hicks, B.J. Remote Sensing Big Data for Water Environment Monitoring: Current Status, Challenges, and Future Prospects. Earths Future 2022, 10, e2021EF002289. [Google Scholar] [CrossRef]
  38. Vermote, E.F.; Tanre, D.; Deuze, J.L.; Herman, M.; Morcrette, J.J. Second Simulation of the Satellite Signal in the Solar Spectrum, 6S: An overview. IEEE Trans. Geosci. Remote Sens. 1997, 35, 675–686. [Google Scholar] [CrossRef]
  39. Berk, A.; Anderson, G.P.; Bernstein, L.S.; Acharya, P.K.; Dothe, H.; Matthew, M.W.; Adler-Golden, S.M.; Chetwynd, J.H., Jr.; Richtsmeier, S.C.; Pukall, B. MODTRAN4 radiative transfer modeling for atmospheric correction. In Proceedings of the Optical Spectroscopic Techniques and Instrumentation for Atmospheric and Space Research III; SPIE: Bellingham, WA, USA, 1999; Volume 3756, pp. 348–353. [Google Scholar] [CrossRef]
  40. Chavez, P.S. An Improved Dark-Object Subtraction Technique for Atmospheric Scattering Correction of Multispectral Data. Remote Sens. Environ. 1988, 24, 459–479. [Google Scholar] [CrossRef]
  41. Doxani, G.; Vermote, E.; Roger, J.C.; Gascon, F.; Adriaensen, S.; Frantz, D.; Hagolle, O.; Hollstein, A.; Kirches, G.; Li, F.Q.; et al. Atmospheric Correction Inter-Comparison Exercise. Remote Sens. 2018, 10, 352. [Google Scholar] [CrossRef]
  42. Fan, Y.Z.; Li, W.; Chen, N.; Ahn, J.H.; Park, Y.J.; Kratzer, S.; Schroeder, T.; Ishizaka, J.; Chang, R.; Stamnes, K. OC-SMART: A machine learning based data analysis platform for satellite ocean color sensors. Remote Sens. Environ. 2021, 253, 112236. [Google Scholar] [CrossRef]
  43. Zhao, X.; Ma, Y.; Xiao, Y.F.; Liu, J.Q.; Ding, J.; Ye, X.M.; Liu, R.J. Atmospheric correction algorithm based on deep learning with spatial-spectral feature constraints for broadband optical satellites: Examples from the HY-1C Coastal Zone Imager. ISPRS J. Photogramm. Remote Sens. 2023, 205, 147–162. [Google Scholar] [CrossRef]
  44. Shah, M.; Raval, M.S.; Divakaran, S.; Dhar, D.; Parmar, H. A spatio-temporal deep learning model for enhanced atmospheric correction. Model. Earth Syst. Environ. 2025, 11, 3. [Google Scholar] [CrossRef]
  45. Shah, M.T.; Raval, M.S.; Divakaran, S.; Dhar, D.; Parmar, H. SAAC-Net: Deep neural network-based model for atmospheric correction in remote sensing. Int. J. Remote Sens. 2023, 44, 7365–7389. [Google Scholar] [CrossRef]
  46. Neil, C.; Spyrakos, E.; Hunter, P.D.; Tyler, A.N. A global approach for chlorophyll-a retrieval across optically complex inland waters based on optical water types. Remote Sens. Environ. 2019, 229, 159–178. [Google Scholar] [CrossRef]
  47. Vanhellemont, Q.; Ruddick, K. Atmospheric correction of metre-scale optical satellite data for inland and coastal water applications. Remote Sens. Environ. 2018, 216, 586–597. [Google Scholar] [CrossRef]
  48. Pyo, J.; Duan, H.; Ligaray, M.; Kim, M.; Baek, S.; Kwon, Y.S.; Lee, H.; Kang, T.; Kim, K.; Cha, Y.; et al. An Integrative Remote Sensing Application of Stacked Autoencoder for Atmospheric Correction and Cyanobacteria Estimation Using Hyperspectral Imagery. Remote Sens. 2020, 12, 1073. [Google Scholar] [CrossRef]
  49. Li, Y.; Zhang, H.K.; Shen, Q. Spectral-Spatial Classification of Hyperspectral Imagery with 3D Convolutional Neural Network. Remote Sens. 2017, 9, 67. [Google Scholar] [CrossRef]
  50. American Public Health Association; American Water Works Association; Water Environment Federation. Standard Methods for the Examination of Water and Wastewater, 24th ed.; American Public Health Association: Washington, DC, USA, 2022. [Google Scholar]
  51. U.S. Geological Survey. Landsat 8 (L8) Data Users Handbook; Technical Report LSDS-1574; U.S. Geological Survey: Reston, VA, USA, 2019.
  52. De Keukelaere, L.; Sterckx, S.; Adriaensen, S.; Knaeps, E.; Reusen, I.; Giardino, C.; Bresciani, M.; Hunter, P.; Neil, C.; Van der Zande, D.; et al. Atmospheric correction of Landsat-8/OLI and Sentinel-2/MSI data using iCOR algorithm: Validation for coastal and inland waters. Eur. J. Remote Sens. 2018, 51, 525–542. [Google Scholar] [CrossRef]
  53. Pahlevan, N.; Schott, J.R.; Franz, B.A.; Zibordi, G.; Markham, B.; Bailey, S.; Schaaf, C.B.; Ondrusek, M.; Greb, S.; Strait, C.M. Landsat 8 remote sensing reflectance (Rrs) products: Evaluations, intercomparisons, and enhancements. Remote Sens. Environ. 2017, 190, 289–301. [Google Scholar] [CrossRef]
  54. U.S. Geological Survey. Landsat 8-9 Collection 2 Level-2 Provisional Aquatic Reflectance Algorithm Description Document; U.S. Geological Survey: Reston, VA, USA, 2024.
  55. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; The MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
  56. Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980v9. [Google Scholar] [CrossRef]
  57. Li, L.; Jamieson, K.; DeSalvo, G.; Rostamizadeh, A.; Talwalkar, A. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. J. Mach. Learn. Res. 2018, 18, 1–52. [Google Scholar]
  58. Bricker, S.B.; Longstaf, B.; Dennison, W.; Jones, A.; Boicourt, K.; Wicks, C.; Woerner, J. Effects of nutrient enrichment in the nation’s estuaries: A decade of change. Harmful Algae 2008, 8, 21–32. [Google Scholar] [CrossRef]
  59. Ta, D.T.; Bui, Q.L. Analysis of seasonal variations of factors affecting algal growth in a lake in Hanoi using the eutrophication model. J. Irrig. Sci. Environ. 2019, 64, 60–68. [Google Scholar]
  60. Singh, S.P.; Singh, P. Effect of temperature and light on the growth of algae species: A review. Renew. Sustain. Energy Rev. 2015, 50, 431–444. [Google Scholar] [CrossRef]
  61. U.S. Geological Survey. Landsat 8-9 Calibration and Validation Algorithm Description Document; Technical Report LSDS-1747; U.S. Geological Survey: Reston, VA, USA, 2025.
  62. Xu, H.Q.; Ren, M.J.; Lin, M.J. Cross-comparison of Landsat-8 and Landsat-9 data: A three-level approach based on underfly images. GIScience Remote Sens. 2024, 61, 2318071. [Google Scholar] [CrossRef]
Figure 1. Locations of the seven study lakes showing their trophic states and in situ sampling points. The green triangles denote the measured R r s ( λ ) sites, and the blue squares denote the locations with R r s ( λ ) and Chl-a measurements. Field photographs illustrate water color characteristics of the studied lakes: (a) Lake Ho Tay; (b) Lake Ba Be; (c) Lake Bay Mau; and (d) Lake Suoi Hai.
Figure 1. Locations of the seven study lakes showing their trophic states and in situ sampling points. The green triangles denote the measured R r s ( λ ) sites, and the blue squares denote the locations with R r s ( λ ) and Chl-a measurements. Field photographs illustrate water color characteristics of the studied lakes: (a) Lake Ho Tay; (b) Lake Ba Be; (c) Lake Bay Mau; and (d) Lake Suoi Hai.
Water 18 00498 g001
Figure 2. Distribution of training dataset for ConvNet-CHL, showing a combination of synthetic Chl-a values and augmented in situ Chl-a measurements across concentration ranges between 50 mg·m−3 and 300 mg·m−3.
Figure 2. Distribution of training dataset for ConvNet-CHL, showing a combination of synthetic Chl-a values and augmented in situ Chl-a measurements across concentration ranges between 50 mg·m−3 and 300 mg·m−3.
Water 18 00498 g002
Figure 3. Overall workflow of the proposed CNN-based framework for atmospheric correction and chlorophyll-a estimation from Landsat-8 imagery.
Figure 3. Overall workflow of the proposed CNN-based framework for atmospheric correction and chlorophyll-a estimation from Landsat-8 imagery.
Water 18 00498 g003
Figure 4. The fine-tuned ConvNet-AC network comprises convolutional and fully connected subnetworks, with inputs including TOA reflectance, sun-sensor geometry angles, AOT, and water vapor producing five R r s ( λ ) values as outputs.
Figure 4. The fine-tuned ConvNet-AC network comprises convolutional and fully connected subnetworks, with inputs including TOA reflectance, sun-sensor geometry angles, AOT, and water vapor producing five R r s ( λ ) values as outputs.
Water 18 00498 g004
Figure 5. Integrated CNN framework architecture comprising ConvNet-AC (AC) and ConvNet-CHL (Chl-a retrieval), showing the sequential processing from TOA reflectance to Chl-a concentration.
Figure 5. Integrated CNN framework architecture comprising ConvNet-AC (AC) and ConvNet-CHL (Chl-a retrieval), showing the sequential processing from TOA reflectance to Chl-a concentration.
Water 18 00498 g005
Figure 6. Scatter plots comparing ConvNet-AC predicted and in situ measurements across five Landsat-8 spectral bands. Each plot shows a different spectral band with linear regression (solid line), 1:1 reference (dashed line), and regression equation.
Figure 6. Scatter plots comparing ConvNet-AC predicted and in situ measurements across five Landsat-8 spectral bands. Each plot shows a different spectral band with linear regression (solid line), 1:1 reference (dashed line), and regression equation.
Water 18 00498 g006
Figure 7. Performance comparison of ConvNet-AC against other six atmospheric correction methods across five Landsat-8 spectral bands: (left) coefficient of determination- R 2 values and (right) root-mean-square-error (RMSE).
Figure 7. Performance comparison of ConvNet-AC against other six atmospheric correction methods across five Landsat-8 spectral bands: (left) coefficient of determination- R 2 values and (right) root-mean-square-error (RMSE).
Water 18 00498 g007
Figure 8. Scatter plots comparing satellite-derived and in situ R r s ( λ ) across five Landsat-8 spectral bands for three AC methods: ACOLITE (top), C2RCC (middle), and FLAASH (bottom).
Figure 8. Scatter plots comparing satellite-derived and in situ R r s ( λ ) across five Landsat-8 spectral bands for three AC methods: ACOLITE (top), C2RCC (middle), and FLAASH (bottom).
Water 18 00498 g008
Figure 9. Scatter plots comparing satellite-derived and in situ R r s ( λ ) across five Landsat-8 spectral bands for three AC methods: iCOR (top), L8SR (middle), and QUAC (bottom).
Figure 9. Scatter plots comparing satellite-derived and in situ R r s ( λ ) across five Landsat-8 spectral bands for three AC methods: iCOR (top), L8SR (middle), and QUAC (bottom).
Water 18 00498 g009
Figure 10. Scatter plots of predicted and measured Chl-a concentrations for (left) ConvNet-CHL and (right) the regional green-to-blue band ratio (HaGrB) algorithm.
Figure 10. Scatter plots of predicted and measured Chl-a concentrations for (left) ConvNet-CHL and (right) the regional green-to-blue band ratio (HaGrB) algorithm.
Water 18 00498 g010
Figure 11. Spatial–temporal distribution of Chl-a concentrations retrieved by ConvNet-CHL from Landsat-8 OLI imagery over Lake Ho Tay (2016–2025), overlaid on false color composite (RGB: NIR–red–green) background. The Chl-a maps illustrate variability in phytoplankton biomass across the lake over seasons. Color scale ranges from 50 to 300 mg·m−3, with higher concentrations (red) typically observed in spring and summer months.
Figure 11. Spatial–temporal distribution of Chl-a concentrations retrieved by ConvNet-CHL from Landsat-8 OLI imagery over Lake Ho Tay (2016–2025), overlaid on false color composite (RGB: NIR–red–green) background. The Chl-a maps illustrate variability in phytoplankton biomass across the lake over seasons. Color scale ranges from 50 to 300 mg·m−3, with higher concentrations (red) typically observed in spring and summer months.
Water 18 00498 g011
Table 1. Descriptive statistics of Chl-a concentrations measured in three eutrophic study lakes.
Table 1. Descriptive statistics of Chl-a concentrations measured in three eutrophic study lakes.
LakesDatesNChl-a Concentrations (mg·m−3)
Min.Max.Mean.St.Dev.
Ho Tay1 June 2016853.7085.070.912.9
13 August 201913186.6288.4223.835.1
Linh Dam1 April 20172050.0106.277.017.8
Suoi Hai30 September 20192870.0135.090.212.2
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nguyen, M.V.; Duong, L.T.; Lin, C.-H.; Nguyen, H.T.T.; Nguyen, C.Q.; Dinh, D.H.; Nguyen, T.P.T. Deep Learning Framework for Atmospheric Correction and Chlorophyll-a Estimation from Landsat-8 Images over the Inland Waters of Northern Vietnam. Water 2026, 18, 498. https://doi.org/10.3390/w18040498

AMA Style

Nguyen MV, Duong LT, Lin C-H, Nguyen HTT, Nguyen CQ, Dinh DH, Nguyen TPT. Deep Learning Framework for Atmospheric Correction and Chlorophyll-a Estimation from Landsat-8 Images over the Inland Waters of Northern Vietnam. Water. 2026; 18(4):498. https://doi.org/10.3390/w18040498

Chicago/Turabian Style

Nguyen, Manh Van, Loi Thi Duong, Chao-Hung Lin, Ha Thu Thi Nguyen, Chien Quyet Nguyen, Duong Hoang Dinh, and Thao Phuong Thien Nguyen. 2026. "Deep Learning Framework for Atmospheric Correction and Chlorophyll-a Estimation from Landsat-8 Images over the Inland Waters of Northern Vietnam" Water 18, no. 4: 498. https://doi.org/10.3390/w18040498

APA Style

Nguyen, M. V., Duong, L. T., Lin, C.-H., Nguyen, H. T. T., Nguyen, C. Q., Dinh, D. H., & Nguyen, T. P. T. (2026). Deep Learning Framework for Atmospheric Correction and Chlorophyll-a Estimation from Landsat-8 Images over the Inland Waters of Northern Vietnam. Water, 18(4), 498. https://doi.org/10.3390/w18040498

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop