1. Introduction
Inland aquatic ecosystems play a crucial role in supporting human well-being, providing ecosystem services that encompass multiple aspects, including the supply of drinking water resources [
1]. However, these water bodies face multiple anthropogenic pressures, including climate change, and are under severe threat globally [
2,
3]. Phytoplankton abundance is a key indicator of the trophic status and water quality of inland waters [
4]. In recent decades, the intensification of eutrophication coupled with global warming has substantially increased the frequency and severity of phytoplankton blooms [
5,
6]. The increase in occurrence of harmful algal blooms has disrupted water quality and ecological stability in many inland water systems [
7,
8]. Therefore, the capability to rigorously monitor and accurately quantify phytoplankton dynamics is of critical importance for protecting aquatic ecosystems and sustaining human well-being.
Chlorophyll-
a (Chl-
a) is ubiquitously present in phytoplankton and is therefore widely used as a proxy for phytoplankton biomass [
9,
10]. It serves as a fundamental parameter in aquatic ecology, environmental monitoring, and water resource management [
11]. Traditionally, in situ sampling provides accurate measurements of water quality parameters at specific sampling sites. However, such methods are time-consuming, labor-intensive, and costly, thereby limiting their applicability for large-scale monitoring [
12,
13]. With the advantages of broad spatial coverage and continuous temporal observation, remote sensing technology has gradually become a mainstream approach for characterizing the spatial distribution and temporal dynamics of water quality constituents [
14,
15]. Common satellite sensors used for the remote estimation of Chl-
a concentration include the Ocean and Land Colour Instrument (OLCI), the Medium Resolution Imaging Spectrometer (MERIS), Landsat TM/ETM+/OLI, and the Moderate Resolution Imaging Spectroradiometer (MODIS) [
16].
Nevertheless, the vertical and horizontal migration of phytoplankton is influenced by factors such as light conditions and hydrological dynamics, causing Chl-
a concentrations to fluctuate significantly over the course of a day or even within a few hours [
17,
18]. Spaceborne remote sensing is constrained by satellite revisit cycles and cloud contamination, resulting in limited availability of high-quality imagery that can capture the rapid spatiotemporal variations in phytoplankton blooms in inland waters [
19]. Previous studies have reported that in some regions, only about 24% of Landsat images acquired throughout the year are usable, and from April to July—the peak bloom period—clear images are almost unavailable [
20]. Consequently, satellite-based remote sensing cannot always provide timely and accurate water quality information due to factors such as cloud cover, precipitation, and relatively coarse spatial resolution [
21,
22]. This limitation is particularly critical for locally sensitive areas, such as drinking-water sources, where real-time or emergency monitoring is required.
Unmanned aerial vehicle (UAV) remote sensing, characterized by high spatial–temporal resolution and flexible image acquisition, provides an efficient and cost-effective means for monitoring the water quality of small and medium-sized inland water bodies such as lakes, rivers, and reservoirs [
23]. By bridging the gap between ground-based sampling and satellite observations, UAV remote sensing enables detailed spatial characterization and timely assessment of water quality dynamics, thereby supporting precise water management and pollution control efforts [
24,
25]. In recent years, numerous studies have utilized UAV systems to investigate phytoplankton dynamics in small inland waters. UAV platforms equipped with RGB, multispectral, or hyperspectral sensors have been successfully employed to retrieve Chl-
a concentrations in various aquatic environments [
26]. Among these, hyperspectral sensors—owing to their satisfactory spectral resolution on the order of several nanometers—have demonstrated superior accuracy in water quality retrieval, particularly in complex inland water systems [
27,
28].
UAVs have been widely applied in various emergency monitoring scenarios. However, most existing studies focus on geological disasters, wildfires, and floods, while relatively few have addressed water quality or algal bloom emergencies [
29,
30]. Algal bloom emergency monitoring refers to a rapid-response environmental procedure initiated when a sudden bloom occurs or when early signs suggest an impending outbreak. This type of monitoring requires a quick assessment of bloom distribution in key areas within a limited timeframe. It enables rapid response, real-time tracking, and scientific risk evaluation, providing essential data support for subsequent warning, management, and mitigation actions. UAVs are particularly well-suited for this purpose. Their ability to operate flexibly at low altitudes allows them to capture high-spatial-resolution images at appropriate temporal frequencies. These advantages make UAV-based remote sensing one of the most effective tools for emergency monitoring of inland water quality and algal bloom events. Nevertheless, UAV imagery alone cannot accurately quantify water quality parameters without temporally matched in situ sampling data. Moreover, the use of conventional manned sampling vessels is restricted in some protected drinking-water source areas. In this context, integrating UAV remote sensing with uncrewed surface vessel (USV)-based in situ sampling becomes essential. USVs can be remotely operated to collect water samples in protected, hazardous, or otherwise inaccessible areas while minimizing disturbance to sensitive aquatic environments. When combined with UAV remote sensing, the UAV–USV framework enables near-synchronous acquisition of hyperspectral imagery and in situ water quality measurements. By improving model calibration under small-sample conditions, the integration of UAV-based remote sensing with synchronized USV sampling enhances the reliability of rapid Chl-
a assessment, thereby providing an effective technical solution for emergency monitoring of inland algal bloom events.
The emergence of artificial intelligence (AI) technologies has significantly transformed water quality monitoring using remote sensing data. These data-driven approaches are capable of processing large volumes of data and capturing the complex nonlinear relationships between remote sensing signals and water quality parameters [
31,
32]. As a core branch of AI, machine learning has been widely employed for the retrieval of Chl-
a concentration in water bodies. For instance, in irrigation ponds located in Higashihiroshima, Japan, the iterative stepwise elimination partial least squares (ISE–PLS) regression method was used to retrieve Chl-
a and total suspended solids [
33]. In a case study in Hong Kong, researchers employed multiple machine learning algorithms to estimate the concentrations of suspended solids, Chl-
a, and turbidity [
34]. In Hubei Province, UAV-based high-frequency hyperspectral observations of typical inland waters were used to construct a Chl-
a retrieval model using XGBoost combined with feature selection techniques, enabling the analysis of diel variations in Chl-
a and their driving factors [
35]. Similarly, in the Maozhou River of Guangdong Province, a novel Bayesian probabilistic neural network (BPNN) was developed to quantitatively predict several water quality parameters, including phosphorus, nitrogen, chemical oxygen demand (COD), biochemical oxygen demand (BOD), and Chl-
a [
36]. This model was successfully applied to UAV-based hyperspectral imagery for large-scale water quality monitoring and pollution source tracking, yielding interpretable and significant results. Previous studies have shown that among various machine learning algorithms for Chl-
a retrieval, the random forest (RF) model is the most widely used, achieving an average coefficient of determination (R
2) of approximately 0.7 and demonstrating robust performance (R
2 > 0.7) on training datasets [
10]. However, it is important to note that the “black-box” nature of specific AI algorithms limits their interpretability, hindering a deeper understanding of the underlying processes [
37]. Furthermore, when models are overly complex or trained on limited datasets, overfitting may occur, thereby reducing their predictive performance on unseen data [
38].
Therefore, this study focuses on the strategically significant Longhu Reservoir in Jinjiang, Fujian Province, as the study area and develops and validates a rapid Chl-a retrieval framework based on a cooperative UAV–USV hyperspectral monitoring system. First, a UAV–USV collaborative sampling scheme was designed to enable high-temporal-resolution acquisition of hyperspectral imagery and in situ Chl-a measurements within a three-day window. Subsequently, an efficient and practical method for correcting stripe noise in UAV hyperspectral imagery was proposed to optimize the preprocessing results. On this basis, a data-driven feature selection strategy was employed to systematically compare the retrieval performance of four machine learning models—the RF algorithm, the back propagation (BP) neural network algorithm, the particle swarm optimization-based least squares support vector machine (PSO–LSSVM) algorithm, and partial least squares (PLS) regression. The spatial distribution characteristics of Chl-a concentration were also analyzed. The overall objective of this study is to establish a rapid-deployment emergency monitoring framework for algal blooms by integrating UAV-based hyperspectral imagery with an unmanned sampling system and machine learning algorithms. This framework is expected to provide technical support for Chl-a retrieval in inland waters and contribute to the development of an integrated ground–aerial–space water environment monitoring network.
3. Results
3.1. De-Striping
During the stripe correction process, to prevent sun glint interference from biasing the column-wise median estimation, a sun glint detection mechanism based on the Sun Glint Aware Restoration (SUGAR) algorithm was incorporated [
57]. Detected sun glint regions were masked to eliminate their influence on the stripe correction results. Additionally, the reference baseline for stripe correction was derived from the flight strip acquired during the middle phase of the UAV’s flight mission. By applying a unified correction to all flight strips, this procedure aimed to minimize subtle variations in water-body irradiance caused by changing illumination conditions during flight.
As shown in
Figure 5a, the original UAV image exhibits a noticeable brightness attenuation at the edges perpendicular to the flight direction, where the edge regions appear significantly darker than the image center. In addition, dark stripe noise of varying widths appears intermittently across the image, resulting in local distortion. From the 3D data view in
Figure 5b, it can be further observed that the DN values of the water body decrease more sharply near the image edges. After processing with the median-based stripe correction method, the overall image becomes more uniform in tone, and color consistency is significantly improved (
Figure 5c). The dark stripes and edge attenuation artifacts are effectively removed, and the water body features are well restored, as also verified in the 3D data view shown in
Figure 5d. These results indicate that the proposed median correction approach can effectively suppress sensor-induced noise. The proposed method in this study can correct both stripe noise distribution and edge attenuation without requiring a white reference panel [
41]. Compared with the structure-guided one-way transformation approach, this method avoids complex parameter tuning while substantially reducing image processing time [
58].
3.2. Data Analysis of Longhu Reservoir Samples
Spectral reflectance data at each sampling point were extracted from the preprocessed UAV-based hyperspectral images. Since the spectral bands beyond 800 nm approached near-zero values, a total of 198 bands within the range of 396.05 nm to 805.15 nm were selected for spectral smoothing, and continuous spectral curves were plotted (
Figure 6a). Considering that variations in illumination conditions occurred during UAV data acquisition, the original spectral reflectance values exhibited certain fluctuations. Therefore, all spectral curves were subjected to min–max normalization to facilitate visualization. After normalization, the spectral distributions became more compact, highlighting the distinctive spectral shape characteristics.
The original spectral curves exhibit an overall increasing trend in water surface reflectance within the 400–550 nm range, with a peak at approximately 550 nm. As illustrated in the normalized spectra (
Figure 6b), the water body with the highest Chl-
a concentration (yellow curve) exhibits a pronounced and steep reflectance peak in the green band. In contrast, the water body with the lowest Chl-
a concentration (purple curve) shows a more gradual variation in reflectance across this wavelength range. Around 670 nm, all spectra display a typical red absorption trough, caused by the strong absorption of red light by Chl-
a within phytoplankton cells. Beyond 700 nm, the spectral reflectance decreases rapidly, with two minor reflectance peaks observed at approximately 740 nm and 760 nm. Subsequently, due to strong water absorption and scattering by suspended particles, the reflectance approaches zero. Although variations existed in sampling time and spatial location, the overall spectral patterns across all sites remained consistent. The subtle differences observed in the spectral features were primarily attributed to variations in Chl-
a concentration.
A total of 30 water samples were collected in this study, and the laboratory analysis results are shown in
Figure 7. Due to field constraints, only one sample was obtained at 10:00 on 24 March 2025, which was treated as the mean value for that time period. The Chl-
a concentrations exhibited strong spatial consistency among different sampling sites within the same observation period. However, they showed pronounced temporal heterogeneity across different time periods, with a coefficient of variation reaching 41.86%. Overall, the Chl-
a concentration displayed a clear increasing trend during the three-day observation period. Detailed descriptive statistics are presented in
Table 2.
3.3. Feature Selection Results
The correlation analysis between the original spectral reflectance and Chl-
a concentration is illustrated in
Figure 8. The analysis indicates that the correlations across most wavelengths are generally weak, with correlation coefficients ranging from 0.2 to 0.3. Such weak correlations may be attributed to residual noise and environmental influences that persisted even after radiometric correction, leading to variability in the reflectance data. The correlation curve reveals that the relationship between reflectance and Chl-
a concentration is relatively stronger in the green band (approximately 500–600 nm) and the red-edge region (approximately 680–750 nm). The relatively strong correlation suggests that variations in Chl-
a can be effectively captured and indicated by these spectral bands.
Using a data-driven approach, all possible three-band combinations within the 400–800 nm range were systematically traversed and computed. A total of 1456 combinations with correlation coefficients higher than 0.7 were initially selected. The occurrence frequency of each band within these preferred combinations was then analyzed (
Figure 8). The results show that in the 400–600 nm range, the frequency distribution of bands closely follows their correlation trend with Chl-
a. In contrast, the high-frequency occurrence of bands beyond 700 nm primarily reflects the data-driven algorithm’s selection of synergistic effects among specific band combinations. To further minimize multicollinearity and enhance model stability, the VIF was applied to screen these combinations, ultimately yielding nine candidate three-band indices with low redundancy. Considering both their correlation strength and optical interpretability in aquatic environments, two optimal combinations—560–732–772 nm and 531–760–732 nm—were selected as two joint input features for subsequent machine-learning modeling. These combinations not only include green wavelengths that are highly sensitive to Chl-
a concentration in this study but also align physically with the core wavelength configurations of traditional three-band models, where bands near 730 nm and 760 nm are typically used to correct for suspended matter scattering and water absorption effects.
3.4. Machine Learning Results
In this study, Chl-
a concentration prediction models were developed using RF algorithm, BP neural network, PSO–LSSVM, PLS models. The effectiveness of these models was evaluated based on the performance metrics of the training and testing datasets (
Figure 9).
The RF algorithm, as an ensemble learning approach, has gained widespread attention for its high training efficiency and stable predictive accuracy, particularly demonstrating strong robustness in small-sample datasets. In this study, the RF model achieved R2 = 0.824, RMSE = 1.656, and MAE = 1.414 on the training set. On the testing set, the corresponding values were R2 = 0.768, RMSE = 2.111, and MAE = 2.008. These results indicate that the model maintained high fitting accuracy and generalization capability during both training and testing stages, without any sign of severe overfitting. The mean residuals were close to zero, implying that the predictions were overall unbiased, with the errors being narrowly distributed and free from systematic deviation.
In terms of parameter configuration, setting max_depth = 4 effectively constrained the growth depth of each decision tree, thereby preventing individual trees from overfitting—a crucial consideration for small-sample datasets. Although a shallower tree depth may slightly reduce the fitting capacity, it significantly enhances the model’s generalization performance. Additionally, the relatively flexible settings of min_samples_split = 3 and min_samples_leaf = 1 allowed the model to capture more detailed data patterns within a limited number of samples. At the same time, the ensemble mechanism of RF helped offset the potential noise introduced by individual trees. The remarkable superiority of the RF algorithm in predicting Chl-a concentration from small-sample datasets in this study primarily stems from its ensemble learning strategy and intrinsic stability. Specifically, RF employs bootstrap sampling (bagging) to generate multiple decision trees and integrates their outputs, which effectively reduces model variance and mitigates the risks of overfitting in small-sample contexts.
Furthermore, the algorithm’s out-of-bag error estimation mechanism enables the assessment of generalization performance without the need for an independent validation set—an essential advantage when dealing with limited data. In this study, the minor discrepancy between training and testing performance (a reduction of only 0.056 in R2) further demonstrates that RF achieved a well-balanced trade-off between bias and variance through ensemble averaging, leading to improved stability and reliability under small-sample conditions. Compared with other machine learning algorithms, RF also exhibits advantages in computational efficiency and interpretability when applied to small datasets. The model’s fast training speed and high accuracy are evidenced by its strong training performance (R2 = 0.824) and satisfactory testing accuracy (R2 = 0.768). Such results confirm that RF is a reliable and efficient approach for retrieving water quality parameters—particularly for estimating Chl-a concentration—when data acquisition is constrained. With its straightforward parameter tuning and consistent output, the RF model provides a robust methodological reference for subsequent studies in remote sensing-based water quality assessment.
The BP neural network, through its multilayer nonlinear transformation structure, is capable of learning the complex mapping relationships between input features and Chl-a concentration. The model achieved R2 = 0.731, RMSE = 2.048, and MAE = 1.622 on the training set, while the testing set yielded R2 = 0.715, RMSE = 2.339, and MAE = 2.172. These results indicate that the BP neural network also exhibits satisfactory fitting performance in predicting Chl-a concentration under small-sample conditions. Although its overall performance was slightly lower than that of the RF model, it still demonstrated considerable predictive capability. The network employed a two-hidden-layer architecture (32–16 nodes) and used the ReLU activation function to enhance nonlinear representation. A dropout mechanism (rate = 0.2) was introduced to prevent overfitting. The two-layer structure provided an appropriate model capacity under limited sample conditions, ensuring sufficient feature extraction while controlling the number of parameters. The application of the ReLU activation function effectively alleviated the vanishing gradient problem, while the dropout mechanism improved the model’s generalization ability. Compared to the RF model, the primary difference between the two lies in their learning mechanisms. The BP neural network relies on gradient-descent-based global optimization, which can easily become trapped in local minima when training on small-sample datasets. In contrast, RF aggregates multiple decision trees through ensemble learning, inherently offering greater stability and resistance to overfitting.
However, in this study, the PSO–LSSVM and PLS models exhibited relatively poor performance in retrieving Chl-a concentration. The LSSVM model achieved R2 = 0.612, RMSE = 2.538, and MAE = 2.038 on the training set, and R2 = 0.600, RMSE = 2.222, and MAE = 2.034 on the testing set. Similarly, the PLS model achieved R2 = 0.626, RMSE = 2.572, and MAE = 2.105 for the training set, and R2 = 0.624, RMSE = 2.059, and MAE = 1.716 for the testing set. Notably, the performance of the LSSVM model was far inferior to that reported in previous studies for predicting total suspended matter (R2 > 0.95). This performance degradation may be attributed to the generally low Chl-a concentration in the study area. Under such low-concentration conditions, both LSSVM and PLS models face significant limitations. Support vector machine-based models struggle to capture weak spectral responses associated with low concentrations of Chl-a. Additionally, the optical characteristics of inland and coastal waters are often complex, with suspended solids and colored dissolved organic matter (CDOM) introducing strong spectral interference. Although LSSVM employs kernel functions for nonlinear feature mapping, its capacity to disentangle mixed spectral signals is limited. Similarly, while PLS mitigates multicollinearity through principal component extraction, it still faces challenges in effectively distinguishing the spectral contributions of target parameters from those of interfering substances.
3.5. Inversion Results of Chl-a
Figure 10 presents the spatial distribution of Chl-
a concentration in the study area, retrieved from UAV hyperspectral imagery using the RF algorithm and visualized with the natural breaks classification method. The UAV images of the water body were mosaicked from multiple flight strips. Due to the limitations of the image stitching algorithm in the software developed by Hangzhou Colorspectrum Technology Co., Ltd., noticeable mosaic artifacts remain visible in the final composite image. To minimize the spatial mismatch between image pixels and in situ sampling points, a 9 × 9 spatial median filter was applied to each band independently. For each pixel, the median DN value was calculated over a 9 × 9 neighborhood of pixels in the same band. This processing effectively reduced surface reflection noise and improved the visual clarity of the retrieved Chl-
a distribution. Additionally, some regular short-strip noise patterns are visible in the image. Their exact origin remains uncertain but is presumed to be related to either surface wave interference caused by strong winds or inherent sensor noise from the hyperspectral imaging system. Therefore, the retrieval results in this study primarily reflect the overall spatial distribution patterns of Chl-
a concentration across the study area. The predicted value of individual pixels should not be interpreted as an accurate representation of local concentration due to residual uncertainties and potential image artifacts.
According to the retrieval results, the Chl-a concentration in the study area ranged from 5.83 μg/L to 16.01 μg/L, which shows good agreement with the in situ measurements (maximum 19 μg/L, minimum 2 μg/L). It is noteworthy that high Chl-a concentrations were mainly distributed in nearshore areas, where the retrieved values exhibited a systematic overestimation compared with field observations. Based on field survey data, this anomaly is likely closely related to the widespread presence of emergent vegetation along the shoreline. The roots and stems of emergent plants disturb bottom sediments, increasing the concentration of suspended particulates in the water and thereby enhancing scattering in the red-edge spectral region. Meanwhile, the decomposition of fallen plant material releases large amounts of dissolved organic matter, intensifying the spectral absorption effects of colored dissolved organic matter (CDOM). The combined influence of suspended matter and CDOM alters the optical properties of nearshore waters, leading to a systematic overestimation of Chl-a concentrations by retrieval models constructed primarily from green and red-edge bands.
Overall, most Chl-a concentrations in the study area were around 10 μg/L, meeting the water quality standards for Longhu Reservoir as a drinking water source and indicating a generally good level of water cleanliness. However, a distinctly bounded anomalous patch appeared in the central area of the lake. In the retrieval map, this region is represented in green, indicating Chl-a concentration markedly higher than that of the surrounding waters, with a few small blue patches embedded inside representing even higher concentrations. The Chl-a concentration in this area was significantly higher than that in the surrounding waters. The central region of the lake is typically a weak hydrodynamic zone, characterized by low water flow velocity, limited vertical mixing, and poor water exchange capacity. Under such conditions, algal communities tend to accumulate locally, leading to a continuous increase in Chl-a concentration in this region. In addition, a nutrient accumulation and re-release effect may occur in the central lake area. On one hand, nitrogen, phosphorus, and other nutrients from surrounding terrestrial sources can be transported via surface runoff or groundwater toward the lake center, gradually accumulating under low-flow conditions. On the other hand, sediments at the lake bottom may release adsorbed nutrients under low-oxygen or reducing conditions, providing a sustained nutrient source for algal growth. The combined effects of these two mechanisms promote the local elevation of Chl-a concentration in the central region of the lake.
4. Discussion
The UAV–USV collaborative system, coupled with a machine learning model proposed in this study, enables the rapid and accurate retrieval of Chl-
a concentration in a small-scale area. Although numerous studies have focused on using UAV-mounted hyperspectral sensors for retrieving inland water quality, the greater significance of this research lies in providing a feasible emergency monitoring pathway for the development of an integrated ground–aerial–space water monitoring system for inland waters (
Figure 11).
Against the backdrop of increasingly frequent eutrophication and algal bloom events, establishing an emergency monitoring system with high timeliness and precision holds significant scientific and practical value. The integrated ground–aerial–space observation system achieves full spatiotemporal coverage—from macroscopic to microscopic scales and from routine monitoring to real-time response—through the multi-tiered coordination of satellites, UAVs, and ground-based monitoring [
59]. Within this framework, satellite remote sensing undertakes routine and large-scale monitoring tasks, enabling long-term tracking of water quality trends at the watershed scale. Its data are easily accessible and provide essential background information for large-scale water environment management. Ground-based monitoring, on the other hand, provides reliable calibration references for satellite and aerial retrieval results through in situ sampling and water quality measurements, thereby enhancing the generalization capability of models under varying water conditions.
In contrast, UAV-based remote sensing serves as the critical intermediary in this system. It compensates for the limited temporal resolution and revisit frequency of satellites while extending the spatial representativeness of ground-based measurements, enabling flexible, sub-meter-level regional water quality monitoring. As a result, UAV remote sensing becomes the core component of the emergency monitoring framework. The UAV–USV collaborative system developed in this study represents a practical implementation of this integrated framework. It can rapidly establish high-precision water quality models for key water areas, providing immediate data support for sudden algal bloom events.
This emergency monitoring system offers three notable advantages. First, it enables minute-level synchronized data collection between UAVs and USVs. USVs are increasingly used in water quality monitoring due to their autonomy, safety, and flexibility. Coordinating USVs with UAVs allows monitoring of areas that are difficult to access, fragile, or protected. Moreover, Chl-a concentrations can fluctuate significantly on an hourly scale (with a coefficient of variation of 40% in this study). If sampling and remote sensing imaging are not synchronized, the temporal and spatial consistency of the retrieval model is directly affected. Traditional manual boat-based sampling is inefficient for large water bodies, and sampling times often differ from UAV data acquisition by several hours. By coordinating UAV and USV operations, this study not only improves sampling efficiency but also ensures that model training data are highly temporally matched, thereby guaranteeing the reliability of the retrieval results.
Second, the sampling scheme combines high efficiency with emergency responsiveness. For small water bodies, UAV docking systems can provide routine monitoring. In emergencies, UAV monitoring focuses on selected key areas, replacing traditional full-area scanning of large water bodies. This approach significantly improves operational efficiency and reduces costs while still capturing dynamic changes in algal bloom risk zones. Building on this strategy, the study successfully monitored the anomalous spatial distribution of Chl-
a concentration at the intake of the Jinjiang–Kinmen Water Supply Project, providing a scientific foundation for water quality management in Longhu Reservoir (
Figure 10).
Third, the model construction and data processing workflow are highly time-efficient, supporting rapid model development and emergency applications. During a three-day favorable weather window (with actual sampling over two and a half days), high-frequency monitoring enabled the establishment of a hyperspectral retrieval model for Chl-
a. The results can be directly applied to real emergency scenarios. Even when suitable observation days are limited, sufficient data can be obtained to reflect the spatial distribution of water quality parameters, providing critical support for rapid response to sudden water environment events. In this study, all hyperspectral image preprocessing, feature extraction, and modeling work can be completed on the same day as sampling. The stripe removal method based on median correction achieved good results, even without the use of white balance calibration (
Figure 5). The lightweight machine learning model structure is suitable for small-sample datasets, avoiding the complex parameter tuning and overfitting risks typical of deep learning models. Training and thematic mapping can thus be completed rapidly, achieving acceptable fitting accuracy (R
2 = 0.768), enabling a true “same-day sampling, same-day mapping” workflow. Furthermore, the constructed model can be directly reused in subsequent monitoring tasks, supporting rapid predictions and long-term applications. In summary, this study not only provides a feasible technical approach for the rapid retrieval of Chl-
a concentration in water bodies but also offers practical reference for developing an integrated ground–aerial emergency monitoring system.
However, several limitations still exist in this framework. First, although hyperspectral imagery provides abundant spectral information, extracting effective spectral features remains a challenge. While band combinations can effectively capture water quality characteristics, their robustness is limited, and their accuracy varies among different water bodies, leading to inconsistencies across case studies. Traditional models driven by data iteration can identify highly correlated band combinations, yet they are essentially data-driven and lack physical interpretability. Moreover, common spectral preprocessing methods, such as min–max normalization and first-order differential, were not discussed in detail in this study. Future work could focus on discovering more discriminative spectral features or developing novel spectral indices to enhance model input quality.
Second, The availability and data quality of UAV-based hyperspectral imagery remain substantially constrained by environmental conditions and technical limitations. Meteorological factors constitute the primary obstacles to effective data acquisition: rainfall and strong winds can interrupt flight missions, while cloud shadows introduce significant noise into spectral data. Statistical analysis indicates that in 2024, only about half of the days in the study area met the basic flight requirements (no rainfall and wind speeds not exceeding Beaufort scale 3), and only two-thirds of these flyable days offered at least a three-day favorable weather window (
Figure 12). This greatly reduces the temporal availability of UAV-based monitoring. In addition to weather constraints, the subsequent data processing workflow—particularly the image mosaicking stage—also presents substantial challenges. Only one of the UAV images acquired in this study yielded a mosaicking result of acceptable quality due to the limitations of the current mosaicking algorithms. Future research could focus on optimizing or even independently developing mosaicking algorithms to improve image registration and fusion accuracy. Based on this, multi-temporal high-frequency water quality monitoring (e.g., four observation phases within a single day) can be carried out in target areas. With the aid of high-spatial-resolution data, the spatiotemporal heterogeneity in water quality at the hour scale can be analyzed more precisely.
Furthermore, developing an integrated, end-to-end processing algorithm is of great significance. Automating the entire workflow—from raw UAV data input to final map generation—would greatly improve the efficiency and consistency of data processing. Finally, this study focuses solely on the retrieval modeling of Chl-a concentration. Future research could extend the proposed approach to other key water quality parameters, such as suspended solids, nitrogen, phosphorus, and colored dissolved organic matter. Validation across a wider range of water body types would further assess the applicability and robustness of the proposed technology.
5. Conclusions
This study established a UAV–USV collaborative framework, integrated with machine learning, to rapidly estimate Chl-a concentrations in a subtropical drinking-water reservoir. Over the course of a three-day intensive campaign, the system successfully collected 30 temporally synchronized UAV–USV sample pairs. Due to the significant short-term variability in Chl-a concentrations (CV = 40%), the imaging–sampling time deviations were controlled within 5 min to ensure high temporal consistency.
Preprocessing and feature extraction effectively enhanced the quality of the hyperspectral data. A two-stage band selection procedure identified key spectral combinations centered on the green and red-edge regions, which showed strong correlations with measured Chl-a concentrations. Among the four algorithms tested, the RF model demonstrated the best predictive performance and generalization capacity. It achieved an R2 of 0.824 and RMSE of 1.656 μg/L for training, and an R2 of 0.768 and RMSE of 2.111 μg/L for testing. The small performance gap between the training and testing datasets suggests stable behavior, even under limited sample conditions. When applied to UAV imagery, the RF model generated Chl-a concentration maps ranging from 5.83 to 16.01 μg/L, which closely aligned with in situ measurements (2–19 μg/L). The spatial patterns revealed the hydrodynamic features of the reservoir, with higher concentrations observed in weakly mixed central zones.
The integrated UAV–USV workflow demonstrated strong operational efficiency, enabling multiple daily observations. It also supported same-day sampling, modeling, and mapping—an essential capability for emergency water-quality assessment. These findings confirm the feasibility and effectiveness of using UAV hyperspectral data, synchronized USV sampling, and lightweight ensemble learning models for rapid Chl-a retrieval in small inland water bodies. Future research should focus on improving hyperspectral mosaicking, automating the processing pipeline, and expanding the model’s applications to other water-quality parameters and broader temporal-spatial scales. This would enhance the robustness and scalability of the proposed monitoring framework.