Next Article in Journal
Enhancing Anchor Location Estimation Algorithm via Multi-Source Observations and Adaptive Optimization for UVIO
Previous Article in Journal
Adaptive Localization-Free Secure Routing Protocol for Underwater Sensor Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of Water Quality Parameters in the Paraopeba River Basin Using Remote Sensing Products and Machine Learning

by
Rafael Luís Silva Dias
1,*,
Ricardo Santos Silva Amorim
1,
Demetrius David da Silva
1,
Elpídio Inácio Fernandes-Filho
2,
Gustavo Vieira Veloso
2 and
Ronam Henrique Fonseca Macedo
3
1
Department of Agricultural Engineering, Universidade Federal de Viçosa, Viçosa 36570-900, MG, Brazil
2
Department of Soil and Plant Nutrition, Universidade Federal de Viçosa, Viçosa 36570-900, MG, Brazil
3
Department of Civil Engineering, Universidade Federal de Viçosa, Viçosa 36570-900, MG, Brazil
*
Author to whom correspondence should be addressed.
Sensors 2026, 26(1), 18; https://doi.org/10.3390/s26010018
Submission received: 9 November 2025 / Revised: 12 December 2025 / Accepted: 16 December 2025 / Published: 19 December 2025
(This article belongs to the Section Sensing and Imaging)

Abstract

Monitoring surface water quality is essential for assessing water resources and identifying their quality patterns. Traditional monitoring methods, based on conventional point-sampling stations, are reliable but costly and limited in frequency and spatial coverage. These constraints hinder the ability to evaluate water quality parameters at the temporal and spatial scales required to detect the effects of extreme events on aquatic systems. Satellite imagery offers a viable complementary alternative to enhance the temporal and spatial monitoring scales of traditional assessment methods. However, limitations related to spectral, spatial, temporal, and/or radiometric resolution still pose significant challenges to prediction accuracy. This study aimed to propose a methodology for predicting optically active and inactive water quality parameters in lotic and lentic environments using remote-sensing data and machine-learning techniques. Three remote-sensing datasets were organized and evaluated: (i) data extracted from Sentinel-2 imagery; (ii) data obtained from raw PlanetScope (PS) imagery; and (iii) data from PS imagery normalized using the methodology developed by Dias. Data on water quality parameters were collected from 24 monitoring stations located along the Paraopeba River channel and the Três Marias Reservoir, covering the period from 2016 to 2023. Four machine-learning algorithms were applied to predict water quality parameters: Random Forest, k-Nearest Neighbors, Support Vector Machines with Radial Basis Function Kernel, and Cubist. Model performance was evaluated using four statistical metrics: root-mean-square error, mean absolute error, Lin′s concordance correlation coefficient, and the coefficient of determination. Models based on normalized PS data achieved the best performance in parameter estimation. Additionally, decision-tree-based algorithms showed superior generalization capability, outperforming the other models tested. The proposed methodology proved suitable for this type of analysis, confirming not only the applicability of PS data but also providing relevant insights for its use in diverse environmental-monitoring applications.

1. Introduction

Continental surface water bodies, such as rivers, lakes, reservoirs, and streams, are essential sources of potable water and support multiple uses, including recreation, industry, agriculture, energy generation, transportation, and fishing [1,2]. They also play a fundamental role in maintaining biodiversity and regulating hydrological flow [3]. Therefore, continuous monitoring of water quantity and quality is crucial for understanding natural and anthropogenic processes that influence aquatic systems and for supporting conservation actions [4].
Conventional monitoring of water quality, based on discrete point sampling, presents important limitations, especially during extreme events. Low sampling frequency and the limited spatial distribution of monitoring stations hinder the detection of spatiotemporal variability in water quality, particularly along lake and reservoir margins or following intense rainfall [5,6]. In such situations, continuous and spatially detailed monitoring becomes essential to complement traditional approaches and to more accurately assess the impacts of events such as dam failures and heavy precipitation.
Remote sensing techniques have emerged as effective alternatives for the continuous monitoring of optically active water quality parameters because of their broad spatial coverage, adequate temporal resolution, and favorable cost–benefit ratio. These techniques have been widely applied to estimate chlorophyll-a (Chla), turbidity (T), colored dissolved organic matter (CDOM), total suspended solids (TSS), and Secchi disk depth (SDD) [7,8,9,10].
In contrast, relatively few studies have focused on predicting non-optically active parameters such as phosphorus (P), nitrogen (N), chemical oxygen demand (COD), dis-solved oxygen (DO), and iron (Fe) [11,12]. These parameters exhibit weak or no relation-ships with spectral bands, which limits their direct detection by orbital sensors [13,14,15]. Consequently, their prediction relies on indirect approaches that explore correlations with optically active parameters, environmental variables, and machine-learning algorithms [12,16,17].
Machine-learning algorithms have become increasingly prominent in water-quality modeling due to their ability to represent highly nonlinear relationships among environmental variables [18]. These methods have been applied across a variety of monitoring contexts, often outperforming traditional approaches for several water-quality parameters [19,20]. Recent studies, for example, demonstrate that backpropagation neural networks can accurately estimate indicators such as chemical oxygen demand (COD), permanganate index, total nitrogen (TN), and total phosphorus (TP) [21,22]. In addition to neural networks, techniques such as decision trees, support vector machines, random forests, and other supervised learning algorithms have shown broad applicability in aquatic systems [12,17,23,24].
These methodological limitations are compounded by constraints inherent to current orbital platforms such as Landsat, MODIS, and Sentinel-2 (S2), which have moderate-to-low spatial resolutions ranging from 10 to 1000 m. Furthermore, the temporal resolutions of Landsat and S2A/2B satellites are considered moderate—16 and 10 days, respectively [25,26]. As a result, their application for remote detection of water quality in small reservoirs, embayments of large reservoirs, and narrow rivers—where higher-frequency imagery is needed—is limited [27,28].
To overcome such limitations, Planet has deployed a large number of nanosatellites known as Doves. These Doves form a CubeSat 3U (10 × 10 × 30 cm) constellation capable of daily revisit to the same target on Earth′s surface. Since 2016, these satellites have provided imagery with 3.7 m spatial resolution and four spectral bands (B1—Blue: 420–530 nm; B2—Green: 500–590 nm; B3—Red: 610–700 nm; and B4—near-infrared [NIR]: 780–860 nm) [29]. They weigh approximately 4 kg, enabling much faster production and launch compared with traditional satellites. They also do not require a dedicated launch vehicle, as they can be delivered to orbit as secondary payloads [30].
Recognizing these methodological gaps in the literature, this study offers an innovative contribution by integrating the prediction of optically active and inactive water-quality parameters in both lotic and lentic environments, an aspect that remains underexplored, particularly in highly variable contexts such as post-disaster conditions. This approach is strengthened by the inclusion of systematic comparisons among different orbital products, including Sentinel-2, PlanetScope, and radiometrically normalized PlanetScope, which enables the assessment of model predictive performance and the relevance of the specific characteristics of each orbital dataset. Accordingly, the objective of this study is to propose a methodology for predicting optically active and inactive water-quality parameters in lotic and lentic environments using remote-sensing data and machine-learning techniques.

2. Materials and Methods

2.1. Study Setting

To apply and evaluate the proposed methodology, we selected an area located in the central region of the state of Minas Gerais, Brazil (Figure 1), encompassing the Paraopeba River Basin and the Três Marias Reservoir. According to IBGE [31], the Paraopeba River is one of the main tributaries of the São Francisco River, extending 510 km and supplying the Três Marias Reservoir after flowing through 48 municipalities in Minas Gerais.
The Paraopeba River Basin has an average altitude of 720 m and is characterized by predominantly strongly undulating to mountainous terrain [32]. The predominant soil classes in the basin are Red Latosols, Red-Yellow Latosols, Haplic Cambisols, and Humic Cambisols. According to Alvares et al. [33], the climate in the region is classified as tropical Aw, with two well-defined seasons: a rainy summer from October to March and a dry winter from April to September.
This basin was chosen due to its considerable diversity of soils, topography, vegetation cover, and the presence of continental water bodies. Additionally, on 25 January 2019, the Paraopeba River Basin was impacted by one of the most severe socio-environmental disasters in Brazil: the failure of the tailings dam (B-I) at the Córrego do Feijão mine, located in the city of Brumadinho. According to the Government of Minas Gerais [34], approximately 12 million m3 of tailings were released, of which an estimated 2 million m3 remained in the former B-I area; 7.8 million m3 were deposited along the Ferro-Carvão stream channel until its confluence with the Paraopeba River; and the remaining 2.2 million m3 reached the main Paraopeba channel.

2.2. Hydrological Data

For this study, we used water-quality parameter time-series data (2016–2023) from 24 monitoring stations distributed along the Paraopeba River channel and the Três Marias Reservoir, including station SF054, located downstream of the dam failure (Figure 1 and Table 1). These data included turbidity, TSS, Chla, P, N, COD, DO, and Fe.
Data were obtained from the institutional repository of the Instituto Mineiro de Gestão das Águas (IGAM) (http://repositorioigam.meioambiente.mg.gov.br (accessed on 15 December 2025)), the governmental agency responsible for water-resource monitoring in Minas Gerais. In the basic network, sampling campaigns were conducted quarterly until December 2018. However, after the dam collapse in January 2019, sampling at stations located along the Paraopeba River became monthly. In addition, monitoring records from stations located in the Três Marias Reservoir, under the responsibility of Companhia Energética de Minas Gerais (Cemig), were incorporated into the database.

2.3. Remote Sensing Data Acquisition and Processing

To predict water quality parameters, we selected images acquired by the multispectral instrument (MSI) onboard the S2A and S2B satellites as well as PlanetScope (PS) imagery from the three generations of Dove nanosatellites Dove Classic, Dove-R, and Super Dove. Additionally, PS imagery was normalized according to the methodology proposed by Dias [36]. The following subsections describe the characteristics of each product and the processing applied.

2.3.1. Multispectral Instrument/Sentinel-2

In this study, 10 spectral bands from S2 MSI were used, mounted on the S2A and S2B platforms. According to Müller-Wilm [37], the multispectral bands are distributed across different electromagnetic ranges: three visible bands (B2—Blue [490 nm], B3—Green [560 nm], and B4—Red [665 nm]); one NIR band (B8—NIR [842 nm]); four red-edge bands (B5 [705 nm], B6 [740 nm], B7 [783 nm], and B8a [865 nm]); and two shortwave-infrared bands (B11–SWIR [1610 nm] and B12 [2190 nm]). The visible and NIR bands have 10 m spatial resolution, whereas the red-edge and shortwave-infrared bands have 20 m resolution.
S2 scenes were obtained from the Copernicus Open Access Hub (https://dataspace.copernicus.eu, accessed on 15 March 2024). We selected Level-1C top-of-atmosphere reflectance products with no cloud cover and with a maximum temporal difference of two days relative to the in situ sampling dates (2016 to 2023).
All preprocessing of S2A and S2B imagery was carried out by the authors and included the following steps: (i) band stacking for each acquisition date; (ii) mosaicking of scenes acquired on the same day; (iii) atmospheric correction using ACOLITE [38], which removes attenuation effects caused by molecular and aerosol scattering and by absorption from water vapor, ozone, oxygen and carbon dioxide [39], (iv) computation of spectral indices (Table 2); and (v) extraction and compilation of reflectance values at the monitoring stations.

2.3.2. PlanetScope Sensor

PS imagery is acquired by small nanosatellites designed and operated by the private company Planet. PS sensors are carried by a constellation of small nanosatellites with a CubeSat 3U form factor (10 × 10 × 30 cm). At present, Planet operates more than 180 Dove nanosatellites that provide daily imagery of Earth′s surface with high spatial resolution (about 3.7 m). Scenes are captured in four spectral bands: B1—Blue (420–530 nm), B2—Green (500–590 nm), B3—Red (610–700 nm), and B4—NIR (780–860 nm). According to Planet Team [61], PS images are delivered orthorectified in the Universal Transverse Mercator projection and geometrically corrected, with about 10 m positional accuracy.
PS scenes were downloaded via the application programming interface (API) available from https://www.planet.com/explorer (accessed on 23 August 2024). We selected cloud-free scenes from the three available sensor generations (Dove Classic, Dove-R, and Super Dove) with acquisition dates coinciding with the water-sampling days between 2016 and 2023. Although PS images are not open access, they can be obtained at no cost through university affiliation by enrolling in the Planet Education and Research program (https://go.planet.com/research, accessed on 1 May 2025).
Preprocessing steps for PS imagery included (i) mosaicking acquisition strips; (ii) calculating spectral indices (Table 3); and (iii) extracting and tabulating data for the water-quality monitoring points.

2.3.3. Normalized PlanetScope Sensor

For the normalized PS dataset, the same PlanetScope sensor data presented in the previous section were used. However, the PS imagery was normalized following the methodology proposed by Dias [36], who developed a procedure to correct radiometric inconsistencies in the PlanetScope constellation′s temporal image series using machine-learning models calibrated with synchronous samples of pseudo-invariant pixels extracted from paired PlanetScope and Sentinel-2 scenes. As a result, the normalized dataset exhibited more stable and comparable temporal series.

2.3.4. Climate Hazards Group Infrared Precipitation with Stations Data

In addition to PS and S2 imagery, we used daily precipitation estimates from the Climate Hazards Group Infrared Precipitation with Stations (CHIRPS) dataset [63]. CHIRPS is a reanalysis product that combines rain-gauge observations with satellite-derived precipitation estimates. It provides global daily data since 1981 at about 0.05° spatial resolution (about 5 km) [64].
Data preparation involved: first, delineating the upstream contributing area for each selected station; second, computing the mean of pixel values within that area; and third, downloading precipitation for the sampling day plus the preceding 14 days (a 15-day window). Accumulated precipitation was then computed in 24 h steps up to 360 h. These accumulated values were added to the database as 15 independent variables for predicting water quality parameters. All image and data preprocessing steps were performed in the R programming language [65].

2.3.5. Acquisition of Reflectance Values

After image preprocessing, we refined and extracted reflectance samples. Surface-water motion can produce direct (sunglint) and diffuse (skyglint) reflection of solar radiation, which markedly affects the spectral response of samples and can overestimate reflectance [66,67]. To reduce this effect in the modeling, pixels with reflectance > 0.6 were assigned NoData values [10].
Reflectance extraction considered two scenarios: (i) For S2 imagery, single-pixel values were extracted. To avoid shoreline interference in water pixels, we selected only stations located on river reaches with a minimum channel width of 30 m (equivalent to three S2 pixels). (ii) For PS imagery, a 3 × 3-pixel window was used, and the mean reflectance over that window was extracted.

2.4. Modeling of Water-Quality Parameters Using Machine-Learning Methods

To identify the remote-sensing product most suitable for predicting water quality parameters, modeling was performed using three datasets: (i) S2 imagery; (ii) PS imagery; and (iii) normalized PS imagery, following the methodology proposed by Dias [36].
The structured database was stratified and then randomly split into two subsets: 75% for training and 25% for testing. Stratification ensured sample representativeness according to two main criteria: (i) temporal proportion—training and test sets preserved the proportion of data before and after the dam failure; and (ii) climatic distribution—data were proportionally divided between wet and dry seasons, preserving the same 75/25 proportion within each season, with randomization applied only within each stratum.
Model performance metrics were computed as averages over 100 repetitions for both training and test sets. This procedure is effective for assessing algorithm performance and helps identify problematic samples or outliers in the datasets [10,68,69].
Figure 2 presents a flowchart of the three steps used in the implemented modeling: (i) selecting the optimal set of covariates for each algorithm by removing highly correlated variables and those with lower relevance for training; (ii) training models using the selected variables for each algorithm; and (iii) evaluating model performance on a dataset distinct from that used for training.

2.4.1. Covariate Selection

Covariate selection is a modeling step that aims to identify the smallest subset of original covariates capable of representing the modeled phenomenon/process while minimizing redundancy. It is used to reduce feature-space dimensionality, remove noisy covariates, and increase model parsimony [10,70,71].
First, covariate variance was assessed, and variables with zero or very low variability were removed based on the criteria defined by [72]. This assessment was performed with the nearZeroVar function from the caret package [73,74].
Next, Spearman′s correlation coefficient was computed [75]. For pairs of covariates with correlation ≥ 95%, the variable with the largest absolute correlation with the remaining variables was removed.
Finally, the importance-based removal of covariates was performed through the re-cursive feature elimination (RFE) procedure implemented in the caret package, which dis-cards variables that contribute least to the model according to the algorithm-specific im-portance measures [10,76,77,78,79,80]. The division into training and testing sets was performed prior to the application of the recursive feature elimination (RFE) procedure.
The structured database was stratified and then randomly split into two subsets: 75% for training and 25% for testing. Stratification ensured sample representativeness according to two main criteria: (i) temporal proportion—training and test sets preserved the proportion of data before and after the dam failure; and (ii) climatic distribution—data were proportionally divided between wet and dry seasons, preserving the same 75/25 proportion within each season, with randomization applied only within each stratum.
After applying RFE, the optimal covariate set was obtained for each algorithm and used in the subsequent modeling steps. For modeling water quality parameters, the predictors comprised individual bands, band ratios, spectral indices (Table 2 and Table 3), precipitation data (accumulated from 2 to 15 days prior to sampling), and image-acquisition period (before/after the dam failure and hydrological season).

2.4.2. Selection of Machine-Learning Models

To predict the concentration of water quality parameters, we employed the following algorithms: random forest (RF) [81], support vector machines with a radial basis function kernel (SVM-RBF) [82], kernel k-nearest neighbors (KKNN) [83], and cubist [84]. These algorithms were chosen because they represent distinct families of modeling approaches, providing a comprehensive assessment of the relationships within the data. This set includes: (i) tree-based ensemble methods (RF), (ii) kernel-based methods capable of capturing complex nonlinear relationships (SVM-RBF), (iii) instance-based learning algorithms (KKNN), and (iv) hybrid rule-based models that combine decision trees with linear regression components (Cubist). Such methodological diversity enables the exploration of different response patterns and follows established recommendations in the literature for environmental and limnological modeling, ensuring robustness and comparability across approaches [74,85].
RF and SVM-RBF are widely used to predict water quality parameters [2,86,87]. RF builds an ensemble of N regression trees, and the final prediction is the average over all trees. As a tree-based approach, RF is a nonparametric algorithm [81,88].
SVM-RBF allows predictions with a tolerable error controlled by the support vectors and governed by the hyperparameter C (cost) [82]. Like RF, SVM-RBF is nonparametric and becomes a nonlinear regression method by using a nonlinear kernel function [89].
KKNN is also kernel-based and identifies the k training points closest to a new sample using a distance metric such as Minkowski distance, a general form of Euclidean and Manhattan distances [90]. This nonparametric model assigns distance-weighted contributions so that nearer neighbors receive larger weights, avoiding explicit assumptions about underlying data distributions [91,92,93].
Cubist is a tree-based regression algorithm that combines a decision-tree structure with linear models fit within each terminal region [84]. It builds a set of decision rules to partition the attribute space and then applies local linear regression within each region, yielding a flexible and interpretable representation of relationships in data [94,95,96].
A more detailed description of these algorithms can be found in Kuhn and Johnson [85] and Murphy [74].
During training, each model′s internal hyperparameters were tuned using repeated cross-validation with 10 folds and 10 repetitions, applied to each algorithm′s tuning grid and testing 5 values of each hyperparameter (tuneLength). Hyperparameters are algo-rithm-specific configuration options that influence model behavior and predictive accura-cy. Each learning method relies on its own set of hyperparameters, and in this work we optimized the following: committees and neighbors for Cubist; kmax, distance, and kernel for KKNN; mtry for RF; and sigma and C for the radial-basis SVM-RBF.
The tuning process was carried out automatically through the train function in the caret package [73]. This function performs a structured exploration of the user-defined hyperparameter space. When minimum and maximum values are provided for each parameter, train constructs an evenly spaced grid—typically composed of five candidate values per hyperparameter—covering the specified range. The algorithm is then fitted for every combination in this grid and assessed using the selected resampling strat-egy (e.g., cross-validation). The configuration that maximizes the chosen performance met-ric is retained as the optimal set of hyperparameters.
In this study, hyperparameter selection was guided by the Lin′s Concordance Corre-lation Coefficient (CCC), which served as the optimization criterion. Initial values and search ranges followed the caret developers′ recommendations (see the manual, Chapter “Available Models”: https://topepo.github.io/caret/available-models.html (accessed on 15 January 2025)). Final optimized hyperparameters are shown in Table 4.
The processes of importance-based variable removal (RFE), model training, and performance evaluation were repeated 100 times. This repeated-resampling strategy enables assessing the ability of the algorithms to handle varying training subsets and to produce robust predictive results [97,98]. Model performance metrics for both training and testing sets were then computed as the mean values across the 100 repetitions. This approach enhances the reliability of performance estimates and facilitates the identification of potentially problematic observations or outliers within the datasets [10,68,69].

2.4.3. Model Evaluation

To evaluate model performance, predictions were compared with observations from the water quality monitoring stations in the study area (Table 1) using the following statistical metrics: root-mean-square error (RMSE; Equation (1)), mean absolute error (MAE; Equation (2)), Lin′s concordance correlation coefficient (CCC; Equation (3)), and the coefficient of determination (R2; Equation (4)). This set of metrics was chosen to capture complementary facets of performance [99,100,101].
R M S E = 1 n i = 1 n P i O i 2 1 2
M A E = i = 1 n P i O i n
C C C = 2 r V r V R P   ¯   O   ¯ 2 +   V r + V R
R 2 = i = 1 n P i O ¯ 2 i = 1 n O i O ¯ 2
where P i are model-predicted values; O i are observed values; O ¯ is the mean of observed values; P ¯ is the mean of predicted values; V r and V R are the variances of predicted and observed values, respectively; and n is the number of observation pairs.
RMSE squares the difference between predicted and observed values, penalizing large errors more than small ones, and is therefore sensitive to outliers [102]. MAE measures the average magnitude of errors using the absolute difference [103]. CCC quantifies the proximity of the fitted relationship to the 45-degree identity line [101]. R2 represents the proportion of variance explained by the model [104]. Because RMSE and MAE share the variable′s units, they facilitate error interpretation. Models with lower RMSE and MAE were considered more accurate [105]. Following Altman [106], CCC and R2 can be interpreted as moderate (0.5–0.7), strong (0.7–0.9), and very strong (>0.9).
In addition to these metrics, RMSE (Equation (5)) and MAE (Equation (6)) were also computed for a null model. The null model predicts each parameter using the training-set mean, returning a single average for numeric outcomes. Models performing similarly to or worse than the null model are poorly rated. Model selection for a given parameter should favor cases in which RMSE and MAE are lower than those of the null model, indicating gains from the machine-learning approach [69].
N U L L _ R M S E = 1 n i = 1 n   O ¯ m t O i T 2 1 2
N U L L _ M A E = i = 1 n O ¯ m t O i T n
where O ¯ m t is the mean of the training samples, O i T are the test-set observations, and n is the number of test samples (loop size).

3. Results and Discussion

Table 5 presents the descriptive statistics of the water quality parameters used in this study, revealing large gaps between minimum and maximum values—i.e., a wide range in the observed measurements. In addition, the parameter means are generally closer to the minima, indicating right-skewed distributions with long upper tails. Such skewness is common in environmental datasets, where measurements cluster at low to moderate levels but rare extreme events stretch the upper tail [105].
Figure 3 ranks the most important covariates for the models used to predict water quality parameters across the three datasets (S2, PS, and normalized PS). Covariate-selection procedures substantially reduced the number of predictors, from an initial 151 to ten variables per model—an adequate and parsimonious set for modeling. These findings agree with Muñoz-Romero et al. [70] and Stevens et al. [76], who showed that reducing model complexity lowers computational costs and improves robustness and predictive performance.
Except for the NIR bands (B8 for S2 and B4 for PS), indices and band ratios dominate as the most important covariates compared with individual bands. This corroborates Sestini [107] and Lillesand, Kiefer, and Chipman [108], who showed that combining spectral bands via indices and ratios enhances discrimination of subtle spectral differences among targets, whereas individual bands tend to capture only more evident variations—making ratio-based indices more effective for identifying specific spectral features of natural objects or phenomena.
The NIR bands (B8 in Sentinel-2 and B4 in PlanetScope) stand out as the most influential predictors. This result is expected, since NIR reflectance responds strongly to increases in suspended particles, directly influencing the prediction of turbidity, TSS, and other optically active parameters. Even for optically inactive parameters, the NIR band provides indirect information because many chemical components are correlated with sedimentary and hydrodynamic processes, particularly in a post-disaster context where sediment mobilization is intensified.
Spectral indices and band ratios such as GLI, Iron, and NDTI also exhibit high importance. Their superior performance stems from their ability to highlight subtle variations in the water′s spectral response while reducing interference associated with illumination, solar geometry, and atmospheric variability. The Iron index, in particular, consistently appears among the most relevant predictors, reflecting the presence of mineral-rich particulate material that characterizes much of the sediment dynamics in the basin after the disaster. These indices provide a more stable and discriminative spectral signal than individual bands, contributing strongly to the prediction of optically active parameters and, indirectly, to optically inactive ones.
Precipitation-derived variables from the CHIRPS product also appear consistently among the ten most important predictors across all sensors. This behavior reflects the direct relationship between accumulated rainfall, increased surface runoff, sediment transport, and nutrient loading. Precipitation further exerts strong influence on sediment resuspension, especially in lotic environments, altering the optical properties of the water column and, consequently, the spectral response captured by the sensors. In reservoirs, these effects are more attenuated due to longer residence times and lower hydrodynamic energy, which explains the differences observed in model performance between lotic and lentic systems.
In addition to these direct effects on optically active parameters, precipitation also contributes to the prediction of optically inactive parameters through indirect relationships. Rainfall events intensify hydrological processes that mobilize organic matter, nutrients, and sediments, thereby altering optical variables such as turbidity, TSS, and indices sensitive to particulate material. Although these inactive parameters do not exhibit distinct spectral signatures, their variations are associated with these processes, enabling machine-learning models to estimate them indirectly.
Table 6 reports, for MSI/S2 data, the performance metrics for the machine-learning models used to predict water quality parameters in the Paraopeba River Basin.
The results demonstrate the superior robustness of tree-based algorithms, particularly RF and Cubist, when compared with KKNN and SVM-RBF. RF achieved the highest performance for five of the eight parameters, while Cubist ranked within the top two for six parameters. Both models produced the lowest prediction errors (RMSE and MAE) and the highest R2 and CCC values, reinforcing the ability of tree-based methods to represent nonlinear relationships and capture multiscale interactions among environmental and hydrological covariates [81]. These findings align with previous studies that emphasize the adaptability of ensemble-based approaches under conditions of high optical and hydrological heterogeneity [104,109,110].
At the parameter level, Turbidity and TSS exhibited the strongest generalization capacity, with CCC values close to 0.82 and 0.72 and R2 values ranging from 0.75 to 0.59, accompanied by low RMSE and MAE. Both variables are optically active and strongly governed by suspended-sediment dynamics, which enhances their detectability across multisensor imagery. In contrast, Fe, P, DO, COD, and N showed limited predictive performance (CCC ≈ 0.44–0.31; R2 ≈ 0.27–0.15), reflecting their weak or indirect spectral signatures and their sensitivity to short-term hydrological fluctuations. For Chla, all algorithms performed poorly; even the best model (SVM-RBF; CCC = 0.12; R2 = 0.05) produced a test-set RMSE higher than the null model.
These results are consistent with the well-known physical–optical limitations of these parameters. DO, Nitrogen, and COD are not optically active and therefore do not exhibit direct spectral signatures detectable by orbital sensors. Their estimation relies on indirect relationships with covariates, which naturally limits model accuracy [67]. In the case of Chla, although characteristic absorption bands exist, its detection in rivers is strongly hindered by low pigment concentrations, high turbidity, and spectral overlap with TSS and CDOM [111,112,113]. These conditions are particularly relevant in the study area, where turbidity remains elevated due to the Brumadinho dam failure, reducing the effective optical depth and weakening the Chla signal.
Overall, there was no evidence of overfitting, as training and test results were concordant. Except for Chla, all parameters showed gains over the null model in RMSE and MAE: RMSE improvements ranged from 47.38% (T) to 6.75% (N); MAE improvements ranged from 61.12% (T) to 7.85% (N). For Chla, no advantage over the null model was observed for RMSE; however, MAE improved by 8.21%.
Table 7 reports, for PS data, the performance metrics for the machine-learning models used to predict water quality parameters in the Paraopeba River Basin.
Table 7 indicates a clear dominance of tree-based models, with Cubist and RF consistently ranking among the top two performers for all eight parameters derived from PS data. For Turbidity, TSS, Fe, and P, CCC values ranged from 0.878 to 0.513 and R2 values from 0.796 to 0.337, accompanied by low RMSE and MAE. The close agreement between CCC and R2 further reinforces the internal consistency and robustness of the modeling framework [36].
The comparison of RMSE and MAE across training and test sets shows only minor discrepancies, suggesting a low risk of overfitting. As reported in Table 7, RMSE values improved by 52.65 percent to 11.82 percent relative to the null model, while MAE improved by 66.04 percent to 13.82 percent. Similar to the MSI/S2 results, Chla was the only parameter for which the model did not outperform the null model, yielding an RMSE 9.85 percent below the mean and a marginal MAE improvement of 7.28 percent. This reinforces the known difficulty of retrieving Chla from PS imagery in highly turbid and optically complex environments.
Table 8 reports, for normalized PS data, the performance metrics used to evaluate the machine-learning models applied to predicting water quality parameters in the Paraopeba River Basin. For all evaluated parameters, RF and Cubist were among the two best models; only for Fe, P, and Chla did these algorithms perform worse than SVM-RBF and KKNN.
Analyzing model performance by parameter, the models for turbidity, TSS, Fe, P, and DO showed good generalization, with CCC values between 0.918 and 0.553. Corresponding R2 values ranged from 0.848 to 0.39. The strong agreement between these two indices is an important indicator of the robustness of the applied methodology.
When analyzing the RMSE and MAE indices, the results show low values and good agreement among the data. Examining the percentage gain of the developed models relative to the NULL RMSE and NULL MAE values, all evaluated parameters demonstrated real improvements, with gains ranging from 59.95% to 13.98% for RMSE. For MAE, the models showed an advantage between 68.77% and 15.04%.
Overall, with the exception of the Chla models derived from S2 and PS datasets, all developed models (Table 6, Table 7 and Table 8) achieved MAE and RMSE values lower than those of the null model, which constitutes the minimum statistical benchmark for acceptable predictive skill [69,85]. This systematic reduction in error metrics indicates that the proposed modeling framework provides a demonstrably superior predictive capability compared with the use of simple mean-based estimates.
The Chla parameter presented the lowest CCC and R2 values, indicating significant difficulty for the algorithms to generalize across the three analyzed datasets. From an optical perspective, Chla is often affected by the presence of other Optically Active Components (OACs), such as TSS and colored dissolved organic matter (CDOM) [111,112,113]. In this context, it is noteworthy that the study area was impacted by a mining tailings dam failure, which led to high TSS levels in the datasets used for modeling. This significantly contributed to the poor performance of the machine learning models in predicting the Chla parameter.
Additionally, the characteristics of the predominant soils in the region, classified by Embrapa [114], as Red Latosols, Red-Yellow Latosols, Haplic Cambisols, and Humic Cambisols, directly influence water color and, consequently, its spectral response.
Table 9 presents the performance statistics of the best-performing machine learning models for each evaluated parameter under the three distinct approaches. Figure 4 complements this information by displaying scatter plots of predicted versus observed values along a 1:1 line, allowing a visual assessment of prediction accuracy.
When analyzing Table 9 and Figure 4, the dataset derived from normalized PS images achieved the best results across all evaluated parameters. Models for turbidity, TSS, Fe, P, and DO presented CCC values between 0.92 and 0.55, while R2 values ranged from 0.85 to 0.39. The MAE and RMSE values were lower than the thresholds established by the null models. For the parameters COD, N, and Chla, although the results were higher than those obtained with the other two datasets, the CCC values remained below 0.50, varying between 0.45 and 0.26, while R2 ranged from 0.30 to 0.11.
These patterns and value magnitudes are consistent with the findings of Gao et al., [115] who predicted non-optically active parameters using S2 data. These results are justified because the predictive capability arises from indirect relationships with optically active constituents and with short-term hydrological dynamics captured by the CHIRPS precipitation covariates. These factors co-vary due to sediment resuspension, nutrient transport, and seasonal changes in streamflow, allowing the models to identify nonlinear environmental patterns.
The results demonstrate the superior performance of the models developed using the PS dataset—both raw and normalized—compared with the S2 data across all analyzed parameters. This advantage arises directly from the characteristics of the PS sensor, such as its high spatial resolution (3.7 m), which enables the detection of finer-scale features, and its daily temporal resolution, which allows for a more accurate characterization of aquatic variability. These findings suggest that PS data offer significant advantages for water quality modeling, especially in complex aquatic systems where spatial and temporal variability is critical.
The MSI/S2 sensor, despite being widely used, has notable limitations in narrow water bodies (< 30 m wide), where its 10 m spatial resolution induces spectral mixing errors. Moreover, its reflectance is highly sensitive to external interferences (riverbed, riparian vegetation, and sediments), as reported by Barbosa et al. [67], Greb et al. [116], and Isidro et al. [117], which reduces its accuracy in more complex aquatic systems.
In summary, when analyzing the characteristics of each sensor across the three evaluated datasets, the results indicate that the models developed using normalized PS data achieved the best performance, surpassing those based on raw PS and MSI/S2 data. This superior performance suggests that normalized PS data are more suitable for predicting water quality parameters, particularly turbidity, TSS, Fe, and P.
Figure 5 and Figure 6 present the predicted and observed values of turbidity, TSS, Fe, P, DO, COD, N, and Chla parameters modeled using normalized PS data for lentic and lotic environments. Overall, due to environmental characteristics, lentic systems exhibited lower dispersion, with values closely grouped. In contrast, lotic environments displayed greater dispersion across all parameters.
Analyzing the statistical indices reveals that turbidity, TSS, COD, and N showed only minor variations in model performance between lentic and lotic environments. Conversely, Fe, P, DO, and Chla exhibited more pronounced differences between the two environments. In lentic systems, turbidity, TSS, and DO achieved the best performances, with CCC values ranging from 0.97 to 0.72 and R2 values between 0.95 and 0.65. Meanwhile, Fe, P, COD, N, and Chla had CCC values between 0.43 and 0.12 and R2 between 0.28 and 0.04. In lotic environments, turbidity, TSS, and Fe stood out, with CCC values ranging from 0.91 to 0.68 and R2 from 0.83 to 0.52. The parameters DO, P, COD, N, and Chla, however, displayed lower CCC and R2 values, ranging from 0.48 to 0.19 and 0.31 to 0.06, respectively.
The methodology proposed in this study proved robust for assessing water quality using a historical series of PS images. The comparative analysis between PS data and those from well-established constellation such as S2 demonstrated that PS imagery can provide valuable information despite its lower spectral resolution and the inherent radiometric differences among sensors. Although PS presents certain limitations, it shows strong potential for monitoring water-quality parameters in inland waters, particularly when radiometric normalization is applied. Normalization enhances radiometric consistency across different PS sensor generations (Dove Classic, Dove-R, and Super Dove), reducing calibration discrepancies and improving the temporal comparability of images acquired by different satellites. It also mitigates sensor-specific noise, including variations in gain, offset, and illumination, resulting in more stable and reliable spectral indices that are less susceptible to radiometric distortions.
The proposed methodology not only validates the use of PS data to predict water-quality parameters, but also offers relevant contributions to integrating different data sources to improve predictive accuracy. Based on the evidence presented, Planet′s nanosatellites are a promising tool for environmental monitoring, particularly in contexts that demand continuous, large-scale observations, opening new possibilities for water-resource management and for understanding environmental impacts.
It is important to emphasize that, although PS data stand out for their high spatial resolution and frequent temporal coverage, these images are not free, unlike those provided by the MSI/Sentinel-2 sensor. For this reason, the use of PS imagery requires a careful cost–benefit assessment to determine whether the financial investment is compatible with the monitoring objectives. In the context of this study, the use of PS data was made possible through the Planet Education and Research Program, which justified their inclusion in the analysis. However, in professional applications, the acquisition cost must be weighed against the specific requirements of each monitoring project, considering whether the advantages offered by PS outweigh the free alternative provided by S2.
However, MSI/S2 imagery also presents limitations, such as lower spatial resolution compared with PS, which may hinder the detection of small-scale targets or phenomena. In addition, frequent cloud cover and lower temporal revisit in some regions can limit the applicability of S2 data in studies that require both high spatial and temporal resolution.

4. Conclusions

Based on the results, we conclude:
1.
The methodology that combines orbital remote-sensing data with machine-learning techniques is suitable for predicting turbidity, TSS, Fe, P, and DO, but shows limitations for N, COD, and Chla.
2.
The NIR band was a key covariate in all approaches (B8 for S2, B4 for PS). In addition, spectral indices, band ratios, and CHIRPS precipitation products exhibited strong predictive value for estimating water-quality parameters.
3.
To improve model performance with PS data, radiometric normalization of the imagery is essential.
4.
Tree-based models—particularly RF and Cubist—were more robust and higher-performing than KKNN and SVM-RBF.

Author Contributions

Conceptualization, Methodology, Formal Analysis, Visualization, Writing—original draft, R.L.S.D.; Methodology, Supervision, Conceptualization, Writing—revision and editing, R.S.S.A.; Supervision, Conceptualization, Writing—revision and editing, D.D.d.S.; Methodology, Formal analysis, Conceptualization, Visualization E.I.F.-F.; Methodology, Conceptualization, Data Curation, G.V.V.; Research, Data Curation, R.H.F.M. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brazil (CAPES)—Finance Code 001, the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), and the Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG), grant number APQ-01957-22.

Data Availability Statement

The field survey data that supports the findings of this study are available by contacting the corresponding author, RLSD, upon reasonable request.

Acknowledgments

We thank the anonymous reviewers for their careful reading and comments, which helped us to improve the manuscript quality. We thank the Department of Agricultural Engineering (DEA) and the Center of Reference in Water Resources (CRRH) of the Universidade Federal de Viçosa for supporting the researchers.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ogashawara, I.; Mishra, D.R.; Gitelson, A.A. Remote Sensing of Inland Waters: Background and Current State-of-the-Art; Elsevier Inc.: Amsterdam, The Netherlands, 2017; ISBN 9780128046548. [Google Scholar]
  2. Saberioon, M.; Brom, J.; Nedbal, V.; Souček, P.; Císař, P. Chlorophyll-a and Total Suspended Solids Retrieval and Mapping Using Sentinel-2A and Machine Learning for Inland Waters. Ecol. Indic. 2020, 113, 106236. [Google Scholar] [CrossRef]
  3. Barbosa, G.R. Introdução Ao Sistema de Informações Geográficas. Available online: https://www.kufunda.net/publicdocs/sig-bd-jai.pdf (accessed on 15 December 2025).
  4. Arango, J.G.; Nairn, R.W. Prediction of Optical and Non-Optical Water Quality Parameters in Oligotrophic and Eutrophic Aquatic Systems Using a Small Unmanned Aerial System. Drones 2020, 4, 1. [Google Scholar] [CrossRef]
  5. Gholizadeh, M.H.; Melesse, A.M.; Reddi, L. A Comprehensive Review on Water Quality Parameters Estimation Using Remote Sensing Techniques. Sensors 2016, 16, 1298. [Google Scholar] [CrossRef] [PubMed]
  6. Winston, R.J.; Dorsey, J.D.; Hunt, W.F. Quantifying Volume Reduction and Peak Flow Mitigation for Three Bioretention Cells in Clay Soils in Northeast Ohio. Sci. Total Environ. 2016, 553, 83–95. [Google Scholar] [CrossRef]
  7. de Aragão, R.; Cruz, M.A.S.; Correia, E.C.d.O.; Machado, L.F.M.; de Figueiredo, E.E. Impacto Do Uso Do Solo Pelo Aumento Da Densidade Populacional Sobre o Escoamento Numa Área Urbana Do Nordeste Brasileiro via Geotecnologias e Modelagem Hidrológica. Rev. Bras. Geogr. Fís. 2017, 10, 543–557. [Google Scholar] [CrossRef]
  8. Bonansea, M.; Rodriguez, M.C.; Pinotti, L.; Ferrero, S. Using Multi-Temporal Landsat Imagery and Linear Mixed Models for Assessing Water Quality Parameters in Río Tercero Reservoir (Argentina). Remote Sens. Environ. 2015, 158, 28–41. [Google Scholar] [CrossRef]
  9. Cui, Y.; Yan, Z.; Wang, J.; Hao, S.; Liu, Y. Deep Learning–Based Remote Sensing Estimation of Water Transparency in Shallow Lakes by Combining Landsat 8 and Sentinel 2 Images. Environ. Sci. Pollut. Res. 2021, 29, 4401–4413. [Google Scholar] [CrossRef]
  10. Dias, R.L.S.; da Silva, D.D.; Fernandes-Filho, E.I.; do Amaral, C.H.; dos Santos, E.P.; Marques, J.F.; Veloso, G.V. Machine Learning Models Applied to TSS Estimation in a Reservoir Using Multispectral Sensor Onboard to RPA. Ecol. Inform. 2021, 65, 101414. [Google Scholar] [CrossRef]
  11. Sagan, V.; Peterson, K.T.; Maimaitijiang, M.; Sidike, P.; Sloan, J.; Greeling, B.A.; Maalouf, S.; Adams, C. Monitoring Inland Water Quality Using Remote Sensing: Potential and Limitations of Spectral Indices, Bio-Optical Simulations, Machine Learning, and Cloud Computing. Earth Sci. Rev. 2020, 205, 103187. [Google Scholar] [CrossRef]
  12. Tian, S.; Guo, H.; Xu, W.; Zhu, X.; Wang, B.; Zeng, Q.; Mai, Y.; Huang, J.J. Remote Sensing Retrieval of Inland Water Quality Parameters Using Sentinel-2 and Multiple Machine Learning Algorithms. Environ. Sci. Pollut. Res. 2023, 30, 18617–18630. [Google Scholar] [CrossRef]
  13. Ferdous, J.; Rahman, M.T.U. Developing an Empirical Model from Landsat Data Series for Monitoring Water Salinity in Coastal Bangladesh. J. Environ. Manag. 2020, 255, 109861. [Google Scholar] [CrossRef] [PubMed]
  14. Swain, R.; Sahoo, B. Mapping of Heavy Metal Pollution in River Water at Daily Time-Scale Using Spatio-Temporal Fusion of MODIS-Aqua and Landsat Satellite Imageries. J. Environ. Manag. 2017, 192, 1–14. [Google Scholar] [CrossRef] [PubMed]
  15. Xiong, Y.; Ran, Y.; Zhao, S.; Zhao, H.; Tian, Q. Remotely Assessing and Monitoring Coastal and Inland Water Quality in China: Progress, Challenges and Outlook. Crit. Rev. Environ. Sci. Technol. 2020, 50, 1266–1302. [Google Scholar] [CrossRef]
  16. Liu, H.; Yu, T.; Hu, B.; Hou, X.; Zhang, Z.; Liu, X.; Liu, J.; Wang, X.; Zhong, J.; Tan, Z.; et al. Uav-Borne Hyperspectral Imaging Remote Sensing System Based on Acousto-Optic Tunable Filter for Water Quality Monitoring. Remote Sens. 2021, 13, 4069. [Google Scholar] [CrossRef]
  17. El Ouali, A.; El Hafyani, M.; Roubil, A.; Lahrach, A.; Essahlaoui, A.; Hamid, F.E.; Muzirafuti, A.; Paraforos, D.S.; Lanza, S.; Randazzo, G. Modeling and Spatiotemporal Mapping of Water Quality through Remote Sensing Techniques: A Case Study of the Hassan Addakhil Dam. Appl. Sci. 2021, 11, 9297. [Google Scholar] [CrossRef]
  18. Peterson, K.T.; Sagan, V.; Sidike, P.; Hasenmueller, E.A.; Sloan, J.J.; Knouft, J.H. Machine Learning-Based Ensemble Prediction of Water-Quality Variables Using Feature-Level and Decision-Level Fusion with Proximal Remote Sensing. Photogramm. Eng. Remote Sens. 2019, 85, 269–280. [Google Scholar] [CrossRef]
  19. Peterson, K.T.; Sagan, V.; Sloan, J.J. Deep Learning-Based Water Quality Estimation and Anomaly Detection Using Landsat-8/Sentinel-2 Virtual Constellation and Cloud Computing. GISci. Remote Sens. 2020, 57, 510–525. [Google Scholar] [CrossRef]
  20. Zhu, M.; Wang, J.; Yang, X.; Zhang, Y.; Zhang, L.; Ren, H.; Wu, B.; Ye, L. A Review of the Application of Machine Learning in Water Quality Evaluation. Eco Environ. Health 2022, 1, 107–116. [Google Scholar] [CrossRef] [PubMed]
  21. Gao, Y.; Gao, J.; Yin, H.; Liu, C.; Xia, T.; Wang, J.; Huang, Q. Remote Sensing Estimation of the Total Phosphorus Concentration in a Large Lake Using Band Combinations and Regional Multivariate Statistical Modeling Techniques. J. Environ. Manag. 2015, 151, 33–43. [Google Scholar] [CrossRef]
  22. Sun, X.; Zhang, Y.; Shi, K.; Zhang, Y.; Li, N.; Wang, W.; Huang, X.; Qin, B. Monitoring Water Quality Using Proximal Remote Sensing Technology. Sci. Total Environ. 2022, 803, 149805. [Google Scholar] [CrossRef]
  23. Shen, Q.; Xing, X.; Yao, Y.; Wang, M.; Liu, S.; Li, J.; Zhang, B. Estimation of Suspended Matter Concentration in Manwan Reservoir, Lancang River Using Remotely Sensed Small Satellite Constellation for Environment and Disaster Monitoring and Forecasting (HJ-1A/1B), Charge Coupled Device (CCD) Data. Int. J. Remote Sens. 2021, 42, 5236–5256. [Google Scholar] [CrossRef]
  24. Mathew, M.M.; Srinivasa Rao, N.; Mandla, V.R. Development of Regression Equation to Study the Total Nitrogen, Total Phosphorus and Suspended Sediment Using Remote Sensing Data in Gujarat and Maharashtra Coast of India. J. Coast. Conserv. 2017, 21, 917–927. [Google Scholar] [CrossRef]
  25. ESA Guia de Missão Sentinel 2 2023. Available online: https://sentinels.copernicus.eu/documents/247904/685211/Sentinel-2_User_Handbook (accessed on 2 December 2024).
  26. NASA Detalhes Da Missão Do Landsat 8 2023. Available online: https://science.nasa.gov/mission/landsat-8/ (accessed on 5 December 2024).
  27. Nguyen, U.N.T.; Pham, L.T.H.; Dang, T.D. An Automatic Water Detection Approach Using Landsat 8 OLI and Google Earth Engine Cloud Computing to Map Lakes and Reservoirs in New Zealand. Environ. Monit. Assess. 2019, 191, 235. [Google Scholar] [CrossRef] [PubMed]
  28. Ansper, A.; Alikas, K. Retrieval of Chlorophyll a from Sentinel-2 MSI Data for the European Union Water Framework Directive Reporting Purposes. Remote Sens. 2018, 11, 64. [Google Scholar] [CrossRef]
  29. Planet Team Planet Surface Reflectance Product v2. Planet Labs Inc., 2020; pp. 1–18. Available online: https://assets.planet.com/marketing/PDF/Planet_Surface_Reflectance_Technical_White_Paper.pdf (accessed on 30 November 2020).
  30. Crusan, J.; Galica, C. NASA’s CubeSat Launch Initiative: Enabling Broad Access to Space. Acta Astronaut. 2019, 157, 51–60. [Google Scholar] [CrossRef]
  31. IBGE Instituto Brasileiro de Geografia e Estatística. Available online: www.cidades.ibge.gov.br (accessed on 14 May 2025).
  32. EMBRAPA Sumula Da X Reunião Técnica de Levantamentos de Solos (SNLCS, Série Miscelânia, 1). Serviço Nac. Levant. e Conserv. Solos 1979. Available online: https://www.infoteca.cnptia.embrapa.br/infoteca/handle/doc/327212 (accessed on 10 May 2025).
  33. Alvares, C.A.; Stape, L.; Sentelhas, P.C.; Gonc, L.D.M.; Sparovek, G. Koppen’s Climate Classification Map for Brazil. Meteorol. Z. 2014, 22, 711–728. [Google Scholar] [CrossRef]
  34. Gerais, G.d.E.d.M. Histórico Do Rompimento Das Barragens Da Vale Na Mina Córrego Do Feijão. Available online: https://www.mg.gov.br/pro-brumadinho/pagina/historico-do-rompimento-das-barragens-da-vale-na-mina-corrego-do-feijao (accessed on 13 December 2024).
  35. IGAM Instituto Mineiro de Gestão Das Águas. Água Superficial 2021. Available online: https://igam.mg.gov.br/w/monitoramento-de-qualidade-das-aguas (accessed on 15 May 2025).
  36. Dias, R.L.S.; Amorim, R.S.S.; da Silva, D.D.; Fernandes-Filho, E.I.; Veloso, G.V.; Macedo, R.H.F. Relative Radiometric Normalization for the PlanetScope Nanosatellite Constellation Based on Sentinel-2 Images. Remote Sens. 2024, 16, 4047. [Google Scholar] [CrossRef]
  37. Müller-Wilm, U. Sentinel-2 MSI—Level-2A Prototype Processor Installation and User Manual; Special Publication ESA SP; European Space Agency (ESA): Paris, France, 2016; Volume 49, pp. 1–51. [Google Scholar]
  38. Vanhellemont, Q.; Ruddick, K. Acolite for Sentinel-2: Aquatic Applications of MSI Imagery. In Proceedings of the 2016 ESA Living Planet Symposium, Prague, Czech Republic, 9–13 May 2016; pp. 9–13. [Google Scholar]
  39. Gao, B.; Montes, M.J.; Davis, C.O.; Goetz, A.F.H. Atmospheric Correction Algorithms for Hyperspectral Remote Sensing Data of Land and Ocean. Remote Sens. Environ. 2009, 113, S17–S24. [Google Scholar] [CrossRef]
  40. Feyisa, G.L.; Meilby, H.; Fensholt, R.; Proud, S.R. Automated Water Extraction Index: A New Technique for Surface Water Mapping Using Landsat Imagery. Remote Sens. Environ. 2014, 140, 23–35. [Google Scholar] [CrossRef]
  41. Mukherjee, N.R.; Samuel, C. Assessment of the Temporal Variations of Surface Water Bodies in and around Chennai Using Landsat Imagery. Indian J. Sci. Technol. 2016, 9, 1–7. [Google Scholar] [CrossRef]
  42. Li, H.; Liu, Q. Comparison of NDBI and NDVI as Indicators of Surface Urban Heat Island Effect in MODIS Imagery. In Proceedings of the International Conference on Earth Observation Data Processing and Analysis, Wuhan, China, 28–30 December 2008; Volume 7285. [Google Scholar]
  43. Huete, A.R. A Soil-Adjusted Vegetation Index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  44. Henrich, V.; Götze, E.; Jung, A.; Sandow, C.; Thürkow, D.; Gläßer, C. Development of an Online Indices Database: Motivation, Concept and Implementation. In Proceedings of the 6th EARSeL Imaging Spectroscopy SIG Workshop Innovative Tool for Scientific and Commercial Environment Applications, Tel Aviv, Israel, 16–18 March 2009; pp. 16–18. [Google Scholar]
  45. Rowan, L.C.; Mars, J.C. Lithologic Mapping in the Mountain Pass, California Area Using Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Data. Remote Sens. Environ. 2003, 84, 350–366. [Google Scholar] [CrossRef]
  46. Van Deventer, A.P.; Ward, A.D.; Gowda, P.H.; Lyon, J.G. Using Thematic Mapper Data to Identify Contrasting Soil Plains and Tillage Practices. Photogramm. Eng. Remote Sens. 1997, 63, 87–93. [Google Scholar]
  47. Zhang, K.; Thapa, B.; Ross, M.; Gann, D. Remote Sensing of Seasonal Changes and Disturbances in Mangrove Forest: A Case Study from South Florida. Ecosphere 2016, 7, e01366. [Google Scholar] [CrossRef]
  48. Wilson, E.H.; Sader, S.A. Detection of Forest Harvest Type Using Multiple Dates of Landsat TM Imagery. Remote Sens. Environ. 2002, 80, 385–396. [Google Scholar] [CrossRef]
  49. Hewson, R.D.; Cudahy, T.J.; Huntington, J.F. Geologic and Alteration Mapping at Mt Fitton, South Australia, Using ASTER Satellite-Borne Data. In Proceedings of the IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Symposium (Cat. No.01CH37217), Sydney, NSW, Australia, 9–13 July 2001; Volume 2, pp. 724–726. [Google Scholar]
  50. Bousbih, S.; Zribi, M.; Pelletier, C.; Gorrab, A.; Lili-Chabaane, Z.; Baghdadi, N.; Ben Aissa, N.; Mougenot, B. Soil Texture Estimation Using Radar and Optical Data from Sentinel-1 and Sentinel-2. Remote Sens. 2019, 11, 1520. [Google Scholar] [CrossRef]
  51. Toming, K.; Kutser, T.; Laas, A.; Sepp, M.; Paavel, B.; Nõges, T. First Experiences in Mapping Lakewater Quality Parameters with Sentinel-2 MSI Imagery. Remote Sens. 2016, 8, 640. [Google Scholar] [CrossRef]
  52. Tucker, C.J. Red and Photographic Infrared Lnear Combinations for Monitoring Vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar] [CrossRef]
  53. McFeeters, S.K. The Use of the Normalized Difference Water Index (NDWI) in the Delineation of Open Water Features. Int. J. Remote Sens. 1996, 17, 1425–1432. [Google Scholar] [CrossRef]
  54. Polidorio, A.M.; Imai, N.N.; Tommaselli, A.M.G. Índice Indicador de Corpos d’água Para Imagens Multiespectrais. In Proceedings of the I Simpósio de Ciências Geodésicas e Tecnologias da Geoinformação (I SIMGEO), Recife, Brazil, 1–3 September 2004; Volume 9. [Google Scholar]
  55. Clevers, J.; De Jong, S.M.; Epema, G.F.; Van Der Meer, F.D.; Bakker, W.H.; Skidmore, A.K.; Scholte, K.H. Derivation of the Red Edge Index Using the MERIS Standard Band Setting. Int. J. Remote Sens. 2002, 23, 3169–3184. [Google Scholar] [CrossRef]
  56. Costa, L.; Nunes, L.; Ampatzidis, Y. A New Visible Band Index (VNDVI) for Estimating NDVI Values on RGB Images Utilizing Genetic Algorithms. Comput. Electron. Agric. 2020, 172, 105334. [Google Scholar] [CrossRef]
  57. Kaufman, Y.J.; Tanre, D. Atmospherically Resistant Vegetation Index (ARVI) for EOS-MODIS. IEEE Trans. Geosci. Remote Sens. 1992, 30, 261–270. [Google Scholar] [CrossRef]
  58. Barnes, J.D.; Balaguer, L.; Manrique, E.; Elvira, S.; Davison, A.W. A Reappraisal of the Use of DMSO for the Extraction and Determination of Chlorophylls a and b in Lichens and Higher Plants. Environ. Exp. Bot. 1992, 32, 85–100. [Google Scholar] [CrossRef]
  59. Gitelson, A.A.; Kaufman, Y.J.; Merzlyak, M.N. Use of a Green Channel in Remote Sensing of Global Vegetation from EOS-MODIS. Remote Sens. Environ. 1996, 58, 289–298. [Google Scholar] [CrossRef]
  60. Pizani, F.M.C.; Maillard, P. Um Índice De Turbidez Para Águas Relativamente Claras. In Proceedings of the XX Simpósio Brasileiro de Sensoriamento Remoto, Florianópolis, Brazil, 2–5 April 2023; pp. 668–671. [Google Scholar]
  61. Planet Team. Planet Application Program Interface: In Space for Life on Earth; Planet Labs Inc.: San Francisco, CA, USA, 2021; pp. 1–96. [Google Scholar]
  62. Lacaux, J.P.; Tourre, Y.M.; Vignolles, C.; Ndione, J.A.; Lafaye, M. Classification of Ponds from High-Spatial Resolution Remote Sensing: Application to Rift Valley Fever Epidemics in Senegal. Remote Sens. Environ. 2007, 106, 66–74. [Google Scholar] [CrossRef]
  63. Funk, C.; Peterson, P.; Landsfeld, M.; Pedreros, D.; Verdin, J.; Shukla, S.; Husak, G.; Rowland, J.; Harrison, L.; Hoell, A. The Climate Hazards Infrared Precipitation with Stations—A New Environmental Record for Monitoring Extremes. Sci. Data 2015, 2, 150066. [Google Scholar] [CrossRef]
  64. CHIRPS. CHIRPS: Rainfall Estimates from Rain Gauge and Satellite Observations|Climate Hazards Center—UC Santa Barbara. Available online: https://www.chc.ucsb.edu/data/chirps (accessed on 23 July 2023).
  65. R Core Team. R: A Language and Environment for Statistical Computing, Version 3.3.1; R Foundation for Statistical Computing: Vienna, Austria, 2020. [Google Scholar]
  66. Carvalho, M.L.S.; Cabús, R.C. Eficiência Da Luz Solar Refletida e Desempenho de Dispositivos de Sombreamento. Ambient. Construído 2020, 20, 191–209. [Google Scholar] [CrossRef]
  67. Barbosa, C.C.F.; Novo, E.M.L.M.; Martins, V.S. Introdução ao Sensoriamento Remoto de Sistemas Aquáticos; Instituto Nacional de Pesquisas Espaciais (INPE): São José dos Campos, Brazil, 2019; Volume 1, ISBN 978-85-17-00095-9. [Google Scholar]
  68. Neogi, S.; Dauwels, J. Factored Latent-Dynamic Conditional Random Fields for Single and Multi-Label Sequence Modeling. Pattern Recognit. 2022, 122, 108236. [Google Scholar] [CrossRef]
  69. Mello, D.C.D.; Veloso, G.V.; Lana, M.G.D.; Mello, F.A.D.O.; Poppiel, R.R.; Cabrero, D.R.O.; Di Raimo, L.A.D.L.; Schaefer, C.E.G.R.; Filho, E.I.F.; Leite, E.P.; et al. A New Methodological Framework for Geophysical Sensor Combinations Associated with Machine Learning Algorithms to Understand Soil Attributes. Geosci. Model Dev. 2022, 15, 1219–1246. [Google Scholar] [CrossRef]
  70. Muñoz-Romero, S.; Gorostiaga, A.; Soguero-Ruiz, C.; Mora-Jiménez, I.; Rojo-Álvarez, J.L. Informative Variable Identifier: Expanding Interpretability in Feature Selection. Pattern Recognit. 2020, 98, 107077. [Google Scholar] [CrossRef]
  71. Reunanen, J. Overfitting in Making Comparisons between Variable Selection Methods. J. Mach. Learn. Res. 2003, 3, 1371–1382. [Google Scholar]
  72. da Silveira, V.A.; Veloso, G.V.; de Paula, H.B.; dos Santos, A.R.; Schaefer, C.E.G.R.; Fernandes-Filho, E.I.; Francelino, M.R. Modeling and Mapping of Inselberg Habitats for Environmental Conservation in the Atlantic Forest and Caatinga Domains, Brazil. Environ. Adv. 2022, 8, 100209. [Google Scholar] [CrossRef]
  73. Kuhn, M. Caret: Classification and Regression Training. 2020. Available online: https://cran.r-project.org/web/packages/caret/index.html (accessed on 15 December 2025).
  74. Murphy, K.P. Machine Learning: A Probabilistic Perspective. In Chance Encounters: Probability in Education; Springer: Dordrecht, The Netherlands, 2013; p. 1098. [Google Scholar]
  75. Lee Rodgers, J.; Nicewander, W.A. Thirteen Ways to Look at the Correlation Coefficient. Am. Stat. 1988, 42, 59–66. [Google Scholar] [CrossRef]
  76. Stevens, A.; Nocita, M.; Tóth, G.; Montanarella, L.; van Wesemael, B. Prediction of Soil Organic Carbon at the European Scale by Visible and Near InfraRed Reflectance Spectroscopy. PLoS ONE 2013, 8, e66409. [Google Scholar] [CrossRef]
  77. Ghosh, A.; Joshi, P.K. A Comparison of Selected Classification Algorithms for Mappingbamboo Patches in Lower Gangetic Plains Using Very High Resolution WorldView 2 Imagery. Int. J. Appl. Earth Obs. Geoinf. 2014, 26, 298–311. [Google Scholar] [CrossRef]
  78. Meyer, H.; Reudenbach, C.; Hengl, T.; Katurji, M.; Nauss, T. Improving Performance of Spatio-Temporal Machine Learning Models Using Forward Feature Selection and Target-Oriented Validation. Environ. Model. Softw. 2018, 101, 1–9. [Google Scholar] [CrossRef]
  79. Meyer, H.; Lehnert, L.W.; Wang, Y.; Reudenbach, C.; Nauss, T.; Bendix, J. From Local Spectral Measurements to Maps of Vegetation Cover and Biomass on the Qinghai-Tibet-Plateau: Do We Need Hyperspectral Information? Int. J. Appl. Earth Obs. Geoinf. 2017, 55, 21–31. [Google Scholar] [CrossRef]
  80. Gomes, L.C.; Faria, R.M.; de Souza, E.; Veloso, G.V.; Schaefer, C.E.G.R.; Fernandes Filho, E.I. Modelling and Mapping Soil Organic Carbon Stocks in Brazil. Geoderma 2019, 340, 337–350. [Google Scholar] [CrossRef]
  81. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 27. [Google Scholar] [CrossRef]
  82. Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 25. [Google Scholar] [CrossRef]
  83. Zhang, M.L.; Zhou, Z.H. ML-KNN: A Lazy Learning Approach to Multi-Label Learning. Pattern Recognit. 2007, 40, 2038–2048. [Google Scholar] [CrossRef]
  84. Adams, A.; Sterling, L. AI ’92. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Hobart, Tasmania, 16–18 November 1992; 408p. [Google Scholar]
  85. Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: Berlin/Heidelberg, Germany, 2013; ISBN 9781461468486. [Google Scholar]
  86. Tiyasha, T.; Tung, T.M.; Bhagat, S.K.; Tan, M.L.; Jawad, A.H.; Mohtar, W.H.M.W.; Yaseen, Z.M. Functionalization of Remote Sensing and On-Site Data for Simulating Surface Water Dissolved Oxygen: Development of Hybrid Tree-Based Artificial Intelligence Models. Mar. Pollut. Bull. 2021, 170, 112639. [Google Scholar] [CrossRef] [PubMed]
  87. Du, C.; Wang, Q.; Li, Y.; Lyu, H.; Zhu, L.; Zheng, Z.; Wen, S.; Liu, G.; Guo, Y. Estimation of Total Phosphorus Concentration Using a Water Classification Method in Inland Water. Int. J. Appl. Earth Obs. Geoinf. 2018, 71, 29–42. [Google Scholar] [CrossRef]
  88. Ferreira, R.G.; da Silva, D.D.; Elesbon, A.A.A.; Fernandes-Filho, E.I.; Veloso, G.V.; Fraga, M.d.S.; Ferreira, L.B. Machine Learning Models for Streamflow Regionalization in a Tropical Watershed. J. Environ. Manag. 2021, 280, 111713. [Google Scholar] [CrossRef]
  89. Mountrakis, G.; Im, J.; Ogole, C. Support Vector Machines in Remote Sensing: A Review. ISPRS J. Photogramm. Remote Sens. 2011, 66, 247–259. [Google Scholar] [CrossRef]
  90. Cover, T.; Hart, P. Nearest Neighbor Pattern Classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
  91. Abdulrahman, S.A.; Khalifa, W.; Roushdy, M.; Salem, A.-B.M. Comparative Study for 8 Computational Intelligence Algorithms for Human Identification. Comput. Sci. Rev. 2020, 36, 100237. [Google Scholar] [CrossRef]
  92. Dudani, S.A. The Distance-Weighted k-Nearest-Neighbor Rule. IEEE Trans. Syst. Man. Cybern. 1976, SMC-6, 325–327. [Google Scholar] [CrossRef]
  93. Shepard, D. A Two-Dimensional Interpolation Function for Irregularly-Spaced Data. In Proceedings of the 1968 23rd ACM National Conference, Las Vegas, NV, USA, 27–29 August 1968; pp. 517–524. [Google Scholar]
  94. Noi, P.T.; Degener, J.; Kappas, M. Comparison of Multiple Linear Regression, Cubist Regression, and Random Forest Algorithms to Estimate Daily Air Surface Temperature from Dynamic Combinations of MODIS LST Data. Remote Sens. 2017, 9, 398. [Google Scholar] [CrossRef]
  95. Houborg, R.; McCabe, M.F. A Hybrid Training Approach for Leaf Area Index Estimation via Cubist and Random Forests Machine-Learning. ISPRS J. Photogramm. Remote Sens. 2018, 135, 173–188. [Google Scholar] [CrossRef]
  96. Hafeez, S.; Wong, M.S.; Ho, H.C.; Nazeer, M.; Nichol, J.; Abbas, S.; Tang, D.; Lee, K.H.; Pun, L. Comparison of Machine Learning Algorithms for Retrieval of Water Quality Indicators in Case-II Waters: A Case Study of Hong Kong. Remote Sens. 2019, 11, 617. [Google Scholar] [CrossRef]
  97. de Mello, D.C.; Francelino, M.R.; Moquedace, C.M.; Baldi, C.G.O.; Silva, L.V.; Siqueira, R.G.; Veloso, G.V.; Fernandes-Filho, E.I.; Thomazini, A.; Demattê, J.A.M. Global Warming May Turn Ice-Free Areas of Maritime and Peninsular Antarctica into Potential Soil Organic Carbon Sinks. Commun. Earth Environ. 2025, 6, 143. [Google Scholar] [CrossRef]
  98. Kennedy, J.B.; Neville, A.M. Basic Statistical Methods for Engineers and Scientists; HarperCollins Publishers: New York, NY, USA, 1986. [Google Scholar]
  99. Arlot, S.; Celisse, A. A Survey of Cross-Validation Procedures for Model Selection. Stat. Surv. 2010, 4, 40–79. [Google Scholar] [CrossRef]
  100. Lin, L.I.-K. A Concordance Correlation Coefficient to Evaluate Reproducibility. Biometrics 1989, 45, 255. [Google Scholar] [CrossRef] [PubMed]
  101. Chai, T.; Draxler, R.R. Root Mean Square Error (RMSE) or Mean Absolute Error (MAE)?—Arguments against Avoiding RMSE in the Literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef]
  102. Willmott, C.J.; Matsuura, K. Advantages of the Mean Absolute Error (MAE) over the Root Mean Square Error (RMSE) in Assessing Average Model Performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
  103. Morse-McNabb, E.M.; Hasan, M.F.; Karunaratne, S. A Multi-Variable Sentinel-2 Random Forest Machine Learning Model Approach to Predicting Perennial Ryegrass Biomass in Commercial Dairy Farms in Southeast Australia. Remote Sens. 2023, 15, 2915. [Google Scholar] [CrossRef]
  104. Bruce, A.; Bruce, P. Estatística Prática Para Cientistas de Dados; Alta Books: Rio de Janeiro, Brazil, 2019; ISBN 8550810800. [Google Scholar]
  105. Wilks, D.S. Statistical Methods in the Atmospheric Sciences; Academic Press: Cambridge, MA, USA, 2011; ISBN 0123850231. [Google Scholar]
  106. Altman, D.G. Practical Statistics for Medical Research; Chapman and Hall/CRC, 1990; ISBN 0429258585. [Google Scholar]
  107. Sestini, M.F. Variáveis Geomorfológicas no Estudo de Deslizamentos em Caraguatatuba-SP Utilizando Imagens TM-Landsat e SIG; Instituto Nacional de Pesquisas Espaciais: São José dos Campos, SP, Brazil, 1999. [Google Scholar]
  108. Lillesand, T.; Kiefer, R.W.; Chipman, J. Remote Sensing and Image Interpretation; John Wiley & Sons: Hoboken, NJ, USA, 2015; ISBN 111834328X. [Google Scholar]
  109. Tyralis, H.; Papacharalampous, G.; Langousis, A. A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History in Water Resources. Water 2019, 11, 910. [Google Scholar] [CrossRef]
  110. Csillik, O.; Kumar, P.; Mascaro, J.; O’Shea, T.; Asner, G.P. Monitoring Tropical Forest Carbon Stocks and Emissions Using Planet Satellite Data. Sci. Rep. 2019, 9, 17831. [Google Scholar] [CrossRef]
  111. Bertone, E.; Ajmar, A.; Giulio, F.; Dunn, R.J.K.; Nicholas, J.; Doriean, C.; Bennett, W.W.; Purandare, J. Satellite-Based Estimation of Total Suspended Solids and Chlorophyll-a Concentrations for the Gold Coast Broadwater, Australia. Mar. Pollut. Bull. 2024, 201, 116217. [Google Scholar] [CrossRef]
  112. Dall’Olmo, G.; Gitelson, A.A.; Rundquist, D.C.; Leavitt, B.; Barrow, T.; Holz, J.C. Assessing the Potential of SeaWiFS and MODIS for Estimating Chlorophyll Concentration in Turbid Productive Waters Using Red and Near-Infrared Bands. Remote Sens. Environ. 2005, 96, 176–187. [Google Scholar] [CrossRef]
  113. Kutser, T.; Pierson, D.C.; Kallio, K.Y.; Reinart, A.; Sobek, S. Mapping Lake CDOM by Satellite Remote Sensing. Remote Sens. Environ. 2005, 94, 535–540. [Google Scholar] [CrossRef]
  114. EMBRAPA Sistema Brasileiro de Classificação de Solos; Centro Nacional de Pesquisa de Solos: Rio de Janeiro, Brazil, 2006; p. 306.
  115. Gao, L.; Shangguan, Y.; Sun, Z.; Shen, Q.; Shi, Z. Estimation of Non-Optically Active Water Quality Parameters in Zhejiang Province Based on Machine Learning. Remote Sens. 2024, 16, 514. [Google Scholar] [CrossRef]
  116. Greb, S.; Dekker, A.G.; Binding, C.; Bernard, S.; Brockmann, C.; DiGiacomo, P.; Griffith, D.; Groom, S.; Hestir, E.; Hunter, P. Earth Observations in Support of Global Water Quality Monitoring; Reports of the International Ocean Colour Coordinating Group, No. 17; International Ocean Colour Coordinating Group: Dartmouth, NS, Canada, 2018. [Google Scholar]
  117. Isidro, C.M.; McIntyre, N.; Lechner, A.M.; Callow, I. Quantifying Suspended Solids in Small Rivers Using Satellite Data. Sci. Total Environ. 2018, 634, 1554–1562. [Google Scholar] [CrossRef]
Figure 1. Location of the 24 water-quality monitoring stations distributed along the Paraopeba River Basin and the Três Marias Reservoir in the state of Minas Gerais, Brazil.
Figure 1. Location of the 24 water-quality monitoring stations distributed along the Paraopeba River Basin and the Três Marias Reservoir in the state of Minas Gerais, Brazil.
Sensors 26 00018 g001
Figure 2. Workflow of the methodology used to predict water-quality parameters from machine-learning algorithms and remote-sensing data. CCC = Lin′s concordance correlation coefficient; CHIRPS = Climate Hazards Group InfraRed Precipitation with Station; R2 = coefficient of determination; RMSE = root mean square error; MAE = mean absolute error.
Figure 2. Workflow of the methodology used to predict water-quality parameters from machine-learning algorithms and remote-sensing data. CCC = Lin′s concordance correlation coefficient; CHIRPS = Climate Hazards Group InfraRed Precipitation with Station; R2 = coefficient of determination; RMSE = root mean square error; MAE = mean absolute error.
Sensors 26 00018 g002
Figure 3. Ranking of the most important covariates for modeling water-quality parameters across the three datasets. Covariates include spectral bands (for example, B3, B4, B8), band ratios (for example, B3/B2, B2/B4), spectral indices (such as GLI, NDTI, ARVI), and environmental variables such as accumulated precipitation prior to the image date (for example, “Rain 15” indicates 15-day rainfall accumulation). ARVI means Atmospherically Resistant Vegetation Index, GLI means Green Leaf Index, and NDTI means Normalized Difference Turbidity Index.
Figure 3. Ranking of the most important covariates for modeling water-quality parameters across the three datasets. Covariates include spectral bands (for example, B3, B4, B8), band ratios (for example, B3/B2, B2/B4), spectral indices (such as GLI, NDTI, ARVI), and environmental variables such as accumulated precipitation prior to the image date (for example, “Rain 15” indicates 15-day rainfall accumulation). ARVI means Atmospherically Resistant Vegetation Index, GLI means Green Leaf Index, and NDTI means Normalized Difference Turbidity Index.
Sensors 26 00018 g003
Figure 4. Comparison between predicted and observed reflectance values from the best-performing machine learning models for turbidity (Turb), total suspended solids (TSS), iron (Fe), phosphorus (P), dissolved oxygen (DO), chemical oxygen demand (COD), nitrogen (N), and chlorophyll-a (Chla) across the three analyzed datasets (S2, PS, and normalized PS).
Figure 4. Comparison between predicted and observed reflectance values from the best-performing machine learning models for turbidity (Turb), total suspended solids (TSS), iron (Fe), phosphorus (P), dissolved oxygen (DO), chemical oxygen demand (COD), nitrogen (N), and chlorophyll-a (Chla) across the three analyzed datasets (S2, PS, and normalized PS).
Sensors 26 00018 g004
Figure 5. Scatter plots of predicted and observed values for turbidity (Turb), total suspended solids (TSS), iron (Fe), and phosphorus (P) modeled using normalized PlanetScope (PS) data for lentic and lotic environments.
Figure 5. Scatter plots of predicted and observed values for turbidity (Turb), total suspended solids (TSS), iron (Fe), and phosphorus (P) modeled using normalized PlanetScope (PS) data for lentic and lotic environments.
Sensors 26 00018 g005
Figure 6. Scatter plots of predicted and observed values for dissolved oxygen (DO), chemical oxygen demand (COD), nitrogen (N), and chlorophyll-a (Chla) modeled using normalized PlanetScope data for lentic and lotic environments. CCC = Lin′s concordance correlation coefficient; MAE = mean absolute error; RMSE = root-mean-square error; R2 = coefficient of determination; TSS = total suspended solids; Turb = Turbidity.
Figure 6. Scatter plots of predicted and observed values for dissolved oxygen (DO), chemical oxygen demand (COD), nitrogen (N), and chlorophyll-a (Chla) modeled using normalized PlanetScope data for lentic and lotic environments. CCC = Lin′s concordance correlation coefficient; MAE = mean absolute error; RMSE = root-mean-square error; R2 = coefficient of determination; TSS = total suspended solids; Turb = Turbidity.
Sensors 26 00018 g006
Table 1. Identification, geographic coordinates, and operating agencies of the water-quality monitoring stations used in this study.
Table 1. Identification, geographic coordinates, and operating agencies of the water-quality monitoring stations used in this study.
StationCityOperatorLongitude *Latitude *
BP036BrumadinhoIGAM591,481.657,766,154.53
BP068Mário CamposIGAM582,550.807,777,238.56
BP070BetimIGAM577,842.677,783,718.41
BP072BetimIGAM571,826.237,794,515.43
BP077PapagaiosIGAM549,063.797,862,981.60
BP078CurveloIGAM530,977.927,880,408.76
BP082EsmeraldasIGAM554,515.307,824,940.83
BP083PapagaiosIGAM549,278.617,858,233.86
BP087CurveloIGAM528,445.637,896,560.69
BP093BrumadinhoIGAM587,971.757,770,652.15
BP099FelixlândiaIGAM521,656.617,914,060.68
BPE2BrumadinhoIGAM582,099.277,773,420.49
BPE6FelixlândiaIGAM498,228.237,918,958.61
BPE7AbaetéIGAM475,069.837,906,716.13
BPE8Três MariasIGAM469,656.807,954,599.84
SF011BiquinhasIGAM446,188.747,945,567.26
SF054Três MariasIGAM473,291.937,988,911.71
TM15AbaetéCemig478,888.567,911,539.85
TM20PompéuCemig487,036.467,910,337.95
TM25PompéuCemig486,013.707,917,367.25
TM30Morada Nova de MinasCemig472,184.167,947,127.11
TM35Morada Nova de MinasCemig461,488.377,960,155.51
TM40Morada Nova de MinasCemig454,887.447,955,627.32
TM45Três MariasCemig473,237.487,988,849.00
* Datum: SIRGAS 2000 Universal Transverse Mercator 23S. Cemig = Companhia Energética de Minas Gerais; IGAM = Instituto Mineiro de Gestão das Águas. Source: [35].
Table 2. Sentinel-2 spectral indices used as covariates in this study, with their respective equations and bibliographic sources.
Table 2. Sentinel-2 spectral indices used as covariates in this study, with their respective equations and bibliographic sources.
IndexEquationReferences
Aweish(B2 + [2.5 × B3]) − (1.5 × [B8 + B12]) − (0.25 × B12)[40]
WRI(B3 + B4)/(B8 + B12)[41]
SCI(B11 − B8)/(B11 + B8)[42]
SAVI([B8 − B4]/[B8 + B4 + 0.5]) × 1.5[43]
FE_SIB12/B11[44]
FE_OXB11/B8[44]
FE2(B12/B8) + (B3/B4)[45]
FE3(B12/B8) + (B3/B4)[45]
NDTI(B11 − B12)/(B11 + B12)[46]
NDMI(B8 − B11)/(B8 + B11)[47]
NDMI2(B4 − B8a)/(B4 + B8a)[48]
IronB4/B2[49]
ClayB11/B12[50]
TomingB3/B4[51]
NDVI(B4−B8)/(B4 + B8)[52]
NDWI(B3−B8)/(B3 + B8)[53]
IIA(4 × B8)/(B3 + [4 × B8])[54]
IredEdge(B3 + B4)/2[55]
GLI(2 × B3 − B4 − B2)/(2 × B3 + B4 + B2)[56]
ARVI(B8−[2 × B4 − B2])/(B8 + [2 × B4] + B2)[57]
NPQI(B2 − B3)/(B2 + B3)[58]
GNDVI(B5 − B2)/(B5 + B2)[59]
NDVIVIS(B2 − B3)/(B2 + B3)[56]
AWEI4 × (B3 − B12) − (0.25 × B8a + 2.75 × B11)[40]
ITBDN(B3 − B2)/(B3 + B2)[60]
B corresponds to the Sentinel-2 spectral band used.
Table 3. PlanetScope spectral indices used as covariates in this study, with their respective equations and bibliographic sources.
Table 3. PlanetScope spectral indices used as covariates in this study, with their respective equations and bibliographic sources.
IndexEquationReferences
SAVI((B4 − B3)/(B4 + B3 + 0.5)) × 1.5[43]
NDTI_VIS(B3 − B2)/(B3 + B2)[62]
NDMI2(B3 − B4)/(B3 + B4)[47]
IronB3/B1[48]
TomingB2/B3[51]
NDVI(B3 − B4)/(B3 + B4)[52]
NDWI(B2 − B4)/(B2 + B4)[53]
HA(4 × B4)/(B2 + (4 × B4))[54]
IredEdge(B2 + B3)/2[55]
GLI(2 × B2 − B3 − B1)/(2 × B2 + B3 + B1)[56]
ARVI(B4 − (2 × B3 − B1))/(B4 + (2 × B3) + B1)[57]
NPQI(B1 − B2)/(B1 + B2)[58]
GNDVI(B4 − B1)/(B4 + B1)[59]
NDVIVIS(B1 − B2)/(B1 + B2)[56]
ITBDN(B2 − B1)/(B2 + B1)[60]
B corresponds to the PlanetScope spectral band used.
Table 4. Optimized hyperparameters for each developed model.
Table 4. Optimized hyperparameters for each developed model.
ParameterCubist (Committees/Neighbors)KKNN (Kmax/Distance/Kernel)RF (mtry)SVM-RBF(Sigma/C)
PlanetScope
Chla20/57/2/optimal100.02427/4
COD20/55/2/optimal618.5751/4
Fe1/55/2/optimal20.50783/2
N10/513/2/optimal20.04940/1
DO1/95/2/optimal1816.45009/4
TP10/95/2/optimal600.00942/4
TSS10/97/2/optimal20.03363/2
Turbidity10/513/2/optimal50.008999/2
PlanetScope (normalized)
Chla20/57/2/optimal100.02427/4
COD20/55/2/optimal618.5853/4
Fe1/55/2/optimal20.50783/2
N10/513/2/optimal20.04940/1
DO1/95/2/optimal1816.50009/4
TP10/95/2/optimal600.00942/4
TSS10/97/2/optimal20.03363/2
Turbidity10/513/2/optimal50.008999/2
Sentinel-2
Chla1/55/2/optimal20.02736/4
COD1/97/2/optimal60.03702/4
Fe20/55/2/optimal20.10574/4
N20/97/2/optimal180.05874/4
DO20/55/2/optimal50.05790/4
TP20/511/2/optimal110.11027/4
TSS10/95/2/optimal20.01212/2
Turbidity20/95/2/optimal150.00732/4
Chla = chlorophyll-a; COD = chemical oxygen demand; DO = dissolved oxygen; Fe = iron; N = nitrogen; P = phosphorus; TSS = total suspended solids.
Table 5. Descriptive statistics of water quality parameters for the two datasets used in the modeling.
Table 5. Descriptive statistics of water quality parameters for the two datasets used in the modeling.
SensorStatisticsParameters
ChlaCODFePNDOTSSTurb
MSI/Sentinel-2Maximum8.46057.4005.2410.401.24014.800528.00809.00
Minimum0.2705.0000.0110.0100.1002.1002.000.50
Range8.19052.45.2300.3901.14012.700526.00808.50
Mean1.93217.6610.3170.0720.4067.48131.3334.08
Median1.34017.0000.1740.0500.3807.70012.0011.20
Std. Deviation3.7189.0900.5440.0800.2381.44558.2077.18
PlanetScopeMaximum8.6959.00021.8460.4801.31014.800716.00982.00
Minimum0.8005.0000.0230.0100.1002.1002.000.53
Range7.8954.00021.8230.4701.21012.700714.00981.47
Mean2.20220.7770.6770.0980.4317.48178.8387.93
Median1.60020.0000.2530.0600.3807.50029.0030.45
Std. Deviation2.85110.8181.9370.0920.2581.300136.32144.53
Chla = chlorophyll-a; COD = chemical oxygen demand; DO = dissolved oxygen; Fe = iron; N = nitrogen; P = phosphorus; TSS = total suspended solids; Turb = Turbidity.
Table 6. Performance metrics (training/test) for Cubist, KKNN, RF, and SVM-RBF in predicting water quality parameters using Sentinel-2 data.
Table 6. Performance metrics (training/test) for Cubist, KKNN, RF, and SVM-RBF in predicting water quality parameters using Sentinel-2 data.
ModelsTrainingTest
RMSEMAER2CCCRMSEMAER2CCCNULL RMSENULL MAE
TurbCubist35.36914.4680.7870.82342.73815.2100.7390.81374.45137.619
KKNN36.48614.8560.7690.79541.70115.0160.7350.799
RF33.95314.3220.7850.82439.17614.6250.7500.824
SVM-RBF39.34716.5270.7460.76148.49517.5750.6560.709
TSSCubist36.18517.0660.6480.70340.72017.5720.5600.70157.01229.762
KKNN33.78516.4110.6580.71537.48216.5270.5960.727
RF34.25016.8000.6430.70937.87816.8810.5850.719
SVM-RBF34.65216.8990.6400.67440.93217.7800.5330.630
FeCubist0.4220.2080.3690.4900.4740.2130.2780.4490.5190.269
KKNN0.4380.2150.3380.4660.4820.2190.2420.418
RF0.4530.2210.2990.4170.4960.2250.2020.371
SVM-RBF0.4200.2010.3410.4290.4650.2010.2410.342
PCubist0.0670.0420.3080.4740.0710.0430.2300.4200.0770.055
KKNN0.0680.0430.3000.4710.0720.0450.2100.408
RF0.0650.0420.3250.4780.0690.0430.2490.427
SVM-RBF0.0670.0400.2990.4200.0700.0400.2280.371
DOCubist1.3200.9280.2160.3801.3950.9420.1750.3601.4571.006
KKNN1.3280.9270.2150.3781.4670.9850.1330.318
RF1.2180.8360.2880.4191.3250.8640.2030.361
SVM-RBF1.2670.8570.2320.3771.3820.8990.1520.316
CODCubist8.4386.6340.1920.3518.6416.7460.1390.3138.9787.471
KKNN8.8697.0670.1460.2959.1247.2000.0930.252
RF7.9696.3400.2470.3818.0266.3660.2120.369
SVM-RBF8.5666.5790.1650.3178.8506.7990.1060.267
NCubist0.2260.1800.1720.3220.2340.1840.1190.2880.2370.191
KKNN0.2320.1850.1340.2730.2410.1910.0800.227
RF0.2130.1710.2200.3520.2210.1760.1590.313
SVM-RBF0.2310.1830.1370.2800.2410.1900.0820.234
ChlaCubist3.4061.5200.1060.1564.2311.5480.0470.0693.0311.364
KKNN2.9891.3900.1490.1833.5241.3970.0750.113
RF3.0261.4680.1260.1573.5011.4570.0440.077
SVM-RBF2.5461.2470.1620.2383.0791.2520.0550.123
CCC = Lin′s concordance correlation coefficient; Chla = chlorophyll-a; COD = chemical oxygen demand; DO = dissolved oxygen; Fe = iron; KKNN = kernel k-nearest neighbors; MAE = mean absolute error; N = nitrogen; NULL MAE = mean absolute error of the null model; NULL RMSE = root-mean-square error of the null model; P = phosphorus; RF = random forest; RMSE = root-mean-square error; R2 = coefficient of determination; SVM-RBF = support vector machine with radial basis function; TSS = total suspended solids; Turb = Turbidity. Note: Rows in bold and shaded represent the best-performing models for each parameter.
Table 7. Performance metrics (training/test) for Cubist, KKNN, RF, and SVM-RBF in predicting water quality parameters using PlanetScope data.
Table 7. Performance metrics (training/test) for Cubist, KKNN, RF, and SVM-RBF in predicting water quality parameters using PlanetScope data.
ModelsTrainingTest
RMSEMAER2CCCRMSEMAER2CCCNULL RMSENULL MAE
TurbCubist65.31733.2660.8250.87866.66432.6080.7960.878140.81796.021
KKNN67.91734.8810.8030.86176.51436.6390.7260.835
RF65.77033.9610.8230.87467.54933.3230.7900.873
SVM-RBF66.98235.8950.8150.85574.05536.7020.7470.830
TSSCubist76.37338.5270.7340.80488.27440.4920.6480.768136.23686.864
KKNN74.96838.4830.7240.79887.31341.0170.6270.758
RF74.35038.8630.7370.80683.94540.3880.6590.774
SVM-RBF73.03938.8750.7370.79184.00940.9070.6570.750
FeCubist1.1260.4650.5490.6051.2820.4560.5910.6721.8660.792
KKNN1.2240.4460.5570.6051.3670.4410.5580.671
RF1.4080.6150.3690.4211.6420.6290.3150.408
SVM-RBF1.1720.5070.5490.5731.4490.5340.5050.541
PCubist0.0770.0530.3420.5130.0780.0530.3030.5050.0910.071
KKNN0.0780.0530.3240.4980.0790.0530.2830.487
RF0.0750.0530.3570.4940.0750.0520.3270.494
SVM-RBF0.0750.0490.3620.5100.0760.0480.3370.513
DOCubist1.1300.7880.2520.3671.1390.7810.2940.4761.3040.852
KKNN1.1590.7980.2570.3851.1980.8070.2700.488
RF1.1170.7510.2490.3371.1450.7500.2460.373
SVM-RBF1.1390.7890.2310.3471.1760.8050.2410.422
CODCubist10.3248.0280.1620.30310.6018.1630.1270.29610.8388.487
KKNN10.2817.9890.1830.34410.6898.1910.1350.320
RF9.3787.4030.2630.4009.6817.5640.2220.389
SVM-RBF9.9047.6990.2090.36010.3978.0130.1460.322
NCubist0.2260.1740.2650.4170.2360.1790.2020.3860.2570.202
KKNN0.2360.1810.2230.3910.2480.1890.1570.350
RF0.2210.1720.2790.4090.2270.1740.2330.391
SVM-RBF0.2230.1710.2770.4320.2350.1790.2110.394
ChlaCubist2.7051.4480.1450.2423.0101.4250.0790.1792.6501.485
KKNN2.7581.4290.1670.2672.9111.3770.1240.257
RF2.3921.3600.1590.2702.7111.3610.0710.183
SVM-RBF2.2351.1920.2170.3372.5771.2120.1170.235
CCC = Lin′s concordance correlation coefficient; Chla = chlorophyll-a; COD = chemical oxygen demand; DO = dissolved oxygen; Fe = iron; KKNN = kernel k-nearest neighbors; MAE = mean absolute error; N = nitrogen; NULL MAE = mean absolute error of the null model; NULL RMSE = root-mean-square error of the null model; P = phosphorus; RF = random forest; RMSE = root-mean-square error; R2 = coefficient of determination; SVM-RBF = support vector machine with radial basis function; TSS = total suspended solids; Turb = Turbidity. Note: Rows in bold and shaded represent the best-performing models for each parameter.
Table 8. Statistical performance metrics used to evaluate the Cubist, KKNN, RF, and SVM-RBF models for predicting water quality parameters using normalized PlanetScope data.
Table 8. Statistical performance metrics used to evaluate the Cubist, KKNN, RF, and SVM-RBF models for predicting water quality parameters using normalized PlanetScope data.
ModelsTrainingTest
RMSEMAER2CCCRMSEMAER2CCCNULL RMSENULL MAE
TurbCubist55.93929.0510.8520.92756.38929.9820.8480.918140.81796.021
KKNN66.39535.0540.8120.87072.22335.9170.7610.856
RF66.14434.6680.8210.87368.84534.3040.7840.869
SVM-RBF67.44334.8880.8160.85471.00734.6270.7710.843
TSSCubist78.83539.0650.7220.79393.63838.5460.6140.740136.23686.864
KKNN76.61339.0540.7130.79490.29341.9620.6080.745
RF69.69736.6850.7500.85070.18937.2330.7470.848
SVM-RBF75.73339.4320.7120.77491.63743.1190.5850.701
FeCubist1.1970.4890.5220.5771.3450.4870.5500.6531.8660.792
KKNN1.2110.4780.5280.5891.0840.4240.6560.764
RF1.3750.5880.4090.4631.6080.6090.3690.476
SVM-RBF1.2230.5250.5060.5301.4580.5370.4880.531
PCubist0.0890.0530.3090.4690.0900.0580.2640.4560.0910.071
KKNN0.0780.0660.3070.4720.0870.0550.2580.457
RF0.0760.0510.3350.4770.0780.0540.2930.468
SVM-RBF0.0760.0480.3890.5570.0730.0460.3900.553
DOCubist1.1500.7980.2380.3531.0170.7040.3920.5571.3040.852
KKNN1.2580.8520.1750.2641.1790.7800.1950.370
RF1.0990.7390.2690.3811.1140.7080.2710.402
SVM-RBF1.2090.8150.1560.2651.2590.8290.1270.257
CODCubist10.0857.9060.1950.3459.5997.4700.2140.38110.8388.487
KKNN10.3548.1720.1840.3479.3927.3440.2470.422
RF9.2527.2890.2800.4149.0417.1400.3020.445
SVM9.8677.8270.2090.3499.2347.2650.2730.401
NCubist0.2260.1750.2580.4040.2280.1740.2280.3900.2570.202
KKNN0.2400.1870.1930.3510.2330.1800.2010.373
RF0.2200.1710.2850.4250.2210.1720.2720.414
SVM-RBF0.2290.1780.2400.3780.2320.1780.2120.383
ChlaCubist2.6901.4610.1290.2111.9231.2210.0240.1502.6501.485
KKNN2.6211.4720.1330.2211.7041.1620.0700.251
RF2.3921.3730.1500.2551.6041.1690.0940.271
SVM-RBF2.3361.3100.1400.2391.4901.0640.1150.264
CCC = Lin′s concordance correlation coefficient; Chla = chlorophyll-a; COD = chemical oxygen demand; DO = dissolved oxygen; Fe = iron; KKNN = kernel k-nearest neighbors; MAE = mean absolute error; N = nitrogen; NULL MAE = mean absolute error of the null model; NULL RMSE = root-mean-square error of the null model; P = phosphorus; RF = random forest; RMSE = root-mean-square error; R2 = coefficient of determination; SVM-RBF = support vector machine with radial basis function; TSS = total suspended solids; Turb = Turbidity. Note: Rows in bold and shaded represent the best-performing models for each parameter.
Table 9. Statistical performance metrics used to evaluate the best-performing models for water quality parameters using Sentinel-2, PlanetScope, and normalized PlanetScope data.
Table 9. Statistical performance metrics used to evaluate the best-performing models for water quality parameters using Sentinel-2, PlanetScope, and normalized PlanetScope data.
DataModelsRMSEMAER2CCCNULL RMSENULL MAE
TurbS2RF39.17614.6260.750.8274.45137.619
PSCubist66.66432.6090.800.88140.81796.021
PS NormCubist56.38929.9820.850.92140.81796.021
TSSS2KKNN37.48216.5280.600.7357.01229.762
PSRF83.94540.3890.660.77136.23686.864
PS NormRF70.18937.2330.750.85136.23686.864
FeS2Cubist0.4740.2130.280.450.5190.269
PSCubist1.2820.4560.590.671.8660.792
PS NormKKNN1.0840.4240.660.761.8660.792
PS2SVM-RBF0.06900.04300.250.430.0770.055
PSSVM-RBF0.0760.0490.340.510.0910.071
PS NormSVM-RBF0.0730.0460.390.550.0910.071
DOS2RF1.3250.8640.200.361.4571.006
PSCubist1.1390.7810.290.481.3040.852
PS NormCubist1.0170.7040.390.561.3040.852
CODS2RF8.0266.3670.210.378.9787.471
PSRF9.6817.5650.220.3910.8388.487
PS NormRF9.0417.1400.300.4510.8388.487
NS2RF0.2210.1770.160.310.2370.191
PSRF0.2270.1740.2330.3910.2570.202
PS NormRF0.2210.1720.270.410.2570.202
ChlaS2SVM-RBF3.0791.2530.050.123.0311.364
PSKKNN2.9111.3780.120.262.6501.485
PS NormSVM-RBF1.4901.0640.110.262.6501.485
CCC = Lin′s concordance correlation coefficient; Chla = chlorophyll-a; COD = chemical oxygen demand; DO = dissolved oxygen; Fe = iron; KKNN = kernel k-nearest neighbors; MAE = mean absolute error; N = nitrogen; NULL MAE = mean absolute error of the null model; NULL RMSE = root-mean-square error of the null model; P = phosphorus; RF = random forest; RMSE = root-mean-square error; R2 = coefficient of determination; SVM-RBF = support vector machine with radial basis function; TSS = total suspended solids; Turb = Turbidity. Note: Rows in bold indicate the best-performing model for each parameter.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dias, R.L.S.; Amorim, R.S.S.; da Silva, D.D.; Fernandes-Filho, E.I.; Veloso, G.V.; Macedo, R.H.F. Prediction of Water Quality Parameters in the Paraopeba River Basin Using Remote Sensing Products and Machine Learning. Sensors 2026, 26, 18. https://doi.org/10.3390/s26010018

AMA Style

Dias RLS, Amorim RSS, da Silva DD, Fernandes-Filho EI, Veloso GV, Macedo RHF. Prediction of Water Quality Parameters in the Paraopeba River Basin Using Remote Sensing Products and Machine Learning. Sensors. 2026; 26(1):18. https://doi.org/10.3390/s26010018

Chicago/Turabian Style

Dias, Rafael Luís Silva, Ricardo Santos Silva Amorim, Demetrius David da Silva, Elpídio Inácio Fernandes-Filho, Gustavo Vieira Veloso, and Ronam Henrique Fonseca Macedo. 2026. "Prediction of Water Quality Parameters in the Paraopeba River Basin Using Remote Sensing Products and Machine Learning" Sensors 26, no. 1: 18. https://doi.org/10.3390/s26010018

APA Style

Dias, R. L. S., Amorim, R. S. S., da Silva, D. D., Fernandes-Filho, E. I., Veloso, G. V., & Macedo, R. H. F. (2026). Prediction of Water Quality Parameters in the Paraopeba River Basin Using Remote Sensing Products and Machine Learning. Sensors, 26(1), 18. https://doi.org/10.3390/s26010018

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop