Next Article in Journal
Selecting Appropriate Spatial Scale for Mapping Plastic-Mulched Farmland with Satellite Remote Sensing Imagery
Next Article in Special Issue
Spatio-Temporal Change of Lake Water Extent in Wuhan Urban Agglomeration Based on Landsat Images from 1987 to 2015
Previous Article in Journal
Automatic Detection of Low-Rise Gable-Roof Building from Single Submeter SAR Images Based on Local Multilevel Segmentation
Previous Article in Special Issue
Fluorescence-Based Approach to Estimate the Chlorophyll-A Concentration of a Phytoplankton Bloom in Ardley Cove (Antarctica)
Article Menu
Issue 3 (March) cover image

Export Article

Remote Sens. 2017, 9(3), 264;

Retrieval of Chlorophyll-a and Total Suspended Solids Using Iterative Stepwise Elimination Partial Least Squares (ISE-PLS) Regression Based on Field Hyperspectral Measurements in Irrigation Ponds in Higashihiroshima, Japan
Graduate School of Engineering, Hiroshima University, 1-4-1 Kagamiyama, Higashihiroshima, Hiroshima 739-8527, Japan
Social Sciences Division, Japan International Research Center for Agricultural Sciences (JIRCAS), 1-1 Ohwashi, Tsukuba, Ibaraki 305-8686, Japan
Graduate School for International Development and Cooperation (IDEC), Hiroshima University, 1-5-1 Kagamiyama, Higashihiroshima, Hiroshima 739-8529, Japan
Author to whom correspondence should be addressed.
Academic Editors: Yunlin Zhang, Claudia Giardino, Linhai Li, Deepak R. Mishra and Prasad S. Thenkabail
Received: 16 December 2016 / Accepted: 9 March 2017 / Published: 13 March 2017


Concentrations of chlorophyll-a (Chl-a) and total suspended solids (TSS) are significant parameters used to assess water quality. The objective of this study is to establish a quantitative model for estimating the Chl-a and the TSS concentrations in irrigation ponds in Higashihiroshima, Japan, using field hyperspectral measurements and statistical analysis. Field experiments were conducted in six ponds and spectral readings for Chl-a and TSS were obtained from six field observations in 2014. For statistical approaches, we used two spectral indices, the ratio spectral index (RSI) and the normalized difference spectral index (NDSI), and a partial least squares (PLS) regression. The predictive abilities were compared using the coefficient of determination (R2), the root mean squared error of cross validation (RMSECV) and the residual predictive deviation (RPD). Overall, iterative stepwise elimination based on PLS (ISE–PLS), using the first derivative reflectance (FDR), showed the best predictive accuracy, for both Chl-a (R2 = 0.98, RMSECV = 6.15, RPD = 7.44) and TSS (R2 = 0.97, RMSECV = 1.91, RPD = 6.64). The important wavebands for estimating Chl-a (16.97% of all wavebands) and TSS (8.38% of all wavebands) were selected by ISE–PLS from all 501 wavebands over the 400–900 nm range. These findings suggest that ISE–PLS based on field hyperspectral measurements can be used to estimate water Chl-a and TSS concentrations in irrigation ponds.
chlorophyll-a; hyperspectral; irrigation ponds; partial least squares regression; total suspended solids

1. Introduction

Agriculture is by far the greatest water consumer in the world, and consequently, a major cause of water pollution. The primary pollutants from agriculture are excess nutrients and pesticides [1]. In agricultural activity, non-point source pollution, such as irrigation water and surface runoff water containing fertilizer from farmland, contributes to excessive nutrient concentrations [2]. Meanwhile, excess nutrients that cause eutrophication, hypoxia and algal blooms in surface water bodies and coastal areas contribute to the primary global water quality problem [1]. Eutrophication has become a widespread matter of concern during the past 50 years, especially in coastal and inland waters [3].
The chlorophyll-a (Chl-a) concentration in water is the most widely applied parameter to assess the water quality status of lakes, particularly with respect to their trophic quality [4]. Since Chl-a is the primary photosynthetic pigment of all plant life [5], the concentration of Chl-a indicates phytoplankton biomass and eutrophication in lakes [6]. The concentration of total suspended solids (TSS) is another commonly used indicator for water quality assessment [7]. TSS consists of organic and inorganic materials suspended in the water [8]. Increased TSS decrease light transmission through the water [9], and therefore affect light availability to phytoplankton, thus resulting in a decrease of phytoplankton primary production [10].
However, traditional water quality monitoring requires in situ measurements and sampling, then returning the samples to the laboratory to measure water quality indicators (e.g., Chl-a and TSS), which is costly and time consuming [11]. Remote sensing makes it possible to monitor the state of the globe routinely, and is cost effective and useful, with the benefits of its passive nature and wide spatial coverage [12]. Earlier studies have demonstrated several algorithms developed for satellite sensors to estimate ocean and coastal water quality parameters, such as the Chl-a algorithm OC3, created for the moderate resolution imaging spectroradiometer (MODIS) data, and OC4, created for sea-viewing wide field-of-view sensor (SeaWiFS) data [13]. The geostationary ocean color imager (GOCI) also shows good performance, using the linear combination index (LCI) method to monitor Chl-a [14]. Further, a three-band semi-analytical reflectance model, originally developed by Gitelson et al. (2003) [15], and a normalized difference chlorophyll index (NDCI) [16], both performed well for assessing Chl-a in turbid productive water [16,17,18]. For estimating TSS concentrations, an algorithm with a single wavelength created for MODIS and medium spectral resolution imaging spectrometer (MERIS) data has been proved to be satisfactory [19].
Unlike ocean and coastal water, inland water usually has a smaller surface area and more complicated spectral features, especially irrigation ponds, which are often impacted by human use such as agriculture activities. Consequently, inland water quality monitoring presents higher requirements for both temporal and spatial resolution of satellite sensor data; hence currently used satellite sensors often have limited practical applicability in assessing relatively smaller inland water bodies. Since there are a limited number of wavebands for Landsat and other multispectral sensors, finding more informative wavebands to improve the performance of water quality estimation is necessary. With respect to in situ measurements, a two-band ratio approach, for example the ratio spectral index (RSI), has performed well for estimating Chl-a concentrations in inland waters [18,20,21], especially using the ratio of near-infrared (NIR) regions to red wavebands, such as the reflectance ratio of 705 nm to 670 nm performed by Han et al. (1997) [22]. Normalized difference spectral indices (NDSI) are another type of spectral indices frequently used to select the optimum bands for spectral analysis. As similar studies that have been done before mainly focused on vegetation parameters retrieval [23,24,25], optimum bands have been calculated from combinations of all available bands in the hyperspectral spectrum, a considerable range for hyperspectral analysis. Water quality parameters retrieval requires a similarly broad approach.
Partial least squares (PLS) regression, which was developed by Wold (1966) [26], is widely used to extract valuable information for spectroscopic analysis. PLS regression uses all available wavebands without multi-collinearity issues. The eigenvectors of the explanatory variables are manipulated such that the corresponding scores (latent variables) not only explain the variance of the explanatory variables (wavebands) themselves, but also are highly correlated with the response variables (Chl-a and TSS) [27]. However, PLS is considered limited because it treats each wavelength as independent, which incorporate noise created by non-informative wavelengths [28]. There is increasing evidence to indicate that wavelength selection can affect the performance of PLS analysis [29], since wavelength selection for PLS models is performed to eliminate uninformative variables and choose the variables that contribute the most to the predictive ability of the calibration model [30]. Iterative stepwise elimination PLS (ISE–PLS), developed by Boggia et al. (1997) [31], combines PLS regression and the most useful information from hundreds of wavebands into the first several factors [32]. This method was developed to eliminate useless wavebands in PLS analysis.
The objective of this study is to develop models to estimate Chl-a and TSS using in situ spectral reflectance data and statistical approaches. We used several regression analyses including (a) a simple linear regression at each waveband of reflectance and the first derivative reflectance (FDR) to explore informative wavelength regions for Chl-a and TSS estimation; (b) all available two-band combination spectral indices (RSI and NDSI); and (c) a PLS regression using original reflectance and FDR datasets. In the PLS analyses, the predictive ability of ISE–PLS was compared with that of a standard full spectrum PLS (FS–PLS) and the spectral indices (RSI and NDSI).

2. Study Area

The study area is located in Higashihiroshima, Japan, as shown in Figure 1. Higashihiroshima is a core city in the central region of Hiroshima Prefecture, with a total area of 635.32 km2 covering nearly 7.5% of the prefecture’s total area. Paddy fields, totalling 36.8 km2, cover 14.9% of the Hiroshima Prefecture. Consequently, Higashihiroshima has the largest rice production of the 86 cities, towns and villages in Hiroshima Prefecture [33]. The city has an estimated population of 183,834 people, and its population density was 289.36 people per km2 in 2011. The number of irrigation ponds in Hiroshima Prefecture approaches approximately 21,000. This qualifies as the second largest number in Japan; a quarter of the total irrigation ponds in Japan are in Higashihiroshima, the average beneficiary area is 3.36 ha, and the average number of beneficiary farmhouses is approximately 9 [34]. The monthly mean temperature ranges from 2.2 °C in January to 25.8 °C in August, and the monthly precipitation ranges from 43.3 mm in December to 232.1 mm in July, referring to the minimum and maximum values, respectively. To assess changes in water quality status and environments, six ponds, including both eutrophic ponds and non-eutrophic ponds, were selected for this study. Descriptions of the six ponds are listed in Table 1.

3. Materials and Methods

3.1. Measurement of Water Surface Reflectance

Measurements of water surface reflectance were performed using an ASD FieldSpec HandHeld-2 spectrometer (ASD Inc., Boulder, CO, USA) with a spectral range of 350–1050 nm and a probe field angle of 10°. Spectral readings were taken approximately 1 m above the water surface between 10:30 and 13:00 on a day with clear skies. Surveys were conducted six times between 3 January 2014, and 28 June 2014. From these data, a total of 36 datasets were obtained.
With respect to the spectral data, the ranges 325–399 nm and 901–1075 nm from each spectrum were identified as noise and removed. Subsequently, spectral data were smoothed using a moving and normalized Gaussian filter with a sigma (standard deviation) of 2.5. The FDR was also computed and compared with the original reflectance.

3.2. Water Sampling and Chemical Analysis

The water sampling sites were consistent with the spectral reflectance measurements. Immediately after measurement of spectral reflectance, water samples were collected into two 1 L containers. The samples were maintained at constant temperature and protected from light until they were received at the laboratory for analysis.
Chl-a and TSS concentrations were determined at the laboratory of the Graduate School for International Development and Cooperation (IDEC), Hiroshima University, Japan. Chl-a was extracted using 90% acetone, the absorption of Chl-a was measured by a spectrophotometer (UVmini-1240, SHIMADZU Co., Kyoto, Japan) and pigment concentration was calculated using the equations from UNESCO. To measure the TSS, the water sample was filtered using 47 mm diameter GF/F filters. The filters were weighed before and after drying with an oven drier (SANYO Electric Co., Moriguchi, Osaka, Japan) at 105 °C for two hours. The TSS contents were quantified by the difference in the weight of the filter paper before and after filtration.

3.3. Ratio Spectral Index and Normalized Difference Spectral Indices

A combination of spectral indices between all wavebands is performed to select the optimum two-band combination. The aim of spectral indices is to construct a mathematical combination of spectral wavebands to enhance information content with respect to the parameter under study [35]. Moreover, normalization in the NDSI is effective at cancelling atmospheric disturbance or other sources of error, while enhancing and standardizing the spectral response to the observed targets [23].
For this study, two of the most commonly used spectral indices (RSI and NDSI) were calculated using the reflectance dataset. The forms to express them are as follows:
RSI ( i , j ) = R i R j
NDSI ( i , j ) = R i     R j R i   +   R j
where Ri and Rj are the intensity values for bands i and j, respectively. Using both indices, the optimum wavebands from all combinations of two separate wavelengths were obtained.

3.4. Full Spectrum Partial Least Squares Regression

We performed PLS regression to estimate Chl-a and TSS concentrations using the reflectance and FDR datasets (n = 36). The standard FS–PLS regression equation is as follows:
y = β 1 x 1 + β 2 x 2 + + β i x i + ε
where the response variable y is a vector of the water quality parameters (Chl-a and TSS), the predictor variables x1 to xi are surface reflectance or FDR values for spectral bands 1 to i (400, 401, …, 900 nm), respectively, β1 to βi are the estimated weighted regression coefficients, and ε is the error vector. The latent variables were introduced to simplify the relationship between response variables and predictor variables. To determine the optimal number of latent variables (NLV), leave-one-out (LOO) cross validation was performed to avoid overfitting of the model, which was based on the minimum value of the root mean squared error (RMSECV). The RMSECV is calculated as follows:
RMSECV = i   =   1 n ( y i     y p ) 2 n
where yi and yp represent the measured and predicted water quality parameters (Chl-a and TSS) for sample i, and n is the number of samples in the dataset (n = 36).

3.5. Iterative Stepwise Elimination Partial Least Squares Regression

The ISE–PLS is a model-wise technique [31], which is based on the wavelengths selection function of the ISE method. To improve the performance of the PLS model, the optimum wavelengths with good predictive ability are selected for model calibration. The wavelengths elimination process depends on the importance of the predictors (zi), described as follows:
z i =   | β i | s i i   =   1 I | β i | s i
where βi is the regression coefficient and si is the standard deviation of predictor, both corresponding to the predictor variable of the waveband i.
Initially, all available wavebands (501 bands, 400–900 nm) are used to develop the PLS regression model. Then variables are ranked from most contributed to least contributed according to the predictor zi; in other words, the predictor zi represents the weight of each variable. The least contributed variable is eliminated and the PLS model is recalibrated with the remaining predictor variables [36]. The model building procedure is repeated, and in each cycle the predictor variable with the minimum importance (i.e., the less informative wavelength) is eliminated, until the final variable is eliminated. To determine the optimum number of wavelengths to include in the final model, LOO cross validation is conducted after each calibration. The final model with the maximum predictive ability is calibrated by the minimum value of RMSECV [37].

3.6. Evaluation of Predictive Ability

The coefficient of determination (R2) and RMSECV were selected as indices to evaluate the FS–PLS and ISE–PLS calibration models’ accuracy by using LOO cross validation. High results for R2 and low RMSECV indicate the best model to predict Chl-a and TSS concentrations. In addition, the residual predictive deviation (RPD) was used to evaluate the predictive ability of the models, which was defined as the ratio of standard deviation (SD) of reference data in prediction to RMSECV [38]. For determining the performance ability of the calibration models, the goal RPD was at least 3 for agriculture applications; RPD values between 2 and 3 indicate a model with good prediction ability, 1.5 < RPD < 2 is an intermediate model needing some improvement, and an RPD < 1.5 indicates that the model has poor prediction ability [39].
All data handling and linear regression analyses were performed using Matlab software ver. 8.6 (MathWorks, Sherborn, MA, USA).

4. Results

4.1. Chl-a and TSS Concentrations in Irrigation Ponds

Descriptive statistics are shown in Table 2, including the sampling data, the number of samples, the minimum (Min), the maximum (Max), the mean, the standard deviation (SD) and the coefficient of variation (CV). In total, 36 samples were collected from six irrigation ponds in six sets of field measurements (3 January, 19 January, 24 March, 9 April, 24 May, and 28 June in 2014). Field samples (n = 36) provided a wide range of both Chl-a (SD = 46.1 μg/L, CV = 2.0) and TSS (SD = 12.8 mg/L, CV = 1.65). In the datasets, Chl-a ranged from 0 to 169.5 μg/L, and TSS ranged from 0.1 to 53 mg/L, which indicates that this study involves various water quality conditions from different ponds.

4.2. Comparison of Simple Linear Regression Models

In this study, several simple linear regression models were constructed, and the accuracy was compared with that of the PLS method. As shown in Table 3, distinct bands were selected as the optimal bands with respect to accuracy for all models. In the model that used the gaussian smoothed water surface reflectance and FDR, the 730 nm and 705 nm wavebands were selected, based on the linear correlation coefficient shown in Figure 2, to estimate Chl-a concentration (R2 = 0.14 and 0.54); 722 nm and 704 nm were selected to estimate TSS (R2 = 0.05 and 0.46). Figure 2 shows the correlation coefficient (r) between reflectance/FDR and Chl-a/TSS with regard to each waveband. It is clear that FDR obviously improved correlation with Chl-a and TSS; moreover, spectra reflectance and absorption features were also enhanced (Figure 2b). An NIR/red algorithm developed by Han et al. (1997) [22] was introduced for comparison of the RSI selected wavebands and accuracy. The NIR/red model showed a higher R2 and lower RMSE than the single waveband models. However, based on the regression between the reflectance of each waveband and Chl-a and TSS, the RSI model selected the R719/R662 ratio as the best band combination, which enhanced the performance of ratio model, giving the highest R2 value of 0.72 for Chl-a. The R717/R630 ratio was the best band combination for TSS, with an R2 of 0.52 (Figure 3a,b). A three-band semi-analytical algorithm for estimating Chl-a concentration was conducted, as a previous study suggested [40], and the optimal wavebands of model were tuned according to the optical properties of the water bodies. Bands 660, 703, and 740 nm were final selected for the three-band model with an R2 of 0.71 and RMSE of 29.32. For another algorithm introduced in a previous study, the NDCI was evaluated using remote sensing reflectance Rrs at an absorption peak of 665 nm (Rrs665), which is closely related to absorption by Chl-a pigments and a reflectance peak of 708 nm (Rrs708), which was sensitive to variations in Chl-a concentration in water, with a result of an R2 of 0.60 and an RMSE of 28.82. For the NDSI model, bands 719 and 663 nm were the best combination for estimating Chl-a (R2 = 0.64), and bands 704 and 698 nm were the best combination for TSS (R2 = 0.55) (Figure 3c,d). The results showed the lowest RMSECV in the RSI model for Chl-a (24.14) and in the NDSI model for TSS (8.48). Among the models, the RSI or the NDSI showed higher R2 values and lower RMSECV values than those of the two types of single-band models in the estimation of both Chl-a and TSS.

4.3. FS–PLS and ISE–PLS Models

Calibration and cross validation results between reflectance/FDR spectra and Chl-a/TSS using FS–PLS and ISE–PLS are shown in Table 4. The results showed that the optimum NLV ranged between 4 and 8 in FS–PLS and between 5 and 11 in ISE–PLS, which was determined by the LOO cross validation based on the lowest RMSECV. The RPD ranged between 1.22 and 1.32 (low accuracy) in FS–PLS and between 1.45 and 7.44 (excellent accuracy) in ISE–PLS. In particular, the selected number of wavebands and the percentage to full spectrum (that is, selected wavebands number/all (n = 501) × 100%) were calculated to evaluate the informative wavebands for ISE–PLS. Results showed the selected wavebands number ranged between 9 and 85, and the percent ratio ranged between 1.80 and 16.97. Overall, for Chl-a, ISE–PLS using FDR showed the highest R2, highest RPD, and lowest RMSECV (R2 = 0.98, RMSECV = 6.15, RPD = 7.44); NLV = 11, and 85 wavebands were selected. Similarly, with respect to TSS, ISE–PLS using FDR showed the highest R2, highest RPD and lowest RMSECV (R2 = 0.97, RMSECV = 1.91, RPD = 6.64); NLV = 11, and 42 wavebands were selected.
The relations between observed and predicted Chl-a and TSS are shown in Figure 4. The data in this figure were used to evaluate goodness of fit in the FS–PLS and ISE–PLS models. Comparisons between the FS–PLS and ISE–PLS models were presented in combination with the R2 and RMSE from the cross validation listed in Table 4. For Chl-a, the ISE–PLS using FDR showed a higher R2 and lower RMSECV. The scatter distribution also showed a better linear relation, which can be judged by the red dots clustered along the 1:1 line in Figure 4b. Similarly, with respect to TSS, the ISE–PLS model using FDR showed better results than the others (Figure 4d, red dot). However, both red and green dots clustered vertically, particularly in Figure 4a,c, showing a large variation in the predicted values and nearly no variation in the observed values, indicating that plenty of observed Chl-a and TSS samples had low concentrations. This vertical clustering also indicates the FS–PLS and ISE–PLS using reflectance had lower predictive abilities than using FDR.
The selected wavebands in ISE–PLS using the reflectance and FDR datasets are shown in Figure 5. In the reflectance dataset, the selected wavebands were primarily in the red wavelengths (650–680 nm) for Chl-a. For TSS, the selected wavebands were in green wavelengths (560 nm), red wavelengths (620–630 nm) and red-edge wavelengths (720 nm). In the FDR datasets, a cluster of wavebands focus on the red region (670–680 nm, 690–710 nm) for Chl-a, and wavebands were also selected from other regions: blue (around 410), green (around 490 nm, 510 nm), red (around 603, 615), and the NIR region between 820 nm and 900 nm. Similarly, more wavebands were selected for TSS using FDR than using reflectance, especially in the red (around 620 nm, 680 nm, and 700 nm) and NIR (around 730 nm) regions.

5. Discussion

5.1. Evaluation of the Predictive Abilities of Simple Linear Regression Models

In the present study, models established by single waveband and two waveband combinations were compared using PLS. For single waveband models, FDR showed a better R2 than smoothed reflectance both for Chl-a and TSS, indicating that the accuracy can be improved by enhancing the features of absorption and reflectance from the smoothed reflectance. However, all single waveband models showed poor accuracy for estimating both Chl-a and TSS concentrations. According to previous research, single band focus on 670–750 nm is better at determining TSS concentrations [19], especially in turbid water. Single band focus showed no predictive ability in that research, and simple linear regression using two wavebands combinations showed poor accuracy for TSS, which may indicate that TSS is difficult to detect using single-band or two-band combinations in relatively clear water; as shown in our results, most observed TSS values were low. The three-band model was successfully used for Chl-a retrieval in turbid water bodies [18,40]. As for this research, the optimal spectral bands selected from the iterative band tuning are in accord with the previous research [17]; however, even the result shows a considerable R2, but the relatively high RMSE may indicate a low accuracy model, which may be attributed to different compositions of optically active constituents (Chl-a, tripton, CDOM) [40]. The NDCI is a special case of the NDSI: two bands of NDCI are determined by the reflectance peak and spectral absorption peak, and the normalizing of two bands reflectance can eliminate uncertainties in the estimation of Rrs [16]. As a comparison, the result of the NDSI has a slight improvement with an R2 of 0.64 and an RMSE of 27.19 than the NDCI with an R2 of 0.60 and an RMSE of 28.82, which may indicate that a combination of wavebands at 719 and 663 nm in the NDSI can better reflect the Chl-a variations in this research area. Among all tested combinations of the RSI and the NDSI, the best R2 values were obtained using the NIR waveband (719 nm) and the red region (662 nm for the RSI, 663 nm for the NDSI) to estimate Chl-a concentrations, which agrees with the findings of other research. In most available research on the measurement of chlorophyll content in water, the absorption trough is located at near 670 nm, caused by absorption of Chl-a [22,41] and the reflectance peak near 710 nm, caused by the fluorescence of Chl-a [42,43,44]. On account of these characteristics, the two waveband models, particularly the NIR/red ratio, have been widely used for Chl-a retrieval, and a variety of algorithms have been based mainly on the ratio of reflectance peak (about 710 nm) to reflectance trough (about 670 nm) [22,45]. Similarly, in the present study two wavebands, from the NIR and red regions respectively, were selected by the NDSI, confirming the water body reflection characteristics.

5.2. Evaluation of the Predictive Abilities of FS–PLS and ISE–PLS

As we expected, the PLS models exhibited better predictive abilities than models that use single wavebands or the index-based (RSI and NDSI) approaches, which shows the PLS method is potentially useful in retrieval of inland water quality parameters [46,47]. In our PLS analyses, results using ISE–PLS models with the FDR dataset showed higher R2 and lower RMSECV values than those of the reflectance dataset. These results are consistent with the research of Han and Rundquitst (1997) [22], who noted that FDR was better correlated with chlorophyll concentration than raw reflectance, and that random noise and the effects of suspended matter could be reduced by FDR [46]. After eliminating outliers and useless predictors, ISE–PLS calibrated more potential models than FS–PLS, both for Chl-a and TSS, with the wavelengths relevant to water quality. As a consequence, predictive ability was further enhanced, which is reflected in the results of evaluation indices. PLS-based waveband selection greatly improved predictions for both Chl-a (R2 from 0.43 to 0.98, RMSECV from 35.15 to 6.15, RPD from 1.32 to 7.44) and TSS (R2 from 0.40 to 0.97, RMSECV from 9.98 to 1.91, RPD from 1.27 to 6.64). The PLS models in combination with wavelength selection had an improved performance also supported by other previous research [29,36,48]. However, the R2 for Chl-a using ISE–PLS reached 0.98, a result that does not rule out the possibility of overfitting; therefore, the solution method for this condition should be the subject of additional research and validation.

5.3. Importance of Selected Wavebands in ISE–PLS

Our results showed 16.97% of all available wavelengths that were selected for predicting Chl-a and 8.38% were also selected for predicting TSS by ISE–PLS, which indicates that less than 20% of the waveband information from field hyperspectral data contributes to the prediction for water quality parameters (Chl-a and TSS) and over 80% were redundant. In the reflectance dataset, wavebands primarily in the red wavelengths were selected: between 630 and 710 nm for Chl-a; for TSS, 560 nm, 620–630 nm, and 720 nm. In the FDR dataset, the selected wavebands for estimating both Chl-a and TSS involved more regions than the reflectance dataset. Nevertheless, similar wavelengths in the visible and NIR regions were selected; blue (410 nm), green (approximately 490 nm, 510 nm), and red (approximately 603 nm, 615 nm) for Chl-a; and blue (approximately 420 nm), green (approximately 500 nm), red (approximately 620 nm, 680 nm and 700 nm), and NIR (approximately 730 nm) for TSS. Intensive absorption by Chl-a resulted in reflectance troughs around 440 and 670nm (Figure 5a) [49]. Low absorption of algal pigments or the scattering of phytoplankton cells and inorganic suspended materials might cause the reflectance peak near 570 nm [41]. The reflectance spectrum peak near 700 nm had a strong correlation with Chl-a concentration [42,50,51]. Several previous studies of inland water quality also proved these wavelengths have the potential to predict Chl-a and TSS concentrations [52,53,54]. This study brings obvious evidence that the ISE-PLS model may be considered as a unified approach for remote quantification of constituent concentrations in water quality assessment. Using this method, more informative wavebands can be selected from hundreds of hyperspectral wavebands, which indicates the accuracy and efficiency can be enhanced by ISE-PLS when it comes to using hyperspectral sensors in satellites with a high temporal and spatial resolution to monitor relatively small area inland water quality in the future.

6. Conclusions

The present study develops models for estimating Chl-a and TSS concentrations in irrigation ponds using water surface reflectance spectral data. Our results show that PLS regression analysis has high potential for predicting Chl-a and TSS based on field hyperspectral measurements, and that ISE wavebands selection in combination with PLS regression analysis can enhance predictive ability. Chl-a and TSS concentrations were estimated with high accuracy by using ISE-PLS, which explains 98% of the variance for Chl-a and 97% of the variance for TSS. The important wavebands for estimating Chl-a and TSS using ISE–PLS represented 16.97% and 8.38%, respectively, of all 501 wavebands over the 400–900 nm range. The selected wavebands approximately match the absorption peaks published by previous researchers. Compared to the estimation of water quality parameters by satellite sensors such as MODIS, ISE–PLS selected more informative wavebands, especially the wavelength at approximately 700 nm. These results provide useful insights for future analyses on the assessment of water quality in irrigation ponds, especially when using satellite imagery.


This study was supported by the Environmental Research and Technology Development Fund (S9) of the Ministry of the Environment, Japan, and JSPS KAKENHI (24560623, 15K14041, 16H05631).

Author Contributions

Yuji Sakuno and Kensuke Kawamura designed this study and the fieldwork; Xinyan Fan, Zhe Gong, Jihyun Lim, and Zuomin Wang performed the fieldwork; Zuomin Wang carried out the laboratory analysis, analyzed the data and wrote the manuscript; Zuomin Wang, Kensuke Kawamura and Yuji Sakuno revised the paper.

Conflicts of Interest

The authors declare no conflicts of interest.


  1. Mateo-Sagasta, J.; Burke, J. SOLAW Background Thematic Report—TR08; FAO: Rome, Italy, 2010. [Google Scholar]
  2. Yang, X.E.; Wu, X.; Hao, H.L.; He, Z.L. Mechanisms and assessment of water eutrophication. J. Zhejiang Univ. Sci. B 2008, 9, 197–209. [Google Scholar] [CrossRef] [PubMed]
  3. Rönnberg, C.; Bonsdorff, E. Baltic Sea eutrophication: Area-specific ecological consequences. Hydrobiologia 2004, 514, 227–241. [Google Scholar] [CrossRef]
  4. World Health Organization (WHO). Guidelines for Drinking-Water Quality, 4th ed.; WHO: Geneva, Switzerland, 2011. [Google Scholar]
  5. Latif, Z.; Tasneem, M.A.; Javed, T.; Butt, S.; Fazil, M.; Ali, M.; Sajjad, M.I. Evaluation of Water-Quality by Chlorophyll and Dissolved Oxygen. Water Resour. South Present Scenar. Future Prospect. 2003, 7, 123–135. [Google Scholar]
  6. Lu, F.; Chen, Z.; Liu, W.; Shao, H. Modeling chlorophyll-a concentrations using an artificial neural network for precisely eco-restoring lake basin. Ecol. Eng. 2016, 95, 422–429. [Google Scholar] [CrossRef]
  7. Sikorska, A.E.; Del Giudice, D.; Banasik, K.; Rieckermann, J. The value of streamflow data in improving TSS predictions—Bayesian multi-objective calibration. J. Hydrol. 2015, 530, 241–254. [Google Scholar] [CrossRef]
  8. Fondriest Environmental, Inc. Turbidity, Total Suspended Solids and Water Clarity; Fundamentals of Environmental Measurements, 2014. Available online: (accessed on 3 November 2016).
  9. Bash, J. Effects of Turbidity and Suspended Solids on Salmonids; Center for Streamside Studies, University of Washington: Seattle, WA, USA, 2001; p. 74. [Google Scholar]
  10. Davies-Colley, R.J.; Smith, D.G. Turbidity, suspended sediment, and water clarity: A review. J. Am. Water Resour. Assoc. 2001, 37, 1085–1101. [Google Scholar] [CrossRef]
  11. Shafique, N.A.; Fulk, F.; Autrey, B.C.; Flotemersch, J. Hyperspectral Remote Sensing of Water Quality Parameters for Large Rivers in the Ohio River Basin. In Proceedings of the First Interagency Conference on Research in the Watersheds, USDA Agricultural Research Service, Washington, DC, USA, 27–30 October 2003.
  12. Voutilainen, A.; Pyhälahti, T.; Kallio, K.Y.; Pulliainen, J.; Haario, H.; Kaipio, J.P. A filtering approach for estimating lake water quality from remote sensing data. Int. J. Appl. Earth Obs. Geoinform. 2007, 9, 50–64. [Google Scholar] [CrossRef]
  13. O’Reilly, J.E.; Maritorena, S.; Mitchell, B.G.; Siegel, D.A.; Carder, K.L.; Garver, S.A.; Kahru, M.; McClain, C. Ocean color chlorophyll algorithms for SeaWiFS. J. Geophys. Res. 1998, 103, 24937–24953. [Google Scholar] [CrossRef]
  14. Sakuno, Y.; Makio, K.; Koike, K.; Maung-Saw-Htoo-Thaw; Kitahara, S. Chlorophyll-a estimation in Tachibana bay by data Fusion of GOCI and MODIS using linear combination index algorithm. Adv. Remote Sens. 2013, 2, 292–296. [Google Scholar] [CrossRef]
  15. Gitelson, A.A.; Gritz, U.; Merzlyak, M.N. Relationships between leaf chlorophyll content and spectral reflectance and algorithms for nondestructive chlorophyll assessment in higher plant leaves. J. Plant Physiol. 2003, 160, 271–282. [Google Scholar] [CrossRef] [PubMed]
  16. Mishra, S.; Mishra, D.R. Normalized difference chlorophyll index: A novel model for remote estimation of chlorophyll-a concentration in turbid productive waters. Remote Sens. Environ. 2012, 117, 394–406. [Google Scholar] [CrossRef]
  17. Dall’Olmo, G.; Gitelson, A.A.; Rundquist, D.C. Towards a unified approach for remote estimation of chlorophyll-a in both terrestrial vegetation and turbid productive waters. Geophys. Res. Lett. 2003, 30, 1938. [Google Scholar] [CrossRef]
  18. Gitelson, A.A.; Dall’Olmo, G.; Moses, W.; Rundquist, D.C.; Barrow, T.; Fisher, T.R.; Gurlin, D.; Holz, J. A simple semi-analytical model for remote estimation of chlorophyll-a in turbid waters: Validation. Remote Sens. Environ. 2008, 112, 3582–3593. [Google Scholar] [CrossRef]
  19. Nechad, B.; Ruddick, K.G.; Park, Y. Calibration and validation of a generic multisensor algorithm for mapping of total suspended matter in turbid waters. Remote Sens. Environ. 2010, 114, 854–866. [Google Scholar] [CrossRef]
  20. Pulliainen, J.; Kallio, K.; Eloheimo, K.; Koponen, S.; Servomaa, H.; Hannonen, T.; Tauriainen, S.; Hallikainen, M. A semi-operative approach to lake water quality retrieval from remote sensing data. Sci. Total Environ. 2001, 268, 79–93. [Google Scholar] [CrossRef]
  21. Dall’Olmo, G.; Gitelson, A.A.; Rundquist, D.C.; Leavitt, B.; Barrow, T.; Holz, J.C. Assessing the potential of SeaWiFS and MODIS for estimating chlorophyll concentration in turbid productive waters using red and near-infrared bands. Remote Sens. Environ. 2005, 96, 176–187. [Google Scholar] [CrossRef]
  22. Han, L.; Rundquist, D.C. Comparison of NIR/RED ratio and first derivative of reflectance in estimating algal-chlorophyll concentration: A case study in a turbid reservoir. Remote Sens. Environ. 1997, 62, 253–261. [Google Scholar] [CrossRef]
  23. Inoue, Y.; Peñuelas, J.; Miyata, A.; Mano, M. Normalized difference spectral indices for estimating photosynthetic efficiency and capacity at a canopy scale derived from hyperspectral and CO2 flux measurements in rice. Remote Sens. Environ. 2008, 112, 156–172. [Google Scholar] [CrossRef]
  24. Stagakis, S.; Markos, N.; Sykioti, O.; Kyparissis, A. Monitoring canopy biophysical and biochemical parameters in ecosystem scale using satellite hyperspectral imagery: An application on a phlomis fruticosa Mediterranean ecosystem using multiangular CHRIS/PROBA observations. Remote Sens. Environ. 2010, 114, 977–994. [Google Scholar] [CrossRef]
  25. Inoue, Y.; Sakaiya, E.; Zhu, Y.; Takahashi, W. Diagnostic mapping of canopy nitrogen content in rice based on hyperspectral measurements. Remote Sens. Environ. 2012, 126, 210–221. [Google Scholar] [CrossRef]
  26. Wold, H. Estimation of Principal Components and Related Models by Iterative Least Squares. In Multivariate Analysis; Krishnaiaah, P.R., Ed.; Academic Press: New York, NY, USA, 1966; pp. 391–420. [Google Scholar]
  27. Song, K.; Li, L.; Li, S.; Tedesco, L.; Duan, H.; Li, Z.; Shi, K.; Du, J.; Zhao, Y.; Shao, T. Using partial least squares-artificial neural network for inversion of inland water Chlorophylla. IEEE Trans. Geosci. Remote Sens. 2014, 52, 1502–1517. [Google Scholar] [CrossRef]
  28. Ghasemi, J.; Niazi, A. Genetic-algorithm-based wavelength selection in multicomponent spectrophotometric determination by PLS: Application on copper and zinc mixture. Talanta 2003, 59, 311–317. [Google Scholar] [CrossRef]
  29. Kawamura, K.; Watanabe, N.; Sakanoue, S.; Inoue, Y. Estimating forage biomass and quality in a mixed sown pasture based on partial least squares regression with waveband selection. Grassl. Sci. 2008, 54, 131–145. [Google Scholar] [CrossRef]
  30. Swierenga, H.; Groot, P.J.; Weijer, A.P.; Derksen, M.W.J.; Buydens, L.M.C. Improvement of PLS model transferability by robust wavelength selection. Chemom. Intell. Lab. Syst. 1998, 41, 237–248. [Google Scholar] [CrossRef]
  31. Boggia, R.; Forina, M.; Fossa, P.; Mosti, L. Chemometric study and validation strategies in the structure-activity relationships of new class of cardiotonic agents. Quant. Struct Act. Relatsh. 1997, 16, 201–213. [Google Scholar] [CrossRef]
  32. Kawamura, K.; Watanabe, N.; Sakanoue, S.; Lee, H.; Inoue, Y.; Odagawa, S. Testing genetic algorithm as a tool to select relevant wavebands from field hyperspectral data for estimating pasture mass and quality in a mixed sown pasture using partial least squares regression. Grassl. Sci. 2010, 56, 205–216. [Google Scholar] [CrossRef]
  33. Derbalah, A.S.H.; Nakatani, N.; Sakugawa, H. Distribution, seasonal pattern, flux and contamination source of pesticides and nonylphenol residues in Kurose River water, Higashi–Hiroshima, Japan. Geochem. J. 2003, 37, 217–232. [Google Scholar] [CrossRef]
  34. Abe, H.; Shinohara, S. A study on irrigation ponds in Higashihiroshima: A statistical approach. J. Fac. Appl. Biol. Sci. Hiroshima Univ. 1996, 35, 27–34. [Google Scholar]
  35. Stratoulias, V.; Heino, T.I.; Michon, F. Lin-28 regulates oogenesis and muscle formation in Drosophila melanogaster. PLoS ONE 2014, 9, e101141. [Google Scholar] [CrossRef] [PubMed]
  36. Forina, M.; Lanteri, S.; Oliveros, M.; Millan, C.P. Selection of useful predictors in multivariate calibration. Anal. Bioanal. Chem. 2004, 380, 397–418. [Google Scholar] [CrossRef] [PubMed]
  37. D’Archivio, A.A.; Maggi, M.A.; Ruggieri, F. Modelling of UPLC behaviour of acylcarnitines by quantitative structure–retention relationships. J. Pharm. Biomed. Anal. 2014, 96, 224–230. [Google Scholar] [CrossRef] [PubMed]
  38. Williams, P.C. Implementation of Near-Infrared Technology. In Near-Infrared Technology in the Agricultural and Food Industries, 2nd ed.; Williams, P.C., Norris, K., Eds.; Association of Cereal Chemists Inc.: Eagan, MN, USA, 2001; pp. 145–169. [Google Scholar]
  39. D’Acqui, L.P.; Pucci, A.; Janik, L.J. Soil properties prediction of western Mediterranean islands with similar climatic environments by means of mid-infrared diffuse reflectance spectroscopy. Eur. J. Soil Sci. 2010, 61, 865–876. [Google Scholar] [CrossRef]
  40. Gitelson, A.A.; Schalles, J.F.; Hladik, C.M. Remote chlorophyll-a retrieval in turbid, productive estuaries: Chesapeake bay case study. Remote Sens. Environ. 2007, 109, 464–472. [Google Scholar] [CrossRef]
  41. Huang, Y.; Jiang, D.; Zhuang, D.; Fu, J. Evaluation of hyperspectral indices for chlorophyll-a concentration estimation in Tangxun Lake (Wuhan, China). Int. J. Environ. Res. Public Health 2010, 7, 2437–2451. [Google Scholar] [CrossRef] [PubMed]
  42. Gitelson, A.A. The peak near 700 nm on radiance spectra of algae and water: Relationships of its magnitude and position with chlorophyll concentration. Int. J. Remote Sens. 1992, 13, 3367–3373. [Google Scholar] [CrossRef]
  43. Bennet, J.; Bogorad, L. Complementary chromatic adaptation in a filamentous blue-green alga. J. Cell Biol. 1973, 58, 419–435. [Google Scholar] [CrossRef]
  44. Ma, R.H.; Ma, X.D.; Dai, J.F. Hyperspectral Feature Analysis of Chlorophyll a and Suspended Solids Using Field Measurements from Taihu Lake, Eastern China. Hydrol. Sci. J. 2007, 52, 808–824. [Google Scholar] [CrossRef]
  45. Mittenzwey, K.H.; Breitwieser, S.; Penig, J.; Gitelson, A.A.; Dubovitzkii, G.; Garbusov, G.; Ullrich, S.; Vobach, V.; Müller, A. Fluorescence and reflectance for the in-situ determination of some quality parameters of surface waters. Acta Hydrochim. Hydrobiol. 1991, 19, 1–15. [Google Scholar] [CrossRef]
  46. Song, K.; Li, L.; Tedesco, L.P.; Li, S.; Duan, H.; Liu, D.; Hall, B.E.; Du, J.; Li, Z.; Shi, K.; et al. Remote estimation of chlorophyll-a in turbid inland waters: Three-band model versus GA-PLS model. Remote Sens. Environ. 2013, 136, 342–357. [Google Scholar] [CrossRef]
  47. Ryan, K.; Ali, K. Application of a partial least-squares regression model to retrieve chlorophyll-a concentrations in coastal waters using hyper-spectral data. Ocean Sci. J. 2016, 51, 209–221. [Google Scholar] [CrossRef]
  48. Chen, D.; Cai, W.; Shao, X. Representative subset selection in modifiediterative predictor weighting (mIPW)-PLS models for parsimonious multivariate calibration. Chemom. Intell. Lab. Syst. 2007, 87, 312–318. [Google Scholar] [CrossRef]
  49. Yacobi, Y.Z.; Moses, W.J.; Kaganovsky, S.; Sulimani, B.; Leavitt, B.C.; Gitelson, A.A. NIR-red reflectance-based algorithms for chlorophyll-a estimation in mesotrophic inland and coastal waters: Lake Kinneret case study. Water Res. 2011, 45, 2428–2436. [Google Scholar] [CrossRef] [PubMed]
  50. Vasilkov, A.; Kopelevich, O. Reasons for the appearance of the maximum near 700 nm in the radiance spectrum emitted by the ocean layer. Oceanology 1982, 22, 697–701. [Google Scholar]
  51. Gitelson, A.; Garbuzov, G.; Szilagyi, F.; Mittenzwey, K.; Karnieli, A.; Kaiser, A. Quantitative remote sensing methods for real-time monitoring of inland waters quality. Int. J. Remote Sens. 1993, 14, 1269–1295. [Google Scholar] [CrossRef]
  52. Hu, Z.; Liu, H.; Zhu, L.; Lin, F. Quantitative inversion model of water chlorophyll-a based on spectral analysis. Procedia Environ. Sci. 2011, 10, 523–528. [Google Scholar] [CrossRef]
  53. Thiemann, S.; Kaufman, H. Determination of chlorophyll content and tropic state of lakes using field spectrometer and IRS—IC satellite data in the Mecklenburg Lake Distract, Germany. Rem. Sens. Environ. 2000, 73, 227–235. [Google Scholar] [CrossRef]
  54. Gons, H.J. Optical teledetection of chlorophyll a in turbid inland waters. Environ. Sci. Technol. 1999, 33, 1127–1132. [Google Scholar] [CrossRef]
Figure 1. Locations of Higashihiroshima and the six irrigation ponds used in this study.
Figure 1. Locations of Higashihiroshima and the six irrigation ponds used in this study.
Remotesensing 09 00264 g001
Figure 2. Correlation coefficients (r) between water quality parameters (Chl-a and TSS) at each wavelength: (a) reflectance; (b) FDR.
Figure 2. Correlation coefficients (r) between water quality parameters (Chl-a and TSS) at each wavelength: (a) reflectance; (b) FDR.
Remotesensing 09 00264 g002
Figure 3. Distributions of R2 between two wavebands using RSI (a) Chl-a; (b) TSS and NDSI (c) Chl-a; (d) TSS.
Figure 3. Distributions of R2 between two wavebands using RSI (a) Chl-a; (b) TSS and NDSI (c) Chl-a; (d) TSS.
Remotesensing 09 00264 g003
Figure 4. Relations between measured and cross-validated prediction values of Chl-a (a) Reflectance; (b) FDR and TSS; (c) Reflectance; (d) FDR using FS–PLS and ISE–PLS.
Figure 4. Relations between measured and cross-validated prediction values of Chl-a (a) Reflectance; (b) FDR and TSS; (c) Reflectance; (d) FDR using FS–PLS and ISE–PLS.
Remotesensing 09 00264 g004
Figure 5. Selected wavebands in ISE–PLS using reflectance or FDR datasets (n = 36) to estimate: (a) and (c) Chl-a; (b) and (d) TSS. Green bars = Chl-a; red bars = TSS.
Figure 5. Selected wavebands in ISE–PLS using reflectance or FDR datasets (n = 36) to estimate: (a) and (c) Chl-a; (b) and (d) TSS. Green bars = Chl-a; red bars = TSS.
Remotesensing 09 00264 g005
Table 1. The six irrigation ponds in the study.
Table 1. The six irrigation ponds in the study.
No.Name of pondAlt. (m)Depth (m)Area (ha)Coordinate
1Nanatsu-ike2452.38.134°26′06.46″N 132°41′39.69″E
2Shitami-Oike2211.52.534°24′28.56″N 132°42′22.09″E
3Okuda-Oike2283.32.934°24′25.24″N 132°43′43.16″E
4Yamanaka-ike2312.61.234°24′14.15″N 132°43′12.21″E
5Yamanakaike-kamiike2311.10.134°24′15.29″N 132°43′14.45″E
6Budou-ike2101.6134°24′02.78″N 132°42′45.89″E
Table 2. Descriptive statistics for the Chl-a and TSS concentrations.
Table 2. Descriptive statistics for the Chl-a and TSS concentrations.
DatenChl-a (μg/L)TSS (mg/L)
3 January 201460.198.720.739.
19 January 201460.1169.536.
24 March 201460169.136.867.31.80.438.
9 April 201460.548.58.719.52.20.533.56.513.22.0
24 May 201460.937.
28 June 201461.6133.927.
SD = standard deviation; CV = coefficient of variation; n = number of samples.
Table 3. Regression models used to estimate Chl-a and TSS concentrations with two spectral data types (reflectance and FDR) and two spectral indices (RSI and NDSI).
Table 3. Regression models used to estimate Chl-a and TSS concentrations with two spectral data types (reflectance and FDR) and two spectral indices (RSI and NDSI).
ParameterSpectral indexModelR2RMSE
Chl-aReflectancceChl-a = 0.0004 × R730 + 0.03960.1451.00
FDRChl-a = 1 × 10 −5 × R705 − 0.00040.5451.01
NIR/red (Han et al. (1997) [22])Chl-a = 94.748 × R705/R670 − 88.8970.6028.78
Three-band (Gitelson et al. (2003) [15])Chl-a = 0.0036 × (R−1660 − R−1703) × R740 − 0.06650.7129.32
NDCI (Mishra et al. (2012) [16])Chl-a = 253.16 × (Rrs708 − Rrs665)/(Rrs708+Rrs665) + 36.5350.6028.82
RSIChl-a = 119.27 × R719/R662 − 88.0520.7224.14
NDSIChl-a = 253.16 × (R719 − R663)/(R719 + R663) + 36.5350.6427.19
TSSReflectancceTSS = 0.0009 × R722 + 0.05010.0514.81
FDRTSS = 5 × 10 −5 × R704 − 0.00030.4614.83
RSITSS = 31.419 × R717/R630 − 17.9130.528.73
NDSITSS = 300.45 × (R704 − R698)/(R704 + R698) + 6.38680.558.48
Table 4. Optimum NLV, R2 and RMSECV using the LOO method in FS–PLS and in ISE–PLS using the entire dataset (n = 36), with the residual predictive deviation, the number of selected wavebands and the percent ratio with respect to the full spectrum (i = 501).
Table 4. Optimum NLV, R2 and RMSECV using the LOO method in FS–PLS and in ISE–PLS using the entire dataset (n = 36), with the residual predictive deviation, the number of selected wavebands and the percent ratio with respect to the full spectrum (i = 501).
ParameterSpectral Data TypeRegressionCalibrationCross ValidationSelected Wavebands NumberSelected Wavebands (%)
FDR = first derivative reflectance; NLV = number of latent variables; RMSEC = root mean square error from calibration; RMSECV = root mean square error from cross validation; RPD = the residual predictive deviation.
Remote Sens. EISSN 2072-4292 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top