Development of the Statistical Model for Monitoring Salinization in the Mekong Delta of Vietnam Using Remote Sensing Data and In-Situ Measurements

This article presents the methodology for developing a statistical model for monitoring salinity intrusion in the Mekong Delta based on the integration of satellite imagery and in-situ measurements. We used Landsat-8 Operational Land Imager and Thermal Infrared Sensor (Landsat- 8 OLI and TIRS) satellite data to establish the relationship between the planetary reflectance and the ground measured data in the dry season during 2014. The three spectral bands (blue, green, red) and the principal component band were used to obtain the most suitable models. The selected model showed a good correlation with the exponential function of the principal component band and the ground measured data (R2 > 0.8). Simulation of the salinity distribution along the river shows the intrusion of a 4 g/L salt boundary from the estuary to the inner field of more than 50 km. The developed model will be an active contribution, providing managers with adaptation and response solutions suitable for intrusion in the estuary as well as the inner field of the Mekong Delta.


Introduction
The Mekong Delta is a major agricultural area, contributing to export as well as the food supply to the people of Vietnam. However, this hollow terrain area, located in the downstream Mekong basin, is being heavily impacted by sea level rise and climate change. This has made drought and salinization more severe and is expected to continue with negative developments in the near future [1].
The traditional measurement of water salinity requires in-situ sampling, which is a costly and time-consuming effort. Because of these limitations, it is impractical to cover the whole water body or obtain frequent repeat samples at a site. This difficulty in achieving successive water salinity sampling becomes a barrier to water salinity monitoring and forecasting [2].
Remote sensing techniques have the potential to overcome these limitations by providing an alternative means of studying and monitoring water salinity over a wide range of both temporal and spatial scales. Several studies have confirmed that remote sensing can meet the demand for the large sample sizes required for water salinity studies conducted on the watershed scale [3]. Hence, a significant amount of research has been conducted to develop remote sensing methods and indices. These methods range from semi-empirical techniques to analytical methods for estimating and producing quantitative water salinity maps [4].
This research aimed to assess the use of remote sensing data for estimating the Mekong Delta water salinity using statistical techniques. The work was based on developing an algorithm for estimation of water salinity using the Landsat-8 Operational Land Imager and Thermal Infrared Sensor (Landsat-8 Operational Land Imager (OLI) and Thermal Infrared Sensor (TIRS)) bands reflectance.

Satellite and Ground Measured Data
We used five scenes of Landsat-8 OLI and TIRS multispectral image data for the different dates in 2014 (22 February, 10 March, 26 March, 27 April, 13 May). The size of each scene was 185 km × 180 km. The spatial resolution for each of these spectral bands was 30 m. The satellite data was downloaded from the United States Geological Survey in GeoTIFF format with the Level 1T correction. Level 1T is the standard terrain correction in which systematic radiometric and geometric accuracy is provided using the ground control points and the topographic accuracy is obtained by using the digital elevation model. The scenes of Landsat-8 OLI and TIRS were selected on the basis of the least cloud cover on the sampling days. In total, we used ground measured data at eleven sampling locations of the study area in February 2014-May 2014 to develop and validate models using the planetary reflectance of the scenes of Landsat-8. The sampling locations and the dates for the ground measured data and Landsat-8 scenes are given in Table 1.

Image Processing
The satellite scenes were processed to make them workable for the purpose of this research. The operations applied for processing are briefly explained in the following sub-sections:

Normalized Difference Vegetation Index (NDVI)
We calculated normalized difference vegetation index (NDVI)-a measure of vegetation greenness-using Equation (3) [5] as follow: where рNIR = reflectance of NIR band, рRed = reflectance of red band. Such NDVI calculations were performed over the study area in all scenes of Landsat-8. The negative NDVI values (between 0 and −1) indicated the presence of water in the pixels whereas positive NDVI values showed possible contamination due to other land uses [6]. In case of a positive NDVI value for any sampling site pixel, we considered the reflectance value of the nearest neighboring water pixel.

Investigating the Correlation between the Landsat-8 Satellite Image Data and the In-Situ Salinity Measurements
We investigated the correlation between the spectral bands of Landsat-8 images (except the panchromatic and cirrus bands) and the in-situ salinity measurements by calculating the Pearson correlation coefficient (Table 2). Table 2. Pearson correlation coefficient (r) between the spectral bands of Landsat-8 images (except the panchromatic and cirrus bands) and the in-situ salinity measurements.  Table 2 shows that the coastal aerosol, blue, green and red bands correlate well with the in-situ salinity measurements (r = 0.68-0.83). Therefore, these spectral bands can be used to develop the model for water salinity.

Pearson Correlation Coefficient (r)
The principal component analysis technique is an algorithm that creates the principal components (PCs) to remove the overlap data on the image, highlighting the spectral characteristics of different objects on the Earth's surface [7]. We also investigated the correlation between the principal components analyzed from the original Landsat-8 image bands (bands 1, 2, 3 and 4) and the in-situ salinity measurements. The first principal component correlated most strongly with the in-situ salinity measurements (r = 0.86) ( Table 3).

Developing Models of Water Salinity from Planetary Reflectance of Landsat-8 Data
We used regression analysis technique to develop eight individual empirical models to determine salinity as a function of the blue, green and red bands, as well as PC1. We used 40 data records (Landsat-8 as well as the ground data) for the development of the models to obtain salinity from the planetary reflectance. The remaining 15 data records were used to validate the best selected models by calculating the root mean square error (RMSE). In all these models, the in-situ salinity measurements were the dependent variable, whereas the spectral bands and PC1 band were the independent variables. The strength of the correlations between the planetary reflectance and in situ salinity for the development and validation of the models were obtained based on the co-efficient of determination (R 2 ) and RMSE. On the basis of R 2 values, RMSE, we identified the significant empirical models for salinity.

Spatial Analysis for the Study Area
The NDVI was calculated for the study area to extract the pixels with water and remove the pixels contaminated with other land use types. We selected the most suitable empirical models for salinity on the basis of R 2 values. The selected salinity model was applied on the scenes of Landsat-8 to obtain the spatial distribution along the study area. The salinity values were divided into four classes which are: (1) Low salinity: less than 4 g/L, (2) Moderate salinity: 4-7 g/L; (3) High salinity: 7-10 g/L, (4) Severely salinity: greater than 10 g/L.

Empirical Models for Determining Salinity Classes
We developed eight empirical models for determining salinity as given in Table 4. These empirical models could be used to obtain the spatial distribution of salinity using the planetary reflectance of bands. Among these significant models, R 2 was higher (0.73-0.82) for the models with the exponential function (model number one, two, three and five in Table 4), whereas it was lower (0.62-0.74) in the models with the linear function (model number four, six, seven and eight in Table  4). The best model was the use of the principal component PC1 (model number one in Table 4). The scatter plot of model number one is shown in Figure 2. This figure shows that the selected model showed the best correlation with the exponential function of the principal component PC1 band and the ground measured data with R 2 = 0.8195 and RMSE = 1.592. Figure 3 revealed that the predicted values of salinity matched the observed ones. The value of the coefficient of determination computed for the regression between measured and predicted salinity was 0.8948 and RMSE = 0.802. This shows that both field and calculated salinity data are in good agreement, indicating a good fitting.

Application of Model for Spatial Analysis
We applied the best model (model number one in Table 4) to quantify the water salinity into the spatial distribution map for the study area. Figure 4 shows the four salinity classes which are: (1) Low salinity: less than 4 g/L, (2) Moderate salinity: 4-7 g/L; (3) High salinity: 7-10 g/L, (4) Severely salinity: greater than 10 g/L in February 2014. In February 2014, the length of the saline intrusion of 4 g/L salt boundary into the estuaries Cua Tieu, Cua Dai and Cua Ham Luong is more than 50 km. Table 5 shows the salt intrusion length of the 4 g/L salt boundary into the inner field from eight large estuaries in the study area.

Conclusions
The purpose of this research was to use in-situ salinity measurements combined with Landsat-8 OLI and TIRS data to derive water salinity in the Mekong River Delta of Vietnam for sustainable management. Correlation and regression models were developed between each of the salinity measurements and the reflectance of image data. The current research indicated that despite the variation in statistical response, a good correlation existed between selected salinity and Landsat-8 data in the Mekong River Delta of Vietnam. The efficiency of the proposed algorithms were investigated based on the values of determining coefficient and RMSE. As there was little variation between Landsat-8 predicted values and the measured concentrations, this research proved that Landsat-8 data were useful in quantifying water salinity dynamics and preparing digital maps. The models and methods used in this research for the retrieval of water salinity from Landsat-8 are promising and can be frequently applied to the Mekong Delta for sustainable water quality monitoring.