Satellite Soil Moisture Validation Using Hydrological SWAT Model : A Case Study of Puerto Rico , USA

Soil moisture is placed at the interface between land and atmosphere which influences water and energy flux. However, soil moisture information has a significant importance in hydrological modelling and environmental processes. Recent advances in acquiring soil moisture from the satellite and its effective utilization provide an alternative to the conventional soil moisture methods. In this study, an attempt is made to apply physically based, distributed-parameter, Soil and Water Assessment Tool (SWAT) to validate Advanced Microwave Scanning Radiometer (AMSR2) soil moisture in parts of Puerto Rico. For this, calibration is performed for the years 2010 to 2012 with known observed discharge sites, Rio Guanajibo and Rio Grande de Añasco in Puerto Rico and validation, with the observed stream flow for the year 2013 using the AMSR2 soil moisture. Moreover, the SWAT and AMSR2 soil moisture outcome are compared on a monthly basis. The model capability and performance in simulating the stream flow are evaluated utilizing the statistical method. The results indicated a negligible difference in SWAT soil moisture and AMSR2 soil moisture for stream flow estimation. Finally, the model retrievals show a satisfactory agreement between observed and simulated streamflow.


Introduction
Soil moisture, referring to limited period storage water in the top layer of soil, is important in operational hydrology.Accurate soil moisture information is crucial for climatology, water resource management, agriculture, and flood forecasting [1].The conventional in situ soil moisture measurement method is accurate, but high temporal and spatial variability provide limitations for information at local, regional, and global scale [2].Recently developed satellite soil moisture retrieval techniques provide insight into quantitative assessment at large scale, which is feasible to use by means of microwave remote sensing satellite.However, conventional in situ information is necessary to assess the accuracy of satellite soil moisture [3][4][5][6].Furthermore, integration of these two datasets is used to achieve substantial accuracy.Soil moisture measured by satellite remote sensing approaches relies on the estimations of active or passive electromagnetic radiations [7].
In the past, quite a few active and passive missions retrieved soil moisture, which mainly includes Soil Moisture and Ocean Salinity (SMOS) [8], Soil Moisture Active Passive (SMAP) [9], Advanced Scatterometer (ASCAT) [10], European Remote Sensing (ERS) [11], Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E), and Advanced Microwave Scanning Radiometer 2 (AMSR2) [12,13].Global Change Observation Mission First-Water (GCOM-W1) launched on 17 May 2012 carries AMSR2 and is an improved design of AMSR-E NASA's Aqua Satellite with enhancement in the reflector, C-band frequency and improved calibration system.Before the direct application, the data have to be corrected by removing the gaps from the product.Xiao et al. [14] applied a data assimilation algorithm to fill the gaps in the soil moisture data; Microwave Radiometer Imager (MWRI), AMSR-E, and AMSR2 dataset.The study shows that assimilation algorithm efficiently regenerates spatial and temporal soil moisture time series.
As of late, many reviews have surveyed the AMSR2 soil moisture capability for hydrological and climatic applications.However, due to the large spatial coverage integration, the product may give faster and better outcome for the various hydrological models.Wu et al. [15] made a comprehensive appraisal of AMSR2 with in situ soil moisture.Study outcome demonstrated the best correspondence between AMSR2 and in situ soil moisture estimation and thus offered better understanding into the suitability and reliability of AMSR2 soil moisture product.Zhang et al. [16] analyzed AMSR2 and SMAP product against in situ measurement and found the best agreement with in situ measurement and stable pattern for capturing the spatial distribution of surface soil moisture.Brocca et al. [17] briefly reviewed techniques to monitor soil moisture for hydrological applications and described the use of in situ and satellite soil moisture data for improving hydrological predictions.Zhuo et al. [18] explored the advantages of satellite soil moisture in hydrological modeling and suggested an evaluation, representation, and compatibility of satellite soil moisture in a hydrological model.Further, it is recommended to modify the hydrological model to make it compatible with variation in real field soil moisture.
Hydrological models mainly depend on the water cycle which is broadly applied to achieve long-term sustainability for any hydrological project.Stream flow is the major element of the water cycle that needs reliable estimation for resolving quantity and quality problems in water resource project.However, the soil moisture is the essential parameter in a hydrological model that controls the streamflow estimation but routine soil moisture measurement is tedious and inconvenient in the larger area hence satellite soil moisture is critical for improving the understanding and ability of the hydrological model over a larger area.
In the present study, SWAT is used to simulate streamflow in the Rio Gunajibo and Rio Grande de Anasco watersheds.The model simulates streamflow on monthly basis for the period 2008 to 2013.This study provides a hydrological comparison between SWAT generated and AMSR2 retrieved soil moisture data and performs calibration and validation for stream flow using AMSR2 soil moisture.Finally, the study evaluates the SWAT performance using the coefficient of determination (R 2 ).

Study Area
The study areas, Rio Guanajibo and Rio Grande de Añasco, are located in the western portion of the main island of Puerto Rico, USA (Figure 1).The watersheds are positioned between coordinate; northeast corner 18 • 19 14.16" N-66 • 40 11.28" W to southwest corner 17 • 58 5.88 N-67 • 11 20.04"W and covers an area of 310.53 km 2 and 252.52 km 2 , respectively.In general, both the sites receive annual average precipitation of 1687 mm.The annual average maximum and minimum temperature is 32 • C and 20 • C whereas the elevation ranges from −2 m to 370 mm above mean sea level.Evergreen forest is the dominant land use followed by herbaceous and hay in both the watershed.

Data Sources
SWAT interface for both the study area is created using input from various sources such as Digital Elevation Model (DEM) of 30 m resolution obtained from United States Geological Survey (USGS) SRTM (http://srtm.csi.cgiar.org/)(Figure 2a); land use land cover from national land cover database (NLCD) (http://databasin.org/datasets/)(Figure 2b); Soil Survey Geographic Database (SSURGO) soil map from National Resource Conservation Services (NRCS) (http://websoilsurvey.nrcs.usda.gov/)(Figure 2c); daily rainfall and temperature station data for the period 2008 to 2013 are collected from National Climatic Data Center (NCDC) (http://www.ncdc.noaa.gov/);solar radiation, relative humidity, wind speed are simulated on the basis of weather generated database by the model.Watershed outlet location and daily observed streamflow are collected from USGS gauge station (https://waterwatch.usgs.gov/).Remotely sensed satellite soil moisture data are collected from the Japan Aerospace Exploration Agency (JAXA) available on (http://global.jaxa.jp/).The soil moisture retrieved from AMSR2 on board GCOM-W1 that was launched in May 2012.The AMSR2 soil moisture data is available on the daily or monthly basis with a spatial resolution of 0.1° and 0.25° (10/25 km). (a)

Data Sources
SWAT interface for both the study area is created using input from various sources such as Digital Elevation Model (DEM) of 30 m resolution obtained from United States Geological Survey (USGS) SRTM (http://srtm.csi.cgiar.org/)(Figure 2a); land use land cover from national land cover database (NLCD) (http://databasin.org/datasets/)(Figure 2b); Soil Survey Geographic Database (SSURGO) soil map from National Resource Conservation Services (NRCS) (http://websoilsurvey.nrcs.usda.gov/)(Figure 2c); daily rainfall and temperature station data for the period 2008 to 2013 are collected from National Climatic Data Center (NCDC) (http://www.ncdc.noaa.gov/);solar radiation, relative humidity, wind speed are simulated on the basis of weather generated database by the model.Watershed outlet location and daily observed streamflow are collected from USGS gauge station (https://waterwatch.usgs.gov/).Remotely sensed satellite soil moisture data are collected from the Japan Aerospace Exploration Agency (JAXA) available on (http://global.jaxa.jp/).The soil moisture retrieved from AMSR2 on board GCOM-W1 that was launched in May 2012.The AMSR2 soil moisture data is available on the daily or monthly basis with a spatial resolution of 0.1 • and 0.25 • (10/25 km).

Data Sources
SWAT interface for both the study area is created using input from various sources such as Digital Elevation Model (DEM) of 30 m resolution obtained from United States Geological Survey (USGS) SRTM (http://srtm.csi.cgiar.org/)(Figure 2a); land use land cover from national land cover database (NLCD) (http://databasin.org/datasets/)(Figure 2b

Model Application
SWAT is a physically based semi-distributed basin scale model that uses parameter such as DEM, soil, land use, and climatic data for hydrological and climatological modeling on the daily or monthly basis [19][20][21].Major model components include weather, hydrology, soil properties, plant growth, nutrients, and land management practices.The first stage of modeling involves watershed delineation.The delineated watershed is further subdivided into hydrologic response units (HRU)

Model Application
SWAT is a physically based semi-distributed basin scale model that uses parameter such as DEM, soil, land use, and climatic data for hydrological and climatological modeling on the daily or monthly basis [19][20][21].Major model components include weather, hydrology, soil properties, plant growth, nutrients, and land management practices.The first stage of modeling involves watershed delineation.The delineated watershed is further subdivided into hydrologic response units (HRU) which is the unique combination of land use, soil, and slope.Each HRU in the model behaves differently for precipitation and temperature input [22].For the given study each watershed is considered as the single basin to avoid complexity and to match the spatial resolution of AMSR2 soil moisture.The conversion of sub-basin to the single basin is made by changing the threshold limit in the model while the number of HRUs generated by the model remain same.The SWAT flow chart process is depicted in Figure 3.
which is the unique combination of land use, soil, and slope.Each HRU in the model behaves differently for precipitation and temperature input [22].For the given study each watershed is considered as the single basin to avoid complexity and to match the spatial resolution of AMSR2 soil moisture.The conversion of sub-basin to the single basin is made by changing the threshold limit in the model while the number of HRUs generated by the model remain same.The SWAT flow chart process is depicted in Figure 3.

Streamflow Generation
The hydrological cycle simulated by SWAT based on water balance equation and calculates water balance for each HRU.
where t is the time in days, SWt is the final soil water content, SW0 is the initial soil water content, Rday is amount of precipitation, Qsurf is the amount of surface runoff, ETa is the amount of evapotranspiration, Wseep is the amount of percolation, and Qgw is the amount of return flow.Thereafter, the aggregated results of various physical processes on HRU scale and can be integrated into basin level.Soil Conservation Services Curve Number method (SCS-CN) is used to calculate streamflow, infiltration, and canopy storage [23,24].The lateral flow is simulated using a kinematic storage routing technique whereas return flow is simulated by considering shallow aquifer.In SWAT, Muskingum method is used for channel routing and potential evapotranspiration (PET) is estimated using the Penman-Monteith method [25].The SWAT output includes water balance for each watersheds and flow at the outlet point.

Bias Correction
In Puerto Rico, a different network called Soil Climate Analysis Network (SCAN) run by NRCS is used to collect in situ data.Whereas measurement of soil moisture and temperature along with liquid precipitation, solar radiation, relative humidity, etc. at different depths, are assessed through the Snow Survey and Water Supply Forecasting Program (SSWSF).The data are compared to the data derived from GCOM-W1's AMSR2 satellite.The units of the satellite data are originally v/v % before being converted to millimeters in order to obtain the same values in a unit of depth [26].In order to retrieve the satellite data for the same coordinates as the in situ data, the minimum distance between two points, is calculated using the latitude and longitude of the ground station and the geographical coordinates included in the original satellite data [27].This is done through the use of the Euclidean distance method (EDM).Once the data is retrieved, it is compared against the ground station data through a variety of statistical methods; although one discovery is that there is a very noticeable discrepancy between the in situ data and the satellite data.In order to account for this, any bias between the two sets of data is removed using a simple cumulative density function (CDF) that

Streamflow Generation
The hydrological cycle simulated by SWAT based on water balance equation and calculates water balance for each HRU.
where t is the time in days, SW t is the final soil water content, SW 0 is the initial soil water content, R day is amount of precipitation, Q surf is the amount of surface runoff, ET a is the amount of evapotranspiration, W seep is the amount of percolation, and Q gw is the amount of return flow.Thereafter, the aggregated results of various physical processes on HRU scale and can be integrated into basin level.Soil Conservation Services Curve Number method (SCS-CN) is used to calculate streamflow, infiltration, and canopy storage [23,24].The lateral flow is simulated using a kinematic storage routing technique whereas return flow is simulated by considering shallow aquifer.In SWAT, Muskingum method is used for channel routing and potential evapotranspiration (PET) is estimated using the Penman-Monteith method [25].The SWAT output includes water balance for each watersheds and flow at the outlet point.

Bias Correction
In Puerto Rico, a different network called Soil Climate Analysis Network (SCAN) run by NRCS is used to collect in situ data.Whereas measurement of soil moisture and temperature along with liquid precipitation, solar radiation, relative humidity, etc. at different depths, are assessed through the Snow Survey and Water Supply Forecasting Program (SSWSF).The data are compared to the data derived from GCOM-W1's AMSR2 satellite.The units of the satellite data are originally v/v % before being converted to millimeters in order to obtain the same values in a unit of depth [26].In order to retrieve the satellite data for the same coordinates as the in situ data, the minimum distance between two points, is calculated using the latitude and longitude of the ground station and the geographical coordinates included in the original satellite data [27].This is done through the use of the Euclidean distance method (EDM).Once the data is retrieved, it is compared against the ground station data through a variety of statistical methods; although one discovery is that there is a very noticeable discrepancy between the in situ data and the satellite data.In order to account for this, any bias between the two sets of data is removed using a simple cumulative density function (CDF) that utilized the mean and standard deviation of both the ground station data and the satellite data.The following CDF is used in order to make the calculations where θ is the final corrected soil moisture, µ ground is in situ soil moisture, σ sat is standard deviation of satellite soil moisture, σ ground is standard deviation of in situ soil moisture, sat is satellite soil moisture, µ sat is satellite soil moisture.Once this is finished, the corrected data is then plotted again the original in situ data and shows that the satellite data is much closer matched than the original data.

Model Run
The model is run from the year 2008 to 2013 for both the watershed.The years 2008 and 2009 are used as a warm up period as this step is necessary to initialize the model and results of these are not considered in the model prediction.Another model is run for calibration and validation considering the parameter sensitivity from 2010 to 2013.Finally, the model performance is evaluated using statistical parameter such as the coefficient of determination (R 2 ) which indicates a relationship between observed and model simulated values.
where n is a number of simulation, Q i is observed stream flow at time i, S i is simulated stream flow at time i, Q is mean observed streamflow, and S is mean simulated streamflow [28].

Model Sensitivity Analysis
Sensitivity analysis is the process of determining the rate of change in model output with respect to changes in model input.Before the calibration sensitivity analysis is performed to reduce parameter uncertainty.The SWAT-CUP (calibration and uncertainty) incorporated with sequential uncertainty fitting (SUFI-2) is used for sensitivity analysis, calibration and validation of the model run [29,30].The global sensitivity is performed for the given set of parameters which regresses the Latin hypercube generated parameter against the objective function value.The next step is the calibration process which is an effort to better parameterize a model to a given set of local conditions, thereby reducing prediction uncertainty [31][32][33][34].Thirteen parameters are included in the calibration shown in Table 1.Fitted value for each parameter is obtained by iterating previous parameter range.The relative sensitivity analysis is performed for both the study area of the Rio Guanajibo and the Rio Grande de Anasco respectively.The p value is used to measure the significance of sensitivity.The larger p value indicates lesser sensitive parameter whereas a value close or equal to zero indicates more sensitivity [35].Parameters ALPHA_BNK, CN2, ALPHA_BF, GW_DELAY, and GW_REVAP are most sensitive parameters in the Rio Guanajibo basin, whereas ALPHA_BNK, GW_DELY, CN2, and GW_REVAP are most sensitive parameters in Rio Grande de Añasco basin.The most sensitive parameter has the effect on the calibration of streamflow and further change in the rest of the parameter value had no major effect on streamflow.The name and the values are listed in Table 2.

Calibration and Validation
The calibration is performed with 1000 simulations using data from January 2008 to December 2012 in which period 2008 to 2009 is used as warm-up period and not considered in the evaluation of the model prediction.In the preliminary assessment, the model streamflow results compared fairly well on a monthly basis with the observed value of streamflow.The range of a parameter is adjusted depending on the sensitivity analysis to match observed and simulated streamflow.The calibration yielded coefficients of determination R 2 = 0.69 and R 2 = 0.73 for the Rio Guanajibo and Rio Grande de Añasco watersheds respectively, shown in Figure 4.The temporal comparison between SWAT and AMSR2 satellite soil moisture is shown in Figure 5.In the Rio Guanajibo watershed, the value of AMSR2 soil moisture is higher in some months than SWAT soil moisture whereas, in the Rio Grande de Añasco watershed, the AMSR2 value is lower than SWAT soil moisture indicates a significant amount of bias in AMSR2 and SWAT soil moisture.This difference in soil moisture is due to change made in SWAT soil moisture depth to match the AMSR2 soil moisture depth.
The same range of calibration parameters is used in order to perform validation for two different cases using data from year 2013.In the first case, model-generated soil moisture values are directly used for the validation.The R 2 value estimated by the model is 0.60 for Rio Guanajibo and 0.58 for the Rio Grande de Añasco watershed shown in Figure 6.In the second case, AMSR2 values replace the model-generated values before being entered into the SWAT (HRU) layer.This leads to the generation of another model run that is then input into SUFI-2 for validation.The R 2 values obtained for AMSR2 data are 0.58 and 0.57 in both the watersheds and are shown in Figure 6.The comparison of validation results reveals a low influence of both SWAT and AMSR2 soil moisture values on the estimation of streamflow.Although the bias is seen in the soil moisture, streamflow is still less of an influence due to the parameter sensitivity.Soil moisture is less sensitive than the rest of the parameters in both the watershed hence indicate the lesser influence in streamflow.In some places, as a result, an observed streamflow does not match with SWAT streamflow due to the uncertainty involved either in model prediction or in climatic data such as precipitation and temperature, resulting in reduced R 2 value.However, the estimated R 2 value signifies an acceptable correlation between observed and estimated streamflow under varying land use, topography, and climatic conditions.

Streamflow Assessment
In the case of both watersheds, the streamflow assessment is based on the entire simulation period.In Rio Guanajibo, the model estimates mean annual rainfall of 2660.2 mm, PET of 1201.4 mm, evapotranspiration (ET) of 894.3 mm, water yield of 1702.57mm, and baseflow/total flow of 71%.The model estimated values in Rio Grande de Añasco, mean annual rainfall of 2660.2 mm, PET of 1226.7 mm, ET of 1022.6 mm, water yield of 1496.47 mm, and baseflow/total flow of 78%.Añasco Both the watersheds are distributed with the same soil type and land use classification, which resulted in higher water yield value.The Higher ET value is obtained due to forest dominant land use in the watersheds.The forest land use covered in Rio Guanajibo is lower compared to Rio Grande de Añasco which indicates lower ET value in Rio Guanajibo watershed.

Conclusions
The study presents applicability of SWAT for calibration and validation AMSR2 soil moisture in parts of Puerto Rico.The methodology evaluates assimilation of AMSR2 soil moisture in SWAT.However, there is still a scope for improvement in results between observed and streamflow in the Rio and the Rio Grande de Añasco watersheds.The sensitivity analysis accounting for streamflow calibration has shown variations between the parameter range, which had been initialized for model calibration.Thus, it indicated that parameters ALPHA_BNK, CN2, ALPHA_BF, GW_REVAP, and GW_DELAY are sensitive and have a great impact on the stream flow.The SUFI-2 procedure tries to minimize the difference between observed and measured streamflow data.Further results can be enhanced using model run on daily weather data and satellite soil moisture data over a longer period of time.The overall effect of AMSR2 soil moisture in SWAT is negligible, thus suggesting that the use of AMSR2 soil moisture will be effective when soil moisture is the most sensitive parameter in the SWAT.The soil moisture replacement technique improved model efficiency and considered a larger temporal variation.Finally, results revealed that the SWAT fusion with AMSR2 soil moisture is capable of simulating streamflow and could be effectively applied in hydrological modeling.

Figure 1 .
Figure 1.Location of the Rio Gunajibo and Rio Grande de Anasco sub-basins in Puerto Rico, USA.The study area showing stream networks, outlet points, and basin boundary.

Figure 1 .
Figure 1.Location of the Rio Guanajibo and Rio Grande de Añasco sub-basins in Puerto Rico, USA.The study area showing stream networks, outlet points, and basin boundary.

Figure 1 .
Figure 1.Location of the Rio Gunajibo and Rio Grande de Anasco sub-basins in Puerto Rico, USA.The study area showing stream networks, outlet points, and basin boundary.

Figure 2 .
Figure 2. Input data for SWAT (a) DEM elevation of the area ranges from 3 m to 1183 m; (b) LULC classifications in the area; (c) Soil type.

Figure 2 .
Figure 2. Input data for SWAT (a) DEM elevation of the area ranges from 3 m to 1183 m; (b) LULC classifications in the area; (c) Soil type.

Figure 3 .
Figure 3. Flow chart of methodology for streamflow measurement using SWAT in Puerto Rico, USA.

Figure 3 .
Figure 3. Flow chart of methodology for streamflow measurement using SWAT in Puerto Rico, USA.
values before being entered into the SWAT (HRU) layer.This leads to the generation of another model run that is then input into SUFI-2 for validation.The R 2 values obtained for AMSR2 data are 0.58 and 0.57 in both the watersheds and are shown in Figure6.

Table 1 .
SWAT parameters including initial and fitted value for streamflow calibration

Table 1 .
Cont.The changes in the parameter values are applied by (r) relative change means existing parameter value is to be multiplied with (1 + fitted value), (v) variable means the existing parameter value to be replaced by fitted value, (a) absolute means existing parameter value is added to a fitted value. *

Table 2 .
Global sensitivity analysis for 13 parameters in Rio Guanajibo and Rio Grande de Añasco watershed.Smaller P value indicates more parameter sensitivity.