An Application of Improved MODIS-Based Potential Evapotranspiration Estimates in a Humid Tropic Brantas Watershed—Implications for Agricultural Water Management

: The reliance on native MODIS-16 PET potential evapotranspiration (PET) in scarce-data-driven areas is growing in support among ecohydrological studies, yet information about its performance is limited or unknown as validation studies are mostly concentrated in developed countries. This study aimed to assess its performance at the monthly level using four ground measurements in a tropical watershed system with complex topography, applying a machine learning artiﬁcial neural network (ANN) to improve the estimates, and using the ANN-adjusted MODIS-16 PET to characterize the spatio-temporal patterns of PET in the Brantas watershed, as well as to understand the monthly patterns of water deﬁciency in areas under eight different vegetation covers. The results showed that the native MODIS-16 PET experienced overestimation with an RMSE of 37–66 mm/month and NRSME of up to 33%. The performance decreased in drier periods. The ANN-based adjustment using only one variable showed improved estimates with a reduction of RSME to only 14 mm and lower than 10% NRMSE. Sari-temporal patterns of PET in the Brantas watershed showed that the PET characteristics were not uniform. The southern part of the Brantas watershed has areas with relatively lower PET that are, thus, more prone to water deﬁciency. Complex topography and climate gradients within the watershed apparently became the multi-controllers of PET variations. The difference in vegetation cover also inﬂuenced the magnitudes of water deﬁciency. ensembles


Introduction
Evapotranspiration is frequently considered to be one of climatic variables greatly influencing biodiversity, geographical ecology, climatic hazards, and land productivity [1][2][3][4]. In high rainfall-energy regions such as the tropics, evapotranspiration is often examined to better understand hydrological responses, crop and irrigation management, and tropical forest characteristics [5][6][7]. Due to this importance, evapotranspiration has been quantified in varying measurement types such as evapotranspiration reference (ETo), actual evapotranspiration (AET), and potential evapotranspiration (PET). PET refers to the amount of water moving to the atmosphere by land or water, representing the combined evapotranspiration from varying components-soil and plants-and signifying the reversed process of precipitation [8]. The initial purpose of PET development was to indicate the maximum amount of the evaporated water, however, in its recent development, PET has experienced varying adjustment and algorithms. Recent comprehensive literature about PET and its history, development, and algorithms is exemplified in a study by Xiang et al., 2020 [9]. In earlier studies, PET has been assessed primarily for meteorological and hydrological purposes [10][11][12]. In recent periods, the application has been expanded to other areas such as crop management, drought monitoring, and the use artificial intelligence for estimating remotely sensed PET data [13][14][15][16].
Due to this importance, climatic variable monitoring related to potential evapotranspiration has become standard in field climatic stations. The potential evapotranspiration data is traditionally calculated using extensive climatic measurements obtained from these stations [17,18]. However, such parameters are often unavailable mostly due to costly sensors, high operational expenses, and lack of skillful manpower behind the station management. Consequently, coverage of the evapotranspiration data is highly confined by coverage of weather stations measuring all parameters needed. In order to obtain PET for the unmonitored areas, estimation has commonly been applied through varying methods such as interpolation from weather stations such as arithmetic mean, Thiessen polygon, and geo-statistics [19,20]. Unfortunately, these approaches require extensive and representative weather stations observation points. Failure to address this requirement means estimates often suffer large errors and high uncertainty.
Efforts have been made to provide more accurate evapotranspiration datasets. Advancement in machine learning have been extended to provide better estimates of evapotranspiration at varying scales from hourly to monthly [21][22][23]. Some of early studies employed either feed-forward or feed-back propagation modes with pure artificial neural network (ANN) models [22,24,25]. Later studies advanced the algorithms by using more complex models and hybrid approaches [26][27][28]. Based on the variables used in the models, there have been variations among studies. Some studies involved a number of variables mostly related to climatic variables such solar radiation, temperature, relative humidity, heat flux, and wind speed [26,29], which are generally unavailable in most areas. These models are considered to have more physical understanding due to the relationship of climatic variables to the evapotranspiration process. However, some others relied on simpler variables for practicality [30,31]. These might not necessarily link to the physical process as the nature of a neural network does not require a physical understanding. Despite varying approaches used in these studies, one commonality is the resulting high performance of the models, confirming the potential use of neural networks in estimation of evapotranspiration.
The advancement of remote-sensing technologies has sparked the increasingly operational satellite data offering varying climatic parameters of various scales. Satellite-based estimates offer a potential for providing data needed in support of different applications such as drought monitoring, crop management, and water-balance assessment [15,32,33]. However, in spite of the benefit in obtaining complete data coverage, satellite-based PET estimates are subject to a degree of error. An understanding of estimate errors is essential in that it allows users to recognize the most suitable products and incorporate the errors in the sequential analysis [34,35].
Accurate PET data are important for supporting agricultural and water-resource planning as well as for understanding the impacts of climate on the earth systems. There have been varying PET gridded data available globally at varying scales, such as the monthly Global Land Evaporation Amsterdam model GLEAM, at 0.25 • resolution [36]; the process-based land surface evapotranspiration/heat flux, P-LSH at 0.5 • [37] and the 8-day and monthly Moderate Resolution Imaging Spectroradiometer Evapotranspiration Data Set (MODIS-16) [38], available at 500m resolution. Today, MODIS-based PET is the only available dataset with fine spatial and temporal resolution suitable for small-scale applications. Compared to other climatic parameters, evaluation of satellite-based potential evapotranspiration products is far more limited. Several studies exemplify the work on assessing MODIS-based evapotranspiration such as in USA, Australia, Europe, Africa, and East Asia [39][40][41][42]. These studies demonstrate that performance of MODIS can vary by site with differing potential causes. This finding triggers the need of a representative evaluation assessment in varying geographic areas to improve the usability of remote-sensing-based PET products. To the best of our knowledge, at this point we cannot find peer-reviewed evaluation studies of MODIS evapotranspiration over humid region Indonesia. Most Asiabased studies are concentrated in China, which is known for its extensive weather data.
Indonesia is a large archipelagic country with the typical influence of monsoon in most regions. The presence of massive mountainous ranges and tropical forests, together with intensified rainfall energy, marks the importance of climatic setting to the country. Despite its declining trend due to industry development, agriculture remains the primary sector and land-use [43] with varying water resources from irrigation to dominantly rainfed farming [44]. The alarming climate uncertainty has increased the vulnerability of the rainfed farming system [45]. Given this, information on evapotranspiration characteristics has become more urgently required. Such quantification of PET spatio-temporal patterns in Indonesia's agricultural farming systems has rarely been examined due to the lack of available weather data with sufficient parameters for estimating PET magnitudes. Increasingly, reports on water-resource-related issues have called for representative climatic data including PET with sufficient spatio-temporal coverage for varying purposes such as drought modelling, crop management, and water supply assessment. This study was therefore aimed at: 1.
validating the monthly satellite-based MODIS potential evapotranspiration using weather data; 2.
applying an artificial neural network (ANN) to improve estimation of monthly PET; 3.
evaluating the adjusted monthly MODIS-16 PET in characterizing the spatio-temporal pattern of watershed water budget; 4.
applying the adjusted monthly MODIS-16 PET to describe patterns of PET under differing agricultural land-uses.

Study Area
As one of the primary watersheds in Indonesia, Brantas has gained important recognition for its economic and ecological functions. The more than 20 million dwellers in Brantas are approximately 15% of total population of the most densely populated province East Java [46]. Laid within the tropical region, Brantas exhibits a significant monsoonal influence with long-term annual average rainfall ranging from 1200 to 3600 mm with evident orographic influences. Seven mountain ranges mark the topographic complexity within the watershed to create an elevation gradient from 0 m above sea level to 3600 m asl. Brantas, in spite of its increasing urbanization trends, remains a watershed dominated by agricultural land-use. Data from global land-uses show that in 2019, more than 35% of the watershed was occupied by agricultural land-from wetland (rice farming) to dryland farming systems such as crops, trees, gardens, horticulture, and production forests. In recent decades, Brantas has undergone a decline in its agricultural roles in terms of production, labor, and share of total domestic products from agricultural sectors [47]. Recently, in Java, there have been increasing reports of issues related to agricultural productivity ranging from elevated runoff to intensified drought occurrence [48,49]. Brantas is controlled by varying physical factors-from climatic to geomorphological ( Figure 1). With dominant landforms of volcanic from the middle to upper watershed, to alluvial soil on the lowland, and a minor portion of karst and fluvio-marine, Brantas exhibits varying soil classes ranging from very fine soils to young coarse soils.

Data Acquisition and Accuracy Assessment
We derived monthly MODIS-16 PET using a Google Engine Platform. MODIS-16 PET is one of MODIS-16 PET products that come in varying bands and 500 m resolution. We focused on the monthly products to represent water-balance characteristics. The MODIS-16 PET gridded dataset refers to a potential evapotranspiration measured by adopting the Penman-Monteith formula, which was the first globally available ET dataset regularly spanning from 2001 and which is causing increasing concern in scientific community. It was generated under all sky conditions for seamless spatio-temporal ET data over vegetated areas [41,50]. The MODIS-16 PET algorithm was revised from a previous one based on the Penman-Monteith formula, as below: where PET i , PET o are the observed PET by the weather station and estimated PET by MODIS-16 PET, and R 2 refers to the coefficient of determination of MODIS-16 PET to ground PET.

Setting the ANN Architecture
In recent years, efforts to improve evapotranspiration estimates have relied on physical models or only the use of machine learning [16,51,52]. The first is often hindered by the needs of extensive data parameters not easily available, while the latter might be subject to inability to provide a physical understanding and can fail due to anomalies or extreme condition. The MODIS-16 PET was developed by incorporating varying parameters, which, with its algorithm development, consider the vegetative structure such as plant canopy, as described [50]. Recent studies attempted to improve the quality by incorporating varying parameters such as leaf area index, wind, and other climatic parameters such as temperature, solar radiation, and wind [52,53], and some studies concluded that the use of remote-sensing data can suit the need of regional models in scarce-data-driven areas [54]. Initially, we included four parameters-elevation, temperature, and normalized difference vegetation index as well as the MODIS-16 PET as the input layer and ground-PET as the output layer with the architecture, as below ( Figure 2): We tested the feed-backward propagation method with two, three, and four hidden layers. We selected the linear model as the training function and used the sigmoid as the transfer function. With respect to the proportion of data, there was no strict agreement about the allocation for training, testing, or validation. In this study, we evaluated differing compositions-60:20:20 and 70:15:15 training testing-as these two have apparently been commonly used in varying studies [16,55,56]. We used r, RMSE, and NRMSE as the performance evaluation. The selected model was taken from the 70:15:15 due to its slightly better performance than that of the 60:20:20 dataset. We applied the Levenberg-Marquardt (LM) training algorithm, a second-order optimization technique that has been extensively employed in evapotranspiration modeling using artificial neural networks due to its wellknown fast convergence which means that it needs lesser learning cycles [24]. Two hidden layers were selected in the final due to better accuracy results, and the stopping criterion was when the validation performance failed to improve. We selected the best performance and applied the validated architecture to the monthly MODIS-16 PET data.

Characterizing Water-Balance Patterns at Watershed Level
The monthly ANN-adjusted PET were then inputted to map the spatial pattern of monthly water balance. The monthly water balance was simply derived from the incorporation of monthly rainfall for the corresponding MODIS-16 PET pixels. We adopted FAO's simplified approach [15] to indicate the climatic water balance (CWB) as the indicator of water loss by drought conditions that can be summarized as: where CWB i refers to the i-month climatic water balance as a result of total i-month precipitation and total i-month potential evapotranspiration as estimated by MODIS-16 PET. The i-month precipitation was calculated from screened daily filtered rainfall from 201 stations available across the watershed. To obtain gridded monthly data, the monthly total rainfall was estimated from the interpolated data from 201 stations into 500 m-grid rainfall. To map the duration of the water-loss-affected periods, we calculated the CWBi for 12 months, and applied a 0-1 code for every pixel, where 1 refers to a condition where PETi is higher than Pi or climatically deficit, and 0 is the non-deficit condition. We summed up the 12 months within a year, and the summation result indicated the number of months where a pixel suffered from water deficit.
To understand the trends of PET in the study area for the last 20 years, we aggregated the PET into annual PET to obtain total PET for every year, which resulted in 20 images of total annual PET. The images were then stacked for further processing using the Mann-Kendall time-series raster analysis of images, treating every single pixel as a representation of a trend in a particular point. The Mann-Kendall test is a non-parametric test and a frequently used method for assessing trends in time-series remote-sensing raster datasets [57,58]. This test in combination with Sen's slope estimator allows more precise detection of spatial variability of such trends and has been widely used to monitor the dynamics of ecological variables [59][60][61]. The computation was run using R platform, modifying the protocols for time-series raster image analysis from Pironkova et al., 2018 [58]. The per-pixel time-series analysis outputs included Mann-Kendall tau statistic raster, a corresponding p-value raster, and a Sen's slope raster signifying the time-series features of each pixel [58]. The Mann-Kendall tau pixel values range from −1 to +1, meaning an upward trend for the positive values and a downward trend for the negative ones. In addition, when the trend detected appeared to be linear, the strength of the trend was estimated by a Sen's slope value. This serves as an alternative to the parametric regression line and quantifies the rate of change for every pixel [58]. Numerically, the Sen's slope is defined as the median value of all possible slopes between two observation points in the given period as follows: where Q is the slope, x j and x i are the values at time j and i, and N is the number of periods, showing the number of observations per year [61]. Likewise, the p-value raster indicates the probability estimates and the trend's significance-the smaller the value, the higher the statistical confidence.

Characterizing Water-Balance Patterns under Five Different Agricultural Land-Uses
As described, Brantas, despite its increasing trend in urbanization, is a watershed dominated by agricultural land-uses. As of today, there has been limited information about water-balance patterns for typical agricultural land-use in the Brantas watershed. We selected five relatively large areas (~2 km 2 ) of homogeneous agricultural land-use in verified locations-teak complex, rice-field, sugarcane, Taman Hutan Raya Forest, and bromo-savannah shrubs. The last two sites are ecologically important land-uses that are prone to drought and forest fires [54,62]. These land-uses represent typical agricultural land-use and important ecological land-use types which are beyond the coverage of the nearest weather stations. Long-term monthly average PET and precipitation (2001-2020) were calculated, and comparative assessment was performed.

MOD-16 ET Validation Performance and an ANN-Model for Improving PET Estimation in the Brantas Watershed
Utilizing all monthly data from six weather stations, the performance of MODIS-16 PET is summarized in Figure 3 and Table 1. Overall, the validation results were satisfactory. In both dry and wet seasons, the performance was relatively similar, resulting in normalized RMSE (NRMSE) from 20 to 33% and RMSE ranging from 37 to 66 mm. The performance was poorer in the dry season than in the wet season. Figure 3A shows the distribution density of monthly PET from two sources revealing the potential source of discrepancy. In both dry and wet seasons, the native MODIS-16 PET experienced considerable errors, mostly due to inability in estimating the PET lower than 150 mm. Yet, at higher monthly PET magnitudes, the native MODIS-16 PET tended to be overestimated. Figure 3B shows the larger deviation in the 1:1 scatter plot after the PET exceeded 150 mm.   The use of machine learning through an ANN model has improved the estimation of MODIS-16 PET. Using only the native MODIS-16 PET and average 8-day temperature from MODIS's land surface temperature, the performance rose to r value ranges of 0.92-0.94 ( Figure 4). This was the best performance achieved with a sampling design of 70:15:15. The other models employing additional variables such as NDVI and rainfall were unable to obtain the best performance quality.  Table 2 further summarizes the performance measures of each dataset. All datasets showed good performance with MAE ranges from 11 to 13 mm at the monthly level, with NRSME values all below 10% of the given data range. This result supports the opportunity to use the modified gridded data for spatio-temporal assessment at a larger scale, i.e., at a watershed scale. We examined the monthly pattern of PET from each weather station and compared these with the native MODIS-16 PET and the modified MODIS-16 PET resulting from the ANN above. Figure 5 shows that the monthly variation of the PET pattern was captured well by the ANN-modified MODIS-16 PET (PET_MOD ANN), which closely aligned with the PET from weather stations (PET measured). In four locations, the PET_MOD ANN values were in a good agreement with the measured PET and were able to suppress the deviation that became larger in drier months (March to October).

Spatio-Temporal Patterns of Watershed Deficiency 2001-2020
The validated ANN architecture was then employed to all gridded PET data from MODIS-16 PET from 2001 to 2020 to examine the spatio-temporal variations of monthly PET within the Brantas watershed. It appears from Figure 6 that the watershed had been experiencing a fluctuated condition regarding the component of water-balance deficiency.
The inter-annual variations of total PET were observed from the 2001 to 2020 datasets. The CWB maps from 2001 to 2020 indicate that in each year, the watershed might suffer a deficiency state where the total P (precipitation) is lower than the total PET. Year 2012 was the year where the watershed experienced a prolonged deficiency as marked by most orange to red (red means an area experiencing 12-month deficiency) areas within the watershed. The condition was almost uniform with less severe areas being mostly the lowland areas. In 2019, almost all areas within the watershed experienced shorter deficiency as marked by the blue to green pixels. The spatial pattern of the water deficiency maps in general shows that mid-to-south areas were more susceptible to longer water deficiency periods. Assessment of long-term datasets reveals the spatial patterns of PET characteristics in the Brantas watershed. The PET variations in the Brantas watershed are spatially linked to the topographic setting (Figure 7). The areas in mountain regions in Figure 7A-C are areas with relatively lowest PET for all categories-annual and seasonal periods. These patterns suggest the controls of topography and other climate variables in the dynamics of PET. Brantas is a watershed with complex topography. Seven mountain complexes within the watershed have shaped the gradients in elevations and orographic precipitations. This, in part, explains the spatial variations in PET. Figure 7B,C signify the seasonal pattern of PET where dry season is marked with elevated PET rates. The monthly average of PET in dry season was nearly 30% higher than the PET in wet season. This can be attributed to the role of elevated temperature during the dry season leading to the intensified evapotranspiration. Likewise, the intensified clouds developed in the wet season suggest elevated humidity and suppression of the intensity of water leaving to the atmosphere.

Dynamics of Evapotranspiration and Watershed Deficiency 2001-2020
The long-term PET as observed by the Mann-Kendall test in Figure 8 shows the spatial variability of trend directions and trend significance during the last 20 years. Figure 8A shows that most pixels pose positive Sen's slope values (upward trends). The areas with negative Sen's slope were mostly found in the mid-to southern part of the watershed, which are climatically wetter with higher precipitation. The significance values of such trends in Figure 8B shows that, for the last 20 years, only a small part of the region (approximately 30% of the area) observed statistically significant trend (lower p-value ≤ 0.05) in evapotranspiration within the watershed. Annual classification of water deficiency periods from 2001 to 2020 was shown in Figure 9. Here, we mapped the trends of areas experiencing the water deficiency at differing periods. Overall, a short deficiency period (3-6 months) dominated the watershed in early periods (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008) and (2014)(2015)(2016)(2017)(2018)(2019)(2020)(2021). Only in the years 2009-2013 was the watershed dominated by a prolonged deficiency (6-9 months). It should be noted that the number of pixels used in this analysis was affected by the number of valid pixels, which might have affected the calculation of the proportion of areas affected.

Monthly Water-Balance Patterns under Nine Different Vegetated Land-Uses
The Brantas watershed has, for a long time, been an agriculturally dominated watershed. Despite its increasing urbanization, the role of agricultural lands is still important. Currently, information about PET patterns that are crop-specific is unavailable. Figure 10 explores the use of the ANN-modified PET to examine the monthly patterns of water balance from varying vegetated areas. Overall, the deficit periods lasted for 4 up to 6 months, with PET usually peaks in September. Under eight different covers, there were no distinct patterns of water deficiency. Month 5 (May) was usually the start of deficiency, and this ended in month 10. The discrepancies among the covers were mainly due to the differing rain patterns and magnitudes. For example, the dryland forest (DRF) experienced the smallest area under intersection between PET and P, indicating the least severe deficiency. This was similar to the areas under high tree cover (industrial estate/HTI). On the other hand, the deficit curve areas were generally larger on the sites under sparse vegetation such as dryland agriculture (mixed crops), horticulture (vegetables), savanna, and teak. The last two covers represent sparse vegetation and some deciduous periods.
The sites selected in Figure 10 were sites with a long history of land-use types. The land-uses are historically permanent and some of these have multiple growing seasons such as those horticulture, rice-farming, and dryland farming. The monthly patterns of water balance in Figure 10 could be beneficial for identifying the critical periods of water supply, and hence be supporting to the implementation of scheduled irrigation programs.

Discussion
The performance of the native MODIS-16 PET data in the study area was somewhat unsatisfactory with monthly RMSE of 37 to 54 mm/month. This was below the values in several studies examining the monthly MOD-16 PET, for example, the validated MOD-16 PET in China with a monthly RMSE of 26 mm/month [63]. The use of machine learning through ANN-modeling was shown in this study to provide improved estimates of the PET values with the decrease in RMSE. The improved estimates are considered very good in comparison with other ANN-based estimates in other places such as in China with RMSE of 18-19 mm [16]. The fact that the use of NDVI did not result in as noticeable an improvement as temperature might suggest that the MOD-16 PET algorithm is not as sensitive to variation in vegetative cover in tropical regions as the temperature. The use of fewer explanatory variables in this study was able to produce satisfactory results which was beneficial for the data-stricken areas in tropical regions. With this result, the next challenge is how to downgrade such estimates to be suitable for assessment at the field level.
In four locations, the native MOD-16 PET show overestimation, which became larger in drier months. This can be attributed to two potential causes. The first is common in the tropics-dry seasons exhibit hotter days and stimulate elevated evapotranspiration. The second is potentially due to the nature of the MODIS-16 PET algorithm. The monthly PET from MODIS-16 PET was derived using Penman-Monteith-based algorithm supported by the use of MODIS-LAI [50]. A study in East Asia [64] highlighted the role of no-to-small cloud fraction areas in the overestimation of MODIS-16 ET, due to the overestimation of LAI in these areas. This overestimation is more severe in tropical regions than in higher latitude areas. The use of machine learning in improving PET estimates and the use of fewer variables with finer spatial resolution could make the gridded MODIS-16 PET more usable for more detailed analysis.
The annual spatial pattern of CWB using the adjusted MODIS-16 PET and gridded ground rainfall data shows that the BRB is not climatically uniform. This suggests that controls on water deficiency can vary across the watershed. The southern part of BRB is generally more prone to such deficiency. Climatically, these regions are also relatively drier areas. The southern parts are also marked by landform transitions resulting in convex areas where, once water is evaporated, the water in the surroundings is dragged to these sites, leaving a lower PET. In this study, the modified PET was used to examine the monthly pattern of areas under eight different vegetated covers in 2020. The eight areas in Figure 10 were selected to be areas of the permanent land-use/land-cover type. The differences in deficiency pattern were not significant among the land-use/land-cover types. In ricefarming systems, the monthly pattern of PET is beneficial in helping allocating water irrigation. With three growing seasons, the needs for water supply become more critical. Identification of spatial water-deficiency patterns helps prioritizing the water allocation to the areas where cropping patterns are more frequent.

Conclusions
The native MODIS-16 PET data are useful to provide spatio-temporal gridded information. With this dataset, assessment of water requirements at a watershed regional level could be made more accurate. However, ground validation from four tropical sites shows that MODIS-16 PET experience overestimation, which is more severe in drier months. The use of ANN machine learning with only one temperature variable with comparable spatial resolution could improve the estimation and reduced the RMSE in all sites. Further use of the adjusted gridded PET could help identify the spatial pattern of PET and water-related deficiencies. Temporal patterns of PET at eight differing vegetated covers revealed the differences in water deficiency periods. In this study, overall performance of monthly evapotranspiration was satisfactory, however, improvement of the model should be considered for future directions. This could include efforts for downscaling to daily or weekly temporal resolution, and to a finer spatial resolution. In terms of the algorithm development, the use of a more advanced neural network modeling approach should be considered, provided that there is advancement in computation resources. All these would be beneficial in leveraging the potential of publicly available remote-sensing data for supporting water resources, especially in scarce-data-driven regions.

Conflicts of Interest:
The authors declare no conflict of interest.