An LSWI-Based Method for Mapping Irrigated Areas in China Using Moderate-Resolution Satellite Data

: Accurate spatial information about irrigation is crucial to a variety of applications, such as water resources management, water exchange between the land surface and atmosphere, climate change, hydrological cycle, food security, and agricultural planning. Our study proposes a new method for extracting cropland irrigation information using statistical data, mean annual precipitation and Moderate Resolution Imaging Spectroradiometer (MODIS) land cover type data and surface reﬂectance data. The approach is based on comparing the land surface water index (LSWI) of cropland pixels to that of adjacent forest pixels with similar normalized di ﬀ erence vegetation index (NDVI). In our study, we validated the approach over mainland China with 612 reference samples (231 irrigated and 381 non-irrigated) and found the accuracy of 62.09%. Validation with statistical data also showed that our method explained 86.67 and 58.87% of the spatial variation in irrigated area at the provincial and prefecture levels, respectively. We further compared our new map to existing datasets of FAO / UF, IWMI, Zhu and statistical data, and found a good agreement with the irrigated area distribution from Zhu’s dataset. Results show that our method is an e ﬀ ective method apply to mapping irrigated regions and monitoring their yearly changes. Because the method does not depend on training samples, it can be easily repeated to other regions.


Introduction
Irrigation plays a vital role in increasing global grain output [1], especially in regions lacking fresh water. In the past 40 years, global crop production has doubled, cropland has increased by 12%, and irrigation has expanded [2]. Although only 18% of the world's arable land is irrigated [3], 40% of the global grain output comes from irrigated agriculture [4]. Irrigated agriculture accounts for the primary consumption in fresh water resources, using more than 70% of groundwater, rivers and lakes [5]. Irrigation information is very important for a wide range of studies, including water exchange between the atmosphere and the land surface [6,7], agriculture water requirements and supply [8], allocation of water resources between agriculture and ecosystems [9][10][11], hydrological processes [12], environmenta concerns such as soil quality depletion [13,14], agriculture-climate interactions and feedback [15,16]. Therefore, accurate information on irrigation distribution is an important step in monitoring cropland yields and water resources [17]. Accurate mapping of China's irrigated areas will not only help the Chinese government better assess future food and water security issues, but also help guide future policies aimed at mitigation or adaptation to climate change.

Preprocessing of the MODIS Data and Precipitation Data
The MODIS images were downloaded from website of the Land Processes Distributed Active Archive Center (LPDAAC). Land surface water index (LSWI) (bands 2 and 6) and Normalized Difference Vegetation Index (NDVI) (bands 1 and 2) were calculated using MODIS eight-day surface reflectance products (MOD09A1). The Savitzky-Golay filter was used to smooth the noise in the vegetation index time series, especially the noise related to atmospheric variability and cloud contamination [65]. Land cover types were determined based on the Global vegetation classification system of the International Geosphere-Biosphere Programme included in the MCD12Q1 products. The gridded daily precipitation dataset was from Yuan et al. (2015) [66]. This dataset was on the basis of the meteorological observations data of 735 meteorological stations in the National Climate Center of the China Meteorological Administration (CMA), and the gridded climate dataset (10 × 10 km) is interpolated by using the method of thin plate smoothing splines [66]. We resampled the precipitation dataset to match with the MODIS data.

Preprocessing of the MODIS Data and Precipitation Data
The MODIS images were downloaded from website of the Land Processes Distributed Active Archive Center (LPDAAC). Land surface water index (LSWI) (bands 2 and 6) and Normalized Difference Vegetation Index (NDVI) (bands 1 and 2) were calculated using MODIS eight-day surface reflectance products (MOD09A1). The Savitzky-Golay filter was used to smooth the noise in the vegetation index time series, especially the noise related to atmospheric variability and cloud contamination [65]. Land cover types were determined based on the Global vegetation classification system of the International Geosphere-Biosphere Programme included in the MCD12Q1 products. The gridded daily precipitation dataset was from Yuan et al. (2015) [66]. This dataset was on the basis of the meteorological observations data of 735 meteorological stations in the National Climate Center of the China Meteorological Administration (CMA), and the gridded climate dataset (10 × 10 km) is interpolated by using the method of thin plate smoothing splines [66]. We resampled the precipitation dataset to match with the MODIS data. Figure 1 shows the spatial distribution of the validation samples in mainland China. For our study we collected 612 validation samples that were from three sources. First, the soil moisture and crop growth dataset was provided by the China Meteorological Data Sharing Service System (CMDSSS). Sites were used from this source, were a total of 214 non-irrigated and 158 irrigated samples. Second, they were field surveys, which were carried out from July to August of 2016 in mainland China-76 non-irrigated samples and 46 irrigated samples. Third, it was collected by Google Earth, with the irrigation information on large, irrigated regions. We collected 612 evaluation samples (231 irrigated samples and 381 non-irrigated samples). At all sampling sites, we carried out a survey and recorded on cropland irrigation times, fertilizer use and pesticide, well depth, and crop yield.

Validation and Comparison
In our study, regression analysis is done using computer software (SPSS Statistics 17.0.1). A central part of the regression output of such packages is a summary of the foregoing information in an Analysis of Variance, or ANOVA table. We matched the ground records listed in Section 2.3 with remote sensing data and analyze them. In addition, we contrasted the estimated irrigation areas from our method with statistical data at prefecture and province scales. The data were collected from the National Bureau of Statistics of China [64]. We contrasted our new irrigation map with three widely used irrigation datasets: Zhu datasets, FAO/UF, and IWMI. We resampled our new irrigation map and IWMI datasets to the same spatial resolution with FAO/UF datasets.

Methodology
The principle of our method was to determine irrigated areas by comparing the canopy vegetation moisture index of a cropland site with the nearby natural vegetation (i.e., forests). Our method was based upon two fundamental assumptions: (1) temporal soil moisture of irrigated croplands is higher than in non-irrigated croplands or natural vegetation (i.e., forests) [23,67]. In general, at the non-irrigated croplands sites or natural vegetation, precipitation is the only soil moisture source, and soil moisture will keep decreasing between precipitation events, which may lead to relatively low levels of soil moisture. (2) Differences in moisture between croplands and adjacent forests decrease as the precipitation increases. The less precipitation a site receives, the larger the difference between cropland and adjacent forests in moisture, which is caused by increased artificial irrigation for the need of higher yield.
To validate the first assumption, LSWI was used as an indicator of soil moisture conditions in our study, and we calculated the mean value of LSWI (LSWI C ) during the growing season at each cropland pixel to show the condition of land surface moisture. The calculation formula of LSWI is: where ρ swir and ρ nir represent short wave infrared and red reflectances, respectively [68]. We calculated LSWI C from the 201st to the 241st day at all 612 investigated cropland samples, including 231 irrigated and 381 non-irrigated samples, and contrasted LSWI C with those of nearby forests (LSWI F ). Moreover, because there is a significant positive correlation relationship between LSWI and NDVI [22,69], we compared the LSWI of cropland and forest with the same NDVI values to rule out the influence of NDVI [68]. The land cover type data was from MCD12Q1. At a given cropland pixel, we selected the nearest 30 forest pixels to contrast, which the mean NDVI (NDVI F ) value is equal to NDVI of croplands (NDVI C ) (i.e., |NDVI C -NDVI F |< 0.05 × NDVI C ). Then, we computed the mean LSWI value (201st to 241st day) of nearby forests (LSWI F ), and contrasted LSWI C with all investigated cropland sites to test the first assumption.

of 15
To test our second assumption, in every province, we selected and ranked all pixels by descending the LSWI differences between LSWI F and LSWI C (LSWI Diff =LSWI C − LSWI F ). The pixels near the front in the sort had bigger LSWI Diff values, and hence a larger probability of being recognized as irrigated cropland pixel. We used statistical data at province-level to determine the LSWI Diff thresholds through calculate the number (N) of pixels with the biggest LSWI Diff as irrigated pixels. The LSWI Diff value of the N th is called the threshold (LSWI Diff0 ) for distinguishing the irrigated pixels and non-irrigated pixels. We used 16 provinces (half the number of Chinese provinces) to test the relationship between mean annual precipitation and LSWI Diff0 raised by our second assumption. In addition, we used the remaining 15 provinces (other half the number of Chinese provinces) to validate this relationship. The results for testing our two assumptions were showed in Figures 2 and 3. If our assumptions were verified that there is a significant correlation relationship between mean annual precipitation and LSWI Diff0 , we can deduce LSWI Diff0 for every province based on this correlation equation. After the LSWI Diff0 was calculated, we can judge the cropland pixels which LSWI Diff value is bigger than LSWI Diff0 as irrigated pixels. We examined the second assumption proposed in the methodology section based on statistics data of irrigation and the mean annual precipitation. A significant negative linear relationship was found between the mean annual precipitation and LSWIDiff0 thresholds (the LSWI difference between the nearby forest and irrigated cropland) (p < 0.01) (Figure 3). The LSWIDiff0 thresholds decreased with the increasing mean annual precipitation, indicating that the larger LSWI differences occurred in dry regions, which confirmed our second assumption. Hence, we used the regression to calculate the thresholds (LSWIDiff0) for all provinces using mean annual precipitation.   Based on our two assumption, we established the following six steps to select irrigated cropland pixels: Step (1) We calculated the mean LSWI (LSWI C ) and NDVI (NDVI C ) values of each cropland pixel from day 201 to day 241 of the year, which correspond to the peak of growing season of the year.
Step (3) Within a given province, we sorted all pixels with LSWI Diff by descending order. The pixels with larger LSWI Diff values were more likely to be irrigated. Statistical data of irrigation areas at province-level from the National Bureau of Statistics were used to calculate the LSWI Diff thresholds. The N pixels with the largest LSWI Diff were selected as irrigated; we determined whether the total area of the N pixels equated to the recorded statistical area of irrigation for a given city.
Step (4) We selected half of the provinces identified in Step 3 to determine the LSWI thresholds. We investigated the correlations between the LSWI Diff thresholds and mean annual precipitation, and found a significant negative linear relationship between LSWI Diff thresholds and mean annual precipitation.
Step (5) Using the appropriate LSWI Diff0 thresholds in other half provinces to verify our regression equation. Then, we used mean annual precipitation as input parameter for the regression equation to deduce the threshold value (LSWI Diff0 ) for every pixel. If LSWI Diff is greater than the deduced threshold (LSWI Diff0 ), then the pixel is considered to be irrigated.
Step (6) These five steps were repeated until the total pixels of mainland China were compared and identified. The pixels of all irrigated areas were combined to obtain the irrigated map of the study area.
We checked the first assumption that LSWI values of the irrigated cropland pixels were bigger than those of the nearby natural forest vegetation pixels. Therefore, we compared LSWI values of the investigated sampling sites with the nearby forests pixels that have the equivalent NDVI values with the cropland pixels. There was an obvious LSWI difference between the irrigated and non-irrigated cropland pixels compared to the nearby natural forest vegetation pixels (ANOVA test, p < 0.01) (Figure 2a).
We examined the second assumption proposed in the methodology section based on statistics data of irrigation and the mean annual precipitation. A significant negative linear relationship was found between the mean annual precipitation and LSWI Diff0 thresholds (the LSWI difference between the nearby forest and irrigated cropland) (p < 0.01) (Figure 3). The LSWI Diff0 thresholds decreased with the increasing mean annual precipitation, indicating that the larger LSWI differences occurred in dry regions, which confirmed our second assumption. Hence, we used the regression to calculate the thresholds (LSWI Diff0 ) for all provinces using mean annual precipitation.

Results
We used the statistical data to validate our method at the prefecture and the province levels. The results show that our new method can accurately identify irrigated areas compared to the statistical data at the two spatial scales (i.e., the prefecture and the provincial levels) (Figure 4). The coefficients of determination (R 2 ) between the estimated irrigated areas in our method and the statistical data areas were 0.59 ( Figure 4b) and 0.87 (Figure 4a) at the prefecture and province levels, respectively. Estimated irrigated areas were highly consistent with the statistical data over the 16 provinces used to build the regression relation between LSWI Diff0 and the mean annual precipitation, as shown in Figure 3a. Furthermore, the agreement between estimated areas and statistical data was also found over the remaining 16 provinces (Figure 4a), indicating a good relationship between irrigated areas estimated by our method and the statistical data.  Figure 4a represent the 16 provinces used for building the relationship between LSWI Diff0 and mean annual precipitation, and the red dots correspond to the remaining 15 provinces used for verification. Statistical area refers to the irrigated area in a given administrative unit (province or prefecture) that were obtained from the statistical yearbooks.
Based on the remote sensing canopy moisture index, we produced a map of spatial distribution of irrigated regions in mainland China ( Figure 5). Irrigated regions are distributed mainly in three alluvial plains (the middle and lower reaches of the Yangtze River Plain, the North China Plain, and the Northeast Plain) and valleys along five rivers (Yangtze River Basin, Liaohe River Basin, Yellow River Basin, Haihe River Basin, Huaihe River Basin, Songhua River Basin). The irrigated area of these three alluvial plains accounts for most of the irrigated area of China. The National Bureau of Statistics of China [64] reported 654,387 km 2 of irrigated area in China 2016, while we estimated 604,344 km 2 ( Table 1) [64]. The overall irrigated area in China was underestimated by 7.64%. provinces used to build the regression relation between LSWIDiff0 and the mean annual precipitation, as shown in Figure 3a. Furthermore, the agreement between estimated areas and statistical data was also found over the remaining 16 provinces (Figure 4a), indicating a good relationship between irrigated areas estimated by our method and the statistical data.  (Table 1) [64]. The overall irrigated area in China was underestimated by 7.64%.   To evaluate the accuracy of our method, we made comparisons with three other irrigation datasets: the irrigation map derived by Zhu et al. (2015) [70] (see method) and two global irrigation datasets (IWMI and FAO/UF). In total, the overall accuracies are 41.18, 58.98, 61.76, and 62.09% in FAO/UF, IWMI, Zhu's dataset and our map ( Table 2). To compare the datasets, we calculated the irrigated areas at the provincial level from four irrigation maps. Figure 6 indicates the scatter plots for inter-comparison of four maps and statistical data. The two scatter plots show that the map obtained from our study is more similar to Zhu's map than FAO/UF map and IWMI map and is more consistent with the statistical data of irrigated area at provincial level. To further compare similarity/dissimilarity among the four maps, we aggregated them at the prefecture-level (Figure 7). This figure shows that both FAO/UF map and our map show irrigation patterns with similar types of spatial distribution. The Hetao plain, the Northeast plain, the North China plain, and Northwestern plain of Xinjiang province have the largest irrigated fields in China. Southeastern and southwestern China have the smallest irrigated fields in China. The largest discrepancy between irrigation distribution derived by our map and the FAO/UF map is in Heilongjiang, Xinjiang, Jiangsu, Henan, and Liaoning Province. To further compare similarity/dissimilarity among the four maps, we aggregated them at the prefecture-level (Figure 7). This figure shows that both FAO/UF map and our map show irrigation patterns with similar types of spatial distribution. The Hetao plain, the Northeast plain, the North China plain, and Northwestern plain of Xinjiang province have the largest irrigated fields in China. Southeastern and southwestern China have the smallest irrigated fields in China. The largest discrepancy between irrigation distribution derived by our map and the FAO/UF map is in Heilongjiang, Xinjiang, Jiangsu, Henan, and Liaoning Province.

Discussion
Mapping irrigation distribution is challenging because the water source for croplands may be represented by precipitation, irrigation, or both. Based on phenological characteristics, this research developed a new method to identify irrigation by analyzing MODIS, precipitation and statistical data. In our study, natural vegetation forest was used as a reference to compare the LSWI of cropland and the nearby forests with equal NDVI. Our method is especially suitable for China, where nearly 70% of the land area are mountains, plateaus, and hills. We discovered the regression coefficient relationship between the LSWI Diff0 thresholds (the difference in LSWI between nearby forests and irrigated cropland) and mean annual precipitation. Our method is based on this relationship to identify irrigated regions and the regression coefficients we used in mainland China are shown in Figure 3. These regression coefficients may change with different climatic backgrounds. Therefore, if the method is extended to other regions, the relationship between LSWI Diff0 and mean annual precipitation needs to be redefined.
However, there are also two weaknesses in our method. First, the method does not take into account the differences in water storage between forests and croplands. In general, the nearby forests turn green earlier than the fields [71,72]. Thus, there may be differences in soil moisture between cropland and forest, since forests that were reforested earlier can store more precipitation in the soil [73,74]. In addition, previous studies have shown that plants in different ecosystems have different transpiration capacities and water retention capacities under the same leaf area index (LAI) [75,76]. The differences of transpiration between croplands and forest plants may lead to the differences in soil moisture.
Second, it is clear that our approach is most likely to perform better in arid regions than in humid regions, where rainfall is plentiful and the supplemental irrigation may be the main form of irrigation [70]. Supplemental irrigation usually takes place in a short period of crop growth and development, and temporal spectral differences of the additional soil moisture brought by irrigation are not obvious. However, as irrigation contributes to higher crop yields, irrigation requirements are closely related to climatic conditions. Our method used precipitation data and time series of NDVI and LSWI to estimate the presence of irrigation in each cropland pixel. We emphasized the level of mean annual precipitation, which aims at reflecting irrigation needs, and average LSWI conditions, to reflect the moisture index differences between irrigated cropland and nearby non-irrigated forests. In humid regions, our method may perform worse than the traditional classification methods, but it still provides an alternative irrigation assessment method.
Our method presents some advantages. Our method does not rely on field investigation and is easily scaled and applied to a larger region. In our method, irrigated pixels can be automatically identified just through the MODIS land-cover product, avoiding visual interpretation [70,77,78] of high-resolution images [79] and much field work [80,81]. This advantage may substantially improve the repeatability and applicability of our method. Standardization of the procedure allows it to be easily repeated in other similar areas all over the world and used to map real-time irrigation areas as long as the latest remote sensing data can be obtained.

Conclusions
In our study, we developed a new method for identifying irrigated areas in mainland China, based on the regression between the mean annual precipitation and the LSWI difference between cropland and the adjacent forest pixels, which were identified from the MODIS land cover datasets. We calculated the regression relation between the threshold of irrigation and mean annual precipitation at provincial levels. Validation of the irrigation map derived by our method and statistical data at provincial and prefecture levels indicated that the approach was reliable. We further collected 612 reference samples (231 irrigated and 381 non-irrigated samples) to validate our new irrigation map for mainland China at the pixel level. The overall accuracy of our new map is 62.09%. Based on statistical data, validation indicated that our method explained 86.67 and 58.87% of the spatial variation in irrigated areas at the provincial and prefecture levels, respectively. Results suggest that our method is an effective and promising tool for mapping irrigated regions. Moreover, the accurate mapping of irrigated regions, especially in arid and semi-arid regions, will be beneficial for studying the effects of human activities on agroecosystems and their relationships with global changes.