Entropy Base Estimation of Moisture Content of the Top 10-m Unsaturated Soil for the Badain Jaran Desert in Northwestern China

Estimation of soil moisture distribution in desert regions is challenged by the deep unsaturated zone and the extreme natural environment. In this study, an entropy-based method, consisting of information entropy, principle of maximum entropy (PME), solutions to PME with constraints, and the determination of parameters, is used to estimate the soil moisture distribution in the 10 m deep vadose zone of a desert region. Firstly, the soil moisture distribution is described as a scaled probability density function (PDF), which is solved by PME with the constraints of normalization, known arithmetic mean and geometric mean, and the solution is the general form of gamma distribution. A constant arithmetic mean is determined by considering the stable average recharge rate at thousand year scale, and an approximate constant geometric mean is determined by the low flow rate (about 1 cm a year). Followed, the parameters of the scaled PDF of gamma distribution are determined by local environmental factors like terrain and vegetation: the multivariate linear equations are established to qualify the relationship between the parameters and the environmental factors on the basis of nineteen random soil moisture profiles about depth through the application of fuzzy mathematics. Finally, the accuracy is tested using correlation coefficient (CC) and relative error. This method performs with CC larger than 0.9 in more than a half profiles and most larger than 0.8, the relative errors are less than 30% in most of soil moisture profiles and can be as low as less than 15% when parameters fitted appropriately. Therefore, this study provides an alternative method to estimate soil moisture distribution in top 0–10 m layers of the Badain Jaran Desert based on local terrain and vegetation factors instead of drilling sand samples, this method would be useful in desert regions with extreme natural conditions since these environmental factors can be obtained by remote sensing data. Meanwhile, we should bear in mind that this method is challenged in humid regions since more intensive and frequent precipitation, and more vegetation cover make the system much more complex.


Introduction
Soil moisture is a crucial factor in hydrologic, geomorphic, and pedogenic processes [1,2].Moreover, it influences the partition of available energy between latent and sensible heat, and the magnitude of the net radiation absorption by the soil surface [3].Under extremely dry conditions, both the number and the size of perennial plant species are limited by the availability of soil water [4,5].
Hence, it is necessary to know the soil moisture content in desert regions.A wide range of methods are used to measure the soil moisture, including mass soil water content by the drying method, volume moisture content by neutron probes, TDR probes, and remote sensing.However, these methods are challenged by the measurement of soil moisture in deep desert vadose zones.Soil moisture estimates are obtained within the top 5 cm depth by passive remote sensing [6] and at greater depths by active sensors, and a depth of tens of centimeters by TDR, whereas the depth of the vadose zone in desert regions is usually up to tens or hundreds of meters.The mass moisture content is limited by the huge amount of work as well as difficult natural conditions.
In recent years, entropy-based methods were used to estimate soil moisture in the vadose zone.Information content and complexity were used in soil water fluxes simulation [7].Al-Hamdan and Cruise estimated the soil moisture profile to near 50 cm on the basis of PME [8].Some infiltration equations were derived based on the principle of maximum entropy (PME) with the Shannon entropy or information entropy [9] as well as soil moisture movement on the basis of PME with Tsallis entropy [10].The soil moisture distribution was also estimated in an irrigated field in North Central Alabama, USA, based on the PME [11].These studies assumed that the vertical soil moisture distribution curve is a scaled probability density function (PDF) that follows PME.Then the PDF was solved using Lagrange multipliers with constraints of normalization and known mean.
The soil water distribution in a vertical profile usually exhibits three phases, including just after rainfall (wet), a long time after rainfall (dry) and a short time after rainfall [9], which are illustrated in Figure 1a.However, some different characteristics were observed in the 10 m depth vadose zone of desert regions.Generally, the soil moisture profiles should be considered as the dry phase in desert areas as a result of the intense potential evaporation and little precipitation.Previous studies revealed that the distribution of soil moisture profiles have three subcategories in the deep vadose zone of the Badain Jaran Desert in Northeastern China, including an increase first then a decreasing trend, the increase then being stable and affected by ground water levels [12,13], as shown in Figure 1b.Hence, it is necessary to know the soil moisture content in desert regions.A wide range of methods are used to measure the soil moisture, including mass soil water content by the drying method, volume moisture content by neutron probes, TDR probes, and remote sensing.However, these methods are challenged by the measurement of soil moisture in deep desert vadose zones.Soil moisture estimates are obtained within the top 5 cm depth by passive remote sensing [6] and at greater depths by active sensors, and a depth of tens of centimeters by TDR, whereas the depth of the vadose zone in desert regions is usually up to tens or hundreds of meters.The mass moisture content is limited by the huge amount of work as well as difficult natural conditions.In recent years, entropy-based methods were used to estimate soil moisture in the vadose zone.Information content and complexity were used in soil water fluxes simulation [7].Al-Hamdan and Cruise estimated the soil moisture profile to near 50 cm on the basis of PME [8].Some infiltration equations were derived based on the principle of maximum entropy (PME) with the Shannon entropy or information entropy [9] as well as soil moisture movement on the basis of PME with Tsallis entropy [10].The soil moisture distribution was also estimated in an irrigated field in North Central Alabama, USA, based on the PME [11].These studies assumed that the vertical soil moisture distribution curve is a scaled probability density function (PDF) that follows PME.Then the PDF was solved using Lagrange multipliers with constraints of normalization and known mean.
The soil water distribution in a vertical profile usually exhibits three phases, including just after rainfall (wet), a long time after rainfall (dry) and a short time after rainfall [9], which are illustrated in Figure 1a.However, some different characteristics were observed in the 10 m depth vadose zone of desert regions.Generally, the soil moisture profiles should be considered as the dry phase in desert areas as a result of the intense potential evaporation and little precipitation.Previous studies revealed that the distribution of soil moisture profiles have three subcategories in the deep vadose zone of the Badain Jaran Desert in Northeastern China, including an increase first then a decreasing trend, the increase then being stable and affected by ground water levels [12,13], as shown in Figure 1b.The two former types are the typical moisture distribution in this region, corresponding to the distribution of a short time after rainfall and a long time after rainfall in the shallow soil layer, respectively.Therefore, these soil moisture distributions could be calculated by a method based on entropy theories.Singh considered the soil moisture distribution of a short time after rainfall (or lognormal distribution type) to consist of two parts: the dry case and the wet case [10].This is only constrained by the limit of known information of the PDF (normalization and known mean) and the result is an exponential shaped curve.Further information is required to solve the soil moisture distribution curve similar to a two parameter probability distribution, such as gamma distribution, The two former types are the typical moisture distribution in this region, corresponding to the distribution of a short time after rainfall and a long time after rainfall in the shallow soil layer, respectively.Therefore, these soil moisture distributions could be calculated by a method based on entropy theories.Singh considered the soil moisture distribution of a short time after rainfall (or lognormal distribution type) to consist of two parts: the dry case and the wet case [10].This is only constrained by the limit of known information of the PDF (normalization and known mean) and the result is an exponential shaped curve.Further information is required to solve the soil moisture distribution curve similar to a two parameter probability distribution, such as gamma distribution, lognormal distribution and normal distribution.In desert regions, fortunately, the soil moisture distribution is a record of both precipitation and recharge history [14].For example, a 10 m deep soil moisture profile is an archive of the climate change in the past 1000 years [15].Hence, when sand texture and local environmental factors are determined, the total soil moisture in the deep vadose zone of the desert regions can be regarded as a constant since the difference between precipitation and evaporation is almost a constant on a thousand year scale.Apart from that, the geometric mean of the soil moisture curve is also a constant because the flow rate is extremely slow in desert regions.For example, the soil water recharge to ground water is about 1-1.3 mm per year in the Badian Jaran Desert [16].Hence, the PDF-based distribution of soil moisture can be considered as a whole instead of two parts in this study area.Apart from that, soil moisture distribution is determined by the soil texture, local environmental factors like terrain and vegetation, and climatic factors [17].In the deep desert vadose zone region, the texture is determined by the sand deposition and can be considered as constant at a small spatial scale; the vegetation is constricted by the soil moisture content in the top ten meters even as deep as dozens of meters, the climatic conditions are relatively uniform and is mostly affected by the terrain.Hence the system controlling soil moisture in desert regions is simpler than systems in humid regions, and establishing the relationship between soil moisture distribution and local surface controlling factors should be easier to achieve.A good correlation between the soil moisture distribution and local terrain and vegetation was revealed by geo-statistical analysis in the Badain Jara Desert [12,13].
Therefore, the objectives of this study are to: (1) consider the soil moisture distribution as an scaled entire probability distribution instead of two part in the deep vadose zone of the Badain Jaran Desert, and solve the distribution on the basis of PME with the constraints of normalization, known arithmetic mean and geometric mean; (2) determine the parameters of the scaled PDF based on local land surface factors like terrain and vegetation through the application of fuzzy mathematics methods, and to test the performance of this entropy-based method.

Study Area
The Badain Jaran Desert is located in the northwestern part of the Alashan Plateau of western Inner Mongolia (39 • 20 N to 41 • 30 N and 100 • E to 104 • E; Figure 2), and covers an area of some 49,000 km 2 [18].It contains the third largest dune field in China, and includes the highest megadunes on Earth.The dunes are interspersed with lakes that occur in many low-lying areas throughout the desert and that vary in size, shape, and salinity.From southeast to northwest the elevation gradually decreases from approximately 1800 m asl to 1000 m asl.
The climate in the Badain Jaran Desert is an extreme continental type, with hot summers and cold winters.Daily daytime temperatures in summer months range up to 40 • C, while mean monthly temperatures fall to −10 • C in January, and sub-zero minimum temperatures prevail for most of the year.The southeastern Badain Jaran Desert is near the current northern extent of the East Asian monsoon, which provides the primary source of precipitation, 70% of which falls from July to September.Rain falls on 10-35 days per year.Cold and dry continental air masses from the prevailing westerly winds dominate the region in the winter.The mean annual precipitation measured at the meteorological station nearest to the present study area (Zhongqanzi Station, 20 km southeast of the study area) was 84 mm from 1956 to 1999 and was highly variable (coefficient of variation = 0.39).In contrast, the potential evaporation from surface water is 2600 mm•year −1 [19].Average precipitation decreases significantly from south to north, declining to about 50 mm•year −1 at Wentugaole, near the border between China and Mongolia [20], due to the progressively declining influence of monsoonal moisture.Orographic effects result in slightly higher rainfall rates in the Yabulai Mountains (150 mm•year −1 ) in the southeastern part of the desert.The mean annual wind speed ranges from 2.8 to 4.6 m•s −1 , and increases from the south to the north, with the strongest winds in April and May.

Information Entropy
Information entropy or Shannon's entropy of a random variable X with the PDF f(x) is defined as the negative expectation value of the logarithmic f(x) [21], usually denoted as H(x): where a is the minimum of variable X and b is the maximum of variable X.
The entropy of a single discrete random variable X is a measure of its average uncertainty, which is expressed by Equation (3): where X represents a random variable with a set of values i and probability mass function p(xi) = Pr(X = xi).Note that p log p = 0 if p = 0.

Principle of Maximum Entropy and Probability Distribution
Principle of maximum entropy (PME) theory is used as a constructive criterion for setting up the least biased probability distribution based on partial knowledge [22,23].If no other information is available except given statistical constraints, the distribution on the basis of PME is the least biased toward unavailable information.Therefore, the probability distributions are quite different with different constraints or known information.For the PDF f(x), the first constraint is the normalization,

Information Entropy
Information entropy or Shannon's entropy of a random variable X with the PDF f (x) is defined as the negative expectation value of the logarithmic f (x) [21], usually denoted as H(x): where a is the minimum of variable X and b is the maximum of variable X.
The entropy of a single discrete random variable X is a measure of its average uncertainty, which is expressed by Equation (3): where X represents a random variable with a set of values i and probability mass function p(x i ) = Pr(X = x i ).Note that p log p = 0 if p = 0.

Principle of Maximum Entropy and Probability Distribution
Principle of maximum entropy (PME) theory is used as a constructive criterion for setting up the least biased probability distribution based on partial knowledge [22,23].If no other information is available except given statistical constraints, the distribution on the basis of PME is the least biased toward unavailable information.Therefore, the probability distributions are quite different with different constraints or known information.For the PDF f (x), the first constraint is the normalization, shown as Equation (2).Apart from that, the mean, variance or other information can be considered as known information for specific conditions.These constraints are denoted as g i (x), which can be expressed as follows: In order to solve f (x) with constrains of Equations ( 2) and ( 4), one simple method is using Lagrange multipliers to solve it under extreme conditions.The Lagrange function L can be defined as Equation ( 5): where λ 0 , λ 1 , λ 2 , . . .,λ n are Lagrange multipliers.As the information entropy is the maximum value, the partial derivative function is equal to zero, as shown in Equations ( 6)-( 8): Equation ( 8) is the general PDF of random variable X solved by PME with the constraints of Equations ( 2) and (4).More details on the mathematical proof of the probability distribution based on PME can be found in Conrad [24].

Principle of Maximum Entropy for Soil Moisture Distribution in Deep Desert vadose Zone
The function between soil moisture and vertical soil depth can be written as: where θ Z is the soil moisture at depth z.
The accumulation of soil water: where C is the total soil water in a profile.Equation (10) can also be transformed as: Hence the scaled vertical soil moisture distribution f (z)/C can be considered as a PDF.The information entropy can be calculated as: As the soil water can be seen as in an equilibrium state, the function f (z)/C obeys the principle of maximum entropy, and the constraints of f (z)/C are: (1) the normalization; (2) arithmetic mean is given since the average recharge rate or the average recharge depth at millennium scale is a constant; (3) geometric mean can be considered as a constant since flow rate is get close to zero and the soil moisture also can be approximately considered as an equilibrium state in desert regions, which are shown in Equations ( 11), ( 14) and ( 15): where µ z and ν z and are the arithmetic mean and geometric mean of soil moisture distribution, respectively.Then we construct the Lagrange function L with the constraints of ( 11), ( 14) and ( 15): Let ∂L ∂ f z = 0, we can find the probability density function f (z)/C: Substituting Equation ( 17) into Equations ( 11), ( 14) and ( 15), we can obtain the values of λ 0 , λ 1 , λ 2 .Equation ( 17) can also be transformed as (18): Equation ( 18) is the general form of a Gamma distribution.The soil moisture distribution based on a scaled PDF f (z)/C with different parameters is shown in Figure 3.In the Badain Jaran Desert, one centimeter depth sand profile usually records the archive of the recharge from precipitation in a year [15].In other words, the average flow rate of soil movement is 1 cm per year.This indicates the shape of the soil moisture profile is almost the constant at the scale of decades of years, so the geometric mean of the profile can be considered as a constant.Meanwhile the arithmetic mean can also be considered as constant since the average recharge rate is a constant at one thousand year scale.Hence, the two constricts, both constant in arithmetic mean In the Badain Jaran Desert, one centimeter depth sand profile usually records the archive of the recharge from precipitation in a year [15].In other words, the average flow rate of soil movement is 1 cm per year.This indicates the shape of the soil moisture profile is almost the constant at the scale of decades of years, so the geometric mean of the profile can be considered as a constant.Meanwhile the arithmetic mean can also be considered as constant since the average recharge rate is a constant at one thousand year scale.Hence, the two constricts, both constant in arithmetic mean and geometric mean, are reasonable to determine the soil moisture distribution in desert region at the scale of a couple of decade years.The PDF of gamma distribution also displays the same pattern as the soil moisture profiles in desert regions, as shown in Figures 3 and 4, respectively.

Correlation Analysis
Firstly, the correlation coefficients between measured data and simulated results are used to evaluate the performance of the entropy theory based model.The results are shown in Table 4.

Parameterization
Actually, the soil moisture distribution is measured at a limit depth, which would lead to a substantial difference of the parameters (both arithmetic mean and geometric mean) between the real value and estimated value by samples, especially for the soil moisture distribution displaying the trend of increase then being stable.Even worse, both arithmetic mean and geometric mean are difficult to obtain at the depth of 10 m without drilling samples.Fortunately, previous studies showed that soil moisture distribution is correlated to local environmental factors in this region [12,13].Therefore we can estimate the parameters based on the local environmental factors through the application of fuzzy mathematics method.In this study, multilinear equations are used to determine the parameters of the scaled PDF of gamma distribution.

Soil Moisture Data
Nineteen random deep soil moisture profiles are used to test the simulated results.Sand samples were collected from the southeastern part of the Badain Jaran Desert where dunes and lakes are densely interspersed (Figure 2) in June 2005 and September 2007.Samples in the unsaturated zone were obtained to a depth of about 10 m using a 50 cm hollow-stem hand auger (Dormer Engineering, Murwillumbah, Australia) with interchangeable 1.5 m aluminum rods.Bulk sediment samples of approximately 500 g were collected at intervals of 0.25 m in top 3 m and 0.50 m in 3-10 m respectively.Samples were homogenized over the sampled interval and immediately sealed in polyethylene bags, care being taken to avoid moisture loss.Locations and elevations were recorded with a Garmin GPS, as shown in Table 1.Moisture contents were determined gravimetrically after drying overnight at 110 • C.These data have been partly reported in the previous studies [12,13,19,[25][26][27].

Data Preparation for Multiple Linear Regressions
Before multiple linear regressions, the local terrain and vegetation data was encoded to qualify the data.For certain parameters, categories were more meaningful than the actual measured values.Thus, the slope orientation was encoded using eight equal 45 categories, moving clockwise from north (centered on an azimuth of 0, thus with azimuths for this initial category ranging from 337.5 to 22.5), following the method of Qiu and Zhang [17].The sine of the half slope orientation was used to quantify its value as the sine increase from 0 to 90 degree and then decrease from 90 to 180 degree.Pulsing the constant 1 was used to calibrate quantity of orientation.The vegetation cover was classified into five categories: 0 represents none, 1 represents very sparse, 2 represents sparse, 3 represents sparse to moderate, and 4 represents moderate to relatively dense.The slope, relative elevation above the nearest lake, and distance to the nearest lake were measured at each sample site (Table 1).And the tangent value was used to quantify the slope of dunes.All environmental variables were normalized before the multiple regressions.

Solving Equations
The multiple regressions between the parameters of the scaled gamma PDF and local environmental factors are as follows: With the exception of fitting by least squares, the weights of the environmental factors for the parameters of λ 0 , λ 1 , λ 2 are fitted by two methods: based on the environmental factors of all profiles (FAP) and leave itself out or the factors of the other profiles (FOP).The former use the same weight vector for all 19 profiles, while the latter need 19 vectors to determine the parameters for each profile, respectively.Here, we use the mean, standard deviation (STD) and coefficient of variance (CV) to describe these weights calculated by the second method, as shown in Table 2.More details are listed in the Appendix (Tables A1-A3 correspond to λ 0 , λ 1 , and λ 2 , respectively).By comparing the weight vectors estimated by the two method above, we find that, generally, there is little difference between weights estimated by all profiles and the mean of weights by leave itself out.However, a larger CV is observed in the factor of relative elevation, distance to lake and thickness of dry layer, which are the sensitive factors affecting the soil moisture distribution based on detrended canonical correspondence analysis (DCCA) [12,13].When the weights of the environmental factors are determined, the parameters of soil moisture distribution can be calculated by the encoded environmental variable matrix multiplying the weight vectors listed in Table 2 and Tables A1-A3.Table 3 presents parameters exp(λ 0 −1), λ 1 and λ 2 fitted by the three methods mentioned above.

Simulation Results
On the basis of the input parameters for each profile shown in Table 3, the soil moisture profiles are estimated by substituting these parameters into the scaled gamma distribution PDF. Figure 4 presents the measured data and the simulated results based on best fitted method, calibrating parameters using all profiles' environmental factors and leaving itself out.

Correlation Analysis
Firstly, the correlation coefficients between measured data and simulated results are used to evaluate the performance of the entropy theory based model.The results are shown in Table 4.
Theoretically, the best fitted results display a good linear correlation with the measured value.There are 13 profiles that show correlation coefficients larger than 0.9, four profiles between 0.8 and 0.9 and two profiles about 0.7.As for the profiles of SC and SU, the least correlation between the measured and best gamma PDF fitted result, both profiles exhibit a substantial change of sand texture: a higher clay content appearing when the soil moisture increases again.Therefore, these scaled PDF are appropriate for about ninety five percent of the soil moisture profiles in the deep vadose zone of the Badian Jaran Desert, and the two exceptions are caused by the change of sand texture.
The correlation coefficients of the fitted results on the basis of local environmental factors from all 19 profiles perform well too: ten profiles showing larger than 0.9, four profiles between 0.8 and 0.9, two profiles between 0.75 and 0.8, and three profiles between 0.56-0.75.The results based on other 18 profiles are mostly close to those based on all nineteen profiles.Eight profiles displays correlation coefficients larger than 0.9, four profiles between 0.8 and 0.9, five profiles are about 0.7 or higher, and two profiles (SC and SU) between 0.5 and 0.7.Hence, the simulated results display a good correlation with the measured values.

Error Analysis
Similarly, the results of error analysis are shown in Table 4 too.The relative error is used to quantify the performance of the simulated results.Firstly, the best fitted results display a small error between the simulated and measured values.The average relative errors are less than 0.15 in fifteen profiles, and four profiles are between 0.15 and 0.3.This indicates this method can provide good precision for most of the soil moisture profiles.
The fitted results based on local environmental factors also perform well.Most of the relative errors are less than 30% and the maximum average error is less than 50% in a profile, although several profiles showing a good correlation coefficients exhibit a high relative error, such as SE, SP and ES.In the top ten meter depth zone of desert regions, this precision is acceptable on the basis of local environmental factors instead of drilling samples and drying them.
By comparison the errors based on parameters estimated by two methods, we find the model performs a little better when parameters determined by all the profiles' environmental factors than fitted by leave itself out.The reason is the former counts in the autocorrelation, and the latter is more objective as the validation is performed on data that were not used in the calibration.Hence, we concentrate more on discussing the errors from the second method and analyzing the main driving factors.
Generally, the performance of the model calibrated by the last method deliver an acceptable accuracy although the error might be close to 50% in a couple of profiles.Nine profiles show a high accuracy with errors less than 20%, six profiles showing a middle level with errors between 20% and 30%, and poor performance (30%-40%) is seen in four profiles, including SC, SE, SP and ES.From the estimated parameters of the profiles in which the model performs poorly, we can find that: (1) the reason for SC is the change of soil texture with more clay content that leads to the soil moisture pattern not be consistent with that of the gamma distribution; (2) SE, SP and ES, however, are able to show good precision when fitted appropriately; (3) profile SE exhibits a good CC but a large relative error.This means the parameters for λ 1 and λ 2 are estimated appropriately, while λ 0 , the factor to scale the PDF, is higher than the real value; (4) as for ES, the middle level CC indicates the error is not caused by λ 0 alone, and actually the reason is mainly due to a larger λ 1 from Table 3; (5) finally, the estimated three parameters of SP are different from the best fitted value: a larger value for λ 0 , λ 1 , and smaller for λ 2 .This means the estimated soil moisture concentrates more around the mode than the measured profile.Combined our previous studies based on detrended canonical correspondence analysis (DCCA), the dominating factor is thickness of the dry sand layer near the surface for SE and ES, and distance to lake for SP, respectively [12,13].Compared with the other environmental factors, the thickness of the dry sand near the surface can be affected by small intensity precipitation, which might lead to the model performing poorly in some profiles.For SP, it is located near a lake that has dried up, and the measured distance is to another lake located beyond the divide line of the dried lake.

Applicability of the Model
This method performs well on estimating soil moisture distribution in most profiles with the exception of several profiles caused by abrupt change of texture, such as SC and SU, in the study area, the Badain Jaran Desert of northwestern China.In order to simulate the soil moisture profiles based on local land surface factors instead of sampling sand vertically, three requirements should be met.
The first is the soil moisture distribution can be considered as stable in a specific period.Here, the soil moisture distribution at the top ten meter zone of the desert region meets the first requirement since the soil water moves at a very low flow rate (about 1 cm/year) and the ten meter depth sand profile represents a one thousand year recharge record.The former indicates the shape of the soil moisture profile is almost a constant, so the geometric mean of the profile can be considered as a constant.The latter indicates that the arithmetic mean is a constant as the average recharge rate is a constant at a one thousand year scale.Hence, the two limitations, both constant in arithmetic mean and geometric mean, are necessary to determine patterns of soil moisture distribution.
The second requirement is a gradual change of soil texture and soil moisture.As presented in the text above, the abrupt change of texture will lead to the model performing poorly in these profiles like SC and SU.The reason is the trend of the PDF of the gamma distribution decreases monotonously after the mode.In the desert region, fortunately, most of the soil moisture profiles exhibit the same trend which enables the applicability of this method in the desert regions.
Another requirement is determining the weight vector of the environmental factors based on sufficient measured soil moisture profiles.In this study, nineteen soil moisture profiles are used to calibrate the parameters by the application of fuzzy mathematics.In a desert region, the influence of rainfall on soil moisture is mostly small in a short period, hence the relationships can be established simply by the linear equations.
In other regions, however, the soil moisture distribution is not static.For example, in humid climatic conditions, the vertical soil moisture distribution varies substantially on a daily scale or even an hourly scale.The reason is the much more intensive and frequent precipitation will lead to the temporal and spatial patterns of soil moisture varying substantially.On the contrary, the soil moisture distribution is determined mainly by the long term climatic conditions, and the precipitation events have little effect on soil moisture distribution in the deep vadose zone of desert regions at a daily or monthly scale.Therefore, the assumption of static state is appropriate for the arid desert regions while it is challenged in humid regions.

Conclusions
From the establishment of the model based on information entropy, parameter determination and precision tests, the conclusions of this work are as follows: 1.
The soil moisture distribution curve can be described as a scaled PDF in the deep vadose zone of the Badian Jaran Desert.The function is solved by PME with the constraint of normalization, known arithmetic mean and geometric mean, and the soil moisture vertical distribution curve is a scaled PDF of gamma distribution with a general form as f (z) = e λ 0 −1 e λ 1 z z λ 2 .

2.
The parameters of soil moisture distribution are estimated by local land surface environmental factors like terrain and vegetation: the theoretic parameters are estimated by least squares fitting, then the linear equations were used to describe the relationship between environmental factors and best fitted parameters.The coefficients of the environmental factors are obtained by solving these equations.

3.
The simulated results show a good correlation and an acceptable precision with the measured values.The correlation coefficient is larger than 0.9 in more than a half profiles and most are over 0.8; the relative errors are smaller than 30% in most of the profiles and it can less than 15% when fitted by appropriate parameters.
Therefore, a simple alternative method is established to estimate the soil moisture distribution in the deep vadose zone of desert regions based on the local land surface environmental factors, and this method would be useful since these environmental factors can be obtained by remote sensing data.Meanwhile, we should bear in mind that this method is applicable in desert regions but challenged in humid and semi-humid regions.The reason is the former is determined mainly by the long term climatic conditions and local environmental factors, while the latter is determined by the short term climatic conditions, especially more intensive and frequent precipitation.

Figure 2 .
Figure 2. Location of the study area and spatial distribution of the sampling sites.Site 1, Wulanjilin; Site 2, Nuoertu; Site 3, Sayinwusu; Site 4, Baoritaolegai.The remote sense image is from Google earth, and the projection is WGS-1984 for the coordinates of sampling sites to keep the consistent with the Google earth.

Figure 2 .
Figure 2. Location of the study area and spatial distribution of the sampling sites.Site 1, Wulanjilin; Site 2, Nuoertu; Site 3, Sayinwusu; Site 4, Baoritaolegai.The remote sense image is from Google earth, and the projection is WGS-1984 for the coordinates of sampling sites to keep the consistent with the Google earth.

Figure 3 .
Figure 3. Illustration of soil moisture distribution based on a scaled PDF solved by PME with the constraints of normalization, known arithmetic mean and known geometric mean.

Figure 3 .
Figure 3. Illustration of soil moisture distribution based on a scaled PDF solved by PME with the constraints of normalization, known arithmetic mean and known geometric mean.

Figure 4 .
Figure 4. Measured and simulated soil moisture distributions in deep vadose zone of the Badain Jaran Desert.Cycles are the measured soil moisture content, black line is fitted by least square, red line is fitted by all 19 profiles and blue lines is fitted by the other 18 profiles.

Figure 4 .
Figure 4. Measured and simulated soil moisture distributions in deep vadose zone of the Badain Jaran Desert.Cycles are the measured soil moisture content, black line is fitted by least square, red line is fitted by all 19 profiles and blue lines is fitted by the other 18 profiles.
(a) Theoretical parameters.The theoretical value of the parameters of each profile is represented by values fitted using the least square fitting, denoted as Y.(b) The variables matrix are represented by the normalized environmental factors which were encoded by the specific rules mentioned above.(c) The coefficients or the weights of these environmental factors are obtained by solving by the linear equations.

Table 1 .
Local environmental factors of typical random soil moisture profiles.

Table 2 .
The weight of environmental factors for parameters of soil moisture distribution.

Table 3 .
Parameters estimation of soil moisture distribution as a scaled PDF of gamma distribution through different fittings.

Table 4 .
Performance of the entropy based method under different fittings.

Table A2 .
The weight of local environmental factors for parameter λ 1 .

Table A3 .
The weight of local environmental factors for parameter λ 2 .