2.1. Soil Sample Preparation and Rewetting Experiment
In 2016, black soil samples were collected in Qiqihar (126°40′18.71″E, 47°37′18.28″N, Hei Long-jiang Province). In 2018, loessial soil, forest brown soil, and agricultural brown soil samples were collected in Changchun (respectively from 125°24′9.26″E, 43°47′8.27″N; 125°26′34.3″E, 43°47′6.2″N; and 125°25′36.86″E, 43°47′15.71″N, Ji Lin Province). The major types of land are forestland and farmland. For each kind of soil, the soil samples come from the same sampling site, so it is considered that the same kind of soil has the uniform soil particle property (i.e., mineral composition, organic matter, nutrients, etc.), ignoring the influence on the reflectance spectra of slight differences in organic matter, etc. The granulometric analysis with the mineralogical composition of four soils is shown in
Table 2. The collected soil samples were further air-dried, and crushed to pass through a one-mm sieve so that stones, roots, and the vegetation litter were avoided from soils.
The soil water content in this literature refers to the weight water content (the ratio of the weight of water in the soil to the weight of dry soil). Prior to the rewetting experiment, all of the soil samples were oven-dried at 105 °C for 24 h to eliminate soil water. Approximately 100 g of oven-dried soil for each sample was weighed using a scale (accuracy = 0.01 g) in the laboratory, and then placed in a petri dish. In order to prepare samples with different humidity gradients, they were wetted with different amounts of water. Water was sprayed into each soil sample while stirring so that the soil and water were fully mixed. After spraying, the soil sample was placed in a sealed bag with a good sealing effect and kept for 24 h, with the consequence that the soil could fully absorb water (
Figure 1). The SM could be calculated from the amount of the water added. As a result, black soil samples, loessial soil samples, forest brown soil samples, and agricultural brown soil samples were prepared with 15 different SM levels, 14 different SM levels, 15 different SM levels, and 16 different SM levels, respectively.
2.2. Spectral Measurement and Pre-Processing
The hyperspectral reflectance data were acquired in a dark room using an ASD FieldSpec.3 Portable Spectrometer (Analytical Spectral Devices, Boulder, CO, USA). The main geometric parameters of the spectrometer set-up were illustrated as follows (
Figure 2): a 50-W halogen lamp as the unique light source with a 30° incident angle was used to reduce the shadow effect caused by soil roughness.The lamp was set 10 cm away from the petri dish; the probe was mounted vertically about five cm above the dish, and the field angle of the probe was one degree. The soil depth was one cm. In order to obtain the absolute reflectance, the reflectance was standardized using a white Spectralon reference panel [
34]. The arithmetic average of 10 spectral curves collected from each soil sample was regarded as the actual reflectance spectrum data.
Before the original spectral data were exported, splice corrections were calculated using view Spec™ software (version 6.0.0, ASD Inc., Longmont, CO, USA) to solve the breakpoint phenomena around 1000 nm and 1800 nm. The reflectance of each spectrum was narrowed to 470–2400 nm. To eliminate the noise in the spectra, the study applied RLOWESS smoothing to the original reflectance spectra curve.
2.3. KM Model
The KM [
35] model describes radiative transfer, considering a downward and an upward light propagation flux (I and J, respectively), in an absorbing and scattering medium, perpendicular to the layer (
Figure 3). The model assumes that (i) the layer exhibits an infinite lateral extension (so that the edge effects can be neglected); (ii) the light absorbing and scattering particles are uniformly distributed in the layer; (iii) particle dimensions are much smaller than the layer thickness, d, and (iv) the whole layer is homogeneously illuminated with a monochromatic diffuse light source [
36,
37].
The KM model consists of two differential equations describing the light fluxes,
and
, at a given wavelength,
(nm), and at a depth in the layer,
(cm), with a light absorption coefficient,
(cm
−1), and a light scattering coefficient,
(cm
−1):
By analytically solving these equations, reflectance (
R) can be obtained [
38]:
where
;
With increasing layer thickness, d, the reflectance reaches the infinite reflectance value,
, which is used in diffuse reflectance spectroscopy, because a further increase of the sample thickness does not affect the measured signal. In this case, the calculation of the infinite reflectance in Equation (3) can be drastically simplified:
By solving this equation for
, one gets the so-called Kubelka–Munk function:
2.4. SM Retrieval Model
For wet soil, reflectance, which is related to SM, mainly depends on diffuse scattering [
39]. The relationship can be expressed as:
where
is the Fresnel reflectance for diffuse light that exits the material and transits a thin layer–air interface at the material surface. In general,
is a function of the surface roughness, refractive index, and scattering angles. It is often assumed to be approximately equal to
or treated as a constant [
39].
is the Fresnel reflectance for light incident in air upon the target surface [
21]:
where
is the refractive indices of water (≈1.33), and
is the refractive indices of air (≈1).
Equation (6) can be rearranged as:
Combining Equations (5) and (8) yields:
Variables and can be expressed mutually using Equations (4) and (5). The absorption and scattering coefficients are both affected by soil water content; thus, in the following modeling process, they will be equally considered as remotely sensible variables to estimate the SM.
Equation (9) shows that reflectance
R is affected by the absorption and scattering coefficients (
and
) of the soil, because they are functions of the soil particle characteristics (i.e., mineral composition, organic matter, nutrients, etc.) and the soil water content. A frequently effective and commonly accepted assumption is that the absorption and scattering coefficients of a mixed medium can be regarded as a simple sum function of the absorption and scattering coefficients weighted by their composition proportions [
38,
40,
41]. Given this assumption, we can describe the
and
of the soil surface as:
where
is the soil water content,
and
are the absorption coefficients of solid and water, and
and
are the scattering coefficients of solid and water, respectively. The optical properties of soil water are different from pure water, as it contains not only pure water, but also dissolved organic matter and ions in addition to suspended particles, and the water itself is partially bound to the soil [
26]. When the soil water content is
, the absorption and scattering coefficients of the soil, which are denoted as
and
, can be written as:
Equations (10) and (11) can be written as:
Combining Equations (14), (15), and (5) yields:
The absorption and scattering coefficients of a soil sample, whether dry or wet, can be directly measured. Nonetheless, a more convenient and practical algorithm is one in which the numerator and denominator on the right side of Equation (16) are simultaneously divided by the scattering coefficient
:
with:
where
is the reflectance of the soil in which the water content is
.
Mainly due to the strong absorption of water in the soil, the scattering of water in the soil is very weak, and even can be ignored compared with the scattering of water-bearing soil. Thus,
= 0 [
15], the model contains only one unknown parameter, and Equation (17) can be simplified to:
For remote-sensing applications, one needs to retrieve the soil water content from the reflectance data. For such an application, Equation (21) can be solved explicitly for soil water content as:
with:
2.5. Calibration and Validation
Data sets partitioning methods include the concentration gradient, random sampling, Kennard–Stone (KS), the sample set partitioning based on the joint x–y distance (SPXY), and the concentration gradient method that is used in this paper.
Different sorts of soil were similarly treated as follows: the whole set (n = 14, 15, or 16) was sorted in ascending order according to the SM level; one sample was selected as , and we used a stratified sampling approach to separate the samples into four strata with three or four intervals, and one sample was selected from each stratum as an independent validation set. The remaining samples were selected as a calibration set.
The root mean square error of prediction (RMSEP), R
2, and ratio of the performance to deviation (RPD) between the predicted and measured SM were selected to evaluate the model performance. Generally, the larger R
2, the RPD, and the smaller RMSEP were indicators of a superior model. Interpretations of RPD values were classified into five classes: RPD < 1.4 indicated unacceptable models/predictions; 1.4 ≤ RPD < 1.8 indicated fair models/predictions; 1.8 ≤ RPD <2.0 indicated good models/predictions; 2.0 ≤ RPD < 2.5 indicated very good models/predictions; and RPD ≥ 2.5 indicated excellent models/predictions [
42,
43]. All of the data analyses were carried out in Matlab R2014b (The Math Works Inc.: Natick, MA, USA).