www.mdpi.com/journal/Remote Sensing Article Soil Line Influences on Two-Band Vegetation Indices and Vegetation Isolines: A Numerical Study

Influences of soil line variations on two-band vegetation indices (VIs) and their vegetation isolines in red and near-infrared (NIR) reflectance space are investigated based on recently derived relationships between the relative variations of VIs with variations of the soil line parameters in the accompanying paper by Yoshioka et al. (1). The soil line influences are first demonstrated numerically in terms of variations of vegetation isolines and VI values along with the isolines. A hypothetical case is then analyzed by assuming the discrepancies between the general and regional soil lines for a Southern Brazil area reported elsewhere. The results indicate the validity of our analytical approach for the evaluation of soil line influences and the applicability for adjustment of VI errors using external data sources of soil reflectance spectra.


Introduction
The soil line is a linear relationship between the red and near-infrared (NIR) reflectance of bare soils as characterized by its slope and offset [2,3]. This information has intensively been used for characterization of the earth's surface for retrievals of vegetation biophysical parameters [4][5][6][7][8][9][10][11][12] and soil surface status [13][14][15][16][17][18] from remotely sensed reflectance spectra. Several studies have indicated that the soil moisture and organic matter contents relate to changes in soil reflectance along the soil line [2,14,[18][19][20]. It has also been shown that the distance away from the soil line in red-NIR reflectance space correlates strongly to various biophysical parameters such as leaf area index (LAI) and fraction of green vegetation, and has been used as the basis for several types of VIs [3,[21][22][23][24].
Numerous studies have been conducted to establish the concept of the soil line from both empirical and theoretical points of view [2,3,7,8,25]. The existence of a general soil line was indicated by [3] which represents a pattern of reflectance spectra covering a wide range of soil types and conditions. Although a somewhat contradictory conclusion was reached [2], the existence of a single soil line for a particular soil type was common to those studies [2,3].
A notable fact is that the soil line parameters vary significantly with soil type [2,15,[26][27][28][29][30]. For example, Jasinski and Eagleson [20] demonstrated the existence of three unique soil lines depending on soil type, moisture content, and roughness. Baret et al. [2] clearly showed that the soil line parameters depend on the combined variations of soil surface status characterized by its roughness and moisture. The soil line parameters also vary with spectral band positions and the bandwidths of the multispectral sensor [25,[31][32][33]. These results suggest that the optimum soil line should be determined for each sensor and on a region by region basis [15].
Retrievals of biophysical parameters through spectral vegetation indices (VIs) heavily rely on the strong correlations between the VIs and the parameters of interests. However, the relatively large contributions of bare soil signals to the observed VI value deteriorates the accuracy in estimating biophysical parameters based on such relationships [28,34]. In order to minimize the soil noise, various VI forms have been proposed [3, 6, 10-12, 35, 36] including the soil-adjusted VI [10]. Although a prior knowledge of the soil line parameters are required for the VIs, such information is not often available, and when the information is not available, a general soil line is assumed. In such a case, the discrepancy between the general soil line and true soil line of the target region becomes a source of error in biophysical parameter retrievals. Therefore, the influence of the soil line parameters on the VIs should be clarified prior to the adaptation of regionally/individually determined soil lines.
Our previous work [1] introduced the derivations of soil line influences on two-band VIs and the vegetation isolines. Variations of the VIs were derived analytically based on the vegetation isoline equations [37,38] by employing the slope and offset of the soil line as parameters. It was found that the slope and offset of the vegetation isolines can be written only as a function of the soil line parameters. It was also found that the values of the VIs are correlated linearly with the soil line parameters. Since the derivation was the focus of discussion in our previous work, the validation and demonstration of those findings have yet to be performed. Numerical studies are required for this purpose. Moreover, the adaptability of the derived expressions to the evaluation of the soil line influence needs to be demonstrated with case studies.
The objectives of this paper are twofold. The first objective is to validate the derived expressions [1]. In [1] we have showed that soil line variation influences on VIs almost linearly. It needs to be validated and confirmed to fulfill the gap between the theoretical work and practical applications of satellite data processing. For this purpose we conduct a set of numerical simulations with a radiative transfer model of a vegetation canopy and the estimated soil line parameters from a set of spectra sampled over southeastern Brazil from Landsat 5 Thematic Mapper (TM) [39,40]. The linearity of soil line influence is numerically validated (in section 5.) The second objective is to demonstrate our technique to quantify the differences (errors) on the two-band VIs (e.g., normalized difference VI (NDVI)) caused by variations of the soil line parameters. We conduct a hypothetical case study for southern Brazil [39,40] as well.
In the following (Section 2), we first show a sample of soil line variations based on reflectance data sampled over southern Brazil [39,40], and then obtain a set of soil line parameters from those data. In Section 4, the variations of the vegetation isolines and the VIs are demonstrated numerically based on the obtained soil line parameters. In Section 5, the variations of the NDVI, as an example, are shown and analyzed with the analytical formulations derived in [1] as a function of the perturbation variables related to the soil line parameters (for the purpose of validating the linearity shown by our previous work.) Finally (Section 6), we present a hypothetical case study and demonstrate an evaluation technique of soil line influences on the VI.

Sample Spectra of Bare Soil Pixels and Soil Lines
The soil line represents a robust relationship between the red and NIR reflectances from a single soil type, which is modeled by where a and b are soil line parameters, and R sR and R sN are the red and NIR reflectances of a target soil, respectively. A general soil line of (a, b) = (1.166, 0.042) was reported by Huete et al. [7] determined from 20 soils with a wide range in physical and chemical properties and measured with a radiometer mimicking Landsat-MSS bands. Galvao and Vitorello [25] reported a general soil line of (a, b) = (1.36, −0.0117) from a database [30,41,42] covering the 14 most common soil types in the southeastern Brazil and with Landsat-TM bands. In the examples, the soil lines of individual soil types were also summarized, which clearly indicated significant variations of slopes and NIR-intercepts.
In this study, we used a set of soil spectra taken from a Landsat-TM scene of southeastern Brazil [39,40] to demonstrate our technique of evaluating NDVI variations due to changes in soil line parameters. The size of our target region is approximately 2,800 km 2 , including a wide variety of soil types. The spectral data set included a total of 173 bare-soil pixels covering seven different soil types, of which the soil properties are summarized in Table 1 (Soil Survey Staff [43]). Digital numbers of the extracted pixels were converted into top-of-the-atmosphere reflectances and then corrected for atmospheric effects, including Rayleigh scattering and ozone absorption (detailed in [39]), using the 6S code [44]. Figure 1a,b shows the scatter plot of soil spectra in red-NIR reflectance space and the linear-fitting results of these data for each soil type, respectively. Table 2 summarizes the soil line parameters (a and b), the goodness of fit (r 2 ), and the minimum and maximum values of red and NIR reflectances. The minimum and maximum slopes among these samples were 1.16 and 0.88, respectively, and the minimum and maximum values of NIR intercepts were 0.0213 and -0.0300, respectively. Instead of using these soil lines directly, four extreme cases were assumed by combining the minimum and maximum values among these data ( Table 3). The four pairs of (a, b) were (1. 16 TE  TH  THU  RP1  TR2  TU  TQ  Table 2. Soil line parameters (slope a, NIR-intercept b) and coefficient of determination (r 2 ).    Table 1.

Vegetation Isoline Equations and Numerical Procedures
The vegetation isoline equations are obtained by assuming the following canopy model. We first assume that a portion of the target area is covered with homogeneous vegetation canopy [45]. A parameter called the fraction of green cover, β, is defined as a fraction of the target area covered by the vegetation canopy at a particular combination of illumination and viewing geometries. We discriminate two types of LAI in this study, a local LAI and total LAI. The local LAI is defined as the LAI within an area covered by green vegetation, which is referred to as "LAI" throughout this paper. The total LAI (average LAI value over a pixel) is then defined as the product of the local LAI and the fraction of green cover.
The radiative transfer problem in the covered portion of a pixel is simulated independently by modeling a horizontally infinite homogeneous canopy. The SAIL model [46,47] was used for this purpose in this study. The total reflectance from the two parts of fractional area are, therefore, obtained by a weighted sum of the reflectance from the covered and uncovered portions with the fraction of green cover as their weights.
The vegetation isoline equation is a linear relationship between the red and NIR reflectances for a constant canopy under identical viewing and illumination geometries. The relationship was derived [37,38] as with the following definitions, where T vλ is the geometric mean of downward and upward transmittance of the canopy layer, and ρ vR and ρ vN are the "pure" canopy reflectance computed by assuming a perfect absorber underneath the canopy layer. The isoline equation represents spectral variations caused by variations of the background soil. The details of the derivation and the explanations of the variables in the equations are provided in [37,38]. The vegetation isolines were numerically obtained by a technique also described in [37,38]. The average transmittance and the zero-th order interaction term (pure-canopy reflectance) were obtained by conducting two simulations at each LAI value using the SAIL canopy model [46,47]. The procedure consists of four steps; (1) preparation of input data, (2) the first simulation with a perfect absorber as background, (3) the second simulation with an arbitrary background, and (4) construction of the vegetation isoline parameters γ and D. Vegetation isolines were then obtained by combining these isoline parameters and the soil line parameters. Looking into the term D closely, it is the difference of D N and aγD R , as shown in Equation (3d). Noting the fact that D N is a linear function of b with a positive factor, k N , D increases monotonically with b. On the contrary, D decreases monotonically with a, since the second term of Equation (3d) is also a linear function of (−a) with a positive factor, γβρ vR . Therefore, a and b are negatively correlated to the variations in D.

Variations of Vegetation Isolines
In the same figure, the variations of vegetation isolines for a constant LAI value become smaller at higher LAI, and indicates the relatively smaller influence of soil line variations on vegetation isolines at larger LAI values due to the lower transmittance.

Variations of VIs
Our next focus is on the soil line influences on the VIs. We plotted three VIs (Figure 4a-c) based on the sets of reflectance spectra, to the vegetation isolines shown in Figure 3. We chose NDVI (Figure 4a), SAVI ( Figure 4b) and DVI (Figure 4c) for the purpose of the numerical demonstration. Although the L factor of SAVI depends on the soil line parameters, determination of this factor requires information about vegetation isoline at LAI = 1 prior to the optimization process (which is most likely not available in real application.) A fixed value 0.5 is assumed for L in this study.
The effects of soil line parameters are clearly seen in the figures. Although the magnitudes and trends of each line among the indices are different, two things are in common. First, the order and shape of the soil line variations are transferred to the VIs. Second, the effects of the soil line variations become smaller as the LAI value becomes larger. The latter phenomenon is due to the relatively smaller soil effects for a denser canopy. The first trend indicates that the variations of the VIs show a similar trend to the soil line parameters. This motivated us to derive and quantify the variations of VIs in terms of the soil line parameters in our previous work [1]. There are clear trends that govern the effects of soil line variations on the VIs, which are the focus of our analyses in the next section. The focus of this section is the variational patterns of the VIs by the three parameters, ∆a, ∆b, and ∆R sR . In our previous work, a generalized form of the two-band VIs was assumed, where p i , q i , and r i for i = 1, 2 are parameters that characterize a VI model equation. For example, Equation (4) will be the NDVI when one sets the parameters p 1 , p 2 , and q 2 as unity, q 2 as −1, and r 1 , and r 2 as zero. Then, the normalized value of ∆V by V was derived as, E i represents the relative difference of the numerator (when i = 1) or the denominator (when i = 2) of the VI model equation upon a soil line change. Equation (5) is the relative variation of the VIs against the soil line variation. The numerical results are presented based on Equations (5) to (8).

Variations of ∆V /V
Relative variations of the VI were obtained as a function of ∆a and ∆b for ∆R sR at -0.03, 0.0, and 0.03, and R sR at 0.1, 0.2, and 0.3, resulting in nine cases. The values for R sR were chosen to cover entire range of soil reflectance shown in Figure 1(a). Note that the derivation of ∆V /V has been done by assuming a small variation in R sR . Thus, for ∆R sR , the variation that is one order smaller than the maximum of R sR was assumed in the simulation. The selected ∆R sR corresponded to a 3% reduction of soil reflectance in the red band, no-variation, and a 3% increment, respectively. In order to determine ∆a and ∆b, we first obtained the average soil line of the four extreme cases, of which the slope and NIR-intercept are 1.05 and -0.005, respectively. The maximum and minim values of ∆a and ∆b are then assumed as differences from the average soil line. The ranges of ∆a and ∆b are −0.11 < ∆a < 0.11 and −0.025 < ∆b < 0.025, respectively (summarized in Table 3). We chose four discrete values of LAI, 0.25,0.5, 1.0 and 2.0 at which the soil line variations are induced on the VI. A full green cover (β = 1.0) was assumed during the simulations. Vegetation indices written as a ratio of the two linear functions of reflectances can be analyzed by setting the coefficients in Equation (4). In this study, NDVI is used to demonstrate and analyze the influences of the soil line. Figure 5 shows the contour plot of ∆V /V as a function of ∆a and ∆b for all combinations of R sR (three values) and ∆R sR (three values as well). These figures indicate four trends in NDVI variations, (1) the slope of the contour is linear and negative, (2) the slope of the contour stays the same for constant R sR , (3) NDVI variation becomes larger when the soil becomes darker (a smaller value of ∆D sR ), and (4) the maximum variations of NDVI occurs at the maximum values of ∆a and ∆b when ∆R sR is negative (-0.03), or at the minimum values of ∆a and ∆b when ∆R sR is positive (0.03). When the soil brightness stays the same (∆R sR = 0), the NDVI variation becomes maximum when both ∆a and ∆b are minimum. In the next subsection, we explain and validate these four trends based on the derived expressions of our previous work [1]. Figure 6 shows similar plots for the different LAI values. The larger LAI gives smaller magnitude of variations in the VIs than the case with a smaller LAI, simply because a thicker canopy has relatively smaller influences of the soil brightness.

Iso-Plane of Constant ∆V /V
∆V /V can be approximated by a linear function of perturbation variables [1], where , k N and k R are all positive for any VIs, while w q is negative for NDVI. Therefore, ∆V /V increases with ∆a and ∆b almost linearly, which explains the pattern in Figure 5. Considering ∆R sR , the factor (aγ + w q /w p ) could be either positive and negative, depending on the choice of VI and the value of LAI. Since (aγ + w q /w p ) is always negative for all the cases we simulated in this study, ∆V /V decreases with ∆R sR . The relationship among the three perturbations for a constant ∆V /V (iso-plane) was also derived in a slightly different form in [1] as the analytical representation of the isolines of Figure 5 (lines of constant ∆V /V ). Defining C as a constant value of ∆V /V , the iso-plane equation of a constant ∆V /V becomes [1], Since the red reflectance is always larger than D R (Equation (3f)), the factor of ∆a is always negative, which explains the negative slope of all the contours in Figure 5a-i. The validity of the derived expression is also inferable by considering a particular case in Figure 5. When ∆R sR = 0, the relationship between ∆a and ∆b, which induce no variation in ∆V /V , can be obtained by setting both C and ∆R sR to zero simultaneously [1], The slope of the contour for this particular case is identical to the value of soil reflectance with a negative sign. Since R sR is always positive, the slope of the contour is always negative, which agrees with the results in Figure 5: the slope of contour lines which go through the point corresponding to ∆V = 0 in the cases of ∆R sR = 0 is equal to −R sR . Those results also indicate that if the soil line parameters, a and b, are shifted toward the opposite directions (positive and negative directions), the effects of these shifts would cancel out. And the amount of changes in a and b that result in no-variation in the VI should follow Equation (11) with C = 0, or Equation (12) for the case of ∆R sR = 0. In order to confirm this analysis, we show similar plots at three values of R sR = 0.1, 0.2 and 0.3, for a fixed LAI = 1 and ∆R sR = 0 in Figure 7. In the figure, the slope of the contour decreases as R sR increases, confirming the above analysis. We also noticed that the absolute value of the variations in ∆V /V is larger at a higher R sR . It is simply due to a relatively larger influence of the soil when it is brighter.

Hypothetical Case Study
Galvao and Vitorello [25] reported the general soil line of southeastern Brazil with slope and NIR intercept of 1.36 and -0.0117, respectively, for the TM band configuration. These soil line parameters were determined based on a database [41] that included the 14 most common soil types in the region. However, a localized soil line over a specific area should be somewhat different from the general (regional) soil line. Therefore, these differences would cause a bias error in the retrievals of biophysical parameters with the use of VIs.
In such a case, the influence of the soil line parameters can be analyzed with the derived expressions in [1]. In this subsection, we quantify and characterize the bias errors on NDVI values induced by the discrepancies between the general and local soil lines. In summary, the following scenario is assumed.
The NDVI is used for the estimation of a biophysical parameter based on the relationship between the two. Prior to the estimation, the relationship is determined from the simulated reflectance of a canopy-soil system with the general soil line by Galvao and Vitorello [25]. At the same time, it is also known that the localized soil line is identical to the one by Demattê et al. [40].
The values of the following parameters were needed to conduct the analysis, (1) the parameters of the vegetation isoline (for determinations of k N and k R ), (2) the parameters of the general soil line, a and b, in [25], (3) the parameters of the local soil lines for the target area, and (4) the range of the red reflectance for each target soil (it can be simply a parameter if it is not a variable). From (4), an average value of the soil reflectance is estimated as a reference. From (2) and (3), we can estimate discrepancies in a and b of the target soil from the parameters of the general soil line. These parameters are summarized in Table  4. The bias errors of the NDVI computed from Equations (5) through (8) are summarized in Table 5 for LAI values of 0.25 and 1.0. The NDVI differences are larger at lower LAI values, as expected, caused by relatively larger contribution of soil reflectance at lower LAI. In most of the cases, the positive perturbations of ∆R sR resulted in the larger differences of the NDVI, due to the negative values of D a for all the cases (Table  4). Overall, the bias error is expected to be −15 to −30 % at LAI= 0.25, and −10 to +5 % at LAI= 1.0. In this hypothetical case, the NDVI adjusments based on the external data source (e.g., field data and regional soil map) would result in up to 30 % improvement in the bias error.
The baseline of NDVI is one-to-one line in the red-NIR reflectance space that is different from actual soil line in general. This difference in soil line assumption is known to be a source of variation. Moreover, a fundamental issue of the soil line influence on VI and any other algorithms comes from the discrepancy between assumed vegetation isoline in retrieval model (such as VI model equation) and true vegetation isoline (with site-specific soil line). Therefore, as confirmed from Table 5, influence of soil line variation remains at higher LAI, which also becomes a source of bias error in retrieval of biophysical parameters.

Remarks
This study focused on numerical demonstrations of the soil line influences on vegetation isolines and VIs derived in our previous work [1]. We performed a systematic evaluation of such effects based on prior knowledge of the soil line differences. Starting from an assumption of soil line variations, the variations of the vegetation isolines and the NDVI under four different soil lines were demonstrated numerically based on the derived expressions in [1] as a function of the perturbations in the soil line slope and offset.
Note that we made an assumption of small variations in the soil line parameters and soil reflectance in red band during the derivations in [1]. The results obtained in this study are, thus, limited to such cases. Also, the form of VIs considered in this work covers a limited type of VI which is represented by a ratio of weighted sum of the two reflectances. This limitation needs to be extended to cover VIs with three bands as well as nonlinear forms in the numerator and denominator of the index formulation.
The derived expressions in [1] were then applied to the evaluation of the NDVI bias errors by analyzing a hypothetical case study. The VI variations were induced by assuming discrepancies between a general soil line and the localized soil lines. In the case study, we assumed a general soil line reported in [25] and a set of the localized soil lines reported in [41]. The results indicate that the bias error in the NDVI would reach up to 30 percent of variations due to the differences in the soil line parameters. The results also remind us about the known fact of soil influences on NDVI such that much caution is needed when one assumes a general soil line for the estimation of biophysical parameters if the target area is regional scale (or smaller). Although the estimation of biophysical parameters through the NDVI with the use of a general soil line may induce bias errors, the results also suggest that external sources of data, such as regional soil maps, would improve accuracies significantly. Note that the determination of a general soil line is a controversial issue due to difficulty in characterizing and validating its generality. In this study, we chose the general soil line reported in [25] for our case study, the one which is determined from the scene used in this study can be a choice of general soil line as well.
The soil line parameters are also a function of sensor bandpass configurations (bandwidth and band position). Thus, the estimation of biophysical parameters through VIs is also influenced by the choice of sensors. In this context, our work can also be applicable to the analysis of inter-sensor data continuity and compatibility. However, for such applications, one would also need to take into account the differences in the vegetation isoline parameters (γ and D) across sensors, in addition to the differences of soil line parameters.