Quantitative Retrieval of Organic Soil Properties from Visible Near-Infrared Shortwave Infrared (Vis-NIR-SWIR) Spectroscopy Using Fractal-Based Feature Extraction

Visible and near-infrared diffuse reflectance spectroscopy has been demonstrated to be a fast and cheap tool for estimating a large number of chemical and physical soil properties, and effective features extracted from spectra are crucial to correlating with these properties. We adopt a novel methodology for feature extraction of soil spectroscopy based on fractal geometry. The spectrum can be divided into multiple segments with different step–window pairs. For each segmented spectral curve, the fractal dimension value was calculated using variation estimators with power indices 0.5, 1.0 and 2.0. Thus, the fractal feature can be generated by multiplying the fractal dimension value with spectral energy. To assess and compare the performance of new generated features, we took advantage of organic soil samples from the large-scale European Land Use/Land Cover Area Frame Survey (LUCAS). Gradient-boosting regression models built using XGBoost library with soil spectral library were developed to estimate N, pH and soil organic carbon (SOC) contents. Features generated by a variogram estimator performed better than two other estimators and the principal component analysis (PCA). The estimation results for SOC were coefficient of determination (R2) = 0.85, root mean square error (RMSE) = 56.7 g/kg, the ratio of percent deviation (RPD) = 2.59; for pH: R2 = 0.82, RMSE = 0.49 g/kg, RPD = 2.31; and for N: R2 = 0.77, RMSE = 3.01 g/kg, RPD = 2.09. Even better results could be achieved when fractal features were combined with PCA components. Fractal features generated by the proposed method can improve estimation accuracies of soil properties and simultaneously maintain the original spectral curve shape.


Introduction
Quantitative assessment of soil properties using visible near-infrared shortwave infrared (Vis-NIR-SWIR) spectroscopy has been demonstrated as a fast and non-destructive method [1][2][3][4][5][6].Over the past 30 years, numerous soil physical and chemical properties, such as soil texture, soil organic carbon (SOC), cationic exchange capacity (CEC), total nitrogen (N) and exchangeable potassium (K), have been investigated using the spectroscopic approach based on various multivariate statistics and machine learning approaches [7][8][9][10][11], and outcomes were applied in soil contamination, soil degradation, environmental monitoring and precision agriculture [6,[12][13][14].As one of the attractive advantages, soil spectra can be recorded at points or by imaging from different platforms [1,15].The technique is mainly used in the laboratory, where soil samples are prepared and measured under controlled conditions, and it can be considered as an alternative to traditional analytical techniques.Portable Vis-NIR-SWIR spectrometers allow measurements operated directly in situ.Although the estimation accuracy is lower when compared to results achieved in the laboratory due to uncontrollable environmental factors in the field, in situ proximal sensing improves the efficiency of soil data collection by avoiding tedious sampling and preparation procedures [16].Sensors can also operate from high above, termed as air-or spaceborne imaging spectroscopy [17][18][19].However, there are still some limitations with respect to the application of imaging spectroscopy to the field of soil analysis, especially when vegetation is present.They have already shown the potential to map and quantify soil properties [20,21].With upcoming spaceborne sensors, like the Environmental Mapping and Analysis Program (EnMAP) from Germany and the Hyperspectral Infrared Imager (HyspIRI) from the USA, imaging spectroscopy provides the opportunity to map soil properties at regional and global scales at comparatively low costs.
Reflectance spectra of soil can be viewed as cumulative properties that reflect inherent spectral behaviour of soil components, and can be used to quantify these components simultaneously [5].However, due to the complexity of scattering effects caused by soil structure and/or specific constituents, the absorption wavelengths are largely overlapping and result in complex absorption patterns [4].Besides, soil spectra often tend to have a very high dimensionality.For example, each spectrum in the Land Use/Land Cover Area Frame Survey (LUCAS) [22] soil spectral library has 4200 Vis-NIR-SWIR absorbance measurements, while the Africa Soil Information Service (AfSIS) [23] soil spectra has more than 3000 mid-infrared absorbance measurements.The LUCAS Project aims to sample and analyse the main properties of topsoils across Europe, and the AfSIS Project aims to narrow the sub-Saharan soil information gap and to provide a consistent baseline for monitoring soil ecosystem services.Laboratory spectroscopy was used in both projects.High-dimensional data often contain redundant information and increase computation complexity.In high-dimensional space, spectral similarities are diminished.It has been proven that most of the data are concentrated in the corners of a high dimensional space and the model's accuracy tends to firstly improve and then decline with an increase of features, which is also known as the curse of dimensionality or Hughes phenomenon [24][25][26].Therefore, simply relying on different multivariate statistics in raw feature space is not enough, and methods to reduce the dimensionality and extract information from the spectra that can be better correlated with soil properties of interest should be investigated.
Feature extraction has been proved to be successful in imaging-spectroscopy classification [24,[27][28][29][30].The high-dimensional spectral data can be projected to a lower dimensional space with feature extraction methods, without actually losing significant information.Reduced features may increase the separation between spectrally similar classes and the classification model can perform well with a reduced number of features.In soil spectroscopy, a common approach is principal component analysis (PCA).In [31], PCA was used to reduce the Vis-NIR-SWIR data with more than 2000 wavelengths to a few components, the first component of which accounting for the largest variance.Also, soil information contents of the spectra consisted of PCA components, and a predictive spatial model was developed across Australia.Effective information can also be extracted with wavelet analysis [32].It can substantially reduce the factors outside the parameters to the spectrum directly or indirectly.PCA and local linear embedding (LLE) have, in a comparative way, been exploited for soil spectral distance and similarity in projected space [33].LLE is a nonlinear dimensionality reduction method [34,35].It can identify the underlying structure of a manifold, while PCA maps faraway data points to nearby points in the plane.The results indicate that the distances computed in the raw space have comparatively lower performance than the ones computed in low reduced spaces.Methods using PCA and LLE with Mahalanobis distance outperformed other approaches.It can be seen that an effective feature extraction method has the potential to explore the intrinsic structure of spectra, and does not only reduce the data redundancy but also improves estimation accuracy [36].
Knowing how to effectively extract features from the spectra is crucial for a successful soil-spectral quantitative model.Studies focused on feature extraction from soil Vis-NIR-SWIR spectra are still limited.In this paper, we adopt a novel approach of fractal features based on fractal geometry using variation estimators with the different power indices 0.5, 1.0 and 2.0, which can be termed as rodogram, madogram and variogram, respectively.The concept of fractal dimension was introduced by [37,38] to reduce the dimensionality of imaging spectroscopy data.Kriti Mukherjee [24,39] proposed a method to generate multiple fractal-based features from imaging spectroscopy data and then further compared the performance of fractal-based dimensionality reduction using Sevcik's, power spectrum and variogram methods with conventional methods like PCA, minimum noise fraction (MNF), independent component analysis (ICA) and decision boundary feature extraction (DBFE) methods.They concluded that the classification accuracy is similar but the computational complexity is reduced.The aims of the present study are to explore fractal-based feature extraction from soil spectra and to examine its performance on the estimation of SOC, N and pH contents with soil Vis-NIR-SWIR diffuse reflectance spectra.Features generated by the fractal method were compared to PCA-transformed components, and then these two kinds of features were combined to quantify soil properties using a gradient-boosting regression method.The proposed method is further compared with partial least squares (PLS) regression, which is a frequently adopted method for the quantification of soil properties.

The LUCAS Topsoil Database
As part of Land Use/Land Cover Area Frame Survey, approximately 20,000 geo-referenced topsoil samples were collected and analysed for the 25 European Union member states [22,40].Stratified random sampling was applied to collect around 0.5 kg of topsoil (0-20 cm) [41].The collected samples can be classified as mineral and organic soils based on the extremely diverse spectral response.The LUCAS topsoil dataset is obtained from the Joint Research Centre (JRC) and can be used for non-commercial purposes [22].In this paper, the proposed feature extraction method was tested using the LUCAS organic soil samples, the distribution of which was explored in ArcGIS 10.4 and can be seen in Figure 1.Knowing how to effectively extract features from the spectra is crucial for a successful soilspectral quantitative model.Studies focused on feature extraction from soil Vis-NIR-SWIR spectra are still limited.In this paper, we adopt a novel approach of fractal features based on fractal geometry using variation estimators with the different power indices 0.5, 1.0 and 2.0, which can be termed as rodogram, madogram and variogram, respectively.The concept of fractal dimension was introduced by [37,38] to reduce the dimensionality of imaging spectroscopy data.Kriti Mukherjee [24,39] proposed a method to generate multiple fractal-based features from imaging spectroscopy data and then further compared the performance of fractal-based dimensionality reduction using Sevcik's, power spectrum and variogram methods with conventional methods like PCA, minimum noise fraction (MNF), independent component analysis (ICA) and decision boundary feature extraction (DBFE) methods.They concluded that the classification accuracy is similar but the computational complexity is reduced.The aims of the present study are to explore fractal-based feature extraction from soil spectra and to examine its performance on the estimation of SOC, N and pH contents with soil Vis-NIR-SWIR diffuse reflectance spectra.Features generated by the fractal method were compared to PCA-transformed components, and then these two kinds of features were combined to quantify soil properties using a gradient-boosting regression method.The proposed method is further compared with partial least squares (PLS) regression, which is a frequently adopted method for the quantification of soil properties.

The LUCAS Topsoil Database
As part of Land Use/Land Cover Area Frame Survey, approximately 20,000 geo-referenced topsoil samples were collected and analysed for the 25 European Union member states [22,40].Stratified random sampling was applied to collect around 0.5 kg of topsoil (0-20 cm) [41].The collected samples can be classified as mineral and organic soils based on the extremely diverse spectral response.The LUCAS topsoil dataset is obtained from the Joint Research Centre (JRC) and can be used for non-commercial purposes [22].In this paper, the proposed feature extraction method was tested using the LUCAS organic soil samples, the distribution of which was explored in ArcGIS 10.4 and can be seen in Figure 1.The Vis-NIR-SWIR soil spectra were measured using a FOSS XDS Rapid Content Analyser (FOSS NIRSystems Inc., Denmark) [22], operating in the 400-2500 nm wavelength range, with 0.5 nm spectral resolution.Organic soil spectra were pre-processed by removing the data at wavelengths of 400-500 nm that showed instrumental artefacts, transformation of absorbance (A) spectra into reflectance (1/10 A ) spectra, continuum removal, Savitzky-Golay filter with a window size of 50, second order polynomial and first derivative.Thirteen soil properties have been analysed in a central laboratory [22], including the percentage of coarse fragments, particle size distribution (% clay, silt and sand content), pH (in CaCl 2 and H 2 O), soil organic carbon (g/kg), carbonate content (g/kg), phosphorous content (mg/kg), total nitrogen content (g/kg), extractable potassium content (mg/kg), and cation exchange capacity (cmol(+)/kg).Three key soil fertility properties, soil organic carbon (SOC), total nitrogen content (N) and pH in CaCl 2 (pH), were selected as our studied properties.

Concept of Fractal Dimension
Fractal dimension is a robust method for describing natural or man-made fractals having the fundamental feature known and referred to as self-similarity [42].Within the fractal lies another copy of the same fractal, smaller but complete.If we have a strictly self-similar fractal which can be decomposed into N pieces, each of which is a copy of the original fractal scaled by a factor of S, then, where D is the Hausdorff Dimension.D is a non-integer number, describing how the irregular structure of objects and/or phenomena is replicated in an iterative way from small to large scales.Anything that appears random and irregular can be a fractal, strictly or statistically, including the soil Vis-NIR-SWIR spectrum, which cannot be defined by any mathematical equation and is therefore considered as an irregular curve.There are numerous methods which have been developed for fractal dimension estimation, including box-count [43], variogram [44], power spectrum [24] and spectral [45] methods.

Variation Method for Fractal Dimension
The variogram estimator is widely used in the determination of the fractal dimension and it is known for its ease of use [46].By sampling a large number of pairs of points along the spectral curve and computing the differences in their reflectance values, the fractal dimension is easily derived from the log-log plot of variogram and lags.X u and X t+u are two reflectance values located at points u and t+u, and these two points are separated by the lag of t.The variogram can be calculated as the mean sum of squares of all differences between pairs of values with a given distance divided by two.
The variogram estimator is a stochastic process with stationary increments as half times the expectation of the square of an increment at lag t, and a generalisation of the variation estimator can be obtained with different order p of a stochastic process [47]: where p = 1.0, it represents the madogram, which instead of calculating squares of the differences takes the absolute values.Where p = 1/2, the rodogram is derived by calculating the square root of absolute differences.Fractal dimension is estimated using the slope (θ) of the corresponding log-log regression plot of γ p (t) and t, as shown in Figure 2.

Fractal Feature Generation
Fractal features are generated by multiplying spectral energy with the corresponding fractal dimension.As the fractal dimension can be calculated using the whole curve or only part of the curve, the spectrum can be segmented into several parts and each part corresponds to a new fractal feature.For a soil spectral curve, a common approach is to evenly divide the whole curve into a desired number of segments [48], which means the step and window size are the same.In this study, we explored the effect of different combinations of step and window sizes on generated fractal features.The final feature number Nf can be calculated as: is the number of raw spectral measurements, is the value of step size and W is the value of moving window size.W is obtained by multiplying scale and step value.It should be pointed out that the scale here is not the scaling factor for fractal dimension.When the window size is equal to the scale size, the fractal dimension of the spectral segment is calculated using reflectance values within the same wavelength window.The window size is often defined as larger than the step size, which means segments of the same spectral curve are overlapping.Step size is defined as 100.0 nm and moving window size as 200.0 nm, as shown in Figure 3, which means = 200 and w = 400 (the spectral resolution is 0.5 nm in our case).New fractal features can be generated when the wavelength window moves along the spectral curve at step 100.0 nm.With the increase of the step size, the final fractal feature number (Nf) correspondingly decreases, which can be used as a means of dimension reduction.
For a certain scale value, s, Nf numbers of fractal dimension values can be obtained by moving along the spectral curve at step size p.For each segment, the number of points are marked as n and can be calculated by Equation (5).The reflectance value as Zj (j = 1, 2,…,n) and the corresponding fractal dimension value can be calculated according to Equation (4) as Dm (m = 1, 2,…,Nf), and fractal features at scale s by: where E is the spectral energy and can be derived from the following equation:

Fractal Feature Generation
Fractal features are generated by multiplying spectral energy with the corresponding fractal dimension.As the fractal dimension can be calculated using the whole curve or only part of the curve, the spectrum can be segmented into several parts and each part corresponds to a new fractal feature.For a soil spectral curve, a common approach is to evenly divide the whole curve into a desired number of segments [48], which means the step and window size are the same.In this study, we explored the effect of different combinations of step and window sizes on generated fractal features.The final feature number N f can be calculated as: N r is the number of raw spectral measurements, P is the value of step size and W is the value of moving window size.W is obtained by multiplying scale and step value.It should be pointed out that the scale here is not the scaling factor for fractal dimension.When the window size is equal to the scale size, the fractal dimension of the spectral segment is calculated using reflectance values within the same wavelength window.The window size is often defined as larger than the step size, which means segments of the same spectral curve are overlapping.
Step size is defined as 100.0 nm and moving window size as 200.0 nm, as shown in Figure 3, which means P = 200 and w = 400 (the spectral resolution is 0.5 nm in our case).New fractal features can be generated when the wavelength window moves along the spectral curve at step 100.0 nm.With the increase of the step size, the final fractal feature number (N f ) correspondingly decreases, which can be used as a means of dimension reduction.For a certain scale value, s, N f numbers of fractal dimension values can be obtained by moving along the spectral curve at step size p.For each segment, the number of points are marked as n and can be calculated by Equation (5).The reflectance value as Z j (j = 1, 2, . . ., n) and the corresponding fractal dimension value can be calculated according to Equation (4) as D m (m = 1, 2, . . ., N f ), and fractal features at scale s by: where E m is the spectral energy and can be derived from the following equation:

Gradient-Boosting Regression Model
Soil spectroscopy quantitatively correlates with soil properties, which supposes that fitting a regression model with features extracted from spectra will have good predictive accuracies with respect to soil continuous properties.Gradient-boosting is a highly effective and widely used machine-learning approach [49].Gradient-boosting develops an ensemble of tree-based models by training each of the trees in the ensemble on different labels and then combining the trees.It can produce robust and interpretable procedures for both regression and classification.For a regression problem where the objective is to maximize the coefficient of determination (R 2 ) or to minimize the root mean square error (RMSE), each successive tree is trained on the errors left over by the collection of earlier trees.XGBoost is a scalable and flexible gradient-boosting library [50][51][52], which is adopted to build the soil spectral quantitative model in our study.XGBoost uses more regularised model formalisation to control over-fitting, which gives it better performance.Mathematically, the model can be viewed as: where is the number of trees, f is a function in the functional space , and is the set of all possible regression trees.Therefore, the objective of optimization can be written as: where ( , ) is the training loss function, and Ω( ) is the regularization term.The goal of XGBoost model is to minimize ( ).

Evaluation
For each soil property, the soil spectral quantitative model was developed on a random sample of two-thirds of the selected soil samples using the gradient-boosting regression method.The calibrations were tested by predicting the soil properties on validation data sets composed of the remaining one-third of the organic soil samples.No samples were omitted from the analysis, nor the calibration or validation data sets.The model accuracies were evaluated on estimated and measured soil SOC, N and pH values using RMSE, R 2 and the ratio of percent deviation (RPD).

Gradient-Boosting Regression Model
Soil spectroscopy quantitatively correlates with soil properties, which supposes that fitting a regression model with features extracted from spectra will have good predictive accuracies with respect to soil continuous properties.Gradient-boosting is a highly effective and widely used machine-learning approach [49].Gradient-boosting develops an ensemble of tree-based models by training each of the trees in the ensemble on different labels and then combining the trees.It can produce robust and interpretable procedures for both regression and classification.For a regression problem where the objective is to maximize the coefficient of determination (R 2 ) or to minimize the root mean square error (RMSE), each successive tree is trained on the errors left over by the collection of earlier trees.XGBoost is a scalable and flexible gradient-boosting library [50][51][52], which is adopted to build the soil spectral quantitative model in our study.XGBoost uses more regularised model formalisation to control over-fitting, which gives it better performance.Mathematically, the model can be viewed as: where K is the number of trees, f is a function in the functional space F, and F is the set of all possible regression trees.Therefore, the objective of optimization can be written as: where l(y i , ŷi ) is the training loss function, and Ω( f k ) is the regularization term.The goal of XGBoost model is to minimize obj(θ).

Evaluation
For each soil property, the soil spectral quantitative model was developed on a random sample of two-thirds of the selected soil samples using the gradient-boosting regression method.The calibrations were tested by predicting the soil properties on validation data sets composed of the remaining one-third of the organic soil samples.No samples were omitted from the analysis, nor the calibration or validation data sets.The model accuracies were evaluated on estimated and measured soil SOC, N and pH values using RMSE, R 2 and the ratio of percent deviation (RPD).
where n is the number of validation samples, y is the measured values, y is the mean of the measured values, and ŷ is the estimated values.RPD is the ratio of the standard deviation (SD) of the calibration data to the RMSE of the validation data

Fractal Features for Soil Spectroscopy
For a single soil Vis-NIR-SWIR spectrum, the fractal dimension can be calculated by Equation ( 4).Before extracting fractal features from soil spectra, we first examined the relationship between soil properties and the corresponding fractal dimension.Spectral values of soil are relatively low and the curve appears smoother compared with other objects like vegetation.Thus, the resulting fractal dimension values are comparatively low.Since the fractal dimension is derived from the slope of the regression line obtained from the log-log plot of γ p (t) and lag t, one problem is how many lag increments are necessary to produce reliable results.Theoretically only a minimum of two points is necessary to make such a plot [46].However, the results of such an analysis tend to not be reliable or representative.In this study, the value of lag increments was set as 5, and the Pearson correlations of soil properties and fractal dimensions are shown in Table 1.The Pearson is a standardized covariance and ranges from −1 to +1, which indicates a perfect negative (−1) or positive (+1) linear relationship respectively.A value of zero is not related to the independency between the two variables, it only suggests no linear association.It can be seen that SOC, N and pH have negative relationships with fractal dimension.SOC and N have similar correlations with fractal dimension.Among these three estimators, the variogram-based fractal dimension calculation method achieved the best correlation between fractal dimension values and soil properties SOC (correlation coefficient (r) = −0.54),N (r = −0.50)and pH (r = −0.12).An intact spectrum can be divided into multiple segments, overlapping or non-overlapping.Each segment is corresponding to a fractal feature.When step size and window size are respectively set to 2.5 nm and 50.0 nm, a total number of 791 fractal features can be derived by rodogram, madogram or variogram methods, resulting the original spectral dimension reduced from 4000 to 791.In order to make a proper comparison between the generated fractal feature-based curve and the raw spectral Remote Sens. 2016, 8, 1035 8 of 18 curve, the centre wavelength value of the spectral segment is assigned to the fractal feature as the corresponding "wavelength number".
A great advantage of fractal-based feature extraction is that the curve shape of fractal features is similar to the shape of raw spectrum, which makes it possible to apply methods like continuum removal (CR) not only to the raw spectrum but also to the fractal-based "spectrum".The organic soil samples can be divided into four groups according to the content of SOC.Average spectral reflectance and continuum removal reflectance of LUCAS organic soil samples were computed by SOC classes (Figure 4A).For fractal features, average fractal energy and continuum removal responses of organic soil samples were also computed and shown in Figure 4B-D.The highest SOC class that was above 480 g/kg showed the highest mean reflectance in wavelength range from 1000.0 nm to 2000.0 nm, which is consistent with observations in the literature [4].The continuum removal reflectance showed a strong correlation with SOC content at a wavelength of near 600.0 nm.The difference between raw spectral curve and fractal feature curve was not obvious from the view of shape.Fractal features showed shallow absorption peak in proportion for SOC classes at a wavelength of 600.0 nm.The fractal energy values were larger than reflectance values, as the former were multiplied by spectral energy and fractal dimension, which was supposed to be larger than 1.0.
Remote Sens. 2016, 8, 1035 8 of 19 removal (CR) not only to the raw spectrum but also to the fractal-based "spectrum".The organic soil samples can be divided into four groups according to the content of SOC.Average spectral reflectance and continuum removal reflectance of LUCAS organic soil samples were computed by SOC classes (Figure 4A).For fractal features, average fractal energy and continuum removal responses of organic soil samples were also computed and shown in Figure 4B-D.The highest SOC class that was above 480 g/kg showed the highest mean reflectance in wavelength range from 1000.0 nm to 2000.0 nm, which is consistent with observations in the literature [4].The continuum removal reflectance showed a strong correlation with SOC content at a wavelength of near 600.0 nm.The difference between raw spectral curve and fractal feature curve was not obvious from the view of shape.Fractal features showed shallow absorption peak in proportion for SOC classes at a wavelength of 600.0 nm.The fractal energy values were larger than reflectance values, as the former were multiplied by spectral energy and fractal dimension, which was supposed to be larger than 1.0.To demonstrate the effects of step and window size on extracted fractal features, the combinations of the two parameters were tested.When the step size was fixed at 2.5 nm, a series of fractal feature curves were derived by defining window sizes as 15.0 nm, 35.0 nm, 55.0 nm, 75.0 nm and 95.0 nm.With the increase of window size, fractal energies correspondingly increased and the shapes of fractal features were also gradually exaggerated, as shown in Figure 5A.The number for fractal features derived at different window sizes were equal but less than raw spectral features.When the window size was fixed at 50.0 nm and step size increased from 10.0 to 50.0 nm at an interval of 7.5 nm, the number of fractal features was non-linearly decreased from 196 to 40 as shown in Figure 5B.To demonstrate the effects of step and window size on extracted fractal features, the combinations of the two parameters were tested.When the step size was fixed at 2.5 nm, a series of fractal feature curves were derived by defining window sizes as 15.0 nm, 35.0 nm, 55.0 nm, 75.0 nm and 95.0 nm.With the increase of window size, fractal energies correspondingly increased and the shapes of fractal features were also gradually exaggerated, as shown in Figure 5A.The number for fractal features derived at different window sizes were equal but less than raw spectral features.When the window size was fixed at 50.0 nm and step size increased from 10.0 to 50.0 nm at an interval of 7.5 nm, the number of fractal features was non-linearly decreased from 196 to 40 as shown in Figure 5B.

Effects of Different Step and Window Size on Extracted Fractal Features
For further analysis about effects of step and window size on the relationship between fractal features and soil properties, a matrix of step-window pairs was generated by defining step size ranging from 2.5 nm to 50.0 nm at an interval of 2.5 nm and window size ranging from 10.0 nm to 100.0 nm at an interval of 5.0 nm.For each pair of these two parameters, fractal features were derived according to Equation (6).A gradient-boosting regression model using the XGBoost tool was built on a random sample of two-thirds of organic soil samples, and then applied to the estimation of each sample from the validation dataset.Pre-processing methods for soil spectra could also be applied to new fractal features because of the shape similarity between fractal features and the raw spectral curve.For example, fractal features were smoothed by use of Savitzky-Golay filter.R 2 derived by step-window pairs for SOC using rodogram, madogram and variogram methods are shown in Figure 6(A2-A4) respectively, as is the case for N and pH in Figure 6B,C.For a comparable study, the regression model was also applied to raw spectral values and PCA-transformed data.
Taking advantage of fractal features, models developed for SOC estimation achieved comparably good results, R 2 varies from 0.64 to 0.83 (rodogram), 0.70 to 0.84 (madogram) and 0.72 to 0.84 (variogram).For pH, R 2 varies from 0.61 to 0.80 (rodogram), 0.63 to 0.80 (madogram) and 0.63 to 0.82 (variogram.However, for N there is comparatively less accuracy.R 2 varies from 0.52 to 0.74 (rodogram), 0.53 to 0.75 (madogram) and 0.55 to 0.76 (variogram).Models with raw spectra were developed by evenly selecting desired number of spectral measurements.The Hughes phenomenon can be seen well in models built with raw spectra.R 2 increased first and then declined with the increase of feature numbers.It can be seen that models with raw spectra had the poorest performance.For SOC and N, fractal features outperformed PCA-transformed features and raw spectra.Fractal features for pH achieved similar accuracies compared to PCA-transformed features.

Effects of Different Step and Window Size on Extracted Fractal Features
For further analysis about effects of step and window size on the relationship between fractal features and soil properties, a matrix of step-window pairs was generated by defining step size ranging from 2.5 nm to 50.0 nm at an interval of 2.5 nm and window size ranging from 10.0 nm to 100.0 nm at an interval of 5.0 nm.For each pair of these two parameters, fractal features were derived according to Equation (6).A gradient-boosting regression model using the XGBoost tool was built on a random sample of two-thirds of organic soil samples, and then applied to the estimation of each sample from the validation dataset.Pre-processing methods for soil spectra could also be applied to new fractal features because of the shape similarity between fractal features and the raw spectral curve.For example, fractal features were smoothed by use of Savitzky-Golay filter.R 2 derived by step-window pairs for SOC using rodogram, madogram and variogram methods are shown in Figure 6(A2-A4) respectively, as is the case for N and pH in Figure 6B,C.For a comparable study, the regression model was also applied to raw spectral values and PCA-transformed data.
Taking advantage of fractal features, models developed for SOC estimation achieved comparably good results, R 2 varies from 0.64 to 0.83 (rodogram), 0.70 to 0.84 (madogram) and 0.72 to 0.84 (variogram).For pH, R 2 varies from 0.61 to 0.80 (rodogram), 0.63 to 0.80 (madogram) and 0.63 to 0.82 (variogram.However, for N there is comparatively less accuracy.R 2 varies from 0.52 to 0.74 (rodogram), 0.53 to 0.75 (madogram) and 0.55 to 0.76 (variogram).Models with raw spectra were developed by evenly selecting desired number of spectral measurements.The Hughes phenomenon can be seen well in models built with raw spectra.R 2 increased first and then declined with the increase of feature numbers.It can be seen that models with raw spectra had the poorest performance.For SOC and N, fractal features outperformed PCA-transformed features and raw spectra.Fractal features for pH achieved similar accuracies compared to PCA-transformed features.Figure 6.Gradient-boosting regression modelling accuracies for SOC, N and pH.(A1), (B1) and (C1) were with principal component analysis (PCA)-transformed features and raw spectra; (A2), (B2) and (C2) were with fractal features derived by the rodogram method with various step-scale pairs.(A3), (B3) and (C3) were with fractal features derived by the madogram method with various step-window pairs.(A4), (B4) and (C4) were with fractal features derived by the variogram method with various step-scale pairs.

Modelling Soil Properties with Fractal Features
Window sizes and step sizes adopted to optimize the gradient-boosting regression model can be seen in Section 3.2.Fractal feature numbers approximately ranged from 40 to 800.The optimal pairs of step-window sizes for SOC, N and pH can be seen in Table 2.For each gradient-boosting regression model built with XGBoost library, the maximum tree depth was 4 and maximum number of trees was 100.R 2 was used as the evaluation metric for validation data.
The best trade-off between step and window size for SOC (R 2 = 0.851, RMSE = 56.7 g/kg, RPD = 2.59) was 2.5 nm for the former and 105.0 nm for the latter with variogram estimator.The best performance step-window sizes for N (R 2 = 0.776, RMSE = 3.01 g/kg, RPD = 2.09) were step size at 2.5 nm and window size at 65.0 nm with the variogram estimator.The best performance step-window size for N (R 2 = 0.822, RMSE = 0.49, RPD = 2.31) were step size at 7.5 nm and window size at 45.0 nm with the variogram estimator.From Table 2, it can be seen that fractal-based feature extraction methods tend were with principal component analysis (PCA)-transformed features and raw spectra; (A2), (B2) and (C2) were with fractal features derived by the rodogram method with various step-scale pairs.(A3), (B3) and (C3) were with fractal features derived by the madogram method with various step-window pairs.(A4), (B4) and (C4) were with fractal features derived by the variogram method with various step-scale pairs.

Modelling Soil Properties with Fractal Features
Window sizes and step sizes adopted to optimize the gradient-boosting regression model can be seen in Section 3.2.Fractal feature numbers approximately ranged from 40 to 800.The optimal pairs of step-window sizes for SOC, N and pH can be seen in Table 2.For each gradient-boosting regression model built with XGBoost library, the maximum tree depth was 4 and maximum number of trees was 100.R 2 was used as the evaluation metric for validation data.
The best trade-off between step and window size for SOC (R 2 = 0.851, RMSE = 56.7 g/kg, RPD = 2.59) was 2.5 nm for the former and 105.0 nm for the latter with variogram estimator.The best performance step-window sizes for N (R 2 = 0.776, RMSE = 3.01 g/kg, RPD = 2.09) were step size at 2.5 nm and window size at 65.0 nm with the variogram estimator.The best performance step-window size for N (R 2 = 0.822, RMSE = 0.49, RPD = 2.31) were step size at 7.5 nm and window size at 45.0 nm with the variogram estimator.From Table 2, it can be seen that fractal-based feature extraction methods tend to keep a much larger number of features compared to PCA.To achieve similar performance of PCA, fractal-based approaches need to retain ~200 features, such as 190 for SOC (R 2 = 0.819, RMSE = 62.49g/kg, RPD = 2.34) where step size and window size were respectively 10.0 nm and 105.0 nm, 128 features for N (R 2 = 0.736, RMSE = 3.26 g/kg, RPD = 1.92)where step size and window size were respectively 15.0 nm and 135.0 nm, and 131 features for pH (R 2 = 0.807, RMSE = 0.50, RPD = 2.22) where step size and window size were respectively 15.0 nm and 50.0 nm.
In real-world examples, there are many ways to extract features from a dataset.Often it is beneficial to combine several methods to obtain good performance.To assess whether predictive accuracy could be enhanced by integrating multiple features, the first 30 PCA components were combined with fractal features and then ingested into the gradient-boosting regression model.Combined features showed better performance when applied for the estimation of all three soil properties, SOC (R 2 = 0.86, RMSE = 55.16 g/kg, RPD = 2.7), N (R 2 = 0.78, RMSE = 2.96 g/kg, RPD = 2.19) and pH (R 2 = 0.85, RMSE = 0.44, RPD = 2.59), as shown in Figure 7.In real-world examples, there are many ways to extract features from a dataset.Often it is beneficial to combine several methods to obtain good performance.To assess whether predictive accuracy could be enhanced by integrating multiple features, the first 30 PCA components were combined with fractal features and then ingested into the gradient-boosting regression model.Combined features showed better performance when applied for the estimation of all three soil properties, SOC (R 2 = 0.86, RMSE = 55.16 g/kg, RPD = 2.7), N (R 2 = 0.78, RMSE = 2.96 g/kg, RPD = 2.19) and pH (R 2 = 0.85, RMSE = 0.44, RPD = 2.59), as shown in Figure 7.The quantitative method proposed in the paper can be viewed as a combination of fractal features and gradient-boosting regression (method C), and it achieved the best performance (R 2 = 0.851) for the estimation of SOC contents of these three methods.We also applied methods A and B to the estimation of N and pH contents.For N, the same case applied; method C showed the highest R 2 .Although method A (PLS regression) achieved the best performance for the estimation of pH contents, when focusing on extracted features, fractal features had similar performance compared with PLS components, the R 2 for method C being 0.821 and for method B, 0.823.The only difference between these two methods was the ingested features.The results are summarised in Table 3, and it can be seen that fractal features can achieve similar or even better results compared with PLS components.

The Importance of Fractal Dimension for Soil Spectra
The correlations between fractal dimension and soil properties were assessed by means of Pearson correlation analysis when the fractal dimension calculation was applied to the whole spectrum.Significant negative correlations for SOC (r = −0.54)and N (r = −0.50)with the fractal dimension were found, which means that values of SOC and N could have effects on the shape of soil spectra and therefore diagnostic wavelengths exist for SOC and N.In [55] an absorption peak centred at 600 nm was observed, which seems to be related to SOC content.At 2100 nm, there was an absorption determined by N content.In [4] the authors also highlighted that wavelengths of around 1100, 1600, 1700-1800, 2000, and 2200-2400 nm have been identified as being particularly important for SOC and N estimation.
The pH showed a very weak correlation with the fractal dimension (r = −0.12),which could be caused by a lower direct spectral response to soil pH [4].It has to be pointed out that the weak correlation between pH and fractal dimension does not mean that soil spectra cannot be used to quantify soil pH values, but means that the variation of soil pH values does not significantly The quantitative method proposed in the paper can be viewed as a combination of fractal features and gradient-boosting regression (method C), and it achieved the best performance (R 2 = 0.851) for the estimation of SOC contents of these three methods.We also applied methods A and B to the estimation of N and pH contents.For N, the same case applied; method C showed the highest R 2 .Although method A (PLS regression) achieved the best performance for the estimation of pH contents, when focusing on extracted features, fractal features had similar performance compared with PLS components, the R 2 for method C being 0.821 and for method B, 0.823.The only difference between these two methods was the ingested features.The results are summarised in Table 3, and it can be seen that fractal features can achieve similar or even better results compared with PLS components.

The Importance of Fractal Dimension for Soil Spectra
The correlations between fractal dimension and soil properties were assessed by means of Pearson correlation analysis when the fractal dimension calculation was applied to the whole spectrum.Significant negative correlations for SOC (r = −0.54)and N (r = −0.50)with the fractal dimension were found, which means that values of SOC and N could have effects on the shape of soil spectra and therefore diagnostic wavelengths exist for SOC and N.In [55] an absorption peak centred at 600 nm was observed, which seems to be related to SOC content.At 2100 nm, there was an absorption determined by N content.In [4] the authors also highlighted that wavelengths of around 1100, 1600, 1700-1800, 2000, and 2200-2400 nm have been identified as being particularly important for SOC and N estimation.
The pH showed a very weak correlation with the fractal dimension (r = −0.12),which could be caused by a lower direct spectral response to soil pH [4].It has to be pointed out that the weak correlation between pH and fractal dimension does not mean that soil spectra cannot be used to quantify soil pH values, but means that the variation of soil pH values does not significantly contribute to the smoothness or roughness of the spectral curve.Soil pH value can still be well estimated in the laboratory or in the field [55,56] using raw spectral data, which might be due to the mutual effect of spectrally active soil constituents such as organic matter and clay [57].It also can be seen that the Pearson correlation between fractal dimension and soil properties has a positive relationship with the performance of fractal features.

Modelling Soil Properties with Fractal Features
Three methods for the fractal dimension calculation and further feature extraction were studied in this paper.The results demonstrate that the variogram estimator had slightly better performance than the madogram estimator when applied to fractal feature generation for soil property estimation, and methods using these two estimators achieved better R 2 than the method using the rodogram estimator.In [58] the classification achieved better results with texture layers derived from the madogram.Since the madogram estimator calculates the sum of the absolute value of the semivariance for all observed lags, it yields a softer effect on the presence of outliers compared to the variogram estimator.However, in our study, soil spectra were well pre-processed by the Savitzky-Golay filter and generated fractal features.Fractal features generated by these three estimators have a similar curve shape and achieved very close estimation accuracies for tested soil properties.
Step-window pairs have significant impact on estimation accuracies of soil properties.When the window size is fixed, accuracies are decreased with the increase of step size.However, when the step size is fixed, accuracies are prone to ascend slightly and then clearly descend.A higher R 2 was found to be located at the bottom of the step-window matrix.However, there is no guarantee as to which step-window pair is the best parameter for soil property estimation.Therefore, a hyper-parameter optimisation method should be adopted for each of the soil properties.
In general, fractal features achieved better results compared to PCA-transformed features and raw spectra.This demonstrates that by taking advantage of fractal information encoded in the soil spectral shape, soil properties can be estimated in a better way.Besides, when raw data are transformed or projected via PCA, measurement units and shape are lost.However, fractal-based feature extraction is prone to retaining much larger number of features compared to PCA.To achieve similar performance, the fractal-based approach needs ~200 feature numbers while PCA only needs ~30.When compared with PLS components, fractal features also had better performance for the estimation of OC and N contents.However, there is no conflict between common feature extraction practices with the proposed fractal method.When integrating different kinds of features, like PCA-transformed features and fractal features, the performance is expected to be improved for the retrieval of soil properties.

Conclusions
Data acquisition with Vis-NIR-SWIR spectroscopy is relatively easy, and a wide range of soil properties can be analysed within a comparatively short time with relatively little effort for sample preparation.Soil spectroscopy has recently been identified as a method that has the potential to rapidly estimate soil properties.Many soil-spectral libraries are already built at regional, continental or even global scales.Various multivariate statistics methods have been successfully adopted to explore the relationship between soil spectra and soil physical/chemical properties.However, few studies are focused on feature extraction from measured soil spectra, which is also crucial to correlating spectra with soil properties.
The present study presents a novel methodology for feature extraction based on fractal geometry.Each Vis-NIR-SWIR spectrum can be divided into multiple segments by defining the moving window size and the step size.For each segmented spectral curve, the fractal dimension value was calculated using variation estimators.Fractal features, generated by multiplying the fractal dimension value with spectral energy, were further combined with PCA-transformed features, and the gradient-boosting regression model achieved good performance with respect to the retrieval of SOC (R 2 = 0.86, RMSE = 55.16 g/kg, RPD = 2.7), N (R 2 = 0.78, RMSE = 2.96 g/kg, RPD = 2.19) and pH (R 2 = 0.85, RMSE = 0.44, RPD = 2.59).Fractal analysis can be functionalised as an approach to examine the relationship between soil spectra and soil properties, which can characterise statistical self-similarity and further quantify the irregularity of soil spectra [47].Fractal features, by taking advantage of fractal information encoded in the shape of soil spectral curve, can reflect the impact of various properties on soil spectra except when the properties have less direct spectral response.In this case, fractal features can still be functioned to quantify the corresponding soil property, however, they not perform as well.Fractal features performed well when ingested into quantitative soil spectroscopic models, and the proposed fractal method can not only reduce the dimensionality in the original space, but also simultaneously maintain the spectral shape, which means that methods for raw spectra can also be applied to extracted fractal features, for example, calibrating soil properties using PLS regression with fractal features.

Figure 1 .
Figure 1.Distribution of organic soil samples in the Land Use/Land Cover Area Frame Survey (LUCAS) topsoil database.Colours indicate amounts of soil organic carbon (SOC) content.

Figure 1 .
Figure 1.Distribution of organic soil samples in the Land Use/Land Cover Area Frame Survey (LUCAS) topsoil database.Colours indicate amounts of soil organic carbon (SOC) content.

Figure 2 .
Figure 2. Illustration of fractal dimension calculation.(A) is the spectral curve and (B) is the corresponding log-log plot of variogram and lags and the fitted regression line.

Figure 2 .
Figure 2. Illustration of fractal dimension calculation.(A) is the spectral curve and (B) is the corresponding log-log plot of variogram and lags and the fitted regression line.

Figure 3 .
Figure 3. Illustration of the meaning of step and window size for multiple fractal feature generation.(step size = 100.0nm, window size = 200.0nm).

Figure 3 .
Figure 3. Illustration of the meaning of step and window size for multiple fractal feature generation.(step size = 100.0nm, window size = 200.0nm).
[53].An RPD <1.0 indicates a very poor model and its use is not recommended; an RPD between 1.0 and 1.4 indicates a poor model where only high and low values are distinguishable; an RPD between 1.4 and 1.8 indicates a fair model which may be used for assessment and correlation; RPD values between 1.8 and 2.0 indicate a good model where quantitative predictions are possible; an RPD between 2.0 and 2.5 indicates a very good, quantitative model, and an RPD >2.5 indicates an excellent model.

Figure 4 .
Figure 4. (A) Average spectral reflectance and continuum removal reflectance of LUCAS organic soil samples computed by SOC classes.(B-D) Average fractal energy and continuum removal responses of organic soil samples computed by SOC classes using rodogram, madogram and variogram estimators respectively.The central wavelength number of the corresponding spectral segment is assigned to the fractal feature.

Figure 4 .
Figure 4. (A) Average spectral reflectance and continuum removal reflectance of LUCAS organic soil samples computed by SOC classes.(B-D) Average fractal energy and continuum removal responses of organic soil samples computed by SOC classes using rodogram, madogram and variogram estimators respectively.The central wavelength number of the corresponding spectral segment is assigned to the fractal feature.

Figure 5 .
Figure 5.The effect of step and window size on generated fractal features.(A )are fractal feature curves when window sizes were at 15.0-95.0nm (step size fixed at 2.5 nm); (B) is the number of fractal features when step sizes were increased from 10.0 to 50.0 nm (window size fixed at 50.0 nm).

Figure 5 .
Figure 5.The effect of step and window size on generated fractal features.(A)are fractal feature curves when window sizes were at 15.0-95.0nm (step size fixed at 2.5 nm); (B) is the number of fractal features when step sizes were increased from 10.0 to 50.0 nm (window size fixed at 50.0 nm).
Remote Sens. 2016, 8, 1035 11 of 19 to keep a much larger number of features compared to PCA.To achieve similar performance of PCA, fractal-based approaches need to retain ~200 features, such as 190 for SOC (R 2 = 0.819, RMSE = 62.49g/kg, RPD = 2.34) where step size and window size were respectively 10.0 nm and 105.0 nm, 128 features for N (R 2 = 0.736, RMSE = 3.26 g/kg, RPD = 1.92)where step size and window size were respectively 15.0 nm and 135.0 nm, and 131 features for pH (R 2 = 0.807, RMSE = 0.50, RPD = 2.22) where step size and window size were respectively 15.0 nm and 50.0 nm.

Figure 9 .
Figure 9.The gradient-boosting regression model with PLS components for the estimation of OC contents.

Figure 9 .
Figure 9.The gradient-boosting regression model with PLS components for the estimation of OC contents.

Table 1 .
Pearson correlation coefficients between soil properties and fractal dimensions calculated by rodogram, madogram and variogram estimators.

Table 3 .
Comparison of three methods for the quantitative retrieval of soil properties.

Table 3 .
Comparison of three methods for the quantitative retrieval of soil properties.