Assessment of Soil Quality under Different Soil Management Strategies: Combined Use of Statistical Approaches to Select the Most Informative Soil Physico-Chemical Indicators

: Assessment of soil quality under different management practices is crucial for sustainable agricultural production and natural resource use. In this study, different statistical methods (principal component analysis, PCA; stepwise discriminant analysis, SDA; partial least squares regression with VIP statistics, PLSR) were applied to identify the variables that most discriminated soil status under minimum tillage and no-tillage. Data collected in 2015 from a long-term ﬁeld experiment on durum wheat ( Triticum durum Desf.) were used and twenty soil indicators (chemical, physical and biological) were quantiﬁed for the upper soil layer (0–0.20 m). The long-term iteration of different management strategies affected soil quality, showing greater bulk density, relative ﬁeld capacity (RFC), organic and extractable carbon contents (TOC and TEC) and exchangeable potassium under no-tillage. PCA and SDA conﬁrmed these results and underlined also the role of available phosphorous and organic carbon fractions as variables that most discriminated the treatments investigated. PLSR, including information on plant response (grain yield and protein content), selected, as the most important variables, plant nutrients, soil physical quality indicators, pH and exchangeable cations. The research showed the effectiveness of combining variable selection methods to summarize information deriving from multivariate datasets and improving the understanding of the system investigated. The statistical approaches compared provided different results in terms of variables selected and the ranking of the selected variables. The combined use of the three methods allowed the selection of a smaller number of variables (TOC, TEC, Olsen P, water extractable nitrogen, RFC, macroporosity, air capacity), which were able to provide a clear discrimination between the treatments compared, as shown by the PCA carried out on the reduced dataset. The presence of a response variable in PLSR considerably drove the feature selection process.


Introduction
Soil is increasingly recognized as major provider of ecosystem services [1,2]. Food and biomass production, climate change mitigation through carbon cycling and carbon sequestration, prevention of land degradation, water purification and supply are some of the services supported, regulated and provided by soils [3,4]. In addition, soil quality has strong implications on human health, both by producing safe and nutritious food and protecting from environmental pollution, thus demonstrating its important role in both society and the environment [5,6]. These evidences have raised the awareness that In particular, VIP statistics are able to summarize the contribution of both predictors and response variables [37].
Principal component analysis (PCA) is a widespread procedure designed to summarize large datasets of correlated variables into a reduced number of components bearing the greatest part of the original information [38]. Variable weights or loadings of the retained components are useful to identify the variables that most contribute to each selected PC and investigate their relationships. Variables correlations and stepwise and canonical discriminant analysis (SDA and CDA) have been adopted in previous studies [15,33,39]. In particular, SDA allows the selection of variables which can best differentiate treatments or classes.
In a study aimed to assess the effects of different management practices on soil physical quality (SPQ) in long-term field experiments, Castellini et al. [15] applied correlation analysis, PCA and SDA to select key soil physical quality indicators. The results highlighted the complementary and supplementary role of the three methods applied and the importance of simultaneously using different approaches to gain a complete understanding of the processes investigated. De Paul Obade et al. [35], after a preliminary comparison of different multivariate approaches, used PLSR to define a standardized soil quality index for different soil conditions, natural vegetation or woodlands and different soil management, conventional tillage and no-tillage. Soil organic carbon, bulk density, carbon-nitrogen ratio and electrical conductivity were identified as the major variables influencing soil quality status and the index provided a comprehensive evaluation of the management investigated. Shukla et al. [24] used factor analysis to identify appropriate soil quality indicators under different land use and management practices. Soil organic carbon was the most dominant measured attribute; other key soil attributes were mainly physical indicators-i.e., field water capacity, air-filled porosity, bulk density-and pH. Pulido Moncada et al. [34] applied decision trees and assessed soil quality using visual soil assessment in the field and a limited number of physical and chemical soil properties.
Regardless of the approach used, little research is available on the comparison and combined use of different statistical methods for selection of soil indicators. Therefore, the main objectives of this study were to (i) identify the most suitable variables for discriminating soil status under different soil management strategies and (ii) compare the performance of three multivariate statistical approaches to select soil indicators. Data collected in 2015 from a long-term field experiment investigating the effects of two soil management strategies on durum wheat (Triticum durum Desf.) in a Mediterranean area were used. The performance of PCA was compared to SDA and PLSR using VIP statistics (PLS-VIP) for feature selection.

Study Area and Long-Term Field Experiment
The study was carried out on a dataset collected from a long-term field experiment, performed at the experimental farm of the Council for Agricultural Research and Economics (CREA-AA) in Southern Italy (Foggia, 41 • 27 03 N, 15 • 30 06 E).
The climate of the study area is classified as "accentuated thermomediterranean" [40], with temperatures that may fall below 0 • C in winter and exceed 40 • C in summer. Rainfall is unevenly distributed throughout the year and is mostly concentrated in the winter months, with a long-term annual average of 550 mm [41]. The soil is clay of alluvial origin (42.7 and 29.6% of clay and sand, respectively), classified as fine, mesic, Typic Chromoxerert [42].
The rain-fed durum wheat monoculture tillage trial, established in 2002 and still ongoing, compares the effects of two soil management strategies (i.e., minimum tillage, MT, and no-tillage, NT) on durum wheat yield response and on soil fertility. Minimum tillage consists of a two-layer soil tillage at 40 cm depth (i.e., a chisel and rotary tiller combination) performed in autumn before durum wheat sowing. No-tillage consists of a direct sowing of durum wheat after a chemical weeding treatment. Treatments are arranged Appl. Sci. 2021, 11, 5099 4 of 17 in a randomized complete block design with three replicates and unit plots of 500 m 2 size. For both treatments, in September straw was chopped into 10-to 15-cm lengths and spread back onto the plot. Weeding with glyphosate was carried out in early November on NT plots. Sowing was performed for both treatments at the end of November with a seeder for direct sowing, appropriately equipped with shaped blades. All other agronomic techniques (fertilization, pest control and weed management during crop growth) were carried out uniformly for the two soil management compared. At harvesting, yield was measured at each soil location within a subarea of 1 × 1 m. Further information on plot management can be found in Castellini et al. [15].

Soil Sampling and Laboratory Measurements
Undisturbed soil cores were collected at wheat heading (April 2015) within each experimental unit of the RCBD experimental design in 4 sub-replicate locations, for a total of 24 observations. Soil samples were collected at 0-0.20 m depth. The following indicators were quantified on the soil samples: (i) chemical indicators: total organic carbon and total nitrogen (TOC and N), alkaliextractable carbon (TEC), humic and fulvic acid carbon (HA_FA), water extractable nitrogen and organic carbon (WEN and WEOC), Olsen available phosphorus (P_Olsen), exchangeable cations (Ca 2+ , Mg 2+ , Na + , K + ), pH and electrical conductivity (EC); (ii) physical indicators: texture, dry bulk density (BD), macroporosity (P MAC ), air capacity (AC) and relative field capacity (RFC); (iii) biological indicator: microbial biomass carbon quantified with the fumigationextraction method [43].
WEOC and WEN, which are indicators for labile organic C and N pools, were extracted from field-moist soil samples according to the protocol reported in Armenise et al. [28]. TOC, TEC, HA_FA and total N were quantified on dried and 2-mm sieved samples, following protocols reported in Ferrara et al. [17]. In detail, for TOC quantification, soil samples were ground to a fine powder (0.5 mm) using an agate ball mill. TEC was obtained by 0.1 M NaOH + 0.1 M Na 4 P 2 O 7 extraction at 65 • C for 48 h. Humic and fulvic acids were fractionated by acidification to pH = 2.0 with H 2 SO 4 . The purification of FA from non-humic substances was carried out by adsorption onto polyvinylpyrrolidone columns. Total organic carbon in soil samples, as well as C fractions in alkali extracts and C and N fractions in water extracts, were quantified through dry combustion using a TOC Vario Select analyzer (Elementar, Lomazzo, Germany), which conducts a catalytic combustion of the sample at high temperatures in air environment. Total N was quantified according to the Kjeldahl procedure.
Both BD and some points of soil water retention curve, namely the relationship between volumetric soil water content (θ) and water pressure head (h), were determined using undisturbed soil cores. Specifically, stainless steel rings with sharp edges (8 cm inner diameter; 5 cm height) were used to determine soil BD and water retention curve at high pressure heads (h ≥ 120 cm). A disturbed soil sample was collected close to the undisturbed sample collection points to determine the water retention curve at low pressure heads (h ≤ 330 cm). The θ values were determined on each undisturbed soil core by a hanging water column apparatus [44] for h values ranging from −5 to −120 cm, and on repacked soil cores by pressure plate method [45] for h values ranging from −330 to −15,300 cm [46]. The soil water retention function was obtained fitting the experimental data with the van Genuchten model [47], and a set of capacitive indicators that give into account of water/air availability were estimated: macroporosity (P MAC = θ s − θ m ) (cm 3 cm −3 ), air capacity (AC = θ s − θ FC ) (cm 3 cm −3 ) and relative field capacity (RFC = θ FC /θ s ) (dimensionless), where θ s , θ m , θ FC are the volumetric water contents corresponding to a pressure head of 0, −10 and −100 cm, respectively. Evaluation of soil physical quality (SPQ) was carried out according to classifications gathered from literature [14]. In particular, the SPQ was considered optimal when 0.9 ≤ BD ≤1.2 g cm −3 , P MAC ≥ 0.07 cm 3 cm −3 , AC ≥ 0.14 cm 3 cm −3 and 0.6 ≤ RFC ≤ 0.7. Supplementary information on the described indicators can be found in Castellini et al. [15].

Data Analysis
Descriptive statistics were computed to summarize the main features of data distribution of response variables to be used in the regressive approach. In addition, variables were tested for normality, using Shapiro-Wilk and Kolmogorov-Smirnov tests, and for heteroscedasticity by soil management with Levene homogeneity of variance test. Data distribution and presence of heteroscedasticity were also examined for soil variables.
The set of twenty soil variables was first analyzed through a nested analysis of variance (ANOVA) considering replicates within plots as pseudo-replicates. Then, data were analyzed using Principal Component Analysis (PCA), Stepwise Discriminant Analysis (SDA) and Partial Least Square Regression (PLSR) with Variable Importance for Projection (VIP) statistics.
PCA was applied to the correlation matrix of the soil variables in order to obtain few new components explaining most of the variation of the original variables. The principal components (PCs) that explained cumulatively a high percentage of the total variance and had an eigenvalue greater than one (Kaiser criterion) were retained. Together with eigenvalue, percentage of variation explained by the single component was taken into account, considering the threshold of 5% suggested by Wander and Bollero [48]. Variable loadings were examined. Within each PC, only highly weighted loadings, defined as having absolute values within 10% of the highest loading [49], were considered and signs were examined to investigate relationships among selected variables.
SDA was applied to identify the variables enabling maximum discrimination among the compared classes (soil management treatments). The Wilks' lambda statistic was used as multivariate measure of separability [50]. The use of SDA requires that a set of assumptions should be checked, among which normality of data distributions, homoscedasticity and not complete redundancy of considered variables. However, a moderate departure from such assumptions does not affect seriously analysis outcomes as shown by a large literature concerning SDA application [51][52][53]. SDA was carried out using the STEPWISE algorithm of STEPDISC procedure of SAS/STAT [54]; significance level to entry and to stay was set at 0.05.
PCA and SDA were first performed separately on the set of chemical variables (plus carbon of the microbial biomass) and physical variables, and then were carried out on the whole dataset of indicators.
PLSR was carried out on mean-centered and variance-scaled data of predictors and response variables. In this study, predictors were represented by soil chemical, physical and biological properties; response variables were grain yield and grain protein content, as integrated indicators of quantitative and qualitative crop response to different soil management conditions. The optimal number of factors to be retained in the model was based on the minimum predicted residual sum of squares statistics (PRESS) [54]. In this study, both approaches were used in running PLSR: PLS1 and PLS2 [55,56]. In PLS1, PLSR was carried out using grain yield or protein content as single response variable; in PLS2, grain yield and protein content were considered simultaneously as multiple response variables.
VIP statistics were used for variable selection [57]. Since the average of squared VIP scores equals 1 [58], only the variables with a VIP score greater than 1 are generally considered significant. However, thresholds between 0.83 and 1.21 have also been suggested [32]. A value of 0.8 is also considered by Wold [59] for retaining or deleting variables. The soil variables with the highest VIP values were selected. PLS regression was performed using PLS procedure of SAS/STAT [54].

Preliminary Statistical Analysis
Preliminary statistical analysis carried out on grain yield and protein content, response variables in PLSR, showed close mean and median values and coefficient of skewness and kurtosis equal or lower than 0.5. These results were confirmed by normality tests indicating for both variables a not significant deviation from normal distribution (P = 0.5198 and P = 0.0970 for the Shapiro-Wilk test, for grain yield and protein content, respectively). Variances were homogeneous over management treatments for both variables according to Levene test (P = 0.4128 for grain yield and P = 0.2432 for protein content). Average values of grain yield and protein content were respectively 5.247 Mg ha −1 and 13.39 g 100 g −1 .
Soil variable distributions did not significantly deviate (microbial biomass C, TOC, TEC, HA_FA, N, Olsen P, pH, K + , Mg 2+ , Na + , BD, AC, RFC, clay and sand) or only slightly deviated (EC, Ca 2+ , P MAC ) from normal distribution, except for WEOC and WEN. Homoscedasticity was also observed in the larger part of the cases (except for WEN and P MAC ). For this reason, data were analyzed for all the variables in the original scale.

Analysis of Variance
Different soil management strategies significantly affected both total organic and total extractable carbon (TOC and TEC), and water extractable nitrogen (WEN) concentrations, with higher TOC and TEC content observed in the upper soil layer under more conservative soil management (Table 1). In particular, an average TOC value of 21.87 g kg −1 was recorded under no-tillage (NT) in comparison to 17.45 g kg −1 under minimum tillage (MT) management. A greater significant concentration of exchangeable potassium (K + ) was also observed in untilled soils (Table 1). Under minimum tillage, higher WEN concentrations were observed. * and ** indicate respectively differences at P ≤ 0.05 and P ≤ 0.01. Means followed by different letters are significantly different according to the SNK test (P = 0.05). Pr(> F) indicates the probability value (p-value) to determine whether to reject the null hypothesis. WEOC = water extractable organic carbon; WEN = water extractable nitrogen; C_biomass = microbial biomass carbon; TOC = total organic carbon; TEC = alkali-extractable carbon; HA_FA = humic and fulvic acid carbon; N = total nitrogen; P_Olsen = Olsen available phosphorus; EC = electrical conductivity; Ca 2+ , Mg 2+ , Na + , K + = exchangeable cations. (n = 24).
Soil management also affected physical properties with a significantly greater bulk density (BD) in untilled soils and, as a consequence, lower air capacity (AC), indicating a tendency to soil compaction ( Table 2). Macroporosity (P MAC ) confirmed this trend, although not significant differences were recorded. Relative field capacity (RFC), that gives an account of the balance between water capacity and air capacity of the soil (in other words, it is an index of the relative importance of meso-micropores to total porosity), was higher in NT than in MT, suggesting major potential risks of anaerobic conditions due to reduced presence of air in the soil porosity. * and ** indicate, respectively, differences at P ≤ 0.05 and P ≤ 0.01. Means followed by different letters are significantly different according to the SNK test (P = 0.05). Pr(> F) indicates the probability value (p-value) to determine whether to reject the null hypothesis. BD = dry bulk density; P MAC = macroporosity; AC = air capacity; RFC = relative field capacity. (n = 22 for BD, PMAC, AC, RFC. n = 24 for clay and sand).
According to Reynolds et al. [14], optimal and intermediate values were observed under MT for bulk density (0.9-1.2 g cm −3 , optimal range), air capacity (0.10-0.14 cm 3 cm −3 , intermediate range) and macroporosity (0.04-0.07 cm 3 cm −3 , intermediate range) and values slightly over than the optimal threshold for RFC (0.6-0.7, optimal range). Except for BD, values recorded under NT were all indicative of limited aeration conditions.
The long-term iteration of the different management strategies slightly affected grain yield response (P = 0.0603), with average values of 5.45 and 5.04 Mg ha −1 recorded under no-tillage and minimum tillage, respectively. No significant effect of the management compared was instead observed for grain protein content.

Principal Component Analysis
PCA was first performed separately on the set of chemical and physical variables, and then carried out on the whole dataset.
In the analysis of the chemical indicators, the first three components (PCs) explained about 65.16% of total variance, whereas in the analysis of the physical indicators, the first two PCs were able to explain 80.12% of total variance (Table 3). Table 3. Eigenvalues and variance explained by the first five principal components (PCs) of the analysis carried out on the set of (a) chemical and biological indicators (14 variables) and (b) physical indicators (6 variables).  The score plots of the first two components showed that both chemical and physical variables were able to discriminate the different soil management compared (Figure 1). The inspection of the loadings of the first PCs highlighted that no-tilled soils (NT) were characterized by a greater TOC and TEC, together with exchangeable K + content, and by a larger bulk density and RFC, whereas a lower P MAC and AC were detected. Appl. Sci. 2021, 11, x FOR PEER REVIEW 8 of 18 The analysis on the whole set of soil indicators (chemical, physical and biological) confirmed these results with an even clearer treatment discrimination and the same highly weighted variables showing the greatest loadings (Tables 4 and 5; Figure 2).  The analysis on the whole set of soil indicators (chemical, physical and biological) confirmed these results with an even clearer treatment discrimination and the same highly weighted variables showing the greatest loadings (Tables 4 and 5; Figure 2).   * indicates the significance of the variable loadings. The sign of variable loadings indicates the positive or negative correlation between the variables and the principal component. WEOC = water extractable organic carbon; WEN = water extractable nitrogen; C_biomass = microbial biomass carbon; TOC = total organic carbon; TEC = alkali-extractable carbon; HA_FA = humic and fulvic acid carbon; N = total nitrogen; P_Olsen = Olsen available phosphorus; EC = electrical conductivity; Ca 2+ , Mg 2+ , Na + , K + = exchangeable cations. BD = dry bulk density; PMAC = macroporosity; AC = air capacity; RFC = relative field capacity. In detail, the first four PCs explained cumulatively about 72.54% of total variance (Table 4). In the first PC (41.21% of total variance) the highly weighted variables were TOC and TEC among chemical variables, whereas RFC and AC, followed by PMAC, for the hydrological soil parameters (Table 5). Slightly under the threshold of 10% of the highest factor loading [50], there were exchangeable K + and bulk density and, at a lower extent, WEN. In the second PC (12.61%), available P and humic and fulvic acids showed the highest loadings, with microbial biomass carbon and pH slightly under the threshold. In the third (10.65%) and fourth (8.07%) components, exchangeable Ca 2+ and N were selected, respectively ( Table 5). The inspection of these results showed that the first PC summarized main findings of the analysis of variance. The second component highlighted instead the role of available nutrients (P) and some carbon fractions (humic and fulvic acids C and microbial biomass C), adding, in this way, further elements to explain differences observed in the experimental conditions. In detail, the first four PCs explained cumulatively about 72.54% of total variance (Table 4). In the first PC (41.21% of total variance) the highly weighted variables were TOC and TEC among chemical variables, whereas RFC and AC, followed by P MAC , for the hydrological soil parameters (Table 5). Slightly under the threshold of 10% of the highest factor loading [50], there were exchangeable K + and bulk density and, at a lower extent, WEN. In the second PC (12.61%), available P and humic and fulvic acids showed the highest loadings, with microbial biomass carbon and pH slightly under the threshold. In the third (10.65%) and fourth (8.07%) components, exchangeable Ca 2+ and N were selected, respectively ( Table 5). The inspection of these results showed that the first PC summarized main findings of the analysis of variance. The second component highlighted instead the role of available nutrients (P) and some carbon fractions (humic and fulvic acids C and microbial biomass C), adding, in this way, further elements to explain differences observed in the experimental conditions.

Stepwise Discriminant Analysis
SDA was first performed separately on the set of chemical and physical variables, and then carried out on the whole dataset.
Variables enabling maximum discrimination among treatments for the analysis carried out on the dataset of chemical and biological indicators were TOC (P < 0.0001) and HA_FA (P = 0.0004), followed at a lower extent by Olsen P (P = 0.0111), EC (P = 0.0258), WEOC (P = 0.0397) and TEC (P = 0.05). In the analysis carried out on the set of physical indicators, the variables selected were RFC (P < 0.0001) and clay (P = 0.037).
Finally, the analysis carried out on the whole dataset summarized the results obtained, selecting TOC (P < 0.0001), RFC (P < 0.0001) and WEOC (P = 0.003) as variables enabling maximum discrimination among treatments (Table 6). Both PCA and SDA, as well as univariate analysis of variance, returned TOC and RFC among the most influential variables both on the set of chemical and physical data analyzed separately as well as on the whole dataset. In addition, WEOC was selected among the most discriminating variables in the SDA carried out on the whole dataset of soil indicators. These findings are in agreement with results reported by previous studies in the selection of the most relevant indicators for assessing soil quality status.

Partial Least Squares Regression and VIP Statistics
PLSR was applied to gain further information in the selection of the most informative variables to describe the effects of the two soil management practices compared. The method was applied to the whole set of soil indicators and considering grain yield and grain protein content as single response variables (PLS1) and as multiple response variables (PLS2).
When wheat grain yield was considered as single response variable, the first two factors accounted cumulatively for 47.20% of total variance in predictors and 48.17% in the response variable. VIP coefficient profiles (Figure 3a) showed that the highest values were recorded for Olsen P and P MAC , followed by TOC, TEC, pH and Mg 2+ (for the chemical variables) and by clay, RFC and AC (for the physical variables).
The use of grain protein content as response variable further modified the rank of indicators selected. The first two factors accounted cumulatively for 49.54% of total variance in predictors and 23.59% in response. The greatest PLS-VIP statistics values were recorded for N and Olsen P, followed by WEN and sand, thus indicating the contribution of macro-elements, with particular regard to N (total and labile form), in affecting grain protein concentration (Figure 3b). In addition, TOC, TEC, Ca 2+ and pH showed important contribution.
The inspection of both VIP profiles (Figure 3a,b) underlined the role of physical soil indicators (P MAC , AC and RFC), which were selected as important variables in both analyses.
Finally, PLSR was carried out considering simultaneously, as response variables, the grain yield and protein content (PLS2). The first two factors extracted accounted cumulatively for 48.56% of total variation in predictors and 27.16% in the response variable. From the inspection of the VIP profile (Figure 3c), the role of available nutrients (Olsen P and, secondary, WEN) and hydrological variables (P MAC , AC and RFC) in explaining plant response emerged again. These variables were followed by soil carbon contents (TOC and TEC), pH and exchangeable cations.
The inclusion of a response variable in PLSR-yield, protein content or both-significantly modified the rank of soil indicators selected, giving greater emphasis to plant macronutrients (available P and N), particularly when the grain protein content was used as dependent variable. These results underline the importance of using combined approaches to explore the data and interpret the behavior of the system investigated.
Finally, for a further exploratory purpose, PCA was carried out on the variables more frequently selected with the combined use of SDA, PCA and PLSR, namely TOC, TEC, Olsen P, WEN, RFC, P MAC and AC. Figure 4 shows that the seven variables were able to discriminate the treatments under investigation. Additionally, a clear improvement in the percentage of variance explained by the first two components (89.34%) was observed in comparison to the analysis performed on the whole set of soil indicators.   proaches to explore the data and interpret the behavior of the system investigated. Finally, for a further exploratory purpose, PCA was carried out on the variables more frequently selected with the combined use of SDA, PCA and PLSR, namely TOC, TEC, Olsen P, WEN, RFC, PMAC and AC. Figure 4 shows that the seven variables were able to discriminate the treatments under investigation. Additionally, a clear improvement in the percentage of variance explained by the first two components (89.34%) was observed in comparison to the analysis performed on the whole set of soil indicators.

Discussion
Long-term experiments can be considered valuable research infrastructures, which enable the long-term study and monitoring of the effects of agricultural strategies or management scenarios. Among agricultural practices, soil tillage strategies are major determinants of soil status and quality. By directly acting on soil physical properties-modifying porosity (total pore space and pore size distribution), soil aggregate size and stability, hydraulic conductivity, soil tillage modifies air-water capacity relationships and thus induces changes in soil organic carbon dynamics, nutrient cycling and solute transport [60].

Discussion
Long-term experiments can be considered valuable research infrastructures, which enable the long-term study and monitoring of the effects of agricultural strategies or management scenarios. Among agricultural practices, soil tillage strategies are major determinants of soil status and quality. By directly acting on soil physical properties-modifying porosity (total pore space and pore size distribution), soil aggregate size and stability, hydraulic conductivity, soil tillage modifies air-water capacity relationships and thus induces changes in soil organic carbon dynamics, nutrient cycling and solute transport [60]. Significant effects can usually be observed when stable or near-stable conditions are established, after transition periods [61]. In this study, the long-term iteration (over about 13 years) of two soil management strategies-minimum tillage and no-tillage-considerably affected certain physical and chemical properties of the upper soil layer in the system investigated.
Analysis of variance showed that significantly greater bulk density (BD) and relative field capacity (RFC) values and, consequently, lower air capacity (AC), were recorded under no-tillage soil management. Since RFC is the ratio between the water content at the field capacity and that at water saturation, relatively higher RFC values highlight a reduced availability for soil air. In accordance with previous results on fine-textured soils [62], soil physical quality was thus indicative of lower aeration conditions under no-tillage.
The long-term no-tillage soil management also enhanced total organic and extractable carbon contents (TOC and TEC). The behavior observed is in agreement with several studies [19,63,64], since the reduced soil disturbance reduces the turnover of soil aggregates favoring the accumulation and stabilization of organic matter within micro-and macroaggregates, thus leading to a net gain of soil carbon. Significantly different values of C and N of the microbial biomass, bulk density, hydraulic conductivity and average weight of soil aggregates were also observed by Sharma et al. [49] under conventional and no-tillage in a long-term field experiment. Laudicina et al. [65], comparing the effect of different cropping systems (wheat/wheat and wheat/bean) and most used tillage managements (conventional, dual layer and no-tillage), in a long-term field experiment on soil organic C pools (total and extractable organic C, microbial biomass C, basal respiration), observed that tillage management affected the soil organic C stored in the first 15 cm of soil more than cropping system. No-tillage caused a 3.6 Mg ha −1 increase of C content in a wheat/faba rotation, and an increase of 5.6 Mg ha −1 in wheat monoculture after 19 years [65]. A greater exchangeable K + content was also observed under no-tillage. PCA and SDA confirmed and summarized results of analysis of variance but also underlined the role of organic carbon fractions -humic and fulvic acids and WEOC-, and available P as main sources of variability in describing the data (PCA), and as the variables that most discriminated the treatments compared (SDA). In a study assessing the suitability of different labile C fractions as soil quality indicators, Bongiorno et al. [66] found that WEOC (dissolved organic C) was not sensitive to soil tillage, unlike the particulate organic C; in any case, WEOC content highly depends on environmental conditions and soil sampling time [66]. In accordance with the present study, López-Fando and Pardo [67] measured higher concentrations of exchangeable K + and available P under no tillage compared to minimum tillage, at a soil depth of 0-20 cm. Similarly, Martin-Rueda et al. [68] found greater concentrations of plant-available K and P in surface soil (0-20 cm) under no-tillage compared to minimum tillage system, but no difference was observed between the two soil management systems at higher soil depths. The accumulation of available P, K and other nutrients in surface soil layers under no-tillage is usually ascribed to the decomposition of organic matter (which is more abundant in no-tilled soils) and to the accumulation of mineral fertilizers in topsoil [69]. The tillage system also influences the relations occurring between plant roots, soil and microorganisms at the rhizosphere level. After long-term no tillage, a higher activity of alkaline phosphatase and acid phosphatase was measured by Balota et al. [70]. These two enzymes, which can be released both by plant roots and soil microorganisms, are involved in the release of labile P from the organic pools. Moreover, the organic acids exuded by plant roots and/or released through organic matter decomposition could compete with P for the binding sites on soil particles, thus enhancing P availability [69].
Finally, PLSR, by considering simultaneously soil indicators and plant response (grain yield, protein content or both), selected as important variables the mineral nutrients (available P, and both total and water extractable N), particularly when grain protein content was used as dependent variable, together with soil physical quality indicators (P MAC , AC, RFC), pH and exchangeable cations. To this regard, VIP statistics, being a weighted sum of the squares of PLS X-score coefficients for the retained components, with the weights calculated from the amount of dependent variable (Y) variance explained by each retained component [33,37], were able to summarize the contribution of both predictors and response variables. Thus, the contribution of mineral nutrients in determining plant response was also highlighted. Results of both PCA and SDA, as well as of univariate analysis of variance, returned TOC and RFC among the most influential variables both on the set of chemical and physical indicators analyzed separately as well as on the whole dataset. Previous studies highlighted the role of RFC among soil indicators in assessing soil physical quality. This variable was able to summarize part of the information given by AC and P MAC and, supported also by BD and plant available water content, showed the highest discriminating capability of the soil and crop residues management strategies compared [15]. TOC selection is important because is a necessary input for soil structural quality indicators [71]. This variable is also indicative of the soil chemical quality, being positively related with soil CEC and nutrient retention capacity. Moreover, TOC can be also indicative of the soil biological quality, due to its key role in maintaining the soil trophic relations and stimulating both the plant and soil microbial activity. Shukla et al. [24], selecting key soil indicators by means of factor analysis, concluded that TOC was the most dominant measured soil attribute as soil quality indicator for the two soil depths investigated and suggested its use for monitoring soil quality changes [24,34].
Overall, the inspection of VIP profiles, together with the results of all methods compared, underlined the role of physical soil indicators (P MAC , AC and RFC), which were selected as important variables in all the analyses performed. This reinforces the idea that such soil capacitive indicators (e.g., RFC) can be suggested for several practical applications including, for example, the estimation of optimal rate of amendments in laboratory, before use in the field.
In the assessment of soil quality modifications as effect of agronomic management, the number and type of the indicators considered are strictly related to the aim of the study and to the spatial scale investigated [72]. Our research was performed at field scale and had a methodological and exploratory aim, focused on comparing the strength and ability of different statistical approaches to extract crucial information from soil data and gain an improved understanding of the system investigated. In any case, the combined approach described in this study can be applied to the analysis of different datasets and conditions, including also different spatial scales.

Conclusions
This methodological study shows the effectiveness of using variable selection methods to summarize the information deriving from multivariate datasets and improving the understanding and interpretability of the system investigated. The results also highlight the importance of simultaneously using different approaches because they may provide different and complementary information. The statistical approaches compared provided different results in terms of variables selection and ranking of the selected variables. The presence of a response variable, in the regressive technique, significantly drove the feature selection process. In particular, the inclusion of yield or protein content, as response variables in PLSR, modified the rank of selected soil indicators, giving greater emphasis to plant nutrients, particularly when the grain protein content was considered. The variables more frequently selected with the combined use of the three methods (TOC, TEC, Olsen P, WEN, RFC, P MAC , AC) were able to provide a clear discrimination between the treatments compared, as shown by the PCA carried out on the reduced dataset. Results finally emphasize the role of multi-year datasets which are invaluable tools to explore benefits and limits of different methodologies and management practices. Funding: The authors would like to thank the EU and MIUR for funding the present methodological contribution, in the frame of the collaborative international consortium DESERT financed under the ERA-NET Cofund WaterWorks2014 Call; this ERA-NET is an integral part of the 2015 Joint Activities developed by the Water Challenges for a Changing World Joint Programme Initiative (Water JPI). The work was also supported by the projects "BIOTILLAGE, Approcci innovativi per il miglioramento delle performances ambientali e produttive dei sistemi cerealicoli no-tillage", financed by PSR-Basilicata 2007-2013 and PRIMA Fundation, call 2019-Section 1-GA n.1912 "Research-based participatory approaches for adopting Conservation Agriculture in the Mediterranean Area-CAMA" project.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.