Teasing Apart Silvopasture System Components Using Machine Learning for Optimization

: Silvopasture systems combine tree and livestock production to minimize market risk and enhance ecological services. Our objective was to explore and develop a method for identifying driving factors linked to productivity in a silvopastoral system using machine learning. A multi-variable approach was used to detect factors that affect system-level output (i.e., plant production (tree and forage), soil factors, and animal response based on grazing preference). Variables from a three-year (2017–2019) grazing study, including forage, tree, soil, and terrain attribute parameters, were analyzed. Hierarchical variable clustering and random forest model selected 10 important variables for each of four major clusters. A stepwise multiple linear regression and regression tree approach was used to predict cattle grazing hours per animal unit (h ha − 1 AU − 1 ) using 40 variables (10 per cluster) selected from 130 total variables. Overall, the variable ranking method selected more weighted variables for systems-level analysis. The regression tree performed better than stepwise linear regression for interpreting factor-level effects on animal grazing preference. Cattle were more likely to graze forage on soils with Cd levels <0.04 mg kg − 1 (126% greater grazing hours per AU), soil Cr <0.098 mg kg − 1 (108%), and a SAGA wetness index of <2.7 (57%). Cattle also preferred grazing (88%) native grasses compared to orchardgrass ( Dactylis glomerata L.). The result shows water ﬂow within the landscape position (wetness index), and associated metals distribution may be used as an indicator of animal grazing preference. Overall, soil nutrient distribution patterns drove grazing response, although animal grazing preference was also inﬂuenced by aboveground (forage and tree), soil, and landscape attributes. Machine learning approaches helped explain pasture use and overall drivers of grazing preference in a multifunctional system.


Introduction
Silvopastoral systems combine agroforestry and pasture/livestock to maximize ecosystem services and mitigate risk by diversifying markets. It is a complex system, with factors such as soil, topography, tree species, and forage species interacting to influence net primary productivity, as well as ecosystem services [1][2][3][4]. Topography, characterized by terrain features, controls the spatial distribution of soil water and associated nutrients, thus affecting the quality and quantity of forage production, as well as grazing preference spatially and temporally. Classical statistical analysis often fails to capture the effect of nonlinear factors and their interaction once the number of variables increases in the model. A multi-variable approach that is not affected by interactions and linearity requirements can help to identify important variables in system-level analyses. Therefore, this study aims to develop a framework for identifying the driving factors affecting net productivity in silvopastoral systems.
The primary focus of statistical analysis is to make population inferences using the samples, while machine learning is used for its predictive capability of algorithms [5]. To better understand factors and their interactions in complex systems such as silvopastures, machine learning may be useful, but it is not widely used in agricultural data analysis. Machine learning automates the process of data classification, clustering, and dimension reduction by allowing data to learn from itself [6]. Machine learning, which is a subset of artificial intelligence [7], allows the selection of a few important subsets of variables, a process known as feature selection [8]. Studies on how selected variables improve the predictability of the machine learning algorithms [9] may help develop more accurate prediction capability of different machine learning classifiers while minimizing the computing time and storage space requirement. Studies on how each of those selected variables improves the system-level statistical inference are, however, a less explored area. With fewer selected variables, exploration of such statistical relationships becomes easier.
Separating machine learning and statistical inferences into two distinct groups is still a debate [10], as the methodology employed in machine learning approaches uses some statistical measure during feature selection. However, with the increased volume, speed, and processing capability of big data, techniques are needed to use those data in inferential statistics. There are some concerns on using machine learning predicted value as observed data for downstream statistical analysis [11], but less a concern on selecting important variables [9]. Once important variables are selected, they can be interpreted based on machine learning output or using some conventional statistical approach. A commonly used approach is to fit a model using linear regression and analyze and interpret the output. The main limitation of linear regression, however, is the assumption of linearity between dependent and independent variables. Additionally, the intercorrelation of independent variables makes multiple linear regression (MLR) output interpretations difficult [12]. Ref [13] discussed that much of the MLR use is inappropriate. In a silvopasture system, ref [14] used a machine learning algorithm called random forest (RF) to build a model between topography and soil nutrient distribution. Topography was characterized by 12 terrain attributes derived from a 1 m digital elevation model (DEM), and nutrient contents were spatially mapped for the study site. The relationship between topography and nutrient distribution is just one component of the complex silvopasture system. The overall productivity of the system depends on topography and additional interacting factors such as tree species, grass species, and animal grazing preference.
The aim of this paper is to use the predictive capability of machine learning for systems-level statistical interpretations. In this paper, we present a novel approach called "variable rankings" to select few important variables using the machine learning RF method. Then, we develop machine learning approach classification and regression tree (CART) and MLR model to predict animal grazing preference using the selected variables and compare the model output with the analysis of variance (ANOVA) for the interpretation. We hypothesize that the machine learning model will have a better cause-effect relationship with the animal grazing preference compared to the linear regression model. Therefore, the objective of this paper is to develop machine learning approaches to select important variables in a silvopasture system and to explore how system components interact to affect animal response in such systems.

Site and Experiment Description
The study was located on a 4.25-hectare silvopasture site (Figure 1)  W). Information on previous site history is described by [15][16][17]. Briefly, soil in most of the experimental area is mapped as Captina silt loam (fine-silty, siliceous, active, mesic Typic Fragiudults), Pickwick silt loam (fine-silty, mixed, semiactive, thermic Typic Paleudults), Johnsburg silt loam (fine-silty, mixed, active, mesic Aquic Fragiudults), and Nixa cherty silt loam (loamy-skeletal, siliceous, active, mesic Glossic Fragiudults; Soil Survey Staff, 2019b). A dissimilar inclusion is also present at this site that is lower in elevation and was not captured in the mapping unit [14]. Sixteen rows of three tree species, Northern red oak (Quercus rubra L.), eastern black walnut (Juglans nigra L.), and pecan (Carya illinoinensis Wangenh. K. Koch), were established in 1999-2000 at 15 m row spacing. The eastern black walnut trees were replaced with three tree species; American sycamore (Platanus occidentalis L.), cottonwood (Populus deltoides W. Bartram ex Marshall), and pitch/loblolly pine (Pinus rigida/Pinus taeda) in 2014. Tree row alleys (4.57 m wide) were planted with two forage species; a cool-season species (orchardgrass (Dactylis glomerata L., var. Tekapo)) in fall 2015 and a native warm-season mix (8:1:1 big bluestem (Andropogon gerardii Vitman), little bluestem (Schizachyrium scoparium (Michx. Nash) and indiangrass (Sorghastrum nutans L.)) in spring of 2016. Poultry litter fertility treatment (fertilized at 84 kg N ha −1 vs. a zero rate (control)) was employed per forage treatment in the spring of 2017, 2018, and 2019. A low elevation area with high soil moisture content within the site was delineated as aquic (wet) treatment while the rest of the area was designated as udic (dry) treatment. The site was grazed by heifers (Bos taurus L.) at 2.2 animal units (AU) (2017 and 2018) and 2.4 AU (2019) per hectare during the summer. In summary, this three-year (2017-2019) silvopastoral grazing study evaluated grass treatments (orchardgrass and a native grass mix), tree species (cottonwood, oak, pecan, pine, and sycamore), soil fertility (poultry litter and a control), and soil moisture regimes (udic and aquic).

Sampling and Processing for Silvopasture Variables
Soil water content (0.33, 1, 3, and 15 bar) and temperature measurements were recorded every 4 h and logged on a Decagon EM50 data logger (METER Group, Pullman, WA) from May to July in 2017-2019. There were 34 sensors total at the site (17 locations and 2 depths). These data were processed to obtain daily average measurement values. Photosynthetically active radiation (PAR) and leaf area index (LAI) were measured every 10 to 15 days using an AccuPAR LP-80 ceptometer (METER, Pullman, WA) throughout the study period. Soil samples (per grass treatment, fertility, and soil moisture regime) were collected in triplicate at the 0 to 15 cm depth each year during the study period (2017-2019). Total C and N were determined using the combustion method, while soil organic matter was determined using the weight loss on ignition (LOI) method. Other soil parameters; soil texture (modified hydrometer method; sand, silt, and clay in percentage), pH and EC (1:10 soil:water extract), soil nutrient concentration (Mehlich-3 method using 1:10 soil:extractant; P, K, Mg, S, Ca, Na, B, Mo, Ni, Al, Se, Fe, As, Cu, Zn, Mn, Pb, Cr, Ti, Cd, and Co) were analyzed on soil samples following appropriate procedures as described in [18,19]. On each plot, soil bulk density was measured using a 4.8 cm diameter core [20].
A grazing exclosure of 4 m 2 area was secured per treatment combination (and replicated thrice) to measure accumulated forage, which was measured by clipping 0.75 m 2 of uncut forage to a 6 cm stubble height four times during the grazing season. Outside the Figure 1. Study site with tree species labels (oak, pine, cottonwood, sycamore, and pecans) and area representing the individual tree polygons. Tree polygons extend southbound to identify grass species treatments (planted in tree alleys) associated with each individual tree. Figure 1. Study site with tree species labels (oak, pine, cottonwood, sycamore, and pecans) and area representing the individual tree polygons. Tree polygons extend southbound to identify grass species treatments (planted in tree alleys) associated with each individual tree.

Sampling and Processing for Silvopasture Variables
Soil water content (0.33, 1, 3, and 15 bar) and temperature measurements were recorded every 4 h and logged on a Decagon EM50 data logger (METER Group, Pullman, WA) from May to July in 2017-2019. There were 34 sensors total at the site (17 locations and 2 depths). These data were processed to obtain daily average measurement values. Photosynthetically active radiation (PAR) and leaf area index (LAI) were measured every 10 to 15 days using an AccuPAR LP-80 ceptometer (METER, Pullman, WA) throughout the study period. Soil samples (per grass treatment, fertility, and soil moisture regime) were collected in triplicate at the 0 to 15 cm depth each year during the study period (2017-2019). Total C and N were determined using the combustion method, while soil organic matter was determined using the weight loss on ignition (LOI) method. Other soil parameters; soil texture (modified hydrometer method; sand, silt, and clay in percentage), pH and EC (1:10 soil:water extract), soil nutrient concentration (Mehlich-3 method using 1:10 soil:extractant; P, K, Mg, S, Ca, Na, B, Mo, Ni, Al, Se, Fe, As, Cu, Zn, Mn, Pb, Cr, Ti, Cd, and Co) were analyzed on soil samples following appropriate procedures as described in [18,19]. On each plot, soil bulk density was measured using a 4.8 cm diameter core [20].
A grazing exclosure of 4 m 2 area was secured per treatment combination (and replicated thrice) to measure accumulated forage, which was measured by clipping 0.75 m 2 of uncut forage to a 6 cm stubble height four times during the grazing season. Outside the exclosure (grazed area), available forage was also measured on each sampling date. Forage nutrient content and quality parameters (C, N, lignin, acid detergent fiber (ADF), neutral detergent fiber (NDF), hemicellulose, total ash, water-soluble carbohydrate, forage nutrients, and metals (P, K, Mg, S, Ca, Na, B, Mo, Ni, Al, Se, Fe, As, Cu, Zn, Mn, Pb, Cr, Ti, Cd, and Co)) were determined in both available and accumulated forage biomass as described in [21,22]. Nutrient removal (N, P, and K removal) by forage (both available and accumulated forage) was calculated based on nutrient content and forage yield measurement.
Tree parameters such as diameter at breast height (DBH; diameter at 137 cm above soil level) and tree height measured for each individual tree during 2017-2019 were averaged for the analysis. Each heifer and its grazing activity were recorded using GPS collars (Model 3300LR, Lotek Wireless Inc., Newmarket, ON) during the summer grazing period (2017-2019). These point data represent a grazing event for each 15 min when animals' heads are down for more than 75% of the time during this interval. Study site terrain attributes; elevation (m), aspect ( • ), flow accumulation (FlowAccum, n), slope-length factor (LSFactor, m); midslope position (MidSlope, index), multi-resolution ridge top flatness index (MRRTF, index), multi-resolution valley bottom flatness index (MRVBF, index), normalized height (NormHt, index), slope percent (SlopePer, %), slope height (SlopeHt, m), system for automated geoscientific analysis wetness index (SAGAWI, index), valley depth (ValleyDep, m), soil depth (soildepth, cm), and altitude above channel network (VDistChn, m) were derived from 1m × 1 m DEM as described in Adhikari et al. (2018). [14] used these 1m × 1 m raster images of terrain parameters to classify four similarly behaving areas within the study sites called topographic functional units (TFU), as well as to help explain soil microbial linkages to production zones [23,24].

Data Preprocessing for Analysis
Individual trees in the study site were assigned a unique ID. Each tree ID was correlated with the treatment information (tree species, forage species, fertility, wetness) based on its location within the study site. Forage grass treatments were planted between the tree alleys; hence non-overlapping tree polygons were created ( Figure 1). Soil and forage parameter values that were collected at each treatment combination level (tree species, forage species, fertility, wetness) were also assigned to each tree ID based on their treatment information. A data set without treatment information but with the geo-referenced location was assigned to each treatment combination after evaluating their position with respect to the tree polygons. Terrain attribute values were extracted per tree polygon using "raster" [25] packages in R [26]. Terrain attributes per tree polygons were averaged for numeric variables, while the most frequent value (mode) was used to extract TFU information per polygon.
For each year, the number of grazing data points were counted per polygon with the collar ID (individual animal) information. Each collar GPS point represents a 15 min Soil Syst. 2021, 5, 41 5 of 16 grazing interval. Collar GPS data recorded outside of the study area and/or with a dilution of precision greater than four was removed in preprocessing, as was all tilt data with tilt percentage less than 70% in the previous five minutes. Thereafter, the total number of grazing events per polygon was converted to total grazing hours per hectare per AU using polygon area and number of animals (collar ID) information. A combined data set (tree, forage, soil, terrain attributes, and grazing events) containing a total of 130 variables was then created. Tree species, forage species, fertility (poultry manure), wetness, and TFU were categorical variables, while the remaining 125 variables were numerical. Silvopasture systems are diverse and complex, where several interacting factors influence the system output. Clustering variables with similar information allows for the identification of important variables to further explore how each of those factors influences the system output. Hierarchical variable clustering (HVC) involves calculating dissimilarity between variables and applying a clustering algorithm iteratively until similar variables group together. The goal is to maximize intra-cluster similarity and minimize inter-cluster similarity. This measure of similarity is defined by the first principal component of the cluster and its correlation/association with the variables within the cluster [27]. Variables are then assigned to those clusters with the highest association values. The algorithm stops once there is no more change in the variable partition to the clusters. Dendrograms visualize the hierarchy of the clusters and help to decide/choose the final number of clusters. The cluster stability is evaluated by bootstrap resampling and then applying the Rand index [28] that penalizes false positive and false negative decisions on the generated clusters.
Variable clustering was performed using the "ClustOfVar" [27] package in R. The approach in this package uses the HVC algorithm for both quantitative and qualitative variables to group them in clusters based on their similarity and association within the cluster. For qualitative variables, correlation ratios of the variable with the first principal component of the cluster were used, while for quantitative variables, squared Pearson correlation coefficients were used. After the clustering algorithm was applied to all 130 variables, a cluster dendrogram was produced. Dendrograms help visualize how each of the variables was grouped and linked together. Four major clusters ( Figure 2) were selected based on visual inspection of the cluster dendrogram [27]. Finally, variables per cluster, as well as the score of each variable loading with the cluster (indicating the strength of association), were generated (Table 1).

Variable of Importance Using Random Forest Model
Random forest [29] is a non-parametric, machine learning approach where ensembles of classification and regression tree predictions are averaged to make the final prediction model [30]. The CART is a recursive binary split method of input space. In the case of regression predictions, CART decides the best split node by minimizing the sum of squared deviations between each response and the mean predicted value for the node. This approach is useful when there are large numbers of explanatory variables because of its capacity to handle nonlinear relationships and interactions among predictor variables. Individual trees are grown by randomly resampling (with replacement) the training data (called bootstrap sample), as well as using the subset of explanatory variables at a time, which is called the random subspace method. Bagging [30] is the process of the aggregating model developed by the bootstrap sample. Out-of-bag (OOB) mean square error (MSE)of the RF model is the mean prediction error calculated from the training data set when each particular observation was not present in the bootstrap sample. The random forest algorithm uses two criteria to assess the importance of the selected variables. First, important variables in the models are ranked by an estimate of the decrease in predictive accuracy when the variable is not included in the prediction model. The second measure is ranking variables by their contribution to node (splitting node) purity by using the random subset of variables. Averaged over all regression trees, a decrease in residual sum of square (RSS) by the variables in question allows measurement of this indicator.
The RF models were developed to predict each of the 125 quantitative variables (response variable, y) separately using the R package "randomForest" [31]. For each of the RF model response variable (y), the remaining 129 variables (130 variables less response variable) were used as explanatory variables.
where y i is the ith response variable and X −i is the matrix of 129 variables other than the ith response variable. Random forest parameters: nTree (number of trees to grow each time) was set to 500, and mTry (number of variables needed at each node/split) was set to 43. Each RF model produced important variables based on increases in MSE and increases in node purity. Variables were ranked highest to lowest based on these two criteria. A column with the average ranking of these two criteria was also developed. Next, the twenty most important variables for each model were selected using: (1) MSE ranking, (2) node purity ranking, and (3) average ranking of MSE and node purity. Each RF model then grouped response variables into one of the four clusters. Finally, for each group (cluster), the 10 most frequently occurring ranked variables were selected (Figures 2 and 3). Variables selected using this new approach, "variable ranking" was compared with the standard hierarchical clustering output. where is the ith response variable and − is the matrix of 129 variables other than the ith response variable. Random forest parameters: nTree (number of trees to grow each time) was set to 500, and mTry (number of variables needed at each node/split) was set to 43.
Each RF model produced important variables based on increases in MSE and increases in node purity. Variables were ranked highest to lowest based on these two criteria. A column with the average ranking of these two criteria was also developed. Next, the twenty most important variables for each model were selected using: (1) MSE ranking, (2) node purity ranking, and (3) average ranking of MSE and node purity. Each RF model then grouped response variables into one of the four clusters. Finally, for each group (cluster), the 10 most frequently occurring ranked variables were selected (Figures 2 and 3). Variables selected using this new approach, "variable ranking" was compared with the standard hierarchical clustering output.  Table 1 for the variable description.  Table 1 for the variable description.

Animal Grazing Preference Modeling Using Variables Selected by RF-Based Variable Ranking Method
Selected variables based on the averaged ranking method were used to predict animal grazing preference. Summer active grazing hour per ha per AU (labeled as "grz_hr_ha") was averaged for all 3 years (2017, 2018, and 2019) for this study. The random forest variable selection method described previously resulted in 10 variables per cluster that were important for system-level analysis. These 40 variables (10 per cluster) were selected to model grazing preference using two approaches: MLR method and the CART method. Outputs from these two methods were compared for the grazing preference interpretation. For MLR, a stepwise regression using both backward and forward selection methods was employed to select important variables using the "MASS" package [32] in R. A final MLR model was thus summarized for each selected variable with its coefficient and variance inflation ratio (VIF) to assess if some of these variables still contained redundant information. For CART, a regression tree for the grazing hour was developed using all 40 variables selected by the average ranking method. The R package "rpart" [33] was used for the CART model. The default parameter controls of the "rpart" package were used to develop grazing hour prediction models, where a minimum number of observations needed to attempt a split node was set to 20, the minimum number of observations in any terminal node/leaf was set to 7, complexity parameter (for additional split, R 2 value must increase by) to 0.01, and 10-fold cross-validation was used. The algorithm develops fully grown trees at first and then prunes the trees based on control parameters for the final model output.

Grouping Variables Together Using Hierarchical Clustering Method
A total of 130 variables were clustered into four broad groups ( Table 1). The majority of variables were in cluster one, along with nearly all terrain features, indicating terrain features are an important factor driving the majority of variability in the silvopasture data set. Tree species and the terrain feature SAGAWI were most strongly correlated with this cluster, along with other terrain features (normalized height, slope percent, slope height, soil depth, MRVBF, and hillshade). Among other strongly correlated variables in this group were tree coverage area as represented by area_m 2 , soil water content (at 1 bar), and volumetric water content. In summary, this cluster represented terrain features, soil water content, and tree species and tree area (coverage area) of the site. The second group (cluster 2) was composed primarily of soil parameters and was strongly correlated with soil metals such as soil Cd, Cr, Pb, Ti, Cu, As, Fe, and Al (Table 1). Soil texture and soil pH were also grouped within the second cluster.
The third and fourth groups were composed of forage parameters. Two sets of forage samples, accumulated biomass (labeled as "bio") and available forage mass (labeled as "avl"), were distinctly separated within these two groups. Forage nutrient content (i.e., N, P, and K) and fertilizer application were strongly correlated in cluster three, while forage structural parameters such as NDF, hemicellulose, lignin, and forage species were correlated with cluster four.

Important Variables Using Random Forest Method
In general, strongly correlated variables for each cluster also appeared as important variables based on RF models ( Figure 3 and Table 1). For example, tree species, SAGAWI, and NormHt variables that were strongly correlated with cluster one also appeared as important variables (repeatedly appeared in RF models) for this cluster. The most strongly related variables per cluster (Table 1), for example, tree species (SPECIES, cluster 1), soil Cd (cluster 2), P removal by accumulated biomass (cluster 3), and NDF content of accumulated biomass (cluster 4), were selected by both node purity and MSE method (Figure 3). However, there were differences in selected variables based on MSE and node purity method. Variable selection using either MSE or node purity resulted in 8/10 same variables for cluster one (tree coverage area, elevation, normalized height, SAGAWI, slope height, soil depth, tree species, and altitude above channel network; Figure 3A), 7/10 same variable for cluster four (available forage Ca, Cu, Mg, and S content, accumulated biomass C, hemicellulose, and NDF content; Figure 3D) and 6/10 same variables for cluster two (soil Cd, Cr, Mn, Ni, Pb, and Zn content; Figure 3B). MSE and node purity method selected only 3/10 same variables in cluster three (N removal by available forage, accumulated biomass Cu content, and P removal by accumulated biomass; Figure 3C). The node purity algorithm selects variables to minimize the sum of squares based on the splitting of each node, while MSE selects variables based on predictive accuracy. To capture the importance of both criteria, an averaged ranking (MSE ranking and node purity ranking) method was also used to select the variables (Figure 3).
Using the average method (average of MSE and node purity ranking), the majority of explanatory variables selected to model response variables in each cluster were from the same cluster. For example, except for soil Al, all other selected variables for cluster 1 ( Figure 3A) were from the same cluster (Table 1). Similarly, except clay and tree species, all other variables selected for cluster 2 were from the same cluster. Tree species appeared in both clusters (cluster 1 and cluster 2, Figure 3A,B), indicating it is one of the important variables to explain several system-level processes and interrelationships. Variables selected in cluster 3 were entirely from the same cluster. For cluster 4, except Cd content in available forage mass, all other variables selected were from the same cluster. Although several of the selected variables were from the same cluster, the RF variable ranking method showed that variables that were weakly correlated with respective clusters (variables showing <0.5 scores in Table 1) were important enough to explain some system-level interrelationships.

Linear Regression-Based Interpretation of Selected Variables for Animal Grazing Preference
Summer grazing hour was highly variable with a mean of 77.7 and standard deviation of 56 h ha −1 AU −1 but indicated grazing preference varied by tree species (Table 2). However, stepwise linear regression was unable to determine the importance of tree parameters (species, area coverage) on grazing pressure. Variables were grouped into terrain attributes, soil parameters, and forage parameters ( Table 2). Based on the linear regression coefficient, grazing hour decreased (negative coefficients) with slope height, soil Ni, soil Cr, soil Mn, and accumulated biomass Cu and P content. Similarly, grazing hour increased (positive coefficients) with SAGA wetness index, normalized height, soil Cd, soil Fe, and N removal, Fe, C, and Ca content of available forage mass.
For this study, based on a digital elevation model and random forest algorithm, SAGAWI, ValleyDep, and SlopeHt were the main contributing terrain attributes ex-plaining soil nutrient spatial distribution and dynamics. Similar important terrain at-tribute variables (SlopeHt, SAGAWI, and NormHt), as well as additional soil and for-age parameters, were important for explaining animal grazing preferences. Note: * indicates coefficients were significantly different from the intercept at p ≤ 0.05.

CART-Based Interpretation of the Selected Variables for Grazing Preference
Soil metals Cd and Cr, and terrain attributes SAGA wetness index and tree coverage area were important variables for identifying factors affecting grazing preference, as selected by the CART method ( Figure 4). Soil Cd appeared decisive to split the whole data set into two groups of animal grazing hour: 70 and 161 h ha −1 AU −1 . Fewer grazing hours (71 h ha −1 AU −1 ) were associated with soil Cd ≥ 0.035 mg kg −1, and the majority of data (92% data) were under this group. The inference drawn from this result and the linear regression model output (Table 2) were different. The positive coefficient of soil Cd in linear regression suggested grazing hour increased with higher soil Cd, while the CART model showed the opposite result. We further explored soil Cd and its relationship with forage parameters (both accumulated forage biomass and available forage, data not shown), and results suggested that forage yield, NDF, and ADF decreased with increasing soil Cd contents at this site. Soil Cd itself was influenced by experimental factors (tree species, fertilizer application, wetness, and grass treatments) and was usually higher in the aquic (wet), un-fertilized areas where orchardgrass was grown (Table 3). Therefore, animals preferred grazing native grasses (53.9 vs. 101 h ha −1 AU −1 ) and udic (dry) areas (51.8 vs. 103.2 h ha −1 AU −1 ), where soil Cd content was significantly lower (Table 3).
Terrain attribute SAGA wetness index was another important variable selected by the CART method to explain animal grazing preference (Figure 4). This variable appeared repetitively on the CART node to delineate animal grazing preference. Higher SAGAWI values infer pixels where water would accumulate following a rainfall event indicating lower elevation area within the study site with comparatively more wetness. In general, a higher value (more wetness) was associated with fewer animal grazing hours. On average, sites with the SAGAWI ≥2.7 showed 138 h ha −1 AU −1 grazing hour compared to 217 h ha −1 AU −1 when SAGAWI was less than 2.7. Again, compared to a linear regression (Table 2) approach where SAGAWI showed a positive coefficient value, the CART interpretations were different and more realistic.
the CART method to explain animal grazing preference (Figure 4). This variable appeared repetitively on the CART node to delineate animal grazing preference. Higher SAGAWI values infer pixels where water would accumulate following a rainfall event indicating lower elevation area within the study site with comparatively more wetness. In general, a higher value (more wetness) was associated with fewer animal grazing hours. On average, sites with the SAGAWI ≥2.7 showed 138 h ha −1 AU −1 grazing hour compared to 217 h ha −1 AU −1 when SAGAWI was less than 2.7. Again, compared to a linear regression (Table 2) approach where SAGAWI showed a positive coefficient value, the CART interpretations were different and more realistic.  Table 1 for the variable description. Soil Cr was another statistically significant variable selected by CART and stepwise linear regression methods. Similar to soil Cd, higher soil Cr (≥0.09 mg kg −1 ) was associated with fewer animal grazing hours (Figure 4; 38 vs. 79 h ha −1 AU −1 ). The linear regression ( Table 2) method also showed a negative slope coefficient for the soil Cr to predict animal grazing hours. Similar to soil Cd, Cr was higher in wet, un-fertilized, and areas where orchardgrass was grown ( Table 3).
The tree coverage area represents the areal area coverage of each individual tree within the study site. Higher area coverage represents larger trees compared to lower area coverage. Area coverage appeared in the CART approach as an important variable ( Figure 4); however, it did not appear in the stepwise linear regression method (Table 2). Animals preferred grazing (160 vs. 75 h ha −1 AU −1 ) where trees were larger (tree coverage 140 m 2 or more). Table 3 shows that tree coverage was not affected by any factor other than tree species. Larger trees (greater diameter at breast height) within the sites were pecan (132 m 2 ), and animals preferred grazing (103 h ha −1 AU −1 ) on these sites. However, results also indicated that smaller trees did not always result in fewer grazing hours. Pine trees, for example, were the smallest trees on the study site (28.8 m 2 tree coverage), and animals preferred grazing there compared to sycamore (Table 3; 80.8 vs. 58.5 h ha −1 AU −1 ). Other factors that appeared in the CART approach, such as silt content, soil depth, slope height, and hemicellulose, were determinants for further variation in animal grazing preference.

Discussion
The authors used a variable ranking method to select important variables for the system-level analysis. Compared to cluster analysis, some of the selected variables (by variable ranking method) came from different cluster groups. Although not all, most of the strongly correlated variables (Table 1; variables with higher loading score) were selected by our variable ranking approach (Figure 3). This result helped to conclude that the new approach was comparable to HVC for variable selection and helpful to reduce a large number of variables into a few important ones for system-level analysis.
Selected variables were used to develop a grazing preference model using a machine learning approach CART and MLR. As discussed by [12,13], the limitation of MLR due to linearity assumption and autocorrelation (correlation between independent variable) was evident. We used variables with VIF <10 to select less correlated variables during MLR (Table 2), still the interpretation of MLR output was different from the machine learning approach (CART), probably due to linearity limitation. For example, with higher soil Cd content and SAGAWI, animal grazing hours increased (Table 2) in MLR while it decreased in CART (Figure 4). The interaction of SAGAWI with other variables was evident on the CART model, as it appeared repeatedly with different coefficient values ( Figure 4) on different nodes. Factor-level analysis and comparison of model coefficient (MLR and CART) with ANOVA showed that CART performed better than MLR for explaining the relationship of selected variables with the response variable grazing preference.
Animal grazing was linked primarily with soil Cd, soil Cr, tree coverage area (areal coverage), and the terrain feature SAGAWI. Cattle preferred grazing locations with lower SAGAWI (drier areas), native forages, and lower soil Cd and Cr contents. A comparison with ANOVA results ( Table 3) also confirmed that these drier sites (with lower SAGAWI, soil Cd, and soil Cr contents) had greater animal grazing hours.
In summary, cattle grazing preference is influenced by the interaction of plant, animal, and environmental factors [34]. Cattle respond to these factors by selecting plants or plant parts with the most desirable nutritive value [35]. We found soil metals Cd, Cr, and terrain attributes SAGA wetness index to be the most important factors influencing animal grazing preference in our study. [14] reported SAGAWI as one of the important terrain attributes determining soil nutrient distribution on this site. Cattle grazing preference and its relationship with the lower metal content of the site is related to the water movement due to landscape position, as indicated by the SAGA wetness index. Greater soil metals (Cd and Cr) were associated with lower elevation areas within the site. Several studies report landscape hydrology as a function of terrain attributes [36][37][38][39] that ultimately drives the nutrient and metal distribution. Soil Cd increased with SAGAWI (regression coefficient, b = 0.009) and soil depth (b = 0.001), while it decreased with the elevation (b = −0.003) within the site (DNS). Soil Cr showed a similar relationship with SAGAWI, soil depth, and elevation, indicating the occurrence of these metals was likely more of a function of terrain attributes and concentration of these metals were higher on lower elevation and depressional area. [40] also reported that higher soil Cd was associated with lower elevation topographic areas within a Northern Great Plains study site. Forage yield (both accumulated biomass and available forage mass) decreased with increasing soil Cd and Cr content in our study (DNS); hence, lower animal grazing hours are expected with less forage availability on these depression areas.

Conclusions
Silvopastoral system components, including grass and tree species, and their interaction with terrain-associated factors require a multi-variable approach to understand system-level functionality. We hypothesize that the machine learning approach will help to improve both variable selection and factor-level interpretation for a complex system with multiple interacting variables. A three-year (2017-2019) silvopastoral grazing study was used to evaluate a novel approach called "variable rankings" to select most important variables using the machine learning RF method. Results showed that this new approach was comparable to HVC for variable selection and was helpful in reducing a large number of variables into a few important ones for system-level analysis. Selected variables were then used to develop an animal grazing preference model using machine learning (CART) and MLR approach. Machine learning model CART showed a better cause-effect relationship with the animal grazing preference compared to the MLR. Based on the CART result, animal grazing was linked primarily by soil Cd, soil Cr, tree coverage area, and the terrain feature SAGAWI. Cattle preferred grazing locations with lower SAGAWI (drier areas), native forage mixture, and lower soil Cd and Cr contents. A comparison with ANOVA results also confirmed that these drier sites (with lower SAGAWI, soil Cd, and soil Cr contents) had greater animal grazing hours. A machine learning-based variable ranking approach used in this study helped to subset important variables that can be used for statistical inference. In conclusion, animal grazing preference was influenced by tree species (larger pecan trees preferred), grass treatments (native grass preferred), soil, and landscape attributes, and may help explain pasture use. Machine learning is a viable approach for identifying systems-level drivers of a complex silvopasture system.

Data Availability Statement:
The data presented in this study are available on request. Acknowledgments: Trade names or commercial products mentioned in this article are solely for the purpose of providing specific information and do not infer either recommendation or endorsement by the U.S. Department of Agriculture. Field management by Robert Rhein with the University of Arkansas Animal Science Department and Taylor Adams with the USDA-ARS is gratefully acknowledged.