Multi-Factor Diagnostic and Recommendation System for Boron in Neutral and Acidic Soils

: Despite its inconveniences, the most recognized method to extract boron from soils is that of hot water extraction (B HW ), which is used for diagnostics and recommendations. However, the Mehlich-3 (M3) method is widely used to extract and diagnose several elements at once (P, K, Ca, Mg, Al, B, Cu, Zn, Fe, and Mn) and is well adapted to routine analyses. The objective of our study was to develop a soil diagnostic and recommendation system for boron as a function of measured B M3 (and other interacting elements), crop type, and spreading methods. This system is based on three databases from either the international literature or the chemical characterization of acidic-to-neutral soils typical from Qu é bec (Canada). The ﬁrst database came from the characterization of 365 samples typical of Qu é bec soils; it has been used to predict, by the AutoML (Automatic Machine Learnig) supervised learning algorithm, B M3 as a function of a set of parameters from the following: B HW , pH W , organic carbon (OC), Ca M3, K M3 , and Mg M3 . Depending on the parameters used, the R 2 between the measured and observed B M3 varied from 0.36 to 0.99. This database allowed us to deﬁne two classiﬁcations for soil boron diagnostics and fertility evaluation. The Cate–Nelson analysis for these two models allowed us to deﬁne three boron fertility classes: Low, medium and high; that is 0.00–0.23, 0.23–0.58, and 0.58–3.70 mg B kg − 1 , respectively, for B HW , and 0.00–0.65, 0.65–1.03, and 1.03–12.70 mg B kg − 1 , respectively, for B M3 . The third database was extracted from 130 yield responses to increasing levels of boron; it was used to deﬁne a recommendation model for boron, based on AutoML, as a function of B M3 , pH W , the crop boron requirement (medium, high), and the type of spreading (broadcast, sidedress, foliar spraying). This model resulted in an R 2 of 0.63.


Introduction
Fertile soils allow for root development and provide for the plant extraction of water and nutrients [1]. It is therefore important to evaluate agricultural soil potential using analytical tests for soil diagnostics. Despite the fact that the extraction of trace elements by plants is low, these elements are limiting for crop yields [2]. The characterizing and dosing of trace elements depend on analytical artefacts, since a change in process or a low contamination may greatly affect results [3]. In Québec, fertilization grids were mainly developed for major nutrients [4]. Soil chemical fertility diagnostics for most nutrients (P, K, Ca, Mg, Cu, Zn and Mn) are done after a Mehlich-3 solution extraction. However, some trace elements, such as boron, are not on this list since there are no calibrations with yield. For boron, most calibrations with crop yield are based on hot water extraction (B HW ). Until recently, the B HW method was widely used, since it is efficient in detecting and analyzing low soil boron concentration by colorimetry [5]. For alfalfa, Gestring and Soltanpour [6] compared three boron extraction methods (saturation, ammonium bicarbonate-DTPA, and hot water), and found that hot water extraction was the most related to yield [6]. However, cooling time and additional manipulations

Materials and Methods
To obtain boron fertilizer recommendations from B M3 and other soil chemical properties, the method has been divided into three phases ( Figure 1); that is, to develop: (i) Algorithms predicting B M3 (database 1), (ii) agronomical response models to both crop fertilizer and soil boron content (database 2), and (iii) algorithms to predict recommended boron rates in case of deficiency (database 3).

Algorithms for Prediction of Boron Mehlich-3 (B M3 )
To develop these algorithms, we used a database of 365 soil samples (database 1). These samples were taken from about 80 Québec farms dedicated to various productions: Forage crops, field crops, vegetables, and forestry (Table 1).  [23]. Soil samples were air-dried and sieved with 2 mm sieves. Soil pH was determined with a 1:1 soil/solution (vol/vol). Organic carbon (OC) was determined by the Walkley-Black [24] method, and soil texture was determined by the method of Day [25].
The estimation of B M3 was performed by supervised learning using the h2o's AutoML [28] algorithm within the R software (Version 3.0.2) [29]. This algorithm allowed us to predict B M3 from B HW and a combination of the other predicting variables: pH W, OC , K M3, Ca M3, and Mg M3 . The partial dependence of predicted B M3 on each predicting variable was determined with h2o's varimp function, and the correlation was determined between predicted and laboratory measured B M3 . Additional statistical analyses were done with Excel to determine the means, ranges, and standard deviations of the main soil sample parameters.

Boron Agronomical Models
Database 2 contains a total of 672 values of B HW and yield (Y 0 : Control yield without boron fertilization; Y max : Maximal yield of the boron fertilization treatment). These data come from 32 published studies (Table 2) and involve about 40 crop types. To define an agronomic model based on B M3 , we used the following predictive variables: pH W , OC, K exchangeable , Ca exchangeable and Mg exchangeable , which were all, or partially, available from the literature. This data came under two forms: Tables and graphs. Data from tables were taken directly, whereas graphical data were converted to numeric values using the Data Thief software [30].  From these 32 studies, we only kept those with B HW and those with a soil pH W from neutral to acidic. From the data of these studies, the crop nominal yield for the control was expressed as a relative yield with respect to the maximal yield.
The critical agronomic thresholds of B HW or B M3 are those for which a significant yield increase will not occur with an additional amount of boron fertilizer [61]. This threshold was determined iteratively according to the Cate-Nelson method [62] using relative yields from all experiments with B HW or B M3 . The Cate-Nelson procedure used identifies critical thresholds with an iterative method that maximizes the sum of squares on the boron axis [63] (B HW or B M3 ) while minimizing, on the yield axis, the number of points in the error quadrants [64]. Two models (RY = function of B HW and RY = function of B M3 ) were developed using the Cate-Nelson procedure of the rcompanion package [65] with the R software. For the model RY = function of B M3 , an additional step was required because all the predictive variables (B HW , pH W , OC, K exchangeable , Ca exchangeable and Mg exchangeable ) were not available for all the studies.

Boron Recommandation Algorithms
Database 3 contains 130 curves of yield response to different application rates of boron. This data come from 25 published studies (Table 3). Each of these curves were examined to determine the boron application rate corresponding to the maximum yield, Rate Maximum yield . To this Rate Maximum yield we associated (i) the crop type corresponding to one of the two categories of boron requirements (medium, high) as recommended by CRAAQ [4] (Centre de Référence en Agriculture et Agroalimentaire du Québec); (ii) the application type of the boron fertilizer (broadcast, sidedress, field application); and (iii) the soil variables B HW , pH W , OC, K, Ca, Mg. Prediction of B M3 was done using AutoML algorithms A2, A3 and A4 ( Figure 1). The prediction of Rate Maximum yield was performed by algorithm AutoML A5 using B M3 , estimated by A2, A3 or A4, crop type, application mode, and soil pH W .

Algorithms for Prediction of Boron Mehlich-3 (B M3 )
The results used by our study show a wide range of soil properties (Table 1). Soil texture varies from coarse to fine with a pH from 2.3 to 7.4. Soil boron content ranges from 0.02 to 3.68 mg B kg −1 for B HW and from 0.08 to 12.7 mg B kg −1 for B M3 . This shows the Mehlich-3 method extracts more boron than hot water, which was also a conclusion of Zbíral and Němec [9] and Shuman et al. [10]. For Québec acidic sandy soils and loamy soils, Tran et al. [13] found that, on average, B M3 is 31% more than B HW . Figure 2 shows simple linear correlations between B M3 and B HW , as well as between predicted and measured B M3 .
The model of Figure 2a shows a very low correlation between B HW and B M3 , with only 36% of the B M3 variation being explained by the model. Low coefficients of determination (R 2 = 0.29; R 2 = 0.45) between B M3 and B HW were also observed by Walworth et al. [75]. With such low correlations, such a model cannot be used to transpose B HW values from the literature to B M3 values in order to develop recommendation models or to diagnose soil boron status. Figure 2b shows the prediction quality criteria of B M3 using the linear regression from Figure 2a. Compared to a perfect prediction of B M3 from B HW , i.e., a slope (m) of 1.0 and an intercept (a) of 0.0, we found that m = 0.63 and that a = 0.36 mg B kg −1 , a value so high that it is close to the average of all 365 samples analyzed (0.40 mg B kg −1 ). Such a weak model does not allow for a reliable conversion from B HW to B M3 . Therefore, instead of simple linear regression, we used the supervised learning algorithm AutoML (Table 4).  The A1 AutoML algorithm predicted B M3 from B HW with an R 2 of 0.63 compared to 0.36 for the linear regression, but intercepts and slopes were similar. Adding predictive variables related to the B M3 extraction improved the predictive capacity ( Table 4). The algorithm A4 performed better with an R 2 and a slope close to one, as well as an intercept close to 0 (Figure 3a).
To improve the conversion from B HW to B M3 , we included in AutoML A4 five soil parameters that are routinely analyzed by soil laboratories. With all these parameters, a multiple linear regression gives R 2 = 0.65, a = 0.23 and m = 0.65 (Figure 3i), which results in a deviation of 35% (1-m) between predicted and measured B M3 . The improvement in B M3 prediction when going from simple regression (Figure 2b) to multiple regression ( Figure 3i) and to AutoML shows the potential of using supervised learning algorithm A4. The A4 AutoML algorithm shows only a slight average deviation of 3%, which is much less than the value of 20% for inter-laboratory error allowed by the CEAEQ [76] (Centre d'expertise en analyse environnementale du Québec) and an intercept close to the detection limit of 0.01 mg kg −1 [77]. It is to be noted that all slopes, either from regression or supervised learning, are less than 1.0, which implies mainly an under-estimation of B M3 .
Algorithms A3 and A4 predicted B M3 with deviations (1-m) less than the 20% error allowed by CEAEQ [76]. These two algorithms allow for a better prediction of B M3 by using several soil variables from which depend the soil boron status. Recent research used machine learning to obtain reliable predictions [78][79][80]. Since in Québec routine soil determination includes elements extractible by Mehlich-3, the pH, and soil organic matter [79], algorithm A4 is the most appropriate to convert B HW to B M3 . The scaled (0-1) importance of the variables used to predict B M3 by A4 is shown in Figure 3b; they are, in order of importance: Ca M3 > pH W > B HW > Mg M3 > OC > K M3 . The influence intervals of each of these predictors are shown in Figure 3c-h. Generally, the value of predicted B M3 increased, up to a limit, with the values of the predictors. Contrarily to what was expected, B HW is not the most important variable in predicting B M3 . The two most important variables are Ca M3 and pH W , which is explained by the composition of the Mehlich-3 solution. In comparison with the official method of ammonium acetate pH 7, the Mehlich-3 pH 3 method extracts approximately the same levels of K and Mg. However, Mehlich-3 extracts more calcium than ammonium acetate, i.e., 17% [80] or 28% [26]. This additional extraction could influence the amount of boron moving from the soil to the Mehlich-3 solution. Besides, this is the most plausible reason to explain that B M3 is higher than B HW [13]. Mehlich [26] included, in his extraction solution, ammonium fluoride (NH 4 F) which induces an interaction between soil pH and that of the solution. Mehlich [26] concluded that soils with a high pH are at risk of calcium precipitation in the form of CaF 2 , which makes the Mehlich-3 solution not relevant for alkaline soils (pH W ≥ 7.5). This is seen in Figure 3c, where the B M3 is dependent of the exchangeable calcium up to a concentration of about 13,000 mg Ca M3 kg −1 . A higher pH W results in less exchangeable calcium and in an increase of anionic boron in the form of B(OH) 4 − by the hydrolytic effect [81]. Cancela et al. [82] found that the Mehlich-3 extraction is more reliable for soils with a low calcium content and which are more acidic. In high pH soils, Tran et al. [13] measured B M3 values 2-5.2 more than B HW ; in addition, the correlation between these two variables was almost zero. Moreover, in Figure 3d, we can observe that B M3 is more affected by pH values superior to 6. For organic carbon, K M3 and Mg M3 (Figure 3f-g) their predictive influences on B M3 are within the ranges usually found in Québec mineral soils, i.e., 0%-10% for OC, 0-500 mg B kg −1 for K M3 , and 0-1800 mg B kg −1 for Mg M3 .

Boron Agronomic Models
Two models were developed: One based on B HW (Figure 4) and another based on predicted B M3 ( Figure 5). The Cate-Nelson procedure applied to these two models allowed for the definition of two critical thresholds: 0.23 and 0.58 mg B HW kg −1 (Figure 4c) and 0.65 and 1.03 mg B M3 kg −1 (Figure 5c). These thresholds correspond to high points of the curve between the sum of squares and the variable concerned [62]. These critical values allowed us to partition soil boron into three fertility classes for diagnostics: Low, medium and high. For B HW , these three classes correspond, respectively, to 0-0.23, 0.23-0.58, and 0.58-3.70 mg B HW kg −1 , whereas for B M3 they are 0-0.65, 0.65-1.03, and 1.03-12.70 mg B M3 kg −1 . The high probabilities of economical response for crops to boron fertilizers are in the true positive quadrants (TP) of low boron contents, i.e., less than 0.23 mg B HW kg −1 or less than 0.65 mg B M3 kg −1 , which correspond to relative yield from 20 to 85% and from 20 to 95%, respectively. The high limits of 85% and 95% for critical relative yield were obtained by minimizing the number of points in the error quadrants: false negative(FN) + false positive (FP) of Figures 4b and 5b for B HW and B M3 , respectively. Contrarily, stable yields are observed in the true negative quadrants (TN) corresponding to low probabilities that yields respond to boron fertilization. In these TN quadrants, relative yield values reach a stability [64] ranging from 85% to 100% for B HW and from 95% to 100% for B M3 (Figures 4a and 5a). Error quadrants (FP) correspond to high relative yields despite pertaining to a low boron fertility class, whereas FN quadrants correspond to low relative yield while being in a high boron fertility class. For Figures 4 and 5, we obtained a robustness R 2 of 82.6 and 69.1%, respectively, corresponding to the probability to make a correct diagnosis for soil boron. For both cases, model robustness was high. These R 2 values are defined as the ratio the number of points in quadrants TP and TN to the total number of points. The specificity TN/(TN + FP) represents the probability to make the correct decision (do not fertilize) with respect to all observations with a yield stability, i.e., a relative yield >85% or >95%, respectively, for the models of   These values of 83.8% and 65.1% represent the probability that high relative yield be associated to soil with a boron content higher to the agronomical critical threshold of 0.23 mg kg −1 for B HW and 0.65 mg kg −1 for B M3 . The sensitivity [TP/(TP + FN)] represents the probability to make a good decision to fertilize with boron with respect to all observations with less relative yields, i.e., a relative yield <85% for the models in Figure 4 or <95% for the models in Figure 5. These sensitivities were 78.7% for the B HW model and 73.5% for the B M3 model. In these cases, the sensitivity is the probability that lower yields occur for soils with a boron content less than the critical agronomical threshold of 0.  FN)] are the probability that crops do not respond to boron fertilization when the soil boron content is more than the critical agronomical threshold. These NPVs are 92.9% for the B HW model ( Figure 4) and 73.0% for the B M3 model ( Figure 5). As can be seen from Figures 4 and 5, the distribution patterns of the points are similar, with small statistical differences from the Cate-Nelson partitioning. The B HW agronomical model shows better statistics (robustness, specificity, sensitivity, PPV, NPV) than the B M3 agronomical model. The better performance of the B HW model is explained by the fact that it is developed from measured B HW values, whereas the B M3 model is based on the results of four algorithms with a predictive power ranging from 65% to 99%. Moreover, the critical relative yield value of 95% for the B M3 model is more limiting than the value of 85% for the B HW model; this causes the B M3 model having more points in the error quadrants.
Since the B M3 method extracts more boron from the soil, the critical agronomical thresholds are higher for B M3 than for B HW . Besides, Datta et al. [83] found, for a same culture and for identical experimental conditions, different critical thresholds with four boron extraction methods. The two critical thresholds that we found (0.65 and 1.03 mg B M3 kg −1 ) are located in the median part of the interval [0.1 to 2.0 mg B HW kg −1 ] [84], below which soil boron fertility is classified as low to medium [85].

Boron Recommendation Model
Usually, a recommendation model for a given element is based solely on the soil content of that element, which implies that yield depends on that element only and that any other fertility indicators or fertilization practices do not have any influence. However, Simard et al. [86] proposed considering both pH W and B M3 for boron recommendations. Additionally, Gupta and Cutcliffe [74] showed the effect of boron spreading type on crop yield. It is also known that boron requirements vary from crop to crop [4]. Therefore, the AutoML A5 algorithm, combining all these factors to predict the boron dose, gave good results, as shown in Figure 6. This figure shows a correlation between observed and predicted Rate Maximum yield , giving an R 2 of 0.63, an intercept of 0.6 kg B ha −1 , and a slope of 0.7. Therefore, 63% of the Rate Maximum yield variance is explained by the model. Usually, recommendation models presented in the literature have very low R 2 values. For a same soil analysis, these models present so much variation that researchers use averages or mathematical expectations to propose recommended rates [87]. The A5 algorithm underestimates Rate Maximum yield , as it shows a mean deviation of 30% between observed and predicted Rate Maximum yield. In addition to the main variable, B M3 , the inclusion of three other predicting variables (pH W , application type and crop boron requirements) in the A5 algorithm significantly improves the boron rate prediction. This improvement is explained by the interaction between B M3 and the three other variables. As shown in Figure 6b, pH W is the second most important predicting variable after B M3 . In Figure 6c, the least influencing interval of B M3 is located between 0.65 and 1.03 mg B M3 kg −1 , which coincides with the critical agronomical values of the model of Figure 5a. This interval is between the region of boron deficiency (B M3 < 0.65 mg kg −1 ) and that of boron sufficiency (B M3 > 1.03 mg kg −1 ). It is within this intermediary interval that we find the most estimation error on the boron recommended dose. Besides, in the proposed agronomical model (Figure 5), the points in the error quadrant FN are mostly concentrated in this interval. Therefore, the boron recommendation is relatively less reliable within that interval than outside of it. Figure 6d shows that for a pH W above 6.0, the required boron amount to reach maximum yield is increased.
This figure also shows that for a range of 3 pH W units (from 4.5 to 7.5), the interval of recommended application rate is from 1 to 2.5 kg ha −1 . This is in concordance with findings of Peterson and Newman [88], who recommend a boron rate of 2.5 higher for a soil of pH W of 7.4 compared to another with a pH W of 4.7. These observations show the fixation of boron added to the soil increase considerably when the soil acidity is neutralized, which implies the application of higher boron rates. While predicting variables B M3 and pH W are numerical, the two others (spreading type and crop boron requirement) are categorical; however, they also have an effect on boron recommended rates. The 130 crop response curves analyzed (Database 3) show that Rate Maximum yield decreases for the following order of application type: Broadcast > sidedress > foliar application. Such a difference due to the application method is explained by the fact that fertilizer losses are greater for broadcasting than for sidedress and for foliar application [89]. With the proposed A5 multi-variable model, boron fertilizer applications can be done with more precision. The precision of this A5 model could be further increased as more data become available.

Summary and Conclusions
In this study, we used three databases to develop a multi-variable system for soil boron diagnostics and recommendation. Database 1 is composed of 365 soil samples representative of the various Québec agricultural regions. This database was used to develop an AutoML algorithm (A4) which successfully converts B HW to B M3 using five covariates (pH W , OC, Ca M3 , Mg M3 and K M3 ). This A4 algorithm allows for a robust prediction of 99% (with only a slight 3% average deviation) and an intercept so small that it is in the same order of magnitude as the detection limit of 0.01 mg kg −1 of B M3 . Database 2 is composed of 672 observations coming completely from published literature and was used to develop two agronomical models, one for B HW and one for B M3 . These two models showed critical thresholds defining the intervals of medium deficiency, that is 0.23 and 0.58 mg kg −1 for B HW and 0.65 and 1.03 mg kg −1 for B M3 . These two intervals discriminate soils with a risk of boron deficiency from those with high potential of boron sufficiency. Database 3 include 130 crop yield response to various boron rates; all data were extracted from the literature. These response curves allowed us to define 130 boron rates corresponding to maximal yields (Rate Maximum yield ). An AutoML supervised learning algorithm, A5, allowed us to relate Rate Maximum yield to B M3 , pH W, fertilizer application type, and boron crop requirement category. With an R 2 of 63%, the performance of algorithm A5 is much greater than those of other recommendation models based solely on soil reserves. Results from this study show the superiority of supervised learning compared to traditional prediction methods, despite the heterogeneity of sources and data. It therefore becomes possible to include boron in the common extractant of Mehlich-3 to diagnose and recommend boron fertilization, especially because this method is well suited to the neutral to acidic soils in Québec.
Author Contributions: D.R., K.L. and G.J. planned and designed the research; D.R. carried out the experiments; D.R. and K.L. wrote the manuscript; J.C. was involved in structuring the research.
Funding: This research did not receive any external funding.