Adaboost-Based Machine Learning Improved the Modeling Robust and Estimation Accuracy of Pear Leaf Nitrogen Concentration by In-Field VIS-NIR Spectroscopy

Different cultivars of pear trees are often planted in one orchard to enhance yield for its gametophytic self-incompatibility. Therefore, an accurate and robust modelling method is needed for the non-destructive determination of leaf nitrogen (N) concentration in pear orchards with mixed cultivars. This study proposes a new technique based on in-field visible-near infrared (VIS-NIR) spectroscopy and the Adaboost algorithm initiated with machine learning methods. The performance was evaluated by estimating leaf N concentration for a total of 1285 samples from different cultivars, growth regions, and tree ages and compared with traditional techniques, including vegetation indices, partial least squares regression, singular support vector regression (SVR) and neural networks (NN). The results demonstrated that the leaf reflectance responded to the leaf nitrogen concentration were more sensitive to the types of cultivars than to the different growing regions and tree ages. Moreover, the AdaBoost.RT-BP had the best accuracy in both the training (R2 = 0.96, root mean relative error (RMSE) = 1.03 g kg−1) and the test datasets (R2 = 0.91, RMSE = 1.29 g kg−1), and was the most robust in repeated experiments. This study provides a new insight for monitoring the status of pear trees by the in-field VIS-NIR spectroscopy for better N managements in heterogeneous pear orchards.


Introduction
Pear has been cultivated in China for at least 3000 years [1,2]. It is currently widely grown over an area of 1.12 × 10 6 ha [3]. China's pear production (1.87 × 10 7 tons) represents 75 percent of the world's total yield [4]. However, over-fertilization of nitrogen (N) and phosphorus, common in pear orchards of North China [5][6][7], has led to a low N use efficiency and severe environmental degradation, resulting in accelerated soil acidification, salinization and water quality impairment [8][9][10]. For steady growth and increased fruit production, it is necessary to know the timely N status of pear trees, so that orchardists can provide the correct amount of N fertilizer, optimize N use efficiency, and avoid N losses [11,12]. Despite the costly and labor-intensive chemical tissue testing method of leaf N determination, the recent development and improvement of spectroscopy techniques provide a rapid, non-destructive method for linking leaf N concentration and spectral signatures [13][14][15][16][17].
There are two broad approaches for analyzing hyperspectral data set modeling: physically based and empirically based [18,19]. Recently, both types of leaf N retrieval methods have expanded into subcategories and combinations thereof [20], which can be classified into five methods (adapted from Berger et al., 2020). In general, a parametric regression method is defined by narrow spectra and is then linked to the leaf N concentration through a fitting function [21], which is focused on the visible and near-infrared spectral domains (400-900 nm). In addition, the trend that uses the physically based model inversion methods of RTMs, nonlinear nonparametric regression methods of machine learning, and hybrid techniques, increase. As physically based RTMs, Li et al. (2018) modified the PROSPECT model version into an N-PROSPECT by replacing the specific absorption coefficients corresponding from the leaf chlorophylls to the leaf N concentration, which succeeded in retrieving both leaf and canopy N status [22]. However, this approach was restricted to different cultivars. The observed spectra in various field conditions were influenced by extraneous factors, including both leaf structure and the environmental conditions of the workplace [23]. Our previous study has compared the two methods in modeling the leaf nitrogen concentration of pear leaves: • Nonlinear nonparametric method of partial least squares regression (PLSR), R 2 = 0.85 • Parametric regression method of difference vegetation indices (DVI), R 2 = 0.46.
The PLSR is reported as an effective method for dealing with near-infrared reflectance spectra' high collinearity [18,24,25]. However, leaf N concentrations and spectra collected in-field of pear trees may vary significantly in different cultivars grown in different regions. As a member of the Rosaceae family, pear presents typical gametophytic self-incompatibility. Therefore, different cultivars of pear trees are often cultivated in one pear orchard to enhance yield and quality [26]. The leaf reflectance of different pear cultivars responding to the leaf nitrogen concentrations have still not been characterized. Recently, machine learning regression algorithms (nonparametric regression approaches), apply nonlinear transformations to capture the nonlinear relationships of mixed spectroscopic data with target variables [27]. Support vector regression (SVR) and neural networks (NN) are two of the most widely nonlinear nonparametric methods used for estimating foliage biophysical parameters [28][29][30][31][32]. The Adaptive Boosting (Adaboost) algorithm, proposed by Freund (1997), is one of the most successful recognition algorithms in the field of machine learning. The Adaboost algorithm assumes that a combination of weak learners can be "boosted" into an accurate strong learner, which creates a set of weak learners by maintaining a collection of weights over training data and adjusts them after each weak learning cycle adaptively [33]. Recent research has demonstrated that Adaboost-based machine learnings could achieve high accuracy in modelling with multi-class imbalanced data compared to the regular back-propagation neural networks or the convolutional neural network [34,35]. Adaboost has been applied in ensemble learning due to its excellent classification performance, including image recognition, fruit biochemical parameter estimation, and complex change prediction modelling [36][37][38][39].
Based on our previous studies, the objectives of this paper were twofold: (1) to evaluate the effect and relationship of different cultivars, growth regions, and tree ages on pear leaf reflectance; and (2) to apply a highly accurate and robust mixed algorithm for estimating leaf N concentration of different cultivars, growth regions and tree ages in pear orchards.

Study Area
The study was conducted in intensive pear production orchards of four main growing regions in the east, north, southwest, and northwest of China. The location, climate, soil, physical and chemical characteristics, tree age, and yield of sampled orchards are detailed in Table 1. Pear leaves were sampled from five cultivars named 'Kotobuki shinsui' (Pyrus pyrifolia Nakai), 'Huangguan' (P. bretschneideri Rehd.), Yali (P. bretschneideri), 'Yuanhuang' (P. communis), and 'Cuiguan' (P. pyrifolia) in different orchards. Climatic differences (tem-Sensors 2021, 21, 6260 3 of 14 perature and precipitation) among the eastern, western, northern, and southern regions lead to differences in the cultivars and maturity time. For example, Kotobuki shinsui is mainly cultivated in the southern areas because of its relatively large precipitation demand. Huangguan is widely cultivated in mainland China, but its maturity time depends on the effective accumulated temperature. In Gansu Province, Huangguan pear trees blossom in late April, and the fruit is harvested in late September. However, in Jiangsu province, the tree blossom and fruit harvest of Huangguan take place at least one month earlier than in Gansu. Among the six sampled sites, orchards in Yixing and Pengzhou were relatively young (less than ten years old), and yields were relatively low. Pear trees in orchards of Gaochun, Xuzhou, Xinji, and Jingtai were in the full productive age (over ten years old), and yields were higher than that of young orchards. The application rate of N fertilizers in the six orchards ranged from 0 to 490 kg N ha −1 . The different N treatments were conducted by considering tree ages, the local soil conditions, and average yields. The N treatments in all the regions were the located fertilization experiments by the modern agricultural industry technology system, with 2-5 replicates of 2-5 trees, each arranged in 2 alternate tree rows during 2015-2016. Fruit yields of different cultivars grown in different regions differed from the cultivar characteristic, local climate, and orchard management. The average yield listed in Table 1 were the average values of different N managements. Because of the experimental fertilization, both N deficiency and over-application of N often took place in the same orchard.

Spectra Collection
Nitrogen concentrations in the middle leaves of new shoots from the external side (east, south, west, and north) of the canopy during the 50-80 days after full bloom  were suggested to assess the tree's N status [40]. In 2015 and 2016, eight to ten leaves per tree were sampled from different cultivars grown in different regions. All leaf samples were collected from multiple plants and were free of insect or fungal infestation. To obtain the high signal-to-noise ratio of leaf spectra, the in-field leaf spectral measurements were conducted using the ASD FieldSpec 3 spectroscopy (Analytical Spectral Devices, Boulder, CO, USA), the assembly of which attached a leaf clip with the black background and a plant probe with an internal stable light source [40]. The FieldSpec 3 spectroscopy covered wavelengths from 350 nm to 2500 nm, with high spectral resolution and resampling accuracy. Before leaf Sensors 2021, 21, 6260 4 of 14 spectra measurement, the leaf-clip with Teflon white standard should be applied to adjust the maximum reflectance (99.9%) conditions. The leaf clip with the black background was used to collect the leaf spectra through the ratio of leaf reflectance and the standard white reflectance. The adaxial leaf surface should be faced to the plant probe. Two symmetrical points beside the leaf vein were designed to collect the spectra. Final leaf spectra were obtained by the average spectrum of the two points.

Determination of Leaf Nitrogen Concentration
Leaf N concentration of dry mass was determined by the Dumas method using an Elementar Vario Macro CHN analyzer (Elementar Analysensyteme GmbH, Hanau, Germany). The leaves which completed spectra measurements were taken to the laboratory for analysis. The leaf samples were dried in an oven first, at 105 • C for 1 h to de-enzyme and then at 70 • C for 72 h to remove the water. The central vein in the middle of the leaves should be removed. The dried mesophyll was finely ground, mixed, and weighted in the Tin boat for determination with standard acetanilide samples.

Sample Division
Considering the large amount of data used in this paper, we chose the k-fold method to perform the cross-test. It is essential to ensure the distribution uniformity of data in each training and test subsets, consistent with the original data distribution. Therefore, stratified sampling is adopted to select the training set and test set. The 1285 samples were collected and composed of 11 subsets. Two-thirds of each subset was randomly selected as the training set and the rest as the test set. In addition, the stratified random sampling was repeated 20 times to test the uniformity and robustness of the modelling methods.

Modelling Methods
In addition to the new machine learning methods, parametric regression methods and linear nonparametric regression methods were also conducted and compared. The parametric regression models are composed of the leaf N concentration and narrowband indices (difference vegetation index DVI, ratio vegetation index-RVI, and normalized difference vegetation index-NDVI) with the method of Yao et al., 2010 [41]. To simplify the computation and to decrease the collinearity of leaf spectra, the narrowband vegetation indices were read and calculated at intervals of 10 nm within the range of 350-2500 nm. All the obtained DVI, RVI, and NDVI were regressed with the reference leaf N concentration by the linear equation. Next, the best linear model and its sensitive bands will be achieved. The establishment of the linear nonparametric regression method (partial least squares regression) was conducted in MATLAB R2017b (MathWorks, Natick, MA, USA). In addition, we used quadratic loss as the loss function. The regular neural network (NN) is composed of three layers: (1) input layer; (2) hidden layer; and (3) the output layer. NN's task is to minimize the error between the reference and calculated values by adjusting the layers' weights. In this study, the neural network had three layers, in which the number of neurons in the input layer is not fixed. We used principal components analysis (PCA) for dimensionality reduction and then used sufficient principal components to explain 99.99% of the variance. The hidden layer has 14 neurons, and the output layer has one neuron. The support vector regression (SVR) is mainly used in the regression analysis, which belongs to a supervised learning algorithm [42,43]. We used an SVR with the radial basis function kernel [44]. In this study, the kernel function of SVR is the Gaussian kernel function. The form is as follows: In this study, Adaboost was adapted to combine with a regression method to realize the final high-precision regression model. The Adaboost algorithm was initiated in NN and SVR regression modelling procedures to improve NN and SVR's predictive ability. Because the dimensionality of training data is very high (2151), the principal component analysis would reduce the training and test sample's dimension. Consequently, the component conforms to the condition of AdaBoost, and the computational time, could be saved (see the diagram and detail calculation steps in the Supplementary Materials of Figure S1 and attached formulas).
Moreover, the randomness of the result produced by NN and SVR would decrease significantly after many Adaboost iterations. As a result, the outcomes corresponding to several independent runs of the mixed method are similar. To test the robustness and stability, the process of the hybrid algorithm is computed with 20 repetitions [45][46][47]. The computational steps of the AdaBoost.RT-BP and Adaboost-SVR can be find in the Supplementary Materials. The NN, SVR, parametric regression methods, and the new machine learning methods, were calculated and completed in MATLAB R2017b (MathWorks, Natick, MA, USA).
The accuracy and precision of different models were evaluated by the coefficient of determination (R 2 ) between predicted and chemical-determined N concentrations, and root mean squared error (RMSE). According to the criteria of Saeys et al. (2005), training and test results with an R 2 value greater than 0.91 are considered to be excellent, whereas R 2 between 0.82 and 0.90 represents a good prediction [48]. RMSE values of training and test results should be small to approximate the measured value. The equations used to calculate these parameters are as follows: Coefficient of determination: In which, y = 1 n samples ∑ n samples i=1 y i Root mean squared error: where y i is the true value of number i,ŷ i is the predicted value of number i.

Leaf N Concentrationon
Statistics of leaf N concentration of the different sample sites are shown in Table 2. Samples were collected from different years, cultivars, and regions of mainland China. The average leaf N concentrations of Yali and Kotobuki shinsui were 25.3 and 23.7 g kg −1 , respectively, which were significantly lower than that of other cultivars. Nevertheless, we found no significant difference in the average leaf N concentrations of 5-year Cuiguan trees in Yixing and 8-year Cuiguan trees in Pengzhou. The same tendency was found between the 20-year Huangguan in Xinji and the 17-year Huangguan in Jingtai. Differences in leaf N due to trees' year and cultivation regions were less than those, due to different cultivars.

Leaf Reflectance Spectra
The leaf reflectance of five pear cultivars with leaf N concentration of 25.0 g kg −1 and 30.0 g kg −1 were artificially selected to compare the differences induced by the cultivars (Figure 1). The distribution of leaf spectra collected from different cultivars showed the same trait as other foliar spectra. However, the leaf spectra of different cultivars differed at certain bands. In detail, the relationship between spectra and leaf N concentration for each cultivar were significantly different at the same leaf nitrogen concentration difference value. The spectra in visible and near-infrared regions of Kotobuki Shinsui and Cuiguan differed with different leaf nitrogen concentrations. In addition, the spectra in near infrared regions of Huangguan differed with different leaf nitrogen concentrations. However, the leaf spectra in all regions of Yali and Yuanhuang were not apparently differed from different leaf nitrogen concentrations.
The correlation coefficients between the leaf N concentration and the leaf spectra of different cultivars, were plotted to better understand inter-cultivar variability for this parameter (Figure 2). The trends of the correlation coefficients of Kotobuki shinsui, Cuiguan, and Yali were found to be similar, with a higher correlation in the 550 nm (green peak) and 720 nm (red edge), but the values of the correlation coefficients in the green peak and the red edge varied from one cultivar to the other (Figure 2a). Nevertheless, the trends in the correlation coefficients of Huangguan and Yuanhuang (Figure 2b) were significantly different to that of the three cultivars in Figure 2a. The correlation coefficient values of Huangguaan at 850 nm to 1350 nm band were higher than those of other wavelengths, while the leaf spectra at wavelengths 670 nm and 1920 nm of Yuanhuang presented high correlation values (Figure 2b). The leaf weight per unit area of different cultivars affected by the same difference value of leaf nitrogen concentration in the supplementary could partially demonstrate that there is a difference in leaf structures between different cultivars ( Figure S2). In addition, the spectra in near infrared regions of Huangguan differed with different leaf nitrogen concentrations. However, the leaf spectra in all regions of Yali and Yuanhuang were not apparently differed from different leaf nitrogen concentrations.
(a) (b) Figure 2. Correlation coefficient between different varieties of leaf nitrogen concentration and the original spectra. The trend of correlation coefficients of Kotobuki shinsui, Cuiguan, and Yali were found to be similar to each other, which showed a higher correlation in the 550 nm (green peak) and 720 nm (red edge). Nevertheless, the trend of correlation coefficient of Huangguan and Yuanhuang were found to be significantly different compared with the three cultivars above.

Modelling Results
The 1285 samples were collected and composed of 11 subsets and two-thirds of each subset was randomly selected as the training set and the rest as the test set ( Table 3). The parametric regression models composed of the leaf N concentration and narrowband indices (difference vegetation index DVI, ratio vegetation index-RVI, and normalized difference vegetation index-NDVI) by Yao et al. (2010) were used to identify the bands that resulted in high R 2 values. Contour maps of R 2 for the linear relationship between the In addition, the spectra in near infrared regions of Huangguan differed with different leaf nitrogen concentrations. However, the leaf spectra in all regions of Yali and Yuanhuang were not apparently differed from different leaf nitrogen concentrations.
(a) (b) Figure 2. Correlation coefficient between different varieties of leaf nitrogen concentration and the original spectra. The trend of correlation coefficients of Kotobuki shinsui, Cuiguan, and Yali were found to be similar to each other, which showed a higher correlation in the 550 nm (green peak) and 720 nm (red edge). Nevertheless, the trend of correlation coefficient of Huangguan and Yuanhuang were found to be significantly different compared with the three cultivars above.

Modelling Results
The 1285 samples were collected and composed of 11 subsets and two-thirds of each subset was randomly selected as the training set and the rest as the test set ( Table 3). The parametric regression models composed of the leaf N concentration and narrowband indices (difference vegetation index DVI, ratio vegetation index-RVI, and normalized difference vegetation index-NDVI) by Yao et al. (2010) were used to identify the bands that resulted in high R 2 values. Contour maps of R 2 for the linear relationship between the Correlation coefficient between different varieties of leaf nitrogen concentration and the original spectra. The trend of correlation coefficients of Kotobuki shinsui, Cuiguan, and Yali were found to be similar to each other, which showed a higher correlation in the 550 nm (green peak) and 720 nm (red edge). Nevertheless, the trend of correlation coefficient of Huangguan and Yuanhuang were found to be significantly different compared with the three cultivars above.

Modelling Results
The 1285 samples were collected and composed of 11 subsets and two-thirds of each subset was randomly selected as the training set and the rest as the test set ( Table 3) Figure S3. The R 2 between leaf N concentration and DVI, RVI, and NDVI ranged from 0.35 to 0.45 in training as well as 0.32 to 0.42 in the test. The wavelengths of 2170 nm and 2160 nm indicated the highest R 2 between leaf N concentration and DVI, while the wavelengths (1720 nm and 580 nm) resulted in the highest correlation with RVI and NDVI. DVI had the highest R 2 among the three vegetation indices. Compared with the vegetation indices, PLSR showed a good modelling accuracy during training (R 2 = 0.85), but the predictive accuracy during the test (R 2 = 0.76) was relatively lower. Compared with the singular modelling methods of SVR and NN, Adaboost-initiated NN significantly improved the model accuracy in both training and test (Table 4). However, the AdaBoost SVR algorithm performs essentially identically to the standard SVR algorithm (limited improvement of the modelling accuracy in the test subset). The R 2 of Adaboost combined with NN for a test was above 0.9, which was significantly higher than that of other methods. AdaBoost.RT-BP had advantages over other methods, and fitted with the leaf reflectance and N concentration of different pear cultivars. The five machine learning methods of RMSE ranged from 1.03 to 1.57 g kg −1 and 1.29 to 1.78 g kg −1 , respectively. Similarly, the errors of AdaBoost.RT-BP in both the training and test sets were lower than those of other methods. AdaBoost.RT-BP had the best modelling accuracy in both the training and test sets. To test the stability of the modelling accuracy of the four machine learning methods, 20 random tests were conducted by the stratified random sampling data (Figure 3). A larger interquartile range and the outliers means a relatively bad robustness. In general, NN showed better stability than SVR because of the lower standard deviation in both R 2 and errors. Compared with the singular modelling methods (SVR, NN), Adaboost initiated in SVR and NN improved the modelling accuracy and significantly reduced the low precision times in both training and test (Figure 3). The robustness of the SVR and the Adaboost-SVR models were not as good as NN. In this study, Adaboost iteratively selected Sensors 2021, 21, x FOR PEER REVIEW and errors. Compared with the singular modelling methods (SVR, NN), Adaboo ated in SVR and NN improved the modelling accuracy and significantly reduced precision times in both training and test (Figure 3). The robustness of the SVR a Adaboost-SVR models were not as good as NN. In this study, Adaboost iterativ lected several learner instances by maintaining an adaptive weight distribution improved the modelling accuracy and robustness of NN over the training exa Among the four modelling methods, Adaboost combined with NN (Adaboost. outperformed the others on robustness. The sample set of five cultivars located in different planting regions was ran split into a training set (n = 856) and a test set (n = 429), with a split ratio of 2:1. Com with the other seven modelling methods, the R 2 of measured leaf N concentration predictive value by the AdaBoost.RT-BP model was above 0.9 both in the training a sets (Figure 4). Accordingly, this model's root mean square error was less than 1. 29 The result indicated that the model established by the AdaBoost.RT-BP method s the non-destructive leaf N concentration determination of different cultivars and in pear orchards. The sample set of five cultivars located in different planting regions was randomly split into a training set (n = 856) and a test set (n = 429), with a split ratio of 2:1. Compared with the other seven modelling methods, the R 2 of measured leaf N concentration and the predictive value by the AdaBoost.RT-BP model was above 0.9 both in the training and test sets ( Figure 4). Accordingly, this model's root mean square error was less than 1.29 g kg −1 . The result indicated that the model established by the AdaBoost.RT-BP method satisfies the non-destructive leaf N concentration determination of different cultivars and regions in pear orchards.

Leaf Reflectance Responses to Nitrogen Concentration of Different Cultivars
In the study, we analyzed the relationship between leaf reflectance and N conc tions of different cultivars from different growing regions. The leaf spectral characte of different cultivars with different N concentrations were roughly the same, but th reflectance of cultivars affected by the same difference value of leaf nitrogen concent varied especially in the near-infrared region. In addition, the leaf weight per unit a different cultivars affected by the same leaf nitrogen concentration further explaine the leaf structure characteristics (leaf thickness) affected by the leaf nitrogen concent may be the reason that induced the leaf reflectance difference among cultivars (Figu Our result is consistent with the study reported by Li et al. (2018) and Wang et al. on rice and wheat, who reported that the leaf reflectance affected by different cul was more sensitive than that of different growing regions [22,49]. Further analysis correlation coefficient between the leaf reflectance and measured leaf N concentratio found to consolidate this result. In addition, the future work should take the leaf pi determination of the chlorophyll concentration or the leaf thickness to explain the d ence induced by the cultivars.

Comparison of Modelling Methods
The wavelengths with the maximum R 2 response to leaf N concentration were to be similar in our previous study (2170 nm and 2150 nm), covering a large range o tivars and nitrogen concentrations. However, the R 2 is relatively low, and the wavele were probably highly correlated according to the result. The modelling accuracy i study was much lower than that of crops' N determination by the parametric regre models exploiting limited bands of VIS, red edge, NIR, and SWIR [50]. The maxim response to leaf N concentration of wheat was in the region of visible near-infrared sp [41]. Nevertheless, future work should insist on trying more possible indexes to r the amount of input data. The parametric regression models using limited bands easily influenced by the leaf nitrogen allocation [51]. Recent researchers have de strated that leaf N concentration expressed by the leaf area-based measuremen higher correlated to the photosynthetic capacity [52,53]. Nevertheless, some studies emphasized that vegetation indices using the SWIR regions by the leaf N allocat protein could improve the modelling accuracy [54]. Coincidentally, our results rev that leaf N concentration in the pear tree might be allocated more as the non-phot thetic N (such as proteins and structural N), which were more sensitive to the short

Leaf Reflectance Responses to Nitrogen Concentration of Different Cultivars
In the study, we analyzed the relationship between leaf reflectance and N concentrations of different cultivars from different growing regions. The leaf spectral characteristics of different cultivars with different N concentrations were roughly the same, but the leaf reflectance of cultivars affected by the same difference value of leaf nitrogen concentration varied especially in the near-infrared region. In addition, the leaf weight per unit area of different cultivars affected by the same leaf nitrogen concentration further explained that the leaf structure characteristics (leaf thickness) affected by the leaf nitrogen concentration may be the reason that induced the leaf reflectance difference among cultivars ( Figure S2). Our result is consistent with the study reported by Li et al. (2018) and Wang et al. (2012) on rice and wheat, who reported that the leaf reflectance affected by different cultivars was more sensitive than that of different growing regions [22,49]. Further analysis of the correlation coefficient between the leaf reflectance and measured leaf N concentration was found to consolidate this result. In addition, the future work should take the leaf picture, determination of the chlorophyll concentration or the leaf thickness to explain the difference induced by the cultivars.

Comparison of Modelling Methods
The wavelengths with the maximum R 2 response to leaf N concentration were found to be similar in our previous study (2170 nm and 2150 nm), covering a large range of cultivars and nitrogen concentrations. However, the R 2 is relatively low, and the wavelengths were probably highly correlated according to the result. The modelling accuracy in this study was much lower than that of crops' N determination by the parametric regression models exploiting limited bands of VIS, red edge, NIR, and SWIR [50]. The maximum R 2 response to leaf N concentration of wheat was in the region of visible near-infrared spectra [41]. Nevertheless, future work should insist on trying more possible indexes to reduce the amount of input data. The parametric regression models using limited bands were easily influenced by the leaf nitrogen allocation [51]. Recent researchers have demonstrated that leaf N concentration expressed by the leaf area-based measurement was higher correlated to the photosynthetic capacity [52,53]. Nevertheless, some studies have emphasized that vegetation indices using the SWIR regions by the leaf N allocation to protein could improve the modelling accuracy [54]. Coincidentally, our results revealed that leaf N concentration in the pear tree might be allocated more as the non-photosynthetic N (such as proteins and structural N), which were more sensitive to the short-wave infrared regions [19]. In addition, PLSR, which was found optimal in our previous study, including only one cultivar, did not perform well in the present study's mixed cultivar setting.
Regular NN and SVR have been widely used in regression dealing with high-dimensional data. The modelling performance of NN was superior to the SVR in this study. The trait of nonlinear regression in the SVR modelling procedure is insensitive to random noise [42]. Adaboost is one of the most successful recognition algorithms in machine learning, which is based on the idea that a combination of simple learners (obtained by a weak learner) can perform better than any of the simple learners alone [34]. As a result, Adaboost iteratively selects several learner instances by maintaining an adaptive weight distribution over the training examples, improving the modelling accuracy and robustness of SVR and NN [35]. Compared with single SVR and NN modelling, Adaboost combined with NN can reducing the RMSE in the training and test than the regular NN. However, the AdaBoost SVR algorithm performs essentially identically to the standard SVR algorithm. The experimental results show that Adaboost SVR did have a better effect in the test subset, but the improvement of the modelling accuracy was not that large (Table 4). Wickramaratna et al. (2001) demonstrated that boosting productivity would fall if the underlying learner was a strong regression method (SVR) [55]. Among the machine learning methods, Adaboost combined with NN outperformed the others.

Pear Leaf Nitrogen Determination by the Spectral Method
The published modelling methods listed in Table 5 were evaluated for their ability to predict leaf N concentration of pear trees based on the R 2 of training and test, mean relative error of the test. Neto et al. (2011) and  used the linear regression method to fit the leaf N concentration of 'Rocha' pear trees and the Huanghua pear [12,56]. However, the result of Neto et al. (2011) only demonstrated that SPAD readings ≥33 in leaves sampled at 60-110 DAFB corresponded to optimum leaf N concentration of ≥20 g kg −1 dry weight. The linear regression models showed unstable predictive ability during the test ( Table 5). The vegetation index (DVI [40]; NDVI [57]) showed the approximate R 2 value of training and the similar sensitive wavelength of maximum R 2 in both single cultivars and mixed cultivars (This paper). PLSR can alleviate the high dimensionality of all band spectra input but was weak when dealing with the problem caused by different cultivars ( Table 5). The RMSE of modeling by mixed cultivars were found general larger than that of the single cultivars. The R 2 of the test by the PLSR model with mixed cultivars were 0.72 [58] and 0.76 (this paper), respectively. However, the R 2 of both training and test of the PLSR model with a single cultivar were above 0.85. Interestingly, the R 2 of NN showed the opposite result compared to the PLSR. The R 2 in the test sets by the NN and AdaBoost.RT-BP models with mixed cultivars were 0.85 and 0.92 (this paper), respectively. However, the R 2 of training and test of the NN model with a single cultivar were 0.89 and 0.67 [40]. Consequently, PLSR was indicated for modelling a single cultivar, and the NN was more suitable for modelling mixed cultivars.

Conclusions
In this study, machine learning methods were applied to modeling the determination of leaf nitrogen concentration in pear orchards with mixed cultivars by the in-field visiblenear infrared spectroscopy. Results showed that the effect of different cultivars on leaf reflectance of pears was greater than that of different growing regions and tree ages. In addition, among the modelling methods analyzed, the AdaBoost.RT-BP performed the best in accuracy and robustness in both training and test sets. The results from this study provide a new method to assess pear trees' N status for better N managements in pear orchards with mixed cultivars.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/s21186260/s1, Figure S1: The schematic illustration of Adaboost analysis, Attached formulas: The computational steps of the AdaBoost.RT-BP and Adaboost-SVR, Figure S2: the leaf weight per unit area of different cultivars affected by the same leaf nitrogen concentration, Figure S3: Contour maps of R 2 for the linear relationship between the narrowband indices (DVI, RVI and NDVI) and the leaf N concentration of different cultivars.
Author Contributions: Conception and experimental design were done by Y.X. and J.W.; experiment execution and data collection were done by J.W. and W.X.; J.W., W.X., X.S. and C.D. completed the manuscript writing and revision. All authors have read and agreed to the published version of the manuscript.