Next Article in Journal
Dual-Mode Control Scheme to Improve Light Load Efficiency for Dual Active Bridge DC-DC Converters Using Single-Phase-Shift Control
Next Article in Special Issue
Detrital Mica Composition Quantitatively Indicates the Sediment Provenance along the Subei Coast to the Yangtze Estuary
Previous Article in Journal
Clinical Characterization of Inpatients with Acute Conjunctivitis: A Retrospective Analysis by Natural Language Processing and Machine Learning
Previous Article in Special Issue
Geochemistry, Geochronology, and Prospecting Potential of the Dahongliutan Pluton, Western Kunlun Orogen
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evaluation of Soil Nutrient Status Based on LightGBM Model: An Example of Tobacco Planting Soil in Debao County, Guangxi

1
School of Earth Science and Engineering, Sun Yat-sen University, Zhuhai 519000, China
2
Guangdong Key Laboratory of Geological Process and Mineral Resources Exploration, Zhuhai 519000, China
3
Guangdong Provincial Key Laboratory of Geodynamics and Geohazards, Zhuhai 519000, China
4
Guangdong Vcarbon Testing Technology Co., Ltd., Qiangyuan 511500, China
5
China National Tobacco Corporation Guangxi Corporation, Nanning 530022, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2022, 12(23), 12354; https://doi.org/10.3390/app122312354
Submission received: 21 September 2022 / Revised: 21 November 2022 / Accepted: 23 November 2022 / Published: 2 December 2022
(This article belongs to the Special Issue New Advances and Illustrations in Applied Geochemistry)

Abstract

:
Soil nutrient status is the foundation of agricultural development. Exploring the features of soil nutrients and status evaluation can provide a reference for the development of modern agriculture. LightGBM is an optimization algorithm based on the boosting framework, which uses histograms to improve the accuracy of the model. Based on the construction of the LightGBM model, the main nutrient features and status of tobacco planting soil were analyzed in seven towns in Debao County, Guangxi Province, namely Yantong Town, Longguang Town, Najia Town, Zurong Town, Du’an Town, Dongling Town and Jingde Town. The confusion matrix results show the accuracy of the LightGBM model is 94.2%, and the eigenvalue analysis shows that the available potassium (K) contributes the most to the nutrient status. The pH value of soil ranging from 6.1 to 7.8 is favorable for tobacco growth, and the contents of soil organic matter, total nitrogen (N), available phosphorus (P), exchangeable calcium (Ca) and exchangeable magnesium (Mg) are at the appropriate level. Available potassium (K) and available zinc (Zn) are at a high level, but available boron (B) is slightly insufficient. The nutrient status of 10% of soil is at an extremely high level, and about 81.03% of soil is medium level or above. The LightGBM model has high reliability in the automatic evaluation of soil nutrient status, which not only can accurately monitor the soil nutrient status but also reflects the correlation and importance of nutrient factors. Therefore, the LightGBM model is significant for guiding soil cultivation and agricultural production.

1. Introduction

Soil, as a basic environment for crop growth and an important means of agricultural production, is the primary guarantee for the sustainable development of the biosphere [1,2]. The abundance or shortage of soil nutrients greatly affects the quality of crops, which is one of the important factors for the development of planting agriculture [3]. Influenced by landform, climate, altitude and so on, soil nutrients are diverse in different regions [4]. The level of soil nutrients is not only affected by the independent role of nutrient factors but also depends on the comprehensive coordination of various nutrient factors [5]. Therefore, exploring the comprehensive evaluation of soil nutrient status can lead to a deeper understanding of the current nutrient features of soil, which has important guiding significance for farming and fertilization in agricultural areas.
According to previous studies, the evaluation of soil nutrient features and status is mainly based on the comprehensive evaluation of nutrient factors in the study area [6,7]. However, due to the different locations, soil texture, hydrological conditions and suitable crop types of various cultivated land soils, there are no standard methods to evaluate the soil nutrient status [8]. The commonly used methods include principal component analysis, cluster analysis, the fuzzy mathematical model membership function, Nemero comprehensive index and gray correlation analysis [9].
In recent years, machine learning as a new subject has received wide attention in various fields. As an extension of applied statistics, machine learning is very suitable for the application and research of agronomy and geosciences [10]. For example, random forest and XGBoost in integrated algorithms are often used to solve classification and regression problems in geochemical research due to their good performance [11,12]. Tian et al. [13] established a random forest model for automatic evaluation of soil nutrient status, and the results are objective and accurate. Tong et al. [14] found that the results of risk assessment and prediction are more accurate than the traditional research methods in the risk assessment of waterlogging in the central cities of the Yangtze River Delta based on the XGBoost model. LightGBM is the Microsoft’s latest developed source framework, which uses the histogram decision-tree algorithm and is regarded as an improved version of XGBoost. Compared with previous models, LightGBM has the advantages of fast training speed, high accuracy, less memory and more objective training results, which are mainly used to deal with the classification and regression problems of data analysis [15,16]. Therefore, applying the LightGBM model to the research of soil nutrient status can obtain more objective and accurate evaluation results.
Debao County is one of the main tobacco planting areas in Baise City, which is the largest tobacco planting area in Guangxi Province, with an area of about 8167 ha, accounting for 78% of the tobacco planting area in the province [17,18]. At present, there are few in-depth studies on the evaluation of soil nutrient status and the status of soil nutrient dynamics in Debao County, and the main research methods are traditional and lack innovation. The application and research of the new method of machine learning can evaluate the nutrient abundance and deficiency of local tobacco planting soil more accurately and objectively, which is conducive to a better local understanding of the current tobacco planting soil nutrient status and provides the corresponding method reference for regional soil nutrient status evaluation research, so as to promote the development of agriculture.
In this study, the data of nine nutrient factors were preprocessed through principal component analysis in tobacco planting soil of seven towns in Debao County, namely Yantong Town, Longguang Town, Najia Town, Zurong Town, Du’an Town, Dongling Town and Jingde Town, and then used as test set to build the LightGBM model. The feasibility of the LightGBM model was proved using the confusion matrix, and the important differences between diverse soil nutrient factors were obtained with eigenvalue analysis. Through the classification and prediction of the LightGBM model, the nutrient status of tobacco planting soil was evaluated automatically in the study area. Therefore, using the LightGBM model to study the nutrient status of tobacco planting soil can provide some scientific reference for the improvement of soil fertility in the local tobacco industry and the rational layout of tobacco planting.

2. Materials and Methods

2.1. Experimental Sites

The Debao County (106°10′–107°00′ E, 23°00′–23°40′ N) is located in the southwest of Baise City, Guangxi Province, with an altitude of 200–1000 m. Its climate is warm and wet, and the hydrothermal and sunshine conditions are good across four seasons, which is very suitable for producing high quality tobacco. The local average annual rainfall is about 1462.5 mm, the average annual temperature is about 19.5 °C and the average annual sunshine duration is about 1325 h [19]. The soil texture types are mainly clay and loam in the study. The vast area of cultivated land is conducive to agricultural development [20].

2.2. Data Source

According to the planting situation of the tobacco planting industry in the study area, 290 soil samples were collected from seven towns in Debao County, namely Yantong Town, Longguang Town, Najia Town, Zurong Town, Du’an Town, Dongling Town and Jingde Town (Figure 1). The samples were collected according to the Technical Specification for Soil Testing and Formulated Fertilization [21]. Among them, every 50 acres of contiguous tobacco fields was taken as a sampling unit, and the ‘s’ shape distribution method was adopted for sampling. The isolated and small tobacco fields that are not connected were used as sampling units, and the ‘plum blossom’ point distribution method was adopted for sampling. During soil collection, the topsoil of 0–20 cm in the tillage layer was selected, and each sampling unit had 5–8 sampling points. After natural air drying, impurity removal, grinding, screening and other steps, the nutrient factor indexes were determined and analyzed.

2.3. Evaluation Index and Measurement

Based on the commonly used indicators for soil nutrient status evaluation, nine nutrient factors were selected in this study to assess the soil of Debao County, namely pH value, organic matter, total N, available K, available P, exchangeable Ca, exchangeable Mg, available B and available Zn, for content detection and analysis. All indicators were determined in strict accordance with standard methods (Table 1). According to the Integrated Management of Tobacco Planting Soil and Tobacco Nutrients in China [22], the grading standard of abundance and deficiency of different nutrient factors is shown in Table 2.

2.4. Research Methods

2.4.1. LightGBM

LightGBM is a new optimization model algorithm based on the GBDT framework launched by Microsoft in 2017. It is an upgraded version of XGBoost, with more efficient parallel training, lower memory consumption and more accurate results [30]. LightGBM adopts the histogram decision-tree algorithm, which can convert a weak learner into a strong learner. In the continuous combination of multiple groups of tree models, the calculation complexity is reduced by making use of histogram difference so that the result is a high-quality tree, which can be used as a classification and prediction model [31].

2.4.2. LightGBM Model Construction

The LightGBM model training is divided into five steps, namely data collection, feature engineering, model training, cross validation and accuracy evaluation [32]. The data collection is mainly the experimental data of nine nutrient factors, namely pH value, organic matter, total N, available K, available P, exchangeable Ca, exchangeable Mg, available B and available Zn in the soil of the tobacco growing area. The data were preprocessed, and the model database was established after screening and eliminating the abnormal values.
Feature engineering is an important part of the LightGBM model (Figure 2) construction. It is mainly used to classify sample data through feature values that can reflect the nature of classification. In addition, model training and cross validation mainly optimize the model through continuous learning and training. After that, the accuracy of the sample set and the test set was evaluated, and the results were output.
The comprehensive analysis of nine nutrient factors, namely pH value, organic matter, total N, available K, available P, exchangeable Ca, exchangeable Mg, available B and available Zn, in 1038 sample points in Baise, Hechi and Hezhou City of Guangxi Zhuang Province was carried out. The data were preprocessed with principal component and cluster analysis and were used as the training data of the model. The 290 samples in the study area are the test data. Since the original data were divided into five grades according to the nutrient status of tobacco planting soil based on the class average method in cluster analysis during preprocessing, the class labels of the confusion matrix are also displayed as five grades during model training [33].

3. Results and Analysis

3.1. Analysis of Nutrient Features

In Table 3, the pH of tobacco planting soil in seven towns of Debao County is generally between 6.1 and 7.8, with an average value of about 6.9, which is generally weak acidity. The content of soil organic matter is between 12.1 and 49.3 g/kg, with an average value of 27.5 g/kg, which is moderate, while the average content of organic matter in Najia Town, Zurong Town and Du’an Town is 38.0, 33.5 and 32.5 g/kg, respectively, which is within the high range. The average contents of soil total N, available K and available P are 1250, 270.5 and 23.0 mg/kg. Among them, the content of soil total N and available P is in a moderate state, and available K is in a high state.
Figure 3 shows that the exchangeable Ca content in the tobacco planting soil in seven towns of Debao County is relatively high, with more than 47% of the soil samples containing >4 cmol/kg and 22% of the soil samples containing >10 cmol/kg, which is extremely high. The exchangeable Mg content of soil is good, and the content of most samples is between 0.8 and 1.6 cmol/kg, which is basically within the suitable range for planting high-quality tobacco. The number of soil samples with available B content lower than the B deficiency threshold (0.5 mg/kg) [34] accounted for about 35%, and the number of samples with available Zn content higher than the enrichment threshold (1.0 mg/kg) accounted for about 99.5%. Therefore, the soil available Zn content is very high, but the available B content is slightly insufficient.

3.2. Pearson Correlation Analysis

The quality of soil nutrient status is a comprehensive reflection of various nutrient factors, and there is a certain correlation between different soil nutrient factors [8,35]. Pearson correlation analysis of soil nutrient factors of the nine tobacco planting areas can lead to a better understanding of the relationship between nutrient factors. The results (Table 4) show that there is a significant positive correlation between pH value and organic matter, total N, exchangeable Ca and exchangeable Mg and a significant negative correlation with available B. Among them, the correlation coefficient between pH value and exchangeable Ca reached 0.613. It can be seen that the acidity and alkalinity of soil significantly affected the concentration of calcium ions. Where the pH value is high, the exchangeable calcium content is high.
In addition, the positive correlation coefficient between soil organic matter and total N is 0.588, and there is a significant positive correlation between the two. The main reason is that the N content in the soil mainly exists in the form of organic N, and the organic N mainly comes from the inorganic degradation of organic matter.

3.3. Evaluation of Soil Nutrient Status by LightGBM Model

3.3.1. Confusion Matrix

A confusion matrix, also known as an error matrix, is a standard format for expressing accuracy evaluation in integrated algorithm model and is also a method for judging the classification of algorithm model [36]. It is a matrix in which rows represent actual classes and columns are prediction classes. In the calculation process of the LightGBM model, 1038 samples and 9 nutrient factors in Baise, Hechi and Hezhou City were taken as a data set, and 70% of the samples were taken as the training sample set and 30% as the validation sample set according to the ratio of 7:3, for multi-classification prediction.
As shown in Figure 4, 127 of 134 samples in the class I nutrient state were predicted to be true, with a validation rate of 94.8%. 34, 37, 61 and 35 samples in class II, III, IV and V were predicted to be true, with a validation rate of 91.9%, 92.5%, 92.4% and 94.6%, respectively. Of the 312 validation sample sets in the five nutrient status levels, 294 were verified as true categories, with an overall accuracy of 94.2%. The model has a small calculation error and high classification accuracy.

3.3.2. Eigenvalue Analysis

Eigenvalue analysis is an important part of the feature engineering operation in the LightGBM model, and it can directly reflect the own features of the independent variables in classifying the samples. Through its own weight calculation, it can obtain its own importance to the classification results, which is called the importance of the independent variables [32]. In the process of the LightGBM model operation, the eigenvalue is the independent variable, that is, the nine nutrient factors in the nutrient status evaluation. The importance ranking is shown in Figure 5. The larger the value, the higher the importance.
In Figure 5, the eigenvalue score of available K is more than 1000, which is much higher than the other eight nutrient factors, indicating that it has the strongest importance and the greatest contribution to the evaluation and grading of the nutrient status of tobacco planting soil. The contribution degree from high to low is available K, available P, organic matter, total N, available B, pH value, available Zn, exchangeable Ca and exchangeable Mg, which indicates that these nutrient factors have certain differences in the evaluation of the nutrient status of tobacco planting soil and that it is easy to identify and classify the models. The top six characteristic values, available K, available P, organic matter, total N, available B and pH value, are also important nutrient factors for the comprehensive evaluation of nutrient status of tobacco planting soil. For example, when Guo [37] studied the nutrient status of tobacco planting soil in the Erhai Lake Basin, the nutrient factors selected were available K, available P, organic matter, total N and pH value. Mu [38] showed that available B is one of the trace elements necessary for crop growth, and its abundance affects the development and quality of tobacco growth. Therefore, according to the above, the LightGBM model is consistent with the actual research results.
In addition, the contribution rate of available Zn, exchangeable Ca and exchangeable Mg is low, and the characteristic score is lower than 200. The quality of soil nutrient status is a comprehensive reflection of various nutrient factors in the soil. The LightGBM model is reasonable because of the common existence and mutual influence of nine soil nutrient factors and can obtain more accurate and objective evaluation results of nutrient status.

3.3.3. Nutrient Status Evaluation

A total of 290 tobacco planting soil samples including nine nutrient factors in Debao County were taken as the test sample sets to conduct automatic classification of nutrient status evaluation in the LightGBM model. The classification level was consistent with that of model training and was divided into five classes (Table 5). As shown in Table 5, Grade I indicates that the nutrient status of tobacco planting soil is at a very high level, accounting for 10%. Grade II and III indicate that the nutrient status is high and relatively high, accounting for 7.24% and 17.59%. Grade IV indicates that the nutrient status of tobacco planting soil is at the medium level, accounting for 46.2%, which is the largest part of the nutrient status of tobacco planting soil in the town. Grade V indicates that the nutrient status of tobacco planting soil is at the general level, accounting for 18.97%.
Based on the evaluation results of nutrient status applied by the LightGBM model, the distribution of soil nutrient status in seven tobacco planting towns in Debao County can be obtained using the kriging analysis method (Figure 6). In Table 5 and Figure 6, the nutrient status of tobacco planting soil in Yantong Town and Jingde Town among the seven tobacco planting towns in Debao County are at a extremely high level, accounting for 0.69% and 9.31%. The nutrient status of Yantong Town, Du’an Town, Dongling Town and Jingde Town in the county are at a medium level or above, while Longguang Town, Najia Town and Zurong Town are at medium and low level. In general, the soil nutrient status of the seven tobacco planting towns in Debao County is basically at the medium level or above, accounting for 81.03%, and the low level of nutrient status accounts for 18.97%.

4. Discussion

Soil pH directly affects the growth of plants, and the soil with weak acidity is more suitable for the growth of tobacco [39,40]. The pH of tobacco planting soil in seven towns of Debao County is generally at an appropriate level, which is conducive to the development of tobacco agriculture. The contents of soil organic matter, total N, available P, exchangeable Ca and exchangeable Mg and other nutrient factors are moderate. The average content of soil available K (270.5 mg/kg) is almost five times the average level of the national tobacco planting soil (57.5 mg/kg). As tobacco is a typical K-loving crop, K has an important impact on the growth, development and quality of flue-cured tobacco [41]. Compared with the national tobacco planting soil, the high soil K in the tobacco area of Debao County is more conducive to producing high-quality tobacco. The effective zinc content of the tobacco planting soil in the town is at a very high level, and the effective B content is slightly insufficient. The application of trace element fertilizer B can be slightly increased to improve the effective B content of the local tobacco planting soil.
Based on the LightGBM model, the nutrient status of tobacco planting soil is evaluated. The accuracy of the model is 94.2%, the classification accuracy is high and the reliability is strong. The feature value of available K is the largest, its contribution rate to the evaluation of soil nutrient status is the largest and its importance is the strongest in the tobacco planting area. This also corresponds to the high content of available K in the tobacco planting soil in the study area. In the evaluation of nutrient status, Yantong Town, Du’an Town, Dongling Town and Jingde Town have good nutrient status, and the county is generally at or above the medium level, accounting for 81.03%, which has good tobacco planting value and is conducive to the development of tobacco industry. Some soil nutrients are slightly insufficient. In the future, we can improve the soil nutrients by formulating reasonable and scientific fertilization measures to provide more fertile soil conditions for the planting of tobacco crops.
Applying the integrated algorithm to the evaluation of soil nutrient status, the multi-classification nonlinear mapping relationship between nutrient factors and nutrient status can be established. The regular and clear evaluation indexes improve the sorting performance of training samples and make the final sorting prediction more scientific. In this study, LightGBM, as an upgraded and improved version of XGBoost, obtained more accurate classification and recognition ability through its own full training. Automatic data processing can make the evaluation results of nutrient status of tobacco planting soil in the study area more objective and accurate. In addition, due to the reliability and flexibility of the LightGBM model, it can be widely used in the evaluation of various types of soil nutrient status and fertility, in order to provide more innovative research methods and means for the development of new agriculture in the 21st century. At the same time, the LightGBM model also provides a new and innovative method for the research of other issues outside the agricultural field.

5. Conclusions

There is weak acidity in the tobacco planting soil in Debao County. The content of soil organic matter, total N, available P, exchangeable Ca and exchangeable Mg are at the appropriate level, and available K and Zn are at a high level, but the available B is slightly insufficient. Therefore, the content of soil available B can be adjusted by artificial means such as fertilization.
The rate of the contribution of available K to the evaluation of the soil nutrient status of tobacco planting is the largest in the Debao County. The nutrient status of tobacco planting soil is at a great level in Yantong Town, Du’an Town, Dongling Town and Jingde Town, while Longguang Town, Najia Town and Zurong Town are at a medium or low level. In general, the whole county has reached a medium level and above, which is conducive to the planting of tobacco crops.
The application of the LightGBM model to the evaluation of soil nutrient status is accurate, reliable and objective. Due to the good stability and wide adaptability of the model, the LightGBM can be widely used to solve other problems in agriculture and geosciences so that researchers can obtain more accurate and realistic results.

Author Contributions

Z.L.: Writing, editing and figure preparation; T.Z., M.Z. and J.G.: Table preparation; W.S.: Funding acquisition, review and supervision; J.Z., D.F. and Y.L.: Review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Program of Guangzhou, grant numbers 202002030184 and 201804010190, the National Key Research and Development Program of China, grant number 2022YFF0801201 and the National Natural Science Foundation of China, grant numbers 42072229 and U1911202.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kibblewhite, M.G.; Ritz, K.; Swift, M.J. Soil health in agricultural systems. Philos. Trans. R. Soc. B. 2008, 363, 685–701. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Lu, M.Z.; Yang, M.Y.; Yang, Y.R.; Wang, D.; Sheng, L.X. Soil carbon and nutrient sequestration linking to soil aggregate in a temperate fen in Northeast China. Ecol. Indic. 2019, 98, 869–878. [Google Scholar] [CrossRef]
  3. Yageta, Y.; Osbahr, H.; Morimoto, Y.; Clark, J. Comparing farmers’ qualitative evaluation of soil fertility with quantitative soil fertility indicators in Kitui County, Kenya. Geoderma 2019, 344, 153–163. [Google Scholar] [CrossRef]
  4. Meng, B.; Zhou, Y.F.; Yang, L.S.; Peng, G.Z.; Li, J.Q.; Deng, Y. Nutrient Spatial Distribution and Fertility Evaluation of Sugarcane Soils in Menghai County. Soils 2022, 54, 277–284. (In Chinese) [Google Scholar]
  5. Shen, J.G.; Wang, Z.; Li, D.; Fei, J.H.; Lou, L.; Ma, W.H. The Quality Evaluation of Newly Reclaimed Red Soils in Yuhang District, Hangzhou. Chin. J. Soil Sci. 2018, 49, 55–60. (In Chinese) [Google Scholar]
  6. Chen, J.; Qu, M.K.; Zhang, J.L.; Xie, E.Z.; Huang, B.; Zhao, Y.C. Soil fertility quality assessment based on geographically weighted principal component analysis (GWPCA) in large-scale areas. Catena 2021, 201, 105197. [Google Scholar] [CrossRef]
  7. Huang, J.; Jiang, D.H.; Deng, Y.S.; Ding, S.W.; Cai, C.F.; Huang, Z.G. Soil Physicochemical Properties and Fertility Evolution of Permanent Gully during Ecological Restoration in Granite Hilly Region of South China. Forests 2021, 12, 510. [Google Scholar] [CrossRef]
  8. Dai, S.X.; Ren, W.J.; Teng, Y.; Chen, M.; Ma, W.T. Basic Physico-chemical Properties and Fertility Comprehensive Evaluation of Main Paddy Soils in Anhui Province. Soils 2018, 50, 66–72. (In Chinese) [Google Scholar]
  9. Yang, W.X.; Ren, J.X.; Li, Z.Y.; Xu, Y.; Li, Z.L.; He, B.H. Soil Fertility in Karst Regions with Analysis of Principal Component and Fuzzy Synthetic Evaluation. Southwest China J. Agric. Sci. 2019, 32, 1307–1313. (In Chinese) [Google Scholar]
  10. Hengl, T.; Leenaars, J.G.B.; Shepherd, K.D.; Walsh, M.G.; Heuvelink, G.B.M.; Mamo, T.; Tilahun, H.; Berkhout, E.; Cooper, M.; Fegraus, E.; et al. Soil nutrient maps of Sub-Saharan Africa: Assessment of soil nutrient content at 250 m spatial resolution using machine learning. Nutr. Cycl. Agroecosyst. 2017, 109, 77–102. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  11. Wu, C.; Fang, C.; Wu, X.; Zhu, G. Health-risk Assessment of Arsenic and Groundwater Quality Classifcation Using Random Forest in the Yanchi Region of Northwest China. Expo. Health 2020, 12, 761–774. [Google Scholar] [CrossRef]
  12. Ma, M.; Zhao, G.; He, B.; Li, Q.; Dong, H.; Wang, S.; Wang, Z. XGBoost-based method for flash flood risk assessment. J. Hydrol. 2021, 598, 126382. [Google Scholar] [CrossRef]
  13. Tian, Y.C.; Xu, M.D. Application of Random forest Pattern Recognition in Soil Fertility Assessment. J. North Univ. China (Nat. Sci. Ed.) 2019, 40, 464–469. (In Chinese) [Google Scholar]
  14. Tong, J.P.; Zhang, H.Y.; Liu, H.; Huang, J.; Hao, Y. XGBoost model-based risk assessment and influencing factors analysis of waterlogging in core cities of Yangtze River Delta. Water Resour. Hydropower Eng. 2021, 52, 1–11. (In Chinese) [Google Scholar]
  15. Yan, J.; Xu, Y.T.; Cheng, Q.; Jiang, S.Q.; Wang, Q.; Xiao, Y.J.; Ma, C.; Yan, J.B.; Wang, X.F. LightGBM: Accelerated genomically designed crop breeding through ensemble learning. Genome Biol. 2021, 22, 271. [Google Scholar] [CrossRef]
  16. Shehadeh, A.; Alshboul, O.; Mamlook, R.E.A.; Hamedat, O. Machine learning models for predicting the residual value of heavy construction equipment: An evaluation of modified decision tree, LightGBM, and XGBoost regression. Autom. ConStruct. 2021, 129, 103827. [Google Scholar] [CrossRef]
  17. Li, Z.L.; Lu, Y.C.; Zhao, L.F.; Fan, D.S.; Wei, Z.; Zhou, W.L.; Huang, L.G.; Huang, Y.; Huang, J.P.; Gu, X.Q.; et al. Evaluation on Tobacco-planting Soil Fertility in Longlin County of Guangxi. Chin. J. Soil Sci. 2020, 51, 1042–1048. (In Chinese) [Google Scholar]
  18. Gao, H.J.; Wei, Z.; Luo, G.; Lin, B.S. Status and Conservation and Remediation Technology of Tobacco-growing Soils in Baise City. Crop. Res. 2016, 30, 736–740. (In Chinese) [Google Scholar]
  19. Li, L.Y.; Su, B.L. Present situation and Countermeasures of silkworm production and development in Debao County, Guangxi. South China Agric. 2018, 12, 116–117. (In Chinese) [Google Scholar]
  20. Li, Q.X. Analysis on climatic conditions of flue-cured tobacco planting in Nandan county. Mid-Low Latit. Mt. Meteorol. 2006, S1, 29–31. (In Chinese) [Google Scholar]
  21. NY/T 1118-2006; Technical Specification for Soil Testing and Formulated Fertilization. The Ministry of Agriculture of the People’s Republic of China: Beijing, China, 2006. (In Chinese)
  22. Chen, J.H.; Liu, J.L.; Li, Z.H. Soil and Nutrient Status of Tobacco Growing in China. In Integrated Management of Tobacco Planting Soil and Tobacco Nutrients in China; Science Press: Beijing, China, 2008; pp. 39–55. (In Chinese) [Google Scholar]
  23. NY/T 1377-2007; Determination of pH in Soil. The Ministry of Agriculture of the People’s Republic of China: Beijing, China, 2007. (In Chinese)
  24. NY/T 1121.6-2006; Soil Testing Part 6: Method for Determination of Soil Organic Matter. The Ministry of Agriculture of the People’s Republic of China: Beijing, China, 2006. (In Chinese)
  25. LY/T 1228-2015; Nitrogen Determination Methods of Forest Soils. The State Forestry Administration of the People’s Republic of China: Beijing, China, 2015. (In Chinese)
  26. NY/T 1848-2010; Method for Determination of Ammonium Nitrogen, Available Phosphorus and Rapidly-Available Potassium in Neutrality or Calcareous Soil. The Ministry of Agriculture of the People’s Republic of China: Beijing, China, 2010. (In Chinese)
  27. LY/T 1245-1999; Determination of Exchangeable Calcium and Magnesium in Forest Soil. The State Forestry Administration of the People’s Republic of China: Beijing, China, 1999. (In Chinese)
  28. NY/T 1121.8-2006; Soil Testing Part8: Method for Determination of Soil Available Boron. The Ministry of Agriculture of the People’s Republic of China: Beijing, China, 2006. (In Chinese)
  29. NY/T 890-2004; Determination of Available Zinc, Manganese, iron, Copper in Soil-Extraction with Buffered DTPA Solution. The Ministry of Agriculture of the People’s Republic of China: Beijing, China, 2004. (In Chinese)
  30. Tang, Y.F. Research on Loan Default Prediction Model Based on XGBoost Algorithm and LightGBM Algorithm. Mod. Comput. 2021, 27, 33–37. (In Chinese) [Google Scholar]
  31. Liang, W.Z.; Luo, S.Z.; Zhao, G.Y.; Wu, H. Predicting hard rock pillar stability using GBDT, XGBoost, and LightGBM algorithms. Mathematics 2020, 8, 765. [Google Scholar] [CrossRef]
  32. Liu, X.W.; Huang, W.B.; Jiang, Y.S.; Guo, R.X.; Huang, Y.X.; Song, Q.; Yang, Y. Study of the Classified Identification of the Strong Convective Weathers Based on the LightGBM Algorithm. Plateau Meteorol. 2021, 40, 909–918. (In Chinese) [Google Scholar]
  33. Song, Q.F.; Niu, S.Z.; Chen, Z.G.; Yin, J.; Zhou, S.J.; Cen, C.J. Evaluation of nutrient status in site soil of ancient tea trees in Huaxi on principal component analysis. Acta Agric. Zhejiangensis 2017, 29, 1844–1853. (In Chinese) [Google Scholar]
  34. Hu, H.Z.; Wang, H.J.; Liu, B.F.; Yao, Z.D.; Liu, X.M.; Jiang, C.G.; Niu, Z.X.; Qu, H.F.; Fang, T. Comprehensive Evaluation of Soil Fertility in Panxian Tobacco-Growing Areas of Guizhou Province. Chin. Agric. Sci. Bull. 2012, 28, 109–116. (In Chinese) [Google Scholar]
  35. Song, D.P.; Li, H.; Liu, S.J.; Zou, G.Y.; Liu, D.S. A geostatistic investigation of the comprehensive evaluation of fertility and spatial heterogeneity of forest soil nutrients in hilly and mountainous regions of southern China. Arab J. Geosci. 2019, 12, 292. [Google Scholar] [CrossRef]
  36. Caelen, O. A Bayesian interpretation of the confusion matrix. Ann. Math. Artif. Intel. 2017, 81, 429–450. [Google Scholar] [CrossRef]
  37. Guo, Y.X.; Chen, Y.L.; Miao, Q.; Fan, Z.Y.; Sun, J.W.; Cui, Z.L.; Li, J.Y. Spatial-Temporal Variability of Soil Nutrients and Assessment of Soil Fertility in Erhai Lake Basin. Sci. Agric. Sin. 2022, 55, 1987–1999. (In Chinese) [Google Scholar]
  38. Mu, T.; Lu, X.Q.; Xu, Z.C.; Xie, Y.; Fang, X.; Zhang, S. The relationship between the contents of available boron and available molybdenum in soil with the contents of boron and molybdenum of tobacco leaf in Luoping. Soil Fertil. Sci. China 2017, 44–50. (In Chinese) [Google Scholar]
  39. Yin, Y.Q.; Wei, Z.Y.; He, M.X.; Chen, D.K.; Kong, F. Analysis of soil nutrient status in tobacco planting areas of Nandan County, Guangxi. J. South. Agric. 2010, 41, 147–152. (In Chinese) [Google Scholar]
  40. Jiang, C.Q.; Dong, J.J.; Xu, J.N.; Shen, J.; Xue, B.Y.; Zu, C.L. Effects of Soil Amendment on Soil pH, Plant Growth and Heavy Metal Accumulation of Flue-Cured Tobacco in Acid Soil. Soils 2015, 47, 171–176. (In Chinese) [Google Scholar]
  41. Zhang, C.S.; Kong, F.Y. Isolation and identification of potassium-solubilizing bacteria from tobacco rhizospheric soil and their effect on tobacco plants. Appl. Soil Ecol. 2014, 82, 18–25. [Google Scholar] [CrossRef]
Figure 1. Spatial distribution of tobacco planting soil sampling location in Debao County.
Figure 1. Spatial distribution of tobacco planting soil sampling location in Debao County.
Applsci 12 12354 g001
Figure 2. The LightGBM modeling process framework.
Figure 2. The LightGBM modeling process framework.
Applsci 12 12354 g002
Figure 3. Distribution of exchangeable Ca, exchangeable Mg, available B and available Zn in tobacco growing soil.
Figure 3. Distribution of exchangeable Ca, exchangeable Mg, available B and available Zn in tobacco growing soil.
Applsci 12 12354 g003
Figure 4. Prediction results of the LightGBM model confusion matrix.
Figure 4. Prediction results of the LightGBM model confusion matrix.
Applsci 12 12354 g004
Figure 5. Eigenvalue score of LightGBM model.
Figure 5. Eigenvalue score of LightGBM model.
Applsci 12 12354 g005
Figure 6. Spatial distribution of soil fertility.
Figure 6. Spatial distribution of soil fertility.
Applsci 12 12354 g006
Table 1. Determination indexes and methods of soil nutrients.
Table 1. Determination indexes and methods of soil nutrients.
Measurement IndexDetermination Method
pH valuepH meter electrode method [23]
Organic matterPotassium dichromate titration [24]
Total NKjeldahl method [25]
Available K, PCombined extraction colorimetric method [26]
Exchangeable Ca, MgAmmonium acetate exchange atomic absorption spectrophotometry [27]
Available BBoiling water extraction methylimine-H colorimetric method [28]
Available ZnDTPA extraction atomic absorption spectrophotometry [29]
Table 2. Evaluation criteria of abundance and deficiency for nutrients in tobacco planting soil.
Table 2. Evaluation criteria of abundance and deficiency for nutrients in tobacco planting soil.
Evaluation IndexGrade
Very LowLowModerateHighVery High
pH value<4.54.5–5.55.5–7.07.0–7.57.5<
Organic matter (g/kg)<1010–2020–3030–40>40
Total N (mg/kg)<500500–10001000–20002000–2500>2500
Available K (mg/kg)<8080–150150–220220–350>350
Available P (mg/kg)<1010–1515–3030–40>40
Available B (mg/kg)0.3<0.3–0.50.5–1.01.0–3.0>3.0
Available Zn (mg/kg)0.3<0.3–0.50.5–1.01.0–3.0>3.0
Exchangeable Ca (cmol/kg)<22–44–66–10>10
Exchangeable Mg (cmol/kg)<0.40.4–0.80.8–1.61.6–3.2>3.2
Table 3. Soil pH values, organic matter, total N, available K, available P content in seven tobacco planting towns of Debao County.
Table 3. Soil pH values, organic matter, total N, available K, available P content in seven tobacco planting towns of Debao County.
Nutrient
Factors
TownRange
Value
AverageStandard
Deviation
Variation Coefficient
(%)
pH valueYantong6.4–7.26.80.192.84
Longguang6.3–7.16.80.223.32
Najia6.6–7.47.20.212.99
Zurong6.6–7.57.20.223.03
Du’an7.4–7.87.60.101.27
Dongling7.0–7.67.40.192.55
Jingde6.1–7.36.70.263.87
Total6.1–7.86.90.354.99
Organic matter
(g/kg)
Yantong12.8–57.025.28.1432.34
Longguang23.0–39.428.85.6419.58
Najia29.2–49.338.04.49211.83
Zurong20.6–48.833.57.7223.06
Du’an26.9–41.732.34.1212.78
Dongling14.2–44.124.09.3238.89
Jingde12.1–39.025.15.019.97
Total12.1–49.327.57.3724.99
Total nitrogen
(mg/kg)
Yantong50–202093046049.03
Longguang530–2560142051936.64
Najia990–3380173067138.68
Zurong610–2620168050030.28
Du’an7601520117017715.20
Dongling190–186084053263.65
Jingde100–321097035636.51
Total50–3380112051641.31
Available K
(mg/kg)
Yantong37.3–472.5231.196.3841.71
Longguang48.1–298.2171.278.0945.61
Najia91.8–298.2183.453.3229.07
Zurong57.8–298.0206.763.2930.61
Du’an222.3–364.8299.047.3215.82
Dongling206.8–379.7287.060.2020.98
Jingde49.4–504.8308.881.8926.86
Total37.3–504.8270.593.6038.84
Available P
(mg/kg)
Yantong3.8–82.331.120.2164.96
Longguang4.6–44.122.511.5251.28
Najia6.6–60.925.817.3567.38
Zurong9.8–62.526.112.3847.46
Du’an14.8–41.925.17.7530.89
Dongling9.8–32.119.67.1836.68
Jingde4.2–60.019.410.8856.01
Total3.8–62.523.014.0357.97
Table 4. Correlation analysis of tobacco planting soil.
Table 4. Correlation analysis of tobacco planting soil.
Nutrient
Factors
Ph
Value
Organic MatterTotal
N
Available
K
Available
P
Exchangeable
Ca
Exchangeable
Mg
Available
B
Available
Zn
pH
value
1
Organic matter0.341 **1
Total
N
0.323 **0.588 **1
Available
K
−0.107−0.133 *−0.182 **1
Available
P
0.0920.201 **0.0310.0421
Exchangeable
Ca
0.631 **0.525 **0.411 **−0.198 **0.0781
Exchangeable Mg0.479 **0.387 **0.316 **0.0520.0990.563 **1
Available
B
−0.356 **−0.005−0.0020.031 **0.075−0.330 **−0.0821
Available
Zn
0.0780.204 **0.213 **−0.0510.237 **0.0990.0410.0601
Note: * p < 0.05, ** p < 0.01.
Table 5. Evaluation of nutrient status based on the LightGBM model.
Table 5. Evaluation of nutrient status based on the LightGBM model.
TownProportion of Different Nutrient Grade (%)
I
Extremely High
II
High
III
Relatively High
IV
Medium
V
Low
Yantong0.691.044.143.796.9
Longguang0001.722.41
Najia0003.452.76
Zurong0008.283.79
Du’an01.031.723.10
Dongling00.341.041.720
Jingde9.314.8310.6924.143.11
Total107.2417.5946.218.97
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Liang, Z.; Zou, T.; Gong, J.; Zhou, M.; Shen, W.; Zhang, J.; Fan, D.; Lu, Y. Evaluation of Soil Nutrient Status Based on LightGBM Model: An Example of Tobacco Planting Soil in Debao County, Guangxi. Appl. Sci. 2022, 12, 12354. https://doi.org/10.3390/app122312354

AMA Style

Liang Z, Zou T, Gong J, Zhou M, Shen W, Zhang J, Fan D, Lu Y. Evaluation of Soil Nutrient Status Based on LightGBM Model: An Example of Tobacco Planting Soil in Debao County, Guangxi. Applied Sciences. 2022; 12(23):12354. https://doi.org/10.3390/app122312354

Chicago/Turabian Style

Liang, Zhipeng, Tianxiang Zou, Jialin Gong, Meng Zhou, Wenjie Shen, Jietang Zhang, Dongsheng Fan, and Yanhui Lu. 2022. "Evaluation of Soil Nutrient Status Based on LightGBM Model: An Example of Tobacco Planting Soil in Debao County, Guangxi" Applied Sciences 12, no. 23: 12354. https://doi.org/10.3390/app122312354

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop