Advances and Developments in Monitoring and Inversion of the Biochemical Information of Crop Nutrients Based on Hyperspectral Technology

: Crop nutrient biochemical information (mainly including chlorophyll class and nutrient elements mainly nitrogen, phosphorus and potassium) is an important basis for revealing crop growth and development patterns and their relationship with the environment. Hyperspectral technology has been rapidly developed and applied in crop nutrient biochemical information monitoring research. This paper ﬁrstly describes the theoretical basis of hyperspectral technology for monitoring crop nutrients and biochemical information. Then, the research progress of hyperspectral technology in monitoring nutrient and biochemical information of crops in different growth periods or different growth environments is outlined. Meanwhile, the shortcomings of the current technology in these research directions and the future research trends are discussed. Finally, the modeling methods for building crop nutrient biochemical information monitoring models by applying hyperspectral data are systematically outlined. And the effects of different spectral pre-processing methods, spectral effective information extraction methods and modeling algorithms on the accuracy of monitoring models are analyzed. On this basis, the challenges and prospects of hyperspectral technology in monitoring crop nutrient biochemical information are presented, aiming to provide relevant theoretical basis and technical reference for the research related to monitoring and inversion of crop physiological parameters based on hyperspectral technology.


Introduction
The nutrient biochemical information of a crop mainly includes pigments (chlorophyll, carotenoids, anthocyanins, etc.) and nutrients based on nitrogen, phosphorus and potassium elements [1]. Nitrogen, phosphorus and potassium are the three essential nutrients for plant growth and development, and each performs certain physiological functions in the crop. Among them, the level of nitrogen content is closely related to the photosynthetic efficiency and intensity of plants; phosphorus improves the adaptability of crops to the external environment by regulating the metabolic process in crops; potassium is the most abundant cation in plant cells, which has the functions of regulating water metabolism, enzyme activator, improving resistance and promoting photosynthesis for plants; chlorophyll and other pigments are the most important elements for the growth of crops [2]. Chlorophylls and other pigments are important indicators of photosynthetic efficiency, growth and nutrient status and pest and disease stress. When crops lack certain nutrients or have abnormal pigment content, it will definitely affect their external morphological structure and internal physiological functions, resulting in incomplete growth and development of crops, thus reducing the quality as well as yield of crops. Therefore, rapid and quantitative monitoring of nutrient and biochemical information of crops can determine the real-time critical fertility period of crops, analyze crop growth information and provide data basis for precise fertilization and pest and disease monitoring of farm crops.
In order to learn the nutritional status of crops and determine the content of pigments or nutrients in crops, traditional methods analyze seedling or leaf samples of crops after destructive treatment using chemical analysis, including Kjeldahl nitrogen [3], vanadiummolybdenum yellow [4] for phosphorus, flame photometry [5] for potassium, ethanol leaching [6] for chlorophyll, etc. Although the accuracy of crop elemental content determination by chemical analysis is high, it is time-consuming, cumbersome and requires destructive treatment of crop samples, which cannot allow rapid and convenient detection of crops grown in large areas [7,8]. Spectral imaging technology is a nondestructive, fast and real-time technique [9], which make it highly applicable for determining biochemical information of crop nutrients and indirectly monitoring crop growth and stress. Among them, hyperspectral technology has continuous waveband and large amount of recorded data compared with other spectral methods, which can better establish the correlation between spectral data and biochemical information of crop nutrients, and thus obtain monitoring models with higher accuracy. In recent years, with the continuous progress of remote sensing and computer technology, hyperspectral technology has been applied in agriculture [10,11], environmental monitoring [12], mineral monitoring [13] and marine remote sensing [14]. In the field of agriculture, hyperspectral-based biochemical information monitoring and inversion models of crop nutrients can quickly and accurately respond to the growth and nutritional status of crops. Therefore, the application of composite technology to analyze hyperspectral profiles of crops and construct biochemical information inversion models of crop nutrients has become a hot research topic. This paper reviews the research results related to the monitoring of crop nutrient biochemical information based on hyperspectral technology. Firstly, the technical features of hyperspectral technology in this field are introduced, including hyperspectral data with high spectral resolution and number of bands, and the ability to obtain crop spectral reflectance information. Then, the research situation that hyperspectral technology can construct a model for monitoring nutrient biochemical information of crops under different growth conditions is described. The effects of choosing different pre-processing methods, feature band extraction methods, spectral indices and modeling methods on the accuracy of the constructed hyperspectral monitoring models of crop nutrients and biochemical information are analyzed. Finally, this paper aims to provide a theoretical basis and technical reference for future research related to the monitoring and inversion of crop physiological parameters based on hyperspectral technology.

Reflectance Spectral Properties of Plants
The absorption, reflection and transmission of electromagnetic waves of different wavelengths (or frequencies) vary from substance to substance, and this response property to different wavelengths of the spectrum is called spectral characteristics. In addition to the influence of the morphological structure of the crop itself, the spectral characteristics of the crop leaves are also closely related to the crop's growth environment and its own growth conditions. Soil moisture content, soil nutrients and the degree of pests and diseases property to different wavelengths of the spectrum is called spectral characteristics. In addition to the influence of the morphological structure of the crop itself, the spectral characteristics of the crop leaves are also closely related to the crop's growth environment and its own growth conditions. Soil moisture content, soil nutrients and the degree of pests and diseases suffered by the crop will affect the growth and development of the crop, making the biochemical composition of the crop different, and the reflectance spectrum of the crop with chlorophyll content, nutrient content, water content and other biochemical components will show different patterns in different bands. Numerous studies have shown that the spectral reflectance characteristics of crops are closely related to their morphological structure, pigment content and nutrient content [15][16][17], so it is feasible to monitor the nutrient and biochemical information of crops using hyperspectral techniques. Figure 1 shows a schematic diagram of the reflectance spectral properties of plant leaves [18,19].

Hyperspectral Operation Platform of Biochemical Information Monitoring Model of Crop Nutrients
Hyperspectral data can be divided into imaging hyperspectral three-dimensional data and non-imaging hyperspectral two-dimensional data [20]. Meanwhile, the hyperspectral data acquisition equipment styles are diversified, and the common ones are handheld hyperspectral acquisition equipment, airborne imaging hyperspectral cameras, etc. Figure 2 shows the operation schematic of different hyperspectral acquisition equipment.

Hyperspectral Operation Platform of Biochemical Information Monitoring Model of Crop Nutrients
Hyperspectral data can be divided into imaging hyperspectral three-dimensional data and non-imaging hyperspectral two-dimensional data [20]. Meanwhile, the hyperspectral data acquisition equipment styles are diversified, and the common ones are handheld hyperspectral acquisition equipment, airborne imaging hyperspectral cameras, etc. Figure 2 shows the operation schematic of different hyperspectral acquisition equipment.
Compared to multispectral data, hyperspectral data have many more continuous bands and can use more spectral information, allowing for a greater variety of vegetation indices and modelling methods. In addition, the spectral pre-processing method is more robust to the noise interference in the continuous band [21]. The biochemical information monitoring models of crop nutrients based on hyperspectral technology better reflect the correlation between spectral data and biochemical information of crop nutrients, so the monitoring model has better prediction accuracy [22]. Since the physiological characteristics and biochemical components of the crop determine its spectral response, the required experimental studies can be carried out by means of controlled variable methods. By using a UAV with a hyperspectral camera or other spectral data collection device to collect spectral information from the leaves or other tissues of the crop canopy, and then applying relevant machine learning algorithms to construct regression models, it is possible to monitor the nutrient and biochemical information of the crops in different growth states and acquire real-time crop growth conditions. Compared to multispectral data, hyperspectral data have many more continuous bands and can use more spectral information, allowing for a greater variety of vegetation indices and modelling methods. In addition, the spectral pre-processing method is more robust to the noise interference in the continuous band [21]. The biochemical information monitoring models of crop nutrients based on hyperspectral technology better reflect the correlation between spectral data and biochemical information of crop nutrients, so the monitoring model has better prediction accuracy [22]. Since the physiological characteristics and biochemical components of the crop determine its spectral response, the required experimental studies can be carried out by means of controlled variable methods. By using a UAV with a hyperspectral camera or other spectral data collection device to collect spectral information from the leaves or other tissues of the crop canopy, and then applying relevant machine learning algorithms to construct regression models, it is possible to monitor the nutrient and biochemical information of the crops in different growth states and acquire real-time crop growth conditions.

Single Growth Stage
Crops have different levels of nutrient biochemical information and demand of nutrients at different growth stages. By using hyperspectral technology to collect spectral data of crops at a certain growth stage and then constructing a monitoring model using appropriate data analysis methods, we can quickly and accurately detect nutrient biochemical information of crops during important growth periods, thus obtaining feedback on possible nutrient stresses of crops during growth periods and providing reference for crop nutrient applications. Ryu et al. [23] used airborne hyperspectral remote sensing technology to collect spectral data and nitrogen content of rice during the heading stage for three consecutive years. They established a hyperspectral prediction model for nitrogen content of rice during the heading stage. Zhou et al. [24] used the FieldSpec geopolitical spectrometer produced by ASD to collect leaf spectral data of three different varieties Crops have different levels of nutrient biochemical information and demand of nutrients at different growth stages. By using hyperspectral technology to collect spectral data of crops at a certain growth stage and then constructing a monitoring model using appropriate data analysis methods, we can quickly and accurately detect nutrient biochemical information of crops during important growth periods, thus obtaining feedback on possible nutrient stresses of crops during growth periods and providing reference for crop nutrient applications. Ryu et al. [23] used airborne hyperspectral remote sensing technology to collect spectral data and nitrogen content of rice during the heading stage for three consecutive years. They established a hyperspectral prediction model for nitrogen content of rice during the heading stage. Zhou et al. [24] used the FieldSpec geopolitical spectrometer produced by ASD to collect leaf spectral data of three different varieties of maize at the six-leaf stage. The correlation between leaf nitrogen content and spectral reflectance of three maize varieties (combinations) was studied by setting different levels of nitrogen supply, and the sensitive wavebands for estimating leaf nitrogen content in six-leaf maize were clarified. An et al. [25] collected the spectral data of leaves of Red Fuji apple trees by using the American ASD FieldSpec 3 spectrometer (Analytical Spectral Devices, Inc., Boulder, CO, USA), and established the hyperspectral estimation model of nitrogen content in leaves of the apple trees at the end of fall growth period. Jiang et al. [26] used PSR-350 portable hyperspectrometer (Yamaha, Shizuoka, Japan) to collect the spectral data of winter wheat canopy during the flowering period. By analyzing the correlation between its spectral reflectance and chlorophyll content, a regression analysis of spectral index and chlorophyll content was constructed by selecting the sensitive waveband of winter wheat during the flowering period. The best inversion model for the chlorophyll content of winter wheat canopy during the flowering period was obtained.
Furthermore, the nutrient level of fruit tree petals during flowering stage reflects the plant's nutritional status and predict the flowering quantity and yield of fruit trees. Potassium content in fruit tree petals can better reflect their nutritional level. Rapid diagnosis and scientific regulation of potassium nutrition in fruit trees is an important aspect of quality and productive management of fruit trees. Liu et al. [27] characterized the potassium nutrition of citrus flowers using hyperspectral technology. Considering the flower organs of Hovenia orange rootstock 'Caracalla red-fleshed navel orange' as test material, the effective information on extraction method, characteristic spectra, and optimal prediction model was established for the estimation of citrus flower potassium content, laying the foundation for the real-time detection of citrus flower potassium nutrition. Zhu et al. [7] used the canopy hyperspectrum of apple blossom measured using the ASD FieldSpec 3 and laboratory-measured potassium content to correlate the canopy hyperspectral reflectance and its 11 transformed forms with potassium content in the Qixia experimental site. They used the highest correlation coefficient as the independent variable to establish a model for potassium content estimation using a fuzzy identification algorithm.

Multiple Growth Stages
In fact, the nutritional components of crops also change during their growth and development [28]. Considering citrus as an example, Figure 3 shows the correlation between the spectral data of citrus leaves at different stages and their chlorophyll content [29]. Highspectral monitoring models with high nutrient biochemical information content for a single growth period do not have predictive effects on the nutrient status of crops throughout their entire growth cycle. Therefore, it is of great significance to detect the nutrient biochemical information of crops in multiple growth periods and conduct cross-growth period nutrient biochemical information prediction research. To achieve this goal, researchers have used various hyperspectral techniques. Li et al. [30] collected high-spectral data of winter wheat canopies at different growth stages using the American ASD FieldSpec spectrometer, and established a relationship between the high-spectral data and leaf nitrogen content based on the N-PROSAIL model. The results showed that this method can estimate the nutritional status of winter wheat well. Li et al. [31] estimated the leaf nitrogen content of lychee in spring, summer, autumn and winter based on the canopy spectral reflectance, and obtained the best estimation model for leaf nitrogen content in different growth periods through comparative analysis. Huang et al. [32] collected reflection spectra data of citrus during the fruit-picking period and the shoot growth stage, and established a phosphorus content prediction model based on 234 sample data by combining with Partial Least Squares (PLS) and Support Vector Regression (SVR) methods. Yang et al. [33] used the portable spectrometer SVC HR-768 to determine the spectral reflectance and total nitrogen content of pear leaves at the fruit-setting period, fruit expansion period and fruit ripening period, and then constructed a leaf total nitrogen content estimation model for different growth periods of Korla pear. The fitting effect of the estimation model during the fruit-setting period is better, and the prediction accuracy is higher as well. Zhu et al. [34] selected super hybrid early rice as the research object, and used the FieldSpec3 spectrometer produced by the American ASD company to obtain 120 groups of high-spectral data, chlorophyll and leaf nitrogen content of rice leaves at the tillering, booting, full heading, filling and mature period of the rice. They used Partial Least Squares Analysis (PLSR), Random Forest algorithm (RF) and Support Vector Regression (SVR) methods to construct cross-period prediction models for leaf nitrogen content and chlorophyll of early rice leaves. used the FieldSpec3 spectrometer produced by the American ASD company to obtain 120 groups of high-spectral data, chlorophyll and leaf nitrogen content of rice leaves at the tillering, booting, full heading, filling and mature period of the rice. They used Partial Least Squares Analysis (PLSR), Random Forest algorithm (RF) and Support Vector Regression (SVR) methods to construct cross-period prediction models for leaf nitrogen content and chlorophyll of early rice leaves. Currently, research on monitoring and predicting the biochemical information of crop nutrients using hyperspectral techniques is mainly focused on establishing a correlation between the spectral data of crops and their biochemical information during a single growth stage. However, such models are not very versatile and cannot be applied to the entire growth stage of the crops. Additionally, constructing a model for monitoring biochemical information of crop nutrients requires a large amount of hyperspectral data, which is limited by the long time span and the difficulty in controlling environmental variables. As a result, there are few studies on monitoring models for the biochemical information content of crop nutrients across multiple growth periods or the entire growth period.

Monitoring the Biochemical Information of Crop Nutrients under Environmental Stress
Crops are often subjected to various environmental stresses, such as drought stress, salt stress, heavy metal stress, low or high temperature stress, etc. Different environmental stresses have different effects on the biochemical information of crop nutrients, thereby altering the spectral reflectance of crops. Yuan et al. [35] found that from the jointing stage to the maturity stage, the chlorophyll content of summer maize leaves decreases with the increasing degree of water stress. Peng et al. [36] discovered that different salt concentrations (0.200 and 400 mmol/L NaCl) significantly reduce the chlorophyll content of alkali bulrush, but increase the values of chlorophyll A and chlorophyll B. Sun et al. [37] found through experiments that different water conditions in wheat fields have a significant impact on the growth and development process of winter wheat, and the significant Currently, research on monitoring and predicting the biochemical information of crop nutrients using hyperspectral techniques is mainly focused on establishing a correlation between the spectral data of crops and their biochemical information during a single growth stage. However, such models are not very versatile and cannot be applied to the entire growth stage of the crops. Additionally, constructing a model for monitoring biochemical information of crop nutrients requires a large amount of hyperspectral data, which is limited by the long time span and the difficulty in controlling environmental variables. As a result, there are few studies on monitoring models for the biochemical information content of crop nutrients across multiple growth periods or the entire growth period.

Monitoring the Biochemical Information of Crop Nutrients under Environmental Stress
Crops are often subjected to various environmental stresses, such as drought stress, salt stress, heavy metal stress, low or high temperature stress, etc. Different environmental stresses have different effects on the biochemical information of crop nutrients, thereby altering the spectral reflectance of crops. Yuan et al. [35] found that from the jointing stage to the maturity stage, the chlorophyll content of summer maize leaves decreases with the increasing degree of water stress. Peng et al. [36] discovered that different salt concentrations (0.200 and 400 mmol/L NaCl) significantly reduce the chlorophyll content of alkali bulrush, but increase the values of chlorophyll A and chlorophyll B. Sun et al. [37] found through experiments that different water conditions in wheat fields have a significant impact on the growth and development process of winter wheat, and the significant difference in plant nitrogen content is also an important manifestation. Guan et al. [38,39] found that with the increase in cadmium mass fraction and the extension of stress time, the total chlorophyll mass fraction shows a trend of first increasing and then decreasing.
Therefore, it is crucial for agriculture to identify the type of stress that crops are subjected to quickly and accurately, in order to improve the growth environment of crops in a targeted manner. Hyperspectral remote sensing technology can make full use of the spectral characteristics of objects, monitor the biochemical information of crop nutrients under different environmental stresses and construct a monitoring model for the biochemical information of crop nutrients under environmental stress, providing a reference for monitoring the degree of environmental stress on crops. Ma et al. [40] set up a field water and fertilizer experiment with irrigation and nitrogen gradients, and used the ASD HH2 portable spectrometer to measure the spectral reflectance of cotton canopy during the cotton growth period, and simultaneously measured the nitrogen content and equivalent water thickness of the cotton canopy. NDSI (570, 500) was selected as the optimal spectral index for modeling, and a high-spectral monitoring model of nitrogen content in cotton canopy was constructed, with a predicted RRMSE of 0.18. Xie et al. [41] studied the effect of water stress on the high-spectral reflectance of winter wheat canopy, and found that important information on chlorophyll density of winter wheat after water stress was present at wavelengths of 427 nm, 434 nm, 749 nm and 814 nm. Feature extraction was carried out on the original spectrum through Correlation Analysis (CA), Partial Least Squares Regression analysis (PLSR) and Successive Projection Algorithm (SPA), and a high-spectral estimation model of chlorophyll density was established. Yao et al. [42] constructed estimation models based on red edge, sensitive bands and spectral indices, and analyzed the correlation between the spectral reflectance of winter wheat under high CO 2 concentration and chlorophyll content. The chlorophyll content of winter wheat under high CO 2 concentration could be better estimated based on sensitive spectral bands and Difference Vegetation Index (DVI).
Currently, the spectral monitoring of environmental stress in crops relies on a binary model, which only determines whether the crop is under stress, without taking into account the degree of stress. To improve the monitoring of environmental stress in crops, a model can be developed to monitor the biochemical information of crop nutrients under stress, which can determine the degree of stress based on the changes in spectral reflectance. Although most research on monitoring environmental stresses in crops focuses on monitoring soil, temperature and other factors that impact crop survival, there is less emphasis on direct monitoring of crop nutrients and biochemical information. In the future, the monitoring of environmental factors and biological parameters can be linked to improve the accuracy of identification and better apply to the actual agricultural production environment. This will provide a more effective monitoring tool for crop production and help improve the efficiency and quality of agricultural production.

Monitoring the Biochemical Information of Crop Nutrients under Pest and Disease Stress
Due to the differences in pathogen species, pest feeding habits and climatic conditions and the various interaction between different crop pests and their host crops, different physiological responses and changes in biochemical information of nutrients in the host crops occur, which manifest as damaged leaf structure, decreased pigment content and decreased nitrogen content. Xu et al. [43] found that black pine and horsetail pine would be infected after natural infestation by pine wood nematodes, and their chlorophyll content gradually decreased with the deepening of the disease. Jiang et al. [44] showed that the total nitrogen content gradually decreased with increasing stripe rust stress in winter wheat, and showed a highly significant correlation with the first-order differential spectra in the 430-518, 534-608, 660-762 nm and 783-893 nm regions. Tian et al. [45] found that the anthocyanin content of apple leaves increased with increasing severity of mosaic disease, and the spectral reflectance of the diseased area of leaves increased significantly throughout the visible region. There was also a red-edge blue shift phenomenon. Chen et al. [46] found that after cotton was infested with yellow wilt pathogen, the pathogen proliferated or induced a large amount of toxins in the plant, which resulted in blocked water transport within the plant, damaged internal leaf structure and corresponding changes in biochemical components (e.g., chlorophyll, water, etc.). The changes vary with different onset times and degrees of severity. Figure 4 shows the correlation coefficients between leaf spectra and chlorophyll content of bamboo leaves in different insect-infested moso bamboo. In the future, this information can be used to establish more accurate crop pest monitoring models and provide more effective control measures for agricultural production. transport within the plant, damaged internal leaf structure and corresponding changes in biochemical components (e.g., chlorophyll, water, etc.). The changes vary with different onset times and degrees of severity. Figure 4 shows the correlation coefficients between leaf spectra and chlorophyll content of bamboo leaves in different insect-infested moso bamboo. In the future, this information can be used to establish more accurate crop pest monitoring models and provide more effective control measures for agricultural production. After being attacked by fungi, bacteria or pests, crops undergo a series of changes in their nutrient biochemical information. On one hand, the pathogens or pests attack the plant tissues to acquire nutrients, leading to a decrease in nutrient content in the crops. On the other hand, the affected crops may adjust their nutrient allocation strategy by directing more nutrients to the invaded areas, enhancing their ability to resist pathogens or pests. However, this can result in insufficient nutrient supply to other parts or tissues of the crops, affecting their normal growth and development. When crops are under the stress of disease and pest attacks, their spectral characteristics usually show an increase in visible and shortwave infrared spectral reflectance, while the near-infrared spectral reflectance decreases [47]. This change differs significantly from the typical spectral characteristics of green vegetation. It is important to note that the extent of stress imposed by pests and diseases on crops can vary depending on factors such as crop species, growth stage and environmental conditions. Therefore, the spectral characteristics displayed by these changes may also differ. For instance, when crops are affected by powdery mildew, significant changes occur in their nitrogen and chlorophyll content, which are reflected in the spectral information. During the early growth stage of the crops, there is an increase in hyperspectral reflectance in the visible and infrared regions, while a decrease in reflectance is observed in the visible region during the mature stage. When crops are affected by aphids or moth, significant changes occur in the pigment content in their leaves, which are reflected in the spectral information. This is manifested as an increase in reflectance in the visible region and a decrease in reflectance in the near-infrared region. These changes are more pronounced during the reproductive stage of the crops. Under specific conditions, hyperspectral technology can analyze the changes in the biochemical information After being attacked by fungi, bacteria or pests, crops undergo a series of changes in their nutrient biochemical information. On one hand, the pathogens or pests attack the plant tissues to acquire nutrients, leading to a decrease in nutrient content in the crops. On the other hand, the affected crops may adjust their nutrient allocation strategy by directing more nutrients to the invaded areas, enhancing their ability to resist pathogens or pests. However, this can result in insufficient nutrient supply to other parts or tissues of the crops, affecting their normal growth and development. When crops are under the stress of disease and pest attacks, their spectral characteristics usually show an increase in visible and shortwave infrared spectral reflectance, while the near-infrared spectral reflectance decreases [47]. This change differs significantly from the typical spectral characteristics of green vegetation. It is important to note that the extent of stress imposed by pests and diseases on crops can vary depending on factors such as crop species, growth stage and environmental conditions. Therefore, the spectral characteristics displayed by these changes may also differ. For instance, when crops are affected by powdery mildew, significant changes occur in their nitrogen and chlorophyll content, which are reflected in the spectral information. During the early growth stage of the crops, there is an increase in hyperspectral reflectance in the visible and infrared regions, while a decrease in reflectance is observed in the visible region during the mature stage. When crops are affected by aphids or moth, significant changes occur in the pigment content in their leaves, which are reflected in the spectral information. This is manifested as an increase in reflectance in the visible region and a decrease in reflectance in the near-infrared region. These changes are more pronounced during the reproductive stage of the crops. Under specific conditions, hyperspectral technology can analyze the changes in the biochemical information of crop nutrients after disease and pest attacks, and even directly locate a specific characteristic band. By constructing a monitoring model for the biochemical information of crop nutrients under disease and pest stress, the relationship equation between hyperspectral data and crop nutrient biochemical information under disease and pest stress can be fitted. This allows for the indirect analysis of the types of diseases and pests affecting crops and the degree of stress they are under. Based on the hyperspectral and chlorophyll content data of jujube leaves under different truncated leaf mite damage levels, Gao et al. [48] analyzed the hyperspectral characteristics of jujube leaves under different truncated leaf mite damage levels and constructed a hyperspectral linear regression estimation model of chlorophyll content of jujube leaves under different truncated leaf mite damage levels based on firstorder differential spectroscopy, and the best fit of the estimation model was achieved at truncated leaf mite damage level 0 with R 2 = 0.810. Li et al. [49] applied hyperspectral imaging to detect the pigment content of citrus red spider-infested leaves and developed a model to predict leaf pigment content based on the best reflectance ratios of 667/522 and 667/647 nm in the characteristic wavebands. He et al. [50] obtained wavelet coefficients at different scales by continuous wavelet transform of wheat canopy spectra infected with stripe rust, and the selected wavelet coefficient features can be used as independent variables for building an inverse model to quantitatively estimate the chlorophyll content of wheat canopy under stripe rust disease stress. Kong et al. [51] constructed a kernel ridge based on the logarithm of the inverse of the original spectrum (log(1/R)), thereby constructing a kernel ridge regression (KRR) model for predicting phosphorus content of soybean leaves inoculated with bushy mycorrhizae.
Research on crop pests and diseases has focused more on the classification and identification of crops using hyperspectral techniques and not on the monitoring of crop nutrient biochemical information under epidemic pest and disease stress. However, prevention and early detection are the keys to crop pest control; therefore, modeling and real-time monitoring of crop nutrient biochemical information under pest and disease stress using hyperspectral techniques provide a new way for early detection of crop pests and diseases. When measuring hyperspectral data and nutrient biochemical information of crops under pest and disease stress, it can be used as a basis for predicting possible pest and disease stress in crops on the one hand, and clarifying the growth characteristics and damage level of crops under pest and disease stress on the other hand. However, there are few relevant research results and the accuracy of the model for analyzing the degree of pest and disease stress is relatively low; therefore, the research on the monitoring model of nutrient and biochemical information of crop pests and diseases still needs to be further developed.

Construction of Hyperspectral Biochemical Information Monitoring Model of Crop Nutrients
Hyperspectral remote sensing has strong advantages in building crop nutrient and biochemical information monitoring models because of its many wavebands, high spectral resolution and the ability to quantitatively analyze the fine spectral differences of features, but hyperspectral remote sensing brings great challenges to data processing and analysis due to the large amount of raw data information, many wavebands and high redundancy of information [52]. The choice of modeling raw spectral data, spectral data pre-processing methods, spectral data feature extraction methods, spectral indices and modeling analysis methods are of great significance to the construction of crop nutrient biochemical information monitoring models, which directly affect the model accuracy. Figure 5 is a schematic diagram of the technical framework of the hyperspectral monitoring model for crop nutrients and biochemical information.

Effect of Different Spectral Data Preprocessing Methods on the Accuracy of the Biochemical Information Hyperspectral Monitoring Model of Crop Nutrients
Common preprocessing methods for spectral data include normalization, Standard Normal Variate (SNV) transformation, Multiple Scatter Correction (MSC), Fourier Transform (FT), Savitzky-Golay (SG), Detrending (DT), Mean Centering, Integer-order Differentiation, Fractional-order Differentiation, baseline correction, Wavelet Transform, etc. Reasonably applying preprocessing methods can effectively eliminate spectral redundancy, noise and other interference factors such as environment, while highlighting the absorption features of the spectrum. Raw spectral data usually need to be preprocessed before being used as input for models, in order to achieve higher prediction accuracy and reduce the workload of parameter adjustment during model tuning. Chen et al. [53] used genetic algorithm combined with Partial Least Squares (PLS) to select leaf chlorophyll features in the chlorophyll spectral region based on the reflectance of rapeseed canopy, with logarithmic and first-order derivative transformations. They found that different spectral preprocessing methods improved the predictive ability of the model. Zhang et al. [54] constructed a hyperspectral monitoring model for nitrogen content in apple leaves, and used wavelet packet analysis to decompose the spectral information, obtaining low-frequency full-spectrum signals and denoised full-spectrum signals. They achieved different levels of denoising effects. Table 1 lists some spectral preprocessing methods used in the construction of crop nutrient and biochemical information monitoring models, as well as their effects on the models. Reasonably applying preprocessing methods can effectively eliminate spectral redundancy, noise and other interference factors such as environment, while highlighting the absorption features of the spectrum. Raw spectral data usually need to be preprocessed before being used as input for models, in order to achieve higher prediction accuracy and reduce the workload of parameter adjustment during model tuning. Chen et al. [53] used genetic algorithm combined with Partial Least Squares (PLS) to select leaf chlorophyll features in the chlorophyll spectral region based on the reflectance of rapeseed canopy, with logarithmic and first-order derivative transformations. They found that different spectral preprocessing methods improved the predictive ability of the model. Zhang et al. [54] constructed a hyperspectral monitoring model for nitrogen content in apple leaves, and used wavelet packet analysis to decompose the spectral information, obtaining low-frequency full-spectrum signals and denoised full-spectrum signals. They achieved different levels of denoising effects. Table 1 lists some spectral preprocessing methods used in the construction of crop nutrient and biochemical information monitoring models, as well as their effects on the models. It reduces scattering noise and enhances the correlation between spectral data and chlorophyll concentration. [57] Winter wheat Nitrogen Continuum Removal It effectively separates and highlights the spectral peak and valley features. [58] Winter wheat Nitrogen Fractional Derivative It enhances the correlation between the red edge band and chlorophyll. [59] Winter wheat Chlorophyll Fractional Derivative, Continuum Removal It removes spectral information noise and enhances the responsiveness of crop nitrogen and chlorophyll. [60] Corn Chlorophyll and Carotenoids

Unknown Variable Elimination
It reduces the noise in hyperspectral data and eliminates redundant spectral variables. [61] Corn Anthocyanin Fractional Derivative It effectively reduces the impact of noise on the target signal, highlights spectral feature information and amplifies the details of the original spectral curve. [62] Cotton Nitrogen Fractional Derivative It improves spectral resolution and provides rich absorption features. [63] Using preprocessing combined with machine learning to process spectral data can effectively eliminate the environmental effects during data acquisition. However, the results of different preprocessing methods combined with machine learning algorithms cannot be predicted in advance, and repeated experiments are necessary to obtain a combination of algorithms with higher accuracy. Deng et al. [64] selected chlorophyll content in apple leaves as the research object to explore the effects of four different spectral preprocessing methods, i.e., wavelet packet denoising reflectance spectra, reflectance first-order difference spectra, wavelet packet denoising followed by first-order difference spectra and firstorder difference followed by wavelet packet denoising spectra, on apple leaf spectral characteristics and chlorophyll content modeling. Results showed that the chlorophyll content prediction model of apple leaves with preprocessing method of first difference followed by wavelet packet denoising algorithm had higher peak signal-to-noise ratio and lower mean square error and maximum error. Rei et al. [65] studied chlorophyll content of mustard leaves and compared the accuracy of chlorophyll monitoring model constructed by five preprocessing methods of first-order derivative reflectance, continuum removal, trend elimination, multiple scattering correction and standard normal transform combined with different machine learning algorithms. With the combination of trend elimination pre-processing and extreme learning machine, the machine learning algorithm was found to be the most effective method in estimating chlorophyll content. Li et al. [66] determined the phosphorus content and canopy spectral reflectance of oilseed rape leaves at different fertility stages, and applied log(1/R), continuum removal (CR) and first-order differential spectral transforms (FDT) to the original spectra, respectively. Results showed that the FDT-PLS model based on sensitive bands was significantly better than other spectral transformations. Based on the hyperspectrum and measured chlorophyll content of winter wheat canopy, Li et al. [67] applied the original spectrum, Fractional-order Differential spectrum, wavelet energy coefficients obtained from the original spectrum by continuous wavelet transform and measured chlorophyll content for Correlation Analysis, and selected the Fractional-order Differential spectrum and wavelet energy coefficients with better correlation. Stepwise regression analysis, support vector machine, artificial neural network and other methods were then used to construct a model for estimating chlorophyll content of winter wheat. The optimal algorithmic combination model of chlorophyll content of winter wheat based on hyperspectrum for different growth periods was finally obtained.
Spectra without preprocessing can lead to errors in quantitative analysis and incorrect component predictions, making spectral preprocessing a potentially beneficial tool for improving the estimation accuracy of nutrient biochemical information monitoring models [68]. Spectral data preprocessing plays a crucial role in building nutrient biochemical information models, setting the foundation for later stages such as obtaining spectral characteristic bands and selecting modeling methods. Therefore, more attention should be given to spectral data preprocessing in experimental design. Currently, model construction based on hyperspectral data usually involves using separate preprocessing methods and comparing their effects on model accuracy. Fewer studies are based on models constructed by applying multiple preprocessing methods simultaneously. First, the combined effects of different preprocessing methods on the original spectral data are unclear, as there may be interactions between preprocessing methods that lead to different prediction results compared to using the methods individually. Second, a clear and interpretable model is essential for understanding and explaining results. When multiple preprocessing methods are applied simultaneously, the model may become more complex and harder to interpret. Furthermore, over-optimizing preprocessing methods can lead to overfitting, where the model performs well on training data but poorly on new data. Despite these challenges, the combined application of multiple preprocessing methods still has the potential to enhance model prediction performance. Future research can explore this approach, aiming to obtain preprocessing methods that effectively improve model prediction capabilities while minimizing negative impacts.

Effect of Different Feature Extraction Methods on the Precision of the Biochemical Information of Crop Nutrient Monitoring Model Based on Hyperspectral Techniques
Application of hyperspectral techniques for quantitative analysis of biochemical information of crop nutrients has a large number of samples, numerous bands and high correlation of adjacent bands, which also results in a high redundancy of information in the whole spectral data and adverse effect on the accuracy of monitoring models. Therefore, there is a need to find suitable methods to reduce the analysis indexes while being able to extract sensitive bands from the original bands that contain most of the information needed to predict nutrient biochemical information for the purpose of comprehensive analysis of the original spectral data, which is also an important working basis for further research on monitoring models of crop biochemical parameters. There are two main types of methods to extract effective information from hyperspectral data, one is to obtain the characteristic bands and their combinations by mathematical and statistical analysis or applying intelligent preference algorithms, and the other is to obtain effective information distinguished from the original spectral data by constructing spectral indices.

Selection of Appropriate Spectral Feature Band Extraction Method Can Effectively Improve the Accuracy of the Model
The feature band selection method based on mathematical statistics is commonly used in processing hyperspectral data. It helps to select feature bands that are highly correlated with the target variable or have significant differences, thus improving the predictive performance and interpretability of the model. Common feature band selection methods based on mathematical statistics include Successive Projection Algorithm (SPA), Correlation Coefficient method (CC), Competitive Adaptive Reweighted Sampling (CARS), Gaussian Process Regression (GPR), etc. Liu et al. [69] obtained the hyperspectral data and chlorophyll content of soybean leaves at flowering and podding stage. The selected wavelengths were extracted by continuous projection method, Competitive Adaptive Reweighted Sampling method and Correlation Coefficient method. The PLS modeling of the selected preprocessing method and the characteristic wavelength variables were compared and analyzed. Results showed that the inversion of the selected variables were significantly improved compared with those of the full-wavelength variables. Wang et al. [70] obtained hyperspectral reflectance data and corresponding leaf nitrogen content data at two scales, leaf and canopy, for rice at different fertility stages. Then, Correlation Analysis of single-band raw spectra and first-order derivative spectra and Gaussian Process Regression (GPR) were applied to screen nitrogen sensitive bands at leaf and canopy scales during the whole fertility period of rice. Results show that the sensitive bands screened by GPR conform to the patterns of nitrogen content and spectral changes in rice. Yang et al. [71] compared the accuracy of Monte Carlo-uninformative variable elimination, Random Frog hopping, Competitive Adaptive Reweighting Sampling and Moving Window Partial Least Squares Method of band selection, and proposed a sensitive band selection method combining Competitive Adaptive Reweighting Sampling and Correlation Coefficient method. A nonlinear regression model was established with the screened 30 bands of data. The experimental results show that both the prediction accuracy and modeling accuracy are significantly improved compared with BP neural network model and Support Vector Regression model. Although feature band selection methods based on mathematical statistics can improve the predictive performance of models to some extent, they still have limitations in practical applications. Many feature extraction methods based on mathematical statistics assume a linear relationship between features and the target variable. However, when the relationship between features and the target variable is nonlinear, these methods may fail to capture this relationship effectively. Additionally, feature extraction methods based on mathematical statistics are highly dependent on data preprocessing, which means that the combination of these methods with different preprocessing algorithms can have a significant impact on the model.
Feature band selection methods based on intelligent optimization algorithms have certain advantages, which help improve the effectiveness of feature extraction when dealing with complex datasets. Common feature band selection methods based on intelligent optimization algorithms include Principal Component Analysis (PCA), Genetic Algorithm (GA), Ant Colony Algorithm (ACA), etc. Intelligent optimization algorithms typically possess strong global search capability, adaptability and generalization ability, enabling them to search for optimal solutions throughout the entire search space. They also dynamically adjust search strategies based on problem characteristics, thus enhancing the efficiency of the model. Guo et al. [72] used PCA analysis to compress and extract the main information from the raw spectral data of rubber seedling leaves, and the obtained 20 principal components could explain 99.993% of the information of the original 2151 wavelengths (350-2500 nm). Cao et al. [73] used three dimensionality reduction methods, such as SPA, LASSO and EN, combined with three regression methods, such as MLR, MSR and PLSR, to construct nine maize leaf nitrogen inversion models, and the EN-PLSR model had the best prediction performance for estimating maize leaf nitrogen content under comprehensive comparison. Sun et al. [74] used rice canopy spectral data as the research object, and applied PCA to reduce the dimensionality of the original spectral data. The obtained principal components were used as input variables to construct high-spectral estimation models for leaf SPAD values using stepwise multiple linear regression analysis and Support Vector Regression. They found that the combination of PCA and support vector machine models could better predict leaf SPAD values compared to models based on Correlation Coefficient methods. In comparison to feature band selection methods based on mathematical statistical analysis, intelligent optimization algorithms have advantages in terms of global search capability, adaptability, parallel computing capability, avoiding overfitting, and handling multi-objective problems. However, such methods also have limitations, such as high computational complexity and slow convergence speed. Table 2 lists some spectral feature extraction methods used in constructing crop nutrient biochemical monitoring models and their effects on the models. Table 2. The spectral feature extraction method and its influence on the model of crop nutrient biochemical monitoring model were partially constructed.

Cotton Nitrogen Successive Projection Algorithm
The collinearity between wavelength variables was eliminated, and the sensitive feature bands of leaf nitrogen content were highly correlated with leaf nitrogen content [75] Lettuce Phosphorus

Successive Projection Algorithm, Principal Component Analysis
The dimension of spectral data is reduced, the complexity of the model is reduced and the prediction ability of the model is improved [76] Winter wheat Nitrogen Discrete Wavelet Transform On the basis of maintaining the quality of original spectral information and reducing the spatial dimension of canopy spectral data, the feature extraction of hyperspectral canopy spectra is completed [77] Winter wheat Chlorophyll Continuous Wavelet Transform The characteristic information of chlorophyll content in winter wheat canopy was captured effectively, which improved the prediction accuracy of the model [78] Corn Chlorophyll Apple Anthocyanin Variable Importance in Projection (VIP)-PLSR-Akaike Information Criterion The estimation accuracy and conciseness of the model are guaranteed effectively [80] The combination of mathematical statistics and intelligent optimization algorithms can improve the accuracy, robustness and efficiency of spectral feature extraction, as well as expand the feature space to adapt to different types of hyperspectral data. Liu et al. [81] focused on winter wheat at different growth stages. They preprocessed the spectral data using wavelet denoising and multiple scattering correction, applied PCA to reduce the dimensionality of the data, and then used correlation coefficient analysis to select the optimal combinations of principal components for different growth stages. Multivariate regression models were built to estimate chlorophyll content at different growth stages. Results showed better predictive performance and generalization ability for all models. Liu et al. [61] combined CWT with CARS to estimate nitrogen concentration in potatoes. The results show that CARS can retain the coefficients with the least information redundancy, while eliminating the invalid information with large interference in the wavelet coefficients processed by continuous wavelet transform, thus improving the stability and accuracy of the model. Although the combination of mathematical statistics and intelligent optimization algorithms theoretically improves the accuracy and effectiveness of spectral feature extraction, there are limited practical examples of this composite approach. On one hand, the combination of these two types of algorithms requires careful design and adjustment to ensure their collaboration and exploitation of respective advantages, which may increase the difficulty of algorithm design and implementation and restrict the practical application of this method. On the other hand, this composite approach may lead to a significant increase in computational complexity, slower convergence speed and poorer stability of the model. Slow or unstable convergence speed during the search for optimal feature combinations can affect the practical application of the method. Despite these challenges, the combination of mathematical statistics and intelligent optimization algorithms for spectral feature acquisition still holds certain research value. In practical applications, suitable feature selection methods can be chosen based on specific problems and scenarios, and existing optimization techniques and tools can be employed to overcome these challenges.

Appropriate Spectral Index Is Helpful to Improve the Prediction Accuracy of the Model
Spectral index is a powerful spectral parameter that is obtained by applying a specific algorithm to one or more spectral bands [82]. Compared to single-band spectral information, spectral indices have higher sensitivity and can effectively reduce or eliminate noise caused by environmental backgrounds such as soil and water. They are an indispensable tool in spectral data analysis. Spectral indices can be used to construct qualitative or quantitative monitoring models and have been extensively applied in research areas such as monitoring crop growth indicators [83], pest and disease stress [84,85] and growth conditions [86]. Spectral indices can effectively extract useful information from hyperspectral data. Compared to models based on raw spectral data or specific spectral bands, spectral indices amplify the connection between spectral reflectance and crop nutrient and biochemical information, resulting in models constructed based on spectral indices having better predictive performance.   Table 1 lists some of the commonly used spectral indices for constructing crop nutrient biochemical information monitoring models. To explore the hyperspectral band combinations that are sensitive to the chlorophyll content of winter wheat canopy, and to compare the estimation effect of different spectral indices on the chlorophyll content of wheat canopy, Luo et al. [87] selected four spectral indices to construct a model for monitoring the chlorophyll content of wheat canopy based on the original spectrum and the first-order derivative spectrum. Pan et al. [88] compared the correlation between six vegetation indices and apple canopy chlorophyll content, and constructed spectral indices for all two-band combinations in the sensitive wavelength region.
Models developed based on these spectral indices achieved a high level of significance. Xu et al. [89] selected strawberry as the test material to measure the chlorophyll content of strawberry leaves and the spectral reflectance of crown height, and studied the correlation between them. Results showed that the correlation between chlorophyll content and the vegetation index DVI, MSAVI, PVI, RDVI, SAVI and TSAVI reached a very significant level. It can be used as characteristic parameter to predict chlorophyll content (Table 3).  [101] The differences in correlation between spectral indices and different crop varieties, growth stages, stress conditions and the types of inferred biochemical information are primarily influenced by the combined effects of variations in plant physiological characteristics, changes in growth stages, impacts of stress conditions and the different types of inferred biochemical information. To enhance the accuracy and reliability of monitoring models, it is crucial to select spectral indices that exhibit a high correlation with the specific target object features and attributes. Diao et al. [102] found that the greenness vegetation index (Green NDVI), soil-adjusted ratio vegetation index (SARVI), ratio vegetation index (RVI) and Difference Vegetation Index (DVI) showed highly significant correlation with the nitrogen fertilizer bias productivity PF-Pn of wheat at tillering and flowering, pulling, tasseling and maturity stages, respectively. Lu et al. [103] experimentally found that the normalized difference spectral index spectral index (NDSI (R1705, R1385)), Ratio Spectral Index (RSI (R1385, R1705)) and Difference Spectral Index (DSI (R1705, R1385)) were well correlated with the Leaf Potassium Content of rice through experiments (R 2 up to 0.68). In their research on the high-spectral inversion model for rice phosphorus content, Ban et al. [104] found that the Leaf Phosphorus Content of rice was highly correlated with three newly constructed spectral indices: NDSI (R498, R606), RSI (R498, R606) and DSI (R498, R586), with correlation coefficients of 0.913, 0.915 and 0.938, respectively.
Spectral indices with higher correlation have better interpretability in specific application scenarios, enabling them to more accurately reflect the spectral variations of the target object and effectively enhance the inference performance of monitoring models. Therefore, in the process of constructing crop nutrient biochemical information monitoring models based on spectral indices, it is necessary to select suitable spectral indices for monitoring and inference based on specific circumstances. Inoue et al. [105] found that spectral indices based on NDSI (R825, R735) or RSI (R825, R735) had high prediction accuracy when modeling and monitoring nitrogen content in rice. Qi et al. [15] found that the optimal spectral indices for determining chlorophyll content in peanut leaves were NDSI (R520, R528), RSI (R748, R561), DSI (R758, R602) and SASI (R753, R624), and all the determined coefficient values for the models based on NDSI, RSI, DSI and SASI were greater than 0.65, while the root mean square error values were all less than 2.04.
To improve the prediction accuracy of inversion models of crop nutrient biochemical information, many scholars have proposed many new spectral indices based on the optimization of published spectral indices to reduce background signals or noise, to resolve overlapping spectral features and to enhance the relationship between spectral data and crop nutrient biochemical information [106]. Liang et al. [107] designed two new spectral indices based on the first-order derivatives of reflectance spectra FD-NDNI and FD-SRNI for estimating the nitrogen content of wheat, and the comparative analysis showed that the accuracy of the models constructed based on FD-NDNI and FD-SRNI was better than that of the commonly used indices such as MNDVI (705) and NDNI. Yu et al. [108] simplified the spectral index NAOC based on integral operations and obtained optimized spectral indices based on dual-band simplification operations. They used the original reflectance spectra R and mathematically transformed spectra LgR, R 1 2 and 1/R as the basis for calculating spectral indices, and established three rice leaf SPAD inversion models using PLSR, SVM and BP neural networks, respectively. The determination coefficients of these models were all greater than 0.79, and the standardized root mean square error values were less than 5.4%. In order to explore and evaluate the effects of different vegetation indices on winter wheat canopy chlorophyll content (CCC) estimation, Zhang et al. [59] optimized the published and revised indices using a combination of Original Spectra (OS) and first-order differential (FD) processed random bands. They found that Three-band Vegetation Index could resolve the limitation of the number of bands on target information extraction, alleviate the saturation problem of Two-band Vegetation Index and improve the monitoring accuracy of winter wheat chlorophyll content. Overall, the emergence of new spectral indices provides a new approach and methodology for spectral analysis and the study of models for estimating crop nutrient biochemical information. These new indices offer more precise and comprehensive spectral feature information.

Construction of Biochemical Information Monitoring Model of Crop Nutrient Based on Machine Learning Algorithm
Machine learning algorithms are powerful tools for handling hyperspectral remote sensing data. Predicting crop nutrient biochemical information based on hyperspectral data is a regression problem, and common machine learning algorithms include decision tree algorithms, regression algorithms, ensemble algorithms, clustering algorithms and artificial neural networks. These algorithms have been widely used in constructing monitoring models for crop nutrient biochemical information based on hyperspectral data. Prior to applying machine learning algorithms, preprocessing and extraction of relevant spectral information from hyperspectral data are necessary to ensure the reliability of the models. The preprocessed and spectrally informative hyperspectral data is used as input, while the desired crop nutrient biochemical information is used as output. By fitting a regression model using machine learning algorithms, monitoring of crop nutrient biochemical information can be achieved. These models can generally be classified as linear or nonlinear models. When applying machine learning algorithms for regression modeling of crop nutrient biochemical information, it is important to consider model selection and tuning, handling overfitting and underfitting and model evaluation and interpretation, in order to improve model accuracy.
Linear regression models have lower computational complexity and are better able to reflect causal relationships and relative importance between variables. They also possess strong robustness. Commonly used algorithms for constructing linear data models for crop nutrient biochemical information include Partial Least Squares Regression (PLSR), Multiple Linear Regression (MLR), Stepwise Regression Analysis (SRA) and Multivariate Stepwise Regression (MSR). Li et al. [109] used hyperspectral data of potato leaves under different nitrogen treatment conditions, selected raw spectra and their transformed data in the wavelength range of 430-910 nm and used PLSR to construct a nitrogen content prediction model of potato leaves based on the nitrogen content data of leaves sampled simultaneously in the field. The results showed that the Partial Least Squares regression modeling method was able to model the nitrogen content of potato leaves under the condition that the independent variables had several multiples. Ye et al. [110] used PLSR and MLR to construct a model to estimate the nitrogen content of apple leaves based on the Raw Hyperspectral Reflectance and first-order derivative reflectance of apple leaves. The MLR model based on raw reflectance was better than the PLS model and the MLR model based on first-order derivative reflectance. Kai Li et al. [111] used MSR to construct a hyperspectral estimation model for chlorophyll content of moso bamboo under insect damage stress. It was found that the MSR could further compress the feature wavelength extracted by the continuous projection algorithm. And the envelope removal and envelope removal first-order derivative MSR models established by combining the continuous projection algorithm with the SPXY sample division method can effectively estimate the chlorophyll content of bamboo leaves. However, linear regression models also have some limitations. For example, when there is multicollinearity among multiple independent variables, the linear regression model may become unstable. Additionally, linear regression models are limited to capturing linear relationships between variables and cannot capture more complex nonlinear relationships or interactions. Although linear regression models may not perform as well as other complex machine learning algorithms in certain situations, they are still a useful and reliable modeling tool for handling linearly related data and problems.
Hyperspectral data are spectral data of continuous wavelengths, and there is a high correlation between adjacent wavelength data. Therefore, the linear model can better solve the problem of multiple covariance that hyperspectral data are usually prone to, thus improve the accuracy and stability of the model. In addition, the relationship between hyperspectral data and crop nutrient biochemical information contains both linear and nonlinear models. Nonlinear models have limited ability to predict linear components in the models. However, a large number of studies have shown that compared with linear models, nonlinear models can further exploit the hidden valid information between spectral data and crop nutrient biochemical information, thus improving the prediction accuracy of the models.
Spectral data typically contains rich information and can capture the reflection or absorption characteristics of crop biochemical components, indicating that there is a direct linear relationship between their concentrations and spectral responses. However, due to the complex nonlinear relationships that may exist between nutrient absorption, transportation, metabolism, crop physiological processes and environmental factors, crop nutrient biochemical information may also exhibit nonlinear characteristics. Nonlinear models have more flexible assumptions about data distribution and can capture interactions between independent variables. Compared to linear models, nonlinear models are better able to explore the relationship between spectral data and crop nutrient biochemical information, providing more accurate and detailed predictions [112]. In a hyperspectral inversion study of chlorophyll content of apple leaves, Liu et al. [113] found that Support Vector Regression had better prediction ability for chlorophyll content of apple leaves compared with polynomial regression. In a hyperspectral-based inversion study of chlorophyll content of winter wheat, Wang et al. [114] found that the prediction accuracy of both nonlinear models constructed by the Gradient Boosted Regression Tree Algorithm and BP neural network were higher than that of the Multiple Linear Prediction model constructed by the ridge regression algorithm. Among them, the Gradient Boosted Regression Tree Algorithm has a greater advantage in predicting chlorophyll content of winter wheat at all fertility stages. Guo et al. [115] selected leaf phosphorus content sensitive bands based on the Correlation Analysis of leaf phosphorus content and spectral variables. This sensitive band was used as the input variable to predict the leaf phosphorus content by combining MLR, PLSR and BP neural network models. The results showed that the model for predicting leaf phosphorus content of rubber seedlings constructed by combining sensitive bands with BP neural network had the highest prediction accuracy. And it was significantly better than other linear models constructed by using various spectral forms as input variables.
In addition to the algorithms mentioned above, common algorithms applied to construct nonlinear data models for crop nutrient biochemical information include Random Forest Regression [58], Gaussian Process Regression [116], K-value proximity [117] and radial basis function neural networks [118]. Although nonlinear regression models have strong predictive capabilities, they also pose challenges and limitations. For example, parameter estimation in nonlinear models can be more difficult and prone to getting stuck in local optima. Additionally, overly complex nonlinear models can introduce overfitting issues. Therefore, when using nonlinear regression models, it is important to carefully choose the model form, address overfitting problems and perform appropriate model evaluation and validation.
Both linear regression models and nonlinear regression models have limitations in real-world applications. The ideal solution is to integrate both types of models, capturing both linear and nonlinear relationships in the data, increasing the flexibility of the model and potentially improving the model's fitting capability and prediction accuracy. It is feasible to integrate linear regression models and nonlinear regression models, and one approach is to use a generalized linear model that combines both linear and nonlinear components. Another approach is to use the predictions of a nonlinear regression model as input for a linear regression model. The specific methods and steps for integrating linear regression models and nonlinear regression models may vary depending on the specific problem and data. When selecting the appropriate approach, it is necessary to consider the practical situation and data characteristics, and perform appropriate model evaluation and validation. Liu et al. [119] collected rice canopy hyperspectral data and chlorophyll content data and developed a hybrid prediction model (GPR-P), which was based on Gaussian Process Regression and PLSR to compensate and predict the chlorophyll content of rice. Moreover, it made full use of the respective advantages of linear and nonlinear models to further improve the prediction accuracy and stability of the model. Integrating linear regression models with nonlinear regression models can improve the model's fitting capability and prediction accuracy. However, it also comes with limitations such as increased complexity, increased data requirements, risk of overfitting, difficulties in parameter selection and reduced interpretability. Currently, most models for crop nutrient biochemical information based on hyperspectral monitoring rely on single data models, and there is limited research on combining two different types of data models. Further development by researchers is needed.

Challenges and Prospects of Hyperspectral Technology in Monitoring Biochemical Information of Crop Nutrients
The application of hyperspectral techniques for monitoring crop nutrient biochemical information has an important position in the field of hyperspectral remote sensing agriculture. There are still challenges that require fast and efficient solutions to expand the application of hyperspectral technology for monitoring crop nutrient biochemical information.

1.
There are few products for spectral monitoring of crop nutrient biochemical information for agricultural production. The purpose of the research is to obtain the nutrient biochemical information of the target crop, select the combination of characteristic spectral bands or characteristic spectral bands that contribute most to the monitoring model and make a relatively low-cost multispectral device or sensor for application in agricultural production. Although most of the realized models have good predictive capability, they are still less able to be translated into products for application in agricultural production practices. This is mainly because different varieties, different growth conditions, different growth periods and different nutrient biochemical information of the same crop correspond to different spectral sensitive bands, and the models are not very universal. At the same time, different crops have different growth environments, and the shapes and adaptability of the spectral sensors need to be designed to match. Therefore, the application of hyperspectral technology to study the crop nutrient biochemical information monitoring models into corresponding spectral sensors is very promising commercially.

2.
A database of spectral models for monitoring nutrient and biochemical information of different crops has not yet been established. Due to the complexity of crop species and different growing conditions, the equipment for collecting hyperspectral data of crops also varies, and the models established by different researchers for monitoring nutrient and biochemical information are less universally applicable. On the one hand, it is conducive to the exchange of data among researchers studying the same crop species or the same nutrient biochemical information, so as to filter out the spectral fingers with good generality; on the other hand, it can provide research ideas for scholars studying different crop species or different nutrient biochemical information.

3.
At present, the hyperspectral monitoring model of crop nutrient and biochemical information is not dynamic and real-time. Most of the spectral data used to construct the monitoring models are static statistics. And the monitoring models constructed by the inversion of single nutrient and biochemical information lack consideration of the linkage between crop nutrient and biochemical information and the interaction mechanism between crop and environment. Therefore, in the subsequent research, the spectral prediction model can be combined with the crop growth mechanism model or remote sensing model to build a more efficient and universal dynamic monitoring model, so as to make accurate judgments on the trends of nutrient and biochemical information and the crop growth in each growth cycle.

Conclusions
Building nutrient biochemical information monitoring models based on hyperspectral data can establish the relationship between hyperspectral data, nutrient biochemical information and other plant phenotypes quickly and effectively. This approach is also a research hotspot in the field of precision agriculture. This paper provides an overview and review of academic research on building nutrient biochemical information monitoring models based on hyperspectral data. The main contributions of this paper are as follows: (1) sorting out the main application scenarios of building crop nutrient biochemical information monitoring models based on hyperspectral data in recent years; (2) systematically explaining the advantages, disadvantages and applicability of the methods used in each stage of model construction; and (3) presenting the current challenges and potential future developments. Through this clearer and more comprehensive overview, we aim to provide a reference for future research on building crop nutrient biochemical information monitoring models based on hyperspectral data.
In recent years, numerous researchers in the agricultural field have built hyperspectral remote sensing systems and applied them in monitoring models for crop nutrient biochemical information. Currently, these models have been effectively applied in monitoring nutrient biochemical information of crops at different growth stages or under different growth conditions. However, most of the research focuses on a specific growth period of crops, only providing feedback on the crop's growth status at the time of data collection, without demonstrating future growth prediction. Therefore, future research can focus on monitoring nutrient biochemical information across multiple growth stages of crops, thus enhancing the validation of crop growth prediction and improving the reliability of the models. Additionally, although most studies achieve high model prediction accuracy, there is limited discussion on the applicability of the models in the same crop research. Generality is an important evaluation criterion for models and should also be considered as a significant factor during model construction.
In the various stages of model construction, a large number of traditional machine learning algorithms have been combined. Generally, traditional machine learning techniques exhibit good performance, but the setting of their parameters greatly affects regression accuracy, requiring significant effort in parameter tuning. Currently, researchers have made improvements in classification and recognition algorithms while using different data mining techniques to identify, classify and quantitatively analyze the spectral features of crop nutrient biochemical information. The achieved research results demonstrate that hyperspectral technology has good predictive capabilities in identifying and monitoring the nutrient biochemical content of crops, revealing the potential of applying hyperspectral technology combined with computer algorithms for analyzing crop phenotypic information. However, as mentioned above, although the monitoring models exhibit excellent predictive abilities after parameter tuning, they are mostly limited to specific application scenarios. Therefore, it is necessary to further increase the dataset of nutrient biochemical information for crops of the same type in different growth environments, enabling the development of models with stronger generalization and dynamic prediction capabilities. Additionally, the development of modeling methods based on multiple algorithm fusion is also crucial, as it may provide an important solution for enhancing model stability, generality and accuracy.