Construction of a Screening Model for Nitrogen-Efficient Rice Varieties Based on Spectral Data

Han, Honghua; Ji, Yuhang; Dai, Mian; Sun, Chengming

doi:10.3390/agronomy16050540

Open AccessArticle

Construction of a Screening Model for Nitrogen-Efficient Rice Varieties Based on Spectral Data

¹

Business School, Yangzhou University, Yangzhou 225009, China

²

Cultivation and Construction Site of National Key Laboratory for Crop Genetics and Physiology in Jiangsu Province, Yangzhou University, Yangzhou 225009, China

³

Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops, Yangzhou University, Yangzhou 225009, China

^*

Author to whom correspondence should be addressed.

Agronomy 2026, 16(5), 540; https://doi.org/10.3390/agronomy16050540

Submission received: 26 January 2026 / Revised: 25 February 2026 / Accepted: 26 February 2026 / Published: 28 February 2026

(This article belongs to the Section Precision and Digital Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Accurate and efficient screening of nitrogen-efficient rice varieties is crucial for implementing precision agriculture and achieving green and sustainable development. However, traditional screening methods rely on destructive sampling and chemical analysis, which are inefficient and costly, and thus cannot meet the requirements of large-scale breeding applications. Therefore, this study aims to develop a non-invasive, high-throughput screening method for nitrogen efficiency of rice based on unmanned aerial vehicle (UAV) hyperspectral data and machine learning algorithms. Sixty rice varieties were selected as the target, and principal component analysis (PCA) was used to reduce the dimension of seven key agronomic parameters (such as yield, nitrogen utilization rate, etc.). A comprehensive evaluation index for nitrogen utilization efficiency was constructed, and K-means clustering was used to classify the varieties into three categories: nitrogen-efficient, medium-efficient, and low-efficient varieties. On this basis, four machine learning algorithms (decision tree (DT), random forest (RF), support vector machine (SVM), and K-nearest neighbor (KNN)) were used to establish a variety nitrogen efficiency classification model based on spectral indices. The results showed that the indicators constructed based on PCA and clustering could effectively distinguish different nitrogen-efficient varieties; among the four models compared, the DT model achieved the highest overall performance, with an accuracy of 0.75, precision of 0.80, and F1-score of 0.74. This study confirmed the feasibility of combining UAV hyperspectral data with decision tree models, providing a reliable technical solution for the large-scale, rapid, and non-invasive screening of nitrogen-efficient rice varieties.

Keywords:

rice; nitrogen-efficient varieties; agronomic parameters; screening; vegetation indices

1. Introduction

As a staple food crop in China, rice is closely linked to daily human life. For decades, rice production in China has focused on cultivating varieties with traits such as lodging resistance, short stature, and high fertilizer tolerance, while continuously increasing fertilizer application [1,2,3]. This approach has contributed to higher yields and enhanced food security. However, it has also led to a series of challenges, including rising input costs, aggravated environmental pollution, and relatively low overall economic benefits, which increasingly impact public health. While yield, quality, or nitrogen use efficiency (NUE) are common targets in rice variety screening, reliance on any single parameter may provide an incomplete assessment of overall performance. Recently, some scholars have begun to explore the relationship between rice quality and yield [4]. Nevertheless, the application of UAV-based spectral technology specifically for classifying rice varieties based on comprehensive nitrogen use efficiency (NUE) types remains relatively limited. Hyperspectral remote sensing, meanwhile, has advanced rapidly and is now widely applied across multiple fields. With the increasing demand for efficiency and sustainability, this technology shows particularly promising prospects in intelligent agriculture [5,6]. Against the backdrop of a growing global population and diminishing arable land, improving rice yield and NUE has become an urgent issue that must be addressed to ensure food security. When selecting nitrogen-efficient rice varieties, an integrated assessment that simultaneously considers yield, quality, and nitrogen response is highly desirable to avoid trade-offs and achieve balanced improvement. Nevertheless, some varieties have demonstrated the ability to achieve high yields through efficient nitrogen utilization [7,8,9].

NUE is a complex trait influenced by numerous factors, including nitrogen uptake efficiency, nitrogen utilization efficiency, nitrogen remobilization, and their interactions with environmental conditions and agronomic management, leading researchers to employ different indicators for screening. Bashir [10] proposed relative tiller number and relative dry matter weight as suitable criteria for identifying nitrogen-efficient rice germplasm. Chen et al. [11] found that nitrogen-efficient varieties exhibited a significantly higher leaf area index at both the heading and maturity stages compared to inefficient varieties, suggesting this parameter as a useful indicator. Wang et al. [12] recommended using tiller number at the early growth stage, along with yield per plant, biomass, and the number of effective panicles throughout the growth cycle, as screening criteria. Furthermore, Chu et al. [13] demonstrated that high-yielding and nitrogen-efficient varieties are characterized by greater aboveground biomass, more grains per panicle, and higher total nitrogen uptake. Collectively, these studies provide valuable selection criteria and research directions for identifying high-yielding, nitrogen-efficient rice varieties. Mi [14] argued that high yield should not be the sole criterion in maize variety screening. They observed that high-yield and high-efficiency varieties exhibited superior performance in terms of ear leaf area at the filling stage, as well as Soil and Plant Analyzer Development (SPAD) values at flowering and physiological maturity, suggesting these traits as viable screening indicators. Similarly, Chen et al. [15] proposed tiller number at the early growth stage as an indicator for nitrogen-efficient rice varieties, while recommending yield per plant, biomass, and number of effective panicles across the entire growth cycle as additional criteria. Aspelund et al. [16] identified SPAD values as a useful parameter for screening nitrogen-efficient wheat. Gao et al. [17] emphasized the importance of key growth stages—heading and filling—for dry matter formation, NUE, and yield in wheat, suggesting these stages serve as critical indicators for evaluating NUE. Furthermore, their study indicated that net photosynthetic rate and intercellular CO₂ concentration during the grain-filling stage can objectively reflect characteristics of nitrogen-efficient varieties, thereby providing additional metrics for NUE assessment.

The findings summarized above collectively indicate that no universal standard exists for selecting nitrogen-efficient crop varieties, and that appropriate criteria should be established according to specific crops and cultivation practices. Given the multiplicity of evaluation factors, the use of principal component analysis (PCA) is well justified, as it effectively addresses multicollinearity and information redundancy among multiple indicators. By linearly transforming a set of interrelated agronomic parameters—such as yield, biomass, and nitrogen content—into a few independent principal components, PCA objectively weights these components based on the inherent variance structure of the data. This process minimizes subjective bias and yields a composite score that comprehensively reflects the overall nitrogen efficiency performance of each variety, thereby providing a scientific and reliable quantitative basis for the accurate screening and classification of nitrogen-efficient rice. Building on previous research, this study selected seven agronomic traits of rice—tiller number [18], plant height [19,20], leaf area [21], SPAD value [22,23], biomass [24], nitrogen content [25], and yield [26]—to construct such a comprehensive evaluation index, thus establishing a robust foundation for identifying nitrogen-efficient rice varieties.

Real-time and accurate monitoring of crop nitrogen status is crucial for effective field management and precision breeding [27]. Traditional methods for determining nitrogen levels involve destructive sampling of plant tissues followed by chemical analysis, which are difficult to scale for large-area field monitoring. With the advancement of agricultural remote sensing technology, numerous studies have demonstrated its capability to provide real-time monitoring data and facilitate nitrogen diagnostics in agricultural production.

The primary objective of research on NUE is to identify crop varieties characterized by high nitrogen uptake and utilization efficiency, thereby reducing nitrogen fertilizer input, lowering production costs, and promoting environmentally sustainable agricultural practices. Varieties with high nitrogen uptake efficiency are typically screened under field conditions by controlling nitrogen application; these varieties are identified by their ability to achieve high yield under limited nitrogen input. In contrast, varieties exhibiting high nitrogen utilization efficiency accumulate substantial dry matter biomass under low nitrogen input. However, accurately quantifying the total nitrogen available to crops remains challenging due to the combined contributions of soil nitrogen and externally applied fertilizer. To address this complexity, a common assumption in field studies is that the soil nitrogen supply is relatively uniform across plots receiving the same fertilizer treatment, thereby standardizing experimental conditions. Under this framework, grain yield at harvest is widely adopted as a key indicator for evaluating NUE in rice [28]. Using yield as a key indicator to identify and promote suitable nitrogen-efficient varieties has served as a fundamental approach for researchers worldwide. Han [29] conducted phenotypic characterization and yield analysis of sugar beet genotypes with varying nitrogen efficiency, demonstrating that dry matter production efficiency per unit of nitrogen uptake could serve as a reliable screening criterion for nitrogen-efficient varieties. Similarly, Li et al. [30] compared maize yields under nitrogen-applied and nitrogen-free conditions, classifying 16 varieties into four efficiency types based on mean yield values, thereby identifying high-yielding nitrogen-efficient genotypes.

Traditional screening methods are often time-consuming, labor-intensive, and destructive to crops. In contrast, modern approaches leveraging information technology enable rapid, large-scale, and non-destructive variety screening [31].

To identify fast-growing Chinese fir varieties, Zou et al. [32] acquired hyperspectral images to measure biomass, tree height, and diameter at breast height across different genotypes. After categorizing the varieties into three groups, they extracted vegetation indices and employed Decision Tree, Random Forest, Support Vector Machine, and XGBoost algorithms in Python (v3.8) to process the spectral data and construct classification models. In a study on screening wheat varieties adapted to late sowing, Wang et al. [33] collected multispectral images using a UAV during the overwintering stage of winter wheat. They extracted single-band reflectance and vegetation indices, applied three variable selection methods and four machine learning algorithms (including random forest, support vector machine, artificial neural network, and k-nearest neighbors), and selected the optimal model for estimating canopy SPAD values with high accuracy. This approach provided an effective UAV and machine learning-based method for real-time monitoring of chlorophyll content and screening of late-sown wheat varieties during the overwintering period. However, while their research focused on estimating a single physiological trait for a specific agronomic scenario, the present study aims to classify rice varieties directly into multi-trait-based nitrogen efficiency categories using an integrated suite of vegetation indices, with the goal of developing a more generalizable screening tool. Zhao [34] acquired both visible and multispectral images to investigate the relationship between spectral data and early senescence characteristics in wheat. Their analysis revealed a strong correlation between vegetation indices derived from spectral data and early senescence traits, with the NDVI index exhibiting particularly high inversion accuracy, supporting its use as a characteristic indicator for screening early senescence resistance. In a study on drought stress classification in potatoes, Zhang [35] collected leaf spectral data under different drought treatments and evaluated multiple classification models. By comparing their performance, the study identified the model with the highest accuracy as the optimal approach for grading drought severity in potato crops. However, UAV-based spectral screening for nitrogen-efficient variety selection remains limited.

Based on traditional methods for screening nitrogen-efficient rice varieties, this study explores an innovative approach that utilizes drone-acquired hyperspectral imagery. The methodology involves selecting characteristic wavelengths to calculate vegetation indices, which are then incorporated into four classification models alongside clustering labels derived from a comprehensive nitrogen efficiency evaluation index. By comparing the performance of these machine learning models, the optimal classifier was identified to output variety classification results and screen nitrogen-efficient rice varieties. Furthermore, the relationship between vegetation indices and NUE was analyzed based on screening outcomes, enabling the characterization of nitrogen-efficient varieties and identification of their representative spectral features. This research establishes a foundational framework for subsequent studies on nitrogen management and variety selection.

2. Materials and Methods

2.1. Rice Varieties and Cultivation Protocol

2.1.1. Plant Materials

A total of 60 rice varieties were selected as the experimental materials. The selection was aimed at covering a wide range of genetic diversity and agronomic performance, including commercial varieties, historical varieties, and advanced breeding lines, to ensure a representative test sample for evaluating the differences in nitrogen response among different varieties. The names of these varieties and their corresponding field codes are listed in Table 1.

2.1.2. Experimental Field Design

This experiment was conducted in the experimental field of the College of Agriculture, Yangzhou University during the 2022 and 2023 rice growing seasons.

This study aims to conduct a preliminary high-throughput screening experiment to evaluate the performance of 60 rice varieties in response to nitrogen. This study aims to conduct a preliminary high-throughput screening experiment to evaluate the performance of 60 rice varieties in response to nitrogen. This “one variety per plot” design is a common necessary compromise in large-scale phenotypic screening experiments, and such experiments aim to identify potential candidate varieties for more resource-intensive repeated tests in the future.

Importantly, agronomic parameters (e.g., plant height, SPAD, yield components) were not based on single-plant measurements but were derived from multiple individual plants or sub-samples within each plot (as detailed in Section 2.2.1), providing a representative mean value for the variety under that specific plot condition. The spectral data, acquired by UAV, inherently represent an average canopy response of the entire plot. Therefore, the final dataset for each variety consisted of two years of observational data, with each year’s data point being a plot-level mean.

Each plot was of a size of 10 rows × 18 columns. One seedling was planted per hole, ensuring a total of 180 basic seedlings per plot. The nitrogen fertilizer application rate was 270 kg·hm⁻², with the nitrogen fertilizer distribution ratio being 5:5 for base fertilizer and tillering fertilizer (base and tillering fertilizer each accounted for 50% of the total nitrogen application), and the phosphorus and potassium fertilizers were 150 kg·hm⁻² applied as base fertilizer. Field management was carried out according to conventional methods. The spatial arrangement of the 60 plots followed a completely randomized design within the experimental field. The detailed layout of this field has been previously published, and the current study utilized the same setup [36].

2.2. Data Acquisition and Processing

2.2.1. Field Data Collection

Agronomic data were collected at key growth stages (seedling, tillering, heading, and maturity). The methodologies employed were threefold. First, the parameters of tiller number, leaf area, biomass, and plant nitrogen content were non-destructively estimated by integrating image recognition and machine learning-based regression models [37,38]. Second, direct manual measurements were taken: plant height was averaged from three random plants per plot, and SPAD values (chlorophyll content) were averaged from nine measurements across the plant canopy using a SPAD-502Plus meter(Konica Minolta Inc., Osaka, Japan). Finally, yield was quantified destructively from a one-square-meter sample by assessing panicle number, filled/unfilled grains, and 1000-grain weight.

2.2.2. Methodology for Index Construction and Validation

Vegetation Index Formulation

We derived a comprehensive score for nitrogen use efficiency through PCA to synthesize the seven measured agronomic traits. The methodology was threefold. First, the data for all parameters were standardized. Subsequently, PCA was performed in the Python environment using the PCA function from the scikit-learn library, with all parameters calculated using the function’s default settings. Components with eigenvalues exceeding 1 were selected [39,40]. Finally, the comprehensive score was computed as the weighted sum of the selected principal components (F₁, F₂, …, F_m), with their variance contribution rates (λ₁, λ₂, …, λ_m) serving as the weights, as defined by the following equation:

F = \frac{λ_{1}}{λ_{1} + λ_{2} + \dots + λ_{p}} F_{1} + \frac{λ_{2}}{λ_{1} + λ_{2} + \dots + λ_{p}} F_{2} + \dots + \frac{λ_{m}}{λ_{1} + λ_{2} + \dots + λ_{p}} F_{m}

(1)

This score became the definitive metric for assessing overall NUE in the subsequent analysis.

Regarding the composition of the comprehensive index (F): As is well known, grain yield is not only a key outcome of nitrogen fertilizer utilization but also a component of the comprehensive index F. By including yield and other agronomic traits, the aim is to construct a comprehensive indicator of the overall field performance under a given nitrogen fertilizer condition. The comprehensive index F is designed as a practical and comprehensive benchmark for screening, aiming to capture varieties that perform well across multiple dimensions (including final yield). Therefore, the subsequent performance category classification (high/medium/low) based on F reflects the comprehensive agronomic value rather than a mechanical division of nitrogen fertilizer utilization efficiency. This approach is in line with the application goals of this study, which is to develop a high-throughput phenotypic analysis tool for selection.

Validation of Index Construction Results

Bartlett’s Test of Sphericity is employed for this purpose [41]. Bartlett’s Test of Sphericity is a method used to assess the degree of correlation among variables. It is typically conducted prior to factor analysis to determine the suitability of variables for such analysis. The test is based on the correlation coefficient matrix, with the null hypothesis being that the correlation coefficient matrix is an identity matrix, where all diagonal elements are 1 and all off-diagonal elements are 0. The test statistic for Bartlett’s Test of Sphericity is derived from the determinant of the correlation coefficient matrix. If this value is large and its associated p-value is less than the chosen significance level, the null hypothesis should be rejected, indicating that the correlation coefficients are not an identity matrix, implying that there is correlation among the original variables, and thus they are suitable for factor analysis. Conversely, if the null hypothesis is not rejected, the variables are not suitable for factor analysis. In this study, the results of Bartlett’s Test of Sphericity (approximate χ² = 72.315, df = 21, p < 0.001) confirmed the suitability of the data for PCA.

Considerations on PCA Interpretation

It should be noted that PCA is a statistical method based on data. The principal components and their weights obtained reflect the dominant variation patterns within the given dataset containing seven agronomic features, rather than the pre-defined biological importance. This method minimizes subjective bias when constructing a composite index and effectively addresses the problem of multicollinearity. However, the biological interpretation of each component requires careful consideration and should be supported by supplementary agronomic knowledge (as discussed in Section 3.3.2).

2.3. Hyperspectral Image Data Acquisition

Hyperspectral imagery of the experimental plots was acquired using the DJI Matrice 600 Pro unmanned aerial vehicle (UAV) (DJI Innovation, Shenzhen, China) equipped with a GaiaSky-mini2 hyperspectral imaging system (Duali He Pu). Data collection was conducted between 10:00 a.m. and 12:00 p.m. under clear, cloudless, and windless conditions. Prior to each flight, the exposure time was calibrated using a diffuse reflectance reference panel. All images were captured manually with the UAV flying at an altitude of 100 m and an image overlap rate set at 80%. The specifications of the Matrice 600 Pro UAV and the GaiaSky-mini2 hyperspectral imaging system are summarized in Table 2.

The technical specifications were sourced from the official DJI website (https://www.dji.com/cn/matrice600-pro, accessed on 1 December 2025) and the product manual of the GaiaSky-mini2 hyperspectral imaging system.

2.4. Preprocessing of Hyperspectral Image Data

The raw images acquired by the UAV were processed through a sequential preprocessing workflow. First, lens calibration, reflectance calibration, and atmospheric correction were performed using SpecView software (v3.1). The corrected images were then mosaicked using the HiSpectralStitcher software (Dualix Spectral Imaging Co., Ltd., Chengdu, China), with the output saved in ENVI format. Finally, region of interest (ROI) analysis was conducted for each experimental plot using ENVI software (v5.6). The mean reflectance of each band was extracted from images at four key growth stages—jointing, booting, heading, and maturity—and used as the spectral reflectance for the corresponding plot. These processed data served as the basis for subsequent analysis.

2.5. Feature Parameter Extraction

Remote-sensing-derived spectral indices have been widely utilized to enhance the monitoring of crop physiological and biochemical responses. These indices have been successfully applied to estimate various agronomic parameters, including SPAD values, nitrogen content, biomass, and leaf area index [42]. Vegetation indices are designed to characterize the spectral reflectance performance of different crop types across specific wavelength bands, enabling the retrieval of crop growth status. The NUE of crops is an inherent physiological characteristic, but its external comprehensive performance is closely related to visible agronomic parameters such as biomass accumulation and chlorophyll content. Therefore, the spectral vegetation indices of these key agronomic parameters theoretically also have the potential to indirectly assess the nitrogen efficiency of varieties.

This association is based on a well-established set of physiological principles. Excellent NUE leads to measurable phenotypic advantages: (1) Enhanced chlorophyll synthesis, as nitrogen is the core component of chlorophyll molecules, resulting in increased absorption in the red light band and higher reflectance in the NIR band—these changes can be reflected by chlorophyll-sensitive indices (such as NDVI and GNDVI); (2) Increased biomass accumulation and tree crown development, which is due to improved photosynthetic efficiency and growth, significantly increasing the NIR reflectance, forming the basis of biomass-sensitive indices (such as RVI and DVI); (3) Optimized tree crown nitrogen content and structure, which can usually be detected through subtle changes in the red-edge region and quantified through indices such as NDRE and RECI. Therefore, under a uniform nitrogen input, varieties with high nitrogen use efficiency are expected to exhibit unique and superior spectral characteristics in these key vegetation indices. This comprehensive spectral phenotype can serve as a practical and non-destructive indicator for measuring the complex trait of crop NUE, enabling machine learning models to classify crops based on their spectral characteristics of the tree crown.

Building upon previous research, this study selected and summarized a set of representative spectral parameters (Table 3) for subsequent variety screening analysis.

2.6. Model Selection

2.6.1. Support Vector Machine

Support Vector Machine (SVM) is a machine learning method developed by Vapnik et al. based on statistical learning theory [62,63]. As a supervised binary classification model, its fundamental principle is to identify an optimal hyperplane that maximizes the margin between the two classes while maintaining classification accuracy. Denoting the optimal hyperplane as H, the data can be separated into two categories. Two parallel hyperplanes, H₁ and H₂, are constructed such that they are equidistant from H and pass through the closest data points of each class. These critical data points located on H₁ and H₂ are referred to as support vectors. Although SVM is inherently a binary classifier, it has been extended to multi-class classification by constructing multiple binary classifiers—typically using a one-versus-one or one-versus-rest strategy—and combining their outputs to determine the final class assignment [64].

2.6.2. Naive Bayes

Naive Bayes is a widely used classification model that represents a modification of the standard Bayesian approach. The key distinction lies in its core assumption of feature independence, which posits that all input features are conditionally independent of each other given the class label. The algorithm operates as follows: first, the prior probability for each class is determined. Then, using Bayes’ theorem, the posterior probability of the feature set belonging to a particular class is computed. Finally, the class with the highest posterior probability is assigned as the predicted category. In simpler terms, for a given sample under specified conditions, the class with the greatest probability of occurrence is identified as the classification outcome.

Assuming there are n rice varieties, for a given rice sample X, the classifier predicts its category y by computing the posterior probability P(C_y∣X) using the Bayesian formula:

P (C_{y}| X) = \frac{P (X | C_{y}) P (C_{y})}{\sum_{i = 1}^{X} P (C_{i}) P (X | C_{i})}

(2)

2.6.3. CART Decision Tree

The Classification and Regression Tree (CART) algorithm, introduced by Breiman in 1984 [65,66], operates by recursively identifying optimal feature thresholds within complex, irregularly distributed data to construct a binary tree structure for classification and prediction. During the tree growth process, the algorithm employs the Gini index—a measure derived from economics—as the primary criterion for feature selection and split point determination. The Gini index is defined as follows [67]:

G i n i I n d e x = 1 - \sum_{j}^{J} P^{2} (j | h)

(3)

P (j| h) = \frac{n_{j} (h)}{n (h)}

(4)

\sum_{j}^{J} p (j| h) = 1

(5)

In the formula,

P (j | h)

represents the probability of the jth category when a sample is randomly selected from the training set and the test variable value is h.

n_{j} (h)

is the number of samples belonging to the jth category when the test sample value is h in the training set.

n (h)

is the total number of samples with the test variable value of h in the training set. j is the number of categories.

2.6.4. K-Nearest Neighbors

The K-Nearest Neighbors (KNN) algorithm is a widely used supervised learning method known for its simplicity and effectiveness [68]. The algorithm operates by calculating the distance between an unknown sample and all known labeled samples in the feature space, identifying the K closest neighbors, and then classifying or predicting the unknown sample based on the majority class or average value of these neighbors [69]. The selection of the K value is typically based on empirical knowledge and domain-specific considerations. While values of 3 or 5 are commonly used, K may extend up to 100 in certain applications. It is standard practice to choose an odd number for K to prevent ties in the voting process when dealing with binary classification problems.

The K-means algorithm, introduced by MacQueen in 1967, is an iterative clustering method that partitions data points into K clusters by iteratively assigning data points to the nearest cluster center and updating the centroids until convergence is achieved. The primary objective of the algorithm is to minimize the total within-cluster variance, expressed as the sum of squared errors (SSE) between data points and their corresponding cluster centroids. While K-means is computationally efficient and suitable for large datasets, it is sensitive to the initial selection of cluster centers and may converge to local optima [70].

Definition 1:

Euclidean Distance between Data Points:

d (x, y) = \sqrt{\sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}}

(6)

Definition 2:

Euclidean Distance between a Data Point and a Cluster Center:

d (x, c_{i}) = \frac{1}{2}

(7)

Definition 3:

Sum of Squared Errors (SSE) Formula:

The Sum of Squared Errors (SSE) is defined as the total squared distance between each data point and its assigned cluster centroid, calculated as:

S S E = \sum_{i = 1}^{k} \sum_{n ⊏ c_{i}}^{n} {|d (x, c_{i})|}^{2}

(8)

where k is the number of clusters; C_i represents the i-th cluster; and x is a data object in dataset d.

K-means Algorithm Procedure. 1. For clustering 60 samples into 3 groups, begin by selecting 3 initial cluster centers. 2. Assign each data point to the cluster with the nearest center, based on Euclidean distance. 3. Recalculate the cluster centers by computing the mean of all data points within each cluster. 4. Repeat steps 2 and 3 until cluster centers stabilize (no further changes) or the maximum number of iterations is reached. To determine the optimal number of clusters (k), we verified it using the elbow rule and the silhouette coefficient method. The elbow rule showed that, when k = 3, the rate of decrease in distortion reached a turning point; at the same time, the average silhouette coefficient was the highest when k = 3. Therefore, k = 3 was determined as the optimal number of clusters. We used the KMeans function from scikit-learn for clustering, with the algorithm parameters set as n_clusters = 3, and the other parameters set to default. Finally, the 60 varieties were classified into three categories: nitrogen-efficient, nitrogen-moderate-efficient, and nitrogen-inefficient groups.

2.7. Model Evaluation Metrics

The performance of this classification model was evaluated through five standard metrics: accuracy (the proportion of correctly classified samples) [71,72], precision (positive predictive value), F1 score (the harmonic mean of precision and recall), Cohen’s kappa coefficient (the consistency between predicted results and true labels, adjusted for the influence of random errors), and Hamming distance (the proportion of wrongly predicted labels).

2.8. Data Processing

Data analysis and visualization were performed using Office 2021, IBM SPSS Statistics 26, PyCharm Community Edition 2020.1.2 x64, and Origin 2022 software. The original data and related codes of this experiment have been made publicly available on GitHub (https://github.com/Mr-J-YH/NUE.git, accessed on 1 December 2025).

3. Results and Analysis

3.1. Analysis of Agronomic Parameter Variations Across Varieties

To explore the relationships among different varieties in terms of agronomic traits, this study analyzed the average values of six traits (number of tillers, plant height, SPAD value, leaf area, biomass, and plant nitrogen content) at four key growth stages, as well as the yield data at the maturity stage.

As shown in Table 4, significant differences in agronomic parameters were observed among the 60 rice varieties. Based on the screening criteria for nitrogen-efficient rice varieties outlined in Section 1, seven previously validated agronomic indicators were selected: tiller number, plant height, SPAD, leaf area, biomass, yield, and plant nitrogen content. The coefficients of variation (CV) for these parameters ranged from 7.39% to 20.98%. Yield exhibited the highest CV (20.98%), indicating its importance as a key distinguishing indicator for screening nitrogen-efficient varieties. Leaf area showed the second highest CV (17.42%), while SPAD had the smallest CV (7.39%), suggesting minimal variation in SPAD values among different varieties. Based on the CV values, the degree of phenotypic variation across varieties was highest for yield, followed by leaf area, tiller number, biomass, plant height, plant nitrogen content, and SPAD. This suggests substantial genotypic diversity in these traits within the tested panel, particularly for yield, which provides a rationale for using multi-trait integration in screening.

3.2. Correlation Analysis of Agronomic Parameters Across Varieties

A correlation analysis of seven agronomic parameters across different rice varieties revealed complex interrelationships (Figure 1). Tiller number showed significant positive correlations with nitrogen content and SPAD value, but significant negative correlations with leaf area and yield, along with a highly significant negative correlation with plant height. Plant height demonstrated a highly significant positive correlation with leaf area but a significant negative correlation with yield. SPAD value was significantly positively correlated with biomass and nitrogen content, yet significantly negatively correlated with leaf area and yield. Leaf area exhibited significant negative correlations with both biomass and nitrogen content, while biomass was significantly negatively correlated with yield. These intricate correlations indicate substantial information overlap among parameters, demonstrating that no single or limited set of indicators can adequately assess NUE. Consequently, PCA was employed to integrate all seven parameters, providing a comprehensive approach for screening nitrogen-efficient rice varieties while overcoming the limitations of single-parameter evaluation.

The significance of the correlation coefficients was assessed using a two-tailed t-test (* p < 0.05, ** p < 0.01). It is noteworthy that with a sample size of n = 60, even moderate correlation coefficients (e.g., |r| > ~0.25) can reach statistical significance at the p < 0.05 level.

The color intensity and the numerical values in each cell represent Pearson’s correlation coefficient (“r”) between the row and column traits. Red shades indicate positive correlations, while blue shades indicate negative correlations. Asterisks denote statistical significance levels (*: p < 0.05, **: p < 0.01). The traits include tiller number (TN), plant height (PH), SPAD value (SPAD), leaf area (LA), aboveground biomass (BM), grain yield (YLD), and plant nitrogen content (PNC).

3.3. PCA of Agronomic Parameters Across Varieties

3.3.1. Extraction of Principal Components

As shown in Table 5, the Kaiser–Meyer–Olkin (KMO) measure was 0.504, and Bartlett’s test of sphericity yielded a significance level of p = 0.000 (<0.05). According to the standard criteria for PCA suitability (KMO > 0.5 and p < 0.05), the dataset was deemed appropriate for PCA.

Table 6 presents the communalities after principal component extraction, indicating the proportion of variance explained for each variable. A communality value represents the amount of information retained from the original variable, with higher values (closer to 1) indicating better information preservation. As shown in the table, all communalities exceeded 0.4. Plant height (0.79) and leaf area (0.80) demonstrated the highest communalities, suggesting these variables contribute substantial information to the principal components. In contrast, SPAD (0.49) and biomass (0.49) showed the lowest communalities, indicating relatively weaker information contribution among the analyzed parameters.

As shown in Figure 2, three principal components were retained based on the eigenvalue-greater-than-one criterion, and the varimax rotation method was applied to enhance the interpretability of the factor structure. The eigenvalues for Component 1, Component 2, and Component 3 were 1.876, 1.752, and 1.044, respectively. The variance contribution rates were 26.80% for Component 1, 25.02% for Component 2, and 14.92% for Component 3, resulting in a cumulative contribution rate of 66.74% for the three components. This indicates that nearly two-thirds of the total variance in the original seven agronomic parameters is effectively captured by these three composite dimensions, achieving a meaningful reduction in data complexity while retaining most of the critical information.

Figure 2 shows the cumulative contribution rates of different components and the corresponding eigenvalues. The scatter plot in the figure further supports the conclusion of selecting three components. It can be clearly seen in the figure that the cumulative contribution rates of the first three components account for the majority of the total contribution rate and are dominant. The curve shows a significant turning point after the third component—usually referred to as the “elbow”—after which the eigenvalues gradually decrease and stabilize. This intuitive criterion reinforces the statistical basis provided by the eigenvalue-greater-than-one criterion, confirming that the first three components account for the majority of the explanatory power, while the contributions of subsequent components are negligible and can be disregarded, and will not cause significant loss of information. Therefore, extracting three principal components is statistically reasonable and conceptually suitable for subsequent clustering and classification analyses.

3.3.2. Calculation of Principal Component Scores

Based on the component score coefficient matrix obtained through principal component analysis (Table 7), this study further revealed the contribution intensity and direction of various agronomic parameters to the principal components. Specifically, in the first principal component, the plant nitrogen content showed the highest positive score coefficient (0.488), indicating that it contributed the most to the formation of the first principal component, while leaf area showed a slight negative contribution (−0.04). In the second principal component, leaf area showed the strongest positive influence (0.50), while the number of tillers showed a significant negative contribution (−0.36). In the third principal component, yield showed the highest positive coefficient (0.79), highlighting its dominant role, while biomass showed a significant negative contribution (−0.34).

These score coefficients not only quantified the driving direction of each variable in the principal component scores but also revealed the relative importance of different agronomic parameters in the comprehensive evaluation system. By analyzing the coefficient patterns, it can be observed that plant nitrogen content and leaf area respectively occupied the dominant positions in the first and second principal components, indicating that nitrogen accumulation and canopy development are two relatively independent physiological dimensions. The prominent performance of yield in the third principal component confirmed its special status as the final output indicator. This hierarchical contribution pattern provided a theoretical basis for the subsequent construction of comprehensive evaluation indicators and explained why a single agronomic parameter cannot comprehensively assess the nitrogen use efficiency of rice: because different parameters represent different aspects of the nitrogen utilization process. It should be noted that, although the availability of nitrogen is physiologically associated with canopy growth, the PCA analysis in this study indicates that, under our experimental conditions, the variation patterns of these trait complexes are statistically separable.

As shown in Table 8, rice varieties with different nitrogen use efficiencies exhibited distinct values across the comprehensive indicators. For the first composite indicator, S51 showed the highest value (1.22), indicating the best performance, while S19 had the lowest value (−1.66), reflecting the poorest performance. For the second composite indicator, S36 achieved the maximum value (1.94), representing the optimal performance, whereas S60 displayed the minimum value (−1.60), indicating the least favorable performance. For the third composite indicator, S52 recorded the highest value (2.57), demonstrating the best performance, while S01 showed the lowest value (−0.25), representing the weakest performance.

3.4. Cluster Analysis of Comprehensive Evaluation Indicators for NUE in Rice

In order to classify the varieties based on the overall agronomic performance of crops under experimental nitrogen fertilizer conditions, we used the K-means algorithm to conduct a cluster analysis on the comprehensive scores (F) obtained through principal component analysis. These scores (Figure 3, labeled as “Comprehensive Evaluation Index of Nitrogen Utilization Efficiency”) represent the data-driven comprehensive results of seven measurement traits and can be used as practical multi-trait indicators for screening. The varieties were classified into three distinct groups: high-nitrogen-efficient (HNE), medium-nitrogen-efficient (MNE), and low-nitrogen-efficient (LNE) varieties. The clustering results identified 22 HNE, 21 MNE, and 17 LNE varieties.

3.5. Construction and Validation of the Variety Screening Model

The screening of nitrogen-efficient rice varieties was conducted using 20 vegetation indices calculated from high-spectrum images obtained by unmanned aerial vehicles as independent variables, and the nitrogen efficiency grades clustered based on the comprehensive nitrogen efficiency scores as the dependent variable.

Independence between predictive features and target labels: it is necessary to clearly identify the source of the data used for model training to avoid concerns about circular reasoning. The target label (i.e., nitrogen efficiency grade) is entirely derived from agronomic traits measured in the field through principal component analysis (PCA) and K-means clustering. On the other hand, the predictive features (i.e., vegetation index) are calculated solely from hyperspectral images obtained by drones, representing a completely independent data pattern. Therefore, the machine learning model does not learn to reconstruct the clustering based on PCA; its goal is to discover the relationship between canopy remote sensing spectral phenotypes and the comprehensive performance categories defined by ground truth data. The successful verification of this mapping validates the effectiveness of spectral indices as non-destructive alternative indicators based on ground benchmarks.

Four machine learning methods were employed to construct the variety screening model: namely, Support Vector Machine, K-Nearest Neighbor, Classification Decision Tree, and Naive Bayes. For each model, the training set and validation set were randomly divided in a 8:2 ratio for training, and the validation set was used to evaluate the training situation of the trained model. Five indicators, namely, accuracy rate, precision rate, F1 value, Kappa coefficient, and Hamming distance (ham_distance), were selected to evaluate the performance of the rice nitrogen-efficient variety screening model.

Based on the 20 vegetation indices calculated, a nitrogen-efficient variety screening model for rice was constructed using machine learning methods. Figure 4 shows the validation results of these four classification models on the test dataset. The confusion matrix diagrams of the four models clearly demonstrate the differences among the various models. We can observe that the performance of the SVM and CART models is relatively better.

As shown in Figure 5, the four classification models were validated and subsequently evaluated using the calculated performance metrics. Among the five selected evaluation indicators—Accuracy, Precision, F1-score, and Kappa coefficient are positive indicators, while Hamming distance is a negative indicator. To facilitate comprehensive model comparison, the negative indicator (Hamming distance) was inversely normalized, and a composite score was calculated by applying weighted integration to all five metrics.

As shown in Table 9, the Decision Tree and Support Vector Machine models achieved the highest accuracy (0.75), while Naive Bayes showed the lowest (0.50), indicating the superior classification accuracy of the former two models. Similarly, the Decision Tree and SVM attained the highest F1-scores (0.74), with Naive Bayes again performing the poorest, reflecting better alignment between predicted and expected results for these models. In terms of precision, the Decision Tree led (0.80), significantly outperforming Naive Bayes (0.24), demonstrating its higher reliability in positive class prediction. The Kappa coefficient was highest for the Decision Tree (0.62) and lowest for Naive Bayes (0.24), confirming stronger agreement between actual and predicted classifications for the Decision Tree. After inverse normalization, both SVM and Decision Tree achieved the maximum Hamming distance score (0.75), whereas Naive Bayes scored the minimum (0.5), indicating smaller discrepancies between actual and validated data for SVM and Decision Tree. Based on comprehensive evaluation, the Decision Tree demonstrated the best overall classification performance, followed by SVM, with K-NN showing moderate results and Naive Bayes performing the weakest. Therefore, the Decision Tree was selected as the final classification model.

3.6. Screening Results for Nitrogen-Efficient Rice Varieties

Based on the comprehensive performance evaluation of the four classification models (including accuracy rate, precision rate, F1 score, Kappa value, and Hamming distance and other multi-dimensional indicators), the decision tree classification model was finally selected as the optimal classification model for this study and the final classification results were output. The optimal partition attribute was selected through the Gini index minimization criterion: that is, the vegetation index with the highest discrimination ability for the classification results and the one that can minimize the uncertainty of the categories was chosen as the branch node. Then, a binary tree classification structure was constructed. The model first used EVI as the first-level branch feature (with the lowest Gini index), EVI is the core index representing “chlorophyll content in leaves” (chlorophyll nitrogen content accounts for approximately 50% to 70% of the total nitrogen in leaves), so EVI can directly reflect the nitrogen base reserve level of the crops. That is, when EVI > 0.001 (corresponding to 18 samples): high chlorophyll content and sufficient leaf nitrogen reserve, it belongs to HNE; when EVI ≤ 0.001 (corresponding to 30 samples): insufficient chlorophyll synthesis and insufficient leaf nitrogen reserve, at this time, the variety is the sum of MNE and LNE types. Other branches are based on MTVI2, EXG, TVI/NDVI, etc. The complete classification results of the model (training set) are detailed in Figure S1.

Table 10 summarizes the final classification results of this model for all the tested samples (60 crop varieties). From an agronomy perspective, the rationality of this classification result can be verified through the following correlations: the HNEs classified by the model all correspond to higher vegetation indices such as EVI, NDVI, TVI, etc., which represent the excellent agronomic traits of these varieties in the field, such as “high chlorophyll content, strong photosynthetic efficiency, and sufficient nitrogen reserves”; meanwhile, the LNE group generally shows lower NDVI and NLI, which are consistent with the agronomic characteristics of “slow growth, weak body, and insufficient nitrogen reserves” in actual cultivation. The application of this model provides quantitative basis for subsequent crop variety selection and optimization of field management (such as increasing nitrogen fertilizer application for nitrogen-low efficient varieties and reducing nitrogen fertilizer application for nitrogen-efficient varieties).

4. Discussion

Previous studies of the relationship between the nitrogen use efficiency (NUE) of rice and agronomic parameters have shown that nitrogen fertilizer management has a significant impact on the growth characteristics and yield of rice. These studies typically indicate that appropriate nitrogen fertilizer management can significantly increase the biomass, SPAD value, number of tillers, and yield of rice [73,74,75,76]. We found that the statement “SPAD is negatively correlated with yield” in the results section contradicts common sense. This might be due to the following reasons: Some varieties absorb excessive nitrogen and store it in the leaves (manifested as high SPAD values), resulting in “luxurious absorption”. However, they failed to effectively assimilate and transport this nitrogen to the grains for yield formation, leading to high nitrogen content and low harvest index simultaneously. At the same time, in the later growth stage, the rice panicles gradually mature, shading the lower and middle leaves, which are the main areas where we measure SPAD values. Therefore, the actual SPAD readings of the lower leaves in high-yielding varieties might be underestimated, while low-yielding varieties can measure higher SPAD values due to fewer leaves and a less sparse canopy, with the leaves receiving light more evenly. Thus, a high SPAD value does not directly guarantee high yield output. These might be the reasons for the counterintuitive result. This finding highlights the limitations of using SPAD values alone to assess the overall nitrogen efficiency of varieties, thereby reinforcing the necessity of adopting multi-parameter comprehensive evaluation indicators.

These observed trade-off relationships also extend to other key indicators. Under our experimental conditions, biomass and plant height are also negatively correlated with yield. This pattern is likely to reflect the differentiated strategies among varieties: some varieties invest excessively in vegetative growth at the expense of grain filling, while taller plants may face greater risk of lodging or a less efficient canopy structure. Overall, these complex and sometimes counter-intuitive interactions indicate that screening based solely on a single trait is unreliable. They directly confirm the rationality of our use of the principal component analysis composite index (F) for screening, which can comprehensively reflect the overall performance of varieties by balancing multiple trade-off relationships.

Moreover, it has been noted that excessive use of nitrogen fertilizer may not further increase yield and may even have adverse effects on the environment [77,78,79,80,81]. Consistently with existing literature, this study confirmed positive associations between NUE and key agronomic parameters including biomass, nitrogen content, and tiller number. The relationship with SPAD, however, was more nuanced, as discussed earlier. However, root-related traits were excluded due to practical difficulties in direct soil measurement, and photosynthetic parameters were omitted given the unclear relationship with NUE, making their inclusion in a composite evaluation index currently unsuitable.

The use of vegetation indices derived from UAV remote sensing imagery for assessing NUE in rice represents a non-destructive and efficient agricultural monitoring approach. Previous studies have demonstrated significant correlations between vegetation indices and rice NUE, confirming the feasibility of retrieving nitrogen utilization efficiency from spectral data [82,83]. Nitrogen-efficient rice varieties typically exhibit higher values in key vegetation indices, reflecting their enhanced capacity to absorb and utilize soil nitrogen.

The application of vegetation indices for screening nitrogen-efficient varieties offers several advantages [84,85]: (1) enabling large-scale, non-invasive monitoring without disrupting crop growth; (2) significantly reduced costs compared to traditional methods involving field sampling and laboratory analysis; (3) comprehensive crop assessment beyond nitrogen status, as vegetation indices can reflect multiple agronomic parameters and overall growth conditions. However, limitations exist, including susceptibility to environmental factors and weather conditions. Therefore, integrating multiple vegetation indices with ground-truth data is essential for improving assessment accuracy. Meanwhile, this experiment also has some shortcomings. For instance, only the data above the rice canopy were used. If the growth conditions of the rice were different, it would affect the accuracy of data acquisition. Due to the shortage of land resources, we cannot strictly ensure that the nitrogen content of all soils is uniform. There will always be certain differences, which is also one of the important errors of this experiment.

Future research based on this study should focus on: (1) investigating the accuracy and applicability of vegetation indices across different climatic conditions and soil types to enhance methodological robustness; (2) developing more precise and comprehensive rice NUE evaluation models by integrating UAV remote sensing, empirical measurements, and crop growth theory. (3) Attempt to apply the screening model to the cultivation of superior varieties, such as screening rice varieties with drought resistance and flood tolerance characteristics.

5. Conclusions

This study successfully constructed and verified a non-destructive screening method for high-nitrogen-efficient rice varieties based on UAV hyperspectral imaging and machine learning. The study demonstrated that by integrating multiple agronomic parameters through principal component analysis (PCA) into a comprehensive evaluation index for nitrogen utilization efficiency, it is possible to more comprehensively and stably assess the nitrogen efficiency of rice varieties, overcoming the limitations of single parameters. The study determined the optimal spectral screening model. Among the four machine learning models compared, the DT model exhibited the best classification performance and interpretability.

Potential application pathways: The decision tree model developed in this study provides practical decision-making tools for crop breeding and field management. In the breeding field, it can be used as a high-throughput screening tool to quickly and non-destructively identify excellent individual plants with nitrogen-efficient potential in the early generations, significantly improving breeding efficiency.

Future prospects: Future research efforts will focus on: (1) verifying and optimizing the universality and robustness of this model in different soil backgrounds; (2) studying the screening ability of this model for high-nitrogen-efficient varieties at lower nitrogen fertilizer levels to explore its application potential in agricultural fields for weight loss and efficiency improvement.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy16050540/s1, Figure S1: Decision tree flowchart.

Author Contributions

Conceptualization, H.H.; methodology, H.H. and Y.J.; software, H.H. and M.D.; validation, M.D.; formal analysis, Y.J.; investigation, H.H. and Y.J.; resources, C.S.; data curation, H.H., Y.J. and M.D.; writing—original draft preparation, H.H. and Y.J.; writing—review and editing, H.H., Y.J. and C.S.; supervision, C.S.; funding acquisition, C.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Project of Zhongshan Biological Breeding Laboratory (ZSBBL-KY2023-05) and the Key Research and Development Program (Modern Agriculture) of Jiangsu Province (BE2022335, BE2022338).

Data Availability Statement

The original contributions presented in the study are included in the article. Further inquiries should be directed to the corresponding author.

Acknowledgments

We would like to express our sincere gratitude to the editor and the reviewers for their valuable feedback and insightful comments, which have significantly improved the quality of our manuscript. Additionally, we would like to extend our thanks to all contributing authors for their hard work and collaboration throughout the research process. This study would not have been possible without their commitment and expertise. We also utilized AI tools to refine the grammar and structure of the study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

He, X.; Zhu, H.; Shi, A.; Wang, X. Optimizing Nitrogen Fertilizer Management Enhances Rice Yield, Dry Matter, and Nitrogen Use Efficiency. Agronomy 2024, 14, 919. [Google Scholar] [CrossRef]
Luo, M.; Liu, Y.; Li, J.; Gao, T.; Wu, S.; Wu, L.; Lai, X.; Xu, H.; Hu, H.; Ma, Y. Effects of Straw Returning and New Fertilizer Substitution on Rice Growth, Yield, and Soil Properties in the Chaohu Lake Region of China. Plants 2024, 13, 444. [Google Scholar] [CrossRef] [PubMed]
Zhu, H.; He, X.; Wang, X.; Long, P. Increasing Hybrid Rice Yield, Water Productivity, and Nitrogen Use Efficiency: Optimization Strategies for Irrigation and Fertilizer Management. Plants 2024, 13, 1717. [Google Scholar] [CrossRef] [PubMed]
Tian, J.; Ji, G.; Zhang, J.; Luo, D.; Zhang, F.; Li, L.; Jiang, M.; Zhu, D.; Li, M. Evaluation of Rice Quality Storage Stability: From Variety Screening to Trait Identification. Plants 2025, 14, 356. [Google Scholar] [CrossRef]
Wu, W.; Feng, X.; Lu, C. The rise of smart agriculture in China: Current situation and suggestions for further development. Exp. Agric. 2024, 60, e28. [Google Scholar] [CrossRef]
Tan, Y.; Gu, J.; Lu, L.; Zhang, L.; Huang, J.; Pan, L.; Lv, Y.; Wang, Y.; Chen, Y. Hyperspectral Band Selection for Crop Identification and Mapping of Agriculture. Remote Sens. 2025, 17, 663. [Google Scholar] [CrossRef]
Zhang, Y.L.; Fan, J.B.; Wang, D.S.; Shen, Q.R. Genotypic Differences in Grain Yield and Physiological Nitrogen Use Efficiency among Rice Cultivars. Pedosphere 2009, 19, 681–691. [Google Scholar] [CrossRef]
Moll, R.H.; Kamprath, E.J.; Jackson, W.A. Analysis and Interpretation of Factors Which Contribute to Efficiency of Nitrogen Utilization. Agron. J. 1982, 74, 562–564. [Google Scholar] [CrossRef]
Hawkesford, M.J. Reducing the reliance on nitrogen fertilizer for wheat production. J. Cereal Sci. 2014, 59, 276–283. [Google Scholar] [CrossRef]
Bashir, S.S.; Siddiqi, T.O.; Kumar, D.; Ahmad, A. Physio-biochemical, agronomical, and gene expression analysis reveals different responsive approach to low nitrogen in contrasting rice cultivars for nitrogen use efficiency. Mol. Biol. Rep. 2022, 50, 1575–1593. [Google Scholar] [CrossRef]
Dou, Z.; Tang, S.; Li, G.H.; Liu, Z.H.; Ding, C.Q.; Chen, L.; Wang, S.H.; Ding, Y.F. Application of nitrogen fertilizer at heading stage improves rice quality under elevated temperature during grain-filling stage. Crop Sci. 2017, 57, 2183–2192. [Google Scholar] [CrossRef]
Wang, B.; Zhou, G.Y.; Guo, S.Y.; Li, X.H.; Yuan, J.Q.; Hu, A.Y. Improving Nitrogen Use Efficiency in Rice for Sustainable Agriculture: Strategies and Future Perspectives. Life 2022, 12, 1653. [Google Scholar] [CrossRef] [PubMed]
Chu, G.; Chen, S.; Xu, C.M.; Wang, D.Y.; Zhang, X.F. Agronomic and physiological performance of indica/japonica hybrid rice cultivar under low nitrogen conditions. Field Crops Res. 2019, 243, 107625. [Google Scholar] [CrossRef]
Tang, S.; Zhang, H.X.; Liu, W.Z.; Dou, Z.; Zhou, Q.Y.; Chen, W.Z.; Wang, S.H.; Ding, Y.F. Nitrogen fertilizer at heading stage effectively compensates for the deterioration of rice quality by affecting the starch-related properties under elevated temperatures. Food Chem. 2019, 277, 455–462. [Google Scholar] [CrossRef]
Chen, G.; Zhao, G.; Cheng, W.; Zhang, H.; Shi, W. Rice nitrogen use efficiency does not link to ammonia volatilization in paddy fields. Sci. Total Environ. 2020, 741, 140433. [Google Scholar] [CrossRef] [PubMed]
Asplund, L.; Bergkvist, G.; Weih, M. Functional traits associated with nitrogen use efficiency in wheat. Acta Agric. Scand. Sect. B—Soil Plant Sci. 2015, 66, 153–169. [Google Scholar] [CrossRef]
Gao, S.; Zhang, F.; Zhi, Y.; Chen, F.; Xiao, K. The yields, agronomic, and nitrogen use efficiency traits of wheat cultivars in north China under N-sufficient and -deficient conditions. J. Plant Nutr. 2017, 40, 1053–1065. [Google Scholar] [CrossRef]
MacKown, C.T.; Van Sanford, D.A.; Ma, Y.Z. Main Stem Sink Manipulation in Wheat. Plant Physiol. 1989, 89, 597–601. [Google Scholar] [CrossRef]
Liao, J.Q.; Quan, Q.; Ma, F.F.; Peng, J.L.; Niu, S. Plant height bridges hierarchical community responses to nitrogen enrichment. J. Ecol. 2024, 112, 2069–2081. [Google Scholar] [CrossRef]
Yin, X.H.; Hayes, R.M.; McClure, M.A.; Savoy, H.J. Assessment of plant biomass and nitrogen nutrition with plant height in earlyto mid-season corn. J. Sci. Food Agric. 2012, 92, 2611–2617. [Google Scholar] [CrossRef]
Zhao, B.; Ata-Ul-Karim, S.T.; Duan, A.W.; Liu, Z.D.; Wang, X.L.; Xiao, J.F.; Liu, Z.G.; Qin, A.Z.; Ning, D.F.; Zhang, W.Q.; et al. Determination of critical nitrogen concentration and dilution curve based on leaf area index for summer maize. Field Crops Res. 2018, 228, 195–203. [Google Scholar] [CrossRef]
Yue, X.L.; Hu, Y.C.; Zhang, H.Z.; Schmidhalter, U. Evaluation of Both SPAD Reading and SPAD Index on Estimating the Plant Nitrogen Status of Winter Wheat. Int. J. Plant Prod. 2020, 14, 67–75. [Google Scholar] [CrossRef]
Li, Y.Y.; Ming, B.; Fan, P.P.; Liu, Y.; Wang, K.R.; Hou, P.; Li, S.K.; Xie, R.Z. Effects of nitrogen application rates on the spatio-temporal variation of leaf SPAD readings on the maize canopy. J. Agric. Sci. 2022, 160, 32–44. [Google Scholar] [CrossRef]
Zhang, H.; Zhao, Q.; Wang, Z.; Wang, L.; Li, X.; Fan, Z.; Zhang, Y.; Li, J.; Gao, X.; Shi, J.; et al. Effects of Nitrogen Fertilizer on Photosynthetic Characteristics, Biomass, and Yield of Wheat under Different Shading Conditions. Agronomy 2021, 11, 1989. [Google Scholar] [CrossRef]
Paleari, L.; Movedi, E.; Vesely, F.M.; Invernizzi, M.; Piva, D. Estimating plant nitrogen content in tomato using a smartphone. Field Crops Res. 2022, 284, 108564. [Google Scholar] [CrossRef]
Ma, Y.J.; Sun, M.L.; Liang, X.L.; Zhang, H.M.; Xiang, J.X.; Zhao, X. Rice Yield and Nitrogen Use Efffciency Under Climate Change: Unraveling Key Drivers with Least Absolute Shrinkage and Selection Operator Regression. Agronomy 2025, 15, 677. [Google Scholar]
Zhang, X.; Zhang, Y.; Xia, C.; Hou, X.; Zhang, X.; Li, W. Estimation of nitrogen content in maize leaves based on UAV hyperspectral imagery. Remote Sens. Technol. Appl. 2024, 39, 927–939. [Google Scholar]
Zhang, Y.L.; Fan, J.B.; Duan, Y.H.; Wang, D.S.; Ye, L.T.; Shen, Q.R. Variation and evaluation of nitrogen use efficiency in different rice genotypes. Acta Pedol. Sin. 2008, 45, 267–273. [Google Scholar]
Han, Z.J.; Cui, J.J.; Pan, H.Y.; Li, Y.W.; Na, M.H.; Song, B.Q.; Zhou, J.C.; Wang, Q.H. Morphological Characteristics of Low-nitrogen and High-efficiency Red Beet. Chin. Agric. Sci. Bull. 2022, 38, 20–29. [Google Scholar]
Li, Y.; Zhang, J.; Bai, J.; Xu, F.L.; Bo, Q.F.; Yue, S.C. Screening of High-Yield and High Nitrogen Use Efficiency Varieties in Shanxi and Their Accumulation and Distribution of Dry Matter and Nitrogen. J. Maize Sci. 2021, 29, 154–161. [Google Scholar]
Shen, X.; Chen, C.Q.; Han, D.Z.; Xu, Y.S.; Wang, X.Y.; Zhou, H.Y. A triple-branch hybrid dynamic-static alignment strategy for vision-language tasks. Neural Netw. 2025, 191, 107871. [Google Scholar] [CrossRef]
Zou, X.D.; Liang, A.J.; Wu, B.Z.; Lin, Y.M.; Hong, T.; Li, J. UAV-Based High-Throughput Approach for Fast Growing Cunninghamia lanceolata (Lamb.) Cultivar Screening by Machine Learning. Forests 2019, 10, 815. [Google Scholar] [CrossRef]
Wang, J.J.; Zhou, Q.; Shang, J.L.; Liu, C.; Zhuang, T.X.; Ding, J.J.; Xian, Y.Y.; Huo, Z.Y. UAV- and Machine Learning-Based Retrieval of Wheat SPAD Values at the Overwintering Stage for Variety Screening. Remote Sens. 2021, 13, 5166. [Google Scholar] [CrossRef]
Zhao, P.; Zhao, G.; Jin, J.; Yang, D.D.; Li, Y.; Zhang, Y. Screening of early aging wheat variety based on unmanned aerial vehicle images. J. Seed Ind. Guide 2022, 3, 14–21. [Google Scholar]
Zhang, W. Hyperspectral Technology-Based Drought Level Classification and Screening of Drought-Tolerant Potato Varieties. Master’s Thesis, Northwest A&F University, Xianyang, China, 2023. [Google Scholar]
Yan, L.; Liu, C.; Zain, M.; Cheng, M.; Huo, Z.; Sun, C. Estimation of Rice Protein Content Based on Unmanned Aerial Vehicle Hyperspectral Imaging. Agronomy 2024, 14, 2479. [Google Scholar] [CrossRef]
Wu, F.; Wang, J.C.; Zhou, Y.Z.; Song, X.X.; Ju, C.X.; Sun, C.M.; Liu, T. Estimation of Winter Wheat Tiller Number Based on Optimization of Gradient Vegetation Characteristics. Remote Sens. 2022, 14, 1338. [Google Scholar] [CrossRef]
Liu, T.; Chen, W.; Zhong, X.C.; Zi, Y.; Chen, C.; Wu, W.; Sun, C.M.; Zhu, X.K.; Guo, W.S. Image-Analysis-Based Evaluation of Wheat Growth Status. Crop Sci. 2017, 57, 3227–3238. [Google Scholar] [CrossRef]
Wang, J.; Wang, F. Research on Safety Evaluation at Urban Roadway Traffic Accident. Mech. Manag. Dev. 2006, 1, 121–122. [Google Scholar]
Zhao, Z.H.; Huang, L.H.; Jing, H.T.; Li, W.J. Analysis of National Road Traffic Safety Level Based on Principal Component Analysis. Saf. Secur. 2024, 45, 26–30. [Google Scholar]
Bartlett, M.S. Properties of sufficiency and statistical tests. Proc. R. Soc. Lond. Ser. A—Math. Phys. Sci. 1997, 160, 268–282. [Google Scholar]
Pu, R.; Gong, P. Hyperspectral Remote Sensing and Its Applications; Higher Education Press: Beijing, China, 2000. [Google Scholar]
Liu, H.J.; Bruning, B.; Garnett, T.; Berger, B. The Performances of Hyperspectral Sensors for Proximal Sensing of Nitrogen Levels in Wheat. Sensors 2020, 20, 4550. [Google Scholar] [CrossRef]
Raper, T.B.; Varco, J.J. Canopy-scale wavelength and vegetative index sensitivities to cotton growth parameters and nitrogen status. Precis. Agric. 2015, 16, 62–76. [Google Scholar] [CrossRef]
Gitelson, A.; Merzlyak, M.N. Spectral Reflectance Changes Associated with Autumn Senescence of Aesculus hippocastanum L. and Acer platanoides L. Leaves. Spectral Features and Relation to Chlorophyll Estimation. J. Plant Physiol. 1994, 143, 286–292. [Google Scholar] [CrossRef]
Lobell, D.B.; Azzari, G. Satellite detection of rising maize yield heterogeneity in the US Midwest. Environ. Res. Lett. 2017, 12, 014014. [Google Scholar] [CrossRef]
Yang, W.; Kobayashi, H.; Wang, C.; Kim, Y.; Kimball, J.S.; Duclos, D. A semi-analytical snow-free vegetation index for improving estimation of plant phenology in tundra and grassland ecosystems. Remote Sens. Environ. 2019, 228, 31–44. [Google Scholar] [CrossRef]
Haboudane, D.; Miller, J.R.; Pattey, E.; Zarco-Tejada, P.J.; Strachan, I.B. Hyperspectral vegetation indices and novel algorithms for predicting green LAI of crop canopies: Modeling and validation in the context of precision agriculture. Remote Sens. Environ. 2004, 90, 337–352. [Google Scholar] [CrossRef]
Zarco-Tejada, P.J.; Miller, J.R.; Morales, A.; Berjón, A.; Aguado, A. Hyperspectral indices and model simulation for chlorophyll estimation in open-canopy tree crops. Remote Sens. Environ. 2004, 90, 463–476. [Google Scholar] [CrossRef]
Huete, A.R.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
Sims, D.A.; Gamon, J.A. Relationships between leaf pigment content and spectral reflectance across a wide range of species, leaf structures and developmental stages. Remote Sens. Environ. 2002, 81, 337–354. [Google Scholar] [CrossRef]
Gao, Y.P.; Kang, M.D.; He, M.Z.; Sun, Y.; Xu, H. Extraction of desert vegetation coverage based on visible light band information of unmanned aerial vehicle: A case study of Shapotou region. J. Lanzhou Univ. Nat. Sci. Ed. 2018, 54, 770–775. [Google Scholar]
Pu, R.L.; Gong, P.; Yu, Q. Comparative Analysis of EO-1 ALI and Hyperion, and Landsat ETM+ Data for Mapping Forest Crown Closure and Leaf Area Index. Sensors 2008, 8, 3744–3766. [Google Scholar] [CrossRef]
Gao, L.; Yang, G.J.; Wang, B.S.; Yu, H.Y.; Xu, B.; Feng, H.K. Soybean leaf area index retrieval with UAV (unmanned aerial vehicle) remote sensing imagery. Chin. J. Eco-Agric. 2015, 23, 868–876. [Google Scholar]
Li, X.; Xu, X.G.; Bao, Y.; Wang, J.H.; Huang, W.J. Retrieving LAI of Winter Wheat Based on Sensitive Vegetation Index by the Segmentation Method. Sci. Agric. Sin. 2012, 45, 3486–3496. [Google Scholar]
Gitelson, A.A.; Zur, Y.; Chivkunova, O.B.; Merzlyak, M.N. Assessing Carotenoid Content in Plant Leaves with Reflectance Spectroscopy. Photochem. Photobiol. 2002, 75, 272–281. [Google Scholar] [CrossRef]
Hu, J.; Peng, J.; Zhou, Y.; Yang, X.; Tong, L.; Li, L. Quantitative Estimation of Soil Salinity Using UAV-Borne Hyperspectral and Satellite Multispectral Images. Remote Sens. 2019, 11, 736. [Google Scholar] [CrossRef]
Schneider, P.; Roberts, D.A.; Kyriakidis, P.C. A VARI-based relative greenness from MODIS data for computing the Fire Potential Index. Remote Sens. Environ. 2008, 112, 1151–1167. [Google Scholar] [CrossRef]
Liu, L.; Dong, Y.; Huang, W.; Ma, H.; Luo, J.; Zhang, D. Monitoring Wheat Fusarium Head Blight Using Unmanned Aerial Vehicle Hyperspectral Imagery. Remote Sens. 2020, 12, 3811. [Google Scholar] [CrossRef]
Broge, N.H.; Leblanc, E. Comparing prediction power and stability of broadband and hyperspectral vegetation indices for estimation of green leaf area index and canopy chlorophyll density. Remote Sens. Environ. Interdiscip. J. 2001, 76, 156–172. [Google Scholar] [CrossRef]
Gitelson, A.A.; Kaufman, Y.J.; Merzlyak, M.N. Use of a green channel in remote sensing of global vegetation from EOS-MODIS. Remote Sens. Environ. 1996, 58, 289–298. [Google Scholar] [CrossRef]
Liu, C.; He, Q.; Lu, Y.; Yang, K.; Huang, Q.; He, L.; Chen, L.; Meng, S. PSOEMLSSVM forecasting model for the transmission lines icing. J. Electr. Power Sci. Technol. 2020, 35, 131–137. [Google Scholar]
Vapnik, V.N. The Nature of Statistical Learning Theory; Tsinghua University Press: Beijing, China, 2000. [Google Scholar]
Wang, X.; Dong, Y.; Yu, Q.; Geng, N. Survey on Structured Support Vector Machines. Comput. Eng. Appl. 2020, 56, 24–32. [Google Scholar]
Dong, Y.; Nan, Y.; Liu, Z. CART decision tree classification based on multi-features of ETM image: A case study of Yanbian Prefecture. Resour. Dev. Mark. 2011, 27, 116–119. [Google Scholar]
Grajski, K.A.; Breiman, L.; Prisco, V.D. Classification of EEG spatial patterns with a tree-structured methodology: CART. IEEE Trans. Biomed. Eng. 1986, 33, 1076–1086. [Google Scholar] [CrossRef]
Sun, J.; Min, L.; Shu, H.; Zhang, R. Binary environmental Gini coefficient and its application in balanced development. Adv. Appl. Math. 2023, 12, 4273–4284. [Google Scholar] [CrossRef]
Sun, C. Research on KNN Algorithm for Heterogeneous Data Under Non-IID Condition. Master’s Thesis, Qilu University of Technology, Jinan, China, 2021. [Google Scholar]
Géron, A. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow; China Machine Press: Beijing, China, 2020. [Google Scholar]
García, V.; Mollineda, R.A.; Sanchez, J.S. On the k-NN performance in a challenging scenario of imbalance and overlapping. Pattern Anal. Appl. 2008, 11, 269–280. [Google Scholar] [CrossRef]
Shi, J.Y.; Han, D.Z.; Chen, C.Q. KTMN: Knowledge-driven Two-stage Modulation Network for visual question answering. Multimed. Syst. 2024, 30, 350. [Google Scholar] [CrossRef]
Liu, Y.; Jiang, L.; Qi, Q. Online Computation Offloading for Collaborative Space/Aerial-Aided Edge Computing Toward 6G System. IEEE Trans. Veh. Technol. 2024, 73, 2495–2505. [Google Scholar] [CrossRef]
Wang, Z.Q.; Zhang, Y.W.; Beebout, S.S.; Zhang, H.; Liu, L.J.; Yang, J.C.; Zhang, J.H. Grain yield, water and nitrogen use efficiencies of rice as influenced by irrigation regimes and their interaction with nitrogen rates. Field Crops Res. 2016, 193, 54–69. [Google Scholar] [CrossRef]
Jones, D.L.; Willett, V.B. Experimental evaluation of methods to quantify dissolved organic nitrogen (DON) and dissolved organic carbon (DOC) in soil. Soil Biol. Biochem. 2006, 38, 991–999. [Google Scholar] [CrossRef]
Yang, R.; SU, Y.; Wang, T.; Yang, Q. Effect of chemical and organic fertilization on soil carbon and nitrogen accumulation in a newly cultivated farmland. J. Integr. Agric. 2016, 15, 658–666. [Google Scholar] [CrossRef]
Lu, D.; Lu, F.; Pan, J.; Cui, Z.; Zou, C.; Chen, X.; He, M.; Wang, Z. The effects of cultivar and nitrogen management on wheat yield and nitrogen use efficiency in the North China Plain. Field Crops Res. 2015, 171, 157–164. [Google Scholar] [CrossRef]
Wu, L.L.; Yuan, S.; Huang, L.Y.; Sun, F.; Zhu, G.; Li, G.; Fahad, S.; Peng, S.; Wang, F. Physiological mechanisms underlying the high-grain yield and high-nitrogen use efficiency of elite rice varieties under a low rate of nitrogen application in China. Front. Plant Sci. 2016, 7, 1024. [Google Scholar] [CrossRef] [PubMed]
Ju, X.T.; Xing, G.X.; Chen, X.P.; Zhang, X.L.; Zhang, L.J.; Liu, X.J.; Cui, Z.L.; Yin, B.; Christie, P.; Zhu, Z.L.; et al. Reducing environmental risk by improving N management in intensive Chinese agricultural systems. Proc. Natl. Acad. Sci. USA 2009, 106, 3041–3046. [Google Scholar] [CrossRef]
Guo, J.H.; Liu, X.J.; Zhang, Y.; Shen, J.L.; Han, W.X.; Zhang, W.F.; Christie, P.; Goulding, K.; Vitousek, P.; Zhang, F. Significant acidification in major Chinese croplands. Science 2010, 327, 1008–1010. [Google Scholar] [CrossRef]
Wang, H.Y.; Hu, R.F.; Chen, X.X.; Zhoong, X.H.; Zheng, Z.T.; Huang, N.R.; Xue, C.L. Reduction in nitrogen fertilizer use results in increased rice yields and improved environmental protection. Int. J. Agric. Sustain. 2017, 15, 681–692. [Google Scholar] [CrossRef]
Yang, X.Y.; Ren, W.D.; Sun, B.H.; Zhang, S.L. Effects of contrasting soil management regimes on total and labile soil organic carbon fractions in a loess soil in China. Geoderma 2012, 177, 49–56. [Google Scholar] [CrossRef]
Wang, Y.W.; Ma, X.; Tan, S.Y.; Li, J.; Zhang, H. Inverting rice nitrogen content with multimodal data fusion of unmanned aerial vehicle remote sensing and ground observations. Trans. Chin. Soc. Agric. Eng. 2024, 40, 100–109. [Google Scholar]
Xu, T.Y.; Xing, S.M.; Yu, F.H.; Guo, Z.H.; Liu, Y.D. Inversion method of japonica rice canopy nitrogen content based on combination of multiple vegetation indices with BAS-ELM. J. Shenyang Agric. Univ. 2021, 52, 577–585. [Google Scholar]
Ling, Q.H.; Kong, F.M.; Ning, Q.; Wei, Y.; Liu, Z.; Dai, M.Z.; Zhou, Y.; Zhang, Y.Q.; Shi, X.J.; Wang, J. Rice nitrogen nutrition monitoring based on unmanned aerial vehicle multispectral image. Trans. Chin. Soc. Agric. Eng. 2023, 39, 160–170. [Google Scholar]
Che, M.; Wang, H.R.; Xu, X.; Sun, C. PSO-DF: A Hyperspectral Model for Estimating Nitrogen Content in Rice Leaves. Remote Sens. Technol. Appl. 2024, 39, 280–289. [Google Scholar]

Figure 1. Thermal maps of correlation of agronomic parameters of different varieties.

Figure 2. Principal component extraction diagram.

Figure 3. Clustering diagram of comprehensive scores of rice varieties with different nitrogen efficiency levels.

Figure 4. Classification model verification.

Figure 5. Comparison of different machine learning models.

Table 1. Test varieties name and number.

Entry Number	Variety Name	Entry Number	Variety Name	Entry Number	Variety Name
S01	Tai 0206	S21	Lian geng 6	S41	Ning Japonica 1
S02	Taiwan 65	S22	Nan Japonica 5718	S42	Nan Japonica 44
S03	Taiwan 30	S23	Suxiu 867	S43	Wuyun Japonica 30
S04	Taiwan50	S24	Xu Rice 3	S44	Ningxiang Japonica 9
S05	Nongken 58	S25	Huai Rice 5	S45	Xiangxue Rice 515
S06	Yanfeng 47	S26	Si Rice 301	S46	Wu Japonica 15
S07	Dang Japonica 8	S27	MG7200	S47	Wuxiang Japonica 14
S08	Wan Rice 8	S28	Yan Rice 83006	S48	Zhennuo 19
S09	Zhongzhong Xiangnuo	S29	Huai Rice 13	S49	Su Yunuo
S10	Zhongyan Rice 881	S30	Nan Japonica 45	S50	Xiushui 123
S11	Wuyu Japonica 3	S31	Wuling Japonica 1	S51	Huruan 1212
S12	Lianjia Japonica 1	S32	Yan Rice 10	S52	SNU 19
S13	Lian geng 7	S33	Nan Japonica 9108	S53	Xiushui 110
S14	Yan geng 2	S34	Si Rice 17	S54	Xiushui 114
S15	Zhen Rice 88	S35	Jingxiangyu 1	S55	Jiahe 218
S16	Hua geng 5	S36	Nan Japonica 46	S56	Xiushui 134
S17	Huaiyou geng 2	S37	Wuxiang Japonica 9	S57	Jia 58
S18	Hua geng 6	S38	Guangling Xiangnuo	S58	Changnong Japonica 1
S19	Lian geng 4	S39	Wuyungeng 7	S59	Suxiang Japonica 100
S20	Huai geng 11	S40	Guangling Yougeng	S60	Xiangruanyu

Table 2. Parameters of longitude and latitude M600PRO UAV and GaiaSky-mini2 UAV load height spectrometer(GaiaChips Co., Ltd., Beijing, China).

Name	Type	Parameter
DJI Matrice M600 Pro UAV	Overall Dimensions	1668 mm × 1518 mm × 727 mm (with propellers, arms, and GPS mast extended, and landing gear equipped) 437 mm × 402 mm × 553 mm (with arms and GPS mast folded, and landing gear removed)
	Recommended Maximum Takeoff Weight	15.5 kg
	Maximum Wind Resistance	8 m/s
	Maximum Level Flight Speed	65 km/h (in calm air)
	Powertrain	Motor Model: DJI 6010 Propeller Model: DJI 2170 R
GaiaSky-mini2 UAV-borne hyperspectral imager	Spectral Range	400–1000 nm
	Full-Frame Pixels	1392 × 1040
	Lens	18.5 mm or 23 mm
	Spectral Resolution	3.5 nm ± 0.5 nm
	Numerical Aperture	F/2.8
	Imaging Mode	Hover-and-Scan Mode

Table 3. Vegetation index and calculation methods.

Vegetation Index	Calculation Formula	References
Normalized Difference Vegetation Index (NDVI)	$(NIR - RED) / (NIR + RED)$	[43]
Normalized Difference Red Edge Index (NDRE)	$(NIR - REDEDGE) / (NIR + REDEDGE)$	[44]
Red Edge Chlorophyll Index (RECI)	$(NIR / REDEDGE) - 1$	[45]
Green Chlorophyll Vegetation Index (GCVI)	$NIR / GREEN - 1$	[46]
Normalized Difference Greenery Index (NDGI)	$(GREEN - RED) / (GREEN + RED)$	[47]
Green Normalized Difference Vegetation Index (GNDVI)	$(NIR - GREEN) / (NIR + GREEN)$	[48]
Leaf Chlorophyll Index (LCI)	$(NIR - REDEDGE) / (NIR + REDEDGE)$	[49]
Difference Vegetation Index (DVI)	$NIR - RED$	[44]
Enhanced Vegetation Index (EVI)	$2.5 \times (NIR - RED) / (NIR + 6 \times RED - 7.5 \times BLUE + 1)$	[50]
Ratio Vegetation Index (RVI)	$NIR / RED$	[51]
Excess Green Index (ExG)	$2 \times GREEN - RED - BLUE$	[52]
Optimized Soil-Adjusted Vegetation Index (OSAVI)	$(NIR - RED) / (NIR + RED + 0.16)$	[53]
Soil-Adjusted Vegetation Index (SAVI)	$(NIR - RED) / (NIR + RED + 0.5) \times 1.5$	[54]
Nitrogen Reflectance Index (NRI)	$(GREEN - RED) / (GREEN + RED)$	[55]
Carotenoid Reflectance Index (CRI)	$1 / RED + 1 / NIR$	[56]
Modified Red Edge Ratio Index (MRERI)	$(NIR / RED) - 1 / \sqrt{[(NIR / RED) + 1]}$	[57]
Nonlinear Vegetation Index (NLI)	$(NI R^{2} - RED) / (NI R^{2} + RED)$	[58]
Renormalized Difference Vegetation Index (RDVI)	$(NIR - RED) / \sqrt{(NIR + RED)}$	[59]
Triangular Vegetation Index (TVI)	$\sqrt{(NDVI + 0.5)}$	[60]
Modified Triangular Vegetation Index (MTVI)	$1.5 \times [1.2 \times (N I R - G R E E N) - 2.5 \times (R E D - G R E E N)] ∕$ $[2 \times {(NIR + 1)}^{2} - 6 \times NIR + 5 \times \sqrt{RED}]$	[61]

Note: BLUE, GREEN, RED, REDEDGE, and NIR represent the reflectance values in the blue, green, red, red-edge, and near-infrared spectral bands, respectively.

Table 4. Differences in agronomic parameters among different varieties.

Agronomic Parameters	Minimum	Maximum	Mean	Standard Deviation	Coefficient of Variation
Tiller Count (individuals)	7.00	15.00	11.02	1.88	17.08%
Plant Height (cm)	69.29	125.38	84.69	8.80	10.39%
SPAD	37.62	58.88	42.40	3.13	7.39%
Leaf Area (cm²)	23.61	62.48	36.82	6.41	17.42%
Biomass (g)	31.84	70.43	47.77	7.31	15.30%
Yield (t/ha)	4.54	15.08	9.67	2.03	20.98%
Plant Nitrogen Content (g/kg)	3.55	5.55	4.58	0.39	8.52%

Table 5. KMO and Bartlett tests.

KMO Measure of Sampling Adequacy		0.504
Bartlett’s Test of Sphericity	Approximate Chi-Square	72.315
	Degrees of Freedom	21
	Significance	0

Table 6. Common factor variance.

Agronomic Parameters	Initial	Extraction
Tiller Count (individuals)	1	0.64
Plant Height (cm)	1	0.79
SPAD	1	0.49
Leaf Area (cm²)	1	0.80
Biomass (g)	1	0.49
Yield (t/ha)	1	0.75
Plant Nitrogen Content (g/kg)	1	0.73

Table 7. Component score coefficient matrix.

Agronomic Parameters	Component 1	Component 2	Component 3
Tiller Count (individuals)	0.259	−0.36	−0.11
Plant Height (cm)	0.206	0.41	−0.16
SPAD	0.399	−0.04	−0.04
Leaf Area (cm²)	−0.04	0.50	0.04
Biomass (g)	0.226	0.13	−0.34
Yield (t/ha)	0.123	0.03	0.79
Plant Nitrogen Content (g/kg)	0.488	−0.06	0.36

Table 8. Scores of 60 rice varieties in each principal component.

Variety	F₁	F₂	F₃
S01	0.31	−0.17	−1.95
S02	1.2	−0.19	−0.40
S03	0.37	−0.75	−1.58
S04	−0.06	0.3	−1.35
S05	1	0.22	−1.61
S06	0.86	0.92	0.67
S07	0.36	0.79	−0.65
S08	0.91	0.01	0.74
S09	−0.28	0.06	1.63
S10	0.35	0.12	0.66
S11	−0.98	1.35	−0.51
S12	−0.32	1.08	0.45
S13	−0.44	0.77	0.32
S14	−0.8	0.7	−0.12
S15	−0.39	0.52	0.95
S16	−1.07	−0.72	0.15
S17	0.18	1.05	−0.30
S18	−0.6	0.92	0.74
S19	−1.66	0.64	−1.94
S20	0.39	−0.05	−0.61
S21	−0.02	0.49	−1.19
S22	1.09	0.07	−0.47
S23	−0.95	0.48	0.31
S24	0.14	0.88	−0.59
S25	−0.71	−0.07	1.50
S26	0.61	−0.29	1.36
S27	−1.19	−0.02	−0.25
S28	−1.08	−0.4	−0.17
S29	−0.42	−0.16	−1.03
S30	−0.32	−0.66	−0.02
S31	−0.34	0.71	−0.28
S32	0.66	−0.3	0.08
S33	0.61	0.42	1.92
S34	0.2	−0.53	0.36
S35	0.44	0.5	2.35
S36	1.17	1.94	−0.64
S37	−0.34	1.01	0.61
S38	1.07	0.04	−0.15
S39	−0.02	0.66	1.16
S40	0.23	−0.78	0.07
S41	−1.51	−0.66	0.53
S42	−0.03	−1.32	−0.22
S43	−1.24	−1.35	0.80
S44	−0.86	−1.05	−0.79
S45	−1.19	−0.15	−0.02
S46	0.18	0.73	0.22
S47	0.36	0.04	0.07
S48	0.69	−0.27	0.86
S49	1.12	−1.27	−0.34
S50	0.57	−0.15	−0.82
S51	1.22	−0.9	−1.37
S52	0.73	−0.49	2.57
S53	0.34	−0.18	−1.32
S54	−0.17	0.39	0.08
S55	−0.12	−0.5	−0.34
S56	−0.36	0.31	−0.63
S57	0.32	−1.42	1.13
S58	0.07	−0.52	−0.76
S59	0.01	−1.16	−0.15
S60	−0.28	−1.6	0.27

Table 9. Comprehensive evaluation of classification model.

	SVM	CART	Naive Bayse	KNN
Evaluation Metrics	SVM	CART	Naive Bayse	KNN
Accuracy (20%)	0.75	0.75	0.50	0.67
F1-Score (20%)	0.74	0.74	0.49	0.67
Precision (20%)	0.75	0.80	0.49	0.71
Kappa (20%)	0.58	0.62	0.24	0.34
Hamming Distance (20%)	0.25	0.25	0.5	0.33
Composite Score (100%)	0.71	0.73	0.44	0.64

Table 10. Classification of nitrogen efficiency of 60 rice varieties.

Category	Type
HNE	Taiwan 65, Taiwan 50, Yanfeng 47, Dang Japonica 8, Wan Rice 8, Zhongzhong Xiangnuo, Zhongyan Rice 881, Lianjia Japonica 1, Zhen Rice 88, Huaiyou geng 2, Hua geng 6, Nan Japonica 5718, Xu Rice 3, Nan Japonica 9108, Jinxiangyu 1, Nan Japonica 46, Wuxiang Japonica 9, Guangling Xiangnuo, Wuyungeng 7, Wu Japonica 15, Zhennuo 19, SNU 19, Xiushui 110
MNE	Nongken 58, Wuyu Japonica 3, Lian geng 7, Yan geng 2, Huai geng 11, Lian geng 6, Suxiu 867, Huai Rice 5, Wuling Japonica 1, Yan Rice 10, Si Rice 17, Guangling Yougeng, Wuxiang Japonica 14, Suyunuo, Xiushui 123, Huruan 1212, Xiushui 114, Xiushui 134, Jia 58
LNE	Tai 0206, Taiwan 30, Hua geng 5, Lian geng 4, Si Rice 301, MG7200, Yan Rice 83006, Huai Rice 13, Nan Japonica 45, Ning Japonica 1,Nan Japonica 44, Wuyun Japonica 30, Ningxiang Japonica 9, Xiangxue Rice 515, Jiahe 218, Changnong Japonica 1, Suxiang Japonica 100, Xiangruanyu

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, H.; Ji, Y.; Dai, M.; Sun, C. Construction of a Screening Model for Nitrogen-Efficient Rice Varieties Based on Spectral Data. Agronomy 2026, 16, 540. https://doi.org/10.3390/agronomy16050540

AMA Style

Han H, Ji Y, Dai M, Sun C. Construction of a Screening Model for Nitrogen-Efficient Rice Varieties Based on Spectral Data. Agronomy. 2026; 16(5):540. https://doi.org/10.3390/agronomy16050540

Chicago/Turabian Style

Han, Honghua, Yuhang Ji, Mian Dai, and Chengming Sun. 2026. "Construction of a Screening Model for Nitrogen-Efficient Rice Varieties Based on Spectral Data" Agronomy 16, no. 5: 540. https://doi.org/10.3390/agronomy16050540

APA Style

Han, H., Ji, Y., Dai, M., & Sun, C. (2026). Construction of a Screening Model for Nitrogen-Efficient Rice Varieties Based on Spectral Data. Agronomy, 16(5), 540. https://doi.org/10.3390/agronomy16050540

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Construction of a Screening Model for Nitrogen-Efficient Rice Varieties Based on Spectral Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Rice Varieties and Cultivation Protocol

2.1.1. Plant Materials

2.1.2. Experimental Field Design

2.2. Data Acquisition and Processing

2.2.1. Field Data Collection

2.2.2. Methodology for Index Construction and Validation

Vegetation Index Formulation

Validation of Index Construction Results

Considerations on PCA Interpretation

2.3. Hyperspectral Image Data Acquisition

2.4. Preprocessing of Hyperspectral Image Data

2.5. Feature Parameter Extraction

2.6. Model Selection

2.6.1. Support Vector Machine

2.6.2. Naive Bayes

2.6.3. CART Decision Tree

2.6.4. K-Nearest Neighbors

2.7. Model Evaluation Metrics

2.8. Data Processing

3. Results and Analysis

3.1. Analysis of Agronomic Parameter Variations Across Varieties

3.2. Correlation Analysis of Agronomic Parameters Across Varieties

3.3. PCA of Agronomic Parameters Across Varieties

3.3.1. Extraction of Principal Components

3.3.2. Calculation of Principal Component Scores

3.4. Cluster Analysis of Comprehensive Evaluation Indicators for NUE in Rice

3.5. Construction and Validation of the Variety Screening Model

3.6. Screening Results for Nitrogen-Efficient Rice Varieties

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI