Next Article in Journal / Special Issue
Ear Back Surface Temperature of Pigs as an Indicator of Comfort: Spatial Variability and Its Thermal Implications
Previous Article in Journal / Special Issue
Field Spectroscopy for Monitoring Nitrogen Fertilization and Estimating Cornstalk Nitrate Content in Maize
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Relationship Between Hyperspectral Data and Amino Acid Composition in Soybean Genotypes

by
Ana Carina da Silva Cândido Seron
1,
Dthenifer Cordeiro Santana
1,
Izadora Araujo Oliveira
1,
Cid Naudi Silva Campos
1,
Larissa Pereira Ribeiro Teodoro
1,
Elber Vinicius Martins Silva
1,
Rafael Felippe Ratke
1,
Fábio Henrique Rojo Baio
1,
Carlos Antonio da Silva Junior
2 and
Paulo Eduardo Teodoro
1,*
1
Departament of Agronomy, Plant Production, Universidade Federal de Mato Grosso do Sul, Rodovia MS 306, km. 305, Caixa Postal 112, Chapadão do Sul 79560-000, MS, Brazil
2
Department of Geography, State University of Mato Grosso (UNEMAT), Sinop 78550-000, MT, Brazil
*
Author to whom correspondence should be addressed.
AgriEngineering 2025, 7(8), 265; https://doi.org/10.3390/agriengineering7080265
Submission received: 1 July 2025 / Revised: 31 July 2025 / Accepted: 12 August 2025 / Published: 15 August 2025

Abstract

Spectral reflectance of plants can be readily associated with physiological and biochemical parameters. Thus, relating spectral data to amino acid contents in different genetic materials provides an innovative and efficient approach for understanding and managing genetic diversity. Therefore, this study had two objectives: (I) to differentiate genetic materials according to amino acid contents and spectral reflectance; (II) to establish the relationship between amino acids and spectral bands derived from hyperspectral data. The research was conducted with 32 soybean genetic materials grown in the field during the 2023–2024 crop year. The experimental design involved randomized blocks with four replicates. Leaf spectral data were collected 60 days after plant emergence, when the plants were in full bloom. Three leaf samples were collected from the third fully developed trifoliate leaf, counted from top to bottom, from each plot. The samples were taken to the laboratory, where reflectance readings were obtained using a spectroradiometer, which can measure the 350–2500 nm spectrum. Wavelengths were grouped as means of representative intervals and then organized into 28 bands. Subsequently, the leaf samples from each plot were subjected to quantification analyses for 17 amino acids. Then, the soybean genotypes were subjected to a PCA–K-means analysis to separate the genotypes according to their amino acid content and spectral behavior. A correlation network was constructed to investigate the relationships between the spectral variables and between the amino acids within each group. The groups formed by the different genetic materials exhibited distinct profiles in both amino acid composition and spectral behavior. Leaf reflectance data proved to be efficient in identifying differences between soybean genotypes regarding the amino acid content in the leaves. Leaf reflectance was effective in distinguishing soybean genotypes according to leaf amino acid content. Specific and high-magnitude associations were found between spectral bands and amino acids. Our findings reveal that spectral reflectance can serve as a reliable, non-destructive indicator of amino acid composition in soybean leaves, supporting advanced phenotyping and selection in breeding programs.

1. Introduction

Soybean cultivation (Glycine max (L.) Merrill) stands out for being an important source of vegetable protein and oil, serving for animal and human consumption in various uses [1]. Another notable feature of this crop is its ability to provide essential amino acids (AAs) to the diet, as these cannot be synthesized by the human body [2]. Therefore, it is crucial for plant breeding programs to select soybean genotypes based on their nutritional quality, particularly regarding their essential amino acid content. For example, a genetically modified soybean cultivar, O-acetylserine sulfhydrylase (OASS), has been improved to have an increased cysteine content [2,3]; however, other amino acids can also be enhanced in this species.
Spectrophotometry, chemical analyses, mass spectrometry, and liquid chromatography are the most commonly used methods for determining amino acid contents. These techniques stand out for their ability to accurately quantify a wide range of amino acids [4]. Furthermore, the use of hyperspectral data technology can provide important and complementary information related to the biochemical parameters of plants, with the advantage of being a fast, high-throughput method, and, depending on the sensor used, even non-destructive [5,6]. Thus, hyperspectral technology may serve as a valuable tool for amino acid determination in crops of interest, which can be used in the field without removing plant material, enabling the prediction of compounds such as amino acids based on the spectral bands listed in this work.
Hyperspectral sensors are non-destructive and user-friendly technologies that enable rapid analyses, allow for the simultaneous measurement of multiple samples, and can effectively address certain limitations of traditional methods [7]. Comprehensive sensors covering the visible (VIS), near-infrared (NIR), and shortwave infrared (SWIR) regions provide information with varying spatial and spectral resolutions, resulting in more comprehensive datasets and generating a large volume of high-dimensional spectral data [8]. Ref. [9] proposed reducing hyperspectral data to 28 bands, which represent the mean values of significant wavelength intervals and inflection points that are characteristic of known elements, such as carotene a and b, xanthophyll, chlorophyll, and spectral regions associated with chlorophyll and water absorption. The authors reported that within the 400–700 nm wavelength range, corresponding to the visible (VIS) region, reflectance is lower due to light absorption by pigments present in the leaves, especially chlorophyll. A slight increase in reflectance can be observed around 550 nm, which is associated with the predominance of green in the visible spectrum. In the NIR region, spanning 700–1300 nm, reflectance increases significantly, influenced by the internal structure of the leaves, including the cell shape and size, as well as the proportion of intercellular spaces. In the 1300–2500 nm range, corresponding to the SWIR region, there is a gradual reduction in reflectance, reflecting absorption characteristics related to the liquid water content in the leaves.
Therefore, spectral reflectance can be readily associated with the physiological and biochemical parameters of agricultural crops [10]. Relating spectral data to the amino acid contents in different genetic materials provides an innovative and efficient approach for understanding and managing genetic diversity. Furthermore, the use of this technology will optimize crop management and accelerate plant breeding programs by enabling the identification of genotypes with desirable traits, without the need for detailed chemical analyses of all samples. This integration of biochemical and spectral data represents a powerful tool for precision agriculture, contributing to agricultural sustainability.
From this perspective, relating spectral data to amino acid contents can provide a novel and effective approach for exploring genetic diversity, improving crop management, and supporting plant breeding programs by offering a rapid, non-destructive, and informative method for assessing complex phenotypic traits such as amino acid contents across different genetic materials. This approach would also enable large-scale evaluations of plant populations, thereby reducing the time and resources required for selection. Therefore, the present study aims to provide insights for an innovative approach to determining amino acid contents through hyperspectral sensing. The specific objectives of this study were (I) to differentiate genetic materials based on amino acid contents and spectral reflectance and (II) to establish the relationship between amino acids and spectral bands derived from hyperspectral data.

2. Materials and Methods

2.1. Field Experiment

The evaluations were conducted based on an experiments implemented in the experimental field of the Federal University of Mato Grosso do Sul (UFMS), located in the municipality of Chapadão do Sul, Mato Grosso do Sul, Brazil (latitude 18°41′33″ S, longitude 52°40′45″ W, altitude 810 m), during the 2023–2024 agricultural harvest. The experimental design used was a randomized complete block design, allocating 32 soybean genetic materials in four replicates. Each experimental unit consisted of four one-meter-long rows, spaced 0.45 m apart, with a seeding density of 15 seeds per linear meter.
The region in question has a tropical savanna climate (Aw), according to the Köppen–Geiger classification. Figure 1 presents the climatic data for the experimental area, such as the rainfall distribution and average temperature variation during the crop cycle. The soil in this area is classified as a dystrophic Red Latosol, with a clayey texture. In the 0–20 cm depth layer, the following chemical characteristics were obtained: pH in H2O of 6.2; absence of exchangeable aluminum; combined calcium and magnesium contents of 4.31 cmolc dm−3; available phosphorus of 41.3 mg dm−3; potassium of 0.2 cmolc dm−3; organic matter of 19.74 g dm−3; base saturation of 45%; aluminum saturation of 0%; combined bases of 2.3 cmolc dm−3; and cation exchange capacity (CEC) of 5.1 cmolc dm−3.
Before sowing, conventional soil preparation was performed, using plowing followed by harrowing to level the soil. Soybean seeds of different genotypes were treated with a combination of fungicide (pyraclostrobin + thiophanate-methyl) and insecticide (fipronil), applied at a rate of 200 mL of the solution per 100 kg of seeds. Inoculation with bacteria of the genus Bradyrhizobium was carried out directly in the planting furrow, using the manufacturer’s recommended dose. Applications occurred between 31 October 2024 and 1 February 2025, totaling eleven interventions with different classes of active ingredients. For weed control, the herbicides haloxyfop-p-methyl (2 L ha−1), fluazifop-p-butyl (0.25 L ha−1), and clethodim (0.5 L ha−1) were used. For insect pest management, the insecticides imidacloprid + beta-cyfluthrin (0.5 L ha−1), lufenuron (0.1125 L ha−1), chlorantraniliprole + thiamethoxam (0.1 L ha−1), lambda-cyhalothrin (0.25 L ha−1), bacillus thuringiensis (0.04 L ha−1), abamectin (0.1 L ha−1), and spinetoram (0.25 L ha−1) were applied. For disease control, the fungicides pyraclostrobin + difenoconazole (0.6 L ha−1) and azoxystrobin + cyproconazole (0.25 L ha−1) were used. Gibberellic acid (0.4 L ha−1) was used as a growth regulator, in addition to the mineral–paraffinic oil adjuvant (0.025 L ha−1). Applications were made following the specific technical recommendations for each active ingredient, aiming to ensure adequate crop development and agronomic performance.

2.2. Hyperspectral Sensing

A spectral analysis of the leaves was performed 60 days after emergence, the period of full flowering of genetic material, when the plants exhibit the greatest photosynthetic activity and nutrient uptake. Three leaf samples were collected from the third fully developed trifoliate leaf, counted from top to bottom, from each randomly selected plot. The samples were taken to the spectroscopy laboratory for spectral data acquisition using a FieldSpec 4 Jr spectroradiometer (Analytical Spectral Devices, Boulder, CO, USA), which covers the spectrum from 350 to 2500 nm. The average spectral curve formed by the soybean leaves is schematically shown in Figure 2. For each block, the equipment was calibrated with a blank reference made from a white barium sulfate reference panel. The sensor was connected to a computer, which assists in recording the data using proprietary RS3 software version 6.4.3. At the end of the data collection from the sheets, the generated unit files were imported into the ViewSpectroPro, version 6.2 software for data extraction in .txt format to assist in subsequent statistical analyses.
The wavelengths were grouped into 28 bands, each representing specific biochemical activities of the plants (Table 1) [9].

2.3. Amino Acid Quantification

After the spectral analysis, the leaves collected from each plot were oven-dried at 65 °C and ground for amino acid quantification. To this end, 50 mg of the material was subjected to acid hydrolysis with 6N HCl at 110 °C for 24 h. The analyses were conducted in triplicate, with 1 µL injected per sample. For the analysis of L-histidine, L-serine, L-arginine, L-glycine, L-aspartic acid, L-glutamic acid, L-threonine, L-alanine, L-proline, L-cystine, L-lysine, L-tyrosine, L-methionine, L-valine, L-isoleucine, L-leucine, and L-phenylalanine, the plant material was subjected to acid hydrolysis. Subsequently, the samples were filtered (0.2 µm), evaporated under vacuum with silica gel for approximately 16 h, redissolved in 20 mM HCl, and derivatized with AQC (6-aminoquinolyl-N-hydroxysuccinimidyl carbamate) using a Waters® kit (Waters Corp., Milford, MA, USA). The mixture was heated to 55 °C for 10 min and analyzed by ultra-performance liquid chromatography (UPLC Acquity, Waters, 1100 series) equipped with a C18 column (1.8 µm, 2.1 × 100 mm). The separation of the 17 amino acids was performed using a linear gradient system involving four mobile phases, at a flow rate of 0.7 mL min−1 and a temperature of 49 °C. The compounds were identified by comparing retention times and UV spectra, with confirmation by standard addition.

2.4. Statistical Analyses

The initial analysis consisted of applying the K-means algorithm to group genotypes based on amino acid levels. This algorithm determines the distances between each sample and the cluster centroids, assigning samples to the closest cluster, so that those with similar characteristics are grouped together and differentiated from others. The clusters were visualized using a biplot generated from a principal component analysis (PCA), allowing the separation between the formed groups to be observed. To execute the method, it is necessary to previously define the number of clusters (k); in this study, division into four clusters best represented the data structure [11]. A schematic summary of how the analyses were carried out is shown in Figure 3.
Comparative boxplots were created to analyze the means of each amino acid and spectral band, aiming to observe differences among the groups formed. A correlation analysis was performed to assess the relationships among variables within each group, represented by a correlation network in which red lines indicate negative correlations and green lines indicate positive correlations. The thicker highlighted lines represent correlations above 0.6, as defined by the software used to generate the graph [12]. Due to the low correlations observed between amino acids and spectral bands, only significant correlations above 0.20 were considered and highlighted.
Means of the results were grouped using the Scott–Knott test [13] at a 5% significance level. All statistical analyses were run on R software version 4.4.0 [14] using the ggplot2, ExpDes.pt, and Metan packages.

3. Results

The 32 soybean genotypes were subjected to a PCA–K-means analysis to separate the genotypes according to their amino acid contents and spectral behavior. This multivariate analysis enabled the separation of the genotypes into four distinct groups (Figure 4). The largest number of genotypes was grouped in cluster 1, totaling 11 genotypes, while cluster 2 contained the fewest genotypes, with five materials.
Aspartic acid, glutamic acid, alanine, arginine, cystine, phenylalanine, glycine, and histidine contents were compared according to the Scott–Knott clustering to observe the highest and lowest levels of each amino acid within each group (Figure 5). Cluster 1 (C1) showed higher levels of aspartic acid, glutamic acid, alanine, cystine, and phenylalanine. Cluster 2 (C2) was characterized by the lowest concentrations of aspartic acid, glutamic acid, alanine, cystine, and phenylalanine, and exhibited higher levels of arginine and glycine. Cluster 3 (C3) presented intermediate levels for aspartic acid, glutamic acid, alanine, arginine, cystine, phenylalanine, glycine, and histidine. Cluster 4 (C4) showed higher levels only for aspartic acid, alanine, and cystine. Histidine levels were similar across all groups.
The amino acids isoleucine, leucine, lysine, proline, tyrosine, threonine, and valine exhibited higher concentrations in the C1 genotypes (Figure 6). C2 showed the lowest levels of isoleucine, leucine, lysine, proline, tyrosine, and valine, but presented an elevated methionine content. C3 exhibited intermediate levels of isoleucine, leucine, proline, serine, threonine, and valine, as well as high levels of lysine and tyrosine. C4 displayed high concentrations of lysine, methionine, tyrosine, and threonine, and intermediate concentrations of isoleucine, leucine, proline, and valine.
Regarding the spectral bands, each group exhibited a specific pattern across bands B1–B14, with C4 showing the highest reflectance in all bands, followed by C1. Overall, C3 had the lowest reflectance, and there was no difference between C2 and C3 in bands B7–B10 (Figure 7). Groups C1 and C2 displayed similar behavior in bands B12 and B13.
For bands B15–B28, cluster C4 continued to exhibit the highest reflectance, followed by C1, C3, and C2. Between B17 and B20, C2 and C3 showed similar spectral behavior. For bands B21–B26 and B28, C1 and C2 displayed the same spectral pattern (Figure 8).
The Pearson correlations exhibited distinct patterns within each group for the relationships between amino acids and spectral bands (Figure 9). Correlations among spectral bands were positive across all groups. In group 1, arginine and proline showed a strong negative correlation, while serine, alanine, tyrosine, threonine, cystine, isoleucine, glutamine, and lysine demonstrated strong positive correlations among themselves. The correlations between spectral bands and amino acids in this group revealed some noteworthy results above −0.20: phenylalanine showed negative correlations greater than 0.20, with all 28 bands; aspartic acid showed negative correlations with all bands except B4 and B5; arginine showed negative correlations with bands B21–B26 and B28; glycine showed a negative correlation with B27; lysine showed negative correlations with B7–B10; serine and tyrosine showed negative correlations with B5.
In group 2, more and stronger correlations were observed compared to group 1. Glutamic acid showed positive correlations with bands B12–B14 and B22–B23; alanine showed negative correlations with all bands, with the strongest correlation at B9; arginine reached −0.20 with bands B16 and B17; cystine showed positive correlations with B27 and B28; histidine showed positive correlations with B11–B14 and B20–B26, ranging from 0.21 to 0.39, and a negative correlation with B27; lysine showed negative correlations with B5, B6–B10, and B27; methionine showed positive correlations with B1, B11–B14, B19–B26, and B28; proline also exhibited positive correlations with B1–B11, B14–B18, B24, and B26–B28; serine showed negative correlations with B5 and B7–B10; tyrosine showed negative correlations with B11–B14; and valine showed a correlation of 0.24 with B28.
In group 3, the following correlations among amino acids were observed: aspartic acid showed strong positive correlations with alanine, serine, tyrosine, and threonine; glutamic acid showed strong positive correlations with phenylalanine, isoleucine, leucine, proline, and valine; alanine showed strong positive correlations with aspartic acid, cystine, serine, tyrosine, and threonine; arginine correlated positively with tyrosine; cystine showed strong positive correlations with alanine, histidine, and valine; phenylalanine showed strong positive correlations with glutamic acid, isoleucine, leucine, proline, and valine; histidine showed a strong correlation only with cystine; isoleucine, leucine, and proline showed strong positive correlations with glutamic acid, phenylalanine, proline, and valine; serine showed strong positive correlations with aspartic acid, alanine, tyrosine, and threonine; tyrosine showed strong correlations with aspartic acid, alanine, arginine, histidine, threonine, and serine; threonine showed positive correlations with aspartic acid, serine, tyrosine, and alanine.
Regarding the correlations between amino acids and spectral bands in group 3, aspartic acid showed negative correlations with B2–B11 and B15–B18; glutamic acid showed positive correlations with B2–B10, B15–B16, and B7; alanine was negatively correlated with B2–B10 and B15–B17, and positively with B28; arginine showed a negative correlation with B15 and positive correlations with B23, B26, and B28; histidine showed a positive correlation with B1; isoleucine showed positive correlations with B2–B4, B6–B10, and B15–B16; methionine showed negative correlations with B1–B10, B15–B19, and B27; proline was positively correlated with B2, B4, B6–B8, and B10; serine showed negative correlations with B2–B3, B5–B11, B15–B18, and B27, and positive correlations with B22–B23; tyrosine was negatively correlated with B2, B4, B7–B10, and B15–B16, and positively with B21–B23, B26, and B28; valine showed a positive correlation with B2.
In group 4, strong positive correlations were observed with alanine and aspartic acid; glutamic acid with isoleucine, leucine, proline, threonine, and valine; and alanine with valine, tyrosine, and serine. Arginine showed strong negative correlations with isoleucine, leucine, tyrosine, and proline, and a strong positive correlation with methionine. Glutamic acid and cystine both exhibited strong positive correlations with isoleucine, leucine, threonine, and valine. Phenylalanine displayed a strong positive correlation only with leucine. Glycine showed a strong negative correlation with serine. Histidine showed strong positive correlations with tyrosine and threonine. Isoleucine showed strong positive correlations with glutamic acid, cystine, leucine, proline, threonine, and valine. Leucine showed strong positive correlations with glutamic acid, cystine, phenylalanine, isoleucine, proline, threonine, and valine, and a negative correlation with arginine. Methionine showed a strong positive correlation with arginine and a negative correlation with tyrosine. Proline demonstrated strong positive correlations with glutamic acid, isoleucine, and leucine, and a negative correlation with arginine. Serine showed strong correlations with aspartic acid and alanine, and a negative correlation with glycine. Tyrosine showed strong positive correlations with alanine and histidine, and a negative correlation with arginine. Threonine showed strong correlations with glutamic acid, isoleucine, leucine, and valine. Valine showed strong correlations with glutamic acid, alanine, cystine, isoleucine, and threonine.
Regarding the correlations between bands and amino acids, glutamic acid showed correlations with bands B2, B5, B7–B10, B14–B17, and B22–B28. Arginine showed positive correlations above 0.20 with B14–B15 and B27, and a negative correlation with B26. Cystine showed negative correlations with B2, B8–B10, B15–B17, and B27, and positive correlations with B21, B23, B26, and B28. Glycine showed a correlation of −0.33 with B27. Isoleucine and leucine showed negative correlations with B2, B8–B10, and B15–B17, and positive correlations with bands B22–B28. Lysine showed correlations with B23–B25. Methionine showed positive correlations with B12–B14 and B21–B23. Proline showed negative correlations with B2, B3–B4, B7–B10, B15–B17, and B27, and positive correlations with B23–B26 and B28. Tyrosine showed a correlation of −0.30 with B27. Threonine showed negative correlations with B2, B3–B4, B7–B10, B15–B16, and B27, and positive correlations with B12, B13, B21–B26, and B28. Valine showed negative correlations with bands B8–B10, B15–B16, and B27, and positive correlations with B21–B26 and B28.
The groups formed by the different materials analyzed exhibited distinct profiles in both amino acid composition and spectral behavior. In general, materials in group 1 stood out for having the highest contents of most amino acids, including aspartic acid, glutamic acid, alanine, cystine, phenylalanine, isoleucine, leucine, lysine, proline, tyrosine, threonine, and valine, indicating superiority in terms of amino acid composition. Group 2 was characterized by the lowest concentrations of most amino acids, except for elevated levels of arginine, glycine, and methionine. Group 3 presented intermediate levels for most amino acids, except for high contents of lysine and tyrosine. Group 4 was distinguished by higher levels of lysine, methionine, tyrosine, and threonine, with intermediate values for isoleucine, leucine, proline, and valine.
Regarding spectral behavior, C4 exhibited the highest reflectance across all spectral bands (B1–B28), followed by C1. C3 showed the lowest reflectance in most bands, with a similar pattern to C2 in bands B7–B10 and to C1 in bands B21–B26 and B28. C1 and C2 displayed similar behavior in bands B12 and B13.
In terms of correlations, the bands that showed the strongest associations with amino acids, considering both positive and negative correlations, were as follows.
Positive correlations of highlight: i. B21 to B28: These bands were positively correlated with amino acids such as glutamic acid, arginine, cystine, isoleucine, leucine, methionine, proline, threonine, and valine; ii. B12 to B17: These bands were positively associated with glutamic acid, histidine, methionine, proline, and threonine; iii. B23 and B26: These bands consistently showed positive correlations with arginine, cystine, proline, tyrosine, and valine; Negative correlations: i. B2 to B10: These bands were most frequently associated with negative correlations with aspartic acid, alanine, arginine, methionine, proline, serine, tyrosine, and valine; ii. B15 to B18: These bands showed significant negative correlations with aspartic acid, alanine, cystine, methionine, proline, and serine; iii. B27: This band was notable for negative correlations with glycine, histidine, lysine, methionine, proline, tyrosine, and valine.

4. Discussion

Ref. [15] reported that, among the most important priorities in soybean production and research is the sustainable provision of soybean protein, which can be achieved through the breeding of high-yielding and high-protein varieties using the crop’s genetic resources. Thus, grouping materials based on their amino acid (AA) content enabled the formation of distinct groups, with particular emphasis on the genetic materials in group 1, which in this study proved to be the most promising for supplying amino acids. Grouping genotypes by AA composition allows for the identification of genetic materials with desirable traits for different purposes, such as grain production or forage use, thereby guiding breeding programs toward the selection of targeted materials. It is important to note that all genetic materials were managed identically throughout the entire crop cycle, from fertilization to pest and disease control. This ensures that the materials in group 1 are indeed superior to the others in terms of AA content in their leaves. The other groups were notable for their different amino acids, allowing for the selection of materials according to the specific components of interest to be improved.
Among the amino acids present at higher levels in the materials of group 1 is proline, an important amino acid associated with drought stress tolerance in plants, functioning as an osmoprotectant and playing a key role as an antioxidant [16,17]. Thus, the amino acid content—especially for proline—may reflect the metabolic efficiency, which can be related to productive performance and resistance to stress, particularly abiotic stress.
There is limited information in the literature regarding the application of hyperspectral data to biochemical parameters in plant leaves, especially concerning the relationship between hyperspectral data and amino acid contents in soybean leaves. In our study, a significant relationship was observed between amino acids and spectral bands calculated from hyperspectral data. Amino acid levels reflect the plant’s metabolic health and its interaction with environmental factors, potentially indicating stress conditions, while spectral data capture morphophysiological variations, such as changes in pigment composition. Integrating these datasets provides a more comprehensive understanding of plant physiology.
Overall, bands B21–B28 exhibited the highest numbers of relevant positive correlations with different amino acids, whereas bands B2–B10 and B15–B18 showed more negative correlations. This suggests that these spectral regions have greater potential for distinguishing specific amino acid concentrations in the analyzed samples. Bands B2–B10 cover the 370–500 nm range, B15–B18 span 650–684 nm, and B21–B28 encompass specific SWIR regions from 701 to 730 nm (B21–B23), with the remaining bands corresponding specifically to 960, 1100, 1400, 1930, and 2200 nm. Ref. [4] reported that the bands most strongly associated with the majority of amino acids in maize leaves were mainly concentrated in the ranges of 505.39–604.95 nm and 651.21–714.10 nm, which is likely due to the influence of various pigments, especially the chlorophyll content. The authors also note that there are relatively few studies on the non-destructive detection of amino acid contents in leaves using spectral spectroscopy, and that further research is needed to demonstrate the feasibility of the non-destructive detection of amino acids in leaves.
In addition to the visible region, the SWIR region also contributes to establishing correlations with different amino acids. Ref. [18] found that the characteristic wavelengths of amino acid nitrogen were primarily distributed in the long-wave near-infrared region. The use of sensors offers potential opportunities not only at the laboratory level, but also through portable imaging systems and satellite-based platforms, enabling the application of this technology in the field. This approach allows for the assessment of plant performance at the tissue level, in individual plants, or across entire crops, utilizing high spectral resolution systems capable of evaluating biochemical and physiological traits in diverse plant populations [19]. Based on this spectral information, the relationship with the amino acid content can help identify genotypes with greater metabolic efficiency, nutritional quality, or stress resilience, thereby accelerating plant breeding programs.
The literature reports studies demonstrating that proteins exhibit absorption peaks in the ranges of 1460–1570 nm and 2000–2180 nm [20,21]. This justifies the negative correlations observed between amino acids and the bands within the SWIR region, as higher amino acid contents result in lower reflectance. Thus, the use of these spectral ranges enables the non-destructive detection of proteins, which should primarily be conducted using the shortwave infrared (SWIR) system with specific spectral bands [22], as previously mentioned. By relating spectral characteristics to functional groups such as C-H, N-H, and O-H, it is possible to indirectly detect the water content, total nitrogen, free amino acids, caffeine, and theanine [23,24,25]. The spectral characteristics of these functional groups are associated with ranges from the visible to the near-infrared spectrum, indicating that combining visible and near-infrared spectra provides useful information and can improve the accuracy of detecting specific compounds, such as amino acids in samples [8].
Our results reveal that soybean genotypes can be effectively grouped based on their amino acid profiles and spectral reflectance patterns, demonstrating a strong association between biochemical traits and leaf spectral responses. This ability to discriminate genotypes based on amino acid composition opens up relevant options for breeding programs, as it allows the selection of lines with superior nutritional profiles, especially those with higher concentrations of essential amino acids, which are essential for high-quality feed and feed formulations.
The use of hyperspectral data makes the process of reliably correlating leaf amino acid contents a significant methodological advancement. This approach allows for the inference of nutritional traits from spectral data, reducing the reliance on destructive, time-consuming, and costly laboratory analyses. This makes it possible to evaluate large genetic populations quickly, non-destructively, and with high efficiency, optimizing selection processes in genetic breeding programs.
In addition to the operational gains, the data from this study reinforce the predictive potential of hyperspectral sensing applied to plant physiology, with an emphasis on the protein nutrition of soybean crops, paving the way for future research, focusing on the relationship between the spectrum and the AA contents of leaves. The adoption of this technology can significantly accelerate the identification of promising genotypes, simultaneously integrating nutritional and productive attributes.
This work, therefore, highlights the transformative role of hyperspectral sensing in genetic improvement and the soybean production chain. By enabling the early and accurate selection of genotypes with desirable amino acid profiles, the technology contributes not only to the development of cultivars with higher nutritional value but also to the sustainable intensification of agricultural production.

5. Conclusions

Leaf reflectance was effective in distinguishing soybean genotypes according to the leaf amino acid content. The spectral bands exhibited specific associations with amino acids. Bands B21 to B28 showed the strongest positive relationships with glutamic acid, arginine, cystine, isoleucine, leucine, methionine, proline, threonine, and valine. Bands B12 to B17 were positively correlated with glutamic acid, histidine, methionine, proline, and threonine, while bands B23 and B26 also presented positive correlations with arginine, cystine, proline, tyrosine, and valine. Conversely, negative associations were observed for bands B2 to B10 with aspartic acid, alanine, arginine, methionine, proline, serine, tyrosine, and valine; bands B15 to B18 with aspartic acid, alanine, cystine, methionine, proline, and serine; and band B27 with glycine, histidine, lysine, methionine, proline, tyrosine, and valine.
These findings reinforce that spectral reflectance, especially in specific regions of the spectrum, can serve as a reliable, non-destructive indicator of amino acid composition in soybean leaves, supporting advanced phenotyping and selection in breeding programs.

Author Contributions

Conceptualization, A.C.d.S.C.S. and P.E.T.; methodology, A.C.d.S.C.S. and L.P.R.T.; software P.E.T. and C.A.d.S.J.; validation, D.C.S., E.V.M.S. and I.A.O.; formal analysis, C.N.S.C.; investigation, F.H.R.B.; resources, L.P.R.T.; data curation, C.A.d.S.J.; writing—original draft preparation, A.C.d.S.C.S., P.E.T. and D.C.S.; writing—review and editing, R.F.R.; visualization, I.A.O. and E.V.M.S.; supervision, F.H.R.B.; project administration, A.C.d.S.C.S., L.P.R.T., P.E.T. and R.F.R.; funding acquisition, C.N.S.C. and F.H.R.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

The authors would like to thank the Universidade Federal de Mato Grosso do Sul (UFMS), Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)—Grant numbers 308295/2023-4, 309250/2021-8, 306022/2021-4 and 304979/2022-8, and Fundação de Apoio ao Desenvolvimento do Ensino, Ciência e Tecnologia do Estado de Mato Grosso do Sul (FUNDECT) TO numbers 88/2021, 07/2022, 318/2022 and 94/2023, and SIAFEM numbers 30478, 31333, 32242 and 33111. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brazil (CAPES)—Financial Code 001.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Savary, S.; Willocquet, L.; Pethybridge, S.J.; Esker, P.; McRoberts, N.; Nelson, A. The Global Burden of Pathogens and Pests on Major Food Crops. Nat. Ecol. Evol. 2019, 3, 430–439. [Google Scholar] [CrossRef]
  2. Kanchana, P.; Santha, M.L.; Raja, K.D. A Review on Glycine max (L.) Merr. (Soybean). World J. Pharm. Pharm. Sci. 2016, 5, 356–371. [Google Scholar]
  3. Kim, W.-S.; Chronis, D.; Juergens, M.; Schroeder, A.C.; Hyun, S.W.; Jez, J.M.; Krishnan, H.B. Transgenic Soybean Plants Overexpressing O-Acetylserine Sulfhydrylase Accumulate Enhanced Levels of Cysteine and Bowman–Birk Protease Inhibitor in Seeds. Planta 2012, 235, 13–23. [Google Scholar] [CrossRef]
  4. Shu, M.; Zhou, L.; Chen, H.; Wang, X.; Meng, L.; Ma, Y. Estimation of Amino Acid Contents in Maize Leaves Based on Hyperspectral Imaging. Front. Plant Sci. 2022, 13, 885794. [Google Scholar] [CrossRef]
  5. Li, Z.; Li, Z.; Fairbairn, D.; Li, N.; Xu, B.; Feng, H.; Yang, G. Multi-LUTs Method for Canopy Nitrogen Density Estimation in Winter Wheat by Field and UAV Hyperspectral. Comput. Electron. Agric. 2019, 162, 174–182. [Google Scholar] [CrossRef]
  6. Mao, Z.-H.; Deng, L.; Duan, F.-Z.; Li, X.-J.; Qiao, D.-Y. Angle Effects of Vegetation Indices and the Influence on Prediction of SPAD Values in Soybean and Maize. Int. J. Appl. Earth Obs. Geoinf. 2020, 93, 102198. [Google Scholar] [CrossRef]
  7. Liang, D.; Zhou, Q.; Ling, C.; Gao, L.; Mu, X.; Liao, Z. Research Progress on the Application of Hyperspectral Imaging Techniques in Tea Science. J. Chemom. 2023, 37, e3481. [Google Scholar] [CrossRef]
  8. Long, T.; Tang, X.; Liang, C.; Wu, B.; Huang, B.; Lan, Y.; Xu, H.; Liu, S.; Long, Y. Detecting Bioactive Compound Contents in Dancong Tea Using VNIR-SWIR Hyperspectral Imaging and KRR Model with a Refined Feature Wavelength Method. Food Chem. 2024, 460, 140579. [Google Scholar] [CrossRef]
  9. da Silva Junior, C.A.; Nanni, M.R.; Shakir, M.; Teodoro, P.E.; de Oliveira-Júnior, J.F.; Cezar, E.; de Gois, G.; Lima, M.; Wojciechowski, J.C.; Shiratsuchi, L.S. Soybean Varieties Discrimination Using Non-Imaging Hyperspectral Sensor. Infrared Phys. Technol. 2018, 89, 338–350. [Google Scholar] [CrossRef]
  10. Pandey, P.; Prabhakar, R. An Analysis of Machine Learning Techniques (J48 & AdaBoost)-for Classification. In Proceedings of the 2016 1st India International Conference on Information Processing (IICIP), Delhi, India, 12–14 August 2016; pp. 1–6. [Google Scholar]
  11. Ahmed, M.; Seraj, R.; Islam, S.M.S. The K-Means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics 2020, 9, 1295. [Google Scholar] [CrossRef]
  12. Bhering, L.L. Rbio: A Tool for Biometric and Statistical Analysis Using the R Platform. Crop Breed. Appl. Biotechnol. 2017, 17, 187–190. [Google Scholar] [CrossRef]
  13. Scott, A.J.; Knott, M. A Cluster Analysis Method for Grouping Means in the Analysis of Variance. Biometrics 1974, 30, 507–512. [Google Scholar] [CrossRef]
  14. R Core Team. R: A Language and Environment for Statistical Computing; R Core Team: Vienna, Austria, 2013. [Google Scholar]
  15. Guo, B.; Sun, L.; Jiang, S.; Ren, H.; Sun, R.; Wei, Z.; Hong, H.; Luan, X.; Wang, J.; Wang, X. Soybean Genetic Resources Contributing to Sustainable Protein Production. Theor. Appl. Genet. 2022, 135, 4095–4121. [Google Scholar] [CrossRef]
  16. Teixeira, W.F.; Soares, L.H.; Fagan, E.B.; da Costa Mello, S.; Reichardt, K.; Dourado-Neto, D. Amino Acids as Stress Reducers in Soybean Plant Growth under Different Water-Deficit Conditions. J. Plant Growth Regul. 2020, 39, 905–919. [Google Scholar] [CrossRef]
  17. Gill, R.J. New State Record: Redgum Lerp Psyllid, Glycaspis Brimblecombei. Calif. Plant Pest Dis. Rep. 1998, 17, 7–8. [Google Scholar]
  18. Huang, H.; Hu, X.; Tian, J.; Jiang, X.; Luo, H.; Huang, D. Rapid Detection of the Reducing Sugar and Amino Acid Nitrogen Contents of Daqu Based on Hyperspectral Imaging. J. Food Compos. Anal. 2021, 101, 103970. [Google Scholar] [CrossRef]
  19. Vergara-Diaz, O.; Vatter, T.; Kefauver, S.C.; Obata, T.; Fernie, A.R.; Araus, J.L. Assessing Durum Wheat Ear and Leaf Metabolomes in the Field through Hyperspectral Data. Plant J. 2020, 102, 615–630. [Google Scholar] [CrossRef]
  20. Shenk, J.S.; Workman, J.J., Jr.; Westerhaus, M.O. Application of NIR Spectroscopy to Agricultural Products. In Handbook of Near-infrared Analysis; CRC Press: Boca Raton, FL, USA, 2007; pp. 365–404. [Google Scholar]
  21. Chelladurai, V.; Jayas, D.S. Near-Infrared Imaging and Spectroscopy. In Imaging with Electromagnetic Spectrum: Applications in Food and Agriculture; Springer: Berlin/Heidelberg, Germany, 2014; pp. 87–127. [Google Scholar]
  22. Li, X.; Peng, F.; Wei, Z.; Han, G.; Liu, J. Non-Destructive Detection of Protein Content in Mulberry Leaves by Using Hyperspectral Imaging. Front. Plant Sci. 2023, 14, 1275004. [Google Scholar] [CrossRef]
  23. Hong, T.; Yin, J.-Y.; Nie, S.-P.; Xie, M.-Y. Applications of Infrared Spectroscopy in Polysaccharide Structural Analysis: Progress, Challenge and Perspective. Food Chem. X 2021, 12, 100168. [Google Scholar] [CrossRef]
  24. Shen, S.; Hua, J.; Zhu, H.; Yang, Y.; Deng, Y.; Li, J.; Yuan, H.; Wang, J.; Zhu, J.; Jiang, Y. Rapid and Real-Time Detection of Moisture in Black Tea during Withering Using Micro-near-Infrared Spectroscopy. LWT 2022, 155, 112970. [Google Scholar] [CrossRef]
  25. Rébufa, C.; Pany, I.; Bombarda, I. NIR Spectroscopy for the Quality Control of Moringa Oleifera (Lam.) Leaf Powders: Prediction of Minerals, Protein and Moisture Contents. Food Chem. 2018, 261, 311–321. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Mean temperature and rainfall over the experiment period.
Figure 1. Mean temperature and rainfall over the experiment period.
Agriengineering 07 00265 g001
Figure 2. Average spectral signature of soybean leaves.
Figure 2. Average spectral signature of soybean leaves.
Agriengineering 07 00265 g002
Figure 3. Graphical summary of the methodological procedures.
Figure 3. Graphical summary of the methodological procedures.
Agriengineering 07 00265 g003
Figure 4. PCA–K-means for the separation of groups according to the similarity in amino acid contents and spectral behavior of soybean genotypes. The numbers 1–32 represent the genotypes. the colors red, green, blue and purple represent the groups which are also numbered 1–4 respectively.
Figure 4. PCA–K-means for the separation of groups according to the similarity in amino acid contents and spectral behavior of soybean genotypes. The numbers 1–32 represent the genotypes. the colors red, green, blue and purple represent the groups which are also numbered 1–4 respectively.
Agriengineering 07 00265 g004
Figure 5. Mean comparison of aspartic acid, glutamic acid, alanine, arginine, cystine, phenylalanine, glycine, and histidine contents for each group formed. Different letters indicate significant differences according to the Scott–Knott test at the 5% significance level.
Figure 5. Mean comparison of aspartic acid, glutamic acid, alanine, arginine, cystine, phenylalanine, glycine, and histidine contents for each group formed. Different letters indicate significant differences according to the Scott–Knott test at the 5% significance level.
Agriengineering 07 00265 g005
Figure 6. Mean comparison for isoleucine, leucine, lysine, methionine, proline, serine, tyrosine, threonine, and valine contents for each group formed. Different letters indicate significant differences according to the Scott–Knott test at the 5% significance level.
Figure 6. Mean comparison for isoleucine, leucine, lysine, methionine, proline, serine, tyrosine, threonine, and valine contents for each group formed. Different letters indicate significant differences according to the Scott–Knott test at the 5% significance level.
Agriengineering 07 00265 g006
Figure 7. Mean comparison for bands B1–B14 for each group formed. Different letters indicate significant differences according to the Scott–Knott test at the 5% significance level.
Figure 7. Mean comparison for bands B1–B14 for each group formed. Different letters indicate significant differences according to the Scott–Knott test at the 5% significance level.
Agriengineering 07 00265 g007
Figure 8. Mean comparison for bands B15–B28 for each group formed. Different letters indicate significant differences according to the Scott–Knott test at the 5% significance level.
Figure 8. Mean comparison for bands B15–B28 for each group formed. Different letters indicate significant differences according to the Scott–Knott test at the 5% significance level.
Agriengineering 07 00265 g008
Figure 9. Pearson correlations between amino acids and spectral data for each cluster formed (C1 to C4).
Figure 9. Pearson correlations between amino acids and spectral data for each cluster formed (C1 to C4).
Agriengineering 07 00265 g009
Table 1. Ranges and average wavelengths (nm) of the 28 selected spectral bands.
Table 1. Ranges and average wavelengths (nm) of the 28 selected spectral bands.
Band (n°)Spectral Range (nm)Mean Wavelength (nm)
1350–369359.5
2370370.0
3371–419395.0
4420420.0
5421–424422.5
6425425.0
7426–444435.0
8445–475460.0
9480480.0
10481–500490.5
11501–530515.5
12531–539535.0
13540540.0
14541–649595.0
15650650.0
16661–670665.5
17675675.0
18676–684680.0
19685–689687.0
20690–700695.0
21701–709705.0
22710710.0
23711–730720.5
24960960.0
2511001100.0
2614001400.0
2719301930.0
2822002200.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Seron, A.C.d.S.C.; Santana, D.C.; Oliveira, I.A.; Campos, C.N.S.; Teodoro, L.P.R.; Silva, E.V.M.; Ratke, R.F.; Baio, F.H.R.; da Silva Junior, C.A.; Teodoro, P.E. Relationship Between Hyperspectral Data and Amino Acid Composition in Soybean Genotypes. AgriEngineering 2025, 7, 265. https://doi.org/10.3390/agriengineering7080265

AMA Style

Seron ACdSC, Santana DC, Oliveira IA, Campos CNS, Teodoro LPR, Silva EVM, Ratke RF, Baio FHR, da Silva Junior CA, Teodoro PE. Relationship Between Hyperspectral Data and Amino Acid Composition in Soybean Genotypes. AgriEngineering. 2025; 7(8):265. https://doi.org/10.3390/agriengineering7080265

Chicago/Turabian Style

Seron, Ana Carina da Silva Cândido, Dthenifer Cordeiro Santana, Izadora Araujo Oliveira, Cid Naudi Silva Campos, Larissa Pereira Ribeiro Teodoro, Elber Vinicius Martins Silva, Rafael Felippe Ratke, Fábio Henrique Rojo Baio, Carlos Antonio da Silva Junior, and Paulo Eduardo Teodoro. 2025. "Relationship Between Hyperspectral Data and Amino Acid Composition in Soybean Genotypes" AgriEngineering 7, no. 8: 265. https://doi.org/10.3390/agriengineering7080265

APA Style

Seron, A. C. d. S. C., Santana, D. C., Oliveira, I. A., Campos, C. N. S., Teodoro, L. P. R., Silva, E. V. M., Ratke, R. F., Baio, F. H. R., da Silva Junior, C. A., & Teodoro, P. E. (2025). Relationship Between Hyperspectral Data and Amino Acid Composition in Soybean Genotypes. AgriEngineering, 7(8), 265. https://doi.org/10.3390/agriengineering7080265

Article Metrics

Back to TopTop