1. Introduction
In most Asian countries, rice is a staple food with major agricultural, economic, and cultural importance, and this is especially true in Japan. Recent concerns about domestic rice shortages, stemming from climate change and global market fluctuations, have prompted government market interventions, including the release of stockpiled rice. Additionally, premium rice varieties are being traded at high prices, with added value attributed to their origin and variety [
1]. While maintaining brand names provides economic incentives for producers, it also increases the risk of mislabeling and false origin claims.
From September 2022, Japan will require all processed foods to label the country of origin for key ingredients [
2]. This highlights the urgent need for rapid, scientific methods to accurately identify the source of ingredients. Accordingly, various studies have explored combining spectroscopic techniques with machine learning for food authentication. For example, Yang et al. [
3] used terahertz (THz) spectroscopy, whereas Quan et al. [
4] and Gu et al. [
5] employed UV-visible spectroscopy. Fluorescence spectroscopy has also been applied to olive oil, citrus fruits, honey, wine, and tea [
6,
7,
8,
9,
10,
11]. Due to its rapid and highly sensitive characteristics, fluorescence fingerprinting has become an effective method for evaluating the quality and authenticity of a variety of agricultural products. Although NIR spectroscopy has also been applied to tea [
12,
13] and peaches [
14], and several studies have demonstrated its applicability to rice using chemometrics and machine learning approaches [
15,
16,
17], few have compared NIR with fluorescence fingerprinting or integrated deep learning–based models across multiple Japanese varieties. This study is intended to fill that gap.
Nevertheless, some progress has been made using Raman spectroscopy and chemometrics for quality assessment and counterfeit detection [
18], EEM spectra to identify geographic origin [
19], and fluorescence spectroscopy and support vector machine (SVM) for high-precision classification [
20,
21]. While research on camellia oil and green tea has been reported [
22,
23,
24,
25], and more recently on oolong and blended teas [
13,
21], very few studies have specifically focused on rice. DNA markers have proven successful in identifying species and varieties of seaweed and other crops, demonstrating high reliability [
26]. However, DNA analysis requires specialized equipment and is destructive to the sample, limiting its routine use for food authentication. DNA marker analysis is a widely accepted and highly reliable method for varietal identification; however, it involves higher costs, longer processing times, and destructive sampling. In contrast, the spectroscopic and machine learning approach offers a rapid and cost-effective alternative, making it more suitable for routine industrial applications.
A method combining spectroscopy and machine learning enables rapid and low-cost analysis, providing a practical tool for industrial applications.
Fluorescence and NIR spectroscopy were selected because they provide complementary information: fluorescence reflects specific fluorophores such as amino acids and Maillard reaction products [
6,
7,
8,
9,
10,
11], whereas NIR captures bulk compositional features including water, carbohydrates, proteins, and lipids [
12,
13,
14]. Using both techniques allowed us to directly compare their discrimination ability and to explore the potential of multimodal approaches.
Previously, we identified the origins and varieties of dried kelp [
27] and green tea [
28] by combining fluorescence spectroscopy and a CNN. Expanding on this, in the current study, we collected fluorescence and NIR spectroscopic data from five polished rice varieties in Japan—Akitakomachi, Hitomebore, Hinohikari, Koshihikari and Nanatsuboshi—and constructed identification models using machine learning techniques, namely, CNN, KNN, RF, LR, and SVM. These varieties were selected because they are widely cultivated, economically important, and closely related, making accurate identification particularly relevant for authenticity assurance. No previous study has combined both fluorescence and NIR spectroscopy with machine learning for rice variety authentication. We also developed hybrid models by integrating CNN features with traditional algorithms to improve identification accuracy. Finally, we compared the identification performance of models using NIR spectra or fluorescence fingerprint data. Collectively, the findings of this study provide a practical tool for food authentication, brand protection, and counterfeit prevention.
2. Materials and Methods
2.1. Target Foods
This study examined five polished rice varieties (Akitakomachi, Hinohikari, Hitomebore, Koshihikari, and Nanatsuboshi;
Table 1) descended from the Koshihikari lineage that are widely cultivated and distributed throughout Japan (
Figure 1), with the largest planting area among the non-glutinous rice varieties harvested in 2023. Three samples of each variety were purchased in 2024 [
29].
The genealogical background of each variety are detailed as follows. Akitakomachi was developed by crossing Koshihikari with Ou 292 and is extensively cultivated throughout the Tohoku region [
30]. Hitomebore originated from a cross between Koshihikari and Hatsuboshi (Aichi 26), and is predominantly grown in Miyagi Prefecture [
31]. Hinohikari was developed by crossing Koganebare (Aichi 40) with Koshihikari, and is widely cultivated across western Japan [
32]. Nanatsuboshi was developed in Hokkaido by crossing Hitomebore with Ku-kei 90242A, followed by an additional cross with Akiho (Soriku 150), and is known for its stable quality under cold climate conditions [
33]. Koshihikari, introduced in 1945, is regarded as a representative Japanese variety and continues to maintain high popularity and market value [
34].
Before spectrofluorometric and NIR spectrometric analysis, all samples were ground for approximately 1 min (30 s × 2 repetitions) in a DR MILLS mill (model: DM-7452; Guangzhou, Guangdong, China). The samples were subsequently stored in airtight containers at room temperature (~25 °C) to prevent degradation due to oxygen and humidity exposure for less than 1 month. This preparation process minimized variability during spectroscopic measurements and improved reproducibility and analytical accuracy.
2.2. Acquisition of EEM Spectra
EEM spectral data were acquired for rice flour samples using a calibrated F-7100 fluorescence spectrophotometer (Hitachi High-Tech Science, Hitachi-shi, Ibaraki, Japan). Subsamples of three ground products from different manufacturers per variety were measured, with 100 independent readings for each of the 15 products, totaling 1500 fluorescence spectral data points. Measurements used excitation/emission ranges of 250–550 nm, with intervals of 10 nm (excitation) and 5 nm (emission), at a photomultiplier voltage of 680 V.
Each EEM data point was saved as a three-dimensional fluorescence matrix FD3 file, converted to text format (TXT file) via FL Solution 4.2, and then to PNG images using Python. During preprocessing, a ± 30 nm around each wavelength was masked to eliminate scattered light, and fluorescence intensities from all 2D images were rescaled between 10 and 2630. Images were cropped to 360 × 360 pixels using Python to standardize scale and remove excessive whitespace. ensure a common scale and to remove unnecessary white spaces (
Figure 2). Four-digit sequential numbers (0001–1500) were assigned to the 2D image file names. A CSV file was created to match the file number of each image. The “No.” column contained a four-digit number; the “class” column contained the variety name, and the “class_maker” column contained the variety and manufacturer information. The 2D images and CSV files were used as training and evaluation data for the machine learning models.
2.3. Acquisition of Near-Infrared Spectroscopic Data
NIR spectroscopy measurements were conducted using a compact spectrometer (NIR Meter, Spectra Co-op, Tokyo, Japan) operating within the 900–1700 nm wavelength range.
All NIR spectral data were compiled into a single CSV file for subsequent machine learning model training and validation. Absorbance data were stratified according to the “product” column in the CSV file, with training and test datasets divided at an 8:2 ratio (1200 training sets, 300 test sets). To maintain reproducibility, the random seed was set to 42.
2.4. Developing a Machine Learning Model: Fluorescence Fingerprint Data
Rice varieties were identified using traditional machine learning algorithms (KNN, RF, LR, and SVM) and a deep learning algorithm (CNN). KNN, RF, LR, and SVM were selected as widely used benchmark algorithms in chemometrics and food authenticity research. CNN was also employed because it can directly process fluorescence fingerprint images and extract high-dimensional features. In addition to single-algorithm models, hybrid models were constructed that combined CNN feature extraction with other classifiers. For non-CNN models, 2D images were stratified by the class_marker column and split 8:2 into training (1200) and test (300) sets. The training set was further split into training (960 items) and validation (240 items) data. For both data divisions, the random seed was fixed at random_state = 42 for reproducibility. Additionally, stratified sampling was applied to all splits to avoid dataset bias.
2.4.1. Single-Algorithm Models
Fluorescent fingerprint images served as the input data to develop five classification models using distinct algorithms: CNN, KNN, RF, LR, and SVM. All models were implemented using Python.
The CNN was constructed using TensorFlow (ver. 2.18.0) and Keras (ver. 3.9). Its configuration comprised three convolutional layers (Conv2D) with 16, 32, and 64 filters, each with a kernel size of 3 × 3. The ReLU activation function was applied throughout, with L2 regularization (coefficient = 0.001). Each convolutional layer was followed by a 2 × 2 max pooling layer (MaxPool2D). Post-convolution, the model included a fully connected dense layer with 256 units, activated by ReLU and regularized with L2, followed by a dropout layer (rate = 0.4) to prevent overfitting. The output layer consisted of five units corresponding to the classification targets, employing a softmax activation function. The model training utilized an initial learning rate of 0.0001, the Adam optimizer, categorical cross-entropy loss, accuracy as the evaluation index, 100 epochs, and a batch size of 16. The ModelCheckpoint mechanisms preserved the network upon achieving peak validation accuracy. The CNN-based identification program was run ten times, and the highest-performing model from each run was saved (.keras format), yielding ten sequentially numbered model files.
The KNN, RF, LR, and SVM models were implemented using scikit-learn with standard parameter settings, while key hyperparameters were systematically optimized to maximize test accuracy (e.g., n_neighbors for KNN, n_estimators for RF, and C for LR and SVM). For each algorithm, the smallest parameter value yielding the highest accuracy was adopted. This consistent optimization framework was applied across all algorithms to enable a fair comparison with the CNN-based and hybrid models, which has not been explicitly reported in previous studies.
The overall architecture of the CNN model is shown in
Figure 3. It consists of three convolutional layers (Conv2D: 16, 32, and 64 filters, kernel size 3 × 3, ReLU activation) followed by 2 × 2 max pooling layers, a fully connected dense layer (256 units, ReLU, L2 regularization = 0.001), a dropout layer (rate = 0.4), and an output layer with five softmax units corresponding to the rice varieties. This schematic provides a visual summary of the configuration described above, clarifying the structure of the CNN model used in this study.
2.4.2. Hybrid Models
Rice variety identification follows a two-stage process, involving feature extraction and classification, which is typical in image-based deep learning applications. In this study, hybrid models were developed that integrated optimized algorithms from both stages. Specifically, feature extraction was performed using a CNN, while classification leveraged conventional machine learning algorithms (KNN, RF, LR, or SVM), resulting in four hybrid model combinations: CNN+KNN, CNN+RF, CNN+LR, and CNN+SVM.
A trained model (.keras format) with the highest validation accuracy among the CNN-only models was selected. The output vector of the fully connected layer (dense, units = 256, activation = ReLU, and kernel regularizer = L2[0.001]) of the CNN model generated a 256-dimensional feature vector for each image.
The conventional algorithms used for classification were configured under the same conditions as those applied in their independent model. However, key hyperparameters—n_neighbors for KNN, n_estimators for RF, and C (inverse of the regularization term) for LR and SVM—were further optimized to maximize the variety identification accuracy on the test dataset. The most effective parameters identified through this process were subsequently adopted.
2.5. Developing Machine Learning Models: Near-Infrared Spectroscopy Data
Identification models for NIR spectroscopy were constructed using KNN, LR, RF, and SVM. All the algorithms were implemented using Python and the scikit-learn library, and the key hyperparameters for each algorithm were optimized through searches.
For KNN, n_neighbors ranged from 3 to 20; for LR and SVM, C ranged from 0.0001 to 10,000; and for RF, n_estimators ranged from 10 to 200. Parameters maximizing test accuracy were selected, and max_iter (LR) was set to 1000, probability (SVM) to true, and randome_state to 42 in all applicable cases.
4. Discussion
This study demonstrated that rice variety identification achieved high accuracy rates (>0.9) utilizing both fluorescence fingerprints and NIR spectra. Among the models employing NIR spectra, the KNN algorithm attained the highest accuracy (0.9367), surpassing the LR, RF, and SVM models applied to fluorescence fingerprint data. The highest overall accuracy was observed with the CNN model (0.9717) using fluorescence fingerprints as input. While NIR spectra primarily provide absorption information from bulk constituents, such as water, carbohydrates, lipids, and proteins [
35], fluorescence fingerprints offer a two-dimensional matrix of excitation and emission wavelengths [
36], potentially enabling clearer visualization of differences in the chemical properties and abundance of fluorescent compounds within the samples.
Hybrid models integrating a CNN utilizing fluorescent fingerprints with conventional algorithms (KNN, RF, LR, and SVM) consistently achieved superior classification accuracy than single-algorithm models. This enhancement is attributed to the 256-dimensional feature vector extracted by the deep learning model (CNN), which was effectively leveraged by conventional classification algorithms to discriminate subtle inter-varietal differences. Notably, the CNN+KNN model yielded the best performance, achieving an average classification accuracy of 0.9817.
Increased fluorescence intensity was detected in 2D images at an excitation of 280 nm and emission of 340 nm across all varieties. Among these, Nanatsuboshi exhibited a particularly strong fluorescence, resulting in precision and recall values of 1.000. This pronounced peak likely contributed to Nanatsuboshi’s high classification accuracy. Similarly, Akitakomachi exhibited robust fluorescence within the same spectral range, demonstrating high classification accuracy with a precision of 1.000 and a recall of 0.9933. In contrast, Hinohikari, Hitomebore, and Koshihikari shared similar fluorescence profiles in this wavelength range, contributing to their mutual misclassification.
The fluorescence peak at excitation 280 nm and emission 340 nm is likely attributable to tryptophan, a fluorescent amino acid known to emit strongly at these wavelengths [
37]. Prior research has also confirmed the presence of tryptophan in rice flour [
38], supporting the attribution of the observed peak in the current study. An additional peak at excitation 340 nm and emission 440 nm may be ascribed to advanced glycation end products (AGEs), which are fluorescent compounds formed via the non-enzymatic Maillard reaction between proteins and sugars. The observed excitation and emission parameters are consistent with those reported for AGEs [
39]. Given that unheated polished rice was analyzed in this study, the formation of AGEs associated with heat-induced Maillard reaction is presumed to be minimal; however, the presence of trace naturally occurring AGEs cannot be ruled out. Notably, AGEs are generally produced in lower quantities in carbohydrate-based grains relative to food with higher fat and protein content [
40], which is consistent with the relatively weak fluorescence signal attributed to AGEs observed in the present study.
Furthermore, in this study, the CNN model using fluorescence fingerprinting consistently achieved higher discrimination accuracy than the model using NIR spectra. This is likely attributable to the CNN’s capability for extracting detailed patterns from two-dimensional image data using deep-learning algorithms optimized for image processing. The observed image variations are presumed to reflect differences in tryptophan content and other chemical constituents among rice varieties, indicating that fluorescence fingerprinting serves as an effective approach for variety discrimination.
Nevertheless, several limitations in this study should be noted. The present dataset consisted of 15 products (three per variety) collected in a single harvest year (2023) from commercial sources, which may limit the generalizability of the model. All samples were commercially sourced, with variability in harvest time, milling dates, and storage conditions. These factors may influence compositional changes in rice, such as lipid oxidation and protein degradation during storage, potentially affecting fluorescence and NIR spectra profiles. To mitigate these variables, future studies should incorporate samples of identical variety and origin, with stringent control over harvest timing and storage periods. Furthermore, as harvest times vary across regions in Japan, seasonality may further impact discrimination accuracy. Thus, subsequent validation studies considering seasonal influences are warranted. Future studies will extract wavelength-specific importance from NIR spectra (e.g., using RF feature importance) and apply visualization techniques such as Grad-CAM to identify influential regions in fluorescence images, thereby improving interpretability. Additional methods such as PCA or PLS may further improve interpretability of varietal discrimination and will be considered in future work.