Prediction of the Leaf Primordia of Potato Tubers Using Sensor Fusion and Wavelength Selection

The sprouting of potato tubers during storage is a significant problem that suppresses obtaining high quality seeds or fried products. In this study, the potential of fusing data obtained from visible (VIS)/near-infrared (NIR) spectroscopic and hyperspectral imaging systems was investigated, to improve the prediction of primordial leaf count as a significant sign for tubers sprouting. Electronic and lab measurements were conducted on whole tubers of Frito Lay 1879 (FL1879) and Russet Norkotah (R.Norkotah) potato cultivars. The interval partial least squares (IPLS) technique was adopted to extract the most effective wavelengths for both systems. Linear regression was utilized using partial least squares regression (PLSR), and the best calibration model was chosen using four-fold cross-validation. Then the prediction models were obtained using separate test data sets. Prediction results were enhanced compared with those obtained from individual systems’ models. The values of the correlation coefficient (the ratio between performance to deviation, or r(RPD)) were 0.95(3.01) and 0.9s6(3.55) for FL1879 and R.Norkotah, respectively, which represented a feasible improvement by 6.7%(35.6%) and 24.7%(136.7%) for FL1879 and R.Norkotah, respectively. The proposed study shows the possibility of building a rapid, noninvasive, and accurate system or device that requires minimal or no sample preparation to track the sprouting activity of stored potato tubers.


Introduction
Recent studies have shown various health-promoting nutritional resources in potato tubers including protein, dietary fibers, minerals, ascorbic acids, anthocyanins, and antioxidants.Moreover, phenolic compounds, contained in the tuber or the peel, are known for their anti-inflammatory and anticarcinogenic effects on human health [1].Due to the rapid change of lifestyles towards fast food and ready-to-cook meals, the consumption of potatoes in the United States, especially frozen French fries and chips, has shown a significant increase during the last four decades [2].The U.S. per capita French fry consumption jumped from 12.93 Kg in 1970 to 22.89 Kg in 2017 [2].Hence, maintaining the appropriate degree of tuber quality during handling and storage operations is a major concern for growers and processors, to preserve a high level of marketability.
Storage significantly affects the chemical composition of tubers and subsequent processed products.Potatoes, as other agricultural commodities, continue to perform several postharvest biological processes, among which respiration represents an important metabolic process that needs to be controlled during storage, to extend the shelf life and reduce the accumulation of sugars [3,4].Dormancy of potato tubers is the duration after harvest during which tubers will not sprout with the presence of the suitable environmental and biochemical conditions.Dormancy usually lasts from several weeks to months, depending on cultivar and storage conditions [5,6].Following the dormancy period and with warmer temperatures (10-20 • C), sprouts, i.e., the meristematic regions of the tubers (eyes), begin to grow at a low rate that increases until one sprout dominates others [7].Sprouting is affected by storage conditions, the cultivar, and the presence of damage.Sprouting has shown a significant impact on the physiological status and age of potatoes during storage [7].Levels of reducing sugars accumulated during low-temperature storage result in after-frying browning, and excess sucrose content causes improper sweetening flavor of fried products [4].On the other hand, high levels of reducing sugars and sucrose result in an increase in sprouting [4].Additionally, unrestrained sprouting results in an increase of respiration rate, which leads into an increase of the sprouting, physiological age, weight loss, and the glycoalkaloid levels that are known to be toxic [8].Thus, uncontrolled storage sprouting causes a considerable decline in the marketability of raw and subsequent processed potato products.
Near-infrared (NIR) spectroscopy has been studied for detecting chemical constituents and physical properties of agricultural and food products, in addition to pharmaceutical, textiles, cosmetics, and medicine domains [15].The utilization of NIR technology in the agricultural domain included the quality evaluation of grains [16,17], fruits, and vegetables [18][19][20].More specifically, the possibility of using NIR systems on determining several quality attributes of potatoes showed promising results.Such properties include specific gravity [21,22], dry matter [23,24], and sugars [24][25][26].Rady et al. [25] stated that prediction models of leaf primordia for potato tubers had correlation coefficient (r) values of 0.89 and 0.77 for FL1879 and R.Norkotah, respectively, using a VIS/NIR spectroscopic system in the interactance mode.In the case of the VIS/NIR hyperspectral imaging system, the prediction models yielded r values of 0.47 and 0.43 for FL1879 and R.Norkotah, respectively.Jeong et al. [27] investigated the application of VIS/NIR diffuse reflectance spectroscopy (400-2500 nm) for estimating the sprouting capacity of Atlantic and Superior potato cultivars.The authors stated that the sprouting capacity could be evaluated by measuring the weight of sprouts grown under a standard sprouting method.Thus, sprouting capacity was measured based on the weight percentage of sprouts for tubers stored for 30 days in the dark at 20 • C and 90% relative humidity.Results showed a good correlation between lab measurements and predicted sprouting capacity, with r values falling between 0.87 and 0.97.
The fusion of data acquired from different electronic sensors has been studied for the potential benefits of improving the prediction models of quality attributes of fruits, vegetables, and food products.The data combined from each individual sensor should, however, provide distinguishing and non-redundant information about the measured property.Consequently, the improvement of prediction and classification models can be feasible.Data fusion can be conducted by either concatenating the features from various sensors, then processing them, or by performing feature selection before combining and processing [28].
The fusion of data obtained by stationary and prototype online hyperspectral imaging systems was conducted by Mendoza et al. [29] to improve the prediction capability of firmness and soluble solid content (SSC) for Golden Delicious (GD), Jonagold (JD), and Red Delicious (RD) apple cultivars.Results showed a significant decrease of the standard error of prediction (SEP) values for firmness by 6.6, 16.1, and 13.7% for GD, JG, and RD, respectively.The values of SEP for SSC decreased for GD, JG, and RD by 11.2, 2.3%, and 3.0, respectively.Mendoza et al. [30] examined the fusion of visible and shortwave NIR spectroscopy (400-1100 nm), spectral scattering obtained from hyperspectral imaging (500-1000 nm), acoustic firmness, and bioyield firmness to assess the firmness and SSC of JG, GD, and RD apple cultivars.In such studies, fused data improved firmness prediction models by reducing SEP values by 14.6, 20.0, and 7.3% for JG, GD, and RD cultivars, respectively.In the case of SSC prediction models, the fusion of spectroscopic and hyperspectral imaging systems showed a decrease in SEP values by as much as 6.0%.
The data fusion approach has also been investigated with other agricultural products.Integrating electronic tongue (e-tongue) and UV-VIS-NIR spectroscopic data has been applied for determining the botanical origin of honey [31].Ignat et al. [32] studied the fusion of VIS/NIR spectroscopic data with VIS hyperspectral imaging features, relaxation and ultrasonic data, and color measurements for predicting several maturity indices for bell peppers, including dry matter (DM), TSS, osmotic potential (OP), ascorbic acid (AA), total chlorophylls, carotenoids, the coefficient of elasticity for compression (CEc), and the coefficient of elasticity for rapture (CEr).Results illustrated the improvement of the determination coefficient (R 2 ) for fused data models.Values of R 2 increased from 0.93 to 0.95 for DM, 0.93 to 0.96 for TSS, 0.79 to 0.83 for AA, 0.87 to 0.90 for OP, 0.60 to 0.77 for total chlorophylls, 0.92 to 0.96 for carotenoids, 0.55 to 0.63 for CEc, and 0.52 to 0.54 for CEr.Several studies were also conducted to boost the evaluation of various quality attributes for fruits and vegetables using data fusion.Such commodities included bell peppers [33,34], tomatoes [35], apples [36][37][38][39], eggplants [40], peaches [41,42], and oranges [43].
The main objective of this study was to investigate the potential of combining data obtained from hyperspectral imaging and spectroscopic systems for building calibration and prediction models of leaf primordia of potato tubers during storage.

Raw Materials, Sampling, and Measurement of Primordial Leaf Count
Electronic measurements were conducted on Frito Lay 1879 (FL1879) and Russet Norkotah (R.Norkotah) potato cultivars used for chipping and baking, respectively.Samples were obtained from a commercial farm in Southwest Michigan, United States.After discarding defected and deteriorated tubers, samples were cleaned and stored at 7 • C for four weeks for periderm maturation [44].Sampling was first examined on 20 tubers per cultivar.Tubers were then stored at 7, 10, and 15 • C, and sampled at 20, 80, and 130 days of storage with 60 tubers per cultivar.A total of 200 tubers tested form FL1879 or R.Norkotah were tested.The reason for choosing such storage temperatures was to create a broad distribution of leaf primordia, which increases the reliability of the prediction models.The measurements of primordial leaf count (LC) took place as stated in Rady et al. [45].

Electronic Measurements
Whole tubers were electronically scanned using a VIS/NIR spectroscopic system in the interactance mode and a VIS/NIR hyperspectral imaging in the reflectance mode.To obtain consistent measurements, each tuber was placed such that the light beam struck the middle area of the longitudinal axis.More detailed explanation of the scanning process for either system can be found in Rady et al. [45].

VIS/NIR Interactance System
The VIS/NIR spectroscopic system in the interactance mode was used to acquire spectral information of the whole tubers.The system, as shown in Figure 1, contained an Ocean optic spectrometer (model No. USB 4000, Ocean Optics, Inc., Dunedin, FL, United States) connected by a 200 µm diameter fiber optic cable, and has a 3648-element linear silicon CCD (charge-coupled device) array with an optical resolution of 0.3 nm (full width half maximum, or FWHM) and a detection range of 200-1100 nm, as well as a radiometric power supply with a maximum power of 250 watts (model No.68931, Oriel Inst., Irvine, CA, United States) and a light source (model No. 66881, Oriel Inst., Irvine, CA, United States) that contained a quartz tungsten halogen lamp and lens transmittance range of 350-2500 nm.In the interactance mode, light photons illuminate the sample through a probe with a concentric outer illumination ring and an inner receptor.A foam-sealing ring was placed between both components for a separation between the light ring and the detector [45].Thus, only the light passing through the sample was measured.Using such a configuration, the incident light represents a circle with a diameter of 24.7 mm.The interactance spectra for each sample was normalized using a Teflon disc (~25 mm diameter) as a reference material, and the relative interactance was calculated as follows: where I s is the intensity of the reflected light from the sample, I r is the intensity of the reflected light from the reference material, and I d is the intensity of the reflected light from the background.
J. Imaging 2019, 5, x FOR PEER REVIEW 4 of 13 range of 200-1100 nm, as well as a radiometric power supply with a maximum power of 250 watts (model No.68931, Oriel Inst., Irvine, CA, United States) and a light source (model No. 66881, Oriel Inst., Irvine, CA, United States) that contained a quartz tungsten halogen lamp and lens transmittance range of 350-2500 nm.In the interactance mode, light photons illuminate the sample through a probe with a concentric outer illumination ring and an inner receptor.A foam-sealing ring was placed between both components for a separation between the light ring and the detector [45].Thus, only the light passing through the sample was measured.Using such a configuration, the incident light represents a circle with a diameter of 24.7 mm.The interactance spectra for each sample was normalized using a Teflon disc (~25 mm diameter) as a reference material, and the relative interactance was calculated as follows: where Is is the intensity of the reflected light from the sample, Ir is the intensity of the reflected light from the reference material, and Id is the intensity of the reflected light from the background.

VIS/NIR Hyperspectral Imaging System
The main target of using a hyperspectral imaging system (HSI) system in this study was to capture the diffuse scattered light in the range of 400-1000 nm under the reflection mode for whole tubers.The system, as shown in Figure 2 A fiber optic cable coupled with a lens focusing assembly was used to deliver a broadband light beam of 1.5 mm diameter, making a 15° angle away from the vertical axis and 1.6 mm apart from the scanning line.The sample holder movement was controlled using a step motor, and each sample was scanned 10 times with a distance of 1 mm between two successive scans, which totally covered the 9 mm longitudinal distance along the sample.The acquisition time was adjusted for each sample at 200 ms, so the total scanning time for each scanning was 2 s.At each scanning line, the spectrograph acquired the spectral information

VIS/NIR Hyperspectral Imaging System
The main target of using a hyperspectral imaging system (HSI) system in this study was to capture the diffuse scattered light in the range of 400-1000 nm under the reflection mode for whole tubers.The system, as shown in Figure 2 A fiber optic cable coupled with a lens focusing assembly was used to deliver a broadband light beam of 1.5 mm diameter, making a 15 • angle away from the vertical axis and 1.6 mm apart from the scanning line.The sample holder movement was controlled using a step motor, and each sample was scanned 10 times with a distance of 1 mm between two successive scans, which totally covered the 9 mm longitudinal distance along the sample.The acquisition time was adjusted for each sample at 200 ms, so the total scanning time for each scanning was 2 s.At each scanning line, the spectrograph acquired the spectral information represented by a 256 × 256 pixel image, with spatial and spectral resolutions of 0.2 mm/pixel and 2.35 nm, respectively.

Calculation of the Mean Reflectance Spectra and Wavelength Selection
The average reflectance spectra were calculated for the hyperspectral imaging data, using 256 wavelengths in the range of 400-1000 nm.For each image, the spectra were first averaged over the spatial coordinates.The relative reflectance (RR) spectrum was then calculated as follows: where ASs, ASb, and ASr are the average spectra for the sample, background, and reference (Teflon cube), respectively.Wavelength selection was conducted to reduce the number of variables involved in multivariate regression, to overcome the possibility of the overfitting problem related to relatively high dimensional data, such as spectroscopic data [46].Therefore, using wavelength selection techniques improves the robustness of the calibration models and reduces the computational time [47].
Interval partial least squares (IPLS) was adopted as a variable selection technique on the data obtained from spectroscopic and hyperspectral imaging, following the results obtained by Rady and Guyer [48].The configuration of the applied IPLS included the forward mode, window width (W) of one and two variables, and using 20 latent variables (LV).

Data Fusion
After obtaining the most influencing wavelengths, data from the spectroscopic and hyperspectral imaging systems were normalized at each wavelength (column) by dividing all values at such a wavelength by the maximum value at the same wavelength.For each sample (row), data obtained from both systems was then concatenated to form the fused data matrix.

Calculation of the Mean Reflectance Spectra and Wavelength Selection
The average reflectance spectra were calculated for the hyperspectral imaging data, using 256 wavelengths in the range of 400-1000 nm.For each image, the spectra were first averaged over the spatial coordinates.The relative reflectance (RR) spectrum was then calculated as follows: where AS s , AS b , and AS r are the average spectra for the sample, background, and reference (Teflon cube), respectively.Wavelength selection was conducted to reduce the number of variables involved in multivariate regression, to overcome the possibility of the overfitting problem related to relatively high dimensional data, such as spectroscopic data [46].Therefore, using wavelength selection techniques improves the robustness of the calibration models and reduces the computational time [47].
Interval partial least squares (IPLS) was adopted as a variable selection technique on the data obtained from spectroscopic and hyperspectral imaging, following the results obtained by Rady and Guyer [48].The configuration of the applied IPLS included the forward mode, window width (W) of one and two variables, and using 20 latent variables (LV).

Data Fusion
After obtaining the most influencing wavelengths, data from the spectroscopic and hyperspectral imaging systems were normalized at each wavelength (column) by dividing all values at such a wavelength by the maximum value at the same wavelength.For each sample (row), data obtained from both systems was then concatenated to form the fused data matrix.

Partial Least Squares Regression and the Preprocessing of Fused Data
Partial least squares regression (PLSR) was applied on the fused data to build calibration and prediction models.PLSR is a linear regression technique known for handling high dimensional data and overcoming the colinearity problem associated with such types of data [46].
According to Rinnan et al. [49], spectral data contains noisy signals resulting from various electronic sources, and consequently data preprocessing is necessary to reduce such undesirable electronic effects and increase the signal-to-noise ratio.Preprocessing was conducted in two stages.The first stage included, in addition to non-processing, smoothing using a first derivative, smoothing using a second derivative, normalization, a standard normal variate (SNV), multiplicative scattering correction (MSC), and the median center.The second stage included the mean center, multiplicative scattering correction, and orthogonal signal correction.Numerical transformation was also carried out on the reference data (leaf primordia count) to obtain uniform distribution.Logarithmic (base 10) and second degree power transformations were applied, in addition to the non-transformed reference values.The regression analysis was carried out on calibration (80% or 160 tubers) and prediction (20% or 40 tubers) sets of data.To reduce the possibility of overfitting and increase the robustness of calibration models, a four-fold cross-validation technique was implemented on the calibration data set, and the best calibration model was chosen as the one with the minimum root mean square error of calibration for cross validation (RMSEC cv ).Prediction models were then obtained by applying the optimal calibration models on the separate prediction data sets.A complete layout of the data analysis operations is shown in Figure 3.The best prediction model was chosen based on the values of the correlation coefficient (r), the root mean square error of prediction (RMSEP), and the ratio of the standard deviation to the root mean square error of prediction (RPD).
Partial least squares regression (PLSR) was applied on the fused data to build calibration and prediction models.PLSR is a linear regression technique known for handling high dimensional data and overcoming the colinearity problem associated with such types of data [46].
According to Rinnan et al. [49], spectral data contains noisy signals resulting from various electronic sources, and consequently data preprocessing is necessary to reduce such undesirable electronic effects and increase the signal-to-noise ratio.Preprocessing was conducted in two stages.The first stage included, in addition to non-processing, smoothing using a first derivative, smoothing using a second derivative, normalization, a standard normal variate (SNV), multiplicative scattering correction (MSC), and the median center.The second stage included the mean center, multiplicative scattering correction, and orthogonal signal correction.Numerical transformation was also carried out on the reference data (leaf primordia count) to obtain uniform distribution.Logarithmic (base 10) and second degree power transformations were applied, in addition to the non-transformed reference values.The regression analysis was carried out on calibration (80% or 160 tubers) and prediction (20% or 40 tubers) sets of data.To reduce the possibility of overfitting and increase the robustness of calibration models, a four-fold cross-validation technique was implemented on the calibration data set, and the best calibration model was chosen as the one with the minimum root mean square error of calibration for cross validation (RMSECcv).Prediction models were then obtained by applying the optimal calibration models on the separate prediction data sets.A complete layout of the data analysis operations is shown in Figure 3.The best prediction model was chosen based on the values of the correlation coefficient (r), the root mean square error of prediction (RMSEP), and the ratio of the standard deviation to the root mean square error of prediction (RPD).

Constituent Distribution and Wavelength Selection Results
The minimum, maximum, mean, and standard deviation values of primordial leaf count (LC) were calculated for FL1879 and R.Norkotah cultivars as shown in Table 1.Both cultivars showed close minimum and mean values.Maximum and standard deviation, however, showed higher values in the case of FL1879, which possibly shows more sprouting.The average LC values were 13.47 and 12.96 for FL1879 and R.Norkotah, respectively.Whereas the standard deviation values were 13.62 for FL1879 and 8.61 for R.Norkotah.The high standard deviation values were intentionally conducted using relatively higher storage temperatures to obtain a broad LC range, which helps develop more comprehensive prediction models for LC.Results of wavelength selection shown in Table 2 indicated that the FL1879 spectral data yielded from the interactance system generally illustrated the highest number of selected wavelengths among all spectral data.In contrast, the number of selected wavelengths obtained from the hyperspectral imaging for R.Norkotah was higher than those obtained from the interactance system, except for W = 2, at which a similar number of wavelengths was selected for both electronic systems.Moreover, the number of selected wavelengths for the hyperspectral imaging was generally higher in the visible spectrum than in the NIR range for both cultivars.In the case of the interactance system, results showed a higher number of selected wavelengths in the NIR range, especially for the R.Norkotah.
Table 2. Number of selected wavelengths using the interval partial least squares (IPLS) technique for primordial leaf count, using data obtained from VIS/NIR interactance and hyperspectral imaging systems for Frito Lay 1879 (FL1879) and Russet Norkotah (R.Norkotah) potato cultivars.Shaded cells show optimal models.

No. of Wavelengths in the Visible Range
No. of Wavelengths in the NIR Range

Partial Least Squares Regression Results
To make a comparison between the performance of prediction models, based on data obtained from individual or fused sensors, we first illustrate the PLSR results using individual systems data for whole Frito Lay 1879 (FL1879) and Russet Norkotah (R.Norkotah) potato cultivars in Table 3.
On the other side, the best PLSR calibration and prediction models of primordial leaf count for FL1879 and R.Norkotah cultivars are shown in Table 4.The optimal models are shown in the shaded cells.In the case of FL1879, the values of r(RPD) of prediction models were 0.95(3.01),0.91(2.27),and 0.91(2.49)for W = 1, 2, and 3, respectively.Whereas, in the case of R.Norkotah, the r(RPD) values were 0.96(3.55),0.95 (3.24), and 0.94(2.93),for W = 1, 2, and 3, respectively.The spectral preprocessing methods for the optimal models were first derivative and MSC for FL1879, and second derivative and mean center for R.Burbank.However, the preprocessing of the LC values for the same models was power transformation.The relationship between the measured and predicted LC values for FL1879 and R.Norkotah deduced from the optimal prediction models for W = 1 is shown in Figure 4a,b.* Rcal: correlation coefficient for the calibration model; RMSEcv: root mean square error of calibration, using cross validation for the calibration model; LVs: number of latent variables.** Rpred: correlation coefficient for the prediction model; RMSEpred: root mean square error of calibration, using cross validation for the prediction model; RPDpred: ratio between standard deviation and the RMSEPpred.
On the other side, the best PLSR calibration and prediction models of primordial leaf count for FL1879 and R.Norkotah cultivars are shown in Table 4.The optimal models are shown in the shaded cells.In the case of FL1879, the values of r(RPD) of prediction models were 0.95(3.01),0.91(2.27),and 0.91(2.49)for W = 1, 2, and 3, respectively.Whereas, in the case of R.Norkotah, the r(RPD) values were 0.96(3.55),0.95 (3.24), and 0.94(2.93),for W = 1, 2, and 3, respectively.The spectral preprocessing methods for the optimal models were first derivative and MSC for FL1879, and second derivative and mean center for R.Burbank.However, the preprocessing of the LC values for the same models was power transformation.The relationship between the measured and predicted LC values for FL1879 and R.Norkotah deduced from the optimal prediction models for W = 1 is shown in Figure 4a,b.

Discussion
The number of wavelengths selected using the IPLS technique was generally proportional to the window size, especially for R.Norkotah cultivars in the case of data yielded from the two electronic systems; this is expected, as the larger the window size is, the higher the number of selected variables [50].It was also noted that the interactance data for FL1879 required a higher number of selected wavelengths to explain the variation of LC in comparison to R.Norkotah, except when W = 1 for the hyperspectral imaging.Furthermore, selected wavelengths based on the window width of one variable (W = 1) that were almost the least compared to those obtained using W = 2 or W = 3 yielded the optimal prediction models.Such results illustrate that the small window width could eliminate redundant variables that might be included during the IPLS search algorithm.Zhao et al. [51] developed a modified IPLS method for variable selection, and their study showed a general conclusion that with the low window width, the number of selected variables decreased, and the root mean square error of prediction (RMSEP) improved.Moreover, Deng et al. [52] compared the number of variables selected using different methods, including synergy interval PLS (siPLS), moving window PLS (MWPLS), and genetic algorithm PLS (GA-PLS).Generally, it was obvious that the smaller the window width, the greater the performance of the prediction models.
Using fused data from the two systems, the prediction of LC significantly improved for both cultivars.In a previous study by Rady et al. [25], as shown in Table 1, the optimal prediction models using the VIS/NIR interactance system showed r(RPD) values of 0.89(2.22)and 0.77(1.50)for FL1879 and R.Norkotah, respectively.Whereas, the r(RPD) values obtained from the VIS/NIR hyperspectral imaging systems were 0.47(1.14)for FL1879 and 0.43(1.10)for R.Norkotah.Additionally, prediction results obtained from the fused data in this study are comparable to the work conducted by Jeong et al. [27] for estimating potato sprouting using NIR diffuse reflectance data.The latter study had r(RPD) values of 0.94(2.0)for the calibration models using cross validation.In our study, data fusion led to significant improvement of the prediction performance, which was mainly based on a separate set of data in which the boosted prediction models yielded r(RPD) values of 0.95(3.01)and 0.96(3.55)for FL1879 and R.Norkotah, respectively.The fusion of the data, along with wavelength selection, has not been investigated before for the sprouting prediction of potatoes.
The above results indicate that there is a possibility of obtaining a robust prediction of sprouting activity of potato tubers during the storage period, using fused from VIS/NIR spectroscopic and hyperspectral imaging systems.One of the main restrictions of applying hyperspectral imaging systems in on-line sorting and quality inspection processes for food and agricultural products is the

Discussion
The number of wavelengths selected using the IPLS technique was generally proportional to the window size, especially for R.Norkotah cultivars in the case of data yielded from the two electronic systems; this is expected, as the larger the window size is, the higher the number of selected variables [50].It was also noted that the interactance data for FL1879 required a higher number of selected wavelengths to explain the variation of LC in comparison to R.Norkotah, except when W = 1 for the hyperspectral imaging.Furthermore, selected wavelengths based on the window width of one variable (W = 1) that were almost the least compared to those obtained using W = 2 or W = 3 yielded the optimal prediction models.Such results illustrate that the small window width could eliminate redundant variables that might be included during the IPLS search algorithm.Zhao et al. [51] developed a modified IPLS method for variable selection, and their study showed a general conclusion that with the low window width, the number of selected variables decreased, and the root mean square error of prediction (RMSEP) improved.Moreover, Deng et al. [52] compared the number of variables selected using different methods, including synergy interval PLS (siPLS), moving window PLS (MWPLS), and genetic algorithm PLS (GA-PLS).Generally, it was obvious that the smaller the window width, the greater the performance of the prediction models.
Using fused data from the two systems, the prediction of LC significantly improved for both cultivars.In a previous study by Rady et al. [25], as shown in Table 1, the optimal prediction models using the VIS/NIR interactance system showed r(RPD) values of 0.89(2.22)and 0.77(1.50)for FL1879 and R.Norkotah, respectively.Whereas, the r(RPD) values obtained from the VIS/NIR hyperspectral imaging systems were 0.47 (1.14) for FL1879 and 0.43(1.10)for R.Norkotah.Additionally, prediction results obtained from the fused data in this study are comparable to the work conducted by Jeong et al. [27] for estimating potato sprouting using NIR diffuse reflectance data.The latter study had r(RPD) values of 0.94(2.0)for the calibration models using cross validation.In our study, data fusion led to significant improvement of the prediction performance, which was mainly based on a separate set of data in which the boosted prediction models yielded r(RPD) values of 0.95(3.01)and 0.96(3.55)for FL1879 and R.Norkotah, respectively.The fusion of the data, along with wavelength selection, has not been investigated before for the sprouting prediction of potatoes.
The above results indicate that there is a possibility of obtaining a robust prediction of sprouting activity of potato tubers during the storage period, using fused from VIS/NIR spectroscopic and hyperspectral imaging systems.One of the main restrictions of applying hyperspectral imaging systems in on-line sorting and quality inspection processes for food and agricultural products is the relatively long acquisition time.The prediction models obtained in this study, however, were based on selected wavelengths.Thus, decreasing the computation time is accomplished by using fewer wavelengths to build a multispectral imaging system.

Conclusions
The main objective of this research study was to investigate the potential of utilizing fused data from VIS/NIR spectroscopic and VIS/NIR hyperspectral imaging systems on predicting primordial leaf count of potatoes.Leaf count is an important factor assessing the sprouting capability of tubers; thus, continuous observation of such activity during storage is crucial to maintain the appropriate physiological status of tubers, especially for processing or seeds.Electronic measurements were performed on whole tubers of FL1879 and R.Norkotah potatoes stored at different temperatures, to stimulate the real storage conditions and obtain wide ranges of LC.After obtaining the most influential wavelengths from both electronic systems using IPLS, data from both systems were fused.Results obtained from PLSR indicated a feasible application of the fusion method to considerably improve LC prediction.Compared to the optimal results obtained from individual systems, values of r(RPD) have been boosted by 6.7%(35.6%)and 24.7%(136.7%)for FL1879 and R.Norkotah, respectively, which stands as a unique enhancement and application of data fusion for potato sprouting.Results deduced from this study initiate the possibility of developing an electronic system, either portable or stationary, that is composed from multispectral imaging along with an interactance sensors to obtain rapid and accurate prediction of sprouting activity of stored potatoes.However, future steps are still needed to reduce the number of selected wavelengths using different versions of IPLS, such as moving average IPLS, synergy IPLS, backward/forward IPLS, and a genetic algorithm.More cultivars should also be tested, and experiments should be conducted over several growing seasons to improve the robustness and reproducibility of the prediction models.
, contained a Hamamatsu dual mode cooled CCD camera (model No. C4880, Hamamatsu Photonics, Hamamatsu, Japan), an imaging spectrograph directly attached to the CCD camera (ImSpector V10, Spectral Imaging Ltd., Oulu, Finland), a power supply control (model No. 69931, Oriel Instruments Irvine, CA, United States), a digital exposure controller (model No. 68945, Oriel Instruments, Irvine, CA, United States), and a light source (model No. 66881, Oriel Instruments, Irvine, CA, United States) holding a 250 W Quartz Tungsten Halogen lamp and having a lens material transmittance range of 350-2500 nm.
, contained a Hamamatsu dual mode cooled CCD camera (model No. C4880, Hamamatsu Photonics, Hamamatsu, Japan), an imaging spectrograph directly attached to the CCD camera (ImSpector V10, Spectral Imaging Ltd., Oulu, Finland), a power supply control (model No. 69931, Oriel Instruments Irvine, CA, United States), a digital exposure controller (model No. 68945, Oriel Instruments, Irvine, CA, United States), and a light source (model No. 66881, Oriel Instruments, Irvine, CA, United States) holding a 250 W Quartz Tungsten Halogen lamp and having a lens material transmittance range of 350-2500 nm.
J. Imaging 2019, 5, x FOR PEER REVIEW 5 of 13 represented by a 256 × 256 pixel image, with spatial and spectral resolutions of 0.2 mm/pixel and 2.35 nm, respectively.

Figure 2 .
Figure 2. Schematic representation of the VIS/NIR hyperspectral reflectance system used to test whole FL1879 and R.Burbank potato cultivars.

2. 3 . 3 .
Partial Least Squares Regression and the Preprocessing of Fused Data

Figure 2 .
Figure 2. Schematic representation of the VIS/NIR hyperspectral reflectance system used to test whole FL1879 and R.Burbank potato cultivars.

Figure 3 .
Figure 3. Flow chart of acquiring data from VIS/NIR spectroscopic and VIS/NIR hyperspectral imaging systems, wavelength selection, preprocessing, and building regression models of leaf primordia count for FL1879 and R.Norkotah potato cultivars.

Figure 3 .
Figure 3. Flow chart of acquiring data from VIS/NIR spectroscopic and VIS/NIR hyperspectral imaging systems, wavelength selection, preprocessing, and building regression models of leaf primordia count for FL1879 and R.Norkotah potato cultivars.

Figure 4 .
Figure 4. Relationship between measured and predicted primordial leaf count using combined VIS/NIR interactance spectroscopy and VIS/NIR hyperspectral imaging for (a) Frito Lay1879 and (b) Russet Norkotah.

Figure 4 .
Figure 4. Relationship between measured and predicted primordial leaf count using combined VIS/NIR interactance spectroscopy and VIS/NIR hyperspectral imaging for (a) Frito Lay1879 and (b) Russet Norkotah.

Table 3 .
Partial least squares regression (PLSR) results of the primordial leaf count, using data obtained from either VIS/NIR interactance or hyperspectral imaging systems for Frito Lay 1879 (FL1879) and Russet Norkotah (R.Norkotah) potato cultivars.R cal : correlation coefficient for the calibration model; RMSE cv : root mean square error of calibration, using cross validation for the calibration model; LVs: number of latent variables.** R pred : correlation coefficient for the prediction model; RMSE pred : root mean square error of calibration, using cross validation for the prediction model; RPD pred : ratio between standard deviation and the RMSEP pred . *

Table 4 .
PLSR results for predicting primordial leaf count using data fused from VIS/NIR interactance and VIS/NIR hyperspectral imaging systems for whole tubers for Frito Lay 1879 (FL1879) and Russet Norkotah cultivars.Optimal results are shaded.

Table 4 .
PLSR results for predicting primordial leaf count using data fused from VIS/NIR interactance and VIS/NIR hyperspectral imaging systems for whole tubers for Frito Lay 1879 (FL1879) and Russet Norkotah cultivars.Optimal results are shaded.