Performance Improvement of Partial Least Squares Regression Soluble Solid Content Prediction Model Based on Adjusting Distance between Light Source and Spectral Sensor according to Apple Size

Apples are widely cultivated in the Republic of Korea and are preferred by consumers for their sweetness. Soluble solid content (SSC) is measured non-destructively using near-infrared (NIR) spectroscopy; however, the SSC measurement error increases with the change in apple size since the distance between the light source and the near-infrared sensor is fixed. In this study, spectral characteristics caused by the differences in apple size were investigated. An optimal SSC prediction model applying partial least squares regression (PLSR) to three measurement conditions based on apple size was developed. The three optimal measurement conditions under which the Vis/NIR spectrum is less affected by six apple size levels (Levels I–VI) were selected. The distance from the apple center to the light source and that to the sensor were 125 and 75 mm (Distance 1), 123 and 75 mm (Distance 2), and 135 and 80 mm (Distance 3). The PLSR model applying multiplicative scatter correction pretreatment under Distance 3 measurement conditions showed the best performance for Level IV-sized apples (Rpre2 = 0.91, RMSEP = 0.508 °Brix). This study shows the possibility of improving the SSC prediction performance of apples by adjusting the distance between the light source and the NIR sensor according to fruit size.


Introduction
In 2020, fruit consumers in the Republic of Korea prioritized fruit quality over price to a greater degree than that in 2018 [1,2].Apples are the most cultivated fruit among the country's representative fruits and are popular among consumers [3].Currently, apple importation is banned in the country.Thus, selecting high-quality local apples is crucial because of the inevitable future competition with imported apples [4,5].
The criteria for selecting apples can be broadly divided into external and internal qualities.The external quality can be classified into size, color, weight, shape, and external defects, whereas the internal quality can be classified into sugar content, acidity, moisture content, and internal defects [6].Various studies have measured the internal quality of fruits, such as the soluble solid content (SSC), using near-infrared (NIR) spectroscopy.NIR spectroscopy can quickly and non-destructively determine and sort the internal quality of fruits [7].NIR spectroscopy is a technique that measures transmitted or reflected light when an NIR sensor with a wavelength range of 700-2500 nm is applied to fruits, measuring SSC through partial least squares regression (PLSR) [8].Although multiple linear regression (MLR) and principal component regression (PCR) have been used as SSC prediction models, PLSR has been widely used.PLSR is a method for finding latent variants to effectively describe concentration changes using both concentration and spectral data from samples, allowing multiple response variables to be simultaneously modeled while effectively handling multicollinearity and noisy independent variables [9].The application of NIR spectroscopy to fruit quality analysis involves a reflectance mode that uses reflected light for irradiated light, a full transmittance mode that uses transmitted light inside the fruit, and a semi-transmittance mode that uses only a part of the fruit [10].Recently, the full transmittance mode has been used to sort the SSC of fruits, and the mode speedily measures the overall SSC of fruits.However, the full transmittance mode varies in the spectrum because of the changes in the path length or scattering caused by differences in the sample size [11].This variation reduces the accuracy of the SSC predictions [12].Therefore, studies have been conducted to improve the accuracy of PLSR models for fruit SSC prediction, primarily by performing spectral preprocessing, and to reduce this disturbance [13].
In developing an SSC prediction model, Suh et al. [10] found that the internal reflectance mode and transmittance mode (90 • , 180 • ) are excellent for pear spectroscopy, and the coefficient of determination of the cross-validation (R 2 cv ) and root mean square error of prediction (RMSEP) of the PLSR model without pretreatment were 0.777 and 0.38 • Brix, 0.643 and 0.48 • Brix, respectively.The RMSEP of the multiplicative scatter correction (MSC) pretreatment application model showed that the internal reflectance mode was 0.37-0.57• Brix, and the transmittance mode (90 • ) was 0.39-0.51• Brix.Shin [14] investigated the SSC prediction of melons using NIR spectroscopy.Among the various pretreatments performed, range normalization is the pretreatment with the best prediction performance, with an R 2 cv of 0.755 and RMSEP of 0.89 • Brix.Luo et al. [15] developed a sugar prediction PLSR model using three wavelength bands and five pretreatments of navel oranges.The model with standard normal variate (SNV) pretreatment in the wavelength band of 450-1800 nm showed optimal performance with R 2 v of 0.8514 and RMSE of 1.1649.Kawano et al. [12] non-destructively measured the SSC of satsuma mandarins using an NIR transmittance spectrum, in which the wavelength affected by the fruit diameter was 844 nm.After applying a second-order differential to a value of 844 nm and normalizing the result, the PLSR analysis resulted in an R of 0.989 and SEP of 0.32.Tian et al. [16] investigated the optimal apple SSC prediction through spectroscopic analysis using Vis/near-infrared (Vis/NIR) and pretreatment applications.PLSR was used for model development, and a total of 322 apples were used: the correlation coefficient of the cross-validation (R cv ) was 0.8545, and the root mean square error of the cross-validation (RMSECV) was 0.5730 without pretreatment.Optimal preprocessing was performed with mean normalization and 11-point smoothing, where the correlation coefficient of prediction (R pre ) was 0.8744 and RMSEP was 0.5332.
Although previous studies have improved the accuracy of the SSC prediction model by applying spectral preprocessing, changes in the optical path corresponding to changes in the size of the fruit have not been considered because the location of the light source and NIR sensor are fixed when measuring the spectrum of the fruit.Because the sizes of fruits vary and the difference in diameter for each sample is large, the change in the NIR spectral signal also considerably influences the outcome.This phenomenon reduces the accuracy of fruit SSC prediction during high-speed sorting.Therefore, it is necessary to determine the distance between the optimal light source and the NIR sensor for each fruit size.
This study aimed to determine the optimal distance between the light source and NIR sensor based on the apple size for predicting apple's SSC and develop a PLSR model for predicting SSC for apple size based on the determined distance.Particularly, the spectral distance characteristics between the light source and the NIR sensor were analyzed based on the apple size, and various forms of spectral preprocessing were applied.

Materials and Methods
In this study, the first experiment (Experiment 1) aimed to select the optimal distance between the light sources and sensors that had less influence on apple size, while the second experiment (Experiment 2) aimed to develop a PLSR SSC prediction model for each apple size at the selected optimal distance.

Experimental Samples
The apple of the Fuji cultivar (Malus pumila) used in this experiment was purchased from the Chungju Agricultural Products Processing Center (APC), and its size was classified using the Korean Agricultural Product Standard Notice (Table 1).The weight, diameter, and height of all samples were measured after purchase, and the average weight, diameter, and height of the apples are listed in Table 2.The apples were stored in a refrigerator at 4 • C, and tempering was performed in a laboratory at 20 ± 1 • C for more than 5 h to reduce the effect of the temperature before the spectral signal measurement experiment.The samples in Experiment 1 were classified from Levels I to V according to weight, and three apples for each level were used.The average weights were 389.75, 324.62, 291.9, 240.22, and 197.14 g, in descending order of magnitude.
The samples in Experiment 2 were classified from Levels I to VI according to their weights, and 82, 57, 72, 60, 70, and 70 apples from Levels I to VI were used.The average weights were 398.02, 318.41, 258.16, 223.19, 194.55, and 182.28 g, from the largest to the smallest.

Spectra Collection and SSC Measurement
An NIR spectroscopy device was used to measure the Vis/NIR spectra, and it consisted of a light source, a sample fixing part, and a spectral sensor, as shown in Figure 1a.A 12 V, 100 W tungsten-halogen lamp was used as the light source, and the light source was placed at the equatorial position of the apple.The spectral sensor was connected to a spectrometer (USB4000; Ocean Optics, Dunedin, FL, USA) via a fiber optic cable.Spectroscopic measurements were performed after 1 h of light stabilization.
The resolution of the refractometer used was 0.1 °Brix at 0.1 °C, and the accuracy was ±0.1 °Brix.
The measured spectrum was configured at intervals of approximately 0.2 nm with a wavelength of 470 to 1150 nm.The light source and Vis/NIR sensor were placed at the center of the height of the apple in all experiments.

Analysis of Spectral Characteristics and Selection of Appropriate Distance between Light Source and Vis/NIR Sensor (Experiment 1)
In Experiment 1, the spectral characteristics were investigated based on the distance between the apple surface, light source, and Vis/NIR sensor.As the size of the apple changed, the distance between the apple surface, light source, and Vis/NIR sensor varied.The spectral signal was measured by changing the position of the light source and Vis/NIR sensor around the apple to determine the characteristics of the spectrum that occurred as this distance changed.The light source was measured at distances of 60, 70, 80, 90, and 100 mm from the apple surface (Figure 2).When the light source was closer than 60 mm, the apple was burned, and the spectral signal was weak when it was farther than 100 mm; therefore, these distances were excluded.Vis/NIR sensors were used for measurements at distances of 20, 25, 30, 35, and 40 mm from the apple surface.When the distance of the sensor was less than 20 mm from the surface of the smallest apple, the large apple collided with the Vis/NIR sensor, and the spectral signal weakened when it exceeded 40 mm; therefore, these distances were excluded.In Experiment 1, the spectra were measured at an integration time of 200 ms, and the values of ten measurements were averaged.The point corresponding to the maximum diameter of the apple was measured three times by allowing light to penetrate, and then an average spectrum of the three measurements was produced.In Experiment 2, the spectra were measured at an integration time of 100 ms, and the values of five measurements were averaged.The spectra were measured once in four directions (0 • , 90 • , 180 • , 270 • ) according to the maximum diameter (0 • ) of the apple, and the average spectrum in the four directions was used.For rapid measurement considering the application of an online system, the integration time and average time were differently set from Experiment 1, and the spectrum was measured in four directions to reflect the influence of the size of the apple and the measurement area of the apple.
After measuring the spectra, four directions of the spectrum-measured apple were cut to make nectar using a mixer, and the juice was extracted using a filter.A refractometer (PAL-3; ATAGO, Tokyo, Japan) was used to measure the SSC in four directions per apple.The resolution of the refractometer used was 0.1 • Brix at 0.1 • C, and the accuracy was ±0.1 • Brix.
The measured spectrum was configured at intervals of approximately 0.2 nm with a wavelength of 470 to 1150 nm.The light source and Vis/NIR sensor were placed at the center of the height of the apple in all experiments.

Analysis of Spectral Characteristics and Selection of Appropriate Distance between Light Source and Vis/NIR Sensor (Experiment 1)
In Experiment 1, the spectral characteristics were investigated based on the distance between the apple surface, light source, and Vis/NIR sensor.As the size of the apple changed, the distance between the apple surface, light source, and Vis/NIR sensor varied.The spectral signal was measured by changing the position of the light source and Vis/NIR sensor around the apple to determine the characteristics of the spectrum that occurred as this distance changed.The light source was measured at distances of 60, 70, 80, 90, and 100 mm from the apple surface (Figure 2).When the light source was closer than 60 mm, the apple was burned, and the spectral signal was weak when it was farther than 100 mm; therefore, these distances were excluded.Vis/NIR sensors were used for measurements at distances of 20, 25, 30, 35, and 40 mm from the apple surface.When the distance of the sensor was less than 20 mm from the surface of the smallest apple, the large apple collided with the Vis/NIR sensor, and the spectral signal weakened when it exceeded 40 mm; therefore, these distances were excluded.The intensity of light with a wavelength of 714.17 nm, mainly representing the maximum value among the various wavelengths, was used to analyze the tendency of the distance change between the light source and Vis/NIR sensor in the measured spectra.For the intensity of light measured by the distance between the light source and the Vis/NIR sensor, the coefficient of variation (CV) was used to obtain the distance range of the light source and the Vis/NIR sensor that was the least affected by the distance change between the light source and the Vis/NIR sensor.The CV is a unitless constant that represents the degree of variation with respect to the mean of the population; the lower it is, the more uniform it is.CV is the standard deviation (SD) divided by the mean, as expressed in Equation ( 1).
The distance range between the light source and the Vis/NIR sensor was defined to include all sizes of apples used in the experiment.The maximum diameter difference between the largest and smallest apples used was approximately 30 mm (radius difference of approximately 15 mm); therefore, the light source and the Vis/NIR sensors were defined at 20 and 15 mm intervals, respectively (Table 3).Figure 3 illustrates the CV calculations for each section.After averaging the CV for each section calculated for each level, the three lowest values were selected as the appropriate distance between the light source and the Vis/NIR sensor.The intensity of light with a wavelength of 714.17 nm, mainly representing the maximum value among the various wavelengths, was used to analyze the tendency of the distance change between the light source and Vis/NIR sensor in the measured spectra.For the intensity of light measured by the distance between the light source and the Vis/NIR sensor, the coefficient of variation (CV) was used to obtain the distance range of the light source and the Vis/NIR sensor that was the least affected by the distance change between the light source and the Vis/NIR sensor.The CV is a unitless constant that represents the degree of variation with respect to the mean of the population; the lower it is, the more uniform it is.CV is the standard deviation (SD) divided by the mean, as expressed in Equation ( 1).
The distance range between the light source and the Vis/NIR sensor was defined to include all sizes of apples used in the experiment.The maximum diameter difference between the largest and smallest apples used was approximately 30 mm (radius difference of approximately 15 mm); therefore, the light source and the Vis/NIR sensors were defined at 20 and 15 mm intervals, respectively (Table 3).Figure 3 illustrates the CV calculations for each section.After averaging the CV for each section calculated for each level, the three lowest values were selected as the appropriate distance between the light source and the Vis/NIR sensor.The intensity of light with a wavelength of 714.17 nm, mainly representing the maximum value among the various wavelengths, was used to analyze the tendency of the distance change between the light source and Vis/NIR sensor in the measured spectra.For the intensity of light measured by the distance between the light source and the Vis/NIR sensor, the coefficient of variation (CV) was used to obtain the distance range of the light source and the Vis/NIR sensor that was the least affected by the distance change between the light source and the Vis/NIR sensor.The CV is a unitless constant that represents the degree of variation with respect to the mean of the population; the lower it is, the more uniform it is.CV is the standard deviation (SD) divided by the mean, as expressed in Equation (1).
The distance range between the light source and the Vis/NIR sensor was defined to include all sizes of apples used in the experiment.The maximum diameter difference between the largest and smallest apples used was approximately 30 mm (radius difference of approximately 15 mm); therefore, the light source and the Vis/NIR sensors were defined at 20 and 15 mm intervals, respectively (Table 3).Figure 3 illustrates the CV calculations for each section.After averaging the CV for each section calculated for each level, the three lowest values were selected as the appropriate distance between the light source and the Vis/NIR sensor.The 714.17-nm transmittance intensity value of apples and CV (%) calculation method for each distance range (Level I apples).

Development of Apple SSC Prediction Model (Experiment 2)
The PLSR model was applied to develop an optimal SSC prediction model for each apple size at an appropriate distance between the light source and the Vis/NIR sensor selected in Section 2.3.The calibration model for SSC prediction and the calibration dataset (Prediction) were randomly divided into a 7:3 ratio for verification.The model was applied to each of the three distances for each apple size, and the spectrum was measured in four directions for each apple, resulting in the number of spectra within the dataset being four times the number of apples.A calibration model for SSC prediction was developed using 70% of the calibration dataset.Cross-validation was performed, and the performance was verified by applying the remaining 30% of the unknown verification dataset.Equation ( 2) is used in the PLSR model and is given by where X is an independent variable (spectral matrix); U is a score matrix that describes the dependent variable Y; P is an eigenvalue matrix of the independent variable; Q is an eigenvalue matrix of the dependent variable; E, F, and H are residual matrices; and B is a regression coefficient of PLSR [17].Unscrambler X (v10.4;CAMO SOFTWARE AS, Oslo, Norway) was used to construct the PLSR model.

• Spectral preprocessing
In spectroscopic analysis, noise is caused by changes in the optical path because of the size of the fruit, reflected and scattered light, and changes in the state of the spectroscopic equipment.In this experiment, to reduce this effect, the location of the light source and the sensor were adjusted to determine the appropriate distance; however, we minimized the noise by preprocessing the spectrum.The maximum normalization, range normalization, mean normalization, standard normal variate (SNV), and MSC methods were used for preprocessing, which was performed using the Unscrambler X 10.4 software.

•
Model evaluation The performance of the SSC prediction model was evaluated using the coefficient of determination of calibration (R 2 cal ), coefficient of determination of prediction (R 2 pre ), RMSEC, and RMSEP.Each metric is expressed as follows: where y pi and y mi are the predicted and measured SSC of the ith apple, respectively, and y mean is the average value of the calibration set or prediction set.n c and n p are the numbers of apples in the calibration and prediction sets, respectively.The closer the R The transmittance spectrum was measured at five distances between apples and light sources for five levels (Levels I-V) of apple samples and five distances between apples and Vis/NIR sensors.Figure 4 shows the transmittance spectrum of apples corresponding to Level IV, indicating large absorption rates in the ranges of 640-700 nm and 700-900 nm.Among the wavelengths corresponding to the visible light region, the absorption peak at approximately 675 nm is related to pigment compounds, such as anthocyanin and chlorophyll (a, b), in apple peel [18].Moreover, absorption peaks at approximately 660, 745, and 840 nm are associated with the third overtone of carotenoids and O-H stretching [19].Wavelengths of approximately 750, 850, and 895 nm are associated with the third overtone of C-H and H 2 O [20].O-H and C-H bonds have been reported to be associated with SSC [21].

Transmittance Spectral Characteristics according to Apple Size and Light Source and Vis/NIR Sensor Distance (Experiment 1)
The transmittance spectrum was measured at five distances between apples and sources for five levels (Levels I-V) of apple samples and five distances between apples Vis/NIR sensors.Figure 4 shows the transmittance spectrum of apples correspondin Level IV, indicating large absorption rates in the ranges of 640-700 nm and 700-900 Among the wavelengths corresponding to the visible light region, the absorption pea approximately 675 nm is related to pigment compounds, such as anthocyanin and chl phyll (a, b), in apple peel [18].Moreover, absorption peaks at approximately 660, 745, 840 nm are associated with the third overtone of carotenoids and O-H stretching Wavelengths of approximately 750, 850, and 895 nm are associated with the third over of C-H and H2O [20].O-H and C-H bonds have been reported to be associated with [21].After measuring the transmittance spectra of three apples of size Levels I to V average spectrum was calculated for each level.Figures 5 and 6 show the transmitt intensity (Figure 7) at a wavelength of 714.17 nm, representing the maximum value in average spectrum for each apple level.Figure 5 illustrates the transmittance inten against the distance between the light source and the apple surface based on the dist between the apple surface and Vis/NIR sensor.Figure 6 shows the transmittance inten against the distance between the apple surface and Vis/NIR sensor based on the dist between the light source and the apple surface.
The intensity of the NIR signal decreased as the distance between the light so and the apple increased, as shown in Figure 5.In addition, the difference in the inten values of the NIR signal was larger as the distance between the light source and the a increased compared with that of the apples of other levels in Level V, the smallest a (Level V: 15.2%, Levels I-IV: 2.6-10.0%).After measuring the transmittance spectra of three apples of size Levels I to V, the average spectrum was calculated for each level.Figures 5 and 6 show the transmittance intensity (Figure 7) at a wavelength of 714.17 nm, representing the maximum value in the average spectrum for each apple level.Figure 5 illustrates the transmittance intensity against the distance between the light source and the apple surface based on the distance between the apple surface and Vis/NIR sensor.Figure 6 shows the transmittance intensity against the distance between the apple surface and Vis/NIR sensor based on the distance between the light source and the apple surface.
The intensity of the NIR signal decreased as the distance between the light source and the apple increased, as shown in Figure 5.In addition, the difference in the intensity values of the NIR signal was larger as the distance between the light source and the apple increased compared with that of the apples of other levels in Level V, the smallest apple (Level V: 15.2%, Levels I-IV: 2.6-10.0%).
As shown in Figure 6, the intensity of light increased at Levels II, III, and V as the distance between the apple surface and the NIR sensor increased.In addition, the difference in signal strength value was larger as the distance between the apple surface and the nearinfrared sensor increased compared with that of the apples of other levels in Level V, which was the smallest apple (Level V: 11.8%, Levels I-IV: 1.5-7.3%).
Hence, the smaller the size of the apple, the greater the distance between the apple and the light source, and the greater the distance between the apple and Vis/NIR sensor, the greater the influence on the transmittance signal.In addition, the transmittance signal decreased as the light source moved farther away, whereas as the Vis/NIR sensor moved farther away, the transmittance signal increased.Thus, a difference in the transmittance signal appeared when the distance between the light source and the Vis/NIR sensor changed, which we determined would affect the SSC measurement of the apple.As shown in Figure 6, the intensity of light increased at Levels II, III, and V as the distance between the apple surface and the NIR sensor increased.In addition, the difference in signal strength value was larger as the distance between the apple surface and the near-infrared sensor increased compared with that of the apples of other levels in Level V, which was the smallest apple (Level V: 11.8%, Levels I-IV: 1.5-7.3%).
Hence, the smaller the size of the apple, the greater the distance between the apple and the light source, and the greater the distance between the apple and Vis/NIR sensor, the greater the influence on the transmittance signal.In addition, the transmittance signal decreased as the light source moved farther away, whereas as the Vis/NIR sensor moved farther away, the transmittance signal increased.Thus, a difference in the transmittance signal appeared when the distance between the light source and the Vis/NIR sensor changed, which we determined would affect the SSC measurement of the apple.

Selection of Appropriate Distance between Light Source and Vis/NIR Sensor according t Changes in Apple Size
The measured intensity value was divided by the distance range of the light so and the Vis/NIR sensor, as shown in Table 3, to determine the appropriate distance tween the light source and the Vis/NIR sensor based on changes in the apple size.Th was calculated for each level.Table 4 shows the CV for each level calculated for each tance range as an average value, and the ranking is based on the low value of the CV shown in Table 4, when the distance range of the light source or Vis/NIR sensor was as a whole, most results were poor with the highest CV.The wider the distance rang which the light source or Vis/NIR sensor was located, the greater the deviation o measured spectral signals.In addition, regardless of the distance range of the Vis/ sensor, the CV was low in the order of distance ranges III, I, and I of the light source.phenomenon occurred possibly because the closer the light source was to the apple greater the difference in the amount of transmitted light was.However, regardless o

Selection of Appropriate Distance between Light Source and Vis/NIR Sensor According to Changes in Apple Size
The measured intensity value was divided by the distance range of the light source and the Vis/NIR sensor, as shown in Table 3, to determine the appropriate distance between the light source and the Vis/NIR sensor based on changes in the apple size.The CV was calculated for each level.Table 4 shows the CV for each level calculated for each distance range as an average value, and the ranking is based on the low value of the CV.As shown in Table 4, when the distance range of the light source or Vis/NIR sensor was used as a whole, most results were poor with the highest CV.The wider the distance range in which the light source or Vis/NIR sensor was located, the greater the deviation of the measured spectral signals.In addition, regardless of the distance range of the Vis/NIR sensor, the CV was low in the order of distance ranges III, I, and I of the light source.This phenomenon occurred possibly because the closer the light source was to the apple, the greater the difference in the amount of transmitted light was.However, regardless of the distance range of the light source, no clear tendency was observed in the distance range of the Vis/NIR sensor.This result is observed because the transmittance spectrum was more affected by the distance range of the light source than by that of the Vis/NIR sensor.The lowest CV appeared when the distance between the light source and the apple surface was 80-100 mm and the distance range between the apple surface and the Vis/NIR sensor was 20-35 mm.That is, the corresponding distance range had the least change in the transmittance signal, even when the distance between the light source and the Vis/NIR sensor changed because of the change in the size of the apple.In addition, three distance ranges were selected as appropriate distances between light sources and Vis/NIR sensors with a CV (%) of less than 5 (distance range of 80-100 mm between the light source and apple surface and 25-40 mm between the apple surface and the Vis/NIR sensor, distance range of 70-90 mm between the light source and the apple surface and 20-35 mm between the apple surface and the Vis/NIR sensor).
The appropriate distance between the selected light source and Vis/NIR sensor is the range of distance from the apple surface, and the position of the light source and Vis/NIR sensor changes relative to the apple size changes.The distance measurement criteria of the light source and Vis/NIR sensor were converted from the apple surface to the apple center.The distance between the light source and the Vis/NIR sensor was converted to a fixed distance by adding the radius of the largest apple (55 mm) among the samples to the minimum distance range such that most apples were in each distance range regardless of size (Figure 8).Therefore, the appropriate distance between the selected light source and the Vis/NIR sensor is 135 mm for the light source and 75 mm for the sensor, 135 mm for the light source and 80 mm for the sensor, and 125 mm for the light source and 75 mm for Distances 1, 2, and 3, respectively.

Characteristics according to Size of Apple Sample (Experiment 2)
Figure 9 illustrates the SSC distribution of the 411 apples used in the development of SSC prediction models by apple size according to the appropriate distance (Experiment 2).Table 5 lists the number of apples used in the calibration model and those used in the prediction model, and the mean and SD of SSC by the apple level.The number of apples used in the model of calibration and prediction was randomly divided by 7:3.Additionally, the spectrum was measured in four directions with 0-degree rotation of the maximum diameter of the apples at 0°, 90°, 180°, and 270°; thus, the spectrum was measured as many times as the number of apples multiplied by 4. The table also presents the number of spectra.From Levels I to VI, the number of apples used in the model construction was

Characteristics According to Size of Apple Sample (Experiment 2)
Figure 9 illustrates the SSC distribution of the 411 apples used in the development of SSC prediction models by apple size according to the appropriate distance (Experiment 2).Table 5 lists the number of apples used in the calibration model and those used in the prediction model, and the mean and SD of SSC by the apple level.The number of apples used in the model of calibration and prediction was randomly divided by 7:3.Additionally, the spectrum was measured in four directions with 0-degree rotation of the maximum diameter of the apples at 0 • , 90 • , 180 • , and 270 • ; thus, the spectrum was measured as many times as the number of apples multiplied by 4. The table also presents the number of spectra.From Levels I to VI, the number of apples used in the model construction was 58, 40, 51, 42, 49, and 49, and the number of apples used in the prediction model was 24, 17, 21,

Development of SSC Prediction Model Based on Distance between Light Source and Vis/NIR Sensor
An SSC prediction model for each apple size was developed using three appropriate distances from the light source and the previously identified Vis/NIR sensor.The three appropriate distances used resulted in the smallest changes in the transmittance spectrum despite changes in the size of the apple.The SSC prediction PLSR model was developed by measuring the transmittance spectrum at the corresponding distance for each level to confirm the effect of apple size (i.e., the three appropriate distances for each level) on the apple SSC prediction performance.
Transmittance spectra were measured in four directions (i.e., 0 • , 90 • , 180 • , and 270 • ) according to the area showing the maximum diameter of each apple.The average spectrum measured in the four directions for each apple was used as the spectrum for each apple.The apple SSC prediction models were developed by applying each of the eight preprocessing types.The performance of the developed models was verified using unknown samples.Tables 6-11 compare the results of the SSC prediction model developed for Levels I-VI by the apple size and the model performance with preprocessing applied, showing the best performance for each selected distance.Figures 10 and 11 show the results and regression coefficient of the model that performed best among the three distances.Figure 11 shows that each model has a relatively large correlation at wavelengths of 745, 850, and 895 nm related to sugar content.During preprocessing, the Savitzky-Golay first/second-order derivatives showed extremely low performance and are not shown in the table.
For Level I, the SSC prediction model, which measured the transmittance spectrum at Distance 1 among the three distance conditions and applied MSC pretreatment, showed the best performance (Table 6).The R 2 cal and RMSEC of the calibration model of this predictive model were 0.9 and 0.414, respectively, and R 2 pre and RMSEP were 0.68 and 0.769 • Brix, respectively, as verified using unknown samples (Figure 10a).The performance of the SSC prediction model was excellent in the order of Distances 2 and 3, and the optimal preprocessing conditions occurred when the SNV was applied.
In Level II, among the three distance conditions, the PLSR SSC prediction model without spectral preprocessing performed the best under Distance 2 conditions (Table 7).The R 2 cal and RMSEC of its calibration model were 0.96 and 0.223, and the factor was 13.In predicting using unknown samples, R 2 pre and RMSEP were 0.72 and 0.615 • Brix, respectively (Figure 10b).The performance of the SSC prediction model was excellent in the order of Distances 1 and 3, and in all cases, preprocessing was not applied.
For Level III, the SSC prediction model exhibited the best performance under the Distance 1 conditions (Table 8).When SNV preprocessing was applied, R 2 cal and RMSEC of the calibration model were 0.99 and 0.142, respectively, and the factor was 15.In predicting with unknown samples, R 2 pre and RMSEP were 0.74 and 0.822 • Brix, respectively (Figure 10c).The performance of the SSC prediction model was excellent in the order of Distances 2 and 3, and the optimal preprocessing methods were mean normalization and SNV, respectively.
In Level IV, the PLSR SSC prediction model that applied MSC preprocessing under Distance 3 conditions had the highest prediction accuracy (Table 9).The R 2 cal and RMSEC of the model were 0.99 and 0.195, and the factor was 12.In predicting with unknown samples, R 2  pre and RMSEP were 0.91 and 0.508 • Brix, respectively, showing the best performance among all levels, as shown in Figure 10d.The performance of the SSC prediction model was excellent in the order of Distances 1 and 2, and the optimal preprocessing was maximum normalization and range normalization.
In Level V, the SSC prediction model that applied MSC preprocessing at Distance 3 among the three distance conditions showed the best performance, as shown in Table 10.For this SSC prediction model, R 2  cal and RMSEC of the calibration model were 0.90 and 0.487, and in predicting with unknown samples, R 2 pre and RMSEP were 0.86 and 0.577 • Brix, respectively (Figure 10e).The performance of the SSC prediction model was excellent in the order of Distances 1 and 2, and the optimal preprocessing conditions were range normalization and mean normalization.
Finally, at Level VI, the SSC prediction model that applied the SNV under Distance 1 showed the best performance (Table 11).By using this predictive model, R 2 cal , RMSEC, and the factor were 0.98, 0.154, and 15, respectively, and R 2 pre and RMSEP were 0.89 and 0.596 • Brix, respectively, in predicting with unknown samples.Figure 10f shows the performance of the SSC prediction model was excellent in the order of Distances 3 and 2, and the optimal preprocessing conditions were when mean normalization and MSC were applied.
As a result of developing the SSC prediction model considering the size of the apple, MSC and SNV were the best preprocessing methods overall, and the higher the level, the better the SSC prediction accuracy.In addition, the results of this study show better performance than those of the study that did not consider changes in the distance between the light source and the Vis/NIR sensor for the fruit size.The results of this study show better performance than those of predicting the SSC of apples using the Vis/NIR (400-1100 nm) spectrum as a reflection method (R 2 pre = 0.82 and RMSEP = 0.5766) [22].In the reflective method, stray light is generated, and transmittance spectroscopy appears to yield better results because it does not penetrate the entire fruit.[19].In predicting SSC by measuring the spectrum excluding the center of the "Fuji" apple by developing an online transmittance device using Vis/NIR, the study showed better results with R 2 pre at 0.733 and RMSEP at 0.61% [20].In a study predicting the SSC of apples through an online semi-transmittance device using NIR, apples were divided into three stages considering only the diameter, and SSC was predicted using the diameter.The results showed that the study performed better when the apple diameter was 65-75 mm (similar in size to Level VI in this experiment), with 0.886 for R pre and 0.536% for RMSPE [23].This study shows that the SSC prediction performance may vary depending on the difference in the diameter of the fruit when the positions of the light source and spectroscopic sensor are fixed.In Level V, the SSC prediction model that applied MSC preprocessing at Distance 3 among the three distance conditions showed the best performance, as shown in Table 10.For this SSC prediction model, R cal 2 and RMSEC of the calibration model were 0.90 and 0.487, and in predicting with unknown samples, R pre 2 and RMSEP were 0.86 and 0.577 °Brix, respectively (Figure 10e).The performance of the SSC prediction model was excellent in the order of Distances 1 and 2, and the optimal preprocessing conditions were range normalization and mean normalization.Finally, at Level VI, the SSC prediction model that applied the SNV under Distance 1 showed the best performance (Table 11).By using this predictive model, R cal 2 , RMSEC, and the factor were 0.98, 0.154, and 15, respectively, and R pre 2 and RMSEP were 0.89 and 0.596 °Brix, respectively, in predicting with unknown samples.Figure 10f shows the performance of the SSC prediction model was excellent in the order of Distances 3 and 2, and the optimal preprocessing conditions were when mean normalization and MSC were applied.As a result of developing the SSC prediction model considering the size of the apple, MSC and SNV were the best preprocessing methods overall, and the higher the level, the better the SSC prediction accuracy.In addition, the results of this study show better performance than those of the study that did not consider changes in the distance between the light source and the Vis/NIR sensor for the fruit size.The results of this study show better performance than those of predicting the SSC of apples using the Vis/NIR (400-1100 This study showed better results than those of previous studies.These show that in apple transmission spectroscopy, adjusting the positions of the light source and NIR sensor depending on the size of the apple has a significant impact on SSC prediction performance.In addition, it showed better performance in Levels IV-VI than in Levels I-III because the amount of light transmitted increased as the size of the apple decreased, resulting in higher signals.In the future, the development of an online Vis/NIR transmittance spectroscopy device that changes the position of the light source and Vis/NIR sensor according to the size of the apple and strengthens the light source will enable the development of a fast and high-performance SSC sorter.In addition, if the distance between the light source and Vis/NIR sensor is quickly adjusted based on the size of the apple, it is expected that a combination of one light source and one sensor can develop a non-destructive SSC sorter with higher performance than before for apples of various sizes.However, this study was conducted on the Fuji cultivars of apples, and additional research is needed to verify if it can be applied to various varieties as well.

Figure 1 .
Figure 1.(a) Vis/NIR spectrum measurement system for apples and (b) mark of maximum diameter of apple.

Figure 1 .
Figure 1.(a) Vis/NIR spectrum measurement system for apples and (b) mark of maximum diameter of apple.

Figure 2 .
Figure 2. Design of distance between light source, apple, and NIR sensor for NIR signal acquisition.

Table 3 .Figure 3 .
Figure3.The 714.17-nm transmittance intensity value of apples and CV (%) calculation method for each distance range (Level I apples).

Figure 2 .
Figure 2. Design of distance between light source, apple, and NIR sensor for NIR signal acquisition.

Figure 2 .
Figure 2. Design of distance between light source, apple, and NIR sensor for NIR signal acquisition.

Table 3 . 40 Figure 3 .
Figure3.The 714.17-nm transmittance intensity value of apples and CV (%) calculation method for each distance range (Level I apples).

Figure 3 .
Figure3.The 714.17-nm transmittance intensity value of apples and CV (%) calculation method for each distance range (Level I apples).

Figure 4 .
Figure 4. Vis/NIR spectra of Level IV apple samples.

Figure 5 .
Figure 5. Transmittance intensity of 714 nm for each distance between light source and apple surface for distance between the apple surface and NIR sensor.Note: S: distance between apple surface and NIR sensor (mm); L: distance between light source and apple surface (mm).

Figure 5 .
Figure 5. Transmittance intensity of 714 nm for each distance between light source and apple surface for distance between the apple surface and NIR sensor.Note: S: distance between apple surface and NIR sensor (mm); L: distance between light source and apple surface (mm).

Figure 6 .
Figure 6.Transmittance intensity of 714 nm for each distance between apple surface and NIR sensor for distance between light source and apple surface.Note: S: distance between apple surface and sensor (mm); L: distance between light source and apple surface (mm).

Figure 6 .
Figure 6.Transmittance intensity of 714 nm for each distance between apple surface and NIR sensor for distance between light source and apple surface.Note: S: distance between apple surface and sensor (mm); L: distance between light source and apple surface (mm).

Figure 6 .
Figure 6.Transmittance intensity of 714 nm for each distance between apple surface and NIR se for distance between light source and apple surface.Note: S: distance between apple surface sensor (mm); L: distance between light source and apple surface (mm).

Figure 7 .
Figure 7. Selection of the highest-intensity wavelength in the apples' spectra.

Figure 7 .
Figure 7. Selection of the highest-intensity wavelength in the apples' spectra.

Figure 8 .
Figure 8. Converting distance range of light source-apple-sensor to fixed distances.
Figure9illustrates the SSC distribution of the 411 apples used in the development of SSC prediction models by apple size according to the appropriate distance (Experiment 2).Table5lists the number of apples used in the calibration model and those used in the prediction model, and the mean and SD of SSC by the apple level.The number of apples used in the model of calibration and prediction was randomly divided by 7:3.Additionally, the spectrum was measured in four directions with 0-degree rotation of the maximum diameter of the apples at 0°, 90°, 180°, and 270°; thus, the spectrum was measured as many times as the number of apples multiplied by 4. The table also presents the number of spectra.From Levels I to VI, the number of apples used in the model construction was 58, 40, 51, 42, 49, and 49, and the number of apples used in the prediction model was 24, 17, 21, 18, 21, and 21.The SSC ranges from Levels I to VI were 12.05-18.80°Brix, 12.93-16.80°Brix, 10.08-18.68°Brix, 9.38-16.85°Brix, 10.25-17.90°Brix, and 10.70-16.45°Brix, respectively.The SD of SSC from Levels I to VI was distributed as 1.06-1.80.

Figure 8 .
Figure 8. Converting distance range of light source-apple-sensor to fixed distances.
Figure9illustrates the SSC distribution of the 411 apples used in the development of SSC prediction models by apple size according to the appropriate distance (Experiment 2).Table5lists the number of apples used in the calibration model and those used in the prediction model, and the mean and SD of SSC by the apple level.The number of apples used in the model of calibration and prediction was randomly divided by 7:3.Additionally, the spectrum was measured in four directions with 0-degree rotation of the maximum diameter of the apples at 0 • , 90 • , 180 • , and 270 • ; thus, the spectrum was measured as many times as the number of apples multiplied by 4. The table also presents the number of spectra.From Levels I to VI, the number of apples used in the model construction was 58, 40, 51, 42, 49, and 49, and the number of apples used in the prediction model was 24, 17, 21, 18, 21, and 21.The SSC ranges from Levels I to VI were 12.05-18.80• Brix, 12.93-16.80• Brix, 10.08-18.68• Brix, 9.38-16.85• Brix, 10.25-17.90• Brix, and 10.70-16.45• Brix, respectively.The SD of SSC from Levels I to VI was distributed as 1.06-1.80.

Figure 9 .
Figure 9. SSC distribution of apple samples by each level.

Figure 9 .
Figure 9. SSC distribution of apple samples by each level.

Figure 10 .
Figure 10.Results of validating best SSC prediction models for apple size with unknown samples.

Figure 10 .
Figure 10.Results of validating best SSC prediction models for apple size with unknown samples.

Figure 11 .
Figure 11.Regression coefficient of the best SSC prediction models for apple size with unknown samples.

Figure 11 .
Figure 11.Regression coefficient of the best SSC prediction models for apple size with unknown samples.

Table 1 .
Size classification of apples in standard specifications of agricultural products.

Table 2 .
Characteristics of apple samples.

Table 3 .
Distance range between light source and apple and between apple and Vis/NIR sensor.

Range ii Range ii Full Range Distance
between apple and Vis/NIR sensor (mm) 20-35 25-40 20-40 Sensors 2024, 24, x FOR PEER REVIEW 5 of 18 Transmittance Spectral Characteristics According to Apple Size and Light Source and Vis/NIR Sensor Distance (Experiment 1) pre values are to 1, the lower the RMSEC and RMSEP values.The smaller the difference, the better the model.

Table 4 .
Average CV (%) for each distance range between light and apple and between apple and Vis/NIR sensor.

Table 5 .
The number and SSC distribution of apple samples for each level.

Table 6 .
Level I-the best PLSR model results of predicting SSC of apples for three optimal distances between the light source and the Vis/NIR sensor.

Table 7 .
Level II-the best PLSR model results of predicting SSC of apples for three optimal distances between the light source and the Vis/NIR sensor.

Table 8 .
Level III-the best PLSR model results of predicting SSC of apples for three optimal distances between the light source and the Vis/NIR sensor.

Table 9 .
Level IV-the best PLSR model results of predicting SSC of apples for three optimal distances between the light source and the Vis/NIR sensor.

Table 10 .
Level V-the best PLSR model results of predicting SSC of apples for three optimal distances between the light source and the NIR sensor.

Table 11 .
Level VI-the best PLSR model results of predicting SSC of apples for three optimal distances between the light source and the NIR sensor.

Table 10 .
Level V-the best PLSR model results of predicting SSC of apples for three optimal distances between the light source and the NIR sensor.

Table 11 .
Level VI-the best PLSR model results of predicting SSC of apples for three optimal distances between the light source and the NIR sensor.