Charting the Future of E-Grocery: An Evaluation of the Use of Digital Imagery as a Sensory Analysis Tool for Fresh Fruits

: While online grocery shopping has been rapidly expanding over the last several years, online sales of fresh produce have lagged far behind. One of the most signiﬁcant contributing factors for this lag is the consumer’s inability to assess the quality of produce online. We hypothesized that this could be alleviated by machine vision technology. This study examines perceived sensory attributes derived from digital images of fresh fruits and compares them with sensory attributes obtained from the actual fruit. Digital images of fresh strawberries, cherry tomatoes, grapes, and blueberries were acquired using a high-resolution digital camera. Consumer panelists evaluated the appearance, texture, ﬂavor, overall eating quality, and also determined purchase decision. Panel sizes ranging from 32 to 40 members (a total of 147) also conducted in situ evaluations of the different fruits. The paired t -test indicated that the mean results between pairs of image scores and in situ evaluation scores were statistically insigniﬁcant. The scores obtained for texture and overall eating quality showed some variability, but the scores for quality appearance were remarkably consistent revealing no difference across the evaluations of the various commodities. The results demonstrated that digital images can be utilized to effectively relay the appearance attributes of fresh produce. This ﬁnding is relevant for the industry, as the appropriate construction of real time images can help to build consumers’ trust in the quality of e-deliveries, nudge consumers to purchase fruits and vegetables, and increase the overall e-commerce acceptance for fresh produce. A discussion of the limitations and opportunities for improving the effectiveness of digital sensory analysis of fresh produce is provided.


Introduction
During the last few years, the adoption of grocery shopping online has been accelerating worldwide. In the US, the grocery product category has been the fastest growing e-commerce segment and it is expected to reach USD 38 billion by the end of 2021 [1]. The COVID-19 pandemic has also catalyzed online sales, reaching USD 106 billion in 2020. Early in the pandemic, it was reported that the online grocery sales increased by 300% [2,3]. It has been estimated that this growth will continue even after the pandemic, though the rate may slow down [4,5].
The adoption of online shopping of fresh produce has been much slower than that of overall groceries. E-commerce involving fresh produce increased significantly during the pandemic. However, it has continued to lag behind the overall growth profile of online grocery shopping [6]. In 2018, up to 46% of surveyed consumers expressed having shopped for groceries online, but only 10% of them bought fresh produce [7].
Consumers usually purchase fresh produce via retail stores or traditional markets. In these food outlets, they can use all of their senses to evaluate the products including sight for size and color, touch for firmness, and smell for aroma before making the decision to Horticulturae 2021, 7, 262 2 of 14 purchase [8][9][10][11][12]. Sensory analysis researchers agree that the most important quality that led to purchase was appearance (visual quality) [8,[13][14][15]. It has been widely accepted that if the visual qualities of fresh produce, such as color, shape, and size were pleasant, consumers were motivated to purchase.
The low adoption rate of e-grocery for fresh produce appeared to be associated with lack of trust regarding the quality of the product. Consumers were hesitant to purchase fresh produce online as they were uncertain about the product's quality. More than 2/3 of consumers reported that the main reason they do not purchase fresh produce online was their inability to examine the fresh produce prior to purchasing [16].
To counteract consumer hesitation, retail stores have taken several measures. Amazon Fresh and Whole Foods Market, the largest market players, have introduced each produce item with detailed information to complement its online images: geographic growing origin, cultural practices, size, and food labeling guidance from FDA [17], such as the nutrition facts and potential health effects [18]. However, the fresh produce section of the Amazon's online store has lower consumer satisfaction than that of the other grocery departments. According to RFG, among online grocery sections, the fresh produce division received the lowest scores in terms of quality [19]. In general, 4 out of 10 people indicated the produce department fell short of their standards for quality and freshness [8,[20][21][22].
Under the premise that the main reason consumers did not purchase produce online was essentially due to the lack of opportunity to inspect the product, we hypothesized that the concerns of consumers could be reduced if they could see a high-quality image of the actual product they would receive through electronic purchasing.
The opportunity appeared to be readily available for real-time images to be utilized for this challenge given the increased use of smart phones. High quality cameras in cell phones have become the norm and they allowed images that can be easily and freely produced and disseminated. In fact, digital images have already entered deep into the food culture. Nearly half of the US population would take pictures of their food [23] and 69% of millennials have indicated that they take a picture of their food before they eat [24]. Food was the second most photographed subject on Instagram after selfies [25]. Consumers were accustomed to visual images and we hypothesized that they provided sufficient elements to convey high visual perception (the brain's ability to receive, interpret, and act upon visual stimuli) to effectively assess the quality of produce.
Visual images have been a part of the food industry for a long time. Images have been routinely used in agricultural resource management, preharvest control (e.g., to monitor germination, plant growth rate, and determine harvest suitability), postharvest quality control (e.g., as an aid to sorting, grading, processing, packaging, and storage), and retail sales. However, its use in retail was usually limited to stock photos in websites.
Traditionally, both the industry and research laboratories have assessed quality of food through sensory evaluations performed by (un) trained panelists and supplemented with instrumental analyses [8,9]. However, these evaluations have been expensive, laborintensive, and time-consuming. Thus, in principle, if consumers were able to evaluate food through images then much effort, time, money, and energy could be saved.
Scientific information that analyzes differences in perception of fresh produce between image and in situ evaluation was lacking. To our knowledge, only a few researchers have attempted to use photos as a potential tool for food sensory analyses. The Garitta group [26] compared trained panel sensory evaluation of real products (broccoli) versus the sensory evaluation of the corresponding digital photographs. The analysis of variance showed that there were no significant differences between evaluations of the real broccoli and the corresponding photograph. Brugiapaglia and Destefanis [27] performed sensory evaluation of meat color using photographs and compared it to instrumental measurements. The research showed that photographs were a beneficial tool to evaluate the color of meat. The Chan group [28] assessed consumer perception of doneness from cooked meat and from corresponding photographs of samples. This research concluded that photographs can be used as a valid approach for assessing preferences for meat doneness. These results have suggested that a standard method using electronic images can be used to assess diverse fresh produce.
Therefore, we investigated whether in situ sensory evaluations of fresh produce could be replaced with quality evaluations that rely on real-time images. More specifically, we aimed to determine if images could translate into realistic visual cues that would help consumers to effectively evaluate quality of produce and make purchasing decisions.

Materials and Methods
The Figure 1 presents a flow chart that summarizes the sequence of steps and the overall concept of the research methodology and statistical analysis.
Horticulturae 2021, 7, x FOR PEER REVIEW 3 of 15 meat and from corresponding photographs of samples. This research concluded that photographs can be used as a valid approach for assessing preferences for meat doneness. These results have suggested that a standard method using electronic images can be used to assess diverse fresh produce. Therefore, we investigated whether in situ sensory evaluations of fresh produce could be replaced with quality evaluations that rely on real-time images. More specifically, we aimed to determine if images could translate into realistic visual cues that would help consumers to effectively evaluate quality of produce and make purchasing decisions.

Materials and Methods
The Figure 1 presents a flow chart that summarizes the sequence of steps and the overall concept of the research methodology and statistical analysis.

Sample Preparation
Four types of produce were evaluated: blueberry, cherry tomato, grape, and strawberry. The produce was purchased from local retail stores. Two hours prior to sensory evaluation, the produce was inspected and sorted for uniform quality (discarding any with visual defects) and prepared at room temperature (approximately 22 °C). The selected samples were washed with potable water and air dried on paper towels. Grapes and blueberries were left whole for the evaluations while cherry tomatoes and strawberries were cut longitudinally in halves. After preparation, the optical images of each sample were produced. Photographs were constructed as follows: using a white background either (i) one whole and one half-cut cherry tomato or strawberry, (ii) five whole blueberries, (iii) three whole grapes with a stem were placed on a small white paper tray. The photographs provided in tablets to the panelists were presented also with example photos displayed in the Amazon Fresh or Whole Food Markets sites ( Figure S1). After each photo was taken, the produce samples were stored in a plastic bag at room temperature until the sensory evaluation was conducted. All procedures followed mandatory safety practices and recommended guidelines.

Sample Preparation
Four types of produce were evaluated: blueberry, cherry tomato, grape, and strawberry. The produce was purchased from local retail stores. Two hours prior to sensory evaluation, the produce was inspected and sorted for uniform quality (discarding any with visual defects) and prepared at room temperature (approximately 22 • C). The selected samples were washed with potable water and air dried on paper towels. Grapes and blueberries were left whole for the evaluations while cherry tomatoes and strawberries were cut longitudinally in halves. After preparation, the optical images of each sample were produced. Photographs were constructed as follows: using a white background either (i) one whole and one half-cut cherry tomato or strawberry, (ii) five whole blueberries, (iii) three whole grapes with a stem were placed on a small white paper tray. The photographs provided in tablets to the panelists were presented also with example photos displayed in the Amazon Fresh or Whole Food Markets sites ( Figure S1). After each photo was taken, the produce samples were stored in a plastic bag at room temperature until the sensory evaluation was conducted. All procedures followed mandatory safety practices and recommended guidelines.

Digital Images
The photographs were taken with a Nikon D 800 digital camera with a 60 mm lens (Nikon Inc., Melville, NY, USA). Settings of the camera included an aperture value F20, at ISO sensitivity 640, with a shutter speed of 1/30 s and under light condition of a white uniform background under portable photo shooting tent (Amazon Basics Portable Photo Studio, Amazon, Seattle, WA, USA). The light condition consisted of 5600k daylight light balanced two light-emitting diodes (LED). The camera was held in place through a hole in the top of the tent and the lens of camera faced downward toward the sample on the floor of the tent. The distance from the lenses of the camera to the sample was approximately 40 cm. Before capturing the images, we performed a calibration with a standard 24 colored chips (X-rite ColorChecker ® Passport, X-rite Inc., Grand Rapids, MI, USA). The photographs were then downloaded to a tablet (I-pad. Apple, WA, USA) for the evaluation of images without any variation. In total, 40 photographs were selected for each of the fruits (blueberry, cherry tomato, grape, strawberry) for a total of 160 photographs.  Tables S1 and S2 respectively. Less than 10% of participating panelists had prior experience buying produce online.

Sensory Evaluation
Prior to the sensory evaluation, consumers were given brief instructions for the evaluation and signed the consent form, in agreement with the USDA-ARS standards. Those who indicated having fruit allergies in the past, were not allowed to participate. The evaluation was conducted in individual booths, under moderate natural light, using electric ballots on computers equipped with the Compusense Five ® Software program (Version 5.6, Compusense Inc., Guelph, ON, Canada).
The consumers evaluated fruit images and answered three questions on three attributes: appearance, texture, and overall eating quality. The scales were set on a 15 cm unstructured line, with words at the anchored edges. The left side of the scale corresponded to "bad", and the right side corresponding to "good", with the acceptability characteristic converted later to a 0-100 scale by the software program. Subsequently, the consumer evaluated the real sample corresponding to its picture by tasting the produce and answering questions on various attributes: appearance, texture, overall eating quality. On unstructured 15 cm line scales, labeled on both ends, panelists rated the texture firmness at first bite ("soft" to "firm"), and attributes of acceptability: appearance, texture, and eating quality ("bad" to "good").

Statistical Analysis
Data analysis was performed using the procedure of frequency, correlation and mean in SAS ® (version 9.4, Cary, NC, USA). Pearson correlation coefficients were calculated to measure the degree of association among sensorial attributes. Using matched paired t-tests, the mean differences of paired scores of the image vs. the in situ evaluations, were compared for the three attributes of appearance, texture, and overall eating quality. Boxplots were generated for distribution and constructed to visualize the trend and spread of score differences. Principal component analysis (PCA) was constructed to identify visible relationships among the seven attributes by each commodity using Sigma Plot (version 13.0 Systat Software, Inc., San Jose, CA, USA)

Consumer's Visual Perception of Real Fresh Produce vs. Its Digital Image
Each consumer panelist generated a pair of sensory evaluation scores for a sample: one on the real sample ['real score'], and another on the image of the corresponding sample ['image score']. The statistical analyses were focused on the difference of each pair of scores. A small difference would mean that the real score (e.g., food quality) can be predicted with the image score. In this research, the difference was calculated as [difference = image score-real score]. A positive difference meant that the image score was higher than the real score, while a negative difference meant the real score was higher than image score. Significance was determined by conducting a matched pair t-test. In the paired t-test, the null hypothesis was that there is no difference between the paired observations while the alternative hypothesis was that there is a difference. In other words, the null hypothesis was that the mean (or average) of the difference between the pairs is equal to zero (µ d = 0) while the alternative hypothesis was that the mean of the difference is not zero (µ d = 0).
Boxplots were designed to summarize and visualize the descriptive statistics ( Figure 2). They revealed that zero was the score difference for both appearance and texture. The score difference for overall eating quality was 4.90, a considerably higher level than that observed for appearance and texture. The spread of the difference between the sensory score of the image and the real produce was smallest for appearance. For this variable we also observed a small standard deviation (10.4), which was one third of that observed for texture and overall eating quality. The boxplots suggest that texture and eating quality either cannot be predictable with the images alone and/or there was high variability across the samples, which is revealed by the large mean difference and standard deviation. The large difference with overall eating quality was somewhat expected as the eating experience is not really reproducible with an image. The results obtained with texture, however, required further analysis as texture to a certain degree could be transmitted through an image, depending on the quality of it and the product. To verify, the large variability was assessed for statistical significance. The results with the paired t-tests showed that a significant difference (p = 0.0008) between the two sets of results (image vs. in situ evaluations) was found only in the overall eating quality (Table 1). For both appearance and texture score differences, there appears to be no difference between the image and the real produce. This suggests the underutilized power of digital images to transmit visual quality without the need of examination through touching or smelling.

Consumer's Visual Perception of Real Fresh Produce vs. Its Digital Image
Each consumer panelist generated a pair of sensory evaluation scores for a sample: one on the real sample ['real score'], and another on the image of the corresponding sample ['image score']. The statistical analyses were focused on the difference of each pair of scores. A small difference would mean that the real score (e.g., food quality) can be predicted with the image score. In this research, the difference was calculated as [difference = image score-real score]. A positive difference meant that the image score was higher than the real score, while a negative difference meant the real score was higher than image score. Significance was determined by conducting a matched pair t-test. In the paired ttest, the null hypothesis was that there is no difference between the paired observations while the alternative hypothesis was that there is a difference. In other words, the null hypothesis was that the mean (or average) of the difference between the pairs is equal to zero (µd = 0) while the alternative hypothesis was that the mean of the difference is not zero (µd ≠ 0).
Boxplots were designed to summarize and visualize the descriptive statistics ( Figure  2). They revealed that zero was the score difference for both appearance and texture. The score difference for overall eating quality was 4.90, a considerably higher level than that observed for appearance and texture. The spread of the difference between the sensory score of the image and the real produce was smallest for appearance. For this variable we also observed a small standard deviation (10.4), which was one third of that observed for texture and overall eating quality. The boxplots suggest that texture and eating quality either cannot be predictable with the images alone and/or there was high variability across the samples, which is revealed by the large mean difference and standard deviation. The large difference with overall eating quality was somewhat expected as the eating experience is not really reproducible with an image. The results obtained with texture, however, required further analysis as texture to a certain degree could be transmitted through an image, depending on the quality of it and the product. To verify, the large variability was assessed for statistical significance. The results with the paired t-tests showed that a significant difference (p = 0.0008) between the two sets of results (image vs. in situ evaluations) was found only in the overall eating quality (Table 1). For both appearance and texture score differences, there appears to be no difference between the image and the real produce. This suggests the underutilized power of digital images to transmit visual quality without the need of examination through touching or smelling.

Appearance
The elements that construct the "appearance" quality variable includes the color of produce (color originating from pigments, which is sometimes associated by consumers with a bioactive value), level of freshness and/or spoilage, fruit size, and level of ripeness [13,[29][30][31][32][33][34][35]. Appearance is commonly the most important attribute affecting the consumer's decision to purchase. A number of studies have shown that people chose fresh produce based essentially on appearance, for which the absence of defects or blemishes, the level of ripeness, and level of freshness are critical [12,[36][37][38]. For the appearance attribute, the mean of the score differences was near zero and the spread was relatively small across all four commodities. The evaluation of grapes produced the highest mean and the largest spread, but, overall, the difference was not markedly distant from that of the other commodities ( Figure 3). The mean differences were 0.16, −0.5, −1.58, and 2.84 for blueberry, strawberry, cherry tomato, and grape, respectively. The standard deviations of the mean were 9.75, 9.75, 10.8, and 11.4 for blueberry, strawberry, cherry tomato, and grape, respectively. The t-test ( Table 2) further showed that for appearance, when combining all commodities, the image scores (IA) and real scores (A) pairs did not produce significant differences. 0.0008 * Note. * There was a significant difference (p < 0.05) for overall eating quality between the mean score perceived in image (IO) and that of the real fruit (O).

Appearance
The elements that construct the "appearance" quality variable includes the color of produce (color originating from pigments, which is sometimes associated by consumers with a bioactive value), level of freshness and/or spoilage, fruit size, and level of ripeness [13,[29][30][31][32][33][34][35]. Appearance is commonly the most important attribute affecting the consumer's decision to purchase. A number of studies have shown that people chose fresh produce based essentially on appearance, for which the absence of defects or blemishes, the level of ripeness, and level of freshness are critical [12,[36][37][38].
For the appearance attribute, the mean of the score differences was near zero and the spread was relatively small across all four commodities. The evaluation of grapes produced the highest mean and the largest spread, but, overall, the difference was not markedly distant from that of the other commodities ( Figure 3). The mean differences were 0.16, −0.5, −1.58, and 2.84 for blueberry, strawberry, cherry tomato, and grape, respectively. The standard deviations of the mean were 9.75, 9.75, 10.8, and 11.4 for blueberry, strawberry, cherry tomato, and grape, respectively. The t-test ( Table 2) further showed that for appearance, when combining all commodities, the image scores (IA) and real scores (A) pairs did not produce significant differences.  The results obtained with the evaluations of appearance suggest that imaging technology is a promising approach for nudging consumers towards online purchasing of fruits and vegetables. Not only was there no significant difference when comparing the means of combined commodities, but also, each commodity separately did not show significant difference.

Texture
In this study we measured the image score for overall texture and compared it to the two texture sub-scores for real fruit: intensity and acceptability of texture. Intensity of texture is defined by the degree of firmness and is measured by the strength needed for the first bite [8,12,14]. When all commodities were combined, the mean difference between image vs. real score was small (0.32) and the standard deviation (22.4) was larger than that of appearance but similar to that of overall eating quality ( Figure 4). The paired t-test showed that the comparison of the scoring of the image with the real fruit did not produce any significant difference (Table 1). However, when each commodity was evaluated separately, the results with blueberry and grape evidenced a statistically significant difference (Table 3). The results obtained with the evaluations of appearance suggest that imaging technology is a promising approach for nudging consumers towards online purchasing of fruits and vegetables. Not only was there no significant difference when comparing the means of combined commodities, but also, each commodity separately did not show significant difference.

Texture
In this study we measured the image score for overall texture and compared it to the two texture sub-scores for real fruit: intensity and acceptability of texture. Intensity of texture is defined by the degree of firmness and is measured by the strength needed for the first bite [8,12,14]. When all commodities were combined, the mean difference between image vs. real score was small (0.32) and the standard deviation (22.4) was larger than that of appearance but similar to that of overall eating quality ( Figure 4). The paired t-test showed that the comparison of the scoring of the image with the real fruit did not produce any significant difference (Table 1). However, when each commodity was evaluated separately, the results with blueberry and grape evidenced a statistically significant difference (Table 3).  The acceptability of texture is defined by the consumer's emotional response to (or preference for) tactile mouth feel and is measured by the likeness of the mouth feel [8,12,14]. In this study, the image score for texture was an imaginary perception of the mouth feeling, as if they were tasting the real product. Texture scores for real produce were generated after consumers tasted the fruit. Usually, image texture scores were higher than the texture scores with (or shortly after) eating. When all the commodities were combined, the mean difference between the image vs. taste evaluation score was very small (−1.21) and the standard deviation (28.5) was three times bigger than that observed in the evaluations for appearance. Additionally, the standard deviation of acceptability of texture was larger than that of the intensity of texture ( Figure 5 and Table 4). Paired t-tests comparing image and real produce scores showed that they do not have any significant difference. However, when each commodity was evaluated separately, the scores provided for grapes evidenced a significant difference. The mean difference in this case was −14.98, which was the largest of all attributes ( Table 4). The negative mean score indicated that the mouth feel perception of the fruit was better than the (predictive) perception based on the photograph. Based on our observations we believe this may be due to the panelists' ability to use the tablet to zoom in and enlarge the image, which allows the panelist to detect small spots or scars. To the naked eye those defects are not prominent, but they are on the magnified image. Additionally, the skin of the grape was slightly transparent and light green, which is more visible through the image and may have influenced the visual perception of the grape in the image. Table 3. Paired t-test comparing mean texture (IT-TF, firmness intensity) score between the images and real produce based on each commodity. Note. * There were significant differences between IT and TF for blueberry and grape (p < 0.05). Table 3. Paired t-test comparing mean texture (IT-TF, firmness intensity) score between the images and real produce based on each commodity. Note. * There were significant differences between IT and TF for blueberry and grape (p < 0.05).

IT-TF
The acceptability of texture is defined by the consumer's emotional response to (or preference for) tactile mouth feel and is measured by the likeness of the mouth feel [8,12,14]. In this study, the image score for texture was an imaginary perception of the mouth feeling, as if they were tasting the real product. Texture scores for real produce were generated after consumers tasted the fruit. Usually, image texture scores were higher than the texture scores with (or shortly after) eating. When all the commodities were combined, the mean difference between the image vs. taste evaluation score was very small (−1.21) and the standard deviation (28.5) was three times bigger than that observed in the evaluations for appearance. Additionally, the standard deviation of acceptability of texture was larger than that of the intensity of texture ( Figure 5 and Table 4). Paired t-tests comparing image and real produce scores showed that they do not have any significant difference. However, when each commodity was evaluated separately, the scores provided for grapes evidenced a significant difference. The mean difference in this case was −14.98, which was the largest of all attributes ( Table 4). The negative mean score indicated that the mouth feel perception of the fruit was better than the (predictive) perception based on the photograph. Based on our observations we believe this may be due to the panelists' ability to use the tablet to zoom in and enlarge the image, which allows the panelist to detect small spots or scars. To the naked eye those defects are not prominent, but they are on the magnified image. Additionally, the skin of the grape was slightly transparent and light green, which is more visible through the image and may have influenced the visual perception of the grape in the image.   Table 4. Paired t-test comparing acceptability of texture (IT-T) between the images and real produce based on each commodity.

Overall Quality
The score provided for overall eating quality (overall quality), particularly in relation to in situ evaluations, derives essentially from an integrated perception of the appearance, texture, and flavor altogether [8,12,13].
The means difference for overall eating quality was positive, and it was the largest among all attributes. The spread was broad with a standard deviation around 20.9 ( Figure 6 and Table 5). The paired t-test showed that each pair of scores from the evaluation of the fruit vs. its image did have a statistically significant difference (Table 1). We expected that the image scores would be higher than the real scores. This is because, when performing the evaluation through the digital image, the criterion is only the appearance, and despite its complexity as it includes color, size, shape and surface texture. However, when eating the fruit, the dimension of the perception expands and more criteria are added, including aroma, taste, and texture. The more comprehensive approach usually lowers the evaluation score, and not vice versa [9]. Note. * There was a significant difference between IT and T rated by consumers on grape at p < 0.05.

Overall Quality
The score provided for overall eating quality (overall quality), particularly in relation to in situ evaluations, derives essentially from an integrated perception of the appearance, texture, and flavor altogether [8,12,13].
The means difference for overall eating quality was positive, and it was the largest among all attributes. The spread was broad with a standard deviation around 20.9 ( Figure  6 and Table 5). The paired t-test showed that each pair of scores from the evaluation of the fruit vs. its image did have a statistically significant difference (Table 1). We expected that the image scores would be higher than the real scores. This is because, when performing the evaluation through the digital image, the criterion is only the appearance, and despite its complexity as it includes color, size, shape and surface texture. However, when eating the fruit, the dimension of the perception expands and more criteria are added, including aroma, taste, and texture. The more comprehensive approach usually lowers the evaluation score, and not vice versa [9].   eating quality, further studies are warranted to identify ways to improve perception and predictability when using digital images. One potential option could be to add a written (or verbal) description along with the image to convey what the flavor and aroma are like. Additionally, emotion sensory analyses (e.g., facial expressions, skin changes, electroencephalography), and optimization of the construction of the image (e.g., lighting, area covered by the product, 3D and 360 • photography technology) could support a more realistic perception of the real product. Understanding how the consumer inspects the overall quality of a selected product would allow for more effective use of image analysis systems (e.g., MIPAR) [39] to feed machine learning models that predict in situ evaluations.

Relationship among Variables
The multivariable biplot map was drawn to see the relation among the variables. The first three principal components accounted for 85.7% of the total variation, with the first two components accounting for 52.6% and 20.3% of the total variation, respectively. The map (Figure 7) showed that the three attributes (IA, A, IO) were clustered together in the 4th quadrant. The real overall eating attribute (O) was slightly outside while four attributes are in proximity in the same quadrant. The texture related attributes (T, IT, and TF) were separate in the 1st quadrant, suggesting there is no close relationship to appearance (A) and overall eating variables (O). As additional information, we are providing biplots generated by principal components analysis that determine the relationships among variables for each commodity ( Figure S2a-d). Three clustered variables (IA, A, IO) were all highly correlated to each other (r > 0.8), as shown in Table 6. These correlations confirmed the possibility of using digital images for the evaluation of produce appearance. They also revealed that consumers perceived overall quality of food in close association with appearance. The latter, despite some variability, was observed when comparing the overall eating quality of real fruit with that using the image, as evidenced by some distance in the biplot maps. Interestingly, the results with the two closely located variables, image perceived texture (IT) and texture intensity of real fruit (TF), indicated that when consumers were asked to evaluate the texture of produce with the image, they tended to assess for firmness, more than for overall texture acceptability (T). This biplot map suggests that, while using images for evaluating texture is complicated, if firmness is confirmed as the variable consumers are primarily looking for when checking for texture, the use of a written description referring to a scale that denotes different levels of firmness could be evaluated further.  Three clustered variables (IA, A, IO) were all highly correlated to each other (r > 0.8), as shown in Table 6. These correlations confirmed the possibility of using digital images for the evaluation of produce appearance. They also revealed that consumers perceived overall quality of food in close association with appearance. The latter, despite some variability, was observed when comparing the overall eating quality of real fruit with that using the image, as evidenced by some distance in the biplot maps. Interestingly, the results with the two closely located variables, image perceived texture (IT) and texture intensity of real fruit (TF), indicated that when consumers were asked to evaluate the texture of produce with the image, they tended to assess for firmness, more than for overall texture acceptability (T). This biplot map suggests that, while using images for evaluating texture is complicated, if firmness is confirmed as the variable consumers are primarily looking for when checking for texture, the use of a written description referring to a scale that denotes different levels of firmness could be evaluated further.

Possibility of Using Images in Sensory Evaluations
Consumers purchase fruit and vegetables after assessing for visual appearance. Unlike less perishable food such as cheese or processed foods such as canned foods and bakery products, all of which are mass-produced in identical form, no two individual products are identical in fresh produce. This presents a challenge for consumers that shop for produce remotely. In our study we propose that use of real-time images may bridge the gap between product variability and quality consistency to build trust in using visual cues to reliably evaluate quality variables. Our results are promising but certainly have limitations and require further studies to optimize prediction of quality using images. The application of real-time images in commercial operations may only become the norm if the images represent large batches of the product at a given time (e.g., daily).
In this study we found that the score difference between the image appearance and the appearance of the real produce was insignificant, with only a small variation of the mean. Larger variation was obtained with the scores observed of texture and overall eating quality. The paired t-test further confirmed no significant differences in the paired factors for appearance and texture. Further, statistical analysis showed that there was no significant difference among age and gender groups for appearance and texture (data not shown). Consumers could generate a fairly accurate evaluation of the actual fruit quality with its photograph.
The use of images has only started to be used in online grocery shops, but currently the photographs displayed are not capturing a realistic view of the batches of the product. In fact, in most cases the same photo is used over a large amount of time, denoting the photo is only for illustration purposes. For those currently purchasing produce online through a third-party, the in-store shoppers could provide real-time pictures of products for approval (or consultation) before finalizing the purchase. In the same way, online shoppers can approve purchases after seeing the produce through 3D cameras installed in the warehouse before shipping.
The benefits of utilizing optimized electronic images (including descriptions of the product) appear numerous. The increasing enthusiasm for online produce shopping can provide opportunities for implementing more sustainable logistic systems that can help decarbonize the overall process of purchases (or carry out of merchandise) by eliminating time constraints, reducing the use of fuel, and more effectively implementing the use of carryout bags and boxes that are environmentally friendly. All of those are applicable even in scenarios with curbside pick-up predominance. Very importantly, solid evidence is starting to emerge, in specific geographical areas, demonstrating that when consumers shift to online purchasing they are more prone to purchase larger quantities of fresh produce [40]. Therefore, it is safe to indicate that our results, if implemented by the industry, can contribute, in the long run, to outcomes with important benefits for the health of the planet and consumers.

Conclusions
The biggest barrier to online shopping for fresh produce is the inability of consumers to evaluate the actual appearance of the products. This study demonstrated there is great potential to utilize electronic real-time images to accurately evaluate the quality appearance of fresh produce. The mean difference of pairs of the produce image scores vs. real produce scores was small and the paired t-test consistently showed no statistical difference for appearance. Moreover, the results obtained with texture were mixed, suggesting further investigation is needed to determine whether optimization of an image and its description could provide a more consistent and narrow difference between virtual and in situ evaluations. Our results can contribute to (i) improve both online environments for fresh produce shopping and (ii) deliver new, inexpensive, and less time consuming, ways to conduct sensory analysis for research purposes.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/horticulturae7090262/s1: Figure S1: Images (i) from online shopping markets (provided by Amazon Fresh or Whole Food Markets) and (ii) given to panelists in a tablet; Figure S2a-d: Biplots were generated by principal components analysis; Table S1: Demographic information of participants in sensory evaluation for each commodity; Table S2: The mean scores rated on each attribute by consumer panelists in fruit.