A Deep Learning Method for Yogurt Preferences Prediction Using Sensory Attributes

: During the development of innovative products, consumer preferences are the essential factors for yogurt producers to improve their market share. A high-performance prediction method will be beneﬁcial to understand the intrinsic relevance between preferences and sensory attributes. In this study, a novel deep learning method is proposed that uses an autoencoder to extract product features from the sensory attributes scored by experts, and the sensory features acquired are regressed on consumer preferences with support vector machine analysis. Model performance analysis, hedonic contour mapping, and feature clustering were implemented to validate the overall learning process. The results showed that the deep learning model can vouch an acceptable level of accuracy, and the hedonic mapping reﬂected could supply a great help for producers’ product design or modiﬁcation. Finally, hierarchical clustering analysis revealed that for all three brands of yogurts, low temperature (4 ◦ C) storage for no more than 4 weeks can promise the highest consumer preferences.


Introduction
As human living standards continue to rise, yogurt is getting popular among people from all walks of life. Taking the Chinese market as an example, market research by the National Bureau of Statistics of China [1] showed that the yogurt sales of China were increased from 33 billion yuan in 2010 to 101 billion yuan in 2016, while milk sales were increased from 67 billion yuan to 110 billion yuan over the same period. Further, yogurt sales reached 122 billion yuan in 2017, surpassing pure milk sales for the first time. It is estimated that by 2020, yogurt sales in China will reach 190 billion yuan [2].
For consumers of dairy products, the smooth texture and refreshing taste of yogurt are appealing, and the probiotics in yogurt, which may have positive effects on immune, cardiovascular, and metabolic health, can also enhance consumers' partiality [3]. The rapidly expanding yogurt market has delivered huge profits to yogurt producers, and catering to consumers' tastes is one of the most important factors for yogurt producers to enlarge their market shares [4]. Thus, the accurate prediction of consumer preferences is crucial for yogurt producers aiming to attract new consumers as well as to maintain their brand loyalty [5].
Knowledge about consumers' preference for yogurt sensory qualities can help producers' processing. However, it is not economic to undertake large-scale sensory evaluations and market surveys [6]. Thus, market demand is usually estimated via consumer preference predictions using sensory attributes as an input. Some previous studies have focused on exploring the correlation between sensory attributes and consumer preferences for dairy products. Ares et al. [7] compared different preference mapping techniques and recommended that the external preference mapping used by the dairy industry to modify the texture of dulce de leche is a method better meeting consumer requirement. Zhi et al. [8] established a relationship between product characteristics and overall preferences using partial least squares regression (PLSR) to identify the preferences of consumers from different regions in relation to flavored milk. Castada et al. [9] used the principal component analysis (PCA) method and found a positive correlation between five sensory attributes and consumer preferences in relation to Swiss cheese.
These studies applied a range of conventional statistical correlation methods in an attempt to predict consumer preferences based on sensory attributes. However, when applied to large volumes of high-dimensional data, these conventional methods may not provide enough accuracy, extendibility, and generalization [10]. In addition, errors of the prediction results may be incurred by potential issues, such as the individual difference of evaluators or bad data quality in sensory experiments [11]. To overcome these deficiencies, we propose a high-performance computer-aided model for more efficient algorithms of regression. The core aim of this model is to extract the unique features that reflect the intrinsic characteristics of a yogurt product from sensory evaluation data using a deep learning method. These features can not only be used to regress consumer preferences, but they can also help yogurt producers for product classification, design, processing, and the adjustment of storage conditions.
Deep learning algorithms are used for dimensionality reduction and feature extraction [12]. The autoencoder (AE) approach is proved to be accurate and efficient in feature extraction and noise decreasing of original data [13]. Among these autoencoders, a general autoencoder framework using fully connected layers (FCAE) is the most widely used [14]. With the development of machine learning technology, another deep convolutional autoencoder (DCAE) proposed by Cheng et al. [15] is becoming highly recognized and increasingly prevalent. The structure of DCAE combines a convolutional neural network (CNN) [16] with a simple autoencoder. The convolution and pooling operations in the structure of the DCAE enable the model to comprehensively consider the latent features of all the evaluators' scoring habits and reveal the relationships among the coupled sensory attributes elements during preference regression [17].
After identifying the key features, a support vector machine (SVM) algorithm [18] can be applied to predict preferences [10]. A previous study [19] showed that SVM was the most practical method in relation to high-dimensional regression problems with small sample sizes. The complete set of SVM regression models has been packaged as an Industrial Internet plug-in. Using this plug-in, the hedonic contour mapping [20,21] of key sensory properties is feasible. Similar to external preference mapping [7], hedonic contour mapping, which is generated by orthogonal experiments [22] implemented by computer programs, can determine an area composed of consumers' favorable sensory attributes. In addition, it has been reported that the sensory qualities of yogurt are closely associated with storage duration and temperature [23]. Therefore, hierarchical clustering [24] of the extracted features is applied to classify consumers' preference levels for the yogurts of different brands stored under various conditions, and thus the optimal storage conditions could be identified.
In this study, the overall process of consumer preference prediction based on sensory evaluation data is introduced, including sensory and consumer analysis, feature extraction by deep learning, and overall preference regression using SVM, hedonic contour mapping, and hierarchical clustering based on sensory features. Yogurt samples are used to provide the dataset for the model and test the feasibility of the overall process.

Sample Preparation
Three newly developed yogurt products of different brands were provided by the yogurt producers in China. Every product was stored at 12 kinds of conditions composed of various storage periods and Processes 2020 , 8, 518   3 of 13   storage temperatures: 2L, 3L, 4L, 5L, 1R, 2R, 3R, 4R, 5R, 2H, 3H, and 4H, in which Arabic numerals denote the storage weeks, while L, R, and H express low temperature (4 • C), room temperature (25 • C), and high temperature (38 • C), respectively.

Panelists
Ten expert panelists and 100 consumer respondents were recruited for the sensory study and hedonic test, respectively. The expert panelists, including 5 males and 5 females aging from 22 to 46, were enrolled externally and constantly trained by COFCO Nutrition Health Research Institute (NHRI), the research and development (R&D) center of China National Cereals, Oils and Foodstuffs Corporation (COFCO) in Beijing. The volunteer employees from the non-R&D and non-marketing departments of COFCO were adopted as the consumer respondents for the hedonic test. These volunteers, aging from 22 to 46 with a male to female ratio of 2.125:1, generally have the habit of purchasing yogurt products. In advance, all participants assured of no allergy or resistance to dairy products.
The selection and training of sensory experts were processed according to ISO 8586:2012 [25]. The people who had no less than 4 years of experiences in sensory evaluation were considered, and the ones who could restate the sample attributes and whose assessment intensities could be calibrated into similar scales were invited for the sensory descriptive analysis of yogurts. Before the formal test, they were requested to join a training program consisting of 1-2 sessions per week over a four-week period. The training was performed mainly with triangle, ranking, and matching tests to ensure panelists' abilities of descriptive vocabulary, matching, and discrimination. During the training, panelists' performance levels were also monitored with three identical yogurt samples marked by different three-digital numbers.

Sensory Descriptive Test
The 36 yogurt samples were randomly divided into 3 groups, each of which contained 12 samples. In each sensory evaluation, which was conducted once a day, the expert panelists were asked to evaluate one group of samples based on 22 preselected descriptive attributes. The assessment of all samples was completed within 3 connective working days. The sensory descriptors were predetermined by panel leaders, and the intensity of each attribute was scored on a line scale of 10 cm anchored by verbal endpoints at both ends [26]. All the yogurt samples were kept overnight in a refrigerator set at 4 ± 2 • C and incubated for 20 min at 26 ± 2 • C before serving. In the environment of 26 ± 2 • C, approximate 30 mL of yogurt samples in 60 mL plastic cups were marked by 3 digital numbers and presented to the panelists in stochastic orders. Between the samples, a 3-minute break was taken, and mineral water and unsalted steamed buns were served to clean assessor' palates. Samples S1, S2, and S3, stirred yogurt respectively stored at room temperature, 4 • C, and 38 • C for 2 weeks were applied as blind reference samples in the evaluation procedure. Sense Whisper (test.sensewhisper.com) and PanelCheck (programing by Oliver Tomic and Henning Risvik) were used to collect and calculate the evaluation data [27]. More information related to the 22 yogurt sensory attributes and the experts' performance is appended to the supplementary file, in Figures S1-S5.

Hedonic test
The 36 yogurt samples were sent to the 100 consumer respondents in various offices of COFCO. After tasting each sample, the respondents were asked to provide a rating on a 9-point hedonic scale in which scores 1 and 9 are explained as extremely unpleasant and extremely pleasant, respectively [26].
The high degree of randomicity and human error in the consumer ratings make it infeasible to precisely predict the entire distribution of consumer preferences. Thus, the scores were divided into three categories for the further analysis: preferred (scores between 7 and 9), neutral (scores between  Table 1.

Feature Extraction Based on Deep Learning
For each yogurt sample, 22 attributes were assessed by 10 experts, generating a two-dimensional matrix with a size of 10 × 22. As the matrix elements, the attribute values of 36 samples were collected from line scales of 10 cm and linear normalized to the range of 0 to 1. Through Python 3.6, the normalized data corresponding to various experts could be expressed by heatmaps [28]. The detailed dataset can be found in Figures S1-S5 of the supplement file.
For comprehensive evaluation of the experts' sensory scores, the conventional statistical correlation methods may not completely satisfy the regression due to the large dimension. The usage of conventional methods, such as PCA or PLSR, could also result in the loss of key sensory information during dimensionality reduction and feature extraction [29]. Thus, we tried to apply deep learning methods to extract the product features from their sensory attribute scores. Among the various deep learning methods, autoencoder (AE) may be an efficient algorithm for dimensionality reduction and ignoring noises. AE firstly learned a representation for a set of data by dimensionality reduction, which was the encoding operation, and then created the closest possible reconstruction of the original data from the extracted representation, which was the decoding operation. The representation was of the features extracted from the original data by converting the 10 experts' attribute scores with the low-dimension vector. To better extract the features, both FCAE and DCAE methods were simultaneously attempted for the same dataset, and their graphical structures are shown in Figure 1. The operations in Figure 1 were executed with eight Intel core i7-7700HQ CPUs @3.8GHz and one GTX Force 1050Ti GPU @CUDA 9.0, running Python 3.6 using a Keras frame on a TensorFlow back end [30].
Processes 2020, 8, x FOR PEER REVIEW 4 of 13 4 and 6), and disliked (scores between 1 and 3). The percentages of consumers' preferences for the 36 samples are shown in Table 1.

Feature Extraction Based on Deep Learning
For each yogurt sample, 22 attributes were assessed by 10 experts, generating a two-dimensional matrix with a size of 10 × 22. As the matrix elements, the attribute values of 36 samples were collected from line scales of 10 cm and linear normalized to the range of 0 to 1. Through Python 3.6, the normalized data corresponding to various experts could be expressed by heatmaps [28]. The detailed dataset can be found in the supplement file.
For comprehensive evaluation of the experts' sensory scores, the conventional statistical correlation methods may not completely satisfy the regression due to the large dimension. The usage of conventional methods, such as PCA or PLSR, could also result in the loss of key sensory information during dimensionality reduction and feature extraction [29]. Thus, we tried to apply deep learning methods to extract the product features from their sensory attribute scores. Among the various deep learning methods, autoencoder (AE) may be an efficient algorithm for dimensionality reduction and ignoring noises. AE firstly learned a representation for a set of data by dimensionality reduction, which was the encoding operation, and then created the closest possible reconstruction of the original data from the extracted representation, which was the decoding operation. The representation was of the features extracted from the original data by converting the 10 experts' attribute scores with the low-dimension vector. To better extract the features, both FCAE and DCAE methods were simultaneously attempted for the same dataset, and their graphical structures are shown in Figure 1. The operations in Figure 1 were executed with eight Intel core i7-7700HQ CPUs @3.8GHz and one GTX Force 1050Ti GPU @CUDA 9.0, running Python 3.6 using a Keras frame on a TensorFlow back end [30].  As shown in Figure 1a, the structure of FCAE is composed of original data (220 nodes), hidden layer (24 nodes), and reverted data (220 nodes). The normalized original data is obtained by a heatmap flattened into one dimension. All the layers are fully connected, and then FCAE is trained to minimize the deviation between input data and output data. The hidden layer, which can be reverted to a similar vector as the input, is considered as the features of sensory attributes in a specific yogurt sample.
As shown in Figure 1b, the encoding operations, including convolutions and poolings, were used to extract features from the original data for further regression. The subsequent deconvolution network reverted the features into the original heatmap through a decoding operation (upsampling and deconvolution) for network parameter optimization and feature correctness verification. Four original heatmaps were randomly selected from the testing sets, and a comparison between the original heatmaps and their reverted heatmaps is shown in Figure 2. In the original heatmaps, the x-axis represents 22 various sensory attributes, while the y-axis represents 10 different experts. As shown in Figure 1a, the structure of FCAE is composed of original data (220 nodes), hidden layer (24 nodes), and reverted data (220 nodes). The normalized original data is obtained by a heatmap flattened into one dimension. All the layers are fully connected, and then FCAE is trained to minimize the deviation between input data and output data. The hidden layer, which can be reverted to a similar vector as the input, is considered as the features of sensory attributes in a specific yogurt sample.
As shown in Figure 1b, the encoding operations, including convolutions and poolings, were used to extract features from the original data for further regression. The subsequent deconvolution network reverted the features into the original heatmap through a decoding operation (upsampling and deconvolution) for network parameter optimization and feature correctness verification. Four original heatmaps were randomly selected from the testing sets, and a comparison between the original heatmaps and their reverted heatmaps is shown in Figure 2. In the original heatmaps, the xaxis represents 22 various sensory attributes, while the y-axis represents 10 different experts. Compared to conventional PCA and PLSR methods, the deep learning treatment of original sensory evaluation data gave a comprehensive consideration of all evaluators' scoring habits and the mutual couplings among sensory attributes. In FCAE, the features could be easily extracted and explicit significances could be expressed, which were stored in the hidden layer for further uses. In DCAE, the reverted heatmaps were reconstructed by extracting features from the distribution information of the sensory attribute dimension in each column and reducing the distribution fluctuation among sensory evaluators in each row, which effectively helps data denoising and feature robustness.
After determination of the FCAE and DCAE structures and parameters, all features (with 24 dimensions) were input for further analyses using AE and SVM as the main algorithms. The entire deep learning process in relation to consumer preference prediction, hedonic contour mapping, and classification of brand/storage conditions was performed, as shown in Figure 3. To explore the effectiveness and robustness of feature extraction, the averages of sensory attributes are also used as a type of extracted features for comparison. Compared to conventional PCA and PLSR methods, the deep learning treatment of original sensory evaluation data gave a comprehensive consideration of all evaluators' scoring habits and the mutual couplings among sensory attributes. In FCAE, the features could be easily extracted and explicit significances could be expressed, which were stored in the hidden layer for further uses. In DCAE, the reverted heatmaps were reconstructed by extracting features from the distribution information of the sensory attribute dimension in each column and reducing the distribution fluctuation among sensory evaluators in each row, which effectively helps data denoising and feature robustness.
After determination of the FCAE and DCAE structures and parameters, all features (with 24 dimensions) were input for further analyses using AE and SVM as the main algorithms. The entire deep learning process in relation to consumer preference prediction, hedonic contour mapping, and classification of brand/storage conditions was performed, as shown in Figure 3. To explore the effectiveness and robustness of feature extraction, the averages of sensory attributes are also used as a type of extracted features for comparison.

Intelligent SVM Regression for Preference Prediction
The features were structured into 24-dimensional vectors for input, which provided output in the form of the percentages of the consumer preference groups. A conversion between the high-dimensional features and the two-dimensional percentages (preferred and disliked) was needed. Thus, the ε−support vector regression (ε−SVR) algorithm, which is suitable for the problems of high-dimensional data versus small sample size, was selected for the conversion processing. The branch algorithm of SVM based on VC dimension theory and the minimum structural risk principle was consistently outperformed in terms of generalization ability and global optimization [31]. In addition, the solution obtained from ε−SVR was understandable and expressed as below [32]: where y represents the preference prediction output, x represents the extracted features, α i and α * i are Lagrange multipliers, b is the intercept of linear regression, SV is the support vector, N NSV represents the amount of standard support vectors, and K represents the kernel function. C, ε, and γ are the penalty coefficient, tolerance boundary, and distribution width in the kernel function, respectively, which are the parameters to be optimized in the model. The optimization process based on the swarm intelligence algorithm [19] is included as an Industrial Internet plug-in with the ε−SVR algorithm, and the entire process can be executed automatically. To evaluate different methods, the optimization process of these three parameters are set in a same parameter space as {(C, ε, γ) |C∈(16,64), ε∈(0.06,0.12), and γ∈(0.05,0.09)}.

Intelligent SVM Regression for Preference Prediction
The features were structured into 24-dimensional vectors for input, which provided output in the form of the percentages of the consumer preference groups. A conversion between the highdimensional features and the two-dimensional percentages (preferred and disliked) was needed. Thus, the ε−support vector regression(ε−SVR) algorithm, which is suitable for the problems of high-dimensional data versus small sample size, was selected for the conversion processing. The branch algorithm of SVM based on VC dimension theory and the minimum structural risk principle was consistently outperformed in terms of generalization ability and global optimization [31]. In addition, the solution obtained from ε−SVR was understandable and expressed as below [32]: where y represents the preference prediction output, x represents the extracted features, i α and

Evaluation of Consumer Preferences Using Hedonic Contour Mapping
Product sensory properties can vary over a wide range, so it is impossible to collect the actual consumer preferences on all data points. However, using the deep learning process outlined above, consumer preferences can be predicted if the specific sensory attributes were provided. The hedonic contour mapping method can generally identify preferences based on some key sensory dimensions using the process shown in Figure 3. To help the yogurt producers to improve their product quality, the hedonic contour mapping is obtained using specific sensory dimensions instead of principle components or other explanatory variables [33]. After collecting the preference predictions, a contour map can be generated whereby the points on the map indicate preference percentages.
Compared to the previous mapping methods, the hedonic contour mapping method can clearly display the consumer preference characteristics for specific attributes and the future improvement direction for a specific product.

Classification of Brands and Storage Conditions
The features extracted by AEs can also be used to classify the yogurt samples in relation to aspects such as brands and storage conditions. Hierarchical clustering analysis is frequently used to identify similarities among various variables. In this study, hierarchical clustering analysis with Euclidean distance and Ward's criteria [34] was used to classify the 36 yogurt samples of various brands and storage conditions, as shown in Figure 3.

Overall Performance of Preference Predictions
The 36 yogurt samples were randomly divided into two groups, 25 (70%) for training of AEs and 11 (30%) for testing by the trained AEs. After the AE system was trained to optimize its ε−SVR, the original heatmap of each sample, which contains the full information regarding the scores of sensory attributes, was transformed into consumer preference percentages using the deep learning process shown in Figure 3. The averages of sensory attribute scores were also set as an input of ε−SVR. Five tests were repeated for each approach. For comparison, the percentages of consumer preference were also calculated using the PLSR algorithm that had been previously applied to this dataset [35]. The root mean square error (RMSE) was used as an evaluation index. The results generated from different methods are listed in Table 2. The RMSE results in Table 2 indicated that compared to the PLSR method, the possible deviations of preference predictions could be largely decreased by using the ε−SVR based models. Among the ε−SVR based models, the deviations resulted from either FCAE or DCAE are also lower than those from the model using averages as features in most of the cases. In the preferred group, a better accuracy could be obtained by DCAE. Nevertheless, DCAE and FCAE showed a similar performance in the disliked group.
The performance of FCAE and DCAE was further evaluated on algorithm stability during feature extraction. The loss function of the two methods, which represents the difference between reverted data and original data, was recorded and plotted in Figure 4. The training process of DCAE maintained a stable convergence after around 60 iterations with a total time consumption of 32.4 s. However, intermittent vibration kept appearing during the training process of FCAE, which seemed not to be convergent after 1000 iterations with a total time consumption of 67.2 s.
Among the ε−SVR based models, the deviations resulted from either FCAE or DCAE are also lower than those from the model using averages as features in most of the cases. In the preferred group, a better accuracy could be obtained by DCAE. Nevertheless, DCAE and FCAE showed a similar performance in the disliked group.
The performance of FCAE and DCAE was further evaluated on algorithm stability during feature extraction. The loss function of the two methods, which represents the difference between reverted data and original data, was recorded and plotted in Figure 4. The training process of DCAE maintained a stable convergence after around 60 iterations with a total time consumption of 32.4 s. However, intermittent vibration kept appearing during the training process of FCAE, which seemed not to be convergent after 1000 iterations with a total time consumption of 67.2 s. In addition, smaller gaps between the training and testing sets demonstrated that the DCAE-SVM model could avoid overfitting to a certain degree compared to the other methods. Considering the overall performance, further analysis was implemented using this model. The prediction results based on the selected DCAE-SVM models for the 11 testing samples are shown in Figure 5.  In addition, smaller gaps between the training and testing sets demonstrated that the DCAE-SVM model could avoid overfitting to a certain degree compared to the other methods. Considering the overall performance, further analysis was implemented using this model. The prediction results based on the selected DCAE-SVM models for the 11 testing samples are shown in Figure 5.

Hedonic Contour Mapping Based on the Prediction Model
After the deep learning process was packed, the consumer preferences for product design could be obtained by entering the sensory attributes into the program. Several hedonic mappings elicited from the key sensory attributes of dairy products are shown in Figure 6. According to our previous researches [35], there are eight key sensory attributes playing an important role in determining consumers' preference for yogurt products. These eight key sensory property dimensions, including acidity, smoothness, sweetness, milkiness, adhesiveness (effort to draw through a straw), oxidization, graininess, and whiteness of yogurt were used as the independent variables, while the percentage of

Hedonic Contour Mapping Based on the Prediction Model
After the deep learning process was packed, the consumer preferences for product design could be obtained by entering the sensory attributes into the program. Several hedonic mappings elicited from the key sensory attributes of dairy products are shown in Figure 6. According to our previous researches [35], there are eight key sensory attributes playing an important role in determining consumers' preference for yogurt products. These eight key sensory property dimensions, including acidity, smoothness, sweetness, milkiness, adhesiveness (effort to draw through a straw), oxidization, graininess, and whiteness of yogurt were used as the independent variables, while the percentage of consumers' preferred group for each sample was set as the dependent variable.

Hedonic Contour Mapping Based on the Prediction Model
After the deep learning process was packed, the consumer preferences for product design could be obtained by entering the sensory attributes into the program. Several hedonic mappings elicited from the key sensory attributes of dairy products are shown in Figure 6. According to our previous researches [35], there are eight key sensory attributes playing an important role in determining consumers' preference for yogurt products. These eight key sensory property dimensions, including acidity, smoothness, sweetness, milkiness, adhesiveness (effort to draw through a straw), oxidization, graininess, and whiteness of yogurt were used as the independent variables, while the percentage of consumers' preferred group for each sample was set as the dependent variable.  The subplots in Figure 6 were generated with Matlab R2018a (MathWorks Inc.), and the color bars displayed the color scale of the preferred ratio of yogurt samples. The brightest (yellow) area represents the sensory qualities most favored by consumers. Using the hedonic contour maps in Figure 6, conclusions for consumer preference characteristics and product improvement schemes could be drawn on within the experimental measurement range of a specific product. As revealed by Figure 6a, consumer preference was the highest when the acidity value was around 5.8 and the smoothness value was around 5.9, and the positive slope of the yellow area indicates the product design principle for this kind of yogurt. For example, if the yogurt is very high in smoothness, it should also have a more acidic taste to balance the overall texture. On the contrary, if the yogurt lacks smoothness, its taste should be less acidic. This area also indicated the boundaries of the optimal yogurt design parameters. The acidity value for this kind of yogurt should not be less than 2.4, and the smoothness value should not be more than 7.6. Figure 6b shows that milky taste is important to consumers, and it should have a value of no less than 4.8. The preferred range for sweetness is between 4.0 and 5.0, and the corresponding milkiness is between 6.5 and 8.0. Figure 6c shows that oxidized flavor was not welcome by consumers, and its value should be as low as possible. When the oxidized flavor value is lower than 2.0, the adhesiveness value should also not be too high: preferably between 3.0 and 4.0. Figure 6d expresses the hedonic mapping for whiteness and graininess. It can be seen that the most favorable area is on the upper left corner, which means that the whiteness value should be as high as possible, while the graininess value should be as low as possible.

Hierarchical Clustering of Yogurt Samples with the Extracted Features
Through the deep learning process, 24-dimensional vectors were extracted as the key variables. To strengthen the interpretability of these features extracted, a hierarchical clustering analysis was implemented using these features as inputs to classify the yogurt brands (Brands 1, 2, and 3) and storage conditions. The results are shown in Figure 7.
design principle for this kind of yogurt. For example, if the yogurt is very high in smoothness, it should also have a more acidic taste to balance the overall texture. On the contrary, if the yogurt lacks smoothness, its taste should be less acidic. This area also indicated the boundaries of the optimal yogurt design parameters. The acidity value for this kind of yogurt should not be less than 2.4, and the smoothness value should not be more than 7.6. Figure 6b shows that milky taste is important to consumers, and it should have a value of no less than 4.8. The preferred range for sweetness is between 4.0 and 5.0, and the corresponding milkiness is between 6.5 and 8.0. Figure 6c shows that oxidized flavor was not welcome by consumers, and its value should be as low as possible. When the oxidized flavor value is lower than 2.0, the adhesiveness value should also not be too high: preferably between 3.0 and 4.0. Figure 6d expresses the hedonic mapping for whiteness and graininess. It can be seen that the most favorable area is on the upper left corner, which means that the whiteness value should be as high as possible, while the graininess value should be as low as possible.

Hierarchical Clustering of Yogurt Samples with the Extracted Features
Through the deep learning process, 24-dimensional vectors were extracted as the key variables. To strengthen the interpretability of these features extracted, a hierarchical clustering analysis was implemented using these features as inputs to classify the yogurt brands (Brands 1, 2, and 3) and storage conditions. The results are shown in Figure 7. In Figure 7, hierarchical clustering can clearly divide these three brands of yogurts with the extracted features as variables. The microclusters showed the sensory divergence under different storage conditions. The green labels represent the suitable conditions for yogurt storage, which result in high favored percentages. It indicates that yogurt should be stored at around 4 °C for less than four weeks or at around 25 °C for less than one week. The red labels represent the improper conditions that lower the sensory quality of yogurt and result in low consumer preferences. It reveals that yogurt should not be stored at around 38 °C for more than two weeks or at around 25 °C for more than five weeks.

Discussion
Although at present, deep learning methods are rarely applied to food research, they have been widely used in the field of process engineering, such as process fault diagnosis [36], and have been demonstrated more effective than the traditional methods for nonlinear dynamic processes with high uncertainty [37]. Similarly, correlation between food sensory evaluation and consumer preference is In Figure 7, hierarchical clustering can clearly divide these three brands of yogurts with the extracted features as variables. The microclusters showed the sensory divergence under different storage conditions. The green labels represent the suitable conditions for yogurt storage, which result in high favored percentages. It indicates that yogurt should be stored at around 4 • C for less than four weeks or at around 25 • C for less than one week. The red labels represent the improper conditions that lower the sensory quality of yogurt and result in low consumer preferences. It reveals that yogurt should not be stored at around 38 • C for more than two weeks or at around 25 • C for more than five weeks.

Discussion
Although at present, deep learning methods are rarely applied to food research, they have been widely used in the field of process engineering, such as process fault diagnosis [36], and have been demonstrated more effective than the traditional methods for nonlinear dynamic processes with high uncertainty [37]. Similarly, correlation between food sensory evaluation and consumer preference is also a dynamic, nonlinear, and highly uncertain modeling process. Thus, a novel deep learning approach combining several ideas from previous papers is proposed and compared with one of the representative traditional correlation models, PLSR. Our numerical experiments illustrated that SVR using the features extracted by AEs as input has advantages over the conventional method.
The two AEs proposed in the present work, FCAE and DCAE, both have their strength and weakness. For FCAE, its excellent learning ability could be demonstrated by the low training RMSE. The concise structure and high interpretability of FCAE make it more efficient and feasible for the application of sensory laboratories. However, when the size of the dataset is small, FCAE may face some instability and overfitting problems. For DCAE, the overfitting is reduced, and the stability is strengthened by the sparsity of convolution. With convolution and pooling operations, DCAE can consider the original heatmap more comprehensively in view of the scale of the coupling relationship among the sensory attributes. The deviations among the expert evaluators are considered as unnecessary features or noises for a yogurt sample, so they were denoised by convolution and average pooling. However, the preprocess of the dataset for DCAE is a bit time-consuming, and the size of its compiled file model is a bit large. In addition, compared to FCAE, the complexity of DCAE is higher, and its interpretability is worse. The further application of DCAE may be hindered by these factors. Even though at present, DCAE could be a good choice to treat the problem of high-dimensional regression with a small sample size, similar to our yogurt preference prediction, we expect that the excellent learning ability of FCAE may make it a more powerful model once its problem in dataset expansion is improved in the future.
Application of the deep learning approach to preference prediction provides plenty of opportunities for the future processing of food product design. The proposed hedonic contour mapping can reduce the workload of consumer investigation compared to the traditional methods, such as a series of just-about-right (JAR) questions. To explore another group of consumers' preferences for the similar food product background, only the hedonic scores need to be acquired instead of long JAR questionnaires, because the essential food product features have been extracted and recorded in the deep learning process. In addition, the hedonic contour mapping method is effective for coupling the relationship between any two variables, which will help food enterprises explicitly improve the recipe and sensory qualities of their products. As a result of the limited dataset, the mapping results still cannot be completely correct. The market analysis using these preference data is not within the scope of this manuscript. Our role here as data scientists is to provide data, not the market analysis indicator makers. However, the results of this research can provide food enterprises and sensory researchers with more choices for the future work.
In addition, some of the data resulting from the deep learning approach can benefit the further analyses of samples, as shown by our hierarchical clustering of yogurt samples with the extracted features. Compared to the clustering results obtained from the PLSR key components in our previous study [35], hierarchical clustering of the features extracted with AE provides better results in terms of storage condition classification, demonstrating the effectiveness of the deep learning process.

Conclusions
In this study, a complete AE-SVM deep learning framework is proposed for consumer preferences prediction, and the dataset of yogurt sensory attributes and corresponding consumer preferences demonstrate the feasibility, accuracy, and stability of this model. The process of feature extraction with AE is highly interpretable, and the results obtained can be also extended to the other applications. Further analyses based on the extracted features can contribute to yogurt the improvement of product processing, storage, marketing, etc. By combining hedonic contour mapping with this deep learning approach, it is interpreted that Chinese consumers prefer the yogurts of minimum oxidization/graininess, maximum whiteness, and certain levels of acidity, smoothness, sweetness, milkiness, and adhesiveness. Since the optimal values of various quality attributes revealed by hedonic contour mapping can provide helpful information to the food enterprises, it is recommended that fixed sensory assessors should be trained and prepared to perform sensory evaluation regularly. By the way, application of the proposed AE-SVM deep learning method is proven feasible and effective for quality control of food products. In addition, the feature extraction process mentioned above may provide more useful information if the other techniques for big data collection are followed.
Supplementary Materials: The following are available online at http://www.mdpi.com/2227-9717/8/5/518/s1, Figure S1: Detailed sensory evaluation results of 36 yogurt samples assessed by 10 experts on 22 sensory attributes after normalization, Figure S2: P*MSE plot of the experts as a blind reference in three ratings, Figure S3: F plot of the experts as a blind reference in three ratings, Figure S4