The Forecasting Sales Volume and Satisfaction of Organic Products through Text Mining on Web Customer Reviews

: The purpose of this study was to predict the online sales volume for organic products, identify important factors for selling organic products, and suggest web marketing strategies for organic product sales. Through the review of organic products on Taobao’s platform, the emotional analysis method is used to divide the review of crawling organic products into positive reviews and negative reviews. Using the Latent Dirichlet Allocation (LDA) method, extracting keywords, identifying important factors for selling organic products, using online survey methods and regression analysis methods, obtaining customers’ purchase intentions, and suggesting web marketing strategies for organic product sales, and by collecting data on organic products’ price, current price, free delivery, sales volume, number of customer reviews, customer reviews, organic labeling, and product fans on Taobao’s platform, the neural network analysis method is used to predict the online sales volume for organic products. This study found that packaging design, nutritional information, food quality, delivery risk, freshness, and source risk are the important online factors in the buying of organic products and the products’ fans, price discount, and number of customer reviews a ﬀ ected the sales volume. Therefore, the promotion of online services and logistics can be used to increase the sales of organic products. This research has an important role in promoting the sale of organic products and improving consumer satisfaction, providing consumers with safe and reliable products, and at the same time has important signiﬁcance for promoting sustainable development.


Introduction
With the continuous improvement of people's living standards, organic products are more and more valued by consumers, because organic products not only protect the environment but also play an important role in consumer health [1][2][3]. Organic products are safe, healthy, and nutritious, so consumers are more willing to buy organic products [4]. In particular, current food safety issues are becoming more prominent year after year, such as the occurrence of poisoned rice, fake eggs, and trench oil. Although the price of organic products is high, consumers are more willing to purchase organic products to find high-quality products. Due to the high price of organic products, in order to sell non-organic products at high prices, many non-organic products are packaged into organic products and circulated in various regions, making the authenticity of organic products difficult to distinguish [5]. At the same time, with rapid economic growth, consumers' consumption capacity has improved significantly, and excessive consumption of natural resources has caused environmental problems [6]. Growers' excessive or inappropriate use of pesticides and fertilizers [7], as well as the Table 1. Previous studies on online products.

Study
Research Content Variables Methodology [29] Customers' food order intentions via internet or phone Service quality, product quality, product freshness, time savings, behavioral intentions Survey, ANOVA [30] The impact of perceived risk on online shopping Fraud risk, delivery risk, financial risk, process and time loss risk, product risk, privacy risk, information risk Survey, SPSS [31] Factors determining customer satisfaction with online shopping Information, search, ordering facilities, after ordering facilities, website aesthetic and attractiveness, delivery and customer support activities, price, quality Survey, SPSS [32] Factors affecting customers buying products online Reputation, website design, fulfillment, reliability, customer service, security, privacy, emotion, perceived risk, purchase intention Survey, EMS [33] Factors affecting customers' online shopping satisfaction Consumer satisfaction, website design, security, information quality, payment method, e-service quality, product quality, product variety, delivery service, customer satisfaction Survey, ANOVA, Regression In previous studies, the research on the impact of online product purchase satisfaction and sales volume mainly used questionnaire surveys or face-to-face interviews. However, such data is often limited and restricted by initial setups, for example, the amount of data surveyed is limited. With the increase in the number of online consumers, customers' perceptions of products after a purchase are different. A single survey and a conventional method of face-to-face interviews cannot find out more comprehensively how different factors affect the customer satisfaction of customers buying organic products online. Therefore, this research not only uses the data from online reviews for analysis but also uses the questionnaire survey method to study the satisfaction.
Consumers pay more attention to product quality and authenticity when purchasing products online [34]. Therefore, reviews of products purchased online are important. Reviews are given after consumers buy a product, and the evaluation of products and services provided by the merchant include the pictures and additional comments. Online product reviews are the main source of information for customers, retailers, and manufacturers. They can explain the product quality to both practitioners and researchers. In fact, customer reviews of products represent customer satisfaction with a product. Table 2 below is a collection of previous research on online reviews. Table 2. Previous studies on online reviews.

Literature
Research Content Data Sources Methodology [35] Online reviews are related to product sales Historical sales and online review data The Bass/Norton model and sentiment analysis [34] Provide consumers with the best products online Online consumer reviews Sentiment mining [36] The relationship between word-of-mouth (WOM) and online reviews Online customer reviews Natural language processing (NLP) [33] Impact of online reviews on travel products Online reviews for hotels Latent Dirichlet Analysis The previous data on customers buying products online is from online reviews. The methods used are Latent Dirichlet Allocation (LDA), sentiment analysis, natural language processing (NLP) methods, and other single analysis methods. A single method cannot to better understand the satisfaction of online reviews. We needed to combine multiple analysis methods to study and better understand customers' online reviews of product satisfaction. Therefore, this research mainly used sentiment analysis, LDA analysis methods, and regression analysis methods. The results obtained were examined regarding satisfaction. Online reviews were collected, and then, using the sentiment analysis and LDA analysis methods, the following keywords were obtained: packaging, nutritional information, food quality, delivery risk, product freshness, and source risk. These keywords were then used to study the relationship with satisfaction.

Research on Forecasting Sales Volume of Online Shopping
When consumers buy products online, they not only focus on reviews, product price, discount, convenience, and other aspects but, because they are also more concerned about these aspects, the variety of information available online is an important factor in driving consumer purchases [37]. In the research on online purchases, online reviews, the discount value, the discount rate, free delivery, and the sentiments of user reviews can help predict product sales [19]. Because organic products are more expensive than traditional products, consumers pay more attention to the price of online organic products, and assortment, price, and promotions are mainly used to determine the impact of retailer performance [38]. At the same time, organic products are different from traditional products. To buy credible organic produce, there must be an organic mark certified by relevant organizations. These factors are important factors for consumers to purchase organic products online. In this study price, free delivery, the number of customer reviews, product fan, organic labeling, and discount value were selected to predict the sales volume of online organic products. Table 3 below is a collection of previous research on predicting product sales. Table 3. Previous studies on predicting product sales.

Study
Research Content Variables Methodology [39] Online reviews (e.g., valence and volume), online promotional strategies (e.g., free delivery and discounts) and sentiments from user reviews can help predict product sales.
Discount value, discount rate, free delivery, online reviews, discount rate, Artificial neural network [40] The goal of this paper is to develop models for predicting the helpfulness of reviews, providing a tool that finds the most helpful reviews of a given product.
Product type, retail price, sales rank, review rating Artificial neural network, Regression [41] Social media communication structure and predicting the product sales volume based on the literature review of the existing media theory.

Conveyance, convergence
Artificial neural network, Regression [42] This study aims to examine the roles of online reviews and reviewer characteristics in predicting product sales.
Votes of reviewer and picture of reviewer Sentiment analysis, Artificial neural network Previous studies have used neural networks and regression analysis to predict sales. Generally, neural networks are better than the traditional regression analysis method as neural network analysis can provide a better prediction effect and is suitable for testing large-scale data with a relatively large number of input variables [9]. Linear regression can be used as a simple tool to study the linear relationship between independent and dependent variables. Information about estimated parameters is obtained from the results of linear regression, but the neural network gives no explanation on the parameter estimation [43]. Therefore, this research combines linear regression and neural networks to predict online sales of organic products through price, free shipping, the number of customer reviews, product fans, organic labeling, and discount values.

Study 1: Effects of Online Customer Reviews on Satisfaction
Product packaging is a bridge between products and consumers. Packaging design is one of the important factors in conveying product information and beautifying products to provide consumers with valuable products [44]. Therefore, packaging design can guide consumers to make a satisfactory choice [45]. An interesting food packaging design is a design made to satisfy consumer psychology. Product packaging must be realized with visual images that appeal to consumers, and that visual design affects consumers' evaluations or behavioral intentions [46]. When the text and images of the product packaging are inconsistent with the product and do not meet consumers' expectations, it affects consumers' willingness to buy [47]. Therefore, our package design hypothesis is as follows: H1. Package design has a positive effect on satisfaction.
With the improvement of consumers' quality of life, consumers pay more attention to the nutrition of products [48]. Therefore, the nutritional information of product packaging is very important for consumers, enabling consumers to choose products that are more suitable for them. It is important to improve consumers' healthy quality of life. When consumers have relevant nutritious information, they make more informed choices and this makes consumers more satisfied [31,49]. Nutrition information labeling can improve consumers' awareness of maintaining their health, which is of great significance in promoting consumer satisfaction and continued use of a product [46]. Therefore, our nutritional information hypothesis is as follows: H2. Nutritional information has a positive effect on satisfaction.
Food quality refers to the degree to which food quality characteristics can satisfy consumers. It is mainly the consumer's evaluation of the food, including the appearance, taste and quality of the food [50]. Food safety has become an important food quality attribute [16]. Food safety issues make it difficult for consumers to trust products and make consumers satisfied with products. Many studies have shown that food quality is an important factor affecting consumer satisfaction [51]. In other words, food quality is an important factor in improving sales and satisfaction, retaining customers, and further ensuring a good buying experience. Therefore, our food quality hypothesis is as follows: H3. Food quality has a positive effect on satisfaction.
Delivery risk is mainly a measure of consumers' concerns about product loss, damage, and errors at the place of delivery when they purchase a product online [30]. Delivery risk mainly refers to on-time delivery (OTD) and delivery without damage [52] (Shahzad Ahmad Khan, 2015). Products bought by consumers online have a lot of damage on arrival, and the phenomenon of products being sent to other places causes consumers who buy products online a lot of psychological inconvenience. Many kinds of research results have shown that the relationship between delivery risk and customer satisfaction is negatively correlated [33,52]. Therefore, our delivery risk hypothesis is as follows: H4. Delivery risk has a negative effect on satisfaction.
Product freshness usually refers to the freshness of food related to crispness, juiciness, and aroma [53]. Product freshness in the product purchased by customers is an important factor because customers think fresh products are good for health [29,54]. Consumers purchase products online, and their main concern is the freshness of products [55]. Online ordering for home delivery, which shortens the delivery time and keeps products fresh compared to traditional delivery, makes consumers more satisfied [56]. Consumer perception of the quality of food freshness has a positive influence on satisfaction [57,58].Therefore, our product freshness hypothesis is as follows: H5. Product freshness has a positive effect on satisfaction.
Source risk refers to unreliable and dubious online shops when purchasing products online [59]. When buying products online, firstly it is important for consumers to know about the reputation and evaluation of the store, whether online store products are credible, whether they will receive the product after payment, and whether the after-sales service is complete. These are all issues that can make consumers distrust online products [60]. At the same time, purchasing products online provides private data to unreliable stores, such as consumers' addresses, names, phone numbers, etc., resulting in a higher source risk of products purchased by consumers online [18]. Therefore, our source hypothesis risk is as follows: H6. Source risk has a negative effect on satisfaction.

Study 2: Effects of Online Variables on Sales
Price is an important factor for consumers to buy products online, and consumers want to buy products of good quality and low price. Online shopping can provide more information about product categories so that consumers can get more price information and choose more suitable products. The prices of fruits and vegetables on Taobao are lower than those of meat and aquatic products [61]. The average fruit sales are the highest, followed by vegetables [62]. Therefore, price is an important factor affecting sales volume. Therefore: H7. The price of organic products online has a positive effect on sales volume.
Merchants often use price discounts to promote products to stimulate consumer purchases [63]. Based on the transaction posit utility theory, the price discount provided is one of the main reasons that promotes consumers to buy products [64]. The main reason is that there is a price concession during a promotion, so the purchase price is much lower than the usual purchase price and the products purchased make consumers feel the value of the product. Therefore: H8. The price discount of organic products online has a positive effect on sales volume.
Free delivery of products purchased online brings convenience to consumers and attracts consumers to online shopping [65]. Free delivery is actually a promotional activity for a product to stimulate consumption by consumers. Consumers feel that their consumption is lower than expected and, therefore, consumption is promoted. Previous research found that free delivery has a positive influence on the attitude toward online shopping. Free delivery has a positive impact on online shopping attitudes and encourages consumers to buy and increases online sales [11,66,67]. Therefore: H9. The free delivery of organic products online has a positive effect on sales volume.
Organic labeling refers to a mark for organic products that is traceable by customers [66]. The main difference between organic products and traditional products is whether there is an organic label [68]. Organic labels can be obtained through national agency certification. There are two forms of organic labels-one is an organic code, and the other is an QR code. These two forms determine whether products are organic agricultural products or not. Consumers have to purchase organic products through the government's certification to build trust for consumers to purchase organic products [69]. This study found that US consumers who purchase chicken breasts with USDA(U.S. Department of Agriculture) organic labeling or universal organic certification labeling are willing to pay more [70]. Therefore: H10. The organic labeling of organic products online has a positive effect on sales volume.
Consumers are provided with products available for purchase online. Fans can collect and follow their favorite products [71]. When there are activities related to certain products or new products are launched, they receive relevant information provided by the merchant. Fans are more loyal to product brands and promote consumption [72]. Fans' loyalty to a product generates better word-of-mouth marketing, makes more consumers understand the product, and some consumers also become fans of the product and better promote the consumption of it [73]. Therefore: H11. The product fans of organic products online have a positive effect on sales volume.
Generally, the reviews of consumers who buy products online have a significant impact on the sales volume of products with good reviews particularly important for the sales volume [30]. Similarly, the number of reviews by consumers is also very important [74]. The more consumers that buy a product, the more reviews they provide. These reviews are provided to more consumers, allowing more consumers to know about the product, and this increases the credibility of the product and the number of products purchased [75]. Therefore, the number of consumers has a significant impact on product purchases. Thus: H12. The number of consumer reviews of organic products online has a positive effect on sales volume.
Based on the proposed hypothesis, two research models were developed in this research, as shown in Figure 1.

Research Procedure
The research purpose was to identify important factors for online sales of organic products, to predict the online sales of organic products, and to suggest web marketing strategies for organic product sales. In order to verify the research purpose, this research constructed two research models and established two research processes. This research mainly collected unstructured data on Taobao organic product reviews, prices, product fans, price discounts, the number of customer reviews, organic labeling, and free delivery. By crawling the comments of Taobao, first the sentiment analysis method was used to divide the reviews into positive and negative reviews, the Latent Dirichlet Allocation (LDA) analysis method was used to find the topic words in positive and negative reviews. A linear regression analysis method was used to test the relationship between subject words and satisfaction in identifying important factors for online sales of organic products. At the same time, we used data collected from various variables, such as prices, product fans, price discount, the number of customer reviews, organic labeling, and free delivery, to determine the relationship between each variable and the sales volume, and to predict the online sales of organic products a neural network analysis method was used. The research process is shown in Figure 2 below.

Research Procedure
The research purpose was to identify important factors for online sales of organic products, to predict the online sales of organic products, and to suggest web marketing strategies for organic product sales. In order to verify the research purpose, this research constructed two research models and established two research processes. This research mainly collected unstructured data on Taobao organic product reviews, prices, product fans, price discounts, the number of customer reviews, organic labeling, and free delivery. By crawling the comments of Taobao, first the sentiment analysis method was used to divide the reviews into positive and negative reviews, the Latent Dirichlet Allocation (LDA) analysis method was used to find the topic words in positive and negative reviews. A linear regression analysis method was used to test the relationship between subject words and satisfaction in identifying important factors for online sales of organic products. At the same time, we used data collected from various variables, such as prices, product fans, price discount, the number of customer reviews, organic labeling, and free delivery, to determine the relationship between each variable and the sales volume, and to predict the online sales of organic products a neural network analysis method was used. The research process is shown in Figure 2 below.

Research Procedure
The research purpose was to identify important factors for online sales of organic products, to predict the online sales of organic products, and to suggest web marketing strategies for organic product sales. In order to verify the research purpose, this research constructed two research models and established two research processes. This research mainly collected unstructured data on Taobao organic product reviews, prices, product fans, price discounts, the number of customer reviews, organic labeling, and free delivery. By crawling the comments of Taobao, first the sentiment analysis method was used to divide the reviews into positive and negative reviews, the Latent Dirichlet Allocation (LDA) analysis method was used to find the topic words in positive and negative reviews. A linear regression analysis method was used to test the relationship between subject words and satisfaction in identifying important factors for online sales of organic products. At the same time, we used data collected from various variables, such as prices, product fans, price discount, the number of customer reviews, organic labeling, and free delivery, to determine the relationship between each variable and the sales volume, and to predict the online sales of organic products a neural network analysis method was used. The research process is shown in Figure 2 below.

Data Collection
The purpose of this research was to identify important factors for online sales of organic products, to predict the online sales of organic products, and to suggest web marketing strategies for organic product sales. This research mainly used Taobao as a platform to collect data. Taobao is one of the fastest growing online B2C markets in China and has 721 million users, and sales have increased year by year [76]. Agricultural products are produced and processed according to the principles of organic agriculture and organic agricultural production methods and standards, certified by organic food certification agencies, for example, organic grains, organic fruits, organic vegetables, etc. Therefore, this study mainly collected data on organic products on Taobao, searching for related products by keywords, such as organic fruits, organic vegetables, organic rice, organic red beans, organic beef, organic pork, organic egg, and other organic products. A Python-based web crawler was developed to retrieve the relevant data. Python is a high-level scripting language that combines interpretability and compiled, interactive, and object orientation. The web crawler simulates the browser access to network resources and automatically collects the content of the accessed web pages in order to quickly and efficiently obtain the required content [77]. Thus, we developed our own web crawler to collect the data from web data. The period of data collection was from September 8 to September 18, 2019. Data from a total of 9040 organic products were collected, and there were 3446 organic products with reviews out of collected reviews on 506,001 organic products. This research mainly collected data with the titles of organic products' price, current price, free delivery, sales volume, number of customer reviews, customer reviews, organic labeling, and product fans. The data collected is shown in Figure 3 below.

Data Collection
The purpose of this research was to identify important factors for online sales of organic products, to predict the online sales of organic products, and to suggest web marketing strategies for organic product sales. This research mainly used Taobao as a platform to collect data. Taobao is one of the fastest growing online B2C markets in China and has 721 million users, and sales have increased year by year [76]. Agricultural products are produced and processed according to the principles of organic agriculture and organic agricultural production methods and standards, certified by organic food certification agencies, for example, organic grains, organic fruits, organic vegetables, etc. Therefore, this study mainly collected data on organic products on Taobao, searching for related products by keywords, such as organic fruits, organic vegetables, organic rice, organic red beans, organic beef, organic pork, organic egg, and other organic products. A Python-based web crawler was developed to retrieve the relevant data. Python is a high-level scripting language that combines interpretability and compiled, interactive, and object orientation. The web crawler simulates the browser access to network resources and automatically collects the content of the accessed web pages in order to quickly and efficiently obtain the required content [77]. Thus, we developed our own web crawler to collect the data from web data. The period of data collection was from September 8 to September 18, 2019. Data from a total of 9040 organic products were collected, and there were 3446 organic products with reviews out of collected reviews on 506,001 organic products. This research mainly collected data with the titles of organic products' price, current price, free delivery, sales volume, number of customer reviews, customer reviews, organic labeling, and product fans. The data collected is shown in Figure 3 below.

Study 1: Sentiment Analysis and Latent Dirichlet Allocation (LDA) Topic Modeling Analysis Results
This research was mainly based on the sentiment analysis of reviews of organic products on, which can help consumers know about the reputation of organic products on Taobao. Usually, sentiment analysis is divided into positive, negative, and neutral [78]. A dictionary-based sentiment analysis mainly uses the sentiment lexicon to give each word a weight for the corresponding emotional inclination, to give each word the corresponding sentiment weight, and then all sentiment words are extracted from the review and the final sentiment score is calculated based on the negative words and adverbs in the review, and the emotional polarity of the review is judged based on the sentiment score [36,79]. The dictionary includes the Boson NLP(Natural Language Processing)

Study 1: Sentiment Analysis and Latent Dirichlet Allocation (LDA) Topic Modeling Analysis Results
This research was mainly based on the sentiment analysis of reviews of organic products on, which can help consumers know about the reputation of organic products on Taobao. Usually, sentiment analysis is divided into positive, negative, and neutral [78]. A dictionary-based sentiment analysis mainly uses the sentiment lexicon to give each word a weight for the corresponding emotional inclination, to give each word the corresponding sentiment weight, and then all sentiment words are extracted from the review and the final sentiment score is calculated based on the negative words and adverbs in the review, and the emotional polarity of the review is judged based on the sentiment score [36,79]. The dictionary includes the Boson NLP(Natural Language Processing) dictionary (including positive sentiment words and negative sentiment words), a negative dictionary, and a degree adverb dictionary. The dictionary is derived from the sentiment dictionary of the Boson NLP data downloaded and social media text, so the dictionary is suitable for processing social media sentiment analysis. This study divided reviews into positive and negative reviews based on weight. The criterion was that negative reviews were less than 0 and positive reviews were greater than 0.
Latent Dirichlet Allocation (LDA) is the most commonly used method for topic modeling [80]. Topic modeling using LDA can discover topic words from large amounts of unstructured text data or big data [75]. In this study, LDA was mainly used to extract the keywords related to consumer satisfaction with the online purchase of organic products. The generation process of this study was as follows: 1.
Read the collection of review documents and use Jieba for word segmentation.

2.
Assign an ID to each word, namely the corporate dictionary.

3.
After the ID is assigned, the word frequency of each word is sorted out, and a sparse vector is formed using the form of "word ID: word frequency".

4.
Use the LDA model of the Gensim library for training.

5.
The results show that after the model finishes running, it will output the probability that a comment belongs to a topic and judge which topic that is, based on the probability.
First, using the sentiment analysis method, the reviews crawled on Taobao were divided into positive and negative reviews. A total of 36,603 articles were collected in negative reviews, and a total of 431,567 articles were collected in positive reviews.
Second, the positive and negative reviews were analyzed using LDA topic modeling analysis to derive the keywords. The words were extracted from nouns by topic modeling. The table below summarizes the themes related to consumer purchases of organic products. The LDA topic modeling analysis results of this study were as follows. In Topic 1, words such as great, golden, color, picture, very good, appearance, bag, and gift were extracted. This result confirmed that the topics were related to packaging design. In Topic 2, words such as quality, products, nutrition, health, first-rate, product quality, good, and type were extracted. Thus Topic 2 was related to nutritional information. In Topic 3, words such as quality, beautiful, perfect, great, loyal, fans, fresh, and crisp were extracted. This means that Topic 3 was related to food quality. In Topic 4, words such as time, too slow, hour, yuan tong, consumption, postage, nonsense, and late were extracted. Therefore, Topic 4 was related to the delivery risk. In Topic 5, words such as organic, garbage, almost, pesticide, diarrhea, bad smell, hospital, and epidermis were extracted. Topic 5 was related to freshness. In Topic 6, words such as evaluation, customer service, online shopping, attitude, regular customer, psychology, merchants, and cautious were extracted. This result confirmed that in Topic 6 words related to the source risk were extracted. Therefore, this study used LDA topic modeling analysis to extract a total of six keywords. The keywords for positive reviews were packaging design, nutritional information, and food quality, and the keywords for negative reviews were delivery risk, freshness, source risk. The keywords of online organic products are shown in Table 4 below. An online survey was conducted among 434 users who purchased organic produce online to test the relationship between the six keywords above and satisfaction. Using a 7-point Likert scale (1 "completely disagree" to 7 "completely agree") 24 items were evaluated. The measurement scales were adapted from previous studies, as shown in Appendix A. These issues have been reviewed by Chinese and Korean experts.
We conducted an online survey of 434 Chinese users who purchased organic agricultural products online, and the questionnaire was conducted from October 29 to November 16, 2019. The following table shows the demographic information of the participants. Appendix B shows the demographic information of the participants. Among them, there were 160 males (51.95%) and 274 females (48.05%). Users aged 18-40 constituted the largest group, with 184 consumers (42.63%) aged 18-30 and 101 consumers (21.89%) aged 31-40. Regarding the educational level, users who were undergraduates or had a master's or higher degree were the largest group, with the number of universities being 227 (52.53%) and the number of those with a master's or higher degree being 107 (24.65%). In terms of income, 255 (58.99%) consumers earned less than $710 and 140 (32.49%) consumers earned $710-1410, and in this group were the largest number of consumers who purchased organic produce online. In terms of occupation, 149 (34.33%) of the consumers who purchased organic produce online were career students, followed by 72 (16.59%) consumers who were full time workers (e.g., professor, nurse). Comparing the online and offline purchase of organic products, consumers were more willing to buy organic products online, there were 279 (64.52%) consumers who bought organic products once a month online and 6 (1.38%) consumers who bought organic products 11 or more times, there were 154 (35.71%) consumers who purchased organic products once a month offline, and 44 (10.37%) consumers who purchased organic products 11 or more times. The types of organic produce that was often bought were organic vegetables, 253 (58.29%); organic fruits, 345 (79.49%); and organic foods, 178 (41.01%). Consumers purchased organic products mainly because of their health-157 (55.67%).
This research used the data from the online questionnaire to analyze the validity and hypothesis tests on the relationship between variables and satisfaction, get relevant data results, and discuss the data results. First, using factor analysis, the factor load corresponding to each of the principal component topics was greater than 0.5, indicating that these topics fell well into the corresponding dimensions. The construct reliability (CR) and average variance extracted (AVE) were calculated based on the load values. The results showed that the construct reliability value of each variable was between 0.836 and 0.917-both greater than the standard of 0.6-and the average variance extracted value was between 0.562 and 0.759-both of which are greater than the 0.5 standard. The alpha coefficient is usually used to measure the reliability of a questionnaire. The larger the alpha coefficient, the higher the reliability of the questionnaire, that is, the higher the reliability and stability of the questionnaire. Generally, the alpha coefficient should be higher than 0.5, and the analysis results were all higher than 0.8, which showed that the data had good reliability and that this study passed the reliability test. The results are shown in Table 5.   Table 6 describes the regression analysis. Each independent variable had a corresponding regression coefficient and a significance test. β represented the standard regression coefficient. The standardized regression coefficient represented the independent variable, that is, the correlation between the predictor and the dependent variable. The results showed that packaging design, nutritional information, and food quality had a positive correlation with satisfaction. Delivery risk, freshness, and source risk had a negative correlation with satisfaction. Therefore, H1, H2, H3, and H4 were supported. However, H5 was rejected. After standardization, each independent variable and dependent variable could be unified. This made the results more accurate and reduced errors due to different units. The t-value was the result of a t-test of the regression coefficients. The larger the absolute value, the smaller the sig-sig represents the significance of the t-test. Statistically, a sig less than 0.05 is generally considered to be significant for the coefficient test. It shows that the independent variable can effectively predict the variation of the dependent variable. Our results were as follows: packaging design (β = 0.245, sig = 0.000), nutritional information (β = 0.240, sig = 0.000), food quality (β = 0.199, sig = 0.000), delivery risk (β = −0.104, sig = 0.009)), freshness (β = −0.107, sig = 0.008), and source risk (β = −0.137, sig = 0.001). The six independent variables had significant standardized regression coefficients for user satisfaction.

Study 2: Online Variables and Sales Volume Linear Regression
The main research purpose of this part is to predict the impact of consumer purchases of organic agricultural products by crawling the six variables of prices, product fans, the price discount, number of customer reviews, organic labeling, and free delivery. Based on the crawled data, regression analysis was used. β represents the standard regression coefficient. The results showed that product fans, the number of reviews, and price discount had a positive correlation with the sales volume. The t-value is the result of a t-test of the regression coefficients. The larger the absolute value, the smaller the sig, sig represents the significance of the t-test. Statistically, a sig less than 0.05 is generally considered to be significant for the coefficient test. It shows that the independent variable can effectively predict the variation of the dependent variable. The results of the regression model showed that the significance of product fan, price discount, and number of customer reviews were all below 0.005, and all three variables that affect sales passed. The results are shown in Table 7. Therefore, the relationship between each variable and the sales volume in the regression analysis is shown in Figure 4 below. The path coefficient for H8 was positive and significant (2.868, p < 0.01). Thus, H8 was supported, indicating that the price discount has a positive impact on the sales volume. The hypothesis for the relationship between product fans and sale volume (H11) was also supported, with a path coefficient of 14.174(p < 0.01). The hypothesis regarding the number of reviews (H12) was also supported, having a significant path coefficient of 36.283. Thus, the regression analysis showed that the three variables of product fans, number of reviews, and price discount had a positive impact on the sale volume. However, hypotheses H7, H9, H10 were not supported, and price, free delivery, and organic labeling did not significantly affect the sales volume.
Sustainability 2020, 12, x FOR PEER REVIEW 13 of 24 also supported, having a significant path coefficient of 36.283. Thus, the regression analysis showed that the three variables of product fans, number of reviews, and price discount had a positive impact on the sale volume. However, hypotheses H7, H9, H10 were not supported, and price, free delivery, and organic labeling did not significantly affect the sales volume.  Recent neural network research has mainly focused on prediction to solve complex problems, and, therefore, is suitable for research with a large amount of data [50]. It is one of the research methods of machine learning. This research mainly used the BP(Back Propagation) neural network to predict the sales volume of organic products on Taobao. It was mainly composed of interconnected node systems in three hierarchical layers (input, hidden, and output). The process of the BP neural network was mainly divided into two stages. The first stage was the forward propagation of the signal, from the input layer through the hidden layer, and finally to the output layer; the second stage was the backpropagation of the error, from the output layer to the hidden layers, finally to the input layer, training with a BP model of the Keras neural network framework to predict the sales volume of organic products on Taobao.
Artificial neural network analysis and modeling, which is one of the representative methods of predictive analysis for checking whether the three indicators obtained through regression analysis can predict sales volume, was used. There were three layers-the input layer, the hidden layer, and Recent neural network research has mainly focused on prediction to solve complex problems, and, therefore, is suitable for research with a large amount of data [50]. It is one of the research methods of machine learning. This research mainly used the BP(Back Propagation) neural network to predict the sales volume of organic products on Taobao. It was mainly composed of interconnected node systems in three hierarchical layers (input, hidden, and output). The process of the BP neural network was mainly divided into two stages. The first stage was the forward propagation of the signal, from the input layer through the hidden layer, and finally to the output layer; the second stage was the backpropagation of the error, from the output layer to the hidden layers, finally to the input layer, training with a BP model of the Keras neural network framework to predict the sales volume of organic products on Taobao.
Artificial neural network analysis and modeling, which is one of the representative methods of predictive analysis for checking whether the three indicators obtained through regression analysis can predict sales volume, was used. There were three layers-the input layer, the hidden layer, and the output layer. The input layer was the prices, discounted prices, free delivery, organic labeling, and the number of customer reviews, and the output layer was the sales volume. The hidden layer was set to 2. The crawled data set was divided into a training set and a test set "Training" was set to 50%, and "testing" was set to 50%. The training set, input prices, product fan, price discount, number of customer reviews, organic labeling, and free delivery were used for modeling to obtain the output index of the sales volume. After obtaining the model, the test set was used, with inputs of the dimensions of prices, product fans, price discount, number of customer reviews, organic labeling, free delivery to get the output index of the sales volume. To confirm the predictive power of three statistically significant indicators, a total of 6 artificial neural network models were implemented, with low loss and low RMSE(Root Mean Square Error) being better, so the artificial neural network analysis concluded that three variables had an impact on the sales volume. The results are shown in Table 8 below. The artificial neural network model is described in detail in Appendix C.

Discussion
First, in this study, we used topic modeling to study the important factors for online consumers purchasing organic agricultural products. The analysis results determined that packaging design, nutritional information, food quality, delivery risk, freshness, and source risk are important factors in buying organic products online. By using regression analysis, the relationship between packaging design, nutritional information, food quality, delivery risk, freshness, source risk, and satisfaction can be determined. Packaging design, nutritional information, food quality, and satisfaction have a positive effect, and delivery risk, freshness, source risk, and satisfaction have a negative effect. Previous studies have shown that packaging design, nutritional information, food quality, and freshness all have a positive effect on satisfaction [56,[79][80][81][82][83][84][85]. However, freshness has a negative impact on satisfaction in this study. The main reason is that online ordering for home delivery can provide consumers with more fresh products than traditional express delivery [56]. But products purchased on Taobao usually take 3-4 days to arrive. Therefore, the freshness period of organic products purchased on Taobao will affect consumers' purchasing of organic products online. In this way, an online and offline platform can be established, and online orders can be delivered to the home in a timely manner, while maintaining the freshness of organic products and providing convenience for consumers to purchase organic products online, providing freshness to products is especially important for consumer satisfaction with online purchases of organic produce. At the same time, maintaining online platforms to provide consumers with better packaging designs and clear nutritional information about organic agricultural products, while providing consumers with better products and ensuring product quality is an important marketing strategy for consumers to purchase organic products online. While buying organic produce online has brought many conveniences to consumers, there are many inconveniences. Previous research has shown that the delivery risk and source risk have a negative impact on satisfaction [11,18,33]. The relationship between the delivery risk, source risk, and satisfaction in this study is consistent with the results obtained in previous studies. Buying products online provides consumer data to unreliable merchants, and at the same time, online shopping raises the question of whether the merchants can send products to consumers safely. These are all unfavorable factors for purchasing products online. Merchants should improve the quality of service and establish a good and credible online shop, providing consumers with a satisfactory and credible platform for purchasing products.
Second, this research mainly used prices, product fans, price discount, the number of customer reviews, organic labeling, and free delivery to predict the sales volume of online organic agricultural products. The results show that product fans, price discount, and the number of customer reviews affected the sales volume. Previous studies have shown that product fans, price discounts, and the number of customer reviews promote product sales [65,73,74]. The relationship between product fans, price discount, the number of customer reviews, and the sales volume in this study is consistent with the results obtained in previous studies. At the same time, the way to promote the sales of organic products purchased online is not only an affected by a single factor but also various factors that jointly promote the sales of online organic products. The more products are discounted, the more consumers will buy, the more consumers will fill out reviews for the product, and at the same time this will attract more consumers to become product fans and jointly promote the increase in product sales. Previous research has shown that prices have an effect on the sales volume, with higher prices lowering the sales volume [61]. The results of this study show that price is not a very important factor in forecasting the sales volume. Generally, the price of organic agricultural products is higher than the price of traditional products. For consumers who purchase organic agricultural products, more attention is not paid to the price of the product. At the same time, previous research has shown that free delivery has a positive impact on sales [11,66,67]. However, the results of this study show that free delivery is not an important factor in predicting sales, because the purchasers of organic products are more concerned about health, product quality, and the environment [3,4,55]. When a consumer buys product online, some products provide free delivery services, some have free delivery services up to a certain price, and some products do not provide free delivery services. By collecting the number of online reviews and monthly sales we can see that many consumers are willing to buy organic products online without free delivery, so free delivery is not an important factor in predicting sales volume. At the same time, the analysis results show that organic codes are not an important factor in predicting sales. However, previous research has shown that organic codes have a positive impact on sales [70], because consumers are more willing to pay for organic products with organic codes [69]. When buying organic products online, merchants will provide consumers with the trademarks of organic agricultural products and certifications of national authoritative organizations online so that consumers can trust products and buy products with confidence. Therefore, buying organic products online in this research is not an important factor in predicting sales. Research shows that when promoting the sale of organic products online, focusing on discount promotions can attract many consumers to buy online, attract consumers to spend, get more reviews for the product, and attract more consumers to become fans of the product.

Conclusions and Implications
This research was mainly to determine the influencing factors on customer satisfaction when purchasing organic products online, predict the sales volume of organic products, and suggest web marketing strategies for organic product sales. Through research, this study first identified the influencing factors on satisfaction regarding the online purchase of organic products. The keywords obtained from online reviews using sentiment analysis and LDA analysis methods were packaging design, nutritional information, food quality, delivery risk, product freshness, and source risk. The questionnaire was designed using the topic words, online data collection, and regression analysis. The results showed that packaging design, nutritional information, and food quality had a positive effect on satisfaction, and delivery risk, product freshness, and source risk had a negative effect on satisfaction. The second aim was to predict the factors that affect the sales volume of organic products.
Through regression analysis and neural network analysis, the results show that price discounts, product fans, and the number of customer reviews have an impact on the sales volume of online organic products. The results of this study mean that providing online organic products to consumers makes them satisfied, mainly by providing consumers with good packaging design, clear nutrition information, good product quality, and fresh products, while reducing the number of consumers' various inconveniences, such as by establishing a trusted consumer platform and after-sales service for consumers, while providing consumers with better home delivery services. Consumers who are more satisfied with online organic products will give good reviews. At the same time, giving customers more discounted promotions attracts consumers to buy organic products online, gains more product reviews, attracts more customers to become fans of online organic products, and can increase the sales of online organic products. These are important factors that satisfy consumers and increase merchant profits. Providing consumers with safe and reliable organic products online satisfies consumers and also promotes more profits for merchants and farmers. At the same time, providing these products of great significance for improving the ecological environment and promoting sustainable development.
The results of this study have the following theoretical implications: First, this research mainly collected large amounts of online data. We not only collected reviews, but we also collected data on forecasted sales volume. However, this study only extracted data from the Taobao platform, but used big data analysis to study consumer satisfaction and forecast sales volume of organic produce. Big data samples can more accurately predict consumer demand for organic products online and better satisfy consumers. Second, this research mainly used four analysis methods: sentiment analysis, LDA, neural network analysis, and regression analysis. The four analysis methods were not used at the same time, so the method remained a single research method. This research aim was to explore online products while using data mining, machine learning, and regression analysis. Third, this research not only used online survey methods to study, but it also used text mining methods to study consumer satisfaction with online organic products. Most of the data that were not collected from online reviews were used for research, mainly using online survey methods. This research could study consumers' satisfaction in purchasing organic products online better by using the two methods.
This study also has important implications for practice. First, it is necessary to provide consumers with high-quality, safe, cheap, and convenient online organic products. When consumers purchase organic products online, they can get exquisite packaging design, high product quality, and detailed product nutrition information. At the same time, the products have many discounts. This is an important factor to attract consumers to buy online organic products. By attracting more consumers to become fans, products get more good reviews, consumers are satisfied with the purchase of organic products online, and sales of online organic products increase. Second, merchants should establish a secure and trusted online shopping platform. Merchants can protect consumer information and prevent leakage by improving service quality, including logistics, services, reputation, etc. For example, for consumers to receive online products in a timely manner, to maintain the freshness of organic agricultural products, online and offline platforms can be established, which can be quickly and timely sent to consumers, and any problems with the product can be communicated and resolved in a timely manner. This provides convenient services to consumers, meets consumer demand for online purchase of organic products, and increases sales of online organic products. Third, organic agriculture can improve the environment and the quality of life of farmers. It can promote sustainable development by promoting consumers' consumption habits of organic products and ensure that consumers are provided with safe and high-quality products to protect consumers' health. At the same time, improving the profits of online merchants and farmers is of great significance to improving the quality of life of farmers.
The results of this study should be interpreted in the context of its limitations. First, this research mainly crawled data on organic agricultural products on Taobao in China. Future research can crawl relevant data on organic products from online platforms in the United States, South Korea, and other regions to expand the results of the research, using three countries for comparison. Second, this study used sentiment analysis only with positive and negative reviews, without considering neutral reviews. Future research can consider using fine-grained sentiment analysis. Third, the online sales of organic produce are forecasted using only prices, discounted prices, free delivery, organic labeling, and the number of customer reviews. Future research could look for more factors to predict the sales of organic products online. Fourth, fake reviews are not considered in the scraped review data. Many merchants attract consumers by making fake reviews. The data obtained may have errors. Therefore, in future research, excluding fake reviews could be considered in order to get more accurate consumer reviews for research.

Acknowledgments:
We are indebted to the anonymous reviewers and editor.

Conflicts of Interest:
The authors declare no conflict of interest. Table A1. Measurement items.

Items Questions Reference
Packaging Design

PD1
Packaging color composition on organic products packaging draws attention.

PD3
Organic products packaging visual design is aesthetic and unique to draw attention.

PD4
Packaging material of organic products reflects good quality.

NI1
Nutritional information is easy to understand.
[80] NI2 Nutritional information is useful and important, from the point of view of nutrition.

NI3
Nutritional information influences more deliberate and reasonable choices.

NI4
Nutritional information should be available in online shopping.
Food Quality FQ1 I shop online because organic products are superior to that sold in offline stores. [84,86]

FQ2
I feel the quality of organic products online is better than offline.

FQ3
I feel the organic products purchased online are healthier than offline.
Delivery Risk

DR1
The delivered organic products could be lost.
[71] DR2 Delivered the organic products to a wrong place.

DR3
The organic products are damaged during delivery.

FN1
The freshness of organic products purchased online is more fresh than offline.
[30] FN3 The quality of organic products purchased online is more fresh than offline.

FN2
The quality of fresh organic products purchased online is better than offline.

SR1
Online information about organic products is not true.
[71] SR2 It is difficult to get support when organic products fail.

SR3
I cannot find the place to settle disputes.

SR4
Providers fail to keep the promise of post-purchase services.

SF1
I am very happy to buy organic products online.
[56] SF2 Overall, I am satisfied with the purchase of organic products online.

SF3
Overall, buying organic products online comes up to my expectations.