Applying Sentiment Product Reviews and Visualization for BI Systems in Vietnamese E-Commerce Website: Focusing on Vietnamese Context

: Product reviews become more important in the buying decision-making process of customers. Exploiting and analyzing customer product reviews in sentiments also become an advantage for businesses and researchers in e-commerce platforms. This study proposes a sentiment evaluation model of customer reviews by extracting objects, emotional words for emotional level analysis, using machine learning algorithms. The research object is the Vietnamese language, which has special semantic structures and characteristics. In this research model, emotional dictionaries and sets of extract rules are combined to build a data training data set based on the semantic dependency relationship between words in sentences of the given Vietnamese context. The recurrent neural network model (RNN) solves the emotional analysis issue, speciﬁcally, the long short-term memory neural network (LSTMs). This analysis model combines the vector representations of words with a continuous bag-of-words (CBOW) architecture. Our system is designed to crawl realistic data in an e-commerce website and automatically aggregate them. These data will be stored in MongoDB before processing and input into our model on the server. Then, the system can exploit the features in products reviews and classify customer reviews. These features extracted from different feedback on each shopping step and depending on the kinds of products. Finally, there is a web-app to connect to a server and visualize all the research results. Based on the research results, enterprises can follow up their customers in real-time and receive recommendations to understand their customers. From there, they can improve their services and provide sustainable consumer service. Author Contributions: Conceptualization, N.-B.-V.L.; data curation, N.-B.-V.L. and J.-H.H.; and J.-H.H.; funding acquisition, J.-H.H.; investigation, N.-B.-V.L.; project ad-ministration, resources, and


Introduction
In recent years, the development of information technology (IT) and internet applications has allowed worldwide users to create a massive amount of information. Contents are created daily on various online platforms such as status social media streams, videos or pictures on e-commerce websites, applications, etc. Social media, e-commerce websites, mobile platforms, applications, etc., allow people to share information and represent their attitudes, perspectives on products and services and other social issues. Product reviews become more important in the buying decision-making process of customers [1]. Reviewing customer's comments is a great way to develop businesses by understanding the strengths and weaknesses of a company's goods and services in order to capture what are their demands and give them the best choices. Exploiting and analyzing customer product reviews also become an advantage for businesses in many sectors, especially e-commerce platforms.
In the past, it was common that humans are in charge of handling customer opinions. These ways of processing led to unnecessary errors and increase cost of businesses and organizations. Natural language processing technology, language mining techniques have been presented to help solve this problem, which can be mentioned in Refs. [2][3][4], etc. In the field of e-commerce, classification and sentiment analysis are often researched on the Amazon platform, which is the largest online supplier in the world by having a wide range of products and numerous reviews. For example, a model of extracting features from user opinion information and evaluated customer opinions about iPhone 5 reviews with three basic levels-good, average and bad-was conducted with data crawled from Amazon [2]. A continuous naïve Bayes learning framework is used to classify sentiment product reviews in research [3]. Currently, emotion analysis usually has three main methods: The primary method bases on the dictionary of emotional word; method deep learning neural network method of combining rule-bases and corpus-bases. Lately, the problem of analyzing user emotions based on reviews has been carried out mainly in the English language, with many different technologies and models applied. However, the analysis of reviews, especially Vietnamese product reviews, has not been applied in terms of popularity. Our problem is detecting and extracting the discussion sentences of each evaluation aspect in the article while predicting the user's rating and importance of each aspect. This is a form of opinion analysis and language processing problems in Vietnamese.
In this research, to address the problem, our system is designed to crawl realistic data in an e-commerce website and automatically aggregate them. These data will be stored in MongoDB before processing and input into our model on the server. Then, the system will exploit all features of the products that customers were interested in through reviews in the Vietnam e-commerce market and different feedback on shopping processes depending on the kinds of products. Finally, a web-app is built, connected to a server to visualize all the research results so that small and medium enterprises can follow up their products and their customers in real-time. Based on the research results, enterprises can follow up their customers in real-time and receive recommendations to understand their customers. From there, they can improve their services and provide sustainable consumer service. Aiming at the problem of sentiment analysis, this research uses machine learning algorithms to propose a sentiment evaluation model for customer reviews by extracting objects and sentiment words. Combining our method with the sentiment dictionary and the extraction rule set, a data training data set is constructed based on the semantic dependency in the Vietnamese context. The sentiment analysis problem is solved by the recurrent neural network model (RNN), which has a commonly used variant currently, called the long short-term memory neural network (LSTM), which combines the vector representation of words with a continuous bag of words (CBOW) architecture.
The rest of this paper is organized as follows: Chapter 2-Research Background provides a brief overview of E-commerce development in Vietnam, as well as the chosen research objectives; Chapter 3-Literature Review presents previous research or works related to sentiment reviews; Chapter 4-Design and Implementation discusses the methodologies for collecting reviews, research design and process of analyzing and classifying sentiment reviews; the results and discussions are presented in Chapter 5 of paper-Result and Discussion; Chapter 6, the last chapter, presents the conclusions of this research and future work aiming to improve the results.

Research Background
The e-commerce market is showing strong growth in Vietnam, with a high internet accessing rate (ranked 17th in the world) and rising smartphone penetration rates (Ecommerce Industry in Vietnam, 2018). According to the Vietnam E-commerce Association (VECOM) in the Vietnam Online Business Forum 2019 (VOBF), the retail market started out at 4 billion USD in 2013 and reached 7.8 billion USD in 2018. If the growth rate in 2019 and 2020 continues to be maintained at 30%, by 2020, the scale of the retail e-commerce market will reach about 13 billion USD, which is higher than expected in the overall plan for e-commerce trade development for 2016-2020 [5]. In addition to transacting via traditional e-commerce platforms, consumers are gradually shifting to online transactions via social network sites, with a tendency to use multi-platform channels because of intelligent searching or convenient shopping experiences. Vietnamese consumers now contribute to the greater reach of the internet trend by using multi-platform channels [6].
The trend of the digital economy is changing the shopping habits of customers in many industries. The e-commerce market is increasingly open, with many models, participants and supply chains also gradually changing in a more modern direction, with digitalization and information technology support. The e-commerce trends and consumer behavior in the world have changed rapidly. Restricted by lockdown, self-isolation and social distance, these driving factors have led to the increase in households purchasing luxury products through e-commerce [7]. COVID-19's effects on five global e-commerce companies were studied in Ref. [8]. The results indicated that new coronavirus cases, current coronavirus cases and cumulative deaths affected almost all of the companies studied. A study on the impact of COVID-19 on the e-commerce market in Malaysia also shows that shipment processes are affected, resulting in a shortage of some products and kits manufactured in China [9]. In particular, due to the impact of the COVID-19 pandemic, the e-commerce market is becoming more vibrant because of the change in consumers' buying habits, switching from traditional buying habits to e-commerce purchases. Consumers' purchasing decisions have changed significantly during this period. Consumers' awareness and their experiences become important influencing factors when making decisions [10]. There were seven factors affecting consumer's online shopping buying behavior: ease of use, perceived risk, perceived usefulness, website design, economic factor, availability of products and customer satisfaction [11]. Normative determinants and hedonic motivation are related to German consumers' intention of online purchasing were also investigated [12]. Especially, the influence of social distancing has made the hedonic motivation factor of gen Z even higher. Another piece of evidence comes from research in Bangladesh, which has been significantly impacted by COVID-19, resulting in changes in online shopping such as products, time-saving, payment and administrative factors [13]. In Ref. [14], five factors influencing the online shopping intentions of consumers in Ho Chi Minh City during the COVID-19 period were arranged in order of importance from high to low: perceived usefulness; reference group; safety and confidentiality; reputation. Nowadays, customers have become increasingly meticulous when shopping. Notably, customers will also measure the value they receive when they pay for a product, not only the price. This means they will consider the quality, features and also fashion. When people become busier, reviews from other customers can convince them and also can affect purchasing decisions. Almost 90% of consumers read reviews before visiting a business, 31% are likely to spend 31% more on a business with excellent reviews and 72% will take action only after reading a positive review [15]. With the development of multi-platform channels in e-commerce, primarily via mobile applications and e-commerce websites through the internet [16], customers are increasingly able to conveniently and freely comment on products after they purchase them or share their shopping experiences.
With such numerous customer comments, effectively exploiting insights from customer comments plays an important role. Collecting and analyzing customer comments can create timely changes in product quality, logistics issues and even the customer relationship management (CRM) system [17] to improve the business efficiency of each business and earn revenue to attract customers more effectively. Currently, there are many research studies worldwide that analyze the sentiments of customer comments in many different languages, as in Refs. [18][19][20][21]. Other researchers conducted many studies and works related to the analysis and classification topics of Vietnamese sentences, as in Refs. [20,21], or works related to specific service products, such as hotel services, as in Ref. [22]. However, there has not been much research on sentiment customer reviews in the context of Vietnamese e-commerce transactions, despite the volume of transactions and fertile information in the current e-commerce environment.
In this research, a process is established to collect and polarize opinion and feature classifications of Vietnamese customer reviews on e-commerce websites. Especially, prod-uct review data, which was used for building data training sets and analysis, were collected from a famous e-commerce website in Vietnam. Tiki is an all-in-one commercial ecosystem consisting of member companies, such as TikiNOW Smart Logistics, which provides endto-end logistics services; Ticketbox, which offers top-notch event and movie ticket services; and Retailer Tiki Trading and Exchange, which offers 10 million products in 26 categories and serves millions of customers nationwide. Tiki's unique features include 10%+ discounts on 90% of its products, 24 h or 2 h delivery, customer hotline offering support 12 h a day/7 days a week and return and exchange policies [23]. Tiki had 19.74 million visits in December, 2017. With the diversity of goods as well as the trust of Vietnamese customers, Tiki's products review data are a reliable source of data reflecting customers' thoughts and many research angles that are exploitable.

Mining Product Reviews and Sentiment Analysis
In English, there have been several works conducted. Opinion mining was applied to Vietnamese and American online reviews to express hotel service preferences introduced in Ref. [24]. The study collected 37,712 online reviews of Vietnamese hotels by domestic and foreign customers in both English and Vietnamese on hotel booking and review channels Agoda.com (accessed on 17 April 2020). The study was conducted with five steps: data collection, data preprocessing, lexicon-based model design, machine learning model design and hybrid model design. Lexicon-based models were used to classify opinions based on lexicon resources. As a result of the study, the authors showed the Vietnam hotel aspect summary reviewed by Vietnamese customers and American customers. There were differences in opinions between domestic customers and American customers because of the differences in cultures of each nation. In this study, the authors also introduced a hybrid model of lexicon-based and machine learning. The proposed model can be applied and used in English and non-English languages. This model can be modified with specific characteristics and used in our analytics reviews in the present study.
In Vietnamese, a process using a machine learning model was proposed to classify emotion into two polarities: positive and negative [25]. The hotel services and experimental data on Vietnamese reviews collected from booking.com (accessed on 21 April 2020) were used as a case study. The techniques used were overfitting training, 5-FoldCV, confusion matrix and ensemble methods. Because the author only trained the data set in the hotel services context, this data training set can only be effective in this case study and is still not used popularly in other contexts. Vietnamese sentiment classification on one public dataset was studied in research [26]. The authors built five ensemble schemes and proved feature importance methods with five techniques. The results showed that ensemble methods could perform better than the others. In another study, the authors used the Vietnamese lexicon to build a sentiment dictionary with scoring method. The domain in this work is electronic products from e-commerce websites. This dictionary can extract nouns, adjectives and adverbs, which can be highly applied in social analysis. In our research, we also expect to build a dictionary for standardizing acronyms and some punctuation in the context of Vietnamese.
In other languages such as Chinese in research [27], textual factors in online product reviews were identified through both qualitative and quantitative methods. The authors interviewed and surveyed 136 participants and used the product "Mobile phones" as a sample product in this study. This study also followed the decision-making process and emotion of the online product reviews following each step of this process. From this study, the researchers identified six measures related to influence on perception, including accuracy, comparison to other products, information on customer support, overall star rating, technical information and persuasive words. The study also defined the presence of rude and racist words, ambivalence/neutrality, spelling and grammar mistakes, brevity and other reviewers' ratings and comments as less influential. The study considered "Textual factors in online product reviews: a foundation for a more influential approach to opinion mining" conducted only in specific regions such as China and with a defined product, "Mobile phones", so it still cannot be accepted or deemed helpful in other contexts, especially in younger e-commerce such as Vietnamese e-commerce. Moreover, other factors such as products, product details, sales and promotions in e-commerce websites would have some influence on consumer buying decisions with online reviews. Our study researched and presented this idea.
In the field of e-commerce, classification and sentiment analysis are often researched on the Amazon platform because this is the largest online supplier in the world with a wide range of products and a large number of product reviews. In Ref. [2], the author presented a model of extracting features from user opinion information and evaluated customer opinions about iPhone 5 reviews with three basic levels: good, average and bad. A continuous naïve Bayes learning framework is used to classify and sentiment product reviews were present in research [3]. There are other research studies about mining and sentiment analysis, such as Refs. [28][29][30]. In these research studies, aspect extraction is essential and it can affect the result of sentiment classification.

Overview of Technologies Used in Supporting Sentiment Reviews
The term natural language processing, also known as NLP, is a collection of computational techniques for automatic analysis and representation of human languages, motivated by theory [31]. NPL plays an important role and can be applied in many areas such as machine translation, focusing on automatic translation converting from one language to another; information retrieval (IR), which is used to solve the problem of providing the most relevant information and information with the input a question of the user; and information extraction (IE), which is used to extract the information automatically, question answering, text summarization, topic modeling and, more recently, opinion mining [32]. As the primary task in NLP, tokenization is used to break up a string of words into semantically functional units called tokens and works by defining boundaries and criteria of where each token begins and ends [33]. Sentence tokenization can be used both to split sentences within a text and words within a sentence [31]. Removing stop words is an essential step in our text processing, which involves filtering out high-frequency words that add little or no semantic value to a sentence. In this research, the NLP techniques are used to remove stop words, tokenize and embed entities of a user's statement.
As a subtopic of NLP in artificial intelligence, natural language understanding (NLU) is understood as machine reading comprehension and considered an AI-hard problem [31]. Most NLU systems share some standard components, which include a lexicon of the language and a parser, as well as grammar rules to break sentences into an internal representation [32], and the construction of a rich lexicon with suitable ontology [33]. Generally, NLU is a field of NLP that deals with the conversion of natural language into a semantic representation that the computer can interpret [32,33].
To effectively support and implement NLP and NLU, recurrent neural networks (RNNs) are often used in many systems and models [34,35]. The main idea of RNNs is to use a series of information to theoretically perform the same task for all sequence elements with the output dependent on previous calculations [36]. The neural network consists of three key sections: input layer, hidden layer and output layer [35].
Recurrent neural network (RNN) models into a flat form with X t , X t+1 , X t+2 as the input data at time t, t + 1 and t + 2 as follows (here, t, t + 1, t + 2, etc., are called the timesteps). RNN uses only a single neural network (usually a layer) to calculate the output value of each timestep. Therefore, outputs, upon becoming input, are multiplied by the same weight matrix (here is WR). Figure 1 shows a simple recurrent neural network [30].  Figure 2 presents an unfolding recurrent neural network process. First, X t is taken from the sequence of input, subsequently outputting Y t , which, together with X t + 1 , is the input for the next step. Thus, Y t and X t are the input for the next step. Similarly, Y t + 1 from the next step is the input with X t + 2 for the next step and so on. This way, it keeps remembering the context while training. The formula for the current state is: Applying the activation function: W is weight, Y is the single hidden vector, W hh is the weight in the previous hidden state, W hx is the weight in the current input state and tanh is the activation function implementing a non-linearity that squashes the activations to the range from −1 to 1.
Output: H t = (W hy ).(Y t ) The unfolding process of the recurrent neural network can bring two main advantages for using this network and building our machine learning model to classify polarity words in each product review in this paper [37]. First, regardless of the sequence length, the learned model always has the same input size because it is specified in terms of the transition from one state to another, instead of being specified in terms of the variablelength history of states. Second, it is possible to use the same transition function f with the same parameters at every time step. These two factors make our training model able to learn a single model f that operates on all-time steps and all sequence lengths instead of needing to learn a separate model for all possible time steps. Learning a single shared model allows generalization to sequence lengths that did not appear in the training set and enables the model to be estimated with far fewer training examples than required without parameter sharing.
In this paper, a training model is built, in which the sentiment analysis problem is solved by the recurrent neural network model (RNN), which has a commonly used variant currently called the long short-term memory neural network (LSTM), which combines the vector representation of words with a continuous bag of words (CBOW) architecture.

Proposal of Research Model
In this research, a sentiment evaluation model of customer reviews is proposed with functions extracting objects, and emotional words for the problem of emotional level analysis, using machine learning algorithms. We combined emotional dictionaries and a set of extract rules to build a data training data set based on the semantic dependency relationship between words in sentences in the given Vietnamese context. In Figure 3, the research framework is presented. The process starts with crawling review data automatically. Then, crawled data will be sent to data processing, with functions such as standardization, removing stop words, etc., and extract into features. There are five main groups of features that directly affect customer reviews that were found in this research. After the training program is run, we can input a new review to the model. This review can be classified into sentiment reviews. If a new feature is defined, it will be updated and added to the feature library, as shown in Figure 3. With the input of a new product review statement, the model can analyze the sentiment of the review and visualize analysis results on the real-time website. A research framework is proposed as shown in Figure 3 and the details of each primary process in this research framework are explained in the following parts.

Crawling Reviews Data
We used Python combined with Splash-Scrapy to crawl data of sales on e-commerce websites. Splash is a lightweight headless browser that works as an HTTP API. With Splash, we can render JavaScript pages, follow URLs and extract data from specific HTML paths on each page. Our collected data were saved in MongoDB of AWS. After collecting data, we cleaned and filtered data in Python. Key conditions expression was used every time crawling data to filter data.
Next, after reviewing the data source of Tiki, the Hot Deal data of the website Tiki was chosen. As a result, there were 56,484 raw reviews from 16 product categories in Hot Deal of the website Tiki as in Table 1. Then, we used two libraries of Python, ViTokenizer and ViPostagger, to define tagged files containing word-processing paragraphs (word segments) in Vietnamese and the library Gensim for unsupervised topic modeling and natural language processing using modern statistical machine learning. In this same step, we used TF-IDF to determine the importance of the words to the sentences in the corpus [38]. TF-IDF works by determining the relative frequency of words in a specific document compared to the inverse proportion of such words over the entire document corpus [38]. The TF-IDF weight is composed of two terms: the first computes the normalized term frequency (TF), or the number of times a word appears in a document, divided by the total number of words in that document; the second term is the inverse document frequency (IDF), measures how important a term is [39]. The TF-IDF weight is computed as the logarithm of the number of documents in the corpus divided by the number of documents where the specific term appears [39][40][41].
In our research, we used TF-IDF for stop word filtering in various subject fields, including text summarization and classification. Figure 4 presents the results of the stop word and punctuation process.   In addition, after processing data, we found that customers usually used many acronyms, misspelt words, spoken language, etc., in product reviews out of habit. We also built a separate dictionary for this particular word-based.

Feature Defining
In this step, we conducted a review of the data set and extracted essential features of the product reviews. Then, we also conducted a frequency analysis and extracted terms in customer reviews related to process features and added them to the final feature word list as shown in Table 2.  First, collected product reviews go to standardization sentences and spoken tagger as subjects, objects, verbs and adjectives by comparing with the sentence tagger dictionary. Second, sentences are represented as vectors with vectorization steps. The system defines the entity and feature word by machine learning algorithm with the topic data-training set model. After this step, the process can define the adjectives and verbs corresponding to each feature's attribute. Finally, the process calculates the weight of features with each entity and summary feature and opinion word in reviews. The main purpose of the topic modeling process is to explore feature words, as well as the main opinion of review sentences of customers in the data set.
Text files are series of words (ordered). We needed to convert the text files into numerical feature vectors to run the machine learning algorithm. We used a bag of words model. Briefly, we segmented each text file into words (for English splitting by space), counted the # of times each word occurs in each document and, finally, assigned each word an integer ID. Each unique word in our dictionary corresponds to a word of the table opinion feature words. The representation of text documents is a challenging task in machine learning. Nowadays, many techniques are used, such as doc2vec method, a concept presented in [40]. In this research, we used the Tfidfvectorizer method.

Machine Learning Algorithms
The workflow of the building machine learning model is presented in Figure 7. The process starts with the creating reviews sentiment. After that, the process extracts feature from text files into labels. We continued with defining the model, which is a training model. After drawing up a training model, we evaluated this model with a comparison index of the model (accuracy, val_loss, loss). If all indices are suitable, the process ends. If not, the process notifies the situation, reviews the dictionary and adds more or changes the model if needed.

Phase 1: Extracting features from text files
After this process, we store collected review data as the value upon processing into the corresponding characteristic columns. Then, we calculated statistics on the number of comments for each feature, pivoted the data and saved it to the database. We prepared statistics on comments by topic. Because the review details are the objects, they need to be processed inside the loop of objects. We defined the evaluation functions according to each word feature. Then, we exported them into a file used as a training data model.
In the beginning, all collected product review data are stored in MongoDB with the following format of key-value. All the keys are ID, product event ID, rating, reviews count, rating count, review detail (review author, review date post, review title, content and useful), link of review and time updated of this review. Each key contains the corresponding.
The statistics on reviews by category product are prepared with df_overview_topic functions. At this step, all the reviews are counted with their product categories in two main columns: name of category and frequency. There are 14 product categories found in this step, as shown in Figure 8. Next, data will be moved to process extracts all features from text files into labels. We define the evaluation functions according to the Feature dictionary and continue to polarize this feature into 3 levels: 1 is negative, 2 is neutral and 3 is positive. We analyze product reviews with five features of price, qualify, shipment, design and customer satisfy. The reviews continue to polarize into three levels: 1 is negative, 2 is neutral and 3 is positive. We use command conditions to extract the levels of these features in customer reviews easily. After that, the implemented value is stored in the database. We exported all the implemented values into a file used as a training data model. and after those data are saved in CSV files.

Phase 2: Building the Data model
We use the trained model to build the classify function in order to predict the future comment in real-time. In our training model, we separated our data into two parts: 80% data for training and 20% remaining of the data for testing. The training data contains 30.000 records and the test data contains 8.778 records. Training data are processed, such as standardization and POS tagger, features extraction. Each sentence will be divided into basic tokens, punctuations and spaces before training. The input text of the model can be one sentence or a short paragraph of 3 to 5 sentences in the same reviews. Vietnamese is an isolated language, so that word boundaries are not automatically defined by spaces. Determining the semantics of words is greatly influenced by word segmentation so that the model achieves higher accuracy. Therefore, there is also a dictionary for phrases that often appear but do not carry much meaning as well as words that will affect the operation of the model. The continuous bag-of-words (CBOW) model is used to represent the word set into vector space at this stage. Thanks to the vector space of this vocabulary, the more similar the words are, the larger the cosine distance value (closer to 1); conversely, the less similar words are smaller (closer to 0). As a result of the training process, we obtained the set of weights of the neural network LSTMs stored in the file (params), along with the LSTMs network configuration hyperparameters (conf). These two files will be loaded into the network of LSTMs for testing and operation. About sampling procedure was applied to test the model; we chose a random selection to test in order to ensure the accuracy and fairness of the model evaluation. Figure 9 shows the entire UML framework of our system. The user can access Main-Activity, which has a login interface. MainActivity connects with the server through the Internet network. When a user sends a request to view the report of customer product reviews, MainActivity sends it to the server through API. The server also connects with RealtimeCollect, where the systems check real-time reviews; if there are any new reviews, this function adds, such new reviews and sends them to Analysis_review. If not, go straight to Anal-ysis_review. Analysis review shows sentiment reviews, topic mining and entity mining. Finally, send all analysis data to visualization. Visualization sends historical reports and predicted reports to MainActivity, as requested by the user.

Topic Modeling
The results of the topic modeling process show that there are 42 distributions of words structured into five main features (quality, price, design, shipment and satisfaction), as shown in Figure 10. They are manually named quality, price, design, shipment and satisfaction. For each topic, the most frequent words (vietnamese_word and english_word) are listed together with their frequency and probabilities (Prob.). We designed Word Cloud charts as in Figure 11 to visualize all the topic distributions for users to follow and understand the overview of Product Review Data easily. As shown in Figure 11, the ten most commonly used words in customer's reviews is "Hàng" (Product), "Không" (No), "Sản_phẩm" (Product), "Giao" (Ship), "Tốt" (Good), "Ðược" (Get), "Nhanh" (Fast), "Ðẹp" (Beautiful), "Nhưng" (But) and "Mới" (New).

Data Training Set
As a result, this research built a data training set that can classify the sentiment of each Vietnamese review with 38,778 different rows and reviews with five features (price, shipment, quality, design, satisfy) and three levels (1-negative, 2-neutral, 3-positive) of sentiment, as shown in Table 3.
In our model, that val_loss starts to decline and val_acc starts to increase; this means that the model is learning and working well. Only the loss function was used to update the model's parameters and the accuracy was only used to see how well the model worked.
When inputting a new review, this model can classify the sentiment. For example, the review was "Fast shipping, good product quality, prices are quite cheap, so I feel great" ("mình mua hàng giao nhanh_chóng chất_lượng sản phẩmdóng_gói chắc chắn giá khá_rẻ hàngdảm bảo nên cũng khá an_tâm"). The model can correctly classify the sentiment of this review with status (33222), Price_Positive, Shipment_Positive, Quality_Neutral, Design_Neutral and Satisfy_Neutral.  bought at the promotion price, bought many times, packaged goods with plastic wrap outside to avoid getting wet, but the box is still a paper carton, so the wide box is heavy and dented.

Visualization of the Results
Some charts and tables are presented for readers or users can get more information and insight from our research. We also designed a web-app to show the visualization of all the results.
As shown in Figure 12, the product with the highest topic review ratio is fashion (18, 37%), followed by toys-moms and children (10, 48%) and beauty-health (9, 76%). The main customers who put their time into feedback were female customers, so the company should focus on designing and providing products that are more suitable and attractive to female customers. The research results also show the ratio of feedback and polarity of each feedback.  Table 4 presents an example of a visualization screen of total reviews of each product category in our web-app. This example shows the total number of reviews of each product category. The top three product categories with the most reviews are fashion (10,378 reviews), toy-mom and children (5921 reviews) and beauty-health (5513 reviews).  Figure 13 presents the sentiment results of product reviews by product categories according to five features: customer satisfaction, good design, good quality, good shipping and low price.

Conclusions
This research aims to build a machine learning model to classify Vietnamese product reviews based on the five main features of the online buying process aspect in an ecommerce website. Specifically, we crawled 56,484 customer reviews from 16 kinds of products. We collected details and descriptions of products and screened the product reviews to find the main feature words. As a result, we found five main features that appear and affect almost all customer reviews include price, shipment, quality, design and satisfaction. By extracting feature words, we built a solitary word dictionary with three levels (negative, neutral and positive) in Vietnamese. After that, we built a machine learning model using an RNN model with a data training set having 38,778 data records. When input a customer review or a file of customer reviews, our model could portray the reviews with five features and three levels. Finally, there is a web-app to connect to a server and visualize all the research results. Based on the research results, enterprises can follow up their customers in real-time and receive recommendations to understand their customers.
In this research, we only used a simple model and polarity reviews with five defined features and three levels; we have still not considered the relationship and ratio of each feature with the products. The testing method of research model and evaluation of the sampling procedure have not yet been established. This current research is necessary to evaluate the utility and cross-validation metrics with different training and test sets. Building cross-validation and evaluation will be researched more in future work. In addition, future work on more different e-commerce websites shall be conducted. In addition, the potential of building a multi-platform product reviews polarity model with a variety of products is essential, as well as deep learning models and natural language processing techniques will be researched in the future to solve these problems. We also plan to build an android app based on cloud computing to enable users follow up realistic sale processing and customer feedback in real-time. This research will be a suitable reference and highly applicable not only in the field of e-commerce. In addition, this methodology can be used to train and sentiment reviews in other specific domains such as education, government, baking, service, healthcare or scientific domain.