Improving the Forecasting Performance of Taiwan Car Sales Movement Direction Using Online Sentiment Data and CNN-LSTM Model

The automotive industry is the leading producer of machines in Taiwan and worldwide. Developing effective methods for forecasting car sales can allow car companies to arrange their production and sales plans. Capitalizing on the growth of social media and deep learning algorithms, this research aimed to improve the overall performance of the forecasting of Taiwan car sales movement direction forecasting by using online sentiment data and CNN-LSTM method. First, the historical sales volumes and multi-channel online sentiment data for six car brands in Taiwan were collected and preprocessed for labeling of car sales movement direction. Then, three models, namely, the classical, sentimental, and CNN-LSTM models, were constructed and trained/fitted for forecasting car sales movement directions in Taiwan. Finally, the performance of the three prediction models were compared to verify the effects of online sentiment data and the CNN-LSTM model on forecasting performance. The results showed that four forecasting performance indices, i.e., accuracy, precision, recall and F1-score, improved by 27.78% (from 41.67% to 69.45%), 0.39 (from 0.38 to 0.77), 0.27 (from 0.42 to 0.69) and 0.33 (from 0.35 to 0.68), respectively. Therefore, the online sentiment data and CNN-LSTM method can indeed improve the overall performance of car sales movement direction in Taiwan.


Introduction
Forecasting, which can help managers to develop more accurate and meaningful plans have played an important role in reducing business uncertainty for companies [1]. Sales forecasting in particular is the basis of definite and reliable plans for marketing, sales management, production, procurement, and logistics, which further empower companies to provide better services and reap more benefits [2]. A successful sales forecast is an essential key for companies to manage their business successfully.
The automobile industry, the leading producer of machines in many countries, is important for worldwide economic development. Furthermore, manufacturing a car requires iron, aluminum, plastic, steel, glass, rubber, copper, and more materials. If an automobile company can accurately predict its car sales, it can arrange effective production plans for its supply chain to prevent shortages and excesses of materials in the inventory process. In addition, when a customer decides to buy a new car, he/she generally hopes to take possession of the vehicle as soon as possible. If an automobile company can accurately predict its sales, it can develop effective sales plans to provide good service to its customers. Therefore, the development of a good car sales forecasting method is important for the automobile industry.
Unfortunately, few studies to date have focused on car sales forecasting [2][3][4][5][6][7][8]. Liu and Long [8] assembled a curve-regression model, a time series decomposition model, and RBF neural networks as a combined forecasting model and used economic data which takes on the obvious time factor and trends in car-making and selling. Brühl et al. [3] developed a time series model consisting of additive components: trend, seasonal, calendar, and error components. The model collected the main time series of newly registered automobiles and a secondary time series of exogenous parameters which could influence the trend of the main time series. The trend component was estimated by Multiple Linear Regression and Support Vector Machine (SVM). Yearly, quarterly, and monthly data for newly registered automobiles served as the basis for the tests of the models. The outcomes showed that the quarterly data provided the most accurate results. Wang et al. [4] developed an automobile sales forecasting methodology based on monthly sales volume, coincident indicator, leading indicator, wholesale price index, and income. Then an adaptive network-based fuzzy inference system (ANFIS) was created to obtain the forecast. The automobile forecasting methodology developed by Hülsmann et al. [5] used market-specific exogenous parameters, such as gross domestic product (GDP), stock index, personal income, and unemployment rate, on a yearly, quarterly, or monthly basis as the input variables for time series analysis and classical data mining algorithms.
On the Internet, consumers enthusiastically share their opinions and reviews via news, blogs, and social media, also known as electronic word of mouth (eWOM), and increasing numbers of potential buyers habitually consult eWOM before making their purchasing decisions [9][10][11][12][13][14]. Since eWOM can be positive or negative statements about a product or company [15][16][17], researchers have proposed sentiment analysis methods for automatically distinguishing three types of eWOM: positive, negative, and neutral [18]. To simultaneously apply historical sales data and eWOM to car sales forecasting, Fan et al. [2] used a sentiment analysis method, the Naive Bayes (NB) algorithm, to extract the sentiment index from each online review, and then integrated the sentiment index into the imitation coefficient of the Bass/Norton model to improve the forecasting accuracy.
Although very little effort has been expended to examine car sales forecasting, several points can still be raised by referring to forecasting studies of both car sales and sales of other products to facilitate the improvement of car sales forecasting methods.
First, historical sales data are the major predictor variable used for sales forecasting. Several other predictor variables, such as product prices, advertising campaigns, holidays [19], and economic indicators [4], are also frequently used for sales forecasting. Recently, with the popularity of social media, studies have begun using online reviews [2], online promotional strategies [20], and sentiment analysis [2,20,21] as the predictor variables to improve the performance of sales forecasting.
Second, several linear and nonlinear models, such as the Delphi technique, exponential smoothing, regression analysis, autoregressive integrated moving average (ARIMA), bass diffusion model, and multinomial logistic regression (MLR), are classical methods employed for sales forecasting and other predictions [4,22,23]. However, with the development of deep learning techniques, such as the Convolution Neural Networks (CNNs) and Long Short-Term Memory (LSTM), deep learning techniques have been recently applied to sales forecasting to improve the prediction performance [4,19,20,24,25]. The CNN is usually applied to image data for solving classification problems [26], while LSTM is used to analyze time series data for solving classification, processing, and forecasting problems [27].
Third, the response variable of sales forecasting can be either the sales volume (or amount) or the sales movement direction [18,28]. Sales volume forecasting is a continuous value prediction of sales volume. In contrast, the sales movement direction transforms the sales volume into directional changes in sales, such as Up, Flat, and Down. Thus, sales movement direction forecasting is a classification problem of sales forecasting.
In Taiwan's automobile industry, the sales volume of passenger cars in 2019 was 383,987. As consumer preferences changed, the sales of imported cars in Taiwan increased year by year. The 2019 sales volume of imported cars was 200,548, i.e., about 52% of the market share of passenger cars. The total sales volume of the top six leading imported car brands, namely, BMW, Lexus, Mazda, Mercedes-Benz (Benz), Toyota, and Volkswagen (VW), was 146,231, around 73% of the market share of imported cars [29]. Thus, accurately predicting car sales, especially for the top six leading imported car brands, could contribute to the development of Taiwan's automobile industry. Consequently, this research aimed to improve the sales forecasting for Taiwan's car industry and used the top six leading imported car brands as the experiment cases.
As mentioned above, the response variable of car sales forecasting can be either the sales volume (or amount) or the sales movement direction. Fantazzini and Toktamysova [7] argued that correct forecasts of car sales movement directions can still provide useful information even with large errors in the forecast car sales volumes. This is particularly important when predicting a turning point, which is a special case of directional accuracy and represents a change in the car sales movement direction. Therefore, this research selected the car sales movement direction as the response variable for the sales forecasting of Taiwan's six leading imported car brands. In addition, because a car is a durable consumer good, potential buyers will spend more time on eWOM to aid in decision-making on the purchase. To improve the performance of car sales forecasting, in addition to historical sales data, multi-channel online sentiment data were also used as the predictor variables for car sales forecasting. Instead of regarding the sentiment data as a coefficient of the Bass/Norton model in Fan et al.'s study [2], this research prepared and analyzed a series of daily multi-channel online sentiment data of a car brand in the form of images. Therefore, to consider the image characters of online sentiment data and the time series characteristics of historical car sales data, a CNN-LSTM model integrating the CNN and LSTM networks was used to build a car sales prediction model with improved prediction performance.
To clarify the effects of the online sentiment data and the CNN-LSTM model on car sales predictions, three models were created for forecasting car sales movement directions in Taiwan. The first "classical" model, using the historical sales data as predictor variables and MLR as the prediction model, was created as the performance baseline of forecasting of car sales movement directions in Taiwan. The MLR is a generalized logistic regression for solving problems with more than two classes [22,30]. Then a "sentimental" model was created by adding the multi-channel online sentiment data as the predictor variables to the classical model so as to verify the effects of online sentiment data on prediction performance. Finally, a "CNN-LSTM" model was created by replacing the MLR method in the sentiment model with the CNN-LSTM method proposed in this research to verify the effects of the latter method on prediction performance.
The performance comparison of the three prediction models showed that the forecasting accuracy of car sales movement directions in Taiwan was effectively improved by the use of online sentiment data and the CNN-LSTM model. This paper is organized as follows. Section 1 states the relevant topics of this research and reviews the literature related to the research problem. Section 2 elaborates the creation process of the three prediction models for forecasting car sales movement directions in Taiwan. In Section 3, the results of the three prediction models are compared and analyzed to verify the effects of the online sentiment data and the CNN-LSTM model on the forecasting of car sales movement directions in Taiwan. Finally, the important findings, discussions, and suggestions for further research are summarized in Section 4. Figure 1 shows the research framework for improving the forecasting of car sales movement directions in Taiwan by using online sentiment data and the CNN-LSTM model. First, the historical sales volumes and multi-channel online sentiment data of Taiwan's top six leading imported car brands were collected and preprocessed for labeling of car sales movement directions. Then the structures of three prediction models, namely, the classical, sentimental, and CNN-LSTM models, were constructed for forecasting car sales movement directions in Taiwan. Third, the three prediction models were trained or fitted with the datasets of Taiwan's top six leading imported car brands. Finally, the prediction performances of the three prediction models were evaluated and compared to verify the effects of online sentiment data and the CNN-LSTM model on the forecasting of car sales movement directions in Taiwan.

Data Collection
As mentioned above, this research used both online sentiment data and the CNN-LSTM method to improve the performance of predictions of the car sales movement direction of Taiwan's top six leading imported car brands. As shown in Figure 2, for each of Taiwan's six car brands, namely, BMW, Lexus, Mazda, Benz, Toyota, and VW, the historical car sales data and online sentiment data were collected mainly from 2014 to 2019. The historical car sales data were collected from the Ministry of Transportation and Communications, R.O.C. (MOTC) [29]. The MOTC website is a platform for commonly used transportation statistics and is operated by Taiwan's government. Since the MOTC website provides new car registration data on a monthly basis, as shown in Table 1, the new car registration data from 2014 to 2019 were retrieved as the historical monthly sales volumes for six car brands. In addition, the new car registration data of January 2020 were also collected for the continuing labeling work. The online sentiment data were collected from the OpView Insight: Social Media Monitoring Tool (OpView) [31]. OpView is the largest social media monitoring service platform in Taiwan. It collects eWOM and news every day from five online media sources in Taiwan [32], including more than 6100 discussion forums (e.g., the Mobile01 and the Dcard), more than 36,000 social media (e.g., Facebook and Instagram), more than 400 Q&A websites (e.g., Yahoo! Answers), more than 1800 blogs, and more than 3600 news websites (e.g., ETtoday and Line Today) [31]. The collected daily eWOM and news are then analyzed as three types of sentiments, i.e., positive, negative, and neutral, for various products and brands. For the study, three types of daily online sentiment volumes, positive (P), negative (N), and total (T), from 2014 to 2019 for six car brands were collected. Table  2 shows the collected daily online sentiment volumes for BMW.

Labeling of Car Sales Movement Directions
As mentioned in Section 1, this research selected the sales movement direction as the response variable of car sales prediction models. Hence, the monthly sales movement directions were labeled with the collected monthly sales volumes for six car brands. The three types of sales movement directions, Up (U), Flat (F), and Down (D) are defined in Equation (1). Since the intent of this research was to predict the car sales movement direction of the next month at this month, this equation indicates that the ith month's label li is determined by the i + 1th month's monthly sales growth rate (Si+1-Si)/Si and the predefined threshold h.
Assume that the threshold h is set at 10%. The results of Equation (1), in Table 3, show the labeling results of each month from 2014 to 2019 based on the collected historical car sales volumes shown in Table 1.
For example, the l1 (i.e., the label of 2014/1) of BMW is determined by the 2nd month's (i.e., 2014/2′s) monthly sales growth rate (624−1437)/1437 = −56.58% and the predefined threshold h = 10%. As the 2nd month's monthly sales growth rate, −56.58%, is smaller than −10% (i.e., −h), the l1 of BMW is labeled as the Down direction "D". As mentioned in Section 2.1.1, for labeling the sales movement direction of 2019/12, i.e., l72, the car sales volume of 2020/1 must be collected. For the example of the l72 of Lexus, the 73rd month's monthly sales growth rate (2687−2348)/2348 = 14.44% is greater than the threshold h = 10%; the l72 of Lexus should be labeled with an Up direction "U" according to Equation (1).

Prediction Model Structure Construction
As explained in Section 1, to verify the effects of online sentiment data and the CNN-LSTM model on Taiwan's car sales prediction, three models, namely, the classical, sentimental and CNN-LSTM models, were created for forecasting car sales movement directions in Taiwan. The structures of these three prediction models will be described in this section.

The Classical Model
In this research, the classical model was created as the baseline for prediction performance for comparison with the sentimental and the CNN-LSTM models.
The classical model adopted the most frequently used predictor variables for sales forecasting, including historical sales data and seasonality data, to predict the car sales movement direction. For example, Table 4 is the dataset prepared for creating the classical model for BMW. The monthly car sales volumes and monthly labeling data were retrieved from Table 3. As for seasonality data, car companies in Taiwan usually start different sales campaigns in specific months to promote sales, so monthly car sales volumes exhibit strong seasonality. In this research, the seasonality data, namely, the month number and the same-month-last-year sales movement direction labels were added, as shown in Table  4, to improve the accuracy of predictions of car sales movement directions. Furthermore, since three types of car sales movement directions (U, F, and D) were defined in this research, the MLR, a generalized logistic regression for solving the problems with more than two classes [22], was selected as the forecasting method of car sales movement directions. Hence, the classical model for forecasting car sales movement directions in Taiwan can be expressed by Equation (2): (2) where yi refers to the car sales movement direction (U, F, or D) of the ith month; xi1 refers to the sales volume of the ith month; xi2 refers to the same-month-last-year sales movement direction label of the ith month; xi3 refers to the month number of the ith month; β0 refers to the y-intercept (constant term); β1 to β3 refer to the slope coefficients for xi1 to xi3; ϵ refers to the model's error term (also known as the residuals).
In multinomial logistic regression with K classes, one class is chosen as a "pivot", and K-1 independent binary logistic regression models are constructed [22]. If car sales movement direction Y = U is selected as the pivot, then the model for Y = U is: where Y refers to the random variable of yi in Equation (2); bF and bD refer to the set of regression coefficients in Equation (2) associated with car sales movement directions F and D, respectively (b is typically estimated by the maximum likelihood method [33]); x refers to the vector of xi1 to xi3 in Equation (2). Then, the probability that x belongs to car sales movement directions F and D can be expressed as Equation (4): Since the sum of the probabilities that x belongs to each class is 1, the probability that x belongs to car sales movement direction U becomes: and Equation (4) can be rewritten as follows: and Given the x from Table 4's dataset, MLR outputs a car sales movement direction label y such that:

The Sentimental Model
As mentioned in Section 1, since the potential car buyers tend to spend more time on online sentiment data to aid in purchasing decision-making, this research intends to improve the overall performance of forecasting of car sales movement directions in Taiwan by using multi-channel online sentiment data.
To clarify the effects of multi-channel online sentiment data on car sales movement direction prediction, the sentimental model was created as shown in Equation (8) by adding the sentimental data to the classical model, shown in Equation (2), as the predictor variables for forecasting car sales movement directions in Taiwan: The only difference between the classical model and the sentimental model was that the predictor variables of the sentimental model additionally contained the online sentiment data from xi4 to xi18. As shown in Table 2, xi4 to xi18 respectively refer to the five channels, including discussion forums, social media, Q&A websites, blogs, and websites, and each channel is composed of three types of online sentiment volume, i.e., P, N, and T, for six car brands collected from the OpView Insight.
However, the online sentiment data were collected on a daily basis, whereas the forecasting model defined in Equation (8) were monthly, since the subscript i represents the ith month. Consequently, the collected daily online sentiment data needed to be converted into monthly online sentiment data. For BMW, for example, the dataset for creating the sentimental model was prepared as shown in Table 5.
As in the classical model, the MLR was used for solving the sentimental model to predict the car sales movement directions. Therefore, the fitting and forecasting processes for Equation (8) were the same as those for Equations (3)-(7). Same-

Month-Last
Year Label

The CNN-LSTM Model
In the last section, for the sentimental model, online sentiment data were added to the predictor variables of the classical model in an attempt to improve the overall performance of forecasting of car sales movement directions in Taiwan. This section presents the development of the CNN-LSTM model, which integrated the CNN and LSTM networks instead of the MLR method used in the sentimental model to improve the overall performance of forecasting of car sales movement directions in Taiwan. The basic idea of the utilization of these models is that LSTM models are appropriate for dealing with time series data, while CNN models may filter out the noise of the input data and extract more valuable features [34,35]. Figure 3 presents the architecture of the CNN-LSTM model. A sliding window [36] is first used in pre-processing for data extracting for the CNN-LSTM models. Then, the CNN-LSTM model sequentially consists of a one-dimensional convolutional neural network (1D CNN), a CNN network, an LSTM network, a fully connected network, and an output layer. Basically, the CNN-LSTM model used the same predictor variables as the sentimental model with the exception of the historical car sales volumes. After a test of prediction performance, this research chose to use the past 60-day (2-month) data of predictor variables to predict the next-month (1-month) response variable of car sales movement directions. Therefore, this research uses a 90-day (3-month) window with a 30-day (1month) sliding step to extract the training data for CNN-LSTM model from the collected dataset. Each sliding window includes a set of 2-month predictor variables (Xi) and a 1month response variable (Yi). The data of predictor variables was standardized and input via a one-dimensional convolutional neural network (1D CNN) developed for processing data arranged in a single dimension [26,37]. Since the multi-channel daily sentiment data was collected on a daily basis, its 60-day data was extracted directly to form the 1D 60day input data. However, because both the same-month-last-year sales movement direction label and the month number were monthly-based, their two monthly data must be duplicated to form the 1D 60-day input data, i.e., one monthly data was duplicated to 30 daily data.
After receiving data via the 1D CNN layer, a CNN network containing five convolutional and max-pooling layers filtered out the noise of the input sentiment data and extracted sentiment change features for the prediction of car sales movement directions in Taiwan. In addition, both the daily online sentiment data and the monthly car sales data were time series. Moreover, the LSTM network, which has a feedback loop for processing the entire data sequence, is always used for classification, processing, and forecasting based on time series data [27].
Therefore, an LSTM network and a 4-layer fully connected network following the CNN network were used to generate the output of car sales movement direction prediction. Some considered hyperparameters of the CNN-LSTM model are presented below: the activation function was relu; the loss function was cross_entropy; the optimizer was adam; the learning rate was 0.003. Finally, the earlystop method was used to solve the overfitting problem [38,39].

Results
As mentioned in Section 2.1.1, this research collected the data of car sales volumes and multi-channel online sentiment volume from 2014 to 2019. For training and evaluating the three prediction models (classical, sentimental, and CNN-LSTM) proposed in Section 2.2, the collected data from 2014 to 2018 were used as the training dataset, and the data of 2019 were used as the test dataset. This section reviews and compares the forecasting performances of the three models to verify the effects of the online sentiment data and the CNN-LSTM model on forecasting of car sales movement directions in Taiwan.
The performances of the classical, sentimental, and the CNN-LSTM models proposed in Section 2.2 were evaluated with four indices, including accuracy, precision, recall, and F1-score. These measures are defined in Equation (9) to (14) based on the confusion matrix shown in Table 6. Table 6. The confusion matrix for car sales movement direction prediction.

Number of Predictions for Car
Accuracy is an intuitive measure of the performance of a prediction model. As defined in Equation (9), accuracy is the ratio of the correct predictions to the total number of predictions. However, accuracy is great only when the dataset is balanced. Since the dataset for forecasting car sales trends in Taiwan collected in this research had an uneven class distribution, three other measures: precision, recall, and F1-score, also needed to be reviewed for evaluation of the performance of the three prediction models.
Precision was used to evaluate the correctness of the prediction of a model for each kind of car sales movement direction. As shown in Equation (10), the precision of a specific direction is the ratio of the correct predictions within the total predictions of that direction. That is, it removes a type of prediction for U, F, and D, and evaluates the accuracy of the prediction. To evaluate the overall precision of a model, Equation (11) defines the precision as the weighted average of the precision of individual directions.
On the other hand, recall is used to evaluate the ability of a model to correctly identify the observations in each kind of car sales movement direction in the dataset. As defined in Equation (12), the recall of a specific direction is the ratio of the correct predictions among the observations with actual values of that direction. To evaluate the overall recall of a model, the recall is defined as Equation (13), as the weighted average of the recall of individual directions.
In some situations, precision or recall will be maximized at the expense of the other metric. To take both precision and recall into account, Equation (14) defines the F1-score as the harmonic mean of precision and recall. Therefore, the F1-score is an overall measure of a model's accuracy that combines precision and recall [40]. The F1-score is usually more useful than accuracy, especially if the dataset has an uneven class distribution.
As mentioned in Sections 1 and 3, the classical, sentimental, and the CNN-LSTM models were designed to verify the effects of the online sentiment data and the CNN-LSTM model on the forecasting of car sales movement directions in Taiwan. Since the sentimental model was designed by adding the sentimental data to the classical model as the predictor variables for that purpose, the effects of multi-channel online sentiment data on car sales prediction accuracy could be reviewed by comparing the performances of the classical and sentimental models. Furthermore, since the CNN-LSTM model adopted a deep learning method integrating the CNN and LSTM networks instead of the MLR method in the sentimental model, the effects of the CNN-LSTM model on car sales prediction accuracy could be verified by comparing the performances of the two models.
Consequently, the remainder of this section will first review the confusion matrix of the testing results. Tables 7-9 show the aggregate confusion matrices of the classical, sentimental, and the CNN-LSTM models, respectively, for the six car brands.     D  14  3  4  21  F  3  10  8  21  U  1  3  26  30  Total  18  16  38  Based on the confusion matrices, the four performance indices of accuracy, precision, recall, and F1-score were then computed and compared model by model to evaluate the effects of online sentiment data and the CNN-LSTM model on the forecasting of the car sales movement directions. Figure 4 shows the accuracy of the classical, sentimental, and the CNN-LSTM models for six car brands. For all but TOYOTA, their prediction accuracies of the sentimental model were higher than those of the classical model. TOYOTA had accuracies of 50% in both the classical and sentimental models. On average, the sentimental model (51.39%) provided an improvement in accuracy of 9.72% over that of the classical model (41.67%) due to the addition of multi-channel online sentiment data as the predictor variables to predict car sales movement directions. Furthermore, the accuracies of the CNN-LSTM model for all six brands were higher than those of the sentimental model. On average, the CNN-LSTM model (69.45%) achieved accuracy that was 18.06% higher than that of the sentimental model (51.39%) due to the adoption of the CNN-LSTM network instead of the MLR method. In total, the CNN-LSTM model demonstrated an average improvement in accuracy that was 27.78% higher than that of the classical model. Consequently, according to the above comparison, the accuracy of forecasting of car sales movement directions in Taiwan was indeed improved by using online sentiment data and the CNN-LSTM model. However, accuracy is an intuitive and great measure only when the dataset is balanced. The collected dataset for forecasting car sales movement directions in Taiwan had an uneven class distribution. Hence, this research further examined three other measures, i.e., precision, recall, and F1-score, to evaluate the performances of the three prediction models. Figure 5 shows the precision of the classical, sentimental, and the CNN-LSTM models for six car brands. As mentioned above, precision is a measure of the ability of a model to correctly identify the observations from the predictions of a specific car sales movement direction. As shown in Figure 5, the precision of the sentimental model for all six brands was higher than that of the classical model. On average, the precision of the sentimental model (0.55) was 0.16 higher than that of the classical model (0.38). Furthermore, the precision of the CNN-LSTM model for all but Lexus was higher than that of the sentimental model. Lexus had high precision of 0.83 in both the sentimental and the CNN-LSTM models. On average, the precision of the CNN-LSTM model (0.77) was 0.22 higher than that of the sentimental model (0.55). In total, the precision of the CNN-LSTM model was an average of 0.39 higher than that of the classical model. Consequently, according to the above comparison, the precision of the forecasting of car sales movement directions in Taiwan was effectively improved by the use of online sentiment data and the CNN-LSTM model. Accuracy Figure 5. The precision of the classical, sentimental, and the CNN-LSTM models for six car brands. Figure 6 shows the recall of the classical, sentimental, and the CNN-LSTM models for the six car brands. As mentioned above, recall is a measure of the ability of a model to correctly identify the observations of a specific car sales movement direction within the dataset. As shown in Figure 6, the recalls of the sentimental model for all six brands but TOYOTA were higher than those of the classical model. TOYOTA had a consistent recall of 0.50 in both the classical and sentimental models. On average, the recall of the sentimental model (0.51) was 0.09 higher than that of the classical model (0.42). Furthermore, the recalls of the CNN-LSTM model for all six brands were higher than those of the sentimental model. On average, the recall of the CNN-LSTM model (0.69) was 0.18 higher than that of the sentimental model (0.51). In total, the CNN-LSTM model demonstrated an average improvement in recall of 0.27 over that of the classical model. According to the above comparison, the recall of forecasting of car sales movement directions in Taiwan was improved by the use of online sentiment data and the CNN-LSTM model.  Figure 6. The recall of the classical, sentimental, and the CNN-LSTM models for six car brands.

Number of Predictions for Car Sales
Finally, the results of F1-scores simultaneously considering precision and recall are reviewed. Figure 7 shows the F1-scores of the classical, sentimental, and the CNN-LSTM models for the six car brands. As mentioned above, the F1-score is more useful than accuracy when the dataset has an uneven class distribution. As shown in Figure 7, the F1scores of the sentimental model for all six brands were higher than those of the classical model. On average, the F1-scores of the sentimental model (0.46) were 0.11 higher than those of the classical model (0.35). Furthermore, the F1-scores of the CNN-LSTM model for all six brands were higher than those of the sentimental model. On average, the F1scores of the CNN-LSTM model (0.68) were 0.22 higher than those of the sentimental model (0.46). Overall, the CNN-LSTM model demonstrated an average improvement in F1-score of 0.33 over that of the classical model. Based on the above comparison, the F1score of forecasting of car sales movement directions was improved by the use of online sentiment data and the CNN-LSTM model. The comparisons of the four indices in the classical, sentimental, and the CNN-LSTM models have shown that the performance of forecasting of car sales movement directions in Taiwan was improved by the use of online sentiment data and the CNN-LSTM model.
However, accuracy works best if kinds of false predictions have similar costs. If the costs of different false predictions are very dissimilar, it is better to look at precision or recall. In this research, the costs of classifying direction D as F or U, classifying direction F as D or U, and classifying direction U as D or F were different, so precision and recall were further explored to examine the applicability of the proposed methods.
As mentioned above, precision is used to evaluate the ability of a model to correctly predict each kind of car sales movement directions; recall is used to evaluate the ability of a model to correctly identify the observations in each kind of car sales movement directions in the dataset. In practice, car companies arrange their production and sales plans according to the predictions of car sales movement directions, so the correctness of the prediction, i.e., the precision, will have a direct impact on car inventories and sales. Thus, in the application of car sales movement direction forecasting, precision is more important than recall. The following section discusses the precision of the three models for the six car brands in greater detail. Figure 8 illustrates the precision for direction D, Precision(D), of the classical, sentimental, and the CNN-LSTM models for the six car brands. A low Precision(D) means that most of the predictions of direction D are incorrect and should be F or U, so the planned production quantity will not meet the market demand. Improving Precision(D) can prevent losses of sales and market share due to insufficient production. As shown in Figure  8  For the precision of direction U, Figure 9 shows the Precision(U) of the three models for the six car brands. A low Precision(U) indicates that most of the predictions of direction U are incorrect and should be F or D, so car companies will plan a production quantity that exceeds market demand. Improving Precision(U) can avoid a surplus of cars. As shown in Figure 9, the Precision(U) values of the sentimental model for all six brands except TOYOTA were equal to or higher than those of the classical model.

Precision-D
On the other hand, the CNN-LSTM appeared unable to raise the Precision(U) for Lexus, which dropped from 1.00 in the sentimental model to 0.70 in the CNN-LSTM model. Finally, the precision of direction F, Precision(F), for the three models and six car brands are shown in Figure 10. A low Precision(F) value indicates that most of the predictions of direction F are incorrect and should be D or U. Since the planned production quantity will be higher or lower than the market demand, car companies may either lose sales and market share or hold excess inventory. As shown in Figure   Precision-U Figure 10. The precision of direction F of the classical, sentimental, and the CNN-LSTM models for six car brands.
From the above analysis, it can be found that the precisions of the three directions, Precision(D), Precision(F), and Precision(U), all improved gradually from the classical model to sentimental and the CNN-LSTM model. Therefore, the results clearly indicated that the online sentiment data and CNN-LSTM method improved the precision of the forecasting of directions D, F, and U in car sales in Taiwan.
However, the extent of the effects of online sentiment data and CNN-LSTM method on Precision(D), Precision(F), and Precision(U) were different. The contribution to Precision(U) of the online sentiment data was 0.06, while that from the CNN-LSTM method was 0.10. That is, Precision(U) gained only a little improvement from the online sentiment data and CNN-LSTM method. For Precision(D), in contrast, the contributions of the online sentiment data and the CNN-LSTM method were 0.39 and 0.14, respectively. Thus, Precision(D) gained a lager improvement from the online sentiment data than from the CNN-LSTM method. Furthermore, the Precision(F) gained only 0.15 in improvement from the online sentiment data but 0.57 from the CNN-LSTM method. Obviously, most of the improvement of Precision(F) resulted from the CNN-LSTM method. A further discussion of how the online sentiment data and CNN-LSTM method impacted Precision(D), Precision(F), and Precision(U) is provided as follows.
As mentioned in Section 2.1, this research collected and analyzed three types of online sentiment volumes, i.e., positive (P), negative (N), and total (T), to predict car sales movement direction for six car brands. Theoretically, positive sentiment is consumer responses indicating satisfaction; negative sentiment is consumer responses suggesting dissatisfaction. A more positive sentiment volume in the previous months indicates an increased likelihood of a sales movement direction of U in the next month. Conversely, a more negative sentiment volume in the previous months increases the likelihood of a sales movement direction of D in the next month.
In practice, the originators and motives of eWOMs and online sentiment will influence consumer intentions and decisions, which in turn will impact the overall performance of sales movement direction forecasting. The main originators of eWOMs and online sentiment include the customers and the industry. The online sentiments of Precision-F customers who have real praise (positive eWOMs) or complaints (negative eWOMs) may directly affect the sales movement directions of the subsequent periods. On the other hand, the online sentiments from the profession include eWOMs written by the companies themselves, the company's partners, or the company's competitors. To promote its products, a company and its partners may intentionally leave positive eWOMs or sentiments on social media or blogs. Conversely, competitors may write negative eWOMs as malicious attacks. Therefore, most of the eWOMs or online sentiments from the profession may be false and could interfere with sales movement direction forecasting.
In the car market, safety and after-sales service play key roles in shaping consumer attitudes and perceptions towards car purchases. Safety and after-sales service experiences also drive customer satisfaction and eWOMs. Over the years, car companies have striven to target and attract specific customer segments through automotive design and services. Meanwhile, consumers will become loyal customers of a particular automotive brand under the consideration of customer experiences and personal preferences. These loyal customers are also willing to write positive eWOMs and share their positive sentiment and experiences with the public via news, blogs, and social media. In contrast, customers who have poor experiences with a car brand will leave negative eWOMs and sentiments on the Internet.
As a result, in normal times, a car brand will have stable volumes of positive and negative eWOMs, and sentiments from satisfied and dissatisfied customers, respectively. This steady volume of positive and negative sentiments from customers theoretically should be helpful to the accurate prediction of car sales movement direction of F. However, the car brand company may sometimes write positive eWOMs and sentiments to respond to the negative eWOMs and sentiments of customers. These positive eWOMs and sentiments from the car manufacturer will interfere with the prediction of direction F because they are not from actual customers. This interference may explain why the online sentiment volume did not contribute much to the improvement of Precision(F), increasing it only from 0.07 of the classical model to 0.22 of the sentimental model on average. However, the CNN-LSTM method used in this research can effectively filter out the influence of noise from the eWOMs and the sentiment volume of the profession to enhance Precision(F) from the average of 0.22 of the sentimental model to the average 0.79 of the CNN-LSTM model, for it can integrate and apply the CNN's image processing ability to online sentiment data and the LSTM's time series processing ability to car sales historical data.
If it is time for a car manufacturer to launch a new car, the company will actively create an atmosphere of discussion of the merits of the new car on the Internet. As a result, the positive eWOMs or sentiment will continue to grow for a period of time. Normally, in the early stage of the introduction of a new car on the market, the increases in positive eWOMs and online sentiments are usually accompanied by growth in new car sales. However, whether the subsequent increases in positive eWOMs and online sentiment will continue to stimulate new car sales depends on consumers' budget constraints and acceptance of the car brand. As mentioned above, consumers usually accept and prefer only a few brands. Therefore, in the late stage of the introduction of a new car, although the positive eWOMs and online sentiment continue to increase, they may not lead to an increase in car sales. This suggests that increases in positive eWOMs and online sentiment sometimes lead to increases in car sales, but sometimes they do not. This fact will reduce the effectiveness of the online sentiment data and the CNN-LSTM method, and it is the reason why the Precision(U) gained only small improvements from the online sentiment data (0.06, from 0.57 of the classical model to 0.63 of the sentimental model) and the CNN-LSTM method (0.10, from 0.63 of the sentimental model to 0.73 of the CNN-LSTM model).
In contrast, if customers are dissatisfied with the quality or service of a car brand, the volume of negative eWOMs and online sentiment will increase, and negative experiences are usually communicated faster than positive experiences. Although the car company may write positive eWOMs and sentiments to respond to the negative eWOMs and sentiments from customers, the car sales still can be affected with varying degrees of decline.
If the complaints are related to serious safety issues or design defects, the car company will be more active in issuing a public recall notification to instruct customers to return their cars for free repairs. At this point, the volume of negative eWOMs and online sentiments will increase significantly, and potential consumers will turn away from the car brand. This is why adding online sentiment data as predictor variables resulted in a significant improvement of 0.39 in Precision(D), from 0.34 of the classical model to 0.73 of the sentimental model. With the significant help of the online sentiment data, the deep learning ability of the CNN-LSTM method was able to improve Precision(D) by only 0.11, from 0.73 of the sentimental model to 0.84 of the CNN-LSTM model.
To sum up, both the online sentiment data and the CNN-LSTM method have good effects on the precision of three car sales movement directions, D, F, and U. As to the computational burden, the three models were executed on a PC environment of Intel Core i7-6700 CPU @ 3.40 GHz × 4, 24 GB RAM memory, and NVIDIA GeForce GTX 1060 6 GB GPU. The total running time of the classical, sentimental, and CNN-LSTM models for each brand is about 1, 2, and 45 min, respectively. The CNN-LSTM model takes longer time to train the model. Once the model is trained, the time required for subsequent testing and actual prediction is not much different from the other two models. Although the CNN-LSTM model takes longer time, it can create better prediction performance, which is worthwhile overall.

Conclusions
With the explosive growth of social media and emerging forecasting methods, the efforts to improve the performance of car sales forecasting should consider the adoption of eWOM, online sentiment data, and some deep learning techniques. The purpose of this research was to improve the overall performance of forecasting of car sales movement directions in Taiwan by using online sentiment data and the CNN-LSTM method. This research selected the car sales movement direction as the predicted object, and it was defined by the sales growth rate of the next month and the predefined threshold. Therefore, in practical application, if a threshold is set at 10% and the car sales movement direction predicted by the CNN-LSTM model is up (U) or down (D), the car company can adjust the production and sales of the next month upward or downward by 10%. To verify the effects of online sentiment data and the CNN-LSTM method on the forecasting of car sales movement directions in Taiwan, three forecasting models, namely, the classical model, the sentimental model, and the CNN-LSTM model, were constructed and compared.
The results showed that of the use of both online sentiment data and the CNN-LSTM method led to significant improvements over the classical model in accuracy (27.78%, from 41.67% to 69.45%), precision (0.39, from 0.38 to 0.77), recall (0.27, from 0.42 to 0.69), and F1-score (0.33, from 0.35 to 0.68). It is found that the overall performance of forecasting of car sales movement directions in Taiwan can be effectively improved by the use of online sentiment data and the CNN-LSTM model.
Furthermore, because car companies use prediction of car sales movement directions to arrange their production and sales plans, the degree of precision will impact directly both car inventories and sales, so this degree needs to be explored in more detail. The results showed that model with the online sentiment data and the CNN-LSTM method demonstrated on the improvements in Precision(U), Precision(D), and Precision(F) of 0.16 (from 0.57 to 0.73), 0.50 (from 0.34 to 0.84), and 0.72 (from 0.07 to 0.79) respectively.
In practice, if direction U is wrongly predicted as direction F or D, the planned production quantity will fall behind the market demand and may incur some opportunity cost, but the impact is not significant. Since loyal customers may just delay their orders, there is little impact on sales volume or market share loss. In contrast, if direction D is wrongly predicted as direction F or U, the acting cost will be high. Since the planned production quantity will exceed the market demand, automobile manufacturers may hold excess inventory, in which can lead to a backlog of capital or the need to sell at a reduced price. Therefore, improvements of Precision(U), Precision(F) and Precision(D) will be of great benefit to car companies in developing effective production and sales plans.
Besides, many deep learning models have been developed. Some models are available for this research. For example, the LSTM-vanilla model which only adds a peephole connection to the classic LSTM is similar to the LSTM used in the CNN-LSTM model for this research. In the future, the LSTM-vanilla model may be used to improve the prediction performance and compare with the CNN-LSTM model proposed in this research.