Research on the Purchase Intention of Electric Vehicles Based on Customer Evaluation and Personal Information

: With the continuous development of electric vehicle (EV) technology, there is an increasing need to analyze the factors in ﬂ uencing customers’ purchase intentions. According to the data of customers’ vehicle experience evaluation and personal information, this paper develops the analysis models of in ﬂ uencing factors using the analysis of variance algorithm (ANOVA) and Kruskal–Wallis algorithm. Then, the purchase intention model for EVs is proposed using the random forest method. Finally, the optimization model for the EV sales plan was built. The results show that the main factors in ﬂ uencing customers’ purchases are di ﬀ erent for di ﬀ erent vehicle brands. However, the customer’s evaluation of the vehicle experience has a greater influence on the customer’s purchase. Compared to other prediction models, the random forest model has the highest accuracy. For 3 EV brands, the prediction accuracies are 97.8%, 98.9%, and 97.6%. In addition, this paper predicts the purchase intentions of 15 customers. By optimizing the sales plans for 3 EV brands, the predicted purchase rate of 15 customers increased from 40% to 53%. The research work contributes to the sales of electric vehicles, the accurate positioning of customers, and the identification of more potential customers.


Introduction
With increasing global environmental problems [1], people's awareness of environmental protection [2] is strengthened, and battery technology continues to advance.The new energy electric vehicles (EVs) have occupied part of the automotive industry market [3].Compared with traditional fuel vehicles, EVs have the advantages of zero emissions, low noise, low maintenance costs, and intelligent functions [4].How to further promote the adoption of EVs and increase sales of EVs is currently the focus of EV enterprises.
At present, there are two main policies to improve the popularity of EVs: First are government policies, such as emission regulations, regional restrictions, a higher fuel tax, and electric vehicle subsidies, which have been implemented in some cities [5].Even some governments in developed countries plan to stop the sale of fuel vehicles in the next few years.Second is the new vehicle technology, such as driverless technology, vehicle networking technology, and intelligent cockpits [6,7].Compared with fuel vehicles, these new technologies are more suitable for application in EVs, which can better meet the needs of modern people for vehicle functions.However, there are also some factors that affect the adoption of EVs, such as range anxiety, battery safety, the popularity of charging facilities, and so on [8].
In order to improve the adoption of EVs, many researchers have analyzed the influencing factors of EV sales and established many EV sales or purchase prediction models.Most studies focus on macro factors, such as national or regional policies, overall EV technology, people's average education level, average income, etc., and their sales or purchase models are only applicable to the whole EV industry [9].However, there is little research on microfactors, such as customers' personal information and EV brands.
This paper analyzes the influence factors of EV purchase from a micro perspective.Based on the customer information and evaluations of three brands of EVs, the influence factor analysis model, predictive purchase model, and sales plan optimization model are established.The research work can be helpful for the sales of different brands of EVs.The innovations in this paper are as follows: (1) Aimed at the micro-purchase factors of EVs, two influence factor analysis models are established, which determine the main factors influencing customers to purchase EVs from different brands.(2) Based on analyzing the influencing factors, the predictive purchase model of EVs is established, which can quickly predict the purchase intention of customers.(3) Based on the above research, the sales plan optimization model is established so as to propose a better sales plan.
The contents of the following chapters are as follows: Section 2 is a review of current EV purchase prediction research; Section 3 is the data processing of personal information and evaluation of customers of three EV brands; and Section 4 is the analysis of the influencing factors of EV purchase, the establishment of the purchase prediction model, and the optimization model of the sales plan.Section 5 is the result analysis, and Section 6 is the discussion.

Related Works
At present, the prediction research for EV purchases is mainly based on a questionnaire survey.The various influencing factors have been analyzed [10], and a variety of prediction models have been established.Descriptive statistics and principal component analysis were used to investigate the important factors of EV purchase intention in China, which showed that the intrinsic essence of the products and the cost were the most important, while the government policy was moderate [11].The survey and statistical methods were applied to study the factors influencing the plug-in EV (PEV) purchase intention of adult drivers in the USA.They found that highly educated consumers were more interested in PEVs, and interest in PEVs is slightly higher than interest in EVs [12].Based on an adapted stage model of self-regulated behavior change, a two-month follow-up survey was conducted among people interested in buying EVs.It was found that stage transitions were preceded by increases in goal intentions and implementation intentions in the week prior to the transition.Intent prediction was different for different people [13].Following a survey in China and the US, consumer preferences for conventional vehicles, PEVs, and EVs were modeled.They found that EVs were less popular in the US than in China.Despite subsidies for EVs, US consumers prefer low-range PEVs to EVs.The adoption of EVs in China would be earlier than in the US [14].A conceptual framework was developed to examine the influence of consumer innovativeness on EV preferences.The results indicated that adopters' innovativeness and attitudes toward the functional performance of EVs significantly affect EV preferences [15].The structural equation model (SEM) was used to analyze the factors influencing EV purchase intentions in China.They pointed out that government policies have a great influence on EV purchase intentions [16].The SEM was used to compare the effect of environmental performance, price value, and range confidence on consumers' EV purchase intentions.The results showed that EV environmental performance was a stronger predictor of attitude and thus purchase intention than price value and range confidence [17].The univariate time series model and multivariate time series model were proposed to predict the sales volume of EVs in China.They focused on short-term (12 months) and long-term (60 months) predictions, which showed relatively high accuracy [5].PEV and EV purchase intentions in the US were studied, which showed that consumers preferred PEV to EV, mainly because of the range anxiety of EV [18].A choice experiment was conducted to evaluate whether personal carbon trading (PCT) influences individual decisions to adopt EVs.The results showed that PCT can effectively promote the adoption of EVs.Government subsidies played a more important role in EV purchase intention than free parking and vehicle tax exemption [19].A model of EV purchase intention was proposed, taking into account a number of demographic characteristics and attitudinal factors.It was found that attitudinal factors such as network externalities, government subsidies, vehicle performance, and demographic characteristics such as gender, age, and marital status have a significant impact on EV purchase intention [20].A personality-perception-intention framework was proposed to study consumers' EV adoption behavior.It was shown that consumer perception and personality play an important role in EV purchase intention; personal innovativeness and environmental concern significantly also have a significant impact [21].The correlation analysis and hierarchical multiple regression analyses were applied to study the socio-psychological effect on EV purchase intention.They found that low-carbon awareness has a slight moderating effect, while subjective norms and government policies have stronger effects [22].An online questionnaire was conducted to investigate the variations and determinants of EV purchase intentions in three countries.The results showed that Chinese citizens were more willing to purchase EVs than citizens of Brazil and Russia, mainly because of social networks and government policies [9].Using big data and text mining technologies, Chinese consumers' preferences for EVs were investigated through their online behaviors, which found that EV prices, car classification, and powertrain have a great influence on consumers' EV selection [23].Based on the structural equation model (SEM), the purchase intention model and post-satisfaction model of EVs in Japan were proposed.They found that environmental awareness had a direct effect on purchase intention and a non-direct effect on post-purchase satisfaction [24].Based on the theory of planned behavior (TPB), a questionnaire survey was conducted among potential consumers of EVs in Beijing.In addition, a structural equation model (SEM) was proposed to analyze the factors influencing EV purchase intention.The results showed that attitude, perceived behavioral control, cognitive status, product perception, and monetary incentive policy were important for consumers to purchase EVs [25].The determinants of Chinese citizens' intentions to purchase EVs were studied in depth through an online survey.It was found that people with a wide social network and friends who already own EVs were more likely to purchase an EV.Age and education also had a limited effect [26].The survey and statistical methods were used to investigate the changes in factors affecting PEV purchase intention over time.The results showed that the intention to purchase a PEV increased over time [27].SEM was used to study customers' perceived value of EVs and found that the concept of "mianzi" has no significant effect on purchase intention, while the price factor has a direct effect [28].The data mining method combined with deep earning technologies was used to study the purchasing reasons for EVs and found that EVs, demographic characteristics, and national policies were the main reasons [29].A partial least squares structural equation model (PLS-SEM) of purchase intention for electric two-wheelers was built.It was found that perceived economic benefits were the most important factor, and women were more inclined to purchase electric two-wheelers than men in India [30].The SEM was used to estimate the effect of positive anticipated emotion (PAE), negative anticipated emotion (NAE), and moral norms with TPB on EV purchase intention in China.The results showed that PAE has the greatest effect on EV purchase intention, followed by attitude, NAE, and perceived behavioral control (PBC) [31].To analyze consumers' stated preferences, a rankordered logit model was constructed to provide 5-year fuel cost and total cost of ownership information on EV stated preferences.The results indicated that providing information can increase consumers' EV purchase intentions [32].
To summarize, the current research mainly analyzes a variety of EV purchase factors, including national and regional policies, people's cognition, EV prices, etc., and establishes a variety of EV purchase and sales forecasting models.The current research has macro-guiding significance for government policy formulation and EV industry development.However, there is little research on customer personal factors and EV brand factors.This kind of analysis can help EV companies accurately target potential customers, provide accurate services, and optimize sales plans.In this study, customer personal factors and EV brand factors are analyzed, and the influence factor analysis model, predictive purchase model, and sales plan optimization model are established.

Data Processing
In this paper, the research data come from three EV brands in China [33]: Brand 1 is a joint venture brand; Brand 2 is an independent brand; and Brand 3 is a new brand.The total data include data A (customer experience score data) and data B (customer personal information data), as shown in Table 1.Due to the presence of anomalous data, the original data must be processed, the anomalous data must be deleted, and the blank data must be filled in.

Customer Experience Score Data (Data A)
Data A is customer experience scores from 1600 customers for 3 EV brands.The lowest score is 0, and the highest is 100.After examining data A, it was determined that there are some scores given by customers that are inconsistent with the scoring interval.In this paper, the box plot method is used to examine data A and eliminate unreasonable data.The results are shown in Figure 1.In Figure 1, the abnormal data are found in columns A1, A3, A5, and A7.Then, data greater than 100 or less than 0 were replaced by linear interpolation, yimod: where xi is the corresponding serial number according to data Ai and is arranged in ascending order, and yi is the abnormal score corresponding to xi.After correcting the data of all customers, relatively correct data can be obtained.

Customer Personal Information Data (Data B)
Data B has some missing data and logical errors; therefore, a logical analysis is carried out.Data groups with logical relationships in data B include groups B4 and B8, groups B13, B15, B16, and B17, and groups B5, B6, and B7, where bi is the value of the i-th index in data B.
For groups B4 and B8, the customer cannot obtain a driver's license until adulthood.Therefore, b4 driving years should be less than or equal to the difference between the current time and the time from the b8 birth year to adulthood: For groups B13, B15, B16, and B17, the criterion is that income must be greater than or equal to expenses: (3) b15 does not satisfy Equation (3) and must be excluded.The excluded data are filled in using the family annual income rate and the individual annual income rate calculated from the normal data: where b13 is the total family annual income and b14 is the total individual annual income.K is the magnification ratio corresponding to b13 data and b14 data.By using the ergodic cycle method, the average rate K of all normal data is solved as follows: where m is the number of the total normal data.Combining K, personal income and total family income, the abnormal value of b15 that needs to be replaced is calculated: For groups B5, B6, and B7, definitions of their values are shown in Table 2. Provided that the customer does not become pregnant out of wedlock, set the logical relationship between family size (B5), marital status (B6), and the number of children (B7) as a criterion; B7 is set as a variable.Combined with Table 2, the values of B5, B6, and B7 are adjusted, as shown in Table 3.

Methods
After the customer data processing is completed, the influencing factors of the EV purchase are analyzed to determine the important influencing factors.Then, based on the important factors, the probability prediction model of the customer's EV purchase is established.Finally, the sales plan optimization model is established.The flow chart of these models is shown in Figure 2. In Figure 2, the analysis of variance (ANOVA) algorithm (F model) and the Kruskal-Wallis algorithm (H model) are used to analyze the influencing factors, and the important factors for the purchase of EVs by each brand are determined.The EV purchase prediction model adopts the random forest method, which uses 70% of the data for training and 40% of the data for verification.Based on the purchase prediction model, the sales plan optimization model is established, which can provide better sales plans for different EV brands.

ANOVA Analysis Model
Based on the ANOVA algorithm [34], the influencing factors analysis model (F model) is established: where Fr is the ratio of intra-group and inter-group variances, and MSB is the mean square error between groups, that is, the sum of squares between groups divided by the degrees of freedom between groups.MSE is the mean square error within a group, which is the sum of squares divided by the group degree.
where dfB is the number of inter-group degrees of freedom, i.e., the number of constraints on inter-group independence, calculated as the total number of groups minus 1. dfE is the number of constraints for the error within the group, i.e., the total number of observations minus the number of groups.

Kruskal-Wallis Analysis Model
In addition to the above model, this paper also establishes the Kruskal-Wallis analysis model [35] (H model).Assuming that there are k independent score data, the total number of samples N is as follows: where nq is the size of the samples in group q.The sum of the collected data is arranged in ascending order, and statistic Hr is used to compare whether the population median of different samples is significantly different: where q is the number of samples and Rq is the rank sum of the nq observations of group q (X1, …, Xq).

Random Forecast Purchase Forecast Model
Based on the above analysis results, the random forest method [36] is adopted to establish a car purchase prediction model, and the algorithm flow is shown in Figure 3.
When u→∞, there is, as follows: Although the total number of samples in S* is equal to the total number of samples in S, S* may contain duplicate samples.When duplicate samples are removed, S* contains only about 63.2% of the samples in S.Then, using the exhaustive method [37], all values of each feature in the sample are traversed to find the best segmentation variable and segmentation point.The quality of the segmentation variables and segmentation points is measured by the impurity of the nodes after segmentation, which is the weighted sum G(ro, tcd) of the impurity of the child node: where ro is the shred variable, tcd is the value of ro, y'bag(xe) is the impurity of the node, and αl, αr, and Ns are the number of training samples of the left sub-node after segmentation, the number of training samples of the right sub-node, and the number of all training samples of the current node, respectively.xl and xr are the sets of training samples of the left and right sub-nodes, and M is the number of decision trees.After calculating the best segmentation variables and segmentation points in each decision tree, the training set and test set can be brought into the model for calculation.Finally, a prediction of the purchase can be obtained.For ease of description, f(yj) is used to represent the entire correlation between the input and output results of the random forest method.Therefore, the probability that customers will purchase electric vehicles can be predicted as follows:

Single Objective Linear Optimization Model of the Sales Plan
Based on the above analysis, customer experience score data (data A) has a significant impact on customer purchase intention.Usually, the service cost of EV sales companies is directly related to data A. Increasing the service cost can improve scores in data A, thereby increasing customer purchase intention.However, it is costly and difficult for automobile sales companies to comprehensively improve all data A scores.Therefore, based on the above prediction model and existing data, an optimization model for short-term sales plans is proposed.The goal is to slightly improve the scores of some items in data A, i.e., to increase the customer's purchase intention with less service cost.For sake of simplicity, some assumptions are made as follows.
(1) Assuming that the customer experience score data are equivalent to the service cost of the automobile sales company, increasing the customer experience score data score means increasing the service cost.(2) Considering the short-term improvement, it is assumed that the scores of data A increase in integer form, with a maximum increase of 5 for each item and a maximum score of 100.
Based on the predicted purchase model and the above assumptions, a single objective optimization model is established with the goal of minimizing the increase in data A [38].This model is constrained by a customer purchase probability greater than 0.5.By investing less in service costs, more purchase customers can be attracted.
(1) Objective function Since the improvement of the service score is proportional to its difficulty, it is necessary to find the point where the increase in the score is minimum and the purchase probability is maximum until the predicted purchase is achieved.To reduce the service difficulty as much as possible, the satisfaction score should increase as little as possible, so the objective equation is as follows: where vj is the increased service score; j is [1,8].
(2) Constraint condition Since the limitation of the satisfaction score is 100, all increased satisfaction scores should be less than 100: where yj is the initial satisfaction score.
where vj is the j-th satisfaction score of brand i.For the sake of optimization, vj is specified as an integer.vj can only be increased in the satisfaction score of data A, which means that vj belongs to data A.Moreover, the maximum vj is set to 5.
According to Equation ( 15), the probability of customers purchasing electric vehicles is obtained as follows: At the same time, if Pforecast is greater than 0.5, the customer will buy the vehicle, and if Pforecast is less than 0.5, the customer will not buy, so the constraint equation is as follows:

Results and Discussion
According to the above models, the influencing factors of EVs purchase are analyzed, customer purchase intention is predicted, and the sales plan is optimized.

Analysis of Influencing Factors
When the F model is used, the tested data should conform to a normal distribution.Therefore, the built-in normplot function of MATLAB is used to compare the data groups, as shown in Figure 4.The results show that all the data groups used in this paper are closed with the standard plus distribution, which means that all the data groups conform to the plus distribution.As shown in Figure 5, the analysis results of the influencing factors of the F and H models are basically consistent and mutually verified.Overall, for the three brands, data A has a greater influence, while data B has a lesser influence, which means that customer experience score has a great influence on EV purchase.In addition, B5, B7, and B8 (family size, number of children, and year of birth) have little impact.However, there are certain differences in the influence factors of different brands, among which and B17 have a greater influence on the purchase of Brand 1 and Brand 2, and a relatively small influence on Brand 3.For further analysis, based on the results of the F model with 4% as the benchmark, the main influencing factors are screened out according to the proportion shown in Table 4.It can be seen from Table 4 that the scores of battery technical performance, comfort, economy, safety performance, and power performance in data A have a great influence on the purchase of three brands.For Brand 1 (joint venture brand), customers are more concerned about the vehicle economy (A3); for Brand 2 (independent brand), customers are more concerned about the battery technical performance (A1); for Brand 3 (new power brand), customers are more concerned about the vehicle comfort performance (A2).In addition, car loans and mortgage loans have more influence on Brand 1 and less influence on Brand 3.

EV Purchase Prediction
The main influencing factors screened above are input into the purchase prediction model, in which 70% of the data is used as the training set and 30% of the data is used as the test set.For the results, the probability of [0, 0.5) means "No", and the probability of [0.5, 1] means "Buy".Since the number of decision trees in the model is different, the prediction error will be different, as shown in Figure 6.It can be seen that the overall error decreases as the number of decisions increases.When the number of decision trees is greater than 40, the change tends to be smooth.In this paper, the decision tree number of the model is set to 50 with an error of 0.04.The prediction accuracy of the training set and the test set is shown in Table 5.It can be seen that the accuracy of the purchase prediction model is high.The lowest accuracy of the training set of three brands is 98.9%; the highest is 99.4%.The accuracy of the test set is 97.8%, 98.9%, and 97.6%, respectively.The accuracy of the purchase model is related to the amount of data.Brand 2 has the most data and Brand 1 has the least, so the accuracy of Brand 2 is higher than the other two brands.Overall, the maximum error of the predictive purchase model of three brands is not more than 2.4%, which meets the research requirements.In order to further verify the random forest purchase prediction model, this paper simultaneously compares the decision tree prediction model, the BP neural network prediction model, the SVM prediction model, and the KNN prediction model.Take Brand 1 as the research object; the comparison of model results is shown in Figure 7.As shown in the figure, the random forest prediction model has the highest accuracy compared to the other models.According to the optimization plan, the customer's purchase probability is recalculated, and the results are shown in Table 8.It can be seen that after the optimization plan, the purchase probability of customers has increased, with the purchase probability of customer 10 increasing from 0.275 to 0.52 and that of customer 15 increasing from 0.459 to 0.52.The total number of customers purchasing has increased from 6 to 8, with a purchase rate of 53%, an increase of 13% compared to before.Based on the above research, it can be found that different electric vehicle companies have different focuses in the sales process.If key points of the sales process can be found, it will be beneficial to attract more potential customers, provide customers with more detailed and accurate services, improve customer satisfaction, and thereby increase car sales.For example, when targeting the sales of electric vehicles for Brand 1, according to Table 6, when focusing on improving the scores of A1, A3, and A5, focus on introducing the battery performance, economy, and power of the company's electric vehicles to customers during the sales process; this will significantly increase customer purchase intention and satisfaction.Similarly, for the sales of electric vehicles under Brand 2, if the focus is on introducing the battery performance and economic viability of the company's electric vehicles to customers during the sales process, it will significantly increase their purchase intention and satisfaction.And, for the sales of electric vehicles under the third brand, during the sales process, the focus should be on introducing the comfort advantages of the company's electric vehicles to customers; this will significantly increase customer purchase intention and satisfaction.In summary, for the above three electric vehicle brands or other electric vehicle companies, using the model established in this paper for sales strategy planning during the sales process can, to some extent, increase the number of electric vehicle customers and sales volume.

Conclusions
Based on the customer experience score and personal information data, this paper establishes a whole process of data processing-influencing factor analysis, customer purchase prediction, and sales plan optimization-to further reveal the relationship between EVs and customers.Specific conclusions are as follows: (1) According to the logical relationships, the original data are processed, and the abnormal data are eliminated.The missing data are filled in by linear interpolation, average magnification, and other methods, and the reliability of the data is greatly improved.(2) The F and H models are used to analyze the influencing factors in the data.It was found that the results of the two test methods are consistent and mutually verified.For different EV brands, the main influencing factors of customer purchase are different, but on the whole, the customer experience score has a greater impact on EV purchase.(3) Based on the main influencing factors tested above, a customer purchase prediction model is established using the random forest method.The results show that the model is more accurate than other purchase prediction models, with the accuracy of all three brands exceeding 97%.In addition, 15 customer purchase intentions are predicted, and 6 customers will buy EVs.(4) Based on the customer prediction model, the single objective linear optimization model of the sales plan is established.The results showed that, with the optimized sales plan, the predicted purchase rate of test customers increased from 40% to 53%.
The model established in this paper uses F-tests and H-tests to extract the important factors affecting automobile sales, and a random forest model is established to predict customer purchases.At the same time, according to the prediction model, the optimization model of the sales plan is established.It can predict and optimize automobile sales from a more intelligent aspect and provide a more scientific sales model for automobile sales.The research work in this paper can theoretically help EV companies attract more potential customers and provide accurate services.
However, there may be certain limitations to the data used in this paper.For example, the uneven distribution of the sales data from different electric vehicle companies collected in this paper manifests this specific unevenness.The amount of data collected between each company is uneven, with sufficient data for Brands 1 and 2 and poor data for Brand 3. The distribution of purchase situations among different customers is uneven, with the majority of customers not making purchases and only a small number of customers making purchases, resulting in a significant difference in the amount of purchase data and non-purchase data.In some cases, this may have a certain impact on the prediction accuracy.At the same time, the model used in this paper can be extended in the future; the prediction model can be applied to the sales of different products; for example, housing, electronic products, and other products can use the model established in this paper to find the best sales strategy.In the future, more rich data will be collected, and the model established in this paper will be improved with higher accuracy.

Figure 1 .
Figure 1.Data A filtered by the box plot method.Red ʺ+ʺ in the figure represents the satisfaction scores in data A.

Figure 2 .
Figure 2. Flow chart of the methods.

Figure 3 .
Figure 3. Flow chart of the random forest algorithm.Taking 70% of the major factor data selected above as the training set and 30% as the test set, the bootstrap sampling model of the random forest model is established.The specific steps are as follows: Assume the set S contains u different samples (x1, x2, …, xu).If a sample is taken from S and put back a total of u times to form a new set S*, then the probability of S* not containing sample xs (s = 1, 2, …, u) is as follows: final optimization model is as follows:

Figure 4 .
Figure 4. Data group 25 normal distribution probability graph.Blue ʺ+ʺ in the figure represent the probability of values in data A7.The dotted red line reflects the degree to which the data conform to a normal distribution.The F and H models were used to analyze the influence factor of the customer data for the three brands, and the results are shown in Figure5.

Figure 6 .
Figure 6.Influence of decision tree number on error.

Table 1 .
Implications of customer data indexes.

Table 4 .
Main influencing factors of car purchases.

Table 8 .
Comparison of customer purchase probability before and after optimization.