Prediction of China Automobile Market Evolution Based on Univariate and Multivariate Perspectives

: The automobile is an important part of transportation systems. Accurate prediction of sales prospects of different power vehicles can provide an important reference for national scientiﬁc decision making, ﬂexible operation of enterprises and rational purchases of consumers. Considering that China has achieved the goal of 20% sales of new energy vehicles ahead of schedule in 2025, in order to accurately judge the competition pattern of new and old kinetic energy vehicles in the future, the automobile market is divided into three types according to power types: traditional fuel vehicles, new energy vehicles and plug-in hybrid vehicles. Based on the monthly sales data of automobiles from March 2016 to March 2023, the prediction effects of multiple models are compared from the perspective of univariate prediction. Secondly, based on the perspective of multivariate prediction, combined with the data of economic, social and technical factors, a multivariate prediction model with high prediction accuracy is selected. On this basis, the sales volume of various power vehicles from April 2023 to December 2025 is predicted. Univariate prediction results show that in 2025, the penetration rates of three types of vehicles will reach 43.8%, 44.4% and 11.8%, respectively, and multivariate prediction results show that the penetration rates will reach 51.0%, 37.9% and 11.1%, respectively.


Background and Motivation
China's automotive industry is in a new stage of energy and technological innovation, and the era of multi-power with traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles as the mainstay has begun. According to the data of the China Association of Automobile Manufacturers, in 2022, the operating income of China's automobile manufacturing industry reached 9289.99 billion CNY, up 6.8% year-on-year, accounting for 6.7% of the total operating income of industrial enterprises above the designated size. According to the statistics of the China Association of Automobile Manufacturers, in the first quarter of 2023, China exported 1.07 million automobiles, surpassing Japan for the first time and becoming the largest automobile exporter in the world. As a pillar industry, the automobile industry plays a key role in accelerating industrialization, promoting manufacturing innovation, stimulating domestic demand, increasing employment and promoting economic growth.
Under the dual pressure of environmental protection and energy shortages, developing new energy vehicles is a common strategic choice for all countries in the world. In 2022, the State Council, China, issued the Twelfth Five-Year National Strategic Emerging Industries Development Plan, which listed the new energy automobile industry as one of the seven strategic emerging industries in China. On 28 March 2023, the agreement of EU member states to ban the sale of new fossil fuel vehicles from 2035 was officially approved. In 2021, US President Biden announced an order that by 2030, the sales of new energy vehicles will account for half of the national car sales.
In 2022, Xu Changming, deputy director of the National Information Center, said that traditional fuel vehicles and new energy vehicles have their own advantages, and both have certain development space in the foreseeable future. In 2023, at the monthly meeting of the National Passenger Car Market Information Association, Cui Dongshu, Secretary-General of the Association, said that the development of new energy and the development of fuel vehicles are not simply antagonistic. While rationally arranging the new energy automobile industry, it is necessary to avoid the sudden drop in demand for traditional automobile products and ensure the steady transformation, upgrading and sustainable development of the industry.
The competition pattern of the automobile industry is influenced by many factors [1,2], such as the economic environment, technological progress, consumer demand, environmental laws and regulations, etc. The market demand and competition pattern of various power vehicles in the future are still uncertain. Under the condition of limited data, it is of great significance for the production planning of automobile enterprises and the scientific decision making of the government to effectively capture the historical sales laws of various types of automobiles in the automobile sales system, analyze the dynamic relationship among various influencing factors and various types of automobile sales, and then accurately predict the sales trend of various power types of automobiles.

Literature Review 1.2.1. Influencing Factors of Vehicle Sales
The sales of traditional cars and new energy vehicles are affected by many factors. Government policy is an important factor affecting the development of the automobile industry, which plays a key role in stabilizing consumption and releasing demand. In all stages of the development of new energy vehicles, government support has played an important role in its large-scale promotion [3]. Socioeconomically, the employment rate of residents and the level of the consumer price are usually considered as important factors affecting automobile sales. In addition, price is also one of the most widely studied socioeconomic factors affecting automobile sales, and most consumers are sensitive to price [4]. Subsidies can reduce the price of new energy vehicles, thus increasing consumers' willingness to buy [5]. In terms of technology, the progress and innovation of automobile technology are directly related to the use cost, cruising range, energy consumption and other issues of automobiles, and are the key factors affecting the promotion and popularization of automobiles [6]. There is a positive correlation between technological innovation and the adoption of electric vehicles [7]. Supporting infrastructure is directly related to whether consumers can enjoy convenient services after buying a car, so it is considered to have an impact on automobile promotion and sales. Studies have shown that perfect charging facilities for new energy vehicles will help to enhance consumers' willingness to buy [8]. However, Lin and Wu [3] think that in the case of short-distance travel, consumers mostly choose home charging, and the coverage of charging infrastructure has limited influence on the purchase intention. Some studies focus on the influence of consumer psychological factors on the sales of new energy vehicles. Yang, et al. [9] and other research shows that the comfort, handling, space and cost performance of electric vehicles have a significant impact on the sales of electric vehicles. Wang, et al. [10] and other studies show that perceived risk and environmental awareness have a significant impact on the acceptance of electric vehicles. Some studies think that the energy price is one of the driving forces of automobile promotion and sales [11,12]. For example, increase in the gasoline price is beneficial to the promotion of electric vehicles [13], but some studies show that the energy price has little influence on the promotion of electric vehicles [14].

Prediction Models of Vehicle Sales Volume
Statistical models are widely used in automobile sales forecasting. Li, et al. [15], based on the improved Bass model and grey theory, constructed the demand forecasting model of new energy vehicles, and its effectiveness is verified by using three data sets from Norway, France and Europe. Hsieh, et al. [16] developed a Monte Carlo model based on historical data. Kumar, et al. [17] used Gompertz, Logistic, Bass and Generalized Bass models to simulate the future demand for electric vehicles, and determined the best-fitting model for 20 major countries.
However, some studies believe that the traditional statistical model cannot effectively extract the nonlinear characteristics of automobile sales data [18]. Therefore, some scholars improved the traditional forecasting method. Zhou, et al. [19] put forward a new timevarying grey Bernoulli model, which better captured the nonlinear, complex and timevarying characteristics related to electric vehicle sales. Pei and Li [20] established a nonlinear grey Bernoulli model based on data grouping, and pointed out that in 2020, the sales of new energy vehicles will exceed 2 million. Liu, et al.
[1] point out that by 2025, the sales of new energy vehicles in China will reach 8.84 million. Li, et al. [21], after improving the parameters of Bass model and LV model, showed it has a better forecasting effect on the sales of battery electric vehicles in China, and the Bass model is more accurate.
Some scholars also apply artificial intelligence algorithms to the problem of automobile sales forecasts. Zhang, et al. [22], using the LSTM algorithm, established a smart car sales forecasting model based on the KOL network public opinion and network search index, which improves the forecasting accuracy of smart car sales. Liu, et al. [23] proposed a multi-factor sales forecasting model combining discrete wavelet transform and BiLSTM, and obtained the MAE value, Maple value and RMSE value of the optimal DWT-BiLSTM model of 0.811, 5.671 and 1.001, respectively. Xia, et al. [24] used the XGBoost prediction algorithm for automobile sales prediction and achieved high prediction accuracy in a short runtime. Wu and Chen [25] combined a principal component analysis and neural network to predict the sales volume and growth rate of electric vehicles, and pointed out that the sales volume of electric vehicles in the world and China will continue to increase in the next 50 years, but the growth rate will continue to decline. With the development of machine learning technology, sentiment analysis technology has also been applied to the problem of automobile sales forecasts. Liu, et al. [26] proposed a combined forecasting model based on multi-angle feature extraction and sentiment analysis, which improved the forecasting accuracy of automobile sales. Ding, et al. [27] uses an online comment-driven combination forecasting model to improve the forecasting accuracy of new energy vehicle sales.
In addition to the innovation of methods, some studies focus on the competition and substitution between traditional fuel vehicles and new energy vehicles. Sun and Wang [28] predicted the market evolution of new energy vehicles and new energy vehicles in China based on the Lotka-Volterra model and system dynamics (SD) model. Guo, et al. [29] divided the passenger car market into four types-gasoline passenger car, natural gas passenger car, blade electric passenger car and plug-in hybrid passenger car-and predicted the evolution trend of the passenger car market based on the Lotka-Volterra model and historical data of automobile sales.
There is a lot of literature on the diffusion trend of automobile sales; however, few publications study the dynamic change trend of automobile sales with competition based on the perspective of prediction. On the basis of related research, this paper analyzes the diffusion of different power types of vehicles in 2025 from the perspectives of univariate prediction and multivariate prediction.

Contribution and Organization
The innovation and contribution of this paper are mainly reflected in two aspects. Firstly, the automobile industry is divided into three types, according to the power type: traditional fuel vehicles (ICEs), battery electric vehicles (BEVs) and plug-in hybrid vehicles (PHEVs). Based on the monthly sales data, from the perspective of univariate and multi-variable, the prediction effects of various statistical models and machine learning models for automobile sales are compared, and high prediction accuracy is obtained. Secondly, in the multivariable forecasting part, the VAR model and BP neural network model are innovatively combined to analyze the lag effect of 17 influencing factors from economic, social and technical aspects on automobile sales, and a multivariable BP neural network forecasting model with lag characteristics is developed. Figure 1 shows the framework of this study. traditional fuel vehicles (ICEs), battery electric vehicles (BEVs) and plug-in hybrid vehicles (PHEVs). Based on the monthly sales data, from the perspective of univariate and multivariable, the prediction effects of various statistical models and machine learning models for automobile sales are compared, and high prediction accuracy is obtained. Secondly, in the multivariable forecasting part, the VAR model and BP neural network model are innovatively combined to analyze the lag effect of 17 influencing factors from economic, social and technical aspects on automobile sales, and a multivariable BP neural network forecasting model with lag characteristics is developed. Figure 1 shows the framework of this study.  The vehicle is one of the basic components of transportation systems, and the reform of the automobile industry will profoundly affect the reconstruction of transportation systems. Considering that the monthly sales of traditional fuel vehicles and new energy vehicles have certain cyclical and seasonal characteristics, firstly, the automobile industry is divided into traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles, according to power types. Secondly, based on the historical sales data of different power types from March 2016 to March 2023, the capturing effects of three univariate forecasting models, namely, the BP neural network (BPNN) model, quadratic exponential smoothing (QES) model and Prophet model, on the trends and laws of sales changes are compared. The univariate forecast results of automobile sales of three power types in 2025 are obtained by selecting the model with better overall performance. In addition, the multivariate prediction model can effectively identify the dynamic coupling relationship between The vehicle is one of the basic components of transportation systems, and the reform of the automobile industry will profoundly affect the reconstruction of transportation systems. Considering that the monthly sales of traditional fuel vehicles and new energy vehicles have certain cyclical and seasonal characteristics, firstly, the automobile industry is divided into traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles, according to power types. Secondly, based on the historical sales data of different power types from March 2016 to March 2023, the capturing effects of three univariate forecasting models, namely, the BP neural network (BPNN) model, quadratic exponential smoothing (QES) model and Prophet model, on the trends and laws of sales changes are compared. The univariate forecast results of automobile sales of three power types in 2025 are obtained by selecting the model with better overall performance. In addition, the multivariate prediction model can effectively identify the dynamic coupling relationship between the target variables and related influencing factors. Therefore, the influencing factors of automobile sales are selected from the aspects of economy, society and technology, and based on the automobile sales data and influencing factor data from March 2016 to March 2023, a vector autoregressive (VAR) model is established; the lagging order of influencing factors is analyzed; the input samples of the multivariate prediction model are constructed; and the prediction effects of the VAR model, support vector regression model (SVM) model and BP neural network model are compared.

Variable Selection and Data Description
Electric energy is one of the main power sources of new energy vehicles. The automobile market is divided into three categories: traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles. Through the analysis of the influencing factors of automobile sales in Section 1.2.1, and considering the availability of data, 17 influencing factors of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles are selected from the aspects of economy, society and technology, as shown in Table 1. The statistical data cover the period from March 2016 to March 2023, with a total of 85 groups of data. The historical sales data of three types of cars are shown in Figure 1. Among them, automobile sales data come from the official website of the China Association of Automobile Manufactures, average automobile price data come from the car home website, patent data come from the China National Intellectual Property Administration website, customs import and export data come from the Oriental Fortune Network, and other data come from the National Bureau of Statistics and the China Statistical Yearbook. Some missing data are filled in by linear interpolation.
Car companies and governments pay more attention to the monthly data of car sales, while the annual data can better reflect the general development trend of different power types of cars. As can be seen from Figure 2 stages, which is used to distinguish them more clearly. The overall sales of traditional fuel vehicles are showing an obvious downward trend, while the overall sales of battery electric vehicles and plug-in hybrid vehicles are showing an upward trend.

Quadratic Exponential Smoothing Model
The exponential smoothing model is one of the classic statistical models in time series prediction [30], among which quadratic exponential smoothing model is suitable for the prediction of time series with a linear trend [31]. The formula of the quadratic exponential smoothing model is as follows: where ( ) and ( ) represent the first exponential smoothing value and the second exponential smoothing value in the period, respectively; and (0 < < 1) is the smoothing coefficient.
According to Formula (5), the sales of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles from April 2023 to March 2024 are predicted respectively, where represents the predicted value of the + period, and represents the number of predicted periods. and are the model coefficients of the period.

Prophet Model
The Prophet model is a time series forecasting model put forward by FackBook Company in 2017, which can effectively fit the trend and seasonal characteristics of the series, and can also deal with holiday factors [32]. The expression is shown in formula (6)

Quadratic Exponential Smoothing Model
The exponential smoothing model is one of the classic statistical models in time series prediction [30], among which quadratic exponential smoothing model is suitable for the prediction of time series with a linear trend [31]. The formula of the quadratic exponential smoothing model is as follows: where S (1) t and S (2) t represent the first exponential smoothing value and the second exponential smoothing value in the t period, respectively; and α(0 < α < 1) is the smoothing coefficient.
According to Formula (5), the sales of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles from April 2023 to March 2024 are predicted respectively, where F t+T represents the predicted value of the t + T period, and T represents the number of predicted periods. a t and b t are the model coefficients of the t period.

Prophet Model
The Prophet model is a time series forecasting model put forward by FackBook Company in 2017, which can effectively fit the trend and seasonal characteristics of the series, and can also deal with holiday factors [32]. The expression is shown in formula (6): where g(t) represents the trend term for modeling the non-periodic change of sales of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles; s(t) represents the seasonal term for modeling the periodic change of time series; h(t) represents holidays; and ∈ t represents the error term. There are two types of trend term g(t): nonlinear saturated growth model and piecewise linear model. Because the automobile sales data did not show an obvious saturated growth trend, the piecewise linear model was chosen. The expression is as follows: where k represents the growth rate, b represents the offset, a(t) represents the indicator function, and γ represents the offset of smoothing. The Prophet model uses Fourier series to simulate the periodic change of time series, which is expressed as follows: where N represents the total number of cycles, T represents cycles, and a n and b n are parameters to be estimated. The expression of holiday item is as follows: where i represents various holidays, Z(t) = [1(t ∈ D 1 ), . . . , 1(t ∈ D L )] represents the indicator function, D 1 represents holiday collection, and γ i is the parameter of each holiday.

Vector Autoregressive Model
The vector autoregressive model (VAR) was originally proposed by Christopher Sims to study the dynamic relationship between variables [33,34]. The VAR model regards all variables as endogenous variables, which reduces the uncertainty in the simultaneous equations model caused by subjective judgment errors. The VAR model is used to determine the optimal lag order of the influencing factors of automobile sales, which is used as the basis for constructing the multivariate prediction model.
The form of the VAR (p) model is shown in Formula (10): n-dimensional vector of endogenous variables, p is the lag order, A 0 and A i are the matrices of coefficients to be estimated, and µ t is the n-dimensional random perturbation term.

Support Vector Regression Model
The support vector regression model (SVM) is a classic machine learning model, and the support vector regression (SVR) is an important branch of SVM [35,36]. The expression of the SVM model is as follows: where ω is the weight vector, and b is the deviation.
According to the principle of structural risk minimization, SVM is transformed into the following optimization problem: where δ i and δ * i are relaxation variables, c C is a penalty factor, and ε is a loss function.

BP Neural Network Model
The BP neural network, put forward by scholars such as Rinehart and MeClelland, is the most widely used artificial neural network [37], which has the characteristics of self-learning adaptation, parallel processing, strong learning ability and generalization [38], and generally consists of an input layer, hidden layer and output layer. Studies have shown that a three-layer BP neural network prediction model can approximate any nonlinear function [39,40]. Based on the Keras deep learning framework, this study constructs the univariate prediction model and multivariate prediction model of the BP neural network.
The general process of BP neural network prediction is as follows: Step 1: Normalize the original data sequence, and input the normalized training samples into the network.
Step 2: Initialize the network parameters. Set the number of neurons in each layer of the network, set the maximum number of iterations and learning rate, randomly assign initial values to the weights and deviations of the network, and determine the activation function.
Step 3: Calculate the input and output values of each layer, and compare the output value with the target value to determine the error.
Step 4: Based on the error, correct the weights and thresholds.
Step 5: Repeat the process from steps (3) and (4) until the model error drops to the preset value or the training times reach the preset value. After the training, the trained BP neural network can be used for data prediction.

Model Evaluation Index
The mean absolute percentage error (MAPE) and root-mean-square error (RMSE) are selected as evaluation indexes to compare the accuracy of the above models for automobile sales forecasting. The lower the index value, the better the forecasting effect of the model is proved.

Comparison of Univariate Prediction Models
The multivariate forecasting model can comprehensively consider the influence of various factors on automobile sales, but it cannot fully extract the trend information contained in the monthly automobile sales data. In some cases, the effect of univariate forecasting may be better than multivariate forecasting [41]. Therefore, based on the historical sales data of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles, univariate forecasting models are established individually. In order to remain consistent with the dimensions of subsequent multivariate prediction, the data series are processed by logarithm, and fitting and prediction are based on logarithm.
Taking the first 85% of data from March 2016 to February 2022 as the training set, and the remaining 15% of data from March 2022 to March 2023 as the test set, this paper compares the accuracy of the selected univariate forecasting model for the sales forecast of three power types. The univariate BP neural network model and Prophet model are realized using Python3.9 programming, and the quadratic exponential smoothing model is realized using Matlab2017b programming.

Prediction Results of Univariate BP Neural Network
Considering that the monthly sales data of automobiles is periodic with a step size of 12, a sliding window with a length of 12 is set; so, the input variable of the univariate BP neural network model is a 12-dimensional vector.
Taking 12 as a step, based on the historical sales data of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles from March 2016 to March 2023, 60 sets of training data from March 2017 to February 2022 and 13 sets of test data from March 2022 to March 2023 were constructed.
Based on the Keras deep learning framework, the univariate BP neural network models of traditional fuel vehicle sales, battery electric vehicle sales and plug-in hybrid vehicle sales are constructed individually, and the parameter settings are shown in Table 2. In total, 60 groups of pre-constructed training data of three types of power vehicles are individually input into the univariate BP neural network model, and they are trained according to the steps in Section 2.2.5, and the trained univariate BP neural network model and the fitting value of automobile sales from March 2017 to February 2022 are obtained. A total of 13 groups of data used for testing are input into the univariate BP neural network model based on the training set data, and the forecast value of automobile sales from March 2022 to March 2023 is obtained.
Comparing the fitted and predicted values with the actual values in the corresponding periods, the fitting and predicted results of the univariate BP neural network model for traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles are shown in Figure 3.

Prediction Results of Quadratic Exponential Smoothing
The input variable of the quadratic exponential smoothing model is a one-dimensional vector, that is, the historical sales data of the automobile. Based on the historical

Prediction Results of Quadratic Exponential Smoothing
The input variable of the quadratic exponential smoothing model is a one-dimensional vector, that is, the historical sales data of the automobile. Based on the historical sales data of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles from March 2016 to February 2022, the quadratic exponential smoothing models of three types of power vehicles are constructed according to Formulas (1)- (5), and the quadratic exponential smoothing fitting values of the sales of three types of power vehicles from March 2016 to February 2022 are calculated. Based on the constructed quadratic exponential smoothing model and Formula (3), the predicted sales values of three types of power vehicles from March 2022 to March 2023 are calculated individually.
After many attempts, the quadratic exponential smoothing model has the best effect when the smoothing coefficient α is 0.3. Comparing the fitted and predicted values with the actual values in the corresponding periods, the fitting and predicted results of the quadratic exponential smoothing model for three types of power vehicles are shown in Figure 4.

Prediction Results of Quadratic Exponential Smoothing
The input variable of the quadratic exponential smoothing model is a one-dimensional vector, that is, the historical sales data of the automobile. Based on the historical sales data of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles from March 2016 to February 2022, the quadratic exponential smoothing models of three types of power vehicles are constructed according to Formulas (1)- (5), and the quadratic exponential smoothing fitting values of the sales of three types of power vehicles from March 2016 to February 2022 are calculated. Based on the constructed quadratic exponential smoothing model and Formula (3), the predicted sales values of three types of power vehicles from March 2022 to March 2023 are calculated individually.
After many attempts, the quadratic exponential smoothing model has the best effect when the smoothing coefficient is 0.3. Comparing the fitted and predicted values with the actual values in the corresponding periods, the fitting and predicted results of the quadratic exponential smoothing model for three types of power vehicles are shown in Figure 4.   Comparing the fitted and predicted values with the actual values in the corresponding periods, the fitted and predicted results of automobile sales of three power types are shown in Figure 5. models of three power types of vehicles are constructed individually, and the fitting values of the vehicle sales of three power types are calculated. Based on the constructed Prophet model, the sales volume of three types of power vehicles from March 2022 to March 2023 are predicted by Formula (6).
Comparing the fitted and predicted values with the actual values in the corresponding periods, the fitted and predicted results of automobile sales of three power types are shown in Figure 5.  Table 3 shows the fitting prediction errors of the BP neural network model, quadratic exponential smoothing model and Prophet model for the sales of three types of power vehicles. Among them, because the BP neural network model is based on the training data constructed by a sliding window, the fitting time range is from March 2017 to February 2022.

Error Comparison
Comparing the fitting and forecasting effects of each benchmark model on the sales volume of traditional fuel vehicles, the RMSE index shows that the fitting error of the Prophet model is the smallest, and the forecasting error of the exponential smoothing model is the smallest. The MAPE index shows that the fitting error of the quadratic exponential smoothing model is the smallest, and the prediction error of the BP neural network model is the smallest. For battery electric vehicles and plug-in hybrid vehicles, except RMSE index, the fitting error of Prophet model for pure electric vehicles is the smallest, and other RMSE indexes and MAPE indexes show that the fitting and prediction error of quadratic exponential smoothing model is the smallest.
Although the quadratic exponential smoothing model performs well on most indicators, it lacks the ability to identify the turning point of data, and the long-term prediction effect is poor, so it is difficult to meet the needs of out-of-sample prediction. Considering the fitting and forecasting performance of each benchmark model for three types of automobile sales, it is considered that the overall forecasting performance of the Prophet Sales volume/ units  Table 3 shows the fitting prediction errors of the BP neural network model, quadratic exponential smoothing model and Prophet model for the sales of three types of power vehicles. Among them, because the BP neural network model is based on the training data constructed by a sliding window, the fitting time range is from March 2017 to February 2022. Comparing the fitting and forecasting effects of each benchmark model on the sales volume of traditional fuel vehicles, the RMSE index shows that the fitting error of the Prophet model is the smallest, and the forecasting error of the exponential smoothing model is the smallest. The MAPE index shows that the fitting error of the quadratic exponential smoothing model is the smallest, and the prediction error of the BP neural network model is the smallest. For battery electric vehicles and plug-in hybrid vehicles, except RMSE index, the fitting error of Prophet model for pure electric vehicles is the smallest, and other RMSE indexes and MAPE indexes show that the fitting and prediction error of quadratic exponential smoothing model is the smallest.

Error Comparison
Although the quadratic exponential smoothing model performs well on most indicators, it lacks the ability to identify the turning point of data, and the long-term prediction effect is poor, so it is difficult to meet the needs of out-of-sample prediction. Considering the fitting and forecasting performance of each benchmark model for three types of automobile sales, it is considered that the overall forecasting performance of the Prophet model is more in line with the requirements, and its average fitting MAPE and RMSE are only 1.748% and 0.255, respectively, and the average forecasting MAPE and RMSE are only 2.502% and 0.418, respectively. Therefore, the Prophet model is selected to forecast the sales volume of three types of power vehicles from April 2023 to December 2025.

Comparison of Multivariate Prediction Models
Taking the data from March 2016 to February 2022 as the training set, and the data from March 2022 to March 2023 as the test set, this paper compares the effectiveness of various multivariate prediction models, establishes the VAR prediction model based on Eviews9, and establishes the SVM model and the multivariate BP neural network prediction model based on Python3.9.

Analysis of Lag Effect Based on VAR Model
The VAR model is widely used to analyze the causal relationship between variables, so the VAR model is used to analyze the lag effect and dynamic mechanism of various factors on automobile sales, and to determine the input variables of the multivariate prediction model.
(1) Stationarity test. In order to avoid pseudo-regression, we test the stationarity of the original sales volume and influencing factor series based on the ADF test. In order to eliminate the influence of heteroscedasticity, this paper carries out logarithmic processing on automobile sales data and influencing factors data. As shown in Table 4, at the significance level of 1%, the 17 data series involved in the study are stationary. (2) Determination of the optimal lag order. The establishment of the VAR model needs to choose the appropriate lag order. In order to fully reflect the dynamic characteristics of the established VAR model, many factors need to be considered when choosing the lag order. Based on LR, FPE, AIC, SC and HQ criteria, the optimal lag order is determined, and the maximum number of * is the optimal lag period. As shown in Table 5, the optimal lag order is 3. (3) Stability test of the VAR model. When a pulsating impact is applied to the process of an equation in the VAR model, the system is considered to be stable if the pulse disappears with the passage of time. When the modulus of the reciprocal of the characteristic root is less than 1, it means that the VAR model is stable. As shown in Figure 6, the feature roots are all located in the unit circle, which proves that the VAR (3) model is stable.
(2) Determination of the optimal lag order. The establishment of the VAR model needs to choose the appropriate lag order. In order to fully reflect the dynamic characteristics of the established VAR model, many factors need to be considered when choosing the lag order. Based on LR, FPE, AIC, SC and HQ criteria, the optimal lag order is determined, and the maximum number of * is the optimal lag period. As shown in Table 5, the optimal lag order is 3. (3) Stability test of the VAR model. When a pulsating impact is applied to the process of an equation in the VAR model, the system is considered to be stable if the pulse disappears with the passage of time. When the modulus of the reciprocal of the characteristic root is less than 1, it means that the VAR model is stable. As shown in Figure  6, the feature roots are all located in the unit circle, which proves that the VAR (3) model is stable.  Table 6:  Table 6: (5) Impulse response analysis. The impulse response function reflects the dynamic relationship between variables and the dynamic influence path of the impact of one variable on another [42].  Figures 7-9, the horizontal axis represents the number of periods, the vertical axis represents the magnitude of the impulse response function, the blue solid line represents the impulse response function, and the red dotted line represents the standard deviation band of plus or minus two times (±2S.E.).
Comparing the subgraphs, it can be found that in the long run, the sales of traditional fuel vehicles and plug-in hybrid vehicles and the average price of fuel vehicles tend to be stable, while the sales of battery electric vehicles have a negative impact on the three types of vehicles. In the short term, the consumer price index, the total export value, the number of patents granted for electric vehicles and the effective number of patents granted for power batteries all show the characteristics of fluctuation on the sales of the three types of vehicles, while in the long term, this influence tends to be positive. The impact trajectories of customs import, average price of new energy vehicles, nuclear power generation and employee index on the sales of three kinds of vehicles all show the characteristics of alternating ups and downs; in the long run, the average price of new energy vehicles has a positive impact on the sales of traditional fuel vehicles and a negative impact on the sales of battery electric vehicles and plug-in hybrid vehicles. The impact of customs imports, nuclear power generation and employee index on the sales of traditional fuel vehicles has gradually weakened, while it has a positive impact on the sales of battery electric vehicles and plug-in hybrid vehicles. The impact of customs export volume and road passenger volume on the sales of the three kinds of vehicles shows the characteristics of alternating ups and downs. In the long run, the impact on the sales of traditional fuel vehicles is gradually weakened, and it has a negative impact on the sales of battery electric vehicles and plug-in hybrid vehicles. In the short term, the effects of total import value, lithium-ion battery output and hydropower generation on the sales of the three types of vehicles all show the characteristics of alternating ups and downs, and in the long term, this effect tends to be negative. The initial response value of traditional fuel vehicle sales to the unit impact of customs import volume, total export value, electric vehicle patent license amount and employee index is 0, and the subsequent response values alternate between positive and negative, indicating that these factors will not have an impact on traditional fuel vehicle sales at the initial stage. In the long run, the impact of customs import volume and electric vehicle patent license amount on traditional fuel vehicle sales tends to be stable, and the total export value and employee index will have a cyclical fluctuation impact on traditional fuel vehicle sales. The impact of consumer price index, customs export volume, total export value, average price of fuel vehicles, average price of new energy vehicles, effective patent authorization of power batteries, output of lithium-ion batteries, hydropower generation, nuclear power generation and road passenger traffic will make the sales of traditional fuel vehicles fluctuate to varying degrees. The sales of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles have a relatively stable influence on the sales of traditional fuel vehicles.    Response of LNY2 to LNX14 Figure 8. Impulse response of battery electric vehicle sales. Figure 9 shows the impact response of plug-in hybrid vehicle sales to various factors. The response value of plug-in hybrid vehicle sales to its own impact has been relatively stable. In the short term, the dynamic impact effect of traditional fuel vehicle sales and battery electric vehicle sales on plug-in hybrid vehicle sales presents the characteristics of ups and downs. In the long term, the influence of traditional fuel vehicle sales tends to be stable, but the influence of battery electric vehicle sales presents the characteristics of ups and downs. Consumer price index, customs import, electric vehicle patent authorization, power battery patent effective authorization and nuclear power generation have obvious positive effects on plug-in hybrid vehicle sales. The response value of plug-in hybrid vehicle sales to the impact of other influencing factors fluctuates up and down. According to the analysis results of the VAR (3) model on influencing factors, based on 14 influencing factor data series and historical sales data of three types of power vehicles, the time window is set to 3, and the rolling time window is used to generate training samples as inputs of the VAR model, SVM model and multivariable BP neural network According to Figure 7, it can be found that the logarithm of 17 influencing factors, such as consumer price index, customs export volume and customs import volume, has a fluctuation characteristic of fluctuation in the impact trajectory of traditional fuel vehicle sales. The initial response value of traditional fuel vehicle sales to the unit impact of customs import volume, total export value, electric vehicle patent license amount and employee index is 0, and the subsequent response values alternate between positive and negative, indicating that these factors will not have an impact on traditional fuel vehicle sales at the initial stage. In the long run, the impact of customs import volume and electric vehicle patent license amount on traditional fuel vehicle sales tends to be stable, and the total export value and employee index will have a cyclical fluctuation impact on traditional fuel vehicle sales. The impact of consumer price index, customs export volume, total export value, average price of fuel vehicles, average price of new energy vehicles, effective patent authorization of power batteries, output of lithium-ion batteries, hydropower generation, nuclear power generation and road passenger traffic will make the sales of traditional fuel vehicles fluctuate to varying degrees. The sales of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles have a relatively stable influence on the sales of traditional fuel vehicles. Figure 8 shows the effect of various influencing factors on the sales of battery electric vehicles. At the beginning, the response value of battery electric vehicle sales to the sales shock of traditional fuel vehicles fluctuated slightly, and finally stabilized. The response value of battery electric vehicle sales to its own impact fluctuated slightly in the middle and finally stabilized. The response value of battery electric vehicle sales to plug-in hybrid vehicle sales shock has been relatively stable. The impact from other influencing factors will make the sales of battery electric vehicles fluctuate to varying degrees. Figure 9 shows the impact response of plug-in hybrid vehicle sales to various factors. The response value of plug-in hybrid vehicle sales to its own impact has been relatively stable. In the short term, the dynamic impact effect of traditional fuel vehicle sales and battery electric vehicle sales on plug-in hybrid vehicle sales presents the characteristics of ups and downs. In the long term, the influence of traditional fuel vehicle sales tends to be stable, but the influence of battery electric vehicle sales presents the characteristics of ups and downs. Consumer price index, customs import, electric vehicle patent authorization, power battery patent effective authorization and nuclear power generation have obvious positive effects on plug-in hybrid vehicle sales. The response value of plug-in hybrid vehicle sales to the impact of other influencing factors fluctuates up and down.
According to the analysis results of the VAR (3) model on influencing factors, based on 14 influencing factor data series and historical sales data of three types of power vehicles, the time window is set to 3, and the rolling time window is used to generate training samples as inputs of the VAR model, SVM model and multivariable BP neural network model.

Predictive Results of VAR Model
The input variables of the VAR (3)   The construction method of the SVM forecasting model for battery electric vehicle sales and plug-in hybrid vehicle sales is similar.
After many trainings, when forecasting the sales of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles, the kernel function is set to "Linear", the penalty coefficient is set to 10, and the default values of the scikit-learn library of the Python platform are used for other parameters. The fitting and forecasting results of the SVM model for automobile sales are shown in Figure 11. The construction method of the SVM forecasting model for battery electric vehicle sales and plug-in hybrid vehicle sales is similar.
After many trainings, when forecasting the sales of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles, the kernel function is set to "Linear", the penalty coefficient is set to 10, and the default values of the scikit-learn library of the Python platform are used for other parameters. The fitting and forecasting results of the SVM model for automobile sales are shown in Figure 11. Based on the Keras deep learning framework, a multivariate BP neural network prediction model is constructed, and the parameter settings are shown in Table 7. We input 69 groups of pre-constructed training data into the multivariate BP neural network model, and trained them according to the steps in Section 2.2.5, and obtained the trained multivariate BP neural network model and the fitting value of automobile sales from June 2016 to February 2022. In total, 13 groups of data used for testing are input into the multivariate BP neural network model based on the training set data, and the forecast value of automobile sales from March 2022 to March 2023 is obtained.  Based on the Keras deep learning framework, a multivariate BP neural network prediction model is constructed, and the parameter settings are shown in Table 7. We input 69 groups of pre-constructed training data into the multivariate BP neural network model, and trained them according to the steps in Section 2.2.5, and obtained the trained multivariate BP neural network model and the fitting value of automobile sales from June 2016 to February 2022. In total, 13 groups of data used for testing are input into the multivariate BP neural network model based on the training set data, and the forecast value of automobile sales from March 2022 to March 2023 is obtained. The fitting results and prediction results of three kinds of vehicle sales trained by the multivariate BP neural network model are shown in Figure 12.  Table 8 shows the fitting and prediction errors of the VAR model, SVM model and multivariate BP neural network model for the sales of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles.

Error Comparison
From the RMSE index, the multivariate BP neural network model has the best prediction effect on the automobile sales of three power types, and the VAR model has the best fitting effect on the automobile sales of three power types. From the MAPE index, the fitting error of the VAR model for three types of power vehicles is small, while the BP neural network model and SVM model have good forecasting effects for three types of power vehicles.
Based on the above analysis, it is considered that the multivariate BP neural network model has the best prediction performance for the sales of three power types of vehicles, and the fitting MAPE and RMSE are only 0.922% and 0.138, respectively, and the prediction MAPE and RMSE are only 2.668% and 0.434, respectively. Therefore, the multivariate BP neural network model is selected to predict the sales of three power types of vehicles outside the sample.   Table 8 shows the fitting and prediction errors of the VAR model, SVM model and multivariate BP neural network model for the sales of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles. From the RMSE index, the multivariate BP neural network model has the best prediction effect on the automobile sales of three power types, and the VAR model has the best fitting effect on the automobile sales of three power types. From the MAPE index, the fitting error of the VAR model for three types of power vehicles is small, while the BP neural network model and SVM model have good forecasting effects for three types of power vehicles.

Error Comparison
Based on the above analysis, it is considered that the multivariate BP neural network model has the best prediction performance for the sales of three power types of vehicles, and the fitting MAPE and RMSE are only 0.922% and 0.138, respectively, and the prediction MAPE and RMSE are only 2.668% and 0.434, respectively. Therefore, the multivariate BP neural network model is selected to predict the sales of three power types of vehicles outside the sample.

Comparison with Existing Literature
In Table 9, some of the existing literature about automobile sales forecast is sorted out. Comparing Tables 3 and 8, we can find that the prediction errors of the univariate Forecast Model and the multivariate BP neural network prediction model we selected are smaller than those in the literature in the table.

Comparison between Univariate Prediction Models and Multivariate Prediction Models
Comparing the errors in Tables 3 and 8, it can be found that the maximum fitting MAPE and RMSE of the three univariate forecasting models for three types of automobile sales are 2.842% and 0.405, respectively. However, the maximum MAPE and RMSE of the multivariable forecasting model for three types of power vehicles are 1.807% and 0.345, respectively. The maximum prediction MAPE and RMSE of the univariate prediction model are 3.382% and 0.596, respectively. The maximum prediction MAPE and RMSE of the univariate prediction model are 4.863% and 0.693, respectively. On the whole, the univariate and multivariate prediction models selected in this paper perform well, and the univariate prediction model is better in synthesis. Table 10 summarizes the main advantages and disadvantages of the univariate forecasting model and multivariate forecasting model. The reason why this paper puts forward two kinds of models is not to compare the forecasting effects of the univariate forecasting model and multivariate forecasting model, but to consider that most of the current literature only forecasts automobile sales from one perspective. This paper hopes to combine the advantages of the two models to determine univariate forecasting and multivariate forecasting individually, so as to enhance the scientific and rigorous research and provide a more comprehensive reference for automobile sales forecasting. Table 10. Advantages and disadvantages of univariate prediction models and multivariate prediction models.

Advantages
Effectively capture the trends and changing rules contained in the historical sales data of automobiles.
Make full use of the relationship between automobile sales and various factors to improve the generalization ability of the automobile sales forecasting model.

Disadvantages
Only one characteristic parameter is considered, and the information extracted is limited, which cannot reflect the influence of other variables in the system on automobile sales.
It is difficult to define the system boundary, and too many influencing factors may affect the extraction of automobile sales trend information.

Extrasample Prediction Result
According to the above analysis, firstly, the sales of traditional fuel vehicles (ICEs), battery electric vehicles (BEVs) and plug-in hybrid vehicles (PHEVs) from April 2023 to December 2025 are predicted by the Prophet model. Among them, the default parameters of the Python platform fbprophet library are used for the out-of-sample prediction of traditional fuel vehicle sales, and the changepoint_range is set to 0.1 for the out-of-sample prediction of battery electric vehicle sales and plug-in hybrid vehicle sales, and the default values are used for other parameters.
Then, we predict the values of 14 influencing factors from April 2023 to December 2025 by using the Prophet model, and input the predicted values into the trained multivariate BP neural network prediction model. The univariate and multivariate predicted and restored values of traditional fuel vehicle sales logarithm, battery electric vehicle sales logarithm and plug-in hybrid vehicle sales logarithm from April 2023 to December 2025 are shown in Figure 13.

Extrasample Prediction Result
According to the above analysis, firstly, the sales of traditional fuel vehicles (ICEs), battery electric vehicles (BEVs) and plug-in hybrid vehicles (PHEVs) from April 2023 to December 2025 are predicted by the Prophet model. Among them, the default parameters of the Python platform fbprophet library are used for the out-of-sample prediction of traditional fuel vehicle sales, and the changepoint_range is set to 0.1 for the out-of-sample prediction of battery electric vehicle sales and plug-in hybrid vehicle sales, and the default values are used for other parameters.
Then, we predict the values of 14 influencing factors from April 2023 to December 2025 by using the Prophet model, and input the predicted values into the trained multivariate BP neural network prediction model. The univariate and multivariate predicted and restored values of traditional fuel vehicle sales logarithm, battery electric vehicle sales logarithm and plug-in hybrid vehicle sales logarithm from April 2023 to December 2025 are shown in Figure 13. (e) (f) According to the forecast results in Figure 13, it can be found that the Forecast Model and the Multivariate BP Neural Network forecast model have the same forecast trend for the sales of three types of power vehicles.
In 2022, the National Development and Reform Commission and the National Energy Administration issued the 14th Five-Year Plan for Modern Energy System, and the 14th Five-Year Comprehensive Work Plan for Energy Conservation and Emission Reduction issued by the State Council pointed out that by 2025, the sales of new energy vehicles should account for about 20% of the total sales of new vehicles in that year, and China has achieved this goal more than three years ahead of schedule. According to the univariate forecast, by the end of 2025 and the beginning of 2026, the sales volume of new energy vehicles in China will reach 56.2%, while the multivariate forecast shows that this proportion will reach 49.0%. The International Energy Agency (IEA) and Bloomberg New Energy Finance (BNEF) have also published relevant outlooks [44], and compared our forecast results with them. The Global EV Outlook 2023 released by IEA points out that under the guidance of China, the sales of electric vehicles, including battery electric vehicles and plug-in hybrid vehicles, continue to grow. According to the Electric Vehicle Outlook 2023 published by BNEF, by 2026, the sales of electric vehicles in China will reach 52.0%. It can be seen that the prediction results of this paper are very close to the prospects of IEV and BNEF.
At present, the Great Wall and other traditional car giants have accelerated the layout of new energy sources, and BYD announced that it would stop producing fuel vehicles in 2022. The intensification of competition in the new energy field has also promoted the research and development and application of new energy vehicles. In the short term, traditional fuel vehicles will not be completely eliminated, but their market share is gradually declining. According to univariate and multivariate forecast results, by 2025, the sales share of traditional fuel vehicles will drop to 43.8% and 51.0%, respectively. In 2020, the State Council released "New Energy Vehicle Development Plan (2021-2035)", pointing out that by 2035, battery electric vehicles will become the mainstream of sales. The forecast results of this paper are consistent with the national planning direction, and the sales volume of battery electric vehicles will maintain an upward trend. According to univariate and multivariate forecasts, by 2025, the market share of battery electric vehicles will reach 44.4% and 37.9%, respectively. According to the forecast results in Figure 13, it can be found that the Forecast Model and the Multivariate BP Neural Network forecast model have the same forecast trend for the sales of three types of power vehicles.
In 2022, the National Development and Reform Commission and the National Energy Administration issued the 14th Five-Year Plan for Modern Energy System, and the 14th Five-Year Comprehensive Work Plan for Energy Conservation and Emission Reduction issued by the State Council pointed out that by 2025, the sales of new energy vehicles should account for about 20% of the total sales of new vehicles in that year, and China has achieved this goal more than three years ahead of schedule. According to the univariate forecast, by the end of 2025 and the beginning of 2026, the sales volume of new energy vehicles in China will reach 56.2%, while the multivariate forecast shows that this proportion will reach 49.0%. The International Energy Agency (IEA) and Bloomberg New Energy Finance (BNEF) have also published relevant outlooks [44], and compared our forecast results with them. The Global EV Outlook 2023 released by IEA points out that under the guidance of China, the sales of electric vehicles, including battery electric vehicles and plug-in hybrid vehicles, continue to grow. According to the Electric Vehicle Outlook 2023 published by BNEF, by 2026, the sales of electric vehicles in China will reach 52.0%. It can be seen that the prediction results of this paper are very close to the prospects of IEV and BNEF.
At present, the Great Wall and other traditional car giants have accelerated the layout of new energy sources, and BYD announced that it would stop producing fuel vehicles in 2022. The intensification of competition in the new energy field has also promoted the research and development and application of new energy vehicles. In the short term, traditional fuel vehicles will not be completely eliminated, but their market share is gradually declining. According to univariate and multivariate forecast results, by 2025, the sales share of traditional fuel vehicles will drop to 43.8% and 51.0%, respectively. In 2020, the State Council released "New Energy Vehicle Development Plan (2021-2035)", pointing out that by 2035, battery electric vehicles will become the mainstream of sales. The forecast results of this paper are consistent with the national planning direction, and the sales volume of battery electric vehicles will maintain an upward trend. According to univariate and multivariate forecasts, by 2025, the market share of battery electric vehicles will reach 44.4% and 37.9%, respectively. Compared with traditional fuel vehicles and battery electric vehicles, plug-in hybrid vehicles have certain advantages in fuel economy and cruising range. BYD, Great Wall and other car giants are shifting the main battlefield of new energy vehicle competition to the hybrid market, further promoting the maturity of hybrid technology. Univariate and multivariate forecasting results show that by 2025, the sales share of plug-in hybrid vehicles will rise to 11.8% and 11.1%, respectively, in the new car market.
It is not difficult to see from the forecast results that the new energy vehicles mainly based on electric energy will be the inevitable trend of the development of China's automobile industry.
From a global perspective, traditional car giants in various countries are accelerating the transformation of electrification. For example, BMW in Germany has set the goal that 50% of the group's sales will be battery electric vehicles by 2030. BYD plans to achieve the goal of global sales of 5 million vehicles by 2025. In addition, with the support of national policies, many new forces have emerged to build cars. Tesla in the United States has become the world's largest battery electric vehicle company with battery electric vehicles. China's Xpeng Motors, Leading Ideal and Nio Automobile have also taken advantage of the historic opportunity of the automobile industry reform to develop into the three giants of the new force of making cars. The global competition of new energy vehicles has started.

Conclusions
China's automobile industry is characterized by unprecedented diversification, and its development is accompanied by strong uncertainty. The new consumption reform of automobiles brings certain challenges to the country, enterprises and consumers. In order to accurately judge the development potential of several mainstream power vehicles on the market at present, the forecasting effects of various statistical models and machine learning on automobile sales are compared from the perspective of univariate and multivariate. Among them, the Prophet model and BP neural network model have achieved high prediction accuracy, and the developed VAR multivariable BP neural network model fully considers the lag effect of various factors on automobile sales. Therefore, we forecast the sales volume of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles in China from April 2023-December 2025 with the Prophet model as a univariate forecasting model and the BP neural network as a multivariate forecasting model. According to the univariate forecast, by 2025, the proportion of traditional fuel vehicles, battery electric vehicles and plug-in hybrid vehicles will be 43.8%, 44.4% and 11.8%, respectively. Multivariate prediction shows that by 2025, the sales of these three types of cars will account for 51.0%, 37.9% and 11.1%, respectively.

Suggestions
According to the above conclusions and analysis, the following suggestions are put forward: (1) Local governments and relevant departments should strengthen the construction of supporting infrastructure for new energy vehicles to better meet the demands of consumers. (2) In order to maintain China's competitive advantage in the global market, we should further enhance our independent innovation capability and increase investment in research and development of key core technologies, such as power batteries, operating systems and independent chips.
Although good results have been achieved for various types of cars, it is considered that there are some limitations in the same lag period of each influencing factor on car sales, and the heterogeneity of the lag effect of different factors on car sales can be considered in the future. Funding: This research was funded by the project "Process Optimization and Data Governance Solution Research" of Shanghai MAHLE Thermal Systems Co., Ltd., (23H00706).

Conflicts of Interest:
The authors declare no conflict of interest.