Relationships between Vehicle Pricing and Features: Data Driven Analysis of the Chinese Vehicle Market

: A full ‐ scale understanding of the dynamics of the Chinese vehicle market can benefit stakeholders with respect to rational decision ‐ making and effective long ‐ term investment. This study attempts to discover the common vehicle pricing patterns in the Chinese market by quantifying statistical correlations among critical vehicle features from intrinsic powertrain systems to extrinsic market positioning. The data samples involve almost all passenger vehicle models sold in 2013 to 2019. After comparing multiple statistical methodologies, a log ‐ transformation variant of the multinomial linear regression model was found to be the best one, and the goodness of fit shows that this model can offer stable estimates, which were validated using 2019 market data. The insights achieved are: (1) The price and major performance features of SUVs/crossovers are similar to those of sedans; (2) If all other explicit features remain the same, the price of a Japanese midsize sedan is 62% higher than that of a Chinese midsize sedan, and European midsize vehicles have the highest prices overall. (3) The incremental price of fuel consumption varies by vehicle class and fuel economy. For example, from 30 to 50 MPG, the vehicle price increases by $119 for a Chinese brand sedan vehicle, by $69 for a Chinese brand SUV. This study uses statistical modeling to perform data analyses of the Chinese passenger vehicle market. The methods and conclusions of this study can be used for vehicle ownership and consumer purchase preference analyses by others. Some assumptions are made to simplify the analysis process: only the major vehicle features are considered; only the gasoline sedan segment and the gasoline SUV/crossover segment are considered; and vehicle brands are classified according to their regional areas instead of by their brand names. As more is learned about the data and the methodology, the analysis will be updated and improved.


Introduction
Rapid wealth accumulation in China and the enormous growth of the Chinese middle class are transforming the Chinese vehicle market [1]. According to the China Automotive Technology and Research Center (CATARC), the total motor vehicle stock in China reached 348 million by the end of 2019, and the passenger vehicle population increased to around 201 million units [2]. Though, as a developing country, China has grown to be one of the largest passenger vehicle markets in the world In 2018-2020, the industry was experiencing an adjustment, and sales growth slowed. However, it is still regarded as a must-win market by investors, and it attracts a significant amount of capital [13]. Tesla's Shanghai factory kicked off production in the end of 2019 after just ten months of construction [14]. GM expanded its research center in China with a team of highly-qualified designers, scientists, and engineers in order to respond to consumer demands more efficiently by closely embracing local markets and manufacturing partners [15]. At the same time, the vehicle industry, viewed by the Chinese government as a strategic manufacturing industry, will be more open to foreign investors [16]. The government has announced that, by 2020, the investment restrictions regarding foreign ownership in both passenger and commercial vehicle manufacturing companies will be fully removed [17].
Meanwhile, the explosive growth of the vehicle market in China has inevitably raised issues related to the economy, energy security, air pollution, and urban planning [1,9]. China has been one of the worlds' largest CO2 emitters and one of the largest oil importers [18]. Road traffic accounts for about 70% of the energy consumed by the transportation sector [18,19]. Therefore, the Chinese government is attempting to divert the vehicle market to a more fuel-efficient and electrified pathway via government stimulus incentives, fuel economy policies, and other measures [3]. In 2017, the Chinese government released a vehicle policy, "Passenger Cars Corporate Average Fuel Consumption and New Energy Vehicle Credit Regulation," to urge the auto companies to produce highly fuel-efficient conventional vehicles or electric vehicles to achieve the stricter fuel economy standards [20].
Regardless of the specific goals of industry and the government, all of these endeavors require an understanding of the preferences of China's car buyers, especially in terms of vehicle price and features. In order to maximize profits and survive under furious market competition, auto companies need to quickly adjust their vehicle characteristics to meet consumer needs at acceptable price points. Hence, it is important for economists and market analysts to develop quantitative models or analyze market data to quickly identify emerging consumer-side market trends. Chen et al. created a mixed oligopolistic differentiated products model to investigate gaming between the supply side and the demand side, as well as gaming among different automakers [21]. This model reveals the relationships between incremental costs and technology-related vehicle characteristics, such as fuel efficiency and horsepower, by different types of automakers. Studies have shown that incremental prices by vehicle technology and by vehicle brand are important inputs in discrete choice models, which rely heavily on consumer preferences regarding various vehicle types [22,23]. Typically, the relationship between vehicle pricing and vehicle intrinsic/extrinsic features are quantified through data survey, data review, or expert assessment [24,25]. For example, the National Research Council uses data review and expert assessment to quantify the relationship between fuel economy and manufacturing cost and projects' incremental costs for further improving vehicle fuel economy by 2017, 2020, and 2025 [24]. Xie et al. conducted analyses based on a large number of regulatory and industrial documents to estimate the corresponding incremental costs associated with fuel-efficient technologies [25]. Huo et al. estimated the relationship between vehicle fuel consumption rate and vehicle weight among passenger vehicles as well as commercial vehicles [26].
Because of the strong spending power of the Chinese middle class, the more expensive premium passenger vehicle market continues to grow despite a general downturn in the vehicle market. According to Gasgoo, most premium auto brands, such as Mercedes-Benz, Audi, and BMW, experienced a growth in sales in the first half of 2019, compared to the same period in last year [27]. Car buyers have distinct purchase preferences and attitudes regarding vehicle brands, which becomes clear when examining which vehicles did not sell well in 2019. Local Chinese brands seem to be regarded as cheap, low-quality products, even though those automakers are improving their brands with attractive designs [28], sophisticated powertrain components (both fuel-powered and electrified), and innovative electronic interfaces [29,30]. Even if the vehicle prices are the same, profit margins can vary among car brands due to differences in the supply chain, manufacturing techniques, and cost controls. Also, Chen et al. found that, for the vehicles that are made in the joint ventures in China, the incremental costs of fuel efficiency and of alternative vehicles tend to be lower in joint ventures with Japanese firms, while the incremental costs of horsepower tend to be lower in joint ventures with U.S. firms [21]. According to our literature review, no publications have discussed the incremental price of vehicle features such as fuel efficiency or horsepower among different car brands in the Chinese vehicle market, and no studies have systematically quantified the possible relative values among the vehicle brands through data analysis.
This study applies statistical methodologies and uses historical market data to quantify the relationships between vehicle prices and features in the Chinese vehicle market in the 2010s. It has three primary objectives: 1. Estimate the price range of a vehicle based on its intrinsic/extrinsic features and performance; 2. Identify the brand premium of vehicles (domestic product, join-venture product, and imported product); 3. Investigate the relationship between incremental vehicle prices and fuel economy (or fuel consumption rate).
By collaborating with CATARC, this study collected sales records and key powertrain information for all vehicle makes and models from 2013 to 2019 for the analyses. The following questions were addressed in this study: This paper consists of five sections. The first section presents the motivations for and objectives of this study and discusses background literature on the passenger vehicle market and recent trends in China. The second section presents the data collection and processing efforts. The third section describes and compares the methodologies used to quantify the relationships among vehicle features. The fourth section discusses the vehicle feature analyses. The last section summarizes this study. This paper focuses only on the passenger vehicle segment, which is the dominant part of the market, so all the "vehicles" mentioned in the following context are referred to the passenger vehicles. In addition, a yearly average currency exchange rate of $ 1.0 USD = 6.620 CNY in 2018 is used in this paper [31].

Sample Analysis
Through the data collection and processing by CATARC, this study obtains major vehicle market and vehicle performance information for the passenger vehicle models sold from 2013 to 2019 (extreme luxury sport cars were viewed as outliers and were excluded from data collection). CATARC is an independent research organization that offers consulting service on policymaking, product marketing, and consumer survey for clients from governments, auto companies, and suppliers. It owns a comprehensive database covering most aspects of the vehicle market. This study uses more than 100 million passenger vehicles (including gasoline sedans, gasoline SUVs/crossovers, and diesel SUVs/crossovers) sold in 2013-2018 for model quantification and uses vehicles sold in 2019 for model validation.
As shown in Table 1, these vehicles account for more than 98% of all passenger vehicles sold in the market in each year, except for 2013. Figure 2 summarizes both the number of models by vehicle type and the sales percentages by vehicle type from 2013 to 2018. This study divides vehicles into two segments: sedans and SUVs/crossovers (multi-purpose vehicles are also classified as SUVs/crossovers). The gasoline sedan has been the prevalent vehicle type sold in recent years. However, the gasoline SUV/crossover is gaining popularity, and this trend is increasingly evident in the market downturn in 2018. Furthermore, 2018 is the first year SUV/crossover sales exceed sedan sales. In sum, this comprehensive, rich data improves the reliability and accuracy of the estimates in this study.  The analysis of this study focuses primarily on gasoline-powered passenger vehicles, even though diesel-powered vehicles and plug-in electric vehicles (PEVs) are also sold in China. There are several reasons for this: 1. Gasoline-powered vehicles are the only products with a significant market share in the current passenger vehicle market, and this trend could continue for at least another decade in China [32]. 2. The well-developed gasoline powertrain technologies and mature gasoline vehicle market ensure the robustness of the study conclusions, which are primarily based on historical market data. 3. Though PEVs are becoming popular in the Chinese vehicle market, the limited number of PEV models, the rapid changes in PEV market dynamics, and the continuous upgrades to electric vehicle technologies are very likely to discount the effectiveness of the conclusions, which are based on contemporary PEV powertrain features and historic market information.
To account for inflation, all prices are normalized to year 2018 values using the consumer price index for China as reported by the World Bank (

Vehicle Features
One issue this study needs to address is which explanatory variables to include in the model for quantifying the relationships among a vehicle's price and its features. It is a trade-off: if the model has too few explanatory variables, relevant variables could be missed, resulting in estimation bias (omitted variable bias); however, a model with too many variables could render an efficiency loss. In this study, the Bayesian information criterion is adopted to determine how many explanatory variables should be used in the model.
The vehicle features, which will be directly or indirectly used for the explanatory variables in the models, are segmented into three aspects: vehicle identification information, vehicle market information, and vehicle technical performance. Figure 3 shows the vehicle data structure. The vehicle identification information offers the most basic information used by buyers to differentiate between vehicles. The vehicle's model name and model year indicate its auto manufacturer and brand, which can impact buyers' purchase decisions [34]. The vehicle class includes the information on vehicle type (sedan, SUV/crossover), and the vehicle size is usually determined by curb weight, passenger/luggage volume, or other factors. Sedans are segmented into five size classes: minicompact, subcompact, compact, midsize, and large; while SUVs/crossovers are segmented into four classes: subcompact, compact, midsize, and large. The vehicle class and size were determined by CATARC, which relies heavily on the information supplied by the Ministry of Industry and Information Technology [3]. The second data category, "market," indicates the "market value" determined by consumer purchase choices. It includes the vehicle price (refer to MSRP in this article), the sales volume (used as a weighting factor), and the brand attribute (whether it is a luxury brand). CATARC classifies the following brands into the luxury (high-end) car segment: Audi, BMW, Cadillac, DS, HQ, Infiniti, Jaguar Land Rover, Lexus, Mercedes-Benz, and Volvo. This classification is based on the vehicle's target market as well as public perception. Unlike the vehicle technical specifications, which can clearly indicate the value/cost, the brand attribute is a more ambiguous concept and is hard to quantify. A superior brand may be associated with distinctive tangible and intangible benefits to vehicle buyers, even though the brand name itself never promises to offer all the services that people imagine [35]. In other words, consumers often associate a superior brand with a more stylish vehicle interior and design, a more comfortable or exciting driving experience, more attentive aftermarket service, or as just providing more satisfaction. Because of the intangible extra value, a favorable brand could attract buyers to pay extra money for its product even if the product quality or the aftermarket service is similar to that of other mainstream products [36]. Thus, the brand attribute is an important characteristic that could influence vehicle market price.
The vehicle "technical performance" is a critical factor that determines the essential quality of the product. Due to data limitations, this study uses four technical parameters as indicators of performance: fuel economy, vehicle weight, engine rated power, and engine displacement. The value of fuel economy informs the level of fuel-efficiency; the vehicle weight implicates vehicle size and vehicle lightweight technology, and it impacts the glider (car body without powertrain) cost in vehicle manufacturing; the size of engine rated power is related to the vehicle acceleration performance and towing capability; the size of engine displacement is related to the size of engine and the cost of aftertreatment system in the vehicle. Notably, the engine rated power and the engine displacement are commonly regarded as two different parameters-although they could correlate to each other-which should be addressed through log transformation or by increasing sample size in the model to reduce the impacts of the collinearity. Figure 4 compares the values of the major vehicle features in 2014-2019 to their 2013 values-all values are sales weighted. It shows that the fuel economy for both sedans and SUVs/crossovers increased significantly, while vehicle weight increased by much less (SUV/crossover weight stays about the same). Engine rated power increased for both vehicle types, but much more for sedans than SUVs/crossovers. As fuel economy goes up, weight and engine power go down initially (through 2016); then, they begin to increase. It would be that automakers initially improved fuel economy by reducing weight and engine power (lowhanging fruit). Then, they began improving engine/powertrain technology, which allowed them to increase vehicle weight and engine power while fuel economy continued to go up. Also, the engine displacement continues to go down, suggesting that improved technology allows them to get increased power from smaller engines. These trends indicate that the vehicles became more fuelefficient during these years. In summary, this study collects 17 vehicle features for each vehicle model and year, including five numerical ( -) and five categorical ( -) features, as shown in Table 3. Featuresare transformed into 11 dummy variables in the vehicle price prediction model. In addition, since the range of values for different vehicle features varies widely and these conspicuous differences could affect estimating the contribution rate of the vehicle features (numerical ones), the min-max rescaling method is adopted to normalize the dataset before using these characteristics for analysis. The minmax normalization rescales the range of numerical vehicle features to fall between −1 and 1.

Estimating Vehicle Prices
The goal of vehicle price estimation is to quantify the vehicle price with correlated vehicle features summarized from the data samples. We denote aggregated features for a vehicle n as a vector ∈ ℝ : where N denotes the total number of observations (vehicle models by year), and P denotes the total number of features. The aggregate vehicle feature matrix (VFM) is shown in Equation (2), where each row represents all features of the vehicle. VFM includes all available historical data collected from the vehicle market. These data are used as the explanatory variables in the vehicle price estimation model. However, these data might not directly act as explanatory variables, because the explanatory variables, if directly used from VFM without any transformations, could cause multicollinearity in a multiple regression model process. For example, as discussed in Section 2.2, two vehicle features-vehicle weight and engine rated power-could be linearly correlated. Therefore, variants of the vehicle features are needed to avoid this issue.
The dependent variable is the vehicle price, as shown in Equation (3). The vehicle price estimation problem aims to determine a function • that maps historical vehicle features (i.e., VFM) to their price so that, with given vehicle features, the model can roughly estimate and project the corresponding vehicle price, based on the relationships between them. Table 4 summarizes both the pros and cons of four state-of-the-art prediction methods: partial least squares (PLS) regression, k-nearest neighbors (k-NN), support vector regression (SVR), and multinomial linear regression (MLR). There are a few characteristics that the candidate estimation/prediction methods should maintain for vehicle price estimation problems. First, this study does not only aim to provide a reliable and accurate estimation of vehicle price based on vehicle features. More importantly, it aims to answer some important research questions: What are the major factors that affect vehicle price? What is the brand implicit values of vehicles from different automakers by origin? In order to answer these questions, the estimation model should be able to generate interpretable parameters. The k-NN algorithm is a non-parametric technique for classification and regression problems. Hence, it is less preferable than the other three methods. Secondly, the proposed algorithm should generate predictions as accurately and efficiently as possible. Table 5 shows a comparison of the vehicle price prediction results among the four approaches in terms of mean absolute error (MAE) and mean absolute percentage error (MAPE). The prediction results indicate that the MLR method outperforms the PLS and k-NN methods and is comparable to SVR. Due to its easy implementation, high interpretability, and low computational complexity, this study uses MLR to estimate vehicle price.

Model Construction
After comparing different prediction methods in Section 3.2, this study adopts the MLR method after data transformation [42] to build a statistical model and examine the potential quantitative relationships among the vehicle features, which are still unclear. Before constructing the model, all numerical vehicle features shown in Table 3 are normalized. This study fits the data for sedans and SUVs separately. The estimation model is described by Equation (4). The numerical variable ( ) functions as weighting factor; the four numerical variables ( -) are log-transformed based on their relationship to vehicle price [42]; and the three category variables ( -) are transformed into dummy variables. In the model, the feature indicating that vehicles belong to a Chinese brand is not assigned a dummy variable. , , ln 1 , ln 1 , ln 1 , ln 1 , , • , where , , ⋯ , contains all the coefficients calculated in this multinomial regression model; ,  Table 6 shows the sample sizes used in this study.  Table 7 shows the statistical results in the sedan segment and in the SUV/crossover segment, respectively, after fitting with the training dataset. The regression statistics indicate a good fit for both the sedan and the SUV/crossover segments, as shown in Figure 5. The adjusted R-squares are both around 0.9998. The F-statistic values, which reveal the overall significance for the regression model, indicate that the results are highly significant in terms of variance. Furthermore, the p-values of the coefficients in both the sedan and the SUV/crossover segments are nearly zero, which means that the coefficients are significant as well. The overall results from the MLR model appear reliable.  In addition, Figure 6 shows the distributions of the percentage errors in the sedan test dataset and the SUV/crossover test dataset. Overall, the percentage errors of most vehicles in the sedan segment are between −20% and 20%, and the MAPE in that segment is about 15.2%. The percentage errors of most vehicles in the SUV segment are also between −20% and 20%, except for large SUVs/crossovers. Most of the percentage errors for the large SUVs/crossovers are between −40% and 40%. The MAPE for the SUV/crossover segment is about 21.9%.

Monte Carlo Simulation
Monte Carlo simulation is a method used for explaining the impacts of inherent uncertainty of inputs on outcomes. This study aims to visualize the potential patterns or relationships among the vehicle prices and vehicle performance and features through the market data. For each vehicle model by year, the vehicle price and its corresponding features are pre-determined. However, for a vehicle market with millions of annual vehicle sales, the aggregations of vehicle performance characteristics and features can often be described by probability density functions. The outcomes, or the vehicle prices for this type of vehicle, will be generated through the statistical model using hundreds or thousands of simulations by assigning possible input values based on their probability distributions. The outcomes are usually represented as a probability distribution, as shown in Figure 7.

Data Visualization-Vehicle Fuel Economy and Vehicle Price
Based on the data analysis and statistical modeling, this study is able to quantify the relationships between vehicle prices (normalized to 2018 U.S dollars) and vehicle features such as fuel economy (MPG), vehicle weight (kg), and vehicle power (kW) for the 2013-2018 Chinese vehicle market. In this section, the study summarizes the historic trends of MPG, vehicle MSRP, and relationships among vehicle powertrain features, based on the market data. The aggregate values are based on the vehicle models in the dataset only; they are not sales-or production-weighted values.
The distributions of MPG for the two vehicle segments by year and by vehicle brand are presented in Figure 9. Figure 9a shows the median MPG of the vehicle models of each segment. Overall, the MPG of both vehicle segments increases over time, and the MPG level of SUVs/crossovers increases much faster than that of sedans. Five brand origins are also compared in the study as in Figure 9b: Chinese brands (e.g., Geely and Great Wall), American brands (e.g., Ford and GM), European brands (e.g., VW and BMW), Korean brands (e.g., Hyundai and Kia), and Japanese brands (e.g., Toyota and Honda). The non-Chinese brands could be manufactured in the joint-venture factories in China or could be imported from other countries. Japanese-brand vehicles rank the highest on fuel economy in the sedan segment, but the lowest in the SUV/crossover segment, according to the 2013-2018 data. Note that the MPG values in Figure 9 are not exactly the same as the sales-or production-weighted values in other references and are, therefore, not comparable.
The distributions of MSRPs among vehicle models by year and by vehicle brand are presented in Figure 10. All vehicle prices have been converted to 2018 U.S. dollars. Figure 10a shows that the MSRP of sedan models did not change much during the study period, except for an increase in 2018. The median MSRP for 2013 was $15,175. It decreased $600 to $14,502 by 2017, but it increased to $19,909 by 2018. This increase in MSRP might be due to a 19.3% increase in luxury car sales in 2018. In comparison, the MSRP of SUV/crossover models generally decreased from 2013 to 2018. In 2013, the median price of SUV/crossover models was $19,223, and the average price was $26,948. While in 2018, the median price of SUV/crossover models was $18,716, and the average price was $23,620. One reason might be the increasingly competitive SUV/crossover market. As shown in Figure 2a, the number of SUV/crossover models in the Chinese vehicle market increased significantly from 97 in 2013 to 278 in 2018. Figure 10a shows the MSRP by vehicle brand origin. Overall, the European brands have the highest MSRP, while Chinese brands rank the lowest in overall MSRP. In addition, note that the MSRP difference between SUVs/crossovers and sedans is much larger for Japanese brands ($19,637) than for other brands. The median MSRP of the Japanese SUVs is $37,734, and the median MSRP of the Japanese sedans is $18,420.  Chinese brands targeted the smaller vehicle size segments, as shown in Figure 11a,b. These brands account for 77% of minicompact sedans and account for 78% of subcompact SUVs/crossovers. This may partially explain why both the MPG and the MSRP values of vehicles by Chinese automakers are lower than those of brands from other countries, as shown in Figure 11c. Since the European vehicles include some luxury brands, the MSRPs of some European vehicle models shown in Figure 11c are much higher than those of other brands. Moreover, as shown in Figure 11b, Chinese automakers focus more on the SUV/crossover segment. The rapid growth of SUV/crossover models in recent years ( Figure 2) is due in large part to the substantial number of SUVs/crossovers produced by Chinese automakers. The change in emphasis from sedans to SUVs/crossovers by the Chinese automakers may be due to two factors: (1) to meet Chinese consumer demand for increased interior vehicle room [43] and (2) the higher profit margin of SUVs/crossovers relative to sedans [44].
Furthermore, as shown in Figure 11c, the level of fuel economy appears negatively correlated with the MSRP. This seems to contradict the conclusions by the National Research Council and other studies (such as Autonomie results) that the vehicle price commonly increases with a higher fuel economy level, since more sophisticated fuel economy technologies inevitably add the vehicle production cost [24,45]. However, the discrepancy is probably because this trend (a positive correlation between fuel economy and vehicle production cost) can be obscured by other factors when the data is complex and influencing factors are many. For example, as shown in Figure 11d, the positive correlation between the fuel economy and vehicle price is tenable for sedan MSRP ranges from $12,000 to $13,000, when car models have similar MSRPs. Therefore, a quantitative model, which can minimize the distractions of irrelevant vehicle market factors, is needed to accurately determine intrinsic relationships.

Vehicle Price Estimation and Validation
To validate the reliability of the statistical mode which quantifies the relationships between vehicle features and vehicle prices based on the vehicle market data, this study randomly selects 20 different sedan models and 12 different SUV/crossover models from the best-selling vehicle models in the 2019 Chinese vehicle market. In sedans, it selects four vehicle models from each of five vehicle size types (minicompact, subcompact, compact, midsize, and large), and in SUVs/crossovers, it selects three vehicle models from each of four size classes (subcompact, compact, midsize, large). Figure 12 shows the projected MSRPs based on the statistical model discussed in Section 3.3. The projected and actual MSRPs of most vehicle models are close to one another, and the deviations for most vehicle models fall within a reasonable range. However, the projected and actual MSRPs of some vehicle models differ significantly. This does not necessarily mean the model works inefficiently. It might be because the listed MSRPs of the vehicle models deviate from the market average. For example, as shown in Figure 12, the blue point in the SUV/crossover segment stands for a large SUV/crossover model produced by a Chinese manufacturer with a short history in the auto industry. The actual MSRP of this SUV/crossover is much lower than other similarly sized models with a similar powertrain configuration produced by more established automakers. This might be because this Chinese automaker wants to compete for that market using a low-price strategy: offering a comparable vehicle at a much lower price. Another reason might be that different automakers may have significantly different profit margins for some vehicles, especially higher-priced models. These profits are often used to offset lower profits, or even losses, on other vehicles in their lineup. Therefore, this statistical model can help analyze the "reasonableness" of vehicle prices by comparing them with the market average price range. Although this study considers the price impacts of luxury brands and vehicle models produced by different automakers to some extent, it finds that the actual prices of a few models by Chinese brands tend to be far below the prices estimated by the statistical model and that the actual prices of some vehicle models by luxury (high-end) brands tend to be much higher than their estimated prices. These vehicle models, as outliers in the market, could shadow the accuracy of their price estimation by this statistical model, which greatly relies on historical data. On the other hand, it might imply that, compared with the overall market, the actual prices of these vehicle models could be very unusual: For an extremely cheap car, could its price be a potential indicator of low quality or unreliable performance? For an exceptionally expensive high-end car, could its price be based mostly SUV/Crossover on its intangible brand value or even based on nowhere? If modeling errors are culled, this statistical model could alert users to be cautious of the quality/service of a vehicle model with an abnormally low price and to be vigilant regarding a luxury vehicle with a price much higher than its estimated price.

Vehicle Price Comparison among Brands
Based on the data analyses and statistical modeling results, the average prices of vehicle segments (sedans and SUVs/crossovers) of different brands can be quantified. If all parameters other than brand origin remain the same, the impact of vehicle brand on MSRP can be determined. The difference in MSRP of Chinese brands compared to other brand origins are shown in Figure 13. The percentages and the confidence intervals (boxes) in Figure 13 signify the vehicle price change ratios compared to a Chinese brand vehicle. For example, the statistical model estimates that the MSRP for a European brand SUV/crossover could possibly be 82% higher than the MSRP of a Chinese brand SUV/crossover with the same parameter values. Furthermore, it is 90% possible that this increase in MSRP would range from 79% to 86%.
Compared to the MSRP of a Chinese brand vehicle, the relative MSRPs of brands from Europe, Japan, Korea, and America show a remarkable increase. The MSRP difference is even greater in the SUV/crossover segment than it is in the sedan segment. The MSRP difference by brand reveals the value added for different vehicle brands. The results clearly show that, in the current vehicle market, the European brand vehicles have much higher added value due to brand and Chinese brand vehicles have the lowest added value due to brand. The difference in added value due to brand may come from brand premium values that are not explicitly included in the statistical model, such as highgrade interior design, more comfortable driving control (better suspension and safety performance), reliable vehicle aftermarket service, favorable public image of the brand, and other factors.

Vehicle Fuel Economy and Engine Rated Power
The effects of fuel economy (MPG) and engine rated power on vehicle MSRP are also quantified by the statistical model. Figure 14a presents the vehicle MSRP and its corresponding probability distribution for projected MSRP at different vehicle fuel economies for a midsize Chinese brand sedan. In this example, all other parameters are the sales-weighted average values for midsize sedans in the 2019 Chinese vehicle market. Based on CATARC data, the sales-weighted parameters for midsize sedans in the 2019 market are obtained: 133 kW for the engine rated power, 1.83 L for the engine displacement, and 1550 kg for the vehicle weight. As shown by Figure 14a, when the fuel economy increases from 20 MPG to 50 MPG, the mean value of the vehicle MSRP also increases from around $16,156 to around $24,871. One would expect the vehicle MSRP of a midsize sedan by other brands would be higher than the MSRPs given in Figure 14a. Figure 14b shows the effect of fuel economy and engine rated power on MSRP of a midsize Chinese brand vehicle. In the simulation, the parameters other than the fuel economy and engine rate power are the sales-weighted average values for the midsize sedans in the 2019 Chinese vehicle market. The engine rated power and vehicle fuel economy are two parameters that mutually restrain each other. Therefore, the vehicle MSRP could be extremely high when a vehicle has both a powerful engine and high fuel economy, as it would require exceptionable manufacturing technology to achieve these two performance targets. Similarly, this study determined the incremental vehicle prices per fuel economy increase for different vehicle sales years and vehicle brand origins, which can, indirectly, be used for implicating or comparing the willingness of Chinese consumers to pay for vehicle fuel economy [46]. For a vehicle fuel economy ranging from 30 to 50 MPG, the incremental vehicle prices by vehicle sales year and brand origin are presented in

Conclusions
The goal of this paper is to quantitatively evaluate the historical vehicle market in China (2013-2019) and to determine the relationships among vehicle prices and features through statistical modeling. By collaborating with researchers from CATARC, we collect passenger vehicle sales data from 2013 to 2019, and data for more than 100 million vehicle sales are adopted for the modeling analyses. The market data from 2013 to 2018 are used for model training and in-sample testing, while market data from 2019 are adopted for out-of-sample estimation and validation. After comparing the stabilities and accuracies of different statistical methodologies, this study creates a linear multinomial regression model that uses dummy variables to project vehicle MSRP, based on specific vehicle features (engine rated power, engine size, fuel economy, and vehicle weight) and categories (brand origin, vehicle size class, vehicle segment [sedan or SUV/crossover], and luxury/non-luxury vehicle). The study also uses Monte Carlo simulation to quantify the possible distributions of the vehicle MSRPs by assuming vehicle feature ranges based on their historical distribution. This study may contribute valuable information to researchers and policy makers in surveying vehicle pricing and vehicle technology trends, analyzing the impacts on fuel economy and engine power in the passenger vehicle market, and evaluating the market penetration of highly fuel-efficient vehicles in China.
The results of this study offer several insights that may be of interest to stakeholders involved with the Chinese passenger vehicle and energy market:


The SUV/crossover is the predominant vehicle type in the Chinese passenger vehicle market.
The number of small SUVs/crossovers in the market is expected to increase, and their prices are decreasing until they are in line with sedan prices. The major vehicle performance features of SUVs/crossovers are nearing those of sedans.


The fuel economy of sedan and SUV/crossover models sold in China has increased annually from 2013 to 2018. However, it has increased more rapidly for SUVs/crossovers than for sedans. Sedan fuel economy increased from 35.26 to 38.56 MPG, and SUV/crossover fuel economy increased from 26.12 to 32.67 MPG.  Comparing the fuel economy performance among vehicle models by brand origin, on average, Japanese brands perform the best among sedans, while they perform the worst among SUVs/crossovers.  When all other vehicle features are the same, Chinese sedans are priced lower than sedans from the other countries/regions in this study. European, Japanese, Korean, American sedans are priced 68%, 62%, 58%, and 47% higher, respectively.  When all other vehicle features are the same, the MSRP of European SUVs is 82% higher than that of Chinese SUVs/crossovers. Japanese, Korean, and American SUVs/crossovers are priced 69%, 59%, and 55% higher, respectively. This study uses statistical modeling to perform data analyses of the Chinese passenger vehicle market. The methods and conclusions of this study can be used for vehicle ownership and consumer purchase preference analyses by others. Some assumptions are made to simplify the analysis process: only the major vehicle features are considered; only the gasoline sedan segment and the gasoline SUV/crossover segment are considered; and vehicle brands are classified according to their regional areas instead of by their brand names. As more is learned about the data and the methodology, the analysis will be updated and improved.