5.1.1. Principles for Constructing the Indicator System
A set of indicator systems aimed at NEV sales forecasting is established on the basis of four major principles to guarantee strictness and practicability.
In terms of the scientific principle, it demands that the system should abide by reliable data sources, rigorous measurement methods, adhere to objective facts, select representative indicators that can represent important influencing factors of the NEV industry, and conduct standardized data processing in line with scientific theory to ensure the scientific validity, rationality, and reproducibility of results, as well as eliminate bias.
Secondly, based on the principle of systematics, from the multi-level system composed of targets, criteria, and indicators, the coverage system for critical content in NEV sales was constructed. NEV sales are simultaneously affected by technology, consumer spending, policy, and infrastructure; therefore, relevant indicators needed to be selected at different levels and fields to fully reflect the industry’s development trend, ensuring an accurate evaluation.
Third, the operability principle provides that usable, measurable data (for example, that which is available via government platforms such as the National Bureau of Statistics) must be tracked, and done so in real time, at all times. The goal is to be able to produce data as proof and context when responding to questions from researchers and managers.
Another condition, the dynamic-stability criterion, keeps stability and adaptability in balance—when the market is different from before, dynamic forecast indicators are revised to suit the new circumstances; meanwhile, core indicators remain constant for a certain period of time, so long as the goal for predicting NEV sales does not change.
5.1.2. Analysis of Influencing Factors and Selection of Indicators
Given the multi-faceted influences on NEV sales, this study constructs a sales forecasting indicator system by analyzing key factors across four dimensions—economy, technology, policy, and consumers—supported by relevant theories and literature.
Higher economic levels stimulate NEV consumption, with economic scale impacting NEV promotion efficiency [
52]. GDP and Urban Survey Unemployment Rate are selected to reflect macroeconomic conditions, while Per Capita Disposable Income and Per Capita Consumption Expenditure measure residents’ purchasing power—key to affording NEVs [
53]. For industrial dynamics, NEV Industry Investment Amount, NEV Industry Investment Events, Automotive Aftermarket Investment Amount, and Automotive Aftermarket Investment Events capture industry attractiveness and post-sales service impacts. Gasoline Price is included, as oil price hikes positively influence NEV purchase intentions [
54]. In total, nine economic indicators are chosen.
Technological advancement boosts NEV demand by enhancing battery performance, drive systems, and intelligence—directly increasing consumer purchase intent [
48]. Tech progress outweighs financial subsidies in promoting NEVs [
49]. Due to inconsistent tech metrics across NEV firms and opaque R&D, NEV Basic Patent Applications in China is selected to reflect industry innovation. Battery tech constraints (range, safety) make Power Battery Output and Total Power Battery Installation Capacity key indicators (scaled battery production supports stable industry growth). In total, two technological indicators are chosen.
NEV market dynamics are closely tied to industrial policies. Fiscal and tax incentives (e.g., vehicle purchase tax exemptions) boost NEV demand [
50], so the Number of NEV-related Policies is selected to reflect policy support. Inadequate charging infrastructure causes range anxiety [
55]; Public Charging Pile Quantity is chosen as it indicates government emphasis on NEV promotion (via infrastructure subsidies). Lower bank loan interest rates reduce purchase costs, stimulating demand. Thus, Bank Loan Interest Rate is included to capture financial policy impacts. Three policy indicators are chosen.
Online platforms reshape how consumers access information. Online reviews (sentiment/polarity) impact purchase decisions [
56], so Review Volume, Average Sentiment Score, Positive/Negative Reviews, Most Probable Review Topic, and Topic-specific Sentiment Score are selected. Baidu Search Index reflects purchase intent [
57]. Seven consumer indicators are finalized.
A total of 21 indicators were selected from the four dimensions to form the index system, which is presented in
Table 1.
5.1.3. Data Sources
To balance data accessibility and processability, this study selects data spanning from July 2018 to December 2024. There are no missing data for each indicator from July 2018 to December 2024. Economic indicator data are sourced from the National Bureau of Statistics (NBS) (monthly/quarterly data) and the Pan-Internet Venture Capital Project Information Database, processed at a quarterly time scale. Technological indicator data are obtained from national government websites, the China Automotive Power Battery Industry Innovation Alliance, and patent databases. Policy indicator data come from national government websites, the People’s Bank of China, and the China Charging Alliance. Among consumer-related indicators, Baidu Search Index data are retrieved from Baidu index.
For consumer-related indicators, online review data are derived from processing semi-structured text reviews, with the raw text sourced from Autohome website (
https://www.autohome.com.cn/, accessed on 3 December 2025) and CHEZHIWANG (
https://www.12365auto.com/, accessed on 3 December 2025). Autohome is recognized as one of China’s most influential and trusted automotive vertical platforms, with a proven track record of providing high-quality, user-generated content (UGC) and authoritative industry data. As of December 2024, Autohome’s mobile App has accumulated over 500 million downloads, with a monthly active user (MAU) count of 64.51 million and a panoramic ecosystem monthly unique user peak of 532 million [
58]. Autohome is a trusted data partner for major NEV manufacturers (e.g., BYD, NIO, Xpeng) and government research institutions. CHEZHIWANG is a leading national platform specializing in automotive quality complaints and consumer feedback, with a focus on data integrity and objectivity. Its authority is firmly anchored in its institutional endorsements and collaborative partnerships: as an official member unit of China’s National Automobile Product Defect Clue Monitoring Network (founded in 2019), the platform operates under direct guidance and coordination with the State Administration for Market Regulation. Notably, Autohome and CHEZHIWANG platform have been widely adopted as reliable data sources by researchers in numerous high-impact, peer-reviewed journals for automotive-related studies [
59,
60,
61,
62]. This cross-referencing in scholarly literature further validates the credibility and scientific applicability of the review data derived from these platforms, aligning with the rigorous standards of academic research. Given their long-term focus on the automotive sector, broad user bases, long time spans of reviews, and large review volumes, Autohome and CHEZHIWANG are representative platforms for NEV user reviews. Thus, they are selected as the sources of online review data for NEVs in our research.
To address the inherent procedural limitations of text mining (e.g., subjectivity in sentiment classification, ambiguity in topic delineation) and further enhance the reliability and interpretability of sentiment/topic indicators derived from NEV online reviews, we present systematic validation experiments and standardized quality assessment metrics specifically designed for SnowNLP-based sentiment analysis and LDA-based topic modeling.
To examine SnowNLP in classifying NEV review sentiment, a two-part validation experiment was performed. Firstly, we used stratified random sampling from the 97,887 preprocessed reviews, choosing 1000 of them, equally representing every quarter between 2018Q3 and 2024Q4 (≈40 reviews per quarter) and ensuring time periods coverage and randomness. We annotated these reviews manually with a scale from −1 (Negative), 0 (Neutral), +1 (Positive), among which annotation consistency was estimated using Cohens Kappa Coefficient, which obtains a value of 0.87, exceeding 0.75, meaning high consistency [
43], which shows the confirmation of gold standard.
Second, the SnowNLP model was tested against this gold standard, with the following performance metrics: accuracy = 92.3%, macro-averaged recall = 91.7%, and macro-averaged F1-score = 0.91. Among them, positive reviews achieve the highest recall (93.5%), while negative reviews have a recall of 89.2%—a minor gap attributed to the low proportion of negative reviews (≈8.3% of the total dataset).
The quality of the LDA topic model is evaluated using two core metrics: perplexity (measuring model generalization ability) and coherence score (measuring semantic consistency of topics). For the NEV review corpus, when the number of topics K = 8 (determine via grid search over K = 5, 8, 10, 12), the model achieve perplexity = 892.
To address ambiguities in data processing procedures, this section clarifies technical specifications for text preprocessing and provides a rationale for normalization methods, along with statistical results.
Noise Removal: Irrelevant characters (including emojis, special symbols like “★” or “→”, and non-Chinese/English text) are removed using the regular expression r‘[^\u4e00-\u9fa5a-zA-Z0-9\s]’. Domain-specific terms (e.g., “kW·h”, “fast charging”) are retained to avoid information loss.
Redundant Review Filtering: Duplicate reviews (identical text and posting time) and short reviews (<5 Chinese characters, e.g., “Good” or “Decent”) are excluded, reducing the corpus from 112,345 to 97,887 valid reviews.
Stopword Removal: A hybrid stopword list is used, combining the Harbin Institute of Technology’s general Chinese stopword list (782 terms, e.g., “of”, “because”) and a custom NEV domain stopword list (68 terms, e.g., “automobile”, “vehicle”, “new energy”—terms that appear in >90% of reviews and lack discriminative value).
Word Segmentation: The Word Segmentation library’s “precise mode” is used, with a custom dictionary of 320 NEV-specific terms (e.g., “driving range”, “public charging pile”, “battery degradation”) added to improve segmentation accuracy. Post-segmentation validation show that domain terms are correctly split in 95.7% of cases (vs. 82.3% without the custom dictionary).
To address concerns about indicator reliability, we take concrete measures to ensure reproducibility and validate indicator robustness across time and tools.
All experiments involving randomness are strictly controlled to ensure results can be replicated:
Random Seed Fixing: Key tools and libraries use fixed random seeds:
Python version 3.13 base randomness: random.seed(42)
TensorFlow (for model training): tf.random.set_seed(42)
LDA topic modeling (via gensim): random_state=42
SnowNLP sentiment analysis: snownlp.seed(42)
Three experiments are conducted to verify that sentiment and topic indicators are stable across time, tools, and parameter changes:
Temporal Consistency Test: The dataset is split into two periods—Q3 2018 to Q4 2021 (Phase 1) and Q1 2022 to Q4 2024 (Phase 2). The grey correlation between sentiment X17 and NEV sales is 0.81 in Phase 1 and 0.83 in Phase 2; the correlation for X16 is 0.89 and 0.91, respectively. Differences confirm that indicators maintain consistent predictive relevance over time.
Cross-Tool Validation: The same 1000 annotated reviews are analyzed using two alternative tools—BosonNLP (Chinese sentiment analysis) and TextBlob (English sentiment analysis, for bilingual reviews). Pearson correlations between SnowNLP scores and BosonNLP/TextBlob scores are 0.93 and 0.88, respectively, indicating high consistency across tools.
LDA Parameter Sensitivity Analysis: The number of topics K is varied (K = 5, 8, 10, 12) to test topic stability. The top 3 keywords for “driving range & charging” (Topic 1) remained unchanged across all K values, and coherence scores only fluctuated between 0.72 and 0.76. This confirms that topic indicators are not sensitive to minor parameter adjustments.
The data is presented in
Table 2, where indicator names are replaced with simplified symbols; “Y” denotes quarterly NEV sales, and “Q3 2018” represents the third quarter of 2018.
As shown in
Table 2, the number of online reviews increased by about 58 times from the third quarter of 2018 to the fourth quarter of 2024, indicating that online discussion and consumer interest in NEVs have exploded in the past few years. The period from 2018 to 2019 is a slow start. The number of comments begin to grow slowly from a relatively low base. In 2019, the number of quarterly comments increase from 283 to 747, reflecting the gradual rise in consumers’ attention to NEVs in the early stage of the market. The period from 2020 to 2021 is a time of rapid growth. The number of comments increases from 561 to 1999 in 2020 and from 1736 to 4758 in 2021, with the growth rate significantly accelerating. This may be associated with the surge in online activities during the epidemic, the implementation of government subsidy policies, and the launch of popular car models. The period from 2022 to 2023 experiences a slowdown in growth or a plateau. The number of comments continue to increase in 2022, but the growth rate slows down (from 3905 to 6914), and even a quarterly decline occurs in 2023. This may be due to market saturation, supply chain issues, or economic uncertainties, which have temporarily stabilized the discussion heat. The year 2024 will be a strong rebound period, with the number of comments increasing significantly from 4150 to 12,799, reaching a peak in the fourth quarter. This may be attributed to the maturity of NEV technology, the frequent launch of new models, price wars that stimulate consumption, and the global carbon neutrality goal driving long-term demand. Data shows that the number of reviews is often higher in the fourth quarter, which may be related to year-end promotions, auto shows, or consumers’ car-buying decisions during holidays. As shown in
Table 2, the emotional score rose from 0.783 in the third quarter of 2018 to 0.917 in the fourth quarter of 2024. The overall emotional score shows an upward trend, reflecting that consumers’ attitudes towards NEVs are gradually becoming more positive. The upward trend of emotional scores clearly reflects the transition process of the new energy vehicle industry from the introduction stage to the mature stage. Consumers have gradually shifted from initial observation and doubt to positive evaluation, which is attributed to the overall development of the industry and the support of the social environment. In the future, the emotional score may remain at a relatively high level, but it is necessary to pay attention to fluctuating factors (such as economic conditions or technical malfunctions) to maintain consumer confidence. In addition, the data of external related influencing factors have all increased to varying degrees, reflecting that the development environment for NEVs is relatively favorable.
Numerous factors influence NEV sales, yet an excessive number of input indicators would overcomplicate the model, hindering its ability to solve practical problems efficiently. Additionally, not all preselected indicators exhibit strong correlation or high impact on NEV sales. Therefore, it is necessary to screen the 21 preselected indicators before incorporating them into the NEV sales forecasting index system as model inputs.