1. Introduction
The global transition to electric vehicles (EVs) has accelerated dramatically in recent years, with sales reaching 17 million units in 2024, accounting for over 20% of all cars sold worldwide and representing a 25% year-on-year increase [
1,
2]. This growth is driven by environmental concerns, policy support, and technological advancements, with major markets—China (45% EV share), Europe (15%), and the United States (11%)—leading adoption [
3,
4,
5]. However, range anxiety remains a persistent barrier to mass-market penetration, particularly among prospective EV buyers [
6,
7]. Studies indicate that 78% of future EV owners report high levels of range anxiety, with concerns peaking 1–2 years before purchase [
8,
9]. This anxiety stems largely from uncertainty about whether battery capacity will suffice to reach destinations, especially under adverse weather conditions where auxiliary systems significantly impact available range [
10,
11]. Traditional energy consumption models predominantly focus on traction energy—motor efficiency, aerodynamic drag, and rolling resistance—while auxiliary systems such as HVAC (heating, ventilation, air conditioning) receive limited attention despite their substantial contribution to total energy demand [
12,
13,
14].
Auxiliary systems—particularly HVAC and battery thermal management—have been identified as critical components that affect the efficiency and range of EVs [
15,
16]. Research demonstrates that auxiliary loads can constitute 20–50% of total energy consumption depending on ambient temperature and driving conditions, with particularly severe impacts during winter operation [
17,
18]. A comprehensive analysis by Gil-Sayas et al. [
19] showed that air conditioning alone can increase energy consumption by 25–35% when ambient temperatures fall below freezing (−7 °C), while cabin heating requirements escalate dramatically in cold climates. The energy penalty is nonlinear: heating power demand increases exponentially as ambient temperature decreases, with typical heating systems consuming 6–9 kW in harsh winter conditions compared to merely 0.2–0.5 kW in moderate temperatures. This seasonal variation translates directly to range reduction; real-world field studies consistently demonstrate that EV driving range decreases by 30–50% in winter conditions compared to summer driving at equivalent battery charge levels [
20,
21]. The discrepancy between standardized test cycles (EPA, WLTP) and real-world performance is largely attributable to the underestimation of auxiliary loads, as current testing protocols do not adequately capture the complexity of thermal management demands across diverse climatic conditions [
22,
23]. Consequently, consumers frequently experience significant range degradation in cold weather that exceeds manufacturer estimates, compounding range anxiety concerns and undermining EV market adoption in northern climates and seasonal regions [
24,
25].
Despite the acknowledged importance of auxiliary systems, the current state of electric vehicle energy modeling exhibits significant research gaps. Most published studies on energy consumption employ one of two approaches: (1) physics-based models that estimate traction energy with high precision but largely ignore auxiliary loads [
26,
27], or (2) data-driven machine learning models that treat the vehicle as a “black box” and rely on empirical patterns without physics-based decomposition [
28,
29]. Few studies attempt to integrate both approaches—combining the mechanistic understanding of vehicle dynamics with the predictive power of modern machine learning algorithms [
30,
31]. Furthermore, the vast majority of existing research uses simulation data or controlled laboratory conditions instead of real-world driving data, limiting the applicability of the results to diverse climatic conditions and actual user driving patterns [
32,
33]. Recent advances in machine learning—particularly ensemble methods such as Random Forest, XGBoost, and neural networks—have demonstrated significant potential for energy prediction in various transportation applications [
34,
35]. However, their application to auxiliary system modeling remains underdeveloped. Additionally, most auxiliary energy studies focus on single components (e.g., heating or cooling individually) rather than integrated thermal management systems that must simultaneously manage cabin comfort, battery temperature, defrosting, and heating distribution across multiple zones [
36,
37]. The lack of comprehensive real-world validation data from equipped vehicles further constrains model development. Current prediction models typically achieve R
2 scores in the range of 0.85 to 0.95 in test data, leaving room for significant improvement [
38,
39]. This work addresses these gaps by developing an integrated physics-based and machine learning framework validated on 95,028 multi-seasonal real-world measurements, achieving unprecedented prediction accuracy while maintaining model interpretability.
The primary objective of this study is to develop a comprehensive, physics-informed machine learning framework for predicting auxiliary power consumption in electric vehicles across multi-seasonal real-world driving conditions. Specifically, this work addresses four key research questions: (1) What is the relative contribution of auxiliary systems to total EV energy consumption in diverse weather and driving contexts? (2) What environmental and operational parameters are the dominant predictors of the demand for auxiliary power and what hierarchical relationships exist between them? (3) Can an integrated physics-based and data-driven approach achieve superior prediction accuracy compared to traditional linear methods? (4) What practical design and operational recommendations can be derived from such a model to inform OEM thermal system optimization and user energy management strategies? To address these research questions, we developed an integrated analysis framework combining physics-based energy decomposition with machine learning techniques, validated on multi-seasonal real-world driving data collected from two battery electric vehicle platforms with comprehensive thermal instrumentation. The methodology employs detailed feature engineering to capture vehicle dynamics, thermal interactions, and driving behavior context, followed by systematic model comparison and cross-validation to ensure robustness across random data partitions.
Our specific methodological contributions are as follows: (i) first integrated energy decomposition framework combining physics-based traction energy modeling with machine learning-based auxiliary power prediction, enabling interpretable separation of propulsion and thermal components; (ii) comprehensive feature engineering incorporating derived features capturing vehicle dynamics, thermal interactions, driving behavior context, and environmental conditions; (iii) systematic comparison of four machine learning architectures (Linear Regression, Random Forest, XGBoost, and Gradient Boosting) to identify optimal prediction performance; (iv) multi-fold cross-validation across random data partitions to confirm model generalizability and robustness; and (v) detailed feature importance analysis to reveal the dominant drivers of auxiliary power consumption and their hierarchical relationships, enabling actionable insights for vehicle design and energy management optimization. Results demonstrating the effectiveness of this integrated framework—including quantitative accuracy metrics, feature importance rankings, seasonal heating effects, and context-dependent auxiliary power contributions—are presented in detail in
Section 4 (Results). These findings enable practical deployment in original equipment manufacturer energy management systems, consumer-facing range forecasting applications, and fleet management operations.
2. Research Background
Physics-based energy consumption models for electric vehicles have been extensively developed in the past decade, focusing primarily on traction energy components—aerodynamic drag, rolling resistance, motor/inverter losses, and regenerative braking [
40,
41]. These models typically employ first-principles equations derived from vehicle dynamics and thermodynamics to estimate the energy demand based on speed profiles, road gradients, and vehicle mass [
42,
43]. Although such approaches achieve reasonable accuracy for the traction energy (R
2 ~0.85–0.90 in standardized driving cycles), they fundamentally underestimate real-world energy consumption because auxiliary systems are either entirely ignored or treated as constant additive terms [
44]. Recent work [
45] by Kim et al. demonstrated that real-world vehicle data reveals auxiliary systems consume 27–42% of total energy depending on trip conditions, yet most simulation frameworks allocate less than 10% overhead for auxiliaries, leading to systematic range overestimation. The work [
46] by Schäfers et al. addressed this gap by developing system identification and deep learning models specifically for auxiliary power prediction in battery electric vehicles, achieving R
2 = 0.92 on real-world test data—a significant improvement over pure physics-based methods. However, these studies acknowledge limitations, including reliance on controlled laboratory conditions and limited seasonal coverage (single temperature range), highlighting the need for comprehensive multi-seasonal real-world validation datasets. Furthermore, traditional models struggle to capture the nonlinear, temperature-dependent behavior of HVAC systems, where heating power increases exponentially (not linearly) with decreasing ambient temperature—a phenomenon requiring data-driven approaches to model accurately [
47,
48,
49].
The energy consumption of HVAC systems in electric vehicles has gained increasing attention because of its profound impact on driving range, particularly in extreme climatic conditions. Gil-Sayas et al. [
19] conducted comprehensive laboratory tests on plug-in hybrid (PHEV) and battery electric vehicles (BEV) at −7 °C, 22 °C, and 35 °C over the WLTC driving cycle, revealing that heating power at −7 °C is 4–10 times higher than cooling power at 35 °C. Specifically, MAC (Mobile Air-Conditioning) impact at −7 °C ranged from 35% to 45% of total energy consumed, compared to 15–18% at 35 °C. These findings align with earlier work [
45] by Kim et al. on annual energy consumption of EV air conditioning in China, which established that heating energy exceeds cooling energy in most climates due to lower system efficiency and larger temperature differentials. More recently, the U.S. Department of Energy [
50] published a comprehensive analysis demonstrating that cold ambient temperatures reduce BEV range by 30–50%, with cabin heating identified as the primary contributor alongside battery performance degradation. Battery thermal management itself introduces additional auxiliary loads; studies show that maintaining optimal battery temperature (20–25 °C) requires active heating/cooling, further compounding HVAC energy demands. Advanced thermal management strategies—such as heat pumps with a Coefficient of Performance (COP) > 2, cabin pre-conditioning during charging, and multi-zone climate control—have been proposed as mitigation measures. However, most existing studies analyze heating and cooling systems independently, lacking integrated models that simultaneously account for cabin climate control, battery thermal management, defrosting, and their interactions across the full seasonal temperature spectrum [
51]. This work addresses this gap by modeling the complete HVAC system as an integrated auxiliary load across −8 °C to +33.5 °C real-world conditions.
Machine learning techniques have emerged as powerful tools for predicting energy consumption in electric vehicles, offering superior performance compared to traditional physics-based models when trained on comprehensive real-world datasets. Ensemble methods—particularly Random Forest, Gradient Boosting, and XGBoost—have demonstrated exceptional capability in capturing complex nonlinear relationships between environmental factors, driving behavior, and energy demand. Rathore et al. [
29] compared Random Forest, XGBoost, Linear Regression, ANN, and DNN for EV energy consumption prediction using historical charging data, concluding that XGBoost achieved the highest accuracy due to its gradient boosting framework and regularization techniques. Huang et al. [
52] proposed a hybrid modeling approach combining physics-based simulation with data-driven machine learning (generalized additive mixed models, Random Forests, boosting), reducing average prediction error from 40% (pure physics) to 10% (hybrid approach). This demonstrates the value of integrating domain knowledge (vehicle dynamics) with statistical learning. Feature engineering plays a critical role in ML model performance; studies emphasize the importance of temporal features (lag variables, rolling averages), weather-related parameters (temperature, precipitation), and driving cycle characteristics (accelerator patterns, speed variability). Furthermore, model interpretability has become increasingly important for deployment in safety-critical automotive applications. SHAP (SHapley Additive exPlanations) values enable feature importance analysis, revealing which inputs drive model predictions and facilitating stakeholder trust [
53,
54]. Despite these advances, existing ML studies on EV energy consumption predominantly focus on total vehicle energy rather than specifically modeling auxiliary systems, and most employ synthetic or limited-scope datasets. The present work extends this literature by (1) applying XGBoost specifically to auxiliary power prediction, (2) utilizing comprehensive multi-seasonal real-world data (95,028 records), (3) integrating physics-based energy decomposition with ML, and (4) achieving R
2 = 0.998—surpassing typical benchmark performance of R
2 = 0.90–0.95 reported in prior studies. A summary of the scope of other works in relation to the one described is provided in
Table 1.
3. Materials and Methods
The research methodology follows an integrated physics-based and machine learning approach to predict auxiliary power consumption in electric vehicles. As presented in
Figure 1, the workflow begins with the acquisition of multi-seasonal real-world driving data from two distinct vehicle platforms, followed by systematic data processing and physics-based energy decomposition into traction and auxiliary components. Feature engineering transforms 52 raw sensor measurements into 24 informative derived features that capture thermal interactions, driving dynamics, and operational context. Four machine learning architectures (Linear Regression, Random Forest, XGBoost, and Gradient Boosting) are trained and compared on identical train-test partitions to identify optimal prediction performance. The selected XGBoost model undergoes cross-validation across random data partitions to confirm robustness and generalizability. Detailed feature importance analysis reveals the dominant predictors of auxiliary power consumption, while residual diagnostics validate model assumptions and prediction reliability. The validated framework enables practical deployment in three application domains: original equipment manufacturer energy management systems, consumer-facing range forecasting tools, and fleet management operations.
Data were acquired from two electric battery vehicles, providing various operational conditions for the development and validation of the model. Vehicle 1 is a five-door urban electric car equipped with a permanent magnet synchronous motor delivering 107 kW (143 HP) and 250 Nm maximum torque, driving the rear wheels through a single-speed transmission. This vehicle utilizes a 19 kWh lithium-ion battery that provides a range of up to 122 km (NEDC cycle) with energy consumption below 19 kWh/100 km. It achieves a 0 to 100 km/h acceleration in 11 s and has a maximum speed of 150 km/h. Full battery charging requires approximately 3 h 48 min. Vehicle 2 is an electric hatchback (data based on [
58]) equipped with a 125 kW motor delivering 250 Nm maximum torque and reaching 150 km/h top speed with a 0–100 km/h acceleration time of 7.2 s. The vehicle is equipped with a 33 kWh battery that provides a range of 245 to 300 km with fast charging capability (0.7 h DC fast charge, 11 h single-phase household outlet). Both vehicles were instrumented with comprehensive CAN (Controller Area Network) bus data loggers that interacted with on-board diagnostic systems to capture real-time signals from multiple Electronic Control Units (ECUs). Real-time battery data were collected via the vehicle’s Battery Management System (BMS), supplemented by a HIOKI 3390 power analyzer for accurate voltage and current measurement at ±1% accuracy. Data acquisition was performed at a sampling frequency of 1 Hz (one measurement per second), ensuring sufficient temporal resolution to capture transient HVAC behavior and dynamic driving events. Both vehicles represent typical mid-size battery electric architectures common in European markets, with detailed HVAC specifications as follows:
Vehicle 1:
Heating system: Positive Temperature Coefficient (PTC) resistive heater element with maximum heating capacity of ~5 kW, providing direct cabin heating with on/off control logic. COP (Coefficient of Performance) = 1.0 (resistive heating).
Air conditioning: Vapor-compression refrigerant cycle employing R1234yf refrigerant with variable displacement compressor. Typical cooling capacity 2.5–3.0 kW at rated conditions with COP = 2.8–3.2.
Cabin climate control: Single-zone automatic climate control maintaining user-defined setpoint at approximately 21 ± 2 °C during normal operation. Cabin temperature sensor (thermistor) with ±0.5 °C accuracy provided feedback for HVAC ECU modulation.
Battery thermal management: Liquid-cooled lithium-ion battery pack (LG Chem LMO chemistry, 23 kWh gross capacity) with active coolant circulation pump maintaining battery thermal operating window of 15–35 °C. Coolant circuit shared with cabin heating core for waste heat recovery during cold-start acceleration.
Vehicle 2:
Heating system: Positive Temperature Coefficient (PTC) resistive heater element with maximum heating capacity of ~5 kW, identical architecture to Vehicle 1. On/off control with proportional modulation via PWM (Pulse Width Modulation) signals. COP = 1.0.
Air conditioning: Vapor-compression refrigerant cycle with R1234yf refrigerant and variable capacity compressor. Cooling capacity 2.5–3.0 kW with COP = 2.8–3.2, equipped with electronic expansion valve enabling precise superheat control across load variations.
Cabin climate control: Single-zone automatic climate control maintaining setpoint at 21 ± 2 °C. Dual temperature sensors (cabin and evaporator core) provided redundancy for HVAC ECU control algorithms.
Battery thermal management: Liquid-cooled lithium-ion battery pack (Samsung SDI NMC chemistry, 33–42 kWh gross capacity depending on variant) with active thermal management including optional battery pre-heating during charging. Coolant circuit integrated with both cabin heating and high-voltage component cooling, enabling multi-zone thermal optimization.
Common Instrumentation Architecture:
Heating power monitoring: Dual-channel redundancy via CAN protocol (primary HVAC ECU signal) and LIN protocol (secondary heater controller signal) enabling robust signal integrity verification.
Air conditioning power: Measured directly from HVAC ECU compressor command signal, cross-validated by motor current measurement (compressor electrical draw).
Cabin temperature: High-resolution cabin temperature sensor (thermistor, ±0.5 °C accuracy) mounted in central dashboard location.
Coolant circuit: Five temperature measurement points (heater core inlet, outlet, heat exchanger inlet/outlet, battery inlet) plus coolant volume flow measurement (electromagnetic flowmeter, ±2% accuracy).
Environmental conditions: Ambient temperature sensor (vehicle-mounted, prone to solar radiation error) supplemented by external weather station data (±0.1 °C precision when available), ensuring accurate thermal gradient quantification.
Neither vehicle employed advanced thermal management technologies such as heat pumps (COP > 2) or cabin pre-conditioning during charging, representing standard automotive architectures prevalent in the 2013–2017 model year range. This architectural similarity ensures methodological consistency while limiting generalizability to heat pump-equipped or thermally advanced platforms addressed in future work.
The diagram of the vehicles tested is shown in
Figure 2.
Data collection spanned a multi-seasonal period, enabling capture of diverse ambient temperature conditions and driving patterns. The combined dataset accumulated 95,028 measurement records in urban, suburban, highway, and mixed driving conditions for both vehicles. The ambient temperature ranged from −8 °C to +33.5 °C (mean: 13.5 °C, standard deviation: ±9.8 °C), ensuring a substantial coverage of the winter heating conditions critical for the prediction of the auxiliary power. The vehicles were operated under real-world driving conditions without experimental constraints—normal user driving patterns that included commuting, commercial errands, and recreational trips. This approach maximizes ecological validity compared to standardized test cycles (EPA, WLTP), which utilize predetermined speed profiles under controlled laboratory conditions. Data collection encompassed the following diverse driving scenarios:
Urban driving (0 to 50 km/h typical): traffic congestion, stop-and-go patterns, frequent acceleration/deceleration.
Suburban driving (30–80 km/h): mixed speeds, moderate acceleration, variable traffic density.
Highway driving (80–130 km/h): sustained cruising, minimal acceleration events.
Idle/charging periods (0 km/h): parked vehicle with HVAC active, maximum auxiliary-to-traction power ratio.
No weather-specific data filtering was applied; instead, all naturally occurring weather conditions (sunny, rainy, snowy, cloudy) were included to ensure representative real-world coverage. Both vehicles contributed to the diversity of the overall dataset. This multi-seasonal, unrestricted-driving approach represents a significant advancement over laboratory-controlled studies that typically capture only narrow temperature ranges under standardized cycles.
The 52 measured variables captured across the vehicles’ ECUs included four primary categories: (1) vehicle dynamics and propulsion, (2) electrical battery system, (3) HVAC and thermal systems, and (4) environmental conditions.
The raw measurement dataset comprises 52 sensor parameters spanning vehicle dynamics, battery management, thermal systems, and HVAC operation (
Table 2). To prepare data for machine learning while preserving physical interpretability, we systematically engineered 24 features that capture fundamental mechanisms driving auxiliary power consumption. Feature engineering prioritized physics-based rationale over purely statistical correlation, ensuring that each derived feature corresponds to known thermodynamic or mechanical principles governing vehicle energy behavior.
Vehicle dynamics features were engineered from raw velocity, acceleration, throttle position, motor torque, and elevation measurements. Instantaneous velocity was supplemented with polynomial expansion terms (velocity squared V2 and velocity cubed V3) to capture nonlinear aerodynamic drag scaling, recognizing that aerodynamic power increases with the cube of speed (P_aero ∝ v3). This polynomial expansion enables the model to distinguish between urban driving, where aerodynamic power is negligible, and highway cruising, where it dominates energy consumption. Temporal smoothing was applied via moving average filters: a 5-sample moving average (V_ma_5) removes sensor noise while preserving transient dynamics, while a 20-sample moving average (V_ma_20) characterizes trip-scale average speed patterns. Driving smoothness was quantified using a 10-sample rolling standard deviation of velocity (V_std_10), where high standard deviation indicates stop-and-go urban driving and low values indicate steady highway cruising. This smoothness metric correlates with occupant comfort perception and climate control adjustment frequency, as aggressive driving with frequent acceleration events tends to trigger more dynamic HVAC responses. Temperature differential features were derived from ambient temperature sensors, cabin temperature measurements, and battery thermal monitoring points. Recognizing that heating and cooling power scales with thermal gradient according to first-law thermodynamics (Q ∝ ΔT), we computed absolute temperature differences rather than using raw temperature values. The absolute cabin-ambient temperature difference (|ΔT_cabin-ambient| = |T_cabin − T_ambient|) quantifies thermal load independent of temperature direction, while the absolute battery-ambient temperature difference (|ΔT_battery-ambient|) indicates battery thermal management system activation intensity. Rate-of-change features were calculated using forward difference methods: cabin temperature rate (dT_cabin/dt) characterizes HVAC system response speed and heating transient dynamics, which are particularly important during cold-start scenarios where rapid temperature rise reflects aggressive PTC heater activation. Battery temperature rate (dT_battery/dt) indicates thermal management intensity, with positive gradients during acceleration reflecting increased motor thermal dissipation and negative gradients during idling reflecting thermal stabilization. Cabin temperature setpoint deviation (T_cabin − T_setpoint) quantifies how far the measured cabin temperature has deviated from the occupant-defined comfort target (approximately 21 ± 2 °C), with larger deviations triggering more aggressive HVAC modulation. Battery and energy state features were engineered from battery voltage, current, state-of-charge, and temperature measurements. Battery power was computed directly as the product of voltage and current (P_batt = V_battery × I_battery), providing instantaneous electrical power delivery or charging intensity. A binary charging indicator (1 if I_battery > 0, 0 otherwise) captures fundamental differences in thermal management strategy between charging and driving modes. The rate-of-change of state-of-charge (dSoC/dt, in percent per minute) indicates charging or discharging speed, where rapid SoC change during fast charging activates maximum battery cooling to prevent thermal runaway. To improve model stability across seasonal variation, we computed heating efficiency per temperature gradient (P_heating/|ΔT_cabin-ambient|), a physics-based normalization that accounts for temperature-dependent variation in heating system effectiveness and improves cross-seasonal prediction accuracy. Physics-based energy decomposition features were calculated using first-principles vehicle dynamics combined with instantaneous velocity, acceleration, and elevation rate measurements. As detailed (Equations (2)–(6)), we computed aerodynamic drag power, rolling resistance power, acceleration power, and elevation power using standard automotive engineering models with representative vehicle parameters (mass, drag coefficient, frontal area). The sum of these four components yields the total traction power required to propel the vehicle. Auxiliary power was measured directly as the sum of heating and air conditioning loads (Equation (7)), and the auxiliary power percentage was computed as (P_auxiliary/P_total_battery) × 100%, quantifying the proportion of battery energy consumed by the auxiliary systems. This percentage reveals context-dependent contributions ranging from approximately 12% during highway cruising (traction-dominated) to 75% during idling with HVAC active (auxiliary-dominated). This thermodynamically rigorous separation prevents confounding effects where aggressive driving dynamics could mask auxiliary system behavior. Driving context features were derived from velocity and acceleration patterns to characterize operational regime. Driving phase classification categorizes instantaneous speed into five regimes: Idle (<5 km/h), City (5–20 km/h), Suburban (20–50 km/h), Highway (50–80 km/h), and Very Fast (>80 km/h), enabling the model to distinguish between fundamentally different HVAC activation patterns across urban congestion versus highway steady-state operation. A speed variability index, computed as the rolling standard deviation of velocity over a 10-sample window, provides a dimensionless metric of driving smoothness, with high values indicating frequent acceleration/deceleration cycles typical of urban congestion and low values indicating steady-state highway cruising. This variability metric correlates with occupant-perceived comfort and climate control adjustment frequency. From the initial 52 raw sensor measurements, this 24-feature subset was selected using four systematic criteria: first, physics relevance: each feature must correspond to known physical mechanisms driving auxiliary power consumption, excluding generic features such as time-of-day or day-of-week that lack a mechanistic connection to HVAC behavior; second, measurement quality: we selected sensors with less than 2% missing data across the entire 95,028-record dataset and eliminated sensors with documented calibration drift or high instrumental noise; third, multicollinearity screening: Pearson correlation analysis identified and removed redundant features with correlation coefficients |r| > 0.95, improving model interpretability and preventing overfitting; and fourth, information gain: mutual information analysis quantified association strength between each candidate feature and the target variable (auxiliary power), retaining features demonstrating the mutual information I(X;Y) > 0.01 bits to exclude random or spurious associations.
The comprehensive energy analysis in this study is based on a systematic decomposition of total vehicle power consumption into two primary components:
where
represents instantaneous total power delivered by the battery [W],
encompasses energy required to overcome vehicle resistive forces and accelerate the vehicle [W], and
represents all auxiliary systems including HVAC, thermal management, and auxiliary vehicle functions [W]. This decomposition provides physical insight into the energy balance of the electric vehicle while enabling quantification of auxiliary system contributions. The traction component is computed using first-principles vehicle dynamics, while auxiliary systems—particularly HVAC—are modeled through machine learning methods trained on measured data. This hybrid approach combines the interpretability of physics-based models with the accuracy of data-driven methods, leveraging the strengths of both paradigms.
Traction power encompasses all energy required to propel the vehicle and overcome environmental resistance. Using standard automotive engineering models, traction power is decomposed into four components:
Each component is calculated as follows:
—aerodynamic drag coefficient
—vehicle frontal area [m2]
—air density [kg/m3], approximately 1.225 at sea level
V—vehicle velocity [m/s]
The aerodynamic drag power scales with the velocity cube, making this component particularly significant at highway speeds (>80 km/h) but negligible during urban driving (<30 km/h).
—coefficient of rolling resistance
—vehicle mass [kg]
—gravitational acceleration [9.81 m/s2]
V—vehicle velocity [m/s]
Roll resistance represents the energy dissipated by tire deformation and friction with the road surface. This component scales linearly with velocity, unlike aerodynamic drag.
—longitudinal acceleration [m/s2].
Acceleration power represents the rate of increase in kinetic energy. This term is set to zero during deceleration or coasting phases (i.e., Paccel = max(0, m·a·V)), as braking and regenerative energy recovery are handled separately.
—rate of elevation change [m/s], calculated as the time derivative of elevation measured by GPS.
Similar to acceleration power, elevation power is clipped to zero during descent = max(0, ), as downhill driving provides potential energy recovery opportunity.
The standard vehicle parameters used in all traction energy calculations are presented in
Table 3 below. These parameters are representative of mid-size electric vehicles and remain constant throughout the analysis.
Auxiliary power consumption includes all vehicle systems that are not directly related to propulsion. The dominant auxiliary loads are the following.
where:
—power consumption of cabin heating and battery thermal management systems [W]
—power consumption of air conditioning/cooling systems [W]
—power for vehicle control systems, lighting, and other auxiliary functions [W]
Heating power monitoring used dual-channel redundancy through separate vehicle communication protocols to ensure signal integrity:
CAN protocol heating power: Primary measurement from HVAC ECU, representing main heating system load
LIN protocol heating power: Secondary verification from heating element controller, validating CAN measurements
Both signals demonstrated excellent correlation (r = 0.9660), confirming measurement redundancy rather than monitoring separate heating circuits. For model development, the dual-channel approach enabled robust quality control: The CAN protocol served as the primary input, while the data from the LIN protocol were retained for filling gaps during intermittent communication dropout intervals. This strategy maximized the completeness of the dataset while preventing duplication of signals. The power consumption of the heating system responds to the temperature difference between the cabin and ambient. When the cabin temperature falls below the comfort setpoint, the resistive heater of the PTC (Positive Temperature Coefficient) is activated. Power consumption exhibits strong nonlinear behavior with ambient temperature, increasing exponentially as temperature drops, particularly below freezing (0 °C).
Air condition power represents the compressor and cooling system load:
This signal is measured directly from the HVAC ECU. Unlike heating, cooling power shows weak dependence on ambient temperature (only significant when T_ambient > 20 °C) but responds strongly to solar load and cabin temperature setpoint.
The remaining auxiliary consumption (lighting, control systems, vehicle electronics) is typically constant (~0.3–0.5 kW) and was treated as background load. For this study, the main focus remained on HVAC systems as the dominant auxiliary consumers; other loads were measured directly or accounted for by residual analysis.
Data processing proceeded as follows:
Unit conversion: All power values converted to watts [W]; velocity converted from km/h to m/s; elevation rates calculated as central differences in GPS altitude data.
Filtering and smoothing: Raw acceleration signals smoothed using 5-point moving median filter to reduce accelerometer noise; velocity signals smoothed with 3-point moving average to eliminate CAN bus signal dropouts.
Outlier identification: Records with Ptotal > 150 kW flagged as measurement errors and excluded (implausible for mid-size EV); <0.1% of data affected.
Temporal aggregation: Although collected at 1 Hz, data were analyzed at both instantaneous (1 s resolution) and trip-aggregated (full driving session) levels depending on analysis requirements.
This decomposition framework provides a physically grounded understanding of energy flows while enabling quantitative attribution of energy consumption to specific vehicle systems. The auxiliary power component Pauxiliary, particularly heating and AirCon, becomes the primary target for machine learning modeling in subsequent sections.
4. Results
This section presents a comprehensive analysis of auxiliary power prediction in electric vehicles using the integrated physics-based and machine learning framework described in Materials and Methods. The analysis proceeds through four sequential components: first, systematic feature engineering that transforms 52 raw measurements into 24 informative derived features that capture thermal, dynamic, and operational contexts; second, comparative evaluation of four machine learning architectures to identify optimal prediction performance; third, detailed feature importance analysis that reveals dominant drivers of auxiliary power consumption; and fourth, rigorous cross-validation evaluation that confirms model robustness and generalizability across random data partitions. The multi-seasonal dataset comprising 95,028 measurement records provides comprehensive coverage of real-world driving conditions ranging from idle parked vehicle states to highway cruising, across temperature extremes from −8 °C to +33.5 °C. Results demonstrate that machine learning-based prediction achieves substantial improvements over traditional physics-based approaches, with a specific focus on quantifying context-dependent auxiliary power contributions, characterizing strong nonlinear temperature dependencies, and revealing fundamental coupling between traction and auxiliary system loads that contradicts conventional constant-overhead assumptions in vehicle energy.
4.1. Feature Engineering
Auxiliary power consumption in electric vehicles exhibits complex, nonlinear dependencies on environmental and operational parameters that cannot be adequately captured by raw sensor signals alone. To enable accurate machine learning-based prediction, systematic feature engineering was performed to transform 52 raw sensor measurements into 24 informative derived features. These engineered features were organized into five primary categories: (1) temperature-based features capturing thermal gradients and dynamics (8 features), (2) velocity and dynamics features encoding aerodynamic and resistance effects (7 features), (3) battery state features reflecting energy management context (4 features), (4) heating demand indicators quantifying thermal load (2 features), and (5) driving cycle classification discretizing operational context (1 feature). This set of engineered features captures both the physical mechanisms driving the behavior of the HVAC system (through physics-derived features) and enables machine learning models to identify complex nonlinear patterns in real-world data (through derived nonlinear and statistical features).
The foundation for feature engineering is a physics-based energy decomposition framework that systematically separates total vehicle power consumption into six major components. As illustrated in
Figure 3, the decomposition model characterizes the following:
Traction energy components: Aerodynamic drag (2315 W average), rolling resistance (1179 W), acceleration/inertia (2845 W), and elevation/gradient forces (402 W).
Auxiliary energy components: Heating systems (2243 W average) and air conditioning systems (183 W average).
The multi-seasonal dataset reveals remarkable findings: total traction power averages 6741 W while total auxiliary power averages 2426 W, indicating that auxiliary systems represent 26.5% of average total energy consumption. Most critically, these averages mask dramatic contextual variation, as detailed in
Figure 4’s comprehensive temperature analysis.
Figure 5 reveals that the auxiliary power percentage peaks at 75% during idle driving (0–5 km/h parked with HVAC active), declines to 58% during city driving (5–20 km/h), further drops to 23% during highway cruising (50–80 km/h), and plateaus at 12% during very fast driving (>80 km/h). This inverse relationship between velocity and auxiliary power percentage—driven by the cubic scaling of aerodynamic drag with speed—fundamentally motivates the need for velocity-aware feature engineering.
Temperature emerges as the dominant predictor of auxiliary power consumption, with engineered features that capture both steady-state thermal state and dynamic thermal evolution.
Figure 4 provides a comprehensive analysis of temperature effects:
Figure 4a—Heating Power Temperature Dependence: The data reveal a striking nonlinear relationship. The heating power in cold (<−5 °C) averages 4.1 kW, declining dramatically through the temperate ranges to 0.2 kW under warm conditions (>25 °C). This represents a 20.5-fold increase from warm to cold. The exponential character of this relationship (rather than linear) reflects the thermodynamic basis: the heating power required scales approximately as the temperature differential increases to power ~1.3, plus an offset component from system losses.
Figure 4c—Cabin-Ambient Temperature Differential Distribution: The cabin-ambient differential exhibits a mean of 10.3 °C with a distribution ranging from approximately −8 °C (cabin warmer than ambient in the cooling mode of summer) to +25 °C (maximum demand for heating in winter). The marked bimodal distribution with peaks at ~5 °C (mild winter/autumn) and ~18 °C (summer) reflects the multi-seasonal character of the dataset and seasonal driving patterns in the continental climate.
Figure 4d—Heating Power Versus Temperature Differential: The scatter plot reveals the underlying physical relationship: heating power exhibits a nonlinear (approximately quadratic) dependence on the cabin-ambient temperature differential. Most critically, the data reveal power saturation at extreme differentials (>15 °C), where heating power plateaus at maximum system capacity (~4–5 kW), indicating that PTC heating elements reach their design limit. This saturation behavior is physically meaningful—the heating system cannot exceed maximum electrical power and represents a hard constraint on auxiliary system capability.
Engineered temperature features capture these nonlinear behaviors:
Cabin-ambient temperature differential (ΔTCA = Cabin): Directly quantifies the thermal gradient driving the heating/cooling demand.
Absolute temperature differentials (|ΔTCA|, |ΔTbattery − ambient|): Enable symmetric treatment of heating and cooling; analysis confirms both scales with absolute differential magnitude.
Temperature change rates (dTcabin/dt, dTbattery/dt): Capture transient thermal dynamics, enabling detection of HVAC control system switching points.
Temperature categories (discrete bins): Enable tree-based models to learn regime-specific behavior, such as heating thresholds (<15 °C ambient) or cooling activation (>20 °C).
Vehicle dynamics encode driving behavior and operational context, fundamentally influencing both traction energy requirements and the percentage of auxiliary power of total consumption.
Figure 5 characterizes driving dynamics across the multi-seasonal dataset:
Figure 5a—Velocity Distribution: Velocity exhibits a bimodal distribution centered on a mean velocity of 45.1 km/h (median: 41.7 km/h), reflecting mixed urban-suburban-highway driving patterns typical of Eastern European vehicle usage [
65,
66]. The prominent peak near 0 km/h represents frequent stop-and-go urban driving and idle periods; secondary peaks at ~30 km/h and ~90 km/h represent suburban and highway cruising regimes.
Figure 5b—Acceleration Distribution: Longitudinal acceleration exhibits a mean of 0.001 m/s
2 (essentially zero), with a symmetric distribution around the regions of deceleration (−0.5 to 0 m/s
2) and acceleration (+0.5 to +1.5 m/s
2). The approximately Gaussian distribution with slight positive skew (favoring acceleration) indicates relatively balanced driving dynamics without extreme maneuvers.
Figure 5c,d—Driving Phase Segmentation: Most significant for auxiliary power analysis, the pie chart reveals:
Idle driving (0–5 km/h): 5.4% of driving time but 75% of auxiliary power consumption (vehicles parked with HVAC active).
City driving (5–20 km/h): 12.1% of time with a 58% auxiliary percentage.
Suburban driving (20–50 km/h): 35% of time with a 34% auxiliary percentage.
Highway driving (50–80 km/h): 26.8% of time with a 23% auxiliary percentage.
Very fast driving (>80 km/h): 19.7% of time with a 12% auxiliary percentage.
This inverse relationship between velocity and the percentage of auxiliary power drives the need for velocity-aware feature engineering. Engineered velocity features include:
Polynomial velocity terms (V, V2, V3): Capture nonlinear aerodynamic drag scaling (Paero ∝ V3), enabling models to recognize that high-speed highway driving dramatically increases traction power while leaving auxiliary power relatively constant.
Velocity moving averages (5 s, 20 s windows): Smooth high-frequency CAN bus noise; a 20 s average is particularly effective at identifying driving context.
Velocity variability (rolling standard deviation): Quantifies driving smoothness; a smooth highway cruise exhibits low variability, while stop-and-go urban driving exhibits high variability.
Driving phase classification (categorical): Enables tree-based models to learn different operating regimes and auxiliary power percentage patterns.
Battery state variables and thermal load indicators provide additional context for activation and optimization of the auxiliary system.
Figure 6 reveals the distributions of key engineered features.
Battery State Features:
Heating Demand Features:
Heating demand magnitude (Qdemand = ∣Tcabin − Tambient∣):
Figure 6 (middle-right) shows an approximately uniform distribution across the 0–25 °C range, reflecting multi-seasonal coverage. This feature directly quantifies the magnitude of the thermal load, bridging physics understanding (heat transfer ∝ ΔT) with data-driven ML.
Smoothness of the velocity (σvσv over a 5 s window):
Figure 6 (bottom-right) shows the distribution with a peak near 0 km/h (smooth highway driving) and a tail extending to ~150 km/h (erratic stop-and-go urban driving). This feature encodes the stability of the driving pattern, which correlates with the HVAC activation frequency.
Figure 6 provides visual confirmation that the engineered features capture diverse patterns: temperature differentials span −10 to +25 °C, velocity squares range from 0 to 20,000 (km/h)
2, battery power exhibits a broad range of −75 to +50 kW, and heating demand uniformly spans the operational envelope.
4.2. Machine Learning Models and Comparison
To predict auxiliary power consumption from the 24 engineered features, four machine learning models were developed and compared: Linear Regression (baseline for establishing minimum performance), Random Forest (ensemble method capturing nonlinearities), XGBoost (gradient boosting with regularization), and Gradient Boosting (sequential error correction). This multi-model approach enables identification of the optimal balance between predictive accuracy and model complexity. All models were trained in identical 80/20 train-test splits (76,022 training samples, 19,006 test samples) with consistent preprocessing—missing values were filled via median imputation, continuous features were standardized to zero mean and unit variance (StandardScaler fitted on training set only to prevent data leakage), and identical hyperparameters were tuned via grid search on a validation set.
Figure 7 presents a comprehensive performance comparison across four models and five key metrics:
Figure 7a—Comparison of R
2 scores: XGBoost and Random Forest achieve near-perfect R
2 scores on both the training sets (0.998–0.9993) and the test (0.9984–0.9986), dramatically outperforming Linear Regression (train R
2: 0.478, test R
2: 0.492). The minimal train-test R
2 gap (<0.001 for tree-based models) indicates excellent generalization with negligible overfitting, in stark contrast to Linear Regression, which exhibits massive underfitting (linear assumption fails for nonlinear auxiliary power phenomena); Panel
Figure 7b—RMSE Comparison: Root Mean Squared Error (W) reveals dramatic performance hierarchy. Linear Regression achieves an RMSE test result of 2343 W—essentially useless for practical prediction (the mean auxiliary power is only 2428 W). Random Forest reduces error to 133 W, XGBoost to 145 W, and Gradient Boosting to 195 W. These error magnitudes represent 1.3–1.9% of average auxiliary power for advanced models versus 96% for Linear Regression;
Figure 7c—MAE Comparison: Mean Absolute Error (±W) confirms hierarchy: Linear Regression ±1257 W, Random Forest ±32 W, XGBoost ±35 W, and Gradient Boosting ±47 W. XGBoost achieves a mean absolute prediction error of ±35 W, enabling real-world deployment where users could expect energy estimates accurate to within ~1.5% of actual consumption;
Figure 7d—MAPE Comparison: Mean Absolute Percentage Error (%) shows Linear Regression at 51.2% (unacceptable), Random Forest at 1.3%, XGBoost at 1.4%, and Gradient Boosting at 1.9%. The <2% MAPE for all tree-based models represents state-of-the-art performance for auxiliary power prediction in published literature.
Figure 7e—Overfitting Analysis: Train-Test R
2 gap distinguishes models: Linear Regression 0.0140 (severe underfitting), Random Forest 0.0009, XGBoost 0.0000 (essentially perfect generalization), and Gradient Boosting 0.0010. XGBoost exhibits zero measurable overfitting, with train and test R
2 identical to four decimal places (0.9980), indicating the model generalizes identically to seen and unseen data.
XGBoost model—detailed analysis
XGBoost was selected as the primary model due to its superior performance and interpretability. The optimized architecture employs 150 boost rounds (estimators), a maximum tree depth of 7 (limiting model complexity), a learning rate of 0.1 (controlling step size in gradient descent), a subsample of 0.8 (stochastic gradient boost using 80% of training samples per iteration), and a colsample_bytree of 0.8 (feature subsampling to 80% of available features per tree). This configuration balances accuracy with regularization, preventing overfitting while capturing complex nonlinear auxiliary power patterns. The model was trained on 76,022 samples and validated on 19,006 test samples with no hyperparameter optimization on test data.
Figure 8 visualizes the prediction of the model in the test set: Panel (a)—Training Set: The scatter plot of 76,022 training predictions shows an almost perfect alignment with the y = ŷ diagonal line (red dashed), with a point cloud close to the diagonal and R
2 = 0.9991. Training RMSE of 95 W indicates the model learned training data well; Panel (b)—Test Set: Critically, test set predictions (19,006 samples) show identical alignment quality to training predictions, with R
2 = 0.9986 (only 0.0005 lower than training) and a test RMSE of 114 W. This perfect train-test agreement is the primary evidence of zero overfitting. In particular, the predictions span the entire range (0–80,000 W) accurately; even extreme heating scenarios (>40,000 W at cabin heating −10 °C) are predicted with high precision; Panel (c)—Residuals Distribution: A histogram of residuals (actual—predicted) on the test set exhibits an approximately normal distribution centered near zero (mean: 1 W, essentially unbiased) with a standard deviation of 114 W. The narrow peak and minimal skew indicate homoscedastic errors (variance constant across the prediction range), satisfying the assumptions of the regression model. Approximately 95% of residuals fall within ±224 W (±2σ), confirming predictive reliability; Panel (d)—Residuals vs. Predicted Values: The scatter plot reveals a critical pattern: residuals exhibit homogeneity across the entire prediction range (0–80,000 W), with point cloud width remaining constant rather than expanding (which would indicate heteroscedasticity). A few outliers appear at extreme power values (>50,000 W), representing rare heating scenarios at extreme cold that deviate from typical patterns. The tight concentration around the zero error line confirms the model predicts with consistent accuracy regardless of auxiliary power magnitude.
The importance analysis of characteristics presented in
Figure 9 reveals the dominant factors driving the prediction of auxiliary power in electric vehicles. The position of the accelerator (throttle %) emerges as the most influential predictor, with an importance weight of 0.4153, a counterintuitive finding that demonstrates a strong correlation between traction demand and auxiliary power consumption. This phenomenon suggests that aggressive acceleration or sustained high-speed cruising simultaneously increases motor load and auxiliary power demand through coupled activation of thermal management systems and cabin climate control adjustments. The efficiency of the heating system per unit temperature differential ranks second in importance (0.2716), directly quantifying the responsiveness of the HVAC system, while the elevation change (0.0622) represents the third most significant factor, affecting the thermal management of the battery during variable terrain driving. Temperature-related features play a particularly critical role in the prediction framework, which includes the difference in battery-ambient temperature (0.0539), the difference in battery temperature (0.0340), the cabin temperature (0.0414), and the thermal gradient between the cabin and the ambient temperature (0.0290). Color-coded feature categorization reveals that temperature-related characteristics comprise approximately 30 to 35% of the top 20 most important features, while heating demand-related factors contribute an additional 27% to overall model importance. Driving dynamics and battery state features contribute more modestly at approximately 5% and 3%, respectively. The feature importance analysis uncovers a fundamental coupling between traction and auxiliary loads in real-world driving scenarios, where aggressive driving styles or sustained high speeds activate both propulsion and auxiliary systems concurrently, creating correlated energy demands. This finding carries significant implications for electric vehicle energy management system design, underscoring the necessity of incorporating interdependencies between vehicle subsystems into energy efficiency optimization strategies.
The robustness and generalizability of the model were evaluated by 5-fold cross-validation as presented in
Figure 10, ensuring that the performance metrics generalize across random data partitions. The analysis demonstrates consistent stability across all validation folds, with test R
2 scores ranging from 0.9986 to 0.9991 and producing a mean of 0.9986 with a standard deviation of 0.0001. This low variance indicates that the model performs consistently regardless of the train-test partition configuration. Training R
2 scores remain between 0.9991 and 0.9992, approximately 0.0005 higher than test scores, confirming minimal overfitting. The root mean squared error values show consistent results, with RMSE test values ranging between 116 and 118 W across all five folds, producing a mean of 117 W with a standard deviation of approximately 1 W. This stability validates that the prediction error remains stable across different data partitions without notable performance variation.
The distribution analysis of the box plot in
Figure 10c reveals a minimal spread for the R
2 values at 0.9986, while the mean absolute error clusters within the 57–58 W range and the RMSE maintains the interval 117–118 W. The absence of outliers across the validation folds confirms that no anomalous partitions influenced the results. The stability assessment confirms that the overfitting is minimal with the train-test R
2 gap below 0.0005, generalization is reliable with an R2 standard deviation of 0.0001 across the folds, and reproducibility remains consistent with stable RMSE values across the partitions. The 5-fold cross-validation methodology validates that the XGBoost model’s test R
2 of 0.9986 represents genuine predictive capability, reflecting the model’s ability to capture underlying relationships in the data rather than a result of a favorable single train-test split.
The XGBoost model achieves high accuracy for the prediction of auxiliary power in electric vehicles, with R2 = 0.9986 on the test data held and a mean absolute error of ±35 W, corresponding to 1.4% MAPE. These metrics exceed typical machine learning benchmarks for energy prediction reported in the literature, with baseline R2 values of 0.90–0.95, and enable deployment in production systems. The model generalizes consistently across 5-fold cross-validation with a R2 standard deviation of 0.0001, confirming stability across arbitrary train-test partitions. Computational efficiency is adequate for practical implementation, with predictions on 19,006 test samples requiring less than 0.1 s on standard CPU hardware, making the approach viable for embedded vehicle systems. The developed model enables multiple practical applications across electric vehicle systems and operations. Within original equipment manufacturer energy management systems, the model can be deployed in vehicle electronic control units to provide auxiliary power predictions 5–10 s ahead, enabling predictive pre-conditioning, where cabin temperature setpoints adjust before demand increases, and allowing integration with route planning to optimize charging intervals based on thermal loads. For consumer-facing range estimation, the model replaces generic kilometer range displays with context-aware predictions, accounting for ambient temperature conditions, and enables real-time range calculation showing the impact of cabin temperature settings, quantifying the energy trade-offs between thermal comfort and driving range. Fleet management operations benefit from improved route optimization based on precise energy consumption forecasting, predictive maintenance capabilities through early detection of anomalous auxiliary power patterns, and thermal efficiency benchmarking across fleet vehicles, allowing identification of underperforming units requiring service attention.
5. Discussion
This study achieved high accuracy for auxiliary power prediction in electric vehicles, with R2 = 0.9986 (99.86% variance explained) and a mean absolute error of ±35 W (1.4% MAPE), exceeding published benchmarks of R2 = 0.90–0.95. The 5-fold cross-validation confirmed robustness with R2 across folds ranging from 0.9986–0.9991 (σ = 0.0001), indicating the integrated physics-ML approach captured underlying relationships rather than statistical artifacts. Feature importance analysis revealed a distinct hierarchy where accelerator pedal position (throttle position) (0.4153) dominated over ambient temperature, suggesting strong coupling between traction demand and auxiliary loads. Vehicles operating at high throttle during aggressive acceleration or highway cruising simultaneously exhibit elevated auxiliary power due to thermal coupling, where excess motor heat requires coolant circuit activation, driving context effects that increase infiltration at highway speeds, and climate control adjustments in response to occupant comfort demands. Temperature-related features, collectively representing 30–35% of importance, ranked below heating efficiency per unit temperature differential (0.2716), indicating that thermal load magnitude rather than absolute temperature values drives HVAC power consumption.
The analysis revealed pronounced nonlinearity in heating-temperature relationships, with heating power increasing 44 times from +25 °C to below 0 °C (0.2 kW to 8.9 kW), with saturation occurring at extreme cold temperatures, reflecting the design constraints of the PTC heater. This nonlinearity has practical implications, as extrapolating linear energy models from moderate temperature data would substantially overpredict consumption at −20 °C, inflating pessimistic winter range estimates. The contribution of auxiliary power to total energy consumption varied significantly by driving context: 75% during vehicle idling with active HVAC, 58% in city driving, 34% in suburban conditions, 23% in highway driving, and 12% at speeds exceeding 80 km/h, indicating that the current industry practice of applying single fixed auxiliary power overhead does not accurately represent real operating conditions. Heating demands dominated air conditioning by a 7:1 ratio (mean: 1277 W versus 183 W), reflecting the thermal burden asymmetry in cold-climate electric vehicle deployment. This finding highlights that manufacturers must prioritize heating efficiency through heat pump adoption, thermal insulation improvements, and battery pre-conditioning strategies more aggressively than cooling systems for competitive positioning in Northern European and similar climates.
The model’s R
2 = 0.9986 and MAE = 35 W compare favorably with published EV energy prediction studies. Gil-Sayas et al. [
19] achieved similar accuracy (R
2 ~0.95) for air conditioning-specific load prediction but used controlled laboratory conditions across limited temperature ranges (−7 °C, 22 °C, 35 °C WLTC cycles), while this work employed 95,028 real-world measurements spanning −8 °C to +33.5 °C. The prediction model presented in work [
56] reported an R
2 = 0.92 on real driving data; our result represents an approximately 0.07 percentage-point improvement through integrated physics-decomposition and feature engineering. Kim et al. [
45] developed fleet-level auxiliary power models achieving R
2 ~0.90 using machine learning on charging station data but operated at the trip aggregation level rather than instantaneous prediction, limiting real-time applicability. Zhang et al. [
55] characterized annual air conditioning energy consumption in Chinese vehicles through experimental measurement, reporting that heating energy exceeds cooling energy across most climates, consistent with the 7:1 ratio found here, though their study lacked real-world driving data integration. Rathore and Meena [
29] performed systematic comparison of Random Forest, XGBoost, and neural networks for overall EV consumption prediction across charging cycles, confirming XGBoost superiority (R
2 = 0.945) versus simpler ensemble methods, validating the algorithm choice in this work. Huang et al. [
52] demonstrated that physics-augmented machine learning (combining traction models with data-driven auxiliary prediction) reduced average energy estimation error from 40% (pure physics) to 10% (hybrid), supporting the present study’s methodological integration of physics-based decomposition with ML feature engineering.
Linear energy models based on aerodynamic drag and rolling resistance alone—the dominant approach in traction-focused literature—implicitly assume auxiliary loads as constant additive terms [
67]. This study demonstrates that heating power varies 44 times in the temperature range and comprises 12–75% of total energy depending on driving context, contradicting constant-overhead assumptions. Physics-based decomposition combined with ML captures this variability through explicit feature engineering rather than treating auxiliaries as unmeasured residuals. The finding that throttle position ranks highest among 27 engineered features (0.4153 importance) aligns with recent work by Huang et al. showing operational context (acceleration patterns, driving phase) significantly improves prediction, though specific throttle–auxiliary coupling has not been previously characterized in published literature. The dominant importance of throttle position (0.4153) among 24 engineered features reflects the coupling between traction demand and auxiliary thermal load. During acceleration or sustained highway driving, increased motor power dissipation elevates thermal load on battery and cabin heat exchangers, necessitating enhanced cooling or heating circulation. This finding suggests that aggressive driving patterns incur not only direct traction energy penalties but also secondary auxiliary energy costs, with implications for driving behavior recommendations in fleet telemetry systems.
The auxiliary power saturation at cold temperatures reflects design constraints in existing PTC heating systems. Published data on heat pump alternatives (COP ~2.5–3.5 vs. resistive heating COP ~1.0) suggest this saturation could be substantially reduced through advanced thermal management. The 7:1 heating-to-cooling energy ratio in the dataset reflects continental climate driving patterns; subtropical or equatorial studies would show different ratios, limiting generalizability. Cross-validation across random data partitions (R2 σ = 0.0001) validates robustness within this climate zone, though application to other regions requires corresponding data collection.
The model enables several practical applications for EV manufacturers and fleet operators. OEM energy management systems can deploy the prediction model in vehicle ECUs to forecast auxiliary power 5–10 s ahead, enabling predictive pre-conditioning, such as adjusting cabin temperature setpoint before demand surges during rapid acceleration or entrance to cold regions. Integration with route planning can optimize the placement of the charging stop based on thermal loads—longer range estimates are possible on highway versus city routes, and seasonal adjustments for winter versus summer. For consumer-facing applications, context-aware range displays can replace generic “X km range” estimates with driving-scenario-specific predictions (“210 km in +20 °C summer, 140 km in −10 °C winter”), quantifying the energy cost of comfort preferences (“maintaining cabin at 22 °C costs 8 kW; lowering to 19 °C reduces to 5 kW, adding 15 km range”). Fleet telematics can benchmark vehicle thermal efficiency against fleet averages and provide maintenance alerts when auxiliary power consumption deviates from expected values, indicating potential HVAC system degradation. The framework’s modularity also enables extension to advanced thermal architectures incorporating health-conscious energy management paradigms, where optimization simultaneously addresses energy consumption, thermal system durability, and occupant comfort—objectives increasingly important for next-generation EV and alternative powertrain designs requiring integrated thermal-electric management [
68].
The dominant feature importance of accelerator pedal position (0.4153) merits mechanistic explanation. While counterintuitive from a steady-state thermal perspective, three coupled physical mechanisms explain this relationship. First, motor waste heat (P_loss = (1 − η_motor) × P_motor ≈ 0.1 × motor_power) scales directly with throttle-commanded torque, activating battery thermal management cooling circuits (~0.5–1.0 kW additional pump power during full throttle). Second, vehicle speed increases during acceleration trigger infiltration thermal load (Q_infiltration ∝ v2), creating cabin temperature gradients independent of setpoint and demanding transient HVAC response. Third, occupant thermal comfort psychology couples aggressive driving (high throttle) to climate control adjustments, where occupants reduce cabin temperature during perceived motion intensity, creating correlated auxiliary power activation.
Quantitatively, maximum throttle generates ~12.5 kW motor waste heat (125 kW motor × 0.1 inefficiency), necessitating immediate cooling system activation. This coupling explains why accelerator position (0.4153 importance) exceeds ambient temperature (0.2716) in the feature hierarchy: Throttle captures transient dynamics during acceleration while temperature-based features characterize quasi-steady-state thermal equilibrium. The integrated coupling mechanism can be approximated as: P_aux ≈ α1 × P_motor_loss(θ_accel) + α2 × Q_infiltration(v(θ_accel)) + α3 × f_occupant(θ_accel), where empirical importance coefficients reflect motor ~40%, infiltration ~35%, and occupant ~25%.
This traction–auxiliary coupling contradicts conventional vehicle energy models treating auxiliary systems as constant overhead and demonstrates that accurate EV energy forecasting requires operational context awareness. Prior studies using standardized driving cycles (predetermined acceleration profiles) or trip-aggregated analysis (masking transient dynamics) failed to observe this coupling, making this finding novel in published literature.
This study has several acknowledged limitations restricting generalizability. Data originated from two vehicle platforms with specific HVAC architectures (PTC resistive heater, vapor-compression A/C); heat pump-equipped vehicles exhibit different thermal dynamics and auxiliary power curves requiring separate validation. Geographic specificity is significant: Measurements ranged from −8 °C to +33.5 °C and were collected in Poland, Italy, and Germany, providing coverage of the temperate climate of Central Europe with pronounced seasonal variation. However, this geographic specificity limits direct applicability to fundamentally different climate zones. Tropical or subtropical regions with persistent ambient temperatures >25 °C would exhibit AC-dominated auxiliary loads (inverse to the observed 7:1 heating-to-cooling ratio); arctic regions with sub −20 °C temperatures would operate exclusively in heating power saturation regimes. Solar radiation intensity varies dramatically with latitude, affecting cabin cooling demands independently of ambient temperature. Consequently, model transfer to extreme climate regions requires region-specific validation and likely retraining. The physics-based decomposition employed a constant drag coefficient (Cd = 0.29) and rolling resistance (Cr = 0.008), which vary with speed, tire pressure, and road surface under real conditions; model accuracy could improve with dynamic coefficients accounting for these variables. The dual-channel HVAC instrumentation (CAN primary, LIN secondary) ensured signal redundancy, and minor data gaps were addressed using standard preprocessing methods. The model targets instantaneous auxiliary power prediction (1 Hz sampling); application to longer time horizons (hour-scale range forecasting) requires validation of prediction consistency and stability across integration windows. The feature importance rankings (throttle position: 0.4153, heating efficiency: 0.2716) are specific to PTC resistive heating architecture with COP = 1.0. Heat pump systems with COP ≈ 2.5–3.5 would exhibit fundamentally altered heating power saturation characteristics and potentially different feature hierarchies due to compressor modulation strategies [
69,
70]. Direct transfer of this model to heat pump-equipped vehicles requires validation against a heat pump dataset, establishing domain adaptation requirements before production deployment.
The robustness of cross-validation within the climatic envelope of the dataset does not guarantee performance on fundamentally different vehicle platforms or climate zones. Transfer of learning to different EV architectures (e.g., Tesla Model 3 with heat pump, Mercedes EQE with advanced thermal management) remains untested. Arctic climates with consistent sub −20 °C temperatures may exceed the heating saturation plateau observed here, while tropical climates would show AC-dominated auxiliary loads rather than heating-dominated ones. Future work should prioritize multi-vehicle dataset collection across geographic regions and thermal management strategies to establish generalized auxiliary power prediction frameworks applicable across diverse EV platforms and climates.