Optimisation of Direct Battery Thermal Management for EVs Operating in Low-Temperature Climates

Electric vehicles (EVs) experience a range reduction at low temperatures caused by the impact of cabin heating and a reduction in lithium ion performance. Heat pump equipped vehicles have been shown to reduce heating ventilation and air conditioning (HVAC) consumption and improve low ambient temperature range. Heating the electric battery, to improve its low temperature performance, leads to a reduction in heat availability for the cabin. In this paper, dynamic programming is used to find the optimal battery heating trajectory which can optimise the vehicle’s control for either cabin comfort or battery performance and, therefore, range. Using the strategy proposed in this research, a 6.2% increase in range compared to no battery heating and 5.5% increase in thermal comfort compared to full battery heating was achieved at an ambient temperature at −7 ◦C.


Introduction
One potential barrier to the mass acceptance of electric vehicles (EVs) is range anxiety, which can be exacerbated at low ambient temperatures. Increased heating ventilation and air conditioning (HVAC) power consumption, coupled with reduced battery performance, leads to a range reduction of 40% to 70% at −20 • C compared to 20 • C [1,2]. In addressing this issue research has been focussed on the development of heat pump technologies as a more efficient mechanism to produce cabin heat [3][4][5]. Other research introduces the use of the heat pump to provide heat to the battery, thereby minimising the impact of cold temperatures [6][7][8][9]. There is a requirement to heat the battery and the cabin in such a way that the battery performance and the driving comfort, respectively, can be ensured. The objective of this research is, therefore, to develop a methodology which can actively balance the needs of these two factors according to a quantitative cost function.
The necessity for controlling the balance between battery and cabin heating is reflected by a shift in manufactures' focus toward optimised thermal management. For example, in Reference [10], Bosch explains that their HVAC system is designed to provide heat to both the battery and cabin according to their requirements at low temperatures. In their system, this is achieved through two separate heating mechanisms. The cabin is heated via the heat pump, while the battery is heated via a positive thermal coefficient (PTC) heater and waste heat from the electric motor.
When operating in cold climates, up to 7.6 kW of heat can be required to heat the cabin [11]. By comparison, assuming most EVs have an efficiency of between 144 Wh/km to 225 Wh/km [12], then the equivalent average tractive effort on a World Harmonised Light Vehicle Test Procedure (WLTP) drive cycle would be 6.7 kW to 10.4 kW. In older electric vehicles, this heat was provided both the cabin and the battery, one heat source needs to be split among two sinks. There is, therefore, a trade off between range (which is improved by heating the battery) and comfort (which is reduced by heating the battery), which needs to be considered. Contextually, this problem can be likened to that of the power split problem in hybrid vehicles. In this area of research, dynamic programming (DP) is considered to be the benchmark of what is potentially achievable for a system and is able to find the global minima in control optimisation problems [27,28].
DP is a method of generating optimised control trajectories for dynamic problems. A dynamic model is used to generate a cost matrix which describes the cost of an action given a control choice and a given vehicle state. An optimisation algorithm is then used to find a series of control options which leads to the minimum cost through the dynamic problem. The advantage of this method is that having full knowledge of the consequence of every control option at every time step allows for the absolute optimum control trajectory to be found. The disadvantage of this method is that it is rarely implementable in its pure form, since full knowledge of the duty cycle is required beforehand, and found optimal trajectories are only applicable to specific scenarios. It is, therefore, regarded as the best case scenario which implementable strategies aim to meet [29].
Other optimised control strategies, such as heuristic control techniques and static optimisation, are more implementable than DP but are sub-optimal in comparison [30]. For example, Wang et al. showed that DP could reduce the fuel costs of a Toyota Prius by 30%, but, when a heuristic controller was created to mimic the DP results, the improvement reduced to 27% [29].
The remainder of the paper is structured as follows: Section 2 describes the vehicle model used to conduct the research, Section 3 describes how the vehicle performance will be measured, Section 4 describes how the model is validated, Section 5 explains how a specific DP algorithm was implemented, and Section 5.2 defines the cost function that describes the balance between range and comfort; Sections 6.3 and 6.4 show the results of implementing the DP method and the optimal control trajectories produced, and Section 6.5 describes a potential route to the actual implementation of the results presented. A review of the method used with suggestions of improvements is made in Section 7, and, finally, Section 8 summarises the progress made through this research.

Vehicle and Thermal Model
The architecture of the heat pump system can be seen in Figure 1. Here, the interactions between vehicle components and the heat pump are shown. The connection between the battery and heat pump is identified with a dashed line as it is the connection which will be optimised through dynamic programming. If the battery raises above its target temperature of 20 • C, then the vehicle controller will cease heating of the battery. Figure 1 shows a PTC heater which is required to support the heat pump in extremely low temperatures and during the warm up phase [31]. In extremely cold temperatures, the heat pump is not able to extract heat from ambient, so its heat output is reduced; in these conditions, the PTC heater is required to meet the heating load. Additionally, Kim et al. showed that a PTC heater is required during the warm up phase of the heat pump to provide adequate cabin heating [31]. Here, the PTC is controlled to supply 4 kW of heat from the beginning of the cycle until the cabin reaches its target temperature, or the high temperature coolant (HTC) circuit, which provides heat to the battery and the cabin, reaches its target temperature of 90 • C.
A vehicle model built in Dymola (Version 2019, Claytex ltd Libraries, Leamington Spa, UK) has been used to evaluate, develop, and optimise a battery thermal management control which balances comfort and range through the application of DP. The vehicle model is a hypothetical EV, which has parameters defined in Table 1. The model utilises a physical heat pump model, which uses waste heat from the motor and cabin exhaust. The system also collects heat from ambient and is further supported by a thermal storage system. This is in agreement with the results presented in Jeffs et al. concerning the most useful components to be used with a heat pump [8]. The models for the heat pump, battery, cabin, motor, and thermal storage will be explained in the remainder of this section. Full and further details of the model can be found in Reference [8,32]. Should the reader be further interested in the model, the libraries used are available from Claytex Ltd.

Heat Pump
The heat pump is able to extract heat from multiple sources around the vehicle, as shown in Reference [8,32]. Parameters relating to the heat pump's operation can be found in Table 2.
For the purposes of this work, focus is given to the connection between the heat pump and the battery. The top level of the heat pump is shown in Figure 2, and this corresponds to the box labelled HVAC in Figure 1. Figure 2a shows the model as it appears in Dymola, while Figure 2b gives a block representation of the Figure 2. Here, the thermal switch for the battery is used to isolate the battery from the heat pump when the battery has reached its target temperature of 20 • C. When the battery is being heated, the heat flow is controlled by a dynamic thermal resistor inside the thermal switch, allowing the user to set a desired heat flow. In a physical vehicle, this would be controlled by the temperature of coolant sent to the battery and its mass flow rate. Here, dynamic thermal resistors are used to simplify the interaction while maintaining the functionality. The heat flow for the battery is set inside a heat pump control unit (HPCU), external to the heat pump model, which is described in Section 2.2.   Inside the block labelled "Heat Pump" in Figure 2b is the middle layer of the heat pump seen in Figure 3. The middle level of the heat pump model is used to house, pump coolant between, and direct heat flows to the 3 main models of the heat pump. These are the HTC, chiller circuit, and refrigeration loop. Inputs to this level include coolant mass flow rate (which is then controlled by a pump and a proportional integral derivative (PID) controller), compressor power demand which is passed onto the refrigerant model, and cabin temperature which is used to shut off the PTC heater, which is located in the HTC. The coolant circuits themselves contain heat exchangers (HXs) which have been sized using examples and estimates taken from components found on existing vehicles. This level also sees the input of heat recuperated from cabin exhaust, which is used to increase the temperature of the chiller loop.
The coolant circuits, HTC, and chiller are shown in Figures 4 and 5, respectively. They have operating temperatures of 90 • C and −10 • C, respectively. In Figure 4, the thermal battery and PTC heater are in series before the coolant reaches the cabin, battery, and ambient, which are in parallel. It can also be seen that all components with the exception of the PTC heater have a bypass option; this is the second control point for component connections which can be used to thermally isolate a component from the system. This arrangement was chosen so that the thermal battery could be used to increase the coolant temperature from the condenser output and under certain conditions negate the need for the PTC heater, which is set to turn off if the coolant temperature exceeds 85 • C. The hot coolant is then split between the battery and the cabin; hence, when battery heating is operational, the cabin receives less mass flow through its HX. Additionally, as the battery heating is increased, the coolant outlet temperature will be lower, increasing the demand on the heat pump, or further reducing cabin heat when the compressor and PTC loads are saturated.
The chiller circuit is arranged in a similar way to the HTC, as seen in Figure 5, with the components capable of contributing heat to the system in parallel. These HXs allow heat to be extracted from; the motor and inverter (as one unit), the gearbox and driveline (which will be treated as one component and refereed to as transmission), the battery, the thermal battery, and ambient. For the purposes of this research, the transmission and battery are not used as heat sources. Battery cooling is implementable through chiller in case of excessive battery temperatures which are not seen in the scenarios tested here. The chiller circuit also uses bypasses to thermally isolate components from the heat pump.
The bottom level of the heat pump contains the physical model of the refrigerant circuit. For this research, R134a is used as the refrigerant. The configuration of the refrigeration circuit is typical of what may be found in literature [4,6]. The refrigeration cycle uses refrigerant to coolant HXs (opposed to more traditional refrigerant to air HXs) to distribute and collect heat to/from the vehicle. The coolant system is limited by the melting and boiling points of the coolant; it is, therefore, important that the compressor is carefully controlled to prevent the coolant from freezing or boiling. The coolant is prevented from going beyond its temperature limits using the HPCU, which controls the compressor demand seen on the right of Figure 6. The refrigerant is protected from exceeding its pressure limit by the PID labelled "Pressure Control PID", seen in Figure 6, which is set to limit the pressure to 30 bar; chosen corresponding to R134a's pressure-enthalpy diagram. Since there are two power demands at this level, the demand according to the pressure controller and the demand input from the HPCU, the minimum of these values is used to control the compressor, ensuring that neither refrigerant pressure nor coolant temperature exceed their physical limits.    The chiller circuit is used for thermal exchange and extraction, i.e., extracting from the motor and inverter, the gearbox and driveline, ambient, heat battery. There is also the capability to cool the cabin and electric battery, if needed. Figure 6. The refrigeration cycle is physically modelled. At this level, the speed of the compressor is controlled either by the demand from the heat pump controller, or a PID which stops the system pressure getting too high.

Heat Pump Control Unit (HPCU)
The heat pump control unit has two primary purposes: firstly, to set the desired amount of battery heating, which will be implemented using a heating schedule generated through the results of dynamic programming; and secondly, to set the desired power output of the heat pump compressor. The compressor controller, shown schematically in Figure 7, has the purpose of getting the cabin and battery to temperature, while ensuring that the HTC and chiller do not exceed their temperature limits. PID controllers are used to create a compressor power demand according to the battery and cabin current and target temperatures, and the greatest of these demands is taken to ensure there is enough heat to meet these requirements. Additionally, PID controllers are used to set a power demand needed to bring the chiller and HTC loops to their target temperatures. The minimum power request (from either the cabin and battery, the chiller, or the HTC controls) is then passed to the compressor; the minimum is used so that, if a component has reached its set point, it is not pushed beyond that target by the requirements of another component. This logic prevents the battery and cabin from overheating and reduces the chance of the model failing due to the coolant breaching its temperature limits.

Battery
The battery model can be broken into two parts: electric and thermal. The electric side uses a first order RC (resistor-capacitor) network equivalent circuit model (ECM), as seen in Figure 8.
The components in this model are parameterised using look up tables which are a function of component temperature, as measured using the thermal model, and the cell's state of charge (SOC).
The resistor and open circuit voltage (OCV) values are scaled by the number of cells in series to produce an RC circuit which represents one string of the pack. This data was generated specifically for Xalt 40Ah cells by Tripathy et al. in Reference [33]. This type of battery model is typically seen in literature [34,35] when modelling pack size batteries for vehicle application. In this application, the vehicle was configured in a 3p108s (3 parallel strings of 108 cells in series) arrangement giving a pack size of approximately 48 kWh (assuming a nominal voltage of 3.7 V).
Since the parameters of the RC network are dependent on temperature, a thermal model is used to estimate the bulk temperature of the pack. The waste heat is calculated using Equation (1).
Here, Q battery waste is the waste heat generated by the battery, I is the battery current, and R 0 (T, SOC) and R 1 (T, SOC) are internal resistances corresponding to Figure 8 which are dependant on battery temperature and state of charge. A schematic representation of the thermal model of the battery can be seen in Figure 9. The battery has two modes of heat exchange: interaction with the thermal management system and thermal losses to ambient. The latter heat exchange is modelled using the flat plate parallel flow equation [36], given in Equation (2).
whereh is the mean heat exchange coefficient of the plate, and k, Re, and Pr are the thermal conductivity, Reynolds number, and Prandtl number of the convection fluid. The Reynolds number and Prandtl number are defined in Equations (3) and (4). L is the length of the surface over which the fluid is flowing, Here, 4m is used as an approximation of the length of the underside of the vehicle.
where ρ, v, µ, and C p are density, velocity, dynamic viscosity, and specific heat capacity of convective fluid-in this case, air. The final heat flow to ambient is then given by or where Q ambient is the heat rejected to ambient, G is the thermal conductance of the plate, W, L are the width and length of the plate, and T plate − T ambient defines the temperature difference between the plate and ambient.  It should be noted from literature that the capacity of a battery is dependent on its temperature; in research concerning the operation of electric vehicles in low temperatures, this should be accounted for. A state of charge model is used to estimate the state of charge of the battery through the drive cycle as a function of temperature. To achieve this, the state of charge model has an additional lookup table which contains information about the proportion of capacity available as a function of temperature. This is then used to scale the Coulomb counting equation by the factor C e f f seen in Equation (7). This adjustment has been seen to improve the accuracy of SOC tracking and range prediction at low temperatures and so is an important addition to this work [33,37,38].
(7) Table 3 shows the values of C e f f as a function of temperature. Table 3. Effective capacity factor (C e f f ) of the battery as a function of temperature for a 0.1 C discharge. Data provided by Tripathy et al. [39]. 1.000

Motor
Motor waste heat is modelled using an efficiency lookup table. The motor's thermal model consists of a lumped thermal capacitance representing the motor windings, which the heat is generated and extracted from. The heat pump aims to extract 1 kW of heat from the motor, and the motor is also able to lose heat to ambient through a fixed thermal resistance.

Cabin
The cabin has a target temperature of 22 • C, which is measured at the "Large air volume" seen in Figure 10a. An infinite air source with ambient temperature is used as the air source for the cabin, where air into the cabin is heated in the "eAC" component, seen in Figure 10. The heated air is then pumped into the cabin, where a single air volume and heat capacitance are used to measure the cabin temperature. This heat capacitor has four modes of losing heat: convection to ambient through panels, thermal exchange with soft furnishings, thermal exchange with hard furnishings, and cabin air exhaust. The convection to ambient through the exterior surfaces is modelled using a variable thermal conductance with dependency on vehicle speed through a lookup table [40]. The model uses three heat capacitances: one for air, one for hard furnishings (such as dashboard panels, glass, etc.), and one for soft furnishings (such as the seats, carpet, etc.), with thermal resistances between the air and the furnishings. Finally, there are two volumes representing the air in the cabin: one large and one small. The larger air volume represents the majority of the cabin where the target temperature is imposed. The smaller air volume is used to harvest cabin exhaust waste heat, Here, 30% of the potential heat contained in the wasted air can be extracted and used in the heat pump. This amount reflects the claims made by BMW in Reference [41].

Thermal Storage
The thermal storage block utilises a phase change material (PCM) to store heat which can be redeployed instantly when the vehicle begins its journey. The material modelled is called paraffin 70-75, and its characteristics were found by Ukrainczy et al. [42] and are shown in Table 4. Table 4. Thermal properties of paraffin 70-75 are given [42].

Property Value
Melting Point (T melting ) The PCM is modelled using a thermal capacitor block with dynamic capacitance, allowing for the modelling of the latent heat release of the material. When the material temperature drops to T melting + 1 the capacitance is increased from C p to C p + C l , then when the temperature falls beneath T melting the capacitance is restored to C p . This approach for modelling PCMs has been used in other similar research [8,32,43].

Key Performance Indicators
The performance of the vehicle will be measured using three metrics; energy efficiency measured from battery, range, and comfort. This section defines how these key performance indicators (KPIs) are calculated. Firstly the energy efficiency of the battery is calculated using Equation (8).
Here, D W LTP is the WLTP distance (23.25 km), and I(t) and V OCV are the battery current and open circuit voltage, respectively. The range is calculated using Equation (9): where E cap is the maximum capacity of the battery (48 kWh), and C e f f is the effective capacity of the battery defined in Table 3. Cabin comfort is defined as the mean cabin temperature given by Equation (10).
These three KPIs will be used throughout this report to inspect the performance of the vehicle.

Validation
Here, the validation of the battery and complete vehicle is shown. This is used to demonstrate that the models used are representative of data presented in a range of literature.

Battery
The battery model was discharged at a rate of 0.1 C (to prevent self-heating from higher currents) at −15 • C to 15 • C in 10 • C increments. The resulting recorded capacities are presented in Figure 11, alongside an amalgamation of data extracted from literature. Here, it can be seen that the battery capacity predicted by the model falls within the range of battery capacity results presented in literature and is, therefore, validated for the purposes of this research. Figure 11. Battery model capacity prediction compared to results extracted from the following sources: Nagasubramanian [19], Zhang [16], Ji [20], Jaguemont [21], and Panasonic [22] (manufacturer).

Complete System Validation
Here, the complete model is validated against high level parameters available from literature. At this level, the model is judged on energy efficiency, measured using Equation (8), which is key when assessing the performance of an electric vehicle and representative range reduction.
Firstly, the vehicle model is validated at room temperature against known EV efficiencies. Since the vehicle model is generic to electric SUVs, it is sensible to use similar vehicles against which to assess the performance. Table 5 highlights a range of similar vehicles and their efficiency. The model is simulated at 23 • C using no HVAC. Doing so, the complete vehicle used 243 Wh/km, which is within the boundaries set out in Table 5.
The final step of validation is to ensure that the vehicle captures a representative range reduction as the ambient temperature is lowered. Referring to literature, it has been seen that a range reduction in the region of 40% to 70% should be measured when moving from 15 • C to −15 • C [1,2,45]. This, therefore, defines the bounds which are used to validate the model. Table 6 shows the vehicle range, calculated using Equation (9), achieved by the vehicle using the WLTP cycle over a range of temperatures, without the heat pump engaged (only using the 4 kW PTC). The range reduction here is 41% when moving from 23 • C to −15 • C. While this is on the lower side of range loss, it should be noted that a 4 kW PTC would be undersized for a vehicle of this type on its own; therefore, higher HVAC consumption would be expected at lower ambient temperatures for a similar vehicle without a heat pump. However, the value still falls within the accepted range, indicating the model is representative of reality.

Dynamic Programming
When considering control optimisation, it is relevant to consider solutions for the power split problem in hybrid electric vehicles, which is conceptually similar to the heating split problem here. Cook et al. [30] performed a review of difference control techniques used for controlling the power split between the internal combustion engine and motor in hybrid electric vehicles. Cook considered heuristic or rule based controllers, steady state optimised control, and dynamic programming and highlighted that dynamic programming was the only method which could reliably find a global minimum. Cook et al. demonstrated this by showing that a controller based on dynamic programming results was able to reduce the fuel consumption of a hybrid vehicle by 54% compared to a 31% reduction using a rule based controller [30]. The Equivalent Consumption Minimisation Strategy (ECMS) is also a strategy considered, which calculates power split based on a simplified model to minimise the instantaneous fuel consumption based on current demand. This method is suboptimal as it is unable to account for the future state of the battery and future requirements [46][47][48].
DP has been used liberally in hybrid vehicles to optimise the power split between the internal combustion engine and electric motor but has not been used as frequently in the area of thermal management. Shojaei et al. [49] demonstrated an optimised battery cooling schedule during a 24-h duty cycle in high ambient temperatures. In this research, the control of the system was binary, with the option of having the chiller on or off. Perez et al. investigated the problem of splitting the power demand between the electric motor and internal combustion engine on a hybrid vehicle [50]. Here, a dynamic programming approach was used to minimise the cost function over a drive cycle, resulting in an optimal trajectory describing how the demand should be split. Wang et al. showed dynamic programming could reduce the fuel costs of a Toyota Prius by 30% [29]. Others describe how DP represents the benchmark of what can be achieved in a dynamic problem [27,28].
Although DP offers global optimality for control problems, it struggles to be implementable due to the "curse of dimensionality" which causes the computational problem to grow beyond manageable limits. However, it is the only option for ensuring global optimality and is, therefore, the most appropriate choice for exploratory research into the problem. DP has not yet been used to produce a multi-level heating trajectory for vehicle components during a drive cycle. The work in this paper, therefore, represents a new application of this well known technique. Furthermore, DP has been coupled to a cost function with variable weightings which can control the balance between energy, range, and comfort. To demonstrate the superiority of this control method, the DP has been compared to simple heating strategies in the results. The simple strategies used are no battery heating and full battery heating, where the battery is heated at 10 kW from the start of the cycle until it reaches target temperature.

Problem Definition
The objective of using DP is to produce a heating trajectory to control the interaction between the heat pump and the battery to balance vehicle performance with cabin comfort during the World harmonised Light Vehicle Test Procedure (WLTP). The WLTP drive cycle has been selected for use in this research as it is the mandatory test to establish vehicle emission performance in Europe.
The balance between vehicle performance and cabin comfort will be achieved by minimising a cost metric, J. This process is described by Equation (11), where U(k) is the input vector, which describes the amount of heat delivered to the battery at the kth time step, as seen in Figure 1. U(k) is discretised into four heating levels: none (0 kW), low (3.33 kW), medium (6.66 kW), high, or full (10 kW). The state variables, X s , for the problem are T cabin and T battery , which represent the cabin and battery temperatures, respectively; this is also seen in Figure 1. These are discretised into 1 • C and 2 • C increments, respectively. The problem is also discretised into time steps of 300 s. This time step length was chosen since it is approximately the length of time of the shortest segment of WLTP cycle (which is used for this research). This ensures that the time resolution is high enough to identify the required battery heating quantity for each section of the chosen drive cycle.
Other vehicle states (such as vehicle speed) which are not affected through the choice heating control will also vary through the drive cycle, and these are known as static parameters. In the first step of the implementation procedure, shown in Figure 12, a selection of vehicle parameters are recorded to be used as static parameters. These are vital for ensuring that the simulation can be reinitialised correctly at an arbitrary point during the drive cycle. The list of variables considered is:
Battery SOC.
While the control option will have no effect on the vehicle speed, it is likely to have an effect on some of the other parameters from this list. For example, should the battery heating be initiated late in the drive cycle, it may cause an increase in compressor speed, which would not be captured using the compressor speed as a static parameter. It may be argued then that these variables should be included as state variables. However, this approach would not be feasible since the computation effort would increase exponentially with the addition of each new state variable due to the "curse of dimensionality". It is, therefore, assumed that, for items two to twelve on the list presented above, the variation in their value due to different battery heating requests is either insignificant or the variation caused is inconsequential to the cost function and cost matrix.

Cost Function
The cost function is defined to account for three factors critical to performance and comfort, which are represented using three vehicle metrics. The first metric is energy consumption which incentivises the minimisation of HVAC usage. The second is depth of discharge (DOD) which correlates directly to remaining range and is separable from energy consumption due to the subtleties of the impacts of low temperature on battery performance [32]. For example, it is possible to spend extra energy on heating the battery at the start of the drive cycle, resulting in higher battery temperatures and a higher final SOC. In this case, the energy efficiency would be reduced, but the vehicle range increased [32]. Finally, comfort is represented with the third metric, which is paramount to the user experience of electric vehicles. These three attributes correspond to objectives j 1 , j 2 , and j 3 , which are defined by: Here, V ocv is the open circuit voltage, which is used instead of terminal voltage to include cell internal losses; I(t) is the battery current; t start (k) and t end (k) are the start and end times of the kth time step; SOC is the battery state of charge; and T cabin is the measured cabin temperature, with 22 • C being the cabin target temperature. The individual objectives are combined in the following way to define the cost J(k), which represents the cost at the kth time step of the solution.
Here, w i are weighting factors used to control the desired balance between battery performance and comfort; a full definition and explanation of these factors is given in Section 6.1. n i are the normalisation factors which are defined using Equation (16). For this normalisation, the vehicle is operated in a baseline configuration, in which the battery is not heated, and generates cost metrics j baseline i . Hence, for the purpose of the dynamic optimisation, improvements in cost correspond to improvements over the no heating strategy.

Implementation
In this section, the procedure to implement the DP solution is described. The method for implementing the DP approach is described in the following steps.

•
Step 1. DP search boundary generation through the evaluation of the model under cases: firstly with full battery heating (10 kW requested at the start of the cycle) and no battery heating. These two cases are also used to generate static parameters for the problem and used for comparison in Section 6.1.

•
Step 2. Cost matrix generation. Here, the cost is found at every 300 s and at every feasible state (discretised cabin and battery temperature) and for every control option. The cost matrix describes the consequence of using every control at every point in the problem and, hence, can be used to find a control sequence which minimises cost. • Step 3. Backwards solution.
Here, the open source algorithm works backwards through the cost matrix and finds the control option which minimises the cost at every point in the state space. The output of this is a control lookup table which is used in the forward solution to identify which control option to use at each time step. • Step 4. Forward solution. The problem is simulated iteratively (each time step reinitialised and simulated separately); after each time step, the vehicle's location in state space is considered, and the control matrix from the backward solution is used to find the control that should be used. The output from this step is the optimised control trajectory.

•
Step 5. Validate found solution in Dymola. The found control strategy from Step 4 is imported in Dymola and the problem is simulated continuously (without re-initialisation). The output of this step is the final vehicle metrics which can be used to assess progress.
The work flow implemented when using the open source toolbox is visualised in Figure 12. Here, C_cycle is the cost found from simulating the entire cycle, where the inputs into the Dymola model are the ambient temperature and a fixed control strategy, either full battery heating (10 kW) or no battery heating. C_Ts is the cost associated with the simulation of the time step Ts within the drive cycle. Here, the inputs to the Dymola mode are the ambient temperature, the required simulation start time, the control option for that time step, the values of the state variables, and the values of the static variables, which are found using profiles generated in output 1. To produce the cost matrix, the problem needs to be discretised; hence, the control trajectory for the electric battery heating has been discretised into four levels: none (0 kW), low (3.33 kW), medium (6.66 kW), high, or full (10 kW). This extra discretisation beyond the binary approach, which would be 0 kW or 10 kW, provides extra control over the heating strategy. X s are the state variables, T cabin and T battery , which are discretised into 1 • C and 2 • C increments, respectively. The problem is also discretised into time steps of 300 s. This time step length was chosen since it is approximately the length of time of the shortest segment of WLTP cycle (which is used for this research). This ensures that the time resolution is high enough to identify the required battery heating quantity for each section of the chosen drive cycle. Table 7 shows the weighting splits between each of the components of interest which make up the cost function. These weightings are retrospectively applied to the cost matrix between steps 2 and 3 of the DP process described in Section 5. Table 7 has been created to control the balance between range and comfort when considering how to optimally heat the electric battery. Neutral refers to a case where the priorities are treated equally, with the intention of finding a solution towards the mid point in available range and comfort. There are then weighting groups which skew the priority to either the cabin or the battery; in these cases, the prioritised component carries two-thirds of the weighting, which is twice as much weight as that given to the un-prioritised component. The intention here is to find solutions which improve cabin heating or battery performance, while still taking the other into account, hence not finding the simple heating solutions which are expected to represent the extreme. Within each group, there is also a split in battery priority which is denoted by Neutral, Energy, and Range for each column of Table 8. Within the results, the specific weightings used will be referred to as "Neutral N", "Neutral E", "Neutral R", "Battery N", and so on, as shown in Table 7. With the battery objectives split into two components, energy and DOD, there may be an advantage to be had in changing the weighting between them. This is explored in the results with the best weighing mix identified for each weighting group (Neutral, Cabin, and Battery).

Simulation Set Up
The model developed in Dymola is used to for Step 1 of the process discussed in Section 5.3. In completing this step, a .exe file is generated, which is used within MATLAB (R2018a, MathWorks, Natick, MA, USA) to complete steps 2 to 4 of the process. The solution is then imported into Dymola again, where it is implemented, and the improvement can be measured. This simulation setup is visualised in Figure 13.  For the drive cycle shown in Figure 13b, the WLTP drive cycle is selected as shown in Figure 14.
The WLTP cycle is chosen as it is currently the official cycle for measuring fuel consumption and electric vehicle range.
The results from the implementation of the method described in Sections 2 and 5 are presented in Sections 6.3 and 6.4. Section 6.3 demonstrates and implementation of the dynamic programming method for the 9 priority weightings described in Table 7. In this section, the individual solutions will be presented and analysed in more detail. In Section 6.4, the weightings in Table 7 will be used to investigate the sensitivity of the method to variations in the balance of the cost function at two additional ambient temperatures without the analysis into individual solutions produced. The results will be used to provide recommendations of which weighting split to use at each temperature to maximise performance.

DP Demonstration
Here, the DP method is performed on the model described using the WLTP drive cycle, seen in Figure 14, at −7 • C. This combination of drive cycle and ambient temperature selection represents the low temperature testing procedure in Europe [51]. The resulting optimal trajectories can be seen in Figures 15 and 16. These generated optimal battery heating trajectories were implemented in the Dymola model as described in the final stage of Figure 12. The resulting vehicle performance metrics corresponding to the optimised heating trajectories can be seen in Table 8. Table 8 also shows results for two simple heating strategies, full heating and no heating. In the full heating strategy, the battery receives the full 10 kW of heat from the start of the cycle, and battery heating is ceased when the battery reaches its target temperature of 20 • C. In the no battery heating scenario, the battery receives no heat through the drive cycle. It should be noted that the full battery heating strategy uses less energy than the no heating strategy. While battery heating uses some additional energy at the start of the cycle, this is recuperated by an increase in battery efficiency at the end of the cycle, leading to the energy and range improvements seen. However, the full battery heating also leads to a higher mean discomfort, defined by the cost metric j 3 in Equation (14). This shows that heating the battery, while improving the vehicle's performance, reduces the thermal comfort experienced by the occupants of the cabin. Figures 15 and 16 show that all the optimal heating strategies begin heating the battery at 1200 s. This is a chance result of the optimisation process. It shows that, under any balance of cost metrics, 1200 s is the optimal time to heat the battery during the WLTP drive cycle at −7 • C. It is likely that this is the case as it allows enough time for the cabin to reach temperature, thereby having minimal impact on the comfort metric, while still being early enough in the cycle to improve battery performance before the high speed section at the end of the cycle. This is a "sweet spot" that exists due to he length and design of the WLTP cycle and the ambient temperature. Different drive cycles at different ambient temperatures may have different "sweet spots". In Table 8, it can be seen that the vehicle performance metrics for the optimised heating trajectories fall between the values corresponding to the full heating and no heating control strategies. This indicates that the method has been successful as a balance has been reached between range and comfort. For example, considering the results corresponding to the "Neutral N" weighting mix, the range has improved by 8% compared to the no battery heating strategy but is not as good as the full battery heating strategy. Conversely the comfort has improved by 4% compared to the full battery heating strategy but is not as good as the no heating strategy. Table 8 shows results corresponding to all nine weightings presented in Table 7, and the sub-weighting which gives the best results considering the intention of the weighting is identified in bold. "Neutral E" is selected as the best Neutral weighting as it improves energy consumption and range in comparison to "Neutral N" and "Neutral R", while only have a small impact on Discomfort. For prioritising, the cabin "Cabin N" has been selected, which significantly improves range compared to "Cabin E", while having a very small increase in discomfort. However, if this additional range is not required, then using the No Heating simple strategy would be the best strategy to use as it promotes maximum comfort. Finally, when the battery is prioritised, "Battery R" is selected as the best weighting mixture as it maximises range, while still improving comfort in comparison to the Full Heating strategy. However, here, it should also be mentioned that, should 126.7 km not cover the required range for a days duty cycle, then it may be best to use the Full Heating strategy at the expense of some additional discomfort.  Figure 17 shows the cabin and battery temperature profiles for the simple heating strategies and the optimised heating trajectory according to the "Cabin N" weighting in Table 7. The benefits of the DP method can be better seen through Figure 17. The decision to heat the battery, which will increase its temperature and therefore effective capacity, extending the vehicles range, is delayed until the end of the cycle, as seen in Figure 16a. Delaying the heating event has the effect of increasing cabin comfort in the early part of the cycle from the increased heating capacity resulting from not heating the battery. This is an important change, as it is likely to give a greater feeling of comfort at the start of the cycle which can be more important than final cabin temperature in terms of thermal sensation and perceived comfort [52]. The DP optimised profile also activates battery heating shortly before the high speed section of the WLTP cycle. It is likely that this decision arises because the battery performance will have a greater impact when the vehicle is at higher speed and drawing more current since losses governed by I 2 R ohmic heating. This is an important point to make as it is possible the battery heating may not be required if a drive cycle does not contain a high speed section. Figure 17a shows a reduction in cabin temperature when using the DP derived solution in comparison to both the simple heating strategies at the end of the drive cycle. This will have an impact on the the cabin discomfort metric j 3 , defined in Equation (14), although the majority of the benefit in comparison to the the Full Heating strategy is achieved before 1200 s. The severity of the cabin temperature reduction at the end of the cycle may indicate a potential area of improvement for method refinement, which is discussed more full in Section 7, but stems from the exclusion of the heater core temperature as a state variable, leading to a reduced accuracy cost matrix. Table 8 clearly demonstrates the advantage of using an optimised battery heating trajectory in balancing range with cabin comfort. Taking "Cabin N" as an example, the range has increased by 6% in comparison to the no heating strategy, while the comfort has improved by 6% (measured using the distance to target temperature) compared to the full heating strategy. With implementation, this would offer the ability for the user to select an option to increase the vehicles range without minimising the cabin comfort.

Sensitivity Analysis
The DP method is repeated using the full set of weightings presented in Table 7 for ambient temperatures −15 • C and 0 • C. Tables 9 and 10 show the results of the DP application at ambient temperatures −15 • C and 0 • C respectively. This temperature range is chosen as it covers the expected cold temperature range of Europe in January [53]. This is done to understand how the results and sensitivity of the method vary at different temperatures. As with the results corresponding to −7 • C, the results identified in bold signify to best compromise and heating trajectory which would most likely be implemented. Here, it can be seen that, at −15 • C, the window in range is greatest; this arises since −15 • C creates the greatest window in battery temperature operating range and, therefore, the greatest window in battery performance. Likewise, the window generated at 0 • C is the smallest, reflecting the smaller battery temperature operating window. While the window of operation at −15 • C is bigger than that at 0 • C, it is not as great as the window of operation at −7 • C. Compared to −7 • C, at −15 • C, the battery requires more energy to heat it to a beneficial temperature, so the window in range is not as large.
It may be noticed that the range does not improve through the ambient temperatures, which might be suspect since the increase in ambient temperature would be expected to improve battery performance and reduce HVAC consumption. At −15 • C, the maximum range is 123.8 km, and, at −7 • C 127.1 km is achievable, but, at 0 • C, only 124.8 km is available. The cause of this is the heat pumps ability to extract heat from ambient depending on ambient temperature. At −15 • C, the heat pump is unable to extract heat from ambient, and this reduces the amount of work that the heat pump is able to do. At 0 • C, ambient is hot enough for thermal extraction, leading to an increase in compressor load. For example, using the Neutral N weighting the heat pump consumes 0.883 kWh of electrical energy at −15 • C, and this increases to 1.2 kWh of electrical energy at 0 • C. The reward for this extra consumption in electrical energy is the increase in cabin comfort which arises from greater heat pump performance. This can be evidenced by a 15% increase in heat delivered to the cabin during the first minute of the cycle when comparing the full battery heating strategies at the two ambient temperatures. This can be seen in Figure 18. The cost of increasing the power consumption of the compressor is a reduction in potential range of the vehicle. This may be adjusted by focusing on the control of the compressor demand, or by controlling the heat flow to the cabin; however, this work is focused on maximising cabin comfort, while balancing range using optimised battery heating. In summary, it has been shown that there is the opportunity to control the balance between cabin comfort and battery performance by dynamically heating the battery. Using the found trajectories, it is possible to make recommendations as to when during a drive cycle the battery should be heated, and by how much. This information is summarised in Table 11.  Intelligent decision maker to identify which weighting should be used to achieve the required range while maximising cabin comfort.

3.
DP solution repository combination into implemented control.
This process represents a strategy for building a controller, allowing this theoretical work to become implementable on physical vehicles. Figure 19a shows the optimised battery heating control generated which prioritising the battery and using a neutral split of battery objectives ("Cabin N"). Here, it can be seen that all of the heating events happen at the end of the cycle. In Figure 17a, is was seen that this strategy had an adverse impact on cabin temperature. Figure 19b shows two cabin temperature profiles, the one resulting from the DP-based thermal management optimisation process and one resulting from the implementation of the DP solution into the Dymola model. Here, it can be seen that the during the DP process the algorithm predicted that the cabin would be less affected by the late activation of battery heating. In this section, the discrepancy between the two profiles is investigated.  The optimisation results were derived from the algorithm performing the simulation in consecutive piece-wise sections. Having performed the backward solution, the algorithm consecutively simulates each time step, evaluating the value of the state variables at the end of each simulation, and using this information to decide the control value for the next step. This corresponds to the 4 steps of Figure 12. This piece-wise simulation approach, generated through the forward solution, is reconstructed into a single trajectory and presented as the reconstructed profile in Figure 19b. Once the solution is found, it is simulated in full in Dymola to check for agreement with the reconstruction, and this is denoted as the re-simulated profile. Figure 19b distinctly shows that the algorithm predicts a more comfortable cabin temperature experience than the one generated through the implementation of the optimised control generated. Considering Figures 19a,b partially explains why this divergence occurs. During the heating phase at 1200 s, the retested cabin temperature begins to deviate from the reconstructed cabin temperature. This implies that, during the reconstruction, the DP algorithm is operating under the assumption that there is more heat availability than their actually is. This can be explained when considering the assumption made about the heater core temperature explained in Section 5.1. The reconstructed heater core temperature profile is shown in Figure 20. The heater core is located between the high temperature circuit and the cabin in Figure 1, and it converts heat from the high temperature circuit into heated air for the cabin; therefore, its temperature significantly impacts heat delivered to the cabin. Figure 20 shows that the reconstructed heater core temperature is able to recover instantaneously after the battery heating performed at 1200 s. This means the heater core starts at a higher temperature during the next time step at 1500 s, while the re-simulated heater core temperature remains low as the battery heating continues. This happens because its initialisation value is controlled by the static parameter trajectory (Section 5.1) which is not adapted due to the previous time steps control value. However, it can be seen from the re-simulated trajectory that the heater core temperature is not able to instantly recover after the heating event; therefore, the cabin temperature does not rise as quickly as expected. The DP algorithm will understand the over predicted heater core temperature as an abundance of heat available to the cabin. The algorithm assumes that heat can be delivered to the battery at little cost to the cabin comfort due to the high heater core temperature; however, this is not the case, and the cabin temperature suffers. The solution to this is to include the heater core temperature as a state variable. Therefore, after a heating event when the heater core temperature is lowered, the cost matrix will be able to predict the cost of starting at this reduced heater core temperature, which may discourage the use of battery heating under some circumstances. The inclusion of heater core temperature as a state variable will also improve the accuracy of the reconstructed cabin temperature when predicting the consequences of implementing the optimised battery heating control strategy. However, the inclusion of the heater core as a state variable adds to the "curse of dimensionality" when using dynamic programming. As an example, if 8 values of heater core temperature are to be used (for example), then the simulation time required to complete cost matrix would increase by a factor of 8. This could make the computational cost prohibitively long.

Conclusions and Further Work
A dynamic optimisation was performed to find the optimal dynamic heating strategy for a battery electric vehicle. It was demonstrated to be able to successfully find a balance between range and comfort in Section 6.3. The result of this optimisation was a 6.2% increase in range compared to no battery heating and 5.5% increase in thermal comfort compared to full battery heating, at an ambient temperature of −7 • C.
In Section 6.4, a sensitivity analysis was performed at different ambient temperatures. This resulted in a recommendation for the best heating strategies to use dependant on ambient temperature and desired priority. Finally, there was a recommendation of when, and by how much, to heat the during a drive cycle depending on different priorities and ambient temperatures, shown in Table 11.
The heating optimisation method shown in this work is not compatible with direct use in a real world use cases. Exploring a wide spectrum of driving scenarios with this method can provide the data required to optimise heating strategies off-line. The results can be then utilised as a control benchmark to develop an intelligent heating control strategy suitable for real-time application.
The next step to be performed in this work is to refine the method by including the heater core temperature as a state variable, as described in Section 7. Upon the success of this refinement, the method should be expanded to incorporate more drive cycles and an intelligent scenario recognition control, which will automated the choice of the ideal control strategy. This will represents steps in the direction of real world implementation.

•
Develop intelligent controller to recognise driving condition and deploy appropriate control. • Include heater core as a state variable to improve performance of DP procedure. • Improved state limitation recognition for more computationally efficient DP cost matrix construction.
Although the examples used in this study have focused on the low temperature use case, the same issue is present at high temperatures. Here, the motivation is to reduce the battery temperature to prevent ageing and ensure safety. Battery cooling reduces the cooling capacity available for the cabin, so the method demonstrated here could be reapplied to create dynamic battery cooling trajectories, which balance battery ageing against cabin thermal comfort.
Using this method could increase the acceptance of EV as manufacturers will be able to improve range at low ambient temperatures, while still maintaining high cabin comfort, when extended range is not required. This will allow the vehicle to self regulate and self optimise according to individual journeys and requirements.
Author Contributions: J.J. was responsible for the conceptualisation, methodology, validation and final analysis. Software assistance was provided by A.P. and Claytex, supervision and assistance in writing and editing was provided by T.Q.D., W.D.W., and A.M. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by EPSRC, grant reference number [EP/M507593], and Jaguar Land Rover.