Improving Electricity Consumption Estimation for Electric Vehicles Based on Sparse GPS Observations

Improving the estimation accuracy for the energy consumption of electric vehicles (EVs) would greatly contribute to alleviating the range anxiety of drivers and serve as a critical basis for the planning, operation, and management of charging infrastructures. To address the challenges in energy consumption estimation encountered due to sparse Global Positioning System (GPS) observations, an estimation model is proposed that considers both the kinetic characteristics from sparse GPS observations and the unique attributes of EVs: (1) work opposing the rolling resistance; (2) aerodynamic friction losses; (3) energy consumption/generation depending on the grade of the route; (4) auxiliary load consumption; and (5) additional energy losses arising from the unstable power output of the electric motor. Two quantities, the average energy consumption per kilometer and the energy consumption for an entire trip, were focused on and compared for model fitness, parameter, and effectiveness, and the latter showed a higher fitness. Based on sparse GPS observations of 68 EVs in Aichi Prefecture, Japan, the traditional linear regression approach and a multilevel mixed-effects linear regression approach were used for model calibration. The proposed model showed a high accuracy and demonstrated a great potential for application in using sparse GPS observations to predict the energy consumption of EVs.


Introduction
For reducing the world's dependency on fossil fuels and their harmful emissions, electric vehicles (EVs) have attracted significant attention in recent years.However, the driving range of EVs is not competitive compared with that of internal combustion engine vehicles because of significant technological barriers, and this can make drivers feel anxious about their remaining energy.It has been widely accepted that accurate range prediction is the key to minimizing range anxiety and helping drivers make the best use of their limited electricity [1,2].More importantly, the energy consumption of vehicles is a critical aspect to consider both in transportation planning and in evaluating the energy impacts of operational-level projects [3].In particular, EV energy consumption data serve as a critical basis for the spatial planning, operation, and management of charging infrastructures.
According to previous studies, a variety of factors affect energy consumption, including travel-related factors [4,5], environment-related factors [6][7][8][9], vehicle-related factors [10][11][12][13][14], roadway-related factors [15], traffic-related factors [16][17][18], driver-related factors [3,[19][20][21][22], the health and degradation condition of the battery [23][24][25][26], the efficiency of braking energy recovery [27][28][29][30][31], and the charge and discharge character of the battery [32].Some previous studies collected the information through driving-cycle experiments in the lab [4,24] and Global Positioning System (GPS) observations in the real world [32][33][34], but some results showed the significant difference between experiments and real-world conditions [35,36], leading to a relative low accuracy and poor practicality of models.Particularly, it is very difficult for the experimental design of predetermined conditions to take some real-world conditions into consideration, including the behaviors of drivers on air-conditioner and heater usage, the influence of driving environment, real-world aerodynamic friction loss, lane changing behaviors, car following behaviors, driving behaviors, etc.Thus, for a high accuracy of energy consumption estimation, the collection of extremely detailed and comprehensive information on the operation of EVs in the real world is required.
In practice, detailed and comprehensive field observations of daily travel are extremely difficult to obtain because of the high costs of implementation and privacy issues.However, sparser and more fragmentary data on EVs are relatively easy to obtain from simple devices implemented in such vehicles, such as GPS devices.Nevertheless, energy consumption estimation based on sparse observations of EVs faces many challenges.First, with sparse GPS data, many slight changes in the level of stored energy and the finer details of driving behavior characteristics and traffic conditions, which might be derived from real-time location information, will remain unobserved [37].Moreover, without continuous observations of vehicle movement, the kinetic characteristics that most strongly influence the energy consumption of EVs are difficult to model, and the quantization of these factors may readily introduce large errors.In particular, sparse GPS observations increase the difficulty of and errors in map matching [38,39].
To overcome these challenges, improving the accuracy of energy consumption estimation, based on sparse observations, would make an important contribution to practical applications.Thus, the purpose of the present work is to improve the accuracy of electricity consumption estimation, based on sparse spatial-temporal behavior observations of EVs.An energy consumption model is proposed that considers the kinetic characteristics and unique attributes of EVs: (1) work opposing the rolling resistance; (2) aerodynamic friction losses; (3) energy consumption/generation depending on the grade of the route; (4) auxiliary load consumption; and (5) additional energy losses arising from the unstable power output of the electric motor.The traditional linear regression approach and a multilevel mixed-effects linear regression approach are used for model calibration based on GPS observations of 68 EVs.
For the energy consumption estimations, either the energy consumption for an entire trip [35] or the energy consumption per unit distance [5] is used as the research object in previous studies.An estimation for an entire trip gives drivers an overview of the electricity costs for future travel, while an estimation for per unit distance provides a detailed view of the electricity consumed when driving.In the present work, energy consumption models for both are built, and the results are compared and analyzed from the aspects of model fitness, parameters, and effectiveness.
The paper is organized as follows.The second section provides a detailed description of the data collection procedures used to obtain the EV energy consumption observations.The energy consumption estimation models and the results of the study are then successively discussed in the following sections (Sections 3 and 4, respectively).Finally, a summary of our findings is presented.

Data Collection
GPS trajectory data were collected for approximately 500 EVs in Japan from February 2011 to January 2013.Among these data, those for 39,685 trips of 68 EVs in Aichi Prefecture, collected from February 2012 to January 2013, were used in this study.The vehicle coordinates, the timestamps, the vehicle odometer records, the vehicle ID, the state of charge (SOC) of the battery, and the usage states of the air conditioner and heater were collected once per minute.It should be noted that the odometer records distance in integer kilometers, which makes it impossible to model energy consumption on a road link basis.
With the help of map matching, the trips completed by each vehicle were identified.The travel distance, travel time, SOCs before and after travel, and so on were acquired.Furthermore, based on elevation data for the road network in Aichi Prefecture, the gradients of the travel routes were also determined.To investigate the influence of the ambient environment, the ambient temperature for each trip was recorded, except for 208 trips with the temperature missing.In total, 39,477 trips of 68 EVs were employed in the present study.
The polling frequency of the EVs' GPS reports was relatively low (once per minute), and therefore the data could not provide detailed information on actual driving behavior or the moments at which the SOC changed (which occurred in steps of 0.5% during operation).It would be extremely difficult to compare the energy consumption of vehicles under varying driving conditions or to estimate the actual SOC values at the start and end nodes of each link; thus, we focused on the energy consumption per trip.In total, 39,477 trips completed by 68 EVs were investigated in this study.

Energy Consumption Estimation Models
The energy consumption model for an internal combustion engine vehicle is formulated in terms of the work of the tractive force necessary to overcome acceleration resistance, air resistance, gradient resistance, and rolling resistance and is also used for EVs [15,40] and given by: where ∆E i is the energy consumption for trip i, Eac i is the work necessary to overcome the acceleration resistance, Ero i is the work opposing the rolling resistance, Eae i is the aerodynamic friction loss, and Egr i is the energy consumed or generated depending on the grade of the route (this value will be positive when the vehicle is travelling uphill and negative during downhill travel).Nevertheless, for an EV, these four components corresponding to the work opposing various resistances will always underestimate the true energy consumption in the real world.The electricity consumed by auxiliary loads represents a non-negligible contribution.In particular, the effect of the ambient temperature is seldom considered.Because of the particularities of the dynamic system, the energy loss arising from the unstable power output of the electric motor, as influenced by the ambient temperature, must also be included in the energy consumption model.Thus the energy consumption model in the present work is formulated as follows: where Eau i is the auxiliary load consumption and Ete i is the additional energy loss due to the unstable power output of the electric motor under the influence of the ambient temperature, which most previous studies have ignored.Notably, the once-per-minute polling frequency of the GPS devices used in this study resulted in considerable difficulty in measuring acceleration.The work necessary to overcome the acceleration resistance is actually correlated with other parts of power output and therefore will shift to other terms when the model is calibrated by a regression model [35].

Average Energy Consumption Estimation per Unit Distance
An estimation of energy consumption per unit distance provides a detailed view of the electricity consumed while driving.However, because of the low polling frequency of the GPS observations in this study (once per minute) and because the prevalence of dropping interruptions in the residual SOC records was 0.5%, corresponding to intervals of approximately 1-3 min during continuous driving, the energy consumption per kilometre could not be directly observed.Thus, the average electricity consumption per kilometre was estimated, as described in this section.
When the average energy consumption per kilometre is considered, ∆E i on the left-hand side of Equation ( 1) is replaced with ∆e i .The work opposing the rolling resistance [15,40] is formulated as: where ϕ is the rolling resistance coefficient, which is a function of the vehicle speed, V i [41]; M is the mass of the vehicle, which is taken to be a constant because all vehicles investigated were of the same type; g is the gravitational constant; and θ is the road grade angle.The product ϕ * M * g is the rolling resistance, and l i is the unit distance, i.e. one kilometre in this study.Most of the road grade angles in the urban area of Aichi Prefecture are less than 5%; therefore, Ero i is approximately equal to ϕ * M * g.For ϕ, a simple linear function of the average travel speed is applied.η 1 and η 2 are the two parameters of this function, which are estimated in the regression model.The aerodynamic friction loss [15,40] is given by: where q is the air density, which is estimated in the regression model; K is the frontal area of the vehicle; and V i is the average travel speed for trip i.In fact, Eae i is proportional to V 2 i when l i = 1 km.Regarding the energy consumed for hill climbing, previous studies [15,40] formulate it as: where l s is the traveling distance in a second and θ is the road grade angle.The product of l s * sin θ is the height difference in this second.In fact, even with a same height difference but a different road grade angle, the electricity consumptions are not identical [5,18].In order to quantify the influence of different gradients, the gradient should be classified into several gradient intervals [18].The observations indicate that most trips had their origins and destinations in urban areas and were relatively short in length.If the range of each interval is too small, the coefficients estimated for two adjacent intervals may be same.While if the range of each interval is too wide, we can't catch continuous change of the coefficients on different grades.Thus after several experiments, 11 intervals were classified, which are <−9%, −7%~−9%, −5%~−7%, −3%~−5%, −1%~−3%, −1%~1%, 1%~3%, 3%~5%, 5%~7%, 7%~9%, and >9% (where '−' indicates a downwards gradient).Furthermore, ten of them (except −1%~1%, which is a flat slope) were used to formulate Egr i to avoid multicollinearity in the model: where p j is the percentage of the link length per kilometre with grade angle j (j ∈ (<−9%, −7%~−9%, −5%~−7%, −3%~−5%, −1%~−3%, 1%~3%, 3%~5%, 5%~7%, 7%~9%, and >9%)), τ 1 to τ 10 are the influence coefficients for the different gradient, G j is the link length with grade angle j for trip i, and l i is the total travelled distance.
The auxiliary load consumption consists of two main components in this study, (1) the air conditioning load consumption and (2) the heating load consumption.This is expressed as follows: where A i and H i are the average service times per kilometer of the air conditioner and heater, respectively, which are calculated as shown in Equations ( 10) and (11), respectively; t i is the total travel time; a i and h i are the numbers of GPS points at which the air conditioner and heater, respectively, are switched on; n i is the total number of GPS points recorded for trip i; and ζ 1 and ζ 2 are the effects of the air conditioner and heater, respectively, on the EV's energy consumption.
As illustrated in Reference [42], the relationship between energy efficiency and ambient temperature exhibits an asymmetrical 'U' shape, which is best fit by a third-order polynomial.Thus, the additional energy loss due to the unstable power output of the electric motor under the influence of the ambient temperature is formulated as: where α 1 -α 3 are the influence coefficients, which are estimated during model calibration, and T is the ambient temperature.

Energy Consumption Estimation for an Entire Trip
The ability to accurately estimate the energy consumption for an entire trip would help drivers to better manage and use the limited electricity stored in their vehicle batteries.Thus, the electricity consumption for an entire trip (∆E i ) is the focus of this section.The work opposing the rolling resistance for an entire trip is formulated as: For a small road grade angle, Equation ( 13) can be simplified to: where Ero i is proportional to The aerodynamic friction loss is given by: whereas Egr i is formulated as: The auxiliary load consumption is given by: where AC i and HT i are the service times of the air conditioner and heater, respectively, during trip i.
The additional energy loss due to the unstable power output of the electric motor under the influence of the ambient temperature is again formulated as shown in Equation (12).

Model Calibration
A linear regression model was employed to calibrate the energy consumption model.The model for the average energy consumption per unit distance was formulated as: where ∆e i is the average electricity consumption per kilometre during trip i (dependent variable); V i , p, A i , H i , and T i are the independent variables described in Section 3.4; and η 2 , α 2 , and α 3 are the coefficients of the variables.β 0 + η 1 is the intercept term of the regression model; particularly, τ and p, in particular, are the vectors of the influence coefficients for the different grade angle categories and the link length percentages at the different grade angles, respectively; and ε i is the residual term.Furthermore, the model for the energy consumption for an entire trip was formulated as: where ∆E i is the energy consumption for the entire trip i; β 0 is the intercept term; τ and G are the vectors of the influence coefficients for the different grade angle categories and the link lengths at the different grade angles, respectively; and ε i is the residual term.
To address the heterogeneity of drivers [18], another regression model based on multilevel mixed-effects regression was also employed to capture the potential correlations and non-constant variability of the energy consumption characteristics.The general form of a two-level mixed-effects regression model [43] is given by: where E ij is the dependent variable, β 0 is the fixed intercept term of the model, x λij is a variable with a fixed coefficient β λ , π is the number of variables x λij that appear in the model, z mij is a variable with a random coefficient u m0j , ω is the number of variables z mij that appear in the model, u 0j is the random intercept, and ε ij is the residual term.Particularly, in this study, for average electricity consumption per kilometre and the energy consumption in an entire trip, the dependent variable E ij in Equation ( 22) will be replaced by ∆e i and ∆E i respectively.Accordingly, the fixed independent variables are also replaced by those in Equations ( 20) and ( 21), respectively, while the random effects for each will be discussed in Section 4.

Variable Descriptions
All of the variables for the energy consumption estimation were calculated as discussed above and are listed in Table 1.The average energy consumption was 0.752 kWh, and the energy consumption per kilometre was 0.15 kWh/km.The mean trip distance was only 5.342 km, and the average speed was less than 22 km/h.Regarding the gradient distribution, 75% of the road distance travelled was at a grade of −1%-1%, whereas approximately 98% of the distance travelled fell within the range of −5%-5%.Thus, the observations indicate that most trips had their origins and destinations in urban areas and were relatively short in length.Moreover, the mean ambient temperature was approximately 18 • C, with a large variance among the different trips.

Results
Four regression models were constructed for calibrating the energy consumption models discussed above.Specifically, Models 1 and 2 are related to the estimation of the energy consumption per kilometre, whereas Models 3 and 4 are related to the energy consumption estimation for an entire trip.
Moreover, Models 1 and 3 are traditional linear regression models, formulated as shown in Equations ( 20) and (21), respectively, whereas the other two are multilevel mixed-effects linear regression models that consider the heterogeneity among drivers, formulated as shown in Equation (22).
In Models 1 and 2, both V i and a constant are used to explain the work opposing the rolling resistance, as described by Equation (3); V 2 i is used to account for the aerodynamic friction loss; the link length percentages corresponding to each gradient category per kilometre are used to interpret the energy consumed for hill climbing; the service times of the air conditioner and heater per kilometre (A i and H i respectively) are used to explain the auxiliary load consumption; and a third-order polynomial with respect to T i is used to consider the influence of the ambient temperature.Additionally, the differing driving habits of different drivers lead to variations in energy consumption characteristics.Thus, a number of variables related to driving behaviour, such as the average travel speed, air-conditioner usage time, and heater usage time, may show heterogeneity.The same average travel speed may contribute differently to the energy consumption depending on the driving behaviour of the driver.The same holds true for the link length percentage corresponding to a given gradient category.In addition, for air-conditioner usage and heater usage, some drivers prefer rapid cooling/heating but others do not.Thus, in Model 2, a random coefficient is assigned to each average travel speed, air conditioning time, heating time and grade angle category, with a random intercept in terms of the driver level.For the 68 EVs in the present study, each EV is driven by no more than two drivers for both private EVs and company EVs, and mostly there is only one main driver for each vehicle.Although we cannot ensure that the same vehicle is driven by the same driver for each trip, the performance of each EV may be similar with regard to the driver's habits because the travel patterns (origination and destination, number of trips) and travel routes are similar for daily trips.The regression results are presented in Table 2.As shown in Table 2, all of the parameters estimated in Models 1 and 2 are highly significant.The signs and magnitudes of the parameters in these two models are similar.The significant standard deviations of V i and the link length percentage corresponding to each gradient category reflect a notable heterogeneity among different drivers.The fitness of each model was evaluated based on the adjusted R 2 and Akaike's Information Criterion value (AIC).An improvement of more than 3% was achieved in Model 2, which indicates a more appropriate model specification.The mean square error (MSE) of the estimated energy consumption for an entire trip is calculated, which also shows a better effective of estimation for Model 2.
In Models 3 and 4, both l i and V i * l i are used to account for the work opposing the rolling resistance; V 2 i * l i is used to consider the aerodynamic friction loss; the link lengths corresponding to the different gradient categories for the trip are used to interpret the energy consumed for hill climbing; the service times of the air conditioner and heater during the trip (AC i and HT i , respectively) are used to explain the auxiliary load consumption; and, once again, the influence of the ambient temperature is represented by a third-order polynomial in T i .Moreover, similar to Model 2, the heterogeneity among drivers is considered in Model 4. Specifically, a random intercept and random coefficients for travel distance, air conditioning time, heating time, and link length at different grade angles are considered for heterogeneity among different drivers.
Similar to Models 1 and 2, all the parameters in Models 3 and 4 are highly significant.An improvement of 1.48% was achieved in Model 4. While the small MSE and AIC also show Model 4 to be more effective.
All four models show that the impact of heating is about twice that of air conditioning per minute.From the results of Models 3 and 4, it will consume about 0.14 kWh of electricity for every kilometre driven.The product of speed and distance (V i * l i ) shows a negative impact on energy consumption and a kilometre increase in travel distance will decrease the influence of average travel speed by 0.0022V i .The significant random coefficients of heating time and air conditioning time indicate that thermal comfort habits vary among drivers significantly.The sign and magnitude of gradient coefficients for these four models are similar and highly significant.As gradient angle rises, energy consumption/generation increases linearly with a sharply increase at an angle of 8%.
The significant standard deviations of the random coefficients related to the link lengths at different grade angles, the average travel speed, and travel distance in Models 2 and 4 reflect significant heterogeneity in driving characteristics among the drivers.Trips with the same link length but different grade angles or average travel speeds showed different energy consumption characteristics because of differences in driving behaviour.With this unobserved heterogeneity, Models 2 and 4 show better fitness ratings compared with Models 1 and 3, respectively.The high fitness ratings and the very small MSE of the four models confirm that the proposed model can effectively estimate the energy consumption of EVs based on sparse GPS observation, no matter which dependent variable is used, either the average energy consumption per kilometer or the energy consumption for an entire trip.
For the energy consumption of an entire trip, all investigated factors directly influence the energy consumption.By contrast, for the energy consumption per kilometre, the investigated factors (such as the service time of the air conditioner and the link length percentage corresponding to each gradient interval) are the spatially averaged values of those for an entire trip.Thus, the spatial averaging process results in a relative reduction in the extent of correlation between the energy consumption and the investigated factors.

Conclusions
To improve the accuracy of energy consumption estimation based on sparse GPS observations, an electricity consumption model was proposed based on the kinetic characteristics and unique attributes of EVs.The average energy consumption per kilometre and the energy consumption for an entire trip were each treated as the dependent variable in separate analyses.Sparse GPS observations of 68 EVs in Aichi Prefecture, Japan were collected, and two types of regression models were used for model calibration.The present work is novel because (1) a relatively comprehensive model for energy consumption estimation is proposed considering work opposing the rolling resistance, aerodynamic friction losses, energy consumption/generation depending on the grade of the route, auxiliary load consumption, and additional energy losses arising from the unstable power output of EVs and (2) two different dependent variables are examined for better understanding the mechanism of estimation error.Results reveal that the proposed method demonstrates a great potential for application in using sparse GPS observations to predict the energy consumption of EVs.
Among the results for the four proposed models, the differences in MSE between the first two models and the second two models reflect the disadvantages of sparse observations for estimating the average energy consumption per kilometre.Due the spatial averaging process, the extent of the correlation between the energy consumption and the investigated factors is reduced.
By contrast, Models 2 and 4, in which the heterogeneity between drivers is considered, no matter whether the average energy consumption per kilometre or the energy consumption for an entire trip is taken as the dependent variable, show very high fitness values (of more than 96%) and highly significant parameter estimations, implying appropriate model specification and a high potential for application in using sparse GPS observations to predict the energy consumption of EVs.

Table 1 .
Descriptive statistics of the variables.

Table 2 .
Parameter estimation results for the four models.