Eco-Driving and Its Impacts on Fuel Efﬁciency: An Overview of Technologies and Data-Driven Methods

: Eco-driving is a multidimensional concept that includes driving behavior, route selection and all other choices or behaviors related to the vehicles’ fuel consumption (e.g., the use of quality fuel, the use of air conditioning, driving at peak hours, etc.). The scope of this paper is to present an overview of recent literature referring to eco-driving and developed models for calculating fuel consumption, as well as the most important factors affecting it. Recent literature contains a large number of models that estimate fuel consumption, based on naturalistic driving data, which are collected using smartphones and OBDs. In this work, the existing literature is critically assessed in relation to conceptual, methodological and data related aspects. The analyses result to a set of limitations and challenges that are further discussed in the framework of system wide implementations for deriving policies that increase drivers’ awareness, but also improve system performance.


Introduction
Alleviating human-driven climate change and reducing pollution of the environment, as well as the high level of dependence on non-renewable resources for energy production are considered as some of the most important challenges targeted as priority by both the United Nations sustainability goals and the European Union Green Deal [1,2]. In general, the transport sector is responsible for the production of the highest volume of greenhouse gases, estimated about 30% of the manmade emissions [3], having increased by 22% from 1990 [4]. The transport sector consumes about 20-25% of the total energy produced [2,5,6], the 65-75% of which is related with road transport [2,6,7].
The amount of harmful emissions is related to fuel consumption efficiency and thus, studying fuel consumption of transportation systems, and especially road transportation, is vital, as well as ways to reduce it. In previous years, many attempts have been made to decrease harmful emissions and increase fuel efficiency, like the enforcement of stricter standards concerning the vehicles' engine (Euro V and VI), new generation motors (electric and hybrid), alternative fuel and fuel of better quality (high-octane and biofuel) [8]. Moreover, the environmental issues as well as the high price of fuel have shifted the drivers' interest towards reducing fuel consumption, which is also reflected by their choices when buying a new vehicle [9].
Although the use of alternative fuel and electric vehicles together with a turn to renewable energy sources is a possible solution to the above-mentioned environmental issues in the long term, reducing emissions of the current vehicle fleet as much as possible would be a promising alternative with short and medium term benefits [6]. Because of the latter, research interest on ecological driving (eco-driving) and the effect of driving behavior on fuel consumption has recently increased.
Although the relationships between fuel consumption and driving behavior is being studied from the beginning of automotive history, it still remains popular because of the environmental and financial impacts of fuel consumption. For example, transportation and logistics companies are always seeking for new chances of fuel saving, in order to reduce operational costs [10].
In this paper, a survey on the literature concerning the effect of driving behavior on fuel consumption is conducted. The variables that are more often used to define driving behavior is the acceleration and the speed of the vehicle, while habits such as idling and harsh braking and acceleration are also significant. More specifically, the aim of the paper is to investigate the driving behavior factors that mostly affect fuel consumption, as well as modeling techniques to accurately estimate fuel consumption, based on driving behavior. The results of the analyses are further discussed in the framework of challenges and limitations for developing large scale eco driving policies and raising the awareness of users on the impacts of eco-driving.
The rest of the paper is organized as follows: In Section 2 the notion of eco-driving is presented in detail and in Section 3 ways of observing eco-driving and collecting driving behavior data are discussed. In Section 4 state-of-the-art predicting models of fuel consumption are presented and, in Section 5, the occurring most important factors related to driving behavior are highlighted. Section 6 includes a thorough discussion about the promises and caveats of eco-driving and of modeling driving behavior. In Section 6 the main conclusions are presented.

Defining Eco-Driving
Eco-driving is the adoption of a driving behavior (or a driving style) that aims at saving fuel and reducing harmful emissions of greenhouse gases (GHG) [11]. In general, it refers to the adjustment of the vehicle's moving speed (in relation to traffic conditions) and the choice of routes that minimize fuel consumption [12,13]. Therefore, eco-driving can be seen as a set of choices and behaviors adopted by drivers that are connected with an energy-efficient way of using a vehicle. Increasing research interest on eco-driving implies that, although vehicles' motors efficiency has improved due to recent technological achievements and integration of new fuel types, drivers have not improved their behavior accordingly. However, it is expected that with proper information and training of the drivers, eco-driving would contribute to decreasing fuel consumption and harmful GHG emissions [14].
Recent research has revealed that eco-driving is capable of reducing fuel consumption by an amount ranging from 15% to 25% and GHG emissions by at least 30% [8,10,15,16]. In contrast, the total fuel savings achieved by engines and vehicles of the latest technology is estimated at about 10-12%, which is significantly lower [8]. According to other researchers, an overall ecological behavior, including the purchase of the vehicle, as well as its usage, and ecological decision-making concerning mobility, are considered to lead to fuel consumption reduction of the order of 40-45% [17].
More formally, eco-driving refers to the adoption of a driving behavior that maximizes the efficiency of the vehicle's engine [18]. It should be made clear that the notion of eco-driving is a multidimensional one, that integrates driving behavior, as well as all decisions directly or indirectly related with fuel saving and GHG emissions reduction, e.g., the choice of vehicle. An ecological driving behavior, i.e., accelerating smoothly and maintaining a constant speed, plays a very important role [19], but it has to be combined with other actions, like the prudent use of the air-condition, in order to lead to maximum fuel saving [9]. All the pieces that constitute the notion of eco-driving are summarized in Figure 1 below. In this paper we focus on the driving behavior, without underestimating though the effect of other parameters.

Observing Eco-Driving
In order to study and model driving behavior and its effect on fuel consumption, it is essential to observe it in real-world conditions. The most efficient way to collect such data is through naturalistic driving experiments. In this kind of experiments, the participants simply drive according to their daily needs and habits, revealing their actual diving behavior under the occurring traffic conditions. Concurrently, a device is used to record the most important variables connected to driving behavior, e.g., speed and acceleration [20], as well as numerous other information. The above method ensures the validity and representativeness of the data collected.
On-board diagnostic scanners (OBDs) are the devices most usually exploited to record the above information. OBDs are not integrated with most cars, but are sold separately and are connected at the corresponding port of the vehicle's engine control unit. The data that can be collected using the above device are, among others, fuel consumption, GHG emissions, moving speed, acceleration, brake usage etc. and the time resolution is usually one second [21]. The majority of the OBD devices are also equipped with GPS sensor and thus they can also record the exact position of the vehicle [22].
Furthermore, modern smartphones are equipped with various sensors, including an accelerometer and GPS and have enhanced computing capabilities [23]. Taking into account that the vast majority of the drivers are smartphone users and the rapid development of telecommunication networks, it is possible to collect a vast amount of data in a relatively low cost. Indeed, compared to an OBD, a smartphone is able to collect almost all the data required, except from fuel consumption and GHG emissions [24]. It is also a way to collect data more massively (from more drivers) when the required number of OBDs are not available [19].
The combination of smartphones and OBDs offers new possibilities and many intelligent transportation systems applications can occur [25], including harsh events detection, safety monitoring and driving behavior evaluation [26], novel motor insurance schemes implementation [27] etc. The two devices may act complimentary to create a rich database with many features and variables, related with the exact position of the vehicle, its speed and acceleration, fuel consumption etc. [14].

Observing Eco-Driving
In order to study and model driving behavior and its effect on fuel consumption, it is essential to observe it in real-world conditions. The most efficient way to collect such data is through naturalistic driving experiments. In this kind of experiments, the participants simply drive according to their daily needs and habits, revealing their actual diving behavior under the occurring traffic conditions. Concurrently, a device is used to record the most important variables connected to driving behavior, e.g., speed and acceleration [20], as well as numerous other information. The above method ensures the validity and representativeness of the data collected.
On-board diagnostic scanners (OBDs) are the devices most usually exploited to record the above information. OBDs are not integrated with most cars, but are sold separately and are connected at the corresponding port of the vehicle's engine control unit. The data that can be collected using the above device are, among others, fuel consumption, GHG emissions, moving speed, acceleration, brake usage etc. and the time resolution is usually one second [21]. The majority of the OBD devices are also equipped with GPS sensor and thus they can also record the exact position of the vehicle [22].
Furthermore, modern smartphones are equipped with various sensors, including an accelerometer and GPS and have enhanced computing capabilities [23]. Taking into account that the vast majority of the drivers are smartphone users and the rapid development of telecommunication networks, it is possible to collect a vast amount of data in a relatively low cost. Indeed, compared to an OBD, a smartphone is able to collect almost all the data required, except from fuel consumption and GHG emissions [24]. It is also a way to collect data more massively (from more drivers) when the required number of OBDs are not available [19].
The combination of smartphones and OBDs offers new possibilities and many intelligent transportation systems applications can occur [25], including harsh events detection, safety monitoring and driving behavior evaluation [26], novel motor insurance schemes implementation [27] etc. The two devices may act complimentary to create a rich database with many features and variables, related with the exact position of the vehicle, its speed and acceleration, fuel consumption etc. [14].
The above methods of observing driving behavior have already been used in various researches related with fuel consumption [9,20]. Usually, a smartphone application is also developed and installed at the participants' smartphones that allows data-sensing and storing and, most importantly, communication between the participant and data collectors. Furthermore, through the application, the users are informed about their performance. In fact, as research shows, by gaining knowledge of the impact of their actions on fuel consumption, drivers are more likely to adopt more environmentally friendly practices [15,28].
Although the above-mentioned data collection method offers many advantages, there is also a major drawback: it is not possible to collect external information, such as weather and traffic conditions and road geometry, which significantly affect fuel consumption and driving behavior [22]. The lack of this kind of information makes the modeling process more challenging and complicated. For example, drivers move slower due to heavy congestion or rainy weather conditions. By observing their driving behavior without being aware of the conditions, it can be wrongly interpreted that they drive at low speed in general. Additionally, moving with an average speed of 60 km/h on a road with zero inclination is totally different, in terms of fuel consumption, from moving on a road with high inclination.
Recent studies have shown that different driving behavior can be detected between people of different age, gender, social and cultural aspects etc. For example, it is more possible that younger drivers' behavior is influenced by emotions such as anger and anxiety while driving, which will increase aggressive and unsafe driving behavior and may increase the risk of an accident [29]. Similarly, men are also found to be more skilled but also more unsafe and risky drivers [30]. Moreover, there has also been observed variation of the driving behavior in different countries or regions. Specifically, in [31], Ozkan et al. compared the answers of a questionnaire survey on driving behavior between respondents from six countries (Finland, Great Britain, Greece, Iran, The Netherlands, and Turkey). It was revealed that there does indeed exist a difference on traffic culture between the Western/Northern Europe countries and Southern Europe, Turkey and Iran, with the first being considered as safer. Specifically, drivers from these countries may only commit minor violations of traffic laws, such as speeding on motorways, while drivers from the other countries commit more usually serious ones and are more possible to drive risky and aggressively. Furthermore, drivers from "unsafe" countries are also less aware of their behavior.

Basic Fuel Consumption Estimation Models
A variety of models have been proposed for predicting fuel consumption, each of which utilizing different types of data and having a different mathematical form [32]. In recent years, developed models have been able to calculate the amount of fuel consumed on a route with an error of less than 10% [33], which is considered quite satisfactory. In an attempt to systematically organize and summarize the models that have been proposed so far and are most common in the literature, we can classify them based on the different criteria, presented below.

Physics-Based vs. Data-Driven Models
Studying a vehicle's movement from the perspective of the science of Physics, the energy produced by the vehicle's engine (by consuming fuel) E tot is equal to the sum of the work of the opposing forces plus the energy required to accelerate the vehicle [14]. More specifically: where E air is the work of the aerodynamic resistance, E roll the work of rolling resistance (friction), E g the work of the vehicle's weight (positive if the vehicle is moving uphill and negative if it is moving downhill), E acc the energy required for accelerating the vehicle and E id the energy required to keep the engine running, even when the vehicle is at stop. The equations for calculating the above quantities are very complex and require the determination of a large number of parameters, which is a difficult task. In an attempt to simplify the calculation, it can be assumed that the engine exerts on the vehicle (through the torque exerted on the wheels) a force, named traction force, in order to overcome the resistance forces exerted on the vehicle and allow the vehicle to accelerate [34]. The forces that are exerted on the vehicle, in addition to the traction force, are the rolling resistance, aerodynamic resistance and its weight. The equilibrium of the forces at any time is positive if the vehicle accelerates and negative if it is decelerating. The equation for the calculation of the traction force is: where m is the vehicle's mass, dv/dt its instantaneous acceleration, g is the acceleration of gravity, θ the slope of the street, F aero is the aerodynamic resistance and F RR is the rolling resistance ( Figure 2). where Eair is the work of the aerodynamic resistance, Eroll the work of rolling resistance (friction), Eg the work of the vehicle's weight (positive if the vehicle is moving uphill and negative if it is moving downhill), Eacc the energy required for accelerating the vehicle and Eid the energy required to keep the engine running, even when the vehicle is at stop. The equations for calculating the above quantities are very complex and require the determination of a large number of parameters, which is a difficult task. In an attempt to simplify the calculation, it can be assumed that the engine exerts on the vehicle (through the torque exerted on the wheels) a force, named traction force, in order to overcome the resistance forces exerted on the vehicle and allow the vehicle to accelerate [34]. The forces that are exerted on the vehicle, in addition to the traction force, are the rolling resistance, aerodynamic resistance and its weight. The equilibrium of the forces at any time is positive if the vehicle accelerates and negative if it is decelerating. The equation for the calculation of the traction force is: where m is the vehicle's mass, dv/dt its instantaneous acceleration, g is the acceleration of gravity, θ the slope of the street, Faero is the aerodynamic resistance and FRR is the rolling resistance ( Figure 2). As it is implied by the equations presented earlier, in order to make calculations using them, knowledge of the instantaneous values of speed, inclination and other parameters during the whole trip (per second) is necessary. Τhe above, in combination with the large number of required parameters, makes it very difficult to use the equations in real problems. However, they have a very strong scientific foundation and may be employed in some cases.
Different physics-based fuel estimation models have been developed in literature. Among the most prevailing is the Vehicle Specific Power (VSP) [35], which allows the calculation of both fuel consumption and vehicle emissions. It includes the detailed per second calculation of the forces exerted on the vehicle, i.e., the aerodynamic resistance, the rolling resistance, the weight and the acceleration of the vehicle. The model has been used with various simplifications of the calculation relationships. Moreover, the Comprehensive Modal Emission Model (CMEM) developed by [36] is a fuel consumption model calibrated using microscopic data. Similar to VSP, it relies on calculating smaller amounts of energy that, when added together, give the total energy consumed. It also requires a large amount of data per second of travel, such as speed. It has been used by a large number of researchers since then [3,37]. Finally, the EMIssions from Traffic (EMIT) was devel-  As it is implied by the equations presented earlier, in order to make calculations using them, knowledge of the instantaneous values of speed, inclination and other parameters during the whole trip (per second) is necessary. The above, in combination with the large number of required parameters, makes it very difficult to use the equations in real problems. However, they have a very strong scientific foundation and may be employed in some cases.
Different physics-based fuel estimation models have been developed in literature. Among the most prevailing is the Vehicle Specific Power (VSP) [35], which allows the calculation of both fuel consumption and vehicle emissions. It includes the detailed per second calculation of the forces exerted on the vehicle, i.e., the aerodynamic resistance, the rolling resistance, the weight and the acceleration of the vehicle. The model has been used with various simplifications of the calculation relationships. Moreover, the Comprehensive Modal Emission Model (CMEM) developed by [36] is a fuel consumption model calibrated using microscopic data. Similar to VSP, it relies on calculating smaller amounts of energy that, when added together, give the total energy consumed. It also requires a large amount of data per second of travel, such as speed. It has been used by a large number of researchers since then [3,37]. Finally, the EMIssions from Traffic (EMIT) was developed for calculating instantaneous fuel consumption and vehicle emissions based on values per second of parameters such as speed and acceleration (Cappiello et al., 2002).
Data-driven models, on the other hand, approximate the true value of fuel consumption using statistical methods or Machine Learning techniques. A well-known model in this category is the VT-Micro (Virginia Tech microscopic). Its data requirements are similar to those of the previously presented models, but what it differs in is that it does not utilize the relationships known from Physics, but instead the calculation algorithm adapts to the data. VT-Micro is developed based on experimental data and its mathematical formulation is a polynomial function of up to third degree terms of vehicle speed and acceleration [38]: where MOE (mL/s) is the instantaneous fuel consumption rate, L i,j and M i,j are the emerging model regression coefficients for the MOE at a speed power "i" and an acceleration power "j" for positive and negative accelerations respectively, s (km/h) is the instantaneous speed and a (km/h/s) is the instantaneous acceleration.
Exploitation of the specific model, as well as of those mentioned earlier, is very limited because of two main reasons: First, road slope, moving speed and acceleration data per second are difficult to obtain, even with today's technology and, second, they include a large number of other parameters that need to be determined. For example, VT-Micro includes 32 such parameters [3].
In general, data-driven models do not require such detailed data. Their training process may also include aggregated data from each trip. Their main difference from Physics-based models is that the calculation does not occur from knowledge of the laws of a body's movement and conservation of energy, but using statistical methods or algorithms which, most of the times, do not have physical meaning. Furthermore, they focus mainly on driving behavior, not taking into account factors such as road geometry, traffic conditions and vehicle condition. Consequently, two identical routes can have a big difference in the amount of fuel consumed.

Modeling Scale
Concerns about the availability of per second data, as well as convenience of relevant models, leads to a second classification. Based, this time, on the time resolution of the data required, models can be separated in microscopic, mesoscopic and macroscopic [33].
Microscopic models are suitable in the case where trajectory data with the exact position of the vehicles per second are available, as time series of speed, acceleration and other parameters per second are required. However, collecting such data is often difficult on a large scale. VT-Micro, VSP, EMIT and CMEM are all microscopic models.
On the other hand, when using macroscopic models, the calculation of fuel consumption is based on variables such as the average value of speed and acceleration, the type of vehicle, etc. Thus, they do not take into account the heterogeneity in driving behavior, since two routes with the same average speed will have the same fuel consumption. Mainly, data-driven models that deploy aggregated data are classified in the macroscopic category.
Finally, mesoscopic models attempt to include the advantages of each of the other two categories: They do not require data per second, but they are not based only on the average values of the parameters. Instead, they use more detailed statistical metrics related to the variation and distribution of each variable. Compared to the other categories, data collection is less difficult than microscopic models, and is, in general, more suitable for most eco-driving applications.

Model Transparency
Another interesting classification of fuel consumption calculation model has been proposed by Zhou et al. (2016). This classification is based on the degree of dependence of the calculation methodology on a large volume of experimental data or on the understanding of mechanical details for the estimation of fuel consumption.
There are three types of models depending on the degree of transparency of the calculations: white, black and gray box. White box models are mainly derived from knowledge of the relevant theory and their mathematical background requires in-depth knowledge of the engine's and subsystems' operation and are largely deterministic.
The opposite of white box models are black box models. Compared to white box models, black box ones lack a theoretical (mathematical/physical) background in their structure and utilize system input and output data and metrics. Gray box models are somewhere between the two approaches and are based on both knowledge of the basics of engine operation and the collection of experimental data and measurements.
More specifically, the white box models are based on the physical and chemical processes that take place in the vehicle engine, using mathematical equations to describe the processes of fuel intake, compression, combustion and evaporation. In black box models, the engine itself is considered a "black box", in the sense that the processes that take place inside it are not known or interested in, except for their measurable results. In fact, there are three types of black box models: • Engine-based models, in which input variables are metrics related to the engine, such as rotational speed (rounds per minute), engine torque, and power.

•
Vehicle-based models with variables such as instantaneous or average speed and acceleration. • Models based on operating modes, with input variables such as acceleration, constant speed, deceleration and idling.
Black box models are usually based on statistical and Machine Learning methods, trained using experimental data collected from real-world trip data. Their main disadvantages are that a large amount of data is often required for their development and that they fall short in terms of the results' interpretability. Gray box models, on the other hand, require at least a partial knowledge of the mechanical background, but also require the use of experimental data. According to the previous categorizations, they are microscopic models based on the theory of Physics.
White box models should not be confused with Physics-based models, as they employ equations and data concerning the vehicle's engine rather than its movement and may be microscopic, mesoscopic or macroscopic. The majority of black box models are data-driven and the data exploited can also be of any temporal resolution.
Regarding the field of application of each model, the white box models are limited due to the requirement for the user to know the mechanical characteristics of the vehicle, as well as details of the processes that take place during fuel consumption. Thus, they are mainly applied in laboratory tests of new engines, rather than for the evaluation of eco-driving. In addition, they do not take into account the driving behavior of each individual driver, which, as already mentioned, has a great influence on fuel consumption. The above also applies largely to gray box models. On the other hand, the use of driving behavior data in black box models, in combination with the possibility of transferring them to newer data gives them an advantage in case a large-scale calculation is required.

Behavioral Fuel Consumption Models
Recently, various approaches are documented in literature that deal with the modeling of the relationship between eco-driving and fuel consumption. The methodologies and data utilized as well as most important parameters that are taken into consideration in the modeling process are summarized in Table 1. Methodologically, in the vast majority of the cases studied, models from the family of linear regression models are used [3,33,39,40]. These models have several advantages, such as their solid mathematical background, their structure, which is transparent and easy to understand and allows good interpretation of the results and the influence of each factor. On the other hand, they do not achieve high accuracy in estimating fuel consumption, compared to more complex models, mainly due to their fundamental assumption that there is a linear relationship between the independent variables and fuel consumption, which does not apply in this case, at least not for all variables [8]. A typical example is the evolution of fuel consumption with average driving speed; contrary to what a linear model would assume, driving at an average speed (50-80 km/h) can be more efficient than driving at a lower or higher speed.
Moreover, when developing linear models, it is a required condition that there is no linear correlation between the independent variables. Two frequently used and strongly correlated variables are gear level and speed, for example. In this case, only one of the two variables should be used; otherwise the influence of each on fuel consumption will not be calculated correctly [39].
The above issues favor the use of Machine Learning models that do not suffer from such limitations, are efficient in function approximation tasks, regardless of the functional form (linear or not) [40]. As in many other areas, the popularity of these models in estimating fuel consumption has greatly increased in recent years. Some prominent modeling approaches include Random Forests [41,42] and Neural Networks [3,33] have been exploited the most, while some of the most sophisticated approaches include LSTM networks and other Deep Learning architectures [42]. In most of the referred cases, linear models are utilized as a baseline model for evaluating more complex ones. Comparisons show that linear models are clearly outperformed by Machine Learning in almost all cases.
The recent popularity of Machine Learning approaches has been facilitated by the use of high-resolution driving and fuel consumption data coming from in-vehicle technologies, OBD devices and smartphones [19]. Additionally, there is a work where synthetic simulation data were utilized for the development of a fuel consumption model [33].
More specifically, there are four remarkable studies that deploy exclusively linear regression models for the estimation of fuel consumption. Yao , who developed a deep learning framework (LSTM) and also used K-Means clustering to, firstly, separate drivers into different profiles and then estimate fuel consumption. This work also focuses mainly on driving behavior and achieves an overall accuracy higher than 80%.
In Table 1, the most significant works are summarized; Methodologies and data collection methods (if estimation models are included) are presented, as well as the most important factors referred in each work.
One can observe that moving speed is the parameter that is most often included in the calculation of fuel consumption. The same applies for acceleration as well. Therefore, they can be considered as the most important parameters. Inclination and other geometric characteristics of the road as well as the weather conditions are the next most common parameters. At the same time, many models use variables related to the engine's operation, which are collected by OBDs. It is quite interesting that only in a few of the studies presented has time of idling been used as a variable, although it is one of the most frequently reported inefficient behaviors and can easily by estimated from GPS trajectory data. Finally, the number of stops and the length of the route, although they are also easily measurable parameters, are not used by a lot of researchers.

Eco Driving Effects on Fuel Consumption
Fuel consumption can be affected by a variety of factors that are usually categorized as: i. Strategic factors, concerning the vehicle, ii. Tactical factors, concerning the choice of route and iii. Operational factors, concerning the driver [12]. The vehicle's type, the fuel it uses (gasoline, diesel, electric etc.), its weight and its proper maintenance are some examples of strategic factors that affect fuel consumption. On the other hand, the tactical factors usually mentioned can be separated into two categories: those concerning the traffic conditions (e.g., level of congestion, driving during peak hours) and those concerning road geometry (e.g., inclination, road type). Finally, the operational factors are related to the driver's actions and decisions. The most important is the driving behavior, and more specifically the vehicle's speed and acceleration, the gear and duration of idling. Other operational factors, such as the use of air-condition and entertainment systems, have a less significant effect. A detailed overview of the factors, as presented in some of the most important recent works is presented in Table 1 above.
Although driving behavior is one of the most important factors influencing fuel consumption, its influence is often ignored [2]. For example, models based solely on external factors (weather and traffic conditions, etc.) or solely on the mechanical characteristics of the vehicle for calculating fuel consumption do not take into account the influence of the individual characteristics of different drivers. As mentioned earlier, however, adopting ecological driving behavior can reduce fuel consumption by about 25% [40].
The variables used to define driving style are related to speed and acceleration or deceleration. According to the most common approach, for the characterization of the driving profile, it is examined whether a driver accelerates or decelerates harshly, how often and, of course, whether he drives at high speed or not, far, near or even above the corresponding speed limit. Specifically, the above quantities for each route are summarized using statistical metrics, such as mean, variance, maximum and minimum, and frequency (for acceleration and deceleration) and are considered adequate to describe a driver's driving style [20].
Regarding driving related factors, most studies exploit the average value or other statistical metrics of speed in calculating fuel consumption (mesoscopic and macroscopic models), while significantly fewer use the time series of speed, i.e., its instantaneous value per second (microscopic models) [49]. Regardless of the choice of approach, speed is considered as the most important factor in calculating fuel consumption. It is in fact one of the variables that, along with acceleration, are used to determine driving style [8]. Maintaining an as constant as possible speed is considered as the best practice, with respect to traffic conditions. The latter is due to the minimization of accelerations and consequently to the requirement for the production of additional kinetic energy from the vehicle's engine In addition, the level of speed also affects fuel consumption: each engine has an optimal speed value, at which it operates with optimal efficiency. Fuel consumption at low speeds is higher due to increased heat loss, then approaches optimal efficiency, while, at very high speeds, it increases again due to friction losses in the engine [2]. Consequently, the fuel consumption versus speed curve has a "U" shape. The optimum speed is between 50 and 80 km/h [50].
Acceleration is the variable that primarily describes the driving profile. An aggressive driver is characterized by harsh accelerations and decelerations, while the opposite is true for the careful or cautious one, which is characterized by smooth changes in speed. The main goal of eco-driving is the transition of drivers from aggressive to cautious profile [2], since smoother driving allows fuel savings [16]. Firstly, frequent decelerations cause also frequent accelerations and, thus, more fuel is consumed to increase kinetic energy and restore the vehicle's speed. In addition, when accelerating, the vehicle's engine operates at higher speed (more rounds per minute), which also increases fuel consumption, especially when accelerating harshly. Smooth acceleration is the most important parameter for reducing fuel consumption and contributes to savings of up to 40% [2].
Empirical findings have also identified idling as a contributing factor to fuel consumption [2]. It is estimated that more than 22 billion liters of fuel are consumed because of idling in the United States [51]. Drivers are advised to switch off their vehicle's engine if it is to remain stopped for more than half a minute due to a traffic light, for example, as the fuel required to restart the engine is less than that consumed if the engine is running. The percentage of savings if the above practice is avoided is up to 20%, in combination with other factors [8].

The Promises and Some Caveats
From the in-depth literature review conducted for the purpose of this paper, it became clear that eco-driving includes many factors that have to be taken into account, whose relation with fuel consumption is multidimensional and more complex than one may expect.
Furthermore, the necessity of an integrated framework for collecting driving behavior data, elaborating them, developing models and utilizing them in real-world problems has been highlighted and described throughout the chapters of the present paper and is illustrated in Figure 3. The steps presented should be anticipated as an integrated process and researchers should consider following them. However, some of them are often not taken into account, leading to approaches that may miss various important features. The above steps are even more important, because the data may be used for the implementation of evidence-based policy making. taken into account, leading to approaches that may miss various important features. The above steps are even more important, because the data may be used for the implementation of evidence-based policy making. Therefore, it is deemed necessary to focus on those significant challenges that may arise in future real-world implementations of behavioral fuel estimation models discussed in the following sections.

Data Collection
Naturalistic driving experiments are considered the most reliable way of collecting driving behavior and fuel consumption data. However, the quality and accuracy of such data is not always guaranteed, due to errors and limitations of the devices used (smartphones and OBDs). Furthermore, if each driver uses different device (i.e., his/her own smartphone) the above errors become unpredictable, because they are not systematic, and thus cannot be easily corrected. Using devices whose accuracy is validated by all participants would be a possible solution to this problem. Therefore, it is deemed necessary to focus on those significant challenges that may arise in future real-world implementations of behavioral fuel estimation models discussed in the following sections.

Data Collection
Naturalistic driving experiments are considered the most reliable way of collecting driving behavior and fuel consumption data. However, the quality and accuracy of such data is not always guaranteed, due to errors and limitations of the devices used (smartphones and OBDs). Furthermore, if each driver uses different device (i.e., his/her own smartphone) the above errors become unpredictable, because they are not systematic, and thus cannot be easily corrected. Using devices whose accuracy is validated by all participants would be a possible solution to this problem.
A second issue that was evident while studying relative literature, was the relatively low availability of data, as, in most experiments, no more than 50 or 100 drivers participated. The main reasons behind this is the cost of such experiments (supply of the devices, potential subscriptions required, monitoring of the experiment etc.) and the fact that most drivers are reluctant to be monitored, even if they are offered monetary rewards. The latter raises a data representativeness issue as well, as the driving behaviors of so few drivers and models developed concerning fuel consumption based on them can barely be assumed to reflect the entire population. The above should be seriously considered in future naturalistic driving experiments.

Modeling Efficiency
A reasonable direction for researchers today is the use of Machine Learning models that are lately widespread in all research areas. The use of models of this category has indeed increased the accuracy in estimating fuel consumption, while more and more advanced models of Deep Learning are being exploited in recent works. However, these models, with a few exceptions (e.g., models based on Decision Trees), do not provide any interpretation of the influence of each variable on fuel consumption. These approaches are often called "black box" approaches and are not suitable for driving behavior and fuel consumption applications because, among other things, the researchers cannot provide recommendations to users to improve their behavior, as they are not able to realize which the correct practices are.
To overcome this problem, but also for researchers to continue to use models with high prediction accuracy, techniques of interpretable Machine Learning should be additionally applied. These techniques include concepts such as feature permutation and the calculation of the partial derivatives of the dependent variable and bring some of the advantages of statistical models to Machine Learning [52,53]. More specifically, feature permutation of each feature is estimated as decrease of the model's accuracy when randomly shuffling the specific feature's values and, therefore, is a metric of each feature's importance for estimating the dependent variable's value accurately. The partial derivative of each feature is a function for estimating the dependent variable based on the value of the feature, while all other features take their mean values. It is estimated using a simplified mathematical formula of derivation and is applicable to both continuous and categorical variables. The partial derivative is an indication of the sign and magnitude of the effect of the independent variables on the dependent.
Moreover, the comparative discussion of the most widely-used models for estimating fuel consumption highlighted that the performance of such models is adequately accurate. However, in order to improve model efficiency, and based on the nature of the problem, researchers should focus on feature engineering techniques. Feature engineering is the process of transforming raw data into variables that better represent the underlying problem, and it is wise to use them in cases where data collection is more difficult [54]. Feature engineering in driving behavior modeling can transform speed and acceleration patterns into high level features that can be used to extract interesting information for the drivers, separate them into different profiles etc. [55].
Complementary to feature engineering, feature selection is another process that can improve model efficiency. Feature selection refers to the process of choosing the most relevant variables among the set of available variables and may be performed through feature importance or dimensionality reduction techniques. Feature selection is a very significant procedure since too many features may lead to overfitting, while too little features may lead to underfitting [56].

Modeling Completeness vs. Usefulness
Although driving data have been systematically used to detect eco driving behavior and link it to fuel consumption, in practice, this approach is not enough, if someone is to target a reliable and complete mathematical representation of fuel consumption estimation. Evidently, behavior can explain a small part of fuel consumption. The most influential factors relate to the type of vehicle, geometric and traffic conditions. Larger engine displacement vehicles, for example, consume more fuel than smaller displacement vehicles, even though they both follow the same route, moving with the same speeds and the same driving behavior [8]. Therefore, the appropriate information should be also available to develop a reliable model. The same is true for newer vehicles and for vehicles whose driver chooses to use higher quality fuel (higher octane) [47]. Another beneficial practice is the good maintenance of the vehicle's systems (engine, tires, filters, etc.), which ensures their most efficient operation throughout their use.
In addition, weather conditions often have a significant effect on fuel consumption for a variety of reasons. First, extreme temperatures (either small or high) affect the operation of the engine. Secondly, strong winds increase the aerodynamic resistance exerted on the vehicle and, thirdly, phenomena such as rain, fog and snowfall affect the driver's behavior, not allowing him to apply the optimal practices in terms of acceleration and speed [8].
Furthermore, a significant number of works refer to the use of air conditioning, as well as other electrical systems such as lights, entertainment systems, etc., as parameters that increase fuel consumption [2]. Finally, the weight transferred is another variable, the minimization of which enhances fuel efficiency [47].
Moreover, there are several other factors that are less commonly reported and have a minor but important impact on fuel consumption. Initially, research has shown that monitoring fuel consumption through a system installed in the vehicle or a smart application pushes drivers to generally have a more ecological behavior [41,47].
The models presented in the literature are often the result of a tradeoff between completeness and data availability and usefulness, meaning that all the above data are difficult to collect and cannot be collected from naturalistic driving experiments (e.g., external data concerning traffic and weather conditions). Furthermore, even if they were available during the model's development, their use would lead to a complete model, but it would also make it impossible to use it in other research and applications, where the collection of so detailed data would not be possible.
The application at which the model is going to be exploited is, of course, determining how much detailed data are required. If it is an application that evaluates the extent to which a driver drives ecologically and suggests actions to improve his efficiency, then the data related to driving behavior are sufficient. If the accuracy of the calculations is important, for example to calculate the cost of trips, then additional data is necessary.

Managing Driving Behavior and Eco-Routing
Optimizing driving performance by addressing personalized aspects of driving behavior and without posing unrealistic restrictions on personal mobility may have far reaching implications to traffic safety, flow operations and the environment, as well as significant benefits for users. The above is supported by various research attempts, including Machine Learning methods for driving style identification [26,43,57] and even more advanced Deep Learning and Reinforcement Learning implementation controlling and improving driving style, using data collected through smartphones [58]. Therefore, observing, studying and modeling eco-driving is not just a theoretical research topic, but should be applied and exploited in real-world applications.
Ecological routing is a complementary concept which affects numerous factors that are related to fuel consumption, such as idling, number of stops and speed. It refers to the choice of route that, for a particular trip, leads to minimal fuel consumption [8]. This parameter is largely related to the geometry and the slope of the road (routes with a smaller slope are selected), but, most importantly, to the expected traffic conditions that the vehicle will encounter, the existence of traffic lights on the route, etc. Driving in congested conditions, as well as traffic lights, would force the driver into making many stops and increase idling, which would dramatically increase fuel consumption, consequently. In addition, the above factors would not allow the driver to drive at the optimum speed, but at a lower speed, further reducing fuel economy, as explained earlier.
The broader concept of eco-routing also includes other decisions, beyond the choice of route, such as the choice of departure time. Drivers should choose to travel during off-peak hours, if possible, to avoid congestion [47]. Eco-routing can contribute to reducing fuel consumption by 10 to 25%, according to recent literature [2,6].

Conclusions
In this paper, an extensive review and analysis of the literature on ecological driving and driving-related fuel consumption was carried out. Eco-driving is a broad concept that primarily refers to the adoption of a driving style aimed at minimizing fuel consumption and GHG emissions, but also includes other strategies and behaviors related to the above goal. The specific review emphasized the importance of novel ICT systems for the collection of detailed data which can be employed to quantify and understand the effects of driving style to fuel consumption and, thus, the power of eco driving in modern cities. It was also underlined that the driving style is one of the five components of fuels consumption, with road geometry, vehicle specifications, traffic and weather conditions being among the most influential. Consequently, for developing an inclusive manner to quantify eco-driving in real world conditions using real world data, a big data approach is needed to jointly consider data from different sources of information.
Evidently, the methodologies used to model fuel consumption, need to improve further than the most commonly used linear regression models. Machine Learning and Deep Learning techniques, which are not limited by assumptions, are expected to attract most of the researchers' interest in the future, especially in the cases where all the components of fuel consumption are to be taken into consideration. Studying recent literature, it was made clear that the most sophisticated Machine Learning models are suitable for estimating fuel consumption using driving behavior data, with methods such as Neural Networks, Radom Forests and Support Vector Machines being the most popular. On the other hand, linear models can be considered nowadays as more useful in assessing the influence and importance of each factor in fuel consumption rather than predicting it.
Regardless of the availability and quality of the data and the associated modeling, eco driving as defined and discussed in the specific work has multiplicative positive impacts to transportation systems in terms of environment, traffic efficiency and safety. Consequently, eco-driving should be linked to fair and applicable practices that aim to raise awareness among drivers and improve their behavior, for the benefit of the society. This can be associated by targeted pricing policies, new regulations for alternative fuel vehicles and a systematic upgrade of the transport infrastructure towards a more connected and cooperative city environment.