Estimation of the Energy Consumption of Battery Electric Buses for Public Transport Networks Using Real-World Data and Deep Learning

: The estimation of energy consumption is an important prerequisite for planning the required infrastructure for charging and optimising the schedules of battery electric buses used in public urban transport. This paper proposes a model using a reduced number of readily acquired bus trip parameters: arrival times at the bus stops, map positions of the bus stops and a parameter indicating the trip conditions. A deep learning network is developed for deriving the estimates of energy consumption stop by stop of bus lines. Deep learning networks belong to the important group of methods capable of the analysis of large datasets—“big data”. This property allows for the scaling of the method and application to di ﬀ erent sized transport networks. Validation of the network is done using real-world data provided by bus authorities of the town of Jaworzno in Poland. The estimates of energy consumption are compared with the results obtained using a regression model that is based on the collected data. Estimation errors do not exceed 7.1% for the set of several thousand bus trips. The study results indicate spots in the public transport network of potential power deﬁciency which can be alleviated by introducing a charging station or correcting the bus trip schedules.


Introduction
Public transport is a key factor for the functionality of urban systems. Much attention is given to limit urban pollution and reduce the total cost of ownership (TCO) of the transport fleet at the same time as improving the quality of the services. In this context, the electrification of bus transport gains favour as it does not emit polluting gases in the urban environment, it is highly power efficient and also less noisy than conventional bus transport. Extended reports [1,2] show that the TCO of electric fleets in the nearest future will drop below the TCO of conventional fleets. Environmental pollution, first of all the carbon footprint, highly depends on the way the electricity is generated. In the case of countries with a domination of coal-fired thermal power stations (Poland), the impact of the introduction of battery electric buses may not contribute to the reduction of the carbon footprint as the electricity is sourced by high-carbon emission plants.
The growth of the world electric bus fleet accelerated rapidly in recent years, currently totalling more than 425,000 [3]. The largest number of electric vehicles is operated in China (over 90%). Although the European share of the global electric bus fleet is not great, European companies play a significant role in the development and design of solutions in the field of transport electrification [4].

•
The energy demand is the product of the length or of the time of the trip and a mean energy consumption per unit of length or of time; • Drive profiles represent the course of the trip; • The first group of models use a mean energy consumption measure per unit of length or of time [8]. This measure is determined for different types of buses based on drive cycles or using real-world measurements of energy consumption on bus lines. The energy consumption depends on a large number of parameters (bus technology, traffic conditions, number of passengers, profile of the route) and varies between 1.0 and 3.5 kWh/km [9].
An average energy consumption rate of 1.41 kWh/km, which is based on the results obtained on a regular city bus route under a real-world evaluation experiment, is used by the authors in [10] for optimising charging schedules in an urban transport network with charging stations.
If a bus operates on a long bus line, the total used energy may surpass the capacity of the batteries, so a place for a charging station or a modification of the charging schedule is proposed or the organisation of the bus line is changed. An extended study is reported in [11] which additionally optimises the costs of the operation of the charging stations. The authors report in [12] a method of optimisation of the charging infrastructure assuming a constant energy demand per unit length of the trips.
The use of fixed consumption rates in the case of highly variable traffic conditions or undulating roads is inaccurate and leads to difficulties in the proper estimation of energy consumption and as such compromises the effectiveness of the proposed optimisation algorithms. Weather conditions can also severely change the energy consumption when air-conditioning or heating is switched on in the bus.
In the case of trip course representations, the range of models for energy consumption covers awide spectrum of drive description approaches. There are complex kinematic models of vehicle movements, route-based scenarios of energy consumption, models incorporating characteristics of driving styles, environment variations and vehicle construction related factors. These models require Energies 2020, 13,2340 3 of 17 complex measurements of variables used for the description of elements of the models and references to produce a detailed physical modelling of these.
Vepsäläinen et al. [13,14] propose a surrogate model which is a derivative of an electro-mechanical model of an electric bus. Fourteen noise factors and their uncertainty margins affecting the energy consumption are identified. The application of the model shows that temperature, rolling resistance and payload contribute most to the variation in energy consumption.
In [15], the authors report on the use of grey relational analysis (GRA) for the analysis of the impact of various external factors on energy consumption on bus trips. Factors such as travel time, weather conditions, length of trip and use of air-conditioning produce the largest effect when estimating energy consumption.
An example of a kinematic model for energy consumption estimation is discussed in [16]. The model is developed in the H2020 project EVERLASTING (Electric Vehicle Enhanced Range, Lifetime and Safety Through INGenious battery management). The longitudinal equations of motion of the vehicle are used to derive expressions for the energy consumption of the powertrain of the bus.
Authors in [17] present a vehicle energy consumption model taking into consideration the influence of weather conditions and road surface-dependent rolling resistance, which includes a road load, a powertrain, a regenerative braking, an auxiliary system and a battery model.
The report in [18] presents a detailed electric model showing the energy flows and power dependencies. The authors propose a method for the state of charge and range estimation by taking into account location-dependent environmental conditions and time-varying drive system losses.
Trips are modelled using standardised scenarios of energy consumption related to the characteristics of roads or traffic conditions. A scenario is determined which most accurately approximates the course of the bus trip or part of the trip [19]. Each scenario has a specific energy demand which is summed up to give the total energy necessary to carry out the bus trip.
The conditions of travel can be diverse, especially in large transport networks, so the application of standardised scenarios of energy consumption leads to large discrepancies in estimations. Changing numbers of passengers, congestion on the roads and weather changes bring about significant changes in the course of the buses causing anomalies in energy demand.
The application of real-world drive profiles for modelling requires measurements of the characteristics of the bus route and parameters of travelling, which determine the energy requirements for the movement of an object. This physical modelling is appropriate for electric vehicles as their efficiency of transforming electric power to mechanical movement is much higher than in the case of vehicles with combustion engines.
The authors in [20] report a mobile data collection system to study the impact of route type on the energy demand. Trip trajectories are registered, a battery management system is used to monitor the voltage, temperature and energy expenditure and an analytical model of energy consumption is proposed using this extended set of variables. Results of the studies presented in [21] prove that accurate estimations can be obtained using route characteristics, such as distance covered, inclination of the road and number of turns, and travel parameters, such as speed and acceleration. The measurements must be carried out with high accuracy and high time resolution. The presented case studies involve a small number of routes and indicate a need for further validation of the proposed estimation method.
Galleta et al. in [22] present a modified approach with a smaller number of parameters. The times of travel and the distances between bus stops are used to calculate the energy consumption. Arrival and departure times are used for determining the travel time and boarding time, which significantly differ in energy requirements. Additionally, the trip between bus stops is divided into segments of acceleration, constant speed and deceleration in order to model the way the bus driver goes through the road network with traffic lights and differing congestion. An assumption is made that consumption is related to the speed of travelling. A simplified speed profile is derived dynamically for each trip. The speed depends on the road conditions, load and on the way the vehicle is driven. This reduced set Energies 2020, 13, 2340 4 of 17 of measured parameters facilitates the implementation of the estimation method for the analysis of the functioning of large transport networks.
There are studies that combine a physics-based and data-driven approach. In [23], the structure of the energy consumption model is based on vehicle dynamics and extended using additional parameters: travel distance, travel time and temperature. These additional parameters are derived from real-world data collected during drive tests.
Data-driven models use determined relations between the chosen parameters describing trips and the electric bus energy consumption. In general, the models are derived by identifying statistical relations in sets of real-world data or capturing complex relations between the parameters using AI tools. The model parameters often no longer represent a physical quantity, and insight in the underlying physics is lost.
Kanarachos et al. employ a soft sensor for estimating fuel consumption [24]. Soft sensors estimate a variable by combining a system model with other physical measurements. Data acquired with smartphone embedded circuits such as accelerometers, magnetometers and GPS receivers are inputs to a deep neural network (DNN)-based model of fuel consumption. The soft sensor follows the measured values achieving an error rate of approximately 6%.
A data-driven approach is applied to the estimation of polluting gas emissions in road traffic [25]. The motor vehicle emission simulator (MOVES) model is complemented with GPS data sets. Additional data provide a description of the driver's behaviour, contributing to a better mapping of pollution patterns and thus improving the estimation accuracy.
Long short-term memory (LSTM) is applied for modelling drivers' behaviour impacts on fuel consumption [26]. This artificial recurrent neural network (RNN) architecture proves successful for processing a vehicle's operating information and tracking its position in real time to estimate fuel consumption. One study shows estimation error rates less than 6%.
This paper presents a model and a method for the estimation of energy consumption coinciding with the data-driven approach. The method uses readily available arrival times at the bus stops, supplemented with map positions of the bus stops and a variable indicating the trip conditions.
The introduced electric buses have a limited range and monitoring of their movement is reduced to a few basic trip parameters. Bus companies are reluctant to equip their fleet with specialised devices for a detailed registering of the drive parameters. The collected data can be corrupted as the simple monitoring is exposed to different disruptions. The aim is to find a method for using these data for the estimation. The use of a deep learning network (DLN)-based approach is proposed, which demonstrates resistance to corrupt data. This feature of DLNs was earlier studied by the author of [27] and is also reported in the literature.
Arrival times at bus stops are routinely registered by the bus company in order to optimise the bus schedules. The arrival times at bus stops enable the calculation of times of trips between bus stops and represent a measure of the speed of travelling. The times also indirectly indicate the load of the vehicle, summing up the journey time and boarding times at the bus stops. The times vary due to changing road traffic conditions as the buses are mixed in general traffic. This variability is also influenced by hard to define auxiliary factors such as the driver's experience and his current mood.
The drive parameters collected by the bus company are not all used in their raw form. The location of bus stops is fixed. Bus stop positions and route lengths (between bus stops) are derived from maps of the bus lines. In order to include a factor of steepness of a route, which highly increases the energy expenditure as physical models of movement show, the difference in elevation of consecutive bus stops is noted.
Another factor was required to improve the estimation and it was decided to include a code indicating the conditions of travel. The extra variable indicating or describing trip conditions originates from observations of energy consumption changes in different times of the day and of the year. The changes could not be accounted for using measures of traffic congestion, weather changes or transport demands. The variable represents a weather-related measure, which includes additional Energies 2020, 13, 2340 5 of 17 energy required to sustain the comfort of the bus passengers-lighting, heating and air-conditioning. In the case of electric buses, heating and air-conditioning constitute a very significant energy expenditure, highly influencing the range of the buses. The characteristic of use of DLNs allows for work with variables related to the approximate descriptions of not precisely defined factors.
The collected data constitute the input to a "deep learning" network, which determines the estimates [27]. No detailed modelling of the course of the bus trips is done. The proposed input variables are regarded as synthetic measures of the characteristics of the bus trip.
Deep learning networks belong to the important group of methods capable of the analysis of large datasets-"big data". This property allows for the scaling of the method and application to large transport networks. The validation of the network is done using real-world data provided by bus authorities of the Jaworzno town (Poland). This dataset contains travel parameters gathered during almost a year of battery bus operation in different traffic, load and weather conditions.
The main contributions of this paper are: The rest of this paper is organised as follows. Section 2 introduces the energy consumption model's variables. Section 3 presents the DLN-based model. The validation of the model and estimation results, based on a case study using real-world data from a bus company, are discussed in Section 4. Section 5 concludes the paper and provides a brief outlook on future work.

Analysis of Energy Demand of Battery Electric Buses
The aim of the study is to develop an efficient method for the estimation of energy consumption of battery electric busses at the nodes of the bus network. Knowledge of the placement of high energy demand nodes enables the development of propositions for changes in bus schedules or the introduction of charging stations. Efficient means that there are small requirements for data describing bus trips and that the processing methods are robust and immune to data corruption.
Transport networks are constantly being extended and optimised to reduce the costs of functioning and improve the quality of services. Tools for streamlining this process, at the same time as not requiring complex data collection, are highly demanded.
The work thesis is phrased as the following: the application of a deep learning network, using readily available bus trip data, enables an efficient estimation of energy consumption. The estimation is done for trips between consecutive bus stops.
The thesis implies that a large bus trip dataset must be acquired for the proper training of the deep learning network. The other prerequisite is a proper choice of input parameters for the network which concisely describe the course of bus trips. The required parameters must adequately represent the different conditions of the bus operation and be easy to collect using commonly available measurement means.

Bus Trip Datasets
Thanks to good relations with the municipal bus company of Jaworzno, a leader in the introduction of battery electric buses to urban transport fleets in the south of Poland, this dataset was fortunately obtained. The dataset is constantly supplemented with daily trip data of buses operating in the transport system. The company manages 24 battery electric buses, which amounts to 40% of their fleet and plans to replace all conventional buses with electric vehicles. Currently, only one type of electric bus in three sizes is used in order to minimise maintenance costs. This is the Solaris Urbino electric (18,12 and 8,9 LE), produced by the Solaris Bus and Coach company in Poland. The type The buses are mixed in general traffic and no extra privileges are granted to their movement. Bus schedules are prepared taking into account transport demands of the different parts of the town. Most of the bus stops have LED text display boards which provide current travel data.
The dataset comprises of map data (GPS coordinates) for bus stops, times of arrival at bus stops and energy expended between bus stops. Solaris Urbino 12 buses operate on the bus lines chosen for the study. Buses are equipped with GSM-based telemetry devices which report the buses' position, battery states and charging events to the central bus depot. The reporting is done for every bus stop and at 1-min intervals when the bus moves.
The data was collected, at the bus depot, for work days in the summer and winter months during the whole operation periods of the bus lines (5 a.m.-11 p.m.). Data registered in April, May and June are used as representative of the summer months. Data registered in December, January and February are used as representative of the winter months. These periods have characteristic weather conditions. Some bus lines are looped, so there is repeated trip data useful for confirming the travelling parameters.
In all, more than 3000 trips between 135 bus stops are available for analysis. Table 1 presents an excerpt from the dataset, which is a list of the raw trip parameters of a bus line with 15 bus stops. The values of energy consumption are derived from the registered levels of bus battery charge. Distances between bus stops and elevation differences are calculated using GPS coordinates and map data. The presented table shows an example of a bus operating on a line in winter: it needs 20 min 12 s and 9.6 kWh of electric energy to cover the route.
A corresponding map is shown in Figure 1, where the red dotted line is the route of the bus line.

Changes of Energy Demand Due to Weather Conditions
Representative month data are analysed. The demand change follows the increase in travel time in the winter months, but in the summer behaves the other way round. The mean values of energy consumption in January and June is the same and very high, and this accounts for heating in the winter and air-conditioning in the summer, which consume much energy. Figure 2a,b shows the column graphs of the two parameters.

Changes of Energy Demand Due to Weather Conditions
Representative month data are analysed. The demand change follows the increase in travel time in the winter months, but in the summer behaves the other way round. The mean values of energy consumption in January and June is the same and very high, and this accounts for heating in the winter and air-conditioning in the summer, which consume much energy. Figure 2a,b shows the column graphs of the two parameters.

Changes of Energy Demand Due to Weather Conditions
Representative month data are analysed. The demand change follows the increase in travel time in the winter months, but in the summer behaves the other way round. The mean values of energy consumption in January and June is the same and very high, and this accounts for heating in the winter and air-conditioning in the summer, which consume much energy. Figure 2a,b shows the column graphs of the two parameters. The difference between the highest consumption and lowest amounts to 0.1kWh, which is 20% of the lowest consumption.

Changes of Energy Demand Due to Bus Stop Elevation Differences
Single trip characteristics are analysed. The movement parameters of the bus trips constitute the input variables of the proposed deep learning network and have a direct impact on the energy expenditure. Significant changes in the energy demand are noted for trips between bus stops at The difference between the highest consumption and lowest amounts to 0.1 kWh, which is 20% of the lowest consumption.

Changes of Energy Demand Due to Bus Stop Elevation Differences
Single trip characteristics are analysed. The movement parameters of the bus trips constitute the input variables of the proposed deep learning network and have a direct impact on the energy expenditure. Significant changes in the energy demand are noted for trips between bus stops at different elevations. One example illustrated in Figure 3a,b shows that a 25 m difference over a 386 m distance brings about an 18% change in December and a 23% change in June.  Interesting is that winter travel has lower energy consumption when travelling downwards in comparison with summer travel. Such dependencies indicate that the factor of elevation change is important for the energy consumption estimation.
The energy consumption also changes during the day of operation. Peaks are observed which can be accounted for a higher number of passengers travelling on the bus. A heavier bus requires more energy to move, but it also takes longer for the passengers to board the bus.
In some cases, the distance between stops can differ when the bus travels upwards and downwards. This happens when the bus lines are routed on one-way roads. Figure 4a

Energy Consumption Model Variables
The analysis of the relations between the raw bus trip data and the demand for a concise description of the bus trips, which is a prerequisite for the efficient application of a deep learning network (DLN), indicate the necessity to combine some of the collected data. The values of energy consumption change with the trip distance, as the trip travel time changes, when there is adifference in elevation between the stops and when the weather changes. The values of these variables except for weather changes can be derived using the collected trip data. The distance and elevation Interesting is that winter travel has lower energy consumption when travelling downwards in comparison with summer travel. Such dependencies indicate that the factor of elevation change is important for the energy consumption estimation.
The energy consumption also changes during the day of operation. Peaks are observed which can be accounted for a higher number of passengers travelling on the bus. A heavier bus requires more energy to move, but it also takes longer for the passengers to board the bus.
In some cases, the distance between stops can differ when the bus travels upwards and downwards. This happens when the bus lines are routed on one-way roads. Figure 4a,b presents an example of bus trips where the distances in the opposite directions differ significantly: one way it is 592 m, and the other way it is 702 m, with a slight difference in elevations-8 m.
Energies 2020, 13, x FOR PEER REVIEW 8 of 17 different elevations. One example illustrated in Figure 3a,b shows that a 25 m difference over a 386 m distance brings about an 18% change in December and a 23% change in June. Interesting is that winter travel has lower energy consumption when travelling downwards in comparison with summer travel. Such dependencies indicate that the factor of elevation change is important for the energy consumption estimation.
The energy consumption also changes during the day of operation. Peaks are observed which can be accounted for a higher number of passengers travelling on the bus. A heavier bus requires more energy to move, but it also takes longer for the passengers to board the bus.
In some cases, the distance between stops can differ when the bus travels upwards and downwards. This happens when the bus lines are routed on one-way roads. Figure 4a

Energy Consumption Model Variables
The analysis of the relations between the raw bus trip data and the demand for a concise description of the bus trips, which is a prerequisite for the efficient application of a deep learning network (DLN), indicate the necessity to combine some of the collected data. The values of energy consumption change with the trip distance, as the trip travel time changes, when there is adifference in elevation between the stops and when the weather changes. The values of these variables except for weather changes can be derived using the collected trip data. The distance and elevation

Energy Consumption Model Variables
The analysis of the relations between the raw bus trip data and the demand for a concise description of the bus trips, which is a prerequisite for the efficient application of a deep learning network (DLN), indicate the necessity to combine some of the collected data. The values of energy consumption change with the trip distance, as the trip travel time changes, when there is adifference in elevation between the stops and when the weather changes. The values of these variables except for weather changes can be derived using the collected trip data. The distance and elevation difference are calculated using the position data of the start and end bus stops of a trip, while trip time is the difference in arrival times at these stops. The problem of expressing weather conditions is a case of coding non-measurable variables frequently found in applications of neural networks. It is usually solved using some heuristic based on the expertise of the designer of the network.
The weather conditions are strongly related to the time of year and this is the basis for assigning variable values proposed in this study. Table 2 contains the assignment list. The assigned values are a derivative of the energy consumption rates and average monthly weather conditions. Months with similar measured energy consumption rates and weather received the same values. Months with lower consumption rates and better weather had higher values ascribed. The following variables are used for the description of bus trips: • d n -distances between bus stops; • ∆t n -travel times: time taken to travel between consecutive bus stops; • ∆h n -elevation differences; • w n -weather codes.
The energy consumption is expressed using where n is the number of the bus trip of a bus line. Table 3 presents the description of the previously (Table 1) quoted bus line using the network input variables. The bus line data were reported on a winter day w = 2 (February). Travel times mostly exceed 100 s and do not follow the distances between bus stops. The route is hilly, there are 20-24 m changes in elevation. Energy consumption per bus trip is mostly above the 0.8 kWh value.

Energy Consumption Model
The energy consumption is modelled using a deep learning network (DLN). This ensures a high level of immunity to corrupted data [26] and a good response to complex relations between the data inputs. These desirable characteristics can be attained in the course of network training using a large dataset of input values, which are derivatives of the collected historical data.
The way of using the model to calculate energy consumption for bus lines is presented in Figure 5. The trained DLN gives estimates of energy consumption for trips between bus stops and requires inputs which are derived from the raw data collected during bus travels.

Energy Consumption Model
The energy consumption is modelled using a deep learning network (DLN). This ensures a high level of immunity to corrupted data [26] and a good response to complex relations between the data inputs. These desirable characteristics can be attained in the course of network training using a large dataset of input values, which are derivatives of the collected historical data.
The way of using the model to calculate energy consumption for bus lines is presented in Figure  5. The trained DLN gives estimates of energy consumption for trips between bus stops and requires inputs which are derived from the raw data collected during bus travels. The first two steps process and prepare the data for the DLN. Different tools for collecting data can be used, for instance, specialised data loggers or mobile devices such as smart phones or smart cameras. The registered trip data are next converted to DLN inputs, and this is done straightforwardly in the case of data loggers and with some effort in the case of smart devices.
The total bus line energy consumption is the sum of estimates for all the bus trips of the line. Additional energy expenditure can be accounted for when the bus waits at the end terminals of the line.
A deep learning network with autoencoders is proposed for modelling the energy consumption function (Equation (1)). Figure 6 shows the draft of the proposed solution.  The first two steps process and prepare the data for the DLN. Different tools for collecting data can be used, for instance, specialised data loggers or mobile devices such as smart phones or smart cameras. The registered trip data are next converted to DLN inputs, and this is done straightforwardly in the case of data loggers and with some effort in the case of smart devices.
The total bus line energy consumption is the sum of estimates for all the bus trips of the line. Additional energy expenditure can be accounted for when the bus waits at the end terminals of the line.
A deep learning network with autoencoders is proposed for modelling the energy consumption function (Equation (1)). Figure 6 shows the draft of the proposed solution.

Energy Consumption Model
The energy consumption is modelled using a deep learning network (DLN). This ensures a high level of immunity to corrupted data [26] and a good response to complex relations between the data inputs. These desirable characteristics can be attained in the course of network training using a large dataset of input values, which are derivatives of the collected historical data.
The way of using the model to calculate energy consumption for bus lines is presented in Figure  5. The trained DLN gives estimates of energy consumption for trips between bus stops and requires inputs which are derived from the raw data collected during bus travels. The first two steps process and prepare the data for the DLN. Different tools for collecting data can be used, for instance, specialised data loggers or mobile devices such as smart phones or smart cameras. The registered trip data are next converted to DLN inputs, and this is done straightforwardly in the case of data loggers and with some effort in the case of smart devices.
The total bus line energy consumption is the sum of estimates for all the bus trips of the line. Additional energy expenditure can be accounted for when the bus waits at the end terminals of the line.
A deep learning network with autoencoders is proposed for modelling the energy consumption function (Equation (1)). Figure 6 shows the draft of the proposed solution.  consumption values based on the features elaborated by the last autoencoder of the stack. The hidden layer and the output layer of the DLN are trained using the Levenberg-Marquardt algorithm [28].
Energies 2020, 13, x FOR PEER REVIEW 11 of 17 the stack. The hidden layer and the output layer of the DLN are trained using the Levenberg-Marquardt algorithm [28]. Autoencoders consist of encoders and decoders. An encoder with inputs (p) finds a hidden feature representation of its inputs-y(p).
This set of features constitutes the inputs of the decoder. In the course of the training of the autoencoder, the outputs of the decoder are adjusted to the inputs of the encoder and a set of neuron weights are derived.
where (We,be) and (Wd,bd) are the sets of weight and bias vectors of the encoder and decoder, respectively.

•
The neuron activation functions f and g are sigmoidal.

•
The goal of the training of the autoencoder is to find weights and bias vectors which give the closest approximation of p using z(p). It is done by minimising the cost function for thek training inputs: using the stochastic gradient descent method [29].

Case Study
The model and method are validated using the dataset obtained from the municipal bus company of Jaworzno. Preliminary investigations using a small dataset of bus trips prove that aDLN network with one autoencoder is adequate for modelling energy consumption. The performance of this solution depends on the number of features that are discerned by the autoencoder. There is a distinctive number which in the best way describes the characteristics of the DLN input variables. Autoencoders consist of encoders and decoders. An encoder with inputs (p) finds a hidden feature representation of its inputs-y(p).
This set of features constitutes the inputs of the decoder. In the course of the training of the autoencoder, the outputs of the decoder are adjusted to the inputs of the encoder and a set of neuron weights are derived.
where (W e ,b e ) and (W d ,b d ) are the sets of weight and bias vectors of the encoder and decoder, respectively.

•
The neuron activation functions f and g are sigmoidal.

•
The goal of the training of the autoencoder is to find weights and bias vectors which give the closest approximation of p using z(p). It is done by minimising the cost function for the k training inputs: using the stochastic gradient descent method [29].

Case Study
The model and method are validated using the dataset obtained from the municipal bus company of Jaworzno. Preliminary investigations using a small dataset of bus trips prove that aDLN network with one autoencoder is adequate for modelling energy consumption. The performance of this solution depends on the number of features that are discerned by the autoencoder. There is a distinctive number which in the best way describes the characteristics of the DLN input variables.
Configurations with 10 to 30 neurons in the autoencoder and 6 to 16 neurons in the MLP were tested. The number of MLP neurons is equal to the number of discerned features. The training set consists of 3941 bus trips and the testing set consists of 600 bus trips.
A systematic approach was used to find the required number of neurons of the network. The number of neurons was gradually changed and each new configuration was trained. The maximum number of training epochs for the autoendcoder was set to 1000, and for the MLP layer, 300 epochs. The training was repeated five times and the results of the estimation were evaluated. The set of neuron weights with the best performance was noted.
The errors of estimation are evaluated using the mean absolute percentage error (MAPE) and root mean square error (RMSE): where E(i) and E r (i) are the estimated and registered energy consumption, respectively, for trip i of the test, n = 600. The best performing DLN configurations are listed in Table 4. The DLN (N1) with a 16 neuron autoencoder discerning 16 features, and the MLP consisting of six neurons in the hidden layer, give the lowest MAPE and RMSE errors. The hidden layer elaborates the relations between the feature values and provides data for the output neuron which generates the estimates of energy consumption.
Absolute values of the percentage estimation errors are presented in Figure 8. Figure 8a shows the box plots for the three best networks which present the statistic characteristics of the estimation errors. Error medians for the consecutive networks are 5.6%, 5.0% and 5.6%, the upper adjacent values are 18%, 19% and 20%, and the lower adjacent values are equal to 0. The diverse conditions of the data registration produce a number of outliers which account for about 7% of the test set samples. Configurations with 10 to 30 neurons in the autoencoder and 6 to 16 neurons in the MLP were tested. The number of MLP neurons is equal to the number of discerned features. The training set consists of 3941 bus trips and the testing set consists of 600 bus trips.
A systematic approach was used to find the required number of neurons of the network. The number of neurons was gradually changed and each new configuration was trained. The maximum number of training epochs for the autoendcoder was set to 1000, and for the MLP layer, 300 epochs. The training was repeated five times and the results of the estimation were evaluated. The set of neuron weights with the best performance was noted.
The errors of estimation are evaluated using the mean absolute percentage error (MAPE) and root mean square error (RMSE): where E(i) and Er(i) are the estimated and registered energy consumption, respectively, for trip i of the test, n=600.
The best performing DLN configurations are listed in Table 4. The DLN (N1) with a 16 neuron autoencoder discerning 16 features, and the MLP consisting of six neurons in the hidden layer, give the lowest MAPE and RMSE errors. The hidden layer elaborates the relations between the feature values and provides data for the output neuron which generates the estimates of energy consumption.
Absolute values of the percentage estimation errors are presented in Figure 8. Figure 8a shows the box plots for the three best networks which present the statistic characteristics of the estimation errors. Error medians for the consecutive networks are 5.6%, 5.0% and 5.6%, the upper adjacent values are 18%, 19% and 20%, and the lower adjacent values are equal to 0. The diverse conditions of the data registration produce a number of outliers which account for about 7% of the test set samples.    There are no explicit relations between the N1, N2 and N3 estimation errors. Figure 9 presents the estimation performance for a larger excerpt of test data amounting to 90 bus trips. The test data and estimates graphs are paired for comparison. Rectangles cover the range of bus trips which are depicted in Figure 8b. The graphs confirm a mixed performance of the networks. There are samples correctly mapped which have large and small values.
Energies 2020, 13, x FOR PEER REVIEW 13 of 17 tops of the stem plots. Error values are scattered, although most of the values fall below the 10% limit. There are no explicit relations between the N1, N2 and N3 estimation errors. Figure 9 presents the estimation performance for a larger excerpt of test data amounting to 90 bus trips. The test data and estimates graphs are paired for comparison. Rectangles cover the range of bus trips which are depicted in Figure 8b. The graphs confirm a mixed performance of the networks. There are samples correctly mapped which have large and small values. Graphs in Figure 9 differ slightly, on par with the statistics. Careful analysis unravels more overlapped lines in Figure 9a, which suggest a better fit of the estimates to the test data. The characteristics of the whole set of test and estimated data confirm the best performance of the N1 network.

Comparison with a Multiple Linear Regression (MLR) Model of Bus Trip Data
The course of development of the DLNs shows that the relations between input variables do not exhibit extraordinary behaviour. This observation suggests that a multiple linear regression model could satisfy the requirements of the energy consumption estimation. The multiple regression model is based on the assumptions that there is a linear relationship between the variables and the variables are not too highly correlated with each other. The multiple regression (MLR)model uses several explanatory variables to predict the outcome of a response variable. In this case, the network input variables are the explanatory variables and the estimated energy consumption (En) is the response as in Equation (6):  Figure 9 differ slightly, on par with the statistics. Careful analysis unravels more overlapped lines in Figure 9a, which suggest a better fit of the estimates to the test data. The characteristics of the whole set of test and estimated data confirm the best performance of the N1 network.

Comparison with a Multiple Linear Regression (MLR) Model of Bus Trip Data
The course of development of the DLNs shows that the relations between input variables do not exhibit extraordinary behaviour. This observation suggests that a multiple linear regression model could satisfy the requirements of the energy consumption estimation. The multiple regression model is based on the assumptions that there is a linear relationship between the variables and the variables are not too highly correlated with each other. The multiple regression (MLR) model uses several Energies 2020, 13, 2340 14 of 17 explanatory variables to predict the outcome of a response variable. In this case, the network input variables are the explanatory variables and the estimated energy consumption (E n ) is the response as in Equation (6): E n = m 4 d n + m 3 ∆t n + m 2 ∆h n + m 1 w n + b A stochastic gradient descent is used to solve the problem of finding the coefficients (m) of the equation. The same set as in the case of DLN training of the bus trip data is processed. Results prove that the proposed model is suitable to estimate energy consumption.    Figure 11 presents the estimation performance, and again, the test data and estimates graphs are paired for comparison. The rectangle covers the range of bus trips which are depicted in Figure  10b. The graphs confirm a mixed performance of the MLR model. The discrepancies between graphs are larger than in the case of the N1 DLN. Figure 11. Test data and estimated values using the multiple regression (MLR) model.   Figure 11 presents the estimation performance, and again, the test data and estimates graphs are paired for comparison. The rectangle covers the range of bus trips which are depicted in Figure 10b. The graphs confirm a mixed performance of the MLR model. The discrepancies between graphs are larger than in the case of the N1 DLN.

(6)
A stochastic gradient descent is used to solve the problem of finding the coefficients (m) of the equation. The same set as in the case of DLN training of the bus trip data is processed. Results prove that the proposed model is suitable to estimate energy consumption. The coefficient of determination (R 2 ) reached the value 0.89. The standard errors of the coefficients m4−m1 are {0.101, 3.144, 0.016, −0.123}, and these show that all of the explanatory variables are necessary for the calculation of the estimates.
The errors of estimation reached the values MAPE = 8.2% and RMSE = 0.085. The box plots in Figure 10a illustrate the comparison of the statistic characteristics of the estimation errors of the best network, N1, and the MLR model. Error medians for N1 and the MLR model are 5.6% and 5.9%, the upper adjacent values are 18% and 22%, and the lower adjacent values are equal to 0. The MLR model gives asimilar number of outliers that is about 7% of the test samples. The characteristics confirm a better performance of the N1 network, especially the lower value of the upper adjacent and the smaller value of 75%.   Figure 11 presents the estimation performance, and again, the test data and estimates graphs are paired for comparison. The rectangle covers the range of bus trips which are depicted in Figure  10b. The graphs confirm a mixed performance of the MLR model. The discrepancies between graphs are larger than in the case of the N1 DLN. Figure 11. Test data and estimated values using the multiple regression (MLR) model. Figure 11. Test data and estimated values using the multiple regression (MLR) model.

Discussion
The proposed model and method of estimation of energy consumption meet the expected requirements for a robust and efficient tool. The application of the trained DLN enables an accurate and fast calculation of the energy needs for bus trips as well as for whole bus lines operated by battery electric vehicles. The number of variables representing energy consumption is reduced to parameters of bus trips between bus stops. The values of these variables can be recorded on a daily basis and used for updating the model. A larger database will be useful for extending the training sets and achieving lower estimation errors.
The proposed approach gives an acceptable error for practical use. Collected real-world data on bus operations show that factory specifications of bus powertrains and especially of battery performance are over-estimated in many cases by tens of percents.
The references do not explicitly present the error rates of the proposed models. Indirect comments hint on errors in the range of a few percent but there are many restrictions on the use of the models in real-world conditions.
The proposed energy consumption model based on DLN is more accurate than an MLR-based model. The difference is not substantial, however, the trained DLN is resistant to inputs of corrupted data, whereas the MLR model responds with high errors. DLN generates a response based on the latest training data. Lack of some input data is corrected by the network and the response does not suffer greatly [26]. This property contributes to the robustness of the proposed DLN-based energy consumption model.

Conclusions
Estimation results are used for the energy analysis of bus lines. The resolution of analysis, bus stop to bus stop, is adequate for locating energy deficient places of the bus network. The positions of bus stops are mostly determined by transport needs of the inhabitants. Energy deficient nodes are potential locations of charging stations. Bus companies can adjust bus schedules to evade the energy bottlenecks of network.
Knowledge of energy consumption in the bus network is beneficial for the planning of the expansion of the bus fleet, modernising of the infrastructure and management of the daily operations. A large number of energy deficient places pose a question whether is it more advantageous to build charging stations or buy buses with batteries of higher capacity.
The designed DLN for the estimation of energy consumption performs satisfactorily in the case of ahomogenous fleet of battery electric vehicles. Currently, new companies enter the electric bus market, and so there is a need to include the fleet variability in the estimation model and this is the future study problem.