Characterization of Daily Travel Distance of a University Car Fleet for the Purpose of Replacing Conventional Vehicles with Electric Vehicles

This study attempts to fit daily travel distances (DTD) data collected from the Nagoya University (NU) car-sharing system for one year to several distribution functions, including a lognormal mixture model. It is deemed here that the lognormal distribution performs best among the five tested single-distribution functions based on their p-values. Moreover, the lognormal mixture model can represent the driving pattern better overall with respect to the Akaike information criterion (AIC). Taking two types of electric vehicles (EVs) into consideration, the results show that 30 out of 48 vehicles can be substituted by the EV type with a larger battery capacity according to the observed DTD data and when a 95% confidence level is considered. In this exercise, the updated car-sharing system can have up to nine available vehicles at peak hour, which can reach the peak-shaving need and provides the possibility of contributing electricity for common use with the help of the vehicle-to-grid (V2G) system. Additionally, the updated system with a larger battery capacity can also reduce 24% of the CO2 emissions. These types of systems could be widely applied to other organizations or companies in the consideration of electricity consumption and emission reduction.


Introduction
Effort has been made for the reduction of CO 2 emissions as well as energy saving. Electric vehicles (EVs) seem to have the ability of achieving these two goals at the same time. EVs described by Smith et al. [1] are seen as having potential for reducing oil dependency and Greenhouse Gas (GHG) emissions in transportation use. This has been confirmed by Casals et al. [2], who examined emissions changes between internal combustion engine vehicles and EVs, and suggested that some countries (e.g., France or Norway) are better suited for EV adoptions. Other researchers attempted to combine EVs with car-sharing systems in order to obtain a larger reduction of emissions, such as Baptista et al. [3], who analyzed the car-sharing system with empirical data, and the results demonstrate that a reduction of 65% for CO 2 emissions can be obtained if a shift to EVs is promoted. Additionally, if conventional vehicles are shifted to EVs, they can also serve as an energy resource by sending electricity back into the grid under the help of a vehicle-to-grid system (V2G), and that will lower the electric system costs [4][5][6].
In this case, there is one unavoidable question that both researchers and EV users are facing when considering the substitution of EVs for conventional vehicles, which has been pointed out by Pearre et al. [7] and Plötz [8], the limited driving range of the EVs due to their small battery capacity.
Many approaches have been applied to solve this issue by focusing on charging patterns and infrastructure. Schücking et al. [9] claimed that a prudent mix of conventional and Direct Current (DC) fast charging allows for a higher annual mileage, while Frade et al. [10] and Xi et al. [11] concentrated on the location of the charging infrastructure.
The study of daily travel distance (DTD) is one of the considerations in optimizing the charging strategy and infrastructure or determining the battery size, and has been linked with distribution functions in many articles. As Greene [12] mentioned, daily travel may be regarded as a series of independent, random values drawn from a particular probability distribution. Many researchers have tried numerous types of distributions. Gamma, binomial, and normal distributions were tested by Greene [12], and Tamor et al. [13] used the sum of a broad exponential and a narrow Gaussian distribution. Lin et al. [14] performed a comparison of lognormal, Weibull, and gamma distributions and revealed that a gamma distribution was the most precise fit for his data. In contrast, Plötz et al. [15] performed a similar comparison of the same three distributions, and their results showed that a lognormal and Weibull distribution generally performed better than a gamma distribution. Some researchers chose not to use one single distribution. Pearre et al. [7] used an empirical distribution of their collected data. Li et al. [16] also did not use any existing distribution models. Plötz et al. [15] claimed that no single distribution clearly outperforms all others, and even though Li et al. [16] tested a mixture model where they categorized drivers into nine types, as previously mentioned, no existing distribution was tested.
This study focuses on university-owned cars at Nagoya University (NU), which are used similarly to a car-sharing system. The study attempts to fit empirical DTD data to five different distribution functions as well as a lognormal mixture model, so that the model can ultimately be used to test the effects of substituting conventional, gas-powered vehicles with EVs. It is deemed in this paper that the lognormal distribution fits the DTD data better than the other four distribution functions; moreover, a lognormal mixture model shows great promise in fitting the DTD. The lognormal mixture model takes into consideration the DTD that can be reached by an EV with certain driving range to determine which vehicles can be substituted by EVs. Although the university car fleet has a different travel pattern than conventional household vehicles, the exercise of fitting a probability distribution to the DTD data allows for us to obtain results analogous to Blum [17] and Plötz et al. [18].

Data
The observed DTD data for this study were obtained from Nagoya University's car fleet. The fleet is used for administrative, educational, and research purposes by employees, including administrative workers and researchers. Nagoya University holds 54 general-purpose vehicles (GPVs), 52 of which have travel data, with the exception of 1 of the 52 vehicles that has only travel time data, with no travel distance data. The data include department, vehicle ID, vehicle type, time of check-out and check-in, odometer at check-out and check-in, and refueling logs. There are 5 vehicle shapes in total, which are van, minivan, sedan, SUV, and truck, among them minivan shares most vehicles with 25, and both truck and SUV have 2 vehicles, 7 vans, and 12 sedans. All 51 vehicles, except for 3, had their data collected from October 2014 to September 2015. Thus, we only analyzed 48 vehicles that collected 1 year of data.
Some vehicles traveled many times in a day; we therefore combined the travel distances into a DTD histogram as shown in Figure 1. The total driving days are 4586 for 48 vehicles; Figure 1 illustrates the DTD data of 4522 driving days since only 1.4% of the DTD data was beyond 500 km. According to Figure 1, the data showed a distinct peak at 10 km and another slight peak at 150 km. Ninety-five percent of the DTD data is distributed within 300 km. Taking the driving range into consideration, whether the demand of travel distance can be fulfilled by existing EVs will be discussed in the latter part of this study. As the goal of using EVs in a car-sharing system, obviously, DTD at lower level is preferred for the limitation of EV battery size. One of the reasons for analyzing DTD data is to explore the availability of power sources for a sharing EV system, the vehicle-to-grid (V2G) system, and one aspect to consider for discharging systems is the charging time of EVs. The discharging system must guarantee that vehicles have sufficient power to fulfill the travel demand. Figure 2 shows the average use rate across time of day and on different weekdays. The highest use rate is 24.6% on Friday between 13:00 and 14:00, while the lowest is 1.6% on Wednesday at 02:00. Overall, two peaks in use rate appear clearly at 10:00 and 13:00, and the use rate falls at 12:00 to a local minimum. While the weekday use rates at peak hours are significantly higher than the weekend, the use rate from 23:00 to 7:00 is not substantially different between the days of the week, which aligns with when people typically work and rest. This basic information contributes knowledge on the usage patterns for the car-sharing system and discharging options after the substitution of EVs. However, whether travel demand can be fulfilled with EVs requires more study and will be discussed in the subsequent sections.

Methods
Similar to Plötz et al. [8], here, we also assume the DTD to be independent and identically distributed random variables and test the fit of the DTD data for each vehicle against five probability distributions as well as a lognormal mixture model. The five distributions tested in this study are the normal, lognormal, gamma, exponential, and Weibull distributions. The mixture model used in this study is a combination of two lognormal distribution functions, and its probability density function (PDF) for a certain vehicle can be explained as followed: where r t is the daily travel distance for day t, α is the mixing proportion of component α∈(0,1), σ i is the standard deviation of i-th mixture components, and µ i is the mean of the i-th mixture component. Two statistical measures are adopted in this study to evaluate the goodness of fit of each distribution and lognormal mixture model: p-value estimated by the Kolmogorov-Smirnov test (K-S test) [19] is used here to determine whether the DTD of a certain vehicle is subject to a certain distribution form with a 95% confidence level. The Akaike information criterion (AIC), where AIC = −2LL + 2(p + 1), p is the number of the model parameters, and LL is the log-likelihood function, and can be delivered as follows: It is used here to distinguish the goodness of fit among the mixture model and other distribution forms for each vehicle.

Comparison of Five Distributions
The 95% confidence level is most commonly used [20] and is therefore used in this study. The K-S test is applied to every vehicle after fitting its DTD with each of the 5 distributions to obtain the p-value. For each vehicle, there are 32 different combinations of being or not being subject to 5 distribution forms, and only 12 combinations could find related vehicles as cases. The results are provided in Table 1.
* O represents certain vehicles that can be subject to this distribution, × represents certain vehicles that cannot be subject to this distribution.
Twenty out of 48 vehicles could not pass the K-S test at a 95% of confidence level by any distribution, and the DTD for 3 vehicles can be subject to all distributions. Given the results of the K-S test for each distribution, the lognormal distribution provided the closest fit out of the five alternatives for the 21 remaining vehicles, followed by the Weibull (20 vehicles), gamma (14 vehicles), and normal (8 vehicles) distributions. The exponential distribution, compared to the other four distributions, yielded the worst fit of DTD of each vehicle.
More than half of the vehicles fit at least one distribution at a 95% confidence level. Given this trend, the distribution that fit the most data was the lognormal distribution. The combination of two lognormal distributions was therefore considered as an additional step to find the closest fit to the DTD data.

Results of the Lognormal Mixture Model
As mentioned above, the lognormal distribution performs the best among 5 distributions, yet less than half of the vehicles can be represented by it. Thus, we combine 2 lognormal distributions into a mixture model, and the parameters of the mixture model are assessed by mixtools package for the R Project for Statistical Computing.
p-value is also tested here, and 34 vehicles could be replicated by a mixture model when 95% of confidence level is considered. However, even though more vehicles fit the mixture model, p-value cannot evaluate the goodness of fit between different distributions and the mixture model. Hence, to evaluate whether the mixture model fits the DTD data more closely, AIC is applied to determine the goodness of fit. Since the goal of this study is to find the distribution model that best fits DTD, we compare the magnitude of the AIC of each distribution model and vehicle instead of considering a p-value with a certain confidence level. Table 2 illustrates the proportion of vehicles that fit each distribution or mixture model best in terms of the AIC. After including the lognormal mixture model in the analysis, the mixture model clearly simulates DTD the most accurately. Thirty-nine out of 48 vehicles (81.25%) can be best explained by the mixture model. In this case, AIC could help to determine the best fitted form among 6 alternatives. The results demonstrate that the lognormal mixture model could better replicate DTD trends for NU's car-sharing system.

Factors Contributing to DTD
Even in the sharing vehicle system, people have different preferences when choosing a vehicle due to factors such as the type of vehicle and the degree of fuel consumption. Therefore, some vehicles will be selected multiple times, resulting in more travel data. Moreover, although the university holds the ownerships of vehicles, but vehicles belong to different departments, a vehicle can only be used by the staff of the same department. This has led to the fact that, even though some departments may have more users, they share fewer vehicles. It will also generate an imbalance in the use of each vehicle. These imbalances are reflected in the parameters of the mixture model, and can be explained by other explanatory parameters, such as vehicle type.
The mixture model with 2 lognormal distributions holds 5 parameters, σ i , and µ i , and the mixing proportion α, and regression model is used here to describe the relationship between these 5 parameters and other explanatory variables. Here, explanatory variables include vehicle type, engine size, engine type, and the proportion of faculty members to vehicle. As mentioned above, there are 5 vehicle types in total, which are van, minivan, sedan, SUV, and truck; they are used as dummy variables in regression model. As for engine type, we also introduced dummy variables for hybrid and diesel vehicles. The proportion of faculty members to vehicle is calculated as the number of staff members divided by the number of vehicles in same department to which the vehicle belongs.
However, it turned out that these explanatory variables are insignificant to the 5 parameters as shown in the Appendix A. The small sample size could be one of the reasons that led to this situation.
Additionally, the explanatory variables are also very limited. Thus, the investigation between the parameters in the mixture model and other explanatory variables remains as a future research theme.

EV Adoption
As described in [21], new registration for electric cars reached a new record in 2016, with over 750,000 sales worldwide. The EV market continues to expand, and the choice of EV when considering substituting conventional vehicles has gained increasing attention in the public.
The results of the distribution analysis can be used to determine the probability of a particular travel distance. Therefore, once the type of EV has been chosen, the ability of the EV to replace the conventional vehicle at a given distance can also be determined.
DTD for each vehicle was tested with 5 distributions and the mixture model, even though the mixture model performed best overall, yet still some vehicles fit single distribution better than the mixture model. DTD here is considered as a representative element of driving patterns, and used to determine which vehicles from the original car-sharing system can be replaced by EVs based on the best fitted form and observed data at the 95% satisfaction level. The extent of replacement with different driving capacity is shown in Figure 3. The line chart illustrates quite a close trend between best fitted form and observed data, especially when the DTD is within 100 km. For a certain gasoline vehicle originally belonging to the car-sharing system, the DTD of it at 95% of quintile can be determined by both best fitted form and observed usage frequency. This can help us to verify the demand of EVs' battery capacity for driving distance. The alternative EVs are selected based on the demand of DTD considering best fitted form or observed usage frequency. As shown in Figure 3, 50% of the vehicles originally from the car-sharing system can be replaced by an EV with 145 km driving capacity according to observed data, or an EV with 164 km driving capacity according to the best fitted form. To obtain 100% replacement, it requires that the EV be equipped with a battery which can travel longer than 716 km according to observed usage frequency, or 757 km based on best fitted form. However, considering that there is no such long-range EV yet, along with other economic and environmental reasons, we choose the following 2 types of EV as alternatives for substitution.
Based on the best fitted form for each vehicle, we used two types of EVs, the MITSUBISHI i-MiEV (Japanese cycle) (Type 1) and Tesla Model 3 (Type 2), to test the ability of the EV to replace conventional vehicles. The driving ranges for these EVs are 160 km and 350 km, respectively, and the proportion of substituting EVs for conventional vehicles in the NU car-sharing system is shown in Table 3 at 95% confidence level. A larger battery size in EVs would no doubt mean that more conventional vehicles can be replaced. However, neither type can reach the travel demand of all conventional vehicles. Different from private vehicles, the remaining gasoline vehicles in the system after substitution can still serve for longer travel demand. Therefore, the sharing system can still fulfill multiple travel demand.
Even though longer driving range of an EV would naturally lead to a higher substitution rate, driving performance and driving cost actually differ between EVs. Thus, it is not simply that the higher substitution rate is better, especially considering that the EVs after substitution are not only used for travel but also serve as a power source sometimes, as well as the financial difference between consuming gasoline and electricity. As seen in Table 4, Type 1 can substitute 23 out of 48 vehicles, while Type 2 can substitute 30 vehicles. Such a substitution also implies that more electricity is consumed when traveling the same distance. Thus, considering that Type 1 can actually travel longer than Type 2 when the same amount of electricity is consumed, the impact of substitution, such as the availability of discharging as well as driving cost change, requires further research.

The Change of Travel Cost after Substitution
Type 1 and Type 2 can substitute 23 and 30 conventional vehicles, respectively. Thus, 2 types of EV lead to 2 scenarios of substitution plan. The change of cost consists of the travel cost and purchase cost.
The calculation of an EV's travel cost is based on the driving performance (km/kWh) and electricity cost (yen/kWh). The price of electricity is offered by Nagoya University's website of energy use. In the scenario of using Type 1 as an alternative for substitution, 23 vehicles can be substituted, and the average travel cost of these vehicles is shown in Table 4, thus, as for 30 conventional vehicles in the scenario of Type 2. The travel cost of Type 2 is higher than Type 1, which is understandable since Type 2 consumes more electricity when the same distance is traveled. This leads to a higher travel cost for EVs, but not less saving financially because more vehicles are transferred to EVs, which can reduce more costs on travel. According to the observed data, if the certain conventional vehicles at 95% satisfaction level are transferred to EVs, Type 1 can reduce 680,790 Japanese yen in total vehicles for a year, while Type 2 can reduce 981,085 Japanese yen. The cost of consuming electricity is cheaper than gasoline; thus, it is recommended to substitute EVs for conventional vehicles.
Even though both types can decrease the cost of travel, the cost of purchasing EVs also needs to be considered. The price of Type 1 is 3,000,300 Japanese yen, while it is 3,800,000 for Type 2, and the average price of conventional vehicles is also shown in Table 2. Although there is not a big difference between the price of Type 1 and Type 2, the cost difference is expanded when considering all the substitution conventional vehicles. In scenario 1, it would only take approximately 1.67 years for the travel cost to be even with the extra cost for purchasing Type 1. However, Type 2 would take 30.79 years to reach the same goal. Thus, Type 1 is economically better than Type 2, considering both travel cost and purchase cost.
The substituted vehicles were only used around 1/3 days in a year; most days they were parked in the school garage. This offers a possibility to use the EVs as an electricity supply when they were parked for 2/3 days in a year.

Available Electricity from Sharing System
With the one-year DTD data from the car-sharing system and a list of conventional vehicles that can be substituted with EVs, as well as assuming that every EV will be charged immediately after check-in, we calculated the average electricity used for traveling in that one-year period for each hour of the day, and the results can be seen in Figure 4. Type 2 requires more electricity for traveling since it can replace more vehicles, and also it consumes more electricity when the same distance is traveled. Even though 13:00 is the peak hour for using vehicles according to the descriptive analysis in Section 2, the peak hour for charging vehicles is actually 17:00, for most vehicles are back in garage again. On the other hand, the electricity once charged to the vehicles could be discharged to the grid in order to shave the peak demand of general usage of the electricity at the university. Here, we assume that the discharge speed of Type 1 and 2 is the same when it is used for supplying electricity; according to Erdogan et al. [22], the discharging power varies between different vehicles, and can be up to 10 kW per hour, so here, 10 kW per hour is used as the discharging speed for both types of EV, and it takes 5 h to be fully charged for both EVs. The amount of consumed electricity is recorded every half hour, and it is shown on the university website; the average amount of each hour reached a peak at 14:00, which is 13,561.25 kWh. Figure 5 explains the average available electricity provided by EVs at the garage from the sharing system at 14:00 in each month. We collected 20 days of electricity use from Nagoya University as reference; in the lowest month June, EVs can still provide 1.26% of total electricity demand. Considering that the peak-shaving need is 0.3% of total demand since the university's contract with the electricity company is based on the maximum kWh usage, the car-sharing system with EVs can reduce the burden of the electricity bill.

Reduction of Carbon Dioxide
The use of electric vehicles is not really zero emissions, but it indirectly produces carbon dioxide. Those emissions come from power plants while producing electricity. One liter of petrol produces 2.3 kg of CO 2 when burnt [23], while according to Ministry of the Environment, Chubu Electric Power, a Japanese electric utilities provider, 0.496 kg of CO 2 is generated when every kWh of electricity is produced. In this way, the emission of observed data can be compared to the system with different types of EVs.
The original car-sharing system emitted 46,133.99 kg of CO 2 emissions over a year; with Type 1, it can reduce 18.98% of the emissions, while Type 2 can reduce 24.06%.

Discussion
At the first part of the distribution test, p-value revealed that lognormal distribution performed best among 5 alternatives. However, after AIC was used since the mixture model joined the comparison, if we still only consider 5 single-distribution forms, we could find according to AIC that Weibull distribution actually performed better than other distributions; 28 out of 48 vehicles could be better replicated by it. In other words, p-value solely could not determine the best fitted form considering single-distribution forms. Yet, the lognormal mixture model with 2 components performed best based on both p-value and AIC. Li et al. [16] pointed out that vehicles are used by different drivers, and therefore categorized their data by type of driver. In this study, the DTD from one vehicle is a combination of different drivers. When applying the mixture model, even though different driver types were not clearly defined here, a lognormal mixture model still fit the DTD data better than the 5 single-distribution forms.
The reason why the lognormal mixture distribution was a better fit than the single-distribution functions might be that the mixture of 2 lognormal distributions holds 2 peaks which can better capture the behavior of different drivers. However, it is also not verified that the more components held by the mixture model, the better the goodness of fit obtained. Even though the mixture model can provide a better explanation for patterns in DTD, how data should be categorized still needs to be considered. In the study by Li et al. [16], the key categorical factor is type of driver; however, in a car-sharing system, since the driver is various, other elements rather than driver might be better for categorization. In this study, even though we fail to find any statistically significant explanatory variables in the regression analysis for parameters from the mixture model, we still believe that department or occupation type could be a way of clustering for a sharing system belonging to a certain organization. In addition, type of vehicle and parking spot could also be key factors to categorize the data and combine them into a mixture model, especially for a public car-sharing system. Determining the key categorical factor could substantially improve the mixture model.
Even though only 2 types of EV are used here, the percentage of replacement by the EV with different driving capacity could be a reference for both consumer and manufacturer when considering the different battery size.

Conclusions
As discussed above, DTD as an important element of usage pattern for each vehicle cannot be modeled with one single-distribution form, especially for NU's car-sharing system, since these vehicles belong to different departments and are shared by multiple users. Based on the tested distributions, the lognormal distribution achieves the best fit among the 5 single-distribution functions, and we believe it is also the best component for the mixture model. In the latter test, the mixture model, which consists of 2 lognormal distributions, performed better than the 5 single distributions, which was as expected, and the usage pattern is better represented by the lognormal mixture model. The results of regression can help in managing the type of vehicles in the system.
Additionally, after testing 2 types of EVs, the results of the substitution imply that the replacement of gasoline-powered vehicles with EVs in the car-sharing system has great potential. By utilizing EVs in the car-sharing system, the system can still be workable for various driving demands. Moreover, the new system is able to reduce travel cost to some extent in both scenarios, and it also has the ability to provide electricity back to the grid for common use, as well as reducing CO 2 emissions. Thus, we believe that this type of car-sharing system has great potential to be applied to other organizations or companies.

Conflicts of Interest:
The authors declare no conflict of interest.
Data Availability: The daily travel distance data of university fleet used to support the findings of this study have not been made available because the university provided the data exclusively to our research group as internal domain.