Next Article in Journal
Processes in the Unsaturated Zone by Reliable Soil Water Content Estimation: Indications for Soil Water Management from a Sandy Soil Experimental Field in Central Italy
Previous Article in Journal
Eco-Driving and Its Impacts on Fuel Efficiency: An Overview of Technologies and Data-Driven Methods
Previous Article in Special Issue
Travel Behavior of SME Employees in Their Work Commute in Emerging Cities: A Case Study in Dhaka City, Bangladesh
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Campus Shuttle Bus Route Optimization Using Machine Learning Predictive Analysis: A Case Study

by
Rafidah Md Noor
1,2,*,
Nadia Bella Gustiani Rasyidi
1,
Tarak Nandy
1,* and
Raenu Kolandaisamy
1,3,*
1
Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur 50603, Malaysia
2
Centre for Mobile Cloud Computing, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur 50603, Malaysia
3
Institute of Computer Science and Digital Innovation, UCSI University, Kuala Lumpur 56000, Malaysia
*
Authors to whom correspondence should be addressed.
Sustainability 2021, 13(1), 225; https://doi.org/10.3390/su13010225
Submission received: 21 September 2020 / Revised: 19 October 2020 / Accepted: 21 October 2020 / Published: 29 December 2020
(This article belongs to the Special Issue Sustainable Public Transport System)

Abstract

:
Public transportation is a vital service provided to enable a community to carry out daily activities. One of the mass transportations used in an area is a bus. Moreover, the smart transportation concept is an integrated application of technology and strategy in the transportation system. Using smart idea is the key to the application of the Internet of Things. The ways to improve the management transportation system become a bottleneck for the traditional data analytics solution, one of the answers used in machine learning. This paper uses the Artificial Neural Network (ANN) and Support Vector Machine (SVM) algorithm for the best prediction of travel time with a lower error rate on a case study of a university shuttle bus. Apart from predicting the travel time, this study also considers the fuel cost and gas emission from transportation. The analysis of the experiment shows that the ANN outperformed the SVM. Furthermore, a recommender system is used to recommend suitable routes for the chosen scenario. The experiments extend the discussion with a range of future directions on the stipulated field of study.

1. Introduction

Over the last decade, technology has led intelligent systems, through sensing devices, to analyze and improve the transportation system. One of the tools used to track the location of a vehicle is the Global Positioning System (GPS). These devices can be used to navigate not only in buses but also in airplanes, vehicles, smartphones, and other mobile devices [1]. In addition, in the past few years, the demand for public transportation has increased significantly to help people commute. On the other hand, conglomerating technology such as the Internet of Things (IoT) and the Intelligent Transportation System (ITS) with the automobile brings both automation and security [2]. Moreover, the ways to improve the management transportation system become a bottleneck for the traditional data analytics solution, one of the answers used in machine learning. Machine learning is a type of Artificial Intelligence and data-driven solution that can meet the latest system requirements. It learns the pattern of the data collection based on historical data to model the performance of the system and to respond to make the system automatically run based on the analytical model. Today, machine learning is becoming popular in the transportation system to support the robustness of the analysis that focuses on prediction methods. Moreover, machine learning techniques turn out to be an integral part of accomplishing the smart transportation system [3] for transportation sustainability. Therefore, to obtain the best-quality prediction of a shuttle bus, machine learning techniques with a lower error rate are used in this study. Furthermore, the study illustrates the environmental benefits using the estimation of fuel consumption and gas emission in the given case study.
The study of the model is tested on the shuttle bus [4] of the University of Malaya [5], which is a public research university situated in the center of Kuala Lumpur, Malaysia, over 309 hectares of land. The University of Malaya is populated with a total of 24,769 local and international students [6]. The university has a capacity of 12 residential colleges and one international house for accommodating university students. Most of the residential arrangements are situated on campus, except for a few buildings. However, around 10% of students come from outside of the university; approximately 7% of students live in their own accommodation near the university. On the other hand, the main mode of transportation in the campus is the shuttle bus arranged by the university. Two of the shuttle buses go outside of the campus to connect the outside accommodated students. Alternatively, two different paid public buses provide their services outside of the main university area (UM Sentral). Moreover, a few students use their own arrangements for transportation. However, as the prime transportation medium for university students is the shuttle bus, the sustainability plan for transport is essential, encouraging the current study.
The main contributions of the paper are as follows:
(1)
Robust machine learning algorithms such as Artificial Neural Network (ANN) and Support Vector Machine (SVM) are used to predict the travel time of the shuttle bus on a chosen scenario.
(2)
The estimation of fuel cost in a preferred period is shown in the study.
(3)
Gas emission of the vehicle in a period of time is also considered in this learning.
(4)
A range of future scopes is demonstrated to show the path of the upcoming direction of the stipulated field of study.
The description of all the different acronyms used in this article is listed in Table 1. However, the rest of the document is designed as per the following pattern. Section 2 discusses the recent related work that aligns with the knowledge of the paper. After that, Section 3 elaborates on the main idea of the research, which is the optimization of the bus route using the data analysis. Then, Section 4 elucidates the experimental results and analysis of the proposed work. After that, Section 5 evaluates the proposed research work and analyzes the results, followed by the conclusion and future research scope in Section 6.

2. Theoretical Framework of Related Research

In recent decades, the use of artificial neural networks and support vector machines to analyze the time travel for vehicles has been pursued by many researchers. Jiang [7] used the methods to predict a whole route and a segment of the bus route travel time based on three-layer neural networks with a different number of input and hidden units. Moreover, GPS, real-time traffic, and primary bus information data were used in their experiment. Additionally, the Mean Squared Error loss function, which indicated the result of positive and negative errors, was tested in that research. They set training alpha to 0.5, compared the prior data to GPS data, and achieved a higher accuracy. On the other hand, Jindal et al. [8] implemented the unified neural network algorithm for estimating travel time for a taxi to found the desired model. Additionally, a multi-layer perceptron with two hidden layers was incorporated in their study. Moreover, the network was trained to minimize the loss for a large number of epochs. The study showed that the performance of the Spatio-Temporal Neural Network (STNN) (unified learning) on the dataset could reduce the mean absolute error of up to 17% for travel time prediction. Alternatively, Junyou et al. [9] applied a support vector machine for predicting bus travel time to distribute many random factors such as weather, traffic congestion, and passenger flows. Moreover, Zhang Junyou et al. [9] suggested that SVM can be used to analyze predictive objects and predict unknown data or new phenomena. They went on to explain that the SVM has a high accuracy of learning and generalization. The result of their study was the difference between actual data and predicted data, with a relative error between −0.5 to 0.5, which is considered as a low error rate and a high accuracy level. On the other hand, Mingheng et al. [10] proposed a regression algorithm in SVM, known as Support Vector Regression (SVR), to predict the arrival time of a vehicle. They showed that the SVR could reduce not only relative mean errors but also root means square errors in predicting travel time. They selected the data which had the best portion with no corrupt or missed data. Moreover, the SVR prediction used the kernel for performance equal to 0.01 and c = 1000, and the result was very good because it could cover local minima with the predicted error of 5%. Eventually, the result showed that the ability of the SVR model to be used in traffic data analysis was significantly outperformed [11]. Recently, Md Noor et al. [12] used the SVR model to predict the urban bus arrival time and showed a significant prediction ability with a low average error. On the other hand, Nandy et al. [13] applied the interpolation model to predict the vehicle arrival time from the statistical dataset. In that study, the prediction of the vehicle lies within 4 m of the actual location.
Besides the travel time, the optimization of transportation is an essential topic highlighted by recent researchers. Wu et al. [14] proposed a mechanism of a control system in a vehicle using holding and speed control to improve the level of service, while Qin et al. [15] proposed a route optimization for the emergency services to enhance sustainability, utilization, and timeliness. Alternatively, Cyril et al. [16] projected a performance optimization with the help of a hierarchical process for both the user and resources. Likewise, Leyerer et al. [17] showed that most of the previous research was focused on the single-tailored vehicle optimization problem. Therefore, they proposed a customized vehicle routing problem for road transportation optimization. On the other hand, via vehicle-mounted GPS systems, large and redundant vehicle trajectory data are continually sent to the server, creating a variety of sustainable problems such as computation, connectivity, and storage. To address this issue, Chen et al. [18] proposed an online vehicle trajectory compression technique facilitated by two phases, namely mapping and compression. Recently, Ciesla et al. [19] proposed a mathematical model of transport optimization to understand the mobility demand and multicriteria decision making of passengers.
The cost of fuel rises with the increased number of vehicles on the road. Therefore, recent trends in transportation sustainability tend toward energy cost reduction in creative ways [20]. Recently, Tian et al. [21] argued that the fuel monitor platform does not show accurate data due to interference, noise, and collision errors, though a high-precious fuel level sensor detects the level of fuel in a vehicle. Therefore, they proposed an effective and efficient method of repairing and cleaning the data to allow for a more precise fuel consumption. Alternatively, in research, Dioha and Kumar [22] explained five different pathways for the Nigerian Transport Sector to reduce energy consumption and gas emission. The alternative routes are improved fuel economy, carbon tax, modal shifting, fuel switching, and improved logistics for the period 2010–2050. Alternatively, Hu et al. [23] proposed a cost-effective energy consumption management strategy to monitor and manage the fuel cell/battery ingesting in a vehicle. Recently, Rivera-Gonzalez et al. [24] studied the energy and fuel demands in Ecuador’s sustainable road transport for 2016–2035. The result shows that the energy demand for the chosen scenario decreased by 12.14%. On the other hand, Yang and Liu [25] presented a mechanical model for the energy consumption of different vehicles depending upon driving style, road condition, and vehicle types.
Transportation is the backbone of the development of a country because it helps to make the movement and supply chain smoother. However, transportation produces greenhouse gas emissions, which are, in turn, a threat to the environment. Significant research on the gas emission of transportation has been conducted in the last decades. In 2017, Salehi et al. [26] proposed a bi-objective mixed-integer nonlinear programming (MINLP) model to reduce the cost and gas emission from transportation. Alternatively, Llopis-Castelló et al. [27] conducted a study to observe the CO2 emission under the geometric design consistency on a vehicle. In the same year, Penazzi et al. [28] designed a framework to reduce transportation carbon discharge to ensure ecosystem sustainability. Recently, Golebiowski et al. [29] proposed a mathematical model of pro-ecological distribution on a network to minimize the carbon footprint from a journey. Alternatively, Huang et al. [30] experimented on large-scale transportation data and established a calculation for energy consumption and gas emission based on a chosen scenario. On the other hand, Shimizu et al. [31] recently used techniques called third-generation wireless in-wheel motor in electric vehicles. As the driving resistance is reduced due to the lightweight of the battery, CO2 reduction is also condensed. Similarly, Wang and Wen [32] proposed a numerical benchmark test based on an adaptive genetic algorithm (AGA) to study the low carbon vehicle routing problem. As a whole, the recent research focuses not only on the cost-benefit and the time complexity but also on environmental sustainability.
The research on the prediction model of the vehicle timing is growing gradually. Moreover, a range of studies focuses only on the accuracy of the timing of prediction. To continue with the journey, this article shows the experimental analysis on the shuttle bus arrival time based on a case study. Moreover, data analysis and machine learning algorithms such as SVM and ANN are used. Additionally, the overall fuel cost and the gas emission are estimated to enhance the understandability of the identified field.

3. Data and Methods

The bus route optimization techniques are discussed in this chapter. Moreover, the process of data collection and travel time prediction models, optimization for cost, and the environmental and current system are shown. Additionally, the discussion of datasets and feature selection procedures are explained in detail.

3.1. System Model

This research tests the effectiveness of the techniques on the University of Malaya shuttle bus. Therefore, this paper used the route, schedule, and other characteristics of the aforesaid bus as a case study. To elaborate further, the discussion of the time frame and all the constraints are given in this part of the paper.
This research analyzed the shuttle bus system in the odd semester from September 2019 to December 2019. Moreover, the analysis only focused on the last learning week of the semester. However, the research excluded the exam week because the routes were different from those of the study week. The various routes of the buses are provided in Figure 1.
The University of Malaya provides five available journeys based on the routes. There are Bus A, Bus B, Bus C, Bus D, and Bus E. The detailed journey is discussed in the following section.
Bus A and B have eight shelters to stop at in every service round, and they take 6.64 km and 5.61 km in total to complete a journey. Moreover, Bus A operates every twenty minutes, from 07.30 until 21.00 from Monday to Friday. However, there is a break from 12.00 to 15.00 for lunch, prayer, and rest between operation hours. Both routes allow passengers to commute inside the University of Malaya through several faculties and student dormitories. Furthermore, Bus C connects the students from outside the campuses. However, the service hour is a bit different, and the bus operates every thirty minutes, from 07.30 to 21.00. The overall distance journey for bus C is 7.44 km. On the other hand, Bus D and E connect the students to the international house and student dormitories (Kolej Kediaman 9). The distance for each journey is 7.44 km and 7.19 km for Bus D and E. An average of the yearly trips, kilometer traveled, and the passengers by all the buses are represented in Figure 2.

3.2. Research Design

The research on this paper is quantitative. It utilizes the methods of numerical data, using a statistical approach and analysis of the variable to get the insight. In addition, two machine learning techniques were applied to predict the travel time and compared the best prediction with a low error rate of the shuttle bus from the case study. Moreover, the numerical optimization model was used to identify the cost of operation and production of carbon dioxide that impacts the environment based on the average travel time. The independent variable in this study is GPS data, which was gathered from the GPS device attached to the buses, and the dependent variables are travel time, cost, and environmental (production of carbon dioxide). Furthermore, the relationship between the dependent and independent variables is analyzed and shown in this paper. The raw dataset has 13 attributes, and each attribute contains 10,328,710 instances. The available dataset is split into training and testing datasets for the model in this study.

3.3. Research Conduct

The GPS coordinate data, along with the time, device id, and speed, is collected from the GPS device, which is attached to the buses, with five-second intervals. As per the research requirements, the data were collected and analyzed from September 2019 to December 2019, within one semester period, in the main route. In addition, artificial neural networks and support vector regression were used to conduct the research. The research procedures can be summarized as follows:
i.
The data collected from GPS are stored in a MySQL server and converted to a CSV file to run in the R programming language.
ii.
The pre-processed data runs in R Studio and selects the useful data in an appropriate format.
iii.
Exploratory data analysis is implemented before conducting data modeling and feature extraction.
iv.
Modeling the training and testing with machine learning techniques for travel time analysis in the UM shuttle bus.
v.
Evaluating the performances based on the result and providing solutions.

3.4. A Detailed Discussion of the Data

This section elaborates on the importance and readiness of data for conducting the proposed research. Moreover, a thorough explanation of the preparation of the data for usefulness is shown in this part of the research. The schema for data preparation is pictorially represented in Figure 3 and the details are clarified below.

3.4.1. Data Collection

All the buses were equipped with GPS devices that populate the location of the bus in a specific interval. Furthermore, the buses were connected with 4G and Wi-Fi networks to transmit the data in runtime. However, the data were stored in a memory unit on the bus for any inconvenience in the network or for backup. On the other hand, the data were collected in a secure MySQL database with mirror backup. Alternatively, the GPS devices gathered data only at the scheduled time. The rest of the time, the buses were mostly in the terminus. A sample of the dataset is presented in Figure 4a. These data were used, furthermore, to predict and model with two machine learning algorithms: ANN and SVM.

3.4.2. Data Pre-Processing

For the pre-processing data phase, the dataset was cleaned up with the dataset management tool R programming language. A threshold of 20% was measured for missing data, and the dataset was cleaned based on the cut-off. Moreover, the time was divided into two categories, namely peak hours (from 08.00 a.m. to 10.00 a.m. and 05.00 p.m. to 07.00 p.m.) and non-peak hours (the remaining period). The next step was mapping the position of the buses, grouping the collected data based on id, and sorting the timestamps in ascending order.

3.4.3. Exploring the Data

This part of the study used the Open Street Map (OSM) to map the collected GPS data with the external real-world map. Map matching is a process used to pair the latitude and longitude locations detected by the GPS. This process is performed to get the exact distance in each shelter along the route. The sample of map mapping of Bus A from Open Source Routing Machine (OSRM) is shown in Figure 4b. Moreover, all the map mapping of the different buses is provided in Appendix A.

3.4.4. Data Classification

Often the GPS point does not correspond precisely to the shelter’s location. This can be referred to as GPS error data. This type of error is found when the shelters are in between the buildings or in a forest. Therefore, the bus location does not appear on the road, a situation that is known as off-road data. To solve this problem, the degree of location is reallocated in the GPS to the nearest shelter (see Figure 5). Another type of error refers to backward data, which are not only off the road but also reveal a location that is earlier than the previous point. However, as the number of backward data is lower, it is not significant for the proposed works and is thus ignored. The total number of points is shown in Table 2. It has been seen that the on-road GPS data amounts to 78.76%, followed by off-road and backward data among a total of 40,460 data points.

3.4.5. Feature Selection

All the buses populate a data stream with the 13 attributes every time. The attributes are id, device_id, protocol, server_time, device_time, fix_time, valid, latitude, altitude, longitude, speed, course, and address. However, only the data that are directly correlated with the research are gathered for further use in this study. A well-accepted and standard feature selection method, the Random Forest, is used to select the most important five features. The selected features are device_id, fix time, latitude, longitude, and speed.

3.4.6. Model Training and Testing

In this part of the study, the whole dataset was split into two crucial parts: the training dataset and the testing dataset. Training data are required to train the machine learning model to learn the pattern of the dataset. Moreover, the learning state of machine learning helps to improve the prediction of future data. On the other hand, the testing dataset was used to check the potentiality of the trained model to predict the output accurately. However, using a large training set makes a model overfit. Therefore, the ratio of the training and testing data was approximately 1:5, which means that about 20% of the whole dataset was used as a training dataset. The training and testing of the data were done over two different machine learning algorithms, namely SVM and ANN, and are described in detail in Section 4.

3.4.7. Optimizing Input Data

The optimization of the data calculates the travel time for each bus route. Additionally, the machine learning approaches are used to predict bus travel time. To calculate the travel time in each direction, we need the data from every shelter. However, it has been seen that a bus takes more time to reach to the next shelter. The time of the bus at the shelter location is shown using (1).
T n = T n 1 + X n 1 , n X n 1 , n + 1 × T n + 1 T n 1
where T n is the GPS timestamp of data in shelter n, and X n 1 , n + 1 is the distance between data n − 1 and n + 1. However, it often happens that no passenger wants to step into the bus or stop in the shelter, in which case the driver just moves forward to another shelter.

Calculating Off-Road Data

GPS location may send different data points to detect the off-road data point, based on the GPS data, in a radius of 30 m from the shelter. The point may be different because the bus may not stop precisely in the bus shelter. On the other hand, the total number of all bus data points is 40,460, and among them, 17.25% of the data are off-road. Therefore, the off-road bus data points need to be considered. Figure 6 shows the understanding of the off-road data, and (8) and (9) demonstrate the calculation of the off-road data.
As per Figure 6, consider two different stops: stop 1 (a1, b1) and stop n(an, bn), where a1, b1, and an, bn are the coordinates. Therefore, the distance between the coordinates can be represented as da = (a1an) and db = (b1bn). Similarly, the slope (m1) between stop 1 and stop n can be shown as (2).
m 1 =   d b d a
On the same occasion, the line between stop 1 and stop n is as follows.
y =   m 1 x + c 1
where c 1 is the constant, and the value of c 1 can be shown as (4) by implementing stop 1 in (3).
c 1 =   b 1   m 1 a 1
On the other hand, let the other point be L1 (da, db), and the slope of the line from L1 m2, which is perpendicular to the m1. Therefore, m2 is shown as follows.
m 2 = m 1 =   d b d a
Let the line from L1 touch the line between stop 1 and stop n on L2 (al2, bl2). Therefore, the line between L1 and L2 is shown as follows.
y =   m 2 x + c 2
where c 2 is the constant, and the value of c 2 can be shown as (7) by implementing L1 in (6).
c 2 =   db   m 2 da
Lastly, the off-road data point L2 (al2, bl2) can be represented as follows.
a l 2 =   c 1 c 2 m 1 m 2
b l 2 =   m 1 a l 2 + c 1
where c 1 and c 2 are the constant values, and m 1 and m 2 signify the slops of line 1 and line 2, respectively.

Calculating Arrival Time

The arrival time for each bus will be detected by the GPS time when the bus arrives in each shelter or final destination. When the bus stops at each station, it may consider the time of arrival of the bus at that station. However, if the bus does not stop in each shelter, the arrival time in that shelter is the timestamp of the nearest location. Consider the time in data n as the arrival time at the bus shelter i.
A n =   T i
where A n denotes the arrival time at bus shelter n, and T i shows the GPS time of data i at the nearest location.

Scheduled Time

The assumption of the bus service operation is on schedule. Usually, the bus arrives 10 min earlier at the initial bus shelter, depending on the peak and non-peak hour. The schedule of the bus is designed to be finished in 30 min. The calculation of the estimated time to travel to each shelter based on the schedule is shown by (11).
T i j =   S i j n i j
where T i j is the estimated time to travel from one point to another, S i j denotes the scheduled time to travel from shelter i to j, and n i j represents the total number of shelters in each route.
The evaluation and recommendation of the data and model, as per the selected case study, are explained in detail in Section 4.

3.5. Gas Emission

The outdoor air quality is decreasing, and the use of heavy-duty vehicles is increasing nowadays. In this study, one of the concerns was to calculate and reduce the production of gas emission for each route. However, unlike many countries, Malaysia is increasing its diesel standard to more greener alternatives [33]. According to Tang et al. [34], the factor of gas emissions containing CO2, HC, CO, NOx, and PM2.5 are calculated in Equation (12), where C O 2 Equivalent is the computation from the pollutant gas in (11).
E m t o t a l =   E F I , m L H I f I           m     C O 2 , H C , C O , N O x , P M 2.5
In the equation above, m represents the pollutants from the bus operation system by estimating the sum of the gas that contributes to pollutant emission. The chart of gas emission by the buses is presented in Table 3.

3.6. Cost of Fuel

The fuel consumption based on the study by Yang et al. [35] is for a diesel bus with an industry production year under 2007. This case fits the University of Malaya shuttle bus specification. According to the authors, the research provides a fuel consumption model which is already valid. From the proposed simulation, we used the equation presented below.
The fuel consumption of a bus depends on many factors, such as the load of the vehicle, traffic, road condition, and driving behavior of the driver [36]. However, if everything is under control, the average fuel consumption of a vehicle can be measured [35]. Moreover, the mathematical comparison of fuel consumption is shown in Equation (13)
F u e l = P   ×   S F C   / ρ d i e s e l   × 1000 / D S × 100  
where P is the demand of a diesel engine power (kW), DS denotes the travel distance journey (km), SFC shows the specific fuel consumption (g/kW), and ρ d i e s e l represents the density of fuel (kg/L).

3.7. Tools

The tools used in this study are MySql and R programming. MySql was used to extract the GPS data before its process in R. Firstly, the data were fetched from MySql using extract and select statement based on one semester period from September 2019 to December 2019. Secondly, the data were exported to a CSV file and executed using R programming for cleaning, modeling, and predicting. The R Package Coordinate Cleaner was used to clean the coordinate data. The neuralnet and e1071 package were used to run the model for predicting travel time. Moreover, Tableau was used for the final visualization in this research.
Overall, this section introduces the process and methodology applied in the proposed scheme. Furthermore, the optimization calculation to calculate the independent variables in this study is mentioned. Additionally, the estimate of pre-processing data selection of point data in each shelter is indicated. The results of the experiments and the analysis of findings are discussed in detail in the next section.

4. Experimental Results and Analysis

In this section, the result of the quantitative analysis based on the processed GPS data is introduced to explore the shuttle bus travel-time data. This section represents the result prediction using an Artificial Neural Network model and Support Vector Machine model. In addition, the calculation of fuel and gas emission is presented in this section.

4.1. Actual Data Analysis Based on GPS Data

The stop point of the vehicles was measured after pre-processing the data components. Next, we looked at the actual time consumption for each route of the vehicles. Bus A and B commute inside the campus, and the rest travel to the outside of the campuses. However, Bus E is free of traffic because of its trajectory.
In this study, the data were explored based on two categories: peak hour and non-peak hour. The peak hour often impacts the vehicle journey because of the number of vehicles on-road. The average travel time for each journey is listed in Table 4.

4.2. Analysis Data Based on Artificial Neural Network

The multi-layer perceptron neural network was used to predict the time of the vehicle movement. In addition, this model was one of the artificial neural network algorithms used to indicate the travel time to support the intelligent transportation system. Therefore, the R programming language was applied using the neuralnet package available from CRAN to execute the prediction.
The hidden layer implemented in all shuttle buses was equal to the reduction of the biased result. One of the ANN models for GPS data of Bus A is shown in Figure 7.
As per Figure 7, the travel time from one bus stop to another, symbolized as S1 until S9, was considered as the input data. However, the input depends on the number of stops for each shuttle bus journey. Consequently, the output layer was calculated based on the input layer and the hidden layer.
There were nine stops on each journey for Bus A, which fit in three hidden layers. In this model, 112 steps were built to predict the travel time. The sample size used was 0.70 of the dataset. The set.seed (80) in this model was set to generate different samples to produce the same sample test when we tried to execute it at an extra time. Additionally, the same model was implemented in other shuttle bus routes after modeling the result of different steps. Kindly look at Appendix B for the figures of other shuttle bus route models of the ANN.
From the model, the next step was to predict the travel time. The dataset of the average travel time data for 14 weeks after predicting the values for the two categories, peak hour and non-peak hour, is shown in Table 5. The shortest time travel for non-peak and peak hours of Bus E were 813.57 s and 1058.12 s, respectively. On the other hand, the highest was for Bus C when during peak hours. Overall, the data shows that there is a time delay when the buses operate during peak hours, which is to be expected due to congestion on the road. The visualization of the predicted time travel data is shown in Figure 8a.

4.3. Analysis Data Based on Support Vector Regression

Support Vector Regression (SVR) is supervised learning that is capable of reading the classification data, and it is a support vector machine model with the regression or prediction function. In this study, the data were classified based on the availability of destination routes. In this prediction model, the data load used the R programming language. The e1701 package in R programming was utilized for SVM build-in functions. In the predicting step, the data were modeled with the available SVR function and mapped the time travel on each route to the weekly data. The plot from the SVR prediction is shown in Figure 9a. Based on the plot, the estimated travel time is close to the real data. In addition, the data provided from executing support vector regression are shown in Table 6 and pictorially represented in Figure 8b. From the chart, it has been seen that Bus C takes a longer time to reach the destination compared to others. The reason behind the delay of Bus C is the longest route and traffic congestion. Moreover, the figures for all other buses are shown in Appendix C.

4.4. Production of Gas Emission

The estimation of the bus journey produces the gas emission that is discussed in this part. According to Tang et al. [34], the gas emission is characterized by three classifications: high, medium, and low. Moreover, the classifications are based on the fuel used by a vehicle used and the types of cars. On the other hand, this study estimates the emission production based on a heavy vehicle, namely a passenger bus. In the university, the shuttle bus uses diesel for fuel. The gas emission estimates are shown in Table 7 and in Figure 9b.
As per the chosen case study, the gas emission production comes into the medium category as only the green diesel is available in Malaysia from a decade. Therefore, regarding the estimation of the contribution to pollution and global warming, Bus A will produce, in each journey, 11,200 g of CO2 Equivalent per mile. Bus B, C, D, and E will produce 9744, 12,936, 12,516, and 3276 g CO2 Equivalent per mile, respectively. The contribution of each bus has a significant impact on the environment. Furthermore, Bus C has the highest number of CO2 equivalent contributions, followed by Bus D, A, B, and E. Overall, the factor affecting the production is the distance and travel time of each journey. The lesser the distance and travel time, the lower a bus produces the gas.

4.5. Cost of Fuel

The estimation of fuel consumed for each journey must be known to manage the shuttle bus operation. Moreover, the cost management keeps track of the budget and prevents over planning. According to Yang et al. [35], the consumption of fuel for the heavy-duty vehicle is about 0.568 L/km. It is based on the type of vehicle and the year of production of the bus. The estimation of fuel consumption for each route is shown in Table 8.
As per Table 8, Bus A spends USD 1.57 on one journey; however, Bus E needs only USD 0.68 for each trip. The estimation of fuel provides advantages over the pre-plan, but it has several disadvantages, such as fluctuation in the fuel price and unwanted congestion on the road.
Overall, both ANN and SVM run well to estimate the travel time data. However, the factors observed in this study are that the distance and the peak or non-peak hours affect the travel time. On the other hand, a linear comparison between the ANN and SVM is shown in Figure 10. It has been clearly seen that the prediction performance is better in the non-peak hour dataset. Moreover, the comparison results indicate that the predictions for Route C and E are near to the actual data. Alternatively, the estimated production of gas emissions is directly proportional to the time and distance traveled. Lastly, the cost of fuel consumption for shuttle bus services reflects a similar pattern to that of the production of gas emission.

5. Evaluation and Discussion

The prediction based on SVM and ANN is discussed in detail in this part of the document. Besides that, the cost-effectiveness and the gas emission contribution from the vehicles are also presented in this section. Finally, the recommendation of the bus routes is provided based on the results. All the factors are shown in the following.

5.1. Time Travel Prediction Evaluation

The root means square error (RMSE) is incorporated to demonstrate the performance evaluation of the prediction accuracy. The model in this study is prepared based on the calculation of predicted and actual testing datasets, according to Yin et al. [37] and shown in Equation (14). Moreover, the predicted result evaluation using ANN and SVM is shown in Table 9. The table shows the evaluation result with RMSE. Moreover, the overall result demonstrates that the RMSE value in ANN is lesser than that in SVM. To conclude the result, the smaller amount the best its predicted fit. On the other hand, the shortest distance point from actual to predicted data is found in Bus A, with the value 9.91 for non-peak hour classification with the ANN model. The peak hour’s smallest RMSE value is found in Bus C, at 5.89 with the ANN Model.
R M S E =   i = 1 n t t r a v e l t ^ t i m e 2 n 1
where t t r a v e l is the actual travel time data, and t ^ t i m e is the predicted time data.

5.2. Cost of Fuel Estimation

The estimation of fuel consumption is calculated for a one-semester shuttle bus operation, covering only the period from week 1 to week 14. However, the exam week is deliberately excluded due to the different operating hours. The estimation can be useful for the evaluation of budgeting plans in the future for the shuttle bus operation.
Table 10 shows the cost for a 14-week shuttle bus operation, which is nearly USD 7386.69.

5.3. Gas Emission Production

Transportation is the top contributor to emission production. Furthermore, the environmental impact of transport is air pollution. The estimation of gas emission production for all the journeys within 14 weeks are shown in Table 11. In addition, the medium category parameter was used to calculate the gas emission in Table 11. It has been seen that the production of Carbon Dioxide Equivalent, or CO2*, is 52,418,240 g in the total of all scheduled journeys within these 14 weeks.

5.4. Bus Route Recommendation

The evaluation result shows that the ANN model is more accurate when compared to the SVM model based on a historical GPS dataset. However, there is a huge impact on the production of gas within the stipulated period. Therefore, from the analysis of the result, the recommendation for bus routes in the observed case study is given below, with the appropriate motivation.
(1)
Combine routes in Bus A and Bus E due to the overlapping route and less crowds.
(2)
The journey of Bus A should cut off only until KK10 and turn around to the Faculty of Languages and Linguistics and continue the trip to KK9 (Bus E routes) to avoid the redundancy.
To further illustrate this, the proposed new journey is given in Table 12, and a route comparison is shown in Figure 11.
As per Table 12, there is no change recommended for Bus B, C, and D. However, merging the routes followed by Bus A and Bus E can reduce a considerable overhead. The proposed journey will reduce one bus route by combining it with another bus journey. Moreover, combining Bus A and Bus E reduces the journey distance by up to 15%, which in turn shrinks the travel time in peak hours by 27.26% and non-peak hours by 28.16%, respectively. Furthermore, merging these two bus routes has a significant impact on the environment. The recommended journey reduces fuel consumption, fuel cost, and CO2 up to 32.57%, 43.55%, and 14.34%, respectively. The detailed description is listed in Table 13.
The recommendation of the merged route is based on several factors, such as distance, travel time, fuel consumption, overall fuel cost, and gas emission reduction.
A constant apprehension about environmental changes, soil, renewable energy, clean water, and breathable air brings sustainability and, more importantly, sustainable planning. Likewise, the overall discussion in this section refers to sustainable transportation planning. From the case study, it has been clearly seen that the estimated time of operation of a vehicle depends on many factors such as traffic congestion, timing, vehicle frequency, and road condition. However, most of the essential elements, such as peak hours and non-peak hours, are taken as a consideration for time calculation using machine learning algorithms. On the other hand, the two prime and most useful machine learning approaches are used in the study. The result of RMSE in Table 9. shows the best and worst prediction analyses on the stipulated university case study. Likewise, Table 10 represents the fuel costs of the 14-week period for all the buses individually. Though the distance and congestion of Bus A and Bus C are different, the frequency of Bus A is greater than that of Bus C. Therefore, the fuel consumption of Bus A and Bus C is almost the same as the more significant number. Alternatively, as the journey of a vehicle increases, the fuel consumption and gas emission level rise. On that occasion, the gas emission of Bus A and Bus C is high, followed by Bus B, Bus D, and Bus E. Finally, the recommendation of the bus routes is justified with the consideration of all the resultant outputs.
On the other hand, though the discussed techniques and operations are tested on a specific scenario, namely the university shuttle bus, the model can be used in any context. The study can be reproduced to predict operation time, cost estimation, and gas emission. Alternatively, the replication of the model may not be directly applicable. Several other factors, such as the variation of fuel cost, environment policies, road conditions, and traffic condition, may need to be included in the existing model to be more precise on the prediction, which may also constitute a limitation of the proposed model. On the other hand, as the stipulated campus size is gigantic, two more paid public buses are available to enable the outside campus students to connect with the university. Alternatively, all the outdoor buses’ main focus is to connect to the UM central (University of Malaya center point), as the students can then take free shuttle buses, which can take them to every different place in the university. On the other hand, merging the recommended vehicles can save time as well as significantly reduce the costs and gas emission, which benefits passengers and can help managements to promote environmental sustainability.

6. Conclusions and Future Work

The use of computers has contributed a lot to the art of transport planning over the last three decades. Some new methods, which take human activity into account, have been applied to the tools of transport planners, in addition to acquiring the ability to create more complex models to analyze collected data. High computational systems make it easy for the decision planner to establish increasing precision quickly. Likewise, modern approaches such as artificial intelligence, neural network, and computer vision help transportation decision-makers to convert mono-solutions to a set of multi-solutions. The main objective of this research was to study the ANN and SVM to predict travel time, fuel consumption, and harmful gas emission. Moreover, the experiment was conducted during a semester—except for the exam period, due to the different schedule times and routes. Furthermore, the RMSE result showed that the ANN model performed well compared to the SVM Model. In addition, the overall result showed that the cost of operation and gas emission production for one learning week was USD 7386.69 and 52,418.240 g, respectively. Based on the results and analysis, a suitable recommendation was made on the bus routes to enhance the quality of service.
Although the model was tested under a specific scenario, namely a university shuttle bus, the model can be replicated in any other context to promote urban sustainability. However, the research found some limitations, which should be overcome and suggest future work. The research was conducted during the learning weeks, 1 to 14, except for the study break and exam weeks. The experiment could be extended to the whole year, including all the factors to make more accurate predictions. On the other hand, several other factors, such as road condition, traffic congestion, dynamic fuel pricing, transportation policies, or weather conditions can be examined in the model to make a more precise decision. To conclude, the study has a substantial impact on the environment and society by estimating the travel time, fuel cost, and gas emission rate, which not only can be implemented on the shuttle bus also on other service vehicles to ensure transportation sustainability.

Author Contributions

Conceptualization, R.M.N., N.B.G.R., T.N., and R.K.; methodology, N.B.G.R. and R.K.; software, N.B.G.R. and T.N.; validation, R.M.N. and T.N.; formal analysis, N.B.G.R., T.N., and R.K.; investigation, N.B.G.R. and R.K.; resources, N.B.G.R. and R.K.; data curation, R.M.N. and T.N.; writing—original draft preparation, R.M.N., T.N., and N.B.G.R.; writing—review and editing, T.N. and R.K.; visualization, N.B.G.R. and T.N.; supervision, R.M.N. and T.N.; project administration, R.M.N.; funding acquisition, R.M.N. and R.K. All authors have read and agreed to the published version of the manuscript.

Funding

The work was financially supported by the UCSI University Pioneer Scientist Incentive Fund (PSIF), Proj-2019-In-FOBIS-023. The authors of this article are thankful to the University of Malaya for providing all the resources for conducting this research. Moreover, the authors express their gratitude to all the reviewers for making the article more useful and attractive.

Acknowledgments

The authors of this research acknowledge the help and support of the University of Malaya (LL048-2019 Iot Monitoring & Management System on Campus Sustainable Transpiration) and UCSI University.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

This section of the document reveals the figures of the map mapping of OSRM on different buses in the chosen case study.
Figure A1. Map Mapping of Bus B.
Figure A1. Map Mapping of Bus B.
Sustainability 13 00225 g0a1
Figure A2. Map Mapping of Bus C.
Figure A2. Map Mapping of Bus C.
Sustainability 13 00225 g0a2
Figure A3. Map Mapping of Bus D.
Figure A3. Map Mapping of Bus D.
Sustainability 13 00225 g0a3
Figure A4. Map Mapping of Bus E.
Figure A4. Map Mapping of Bus E.
Sustainability 13 00225 g0a4

Appendix B

This part of the document shows the figures of the experimental results of the artificial neural network models on different buses in the chosen case study.
Figure A5. Artificial Neural Network Model for Bus B with 116 steps.
Figure A5. Artificial Neural Network Model for Bus B with 116 steps.
Sustainability 13 00225 g0a5
Figure A6. Artificial Neural Network Model for Bus C with 121 steps.
Figure A6. Artificial Neural Network Model for Bus C with 121 steps.
Sustainability 13 00225 g0a6
Figure A7. Artificial Neural Network Model for Bus D with 112 steps.
Figure A7. Artificial Neural Network Model for Bus D with 112 steps.
Sustainability 13 00225 g0a7
Figure A8. Artificial Neural Network Model for Bus E with 128 steps.
Figure A8. Artificial Neural Network Model for Bus E with 128 steps.
Sustainability 13 00225 g0a8

Appendix C

This section of the document demonstrates the figures of the experimental results of the support vector machine models on different buses in the chosen case study.
Figure A9. SVM Model of Regression for Bus B.
Figure A9. SVM Model of Regression for Bus B.
Sustainability 13 00225 g0a9
Figure A10. SVM Model of Regression for Bus C.
Figure A10. SVM Model of Regression for Bus C.
Sustainability 13 00225 g0a10
Figure A11. SVM Model of Regression for Bus D.
Figure A11. SVM Model of Regression for Bus D.
Sustainability 13 00225 g0a11
Figure A12. SVM Model of Regression for Bus E.
Figure A12. SVM Model of Regression for Bus E.
Sustainability 13 00225 g0a12

References

  1. Juhari, M.N.Z.; Mansor, H. IIUM Bus on Campus Monitoring System. In Proceedings of the 2016 International Conference on Computer and Communication Engineering (ICCCE), Kuala Lumpur, Malaysia, 26–27 July 2016; pp. 138–143. [Google Scholar] [CrossRef] [Green Version]
  2. Nandy, T.; Idris, M.Y.I.B.; Noor, R.M.; Kiah, M.L.M.; Lun, L.S.; Juma’at, N.B.A.; Ahmedy, I.; Ghani, N.A.; Bhattacharyya, S. Review on Security of Internet of Things Authentication Mechanism. IEEE Access 2019, 7, 151054–151089. [Google Scholar] [CrossRef]
  3. Tizghadam, A.; Khazaei, H.; Moghaddam, M.H.Y.; Hassan, Y. Machine Learning in Transportation. J. Adv. Transp. 2019, 2019, 4359785. [Google Scholar] [CrossRef] [Green Version]
  4. University of Malaya Shuttle Bus. Available online: https://hep.um.edu.my/um-shuttle-bus (accessed on 21 December 2019).
  5. Malaya, U.O. About UM. Available online: https://www.um.edu.my/ (accessed on 20 August 2020).
  6. Malaya, U.O. UM Fact Sheet. Available online: https://um.edu.my/um-fact-sheet (accessed on 10 August 2020).
  7. Jiang, F. Bus Transit Time Prediction Using GPS Data with Artificial Neural Networks. Carnegie Mellon University. Available online: https://www.ml.cmu.edu/research/dap-papers/F17/dap-jiang-fan.pdf (accessed on 15 October 2020).
  8. Jindal, I.; Chen, X.; Nokleby, M.; Ye, J. A unified neural network approach for estimating travel time and distance for a taxi trip. arXiv 2017, arXiv:1710.04350. [Google Scholar]
  9. Junyou, Z.; Fanyu, W.; Shufeng, W. Application of Support Vector Machine in Bus Travel Time Prediction. Int. J. Syst. Eng. 2018, 2, 21–25. [Google Scholar] [CrossRef]
  10. Mingheng, Z.; Yaobao, Z.; Ganglong, H.; Gang, C. Accurate Multisteps Traffic Flow Prediction Based on SVM. Math. Probl. Eng. 2013, 2013, 418303. [Google Scholar] [CrossRef] [Green Version]
  11. Chun-Hsin, W.; Jan-Ming, H.; Lee, D.T. Travel-time prediction with support vector regression. IEEE Trans. Intell. Transp. Syst. 2004, 5, 276–281. [Google Scholar] [CrossRef] [Green Version]
  12. Md Noor, R.; Seong Yik, N.; Kolandaisamy, R.; Ahmedy, I.; Hossain, M.A.; Alvin Yau, K.; Md Shah, W.; Nandy, T. Predict Arrival Time by Using Machine Learning Algorithm to Promote Utilization of Urban Smart Bus. Preprints 2020, 2020020197. [Google Scholar] [CrossRef] [Green Version]
  13. Nandy, T.; Noor, R.M.; Idris, M.Y.I.B.; Bhattacharyya, S. Vehicle Location Prediction System Based on Historical Data. Techrxiv 2020. [Google Scholar] [CrossRef]
  14. Wu, W.; Ma, W.J.; Long, K.J.; Zhou, H.P.; Zhang, Y. Designing Sustainable Public Transportation: Integrated Optimization of Bus Speed and Holding Time in a Connected Vehicle Environment. Sustainability 2016, 8, 1170. [Google Scholar] [CrossRef] [Green Version]
  15. Qin, J.; Ye, Y.; Cheng, B.R.; Zhao, X.B.; Ni, L.L. The Emergency Vehicle Routing Problem with Uncertain Demand under Sustainability Environments. Sustainability 2017, 9, 288. [Google Scholar] [CrossRef] [Green Version]
  16. Cyril, A.; Mulangi, R.H.; George, V. Performance Optimization of Public Transport Using Integrated AHP–GP Methodology. Urban Rail Transit 2019, 5, 133–144. [Google Scholar] [CrossRef] [Green Version]
  17. Leyerer, M.; Sonneberg, M.O.; Heumann, M.; Kammann, T.; Breitner, M.H. Individually Optimized Commercial Road Transport: A Decision Support System for Customizable Routing Problems. Sustainability 2019, 11, 5544. [Google Scholar] [CrossRef] [Green Version]
  18. Chen, C.; Ding, Y.; Xie, X.; Zhang, S.; Wang, Z.; Feng, L. TrajCompressor: An Online Map-matching-based Trajectory Compression Framework Leveraging Vehicle Heading Direction and Change. IEEE Trans. Intell. Transp. Syst. 2020, 21, 2012–2028. [Google Scholar] [CrossRef]
  19. Ciesla, M.; Sobota, A.; Jacyna, M. Multi-Criteria Decision Making Process in Metropolitan Transport Means Selection Based on the Sharing Mobility Idea. Sustainability 2020, 12, 7231. [Google Scholar] [CrossRef]
  20. Corlu, C.G.; de la Torre, R.; Serrano-Hernandez, A.; Juan, A.A.; Faulin, J. Optimizing Energy Consumption in Transportation: Literature Review, Insights, and Research Opportunities. Energies 2020, 13, 1115. [Google Scholar] [CrossRef] [Green Version]
  21. Tian, D.; Zhu, Y.; Duan, X.; Hu, J.; Sheng, Z.; Chen, M.; Wang, J.; Wang, Y. An Effective Fuel-Level Data Cleaning and Repairing Method for Vehicle Monitor Platform. IEEE Trans. Ind. Inform. 2019, 15, 410–422. [Google Scholar] [CrossRef]
  22. Dioha, M.O.; Kumar, A. Sustainable energy pathways for land transport in Nigeria. Util. Policy 2020, 64, 101034. [Google Scholar] [CrossRef]
  23. Hu, X.; Zou, C.; Tang, X.; Liu, T.; Hu, L. Cost-Optimal Energy Management of Hybrid Electric Vehicles Using Fuel Cell/Battery Health-Aware Predictive Control. IEEE Trans. Power Electron. 2020, 35, 382–392. [Google Scholar] [CrossRef] [Green Version]
  24. Rivera-Gonzalez, L.; Bolonio, D.; Mazadiego, L.F.; Naranjo-Silva, S.; Escobar-Segovia, K. Long-Term Forecast of Energy and Fuels Demand Towards a Sustainable Road Transport Sector in Ecuador (2016-2035): A LEAP Model Application. Sustainability 2020, 12, 472. [Google Scholar] [CrossRef] [Green Version]
  25. Yang, X.; Liu, L. A Multi-Objective Bus Rapid Transit Energy Saving Dispatching Optimization Considering Multiple Types of Vehicles. IEEE Access 2020, 8, 79459–79471. [Google Scholar] [CrossRef]
  26. Salehi, M.; Jalalian, M.; Vali Siar, M.M. Green transportation scheduling with speed control: Trade-off between total transportation cost and carbon emission. Comput. Ind. Eng. 2017, 113, 392–404. [Google Scholar] [CrossRef]
  27. Llopis-Castelló, D.; Camacho-Torregrosa, F.J.; García, A. Analysis of the influence of geometric design consistency on vehicle CO2 emissions. Transp. Res. Part D Transp. Environ. 2019, 69, 40–50. [Google Scholar] [CrossRef]
  28. Penazzi, S.; Accorsi, R.; Manzini, R. Planning low carbon urban-rural ecosystems: An integrated transport land-use model. J. Clean. Prod. 2019, 235, 96–111. [Google Scholar] [CrossRef]
  29. Golebiowski, P.; Zak, J.; Jacyna-Golda, I. Approach to the Proecological Distribution of the Traffic Flow on the Transport Network from the Point of View of Carbon Dioxide. Sustainability 2020, 12, 6936. [Google Scholar] [CrossRef]
  30. Huang, W.; Guo, Y.; Xu, X. Evaluation of real-time vehicle energy consumption and related emissions in China: A case study of the Guangdong–Hong Kong–Macao greater Bay Area. J. Clean. Prod. 2020, 263, 121583. [Google Scholar] [CrossRef]
  31. Shimizu, O.; Nagai, S.; Fujita, T.; Fujimoto, H. Potential for CO2 Reduction by Dynamic Wireless Power Transfer for Passenger Vehicles in Japan. Energies 2020, 13, 3342. [Google Scholar] [CrossRef]
  32. Wang, Z.Q.; Wen, P.H. Optimization of a Low-Carbon Two-Echelon Heterogeneous-Fleet Vehicle Routing for Cold Chain Logistics under Mixed Time Window. Sustainability 2020, 12, 1967. [Google Scholar] [CrossRef] [Green Version]
  33. Malaysia Moves Towards Greener Diesel. Available online: https://www.theborneopost.com/2018/12/11/malaysia-moves-towards-greener-diesel/ (accessed on 13 January 2020).
  34. Tang, C.; Ceder, A.; Ge, Y.-E. Optimal public-transport operational strategies to reduce cost and vehicle’s emission. PLoS ONE 2018, 13, e0201138. [Google Scholar] [CrossRef]
  35. Yang, C.; Bibeau, E.; Molinski, T. Fuel Consumption Model for Diesel And Electric Buses Considering Bus Route and Passenger Load Variation. Proceedings of EV 2012, Montreal, QC, Canada, 22 October 2012. [Google Scholar]
  36. Zhang, X.; Chen, M. Quantifying the Impact of Weather Events on Travel Time and Reliability. J. Adv. Transp. 2019, 2019, 8203081. [Google Scholar] [CrossRef] [Green Version]
  37. Yin, T.; Zhong, G.; Zhang, J.; He, S.; Ran, B. A prediction model of bus arrival time at stops with multi-routes. Transp. Res. Proc. 2017, 25, 4623–4636. [Google Scholar] [CrossRef]
Figure 1. University Shuttle bus service route.
Figure 1. University Shuttle bus service route.
Sustainability 13 00225 g001
Figure 2. Average Yearly Statistics of All the Shuttle Buses of the University of Malaya. (a) Kilometers traveled, (b) Number of trips, and (c) Number of passengers.
Figure 2. Average Yearly Statistics of All the Shuttle Buses of the University of Malaya. (a) Kilometers traveled, (b) Number of trips, and (c) Number of passengers.
Sustainability 13 00225 g002
Figure 3. The schema for Data Preparation.
Figure 3. The schema for Data Preparation.
Sustainability 13 00225 g003
Figure 4. (a) The sample of raw data from the GPS server; (b) A sample of Map Mapping of Bus A.
Figure 4. (a) The sample of raw data from the GPS server; (b) A sample of Map Mapping of Bus A.
Sustainability 13 00225 g004
Figure 5. (a) GPS Data Point On-road and Off-road Illustration; (b) GPS point of backward data illustration.
Figure 5. (a) GPS Data Point On-road and Off-road Illustration; (b) GPS point of backward data illustration.
Sustainability 13 00225 g005
Figure 6. The calculation for Off-road Data Illustration.
Figure 6. The calculation for Off-road Data Illustration.
Sustainability 13 00225 g006
Figure 7. Artificial Neural Network (ANN) Model for Bus A.
Figure 7. Artificial Neural Network (ANN) Model for Bus A.
Sustainability 13 00225 g007
Figure 8. (a) The Average Travel Time in the ANN Model; (b) The Average Travel Time Comparison in the Support Vector Regression (SVR) Model.
Figure 8. (a) The Average Travel Time in the ANN Model; (b) The Average Travel Time Comparison in the Support Vector Regression (SVR) Model.
Sustainability 13 00225 g008
Figure 9. (a) Support Vector Machine (SVM) Model of Regression for Bus A—the point is the actual data, and the red cross is the predicted data by SVR; (b) Gas Emission Production.
Figure 9. (a) Support Vector Machine (SVM) Model of Regression for Bus A—the point is the actual data, and the red cross is the predicted data by SVR; (b) Gas Emission Production.
Sustainability 13 00225 g009
Figure 10. (a) Comparison of ANN, SVM and Actual Model for a Non-Peak Hour; (b) Comparison of ANN, SVM and Actual Model for Peak Hour.
Figure 10. (a) Comparison of ANN, SVM and Actual Model for a Non-Peak Hour; (b) Comparison of ANN, SVM and Actual Model for Peak Hour.
Sustainability 13 00225 g010
Figure 11. Actual routes and recommended routes. (a) Actual route of Bus A; (b) Actual route of Bus E; and (c) Recommended route of Bus AE.
Figure 11. Actual routes and recommended routes. (a) Actual route of Bus A; (b) Actual route of Bus E; and (c) Recommended route of Bus AE.
Sustainability 13 00225 g011
Table 1. Acronyms and their descriptions.
Table 1. Acronyms and their descriptions.
AcronymsDescriptionAcronymsDescription
ANNArtificial Neural NetworkOSRMOpen Source Routing Machine
4GFourth GenerationRMRinggit Malaysia
AGAAdaptive Genetic AlgorithmRMSERoot Means Square Error
CRANComprehensive R Archive NetworkSTNNSpatio-Temporal Neural Network
GPSGlobal Positioning SystemSVMSupport Vector Machine
ITSIntelligent Transportation SystemSVRSupport Vector Regression
IoTInternet of ThingsUSDUnited States Dollar
MINLPMixed-Integer Nonlinear ProgrammingUMUniversity of Malaya
OSMOpen Street MapWi-FiWireless Fidelity
Table 2. Number and percentage of the points based on location and error.
Table 2. Number and percentage of the points based on location and error.
DataNumber of DataPercentage
On-road data31,86678.76%
Off-road data697917.25%
Backward data16153.99%
Total40,460100%
Table 3. Gas Emission/CO2 Equivalent.
Table 3. Gas Emission/CO2 Equivalent.
CO2 EquivalentGas Emission Generated by Buses (g/mile)
Low2500
Medium2800
High3100
Table 4. Actual Average Time Travel Data.
Table 4. Actual Average Time Travel Data.
Bus RouteTime Consumes to Travel Each Route
Peak Hour (sec)Non-Peak Hour (sec)
A1572.151144.50
B1350.451078.43
C2606.111620.33
D1960.221276.29
E1031.10785.29
Table 5. ANN Model on Average Time Travel Data.
Table 5. ANN Model on Average Time Travel Data.
Bus RoutePrediction with ANN Algorithm
Peak Hour (sec)Non-Peak Hour (sec)
A1616.371164.33
B1362.231043.15
C2570.391675.18
D1975.221306.24
E1058.12813.57
Table 6. SVM Model on Average Time Travel Data.
Table 6. SVM Model on Average Time Travel Data.
Bus RoutePrediction with SVM Algorithm
Peak Hour (sec)Non-Peak Hour (sec)
A1797.151299.50
B1470.451140.22
C2759.111670.45
D1990.141389
E981.19765.12
Table 7. Production of Gas Emission Based on the Category.
Table 7. Production of Gas Emission Based on the Category.
Bus RouteDistance/Journey (mile)Gas Emission Produce (g)
High
(3100 g/mile)
Medium
(2800 g/mile)
Low
(2500 g/mile)
A412,40011,20010,000
B3.4810,78897448700
C4.6214,32212,93611,550
D4.4713,85712,51611,175
E1.77362732764425
Table 8. The Estimated Fuel Cost for Each Journey.
Table 8. The Estimated Fuel Cost for Each Journey.
Bus RouteDistance
(KM)
Fuel Consumption 1 (Litre)Fuel Cost 2 (USD)
A6.643.771.57
B5.613.191.33
C7.444.231.77
D7.194.081.71
E2.851.620.68
1 fuel consumption rate 1.758 km/L; 2 diesel price in Malaysia USD 0.42/L.
Table 9. RMSE Comparison between ANN and SVM.
Table 9. RMSE Comparison between ANN and SVM.
Bus RoutesRMSE (sec)
SVMANN
Peak HourNon-Peak HourPeak HourNon-Peak Hour
A112.5077.5022.119.91
B60.0030.905.8917.64
C76.5025.0617.8627.43
D14.9656.367.5014.98
E24.9510.0913.5114.14
Table 10. Estimation Cost of Fuel in 14 Weeks.
Table 10. Estimation Cost of Fuel in 14 Weeks.
Bus RouteJourney/DayCost of Fuel/Journey
(USD 1)
Cost of Fuel/Day
(USD)
Cost of Fuel/Week
(USD)
Cost of Fuel 14 Weeks (USD)
A171.5726.77133.861874.11
B171.3322.63113.131583.79
C151.7726.49132.431853.99
D131.7122.201111554.03
E110.687.4437.19520.77
Total7386.69
1 USD = RM 4.14 Where USD is the United States Dollar and RM is Ringgit Malaysia.
Table 11. Estimation of Gas Emission Produces within 14 Weeks.
Table 11. Estimation of Gas Emission Produces within 14 Weeks.
Bus RouteJourney/DayCO2 */Journey
(g)
CO2*/Day
(g)
CO2 */Week
(g)
CO2 */14 Week
(g)
A1711,200190,400952,00013,328,000
B179744165,648828,24011,595,360
C1512,936194,040970,20013,582,800
D1312,516162,708813,54011,389,560
E11327636,036180,1802,522,520
Total52,418,240
Table 12. Proposed Bus Route of the Case Study.
Table 12. Proposed Bus Route of the Case Study.
BusCurrent RouteBusProposed Route
AUM Central—KK 3,4&6—Academy of Malay Studies—KK 8,10 FSKTM—Academy of Islamic Studies—KK 8,10 FSKTM—Academy of Malay Studies—PTM—UM CentralAEUM Central—KK 3,4&6—Academy of Malay Studies—KK 8,10 FSKTM—Academy of Malay Studies—PTM—FBL—KK 9—UM Sentral
EUM Central—FBL—KK 9—UM Central
BUM Central—PASUM—KK 5—Academy of Islamic Studies—KK 11—KK 12—KK 1—Faculty of Engineering—UM SentralBNo change
CUM Central—PASUM—KL Hockey—Angkasapuri—Pantai Permai—Bangsar South—KK 1—Faculty of Engineering—UM SentralCNo change
DUM Central—International House—Rapid 1—Rapid 2—Rapid 3—Rapid 4—International House—UM CentralDNo change
Table 13. Comparison between the actual bus route and the recommended combined bus route.
Table 13. Comparison between the actual bus route and the recommended combined bus route.
BusStatusDistance
(km)
Travel TimeFuel Consumption
(litre)
Fuel Cost
(USD)
CO2
(g)
Peak HourNon-Peak Hour
AActual6.641572.151144.503.771.5711.200
EActual2.851031.10785.291.620.683.276
A&ETotal9.492603.251929.795.392.2514.476
AERecommended7.99 Sustainability 13 00225 i0011893.36 Sustainability 13 00225 i0021377.58 Sustainability 13 00225 i0033.07 Sustainability 13 00225 i0041.27 Sustainability 13 00225 i00512.400 Sustainability 13 00225 i006
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Noor, R.M.; Rasyidi, N.B.G.; Nandy, T.; Kolandaisamy, R. Campus Shuttle Bus Route Optimization Using Machine Learning Predictive Analysis: A Case Study. Sustainability 2021, 13, 225. https://doi.org/10.3390/su13010225

AMA Style

Noor RM, Rasyidi NBG, Nandy T, Kolandaisamy R. Campus Shuttle Bus Route Optimization Using Machine Learning Predictive Analysis: A Case Study. Sustainability. 2021; 13(1):225. https://doi.org/10.3390/su13010225

Chicago/Turabian Style

Noor, Rafidah Md, Nadia Bella Gustiani Rasyidi, Tarak Nandy, and Raenu Kolandaisamy. 2021. "Campus Shuttle Bus Route Optimization Using Machine Learning Predictive Analysis: A Case Study" Sustainability 13, no. 1: 225. https://doi.org/10.3390/su13010225

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop