Effective Bus Travel Time Prediction System of Multiple Routes: Introducing PMLNet Based on MDARNN

Lei, Jianmei; Chen, Yulan; Han, Qingwen; Zeng, Lingqiu; He, Guangyan

doi:10.3390/app15148104

Open AccessArticle

Effective Bus Travel Time Prediction System of Multiple Routes: Introducing PMLNet Based on MDARNN

by

Jianmei Lei

¹,

Yulan Chen

²,

Qingwen Han

^2,*,

Lingqiu Zeng

³ and

Guangyan He

²

¹

State Key Laboratory of Intelligent Vehicle Safety Technology, Chongqing 400044, China

²

School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China

³

College of Computer Science, Chongqing University, Chongqing 400044, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(14), 8104; https://doi.org/10.3390/app15148104

Submission received: 8 June 2025 / Revised: 15 July 2025 / Accepted: 16 July 2025 / Published: 21 July 2025

Download

Browse Figures

Versions Notes

Abstract

Accurate bus travel time prediction is crucial for improving travel experience, especially in transfer journeys. This study introduces a novel multi-route bus travel time prediction system-based PMLNet, a partition and combination prediction framework, addressing the gap in accurate prediction models by incorporating macro and local impact factors. The system employs a pre-processing algorithm for constructing travel chains, partitions travel time into four components, utilizes LSTM along with the newly proposed MDARNN model for predicting each component, and applies four real-time traffic impact factors to calibrate the predictions of each component. Experimental validation on four bus routes demonstrates PMLNet’s superior performance, achieving mean absolute percentage errors (MAPE) as low as 2.91% and mean absolute errors (MAE) below 1.45 min, outperforming traditional models and various partitioned combination frameworks. These findings underscore PMLNet’s potential to significantly improve public transportation services by providing more accurate travel time predictions, ultimately enhancing the user experience in intelligent transportation systems.

Keywords:

intelligent transportation; bus travel time prediction system; LSTM; DA-RNN

1. Introduction

As an effective approach for addressing urban transportation challenges, public transportation (PT) systems are assuming an increasingly vital role in the daily lives of individuals [1]. However, a series of quality of service (QoS) problems, such as long waiting time, low transfer efficiency, and high flow density, etc., influence passengers’ travel choice behavior. In recent years, PT companies have realized that seamless transfers between different bus routes are one of the most important factors for the attractiveness of PT services. With the improvement of the intelligence level of PT systems, city buses can offer real-time driving data to optimize the performance of travel time prediction.

The performance of bus travel time prediction, which determines the QoS of PT, is affected by many factors, such as bus running state, weather, real-time traffic flow and road features, etc. [2,3]. Moreover, the effectiveness of the prediction model is also an important factor [4].

To meet the demand of high-level PT services, our goal is to design a highly accurate multi-route bus travel time prediction system through harnessing the influence of impact factors and selecting a reasonable model which considers the varying impacts of different factors. Therefore, the key components of the designed system are impact factor selection and prediction model construction [4].

The rationality of impact factors determines the accuracy of travel time prediction outputs to a great extent. Macro impact factors, such as weather, temperature, and air quality, affect all road vehicles and remain relatively stable [5,6]. On the other hand, local impact factors, such as traffic features and passenger behavior, exhibit time-varying and random behavior, introducing uncertainty and complexity to travel time prediction [7]. Furthermore, real-time traffic flow can also influence the current status of bus travel or dwell time [3].

In the past few years, researchers have explored multiple approaches to represent real-time traffic features. Regarding the characteristics of bus operations, in our prior research, we introduced two distinct traffic factors—namely, the travel factor and the dwelling factor—to effectively characterize the prevailing real-time traffic conditions [3]. However, the literature [3] solely focused on Bus Arrival Time prediction, overlooking the aspect of traffic transfers. According to the findings in [8], waiting time, the frequency of transfers, the count of stops, and bus travel time are four pivotal factors influencing passengers’ experiences. This underscores the need to refine Bus Arrival Time predictions. In particular, integrating transfer behavior and transfer point waiting time can enhance the QoS of urban PT services. Consequently, through analyzing the path model of multiple routes, we divided bus travel time into four components: Current Driving Segment Travel Time (CDSTT), Bus Stop Dwelling Time (BSDT), Stop-to-Stop Travel Time (SSTT), and Transfer Point Waiting Time (TPWT).

However, a real challenge lies in capturing passengers’ behaviors, which remains largely unknown due to privacy protection policies. To address this issue, this paper introduces the concept of a “travel chain,” building upon the ideas presented in [9,10]. This travel chain aims to supplement the deficiency in travel data by incorporating transfer behaviors from the raw dataset.

In order to select an appropriate prediction model, the recurrent neural network (RNN), which is well-suited for addressing time series problems, is considered an ideal choice for Bus Arrival Time prediction [11]. To improve the prediction accuracy, a broader set of impact factors are used. Additionally, the attention mechanism is introduced to construct prediction model, such as the macro dual-stage attention-based recurrent neural network (DA-RNN) [12]. Inspired by this, we incorporate macro impact factors into the DA-RNN model to predict Bus Stop Dwelling Time and Stop-to-Stop Travel Time, resulting in an enhanced model termed macro impact factor DA-RNN (MDARNN). On the other hand, the Long Short-Term Memory (LSTM) model is chosen for Current Driving Segment Travel Time and Transfer Point Waiting Time prediction. Therefore, in this paper, a parallel structure consisting of four sub-models is proposed, which employs both LSTM and MDARNN to predict four different time periods based on their characteristics.

The main contributions of this work are outlined below:

A travel chain is constructed using the original bus datasets, which addresses the existing constraint of inadequate historical data concerning individual passengers’ bus journeys, and provides a reference for the analysis of transit transfer time under the condition of a single GPS data source.
A novel enhanced network named MDARNN is introduced, building upon the DA-RNN model. This enhancement involves the classification of input features into two distinct components: macro impact factors and local impact factors, which allows for a more detailed and accurate analysis of the various elements influencing bus travel time.
A partition and combination framework referred to as PMLNet, is presented. Both MDARNN and LSTM are selected to individually predict four time periods, which are corrected and combined using four traffic impact factors. This not only enhances prediction accuracy by considering diverse traffic impact factors but also showcases a novel approach to integrating and correcting.
A real-time bus travel time prediction system is designed. The results affirm the system’s effectiveness in delivering accurate travel time predictions, underscoring the potential impact of our contributions on improving public transportation efficiency and user experience.

This paper is organized as follows: Section 2 gives an overview of the relevant literature related to bus travel time prediction. Section 3 introduces the system architecture of bus travel time prediction system; data processing and the core part of the system, PMLNet, are also presented in Section 3. Experimental results analysis and system validation are presented in Section 4. Finally, Section 5 offers a conclusion to this paper and future research directions.

2. Related Work

Travel time prediction approaches are useful for improving the QoS of PT. Within this section, we review the relevant works from two aspects: impact factor selection and prediction model construction.

2.1. Impact Factor Selection

One of the main challenges encountered in bus travel time prediction is that the bus is affected by various factors while traveling or dwelling [13]. The main impact factors selected by related studies can be categorized as macro impact factors and local impact factors based on their characteristics.

Macro impact factors are defined as the features that are typically the same for all buses or do not change for a long time. For example, weather conditions (sunny or rainy) affect bus running state and passenger boarding time [5,6], making them common input features for prediction models in various studies. Previous researches and observations of bus travel data reveal different distribution patterns in peak and off-peak periods, weekdays, and weekends [14,15]. Therefore, some of the literature includes hour and date information as the influencing factors for predicting bus travel time [9,16]. Additionally, temperature [17,18] and air quality [19] have also been investigated as the impact factors for bus travel time prediction.

Local impact factors are defined as features that vary over a short period of time and differ between buses. Bus running states, such as speed and acceleration, have real-time effects on bus travel time [7]. These states are often acquired from Automatic Vehicle Location (AVL) or a Global Positioning System (GPS) and utilized as input features for bus travel time prediction models [20,21]. While some of the literature focuses on analyzing the influences of real-time traffic conditions, other studies like [22] predict bus travel time by integrating traffic flow data. In one study [3], two traffic factors are defined to express real-time traffic state. Inspired by this, four impact factors are defined to correct four predicted time parts, which consider the recent bus traveling or dwelling status in the most recent period to map the real-time traffic conditions.

Another factor to be considered is the construction of travel chains. Due to the sensitivity of individual passenger data, obtaining public transport transfer data is difficult. Therefore, it is necessary to build travel chains with transfer behavior. One study [9] generated sufficient (pseudo) travel chains from historical bus trajectories of multiple routes. Another study [10] reconstructed travel chains by merging data from different sections of different bus routes. It is easy to reconstruct travel chains from the travel data of different routes. In this paper, bus travel chains are constructed based on the bus trajectory data of multiple routes to compensate for the lack of transfer data.

2.2. Prediction Model Construction

Constructing an appropriate model is crucial for ensuring a high QoS of PT. Previous studies have developed prediction models that can be roughly categorized into two types: traditional models and machine learning-based models.

Over the last few decades, traditional models like historical average and the Kalman Filter have been extensively employed in the field. In the early years, historical average was a naive algorithm used to build bus travel time prediction models [23]. However, the historical average method heavily relies on historical data, ignores real-time traffic factors, and leads to prediction errors. Another traditional method is the Kalman Filter. One study [24] initially used the Kalman Filter to predict Bus Arrival Time by combining historical data and GPS data. Based on the historical average method, another study [25] employed the Kalman Filter to enhance the precision of predicting arrival time for each subsequent stop and construct a dynamic Bus Arrival Time prediction model. However, the Kalman Filter also has some disadvantages, as it is sensitive to anomalies, which prevents it from adapting to prediction tasks with large variability datasets, such as multi-route bus travel time prediction [26].

Machine learning has undergone rapid advancements in recent years and has demonstrated strong performance in Bus Arrival Time prediction. Classic regressions, such as linear regression, were widely employed in the travel time prediction of vehicles [27,28]. Nevertheless, the model is not applicable to non-linear prediction tasks. In [14], a Support Vector Regression (SVR) model was applied to predict Bus Arrival Time since the model has a greater generalization capacity. One study [29] combined a Support Vector Machine (SVM) and the GPS real-time forecast to build a hybrid Bus Arrival Time prediction model. Neural network has also been widely utilized in bus travel time prediction. Another study [30] proposed a segmentation prediction model, which employed a back propagation (BP) neural network model to predict the divided bus driving time by K-means clustering algorithm.

The process of predicting travel time hinges on time series considerations. In order to achieve satisfactory forecasting results, RNN was incorporated to predict Bus Arrival Time as it is well-suited to address time series problems [11,31]. In one study [31], the authors employed both the LSTM and historical average method to predict Transfer Point Waiting Time and travel time for each distinct path segment. One study [32] constructed meta-based spatiotemporal networks to model diverse dependencies for single-line Bus Arrival Time prediction. Furthermore, ref. [33] used Multi-Relational Modeling Graph Convolution (MRMGCN) as a frontend module to capture spatial relationships in multi-route areas, which combines with time series models for Bus Arrival Time prediction. In contrast, PMLNet is designed for multi-route scenarios, which incorporates bus times prediction and real-time traffic calibration. This approach, which excels in handling transfers and navigating complex urban networks, demonstrates superior performance in such contexts. However, the entire trip is roundly segmented into two parts: the travel time of bus driving and the waiting time at transfer point, without considering real-time traffic conditions. Furthermore, one study [12] proposed a dual-stage attention-based recurrent neural network (DA-RNN), which obtained a noteworthy enhancement of the time series prediction. Although attention-based mechanisms perform well, they suffer from information loss and require large resources. Therefore, in this paper, we propose a partition and combination framework PMLNet, which combines MDARNN, improved based on DA-RNN, and traditional LSTM to predict the divided bus travel time, while also incorporating four real-time traffic flow factors for correction and combination.

3. Materials and Methods

3.1. System Architecture

In this section, the framework of the real-time bus travel time prediction system is first introduced. The input features of the model are then analyzed, and a path model is presented based on the time period division. Finally, appropriate models are selected based on the characteristics of time blocks.

3.1.1. Framework

The real-time bus travel time prediction system is designed according to the demand of Chongqing Public Transportation Company, China. And the framework of system is shown in Figure 1. As shown in Figure 1, two types of databases, namely the bus driving database and public database, are set as inputs to the bus travel time prediction system. The cloud server collects the relevant data and subsequently provides Bus Arrival Time prediction services using the proposed PMLNet. Additionally, real-time data transmission occurs between the mobile application and the cloud server.

The bus driving dataset is offered by the cloud control system of Chongqing Hengtong Bus Company, which encompasses historical travel data from approximately 3000 buses in Chongqing, China. The intelligent on-board systems of the buses collect 46 real-time bus cruising attributes, which are uploaded to the cloud server every 2 s. These 46 attributes are extracted and computed into 48 local factors (denoted as

I F_{l o c a l}

), including the information of date, position, speed, engine and mileage; these are denoted as Equation (1).

I F_{l o c a l} = {I F_{l o c a l}^{1}, I F_{l o c a l}^{2}, \dots, I F_{l o c a l}^{48}}

(1)

The public database consists of two sub-datasets: the bus route dataset and the meteorological dataset. The bus route dataset is extracted from the Amap open platform [34], which collects the information at one-meter intervals along the bus routes. Each entry in the bus route dataset consists of 8 attributes: route, origin, destination, stop status, longitude, latitude, stop name, and stop number (0 for non-stop). The meteorological dataset is sourced from Eastern Weather Net [35], a public service website operated by the Shanghai Public Meteorological Service Center. It records 8 attributes: date, day of the week, air quality index and level, maximum and minimum temperature, weather condition, and wind speed.

To analyze the influence of environment factors, we utilize driving data from route 805 during the period of April to November 2016, as shown in Figure 2. Specifically, Figure 2a focuses on temperature, which presents the distribution of average travel time across different temperature ranges within the study period. Meanwhile, Figure 2b compares average travel time between holidays and workdays over the same period to reveal the effect of holiday status. In addition, Figure 2c explores the correlation between air quality levels and average travel time. Further, Figure 2d demonstrates the differences in average travel time during peak hours (7:00–9:00, 17:00–19:00) and off-peak periods. Finally, Figure 2e shows the variation in average travel time across Monday to Sunday, reflecting the influence of weekdays. As shown in Figure 2, the average travel time is influenced by air quality, weather condition, time, and week day. Therefore, this paper extracts 7 macro impact factors from meteorological dataset and denotes them as

I F_{m a c r o} = I F_{m a c r o}^{1}, I F_{m a c r o}^{2}, \dots, I F_{m a c r o}^{7}

(2)

3.1.2. Path Model

Obviously, travel time prediction is a crucial aspect of the multiple bus route transfer problem [10]. The travel path from the source to the destination can be represented as a series of time periods, as shown in Figure 3. Inspired by study [3] dividing the bus travel time, in this paper, the path model comprises 4 blocks: Current Driving Segment Travel Time, Bus Stop Dwelling Time, Stop-to-Stop Travel Time, and Transfer Point Waiting Time. It should be noted that CDS refers to the path segment connecting the bus’s present position and the next stop.

Therefore, based on the aforementioned path model, the whole travel time T can be denoted as the sum of the time consumed by the 4 blocks mentioned above, which can be formulated as follows:

T = t_{C D S T T} + t_{S S T T} + t_{B S D T} + t_{T P W T}

(3)

Note:

t_{B S D T}

and

t_{S S T T}

include p sub-time blocks, which are denoted as Equation (4), where

t_{B S D T} (k)

is the dwell time at

S t o p_{k}

, and

t_{S S T T} (k, k + 1)

is the travel time between

S t o p_{k}

and

S t o p_{k + 1}

.

t_{B S D T} = \sum_{k = 0}^{p} [t_{B S D T} (k)], t_{S S T T} = \sum_{k = 0}^{p} [t_{S S T T} (k, k + 1)]

(4)

3.1.3. Basic Model Selection

Based on Equations (3) and (4),

t_{C D S T T}

and

t_{T P W T}

can be obtained via a many-to-one prediction, while the values of

t_{B S D T}

and

t_{S S T T}

can be achieved via a many-to-many prediction.

It is widely acknowledged that the LSTM model is suitable for solving many-to-one problems [9]. Therefore, it is selected to predict

t_{C D S T T}

and

t_{T P W T}

in this paper, as depicted in Figure 4a. It should be noted that the length l of the input time series for

t_{C D S T T}

prediction is set to 3, while that for

t_{T P W T}

prediction is set to 10.

On the other hand, MDARNN is based on DA-RNN [12]. It introduces an input attention mechanism to weight local factors at each time point and incorporates macro factors at the decoder stage. This allows it to capture macro factor dependencies and extract key information about local influences, making it a viable solution for predicting

t_{B S D T}

and

t_{S S T T}

. Suppose at time t there are m + 1 subsequent bus stops. The inputs and outputs of the MDARNN are shown in Figure 4b. The length of input time series for both

t_{B S D T}

and

t_{S S T T}

is set to 5, and the length of output time series is variable, which refers to the travel time of subsequent stops.

It should be noted that the basic prediction models mentioned above do not take into account the influence of traffic. Therefore, the model outputs need to be adjusted by corresponding weight factors, which will be further discussed in Section 3.2.2.

3.2. Method

In this section, data pre-processing is discussed first, which includes raw datasets, data fusing, travel chain construction, and dataset division. The traffic impact factors are then defined. Finally, the parallel prediction and calibration model is explored, and the core components are introduced.

3.2.1. Data Pre-Processing

Raw Datasets

For this study, we have selected Chongqing city bus routes 805, 248, 121, and 202, as shown in Figure 5. The corresponding route information is listed in Table 1.

To capture the characteristics of bus travel time under different seasonal and climatic conditions, we utilize the bus driving dataset from January, May, August, and October 2016. The dataset comprises 52 buses and 49,605,165 data records, and is used as a raw driving dataset

D_{d}

; it is denoted as

D_{d} = [\begin{matrix} D_{d_{1}}^{1}, D_{d_{1}}^{2}, \dots, D_{d_{1}}^{46} \\ D_{d_{i}}^{1}, D_{d_{i}}^{2}, \dots, D_{d_{i}}^{46} \\ ⋮ \\ D_{d_{h}}^{1}, D_{d_{h}}^{2}, \dots, D_{d_{h}}^{46} \end{matrix}], i = 1, 2, \dots, h

(5)

Similar to

D_{d}

, the raw route dataset

D_{r}

and the raw meteorological dataset

D_{m}

are selected and denoted as Equation (6), where h, f, and g correspond to the sizes of datasets

D_{d}

,

D_{r}

, and

D_{m}

.

D_{r} = [\begin{matrix} D_{r_{1}}^{1}, D_{r_{1}}^{2}, \dots, D_{r_{1}}^{8} \\ ⋮ \\ D_{r_{i}}^{1}, D_{r_{i}}^{2}, \dots, D_{r_{i}}^{8} \\ ⋮ \\ D_{r_{h}}^{1}, D_{r_{h}}^{2}, \dots, D_{r_{h}}^{8} \end{matrix}], D_{m} = [\begin{matrix} D_{m_{1}}^{1}, D_{m_{1}}^{2}, \dots, D_{m_{1}}^{8} \\ ⋮ \\ D_{m_{j}}^{1}, D_{m_{j}}^{2}, \dots, D_{m_{j}}^{8} \\ ⋮ \\ D_{m_{f}}^{1}, D_{m_{f}}^{2}, \dots, D_{m_{f}}^{8} \end{matrix}]; i = 1, 2, \dots, f, j = 1, 2, \dots, g

(6)

The full data pre-processing process involves three stages: data fusion, travel chain construction, and dataset division.

Data Fusion

The data fusion process aims to generate a fused driving dataset consisting of 55 attributes, including 7 macro impact factors and 48 local impact factors. Algorithm 1 provides the pseudo code of the data fusion process, while Table 2 lists the corresponding explanations of variables and functions.

The fused dataset is denoted as follows, where h is the size of dataset

D_{f u s e d}

.

D_{f u s e d} = [\begin{matrix} I F_{l o c a l_{1}}^{1}, \dots, I F_{l o c a l_{1}}^{48}, I F_{m a c r o_{1}}^{1}, \dots, I F_{m a c r o_{1}}^{7} \\ I F_{l o c a l_{i}}^{1}, \dots, I F_{l o c a l_{i}}^{48}, I F_{m a c r o_{i}}^{1}, \dots, I F_{m a c r o_{i}}^{7} \\ ⋮ \\ I F_{l o c a l_{h}}^{1}, \dots, I F_{l o c a l_{h}}^{48}, I F_{m a c r o_{h}}^{1}, \dots, I F_{m a c r o_{h}}^{7} \end{matrix}], i = 1, 2, \dots, h

(7)

Algorithm 1: Datasets fusion algorithm

Input:

D_{d}, D_{r}, D_{m}

Output:

D_{f u s e d}

Travel Chain Construction

Due to the sensitive nature of individual passenger data, the fused dataset lacks travel records, and only contains single-route bus travel data. Based on [9,10], in order to support multi-route travel time prediction, we have designed a travel chain construction process, as shown in Algorithm 2. Explanations of corresponding functions and parameters are listed in Table 3. Note here, that due to the limitations of the dataset, we have assumed that a travel trip includes only one transfer. We have used four bus routes, 805, 248, 121, and 202, to construct a travel chain, as shown in Figure 5.

Algorithm 2: Travel chain construction algorithm

Input:

D_{fused}^{A}

,

D_{fused}^{B}

,

D_{r}

Output:

D_{X \to Y}

D_{f u s e d}

is divided into four sub-travel chain datasets, denoted as

D_{805 – 248}

,

D_{202 – 121}

,

D_{805 – 202}

, and

D_{121 – 202}

, respectively. The corresponding information for each travel chain is presented in Table 4. The sub-travel chain dataset is denoted as Equation (8), where m is the data record amount of route X and Y.

D_{X – Y} = [\begin{matrix} I F_{l o c a l_{1}}^{X_{1}}, \dots, I F_{l o c a l_{1}}^{X_{48}}, I F_{m a c r o_{1}}^{X_{1}}, \dots, I F_{m a c r o_{1}}^{X_{7}} \\ I F_{l o c a l_{i}}^{X_{1}}, \dots, I F_{l o c a l_{i}}^{X_{48}}, I F_{m a c r o_{i}}^{X_{1}}, \dots, I F_{m a c r o_{i}}^{X_{7}} \\ ⋮ \\ I F_{l o c a l_{j}}^{Y_{1}}, \dots, I F_{l o c a l_{j}}^{Y_{48}}, I F_{m a c r o_{j}}^{Y_{1}}, \dots, I F_{m a c r o_{j}}^{Y_{7}} \\ ⋮ \\ I F_{l o c a l_{m}}^{Y_{1}}, \dots, I F_{l o c a l_{m}}^{Y_{48}}, I F_{m a c r o_{m}}^{Y_{1}}, \dots, I F_{m a c r o_{m}}^{Y_{7}} \end{matrix}], i, j = 1, 2, \dots, m; i < j

(8)

Dataset Division

According to Equation (1), travel time T is the sum of

t_{C D S T T}

,

t_{S S T T}

,

t_{B S D T}

, and

t_{T P W T}

, which should be predicted separately. Therefore, for each travel chain, the dataset

D_{X – Y}

should be divided into 4 sub-datasets based on geographic regions of data generation. The corresponding division rule is as follows, where

t_{C D S T T}

,

t_{S S T T}

,

t_{B S D T}

, and

t_{T P W T}

are used to train sub-models and predict time parts.

Bus stop region: Regions within 0.0001 latitude and longitude from a stop are defined as stop regions. It should be noted that transfer points are not considered as bus stops. Data pieces whose coordinates belong to bus stop regions are then classified into $D_{B S D T}$ . $D_{B S D T}$ is labeled by the dwelling time of all preceding stops.
Inter-stop region: The region between two adjacent bus stop regions is defined as the inter-stop region. It should be noted that the start/end points of inter-stop region are located on the edge of bus stop regions. We classify data pieces whose coordinates belong to inter-stop regions as $D_{S S T T}$ . $D_{S S T T}$ is labeled as being between all preceding inter-stop regions.
CDS region: The region between the current location of the bus and next bus stop is the CDS region. Data pieces that belong to the CDS region are sorted into $D_{C D S T T}$ . $D_{C D S T T}$ is labeled according to the cruising time between the current location and next bus stop.
Transfer region: Regions within 0.0001 latitude and longitude from the transfer point are defined as transfer point regions, and corresponding data pieces are used to generate $D_{T P W T}$ . $D_{T P W T}$ is labeled according to the transfer waiting time.

3.2.2. Traffic Impact Factors

To improve prediction accuracy, the impact of real-time traffic should also be considered [22]. Inspired by [3] traffic factors definition and real-time traffic impacts can be expressed in terms of multiple buses traveling and dwelling or passenger transfer status at the bus stop at the most recent period of time.

Based on this, corresponding to

t_{C D S T T}

,

t_{S S T T}

,

t_{B S D T}

, and

t_{T P W T}

, 4 traffic impact factors,

α_{C D S T T}

,

α_{S S T T}

,

α_{B S D T}

, and

α_{T P W T}

, are defined as Equation (9), while the Table 5 lists the corresponding explanations of variables.

α_{C D S T T}

quantifies the impact of real-time traffic conditions on the travel time of the current driving segment. In contrast,

α_{B S D T}

characterizes the influence of real-time traffic conditions on bus stay time at stops. Meanwhile,

α_{S S T T}

represents the effect of real-time traffic conditions on the inter-stop travel time between stop k and stop k + 1. Finally,

α_{T P W T}

reflects the impact of real-time traffic conditions on passenger waiting time at transfer points.

\{\begin{matrix} α_{C D S T T} & = 1 + \frac{1}{n} \sum_{q = 1}^{n} \frac{\bar{t_{C D S T T}^{q}} - t_{C D S T T}^{q}}{t_{C D S T T}^{q}} \\ α_{S S T T (k, k + 1)} & = 1 + \frac{1}{u} \sum_{q = 1}^{u} \frac{\bar{t_{S S T T (k, k + 1)}^{q}} - t_{S S T T (k, k + 1)}^{q}}{t_{S S T T (k, k + 1)}^{q}} \\ α_{B S D T (k)} & = 1 + \frac{1}{v} \sum_{q = 1}^{v} \frac{\bar{t_{B S D T (k)}^{q}} - t_{B S D T (k)}^{q}}{t_{B S D T (k)}^{q}} \\ α_{T P W T} & = 1 + \frac{1}{w} \sum_{q = 1}^{w} \frac{\bar{t_{T P W T}^{q}} - t_{T P W T}^{q}}{t_{T P W T}^{q}} \end{matrix}

(9)

Considering the traffic impact factor, Equation (3) becomes

\begin{matrix} T = & α_{C D S T T} \cdot t_{C D S T T} + α_{T P W T} \cdot t_{T P W T} \\ + \sum_{k = 0}^{p} (t_{B S D T} (k) \cdot α_{B S D T} (k) \\ + t_{S S T T} (k, k + 1) \cdot α_{S S T T} (k, k + 1)) \end{matrix}

(10)

where

t_{C D S T T}

is calibrated to

α_{C D S T T} \cdot t_{C D S T T}

by

α_{C D S T T}

,

t_{T P W T}

is adjusted as

α_{T P W T} \cdot t_{T P W T}

by

α_{T P W T}

, while the total dwell time at each stop

\sum_{k = 0}^{p} [t_{B S D T} (k)]

is corrected to

\sum_{k = 0}^{p} t_{B S D T} (k) \cdot α_{B S D T} (k)

, and the whole Stop-to-Stop Travel Time

\sum_{k = 0}^{p} [t_{S S T T} (k, k + 1)]

is calibrated to

\sum_{k = 0}^{p} t_{S S T T} (k, k + 1) \cdot α_{S S T T} (k, k + 1)

, note here that

S t o p_{p + 1}

is the destination. p is the number of bus stops between the current location and the destination.

3.2.3. Travel Time Prediction Model

On the basis of MDARNN, LSTM, and 4 traffic impact factors, in this paper, a parallel prediction and calibration model namely PMLNet, is proposed. Illustrated in Figure 6, PMLNet utilizes 4 basic prediction models to predict corresponding time parts. Subsequently, the 4 time parts are calibrated by real-time traffic flow factors, respectively, and the corrected time periods constitute the final predicted time. The proposed MDARNN is described in details as follows.

The MDARNN model, presented in this work, is developed upon the foundation of the DA-RNN model [12], which incorporates the attention mechanism in both encoder stage and decoder stage. This characteristic renders the model well-suited for long time series prediction. However, the traditional DA-RNN relies heavily on historical data to predict bus travel time, resulting in both information loss and waste of computing resources.

In order to address the aforementioned issues, we incorporate macro factors into the DA-RNN model, presenting an enhanced iteration named MDARNN. The architecture of the MDARNN is depicted in Figure 7.

The MDARNN employs a sequence-to-sequence architecture that encompasses encoding and decoding phases. The core components of MDARNN are depicted in Figure 8, which are the encoder and decoder. The processing procedure of MDARNN is outlined as follows:

Local impact factor is denoted as Equation (10), where k is the kth local impact factors, T is the time step, and n represents the number of local impact factors.

$I F_{l o c a l}^{k} = (I F_{l o c a l_{1}}^{k}, I F_{l o c a l_{2}}^{k}, \dots, I F_{l o c a l_{T}}^{k}), k = 1, 2, \dots, n$

(11)
The attention layer outputs as Equation (12), where $V_{e}^{T}$ , $W_{e}$ , $U_{e}$ are the parameters to be learned. For n local impact factors, the attention of each factor is $(e_{t}^{1}, e_{t}^{2}, \dots, e_{t}^{n})$ at time t, which is then employed as the input for the softmax layer.

$e_{t}^{k} = V_{e}^{T} tanh (W_{e} [h_{t - 1}, s_{t - 1}] + U_{e} I F_{l o c a l}^{k})$

(12)
The weighted output vector of softmax layer is Equation (13)

$(α_{t}^{1}, α_{t}^{2}, \dots, α_{t}^{n}), \sum_{k = 1}^{n} α_{t}^{k} = 1$

(13)

where $α_{t}^{k} = \frac{exp (e_{t}^{k})}{\sum_{i = 1}^{n} exp (e_{t}^{i})}$ , which can be used to estimate the importance of the kth local impact factor at time t. The weighted local impact factors, which are set as the input of the encoder, should be denoted as $α_{t}^{k} = \frac{exp (e_{t}^{k})}{\sum_{i = 1}^{n} exp (e_{t}^{i})}$ , which can be used to estimate the importance of the kth local impact factor at time t. The weighted local impact factors are set as the input of the encoder, and should be denoted as $x_{t}$ .

$\bar{I F_{l o c a l_{t}}} = (α_{t}^{1} I F_{l o c a l_{t}}^{1}, α_{t}^{2} I F_{l o c a l_{t}}^{2}, \dots, α_{t}^{n} I F_{l o c a l_{t}}^{n})$

(14)
By inputting $\bar{I F_{l o c a l_{t}}}$ , the hidden layer state at time t is set as Equation (15), where $f_{L S T M}^{1}$ is the encoder. Simultaneously, the hidden layer state at time t is obtained by employing an attention mechanism.

$h_{t} = f_{L S T M}^{1} (h_{t - 1}, \bar{I F_{l o c a l_{t}}})$

(15)
The hidden layer outputs the vector as $(l_{t}^{1}, l_{t}^{2}, \dots, l_{t}^{T})$ , where $l_{t}^{i}$ represents the hidden state attention of ith unit in encoder, and is defined as

$l_{t}^{i} = V_{d}^{T} tanh (W_{d} [d_{t - 1}, s_{t - 1}^{'}] + U_{d} h_{i})$

(16)

where $V_{d}^{T}$ , $W_{d}$ , and $U_{d}$ are the learnable parameters, $d_{t - 1}$ and $s_{t - 1}^{'}$ denote the previous step’s hidden state and cell state of the decoder, respectively. The attention weight of the hidden state from the ith unit in the encoder is then defined as

$β_{t}^{i} = \frac{exp (l_{t}^{i})}{\sum_{j = 1}^{T} exp (l_{t}^{j})}$

(17)
THe local impact factor is finally obtained as

$c_{t} = \sum_{i = 1}^{T} β_{t}^{i} h_{i}$

(18)
The input macro impact factor is represented as Equation (19), where $I F_{m a c r o_{t}} \in R^{\bar{n}}$ , and $\bar{n}$ is the number of macro factors.

$I F_{m a c r o} = (I F_{m a c r o_{1}}, I F_{m a c r o_{2}}, \dots, I F_{m a c r o_{T}}) \in R^{\bar{n} \times T}$

(19)
Factor fusion in the decoder results in the following output as Equation (20), where ${\hat{w}}^{T}$ , $\hat{b}$ are the learnable parameters, $f_{L S T M}^{2}$ represents the decoder, and $d_{t}$ represents the decoder’s hidden state. The MDARNN model employs an additional LSTM network as its decoder.

$\bar{I F_{m a c r o_{t}}} = {\tilde{w}}^{T} [I F_{m a c r o_{t}}, c_{t}] + \tilde{b}, d_{t} = f_{L S T M}^{2} (d_{t - 1}, I F_{m a c r o_{t}})$

(20)

Figure 8. The network structure of decoder and encoder. (a) The network structure of encoder. (b) The network structure of decoder.

4. Results and Discussion

In this section, a series of experiments were executed to substantiate the efficacy of the proposed PMLNet. The entire experiments were performed on a i7-8700K, 3.70 GHZ CPU with 32 GB RAM. The TensorFlow 1.14 framework served as the foundational platform for these operations.

4.1. Datasets Partition

As previously mentioned, after the pre-processing of the raw datasets, they are divided into four sub-datasets: Current Driving Segment Travel Time dataset

D_{C D S T T}

, Bus Stop Dwelling Time dataset

D_{B S D T}

, Stop-to-Stop Travel Time dataset

D_{S S T T}

, and Transfer Point Waiting Time dataset

D_{T P W T}

. And

t_{C D S T T}

,

t_{S S T T}

,

t_{B S D T}

, and

t_{T P W T}

are used to train and evaluate the four sub-modules, which were further divided into training, verification, and testing datasets.

It should be noted that in order to validate the performance of proposed PMLNet, a pre-partitioning process is conducted, and a testing dataset,

D_{P M L N e t, T}

, is taken from

D_{f u s e d}

. The corresponding results of the dataset partition are shown in Table 6.

4.2. Model Training

As mentioned in Section 3.1.3, two basic models, namely LSTM and the proposed MDARNN, are employed in this study. The model selection is task-specific. For Current Driving Segment Travel Time and Transfer Point Waiting Time, locally-driven many-to-one prediction tasks, LSTM is adopted, as it efficiently captures short-term dependencies. For Bus Stop Dwelling Time and Stop-to-Stop Travel Time, which involve long-term dependencies and require integration of local factors in many-to-many predictions, MDARNN is utilized. Its attention mechanisms effectively filter local features, rendering it suitable for addressing the complexity of long-sequence tasks. The corresponding parameter settings are described as follows.

MDARNN
–
Number of input features: 55 features in total, comprising 7 macro impact factors and 48 local impact factors;
–
Encoder hidden layer length: 20 units;
–
Decoder hidden layer length: 30 units;
–
Time step: 5;
–
Batch sizes: 160;
–
Initial learning rate: 0.006.

LSTM
–
Number of input features: 55 features in total, comprising 7 macro impact factors and 48 local impact factors;
–
Number of Neurons: 160 units;
–
Full connection layer neurons (Current Driving Segment Travel Time module): 10 units;
–
Full connection layer neurons (Transfer Point Waiting Time module): 3 units;
–
Batch sizes: 160;
–
Initial learning rate: 0.006.

As shown in Table 7, for the parameter settings of four modules, the initial value of the step size is set to 0.006, and the dynamic adjustment of the learning rate and the updates to the network’s weights and biases are executed using the Adam optimizer. Each of the modules undergoes 100 epochs of training, where every epoch encompasses 160 instances for training purposes.

All four modules utilize the mean absolute error (MAE) as their loss function and employ the Adam optimizer to minimize the MAE. This approach is used to search for the combination of parameters that minimizes the loss function on the training set. The details of the algorithm are shown in Algorithm 3, while the corresponding explanation is provided in Table 8.

Algorithm 3: Adam Algorithm

Input:

s, r, ε, ρ_{1}, ρ_{2}, σ

Output:

θ

The training performance of the four sub-models is shown in Figure 9a–d; the blue line in the figure is the decreasing curve of the loss function on the training set, and the yellow line is the decreasing curve of the loss function on the validation set. The vertical coordinate of the left figure is the error calculated by MAE, which participates in the updating of the parameters, and the right figure is the error calculated by mean square error (MSE), which does not participate in the updating of the parameters, and serves as a reference curve for adjusting the parameters of the model only. The horizontal coordinate is the number of training rounds. With the increase in the number of epochs, the training loss and validation loss of the four models show good convergence.

4.3. Model Verification

4.3.1. Evaluation Metrics

This paper employs mean absolute error (MAE), mean square error (MSE), and mean absolute percentage error (MAPE) as evaluation metrics to measure the generalization ability of models in different ways. These metrics are defined as follows, where T denotes the target sequence,

T^{'}

indicates the predicted sequence, and n is the length of T and

T^{'}

.

M A E (T, T^{'}) = \frac{1}{n} \sum_{i = 1}^{n} | T (i) - T^{'} (i) |

(21)

M S E (T, T^{'}) = \frac{1}{n} \sum_{i = 1}^{n} {(T (i) - T^{'} (i))}^{2}

(22)

M A P E (T, T^{'}) = \frac{1}{n} \sum_{i = 1}^{n} \frac{| T (i) - T^{'} (i) |}{T (i)}

(23)

4.3.2. Analysis of Sub-Module Results

The errors analyzing the neural network models for the four sub-models are depicted in Table 9; the MAPE values for all four models register below 3%, while the MAE values remain under 1 min. In summary, the performances of MAE, MAPE, and MSE metrics across all four models demonstrate effective results, affirming the efficacy of each model.

In the MDARNN model for the Bus Stop Dwelling Time module and Stop-to-Stop Travel Time module, the impact of the number of bus stations on error features is depicted in Figure 10; the horizontal coordinate indicates the number of consecutive stations involved in the prediction of dwelling time and the vertical coordinate indicates the error (MAE, MSE or MAPE). The figure suggests that as the number of stations increases, both MAE and MSE values exhibit a corresponding rise. In contrast, the MAPE value maintains a relatively consistent level.

Figure 11 displays the error features associated with various transfer points shown in Table 4. As shown in figure, the performance of the LSTM network in this module performs relatively well for different lines, with MAE within 1 min and MAPE less than 0.03.

4.4. Contrast Experiments

To assess the effectiveness of our proposed model-PMLNet, we have selected seven alternative methods for comparison; among them, Historical Average (HA), Support Vector Regression (SVR), Partitioning and Combination Framework Linear Regression (PCF-LR) are existing methods, while PCF-unweight, PCF-LSTM, PCF-DARNN, and MDARNN are ablation experiments.

HA: This method uses historical travel records as the basic data, and calculates the present travel time as the predicted value by taking the average of historical records [23].

SVR (Support Vector Regression): SVR is employed for predicting total bus travel time, following the methodology outlined in literature [29].

PCF-unweight: The current travel time is derived directly from the sum of the outputs of all four sub-modules without real-time traffic flow impact factor weighting, as shown in Equation (24)

T = t_{C D S T T} + t_{T P W T} + \sum_{k = 0}^{p} (t_{B S D T} (k) + t_{S S T T} (k, k + 1))

(24)

PCF-LSTM: The method predicts all four time periods by utilizing LSTM, with four real-time traffic impact factors being employed for calibration.

PCF-DARNN: Instead of MDARNN, this method employs DA-RNN to predict Stop-to-Stop Travel Time and Bus Stop Dwelling Time. The total predicted travel time is weighted and summed by four real-time traffic impact factors.

Pure MDARNN: This approach utilizes the MDARNN model introduced in this work for predicting bus travel time without using the partitioning and combination framework. The model’s predictions are then calibrated using traffic factors as Equation (25), where

t^{q}

represents the travel time of qth bus on the route predicted by the MDARNN model in the first 30 min, and

\bar{t^{q}}

represents the true travel time of the qth bus on the route in the first 30 min.

α = 1 + \frac{1}{n} \sum_{q = 1}^{n} \frac{(\bar{t^{q}} - t^{q})}{t^{q}}

(25)

PCF-LR: PCF-LR combines the historical average method for predicting Transfer Point Waiting Time and constructs an LSTM model for travel time prediction. The total travel time is then obtained using linear regression. This method is employed for Bus Arrival Time prediction in [9].

In this study, four bus routes are considered; the results of the performance comparison are presented in Table 10. All experiments were conducted under the same environment and using the same dataset as processed in this paper as shown in Table 6.

PMLNet outperforms other models on all routes, confirming the effectiveness of the parallel prediction framework. Notably, MAPE slightly decreases as route distance increases (Route 4 is the shortest). This is because bus travel time depends on travel status, real-time traffic congestion, stop dwelling time, and transfer waiting time. When the bus routes are shorter, the road sections are shorter, the number of bus stops is fewer, and the predicted congestion status, travel time between stops, stopping time at stops, and waiting time at transfer point will have large differences. However, the predicted values of the four sub-modules tend to stabilize as the travel distance increases, resulting in better prediction accuracy. In contrast, longer bus routes tend to have larger MAE, because the longer bus route has a greater average travel time, and the MAE accumulates as the travel time increases. In this reason, Route 4, with the shortest travel distance, has a higher MAPE value and a lower MEA value than other routes.

Figure 12 shows the MAE and MAPE for the four models on different routes. The MAE and MAPE of PCF-DARNN are higher than those of PMLNet, suggesting that MDARNN is more suitable than DARNN for bus travel time prediction by considering macro impact factors. Pure MDARNN does not perform well in both MAE and MAPE compared to models that use division and combination framework, possibly due to the framework’s design of different neural networks based on the characteristics of the four travel time parts. The performance of PCF-unweight on the four routes is inferior to that of PMLNet, which demonstrates that the real-time traffic factors are effective in reducing the deviation of bus travel time prediction.

The overall performance of PMLNet is also optimal as shown in Table 11. Compared with other methods, both the MAE and MAPE of PMLNet reach a small value. The MAPE is 2.91%, which indicates the effectiveness of the PMLNet model. The MAE is 1.45 min, which indicates that the average difference between the estimated travel time and the actual travel time on the four travel routes is 1.45 min, which is a low value acceptable to the passengers. The experimental results show that the

α

correction factor improves the prediction accuracy of each route by about 15–40% by integrating real-time traffic flow information.

4.5. Bus Travel Time Prediction

To validate the effectiveness of PMLNet, a bus travel time prediction system based on PMLNet is designed. The system architecture is given in Figure 13. As shown in Figure 13, the system employs a client/server model to achieve travel time prediction. The client handles real-time data collection/processing, travel time prediction, and output demonstration, while the server handles database access and PMLNet training. The data collected by the client is sent to the server and stored in the server database.

The client firstly collects and processes the real-time macro and local factors, and then sends the real-time location information to the client page, while the real-time data is sent to PMLNet and the server, respectively. After obtaining the features, the PMLNet module outputs the predicted bus travel time. Finally, the client page displays the real-time location information and predicted bus travel time. On the server side, the real-time data, actual travel time, and predicted travel time are stored in the database. As sufficient data accumulates, the model can undergo incremental training on the server to enhance its generalization capability.

The system was tested on the XIAOMI 10 in 15 June 2023, and the test routes were route 805 to 202, where the start is Jialing Garden, the end is Golden Fort, and the tester would transfer at Shapingba Station.

The client screenshots show the user interface of the bus travel time prediction system. This interface is designed to provide users with key information about their bus trips. The main elements displayed include the route map, predicted and actual travel times, and important station markers. Before starting their trip, users can interact with the interface by checking the predicted times to plan their commute. After the trip, they can compare the actual travel times with the predicted ones to assess the system’s accuracy.

Figure 14 shows the client page at the start location and the destination. In (a), the predicted travel time consists of two driving periods and one waiting period, while (b) demonstrates the actual travel time of three periods. The test results reveal that the error between the predicted travel time and the actual travel time is less than two minutes, which proves that the bus travel time prediction system can accurately identify the real-time location of the bus and predict the travel time with high accuracy.

5. Conclusions

In this paper, we introduced a cutting-edge bus travel time prediction system-based PMLNet, which leverages a partition and combination framework to accurately predict multi-route bus travel times. PMLNet uniquely integrates macro and local impact factors through the MDARNN model, and employs four real-time traffic flow factors to refine the predicted results, which significantly enhances prediction accuracy. The reproducibility of the PMLNet method in other contexts is feasible given the availability of essential databases and the generalizability of the proposed framework. The core components of our system, including the partitioning of travel time into distinct components and the use of both LSTM and MDARNN models, are designed to be adaptable to various urban transportation scenarios. Our experimental results showcase PMLNet’s superior performance, achieving remarkable precision in bus travel time predictions across multiple routes. This advancement not only bridges the gap in existing prediction methodologies but also offers reference options for intelligent transportation systems, potentially improving urban mobility and passenger experience.

There are still some shortcomings in our work that need to be further addressed, such as the model’s dependency on high-quality, comprehensive datasets and its performance under varying traffic conditions. Future research will focus on optimizing PMLNet’s architecture for broader applicability, including the integration of multimodal transportation data, to further refine prediction accuracies. Additionally, exploring adaptive algorithms capable of real-time adjustments to sudden traffic changes presents a promising avenue for enhancing the robustness and reliability of our prediction system.

Author Contributions

Conceptualization, Q.H., L.Z. and J.L.; methodology, L.Z., G.H. and Y.C.; software, Y.C. and G.H.; validation, Q.H., L.Z. and Y.C.; formal analysis, Y.C. and G.H.; investigation, Y.C. and G.H.; resources, Q.H. and L.Z.; data curation, Y.C.; writing—original draft preparation, Y.C.; writing—review and editing, Q.H., L.Z. and J.L.; visualization, Y.C. and G.H.; supervision, Q.H. and L.Z.; project administration, L.Z.; funding acquisition, Q.H., L.Z. and J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Science and Technology Innovation Key R&D Program of Chongqing No. CSTB2022TIAD-STX0001.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The bus driving dataset was provided by our partner company Chongqing Hengtong Bus Company and they have not given permission for the researchers to share their data. Data requests can be made via tangtao.cy@crrcgc.cc. The bus route dataset was extracted from the Amap open platform, and data will be made available through https://lbs.amap.com/. The meteorological dataset was sourced from Eastern Weather Net, and data will be made available through https://tianqi.eastday.com/.

Acknowledgments

This research is supported by Science and Technology Innovation Key R&D Program of Chongqing No. CSTB2022TIAD-STX0001.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zeng, W.; Fu, C.-W.; Arisona, S.M.; Erath, A.; Qu, H. Visualizing mobility of public transportation system. IEEE Trans. Vis. Comput. Graph. 2014, 20, 1833–1842. [Google Scholar] [CrossRef] [PubMed]
Petersen, N.C.; Rodrigues, F.; Pereira, F.C. Multi-output bus travel time prediction with convolutional LSTM neural network. Expert Syst. Appl. 2019, 120, 426–435. [Google Scholar] [CrossRef]
Han, Q.; Liu, K.; Zeng, L.; He, G.; Ye, L.; Li, F. A bus arrival time prediction method based on position calibration and LSTM. IEEE Access 2020, 8, 42372–42383. [Google Scholar] [CrossRef]
Akhtar, M.; Moridpour, S. A review of traffic congestion prediction using artificial intelligence. J. Adv. Transp. 2021, 2021, 8878011. [Google Scholar] [CrossRef]
Hofmann, M.; O’Mahony, M. The impact of adverse weather conditions on urban bus performance measures. In Proceedings of the 2005 IEEE Intelligent Transportation Systems Conference, Vienna, Austria, 16 September 2005; pp. 84–89. [Google Scholar]
Zhou, M.; Wang, D.; Li, Q.; Yue, Y.; Tu, W.; Cao, R. Impacts of weather on public transport ridership: Results from mining data from different sources. Transp. Res. Part C Emerg. Technol. 2017, 75, 17–29. [Google Scholar] [CrossRef]
Ramakrishna, Y.; Ramakrishna, P.; Lakshmanan, V.; Sivanandan, R. Use of GPS probe data and passenger data for prediction of bus transit travel time. In Transportation Land Use, Planning, and Air Quality; American Society of Civil Engineers: Reston, VA, USA, 2008; pp. 124–133. [Google Scholar]
Zhang, C. Comparative Research on the Governance Mode of Urban Public Transport Service. Ph.D. Thesis, Shanghai Jiao Tong University, Shanghai, China, 2015. [Google Scholar]
He, P.; Jiang, G.; Lam, S.-K.; Tang, D. Travel-time prediction of bus journey with multiple bus trips. IEEE Trans. Intell. Transp. Syst. 2018, 20, 4192–4205. [Google Scholar] [CrossRef]
He, P.; Jiang, G.; Lam, S.-K.; Sun, Y. Learning heterogeneous traffic patterns for travel time prediction of bus journeys. Inf. Sci. 2020, 512, 1394–1406. [Google Scholar] [CrossRef]
Pang, J.; Huang, J.; Du, Y.; Yu, H.; Huang, Q.; Yin, B. Learning to predict bus arrival time from heterogeneous measurements via recurrent neural network. IEEE Trans. Intell. Transp. Syst. 2018, 20, 3283–3293. [Google Scholar] [CrossRef]
Qin, Y.; Song, D.; Chen, H.; Cheng, W.; Jiang, G.; Cottrell, G. A dual-stage attention-based recurrent neural network for time series prediction. arXiv 2017, arXiv:1704.02971. [Google Scholar]
Ma, J.; Chan, J.; Ristanoski, G.; Rajasegarar, S.; Leckie, C. Bus travel time prediction with real-time traffic information. Transp. Res. Part C Emerg. Technol. 2019, 105, 536–549. [Google Scholar] [CrossRef]
Bin, Y.; Yang, Z.; Yao, B. Bus arrival time prediction using support vector machines. J. Intell. Transp. Syst. 2006, 10, 151–158. [Google Scholar] [CrossRef]
Jeong, R.; Rilett, R. Bus arrival time prediction using artificial neural network model. In Proceedings of the 7th International IEEE Conference on Intelligent Transportation Systems, Washington, DC, USA, 3–6 October 2004; pp. 988–993. [Google Scholar]
Kumar, B.A.; Jairam, R.; Arkatkar, S.S.; Vanajakshi, L. Real time bus travel time prediction using k-NN classifier. Transp. Lett. 2019, 11, 362–372. [Google Scholar] [CrossRef]
Yu, Z.; Wood, J.S.; Gayah, V.V. Using survival models to estimate bus travel times and associated uncertainties. Transp. Res. Part C Emerg. Technol. 2017, 74, 366–382. [Google Scholar] [CrossRef]
Wang, Y.; Bie, Y.; An, Q. Impacts of winter weather on bus travel time in cold regions: Case study of Harbin, China. J. Transp. Eng. Part A Syst. 2018, 144, 05018001. [Google Scholar] [CrossRef]
Xu, T.; Li, X.; Claramunt, C. Trip-oriented travel time prediction (TOTTP) with historical vehicle trajectories. Front. Earth Sci. 2018, 12, 253–263. [Google Scholar] [CrossRef]
Yu, B.; Wang, H.; Shan, W.; Yao, B. Prediction of bus travel time using random forests based on near neighbors. Comput.-Aided Civ. Infrastruct. Eng. 2018, 33, 333–350. [Google Scholar] [CrossRef]
Julio, N.; Giesen, R.; Lizana, P. Real-time prediction of bus travel speeds using traffic shockwaves and machine learning algorithms. Res. Transp. Econ. 2016, 59, 250–257. [Google Scholar] [CrossRef]
Mazloumi, E.; Rose, G.; Currie, G.; Sarvi, M. An integrated framework to predict bus travel time and its variability using traffic flow data. J. Intell. Transp. Syst. 2011, 15, 75–90. [Google Scholar] [CrossRef]
Sinn, M.; Yoon, J.W.; Calabrese, F.; Bouillet, E. Predicting arrival times of buses using real-time GPS measurements. In Proceedings of the 2012 15th International IEEE Conference on Intelligent Transportation Systems, Anchorage, AK, USA, 16–19 September 2012; pp. 1227–1232. [Google Scholar]
Wall, Z.; Dailey, D. An Algorithm for Predicting the Arrival Time of Mass Transit Vehicles Using Automatic Vehicle Location Data. Master’s Thesis, University of Washington, Seattle, WA, USA, 1998. [Google Scholar]
Chen, M.; Liu, X.; Xia, J. Dynamic prediction method with schedule recovery impact for bus arrival time. Transp. Res. Rec. 2005, 1923, 208–217. [Google Scholar] [CrossRef]
Fan, W.; Gurmu, Z. Dynamic travel time prediction models for buses using only GPS data. Int. J. Transp. Sci. Technol. 2015, 4, 353–366. [Google Scholar] [CrossRef]
Du, L.; Peeta, S.; Kim, Y.H. An adaptive information fusion model to predict the short-term link travel time distribution in dynamic traffic networks. Transp. Res. Part B Methodol. 2012, 46, 235–252. [Google Scholar] [CrossRef]
Rice, J.; Van Zwet, E. A simple and effective method for predicting travel times on freeways. IEEE Trans. Intell. Transp. Syst. 2004, 5, 200–207. [Google Scholar] [CrossRef]
Li, Y.; Huang, C.; Jiang, J. Research of bus arrival prediction model based on GPS and SVM. In Proceedings of the 2018 Chinese Control And Decision Conference (CCDC), Shenyang, China, 9–11 June 2018; pp. 575–579. [Google Scholar]
Zhang, J.; Gu, J.; Guan, L.; Zhang, S. Method of predicting bus arrival time based on MapReduce combining clustering with neural network. In Proceedings of the 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA), Beijing, China, 10–12 March 2017; pp. 296–302. [Google Scholar]
Yuan, Y.; Shao, C.; Cao, Z.; He, Z.; Zhu, C.; Wang, Y.; Jang, V. Bus dynamic travel time prediction: Using a deep feature extraction framework based on RNN and DNN. Electronics 2020, 9, 1876. [Google Scholar] [CrossRef]
Sun, H.; Li, F.; Liu, L. A spatial-temporal neural network model based on meta learning for bus arrival time prediction. In Proceedings of the 2024 10th International Conference on Big Data Computing and Communications (BigCom), Dalian, China, 9–11 August 2024; pp. 57–64. [Google Scholar]
Qiu, T.; Lam, C.-T.; Liu, B.; Ng, B.K.; Yuan, X.; Im, S.K. FEN-MRMGCN: A frontend-enhanced network based on multi-relational modeling GCN for bus arrival time prediction. IEEE Access 2025, 13, 5296–5307. [Google Scholar] [CrossRef]
Shanghai Meteorological Service Center. Eastday Weather. Available online: https://developer.amap.com/ (accessed on 15 July 2025).
Amap. Amap Open Platform. Available online: http://tianqi.eastday.com/ (accessed on 15 July 2025).

Figure 1. The framework of the real-time bus travel time prediction.

Figure 2. The influence of environment factors. (a) Temperature. (b) Holiday. (c) Air quality. (d) Peak/off-peak time. (e) Week.

Figure 3. The path model of the entail journey.

Figure 4. The inputs and outputs of the LSTM and MDARNN models. (a) LSTM for Current Driving Segment Travel Time and Transfer Point Waiting Time prediction. (b) MDARNN for Bus Stop Dwelling Time and Stop-to-Stop Travel Time prediction.

Figure 5. The bus driving routes of 805, 248, 121, and 202.

Figure 6. The architecture of PMLNet.

Figure 7. The architecture of MDARNN.

Figure 9. The loss curve of training and verification of the four models. (a) The loss decline curve of Current Driving Segment Travel Time. (b) The loss decline curve of Bus Stop Dwelling Time. (c) The loss decline curve of Stop-to-Stop Travel Time. (d) The loss decline curve of Transfer Point Waiting Time.

Figure 10. The values of MAE, MSE, and MAPE in different stop. (a) The values of MAE, MSE, and MAPE of Bus Stop Dwelling Time. (b) The values of MAE, MSE, and MAPE of Stop-to-Stop Travel Time.

Figure 11. The error of various transfer points.

Figure 12. The performance of PCF-DARNN, PCF-unweight, Pure MDARNN, and PMLNet. (a) MAE comparison. (b) MAPE comparison.

Figure 13. The architecture of bus travel time prediction system.

Figure 14. The client page of the system. (a) Start location page. (b) Destination page.

Table 1. The bus routes information of 805, 248, 121, and 202.

Route	Start Station	Destination	Distance (km)	Average Driving Time (min)
805	Chongqing University	Duijin Village	9.6	34
248	Chenjiawan	Xianfeng Street	13	38
121	East of Panxi	Danlong Road	13.3	55
202	Ciqikou	North Station South Square	16.4	61

Table 2. The explanation for Algorithm 1 variables and functions.

Variable/Function	Explanation
$D_{d}$ , $D_{r}$ , $D_{m}$	The raw datasets, denoted as Equations (5) and (6).
$D_{d_{i}}$	Currently reading driving data at time $t_{i}$ , consists of ${D_{d_{i}}^{1}, \dots, D_{d_{i}}^{46}}$ .
$t_{i}$	The time feature of $D_{d_{i}}$ .
$d_{i}$	The date feature of $D_{d_{i}}$ .
$D_{f u s e d}$	The fused dataset, denoted as Equation (7).
$D_{f u s e d_{i}}$	Currently fusing data at time $t_{i}$ .
T	The peak periods set.
H	The holiday set.
S	The coordinate set of stops from $D_{r}$ , where the difference in latitude and longitude between both fall within a range of 0.0001 degrees.
$I F_{m a c r o_{i}}^{1}$	The peak factor of macro impact factor at time $t_{i}$ , where Peak_Label is a constant indicating the i, is in T.
$I F_{m a c r o_{i}}^{2}$	The holiday factor of macro impact factor at time $t_{i}$ of date $d_{i}$ , where Holiday_Label is a constant indicating the $d_{i}$ , is in H.
${I F_{m a c r o_{i}}^{3}, \dots, I F_{m a c r o_{i}}^{7}}$	Macro impact factor about meteorologic at time $t_{i}$ : maximum and minimum temperature factor, air quality factor, week factor, date factor.
${I F_{l o c a l_{i}}^{1}, \dots, I F_{l o c a l_{i}}^{46}}$	The driving factor of local impact factor at time $t_{i}$ , such as speed, acceleration, throttle state, brake state, etc.
$I F_{l o c a l_{i}}^{47}$	The trip number factor of local impact factor at time $t_{i}$ .
$I F_{l o c a l_{i}}^{48}$	The stop factor of local impact factor at time $t_{i}$ .
$D_{d_{i}}^{1}, \dots, D_{d_{i}}^{46}$	Raw feature of $D_{d}$ at time $t_{i}$ .
$D_{m_{i}}, \dots, D_{m_{j}}$	Raw feature of $D_{m}$ at time $t_{i}$ .
$(L a t_{i}, L n g_{i})$	The bus location coordinates of $D_{d_{i}}$ at time $t_{i}$ .
get_coordinate_set ( $D_{r}$ )	Get the coordinate set of stops from $D_{r}$ , where the difference in latitude and longitude between stops both fall within a range of 0.0001 degrees.
fill_factor ( $A_{i}, B_{i}$ )	Filter redundancy and erroneous data, fill $B_{i}$ with $A_{i}$ , if $A_{i}$ is outlier (e.g., Null), fill $B_{i}$ with $A_{i - 1}$ .
get_trip_factor ( $D_{d_{i}}$ )	Get the trip order number, return how many times the bus has reached the start and end stop at time $t_{i}$ of $D_{d_{i}}$ .
get_stop_factor ((Lat, Lng), S)	Get the stop factor of stop impact factor, if $(L a t, L n g)$ exists within the coordinates of a stop of S, return the serial number of the stop.
get_data ( $D_{d_{i}}$ )	Get $(L a t, L n g)$ , $t_{i}$ and $d_{i}$ from $D_{d_{i}}$ .
fuse_data ( $A_{i}, B_{i}$ )	Fuse data $A_{i}$ and $B_{i}$ as ${A_{i}, B_{i}}$ according to Equation (7).
add ( $D_{f u s e d_{i}}$ , $D_{f u s e d}$ )	Add fusing data $D_{f u s e d_{i}}$ to $D_{f u s e d}$ .

Table 3. The explanation for Algorithm 2 variables and functions.

Variable/Function	Explanation
$X, Y$	The bus route lines.
A	One of driving buses on X.
B	One of driving buses on Y.
$D_{f u s e d}^{A}$	The fused dataset of bus A.
$D_{f u s e d}^{B}$	The fused dataset of bus B.
$D_{r}$	The raw route dataset, denoted as Equation (6).
$D_{f u s e d_{i}}^{A}$	Currently reading driving data of $D_{f u s e d}^{A}$ at time $t_{i}$ .
$D_{f u s e d_{j}}^{B}$	Currently reading driving data of $D_{f u s e d}^{B}$ at time $t_{j}$ .
$D_{X \to Y}$	The travel chain dataset of route line X to Y, denoted as Equation (8).
$Z 1$	The latitude and longitude coordinates of bus A on route X from the start stop to the transfer point, which is obtained from $D_{r}$ .
$Z 2$	The location at which the total error of longitude and latitude for the transfer point fall within a range of 0.0001 degrees, which is obtained from $D_{r}$ .
$Z 3$	The latitude and longitude coordinates of bus B on route Y from transfer point to the end stop, which is obtained from $D_{r}$ .
$(L a t_{i}^{A}, L n g_{i}^{A})$	The driving location coordinates of $D_{f u s e d}^{A}$ at time $t_{i}$ .
$(L a t_{j}^{B}, L n g_{j}^{B})$	The driving location coordinates of $D_{f u s e d}^{B}$ at time $t_{j}$ .
get_coordinates( $D_{f u s e d}$ )	Get $Z 1, Z 2$ and $Z 3$ from $D_{f u s e d}$ .
get_location( $D_{f u s e d}^{A}$ )	Get location information (longitude and latitude) from $D_{f u s e d}^{A}$ at time $t_{i}$ .
is_first_arrival( $D_{f u s e d}^{B}$ , $D_{f u s e d}^{B}$ )	At time $t_{j}$ , is bus B the first bus of route Y to arrive at the transfer point after bus A arrives (at time $t_{i}$ )? Note here, $t_{i}$ and $t_{j}$ are current time feature of $D_{f u s e d}^{A}$ and $D_{f u s e d}^{B}$ , respectively.
add( $D_{f u s e d}^{A}$ , $D_{f u s e d}^{B}$ )	Add driving data $D_{f u s e d}^{B}$ to $D_{f u s e d}^{A}$ .

Table 4. The corresponding information for each travel chain.

Travel Chain	Sub-Dataset	Start	End	Transfer
805 to 248	$D_{805 - 248}$	Jialing Garden	Xianfeng Street	Shuangbei
202 to 121	$D_{202 - 121}$	Shapingba Station	South of TIYZ	Golden Fort
805 to 202	$D_{805 - 202}$	Jialing Garden	Golden Fort	Shapingba Station
121 to 202	$D_{121 - 202}$	SM Square	Rongjing City	Golden Fort

Table 5. The explanation for Equation (9).

Variable/Function	Explanation
$α_{C D S T T}$	The Current Driving Segment Travel Time factor.
$t_{C D S T T}^{q}$	The predicted Current Driving Segment Travel Time of bus q.
$t_{C D S T T}^{q}$	The real Current Driving Segment Travel Time of bus q.
$n^{c}$	The number of buses passed through CDS during time period $[t_{0} - 30 \min, t_{0}]$ , $q \in n$ , $t_{0}$ is the current time.
$α_{S S T T (k, k + 1)}$	The stop-stop travel factor of the segment between $S t o p_{k}$ and $S t o p_{k + 1}$ .
$u^{c}$	The number of buses passed through the segment between $S t o p_{k}$ and $S t o p_{k + 1}$ during time period $[t_{0} - 30 \min, t_{0}]$ , $q \in u$ .
$t_{S S T T (k, k + 1)}^{q}$	The predicted Stop-to-Stop Travel Time of bus q through the segment between $S t o p_{k}$ and $S t o p_{k + 1}$ .
$t_{S S T T (k, k + 1)}^{q}$	The real Stop-to-Stop Travel Time of bus q through the segment between $S t o p_{k}$ and $S t o p_{k + 1}$ .
$α_{B S D T (k)}$	The bus stop dwelling factor of $S t o p_{k}$ .
$v^{c}$	The number of buses dwelled at $S t o p_{k}$ , during time period $[t_{0} - 30 \min, t_{0}]$ , $q \in v$ .
$t_{B S D T (k)}^{q}$	The predicted Bus Stop Dwelling Time of bus q at $S t o p_{k}$ .
$t_{B S D T (k)}^{q}$	The real Bus Stop Dwelling Time of bus q at $S t o p_{k}$ .
$α_{T P W T}$	The Transfer Point Waiting Time factor.
$t_{T P W T}^{q}$	The predicted waiting time for the qth same transfer at the transfer point.
$t_{T P W T}^{q}$	The real waiting time for the qth same transfer at the transfer point.
$w^{c}$	The number of transfer times at transfer point during time period $[t_{0} - 30 \min, t_{0}]$ , $q \in w$ .

Table 6. The corresponding results of the dataset partition.

Dataset	Total Number of Data (Million)	Training Dataset (Million)	Verification Dataset (Million)	Testing Dataset (Million)
$D_{C D S T T}$	13.444	8.064	2.688	2.688
$D_{B S D T}$	7.3	4.38	1.46	1.46
$D_{S S T T}$	10.70	6.42	2.14	2.14
$D_{T P W T}$	0.096	0.0576	0.0192	0.0192
$D_{P M L N e t . T}$	4.2	-	-	4.2

Table 7. The parameter setting of four modules in PMLNet.

Module	Local Factors	Macro Factors	Output Length	Batch Size	Time Step	Learning Rate
Bus Stop Dwelling Time	48	7	Variable length	128	5	0.006
Stop-to-Stop Travel Time	48	7	Variable length	128	5	0.006
Current Driving Segment Travel Time	55	55	1	128	10	0.006
Transfer Point Waiting Time	55	55	1	128	3	0.006

Table 8. The explanation for Algorithm 3 variables and functions.

Variable/Function	Explanation
s	First moment vector.
r	Second moment vector.
$ε$	Step size.
$ρ_{1}, ρ_{2}$	Exponential decay rates for the moment estimates.
$σ$	Small constant used for numerical stabilization.
$θ$	Initial parameter.
g	The gradients.
s	Bias-corrected first moment estimate.
f	Bias-corrected second moment estimate.
sample $(x^{(1)}, \dots, x^{(m)}, y^{(i)})$	Sample a minibatch of m examples from the training set ${x^{(1)}, \dots, x^{(m)}}$ with corresponding targets $y^{(i)}$ .
set_gradients( $m, 0, x^{(i)}, y^{(i)}$ )	Obtain gradients based on sample number m, parameter 0, samples $x^{(i)}$ , and targets $y^{(i)}$ .

Table 9. Performance metrics of sub-models.

Module	MAE (min)	MAPE (%)	MSE
Current Driving Segment Travel Time	0.63	2.75	0.97
Bus Stop Dwelling Time	0.052	1.32	0.098
Stop-to-Stop Travel Time	0.96	0.87	1.12
Transfer Point Waiting Time	0.79	2.81	0.88

Table 10. Performance comparison of different methods on multiple routes.

Method	Route1		Route2		Route3		Route4
Method	MAE (min)	MAPE (%)	MAE (min)	MAPE (%)	MAE (min)	MAPE (%)	MAE (min)	MAPE (%)
HA [23]	5.63	11.28	7.12	9.41	6.35	13.57	4.78	15.77
SVR [29]	4.43	11.25	6.75	8.33	5.93	10.49	3.48	14.25
Pure-MDARNN	4.12	7.03	4.99	6.33	4.70	7.19	4.54	8.35
PCF-LR [9]	3.01	6.18	3.91	4.52	3.43	5.76	2.75	8.27
PCF-unweight	2.46	4.79	3.89	3.96	3.64	4.84	2.29	5.82
PCF-LSTM	2.25	4.85	3.40	3.35	3.49	4.95	2.18	5.72
PCF-DARNN	2.16	4.14	2.94	3.11	2.65	3.05	1.86	4.62
PMLNet	1.89	2.69	2.48	1.86	2.07	2.06	1.58	3.35

Table 11. The overall performance comparison.

Method	MAE (min)	MAPE (%)
HA [23]	6.32	14.89
SVR [29]	5.78	12.25
Pure MDARNN	3.32	7.25
PCF-LR [9]	2.58	5.77
PCF-unweight	2.13	4.32
PCF-LSTM	2.39	5.85
PCF-DARNN	2.02	3.22
PMLNet	1.45	2.91

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lei, J.; Chen, Y.; Han, Q.; Zeng, L.; He, G. Effective Bus Travel Time Prediction System of Multiple Routes: Introducing PMLNet Based on MDARNN. Appl. Sci. 2025, 15, 8104. https://doi.org/10.3390/app15148104

AMA Style

Lei J, Chen Y, Han Q, Zeng L, He G. Effective Bus Travel Time Prediction System of Multiple Routes: Introducing PMLNet Based on MDARNN. Applied Sciences. 2025; 15(14):8104. https://doi.org/10.3390/app15148104

Chicago/Turabian Style

Lei, Jianmei, Yulan Chen, Qingwen Han, Lingqiu Zeng, and Guangyan He. 2025. "Effective Bus Travel Time Prediction System of Multiple Routes: Introducing PMLNet Based on MDARNN" Applied Sciences 15, no. 14: 8104. https://doi.org/10.3390/app15148104

APA Style

Lei, J., Chen, Y., Han, Q., Zeng, L., & He, G. (2025). Effective Bus Travel Time Prediction System of Multiple Routes: Introducing PMLNet Based on MDARNN. Applied Sciences, 15(14), 8104. https://doi.org/10.3390/app15148104

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Effective Bus Travel Time Prediction System of Multiple Routes: Introducing PMLNet Based on MDARNN

Abstract

1. Introduction

2. Related Work

2.1. Impact Factor Selection

2.2. Prediction Model Construction

3. Materials and Methods

3.1. System Architecture

3.1.1. Framework

3.1.2. Path Model

3.1.3. Basic Model Selection

3.2. Method

3.2.1. Data Pre-Processing

3.2.2. Traffic Impact Factors

3.2.3. Travel Time Prediction Model

4. Results and Discussion

4.1. Datasets Partition

4.2. Model Training

4.3. Model Verification

4.3.1. Evaluation Metrics

4.3.2. Analysis of Sub-Module Results

4.4. Contrast Experiments

4.5. Bus Travel Time Prediction

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI