Hybrid Graph Convolutional-Recurrent Framework with Community Detection for Spatiotemporal Demand Prediction in Micromobility Systems

Moon Zin, Mayme; Patanukhom, Karn; Demissie, Merkebe Getachew; Phithakkitnukoon, Santi

doi:10.3390/math14010116

Open AccessArticle

Hybrid Graph Convolutional-Recurrent Framework with Community Detection for Spatiotemporal Demand Prediction in Micromobility Systems

by

Mayme Moon Zin

¹,

Karn Patanukhom

^1,*,

Merkebe Getachew Demissie

² and

Santi Phithakkitnukoon

^1,*

¹

Department of Computer Engineering, Faculty of Engineering, Chiang Mai University, Chiang Mai 50200, Thailand

²

Department of Civil Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB T2N 1N4, Canada

^*

Authors to whom correspondence should be addressed.

Mathematics 2026, 14(1), 116; https://doi.org/10.3390/math14010116

Submission received: 19 November 2025 / Revised: 19 December 2025 / Accepted: 22 December 2025 / Published: 28 December 2025

(This article belongs to the Special Issue Theoretical and Applied Mathematics in Supply Chain Management)

Download

Browse Figures

Versions Notes

Abstract

The rapid growth of dockless electric scooter (e-scooter) sharing services has transformed short-distance urban mobility, offering convenience and sustainability benefits while amplifying challenges related to demand imbalance, fleet rebalancing, and spatial inequity. Accurate spatiotemporal demand prediction is therefore essential for optimizing resource allocation and supporting data-driven policy interventions. This study proposes a hybrid deep learning framework that integrates a Graph Convolutional Network (GCN) with a Gated Recurrent Unit (GRU) and community detection to enhance short-term prediction of e-scooter pick-up and drop-off demands. The Louvain algorithm is employed to partition urban areas into mobility-based communities, enabling the model to capture functional connectivity rather than relying solely on geographic proximity. Using real-world e-scooter trip data from Calgary, Canada, the model’s performance is evaluated against established baselines, including a Masked Fully Convolutional Network (MFCN) and conventional GRU architectures. Results show that the proposed approach achieves up to 11.8% improvement in mean absolute error (MAE) compared with the MFCN baseline and more robust generalization across temporal horizons. The findings demonstrate that integrating community structures into graph-based learning effectively captures complex urban dynamics, providing practical insights for sustainable micromobility operation and service deployment.

Keywords:

e-scooter demand prediction; spatiotemporal modeling; graph convolutional network (GCN); gated recurrent unit (GRU); community detection; micromobility; urban mobility management

MSC:

90B06

1. Introduction

Shared micromobility has rapidly transformed urban transportation over the past decade. Among various forms, dockless electric scooters (e-scooters) have expanded swiftly since their launch in Santa Monica, California, in 2017 by Bird. Within only a few years, major operators such as Bird and Lime scaled globally, embedding e-scooters as a convenient yet controversial feature of many city streets across North America, Europe, and Asia [1,2,3]. Their appeal lies in convenience, digital connectivity, and adaptability. E-scooters can be unlocked through mobile applications, tracked via GPS, and paid for seamlessly using integrated digital platforms [4].

E-scooters are often described as a potential solution to the “first/last-mile” problem, connecting users between public transport nodes and final destinations [5,6]. Within the broader context of Mobility-as-a-Service (MaaS), they support multimodal travel, reduce car dependency, and promote sustainable and inclusive mobility [7]. However, their proliferation has also introduced new challenges concerning safety, environmental impact, and social equity. Accidents and pedestrian conflicts have raised public concerns [8], while life-cycle analyses reveal that much of their environmental footprint stems from manufacturing, maintenance, and fleet rebalancing rather than electricity consumption [9,10]. Moreover, accessibility remains uneven, as deployment tends to concentrate in central or affluent areas [11,12]. These issues highlight the need for operational efficiency and evidence-based policymaking driven by accurate demand prediction.

1.1. Importance of Demand Prediction in Micromobility

Accurate spatiotemporal demand prediction plays a key role in sustaining shared micromobility systems. Reliable prediction enables operators to optimize fleet redistribution, reduce idle time, and maintain service availability in high-demand areas. From a public-policy perspective, demand prediction can guide infrastructure planning, and equitable resource allocation. While statistical and classical machine learning models have provided foundational insights, they often fail to capture complex nonlinear interactions between spatial and temporal factors influencing micromobility demand [13,14].

With the rise of large-scale trip data, deep learning has become increasingly popular in mobility research. Early studies applied convolutional or recurrent neural networks to capture spatial or temporal correlations separately. Phithakkitnukoon et al. [15] proposed a Masked Fully Convolutional Network (MFCN) to predict e-scooter demand while handling data sparsity through a dual-branch classification and regression structure. Their approach achieved robust short- and long-term performance by weighting zero-demand regions and focusing on spatial areas of interest. Ham et al. [16] later introduced a convolutional autoencoder integrated with a recurrent decoder (ERD), demonstrating that latent feature extraction can improve short-term e-scooter demand prediction compared with long short-term memory (LSTM) models. Sahnoon et al. [17] evaluated three deep learning models, MFCN, UNet, and UNet Transformer (UNETR), for short-term prediction of e-scooter pick-ups and drop-offs. While Transformers are often favored for capturing long-range dependencies and achieving high accuracy, the study did not find definitive evidence of their overall superiority.

Kim et al. [18] incorporated community structure into a deep learning framework by clustering Seoul’s e-scooter service areas using modularity optimization before predicting temporal demand with LSTM models. They found that activation functions significantly affected peak-period prediction performance, with the exponential linear unit and hyperbolic tangent (tanh) functions yielding the most accurate results. Liu et al. [19] analyzed three months of e-scooter trips in Indianapolis to uncover temporal variations, noting pronounced peaks during commuting hours and minimal activity at night. Bai and Jiao [20] compared usage patterns between Austin and Minneapolis, identifying the built environment, such as proximity to downtown areas and land-use diversity, as major drivers of demand. Similarly, Lee et al. [21] developed a multivariate log-linear regression model in Manhattan that incorporated demographic factors such as population density and median income, revealing that socioeconomic variables also influence ridership.

1.2. Advances in Deep Learning for Spatiotemporal Demand Prediction

More sophisticated hybrid and graph-based models have recently emerged to address spatial dependencies more effectively. Yang et al. [22] proposed a Spatiotemporal Graph Convolutional Network (SGCN) combining graph convolution for spatial relations with LSTM for temporal dynamics, also including weather variables such as temperature, wind, and air quality. Using real-world data from Tianjin, their model outperformed baseline ARIMA and LSTM models. Li et al. [23] developed an irregular convolutional neural network (ICNN) to forecast bike-sharing demand in Singapore, Chicago, Washington D.C., New York, and London. The ICNN, which integrates convolutional and LSTM layers, achieved superior predictive accuracy across all sites.

Song et al. [24] tackled the issue of data imbalance using a Sparse Diffusion Convolutional Gated Recurrent Unit (SpDCGRU), which merges diffusion convolution with GRU architecture and introduces spatial data reclustering and fusion-loss strategies. Their approach achieved improved overall performance and interpretability. Xu et al. [25] advanced this direction by proposing a Spatiotemporal Multi-Graph Transformer (STMGT) that integrates four types of graphs, adjacency, functional similarity, demographic similarity, and transportation supply, to capture multiple forms of spatial dependency. Their model outperformed conventional deep learning frameworks and identified weather as the most influential predictor of demand. In parallel with these developments, attention-based architectures have gained prominence in spatiotemporal traffic and demand prediction. For example, Jiang et al. [26] proposed PDFormer, a transformer-based model designed to capture long-range propagation delays in complex traffic flows. Similarly, Lan et al. [27] introduced DSTAGNN, a dynamic spatiotemporal graph neural network that constructs adaptive spatial–temporal graphs from data to model evolving traffic patterns. More recently, Singh et al. [28] proposed an Integrated Spatiotemporal Graph Neural Network (ISTGCN) for traffic forecasting, demonstrating that jointly modeling spatial dependencies and temporal dynamics within a unified GNN architecture can significantly improve prediction accuracy. While these approaches highlight the effectiveness of advanced graph-based and attention-driven models, they typically rely on computationally intensive graph representations or fixed sensor networks, which limits direct functional interpretability in mobility systems.

These developments illustrate a clear progression toward deep, graph-based, and multi-feature architectures that jointly model spatial and temporal relationships. Nevertheless, most existing frameworks still rely on fixed geographic grids or administrative zones that may not represent functional travel communities. Few studies have explored spatial partitioning based on intrinsic travel-flow connectivity.

1.3. Graph-Based Learning and Community Detection

GCNs provide a powerful way to model complex dependencies between spatially linked nodes [13,14]. They have been successfully applied in transportation studies for traffic prediction [29], taxi-demand prediction [30], and ride-hailing demand analysis [31]. However, applications to micromobility remain relatively limited.

In parallel, community detection offers an alternative spatial structuring approach that reflects functional connectivity rather than fixed spatial boundaries. The Louvain algorithm [32] efficiently identifies cohesive communities within large networks by maximizing modularity [33]. Integrating this clustering approach into predictive models can capture localized collective behavior and improve interpretability in urban mobility studies. Dastjerdi and Morency [34] demonstrated that community-level modeling can enhance demand prediction performance in bike-sharing systems, particularly under changing conditions such as the COVID-19 pandemic.

1.4. Research Gap and Contribution

Despite significant progress in deep learning-based micromobility short-term demand prediction, most existing studies still rely on fixed grid cells or administrative zones that may not represent functional mobility structures. Such spatial partitions often fail to capture the dynamic flow connectivity among urban areas, which limits interpretability and reduces prediction performance under sparse or unevenly distributed data. In addition, many state-of-the-art models such as SGCN, ICNN, SpDCGRU, and STMGT construct spatial graphs solely from geographic adjacency or predefined neighborhood structures. These approaches are not able to capture latent mobility communities that emerge from actual origin–destination travel behavior. Although these models advance graph-based and hybrid spatiotemporal learning, they do not incorporate functional communities as explicit and learnable features within the prediction framework. Furthermore, although GCNs effectively model spatial dependencies and GRUs capture temporal dynamics, few studies have integrated these approaches with community detection to represent mobility-driven spatial relationships. Existing hybrid models such as STMGT rely on multiple predefined spatial graphs or handcrafted proximity measures, but they do not include flow-derived communities or embed community identifiers as trainable inputs within the GCN. Likewise, prior spatiotemporal variants typically use a single regression output and do not separate the occurrence of demand from its magnitude. These methodological gaps highlight the need for a unified framework that combines mobility-informed spatial structures with multi-task prediction.

To address these gaps, this study proposes a hybrid deep learning framework that combines GCNs with community detection for short-term e-scooter demand prediction. The Louvain algorithm is employed to partition the city into demand-flow-based communities, producing a functional spatial representation that better reflects real-world travel behavior. Unlike prior studies that rely on fixed spatial grids or distance-based adjacency, our framework embeds community identifiers as a learnable feature, enabling the GCN to capture both local geographic correlations and higher-level functional similarity across urban regions. This community-aware design differs from existing models such as STMGT, which do not incorporate community detection outputs into the learning process, and it enables the model to jointly leverage geographic adjacency and mobility-based functional zones. The integration of a GRU-based temporal encoder and a dual-branch classification and regression architecture further allows the model to learn short-term temporal dynamics, spatial spillover effects, and demand sparsity within a single cohesive design. This multi-task structure, which distinguishes the probability of activity from the magnitude of demand, provides an additional level of modeling flexibility not present in existing spatiotemporal hybrids. These combined capabilities have not been addressed together in previous frameworks. The resulting community graph is integrated into a GCN-based predictive model, enabling the framework to jointly learn temporal patterns and inter-community dependencies while mitigating the effects of data sparsity and spatial heterogeneity. Using real-world e-scooter trip data from Calgary, Canada, the model is benchmarked against established baselines, including MFCN and LSTM architectures. Results demonstrate that incorporating community structures within graph learning substantially improves predictive accuracy and robustness.

Beyond methodological advancement, this research provides actionable insights for service providers and urban policymakers. More accurate demand prediction can support efficient fleet rebalancing, and service deployment, and sustainable integration of micromobility into urban transport systems. The findings also contribute to the broader field of urban informatics by showing how community-aware graph learning can uncover latent spatial organization within human mobility networks.

2. Materials and Methods

2.1. Dataset

This research uses a real-world dataset from Calgary, Canada, where two companies, Lime and Bird, deployed 1000 and 500 e-scooters, respectively. Calgary is the largest city in the Canadian Province of Alberta. It covers an area of 820.62

{k m}^{2}

with the population of 1,306,784. Dockless e-scooter sharing was introduced in Calgary in July of 2019. E-scooters are small, electric-powered vehicles, allowing users to rent and return scooters at any location via a mobile application. Each scooter trip record contains detailed usage information, including the date, hour of the day, day of the week, trip duration in minutes, trip length in kilometers, and starting/ending geolocations (latitude, longitude). The scope of the data was selected as a period of 75 days from 15 July 2019 to 27 September 2019.

In the original dataset, the City of Calgary aggregated trip geolocations into 300

m^{2}

hexagonal grids for privacy protection. In contrast, our approach adopts a 200 m square grid format to enable a detailed spatial analysis of scooter usage patterns. The total number of e-scooter rides in the dataset is 459,478 rides from 4080 different starting grid locations to 4367 ending grid locations. Although Calgary city is composed of approximately 21,702 square grids (200 m × 200 m), the analysis dataset is 19.45% of the city. Figure 1 illustrates this spatial coverage, highlighting the active grids used in the analysis and illustrating the spatial concentration of trips within the downtown core. This study aimed to predict the hourly and 24 h demand of pick up and drop off for each square grid. Analysis of the trip data indicates that e-scooter demand is higher on weekends than on weekdays. On average, there were 6570 trips per day on weekends and 5965 trips per day on weekdays, representing an increase of approximately 10% in weekend usage compared with weekday activity. Looking at the hourly demand patterns, shown in Figure 2, it provides a detailed look into the different hourly patterns of weekday and weekend e-scooter usage. During the weekdays, the demand clearly shows two peaks, which is a strong indicator of work-related commuting. A smaller rise occurs in the morning, around 8 a.m., as people travel to work, followed by a much larger peak in the late afternoon, around 4 p.m. (16:00), when people make their return trips. In contrast, weekend demand grows more gradually through the late morning and peaks in the mid-afternoon, suggesting a stronger focus on leisure and social trips rather than commuting. The demand reaches its highest point in the late afternoon, between 3 p.m. and 6 p.m. (15:00–18:00). This comparison suggests that e-scooters play a dual role in urban mobility, serving as a practical option for structured weekday commutes and as a flexible, leisure-based mode of transport on weekends. The analysis of trip duration and distance, shown in Figure 3, indicates that most e-scooter rides are short in both time and distance, a characteristic feature of micromobility usage. While the average trip duration was 12.88 min, the median was only 8.40 min. A similar pattern is seen for trip distance. The average trip covered 1.85 km, but the median distance was just 1.26 km. This data suggests that e-scooters in Calgary are most often used for solving the “first- and last-mile” problem, such as traveling between transit stations and workplaces, and to substitute short walking trips rather than longer car journeys.

The spatial distribution of e-scooter activity across Calgary, illustrated in Figure 4, highlights strong spatial clustering in both pick-up and drop-off demand. The highest concentration of trips occurs in and around the downtown core, particularly in areas with dense commercial activity and proximity to public transit stations. This pattern suggests that e-scooters are commonly used for short urban trips within the central business district and for connecting to public transit hubs. Beyond the downtown area, moderate levels of activity are also visible on main transit routes and near other popular areas for work and recreation, such as those adjacent to the Bow River pathway network. Overall comparison between weekdays and weekends shows that weekday trips are slightly more centralized, reflecting commuter-oriented travel to and from the city center. In contrast, weekend trips are more spatially dispersed, with increased activity extending toward parks and leisure destinations. These spatiotemporal characteristics form the foundation for the predictive modeling described in the next section, where hourly and daily demand are estimated for each grid cell.

2.2. Proposed Model Architecture

To predict the spatiotemporal demand of e-scooters, we propose a deep learning model based on GCN and GRU, the workflow of the proposed model is summarized in detail in in Algorithm 1 and illustrated in Figure 5. A key aspect of our approach is the inclusion of a community feature, derived by applying the Louvain community detection algorithm to a travel network graph of the grids. This feature provides the model with a higher-level understanding of the urban structure. The model architecture consists of two main branches: a classification branch and a regression branch. The model input is a spatiotemporal tensor that represents 14 historical demand lag features and five contextual features (which include the sine–cosine components for the temporal dynamics and the weekend flag) and one categorical community identifier feature. In this implementation, only the lag features are processed by the GRU to learn short- and long-term temporal patterns. The resulting hidden states are then concatenated with the five contextual features, to which the embedded community feature is added, and then passed through two GCN layers, which capture spatial dependencies between adjacent grids and communities. This combined GRU-GCN architecture effectively integrates sequential and spatial dependencies within the same learning framework.

Algorithm 1 Spatiotemporal E-Scooter Demand Prediction Using the Proposed GCN Model

1: Input: Feature tensor X ∈

R^{B \times N \times H \times F}

, where B is the batch size, N is the number of active grid nodes, H is the historical time window, and F is the total number of input features.
Edge index matrix

\hat{A d j}

∈

R^{2 \times E}

.

2: Output: Predicted e-scooter pick-up or drop-off demand matrix

\hat{y} \in R^{B \times N \times 1}

.

3: Initialization: Define feature counts:

F_{l a g} = 14

,

F_{c o n t e x t} = 5

,

F_{c o m m_i d} = 1 .

Models: GRU, Embedding,

{G C N C o n v}^{(1)}

,

{G C N C o n v}^{(2)}

Split X into three parts based on feature indices:

X_{l a g}

← X[:, :, :, 0:

F_{l a g}

] // Contain the 14 lag-based demand features.

X_{c o n t e x t}

← X[:, :, :,

F_{l a g}

:

F_{l a g} + F_{c o n t e x t}

] // Contains the 5 contextual features: {

h_{s i n}

,

h_{c o s}

,

d_{s i n}

,

d_{c o s}

, w}, where

h_{s i n}

and

h_{c o s}

are sine–cosine encodings of the hour (24 h cycle),

d_{s i n}

and

d_{c o s}

are sine–cosine encodings of the day of the week (7-day cycle), and w is the weekend indicator.

X_{c o m m_i d}

← X[:, :, :,

F_{l a g}

+ F_{c o n t e x t}

] // Contains the 1 categorical community ID feature {

c_{i d}

}.

4:

X_{r e s h a p e d_l a g}

← reshape (

X_{l a g}

, (B × N, H,

F_{l a g}

))
5:

H_{o u t}

, _ ← GRU (

X_{r e s h a p e d_l a g}

)
6:

H_{G R U}

←

H_{o u t}

[:, −1, :]
7:

X_{c o n t e x t_l a s t}

←

X_{c o n t e x t}

[:, :, −1, :]
8:

H_{c o n t e x t_f l a t}

← reshape (

X_{c o n t e x t_l a s t}

, (B × N,

F_{c o n t e x t}

))

9:

X_{c o m m_i d_l a s t}

←

X_{c o m m_i d}

[:, :, −1, :]

10:

H_{c o m m_i d_f l a t}

← reshape (

X_{c o m m_i d_l a s t}

,(B × N))

11:

H_{c o m m_e m b}

← Embedding(

H_{c o m m_i d_f l a t}

)

12:

H^{(0)}

← concat[

H_{G R U}, H_{c o n t e x t_f l a t}, H_{c o m m_e m b}

]

13:

H^{(1)}

← ReLU (Dropout(

{G C N C o n v}^{(1)}

(

H^{(0)}

,

\hat{A d j}

)))

14:

H^{(2)}

← ReLU (Dropout(

{G C N C o n v}^{(2)}

(

H^{(1)}

,

\hat{A d j}

)))

15: P ← σ (

W_{c l s} H^{(2)}

+

b_{c l s}

)

16: A ←

W_{r e g} H^{(2)}

+

b_{r e g}

17:

\hat{y}

← max (0, P ⊙ A)

18:

\hat{y}

← reshape (

\hat{y}

, (B,N,1))

19: return

\hat{y}

The classification branch addresses a binary classification task, estimating the probability (P) that any trip activity will occur in each community at the next time step. This branch is optimized using a binary cross-entropy loss function. In parallel, the regression branch predicts the expected magnitude of the demand (A), if activity does occur, and is optimized using a Weighted Mean Squared Error (WMSE) loss. The weighting scheme uses the square root of the historical demand as a sublinear scaling factor. This increases the importance of high-demand locations while preventing excessively large weights that could destabilize the training process. Using raw demand values as weights would cause the loss function to be dominated by a small number of extreme-demand samples, leading to overfitting and poor generalization. The square root operation provides a balanced compromise. It emphasizes accurate prediction in high-demand grids, where operational decisions such as rebalancing are most critical, while maintaining numerical stability and ensuring that moderate-demand regions also influence model learning in a meaningful way. Finally, the model combines the outputs of both branches by multiplying the probability by the predicted demand (P ⊙ A). Since the classification and regression branches are trained sequentially and independently rather than through a joint multi-task loss, a weighting coefficient

λ

is not required to balance the two objectives. This decoupled training approach avoids the sensitivity issues of multi-task methods and allows each branch to specialize effectively. This learning framework is particularly effective for sparse demand data, as it enables the model to first identify potential active locations before estimating demand intensity. As a result, the proposed model achieves improved accuracy and robustness in predicting both the spatial and temporal variations in e-scooter usage across urban environments.

2.2.1. Input Representation

The raw e-scooter trip data was transformed into a structured spatiotemporal tensor that serves as the model’s input. This process began with spatial discretization, where the city was partitioned into the 200 m × 200 m grids defined in Section 2.1. Each trip was then assigned to a grid based on its pick-up and drop-off locations. Subsequently, the data was temporally aggregated by counting the number of pick-ups and drop-offs within each grid for every hour of the study period. To handle the inherent sparsity of the data and create a regular time series for each grid, all grid-time instances with no activity were explicitly assigned a demand value of zero.

From this structured time series, a feature vector was engineered for each grid at each time step, consisting of 20 features. These features were organized into two distinct groups based on their function within the model architecture, as detailed in Table 1. The historical demand features include a set of fourteen lagged demand values from specific past time steps, serving as the primary predictive inputs. These lags are strategically selected to capture immediate trends (recent hours), daily seasonality (the same hour on the previous day), and weekly seasonality (the same hour in the previous week). The spatiotemporal context features include six variables that provide static contextual information for each prediction. These features were concatenated with the GRU output and passed directly into the GCN layers. Among these, the community feature enhanced the model’s spatial understanding of urban structure. The Louvain algorithm was applied to detect clusters of functionally related grids, and each grid’s community identifier was used as a categorical input feature. Each community identifier is encoded using a 10-dimensional embedding layer. This dimensionality was selected through empirical tuning (testing 5, 10, 16, and 32), with 10 yielding the best validation performance. This choice provides a compact yet expressive representation for the 257 identified communities, effectively balancing model complexity with the ability to capture latent functional similarities between urban zones. The resulting vectors are concatenated with the temporal and contextual features before being passed to the GCN layers. In addition, temporal dynamics were represented by sine–cosine encodings of the hour of the day and day of the week, which preserve the cyclic nature of these temporal variables and by a binary weekend indicator capturing behavioral differences between weekdays and weekends. Finally, this feature data was structured into sequences. Each training sample contains feature vectors for all active grids over a 24 h historical window (H = 24). This results in a final input tensor with the shape (B, N, H, F), where B is the batch size, N is the number of active grids, H is the historical time window, and F is the number of features.

2.2.2. Community Detection

To enhance the model’s spatial understanding of Calgary’s urban structure, we introduce a community feature derived from the origin–destination (O-D) flows of all e-scooter trips. A mobility network graph was constructed using the networkx library, where each active grid was treated as a node. For each individual trip record, an unweighted edge was created between its start grid and its end grid, representing the functional connectivity derived from user travel patterns. The Louvain community detection algorithm was then applied to the aggregated network to partition the city into communities of grids exhibiting strong internal trip connections [32]. In this study, community detection using the Louvain algorithm was performed once on the fully aggregated trip graph, which covers all days and hours in the dataset (75 days). This approach generates a single, robust set of mobility-based communities that remains stable across the entire study period, capturing long-term functional connectivity rather than transient daily fluctuations. The Louvain method is an efficient algorithm that identifies community structures by optimizing a metric known as modularity (Q) [33]. Modularity measures the quality of a partition by comparing the density of connections within communities to the density of connections in a random network. A high modularity score indicates a well-defined community structure. Modularity is formally defined as:

Q = \frac{1}{2 m} \sum_{i, j} (A_{i j} - \frac{k_{i} k_{j}}{2 m}) δ (c_{i}, c_{j}),

(1)

where

A_{i j}

is an element of the adjacency matrix (1 if an edge exists between nodes i and j, and 0 otherwise),

k_{i}

is the degree of node i, δ(

c_{i}, c_{j}

) is 1 if nodes i and j are in the same community, and 0 otherwise, m is the total number of edges in the network [30].

The Louvain method operates iteratively through two repeating phases. In the first phase (modularity optimization), each node is initially assigned to its own community. For each node, the algorithm evaluates the modularity gain that would occur if the node were moved into each of its neighbors’ communities, selecting the move that provides the largest positive gain. This process continues for all nodes until no individual move can further improve modularity. In the second phase (network aggregation), the communities identified in the first phase are aggregated into “super-nodes” to form a smaller network, where edges between super-nodes are weighted by the sum of the connections between their constituent nodes. These two phases are repeated until no additional improvement in modularity is achieved, yielding a stable community structure [32].

The detection was performed on the combined network of all active pick-up (4080) and drop-off (4367) grids, encompassing all unique locations appearing as either trip origins or destinations. The Louvain algorithm identified 257 communities (≈17 grids per community) with a final modularity score of 0.368, indicating a meaningful community structure within Calgary’s mobility network. A modularity value of this magnitude reflects a moderately strong clustering pattern, meaning that connections within communities are significantly denser than would be expected in a random graph. This ensures that the detected communities represent genuine functional regions shaped by actual mobility flows rather than artifacts of the algorithm. In practice, such a modularity score confirms that the resulting clusters, such as downtown commercial zones, university districts, and recreational corridors, are coherent and interpretable, making the community assignments informative as spatial features for demand prediction. Each grid cell was then assigned a unique community identifier (

c_{i d}

), which was used as a categorical feature in the model’s input to represent its belonging to a specific functional zone. Incorporating community membership provides the model with a clearer understanding of the city’s spatial organization. It reduces data sparsity, groups grids with similar mobility patterns (such as downtown, university areas and recreational corridors) and enables the GCN to learn more meaningful relationships within and between these regions.

Community detection algorithms can be broadly categorized into several groups, including modularity-optimization approaches (such as Louvain and Leiden), statistical inference and generative models (for example, stochastic block models), spectral clustering methods, and evolutionary or heuristic algorithms. Recent studies have introduced hybrid intelligent approaches that combine fuzzy multi-criteria decision making with evolutionary search, such as the Fuzzy-AHP-based evolutionary algorithm proposed by Pourabbasi et al. [35]. These methods aim to improve accuracy in identifying communities within large and complex networks.

Although a variety of newer algorithms exist, the Louvain method remains well suited for mobility-based applications due to its strong scalability for large and sparse O-D networks, its ability to achieve high modularity without extensive parameter tuning, and its production of flat community labels that can be directly embedded as categorical features within a GCN. Since the objective of this study is to enhance spatiotemporal demand prediction rather than develop a new community detection algorithm, Louvain provides an efficient, interpretable, and practical solution for constructing mobility-aware spatial features.

2.2.3. Gated Recurrent Unit (GRU)

The GRU component is designed to capture the temporal dependencies of e-scooter demand over time. As discussed in Section 2.1, Calgary’s e-scooter usage exhibits pronounced daily and hourly variations, making a sequential model like a GRU essential for accurate prediction. A GRU is an advanced type of Recurrent Neural Network (RNN) developed to process sequential data efficiently, making it well suited for time-series prediction [36]. Unlike a simple RNN [37,38], a GRU incorporates gating mechanisms—an update gate and a reset gate—that control the flow of information. This enables the model to learn long-term dependencies and decide which historical information is relevant for the current prediction, mitigating issues such as the vanishing-gradient problem [16].

In our model, at each hour t, the GRU receives a spatiotemporal input tensor

X_{t}

∈

R^{N \times F},

where N denotes the number of active grid cells and F = 14 represents the number of lag-based features describing each grid. These lag features capture the historical demand patterns across the previous 24 h window (H = 24) and serve as the temporal sequence input to the GRU. For each grid cell, the GRU processes these sequences step by step, updating its hidden state at every time interval. The final hidden state thus encodes a compact representation of the temporal dynamics for that grid over the past 24 h. In the next stage, this learned temporal representation is concatenated with six contextual features (the four sine–cosine components derived from the hour of the day and day of the week, the weekend flag, and the community identifier derived from Louvain detection in Section 2.2.2). The combined vector is then passed to the GCN layers (Section 2.2.4), which integrate spatial correlations among adjacent grids and communities. The GRU component effectively models the sequential evolution of demand by using its update and reset gates to control the flow of information and mitigate the vanishing-gradient problem common in standard recurrent networks. Formally, for each time step t, given the input vector

x_{t}

and the previous hidden state

h_{t - 1}

, the GRU updates its state using the following operations [33]:

Reset Gate : r_{t} = σ (W_{i r} x_{t} + b_{i r} + W_{h r} h_{t - 1} + b_{h r}),

(2)

Update Gate : z_{t} = σ (W_{i z} x_{t} + b_{i z} + W_{h z} h_{t - 1} + b_{h z}),

(3)

New Gate : {\tilde{h}}_{t} = \tanh (W_{i n} x_{t} + b_{i n} + r_{t} ⊙ (W_{h n} h_{t - 1} + b_{h n}),

(4)

Final Hidden State : h_{t} = (1 - z_{t}) ⊙ h_{t - 1} + z_{t} ⊙ {\tilde{h}}_{t},

(5)

where

x_{t}

is the input,

h_{t}

is the hidden state, W and b are learnable parameters, σ is the sigmoid function, and ⊙ denotes element-wise multiplication. The update gate

z_{t}

determines how much prior information is retained, while the reset gate

r_{t}

controls how much of the previous state to forget. After processing the full 24 h sequence, the GRU outputs the final hidden representation

h_{t}

, which summarizes the temporal context of demand for each grid.

In summary, the GRU functions as the temporal encoder within the proposed GRU-GCN framework, transforming sequences of historical demand into meaningful latent representations that provide a robust foundation for the spatial learning performed by the subsequent GCN component.

2.2.4. Graph Convolutional Network (GCN)

The GCN component builds upon the temporal representations produced by the GRU to model spatial dependencies in e-scooter demand across Calgary’s grid structure. As discussed in Section 2.1, demand exhibits clear spatial clustering, particularly in downtown and transit-adjacent areas, with notable variation between weekdays and weekends. The GCN is well suited for this task because it operates on graph-structured data, propagating information between connected nodes to learn relational patterns such as demand spillover between adjacent grids or within functionally similar communities [39]. In this framework, the city is represented as a graph G = (V, E, A). The components are explicitly defined as follows. Vertices (V) represent the set of N nodes corresponding to the active grid cells where e-scooter trips occurred. In our study, N represents the 4080 starting locations for the pick-up graph and 4367 ending locations for the drop-off graph. Edges (E) constitute the set of unweighted edges capturing spatial proximity, where an edge (i, j)

\in

E is established if and only if grid cell i and grid cell j share a common geographic boundary. This structure models the direct diffusion of demand between immediately adjacent areas. The Adjacency Matrix (A) is an N

\times

N symmetric binary matrix where

A_{i j}

= 1 if (i, j)

\in

E and

A_{i j}

= 0 otherwise. Although the spatial relationship is inherently symmetric, self-loops (

A_{i i}

= 1) are implicitly handled during the GCN propagation step via the renormalization trick (

\hat{A} = A + I

) to ensure numerical stability and include node-specific features during aggregation.

The input to the GCN is the concatenated representation of the GRU output and the six contextual features (as described in Table 1). Specifically, the GRU provides a matrix of hidden states

H^{(0)}

∈

R^{N \times D}

, where D is the GRU hidden dimension. Each row

h_{i}^{(0)}

represents the temporal embedding for grid i. These temporal embeddings are then augmented with the contextual features to form the complete node feature matrix used by the GCN. Each node also carries an additional categorical feature,

c_{i d}

, derived from the Louvain community detection process (Section 2.2.2). This feature is numerically encoded and incorporated into the node feature matrix, enabling the GCN to jointly capture local spatial proximity through network edges and regional functional similarity through community membership. The GCN consists of two stacked layers that progressively aggregate spatial information. We adopt the spectral graph convolution formulation of [39], which approximates localized spectral filters using first-order Chebyshev polynomials [40]. For a given layer l, the propagation rule is defined as Equation (6).

H^{(l + 1)} = σ ({\hat{D}}^{- \frac{1}{2}} \hat{A} {\hat{D}}^{- \frac{1}{2}} H^{(l)} W^{(l)}),

(6)

where

H^{(l)}

is the matrix of node features at layer l,

\hat{A} = A + I

is the adjacency matrix of the graph with self-loops added,

\hat{D}

is the diagonal matrix of

\hat{A}

,

W^{(l)}

is the layer-specific learnable weight matrix, and σ denotes the ReLU activation.

This operation allows each node to update its representation by aggregating features from its spatially connected neighbors. The first GCN layer captures local correlations, such as demand patterns propagating from busy transit hubs to adjacent grids, while the second layer refines higher-order relationships, producing the final spatiotemporal embeddings matrix

H^{2}

for each grid. These embeddings are then fed into the parallel classification and regression branches (Section 2.2). The classification branch uses a fully connected layer followed by a sigmoid activation to output the activity probability P ∈ [0, 1] per grid. The regression branch employs a similar structure but with a linear output to predict demand magnitude A ≥ 0, incorporating weighed MSE loss considerations for count data. The branches share the initial GCN layers for efficiency but diverge after the GRU stage to specialize in their respective tasks.

To enhance generalization and ensure stable convergence, the model was trained using the Adam optimizer [41] with an initial learning rate of 0.0005 and a weight decay of 1

{\times 10}^{- 5}

(L2 regularization). The training process utilized a batch size of 16 and ran for a maximum of 200 epochs. To prevent overfitting on the sparse spatial graph, dropout regularization with a rate of 0.2 was applied after each GCN layer [42], and an early stopping strategy was implemented with a patience of 10 epochs. Additionally, a learning rate scheduler reduced the learning rate by a factor of 0.5 if the validation loss stagnated for 5 epochs. Each branch is trained independently: the classifier minimizes Binary Cross-Entropy loss, while the regressor minimizes Weighted MSE loss. Instead of training separate models for different temporal periods, the model was trained on a unified dataset comprising both weekdays and weekends, using a single set of training hyperparameters. Since the framework does not use a joint multi-task loss, no weighting coefficient

λ

is required. Instead, temporal variations are captured directly through the sine–cosine encodings of the hour of the day and day of the week, together with a weekend indicator. These features allow the model to learn distinct demand patterns for weekdays and weekends within a single architecture. Final predictions combine both outputs via multiplication (P ⊙ A) at inference time. In summary, the GCN functions as the spatial encoder in the proposed GRU-GCN architecture, transforming grid-specific temporal embeddings into relational representations that reflect neighborhood dynamics and community structure. This integration enables precise prediction of demand hotspots while addressing the spatiotemporal sparsity inherent in urban e-scooter data.

2.3. Implementation Details

To ensure reproducibility, here we summarize the specific configuration used in our experiments. All models were implemented using PyTorch version 2.9.0 and PyTorch Geometric on an NVIDIA A100 GPU. The Adam optimizer was employed with an initial learning rate of 0.0005. To mitigate overfitting, a dropout rate of 0.2 was applied after each GCN layer, and early stopping was used with a patience of 10 epochs. The complete set of hyperparameters and model architecture settings is summarized in Table 2.

3. Results

To evaluate the predictive performance of the proposed GRU-GCN framework, the model was trained and tested on a 75-day e-scooter dataset from the City of Calgary. The data was divided to preserve temporal consistency: the first 61 days (15 July–14 September 2019) were used for training, while the final 14 days (15–27 September 2019) were used for testing. The Masked Fully Convolutional Network (MFCN) developed by Al. [15] was used as the baseline for comparison. Building upon this benchmark, six variants of the proposed GCN framework were constructed to investigate how alternative spatial–temporal configurations influence predictive performance. The evaluation considers two prediction horizons: next-hour prediction (t + 1) and next 24 h prediction (t + 24). The performance of the model was quantified using Mean Absolute Error (MAE), defined as follows:

M A E = \frac{1}{N} \sum_{i = 1}^{N} | ŷ_{i} - y_{i} |,

(7)

where

y_{i}

and

ŷ_{i}

denote the observed and predicted e-scooter demand for grid i, and N is the total number of grid cells evaluated. The MAE measures the average magnitude of prediction errors across all grids, providing an overall indicator of accuracy.

The following subsections summarize the performance of the proposed GCN model, which integrates a mobility-based community feature and consistently provides the most balanced prediction results across sparse, moderate, and high-demand regions.

3.1. Next-Hour Pick-Up and Drop-Off Prediction Performance

Six GCN model variants were developed to evaluate different spatiotemporal modeling strategies for short-term e-scooter demand prediction. The benchmark reference is the MFCN from the baseline study. The GCN variants gradually build from a baseline spatial GCN to more sophisticated architectures incorporating temporal encoding with GRU, dual-branch outputs (classification and regression), weighted loss adjustment, and community-based spatial features. To clarify the model configurations presented in Table 3, the variants are defined as follows. Proposed SB (Single Branch) utilizes a single regression output without a classification stage. Proposed WC1 (With Classification Version 1) model represents the initial two-stage architecture consisting of independent classification and regression steps using standard MSE, while Proposed WC2 (With Classification Version 2) model employs the enhanced two-stage framework that incorporates Weighted MSE loss to better handle high demand. Proposed GCN-MM (Multi-Model) involves training separate models for each community independently. Finally, the Proposed GCN represents the complete framework, integrating an optimized two-stage framework with community-based spatial features. Table 3 presents the performance of the six GCN variants for the one-hour-ahead pick-up demand prediction. The results highlight a trade-off between handling sparse regions and capturing high-demand peaks. The two-branch model, Proposed WC2, achieved the best accuracy for zero-demand grids (MAE = 0.0050), confirming its effectiveness for sparse data. Proposed SB, incorporating a GRU-based temporal encoder, proved highly effective in the medium-demand range (y ∈ [6, 10], MAE = 2.4290), indicating that the inclusion of temporal dependencies effectively enhances short-term demand prediction. In contrast, Proposed GCN-MM, which trained individual models for each community, showed strong accuracy in low-demand active grids (y > 0, MAE = 1.1815 and y ∈ [1, 5], MAE = 0.8979), but its high error in zero-demand cells (y = 0, MAE = 0.0978) reflects overfitting and poor generalization caused by limited community-level data.

Our proposed GCN, which incorporates community detection as a spatial feature, achieved the most balanced and consistent results across all demand ranges. It closely matched the best sparse-demand model (y = 0, MAE = 0.0051) and significantly outperformed the MFCN baseline in both low-demand (1.1891 vs. 1.2136) and medium-demand (2.8777 vs. 2.9911) ranges. Additionally, it achieved the lowest error in the high–medium demand bracket (y ∈ [11, 15], MAE = 4.4186), confirming its stability in dense urban areas where accurate demand prediction is most critical. Although the baseline MFCN exhibited slightly lower error in the rare, extreme-demand category (y ≥ 16; 8.4077 vs. 9.6895 for the proposed GCN), these instances account for less than 0.01% of all observations. Overall, the proposed GCNs consistently superior or competitive performance across all typical demand levels, particularly its leading results in the y ∈ [11, 15] range, confirms that the proposed GCN offers the most balanced and generalizable framework for short-term e-scooter demand prediction in complex urban environments.

Table 4 summarizes the one-hour-ahead drop-off demand prediction results. The performance trends are consistent with those of the pick-up task, revealing a similar trade-off between sparse and high-demand regions. Proposed WC1, using a dual-branch structure, achieved the best zero-demand accuracy (MAE = 0.0047) but showed weaker performance for high-demand cases (MAE = 13.9176). Adding a weighted loss in Proposed WC2 improved stability in high-demand intervals (MAE reduced to 9.2507) while maintaining low-demand accuracy (MAE = 1.2336 for y ∈ [1, 5]). Proposed SB, which performed well in the medium-demand range (y ∈ [6, 10], MAE = 2.4657), confirming the value of temporal dependency modeling. Proposed GCN-MM, trained separately for each community, achieved the best results in low-demand active grids (y > 0, MAE = 1.1087, y ∈ [1, 5], MAE = 0.8497) but had high error for zero-demand cells (MAE = 0.1011), indicating overfitting from limited per-community samples. Our Proposed GCN, which integrates community detection, provided the most balanced and generalizable results. It achieved MAEs of 0.0054 for zero-demand and 9.0302 for high-demand grids, with strong performance in the high-medium range (y ∈ [11, 15], MAE = 4.2972). Compared with the baseline MFCN (y = 0, MAE = 0.0063 and y ≥ 16, MAE = 9.8085), the proposed GCN consistently outperformed across most intervals.

Our proposed model outperforms the benchmark MFCN and the baseline GCN primarily because it better understands not only when e-scooter demand changes but also how different areas of the city influence each other. While the MFCN employs convolutional filters limited to fixed grid neighborhoods and the baseline GCN models only local geographic adjacency, the proposed GCN leverages graph-based connectivity derived from functional relationships between zones, enabling spatial information to propagate adaptively across the urban network. The integration of community detection as a node feature further enhances this process by clustering grids with similar mobility behavior, allowing the model to learn cross-zone dependencies that are not represented in either baseline. Moreover, the combination of GRU-based temporal encoding and GCN spatial aggregation allows the proposed framework to integrate short-term fluctuations with broader spatial trends, enhancing its generalization across varying demand levels. Consequently, the proposed GCN achieves a more coherent spatial understanding of urban mobility dynamics, resulting in lower overall prediction errors and stronger robustness than the baselines.

The overall temporal performance of the proposed GCN model for the next-hour prediction task is shown in Figure 6, which plots the average prediction error (MAE) for both (a) pick-up and (b) drop-off demand. It can be observed that for both tasks, the MAE is consistently low during off-peak hours (approximately 00:00 to 06:00). A clear divergence in performance between weekdays and weekends emerges as demand increases throughout the day. For both pick-up and drop-off predictions, the error is significantly higher on weekends, particularly during the afternoon peak from 14:00 to 20:00. This indicates that the model more accurately captures the structured, commuter-based demand on weekdays, while the less regular, leisure-driven activity on weekends results in a greater average prediction error.

3.2. Next 24-h Pick-Up and Drop-Off Prediction Performance

Table 5 presents the results for the 24 h-ahead pick-up demand prediction. Overall, the performance trends are consistent with those observed in the one-hour prediction task. The two-branch models (Proposed WC1 and Proposed WC2) again demonstrated superior accuracy in sparse regions (y = 0) and moderate-demand zones, achieving MAE values of 0.0044 and 0.0050, respectively. The incorporation of a weighted loss (Proposed WC2) effectively reduced errors for higher demand levels, indicating improved robustness to data imbalance. Proposed SB performed strongly in the medium-demand range (y ∈ [6, 10], MAE = 2.3631), demonstrating that temporal dependency modeling remains beneficial even for long-term prediction. The proposed GCN, which integrates community detection as a spatial feature, achieved the most balanced performance across all demand categories. With an MAE of 0.0053 for zero-demand grids and 8.6355 for extreme high-demand cases, the proposed GCN consistently outperformed the benchmark MFCN model (MAE = 0.0072 and 11.2299, respectively). The inclusion of community structure enabled more effective spatial information propagation, allowing the model to generalize better across diverse urban regions. In contrast, proposed GCN-MM, which trained separate models for each community, achieved competitive accuracy in low-demand active grids (y > 0, MAE = 1.2128 and y ∈ [1, 5], MAE = 0.8514) but showed degraded performance in high-demand areas (MAE = 13.5130) due to limited training data and overfitting within smaller subgraphs. Therefore, the results confirm that incorporating community information enhances the stability and scalability of long-term e-scooter demand prediction.

The next 24 h drop-off prediction experiment was conducted in a similar way, and the results are presented in Table 6. The overall performance patterns were consistent with those observed in the short-term predictions. The two-branch models (Proposed WC1 and Proposed WC2) achieved strong accuracy in sparse grids, with MAE values of 0.0051 and 0.0054, respectively, and the weighted-loss variant Proposed WC2 improved robustness in higher-demand ranges (MAE = 11.4845). Proposed SB, with a GRU temporal encoder, performed the lowest error for active-demand cells (y > 0, MAE = 1.0935). In contrast, Proposed GCN-MM, trained independently for each community, showed lower overall accuracy, performing best in the low active bin (y ∈ [1, 5], MAE = 0.8154) but declining under high-demand conditions (MAE = 15.2888) due to limited training data. The baseline GCN achieved the best accuracy in the medium–high bracket (y ∈ [6, 10], MAE = 2.3938), indicating effective spatial modeling for those peaks. The proposed GCN, which incorporates community detection, provided the most balanced results overall: it achieved the best zero-demand accuracy (y = 0, MAE = 0.0050) and strong extreme-demand performance (y ≥ 16, MAE = 10.0982), outperforming the baseline MFCN (y = 0, MAE = 0.0078 and y ≥ 16, MAE = 12.4215). These results confirm that explicitly integrating community structure within a unified model enhances the capacity to capture spatial dependencies and leads to more reliable long-term e-scooter demand forecasts.

Overall performance for the proposed GCN model in the 24 h prediction scheme, illustrated in Figure 7, shows lower errors during off-peak hours and noticeably higher MAE on weekend afternoons for both pick-up and drop-off tasks. Within the 24 h forecasts for active regions (y > 0), the proposed GCN achieved a slightly lower MAE for drop-offs (1.4455) compared to pick-ups (1.5293), indicating stable long-range prediction capability. Figure 7 further illustrates the temporal distribution of errors, where both pick-up and drop-off predictions exhibit low MAE values during early morning hours (00:00–06:00) and increasing errors throughout the day as e-scooter activity intensifies. Higher error levels are observed during afternoon and evening peaks (14:00–20:00), particularly on weekends. This pattern suggests that weekday demand, driven primarily by structured commuting behavior, is more predictable than the irregular, leisure-oriented activity that characterizes weekends.

In summary, the performance comparison between one-hour and 24 h prediction time scales indicates that the proposed GCN performs more effectively in short-term prediction for both pick-up and drop-off tasks. For active demand regions (y > 0), the next-hour pick-up prediction achieved a Mean Absolute Error (MAE) of 1.4134, which slightly increased to 1.5293 for the 24 h horizon. A notable advantage of the proposed GCN over the MFCN benchmark and the baseline GCN is observed in high-demand scenarios. For instance, in predicting high-intensity drop-offs (y ≥ 16), the proposed GCN reduced the MAE from 9.8085 (MFCN) and 11.6168 (GCN) to 9.0302 for the next-hour task and from 12.4215 (MFCN) and 13.1775 (GCN) to 10.0982 for the 24 h task. These results highlight that incorporating the community feature within the proposed GCN framework enhances spatial representation and model stability, enabling reliable performance across different temporal scales and particularly improving long-range e-scooter demand prediction.

4. Discussions

The findings show that embedding community detection within a hybrid GRU-GCN framework enhances both interpretability and predictive performance in modeling spatiotemporal micromobility demand. Compared with conventional deep learning baselines such as the MFCN and single-branch GCN models, the proposed framework provides a more robust representation of spatial dependencies and temporal patterns. By utilizing graph-theoretic representations of mobility flows, the model captures both structural and functional connectivity between urban areas, leading to more accurate and stable predictions.

The inclusion of the Louvain community feature enables the framework to identify functionally cohesive subregions such as downtown commercial zones, university areas, and recreational corridors that exhibit similar mobility behaviors. This transformation of raw origin–destination data into a modular network representation allows the model to learn the relationships between spatially connected communities more effectively. Embedding these community structures as node features allows the GCN to propagate information across the graph based on spectral graph theory, thereby modeling correlations that extend beyond adjacent geographic grids. Consequently, the model performs better in both densely and sparsely populated regions by leveraging the inherent regularities in urban mobility patterns.

Beyond predictive performance, the detected communities also enhance the interpretability and practical utility of the model. Several representative communities identified by the Louvain algorithm illustrate this point. For example, a large downtown-centered cluster captures the strong bidirectional flows between commercial blocks and transit-accessible grids, reflecting the core role of the central business district in shaping weekday mobility. Another community emerges around the University of Calgary, where trip patterns exhibit strong evening and weekend activity associated with student travel and campus-related trips. A third community aligns with the Bow River pathway network, showing higher leisure-oriented usage on weekends. These examples demonstrate that the community structure highlights meaningful functional regions and provides a compact representation of mobility behavior that is not available from geographic adjacency alone. The resulting interpretability helps urban planners identify mobility anchors, evaluate network connectivity, and design infrastructure or staging strategies tailored to corridor-level demand patterns.

The GRU component complements the graph structure by capturing recurring temporal dependencies, such as daily commuting and weekend activity patterns. This sequential learning process enhances the model’s ability to handle both short-term fluctuations and longer-term seasonal trends. The higher prediction error observed during weekend afternoons reflects the inherent variability of leisure-related trips, which are often influenced by social or environmental factors not explicitly included in the current model. These findings are consistent with prior studies indicating that leisure and discretionary trips display greater unpredictability than routine commuting behavior. Although the dataset represents a single summer period, the temporal patterns suggest that extending the model to multi-seasonal or year-long datasets would likely improve robustness by capturing additional seasonal cycles. In such cases, incorporating exogenous variables such as weather or event indicators may be necessary to account for greater temporal variability.

From a methodological perspective, the adoption of a multi-branch structure that combines classification and regression tasks contributes to improved robustness under sparse data conditions. This design enables the model to first detect potential activity zones and then estimate the expected demand within them, reducing the sensitivity to data imbalance. Moreover, the use of graph convolution over irregular spatial structures allows information to diffuse efficiently across the network, accommodating complex urban topologies that cannot be represented well by fixed grid-based methods. Together, these components enable the proposed framework to capture nonlinear spatial and temporal interactions without relying on rigid spatial partitions.

The practical implications of these findings are significant. Improved demand prediction can support operational decisions such as fleet redistribution, charging logistics, and dynamic pricing, while also informing policymakers on issues related to service accessibility and urban equity. The enhanced interpretability offered by mobility-based communities further enables planners to identify high-demand corridors, evaluate functional connectivity between neighborhoods, and prioritize infrastructure upgrades in regions that serve as mobility anchors.

In addition to interpretability, computational efficiency is an important consideration for real-time deployment. The Louvain algorithm is executed only once during preprocessing, and its near-linear complexity on sparse graphs ensures scalability for large urban networks. During training, the GRU-GCN architecture requires moderately more computation than MFCN or single-branch GCN models, but the inference phase, critical for real-time forecasting, consists of a single forward pass with no need for repeated community detection or graph reconstruction. This keeps runtime efficiency suitable for near-real-time operational use.

Finally, although the study focuses on Calgary, the underlying methodology is expected to generalize well to other cities. The approach relies on three readily available data sources: grid-level demand counts, origin–destination flows, and simple temporal contextual features. Because both the spatial graph and the community structure are derived directly from local travel patterns, the model adapts naturally to different urban forms without the need for manual spatial tuning. In practical applications, transferring the framework to a new city would require either full retraining with local data or a domain-adaptation strategy in which GRU weights are partially retained while GCN layers and community assignments are recalibrated using the new origin–destination network. Even limited local data is typically sufficient to reconstruct the flow graph and detect mobility communities, enabling the framework to generalize across cities with varying scales, densities, and street layouts. Beyond its application to micromobility, the proposed approach illustrates how advanced mathematical modeling and network-based learning can be leveraged to address complex problems in urban analytics.

5. Conclusions

This study proposed a hybrid deep learning framework that combines Graph Convolutional Networks (GCNs) and Gated Recurrent Units (GRUs) with community detection for short- and long-term e-scooter demand prediction. Using trip data from Calgary, Canada, the proposed GCN model achieved an average reduction of 11.8% in mean absolute error (MAE) compared with the benchmark MFCN model. By incorporating community-based spatial features derived from the Louvain algorithm, the model successfully captured functional relationships among urban regions, resulting in improved predictive accuracy and interpretability. Unlike conventional grid-based or purely adjacency-driven approaches, the proposed framework embeds mobility-driven communities as learnable spatial features, enabling the model to jointly capture local spatial proximity and higher-level functional similarity across urban areas. The resulting community-aware representation also enhanced interpretability by highlighting mobility clusters that align with meaningful functional areas of the city.

Despite these promising results, several limitations remain. The model was developed and validated using data from a single city, and its applicability to other contexts requires further evaluation. Nevertheless, the framework is inherently transferable because it constructs both the spatial graph and community structure directly from observed origin–destination flows, which are commonly available in micromobility systems. This design allows the framework to move beyond fixed spatial partitions and adapt naturally to city-specific mobility patterns. Applying the model to a new city would involve reconstructing the flow network, redetecting communities, and retraining or fine-tuning the model with local demand data, making the adaptation process straightforward. Future research should evaluate the model’s generalizability by testing it in cities with differing socioeconomic and spatial characteristics.

Additionally, the current framework does not explicitly account for exogenous factors such as weather, traffic, or public events, which are known to influence micromobility usage. Including these variables could enhance both prediction precision and adaptability to short-term fluctuations. The dataset used in this study covers a summer period, and incorporating multi-seasonal or year-long data may improve robustness by capturing seasonal variations and weather-driven behavioral shifts. Extending the temporal scope would allow the model to learn additional recurring patterns, though such expansion may require integration of contextual variables to account for increased variability. Another limitation lies in the static nature of the Louvain-based community detection approach, which does not capture temporal evolution in mobility patterns. Future extensions could explore dynamic or time-aware community detection to reflect evolving functional regions within cities. Finally, exploring attention mechanisms, graph transformers, or other advanced architectures could further improve learning efficiency and the interpretability of spatial dependencies.

In conclusion, the proposed GRU-GCN framework represents a promising step forward in the modeling of spatiotemporal demand for micromobility systems. The key contributions of this study lie in (i) integrating mobility-informed community structures into graph-based learning, (ii) jointly modeling temporal dynamics and functional spatial relationships within a unified framework, and (iii) addressing demand sparsity through a two-stage prediction design. By integrating community-aware graph structures with temporal sequence modeling, the framework provides both methodological rigor and practical relevance. The computational efficiency of the approach, particularly during inference where only a single forward pass is required, further supports its suitability for near-real-time deployment in operational settings. Overall, the proposed framework advances current spatiotemporal demand prediction practices by shifting from static spatial representations toward community-aware, functionally interpretable, and transferable modeling of urban mobility demand. This approach contributes to the advancement of data-driven urban analytics, supporting the development of intelligent, and sustainable urban mobility systems.

Author Contributions

Conceptualization, M.G.D. and S.P.; methodology, K.P. and M.M.Z.; software, M.M.Z.; validation, M.M.Z., K.P. and S.P.; formal analysis, K.P. and S.P.; investigation, M.M.Z., K.P. and S.P.; resources, M.M.Z.; data curation, M.M.Z.; writing—original draft preparation, M.M.Z.; writing—review and editing, S.P., K.P. and M.G.D.; visualization, M.M.Z.; supervision, K.P. and S.P.; project administration, S.P.; funding acquisition, S.P. and M.G.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by the National Research Council of Thailand (NRCT) Thailand, the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant (RGPIN/03037-2022), and Chiang Mai University.

Data Availability Statement

Data available on request due to Non-Disclosure Agreement (NDA).

Acknowledgments

We thank the City of Calgary for providing the e-scooter data. The research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant (RGPIN/03037-2022), the National Research Council of Thailand (NRCT) through the Hub of Talents in AI and Emerging Technology (AI-NEXT), and Chiang Mai University.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Gössling, S. Integrating e-Scooters in Urban Transportation: Problems, Policies, and the Prospect of System Change. Transp. Res. Part D Transp. Environ. 2020, 79, 102230. [Google Scholar] [CrossRef]
Chang, A.Y.; Miranda-Moreno, L.; Clewlow, R.; Sun, L. Trend or Fad? Deciphering the Enablers of Micromobility in the U.S. Transp. Res. Rec. 2019, 2673, 1–12. [Google Scholar]
McKenzie, G. Spatiotemporal Comparative Analysis of Scooter-Share and Bike-Share Usage Patterns in Washington. D.C. J. Transp. Geogr. 2019, 78, 19–28. [Google Scholar] [CrossRef]
Hosseinzadeh, A.; Algomaiah, M.; Kluger, R.; Li, Z. E-Scooters and Sustainability: Investigating the Relationship between the Density of E-Scooter Trips and Characteristics of Sustainable Urban Development. Sustain. Cities Soc. 2020, 66, 102624. [Google Scholar] [CrossRef]
Sedor, A.; Oriold, J. Shared E-Bike and E-Scooter Final Pilot Report; City of Calgary: Calgary, AB, Canada, 2020. [Google Scholar]
Abduljabbar, R.L.; Liyanage, S.; Dia, H. The Role of Micro-Mobility in Shaping Sustainable Cities: A Systematic Literature Review. Transp. Res. Part D Transp. Environ. 2021, 92, 102734. [Google Scholar] [CrossRef]
Palm, M.; Farber, S.; Shalaby, A.; Young, M. Equity Analysis and New Mobility Technologies: Toward Meaningful Interventions. J. Plan. Lit. 2021, 36, 16–29. [Google Scholar] [CrossRef]
Trivedi, T.K.; Liu, C.; Antonio, A.L.M.; Wheaton, N.; Kreger, V.; Yap, A.; Schriger, D.; Elmore, J.G. Injuries Associated with Standing Electric Scooter Use. JAMA Netw. Open 2019, 2, e187381. [Google Scholar] [CrossRef]
Hollingsworth, J.; Copeland, B.; Johnson, J.X. Are E-Scooters Polluters? The Environmental Impacts of Shared Dockless Electric Scooters. Environ. Res. Lett. 2019, 14, 084031. [Google Scholar] [CrossRef]
Moreau, H.; de Meux, L.J.; Zeller, V.; D’Ans, P.; Ruwet, C.; Achten, W.M.J. Dockless E-Scooter: A Green Solution for Mobility? Comparative Case Study between Dockless E-Scooters, Displaced Transport, and Personal E-Scooters. Sustainability 2020, 12, 1803. [Google Scholar] [CrossRef]
Abdelwahab, B.; Palm, M.; Shalaby, A.; Farber, S. Evaluating the Equity Implications of Ridehailing through a Multi-Modal Accessibility Framework. J. Transp. Geogr. 2021, 95, 103147. [Google Scholar] [CrossRef]
Liao, F.; Correia, G. Electric Carsharing and Micromobility: A Literature Review on Their Usage Pattern, Demand, and Potential Impacts. Int. J. Sustain. Transp. 2020, 14, 686–700. [Google Scholar] [CrossRef]
Zhang, S.; Tong, H.; Xu, J.; Maciejewski, R. Graph Convolutional Networks: A Comprehensive Review. Comput. Soc. Netw. 2019, 6, 11. [Google Scholar] [CrossRef] [PubMed]
Wu, Z.; Pan, S.; Chen, F.; Long, G.; Zhang, C.; Yu, P.S. A Comprehensive Survey on Graph Neural Networks. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 4–24. [Google Scholar] [CrossRef]
Phithakkitnukoon, S.; Patanukhom, K.; Demissie, M.G. Predicting Spatiotemporal Demand of Dockless E-Scooter Sharing Services with a Masked Fully Convolutional Network. ISPRS Int. J. Geo-Inf. 2021, 10, 773. [Google Scholar] [CrossRef]
Ham, S.W.; Cho, J.-H.; Park, S.; Kim, D.-K. Spatiotemporal Demand Prediction Model for E-Scooter Sharing Services with Latent Feature and Deep Learning. Transp. Res. Rec. 2021, 2675, 34–43. [Google Scholar] [CrossRef]
Sahnoon, M.; Manuel, A.; Demissie, M.G.; Souza, R. UNET and UNETR Based Frameworks for Predicting the Short-Term Spatiotemporal Demand of E-Scooter Sharing Services. In Proceedings of the 2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), Edmonton, AB, Canada, 24–27 September 2024; pp. 2804–2811. [Google Scholar] [CrossRef]
Kim, S.; Choo, S.; Lee, G.; Kim, S. Predicting Demand for Shared E-Scooter Using Community Structure and Deep Learning Method. Sustainability 2022, 14, 2564. [Google Scholar] [CrossRef]
Liu, M.; Mathew, J.K.; Seeder, S.; Li, H.; Bullock, D.M. Analysis of E-Scooter Trips and Their Temporal Usage Patterns. ITE J. 2019, 89, 44–51. [Google Scholar]
Bai, S.; Jiao, J. Dockless E-Scooter Usage Patterns and Urban Built Environments: A Comparison Study of Austin, TX, and Minneapolis, MN. Travel Behav. Soc. 2020, 20, 264–272. [Google Scholar] [CrossRef]
Lee, M.; Chow, J.Y.J.; He, B.Y.; Lee, H. Forecasting E-Scooter Competition with Direct and Access Trips by Mode and Distance in New York City. Transp. Res. Part D Transp. Environ. 2022, 108, 103300. [Google Scholar]
Yang, Y.; Shao, X.; Zhu, Y.; Yao, E.; Liu, D.; Zhao, F. Short-Term Forecasting of Dockless Bike-Sharing Demand with the Built Environment and Weather. J. Adv. Transp. 2023, 2023, 7407748. [Google Scholar] [CrossRef]
Li, X.; Xu, Y.; Zhang, X.; Shi, W.; Yue, Y.; Li, Q. Improving Short-Term Bike Sharing Demand Forecast through an Irregular Convolutional Neural Network. Transp. Res. Part C Emerg. Technol. 2023, 147, 103984. [Google Scholar] [CrossRef]
Song, J.-C.; Hsieh, I.Y.L.; Chen, C.S. Sparse Trip Demand Prediction for Shared E-Scooter Using Spatio-Temporal Graph Neural Networks. Transp. Res. Part D Transp. Environ. 2023, 125, 103962. [Google Scholar] [CrossRef]
Xu, Y.; Zhao, X.; Zhang, X.; Paliwal, M. Real-Time Forecasting of Dockless Scooter-Sharing Demand: A Spatio-Temporal Multi-Graph Transformer Approach. IEEE Trans. Intell. Transp. Syst. 2023, 24, 8507–8518. [Google Scholar] [CrossRef]
Jiang, J.; Han, C.; Zhao, W.X.; Wang, J. PDFormer: Propagation Delay-aware Dynamic Long-range Transformer for Traffic Flow Prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 4365–4373. [Google Scholar]
Lan, S.; Ma, Y.; Huang, W.; Wang, W.; Yang, H.; Li, P. DSTAGNN: Dynamic Spatial-Temporal Aware Graph Neural Network for Traffic Flow Forecasting. IEEE Trans. Intell. Transp. Syst. 2022, 23, 20459–20469. [Google Scholar]
Singh, G.; Pal, S.; Kumar, Y. Integrated Spatio-Temporal Graph Neural Network for Traffic Forecasting. Appl. Sci. 2024, 14, 11477. [Google Scholar] [CrossRef]
Ke, J.; Zheng, H.; Yang, H.; Chen, X. Short-Term Forecasting of Passenger Demand under On-Demand Ride Services: A Spatio-Temporal Deep Learning Approach. Proc. AAAI Conf. Artif. Intell. 2019, 33, 1394–1401. [Google Scholar] [CrossRef]
Zhao, J.; Zhang, H.; Sun, J.; Chen, H. Prediction of Urban Taxi Travel Demand by Using Hybrid Dynamic Graph Convolutional Network. Sensors 2022, 22, 5982. [Google Scholar] [CrossRef]
Yang, Y.; Xu, W.; Wu, J.; Zhang, Y. Dynamic Graph Convolutional Network-Based Prediction of Urban Grid-Level Taxi Demand-Supply Imbalance. ISPRS Int. J. Geo-Inf. 2024, 13, 34. [Google Scholar] [CrossRef]
Blondel, V.D.; Guillaume, J.-L.; Lambiotte, R.; Lefebvre, E. Fast Unfolding of Communities in Large Networks. J. Stat. Mech. Theory Exp. 2008, 2008, P10008. [Google Scholar] [CrossRef]
Newman, M.E.J. Modularity and Community Structure in Networks. Proc. Natl. Acad. Sci. USA 2006, 103, 8577–8582. [Google Scholar] [CrossRef]
Dastjerdi, A.M.; Morency, C. Bike-Sharing Demand Prediction at Community Level under COVID-19 Using Deep Learning. Sensors 2022, 22, 1060. [Google Scholar] [CrossRef]
Pourabbasi, E.; Majidnezhad, V.; Farzi Veijouyeh, N.; Taghavi Afshord, S.; Jafari, Y. A Novel Intelligent Fuzzy-AHP Based Evolutionary Algorithm for Detecting Communities in Complex Networks. Soft Comput. 2024, 28, 7251–7269. [Google Scholar] [CrossRef]
Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation. In Proceedings of the EMNLP, Doha, Qatar, 25–29 October 2014; pp. 1724–1734. [Google Scholar]
Elman, J.L. Finding Structure in Time. Cogn. Sci. 1990, 14, 179–211. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR 2017)-Conference Track Proceedings, Toulon, France, 24–26 April 2017. [Google Scholar]
Hammond, D.K.; Vandergheynst, P.; Gribonval, R. Wavelets on Graphs via Spectral Graph Theory. Appl. Comput. Harmon. Anal. 2011, 30, 129–150. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J.L. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR 2015)-Conference Track Proceedings, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]

Figure 1. Spatial distribution of e-scooter (a) pick-up and (b) drop-off activities in Calgary. The city contains 21,702 square grids (200 m × 200 m), but only about 19.45% recorded trips during the 75-day study period. Color intensity (

{l o g}_{10}

scale) indicates the number of trips, showing concentrated activity in the city center.

Figure 1. Spatial distribution of e-scooter (a) pick-up and (b) drop-off activities in Calgary. The city contains 21,702 square grids (200 m × 200 m), but only about 19.45% recorded trips during the 75-day study period. Color intensity (

{l o g}_{10}

scale) indicates the number of trips, showing concentrated activity in the city center.

Figure 2. Average hourly e-scooter demand: (a) weekdays and (b) weekends.

Figure 3.

{L o g}_{10}

scale distributions of e-scooter trip characteristics: (a) trip duration (minutes) and (b) trip distance (kilometers). The average (red dashed line) and median (black dotted line) values are indicated for both duration (12.88 and 8.40 min, respectively) and distance (1.85 and 1.26 km, respectively).

Figure 3.

{L o g}_{10}

scale distributions of e-scooter trip characteristics: (a) trip duration (minutes) and (b) trip distance (kilometers). The average (red dashed line) and median (black dotted line) values are indicated for both duration (12.88 and 8.40 min, respectively) and distance (1.85 and 1.26 km, respectively).

Figure 4. Spatial distribution of e-scooter demand in Calgary. (a–c) Pick-up demand for overall, weekday, and weekend periods. (d–f) Drop-off demand for overall, weekday, and weekend periods. Brighter colors on the

{l o g}_{10}

scale indicate higher trip volumes.

Figure 4. Spatial distribution of e-scooter demand in Calgary. (a–c) Pick-up demand for overall, weekday, and weekend periods. (d–f) Drop-off demand for overall, weekday, and weekend periods. Brighter colors on the

{l o g}_{10}

scale indicate higher trip volumes.

Figure 5. Proposed model architecture for scooter demand prediction. The input tensor X∈

R^{B \times N \times H \times F}

includes 14 lag features (

X_{l a g}

), five contextual variables:

h_{s i n}

,

h_{c o s}

,

d_{s i n}

,

d_{c o s}

, w with cyclic variables encoded as sine and cosine pairs and one categorical community feature (

c_{i d}

). The classification branch outputs the probability P, and the regression branch estimates the demand magnitude A.

Figure 5. Proposed model architecture for scooter demand prediction. The input tensor X∈

R^{B \times N \times H \times F}

includes 14 lag features (

X_{l a g}

), five contextual variables:

h_{s i n}

,

h_{c o s}

,

d_{s i n}

,

d_{c o s}

, w with cyclic variables encoded as sine and cosine pairs and one categorical community feature (

c_{i d}

). The classification branch outputs the probability P, and the regression branch estimates the demand magnitude A.

Figure 6. Hourly average prediction error (MAE) for next-hour or (t + 1) demand: (a) pick-up and (b) drop-off, shown separately for weekday and weekend using the proposed GCN-4 model.

Figure 7. Hourly average prediction error (MAE) for next 24 hr or (t + 24) demand: (a) pick-up and (b) drop-off, shown separately for weekday and weekend using the proposed GCN-4 model.

Table 1. Features used for the next-hour (t + 1) prediction and the next 24 h (t + 24) prediction.

Feature Group	Prediction Task/Feature	Lag Values (Hours Before Prediction)	Description
Historical Demand Features (14)	Next-Hour Prediction (t + 1)	0, 1	Immediate trend (recent hours)
		22, 23, 24	Daily cycle (same hour on the previous day)
		143, 166, 167, 168, 191	Weekly cycle (same hour in the previous week)
		335, 336	Two-week cycle
		503, 504	Three-week cycle
	Next-24-h Prediction (t + 24)	0, 1	Immediate trend
		120, 121	Five days prior
		143, 144, 145	Six days prior
		168	One week prior (weekly cycle)
		312, 313	Thirteen days prior
		336	Two weeks prior
		480, 481	Twenty days prior
		648	Twenty-seven days prior
Spatiotemporal Context Features (6)	$Temporal context : h_{s i n}$ $, h_{c o s}$	-	Represents 24 h cyclic variation (sine–cosine encoded)
	$Temporal context : d_{s i n}$ $, d_{c o s}$	-	Represents seven-day cycle (sine–cosine encoded)
	Temporal context: w	-	Binary variable indicating weekends (1) vs. weekdays (0)
	$Spatial context : c_{i d}$	-	Categorical feature from Louvain community detection

Table 2. Summary of model hyperparameters and training configurations used in the proposed GRU-GCN framework.

Hyperparameter	Value	Description
GRU Hidden Dimension	64	Hidden state size of the GRU layer
GRU Layers	1	Number of recurrent layers in the GRU block
GCN Hidden Dimension	64	Node embedding size in GCN layers
GCN Layers	2	Number of Graph Convolutional layers
Community Embedding Dim	10	Dimension of learnable community embeddings
Training Settings
Batch Size	16	Number of samples per batch
Training Epochs	200	Maximum iterations
Early Stopping Patience	10	Epochs to wait before stopping
Optimizer	Adam	Adaptive Moment Estimation
Learning Rate	0.0005	Initial learning rate
Weight Decay	$1 {\times 10}^{- 5}$	L2 regularization coefficient
Dropout Rate	0.2	Applied after GCN layers

Table 3. Performance comparison between the benchmark MFCN model and six variants of the proposed GCN architecture for the next-hour or t + 1 pick-up demand prediction, based on the mean absolute error (MAE) along with the standard deviation of absolute errors (SD of AE).

Model	Feature Size	No of Predictions	MAE (SD of AE)
Model	Feature Size	No of Predictions	y = 0	y > 0	$y \in$ [1, 5]	$y \in$ [6, 10]	$y \in$ [11, 15]	$y \in$ $[16, \infty$ ]
MFCN	14	2.4 k	0.0057 (0.1812)	1.5180 (3.1669)	1.2136 (2.773)	2.9911 (3.669)	4.5346 (5.396)	8.4077 (9.330)
GCN	17	1.28 M	0.0734 (0.1084)	1.2374 (1.6946)	0.9675 (0.9469)	2.8630 (1.9336)	5.5368 (2.9251)	11.811 (10.6081)
Proposed SB	17	1.28 M	0.0123 (0.1010)	1.5091 (1.8267)	1.3275 (1.3030)	2.4290 (1.7725)	4.4716 (2.6754)	10.6398 (10.3199)
Proposed WC1	17	1.28 M	0.0053 (0.0665)	1.4034 (1.7018)	1.1153 (1.0386)	3.2156 (1.9538)	5.7806 (2.9315)	12.2893 (11.1480)
Proposed WC2	17	1.28 M	0.0050 (0.0748)	1.4197 (1.8226)	1.1993 (1.1119)	2.8480 (1.8205)	4.4732 (2.9049)	9.7067 (9.4221)
Proposed GCN-MM	17	1.28 M	0.0978 (0.2409)	1.1815 (1.4523)	0.8979 (0.9249)	3.0238 (1.9411)	5.1599 (2.9228)	11.5644 (10.5682)
Proposed GCN	18	1.28 M	0.0051 (0.0749)	1.4134 (1.8139)	1.1891 (1.1049)	2.8777 (1.8549)	4.4186 (2.5946)	9.6895 (9.2734)

Table 4. Performance comparison between the benchmark MFCN model and six variants of the proposed GCN architecture for the next-hour or t + 1 drop-off demand prediction, based on the mean absolute error (MAE) along with the standard deviation of absolute errors (SD of AEs).

Model	Feature Size	No of Predictions	MAE (SD of AE)
Model	Feature Size	No of Predictions	y = 0	y > 0	$y \in$ [1, 5]	$y \in$ [6, 10]	$y \in$ [11, 15]	$y \in$ $[16, \infty$ ]
MFCN	14	2.4 k	0.0063 (0.1872)	1.5382 (2.8517)	1.2513 (2.4982)	3.1714 (3.6090)	4.5551 (4.0260)	9.8085 (9.6832)
GCN	17	1.37 M	0.0205 (0.1146)	1.4476 (1.7277)	1.2842 (1.2170)	2.4893 (1.9347)	4.5246 (2.8562)	11.6168 (12.2902)
Proposed SB	17	1.37 M	0.0235 (0.1220)	1.3856 (1.7646)	1.2123 (1.2445)	2.4657 (1.8582)	4.7671 (2.8317)	12.2261 (12.2568)
Proposed WC1	17	1.37 M	0.0047 (0.0625)	1.3420 (1.7380)	1.0798 (0.9123)	3.2769 (1.9168)	6.1085 (2.9796)	13.9176 (12.4829)
Proposed WC2	17	1.37 M	0.0056 (0.0752)	1.4184 (1.6779)	1.2336 (1.2186)	2.9032 (2.2759)	4.3552 (3.1804)	9.2507 (9.7960)
Proposed GCN-MM	17	1.37 M	0.1011 (0.2339)	1.1087 (1.6985)	0.8497 (0.9387)	3.0258 (1.8103)	5.7779 (2.8675)	13.0021 (11.5892)
Proposed GCN	18	1.37 M	0.0054 (0.0751)	1.4117 (1.6656)	1.2169 (1.1997)	3.0118 (2.1835)	4.2972 (3.1262)	9.0302 (9.9654)

Table 5. Performance comparison between the benchmark MFCN model and six variants of the proposed GCN architecture for the next 24 h or t + 24 pick-up demand prediction, based on the mean absolute error (MAE) along with the standard deviation of absolute errors (SD of AE).

Model	Feature Size	No of Predictions	MAE (SD of AE)
Model	Feature Size	No of Predictions	y = 0	y > 0	$y \in$ [1, 5]	$y \in$ [6, 10]	$y \in$ [11, 15]	$y \in$ $[16, \infty]$
MFCN	14	2.4 k	0.0072 (0.1870)	1.6850 (3.1231)	1.3307 (2.8816)	3.1609 (3.5391)	5.2500 (3.6433)	11.2299 (8.4332)
GCN	17	1.28 M	0.0139 (0.1204)	1.6534 (1.9371)	1.4911 (1.3790)	2.4091 (1.7625)	4.3537 (2.5127)	11.0606 (10.1379)
Proposed SB	17	1.28 M	0.0135 (0.1163)	1.5568 (1.9356)	1.3817 (1.3096)	2.3631 (1.7463)	4.3800 (2.6205)	11.4834 (10.8860)
Proposed WC1	17	1.28 M	0.0044 (0.0615)	1.4302 (1.8170)	1.1073 (0.8909)	3.3048 (2.1369)	5.9645 (2.7744)	12.2274 (11.9223)
Proposed WC2	17	1.28 M	0.0050 (0.0756)	1.4557 (1.8362)	1.2082 (1.0797)	2.8506 (1.8144)	4.8619 (2.7734)	10.3802 (9.8565)
Proposed GCN-MM	17	1.28 M	0.0924 (0.2247)	1.2128 (1.7382)	0.8514 (0.8686)	3.2655 (1.8970)	6.3713 (2.8842)	13.5130 (12.0103)
Proposed GCN	18	1.28 M	0.0053 (0.0799)	1.5293 (1.8994)	1.2911 (1.2978)	3.1125 (1.8276)	4.3152 (2.4896)	8.6355 (9.8171)

Table 6. Performance comparison between the benchmark MFCN model and six variants of the proposed GCN architecture for the next 24 h or t + 24 drop-off demand prediction, based on the mean absolute error (MAE) along with the standard deviation of absolute errors (SD of AEs).

Model	Feature Size	No of Predictions	MAE (SD of AE)
Model	Feature Size	No of Predictions	y = 0	y > 0	$y \in$ [1, 5]	$y \in$ [6, 10]	$y \in$ [11, 15]	$y \in$ $[16, \infty]$
MFCN	14	2.4 k	0.0078 (0.1678)	1.6230 (2.567)	1.3130 (2.2940)	3.0823 (3.0603)	5.2701 (3.0756)	12.4215 (8.9490)
GCN	17	1.37 M	0.0147 (0.1264)	1.5506 (1.8655)	1.3953 (1.3368)	2.3938 (1.7272)	4.6165 (2.7112)	13.1775 (11.4313)
Proposed SB	17	1.37 M	0.1820 (0.2270)	1.0935 (1.6387)	0.8234 (0.8663)	2.9773 (1.8192)	6.1763 (2.8791)	15.3078 (12.4467)
Proposed WC1	17	1.37 M	0.0051 (0.0724)	1.3690 (1.7939)	1.0123 (0.8982)	3.2218 (1.9256)	6.4127 (3.0165)	15.4617 (12.5934)
Proposed WC2	17	1.37 M	0.0054 (0.0738)	1.3796 (1.8102)	1.1555 (0.9937)	2.8758 (1.7430)	5.0472 (2.7751)	11.4845 (10.0969)
Proposed GCN-MM	17	1.37 M	0.0977 (0.2202)	1.1478 (1.7006)	0.8154 (0.8269)	3.3612 (2.0973)	6.8945 (3.0820)	15.2888 (12.4140)
Proposed GCN	18	1.37 M	0.0050 (0.0682)	1.4455 (1.8547)	1.2290 (1.1590)	3.0437 (1.9116)	4.6059 (2.4427)	10.0982 (10.0180)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Moon Zin, M.; Patanukhom, K.; Demissie, M.G.; Phithakkitnukoon, S. Hybrid Graph Convolutional-Recurrent Framework with Community Detection for Spatiotemporal Demand Prediction in Micromobility Systems. Mathematics 2026, 14, 116. https://doi.org/10.3390/math14010116

AMA Style

Moon Zin M, Patanukhom K, Demissie MG, Phithakkitnukoon S. Hybrid Graph Convolutional-Recurrent Framework with Community Detection for Spatiotemporal Demand Prediction in Micromobility Systems. Mathematics. 2026; 14(1):116. https://doi.org/10.3390/math14010116

Chicago/Turabian Style

Moon Zin, Mayme, Karn Patanukhom, Merkebe Getachew Demissie, and Santi Phithakkitnukoon. 2026. "Hybrid Graph Convolutional-Recurrent Framework with Community Detection for Spatiotemporal Demand Prediction in Micromobility Systems" Mathematics 14, no. 1: 116. https://doi.org/10.3390/math14010116

APA Style

Moon Zin, M., Patanukhom, K., Demissie, M. G., & Phithakkitnukoon, S. (2026). Hybrid Graph Convolutional-Recurrent Framework with Community Detection for Spatiotemporal Demand Prediction in Micromobility Systems. Mathematics, 14(1), 116. https://doi.org/10.3390/math14010116

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hybrid Graph Convolutional-Recurrent Framework with Community Detection for Spatiotemporal Demand Prediction in Micromobility Systems

Abstract

1. Introduction

1.1. Importance of Demand Prediction in Micromobility

1.2. Advances in Deep Learning for Spatiotemporal Demand Prediction

1.3. Graph-Based Learning and Community Detection

1.4. Research Gap and Contribution

2. Materials and Methods

2.1. Dataset

2.2. Proposed Model Architecture

2.2.1. Input Representation

2.2.2. Community Detection

2.2.3. Gated Recurrent Unit (GRU)

2.2.4. Graph Convolutional Network (GCN)

2.3. Implementation Details

3. Results

3.1. Next-Hour Pick-Up and Drop-Off Prediction Performance

3.2. Next 24-h Pick-Up and Drop-Off Prediction Performance

4. Discussions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI