Advancing Rural Mobility: Identifying Operational Determinants for Effective Autonomous Road-Based Transit

Jayatilleke, Shenura; Bhaskar, Ashish; Bunker, Jonathan

doi:10.3390/smartcities8050170

Open AccessArticle

Advancing Rural Mobility: Identifying Operational Determinants for Effective Autonomous Road-Based Transit

by

Shenura Jayatilleke

,

Ashish Bhaskar

and

Jonathan Bunker

^*

School of Civil and Environmental Engineering, Queensland University of Technology, 2 George Street, Brisbane, QLD 4000, Australia

^*

Author to whom correspondence should be addressed.

Smart Cities 2025, 8(5), 170; https://doi.org/10.3390/smartcities8050170

Submission received: 26 August 2025 / Revised: 8 October 2025 / Accepted: 9 October 2025 / Published: 12 October 2025

(This article belongs to the Special Issue Cost-Effective Transportation Planning for Smart Cities)

Download

Browse Figures

Versions Notes

Abstract

Highlights

What are the main findings?

Small autonomous shuttles are preferred for flexible, non-routine trips, minibus shuttles for First Mile Last Mile (FMLM) connectivity in town centers, and standard-sized buses for high-capacity school and emergency transport.
Hybrid models integrating autonomous and conventional buses, alongside multipurpose services and Mobility-as-a-Service (MaaS) integration, are favored to address equity concerns and optimize efficiency, with autonomous taxis raising accessibility barriers for disadvantaged groups.

What is the implication of the main finding?

These insights inform targeted policy deployment, such as small shuttles in university/tourist areas, minibus shuttles in accessible town centers, and subsidized standard buses for schools, to reduce transport disadvantages and enhance rural connectivity.
By tailoring autonomous road-based transit to diverse user needs, the study promotes sustainable, inclusive mobility solutions that mitigate social inequality and improve quality of life in low-density regions.

Abstract

Rural communities face persistent transport disadvantages due to low population density, limited-service availability, and high operational costs, restricting access to essential services and exacerbating social inequality. Autonomous public transport systems offer a transformative solution by enabling flexible, cost-effective, and inclusive mobility options. This study investigates the operational determinants for autonomous road-based transit systems in rural and peri-urban South-East Queensland (SEQ), employing a structured survey of 273 residents and analytical approaches, including General Additive Model (GAM) and Extreme Gradient Boosting (XGBoost). The findings indicate that small shuttles suit flexible, non-routine trips, with leisure travelers showing the highest importance (Gain = 0.473) and university precincts demonstrating substantial influence (Gain = 0.253), both confirmed as significant predictors by GAM (EDF = 0.964 and EDF = 0.909, respectively). Minibus shuttles enhance first-mile and last-mile connectivity, driven primarily by leisure travelers (Gain = 0.275) and tourists (Gain = 0.199), with shopping trips identified as a significant non-linear predictor by GAM (EDF = 1.819). Standard-sized buses are optimal for high-capacity transport, particularly for school children (Gain = 0.427) and school trips (Gain = 0.148), with GAM confirming their significance (EDF = 1.963 and EDF = 0.834, respectively), demonstrating strong predictive accuracy. Hybrid models integrating autonomous and conventional buses are preferred over complete replacement, with autonomous taxis raising equity concerns for low-income individuals (Gain = 0.047, indicating limited positive influence). Integration with Mobility-as-a-Service platforms demonstrates strong, particularly for special events (Gain = 0.290) and leisure travelers (Gain = 0.252). These insights guide policymakers in designing autonomous road-based transit systems to improve rural connectivity and quality of life.

Keywords:

sustainable mobility; public transport; shared autonomous vehicles; first mile and last mile; general additive model; extreme gradient boost; South East Queensland

1. Introduction

1.1. Challenges in Rural Public Transport

The provision of efficient and accessible public transport in rural areas remains a persistent challenge, driven by low population densities, dispersed settlements, and constrained financial resources. These factors contribute to significant transport disadvantages, limiting access to essential services such as education, healthcare, employment, and social activities, which in turn exacerbates social isolation and inequality. In rural and remote regions, public transport options are often scarce, infrequent, or entirely absent, compelling residents to rely heavily on private vehicles, which may not be viable for all, particularly low-income individuals, seniors, or those with disabilities. For instance, rural areas face higher traffic fatality rates (1.5 times those in urban areas) due to poor infrastructure, such as unpaved roads, sharp curves, and limited visibility, compounded by extreme weather conditions like snow and fog that impair navigation. Additionally, limited digital connectivity and insufficient charging infrastructure further hinder sustainable transport solutions [1,2,3]. Studies on regional public transport preferences emphasize that these challenges lead to lower modal shares for public transit, with users prioritizing reliability, frequency, and cost over other attributes [4,5].

1.2. Potential of Autonomous Shuttles in Addressing Rural Transport Challenges

Autonomous public transport systems, enabled by advancements in vehicle automation and artificial intelligence, offer a transformative opportunity to address these challenges. Unlike traditional fixed-route systems, autonomous vehicles can provide flexible, on-demand services tailored to the irregular and dispersed demand patterns characteristic of rural areas. By eliminating driver-related costs, which constitute a significant portion of public transport expenses, autonomous systems could reduce operational costs, enabling extended service hours, including 24/7 operations, and broader geographic coverage [6]. Autonomous shuttles also promise enhanced safety by minimizing human error, improved mobility for underserved groups (e.g., elderly and disabled), and economic benefits through efficient goods transport [7]. Electric autonomous shuttles can reduce emissions and noise, aligning with sustainability goals, while shared mobility models like on-demand shuttles optimize vehicle utilization in low-density areas [8].

1.3. Public Perceptions and User Preference for Autonomous Shuttles in Rural Contexts

Public perceptions and user preferences play a crucial role in autonomous shuttle adoption, particularly in rural areas where trust in technology is often lower than in urban environments [9]. Surveys indicate that rural residents express skepticism toward autonomous shuttles due to concerns about reliability, safety, affordability, and privacy, with older adults and those with lower education levels showing greater resistance [10,11]. In small and mid-sized metropolitan areas with rural peripheries, autonomous shuttles (shared or private) have limited appeal, even at reduced costs, with commuters uncomfortable sharing roads with autonomous shuttles [12]. However, positive perceptions emerge for specific use cases, such as first-mile/last-mile (FMLM) connectivity and leisure travel, where flexibility is valued [10]. The literature highlights that while autonomous shuttles are seen as opportunities for sustainable mobility, rural adoption lags due to cultural and infrastructural barriers while demographic factors influencing preferences [13,14].

1.4. Operational Determinants and Service Design

Researchers such as Scheltes and de Almeida Correia [15], Lau and Susilawati [16], Roy, Dadashev [17], Gurumurthy, Kockelman [18], have primarily focused on Autonomous Demand Responsive Transit (ADRT) as the preferred service type, highlighting its prominence in their studies. The high flexibility of autonomous shuttle services has significantly contributed to their increased adoption. Successful implementation of these services hinges on effective service design (e.g., autonomous shuttle penetration rate) and user acceptance [7,19]. Although numerous studies have explored service design through simulations and mathematical programming [20,21], there remains a critical need to incorporate public perception into service design to identify key predictors of operational success [22]. Land use patterns influence trip generation and distribution, informing vehicle capacity and routing, while trip purposes drive service flexibility [23,24]. While existing studies focus on Land Use and Transportation Interaction (LUTI) models, the successful deployment of such systems requires a nuanced understanding of user preferences and the operational determinants that influence their adoption in rural contexts, where transport needs differ markedly from urban environments.

1.5. Contributions of the Study

This study empirically investigates the operational determinants of autonomous public transport systems in rural and peri-urban communities of South East Queensland. Through a structured survey targeting residents in low-density areas, the research explores preferences for various autonomous vehicle types and service offerings. The analysis exhibits two complementary analytical approaches: General Additive Models (GAM) and Extreme Gradient Boosting (XGBoost). GAM is particularly suited for this study due to its ability to model non-linear relationships between predictors such as trip purpose, demographic groups, and land use and user preferences, which may exhibit complex, non-linear patterns. XGBoost, a powerful machine learning algorithm, excels in handling high-dimensional data and capturing intricate interactions among variables, enabling the identification of the most influential factors driving user preferences. GAM identifies significant predictors and non-linear relationships through Effective Degrees of Freedom (EDF), while XGBoost quantifies predictor importance through feature gain metrics.

The significance of this research lies in its contribution to the emerging field of autonomous transport, particularly in rural settings where traditional public transport solutions have proven inadequate. By providing empirical evidence on user preferences across diverse demographic groups, trip purposes, and land use contexts, this study offers actionable insights for policymakers, transport planners, and industry stakeholders. These insights can inform the design and implementation of autonomous public transport systems that are tailored to the unique needs of rural communities, thereby reducing transport disadvantages, enhancing access to essential services, and improving overall quality of life.

The remainder of the paper is divided into the following sections. Study methodology including the survey design, data collection and analysis in Section 2. Model estimation and validation results of GAM and XGBoost in Section 3. A comprehensive discussion, policy implications, and the conclusion of the study are provided in Section 4, Section 5 and Section 6. Finally, the research limitations and the future research directions are presented in Section 7.

2. Materials and Methods

This section details the methodology undertaken, focusing on survey design and data collection, data characteristics and processing, GAM, and XGBoost model fitting and validation.

2.1. Survey Design and Data Collection

To empirically identify the operational determinants for rural autonomous road-based transits, a structured questionnaire was developed. The survey design process was structured in two main phases [10]. First, a comprehensive literature review was conducted to identify and conceptualize key constructs, including trip purpose, land Use, demographic groups, vehicle types, and service offering. The attributes were mainly extracted from the Australian Transport Assessment and Planning Guidelines [24,25] and literature [9,26,27,28,29]. These references provided a theoretical foundation for understanding travel behavior, land use dynamics, vehicle suitability, and service models in rural and peri-urban settings, especially for autonomous shuttles. In the second phase, semi-structured interviews with domain experts from the Australian transport sector were conducted to validate the identified variables and ensure their relevance to rural environments [30]. The survey instrument, encompassing measurement items and introductory content, was iteratively refined through consultations with an expert supervisory review panel to enhance content validity and contextual appropriateness. The final list of study attributes is listed on Table 1.

Before the main study, a pilot study was administered to Higher Degree Research (HDR) students, selected for their expertise in survey methodologies and analytical rigor. This pilot aimed to identify ambiguities, structural inconsistencies, and potential misinterpretations within the questionnaire. Feedback informed further refinements, improving the instrument’s clarity, interpretability, and face validity. The finalized questionnaire was approved by the University Human Research Ethics Committee (UHREC: 9083-HE09), ensuring compliance with ethical standards. The questionnaire (Appendix A) collected respondent data and evaluated autonomous road-based transit suitability across demographic groups, land use contexts, trip purposes, vehicle types, and service configurations. The target population for the main study consisted of residents in rural and peri-urban communities within the South East Queensland (SEQ) region, specifically in the Lockyer Valley, Moreton Bay, Sunshine Coast, Scenic Rim, and Redland local government areas, as identified by Mortoja and Yigitcanlar [31]. To align with global rural classification standards, only the localities with a population density below 300 inhabitants per square kilometer were included [32].

Sample size determination was guided by population-based sampling adequacy, targeting a sample of 300 responses for a population exceeding 1,000,000, at a 95% confidence level with a 6% margin of error [33], sufficient for the exploratory analyses employed in this study. Data collection was facilitated by Qualtrics using a convenience random sampling approach to reach rural residents within the study area. A total of 357 responses were initially collected. The final sample size was 273, after a rigorous data cleaning process that addressed four key issues: unengaged respondents (inconsistent or patterned answers), outliers (anomalous data points with Z-scores greater than ±3), missing data, and multicollinearity (variance inflation factor < 5) [34,35]. The socio-demographic profile, including gender, age, education level, employment status, and annual household income of the respondents, is presented in Figure 1a–e.

The socio-demographic profile of the survey participants (N = 273) revealed a predominantly female sample, with women comprising the majority (n = 179, 65.5%), followed by men (n = 93, 34.1%) and a negligible non-binary representation (n = 1, 0.4%). Age distribution was relatively even across younger and middle-aged groups, peaking at 36–50 years (n = 77, 28.2%) and lowest among those aged 66 years or older (n = 54, 19.8%). Education levels were highest for trade/apprenticeship or TAFE qualifications (n = 90, 32.9%), while Year 10 (n = 38, 13.9%) and postgraduate degrees (n = 37, 13.6%) were the least common. Employment status showed full-time employment as dominant (n = 109, 40.0%), with full-time or part-time students representing the smallest group (n = 15, 5.5%). Household income was skewed towards higher brackets, with AUD 104,000 or more the most prevalent (n = 101, 36.9%) and under AUD 15,600 the least (n = 7, 2.6%).

To establish the sample’s contextual validity, key demographics were compared against official Australian Bureau of Statistics (ABS) data for the target area (rural/peri-urban SEQ) [36]. The comparison revealed several selection biases. The sample showed a significant gender skew, with the proportion of women (65.6%) substantially exceeding the female share in the regional SEQ population (≈48% in Rural SEQ). A socio-economic bias was evident, as 37% of respondents reported an annual household income in the highest bracket (AUD 104,000 or more), which contrasts with the median annual household income for the rural SEQ benchmark (≈AUD 80,964). The sample’s age distribution showed an over-representation of middle-aged respondents (36−65 years accounted for 51.3% of the sample, compared to the regional average of ≈40.5%), and while the high representation of seniors (aged 66+ at 19.8%) appropriately reflects the older demographic profile of many rural areas (regional average ≈ 20% ), the youngest age group (18−35 years) was under-represented (28.9% vs. regional average ≈ 39.5%). Finally, the sample exhibited a higher level of educational attainment than the regional average; for example, 32.9% reported a trade/apprenticeship/TAFE qualification (regional average ≈ 25.5%), 13.6% and held a postgraduate degree (regional average ≈ 10.1%). While the high representation of seniors (aged 66+ at 19.8%) appropriately reflects the older demographic profile of many rural areas, the biases toward female, higher-income, and more highly educated respondents must be considered a limitation on the overall generalizability of the findings.

Internal consistency of the data was evaluated using Cronbach’s Alpha, with values ranging from 0.705 to 0.984, indicating acceptable to excellent reliability across constructs [37,38]. Figure 2 presents the data analysis process, including preprocessing via binarization and feature selection, followed by data property checks (sample size adequacy, category balance, multicollinearity, and linearity). Model development includes GAM to capture non-linear relationships interpretably, complemented by XGBoost for superior ensemble-based accuracy through model validations employing AUC, F1 score, and bootstrap CIs.

2.2. Data Characteristics and Processing

The study employed a sophisticated statistical pipeline for data processing, including binarization, feature selection, sample size adequacy, category balance, multicollinearity and linearity checks. RStudio 2024.12.1 provided the platform for implementing statistical models (XGBoost and GAM) and executing the required data preparation and visualization steps.

2.2.1. Binarization

The dependent variables are binarized to transform Likert-scale responses into binary outcome. Equation (1) represents the logistic regression model used to model binarized dependent variables. The logistic function was initially implemented and later transformed to incorporate non-linear effects of GAM:

P (Y = 1 | X) = \frac{1}{1 + e^{- (β_{0} + β_{1} X_{1} + \dots + β_{P} X_{P})}},

(1)

where β are the coefficients, and X are the predictors of the logistic function.

Dependent variables are binarized at a threshold of ≥4 (coding 4–5 as 1, else 0), based on Norman [39], who argues that Likert-scale data can be treated as binary when high scores indicate agreement. Sensitivity analysis is conducted at thresholds of 3 and 5 to assess impacts on class balance, ensuring robustness. The default threshold of 4 is selected, considering the lower Akaike Information Criterion (AIC) value [40].

2.2.2. Feature Selection

Feature selection is performed using the Least Absolute Shrinkage and Selection Operator (LASSO) to identify relevant predictors, addressing the high dimensionality of the dataset (26 predictors). LASSO imposed an L1 penalty on logistic regression, shrinking irrelevant coefficients to zero [41]:

Minimize \{- \frac{1}{n} \sum_{i = 1}^{n} [y_{i} \log (p_{i}) + (1 - y_{i}) \log (1 - p_{i}) + λ \sum_{j = 1}^{p} |β_{j}|]\},

(2)

where

P_{i} = \frac{1}{1 + e^{- (β_{0} + β_{1} X_{1} + \dots + β_{P} X_{P})}}

. The negative log-likelihood (log-loss) for logistic regression is averaged over n observations. The L1 penalty shrinks the coefficients with

λ

controlling the penalty strength. Ten-fold cross-validation was conducted to select the optimal

λ

, maximizing the Area Under the Curve (AUC). AUC is the probability that a randomly chosen positive instance ranks higher than a negative one, computed as the integral of the Receiver Operating Characteristic (ROC) curve [42,43]. LASSO’s ability to handle multicollinearity and select sparse models makes it ideal for high-dimensional datasets. Predictors with non-zero coefficients were retained for each dependent variable as highlighted in Table 2 [43,44]. Supply-side features were modelled as dependent variables, while demand drivers and built-environment factors were considered as independent variables.

2.2.3. Sample Size and Category Balance Check

The sample size is evaluated to ensure sufficient events per parameter (EPP) for stable model estimates, given the relatively smaller sample size. An EPP ≥ 10 is recommended to avoid overfitting and ensure reliable estimates Peduzzi, Concato [45]. This threshold was evaluated by post-LASSO feature selection to account for the reduced number of predictors, aligning with recommendations for robust model estimation in smaller samples [46]. Imbalanced classes can bias machine learning models toward the majority class, reducing predictive accuracy for minority classes [47]. Category balance is assessed by calculating the minority class count, and a minimum of 20 minority cases is required to ensure balanced classes.

2.2.4. Multicollinearity Check

Multicollinearity among predictors can inflate standard errors and destabilize model estimates. Variance Inflation Factors (VIF) are calculated to quantify multicollinearity as follows:

V I F_{j} = \frac{1}{1 - R_{j}^{2}},

(3)

where

R_{j}^{2}

is the coefficient determinations from regressing predictor j on all other predictors, with VIF > 5 indicating redundancy [48].

2.2.5. Linearity Check

The linearity assumption of logistic regression is tested by comparing the AIC of logistic regression (linear) versus GAM (non-linear). Lower GAM AIC or significant non-linear residual patterns (via loess smoothing) indicated non-linearity, justifying GAM’s use [49,50]. Loess smoothing fits a local regression to residuals, revealing non-linear patterns that violate logistic regression assumptions [51].

2.3. General Additive Model (GAM)

GAMs offer a robust framework for modelling complex, non-linear relationships between predictors and outcomes, extending Generalized Linear Models (GLMs) by incorporating smooth functions to capture non-linear effects (25). For binary outcomes, as analyzed in this study, GAMs employ a logistic link function to model the log-odds of the response variable as a sum of smooth functions of predictors. The logistic GAM is shown below:

g (μ_{i}) = \log \frac{μ_{i}}{1 - μ_{i}} = β_{0} + \sum_{j = 1}^{p} f_{j} (X_{i j}),

(4)

where

μ_{i} = P (Y_{i} = 1)

is the expected probability of the binary outcome

Y_{i} ϵ \{0, 1\}

,

β_{0}

is the intercept, and

f_{j} (X_{i j})

are smooth functions for each predictor

X_{i j}

. These smooth functions enable GAMs to model non-linear relationships without assuming a specific functional form [52]. In this study, the smooth functions are implemented using cubic regression splines, which are piecewise cubic polynomials joined at knots, ensuring continuity up to the second derivative [53]. The basis dimension is set to k = 4, providing a balance between flexibility for capturing non-linear effects and robustness against overfitting, as recommended for moderate-sized datasets [54]. Each smooth function is expressed as

f_{j} (X_{j}) = \sum_{k = 1}^{k_{j}} β_{j k} b_{j k} (X_{j}),

(5)

where

b_{j k} (X_{j})

are cubic spline basis functions,

β_{j k}

are coefficients estimated during model fitting, and

k_{j} \approx 4

is the number of basis functions, determined by the number of knots and boundary constraints. To prevent overfitting, GAM employed penalized regression, optimizing a penalized log-likelihood:

O b j e c t i v e = l (β, y) - \sum_{j = 1}^{p} (λ_{j} \int {[f_{j}^{″} (x)]}^{2} d x + λ_{j}^{*} \int {[f_{j} (x)]}^{2} d x),

(6)

where

l (β, y)

is the binomial log-likelihood:

l = \sum_{i = 1}^{n} y_{i} \log (μ_{i}) + (1 - y_{i}) \log (1 - μ_{i}) .

(7)

The first penalty term,

λ_{j} \int {[f_{j}^{″} (x)]}^{2} d x

, controls the noise of each smooth function by penalizing the squared second derivative, and the second penalty term,

λ_{j}^{*} \int {[f_{j} (x)]}^{2} d x

, shrinks non-significant smooth terms to zero, facilitating automatic feature selection [55]. The smoothing parameters

λ_{j}

and

λ_{j}^{*}

are estimated using Restricted Maximum Likelihood (REML), balancing model fit and complexity [52]. The Effective Degrees of Freedom (EDF) quantifies the complexity of each smooth function after penalization. Defined as the trace of the influence matrix for each smooth term,

E D F_{j} = t r a c e (F_{j})

, the EDF reflects the effective number of parameters used by

f_{j}

. An EDF near 1 indicates a nearly linear relationship, values greater than 1 indicate non-linearity, and values near 0 suggest the predictor has been penalized out of the model [52]. This study uses EDF to assess the non-linear contributions of predictors, with ensuring the non-significant terms are excluded to enhance model parsimony.

2.4. Extreme Gradient Boost (XGBoost)

XGBoost is an advanced, scalable implementation of Gradient Boosting Machines (GBMs), originally introduced by [56] as a method for constructing predictive models through an ensemble of weak learners, typically decision trees. XGBoost, developed by [57], enhances this framework by incorporating regularization terms to prevent overfitting, efficient tree construction algorithms for handling large datasets, and support for parallel computing. It is particularly suited for binary classification tasks with non-linear relationships. The model’s objective is to minimize a regularized loss function iteratively, where interpretability and predictive accuracy are crucial. XGBoost builds an additive ensemble of regression trees, where each tree corrects the residuals of the previous ones. For a binary classification task, the predicted probability

ŷ_{i}

for the positive class for observation i is given by

ŷ_{i} = σ (\sum_{m = 1}^{M} f_{m} (x_{i})),

(8)

where

σ (z) = \frac{1}{1 + e^{- z}}

is the sigmoid (logistic) function that maps the additive log-odds to a probability between 0 and 1, M is the total number of trees (boosting iterations),

x_{i}

is the vector of selected predictors (post-LASSO features), and

f_{m}

is the output of the m-th tree, a piecewise constant function defined over a partition of the feature space [56,57]. Each tree

f_{m}

assigns a score (leaf weight) to each terminal node, and the final prediction aggregates these scores. This formulation extends generalized linear models by allowing flexible, non-parametric capturing of interactions and non-linearities.

Training minimizes a regularized objective function that balances predictive accuracy (empirical risk) with model complexity to avoid overfitting:

L = \sum_{i = 1}^{n} l (y_{i}, ŷ_{i}) + \sum_{m = 1}^{M} Ω (f_{m}),

(9)

where

l (y_{i}, ŷ_{i})

is the loss function measuring prediction error,

Ω (f_{m})

and penalizes complex trees. For binary logistic regression, the loss is the binomial deviance (negative log-likelihood):

l (y_{i}, ŷ_{i}) = - [y_{i} \log (ŷ_{i})) + (1 - y_{i}) \log (1 - ŷ_{i})],

(10)

which penalizes confident wrong predictions more heavily. The regularization term

Ω (f_{m})

is

Ω (f_{m}) = γ T + \frac{1}{2} λ \sum_{j = 1}^{T} w_{j}^{2} + \frac{1}{2} α \sum_{j = 1}^{T} |w_{j}|,

(11)

where T is the number of leaves in the tree, w is the weight (score) of the j-th leaf,

γ

controls the minimum loss reduction required for a split (pruning),

λ

applies L2 (ridge) regularization to shrink leaf weights, and

α

applies L1 (lasso) regularization to promote sparsity by setting some weights to zero [57]. This hybrid regularization prevents overfitting in high-dimensional settings, as justified by Tibshirani [41] and Hoerl and Kennard [58] for ridge penalties.

With the objective defined, XGBoost optimizes it via second-order gradient boosting, approximating the loss with a Taylor expansion for efficiency. At iteration m, a new tree

f_{m}

is added to minimize

L^{(m)} \approx \sum_{i = 1}^{n} [g_{i} f_{m} (x_{i}) + \frac{1}{2} h_{i} f_{m} {(x_{i})}^{2}] + Ω (f_{m}),

(12)

where

g_{i} = \frac{\partial l (y_{i}, {ŷ_{i}}^{(m - 1)})}{\partial {ŷ_{i}}^{(m - 1)}}

is the first-order gradient (residual direction) and

h_{i} = \frac{\partial^{2} l (y_{i}, {ŷ_{i}}^{(m - 1)})}{\partial {ŷ_{i}}^{{(m - 1)}^{2}}}

is the second-order Hessian (curvature, weighting uncertain residuals less) [56]. For logistic loss,

g_{i} = {ŷ_{i}}^{(m - 1)}

and

h_{i} = {ŷ_{i}}^{(m - 1)} (1 - {ŷ_{i}}^{(m - 1)})

. Tree construction uses an exact greedy algorithm to find splits maximizing gain:

G a i n = \frac{1}{2} [\frac{{G_{L}}^{2}}{H_{L} + λ} + \frac{{G_{R}}^{2}}{H_{R} + λ} - \frac{{(G_{L} + G_{R})}^{2}}{H_{L} + H_{R} + λ}] - γ,

(13)

where

G_{L}, G_{R}

are summed gradients in left/right child nodes, and

H_{L}, H_{R}

are summed Hessians [57]. Splits occur only if Gain > 0, ensuring parsimony. The learning rate ɳ shrinks tree contributions:

f_{m} \leftarrow ɳ f_{m}

, controlling step size to prevent overshooting.

Once the core optimization is established, hyperparameters are tuned to adapt the model to the specific dataset. This is performed via grid search over a predefined set of values: tree depth

ϵ \{3, 6\}

, which limits the maximum depth of each tree to control complexity and prevent overfitting by restricting the number of splits (lower depths yield simpler trees, reducing variance at the potential cost of bias), learning rate ɳ

ϵ \{0.1, 0.3\}

, which scales the contribution of each tree to slow down learning and improve generalization and number of boosting iterations

ϵ \{50, 100\}

, determining the ensemble size where too few may underfit and too many may overfit [57].

Grid search operates by exhaustively evaluating all combinations in the Cartesian product of the parameter spaces, resulting in

2 \times 2 \times 2 = 8

candidates in this case. For each combination

θ = (tree depth, learning rate, number of boosting iteration

), the model is trained and evaluated to find the

θ^{*}

that maximizes a performance metric. Mathematically, grid search solves

θ^{*} = {a r g m a x}_{θ ϵ ө} M (θ),

(14)

where

ө

is the grid of hyperparameters, and

M (θ)

is the cross-validated performance metric [59]. This exhaustive approach is computationally feasible for small grids like this one but can be inefficient for larger spaces; however, it is empirically effective for GBMs in moderate datasets, as it avoids local optima common in random search [59]. To evaluate

θ

, 5-fold cross-validation is employed, which partitions the training data into k = 5 equally sized folds to estimate out-of-sample performance and minimize overfitting [60]. During cross-validation, the mean test AUC across folds is evaluated. Early stopping is integrated, halting training if the validation AUC does not improve for 10 consecutive rounds to prevent overfitting and reduce computation [61]. The best parameters

θ^{*}

are those maximizing Cross-validation AUC. The final model is then trained on the full training set (70% of data) while hyperparameters are tuned robustly. This nested approach balances bias-variance trade-off in the data sample [62]. Feature importance in XGBoost quantifies each predictor’s contribution to the model. It yields three metrics: Gain (primary), Cover, and Frequency (Weight), aggregated across all trees. Gain measures the total improvement in loss from splits on a feature, reflecting its predictive power. Cover quantifies the relative number of observations affected by splits on the feature, averaged across trees, and the frequency (weight) is the number of times the feature is used in splits across the ensemble. Gain was prioritized as it directly ties to model improvement, while cover and frequency provided complementary views.

Shapley Additive Explanation (SHAP) values decompose predictions into additive feature contributions, addressing the black-box nature of ensemble models [63]. Rooted in cooperate game theory [64] SHAP assigns each feature a value representing its marginal contribution to the prediction, averaged over all possible coalitions of features. For a prediction

ŷ_{i} = f (x_{i})

, SHAP decomposes it as

ŷ_{i} = φ_{0} + \sum_{j = 1}^{P} φ_{j, i},

(15)

where

φ_{0} = E [f (x)]

is the base value (expected model output over the dataset, often mean log-odds for logistic models), and

φ_{j, i}

is the SHAP value for feature j in instance i:

φ_{j, i} = \sum_{S \subseteq N \{j\}} \frac{∣ S ∣! (p - ∣ S ∣ - 1)!}{p!} [f_{S \cup \{j\}} (x_{S \cup \{j\}}) - f_{S} (x_{S})],

(16)

where

N = \{1, \dots, p\}

is the set of features,

S

is a subset (coalition), and

f_{S}

is the model trained on

S

(or approximated). This satisfies efficiency

(\sum φ_{j, i} = ŷ_{i} - φ_{0})

, symmetry, dummy (zero for non-contributing features), and additivity [63].

2.5. Model Validation

Validation is paramount in the methodology to ensure models reliably predict user preferences. Validation in this study employs a held-out test set approach, computing a comprehensive suite of multi-metrics derived from the confusion matrix, and estimating uncertainty through bootstrap resampling. Predictions on the test set are binarized using a 0.5 threshold (

ŷ_{i} = 1 i f Ṕ_{i} \geq 0.5)

, where

Ṕ_{i}

is the predicted probability from trained models. The resulting predictions yield a 2 × 2 confusion matrix C, tabulating true positives (TP: correct positive predictions), true negatives (TN: correct negative predictions), false positives (FP: incorrect positive predictions), and false negatives (FN: incorrect negative predictions) [65]. The performance metrices computed from the confusion matrix includes AUC and F1 score. AUC is the probability that a randomly chosen positive instance ranks higher than a negative one, computed as the integral of the ROC curve, or equivalently via the Wilcoxon-Mann–Whitney statistic [42]. Precision is the proportion of positive predictions that are correct:

P r e c i s i o n = \frac{T P}{T P + F P} .

(17)

Precision is critical in contexts where false positives carry significant costs, such as misclassifying non-agreement [66]. Sensitivity is the proportion of actual positives correctly identified:

S e n s i t i v i t y = \frac{T P}{T P + F N} .

(18)

Specificity complements sensitivity, focusing on correct identification of non-agreement cases. Specificity is the proportion of actual negatives correctly identified:

S p e c i f i c i t y = \frac{T N}{T P + F P} .

(19)

The F1 score balances precision and sensitivity, providing a single measure of performance that accounts for both false positives and false negatives. F1 score is the harmonic mean of precision and sensitivity:

F 1 = 2 \cdot \frac{P r e c i s i o n \cdot S e n s i t i v i t y}{P r e c i s i o n + S e n s i t i v i t y} .

(20)

To quantify uncertainty in model performance, bootstrap resampling with

R = 100

iterations 95% Confidence Intervals (CIs) for accuracy and F1 scores. For a test set

Ɗ_{t e s t} = {\{(x_{i,} y_{i})\}}_{i = 1}^{n_{t e s t}}, B = 100

resamples

Ɗ_{b}^{*}

are drawn with replacement, each of size

n_{t e s t}

. For each resample

B

, the GAM and XGBoost model is refitted (or predictions are recomputed for XGBoost using the fitted model). This method assumes approximate normality of the bootstrap distribution for large

B

, providing reliable intervals without parametric assumptions [67].

3. Results

Model estimation results of GAM significant predictor EDF values and XGBoost feature importance are presented in Figure 3 and Figure 4, respectively, for autonomous shuttle vehicle types (VT1–VT3) and operational scenarios (SO1–SO7) evaluated against suitable trip purposes (TP1–TP8), demographic groups (DG1–DG12), and land uses (LU1–LU6).

Small shuttles (VT1) in this study are defined as vehicles with a capacity of up to 6 passengers, designed for autonomous road-based transit. The GAM for small shuttles identified several significant predictors: work trips (EDF = 0.796), school trips (EDF = 0.999), tourists (EDF = 1.384), leisure travelers (EDF = 0.964), people with physical disabilities (EDF = 1.514), and university precincts (EDF = 0.909). The XGBoost model highlighted leisure travelers (Gain = 0.473) and university precincts (Gain = 0.253) as the most influential features, followed by medical trips (Gain = 0.090), people with physical disabilities (Gain = 0.065), special event trips (Gain = 0.039), tourists (Gain = 0.027), high-income individuals (Gain = 0.015), school trips (Gain = 0.014), work trips (Gain = 0.010), and residential neighbourhoods (Gain = 0.008).

Minibus shuttles (VT2) are defined as autonomous shuttles with a capacity of up to 20 passengers. GAM for minibus shuttle identified significant predictors: shopping trips (EDF = 1.819), tourists (EDF = 1.107), people with sensory disabilities (EDF = 2.317), and university precincts (EDF = 1.275). The XGBoost model highlighted leisure travelers (Gain = 0.275), tourists (Gain = 0.199), and industrial/business parks (Gain = 0.128) as the most influential features, followed by university students (Gain = 0.098), town centres (Gain = 0.065), university precincts (Gain = 0.058), shopping trips (Gain = 0.039), senior citizens (Gain = 0.030), low-income individuals (Gain = 0.027), people with sensory disabilities (Gain = 0.025), residential neighbourhoods (Gain = 0.014), special event trips (Gain = 0.011), working professionals (Gain = 0.009), leisure trips (Gain = 0.008), and medical trips (Gain = 0.007).

Standard-sized conventional autonomous bus (VT3) refers to vehicles that can carry up to 60 passengers for autonomous road-based transit. GAM for standard-sized conventional bus identified significant predictors: school trips (EDF = 0.834), emergency trips (EDF = 1.140), school children (EDF = 1.963), agricultural land areas (EDF = 0.711), tourist destinations (EDF = 1.960), and town centres (EDF = 1.837). The XGBoost model prioritized school children (Gain = 0.427) and school trips (Gain = 0.148), followed by industrial/business parks (Gain = 0.075), work trips (Gain = 0.066), emergency trips (Gain = 0.064), low-income individuals (Gain = 0.049), university students (Gain = 0.040), agricultural land areas (Gain = 0.032), tourist destinations (Gain = 0.029), town centres (Gain = 0.023), high-income individuals (Gain = 0.015), special event trips (Gain = 0.014), leisure trips (Gain = 0.008), and university precincts (Gain = 0.004).

The operational scenario: complete replacement of conventional buses with autonomous shuttle buses is denoted as SO1. GAM identified significant predictors for this scenario: school trips (EDF = 0.816), people with sensory disabilities (EDF = 0.677), agricultural land areas (EDF = 0.936), and town centres (EDF = 0.897). The XGBoost model highlighted people with physical disabilities (Gain = 0.208) and town centres (Gain = 0.142) as the most influential features, followed by agricultural land areas (Gain = 0.112), school trips (Gain = 0.106), emergency trips (Gain = 0.102), school children (Gain = 0.102), tourist destinations (Gain = 0.080), people with sensory disabilities (Gain = 0.057), special event trips (Gain = 0.053), and people with cognitive disabilities (Gain = 0.033).

The operational scenario SO2 refers to having autonomous shuttles as a connector to existing fixed-route bus services, where the autonomous road-based transit complements the traditional bus services. GAM identified significant predictors: working professionals (EDF = 0.854) and university precincts (EDF = 0.899). XGBoost mirrored these findings, highlighting working professionals (Gain = 0.829) and university precincts (Gain = 0.170) as the most influential features. It is imperative to note the lower number of predictors for the model fit, due to the penalized splines and regularization. However, the models were cross validated to minimize the risk of underfitting [41].

The operational scenario SO3 underscores the potential of autonomous shuttles to enhance the connectivity of long-distance road/rail-based services. School children (EDF = 0.856) and leisure travelers (EDF = 0.824) were GAM significant predictors. The XGBoost model prioritized leisure travelers (Gain = 0.508) and special event trips (Gain = 0.116) as the most influential features, followed by industrial/business parks (Gain = 0.098), people with physical disabilities (Gain = 0.066), university students (Gain = 0.061), low-income individuals (Gain = 0.053), shopping trips (Gain = 0.029), work trips (Gain = 0.021), school children (Gain = 0.018), town centres (Gain = 0.018), university trips (Gain = 0.006), and residential neighbourhoods (Gain = 0.001).

Autonomous shuttles operating as a private taxi service functions like traditional taxi services (SO4), providing on-demand, point-to-point transport. GAM identified several significant predictors: work trips (EDF = 1.413), university trips (EDF = 1.367), shopping trips (EDF = 0.999), school children (EDF = 0.795), senior citizens (EDF = 0.860), people with physical disabilities (EDF = 2.745), low-income individuals (EDF = 2.397), and high-income individuals (EDF = 0.999). The XGBoost model ranked work trips (Gain = 0.229) and people with physical disabilities (Gain = 0.206) as the most influential features, followed by university students (Gain = 0.079), industrial/business parks (Gain = 0.063), shopping trips (Gain = 0.052), low-income individuals (Gain = 0.047), university precincts (Gain = 0.043), school children (Gain = 0.038), medical trips (Gain = 0.038), tourists (Gain = 0.035), high-income individuals (Gain = 0.034), school trips (Gain = 0.032), residential neighbourhoods (Gain = 0.030), people with sensory disabilities (Gain = 0.021), senior citizens (Gain = 0.015), emergency trips (Gain = 0.014), town centres (Gain = 0.012), middle-income individuals (Gain = 0.002), and university trips (Gain = 0.001). Accommodating autonomous shuttles as a multipurpose service (SO5) combines passenger transport with light freight delivery. Leisure trips (EDF = 0.852), working professionals (EDF = 1.392), university precincts (EDF = 0.909), and agricultural land areas (EDF = 0.859) were significant for GAM. The XGBoost model ranked working professionals (Gain = 0.314) and industrial/business parks (Gain = 0.254) as the most influential features, followed by university precincts (Gain = 0.139), agricultural land areas (Gain = 0.111), leisure travelers (Gain = 0.080), work trips (Gain = 0.044), leisure trips (Gain = 0.040), and middle-income individuals (Gain = 0.014).

The operational scenario SO6 refers to the integration of autonomous shuttles with other transport offerings such as Mobility-as-a-Service (MaaS), to provide seamless connectivity across modes. GAM identified significant predictors: work trips (EDF = 0.921), special event trips (EDF = 0.786), senior citizens (EDF = 2.674), leisure travelers (EDF = 1.730), and tourist destinations (EDF = 0.807). The XGBoost model ranked special event trips (Gain = 0.290) and leisure travelers (Gain = 0.252) as the most influential features, followed by senior citizens (Gain = 0.164), work trips (Gain = 0.1429), tourist destinations (Gain = 0.067), university students (Gain = 0.038), university precincts (Gain = 0.023), low-income individuals (Gain = 0.011), and industrial/business parks (Gain = 0.009). The operational scenario for autonomous shuttles to operate 24/7 (SO7) refers to providing continuous service availability to users. GAM identified university precincts (EDF = 1.812) as the only significant predictor. The XGBoost model highlighted university precincts (Gain = 0.593) and working professionals (Gain = 0.219) as the most influential features, followed by leisure trips (Gain = 0.110), agricultural land areas (Gain = 0.052), low-income individuals (Gain = 0.012), and work trips (Gain = 0.011).

The combined swarm plot from the XGBoost model provides deeper insights into feature contributions of VT1, VT2, VT3, SO1, SO2, SO3, SO4, SO5, SO6 and SO7 (Figure 5a–j). For VT1, leisure travelers showed SHAP values ranging from approximately −0.944 to 0.595, indicating a significant and variable impact on predicting small shuttle needs. University precincts exhibited SHAP values from −1.131 to 0.641, with a median around −0.795, suggesting a generally slight negative influence on predictions, though its impact can be positive in some instances. Other features like work trips and school trips showed SHAP value ranges of −0.187 to 0.147 and −0.158 to 0.147, respectively, indicating moderate and context-dependent influences. For VT2, tourists showed SHAP values ranging from −0.521 to 0.543, indicating a significant and variable impact on predicting minibus shuttle needs. Leisure travelers exhibited SHAP values from −0.846 to 0.663, suggesting a substantial influence that can either increase or decrease the prediction. Shopping trips had SHAP values ranging from −0.462 to 0.474, with a median around 0.133, indicating a generally modest positive influence, though its impact varies. For VT3, school children showed SHAP values ranging from −0.879 to 1.153, indicating a substantial and variable impact on predicting standard-sized autonomous bus needs. School trips exhibited SHAP values from −0.499 to 0.750, with a median around 0.526, suggesting a generally positive influence. Agricultural land areas and tourist destinations showed SHAP values ranging from −0.194 to 0.145 and −0.208 to 0.101, respectively, indicating a mixed and relatively modest influence.

For SO1, people with physical disabilities showed SHAP values ranging from −0.659 to 0.723, indicating a significant and variable impact. School children exhibited SHAP values from −0.522 to 0.316, suggesting a mixed influence. School trips had SHAP values ranging from −0.600 to 0.644, with a median around 0.417, indicating a generally positive but variable influence. Special event trips ranged from −0.104 to 0.428, showing a modest and variable impact. Agricultural land areas displayed SHAP values from −0.422 to 0.426, indicating a mixed and moderate influence. For SO2, working professionals exhibited SHAP values ranging from −0.879 to 0.732, indicating a substantial and variable impact on predicting the need for connector services. University precincts showed SHAP values from −0.268 to 0.193, with a median around 0.015, suggesting a more modest and mixed influence.

The combined swarm plot for SO3 revealed leisure travelers with SHAP values ranging from −1.399 to 1.151, indicating a substantial and variable impact on the likelihood of needing long-distance connectors. Special events trips showed SHAP values from −1.285 to 1.322, suggesting a significant and highly variable influence tied to event schedules. School children had SHAP values from −0.519 to 0.193, indicating a moderate and mixed effect, with a tendency toward positive values (median around 0.056). The combined swarm plot for SO4 shows work trips with SHAP values ranging from −1.036 to 0.686, indicating a substantial and variable influence on the likelihood of autonomous shuttles serving as private taxis, with both positive and negative contributions depending on work trip patterns. People with physical disabilities has SHAP values from −0.747 to 1.161, underscoring its critical and highly variable role in driving accessibility needs. Low-income individuals show SHAP values from −0.670 to 0.216, suggesting a mixed effect with a tendency toward negative contributions (median around 0.087), reflecting variable demand among this group. Shopping trips ranges from −0.425 to 0.355, indicating a moderate and variable impact.

In the combined swarm plot for SO5, working professionals exhibited SHAP values ranging from −1.490 to 0.530, indicating a substantial and variable influence on the likelihood of needing multipurpose services, with both positive and negative contributions depending on professional activity patterns. Leisure trips showed SHAP values from −0.776 to 0.081, suggesting a moderate and variable influence, with a tendency toward negative contributions (median around 0.061). Agricultural land areas had SHAP values from −0.559 to 0.368, indicating a highly variable impact, while University precincts ranged from −0.329 to 1.122, reflecting a significant and diverse effect. For SO6, special events trips with SHAP values ranging from −0.558 to 0.559, indicate a strong and variable influence on integrated transport demand, particularly during events. Leisure travelers have SHAP values from −0.479 to 0.664, reflecting a substantial and variable role in tourist-heavy periods. Senior citizens show SHAP values from −0.541 to 0.607, indicating a highly variable impact, while work trips ranges from −0.449 to 0.399, suggesting a notable but mixed effect. For SO7, university precincts exhibited SHAP values ranging from −1.063 to 0.618, indicating a strong influence on the demand for 24/7 operations. Working professionals showed SHAP values from −0.273 to 0.208, suggesting a moderate and variable supporting role. Leisure trips had SHAP values from −0.395 to 0.157, indicating a minor and inconsistent impact, while agricultural land areas ranged from −0.116 to 0.390, reflecting a limited and variable effect.

Model validation metrices for GAM and XGBoost: AUC, F1 score, Bootstrap 95% CI for accuracy and F1 score are presented in Figure 6a–d.

For VT1, Model performance metrics indicate robust predictive capability for both models. The GAM achieved an AUC of 0.909, and F1 Score of 0.816. The XGBoost model performed slightly better with an AUC of 0.913, and F1 Score of 0.838. Bootstrap 95% CIs for Accuracy were 0.855–0.947 for GAM and 0.835–0.910 for XGBoost, while for F1 Score, they were 0.821–0.934 for GAM and 0.801–0.894 for XGBoost. For VT2, model performance metrics indicate excellent predictive capability. The GAM achieved an AUC of 0.958 and F1 Score of 0.933. The XGBoost model performed slightly better with an AUC of 0.982 and F1 Score of 0.947. Bootstrap 95% CIs for Accuracy were 0.880–0.958 for GAM and 0.857–0.932 for XGBoost, while for F1 Score, they were 0.870–0.942 for GAM and 0.843–0.929 for XGBoost. For VT3, model performance metrics indicate strong predictive capability. The GAM achieved an AUC of 0.918 and F1 Score of 0.872. The XGBoost model performed slightly better with an AUC of 0.928 and F1 Score of 0.877. Bootstrap 95% CIs for Accuracy were 0.872–0.971 for GAM and 0.840–0.914 for XGBoost, while for F1 Score, they were 0.902–0.978 for GAM and 0.879–0.939 for XGBoost.

For SO1, model performance metrics indicate good predictive capability. The GAM achieved an AUC of 0.843 and F1 Score of 0.860. The XGBoost model performed better with an AUC of 0.909 and F1 Score of 0.899. Bootstrap 95% CIs for Accuracy were 0.789–0.890 for GAM and 0.809–0.886 for XGBoost, while for F1 Score, they were 0.849–0.921 for GAM and 0.863–0.923 for XGBoost. For SO2, model performance metrics indicate moderate predictive capability. The GAM achieved an AUC of 0.855 and F1 Score of 0.727. The XGBoost model showed an AUC of 0.842 and F1 Score of 0.630. Bootstrap 95% CIs for Accuracy were 0.752–0.840 for GAM and 0.739–0.826 for XGBoost, while for F1 Score, they were 0.651–0.807 for GAM and 0.665–0.789 for XGBoost. For SO3, model performance metrics indicate strong predictive capability. Both the GAM and XGBoost models achieved AUC values of 0.902 and 0.907, respectively, F1 scores of 0.750. Bootstrap 95% CIs for accuracy were 0.833–0.927 for GAM and 0.823–0.895 for XGBoost, while for F1 score, they were 0.829–0.927 for GAM and 0.822–0.893 for XGBoost.

For SO4, model performance metrics indicate strong predictive capability. The GAM achieved an AUC of 0.870 and F1 Score of 0.828. The XGBoost model performed better with an AUC of 0.930, F1 Score of 0.889. Bootstrap 95% CIs for Accuracy were 0.839–0.971 for GAM and 0.833–0.916 for XGBoost, while for F1 Score, they were 0.842–0.974 for GAM and 0.835–0.919 for XGBoost. For SO5, model performance metrics indicate solid predictive capability. The GAM achieved an AUC of 0.859 and F1 Score of 0.785, while the XGBoost model performed similarly with an AUC of 0.858, F1 Score of 0.821. Bootstrap 95% CIs for Accuracy were 0.809–0.901 for GAM and 0.794–0.871 for XGBoost, while for F1 Score, they were 0.819–0.909 for GAM and 0.794–0.880 for XGBoost.

For SO6, model performance metrics indicate strong predictive capability. The GAM achieved an AUC of 0.940 and F1 Score of 0.880. The XGBoost model performed similarly with an AUC of 0.945, F1 Score of 0.872. Bootstrap 95% CIs for Accuracy were 0.809–0.901 for GAM and 0.798–0.890 for XGBoost, while for F1 Score, they were 0.777–0.886 for GAM and 0.771–0.876 for XGBoost. For SO7, model validation metrics indicate moderate predictive performance. The GAM achieved an AUC of 0.828 and F1 Score of 0.643. The XGBoost model showed improved performance with an AUC of 0.843 and F1 Score of 0.710. Bootstrap 95% CIs for Accuracy were 0.719–0.826 for GAM and 0.729–0.813 for XGBoost, while for F1 Score, they were 0.615–0.785 for GAM and 0.657–0.771 for XGBoost.

4. Discussion

This section provides a comprehensive discussion on the operational determinants for the different autonomous shuttle vehicle types (Section 4.1, Section 4.2 and Section 4.3) and operational scenarios (Section 4.4, Section 4.5, Section 4.6, Section 4.7, Section 4.8, Section 4.9 and Section 4.10).

4.1. Autonomous Shuttles as Small-Sized Shuttle

The findings underscore the potential of small autonomous shuttles to address specific mobility needs in targeted contexts. Both the GAM and XGBoost models identified leisure travelers and university precincts as key predictors of small shuttle suitability, with additional significant predictors in the GAM including work trips, school trips, tourists, and people with physical disabilities. The prominence of leisure travelers in both models, with a high feature importance gain of 0.473 in the XGBoost model and a significant EDF of 0.964 in the GAM, suggests that small shuttles are particularly well-suited for non-routine, flexible travel demands. This finding is consistent with research by Golbabaei and Yigitcanlar [19], who identified perceived usefulness and ease of use as critical drivers of autonomous shuttle adoption, particularly among users seeking convenient transport options. The variable impact of leisure travelers, as indicated by SHAP values ranging from −0.944 to 0.595, underscores the diverse needs of this group, which may require tailored shuttle services to meet specific demands.

Similarly, the significance of university precincts as a predictor, with a gain of 0.253 in the XGBoost model and an EDF of 0.909 in the GAM, supports the suitability of small shuttles in campus environments. Iclodean and Cordos [6] note that autonomous shuttles are frequently deployed in university campuses due to their controlled traffic conditions and predictable demand patterns, which facilitate the integration of autonomous technologies. The mixed impact, as shown by SHAP values from −1.131 to 0.641 with a median of −0.795, suggests that while university precincts are generally suitable, their effectiveness may vary depending on specific route characteristics or demand fluctuations. The GAM’s identification of work trips (EDF = 0.796) and school trips (EDF = 0.999) as significant predictors further highlight the potential of small shuttles for short, routine trips. These findings align with the literature emphasizing the role of autonomous shuttles in reducing urban congestion and enhancing mobility for regular, predictable journeys [68]. The moderate SHAP value ranges for work trips (from −0.187 to 0.147) and school trips (from −0.158 to 0.147) in the XGBoost model suggest that while these trip types contribute to shuttle suitability, their impact is context-dependent, possibly influenced by factors such as trip distance or time of day.

4.2. Autonomous Shuttles as Minibus-Sized Shuttles

The findings of the autonomous minibus shuttle underscore its potential as a versatile and inclusive mobility solution, particularly for shopping trips, tourist-related travel, and serving individuals with sensory disabilities, and in university precincts and industrial/business parks. The GAM and XGBoost models demonstrated exceptional predictive performance, with AUC scores of 0.958 and 0.982, respectively, and F1 scores of 0.933 and 0.947, indicating robust and reliable predictions. These results align with the growing body of literature on autonomous shuttle services, which highlights their efficacy in urban and controlled environments for diverse user groups [6,10]. The prominence of shopping trips in both models, with a significant EDF of 1.819 in GAM and a gain of 0.039 in XGBoost, suggests that minibus shuttles are well-suited for urban settings where shopping is a frequent activity. The SHAP values for shopping, ranging from −0.462 to 0.474 with a median around 0.133, indicate a generally positive but context-dependent influence. This aligns with research emphasizing the role of autonomous shuttles in providing last-mile connectivity to shopping districts, reducing reliance on private vehicles and alleviating urban congestion [69,70,71].

Tourists and leisure travelers emerged as key predictors, with tourists significant in both models (GAM: EDF = 1.107; XGBoost: Gain = 0.199) and DG6 leading in XGBoost (Gain = 0.275). The wide SHAP value ranges for tourists (−0.521 to 0.543) and leisure travelers (−0.846 to 0.663) suggest that their impact varies depending on factors such as tourist volume or leisure travel patterns. This variability is consistent with studies highlighting the flexibility of autonomous shuttles in adapting to sporadic demand [10,72]. The significance of university precincts in both models (GAM: EDF = 1.275; XGBoost: Gain = 0.058) supports the suitability of minibus shuttles for campus environments, where flexible and efficient transport is critical for students and staff. Autonomous shuttles have been tested in university settings globally, where they have demonstrated effectiveness in controlled environments with predictable demand patterns [20,73].

A critical finding is the significance of people with sensory disabilities in the GAM (EDF = 2.317), underscoring the importance of accessibility in autonomous shuttle design. However, its lower influence in the XGBoost model (Gain = 0.025) suggests that accessibility may not be the primary driver of shuttle suitability in all contexts. This finding is supported by literature emphasizing the transformative potential of autonomous vehicles for individuals with disabilities, particularly those with sensory impairments, but also highlighting the lack of specific accessibility standards [10,74]. The high predictive performance of both models, coupled with stable bootstrap confidence intervals, provides confidence in their ability to guide minibus shuttle deployment. However, the variable SHAP values for key predictors like tourists and leisure travelers highlight the need for context-specific deployment strategies to account for fluctuating demand. User acceptance studies suggest that operational improvements, such as increasing travel speeds and reducing abrupt braking, are critical to enhancing passenger satisfaction, particularly for tourists and leisure travelers who prioritize comfort and convenience [9,19]. The findings advocate for deploying autonomous minibus shuttles in mixed-use urban settings, where their higher capacity and flexibility can balance diverse mobility needs [75].

4.3. Autonomous Shuttles as Standard-Sized Conventional Buses

The findings for the standard-sized conventional bus highlight its critical role in public transportation, particularly for school trips and school children. Additionally, the GAM significance of town centres and tourist destinations underscores the pivotal role of standard-sized buses in providing safe, accessible, and efficient transport for large groups, particularly in high-demand scenarios [76,77,78]. The prominence of school trips and school children in both models emphasizes the essential function of standard buses in student transport. The significant influence of school children in the XGBoost model, with a gain of 0.427 and SHAP values ranging from −0.879 to 1.153, reflects the variable but substantial impact of school-related demand and parent acceptance. Similarly, school trips’ SHAP values from −0.499 to 0.750, with a median around 0.526, indicate a generally positive influence, reinforcing the consistent need for standard-sized autonomous buses for school trips.

The significance of emergency trips in the GAM (EDF = 1.140) and its notable presence in the XGBoost model (Gain = 0.064) suggests that standard buses are vital for emergency services, such as community evacuations or rapid response during crises. While not typically designed for emergency medical services, buses are often utilized in large-scale emergencies due to their high capacity and reliability. While, buses have been deployed for disaster management over the years, evacuation from natural disasters such as hurricanes and bush fires have been proposed using autonomous shuttles [79,80]. However, there should be a careful consideration when deploying autonomous buses, as there may be non-compliance evacuees and other unforeseen technological issues under such critical situations.

The inclusion of agricultural land areas, tourist destinations, and town centres as significant predictors in the GAM, with modest SHAP values in the XGBoost model, indicates the versatility of standard-sized autonomous buses in serving diverse geographies. In rural areas, where transport infrastructure may be limited, autonomous buses can be utilized to serve various land uses. However, the modest SHAP values for agricultural land areas (−0.194 to 0.145) and tourist destinations (−0.208 to 0.101) suggest that their influence on individual predictions is less pronounced compared to school-related variables, indicating that deployment strategies should prioritize school transport while considering these areas as secondary contexts. The XGBoost model’s emphasis on industrial/business parks (Gain = 0.075) and work trips (Gain = 0.066) suggests additional suitability for serving employees in areas with limited public transport options, aligning with the role of standard buses in facilitating commuter travel. The presence of low-income individuals (Gain = 0.049) as a notable feature further underscores the importance of buses in providing affordable transport solutions, particularly in underserved communities.

4.4. Autonomous Shuttles Completely Replacing Conventional Buses

The findings suggest that replacing conventional buses with autonomous road-based transit is feasible in specific contexts, particularly in rural town centres, with a strong emphasis on serving school trips and passengers with sensory and physical disabilities. GAM and XGBoost models demonstrated good predictive performance, with AUC scores of 0.843 and 0.909, respectively, and F1 scores of 0.860 and 0.899. The prominence of rural town centres in both models, with a significant EDF of 0.897 in GAM and a gain of 0.142 in XGBoost, underscores the suitability of autonomous road-based transit in urban environments where demand is variable and congestion is a concern [81]. The integration of autonomous shuttles in town centres should be integrated through real-time route optimization and fleet variance to respond to real-time demand fluctuations, improving service efficiency.

Agricultural land areas also emerged as a significant predictor, with an EDF of 0.936 in GAM and a gain of 0.112 in XGBoost. The SHAP values for agricultural land areas, ranging from −0.422 to 0.426, suggest a mixed influence, indicating that autonomous road-based transit effectiveness in rural areas depends on specific demand patterns. Literature supports the use of Demand Responsive Transit (DRT) in low-density rural areas, where traditional bus services are often financially unsustainable due to sparse populations [82]. The significance of school trips in both models, with an EDF of 0.816 in GAM and a gain of 0.106 in XGBoost, highlights autonomous shuttles potential to serve educational transport needs. The SHAP values for school trips, ranging from −0.600 to 0.644 with a median around 0.417, indicate a generally positive but context-dependent influence.

Accessibility for passengers with disabilities is a critical finding, with people with sensory disabilities significant in GAM (EDF = 0.677) and people with physical disabilities leading in XGBoost (Gain = 0.208). The wide SHAP value range for people with physical disabilities (−0.659 to 0.723) underscores its variable but substantial impact, suggesting that physical accessibility is a key driver for autonomous shuttle adoption in certain contexts. However, the moderate influence of people with sensory disabilities in XGBoost (Gain = 0.057) suggests that sensory accessibility may require further technological advancements to fully meet user needs. Literature emphasizes the transformative potential of autonomous vehicles for individuals with disabilities, provided they incorporate inclusive design features [10,83].

The robust performance of the XGBoost model, with higher AUC and specificity (0.700 vs. 0.667 for GAM), indicates its effectiveness in identifying suitable contexts for autonomous shuttle replacement, which is crucial for cost-effective deployment. However, there are potential challenges in distinguishing unsuitable contexts, which could lead to over-reliance on autonomous shuttles in areas better served by conventional buses. This aligns with research advocating for hybrid transit models that combine conventional transit with autonomous transit [84,85,86]. Additionally, dynamic public transit systems have been proposed by researchers for optimized transit operations [87,88]. The performance of the transit depends on the autonomous shuttle penetration rate, which should be optimized with the real-time passenger demand [15,89,90].

4.5. Autonomous Shuttles as a Connector to Existing Fixed-Route Bus Services

The findings suggest that autonomous shuttles have significant potential to serve as a connector to fixed-route bus services, particularly for working professionals and in university precincts. GAM and XGBoost models identified these predictors as the most influential, highlighting the specific user group and location where autonomous shuttles can enhance first-mile and last-mile (FMLM) connectivity. The moderate performance metrics, with AUC scores around 0.85 and F1 scores between 0.63 and 0.73, indicate that while autonomous shuttles show promise, challenges remain in accurately predicting its suitability across diverse contexts, likely due to variable demand patterns in university settings.

The prominence of working professionals in both models, with a high gain of 0.829 in XGBoost and an EDF of 0.854 in GAM, underscores their critical role in driving the need for connector services. The SHAP values for working professionals, ranging from −0.879 to 0.732, suggest a substantial but variable impact, likely reflecting the diverse commuting patterns of professionals in rural areas. University precincts also emerged as a significant predictor, with an EDF of 0.899 in GAM and a gain of 0.170 in XGBoost. The SHAP values for university precincts, ranging from −0.268 to 0.193 with a median around 0.015, indicate a more modest and mixed influence, suggesting that while university precincts are relevant, their impact varies depending on specific campus characteristics, such as size, layout, or transit hub proximity. This aligns with the existing research findings, as many researchers have suggested the applicability of autonomous shuttles to solve the FMLM problem [16,18,29,90,91,92,93,94].

4.6. Autonomous Shuttles as a Connector to Longer Distance Services

GAM and XGBoost models demonstrated strong predictive performance, with AUC scores of 0.902 and 0.907, respectively, and F1 scores of 0.750, indicating reliable predictions for autonomous shuttles deployment in this context. The high feature importance of leisure travelers in the XGBoost model (Gain = 0.508) and its substantial SHAP values (−1.399 to 1.151) highlight the critical and variable impact of leisure travel, while special event trips’ SHAP values (−1.285 to 1.322) emphasize the significant influence of event-related demand. The prominence of leisure travelers in both models suggests that autonomous shuttles are well-suited to meet the flexible and sporadic travel needs of this group, who often require transport to access long-distance services in tourist-heavy areas or during off-peak times. research have clearly identified that passenger quality of service can be significantly improved through autonomous shuttle operations, through improved availability, accessibility, reliability and comfort [89,95,96].

Special events trips, with a notable gain of 0.116 in the XGBoost model and highly variable SHAP values, indicate that autonomous shuttles can effectively serve event attendees, whose travel demand is often concentrated around specific times and locations. The moderate influence of school children, with an EDF of 0.856 in GAM and SHAP values ranging from −0.519 to 0.193, suggests that while school-related travel contributes to the need for long-distance connectors, its impact is less dominant compared to leisure and special event travel. The findings advocate for prioritizing autonomous shuttle services in tourist-heavy routes and special events trips to reduce reliance on private vehicles and support sustainable transport goals.

4.7. Autonomous Shuttles as a Private Taxi Service

The findings highlight the versatility of autonomous shuttles when operated as a private taxi service, evidenced by the broad range of significant predictors identified by both the GAM and the XGBoost models. he prominence of work trips and people with physical disabilities in both models underscores the critical role that autonomous shuttles can play in serving daily commuting needs and enhancing accessibility for individuals with mobility challenges. The high feature importance of work trips in both models, with an EDF of 1.413 in GAM and a gain of 0.229 in XGBoost, suggests that autonomous shuttles as private taxis is particularly well-suited for commuters requiring flexible, on-demand transport. The SHAP values for work trips, ranging from −1.036 to 0.686, indicate a substantial but variable influence, likely reflecting diverse commuting patterns across remote and rural town centres. Accessibility for people with physical disabilities is a critical driver, as evidenced by its high EDF of 2.745 in GAM and gain of 0.206 in XGBoost, with SHAP values ranging from −0.747 to 1.161. This variability underscores the diverse mobility needs within this demographic, which may include requirements for low-floor vehicles, automated boarding systems, or real-time assistance [10].

Low-income individuals also emerged as a significant predictor, with an EDF of 2.397 in GAM and a gain of 0.047 in XGBoost, though their SHAP values (−0.670 to 0.216, median around 0.087) raises concerns on affordability. The mixed influence of low and middle-income individuals indicates that financial barriers may limit adoption unless addressed through subsidies or dedicated services, as suggested by the study’s policy recommendations. On a similar note, high-income individuals contribute to demand, with an EDF of 0.999 in GAM and a gain of 0.034 in XGBoost, reflecting their preference for convenient, on-demand transport options, as noted in studies on consumer willingness to adopt autonomous taxis [97,98]. The inclusion of other demographics, such as school children, senior citizens, and trip purposes like university trips and shopping trips, highlights the broad applicability of autonomous shuttles as private taxis. Autonomous taxis will help reduce private car costs by up to 16%, especially for small trip distances and/or in residential areas far from stations, and larger reductions in low-density metropolitan areas [99].

4.8. Autonomous Shuttles as Multipurpose Services

The findings from this study underscore the potential of autonomous road-based transit to operate as multipurpose services, integrating passenger transport and parcel delivery, particularly for working professionals in university precincts, industrial/business parks, and agricultural land areas. GAM and XGBoost models demonstrated solid predictive performance, with AUC scores of 0.859 and 0.858, respectively, and F1 scores of 0.785 and 0.821, indicating reliable predictions for the operational scenario. The prominence of working professionals in both models, with a high EDF of 1.392 in GAM and a gain of 0.314 in XGBoost, suggests the high acceptance among this demographic to have multipurpose services, due to the high acceptance of technology. The wide SHAP value range for working professionals (−1.490 to 0.530) reflects the wider acceptance for this demographic, yet potential discrepancies on prioritization between passengers and parcels. Thus, the optimization should be performed accordingly, so that the passengers will not experience any extended waiting time. Preference for passenger delivery should always be maintained, while scheduling freight delivery to balance empty vehicle kilometres.

University precincts also emerged as a key predictor, with an EDF of 0.909 in GAM and a gain of 0.139 in XGBoost. The SHAP values for university precincts (−0.329 to 1.122) indicate a significant and diverse impact, with a higher inclination towards the positive, due to the higher acceptability for this land use. Industrial/business parks, with a high gain of 0.254 in XGBoost, highlight the potential for autonomous shuttles to serve both employee commuting and logistics needs in commercial areas. Agricultural land areas, significant in both models (EDF = 0.859 in GAM; Gain = 0.111 in XGBoost), indicate autonomous shuttles potential to improve rural mobility and logistics. The SHAP values for agricultural land areas (−0.559 to 0.368) suggest a variable impact, likely due to the complex nature in these land uses and consequently, the incapabilities of autonomous shuttles to navigate effectively.

4.9. Autonomous Shuttles Integrated with Other Transport Offerings

The findings highlight the potential of autonomous shuttles to enhance mobility within integrated transport ecosystems, such as MaaS, particularly for special events, leisure travelers, and senior citizens in tourist destinations. Work trips also play a significant role, reflecting the versatility of autonomous shuttles in supporting diverse mobility needs. Given the prominence of ADRT as a service type for autonomous shuttles, the further integration to MaaS align with the existing literature to improve connectivity and accessibility. The strong performance of both the GAM and XGBoost models, with AUC scores of 0.940 and 0.945, respectively, and F1 scores around 0.88, indicates reliable predictions for autonomous shuttle deployment in this context.

The high feature importance of special events trips in the XGBoost model (Gain = 0.290) and its significant EDF of 0.786 in GAM, combined with SHAP values ranging from −0.558 to 0.559, suggest that autonomous shuttles can be offered with other transport offerings for special events. Leisure travelers are a key driver, with an EDF of 1.730 in GAM and a gain of 0.252 in XGBoost, and SHAP values from −0.479 to 0.664 indicating a substantial and variable impact. This reflects the dynamic travel patterns of tourists, who often require flexible transport to access destinations integrated with broader transport networks. MaaS plays a pivotal role in providing seamless, multimodal travel options for tourists, with autonomous shuttles serving as a critical component for last-mile connectivity in tourist-heavy areas. Senior citizens also emerged as a significant predictor, with a high EDF of 2.674 in GAM and a gain of 0.164 in XGBoost, with SHAP values from −0.541 to 0.607 indicating variable but notable influence. The integration of autonomous shuttles within MaaS can further support seniors by offering door-to-door services coordinated with other transport modes.

Work trips, with an EDF of 0.921 in GAM and a gain of 0.1429 in XGBoost, suggest that autonomous shuttles can support daily commuting within integrated transport systems. The SHAP values for work trips (−0.449 to 0.399) indicate a mixed effect, reflecting variable commuting patterns. It is proven that integration of autonomous shuttles within MaaS can improve service efficiency, reducing congestion and thereby emissions.

4.10. Autonomous Shuttles to Operate 24/7

The findings highlight the potential of autonomous shuttles to provide continuous service, with university precincts emerging as the most significant predictor. The GAM identified university precincts as the sole significant predictor (EDF = 1.812), while the XGBoost model reinforced this with a Gain of 0.593, and added working professionals (Gain = 0.219) as a secondary factor. The moderate performance metrics, with AUC scores of 0.828 (GAM) and 0.843 (XGBoost) and F1 scores of 0.643 and 0.710, respectively, suggest challenges in predicting continuous operation needs, likely due to fluctuating demand outside peak hours. The dominance of university precincts in both models, with SHAP values ranging from −1.063 to 0.618, underscores the unique suitability of campuses for 24/7 autonomous shuttle operations.

Leisure trips (Gain = 0.110) and agricultural land areas (Gain = 0.052) showed minor and variable impacts, with SHAP values indicating inconsistent demand (leisure trips: −0.395 to 0.157; agricultural land areas: −0.116 to 0.390). This suggests limited need for 24/7 operations in these contexts, likely due to lower nighttime demand. Economically, 24/7 operations face challenges due to high operational costs during low-demand periods. An optimized model, with integrating multipurpose services during low-demand periods is required to maintain effective operational efficiency. However, public acceptance and safety are critical considerations for 24/7 operations, particularly in less controlled environments.

5. Policy Implications

Based on the empirical findings from the study, policy recommendations are provided for an effective autonomous road-based transit implementation in rural setting (Table 3).

6. Conclusions

This study provides a comprehensive empirical investigation into the operational determinants of autonomous shuttle systems in rural and peri-urban communities of South East Queensland, addressing a critical gap in the literature on autonomous road-based transit in rural networks. By employing a structured survey and adopting an analytical approach through GAM and XGBoost, the research explains user preferences across diverse demographic groups, trip purposes, land-use contexts, on autonomous shuttle types and service offerings. Small shuttles are optimally suited for flexible, non-routine transport operations, offering adaptability to diverse travel demands. In contrast, minibus shuttles are preferred for their critical role in facilitating FMLM connectivity, effectively bridging gaps in public transport networks. Standard-sized buses remain the preferred choice for high-capacity transport requirements, particularly for school-related trips, due to their ability to accommodate large passenger volumes efficiently. Furthermore, these buses are recognized for their utility in providing disaster management support in regional areas

While the complete replacement of conventional buses with autonomous shuttles is not widely supported, a hybrid model integrating traditional buses with autonomous shuttles is increasingly advocated. Autonomous shuttles are particularly valued for their capacity to address FMLM connectivity challenges, serving as vital connectors to fixed-route and long-distance transport services. The emergence of autonomous shuttles as private taxi services has been noted, though this development raises equity concerns, particularly for transport-disadvantaged groups, who may face barriers to accessing such services. For cost-effective operations, autonomous shuttles can be deployed as multipurpose services, particularly within industrial and business parks. This approach enables the simultaneous transport of passengers and freight, optimizing vehicle utilization during otherwise empty vehicle kilometres. While cost concerns associated with 24/7 operations persist, these can be mitigated by adopting a multipurpose service framework. Moreover, integrating autonomous shuttles within a MaaS ecosystem enhances operational efficiency and contributes to reduced emissions, aligning with broader sustainability objectives.

7. Limitations and Future Research Directions

This study offers a robust methodology to analyze operational determinants for rural autonomous road-based transits, but several methodological limitations warrant consideration. First, the sample size of 357 responses, reduced after cleaning, may not fully represent the diverse preferences of rural populations, and convenience random sampling via Qualtrics risks selection bias, particularly for those with limited digital access. This was evidenced by comparing key demographic characteristics of the sample with Australian Bureau of Statistics data for the target area. Second, the focus on South East Queensland restricts generalizability to other rural regions with distinct socio-economic or infrastructural contexts. Third, reliance on self-reported Likert-scale data, binarized at ≥4, may oversimplify nuanced preferences, potentially obscuring subtle variations. Fourth, key predictors like technology trust, safety perceptions, and infrastructure readiness were omitted, despite their relevance to adoption. Fifth, the trade-off between XGBoost’s complexity and GAM’s interpretability may limit stakeholder accessibility, and GAM’s fixed knots may miss complex interactions. Sixth, the cross-sectional design fails to capture dynamic demand or evolving technology acceptance. Seventh, accessibility and equity for people with disabilities and low-income groups were not fully explored, particularly regarding specific technological needs or affordability barriers. Finally, the smaller sample size heightens overfitting risks, especially for XGBoost, despite LASSO and cross-validation efforts. The modest size of the sample necessitates the study’s findings should be interpreted as strong exploratory evidence and testable premises for future investigation.

To address the study limitations and advance autonomous public transport in rural settings, future research should pursue several directions. Longitudinal studies are needed to track evolving preferences, capturing seasonal and technological changes through panel surveys or repeated cross-sectional designs. Expanding to diverse global rural regions will enhance generalizability, comparing areas with varying infrastructure and socio-economic profiles. Incorporating predictors like trust, safety, and infrastructure readiness via mixed-method approaches will deepen understanding of adoption barriers. Detailed analyses of accessibility and equity should explore design requirements for people with disabilities and affordability solutions, including subsidy cost–benefit analyses. Real-world pilots and agent-based simulations can validate findings, optimize fleet management, and test hybrid models. Advanced analytical methods can improve accuracy and interpretability. Crucially, further research must place greater emphasis on verifying the research premises established here, necessitating a sample size of N < 400 [100]. Finally, developing policy frameworks with stakeholders to address regulatory, safety, and ethical issues will ensure equitable, scalable deployment, informing sustainable transport solutions for rural communities.

Author Contributions

Conceptualization, S.J., A.B. and J.B.; methodology, S.J., A.B. and J.B.; software, S.J.; validation, S.J. and J.B.; formal analysis, S.J.; investigation, S.J.; resources, J.B.; data curation, S.J.; writing—original draft preparation, S.J.; writing—review and editing, S.J., A.B. and J.B.; visualization, S.J.; supervision, A.B. and J.B.; project administration, J.B. All authors have read and agreed to the published version of the manuscript.

Funding

The first author gratefully acknowledges the scholarship support provided by the QUT to carry out this PhD research.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the QUT Human Research Ethics Committee (Approval number: 9083-HE09, Date: 13 September 2024).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

De-identified data is available upon request.

Acknowledgments

During the preparation of this manuscript, the author(s) used Grok 3.5I and Copilot GPT-4 for the purposes of text editing. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ADRT	Autonomous Demand Responsive Transit
AIC	Akaike Information Criterion
AUC	Area Under the Curve
DRT	Demand Responsive Transit
EDF	Effective Degrees of Freedom
EPP	Events Per Parameter
FMLM	First-Mile/Last-Mile
GAM	General Additive Models
GBMs	Gradient Boosting Machines
GLMs	Generalized Linear Models
LASSO	Least Absolute Shrinkage and Selection Operator
MaaS	Mobility-as-a-Service
REML	Restricted Maximum Likelihood
ROC	Receiver Operating Characteristic
SEQ	South East Queensland
SHAP	Shapley Additive Explanation
TAFE	Technical and Further Education
VIF	Variance Inflation Factors
XGBoost	Extreme Gradient Boosting

Appendix A. Questionnaire Survey

Section A—Questions about yourself and your household
1. What best describes your gender identity? (a) Man (b) Woman (c) Non-binary (d) Prefer not to say (e) Other (Please describe)
2. Age group—Choose one answer that best describes your current age: (a) Below 18 (b) 19–35 (c) 36–50 (d) 51–65 (e) 66–80 (f) 81 or higher
3.Occupational status—choose one answer that best describes your current status: (a) Working full-time (b) Working part-time or casually (c) Full-time student (d) Part-time student (e) Unemployed (f) Retired (g) Combination of the above (please define)
4. Please choose your highest level of completed education (if you completed your education outside of Australia, please choose the nearest equivalent option). (a) No formal education attained (b) Year 10 (c) Year 12 (d) Trade apprenticeship/Tafe (e) Undergraduate Degree (f) Post-graduate Degree
5. In which of the following ranges does your total annual household income fall? (a) Negative income/Nil income (b) $1–$15,599 (c) $15,600–$31,199 (d) $31,200–$51,999 (e) $52,000–$77,999 (f) $78,000–$103,999 (g) $104,000 or more (h) Prefer not to answer
6. What is your postcode?
7. How many people live in your household (including yourself)? (a) Under 18 years (b) 18 to 64 years (c) Over 65 years
8. Do you have any disabilities that affect your mobility? (a) Yes (b) No (c) Prefer not to answer
9. Do you have a valid driver’s license? (a) Yes (b) No (c) Prefer not to answer
10. How many vehicles does your household have?
Section B—Questions about suitability of autonomous road-based transit
11. How familiar were you with autonomous (driverless) shuttles before participating in this survey? (a) Very familiar (b) Somewhat familiar (c) Not familiar
12. Have you ever ridden in an autonomous vehicle of any kind? (a) Yes (b) No
13. To what extent do you agree or disagree that autonomous shuttles are suitable for different types of people?
	Extremely suitable	Very suitable	Moderately suitable	Slightly suitable	Not at all suitable
Autonomous shuttles are suitable for school children
Autonomous shuttles are suitable for university students
Autonomous shuttles are suitable for working professionals
Autonomous shuttles are suitable for senior citizens
Autonomous shuttles are suitable for tourists
Autonomous shuttles are suitable for leisure travellers
Autonomous shuttles are suitable for people with physical disabilities (e.g., mobility impairments)
Autonomous shuttles are suitable for people with sensory disabilities (e.g., visually impaired, hard of hearing)
Autonomous shuttles are suitable for people with cognitive disabilities (e.g., learning disabilities, intellectual disabilities)
Autonomous shuttles are suitable for low-income individuals
Autonomous shuttles are suitable for middle-income individuals
Autonomous shuttles are suitable for high-income individuals
14. To what extent do you agree or disagree that autonomous shuttles are suitable for different types of areas?
	Extremely suitable	Very suitable	Moderately suitable	Slightly suitable	Not at all suitable
Autonomous shuttles are suitable for residential neighbourhoods
Autonomous shuttles are suitable for industrial/ business parks
Autonomous shuttles are suitable for university precincts
Autonomous shuttles are suitable for agriculture land areas
Autonomous shuttles are suitable for tourist destinations
Autonomous shuttles are suitable for town centres
15. To what extent do you agree or disagree that autonomous shuttles are suitable for different types of trips?
	Extremely suitable	Very suitable	Moderately suitable	Slightly suitable	Not at all suitable
Autonomous shuttles are suitable for work trips
Autonomous shuttles are suitable for school trips
Autonomous shuttles are suitable for university trips
Autonomous shuttles are suitable for shopping trips
Autonomous shuttles are suitable for medical trips
Autonomous shuttles are suitable for leisure trips
Autonomous shuttles are suitable for emergency trips
Autonomous shuttles are suitable for special events or gatherings
16. To what extent do you agree or disagree that vehicle types are suitable for autonomous shuttle operations?
	Extremely suitable	Very suitable	Moderately suitable	Slightly suitable	Not at all suitable
Small shuttles (capable of carrying up to 8 passengers) will be suitable for autonomous operations
Minibus shuttles (capable of carrying 8–15 passengers) will be suitable for autonomous operations
Standard sized, conventional buses (capable of carrying up to 60 passengers) will be suitable for autonomous operations
17. To what extent do you agree or disagree with the following statements in relation to autonomous operations?
	Strongly agree	Agree	Neither	Disagree	Strongly disagree
Autonomous shuttles could completely replace conventional buses
Autonomous shuttles could operate as a connector to existing fixed route bus services
Autonomous shuttles could operate as a connector to longer distance services (e.g., coach, train)
Autonomous shuttles could operate as private taxi services (including uber/didi style operations)
Autonomous shuttles could accommodate as a multipurpose service, with both passenger transport and light freight (parcel) delivery
Autonomous shuttles could be integrated with other transport offerings
I would expect autonomous shuttles to operate 24/7
I prefer fixed route bus services over autonomous shuttle services

References

Mitchell, R.; Chong, S. Comparison of injury-related hospitalised morbidity and mortality in urban and rural areas in Australia. Rural. Remote Health 2010, 10, 123–133. [Google Scholar] [CrossRef]
CARRS-Q. Rural & Remote Road Safety: Centre for Accident Research & Road Safety—Queensland (CARRS-Q). 2021. Available online: https://research.qut.edu.au/carrsq/wp-content/uploads/sites/296/2021/12/Rural-remote-road-safety.pdf (accessed on 27 July 2025).
Xi, H.; Nelson, J.D.; Mulley, C.; Hensher, D.A.; Ho, C.Q.; Balbontin, C. Barriers towards enhancing mobility through integrated mobility services in a regional and rural context: Insights from suppliers and organisers. Transp. Policy 2025, 171, 282–295. [Google Scholar] [CrossRef]
Hansson, J.; Pettersson, F.; Svensson, H.; Wretstrand, A. Preferences in regional public transport: A literature review. Eur. Transp. Res. Rev. 2019, 11, 38. [Google Scholar] [CrossRef]
Asgharpour, S.; Askari, S.; Mohammadian, A. Dependence or preference? Navigating public transit loyalty across heterogeneous levels of transit dependence. Transp. Policy 2025, 171, 821–837. [Google Scholar] [CrossRef]
Iclodean, C.; Cordos, N.; Varga, B.O. Autonomous Shuttle Bus for Public Transportation: A Review. Energies 2020, 13, 2917. [Google Scholar] [CrossRef]
Jayatilleke, S.; Bhaskar, A.; Bunker, J. Autonomous bus adoption in public transport networks: A systematic literature review on potential and prospects. Australas. Transp. Res. Forum Perth Aust. 2023, 29, 1–17. [Google Scholar]
Silva, Ó.; Cordera, R.; González-González, E.; Nogués, S. Environmental impacts of autonomous vehicles: A review of the scientific literature. Sci. Total Environ. 2022, 830, 154615. [Google Scholar] [CrossRef]
Golbabaei, F. Challenges and Opportunities in the Adoption of Autonomous Demand Responsive Transit (ADRT) by Adult Residents of South East Queensland: Queensland University of Technology. Ph.D. Thesis, Queensland University of Technology, Brisbane City, Australia, 2023. [Google Scholar]
Jayatilleke, S.; Bhaskar, A.; Bunker, J. A Cross-Sectional Study on the Public Perception of Autonomous Demand-Responsive Transits (ADRTs) in Rural Towns: Insights from South-East Queensland. Smart Cities 2025, 8, 72. [Google Scholar] [CrossRef]
Koch, L.-C.; Wishart, D.; Muthukkumarasamy, V. Exploring the perceived benefits and potential challenges of the autonomous vehicle rollout in Australia. Transp. Res. Part F Traffic Psychol. Behav. 2025, 114, 1324–1351. [Google Scholar] [CrossRef]
Zhong, H.; Wang, K.; Li, W.; Burris, M.W.; Sinha, K.C. An urban-rural divide? Preferences for autonomous vehicles in small and med-sized metropolitan areas. Appl. Geogr. 2024, 169, 103324. [Google Scholar] [CrossRef]
Sharma, S.; Woodman, R.; Ocean, N.; Elliott, M.T. Understanding future adoption of autonomous vehicle services among disabled and non-disabled users. Transp. Res. Interdiscip. Perspect. 2025, 34, 101602. [Google Scholar] [CrossRef]
Ansarinejad, M.; Ansarinejad, K.; Lu, P.; Huang, Y.; Tolliver, D. Autonomous Vehicles in Rural Areas: A Review of Challenges, Opportunities, and Solutions. Appl. Sci. 2025, 15, 4195. [Google Scholar] [CrossRef]
Scheltes, A.; de Almeida Correia, G.H. Exploring the use of automated vehicles as last mile connection of train trips through an agent-based simulation model: An application to Delft, Netherlands. Int. J. Transp. Sci. Technol. 2017, 6, 28–41. [Google Scholar] [CrossRef]
Lau, S.T.; Susilawati, S. Shared autonomous vehicles implementation for the first and last-mile services. Transp. Res. Interdiscip. Perspect. 2021, 11, 100440. [Google Scholar] [CrossRef]
Roy, S.; Dadashev, G.; Yfantis, L.; Nahmias Biran, B.-H.; Hasan, S. Autonomous on-Demand Shuttles for First Mile-Last Mile Connectivity: Design, Optimization, and Impact Assessment. Transp. Res. Rec. J. Transp. Res. Board 2024, 2679, 819–840. [Google Scholar]
Gurumurthy, K.M.; Kockelman, K.M.; Zuniga-Garcia, N. First-Mile-Last-Mile Collector-Distributor System using Shared Autonomous Mobility. Transp. Res. Rec. J. Transp. Res. Board 2020, 2674, 638–647. [Google Scholar] [CrossRef]
Golbabaei, F.; Yigitcanlar, T.; Paz, A.; Bunker, J. Individual Predictors of Autonomous Vehicle Public Acceptance and Intention to Use: A Systematic Review of the Literature. J. Open Innov. Technol. Mark. Complex. 2020, 6, 106. [Google Scholar] [CrossRef]
Xu, Z.; Zheng, N. Integrating connected autonomous shuttle buses as an alternative for public transport—A simulation-based study. Multimodal Transp. 2024, 3, 100133. [Google Scholar] [CrossRef]
Chen, Z.; Li, X.; Zhou, X. Operational design for shuttle systems with modular vehicles under oversaturated traffic: Continuous modeling method. Transp. Res. Part B Methodol. 2020, 132, 76–100. [Google Scholar] [CrossRef]
Alipour, D.; Dia, H. A Systematic Review of the Role of Land Use, Transport, and Energy-Environment Integration in Shaping Sustainable Cities. Sustainability 2023, 15, 6447. [Google Scholar] [CrossRef]
Sarri, P.; Kaparias, I.; Preston, J.; Simmonds, D. Using Land Use and Transportation Interaction (LUTI) models to determine land use effects from new vehicle transportation technologies; a regional scale of analysis. Transp. Policy 2023, 135, 91–111. [Google Scholar] [CrossRef]
Transport and Infrastructure Council. T1 Travel Demand Modelling; Commonwealth Department of Infrastructure and Regional Development: Canberra, Australia, 2016. [Google Scholar]
Transport and Infrastructure Council. T2 Cost Benefit Analysis; Commonwealth Department of Infrastructure and Regional Development: Canberra, Australia, 2018. [Google Scholar]
Greifenstein, M. Factors influencing the user behaviour of shared autonomous vehicles (SAVs): A systematic literature review. Transp. Res. Part F Traffic Psychol. Behav. 2024, 100, 323–345. [Google Scholar] [CrossRef]
Chng, S.; Anowar, S.; Cheah, L. Understanding Shared Autonomous Vehicle Preferences: A Comparison between Shuttles, Buses, Ridesharing and Taxis. Sustainability 2022, 14, 13656. [Google Scholar] [CrossRef]
Kim, S.W.; Gwon, G.P.; Hur, W.S.; Hyeon, D.; Kim, D.Y.; Kim, S.H.; Kye, D.-K.; Lee, S.-H.; Lee, S.; Shin, M.-O.; et al. Autonomous Campus Mobility Services Using Driverless Taxi. IEEE Trans. Intell. Transp. Syst. 2017, 18, 3513–3526. [Google Scholar] [CrossRef]
Dong, Z.; Chen, C.; Ouyang, J.; Yan, X.; Liao, C.; Chen, X.; Lee, D.-H. Understanding commuter preferences for shared autonomous electric vehicles in first-mile-last-mile scenario. Transp. Res. Part D Transp. Environ. 2025, 140, 104621. [Google Scholar] [CrossRef]
Jayatilleke, S.; Bhaskar, A.; Bunker, J. Unveiling the Challenges and Opportunities of Autonomous Bus Integration in Rural and Suburban Areas: An Expert Interview Study. Manuscr. Submitt. Publ. 2025, in press. [Google Scholar]
Mortoja, M.G.; Yigitcanlar, T. Why is determining peri-urban area boundaries critical for sustainable urban development? J. Environ. Plan. Manag. 2021, 66, 67–96. [Google Scholar] [CrossRef]
Dijkstra, L.; Hamilton, E.; Lall, S.; Wahba, S. How Do We Define Cities, Towns, and Rural Areas? 2020. Available online: https://blogs.worldbank.org/sustainablecities/how-do-we-define-cities-towns-and-rural-areas (accessed on 3 August 2025).
Krejcie, R.V.; Morgan, D.W. Determining Sample Size for Research Activities. Educ. Psychol. Meas. 1970, 30, 607–610. [Google Scholar] [CrossRef]
Hair, J.F.; Risher, J.J.; Sarstedt, M.; Ringle, C.M. When to use and how to report the results of PLS-SEM. Eur. Bus. Rev. 2019, 31, 2–24. [Google Scholar] [CrossRef]
Kwak, S.K.; Kim, J.H. Statistical data preparation: Management of missing values and outliers. Korean J. Anesthesiol. 2017, 70, 407–411. [Google Scholar] [CrossRef]
Australian Bureau of Statistics. Gatton 2021 Census All Persons QuickStats. 2021. Available online: https://abs.gov.au/census/find-census-data/quickstats/2021/SAL31104 (accessed on 3 August 2025).
Bland, J.M.; Altman, D.G. Statistics notes: Cronbach’s alpha. BMJ 1997, 314, 572. [Google Scholar] [CrossRef] [PubMed]
Tavakol, M.; Dennick, R. Making sense of Cronbach’s alpha. Int. J. Med. Educ. 2011, 2, 53–55. [Google Scholar] [CrossRef] [PubMed]
Norman, G. Likert scales, levels of measurement and the “laws” of statistics. Adv. Health Sci. Educ. Theory Pract. 2010, 15, 625–632. [Google Scholar] [CrossRef]
Hosmer, D.W., Jr.; Lemeshow, S.; Sturdivant, R.X. Applied Logistic Regression; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
Tibshirani, R. Regression Shrinkage and Selection via the Lasso. J. R. Stat. Soc. Ser. B Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
Hanley, J.A. Characteristic (ROC) curvel. Radiology 1982, 743, 29–36. [Google Scholar] [CrossRef]
Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
Friedman, J.H.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef]
Peduzzi, P.; Concato, J.; Kemper, E.; Holford, T.R.; Feinstein, A.R. A simulation study of the number of events per variable in logistic regression analysis. J. Clin. Epidemiol. 1996, 49, 1373–1379. [Google Scholar] [CrossRef]
Vittinghoff, E.; McCulloch, C.E. Relaxing the rule of ten events per variable in logistic and Cox regression. Am. J. Epidemiol. 2007, 165, 710–718. [Google Scholar] [CrossRef]
King, G.; Zeng, L. Logistic regression in rare events data. Political Anal. 2001, 9, 137–163. [Google Scholar] [CrossRef]
Belsley, D.A.; Kuh, E.; Welsch, R.E. Regression Diagnostics: Identifying Influential Data and Sources of Collinearity; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
Box, G.E.P.; Tidwell, P.W. Transformation of the Independent Variables. Technometrics 1962, 4, 531–550. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R. Generalized additive models. Stat. Sci. 1986, 1, 297–310. [Google Scholar] [CrossRef]
Cleveland, W.S.; Devlin, S.J. Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting. J. Am. Stat. Assoc. 1988, 83, 596–610. [Google Scholar] [CrossRef]
Wood, S.N. Generalized Additive Models: An Introduction with R, 2nd ed.; Chapman and Hall/CRC: Boca Raton, FL, USA, 2017. [Google Scholar]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning; Springer Series in Statistics New-York; Springer: New York, NY, USA, 2009. [Google Scholar]
Wood, S.N.; Pya, N.; Säfken, B. Smoothing parameter and model selection for general smooth models. J. Am. Stat. Assoc. 2016, 111, 1548–1563. [Google Scholar] [CrossRef]
Marra, G.; Wood, S.N. Practical variable selection for generalized additive models. Comput. Stat. Data Anal. 2011, 55, 2372–2387. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. (Eds.) Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
Hoerl, A.E.; Kennard, R.W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970, 12, 55–67. [Google Scholar] [CrossRef]
Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
Kohavi, R. (Ed.) A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection; Ijcai: Montreal, QC, Canada, 1995. [Google Scholar]
Prechelt, L. Automatic early stopping using cross validation: Quantifying the criteria. Neural Netw. 1998, 11, 761–767. [Google Scholar] [CrossRef] [PubMed]
Belkin, M.; Hsu, D.; Ma, S.; Mandal, S. Reconciling modern machine-learning practice and the classical bias-variance trade-off. Proc. Natl. Acad. Sci. USA 2019, 116, 15849–15854. [Google Scholar] [CrossRef] [PubMed]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 1–10. [Google Scholar]
Shapley, L.S. A Value for n-Person Games; Princeton University Press: Princeton, NJ, USA, 1953. [Google Scholar]
Powers, D.M. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv 2020, arXiv:201016061. [Google Scholar] [CrossRef]
Davis, J.; Goadrich, M. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; Association for Computing Machinery: New York, NY, USA; pp. 233–240. [Google Scholar]
Davison, A.; Hinkley, D. Bootstrap Methods and Their Application. J. Am. Stat. Assoc. 1997, 94. [Google Scholar] [CrossRef]
Koskinen, K.; Mallat, N.; Raj Upreti, B. Shared benefits and sustainable mobility—A case of autonomous bus. Case Stud. Transp. Policy 2024, 18, 101286. [Google Scholar] [CrossRef]
Othman, K. Exploring the implications of autonomous vehicles: A comprehensive review. Innov. Infrastruct. Solut. 2022, 7, 1–32. [Google Scholar] [CrossRef]
Schippl, J.; Truffer, B.; Fleischer, T. Potential impacts of institutional dynamics on the development of automated vehicles: Towards sustainable mobility? Transp. Res. Interdiscip. Perspect. 2022, 14, 100587. [Google Scholar] [CrossRef]
Thorhauge, M.; Fjendbo Jensen, A.; Rich, J. Effects of autonomous first- and last mile transport in the transport chain. Transp. Res. Interdiscip. Perspect. 2022, 15, 100623. [Google Scholar] [CrossRef]
Li, H.; Jin, Z.; Cui, H.; Tu, H. An exploration of the preferences and mode choice behavior between autonomous demand-responsive transit and traditional buses. Int. J. Transp. Sci. Technol. 2024, 15, 81–101. [Google Scholar] [CrossRef]
Golchin, M.; Grandhi, A.; Gore, N.; Pulugurtha, S.S.; Ghasemi, A. UNC Charlotte Autonomous Shuttle Pilot Study: An Assessment of Operational Performance, Reliability, and Challenges. Machines 2024, 12, 796. [Google Scholar] [CrossRef]
Golbabaei, F.; Paz, A.; Yigitcanlar, T.; Bunker, J. Navigating autonomous demand responsive transport: Stakeholder perspectives on deployment and adoption challenges. Int. J. Digit. Earth 2023, 17, 2297848. [Google Scholar] [CrossRef]
Fadlelseed, S.; Sarhadi, P.; Ramalingam, S.; Gan, H.; Kourtessis, P.; Jackman, G.; Contreras, A.G.S.; Tena, J.G.; West, J.; Holland, J.; et al. Recent Advances in Demand Responsive Transport: Opportunities With Autonomous Bus Service—A System-of-Systems Overview. IEEE Access 2025, 13, 107800–107831. [Google Scholar] [CrossRef]
Sistig, H.M.; Sinhuber, P.; Rogge, M.; Sauer, D.U. Optimizing Fleet Structure for Autonomous Electric Buses: A Route-Based Analysis in Aachen, Germany. Sustainability 2024, 16, 4093. [Google Scholar] [CrossRef]
Xu, M. Addressing the Fleet Sizing Problem for Shared-and-Autonomous-Mobility Services; University of Michigan: Ann Arbor, MI, USA, 2021; Available online: https://limos.engin.umich.edu/istdm2021/wp-content/uploads/sites/2/2021/05/ISTDM-2021-Extended-Abstract-0161.pdf (accessed on 25 August 2025).
Ciari, F.; Janzen, M.; Ziemlicki, C. Chapter 8—Planning shared automated vehicle fleets: Specific modeling requirements and concepts to address them. In Demand for Emerging Transportation Systems; Antoniou, C., Efthymiou, D., Chaniotakis, E., Eds.; Elsevier: Amsterdam, The Netherlands, 2020; pp. 151–168. [Google Scholar]
Afkham, M.; Ramezanian, R.; Shahparvari, S. Balancing traffic flow in the congested mass self-evacuation dynamic network under tight preparation budget: An Australian bushfire practice. Omega 2022, 111, 102658. [Google Scholar] [CrossRef]
Partheepan, S.; Sanati, F.; Hassan, J. Autonomous Unmanned Aerial Vehicles in Bushfire Management: Challenges and Opportunities. Drones 2023, 7, 47. [Google Scholar] [CrossRef]
Choi, M.; Min, S.; Kim, J.; Kim, S.; Kwak, J.; Lee, S. Autonomous vehicle integration with public transit for congestion mitigation and energy efficiency. Sustain. Energy Technol. Assess. 2025, 82, 104476. [Google Scholar] [CrossRef]
Alonso-González, M.J.; Liu, T.; Cats, O.; Van Oort, N.; Hoogendoorn, S. The Potential of Demand-Responsive Transport as a Complement to Public Transport: An Assessment Framework and an Empirical Evaluation. Transp. Res. Rec. 2018, 2672, 879–889. [Google Scholar] [CrossRef]
Golbabaei, F.; Dwyer, J.; Gomez, R.; Peterson, A.; Cocks, K.; Bubke, A.; Paz, A. Enabling mobility and inclusion: Designing accessible autonomous vehicles for people with disabilities. Cities 2024, 154, 105333. [Google Scholar] [CrossRef]
Tang, C.; Liu, J.; Ceder, A.; Jiang, Y. Optimisation of a new hybrid transit service with modular autonomous vehicles. Transp. A Transp. Sci. 2023, 20, 1–23. [Google Scholar] [CrossRef]
Wu, J.; Kulcsár, B.; Selpi Qu, X. A modular, adaptive, and autonomous transit system (MAATS): An in-motion transfer strategy and performance evaluation in urban grid transit networks. Transp. Res. Part A Policy Pract. 2021, 151, 81–98. [Google Scholar] [CrossRef]
Maruyama, R.; Seo, T. Integrated Public Transportation System with Shared Autonomous Vehicles and Fixed-Route Transits: Dynamic Traffic Assignment-Based Model with Multi-Objective Optimization. Int. J. Intell. Transp. Syst. Res. 2023, 21, 99–114. [Google Scholar] [CrossRef]
Abdelwahed, A.; van den Berg, P.L.; Brandt, T.; Ketter, W. Balancing convenience and sustainability in public transport through dynamic transit bus networks. Transp. Res. Part C Emerg. Technol. 2023, 151, 104100. [Google Scholar] [CrossRef]
Rau, A.; Tiana, L.; Jain, M.; Xie, M. Dynamic Autonomous Road Transit (DART) for Use-case Capacity More Than Bus. Transp. Res. Procedia 2019, 41, 812–823. [Google Scholar] [CrossRef]
Leich, G.; Bischoff, J. Should autonomous shared taxis replace buses? A simulation study. Transp. Res. Procedia 2019, 41, 450–460. [Google Scholar] [CrossRef]
Shen, Y.; Zhang, H.; Zhao, J. Integrating shared autonomous vehicle in public transportation system: A supply-side simulation of the first-mile service in Singapore. Transp. Res. Part A Policy Pract. 2018, 113, 125–136. [Google Scholar] [CrossRef]
Sadler, K. First and Last Mile: Emerging Autonomous Public Transport. 2017. Available online: https://www.eurotransportmagazine.com/22134/transport-extra/autonomous-public-transport/ (accessed on 27 July 2025).
Huang, Y.; Kockelman, K.M.; Garikapati, V.; Zhu, L.; Young, S. Use of Shared Automated Vehicles for First-Mile Last-Mile Service: Micro-Simulation of Rail-Transit Connections in Austin, Texas. Transp. Res. Rec. J. Transp. Res. Board 2020, 2675, 135–149. [Google Scholar] [CrossRef]
Zubin, I.; Van Oort, N.; Van Binsbergen, A.; Van Arem, B. Deployment Scenarios for First/Last-Mile Operations With Driverless Shuttles Based on Literature Review and Stakeholder Survey. IEEE Open J. Intell. Transp. Syst. 2021, 2, 322–337. [Google Scholar] [CrossRef]
Stevens, M.; Correia, G.H.d.A.; Scheltes, A.; van Arem, B. An agent-based model for assessing the financial viability of autonomous mobility on-demand systems used as first and last-mile of public transport trips: A case-study in Rotterdam, The Netherlands. Res. Transp. Bus. Manag. 2022, 45, 100875. [Google Scholar] [CrossRef]
Zhang, W.; Jenelius, E.; Badia, H. Efficiency of Connected Semi-Autonomous Platooning Bus Services in High-Demand Transit Corridors. IEEE Open J. Intell. Transp. Syst. 2022, 3, 435–448. [Google Scholar] [CrossRef]
Hatzenbühler, J.; Cats, O.; Jenelius, E. Transitioning towards the deployment of line-based autonomous buses: Consequences for service frequency and vehicle capacity. Transp. Res. Part A Policy Pract. 2020, 138, 491–507. [Google Scholar] [CrossRef]
Kalambay, P.; Kitali, A.; Ngereza, A.; Kidando, E.; Ogungbire, A. Autonomous taxis and ride-sharing vehicles: A social construct perspective for future mobility and infrastructure readiness. Sustain. Cities Soc. 2025, 118, 106060. [Google Scholar] [CrossRef]
Bi, C.; Li, Y.; Gruyer, D.; Tu, M. Who is more willing to use shared autonomous vehicles in first-mile-last-mile? A heterogeneity study on carbon incentive policy from China. Int. J. Transp. Sci. Technol. 2024. [Google Scholar] [CrossRef]
Abe, R. Introducing autonomous buses and taxis: Quantifying the potential benefits in Japanese transportation systems. Transp. Res. Part A Policy Pract. 2019, 126, 94–113. [Google Scholar] [CrossRef]
Hair, J.F. Multivariate Data Analysis, 7th ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2009; p. 761. [Google Scholar]

Figure 1. Socio-demographic profile: (a) Gender; (b) age; (c) education level; (d) employment status; and (e) annual household income.

Figure 2. Data analysis process.

Figure 3. GAM significant predictor EDF values. (Note: VT1—small shuttle; VT2—minibus shuttle; VT3—standard-sized conventional bus; SO1—completely replace conventional buses; SO2—operate as a connector to existing fixed-route bus services; SO3—connector to longer distance services; SO4—operate as private taxi services; SO5—accommodate as a multipurpose service; SO6—integrated with other transport offerings; SO7—operate 24/7; TP1—work; TP2—school; TP3—university; TP4—shopping; TP5—medical; TP6—leisure; TP7—emergency; TP8—special events; DG1—school children; DG2—university students; DG3—working professionals; DG4—senior citizens; DG5—tourists; DG6—leisure travelers; DG7—people with physical disabilities; DG8—people with sensory disabilities; DG9—people with cognitive disabilities; DG10—low-income individuals; DG11—middle-income individuals; DG12—high-income individuals; LU1—residential neighborhoods; LU2—industrial/business parks; LU3—university precincts; LU4—agricultural land areas; LU5—tourist destinations; LU6—town centres).

Figure 4. XGBoost predictors feature importance (Gain). (Note: VT1—small shuttle; VT2—minibus shuttle; VT3—standard-sized conventional bus; SO1—completely replace conventional buses; SO2—operate as a connector to existing fixed-route bus services; SO3—connector to longer distance services; SO4—operate as private taxi services; SO5—accommodate as a multipurpose service; SO6—integrated with other transport offerings; SO7—operate 24/7; TP1—work; TP2—school; TP3—university; TP4—shopping; TP5—medical; TP6—leisure; TP7—emergency; TP8—special events; DG1—school children; DG2—university students; DG3—working professionals; DG4—senior citizens; DG5—tourists; DG6—leisure travelers; DG7—people with physical disabilities, DG8—people with sensory disabilities; DG9—people with cognitive disabilities; DG10—low-income individuals; DG11—middle-income individuals; DG12—high-income individuals; LU1—residential neighborhoods; LU2—industrial/business parks, LU3—university precincts; LU4—agricultural land areas; LU5—tourist destinations; LU6—town centres).

Figure 5. Combined swarm plots of SHAP values for XGBoost. (a) VT1. (b) VT2. (c) VT3. (d) SO1. (e) SO2. (f) SO3. (g) SO4. (h) SO5. (i) SO6. (j) SO7. (Note: VT1—small shuttle; VT2 —minibus shuttle; VT3—standard-sized conventional bus; SO1—completely replace conventional buses; SO2—operate as a connector to existing fixed-route bus services; SO3—connector to longer distance services; SO4—operate as private taxi services; SO5—accommodate as a multipurpose service; SO6—integrated with other transport offerings; SO7—operate 24/7; TP1—work; TP2—school; TP3—university; TP4—shopping; TP5—medical; TP6—leisure; TP7—emergency; TP8—special events; DG1—school children; DG2—university students; DG3—working professionals; DG4—senior citizens; DG5—tourists; DG6—leisure travelers; DG7—people with physical disabilities; DG8—people with sensory disabilities; DG9—people with cognitive disabilities; DG10—low-income individuals; DG11—middle-income individuals; DG12—high-income individuals; LU1—residential neighborhoods; LU2—industrial/business parks; LU3—university precincts; LU4—agricultural land areas; LU5—tourist destinations; LU6—town centres).

Figure 6. Model validation plots of GAM and XGBoost. (a) AUC. (b) F1 Score. (c) Bootstrap 95% CI for Accuracy. (d) Bootstrap 95% CI for F1 Score.

Table 1. Study attributes.

Variable	Item	Description
Demand Drivers
Trip Purpose (TP)	TP1	Work
	TP2	School
	TP3	University
	TP4	Shopping
	TP5	Medical
	TP6	Leisure
	TP7	Emergency
	TP8	Special events or gatherings
Demographic Group (DG)	DG1	School children
	DG2	University students
	DG3	Working professionals
	DG4	Senior citizens
	DG5	Tourists
	DG6	Leisure travelers
	DG7	People with physical disabilities
	DG8	People with sensory disabilities
	DG9	People with cognitive disabilities
	DG10	Low-income individuals
	DG11	Middle-income individuals
	DG12	High-income individuals
Built-Environment Factors
Land Use (LU)	LU1	Residential neighbourhoods
	LU2	Industrial/business parks
	LU3	University precincts
	LU4	Agricultural land areas
	LU5	Tourist destinations
	LU6	Town centres
Supply-Side Features
Vehicle Type (VT)	VT1	Small shuttle
	VT2	Minibus shuttle
	VT3	Standard-sized conventional bus
Service Offering (SO)	SO1	Completely replace conventional buses
	SO2	Operate as a connector to existing fixed-route bus services
	SO3	Connector to longer distance services
	SO4	Operate as private taxi services
	SO5	Accommodate as a multipurpose service
	SO6	Integrated with other transport offerings
	SO7	Operate 24/7

Table 2. LASSO feature selection results.

Dependent Variable	Selected Variables
VT1	TP1, TP2, TP5, TP8, DG5, DG6, DG7, DG12, LU1, LU3
VT2	TP4, TP5, TP6, TP8, DG2, DG3, DG4, DG5, DG6, DG8, DG10, LU1, LU2, LU3, LU6
VT3	TP1, TP2, TP6, TP7, TP8, DG1, DG2, DG10, DG12, LU2, LU3, LU4, LU5, LU6
SO1	TP2, TP7, TP8, DG1, DG7, DG8, DG9, LU4, LU5, LU6
SO2	DG3, LU3
SO3	TP1, TP3, TP4, TP8, DG1, DG2, DG6, DG7, DG10, LU1, LU2, LU6
SO4	TP1, TP2, TP3, TP4, TP5, TP7, DG1, DG2, DG4, DG5, DG7, DG8, DG10, DG11, DG12, LU1, LU2, LU3, LU6
SO5	TP1, TP6, DG3, DG6, DG11, LU2, LU3, LU4
SO6	TP1, TP8, DG2, DG4, DG6, DG10, LU2, LU3, LU5
SO7	TP1, TP6, DG3, DG10, LU3, LU4

Table 3. Policy recommendations.

Strategy	Key Findings	Policy Recommendations
Small shuttle	University/leisure trips are primary drivers with flexible demand and high need for accessible short trips (including for physical disabilities).	- Partner with universities for campus shuttles using dynamic routing. - Integrate in tourist areas with accessibility features and peak-season scheduling.
Minibus shuttles	Driven by shopping and last-mile connectivity. Key for sensory disabilities and sporadic tourist/leisure demand. Also strong in university/industrial precincts.	- Deploy in town centres/shopping districts to reduce congestion. - Mandate inclusive design standards. - Partner with tourism boards for high-traffic areas, emphasizing flexibility/comfort. - Deploy in parks/precincts with peak-hour schedules.
Standard-sized conventional bus	Consistent demand from school trips. Critical for emergency trips (high capacity). Predictors include agricultural areas, town centres, tourist spots, and low-income areas. Used by industrial park commuters.	- Collaborate with schools for student transport with safety features like monitoring/geofencing. - Develop protocols for evacuations in disaster-prone rural areas. - Subsidize services in these areas for connectivity. - Prioritize deployment in parks for employee travel.
Completely replace conventional buses	High relevance to rural town centres and consistent school trips. Important for agricultural areas and vital for accessibility for sensory/physical disabilities.	- Pilot programs with real-time optimization. - Dedicated routes for schools with safety compliance. - Evaluate hybrid models in agricultural contexts. - Incorporate inclusive designs, optimizing demand.
Operate as a connector to existing fixed-route bus services	Dominantly serves working professionals and university precincts for commuting.	- Develop apps for seamless integration targeting professionals. - Awareness campaigns in precincts for trust and demand adaptation. - Regular evaluations for service refinement.
Connector to longer distance services	Driven primarily by leisure travelers with flexible needs, and by special events. Also, school children connections.	- Deploy in tourist areas with flexible scheduling for rail/coach connections. - Integrate with event schedules to reduce vehicle reliance. - Enhance reliability for school connections.
Operate as private taxi services	High demand for work trips. Critical for physical disabilities. Strong relevance to both low-income (equity) and high-income segments.	- Deploy in residential/industrial areas with on-demand platforms. - Include low-floor/automated features. - Subsidies/tiered pricing for affordability. - Premium options for commuters, including children/seniors.
Accommodate as a multi-purpose service	Key drivers are working professionals, university precincts, and industrial parks. Significant for logistics and passenger needs in agricultural areas.	- Dual-purpose designs prioritizing passengers. - Deploy in precincts with optimization. - Off-peak freight in parks for employees/logistics. - Flexible routes for rural logistics.
Integrated with other transport offerings	High demand from special events, leisure travelers, and senior citizens (for accessibility). Also supports work trips efficiency in a MaaS framework.	- MaaS platforms with dynamic scheduling for events. - Focus on tourist MaaS for connectivity/emissions reduction. - Door-to-door accessible interfaces. - Support commuting efficiency in MaaS.
Operate 24/7	Dominated by university precincts demand. Secondary drivers are working professionals.	- Base in precincts for continuous service with off-peak charging. - Awareness campaigns for trust. - Integrate multipurpose during low-demand for efficiency/safety.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jayatilleke, S.; Bhaskar, A.; Bunker, J. Advancing Rural Mobility: Identifying Operational Determinants for Effective Autonomous Road-Based Transit. Smart Cities 2025, 8, 170. https://doi.org/10.3390/smartcities8050170

AMA Style

Jayatilleke S, Bhaskar A, Bunker J. Advancing Rural Mobility: Identifying Operational Determinants for Effective Autonomous Road-Based Transit. Smart Cities. 2025; 8(5):170. https://doi.org/10.3390/smartcities8050170

Chicago/Turabian Style

Jayatilleke, Shenura, Ashish Bhaskar, and Jonathan Bunker. 2025. "Advancing Rural Mobility: Identifying Operational Determinants for Effective Autonomous Road-Based Transit" Smart Cities 8, no. 5: 170. https://doi.org/10.3390/smartcities8050170

APA Style

Jayatilleke, S., Bhaskar, A., & Bunker, J. (2025). Advancing Rural Mobility: Identifying Operational Determinants for Effective Autonomous Road-Based Transit. Smart Cities, 8(5), 170. https://doi.org/10.3390/smartcities8050170

Article Menu

Advancing Rural Mobility: Identifying Operational Determinants for Effective Autonomous Road-Based Transit

Abstract

Highlights

Abstract

1. Introduction

1.1. Challenges in Rural Public Transport

1.2. Potential of Autonomous Shuttles in Addressing Rural Transport Challenges

1.3. Public Perceptions and User Preference for Autonomous Shuttles in Rural Contexts

1.4. Operational Determinants and Service Design

1.5. Contributions of the Study

2. Materials and Methods

2.1. Survey Design and Data Collection

2.2. Data Characteristics and Processing

2.2.1. Binarization

2.2.2. Feature Selection

2.2.3. Sample Size and Category Balance Check

2.2.4. Multicollinearity Check

2.2.5. Linearity Check

2.3. General Additive Model (GAM)

2.4. Extreme Gradient Boost (XGBoost)

2.5. Model Validation

3. Results

4. Discussion

4.1. Autonomous Shuttles as Small-Sized Shuttle

4.2. Autonomous Shuttles as Minibus-Sized Shuttles

4.3. Autonomous Shuttles as Standard-Sized Conventional Buses

4.4. Autonomous Shuttles Completely Replacing Conventional Buses

4.5. Autonomous Shuttles as a Connector to Existing Fixed-Route Bus Services

4.6. Autonomous Shuttles as a Connector to Longer Distance Services

4.7. Autonomous Shuttles as a Private Taxi Service

4.8. Autonomous Shuttles as Multipurpose Services

4.9. Autonomous Shuttles Integrated with Other Transport Offerings

4.10. Autonomous Shuttles to Operate 24/7

5. Policy Implications

6. Conclusions

7. Limitations and Future Research Directions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Questionnaire Survey

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI