Next Article in Journal
Correction: Fellah, S.; Mabrouki, C. Dry Port–Seaport System: A Systematic Review. Future Transp. 2026, 6, 96
Previous Article in Journal
Data Quality in Traffic Management: Framework and Real-World Impacts
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Dynamic Empty-Vehicle Repositioning on Long-Haul Freight Corridors: Lower Bounds and Rolling-Horizon Policies Under Lead Times and Time Windows

School of Social Sciences, Waseda University, Tokyo 169-8050, Japan
Future Transp. 2026, 6(3), 125; https://doi.org/10.3390/futuretransp6030125
Submission received: 10 May 2026 / Revised: 7 June 2026 / Accepted: 10 June 2026 / Published: 11 June 2026

Abstract

Empty-vehicle repositioning is a persistent challenge in long-haul road freight because carriers must reduce empty mileage without sacrificing service reliability under lead times, appointment windows, and uncertain load realization. This paper formulates empty-vehicle repositioning on freight corridors as a stochastic control problem with explicit space–time feasibility and a stated within-epoch event order. Lead times couple current dispatch decisions to future capacity, pickup windows impose reachability constraints, and stochastic match feasibility captures information and market frictions. We develop dynamic lower bounds from time-expanded relaxations, showing that dual prices of inventory-balance constraints can be interpreted as space–time scarcity values. We further introduce an order-dependent nested friction decomposition that separates excess empty movement into spatial imbalance, temporal mismatch induced by lead times and time windows, and information frictions. Guided by this structure, we propose price-guided rolling-horizon and generalized-cost policies and evaluate them on synthetic corridor experiments organized around the three friction families. The results reveal service–empty-mileage trade-offs, a pronounced knee in the Pareto frontier, lower service loss under widened tight pickup windows, and strong sensitivity to match feasibility. The PG-RH policy reduces empty-distance exposure and total cost relative to static balancing in the main scenarios while maintaining comparable, but not uniformly dominant, service performance. The framework provides a diagnostic basis for identifying the sources of deadhead and for designing operational interventions that reduce empty mileage without undermining reliability.

1. Introduction

1.1. Empty-Vehicle Repositioning as a Stochastic Control Problem

Empty-vehicle repositioning remains a persistent structural inefficiency in long-haul truckload freight. Even on mature lane networks, carriers routinely dispatch tractors and trailers without a revenue load in order to regain access to high-demand regions, satisfy shipper appointment constraints, and hedge against near-term uncertainty in load availability. The operational problem is not merely to reduce empty mileage. A policy that suppresses empty moves too aggressively may increase missed pickups, late deliveries, and rejected tenders, whereas a policy that rebalances too proactively may protect service performance at the cost of additional deadhead. The central trade-off is therefore dynamic: repositioning matters because it reshapes future feasibility.
From an operations-research perspective, these phenomena belong to the broader class of dynamic resource rebalancing problems under uncertainty, closely related to empty-equipment repositioning and fleet management in other freight settings [1,2,3]. Long-haul trucking, however, is distinguished by two time-related frictions that are especially binding in practice: lead times and time windows (appointments). Lead times create strong intertemporal coupling, because dispatch decisions made now determine the future spatial profile of available capacity. Time windows convert service commitments into hard feasibility constraints rather than soft timing preferences, and appointment systems may further couple carriers through shared facility capacity and congestion externalities [4].
These features make empty repositioning qualitatively different from static “balance-the-network” heuristics. The relevant object is a stochastic control problem in which feasibility depends jointly on location and remaining time to deadlines, while decisions are taken under incomplete and delayed information about future demand and realized matching outcomes. This informational environment is increasingly shaped by digital brokerage and online freight exchanges, which affect both what carriers observe and which attempted assignments ultimately become realized loads [5,6]. For this reason, empty repositioning in long-haul freight should be understood not simply as a cost-minimization exercise, but as a dynamic coordination problem linking fleet positioning, service reliability, and market-mediated uncertainty.

1.2. Definition: Dynamic Repositioning Under Time Constraints

We formalize empty repositioning as dynamic repositioning under time constraints on a corridor network. Let G = ( V , E ) be a directed graph, where V denotes corridor nodes (markets) and E denotes directed lanes. Time is discretized into decision epochs t = 0 , 1 , , T . Let x t Z + | V | be the vector of available vehicle inventories by node at time t.
Freight requests arrive stochastically. Each request r is characterized by an origin–destination pair ( o r , d r ) , a release time, a pickup (service-start) window [ a r , b r ] , and a service or transport time τ r . Vehicles may reposition empty along arcs ( i , j ) E with travel time τ i j and empty-movement cost c i j . Crucially, a repositioning action chosen at time t does not make a vehicle available at the destination until time t + τ i j , so present actions determine future capacity only with delay. Moreover, the feasibility of serving a request depends on whether a vehicle can reach the origin before the pickup deadline b r and subsequently satisfy any downstream timing requirements. The state must therefore encode not only where vehicles are, but also when in-transit vehicles become usable and how much slack remains for pending service obligations.
A natural representation is the time–space expansion: define time-indexed nodes ( i , t ) and allow flows on holdover arcs (waiting), empty-move arcs (repositioning), and loaded arcs (serving accepted requests) [3]. In this representation, lead times and time windows appear directly as reachability constraints in an expanded network, while the online and uncertain characteristic of arrivals preserves a genuinely dynamic decision problem.
Formally, let F t denote the information available at time t (observed demand, forecasts, and platform signals), and let u t denote a control specifying empty-repositioning flows and acceptance or assignment decisions. The system evolves as
s t + 1 = T ( s t , u t , ξ t + 1 ) ,
where the state s t aggregates on-hand inventories, in-transit vehicles, and the residual time-window status associated with pending and accepted loads, and ξ t + 1 represents exogenous uncertainty such as arrivals, travel-time shocks, and match realizations. The objective is to minimize expected total cost,
min ϖ E ϖ t = 0 T C ( u t ) + C svc ( u t , s t ) + C fail ( u t , s t ) ,
where C captures empty-movement costs, C svc captures operational service costs, and C fail penalizes unserved demand and tardiness. This yields an MDP, or a partially observed MDP when information frictions are modeled explicitly, whose dimensionality is driven by the joint space–time coupling induced by lead times and time windows.

1.3. Why Theory: Lower Bounds and Friction Structure

Lead times and time windows impose hard feasibility constraints that couple space and time, while uncertainty renders exact dynamic programming intractable at corridor scale. In large fleet problems, approximate dynamic programming (ADP) and rolling-horizon schemes are therefore prevalent, often built on value-function approximations or price-based decompositions [7]. Yet for policy design—and for interpreting interventions such as appointment systems, pooling, and platform adoption—workable heuristics alone are not enough. One also needs structure. In particular, it is important to identify performance lower bounds that reveal the unavoidable empty movement implied by imbalance and deadlines, and to develop decompositions that attribute excess empty miles to specific frictions.
Accordingly, this paper adopts a theoretical stance and asks the following questions: given spatial imbalance, temporal peaking coupled with window tightness, and information frictions, what is the minimal empty movement compatible with a target service level? Which components of empty mileage are fundamentally induced by imbalance, which by time constraints, and which by imperfect information? To answer these questions, we proceed in three steps. First, we formulate a corridor-scale MDP for empty repositioning that explicitly incorporates lead times, time windows, and stochastic match feasibility, and we use a time–space interpretation to make coupling and feasibility transparent. Second, we derive computable dynamic lower bounds via time-expanded relaxations and dualization of inventory-balance constraints, producing interpretable shadow prices for space–time capacity; building on these bounds, we propose a friction decomposition that separates empty movement attributable to (i) spatial imbalance, (ii) temporal mismatch and window tightness, and (iii) information frictions such as delays, match success probabilities, and search or coordination costs. Third, we show how the resulting dual prices can be converted into scalable and interpretable rolling-horizon controls, thereby connecting operational interventions to model primitives (e.g., appointment systems as effective window widening; collaboration as supply pooling; platforms as improved match feasibility) [4,8,9].
Beyond its methodological contribution, the framework is intended to support operational interpretation in freight transportation. In particular, the decomposition clarifies when excess empty mileage is primarily driven by structural corridor imbalance, by tight appointment windows relative to travel lead times, or by information frictions in load matching. This distinction is useful for evaluating practical levers such as appointment-window design, carrier collaboration, and digital platform improvements, and thus helps connect the analysis to freight system planning and logistics management.

2. Related Work

The present study draws its intellectual provenance from three interlocking literatures: (i) empty-asset repositioning and network balancing, (ii) market-mediated trucking operations governed by dynamic matching, and (iii) time–space network representations for transportation planning and control. Two further strands, while not always foregrounded in formal repositioning models, are nonetheless constitutive for practice and for our theoretical agenda: research on collaboration and backhauls as institutional and operational devices that reshape effective imbalance, and research on forecasting for optimization as the informational substrate of rolling-horizon control.

2.1. Empty-Vehicle Repositioning and Asset Balancing in Freight Networks

Empty repositioning has attracted sustained attention as a canonical manifestation of network imbalance in freight transportation and logistics. In its classical expression, the problem is cast as a system of controllable flows that reconstitute asset availability across locations, balancing the direct cost of repositioning against service performance and reliability. Early contributions established this perspective through models of fleet sizing and empty-equipment redistribution in freight systems [1,10]. Subsequent work in trucking and adjacent freight contexts extends the balancing paradigm into explicitly dynamic settings, where allocation and repositioning must be undertaken under uncertain demand and binding network constraints [11].
A closely allied corpus arises from container and drayage logistics, where the repositioning, reuse, and sizing of empty containers pose structurally analogous challenges. Representative formulations examine dynamic reuse in port environments [12], integrated fleet sizing and repositioning [13], generalized drayage models [14], and interventions intended to curtail empty container truck trips [15]. While the physical asset differs, the underlying mathematics is shared: conservation of equipment, persistent spatial asymmetry of flows, and temporal constraints that render feasibility time-dependent.
Within operations research on empty-vehicle moves per se, repositioning is commonly formulated as a network optimization problem and is at times integrated with loaded movements and routing. For instance, [2] studies optimal empty-vehicle repositioning on a network, and [16] integrates loaded and empty decisions within planning models. These contributions supply the canonical ingredients—inventory balance, flow conservation, and the cost–service trade-off—that motivate our corridor-level representation.
Our point of departure is twofold. First, we elevate lead times and time windows from ancillary timing parameters to first-class determinants of feasibility, treating them as reachability constraints in space–time; this elevation is not merely notational, but qualitatively shifts the problem toward stochastic control with pronounced intertemporal coupling. Second, our emphasis is explicitly theoretical: we develop dynamic lower bounds and a friction decomposition (spatial, temporal/window, and informational) designed to disentangle unavoidable deadhead from reducible inefficiency under explicit time constraints.

2.2. Collaboration, Backhauls, and Consolidation as Imbalance Mitigators

A complementary literature approaches empty miles not only through refined repositioning policies, but through institutional and operational arrangements that enlarge the effective pooling of supply and demand: collaboration among carriers, consolidation of flows, and systematic backhaul planning. Horizontal cooperation has been examined both empirically and through optimization models, repeatedly highlighting utilization gains and cost reductions [8,9,17]. Surveys of collaborative routing synthesize algorithmic, organizational, and incentive-related considerations [18], while more recent work situates collaborative vehicle utilization within contemporary logistics ecosystems [19].
In vehicle routing, the backhaul literature formalizes pickup–delivery structure and its operational benefits [20], including formulations with time windows [21] and robust or selective backhaul decisions [22]. Related contributions link routing and backhaul design to environmental objectives, including emissions mitigation [23]. Across these lines of work, collaboration and backhauls function less as ad hoc remedies than as mechanisms that alter the feasible set of matches and thereby attenuate imbalance at its source.
We interpret this literature through a single operational lens: collaboration, consolidation, and backhaul design can be represented as parameter shifts that reduce effective spatial imbalance (by enlarging the relevant market or pooling inventories) and/or enlarge feasible matching sets (by enabling multi-party exchange). This lens is valuable in our framework because it identifies precisely where such interventions enter the system dynamics—as changes in effective arrival rates, feasible OD sets, or pooling constraints—and therefore how they propagate into lower bounds and into the structure of implementable policies.

2.3. Dynamic Matching, Freight Exchanges, and Information Frictions

Digital freight exchanges and platform-mediated brokerage have renewed scholarly interest in dynamic matching and repositioning in trucking markets. Relative to classical planning models, the salient novelty is that assignment feasibility and profitability are mediated by search, market mechanisms, and information availability, thereby introducing information frictions beyond physical imbalance. Dynamic-equilibrium and hyperpath perspectives on freight exchanges clarify how routing options and market access shape trucking decisions [5,6]. Complementary empirical and behavioral studies investigate repositioning and acceptance decisions under platform signals and uncertainty [24], as well as determinants of matching outcomes in online freight markets [25]. Broader typologies and mechanisms of digital freight platforms are surveyed in [26].
Our model incorporates these market-mediated features in reduced form—through stochastic match feasibility (e.g., acceptance or award probabilities), observation delays, and search/coordination costs—and treats them as explicit axes of variation in our experiments. Conceptually, this positions our analysis at the boundary between operational control and market design: we do not seek to reproduce platform equilibrium in full generality; rather, we ask how frictions deform the attainable frontier of performance and how they reprice the marginal value of repositioning capacity in space–time.

2.4. Time–Space Networks and Dynamic Fleet Management

Time–space (time-expanded) networks constitute a standard representational apparatus for travel times, waiting, and temporal constraints in transportation planning. In equipment-repositioning applications, [3] employs a time–space network to model reefer container repositioning, illustrating how lead times and availability can be encoded via arcs and conservation constraints. Port and drayage models similarly rely on time-dependent structures—including reuse dynamics and gate congestion—that render feasibility intrinsically temporal [12,14]. The associated emissions impacts in port-related settings underscore, in turn, that time-dependent operational constraints are not mere modeling embellishments but central determinants of system performance [4].
For dynamic fleet management under uncertainty, approximate dynamic programming (ADP) and decomposition principles have proved influential at operational scale. Prior work develops ADP methods for large-scale fleet management and fleet-control approximations under uncertain demand [7,27]. These works motivate our methodological posture: in lieu of exact dynamic programming, we pursue (i) tractable relaxations that yield interpretable dual prices—shadow values of space–time capacity—and (ii) rolling-horizon controls that operationalize these prices while remaining computationally viable.
A final, increasingly salient connection concerns forecasting for optimization. Rolling-horizon operation requires forecasts or predictive signals, yet the objective-relevant forecast need not coincide with the most accurate point predictor. Recent survey work on this interface reinforces our decision to treat forecasts and platform signals as policy inputs whose value is assessed end-to-end in terms of control performance [28].
Taken together, the foregoing literatures furnish substantial modeling and algorithmic foundations for empty repositioning, collaborative utilization, market-mediated matching, and time–space representation. The long-haul corridor setting, however, confers unusual prominence on the conjunction of lead times, time windows, and information frictions. It is precisely this conjunction that motivates the theoretical instruments developed herein: lower bounds that quantify unavoidable deadhead induced by space–time imbalance and deadlines, and decompositions that separate spatial, temporal, and informational drivers of excess empty movement. Our contribution is to develop these instruments and to translate them into price-guided rolling-horizon controls, evaluated along the corresponding experimental axes.

3. Model

3.1. Time, Network, and Demand Primitives

We consider a long-haul freight corridor represented by a directed graph G = ( V , E ) , where nodes i V are regions or terminals and arcs ( i , j ) E are admissible empty relocations and loaded movements. Time is discretized into epochs t = 0 , 1 , , T with step size Δ > 0 (e.g., Δ = 1 h). Travel times are integer multiples of Δ :
τ i j { 1 , 2 , } for each ( i , j ) E , τ i j L { 1 , 2 , } for each loaded lane ( i , j ) .
Empty relocation on ( i , j ) incurs cost c i j 0 per vehicle, while a loaded move on lane ( i , j ) incurs cost c i j L 0 (or yields revenue, handled equivalently by a negative cost). We assume a finite fleet of homogeneous vehicles of size N. This homogeneous-fleet assumption is a baseline abstraction used to keep the lower-bound construction and friction decomposition transparent. In practical implementations, heterogeneous tractors, trailer types, load capacities, and driver qualifications can be represented by adding resource-class indices, as discussed in Section 3.7.
At each epoch t, a random set of freight requests arrives. For tractability and to preserve a Markov structure, we aggregate requests by lane and time-window class. Let K denote a finite set of time-window types. A type k K is characterized by a pickup window length W k Z + and, optionally, a delivery slack parameter. For each lane ( i , j ) and type k, let A i j , t k Z + be the number of newly released requests at epoch t that require pickup at origin i within the inclusive window
[ t , t + W k ] ,
and then loaded travel of τ i j L time steps to destination j (often τ i j L = τ i j but not required). The arrival process { A i j , t k } may be nonstationary to capture temporal waves; in experiments we vary this explicitly.
  • Information frictions (reduced form). We allow the platform or market interface to induce stochastic feasibility of turning an attempted assignment into an actual load. Specifically, if the controller attempts to assign a vehicle to a request class ( i , j , k ) at time t, the match succeeds with probability 0 < p i j , t k 1 , which may depend on information delay, competition, and search costs. We also allow delayed observation of arrivals: the controller observes A ^ i j , t k , a possibly lagged or noisy signal of A i j , t k . When represented as a fixed lag, the observation delay is δ Z + time steps. This yields a partially observed control problem; for the theoretical development we work with (i) a fully observed benchmark and (ii) a belief-state reduction when needed.
In the baseline formulation, p i j , t k is treated as an exogenous reduced-form representation of match feasibility. It summarizes tender acceptance, broker or platform search effectiveness, price attractiveness, competing capacity, and other market-mediated factors that affect whether an attempted assignment becomes a realized load. A more endogenous specification can be obtained by replacing p i j , t k with a state-dependent function
p i j , t k = p i j k ρ i j , t , r i j , t , m i j , t , z t ,
where ρ i j , t denotes a local supply–demand ratio, r i j , t denotes the offered rate or price, m i j , t denotes market tightness or competing capacity, and z t collects other platform or macro-market signals. Under such a specification, prices can be treated either as exogenous signals observed by the controller or as additional control variables. The present paper keeps the matching probability in reduced form to isolate the operational consequences of information frictions, rather than to formulate a full freight-platform equilibrium model.

3.2. State and Controls

Lead times imply that the controller must track not only vehicles currently available at nodes but also vehicles in transit (empty or loaded). We use a pipeline representation.
  • Available inventory. Let x i , t Z + be the number of vehicles available at node i at the start of epoch t.
  • Empty-transit pipeline. For each arc ( i , j ) E and remaining travel time { 1 , , τ i j } , let y i j , t ( ) Z + be the number of vehicles traveling empty from i to j with time steps remaining until arrival.
  • Loaded-transit pipeline. For each loaded lane ( i , j ) and remaining travel time { 1 , , τ i j L } , let y i j , t L ( ) Z + be the number of vehicles traveling loaded on lane ( i , j ) with time steps remaining.
  • Unserved demand backlog by time to deadline. Time windows create a perishable backlog. For each origin i, destination j, and type k, define
    b i j , t k ( d ) Z + , d { 0 , 1 , , W k } ,
    where b i j , t k ( d ) is the number of outstanding requests of class ( i , j , k ) whose pickup deadline is in d steps, i.e., must be picked up no later than t + d . The bucket d = 0 denotes requests expiring during epoch t, which generate service failures if not matched by the end of that epoch.
  • Within-epoch event order. To remove ambiguity in the timing convention, each epoch is processed in the following order:
    • The inventory x t is the stock already available for decisions at epoch t; vehicles still recorded in the pipelines at time t arrive only after the current-epoch decisions according to their remaining travel times.
    • Newly released and observed requests A i j , t k enter the current backlog bucket b i j , t k ( W k ) before decisions are made.
    • The controller chooses empty dispatches and assignment attempts using the inventory and backlog available at epoch t.
    • Match outcomes are realized at the end of the epoch. Successful matches depart immediately into the loaded pipeline; failed attempts remain in the backlog if their pickup deadline has not expired. Vehicles assigned to failed attempts are released at the same origin for the next epoch and cannot be reused within the same epoch.
    • Pipelines advance one time step, outstanding requests age by one bucket, and expired unmatched requests are counted as failures.
  • This convention makes the pickup window [ t , t + W k ] literal: a request released at epoch t can be served at epoch t if inventory is available at its origin.
  • Full state. After incorporating newly released requests as above, the decision state at time t is
    s t = x t , y t , y t L , b t ,
    where x t = ( x i , t : i V ) , y t = { y i j , t ( ) } , y t L = { y i j , t L ( ) } , and b t = { b i j , t k ( d ) } .
At each epoch t, after observing available information F t (which may be the full s t or a partial signal), the controller chooses:
  • Empty dispatch decisions. For each arc ( i , j ) , choose integer flows
    u i j , t Z + ,
    representing the number of available vehicles at i dispatched empty toward j at epoch t.
  • Assignment decisions for outstanding requests. For each lane ( i , j ) , type k K , and deadline bucket d { 0 , 1 , , W k } , choose
    u i j , t L ( k , d ) Z + ,
    representing attempted assignments of available vehicles at i to requests in bucket d.
  • The joint feasibility constraints are
    j : ( i , j ) E u i j , t + j : ( i , j ) E k K d = 0 W k u i j , t L ( k , d ) x i , t i ,
    0 u i j , t L ( k , d ) b i j , t k ( d ) ( i , j ) E , k K , d { 0 , , W k } .
    A “wait” action is implicit by not dispatching or assigning a vehicle.
  • Match success. Attempted assignments do not necessarily materialize. Let M i j , t ( k , d ) be the realized number of successful matches from attempts u i j , t L ( k , d ) , with
    M i j , t ( k , d ) Binomial u i j , t L ( k , d ) , p i j , t k ,
    conditionally independent given F t in the baseline model. The realized matched loads enter the loaded pipeline. Failed attempts are not converted into loads; the corresponding vehicles become available again at the same origin only in the next epoch, and the unmatched requests remain in the backlog if their deadline has not expired. This timing rule prevents within-epoch double use of vehicles while preserving the interpretation of u L as attempts rather than guaranteed loaded moves.

3.3. State Transitions

It is convenient to write transitions after the current-epoch arrivals have been inserted into b t ( W k ) and after the controller has selected u t . Define the number of failed attempts in bucket ( i , j , k , d ) as
F i j , t ( k , d ) : = u i j , t L ( k , d ) M i j , t ( k , d ) .
Let U i j , t k denote requests that expire during epoch t:
U i j , t k : = b i j , t k ( 0 ) M i j , t ( k , 0 ) 0 .
Failed attempts in the expiring bucket are included in U i j , t k and do not carry forward.
  • Inventory. The travel-time convention is that a vehicle dispatched at epoch t on an arc with travel time τ becomes available at the destination at the start of epoch t + τ . Hence moves with τ = 1 are already usable at the next decision epoch, while moves with τ > 1 remain in the pipeline at t + 1 . For each node i, the available inventory at the next epoch is
    x i , t + 1 = x i , t j u i j , t j , k , d u i j , t L ( k , d ) + j , k , d F i j , t ( k , d ) + h y h i , t ( 1 ) + h y h i , t L ( 1 ) + h : τ h i = 1 u h i , t + h : τ h i L = 1 k , d M h i , t ( k , d ) .
    Equivalently, since u L F = M , the assignment part reduces to subtracting successful loaded matches at the origin, while the displayed form makes the within-epoch reservation of failed attempts explicit. The last line adds vehicles dispatched during epoch t that complete a one-period empty or loaded trip before epoch t + 1 begins.
  • Pipelines. For empty transit, vehicles already in transit advance one time step:
    y i j , t + 1 ( 1 ) y i j , t + 1 ( 1 ) + y i j , t ( ) , = 2 , , τ i j .
    New empty dispatches are treated according to their travel time:
    added directly to x j , t + 1 , τ i j = 1 , y i j , t + 1 ( τ i j 1 ) y i j , t + 1 ( τ i j 1 ) + u i j , t , τ i j > 1 .
    Analogously, loaded vehicles already in transit advance as
    y i j , t + 1 L ( 1 ) y i j , t + 1 L ( 1 ) + y i j , t L ( ) , = 2 , , τ i j L ,
    and successful matches are inserted as
    added directly to x j , t + 1 , τ i j L = 1 , y i j , t + 1 L ( τ i j L 1 ) y i j , t + 1 L ( τ i j L 1 ) + k K d = 0 W k M i j , t ( k , d ) , τ i j L > 1 .
    All pipeline components not assigned by these advancement or insertion equations are zero. If destination inventory should be available only after unloading or turn time, one can increase τ i j L accordingly.
  • Demand aging. For d = 1 , , W k , the next-epoch backlog is
    b i j , t + 1 k ( d 1 ) = b i j , t k ( d ) M i j , t ( k , d ) .
    Newly released requests at epoch t + 1 are then inserted into b i j , t + 1 k ( W k ) before the next decision, following the event order above. If late pickup is allowed with penalty rather than outright failure, U i j , t k can be interpreted as tardy volume and carried forward with modified costs.
This accounting preserves fleet size:
i x i , t + i j , y i j , t ( ) + i j , y i j , t L ( ) = N t .

3.4. One-Period Cost and Objective

We optimize a finite-horizon expected cost. The one-period cost at epoch t is
g t ( s t , u t , ξ t + 1 ) = ( i , j ) E c i j u i j , t empty repositioning + ( i , j ) c i j L k , d M i j , t ( k , d ) loaded operating cost + ( i , j ) , k π i j k U i j , t k service failure/lateness + ( i , j ) , k , d κ i j k u i j , t L ( k , d ) search/coordination cost .
Here π i j k penalizes unserved or tardy volume and κ i j k captures per-attempt frictions such as brokerage effort, platform fees, or expected negotiation cost. Let ϖ denote a policy mapping information F t (or belief states) to feasible controls u t .
We seek
min ϖ E ϖ t = 0 T 1 g t ( s t , u t , ξ t + 1 ) + g T ( s T ) ,
where g T is a terminal cost (often 0 or a convex potential encouraging balanced end-of-horizon inventories). A discounted infinite-horizon variant is immediate by replacing the finite sum with t 0 β disc t g ( s t , u t , ξ t + 1 ) , where β disc is the discount factor.

3.5. MDP and Time–Space Interpretation

Under full observability, with the event order and transition equations above, the process { s t } is Markov and defines an MDP with a large but structured state space. The time–space network viewpoint is useful for relaxations and lower bounds: the pipelines y , y L correspond to flows on time-expanded relocation and service arcs, while the backlog vectors b k ( d ) encode the reachability constraints induced by time windows. This representation is standard in equipment repositioning and fleet planning models and will be leveraged in Section 4 to construct dynamic lower bounds via relaxations and dual prices [3,7].

3.6. Parameters for Experiments and Extensions

The model exposes three parameter families that align with our experimental axes:
  • Spatial imbalance: Lane- and node-level arrival intensities λ i j , t k : = E [ A i j , t k ] and their asymmetry across the corridor.
  • Temporal imbalance and time-window tightness: Nonstationarity of λ i j , t k over t and the window widths W k (and, if modeled, delivery slack).
  • Information frictions: Match success probabilities p i j , t k , observation delays/noise mapping A A ^ , and search/coordination costs κ i j k .
  • Performance metrics reported later (empty-distance ratio, service failure/tardiness, and total cost) are all functions of sample-path realizations induced by ϖ under these parameters.

3.7. Scope of the Baseline Model and Operational Extensions

The baseline formulation intentionally uses a homogeneous vehicle resource, reduced-form match feasibility, and corridor-level time–space constraints. These assumptions allow the lower bounds and friction decomposition to remain analytically transparent. They should not be interpreted as excluding richer operational constraints; rather, such constraints can be incorporated by enlarging the state and action spaces.
First, vehicle and driver heterogeneity can be represented by introducing a finite resource-class index q Q . The inventory state then becomes x i , t q , transit pipelines become y i j , t , q and y i j , t L , q , and each request class ( i , j , k ) is associated with an eligibility set Q i j k Q . Assignment variables are indexed as u i j , t L , q ( k , d ) , with feasibility restricted to q Q i j k . Vehicle-specific costs, capacities, and travel-time characteristics can likewise be encoded as c i j , q , c i j L , q , and τ i j q . This extension captures tractor type, trailer configuration, load capacity, and driver qualification requirements while preserving the time–space structure of the model.
Second, the match success probability p i j , t k is a reduced-form representation of market-mediated feasibility. In an endogenous market specification, it can be replaced by a function of local supply–demand ratios, prices or offered rates, platform competition, and other market signals. Prices may be treated either as exogenous signals or as additional control variables. The present paper keeps this component exogenous because its primary objective is to decompose the effects of spatial imbalance, time-window constraints, and information frictions on empty repositioning and service reliability, rather than to solve a full freight-market equilibrium problem.
Third, driver hours-of-service restrictions can be incorporated by augmenting the resource state with remaining driving time, remaining duty time, and rest status. For example, each vehicle or vehicle–driver resource class can be indexed by a discretized hours-of-service state h, so that the state includes x i , t q , h . Feasible empty and loaded arcs would then be restricted to movements that do not violate driving-time, duty-time, or mandatory-rest requirements. Rest arcs can be added to the time–space network in the same way as waiting arcs, with transitions that replenish available duty or driving time. Including such restrictions would reduce the feasible assignment and repositioning sets and increase the shadow value of compliant capacity at particular space–time nodes, and may require earlier repositioning or additional buffering in rolling-horizon policies.
These extensions preserve the basic stochastic control and time–space structure of the framework, but they increase the dimensionality of the state and the computational burden of the rolling-horizon optimization. For this reason, they are treated as operational extensions rather than as part of the baseline model studied in the numerical experiments.

4. Theory: Dynamic Lower Bounds and Friction Decomposition

This section develops two theoretical pillars. First, we derive computable dynamic lower bounds on the stochastic control problem in Section 3; their dual variables yield shadow prices for space–time vehicle capacity. Second, we introduce a friction decomposition that attributes deadhead and service loss to spatial imbalance, temporal misalignment induced by lead times and time windows, and information frictions.
Throughout, we focus on the fully observed benchmark MDP; the same constructions can be applied to belief-state formulations, at the cost of additional notation.

4.1. Benchmark Objective and Notation

Let ϖ denote an admissible (non-anticipative) policy and let
J : = inf ϖ E ϖ t = 0 T 1 g t ( s t , u t , ξ t + 1 )
be the optimal expected cost over horizon T (terminal cost omitted for simplicity). We write C ( ϖ ) for expected empty-move cost, C fail ( ϖ ) for expected service failure/lateness penalties, and similarly for other components so that
J ( ϖ ) = C ( ϖ ) + C L ( ϖ ) + C fail ( ϖ ) + C search ( ϖ ) .
All results below hold with minor modifications for discounted infinite horizons.

4.2. A Time-Expanded “Prophet” Relaxation

A standard way to obtain lower bounds in dynamic allocation is to relax non-anticipativity and allow a prophet (clairvoyant) controller that knows all future uncertainty { ξ t } at time 0. Since the prophet can condition on information unavailable to any admissible online policy, the prophet optimum is no larger than the online optimum and therefore provides a lower bound.
We first rewrite the system in a time–space (time-expanded) network. Define time-indexed nodes ( i , t ) for i V and t { 0 , , T } . Consider three types of arcs:
-
Holdover arcs  ( i , t ) ( i , t + 1 ) for t + 1 T (waiting);
-
Empty arcs  ( i , t ) ( j , t + τ i j ) for ( i , j ) E and t + τ i j T (repositioning);
-
Loaded arcs  ( i , t ) ( j , t + τ i j L ) for loaded lanes ( i , j ) and t + τ i j L T , representing the service of a request released at t whose pickup is executed at t (for now).
  • Time windows can be enforced by allowing service arcs for a request only within its feasible pickup epochs. To keep notation compact, let A H be the set of holdover arcs, A the set of empty arcs, and A L ( ω ) the set of service arcs induced by a sample path ω of request arrivals, time windows, and exogenous match-feasibility primitives. Holdover arcs have zero operating cost. For each request r R ( ω ) , let A r L ( ω ) A L ( ω ) denote the feasible service arcs for request r (pickup times within its window). For a time–space node v, let δ + ( v ) and δ ( v ) denote the outgoing and incoming arc sets, respectively. In this relaxation, all primitives are fixed before optimization. Because match outcomes in the original model are generated only after an attempt, we use a policy-independent coupling for the prophet benchmark: each potential request unit and feasible pickup arc is assigned an ex ante feasibility indicator (or, equivalently, a common random seed) before optimization. A service arc is included only when the corresponding request, time window, and feasibility indicator permit service. This convention makes the feasible arc set a property of the sample path rather than of a particular online policy’s realized attempts.
Let f a 0 be the flow on arc a (vehicles traversing that arc). The prophet relaxation imposes vehicle-flow capacity at each time–space node: outgoing flow cannot exceed available inflow. If b i , t denotes the exogenous supply at ( i , t ) from the initial inventory (at t = 0 ) plus arrivals of in-transit vehicles, then the feasible set is described by linear constraints in the time-expanded network. Because unused capacity is not forced to be carried to a terminal node, this formulation permits free disposal of unused vehicles; this can only decrease the optimal value and is therefore valid for lower-bounding purposes. The relaxation also omits non-negative search/coordination costs and failed-attempt reservation effects, which can only further reduce the relaxed objective value.
  • Prophet program. Fix such a realization ω . Define the prophet minimum-cost flow:
    ( P ( ω ) ) min f 0 , 0 u 1 a A H 0 · f a + a A c a f a + a A L ( ω ) c a L f a + r R ( ω ) π r u r s . t . a δ + ( v ) f a a δ ( v ) f a b v , v = ( i , t ) , a A r L ( ω ) f a + u r = 1 , r R ( ω ) .
    Here R ( ω ) indexes individual requests (or request units) in sample path ω ; u r indicates unserved volume, and the relaxation uses 0 u r 1 . Costs on arcs aggregate operating terms; holdover arcs have cost zero, c a is c i j for an empty arc ( i , t ) ( j , t + τ i j ) , etc. Program (26) can be written as a linear program (LP) when requests are divisible or aggregated; the integer structure is not essential for lower bounds since we will relax integrality anyway.
Proposition 1
(Prophet lower bound). Let z ( ω ) be the optimal value of (P( ω )). Then
E [ z ( ω ) ] J .
Proof. 
Any admissible policy is non-anticipative and hence feasible under the realized sample path ω when interpreted as a time–space flow. The prophet relaxation enlarges the feasible set by allowing decisions to depend on future information. Minimizing over a superset yields a cost no larger than the optimal non-anticipative cost. Taking expectations preserves the inequality.    □
  • Interpretation. Proposition 1 provides a baseline bound, but its principal value for our purposes is structural: the dual of (26) yields space–time prices that will later reappear as guiding signals in rolling-horizon control.

4.3. Dual Prices and a Lagrangian Lower Bound with Separability

We next derive a bound that is both interpretable and algorithmically useful: a Lagrangian relaxation that dualizes space–time vehicle conservation.
Let λ v 0 be the Lagrange multiplier for the conservation constraint written as
a δ + ( v ) f a a δ ( v ) f a b v 0 .
The Lagrangian of (P( ω )) becomes
L ( f , u ; λ ) = a A H λ tail ( a ) λ head ( a ) f a + a A c a + λ tail ( a ) λ head ( a ) f a + a A L ( ω ) c a L + λ tail ( a ) λ head ( a ) f a + r π r u r v λ v b v .
where tail ( a ) and head ( a ) are the tail/head nodes of arc a. Minimizing L over ( f , u ) with only request-cover constraints yields a dual function that decomposes by request.
Theorem 1
(Price-based (Lagrangian) lower bound). Let
Λ : = λ 0 : c a + λ tail ( a ) λ head ( a ) 0 a A , λ tail ( a ) λ head ( a ) 0 a A H .
For any λ Λ , define the per-request generalized service cost
c ˜ r ( a ; λ ) : = c a L + λ tail ( a ) λ head ( a ) , a A r L ( ω ) ,
and the per-empty-arc generalized empty cost
c ˜ ( a ; λ ) : = c a + λ tail ( a ) λ head ( a ) , a A .
For holdover arcs, the corresponding reduced cost is λ tail ( a ) λ head ( a ) , reflecting their zero operating cost. Then the optimal value z ( ω ) of (P( ω ))  satisfies
z ( ω ) v λ v b v + r R ( ω ) min π r , min a A r L ( ω ) c ˜ r ( a ; λ ) .
Consequently,
sup λ Λ v λ v b v + r R ( ω ) min π r , min a A r L ( ω ) c ˜ r ( a ; λ ) z ( ω ) ,
and
E sup λ Λ v λ v b v + r R ( ω ) min π r , min a A r L ( ω ) c ˜ r ( a ; λ ) E [ z ( ω ) ] J .
Proof. 
For any feasible ( f , u ) in (P( ω )) and any λ 0 , the conservation residual a δ + ( v ) f a a δ ( v ) f a b v is nonpositive. Therefore the Lagrangian value is no larger than the original objective value at any feasible solution. Minimizing the Lagrangian after dropping conservation constraints gives a valid lower bound. If λ Λ , the holdover-arc and empty-arc terms have non-negative reduced costs and hence are minimized by zero such flow in the relaxed problem. The remaining minimization separates by request: each request is either left unserved at cost π r or assigned to its least generalized-cost feasible service arc. Taking the infimum over feasible primal solutions and then the supremum over λ Λ preserves the lower-bound inequality; Proposition 1 gives the final comparison with  J .    □
  • Discussion. The multipliers λ i , t admit a precise marginal-value interpretation: they represent the value of an additional unit of vehicle capacity at location i and time t in the relaxed space–time system. Generalized costs c ˜ translate each candidate action into “physical cost net of downstream value,” thereby producing exactly the price structure exploited by rolling-horizon controllers. We will return to this connection in Section 5.

4.4. An Imbalance Lower Bound: Unavoidable Empty Movement

While the prophet/Lagrangian bounds incorporate full space–time constraints, it is also useful to have a coarse bound that isolates the effect of spatial imbalance alone, abstracting away time windows and information frictions. This bound provides a baseline notion of unavoidable deadhead even under ideal temporal coordination.
Consider the aggregated net loaded flow requirement over the horizon. Let D i j , t denote the realized number of served loads on lane ( i , j ) at time t, and define the net vehicle surplus at node i over the horizon:
Δ i : = t j D j i , t t j D i j , t ,
i.e., loaded arrivals minus loaded departures. If Δ i > 0 , node i accumulates vehicles and therefore must send net empty flow out if terminal inventories are fixed; if Δ i < 0 , node i requires net inflow of vehicles via empty repositioning.
Let d i j be a metric-like distance on V for empty moves (e.g., shortest-path distance in G). Let C , be the minimal total empty distance required to rebalance the horizon-aggregated surpluses:
C , ( Δ ) : = min z i j 0 i , j d i j z i j s . t . j z i j j z j i = Δ i , i .
This is the Earth-mover/min-cost circulation needed to offset the net imbalance induced by loaded moves.
Proposition 2
(Imbalance lower bound). For any policy ϖ and any realization of served loaded flows ( D i j , t ) , suppose node-level terminal inventories are fixed to their initial levels over the aggregation horizon, or equivalently that any net terminal-inventory change is accounted for before forming Δ. Then the total empty distance traveled satisfies
EmptyDistance ( ϖ ) C , ( Δ ) .
Taking expectations,
E [ EmptyDistance ( ϖ ) ] E C , ( Δ ) .
In particular, under any service-level target that pins down the expected served OD volumes, the right-hand side yields an unavoidable deadhead baseline driven purely by spatial imbalance.
Proof. 
Under the stated terminal-inventory accounting, vehicle conservation over the horizon implies that net surpluses and deficits created by loaded departures/arrivals must be balanced by empty relocations. If we ignore timing and allow instantaneous empty moves at distance cost d i j , the minimal empty distance needed to satisfy the resulting net balance is exactly (37). Any feasible trajectory induces a feasible flow z i j with a total distance no smaller than the optimum.    □
  • Use. Proposition 2 isolates the spatial driver of deadhead under the stated terminal-inventory accounting and served-OD-volume comparison. The gap between realized empty distance and this bound is therefore interpreted below only for comparisons that hold those quantities fixed; outside such comparisons, the bound is a diagnostic reference rather than a directly comparable performance floor.

4.5. Friction Decomposition

We now introduce a decomposition that maps performance gaps to three friction families aligned with the experimental design: (i) spatial imbalance, (ii) temporal constraints (lead times and time windows), and (iii) information frictions.
  • A nested sequence of relaxations. Let J full denote the optimal value of the full model (Section 3). Define two counterfactual models obtained by removing frictions:
    -
    Temporal relaxation (remove time constraints). Consider a model in which time windows are infinite ( W k = ) and lead times are zero ( τ i j = 0 for empty moves and τ i j L = 0 for loaded moves), while preserving total OD demand volumes over the horizon. Let the optimal value be J sp (“spatial-only”).
    -
    Information relaxation (remove information frictions). Consider a model with full observability and deterministic match feasibility ( p i j , t k 1 , zero observation delay, and κ 0 ), but retaining lead times and time windows. Let the optimal value be J st (“space–time without info frictions”).
These models define a natural chain:
J sp J st J full .
The inequalities hold because each step adds constraints/frictions and thus can only increase the optimal cost.
Lemma 1
(Monotonicity under friction removal). Let M sp , M st , and M full be the spatial-only, space–time (no info frictions), and full models defined above. Then
J ( M sp ) J ( M st ) J ( M full ) .
Proof. 
The spatial-only model removes lead-time and time-window restrictions from the space–time model while preserving the same aggregate demand volumes, so every feasible solution of M st maps to a feasible solution of M sp with the same cost accounting. Likewise, under the common coupling used here, removing information frictions replaces delayed or stochastic match realization with full observability, deterministic feasibility, and zero search cost. With non-negative search costs and the same service-penalty accounting, this relaxation cannot increase the optimal expected cost. These two friction-removal relaxations yield the stated inequalities.    □
  • Cost decomposition. Define the following nested accounting components:
    Φ base : = J sp , Φ time : = J st J sp , Φ info : = J full J st .
    Then
    J full = Φ base + Φ time + Φ info .
    Equation (43) is an exact accounting identity by construction for this particular nesting of relaxations. The baseline term Φ base is the spatial-only benchmark cost, not a marginal friction premium relative to a zero-friction model. The marginal premiums Φ time and Φ info are order-dependent, and interaction effects are allocated according to the chosen order. A different nesting, or a Shapley-style attribution over all orderings, would generally allocate those interaction effects differently.
  • Deadhead decomposition (distance-based). Because the objective includes multiple cost terms, practitioners often prefer a decomposition of empty distance itself. We propose a parallel decomposition using the imbalance lower bound.
Let E full be the minimal expected empty distance in the full model when empty distance is used as the objective, with the same served OD volumes and service requirements as the comparison models. Define
E sp LB : = E C , ( Δ ) ,
the imbalance lower bound from Proposition 2, and let E st be the minimal expected empty distance in the space–time model without info frictions (same lead times/windows, p 1 , full observation).
We then define
Ψ spatial : = E sp LB , Ψ time : = E st E sp LB , Ψ info : = E full E st .
The same nesting argument as in Lemma 1 applies to the empty-distance objective under the common served-OD-volume comparison, and Proposition 2 gives E st E sp LB . Under these stated comparison conditions, the three terms are non-negative. Hence
E full = Ψ spatial + Ψ time + Ψ info .
As with the cost decomposition, (46) is an order-dependent accounting decomposition under a common served-OD-volume and terminal-inventory comparison, not a unique causal decomposition of realized empty miles on arbitrary sample paths.
  • Interpretation and policy mapping.
-
Ψ spatial captures deadhead that is unavoidable given net OD imbalance, even with perfect temporal coordination and information.
-
Ψ time captures additional deadhead induced by lead times and time windows: vehicles must reposition earlier and may need to “over-move” to satisfy reachability under deadlines.
-
Ψ info captures additional deadhead induced by imperfect/lagged information and stochastic match feasibility: failed attempts and delayed visibility lead to mispositioning and reactive relocations.
This separation aligns directly with interventions discussed later: pooling/collaboration changes the effective imbalance (reducing Ψ spatial ); appointment systems and facility operations change effective window widths (reducing Ψ time ); and platform adoption improves feasibility/visibility (increasing p or reducing delay, thus reducing Ψ info ).

4.6. Operational Meaning of Shadow Prices

The dual multipliers λ i , t in Theorem 1 admit a marginal-value interpretation: in the relaxed space–time system, λ i , t is the opportunity cost of consuming one unit of vehicle capacity at ( i , t ) . This yields generalized arc costs and a time-window implication that will be leveraged algorithmically.
Corollary 1
(Generalized arc costs). Fix multipliers λ. For an empty move ( i , t ) ( j , t + τ i j ) , the generalized cost is
c ˜ i j , t ( λ ) = c i j + λ i , t λ j , t + τ i j .
For serving a request by picking up at time t [ t , t + W k ] and delivering to j at t + τ i j L ,
c ˜ i j , t L ( λ ) = c i j L + λ i , t λ j , t + τ i j L .
Thus, a price-guided controller prefers actions with low (possibly negative) generalized cost, i.e., actions that move capacity toward high-value space–time nodes.
  • Implication (time-window tightness). All else equal, shrinking time windows (smaller W k ) reduces the feasible service arc set A r L ( ω ) . This can increase the marginal value of timely capacity at affected space–time nodes and can contribute to larger temporal-friction components in (43)–(46); the direction and magnitude of individual dual prices depend on the selected optimal multiplier and the surrounding network constraints.
This implication formalizes the intuition that appointment tightness manifests as scarcity of timely capacity. In Section 5, we translate these prices into rolling-horizon repositioning decisions.
  • Synthesis. Proposition 1 and Theorem 1 provide computable bounds and interpretable shadow prices for capacity indexed by location and time. Proposition 2 supplies the complementary spatial baseline. The decompositions (43) and (46) then translate the gap between attainable performance and spatial inevitability into temporal- and information-friction premiums, setting up the sensitivity analyses in Section 6.

5. Policies and Algorithms

This section develops implementable control policies for the corridor-scale problem in Section 3. Exact dynamic programming is precluded at the scales of interest, owing to the dimensionality of the state (inventories, in-transit pipelines, and perishable backlogs) and to the combinatorial nature of assignment decisions. We therefore pursue approximate control architectures that remain computationally viable while preserving the two structural constraints that govern feasibility in practice—lead times and time windows. A further desideratum is conceptual: the policies should not be ad hoc, but should inherit an interpretable decision logic from the lower-bound and dual-price constructions of Section 4.

5.1. Rolling-Horizon Planning with Terminal Prices

At each epoch t, given available information F t (possibly partial), we solve a finite-horizon planning problem over a lookahead window of length H (in time steps), execute only the first-period decisions, advance the system, and repeat. Let θ ^ t denote forecast inputs at time t (arrival rates, window-type mix, match success probabilities, and related quantities). Let U ( s t ) denote the set of feasible current-epoch controls satisfying the inventory and backlog constraints in Section 3. A generic rolling-horizon policy may be written as
u t arg min u U ( s t ) J ^ t H ( s t , u ; θ ^ t ) ,
where J ^ t H approximates the expected cost-to-go over [ t , t + H ] augmented by a continuation proxy at t + H . Two ingredients govern the quality of such a policy: the tractable structure chosen for the lookahead optimization (e.g., LP/min-cost flow) and the terminal proxy that imputes value to inventories at the end of the window. We adopt a time–space planning formulation together with a terminal valuation expressed in space–time prices, thereby internalizing the opportunity cost of consuming capacity that is valuable beyond the window.
Fix t and consider time–space nodes ( i , τ ) for τ = t , , t + H . We form a deterministic planning instance by replacing random arrivals with forecasted expected demand (or a scenario set). Let A ^ i j , τ k denote forecasted demand units with pickup windows [ τ , τ + W k ] restricted to the lookahead horizon. Let f i j , τ 0 be the number of vehicles dispatched empty from i at time τ to arrive at j at τ + τ i j ; let f i j , τ , k , d L 0 be the number of loads of class ( i , j , k ) and deadline bucket d { 0 , , W k } served by picking up at time τ (with τ within the relevant time window); let h i , τ 0 denote holdover (waiting); let u i j , τ k 0 denote unserved forecasted volume; and let a i j , τ k 0 denote attempted assignment volume. Constraints take the familiar form: conservation of vehicle flow in time–space, coverage of forecasted loads within admissible pickup epochs, and (optionally) slack for unserved volume.
The distinguishing feature is the terminal valuation. Let λ i , τ denote a set of space–time prices (shadow values) for having one additional vehicle at node i and time τ . Such prices may be obtained from a longer-horizon relaxation (as in Section 4) or updated online. We approximate the continuation value by
V t + H ( x t + H ) i V λ i , t + H x i , t + H ,
so that the lookahead problem accounts for the downstream value of capacity.
A representative deterministic lookahead program is
min τ = t t + H 1 ( i , j ) E c i j f i j , τ + τ = t t + H 1 ( i , j ) , k c i j L d f i j , τ , k , d L + τ = t t + H 1 ( i , j ) , k π i j k u i j , τ k + τ = t t + H 1 ( i , j ) , k κ i j k a i j , τ k i λ i , t + H x i , t + H s . t . C flow ( s t ) , C cov ( A ^ ) , d f i j , τ , k , d L p i j , τ k a i j , τ k , ( i , j ) , k , τ , f , f L , h , u , a 0 .
Here C flow ( s t ) denotes vehicle conservation over the lookahead, with the initial supply fixed by the observed state and pipeline arrivals. The set C cov ( A ^ ) denotes demand coverage within admissible pickup windows, with slack variables u i j , ρ k . The inequality involving a i j , τ k links planned served flow to expected match feasibility, so a i j , τ k is the aggregate attempt volume for lane–type pair ( i , j , k ) at time τ . Thus (51) remains a time–space LP with time-window coverage. When demands are aggregated and divisibility is acceptable, (51) can be solved efficiently even for moderate | V | and H.
The rolling-horizon controller extracts the first-period decisions as follows. Empty dispatches use u i j , t : = f i j , t . Assignment decisions use the first-period attempt volume a i j , t k , allocated across deadline buckets according to the planned served-flow proportions f i j , t , k , d L , with integerization applied if needed (Section 5.4).

5.2. Prices Under Uncertainty and Adaptation

The performance of (51) is mediated by the quality of the terminal prices λ and by how uncertainty is represented in the planning instance. We therefore treat price construction and uncertainty handling as coupled design elements.
  • Offline Prices from a Long-Horizon Relaxation
A natural source of prices is a longer-horizon time–space relaxation (prophet or expected-value LP) solved over a representative day/week profile, from which one extracts dual multipliers of the space–time conservation constraints. This is directly motivated by the Lagrangian structure in Theorem 1 and the generalized-cost representation in Corollary 1. In stationary or periodically stationary corridors, the resulting λ i , τ may be stored as a periodic table (e.g., by hour of day and day of week).
  • Online Adaptive Prices via Stochastic Approximation
When demand patterns drift, it is advantageous to adapt prices using observed scarcity. Let λ i , t be the current estimate. After executing u t and observing outcomes, define a proxy imbalance signal at ( i , t ) , for example
g ^ i , t : = x i , t target x i , t ,
where x target is a desired inventory profile (possibly obtained from a fluid model). Update
λ i , t + 1 λ i , t + η t g ^ i , t + ,
with step size η t > 0 and projection onto R + . This may be viewed as a dual-ascent step for balancing constraints in a relaxation, and in practice it stabilizes rolling-horizon behavior in nonstationary environments.
  • Match Uncertainty and Partial Observability
The full model allows delayed/noisy observation and stochastic feasibility. We incorporate these features in two pragmatic ways. First, in (51) we may plan in terms of attempted volume and enforce expected served volume via
E [ served ] p i j , τ k · a i j , τ k ,
with penalties for failed attempts, yielding a tractable expected-value approximation. Second, when observation delay is material, we maintain a belief μ t over true arrivals/backlogs (e.g., via a simple filter) and run the deterministic lookahead on posterior means E μ t [ A ] and E μ t [ b ] . This certainty-equivalence policy is deliberately simple; in experiments we vary delay/noise to quantify its degradation.

5.3. A Greedy Generalized-Cost Policy (Fast Baseline)

For real-time dispatch, solving (51) at every epoch may remain burdensome in large corridors or under fine time discretization. Motivated by Corollary 1, we therefore also consider a one-step, price-guided rule that retains the economic logic of shadow values without multi-period optimization.
Given prices λ , define for an empty move from i to j at time t
c ˜ i j , t = c i j + λ i , t λ j , t + τ i j .
For serving an available request class ( i , j , k ) with deadline bucket d, the generalized net cost of assigning now is
c ˜ i j , t L ( k , d ) = c i j L + λ i , t λ j , t + τ i j L + κ i j k + ( expected failure penalty adjustment ) .
A simple adjustment for match success probability is to inflate costs by 1 / p and/or include an expected residual penalty ( 1 p ) π .
Operationally, the rule is as follows: at each node i, allocate available vehicles to the lowest generalized-cost actions among (a) serving expiring demand (small d), (b) serving non-expiring demand, (c) empty repositioning toward destinations with high shadow price λ j , t + τ i j , and (d) waiting. The resulting per-epoch complexity is O ( | E | ) , yet the policy often captures the correct qualitative behavior induced by time windows and lead times.

5.4. Integer Feasibility, Rounding, and Implementation

When vehicles and loads are indivisible, (51) produces fractional flows. We therefore employ lightweight integerization procedures that preserve feasibility at the current epoch:
  • Network-flow integrality. If the time–space program is expressed as a pure min-cost flow with integral supplies/demands and no complicating side constraints, integrality holds and the LP solution is integer.
  • Deterministic rounding. Round dispatch variables f i j , t and assignment variables at the current epoch, and then repair feasibility by local adjustments (e.g., greedy fill subject to inventory).
  • Randomized rounding with repair. Interpret fractional values as probabilities, sample integer decisions, and then repair to satisfy per-node inventory constraints.
We demonstrate in Section 6 that rounding-induced gaps are typically small relative to the structural effects (imbalance, window tightness, information frictions) that dominate empty-mile outcomes. Algorithm 1 provides pseudocode for the proposed price-guided rolling-horizon controller.
Algorithm 1 Price-guided rolling-horizon control
   1: Initialize prices λ i , 0 (offline duals or zeros), set lookahead H.
   2: for  t = 0 , 1 , , T 1  do
   3:     Observe information F t and update forecasts θ ^ t (arrivals, p, etc.).
   4:     Build the H-step time-space instance and solve the lookahead LP (51) (or apply the greedy generalized-cost rule).
   5:     Execute first-period empty dispatches u · , t and assignment attempts u · , t L (with rounding if needed).
   6:     Observe realized matches and state transition to s t + 1 .
   7:     Update prices via (53) (optional, if using online adaptation).
   8: end for

6. Experiments: Synthetic Corridors and Sensitivity Analysis

This section evaluates the theoretical constructs and the dispatch/repositioning policies developed in Section 5 on a controlled family of synthetic long-haul corridor instances. The experimental design is deliberately factorized along three axes mirroring the friction families in Section 4: (i) spatial imbalance, (ii) temporal imbalance coupled with time-window tightness, and (iii) information frictions. For each configuration we report (a) the empty-distance ratio, (b) service-level outcomes (unserved and late, where applicable), and (c) total cost, together with comparisons to dynamic lower bounds that connect numerical behavior back to theory.

6.1. Synthetic Corridor Environment

  • Corridor network. We construct directed corridor graphs G = ( V , E ) with | V | { 6 , 10 , 20 } nodes representing regions arranged along a line (baseline) or a branched line (robustness). Edges connect adjacent regions in both directions and may include skip links (e.g., two-hop express arcs) to represent alternative reposition options. Empty travel times τ i j are proportional to corridor distance; empty costs are set as c i j = α empty d i j , where d i j denotes shortest-path distance and α empty > 0 scales the per-distance empty cost. Loaded travel times τ i j L equal τ i j unless stated otherwise.
    Fleet size and initial state. Fleet size N is selected to meet a target utilization range (baseline: 70– 85 % under the rolling-horizon policy). Initial inventories are either balanced ( x i , 0 = N / | V | ) or drawn from a skewed distribution to test transient recovery. In-transit pipelines are initialized empty.
    Stochastic demand and window types. Demand arrivals are generated by lane- and time-dependent Poisson processes,
    A i j , t k Poisson ( λ i j , t k ) ,
    where k K indexes time-window types with widths W k { 1 , 2 , 4 , 8 } (in time steps). Unless noted, we use two window types with mixture weights ( ρ , 1 ρ ) and widths ( W tight , W loose ) , thereby representing the coexistence of appointment freight (tight) and flexible freight (loose).
    Information frictions. Information frictions enter through (i) match success probabilities p i j , t k , (ii) per-attempt search/coordination costs κ i j k , and (iii) observation delay δ Z + measured in time steps, such that the controller observes A ^ i j , t k = A i j , t δ k (baseline: δ = 0 ). When δ > 0 , backlogs are filtered via a simple belief update as described in Section 5.

6.2. Factorized Design and Experimental Protocol

We vary three factor families on pre-specified grids while holding all remaining parameters at their baseline values. Table 1 reports the baseline configuration, and Table 2 summarizes the sensitivity design used to generate Figure 1, Figure 2, Figure 3, Figure 4 and Figure 5.
  • Spatial imbalance parameterization. We generate baseline OD intensities λ ¯ i j from a gravity-like form and then apply two distortions. First, concentration uses a softmax tilt,
    λ i j , t base exp ( σ · s i j ) ,
    where s i j scores a designated subset of “dominant” OD pairs. Second, directionality induces net imbalance by scaling forward vs. backward lanes by ( 1 + β dir ) and ( 1 β dir ) , producing systematic accumulation/depletion along the corridor.
    Temporal imbalance parameterization. Temporal waves are introduced through a multiplicative diurnal profile q t :
    λ i j , t k = λ i j k 1 + γ q t , q t [ 1 , 1 ] , t q t = 0 ,
    where γ controls peak amplitude.
    Information friction parameterization. We vary p (success probability), δ (observation delay), and κ (search cost) independently to isolate the dominant degradation channel. In variants, p is made type-dependent (e.g., tighter windows have lower p) to represent harder-to-match appointment freight.
    Protocol. For each factor configuration, we run R independent replications (baseline R = 25 ) over a horizon of T periods (baseline: T = 24 × 7 for a week with hourly discretization). Random seeds are fixed across policies to enable paired comparisons, using the rule reported in Table 1. We report sample means and 95 % confidence intervals via the normal approximation.

6.3. Policies, Metrics, and Lower Bounds

  • Policies. We evaluate the following policies.
    Price-guided rolling-horizon (PG-RH). Our main method is Algorithm 1: solve the H-step time–space lookahead LP with terminal prices and execute the first-period decision. Unless noted, H equals the 90th percentile of empty lead times plus the loose window width, ensuring that the lookahead spans the principal feasibility horizon. In the numerical implementation, the same rebalancing intensity cap, bal_strength, is applied to first-period proactive empty moves in the PG-RH frontier runs; this cap is used only to sweep the service–empty-mileage frontier and is not part of the theoretical policy definition.
    Price-guided generalized-cost (PG-GC). This is a lightweight price-guided variant that ranks feasible actions by generalized costs (Section 5) and allocates vehicles greedily at each node. PG-GC is included as an implementable fast alternative to PG-RH; the main numerical tables below report the PG-RH policy unless explicitly stated otherwise.
    Myopic serve-first (Myopic). Vehicles are assigned to currently available demand in earliest-deadline-first order, breaking ties by shortest loaded travel time; no proactive empty repositioning is performed beyond what is necessary to serve expiring demand.
    Static balancing (Static). A static target inventory vector x ¯ is computed from mean demand (fluid balance). Each period, vehicles are repositioned to reduce x t x ¯ 1 subject to feasibility, without explicit time-window anticipation. The parameter bal_strength caps the fraction of the current post-service inventory deviation that can be corrected by proactive empty repositioning in that period; the same cap is used in the PG-RH frontier runs to sweep the aggressiveness of proactive rebalancing.
    No reposition (NR). Vehicles never reposition empty unless required by a committed load; this yields low empty distance but typically poor service.
    Metrics. We report three primary metrics aligned with the paper’s objectives.
    (i) Empty-distance ratio. Let D be total empty distance traveled and D L be total loaded distance. The empty-distance ratio is
    EDR : = D D + D L .
    This definition is used consistently when reporting the EDR.
    (ii) Service level. We report the unserved rate
    UR : = t , i , j , k U i j , t k t , i , j , k A i j , t k ,
    and, where late service is permitted, the late rate (fraction served outside the window). In the baseline setting, window violations are treated as failures (penalty π fail ), so UR captures both rejection and expiry.
    (iii) Total cost. Total cost aggregates empty and loaded operating costs, service penalties, and search costs:
    TC : = t C t + C t L + C t fail + C t search ,
    Unless otherwise stated, TC is reported in total model cost units over one simulation horizon. For comparisons across demand intensities, the same quantity can be normalized by total arrivals.
    Lower-bound comparisons. To connect experiments to theory, we compute the spatial-imbalance bound C , ( Δ ) from (37) and convert it into an EDR-form reference line for the Pareto-frontier experiment. This bound is intentionally coarse: it abstracts from deadlines and information frictions, and therefore serves as a diagnostic floor for imbalance-induced empty movement rather than as a full prophet benchmark. The time-expanded prophet relaxation in Section 4 provides the sharper theoretical benchmark, while Table 3 reports the numerical policy-performance summaries that focus on the spatial bound because it is transparent and directly interpretable in the synthetic corridor setting.

6.4. Numerical Illustrations and Sensitivity Findings

Figure 1, Figure 2, Figure 3, Figure 4 and Figure 5 summarize the main numerical patterns from the synthetic corridor experiments. The figures are organized to mirror the friction decomposition in Section 4: the service–empty-mileage frontier and the spatial lower-bound comparison are shown first, followed by targeted sensitivity analyses for spatial imbalance, temporal peakiness, and information frictions.
Figure 1 plots an empirical Pareto frontier between empty-distance ratio (EDR) and unserved rate (UR) on a representative synthetic corridor instance by sweeping the rebalancing intensity parameter (bal_strength) under fixed ( σ , γ , p ) . The dashed vertical line overlays an EDR-form spatial-imbalance lower bound computed from expected net flows (Section 4), thereby providing a diagnostic reference for the unavoidable component of empty movement induced by corridor imbalance under comparable served-volume and terminal-inventory conditions; Table 4 reports the corresponding numerical values. Figure 2 isolates the effect of time-window widening: increasing the tight pickup-window width shifts the UR–EDR frontier downward, improving service reliability at comparable empty-distance ratios. This behavior is consistent with the time-friction channel in Section 4 and the operational interpretation in Section 7.
Figure 3 examines the spatial-friction channel by varying the spatial skew parameter σ . As spatial imbalance increases, all policies face higher unserved rates and higher total costs. The myopic policy avoids proactive empty repositioning and therefore maintains nearly zero EDR, but this comes at the cost of rapidly deteriorating service. Static and price-guided repositioning reduce UR by moving empty vehicles proactively. The PG-RH policy generally achieves service performance close to static balancing while using less empty movement and incurring lower total cost, illustrating the value of distance-aware repositioning when corridor imbalance is pronounced without claiming uniform service dominance.
Figure 4 varies the temporal peakiness parameter γ . In the baseline parameterization, changing γ produces more modest changes than those observed for spatial skew or match feasibility. This does not imply that temporal frictions are unimportant. Rather, it indicates that temporal nonstationarity alone is not the dominant binding constraint in this baseline corridor. The operational cost of temporal mismatch is mediated by its interaction with lead times, pickup-window tightness, and available fleet slack. This interpretation is consistent with Figure 2: widening pickup windows directly relaxes deadline-induced reachability constraints, whereas peakiness alone need not generate severe degradation unless those constraints are binding.
Figure 5 evaluates the information-friction channel by varying the match success probability p. Lower values of p represent more severe uncertainty in converting attempted assignments into realized loads. As p increases, UR and TC fall sharply, confirming that assignment feasibility is a major determinant of service reliability and cost. The PG-RH policy is particularly effective when match feasibility is moderate to high: relative to static balancing, it attains a lower EDR and lower TC while maintaining service levels that are close to, but not uniformly better than, the static policy. This pattern supports the interpretation of platform liquidity and matching reliability as reductions in the information-friction component of excess empty movement.
  • Sensitivity along the three friction axes. The figures show three qualitative patterns. First, spatial imbalance increases the unavoidable pressure to reposition: myopic dispatch avoids empty mileage but suffers substantial service loss, whereas proactive repositioning converts empty movement into improved reliability (Figure 3). Second, temporal frictions are conditional: widening tight pickup windows clearly improves the service–empty-mileage frontier (Figure 2), while peakiness alone has only a modest effect in the baseline configuration (Figure 4). This suggests that the temporal premium is driven by the interaction of demand waves with reachability constraints, not by nonstationarity alone. Third, information frictions materially affect both service and cost: higher match success probability sharply lowers UR and TC, especially for policies that actively reposition vehicles in anticipation of future loads (Figure 5).
Overall, the experiments support the interpretation developed in Section 4. Spatial imbalance establishes a lower-bound floor for empty movement; time-window and lead-time constraints determine when temporal mismatch becomes binding; and information frictions alter the reliability with which positioned capacity can be converted into realized service. Price-guided repositioning does not uniformly dominate every baseline in every metric, but it consistently provides a more favorable cost–service balance than static balancing in the main sensitivity experiments by reducing unnecessary empty exposure while preserving much of the service improvement from proactive repositioning.

7. Implications: Interventions as Parameter Shifts

This section distills the theoretical decomposition and the synthetic corridor evidence into implications for operational and market interventions. The central point is that interventions rarely act on “empty miles” as a direct object of control; instead, they alter the primitives of the corridor dispatch problem and thereby relocate the system within the feasible trade-off set between service and repositioning. Since the model distinguishes spatial imbalance, time-window/lead-time constraints, and information frictions as separate channels, it becomes possible to associate candidate interventions with explicit parameter shifts and to anticipate, ex ante, which performance components are malleable and which remain constrained by lower bounds.

7.1. A Parameter-to-Policy Map

Let Θ denote the collection of primitives introduced in Section 3:
Θ = ( λ i j , t k , W k , τ i j , p i j , t k , δ , κ i j k , π i j k ) .
where λ encodes demand intensity and its spatial–temporal structure, W the time-window widths, τ the travel/lead times, p match feasibility, δ the observation delay, κ the search/coordination cost, and π i j k service penalties (a proxy for contractual strictness and service-level requirements). The friction decomposition of Section 4 implies that improvements inherit corresponding limits: interventions that primarily reshape λ influence the imbalance-induced baseline component and thus the magnitude of unavoidable deadhead; those that enlarge W or reduce τ mitigate the time-friction premium; and those that improve p or reduce ( δ , κ ) mitigate the information-friction premium. Hence, rather than treating “better dispatching” as an undifferentiated objective, one may interpret intervention impacts through their induced perturbations of Θ and through the implied movement relative to the lower bounds.

7.2. Interventions Targeting Spatial Imbalance: Reshaping λ and Enlarging Effective Pooling

Horizontal collaboration among carriers and brokers enlarges the effective pool of vehicles and loads, increasing cross-lane substitution and attenuating local deficits. In the model, such cooperation can be represented either as an aggregation of inventories across cooperating fleets (a larger effective N with partially shared x t ) or as a partial smoothing of OD-specific demand intensities (a less skewed λ i j , t k ). Both mechanisms reduce the imbalance bound in Proposition 2, and thereby relax the floor on empty repositioning that no online policy can breach. This channel has been formalized and operationalized in the collaboration literature on carrier cooperation and consolidation, which documents utilization and cost improvements mediated by enlarged pooling and exchange possibilities [8,9,17,18].
Relatedly, systematic backhaul planning and consolidation may be interpreted as a redesign of the feasible OD mix: by expanding the set of loaded “return” opportunities and reducing net one-way flow, such initiatives shift the composition of λ toward more balanced counterflows. Under the corridor parameterization, this corresponds to reducing directionality and concentration (e.g., lowering the effective skew parameter σ and directionality bias β as in Section 6). The VRP-with-backhauls literature provides algorithmic foundations for these redesigns under operational constraints, including time windows and robust variants [20,21,22]. The salient implication is that backhaul initiatives can lower empty miles even absent any change in real-time control, because they act on the spatial component Ψ spatial in (46).
At the same time, the lower bound clarifies a structural limitation: when persistent net directional flow is pronounced (large β ), the imbalance bound itself rises, and incremental improvements in dispatch logic cannot eliminate deadhead. In such regimes, material reductions in empty travel require interventions that alter effective OD balance—through pooling, contractual lane pairing, consolidation, or other measures that reconfigure λ —rather than purely informational or algorithmic refinements.

7.3. Interventions Targeting Time Constraints: Enlarging W and Reducing Effective τ

Facility appointment systems and berth-reservation mechanisms chiefly operate by widening the effective time window and reducing the operational hardness of deadlines. In the model, this corresponds to larger effective W k (or, equivalently, to a softened penalty structure for modest deviations), which can reduce the time-friction premium Ψ time and attenuate the sharpness of space–time scarcity prices near deadlines, consistent with the time-window implication in Section 4. Evidence from port and drayage settings illustrates how gate operations and appointment policies influence truck movements and externalities [4]; the same mechanism extends to distribution centers and long-haul terminals, where increased schedule predictability converts reachability constraints into smoother cost trade-offs.
Lead times τ i j , while partly physical, also reflect operational latencies: dispatch delays, driver availability, trailer readiness, yard processing, and handoff frictions. Interventions that reduce these latencies—for instance, faster tendering/acceptance cycles, pre-positioned equipment, or yard-process improvements—reduce effective τ , enlarge the reachable set in space–time, and diminish the shadow value of early capacity. The experiments in Section 6 underscore that temporal peakiness interacts sharply with the ratio τ / W : when lead times are large relative to window widths, even mild peaks can induce failures unless repositioning is initiated well in advance. Enlarging W or reducing τ moves the system away from such brittle regimes, thereby narrowing the gap between the spatial lower bound and realized performance by shrinking Ψ time .

7.4. Interventions Targeting Information Frictions: Improving p and Reducing ( δ , κ )

Digital freight platforms and data-sharing interfaces may be interpreted, within the present formalism, as instruments that increase match feasibility and reduce informational latency. In particular, improved market access and standardized interfaces can raise the probability that an attempted match materializes (higher p), while real-time visibility and faster dissemination of opportunities reduce observation delay (smaller δ ). Both changes act to reduce Ψ info in (46). Theoretical treatments of dynamic freight exchanges and related online decision problems articulate how routing options and market access shape equilibrium and performance [5,6], while empirical studies examine determinants of online matching success and repositioning behavior [24,25]. From the model’s standpoint, platform “liquidity” and visibility are parsimoniously represented as increases in p and decreases in δ .
Search and coordination costs κ remain consequential even when feasibility is high. Brokerage effort, negotiation, compliance, and idiosyncratic contractual terms impose per-attempt costs that suppress exploration and delay commitment. Standardized contracts, automated tendering, and clearer service-level agreements reduce κ , thereby shifting the optimal policy toward broader and earlier search, which is particularly valuable under tight time windows. The synthetic sensitivity patterns suggest that high κ can partially neutralize the benefits of improved visibility: opportunities that are observable but costly to operationalize are, in effect, not exploitable at the margin.
Finally, informational improvements are valuable only insofar as they are decision-relevant. Reducing δ via forecasting and signals must be assessed through its induced change in downstream control performance rather than through point-forecast metrics. The forecasting-for-optimization perspective formalizes this requirement, emphasizing evaluation through decision quality [28]. In the present setting, the appropriate criterion is the induced reduction in Ψ info and the corresponding movement of the service–empty-mileage frontier.

7.5. Implications for Policy Design and Evaluation

The parameter interpretation yields concrete guidance for diagnosis and evaluation. First, lower bounds provide a disciplined baseline for target-setting: computing a spatial-imbalance bound (Proposition 2) quantifies unavoidable deadhead under the prevailing OD balance. When observed empty travel lies close to this bound, interventions confined to dispatch logic or information improvement admit limited upside; conversely, a large gap indicates substantive slack that may be captured by better control and/or by reducing time and information frictions.
Second, the decomposition in Section 4 suggests a principled attribution of performance losses. Deterioration concentrated in tight-window and peak regimes points to a time-driven gap (large Ψ time ), favoring window widening and lead-time reduction; deterioration concentrated under low p or high δ points to an information-driven gap (large Ψ info ), favoring liquidity and visibility interventions.
Third, evaluation is naturally conducted as parameter experimentation rather than anecdote. Because real interventions typically move several primitives simultaneously—appointment systems may widen effective windows while also reducing variability; platforms may increase p while reducing δ and, occasionally, κ —credible assessment requires controlled A/B or before–after designs that identify changes in ( W , p , δ , κ ) and trace the induced displacement of the Pareto frontier between empty distance and service level. Under such a lens, operational control and market/facility design appear complementary: price-guided rolling horizons exploit space–time shadow values, and as p increases or W widens those prices flatten, reducing reliance on proactive repositioning without sacrificing service. The framework thus offers a disciplined route from corridor-level interventions to measurable primitives and separates improvements constrained by imbalance from those achievable through enhanced temporal flexibility and reduced informational frictions. Table 5 summarizes these intervention channels as parameter shifts and their primary expected effects within the present framework.

7.6. Practical Applicability and Validation Scope

Although the computational study is based on synthetic corridor instances, the proposed framework is intended to be directly calibratable from operational data commonly available to carriers, brokers, and digital freight platforms. The main required inputs are lane-level demand intensities, pickup and delivery time-window distributions, travel-time or lead-time estimates, vehicle inventory by region and time, empty-move costs, match success probabilities, and observation or communication delays. These quantities correspond directly to the primitive parameters in the model, which makes the framework suitable for scenario analysis even before full-scale deployment.
In a practical implementation, the rolling-horizon policy would operate as a decision-support layer within a transportation management or dispatch system. At each decision epoch, the controller would update the current vehicle inventory, outstanding load requests, and in-transit vehicles, solve the lookahead problem over a finite horizon, and recommend loaded assignments and empty repositioning moves. The lower bounds would not be used as operational decisions themselves, but as diagnostic benchmarks for evaluating whether observed empty mileage is primarily driven by spatial imbalance, time-window tightness, or information frictions.
The present experiments should therefore be interpreted as structural validation of the model mechanisms rather than as an empirical performance claim for a particular corridor. Empirical validation with carrier or platform data remains an important next step. Such validation would allow the numerical magnitudes of the lower bounds, shadow prices, and policy improvements to be assessed under real appointment patterns, demand waves, equipment mixes, driver regulations, and market frictions.

8. Conclusions

This paper develops a unified theory and implementable control framework for empty mileage in long-haul corridors by viewing deadhead as dynamic repositioning under time constraints. The corridor setting is distinguished by three interacting features: lead times that couple current dispatch to future capacity, time windows that impose space–time reachability constraints, and market-mediated information frictions that separate attempted assignments from realized loads.
Our first contribution is a family of dynamic lower bounds on achievable performance. A prophet relaxation on the time-expanded network provides a principled benchmark, and its dual variables deliver interpretable space–time shadow prices that quantify corridor-time scarcity created by lead times and tight windows. These prices offer a common language for marginal opportunity costs and connect analytical limits to operational decision making.
Second, we propose a friction decomposition that attributes excess empty movement to three primitives aligned with intervention levers: spatial imbalance, temporal mismatch induced by windows and lead times, and information frictions. The decomposition separates structurally unavoidable deadhead implied by net flow from additional deadhead induced by deadline-adjacent scarcity and imperfect observability/coordination, thereby clarifying where operational or market design changes can be most effective.
Third, guided by the dual-price structure, we introduce a price-guided control family on a time–space network, centered on a rolling-horizon policy with terminal values informed by the bounds. The main numerical experiments use the H-step PG-RH implementation, while PG-GC provides a one-step generalized-cost alternative for real-time settings in which repeatedly solving the lookahead LP is impractical; both variants minimize generalized reduced costs that internalize downstream opportunity values, yielding a scalable bridge between principled benchmarks and real-time implementable heuristics. Synthetic experiments factorized along imbalance, window tightness, and information frictions demonstrate systematic trade-offs among empty-distance ratio, service reliability, and total cost, and show that informational improvements matter insofar as they improve decision quality rather than prediction accuracy alone [28].
Overall, the paper establishes a coherent theory–policy–measurement loop for corridor deadhead: bounds to delimit what is fundamentally achievable under space–time feasibility, a decomposition to diagnose performance gaps, and a practical rolling-horizon controller that operationalizes the same logic based on shadow prices. The framework is intended both as a research scaffold for richer frictions and equilibrium extensions and as a diagnostic tool for corridor interventions aimed at reducing empty miles without sacrificing reliability.
The present analysis is intentionally based on synthetic corridor environments in order to isolate the roles of spatial imbalance, time-window tightness, and information frictions in a transparent way. The numerical results should therefore be read as structural validation of the proposed mechanisms rather than as an empirical claim for a particular corridor or carrier. A natural next step is empirical calibration with carrier or platform data, including observed appointment structures, realized match success, regional demand waves, equipment and driver-qualification requirements, and hours-of-service constraints. Such extensions would help translate the proposed lower bounds and price-guided policies into decision-support tools for real freight operations while clarifying which conclusions are robust under full operational heterogeneity and endogenous market conditions.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ADPapproximate dynamic programming
EDRempty-distance ratio
LPlinear programming
MDPMarkov decision process
PG-RHprice-guided rolling-horizon
PG-GCprice-guided generalized-cost policy
TCtotal cost
URunserved rate

References

  1. Du, Y.; Hall, R. Fleet Sizing and Empty Equipment Redistribution for Center-Terminal Transportation Networks. Manag. Sci. 1997, 43, 145–157. [Google Scholar] [CrossRef]
  2. Song, D.P.; Earl, C.F. Optimal empty vehicle repositioning and fleet-sizing for two-depot service systems. Eur. J. Oper. Res. 2008, 185, 760–777. [Google Scholar] [CrossRef]
  3. Chao, S.L.; Chen, C.C. Applying a time–space network to reposition reefer containers among major Asian ports. Res. Transp. Bus. Manag. 2015, 17, 65–72. [Google Scholar] [CrossRef]
  4. Schulte, F.; Lalla-Ruiz, E.; González-Ramírez, R.G.; Voß, S. Reducing port-related empty truck emissions: A mathematical approach for truck appointments with collaboration. Transp. Res. Part E Logist. Transp. Rev. 2017, 105, 195–212. [Google Scholar] [CrossRef]
  5. Miller, J.; Nie, Y.M. Dynamic Trucking Equilibrium through a Freight Exchange. Transp. Res. Procedia 2019, 38, 320–340. [Google Scholar] [CrossRef]
  6. Miller, J.; Nie, Y.M.; Liu, X. Hyperpath Truck Routing in an Online Freight Exchange Platform. Transp. Sci. 2020, 54, 1676–1696. [Google Scholar] [CrossRef]
  7. Simão, H.P.; Day, J.; George, A.P.; Gifford, T.; Nienow, J.; Powell, W.B. An Approximate Dynamic Programming Algorithm for Large-Scale Fleet Management: A Case Application. Transp. Sci. 2009, 43, 178–197. [Google Scholar] [CrossRef]
  8. Cruijssen, F.; Cools, M.; Dullaert, W. Horizontal cooperation in logistics: Opportunities and impediments. Transp. Res. Part E Logist. Transp. Rev. 2007, 43, 129–142. [Google Scholar] [CrossRef]
  9. Ergun, Ö.; Kuyzu, G.; Savelsbergh, M. Reducing Truckload Transportation Costs Through Collaboration. Transp. Sci. 2007, 41, 206–221. [Google Scholar] [CrossRef]
  10. Wu, P.; Hartman, J.C.; Wilson, G.R. An Integrated Model and Solution Approach for Fleet Sizing with Heterogeneous Assets. Transp. Sci. 2005, 39, 87–103. [Google Scholar] [CrossRef]
  11. Vasco, R.A.; Morabito, R. The dynamic vehicle allocation problem with application in trucking companies in Brazil. Comput. Oper. Res. 2016, 76, 118–133. [Google Scholar] [CrossRef]
  12. Jula, H.; Chassiakos, A.; Ioannou, P. Port dynamic empty container reuse. Transp. Res. Part E Logist. Transp. Rev. 2006, 42, 43–60. [Google Scholar] [CrossRef]
  13. Dong, J.X.; Song, D.P. Container fleet sizing and empty repositioning in liner shipping systems. Transp. Res. Part E Logist. Transp. Rev. 2009, 45, 860–877. [Google Scholar] [CrossRef]
  14. Moghaddam, M.; Pearce, R.H.; Mokhtar, H.; Prato, C.G. A generalised model for container drayage operations with heterogeneous fleet, multi-container sizes and two modes of operation. Transp. Res. Part E Logist. Transp. Rev. 2020, 139, 101973. [Google Scholar] [CrossRef]
  15. Uddin, M.; Huynh, N. Model for Collaboration among Carriers to Reduce Empty Container Truck Trips. Information 2020, 11, 377. [Google Scholar] [CrossRef]
  16. Braekers, K.; Caris, A.; Janssens, G.K. Integrated planning of loaded and empty container movements. OR Spectr. 2013, 35, 457–478. [Google Scholar] [CrossRef]
  17. Krajewska, M.A.; Kopfer, H.; Laporte, G.; Røpke, S.; Zaccour, G. Horizontal cooperation among freight carriers: Request allocation and profit sharing. J. Oper. Res. Soc. 2008, 59, 1483–1491. [Google Scholar] [CrossRef]
  18. Gansterer, M.; Hartl, R.F. Collaborative vehicle routing: A survey. Eur. J. Oper. Res. 2018, 268, 1–12. [Google Scholar] [CrossRef]
  19. Ahari, S.A.; Bakir, I.; Roodbergen, K.J. A new perspective on carrier collaboration: Collaborative vehicle utilization. Transp. Res. Part C Emerg. Technol. 2024, 163, 104647. [Google Scholar] [CrossRef]
  20. Koç, Ç.; Laporte, G. Vehicle routing with backhauls: Review and research perspectives. Comput. Oper. Res. 2018, 91, 79–91. [Google Scholar] [CrossRef]
  21. Zhong, Y.; Cole, M.H. A vehicle routing problem with backhauls and time windows: A guided local search solution. Transp. Res. Part E Logist. Transp. Rev. 2005, 41, 131–144. [Google Scholar] [CrossRef]
  22. Santos, M.J.; Curcio, E.; Mulati, M.H.; Amorim, P.; Miyazawa, F.K. A robust optimization approach for the vehicle routing problem with selective backhauls. Transp. Res. Part E Logist. Transp. Rev. 2020, 136, 101888. [Google Scholar] [CrossRef]
  23. Pradenas, L.; Oportus, B.; Parada, V. Mitigation of greenhouse gas emissions in vehicle routing problems with backhauling. Expert Syst. Appl. 2013, 40, 2985–2991. [Google Scholar] [CrossRef]
  24. Boumahdaf, A.; Broniatowski, M.; Miranda, É.; Le Squeren, A. A behavioral probabilistic model of carrier spatial repositioning decision-making. Transp. Res. Part C Emerg. Technol. 2023, 153, 104194. [Google Scholar] [CrossRef]
  25. Park, A.; Chen, R.; Cho, S.; Zhao, Y. The determinants of online matching platforms for freight services. Transp. Res. Part E Logist. Transp. Rev. 2023, 179, 103284. [Google Scholar] [CrossRef]
  26. Heinbach, C.; Beinke, J.; Kammler, F.; Thomas, O. Data-driven forwarding: A typology of digital platforms for road freight transport management. Electron. Mark. 2022, 32, 807–828. [Google Scholar] [CrossRef] [PubMed]
  27. Shi, N.; Song, H.; Powell, W.B. The dynamic fleet management problem with uncertain demand and customer chosen service level. Int. J. Prod. Econ. 2014, 148, 110–121. [Google Scholar] [CrossRef]
  28. Sonnleitner, B.; Kourentzes, N.; Ehrig, C.; Pflaum, A. Forecasting for optimization in road freight transport: A review. Transp. Res. Part E Logist. Transp. Rev. 2025, 204, 104378. [Google Scholar] [CrossRef]
Figure 1. Service–empty-mileage trade-off on a synthetic 10-node corridor (1 h steps). The curves trace the Pareto frontier between unserved rate (UR) and empty-distance ratio (EDR) by sweeping the rebalancing intensity (bal_strength) under a fixed environment ( σ , γ , p ) . The dashed vertical line is a spatial-imbalance EDR lower-bound reference computed from expected net flows; it is directly comparable only under comparable served-volume and terminal-inventory conditions. The pronounced “knee” highlights diminishing returns: modest repositioning yields large service gains, while additional empty movement delivers limited improvements.
Figure 1. Service–empty-mileage trade-off on a synthetic 10-node corridor (1 h steps). The curves trace the Pareto frontier between unserved rate (UR) and empty-distance ratio (EDR) by sweeping the rebalancing intensity (bal_strength) under a fixed environment ( σ , γ , p ) . The dashed vertical line is a spatial-imbalance EDR lower-bound reference computed from expected net flows; it is directly comparable only under comparable served-volume and terminal-inventory conditions. The pronounced “knee” highlights diminishing returns: modest repositioning yields large service gains, while additional empty movement delivers limited improvements.
Futuretransp 06 00125 g001
Figure 2. Effect of time-window widening on the service–empty-mileage frontier. Holding the corridor environment fixed, increasing the tight pickup-window width from W tight = 1 to W tight = 4 shifts the UR–EDR Pareto curve downward, indicating improved service reliability at comparable empty-distance ratios.
Figure 2. Effect of time-window widening on the service–empty-mileage frontier. Holding the corridor environment fixed, increasing the tight pickup-window width from W tight = 1 to W tight = 4 shifts the UR–EDR Pareto curve downward, indicating improved service reliability at comparable empty-distance ratios.
Futuretransp 06 00125 g002
Figure 3. Sensitivity to spatial imbalance. The panels report empty-distance ratio (EDR), unserved rate (UR), and total cost (TC) as the spatial skew parameter σ increases. The myopic policy avoids empty repositioning but suffers increasing service loss as imbalance grows. Static and price-guided repositioning reduce unserved demand by moving empty vehicles proactively; PG-RH generally achieves similar service performance with lower empty-distance exposure and lower total cost than static balancing. Error bars indicate 95% confidence intervals over Monte Carlo replications.
Figure 3. Sensitivity to spatial imbalance. The panels report empty-distance ratio (EDR), unserved rate (UR), and total cost (TC) as the spatial skew parameter σ increases. The myopic policy avoids empty repositioning but suffers increasing service loss as imbalance grows. Static and price-guided repositioning reduce unserved demand by moving empty vehicles proactively; PG-RH generally achieves similar service performance with lower empty-distance exposure and lower total cost than static balancing. Error bars indicate 95% confidence intervals over Monte Carlo replications.
Futuretransp 06 00125 g003
Figure 4. Sensitivity to temporal peakiness. The panels report EDR, UR, and TC as the temporal peakiness parameter γ varies. In the baseline configuration, increasing γ produces only modest changes relative to the differences across policies. This indicates that temporal nonstationarity alone is not the dominant binding friction; its operational effect is mediated by pickup-window tightness, lead times, and fleet slack. Error bars indicate 95% confidence intervals over Monte Carlo replications.
Figure 4. Sensitivity to temporal peakiness. The panels report EDR, UR, and TC as the temporal peakiness parameter γ varies. In the baseline configuration, increasing γ produces only modest changes relative to the differences across policies. This indicates that temporal nonstationarity alone is not the dominant binding friction; its operational effect is mediated by pickup-window tightness, lead times, and fleet slack. Error bars indicate 95% confidence intervals over Monte Carlo replications.
Futuretransp 06 00125 g004
Figure 5. Sensitivity to match feasibility. The panels report EDR, UR, and TC as the match success probability p varies. Lower p represents stronger information and matching frictions. As p increases, service reliability improves and total cost declines for all policies. PG-RH achieves lower empty-distance exposure than static balancing while maintaining comparable service performance in moderate- and high-feasibility regimes; under severe match infeasibility, service differences are small and need not favor PG-RH. Error bars indicate 95% confidence intervals over Monte Carlo replications.
Figure 5. Sensitivity to match feasibility. The panels report EDR, UR, and TC as the match success probability p varies. Lower p represents stronger information and matching frictions. As p increases, service reliability improves and total cost declines for all policies. PG-RH achieves lower empty-distance exposure than static balancing while maintaining comparable service performance in moderate- and high-feasibility regimes; under severe match infeasibility, service differences are small and need not favor PG-RH. Error bars indicate 95% confidence intervals over Monte Carlo replications.
Futuretransp 06 00125 g005
Table 1. Baseline simulation parameters.
Table 1. Baseline simulation parameters.
CategoryParameterBaseline ValueDescription
Networkn10Number of corridor nodes in the baseline line network
HorizonT168One-week horizon with hourly decision epochs
FleetN220Number of homogeneous vehicles
Cost α empty 1.0Empty-movement cost per distance unit
Cost β load 0.3Loaded-movement cost parameter
Cost π fail 20.0Penalty for unserved or expired demand
Time windows W tight 2Tight pickup-window width in time steps
Time windows W loose 6Loose pickup-window width in time steps
Time-window mix ρ tight 0.7Share of tight-window requests
Spatial imbalance σ 1.0Baseline spatial demand skew
Directionality β dir 0.4Directional imbalance parameter
Temporal imbalance γ 0.5Baseline diurnal peakiness parameter
Information friction p match 0.8Baseline match success probability
Search friction κ 0.0Search/coordination cost
Rebalancingbal_strength0.20Fractional rebalancing intensity after service decisions
ReplicationR25Monte Carlo replications per configuration
Random seedsseed0 +Replication seed for replication index r
1000r + 17
Table 2. Sensitivity design used in the synthetic experiments.
Table 2. Sensitivity design used in the synthetic experiments.
Experiment/AxisVaried ParameterValuesMain Fixed Settings
Spatial imbalance σ { 0.0 , 0.5 , 1.0 , 1.5 } γ = 0.5 , p match = 0.8 , β dir = 0.4
Temporal peakiness γ { 0.0 , 0.3 , 0.6 , 0.9 } σ = 1.0 , p match = 0.8 , W tight = 2
Information friction p match { 0.5 , 0.7 , 0.85 , 1.0 } σ = 1.0 , γ = 0.5 , κ = 0
Pareto frontierbal_ { 0.00 , 0.03 , 0.05 , 0.08 } σ = 1.0 , γ = 0.5 ,
strength { 0.12 , 0.16 , 0.20 , 0.25 } p match = 0.8
Window widening W tight { 1 , 4 } Same corridor environment as the Pareto-frontier experiment
Table 3. Representative policy performance by sensitivity scenario. Values are sample means with 95% confidence intervals.
Table 3. Representative policy performance by sensitivity scenario. Values are sample means with 95% confidence intervals.
ScenarioPolicyEDRURTC
Baseline ( σ = 1.0)myopic0.000 ± 0.0000.610 ± 0.00817,411 ± 275
static0.714 ± 0.0070.484 ± 0.00818,281 ± 225
PG-RH0.683 ± 0.0060.492 ± 0.00817,744 ± 249
High spatial imbalance ( σ = 1.5)myopic0.000 ± 0.0000.693 ± 0.00619,409 ± 324
static0.812 ± 0.0040.565 ± 0.00621,672 ± 197
PG-RH0.781 ± 0.0060.568 ± 0.00620,619 ± 248
High temporal peakiness ( γ = 0.9)myopic0.000 ± 0.0000.607 ± 0.01017,243 ± 357
static0.718 ± 0.0070.482 ± 0.00918,297 ± 244
PG-RH0.677 ± 0.0060.492 ± 0.00817,679 ± 209
Low match feasibility (p = 0.5)myopic0.000 ± 0.0000.826 ± 0.00723,059 ± 297
static0.686 ± 0.0070.822 ± 0.00624,239 ± 283
PG-RH0.649 ± 0.0080.827 ± 0.00724,177 ± 348
Table 4. Pareto-frontier values and spatial lower-bound comparison. Values are sample means with 95% confidence intervals; EDR LB is reported as a raw difference from the spatial-imbalance EDR lower-bound reference. Negative differences occur when low-repositioning policies serve substantially less demand and are therefore not directly comparable to a full-service balancing benchmark with the same served OD volumes and terminal-inventory accounting.
Table 4. Pareto-frontier values and spatial lower-bound comparison. Values are sample means with 95% confidence intervals; EDR LB is reported as a raw difference from the spatial-imbalance EDR lower-bound reference. Negative differences occur when low-repositioning policies serve substantially less demand and are therefore not directly comparable to a full-service balancing benchmark with the same served OD volumes and terminal-inventory accounting.
Policybal.EDRURTCSpatial LBEDR–LB
static0.000.000 ± 0.0000.609 ± 0.00917,426 ± 3490.365−0.365
static0.030.133 ± 0.0080.508 ± 0.01014,793 ± 3300.365−0.232
static0.050.276 ± 0.0080.478 ± 0.00814,383 ± 3320.365−0.089
static0.080.439 ± 0.0090.470 ± 0.00714,990 ± 2220.3650.074
static0.120.582 ± 0.0070.469 ± 0.00815,958 ± 2630.3650.217
static0.160.659 ± 0.0080.472 ± 0.00816,991 ± 2310.3650.294
static0.200.727 ± 0.0050.472 ± 0.00718,141 ± 2330.3650.362
static0.250.770 ± 0.0050.485 ± 0.00519,622 ± 1730.3650.405
PG-RH0.000.000 ± 0.0000.596 ± 0.01116,894 ± 4110.365−0.365
PG-RH0.030.114 ± 0.0060.516 ± 0.00914,954 ± 2970.365−0.251
PG-RH0.050.246 ± 0.0090.503 ± 0.00715,080 ± 2360.365−0.119
PG-RH0.080.398 ± 0.0070.493 ± 0.00815,452 ± 2690.3650.033
PG-RH0.120.525 ± 0.0080.487 ± 0.00616,017 ± 2370.3650.160
PG-RH0.160.618 ± 0.0080.491 ± 0.00816,731 ± 2740.3650.253
PG-RH0.200.681 ± 0.0080.488 ± 0.00817,650 ± 1850.3650.316
PG-RH0.250.729 ± 0.0060.485 ± 0.01018,474 ± 2840.3650.364
Table 5. Interventions as parameter shifts and primary expected effects.
Table 5. Interventions as parameter shifts and primary expected effects.
InterventionPrimary Parameter Shift(s)Primary Effect(s)
Carrier collaboration/pooling λ less skewed; larger effective pool NLowers spatial bound; reduces Ψ spatial
Backhaul planning/consolidationOD mix shift in λ ; reduced directional biasReduces net surpluses/deficits; lowers unavoidable deadhead
Appointment/reserved berthsWider effective W; lower effective penalties π for small deviationsReduces Ψ time ; improves service at lower EDR
Operational latency reductionSmaller effective lead times τ Reduces anticipatory repositioning; reduces time brittleness
Digital freight platformsHigher p; smaller δ ; possibly lower κ Reduces Ψ info ; stabilizes under uncertainty
Contract/standardization automationLower κ Encourages timely matching; reduces wasted effort
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Noguchi, T. Dynamic Empty-Vehicle Repositioning on Long-Haul Freight Corridors: Lower Bounds and Rolling-Horizon Policies Under Lead Times and Time Windows. Future Transp. 2026, 6, 125. https://doi.org/10.3390/futuretransp6030125

AMA Style

Noguchi T. Dynamic Empty-Vehicle Repositioning on Long-Haul Freight Corridors: Lower Bounds and Rolling-Horizon Policies Under Lead Times and Time Windows. Future Transportation. 2026; 6(3):125. https://doi.org/10.3390/futuretransp6030125

Chicago/Turabian Style

Noguchi, Tomoo. 2026. "Dynamic Empty-Vehicle Repositioning on Long-Haul Freight Corridors: Lower Bounds and Rolling-Horizon Policies Under Lead Times and Time Windows" Future Transportation 6, no. 3: 125. https://doi.org/10.3390/futuretransp6030125

APA Style

Noguchi, T. (2026). Dynamic Empty-Vehicle Repositioning on Long-Haul Freight Corridors: Lower Bounds and Rolling-Horizon Policies Under Lead Times and Time Windows. Future Transportation, 6(3), 125. https://doi.org/10.3390/futuretransp6030125

Article Metrics

Back to TopTop