Next Article in Journal
Machine Learning Prediction of Pavement Macrotexture from 3D Laser-Scanning Data
Previous Article in Journal
Manna SafeioD: A Framework and Roadmap for Secure Design in the Internet of Drones
Previous Article in Special Issue
Integrated Optimization of Timetabling and Skip-Stop Patterns with Passenger Transfer Strategy in Urban Rail Transit
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Optimization of Multimodal Transportation Routes for North-to-South Grain Transportation in China Considering Carbon Emissions

College of Civil Engineering and Transportation, Northeast Forestry University, 26 Hexing Road, Xiangfang District, Harbin 150040, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(1), 510; https://doi.org/10.3390/app16010510
Submission received: 12 November 2025 / Revised: 30 December 2025 / Accepted: 31 December 2025 / Published: 4 January 2026
(This article belongs to the Special Issue Advanced, Smart, and Sustainable Transportation)

Abstract

This study presents a multi-objective optimization framework for China’s North-to-South Grain Transportation (NSGT), balancing costs, time, carbon emissions, and grain quality loss to promote sustainable logistics. We propose a hybrid algorithm combining genetic optimization with reinforcement learning to identify efficient routes and evaluate trade-offs. Compared to standard methods, our approach achieves better solution diversity and robustness, as validated by sensitivity analysis, scalability tests, and statistical comparisons. The findings advance carbon accounting in multimodal transport and provide practical guidance for policymakers to enhance eco-friendly grain distribution.

1. Introduction

1.1. Research Status

The transportation of grain from northern production hubs to southern consumption regions, often referred to as North-to-South Grain Transportation (NSGT), is a critical component of global food security and supply chain logistics [1]. In China, this process is particularly significant due to the geographical disparity between grain-producing areas in the north and high demand in the south. With increasing grain consumption and urbanization, the logistics system faces mounting pressure to ensure efficiency, cost-effectiveness, and sustainability. Multimodal transportation [2], which integrates various modes such as road, rail, and waterways, has emerged as a promising approach to optimize NSGT. However, the environmental impact of these operations, particularly related to their carbon emissions, poses a significant challenge amid global calls for decarbonization and sustainable development. According to data from the International Energy Agency, the transportation sector is the second-largest contributor to worldwide carbon dioxide emissions, accounting for approximately 25% of the global total. In China, this sector represents approximately 15% of the country’s overall final energy consumption-related discharges, positioning it as the third major emission source following industry and buildings [3].
Both environmental and policy imperatives drive the urgency to address carbon emissions in transportation. The logistics sector contributes substantially to global greenhouse gas emissions, with freight transportation being a major source. As nations strive to meet carbon neutrality goals, optimizing multimodal transportation networks to minimize carbon footprints while balancing cost and time efficiency has become a pressing research priority. Existing studies have explored multimodal transport optimization and carbon emission estimation, yet few have integrated these aspects within the specific context of NSGT. This gap underscores the need for a comprehensive framework that simultaneously addresses economic, temporal, and environmental objectives in grain logistics.

1.2. Focus of This Study

This study aims to develop a multi-objective optimization model for the NSGT initiative, integrating transportation costs, transit time, carbon emissions costs, and grain quality loss into a unified analytical framework to address gaps in existing research. At the theoretical level, it enhances the carbon footprint accounting system for multimodal transportation, offering a novel perspective for incorporating environmental factors into logistics optimization. On the practical front, the research outcomes provide actionable decision-making support for policymakers and logistics planners in designing sustainable and efficient grain transportation schemes. By employing the hybrid NSGA-II-Ql algorithm, this study optimizes the North-to-South Grain Transportation network structure, evaluates trade-offs among multiple objectives, and examines the effectiveness of the hybrid algorithm through comparisons with other methods. Unlike general models, this work uniquely targets NSGT by adding exponential degradation and uncertainty validation via Monte Carlo, filling voids in perishable supply chains. Figure 1 illustrates the multimodal transportation network.

2. Literature Review

2.1. Research Status of Multimodal Transportation Route Optimization

Route optimization for multimodal transportation has been a longstanding focus of scholarly attention, with substantial research conducted in related domains. In general, from a modeling standpoint, the primary objective of most studies is to minimize integrated costs, encompassing various metrics such as monetary expenses, transit time, and carbon emission costs. For instance, numerous researchers, including Yang [4] and Zheng [5], have incorporated factors such as cost or time in their investigations. Rahman et al. [6], as well as Cui et al. [7], incorporated carbon emission factors into the objective function. Liu, Hou, and Peng researched cold chain logistics. Specifically, Liu [8] incorporated the damage costs of cold chain container cargoes into consideration, Hou [9] integrated the cargo damage costs into the objective function, and Peng [10] incorporated food waste into the objective function.
The aforementioned studies primarily employ an approach that consolidates multiple cost components into a unified objective for resolution. Nevertheless, owing to the inherent independence of these cost factors, certain investigations have elected to examine each element individually, thereby formulating multi-objective optimization frameworks. For instance, Huang [11] devised a multi-objective model for multimodal transportation route optimization, with objectives focused on minimizing overall transportation expenses and reducing total transit duration. Yang [12] constructed a framework targeting transportation distance, duration, and carbon emissions as key optimization criteria. Zhang [13] explored and contrasted three primary objectives, such as transportation expenses, carbon emission expenses, and transit time.
Most solutions to multi-objective problems predominantly emphasize the generation of Pareto solutions. However, a key limitation of Pareto solutions lies in their capacity to offer merely referential guidance rather than definitive decision-making support. To address this challenge effectively, investigators have increasingly adopted a decision-oriented perspective to determine a solution. Xu [14] introduced a multimodal solution selection strategy grounded in route similarity to equilibrate the distribution of solutions within the decision space, thereby preserving multiple equivalent optimal solutions. Qian [15] employed projection and transformation operators informed by stakeholder preferences to yield a stable Pareto solution set. Chen [16] initially formulated a three-objective integer programming model and incorporated an enhanced multi-objective genetic algorithm, DSNSGA3, to facilitate decision-making. Maneengam et al. [17] utilized an adaptive ϵ -constraint approach to construct the Pareto front, converting one objective into constraints for iterative approximation, followed by a modified TOPSIS integrated with D-CRITIC weighting to select compromise solutions that harmonize cost, time, and sustainability. Chen [18] applied the Normalized Normal Constraint Method (NNCM) to derive the Pareto front, incorporating container utilization and time constraints, and subsequently extracted Pareto solutions to aid decision-making in container shortage scenarios.
Regarding algorithmic approaches, multimodal transportation route optimization is classified as an NP-hard problem, rendering intelligent algorithms a viable methodology for resolution. For instance, Guo [19] developed a hybrid heuristic algorithm termed KIGALNS, while Zhu [20] devised an enhanced genetic algorithm that integrates carpooling scheduling techniques. Zhang [21] applied an improved ant colony optimization (ACO) algorithm to solve the model, while Peng [22] formulated an NSGA-II algorithm augmented with an external elite preservation strategy, and Wu [23] utilized a genetic algorithm (GA) for computation. Concurrently, hybrid algorithms proposed by certain scholars have demonstrated notable efficacy in problem-solving; for example, Lu [24] introduced a hybrid algorithm combining an improved genetic algorithm (GA) with Artificial Fish Swarm Optimization (AFO), denoted as GA-AFO, which leverages GA to produce high-quality initial populations for expedited convergence. Xu [25] employed genetic algorithm (GA), particle swarm optimization (PSO), and a hybrid genetic-particle swarm algorithm (GA-PSO) to resolve the model, whereas Zhang [26] designed and validated a catastrophe adaptive genetic algorithm (CA-GA, a variant incorporating sudden population resets to escape local optima) incorporating Monte Carlo sampling. Building on these hybrids, our NSGA-II-Q-learning method refines multi-objective routes by incorporating reinforcement learning for NSGT-specific uncertainties, enhancing convergence and diversity [27], while these approaches improve convergence in general multimodal problems, they often assume deterministic environments and overlook domain-specific uncertainties like variable grain degradation rates. Our NSGA-II-Q-learning hybrid addresses this by incorporating reinforcement learning to handle stochastic elements in NSGT, leading to more robust Pareto fronts.
In this study, we develop a multi-objective framework that integrates overall transportation and transshipment expenses, duration, carbon emission-related expenditures, and grain quality degradation. Furthermore, a hybrid algorithm merging NSGA-II with Q-learning reinforcement learning is formulated to evaluate the model’s efficacy.

2.2. Research Status of Multimodal Grain Transportation

Early investigations into multimodal grain transportation primarily concentrated on fundamental carbon footprint evaluations for individual modes. For instance, Matsiuk et al. [28] devised a calculation framework for assessing the carbon footprint of road-based grain transport, employing vehicle-specific parameters and mileage metrics to quantify emissions. Through this framework, they established a basis for single-mode environmental assessments and elucidated the effects of fuel types and load factors on carbon emissions.
As scholarly efforts advanced, researchers began incorporating emission forecasting and policy considerations into analyses of multimodal systems. The Matsiuk [29] team formulated a predictive model for carbon emissions in multimodal transport, amalgamating empirical data from rail and water modes to simulate emission profiles under diverse operational scenarios. Sun et al. [30] proposed a green intermodal routing optimization model for grain, integrating carbon taxation and trading mechanisms, while utilizing trapezoidal fuzzy numbers to characterize multi-source uncertainties such as demand variability. By implementing chance-constrained programming to establish wastage thresholds, they demonstrated the viability of concurrently optimizing economic viability, operational reliability, and environmental sustainability.
Ni [31] selected time, cost, and product quality degradation as primary optimization objectives in route planning, developing a node-selection technique to resolve the multi-objective shortest route problem, thereby mitigating the risk of convergence to local optima.
In terms of optimization methodologies, simulation techniques and heuristic algorithms have proven efficacious in tackling the intricacies of multimodal transport. Mazaraki et al. [32] applied multi-agent simulation to refine a rail–water grain supply chain, incorporating infrastructure constraints such as fleet capacity and port storage, with the objective of minimizing transit duration. By representing grain tonnage as discrete entities, their model unveiled optimal resource configurations for practical routes, such as those from Ukraine to Egypt. Prior studies provide valuable emission estimates but simplify wastage as linear or ignore multimodal synergies, limiting applicability in dynamic NSGT scenarios. This work extends these by modeling exponential quality loss and validating via Monte Carlo simulations, filling critical voids in perishable supply chain optimization.
These grain studies guide our carbon-focused NSGT model by highlighting uncertainties and wastage, which we extend with MC simulations and hybrid optimization for enhanced applicability. Despite these advances, gaps persist in integrating real-time uncertainties, comprehensive wastage models, and policy-sensitive optimizations across full multimodal chains. Existing works often simplify assumptions, such as constant emission factors or deterministic demands, limiting applicability in dynamic environments. In this study, we address these limitations by proposing a multi-objective model that incorporates transportation costs, time, carbon emissions, and grain quality losses, validated through a hybrid NSGA-II and Q-learning algorithm.

2.3. Research Gaps and Contributions

While Lu and Gao [24] present a general multi-attribute model for multimodal transportation optimizing costs, time, emissions, and risks using a GA-AFO hybrid with user-preference fuzzy decision-making, our work differentiates by targeting China’s North-to-South Grain Transportation (NSGT). We introduce grain quality loss as a fourth objective, modeled exponentially Equation (4) to capture time-sensitive degradation—A critical gap in generic models lacking perishable goods focus. Our NSGA-II-Q-learning hybrid algorithm enhances Pareto front robustness (validated with hypervolume and p < 0.05 hypothesis testing), outperforming baselines like NSGA-II and SPEA. This provides policy-relevant insights for sustainable NSGT under China’s carbon neutrality targets, addressing limitations in prior works such as deterministic assumptions and simplified emissions.

3. Model Building

3.1. Problem Description

The optimization of multimodal transportation routes for the NSGT problem, considering carbon emissions, is a complex issue involving multiple transport routes, nodes, modes, fixed origin and destination, and various stochastic factors. It requires minimizing transportation costs, transit time, carbon emission costs, and grain quality losses. Therefore, this study establishes a model for optimizing multimodal transportation routes with the objective of minimizing transportation costs, transit time, carbon emission costs, and grain quality losses.
Assume a batch of grain is to be transported from City A to City G, with multiple intermediate nodes as shown in Figure 2. The origin is City A, the destination is City G, and there are five intermediate nodes (B to F). Each node offers three transportation modes: road, rail, and waterway. By evaluating the carbon emissions, costs, and grain quality losses associated with each mode at every node, this study’s aim is to determine an optimized multimodal transportation route for NSGT.
The challenges addressed in carbon-aware route optimization for intermodal grain transportation from northern to southern regions can be distilled into key components: identifying a route within the grain shipment network that encompasses the maximum feasible number of nodes. This involves employing objective functions encompassing shipment expenses, duration, environmental impact fees, and product degradation, subject to constraints including emission limits, scheduling windows, and modal choices, with computations aimed at reducing overall inefficiencies in grain logistics.

3.2. Model Hypothesis

To facilitate the formulation of the multi-objective optimization model for the North-to-South Grain Transportation network, the following assumptions are made based on practical considerations and the existing literature on multimodal transportation optimization.
(1) Grain is assumed homogeneous, with degradation rates averaged across common varieties for simplicity, though H/T effects are evaluated in sensitivity analysis. Flows are conserved at hubs, allowing splitting/merging, but without dynamic route generation [33].
(2) All parameters, such as distances, costs, times, emission factors, efficiencies, and degradation rates, are known and deterministic [34].
(3) At hubs, mode switches are binary decisions with fixed times and costs, assuming compatible modes and sufficient infrastructure capacity. Hub congestion is not modeled [35].
(4) Destinations have strict time windows for arrivals. Violations incur linear penalties, prioritizing operational and quality impacts over soft constraints [36].
(5) Emissions are computed via distance, mode factors, and efficiencies, assuming a carbon pricing policy. Quality loss follows an exponential time-based function with base rate α , ignoring environmental variables like humidity for tractability [37].
Table 1 illustrates symbol definitions for parameters and decision variables in this study.

3.3. Mathematical Model

The proposed model is a multi-objective optimization framework that minimizes total costs, time, carbon emission costs, and grain quality loss, integrating both transportation and transshipment hub operations. The objective functions are formulated as follows:
min { f 1 , f 2 , f 3 , f 4 }
(1) This objective combines transportation costs (based on unit cost, distance, and flow), transshipment costs at hubs (based on unit transshipment cost and transshipped amount), and time window violation penalties. Where V h denotes the set of hub nodes.
f 1 = k K ( i , j ) E c i j k · d i j k · q i j k · x i j k + i V h k K l K c i k l · z i k l · y i k l + i V γ · max ( 0 , p i b i , a i p i )
(2) This objective aggregates transportation time across selected routes and transshipment time at selected hubs.
f 2 = k K ( i , j ) E t i j k · x i j k + i V h k K l K t i k l · y i k l
(3) This objective accounts for carbon emission costs from transportation (adjusted by mode-specific efficiency factors η k ) and transshipment operations (adjusted by energy efficiency factors θ i k l ) extending the environmental considerations, where V h denotes the set of hub nodes.
f 3 = k K ( i , j ) E p c · η k · e i j k · d i j k · q i j k · x i j k + i V h k K l K p c · θ i k l · e i k l · z i k l · y i k l
(4) This objective function minimizes the grain quality loss during transportation and trans-shipment, which is modeled using an exponential decay function (the longer the time, the greater the loss). Here, α and β are quality degradation parameters (with α representing the base rate and β the time sensitivity), and e β t i j k denotes the exponential decay factor, which increases exponentially with time t i j k .
f 4 = k K ( i , j ) E α e β t i j k · q i j k · x i j k + i V h k K l K α e β t i k l · z i k l · y i k l

3.4. Constraint Condition

The model is subject to the following constraints:
k K x i j k 1 , ( i , j ) E
k K l K , l k y i k l 1 , i V h
q i j k C k · x i j k , ( i , j ) E , k K
k K ( i , j ) E n k · e i j k · d i j k · q i j k E max
p j p i + t i j k · x i j k , ( i , j ) E , k K
k K j : ( i , j ) E q i j k k K j : ( j , i ) E q j i k = Q i , i V
q i j k 0 , ( i , j ) E , k K
x i j k { 0 , 1 } , ( i , j ) E , k K
y i k l { 0 , 1 } , i V , ( k , l ) K
p i 0 , i V
Equation (5) implies that only one transportation mode can be employed between any pair of adjacent nodes. Equation (6) specifies that each node is limited to at most one transition between transportation modes. Equation (7) articulates that, for every edge ( i , j ) and mode k, the load q i j k must not exceed the vehicle capacity C k multiplied by the binary decision variable x i j k , which indicates whether the edge is utilized. The implication of Equation (8) is that the aggregate carbon emissions across all modes k and edges ( i , j ) shall not surpass the prescribed emission cap. Equation (9) stipulates that, for each edge ( i , j ) and mode k, the processing time at node j, denoted p j , is at least the processing time at node i, p i , plus the travel time t i j k scaled by the decision variable x i j k . Equation (10) represents a flow conservation constraint, wherein for each node i, the total outbound load from i across all vehicles minus the total inbound load to i equals the net demand Q i . Equation (11) enforces non-negativity on all load variables q i j k . Equations (12) and (13) define the binary nature of the decision variables. Finally, Equation (14) ensures that arrival times at all nodes are non-negative.

4. Algorithm Design

This section outlines the NSGA-II-Q-learning hybrid algorithm for solving the multi-objective NSGT model. Standard NSGA-II procedures are referenced from [38] and condensed here to emphasize innovations tailored to NSGT: adaptive mutation for robustness, diversity reinitialization to prevent stagnation, extended MDP in Q-learning incorporating previous mode for multimodal refinement, and dominance-based rewards prioritizing grain loss ( f 4 ). Parameters such as population size = 100, generations = 250, crossover = 0.9, and mutation = 0.15 are tuned via grid search for convergence. The flowchart of the NSGA-II-Q-learning hybrid algorithm is depicted in Figure 3.

4.1. Non-Dominated Sorting Genetic Algorithm II

The Non-dominated Sorting Genetic Algorithm II (NSGA-II) serves as the foundational evolutionary framework to concurrently optimize four objectives: transportation cost f 1 , transit time f 2 , emission cost f 3 , and grain quality degradation f 4 . This elitist multi-objective evolutionary algorithm incorporates non-dominated sorting and crowding distance metrics to preserve diversity and convergence towards the Pareto front. Adaptive mechanisms are integrated to mitigate premature convergence, a prevalent issue in complex optimization landscapes. The specific steps of the algorithm are as follows:
Step 1: Encoding Mechanism: Individuals are represented via integer vectors, wherein the inaugural gene denotes the route identifier (spanning 1 to 29 unique routes derived from graph enumeration and deduplication). Subsequent genes encode transportation modalities per segment (1: highway; 2: railway; 3: waterway), augmented with 0s for routes shorter than the maximum segment count (6). This schema guarantees solution feasibility by leveraging precomputed routes via k-shortest route algorithms. The encoding paradigm is visualized in Figure 4.
To enhance diversity and reproducibility in route generation, we dynamically generate candidate routes using Yen’s k-shortest routes algorithm, implemented via MATLAB’s (R2024b) graph and shortest route functions to produce alternative routes while iteratively removing edges from the graph. For consistent experimental results across runs, random seeds are set using MATLAB’s (R2024b) rng function.
Step 2: Initialization and Population Diversity: A population of 800 individuals is instantiated through randomized integer assignment within stipulated bounds: [1, 29] for route indices and [1, 3] for modalities. Diversity is bolstered by cyclically assigning initial route indices across all viable routes. Objective evaluations are conducted via the evaluate objectives function, quantifying f 1 f 4 incorporating segment attributes, trans-shipment expenditures, durations, emissions, and quality attrition.
Step 3: Genetic Operators: This part mainly includes the following three steps:
(1) Selection: Binary tournament selection is utilized, favoring individuals with superior ranks and elevated crowding distances.
(2) Crossover: Uniform single-point crossover is executed at a rate of 0.9. A stochastic locus is designated, post which genetic material is interchanged to yield progeny. This operator is exemplified in Figure 5, illustrating partial solution inheritance and variational induction.
(3) Mutation: Probabilistic mutation (initially 0.05) entails random perturbation of selected genes within bounds (e.g., modality reassignment). The mutation procedure is depicted in Figure 6, accentuating altered loci.
Step 4: Non-Dominated Sorting and Crowding Distance: Post-offspring generation and appraisal, the amalgamated populace undergoes non-dominated sorting for rank allocation. In this process, individuals are iteratively assigned to fronts based on dominance: an individual p dominates q if p is no worse in all objectives and strictly better in at least one. The first front comprises non-dominated individuals, with subsequent fronts formed from the remaining population recursively. Crowding distances are computed to foster intra-front diversity. For each objective m, individuals in a front are sorted, and the crowding distance d i for individual i is given by Formula (15):
d i = m = 1 M f m ( i + 1 ) f m ( i 1 ) f m max f m min
where f m i + 1 and f m i 1 are the objective values of neighboring individuals in the sorted list for objective m, and boundary individuals receive infinite distance. The ensuing population is constituted by sequentially incorporating fronts, with truncation of the terminal front predicated on descending crowding distances.
Step 5: Adaptive Mechanisms and Convergence Check: To circumvent stagnation, generational diversity is quantified via a diversity check, and suboptimal diversity (<0.5) triggers a tripling of mutation probability (to 0.15) for transient exploration augmentation; convergence is ascertained by Pareto front size fluctuation: cessation ensues upon <1% variation over 200 successive generations.
Step 6: Top Individual Selection: The terminal Pareto front undergoes objective normalization, with scores derived as the summation of normalized f 1 + f 2 + f 3 augmented by twice the normalized f 4 . Ranked solutions are deduplicated via route and initial four modes to sustain diversity, culminating in the extraction of the top 10% as Q-Learning inputs.
This NSGA-II facet proficiently navigates the solution domain, providing diversified high-caliber candidates for the ensuing Q-Learning refinement.

4.2. Q-Learning Algorithm

Following the NSGA-II phase, Q-learning is integrated as a reinforcement learning component to refine the top 10% Pareto-optimal solutions and identify the singular optimal route. This Markov Decision Process (MDP)-based approach models the transportation network as a state-action space, where learning occurs through trial-and-error interactions to maximize cumulative rewards tied to the objectives. By restricting the action space to elite candidates from NSGA-II, computational efficiency is enhanced while focusing on high-potential solutions.
Q-learning is applied post-NSGA-II to leverage Markov Decision Processes for sequential route optimization, justified theoretically by treating multimodal choices as actions in a state space—addressing NSGA-II’s limitations in local refinement. Reward scaling is empirically tuned via sensitivity analysis, ensuring balance across objectives rather than arbitrary weights [27].

4.2.1. MDP Formulation

The problem is framed as an MDP with states representing nodes ( S = { 1 , 2 , , 7 } ). Actions from state s are tuples a = ( s , m ) , where s is the next node and m { 1 , 2 , 3 } denotes the mode (highway, railway, waterway). The action space is constrained to transitions observed in the top routes from NSGA-II, ensuring feasibility.
The Q-table is initialized randomly: Q ( s , a ) U ( 0.5 , 0.5 ) , with dimensions | S | × max ( | A s | ) , where max ( | A s | ) = 6 × 3 = 18 .

4.2.2. Reward Function

Rewards guide learning towards minimizing objectives, with a focus on grain quality loss ( f 4 ). For each transition ( s , a , s ) ,
r = α exp ( β t i j k ) Q · γ s t e p 1 + N ( 0 , 0.05 ) 0.05
In Formula (16), the immediate reward approximates segment quality loss, where t i j k is the time on edge ( i , j ) under mode k, α = 0.01 , β = 0.01 , Q = 100 , γ = 0.09 , and noise promotes exploration.
Upon reaching the goal (terminal state), a comprehensive terminal reward incorporates all objectives and time window violation:
r terminal = o = 1 4 f o 50000 + γ · v s . · 50
where v = m a x ( 0 , T b ) + m a x ( 0 , a T ) is the soft time window violation at node 7, with [ a , b ] = [ 100 , 210 ] h, and T is total time ( f 2 ).
Table 2 analysis examines the impact of varying reward scales on mean objective values. At Scale1 = 40,000/Scale2 = 40 and Scale1 = 60,000/Scale2 = 60, results are identical, indicating symmetry or saturation in these bounds, while the middle scale yields lower cost ( f 1 ), it compromises emission ( f 3 ) and loss ( f 4 ), highlighting a trade-off: moderate scales balance exploration-exploitation in Q-learning, but extremes favor low-penalty outcomes. This suggests 50,000/50 may be preferable for cost-sensitive NSGT scenarios, though higher scales enhance environmental reliability.

4.2.3. Learning Process

Q-learning employs an ϵ -greedy policy for action selection. The update rule is as follows:
Q ( s , a ) Q ( s , a ) + α ( r + γ max a Q ( s , a ) Q ( s , a ) )
with learning rate α = 0.5 , γ = 0.9 . Episodes commence at state 1 and terminate at state 7 or after 20 steps. ϵ initializes at 1.0 and decays: ϵ ϵ × 0.9995 Periodic resets (every 5000 episodes) reinitialize the Q-table, and ϵ resets to 0.8 every 10,000 episodes for balanced exploration.
Training spans 30,000 episodes, with successful episodes validated against full objectives and adjusted rewards.

4.2.4. Optimal Route Extraction

Post-training, the optimal policy is greedy: from state 1, a = arg max a Q ( s , a ) is selected iteratively until state 7 is reached. The resultant route and modes are evaluated for final objectives.

5. Case Study

To validate the efficacy of the proposed model and the applicability of the algorithm, this study constructs a multimodal transportation network as illustrated in Figure 7. In the figure, the blue solid lines represent highways, the black dashed lines represent railways, and the green dotted lines represent waterways. Data for time, cost, emissions, and capacity are referenced from Table 3, Table 4 and Table 5. Leveraging this network, we simulate the complexity and diversity of real-world traffic scenarios by configuring specific parameters and key settings. Subsequently, the aforementioned methodology is applied to address the case study problem, followed by an analysis of the resulting solutions to derive the ultimate decision-making strategy.

5.1. Case Description

The North-to-South Grain Transportation network comprises seven nodes, namely Harbin, Shenyang, Dalian, Qingdao, Shanghai, Wuhan, and Guangzhou, along with twelve edges and three transportation modes. In this study, the average speeds for highway, railway, and waterways are adopted as 80 km/h, 50 km/h, and 20 km/h, respectively. Table 3, Table 4 and Table 5 detail the supplementary attributes of the multimodal intermodal network. Specifically, Table 3 presents the trans-shipment times, costs, carbon emissions, and energy efficiency factors under different transportation modes [39]; Table 4 provides the unit costs, unit carbon emissions, and transportation energy efficiency factors for each mode [40]; and Table 5 lists the segment lengths of each route, the time windows for every edge, and the corresponding capacities [41].

5.2. Case Solution

The proposed hybrid NSGA-II-Ql algorithm was developed utilizing MATLAB R2024b and evaluated on a Windows 11 platform featuring a 2.30 GHz processor and 16 GB RAM. The primary parameter settings for the algorithm are outlined as follows [38]:
p o p s i z e = 100
m a x g e n = 250
Crossover probability = 0.9
Mutation probability = 0.15
a l p h a q l = 0.5 (learning rate, which determines the weight assigned to new information
when updating Q-values)
g a m m a q l = 0.9 (discount factor, which applies a discount to future rewards)
e p s i l o n q l = 1.0 (exploration rate)
decay-rate = 0.99 (epsilon decay rate)
n u m e p i s o d e s = 30,000 (number of training episodes)
m a x s t e p s = 20 (maximum number of steps per episode to prevent infinite loops)
The proposed framework achieves an equilibrium between exploratory search (conducted over 2000 evolutionary generations) and exploitative refinement (enabled by a Markov Decision Process and epsilon-greedy mechanism for identifying superior routes), with NSGA-II constructing the multi-objective Pareto frontier and Q-Learning pinpointing the preferred outcome.
In the context of the North-to-South Grain Transportation scenario, NSGA-II was initially executed independently. The top 10% routes were then identified through comparative analysis of the multi-objective Pareto fronts. The 3D visualization of the Pareto front is illustrated in Figure 8, and across the ten runs, the aggregated frequency distribution (Figure 9) reveals consistent preferences for certain routes, as summarized in Table 6.
In the diagram, the x-axis depicts the aggregate cost ( f 1 ), the y-axis represents the overall duration ( f 2 ), and the z-axis illustrates the carbon emission expense ( f 3 ), while the color spectrum encodes grain degradation ( f 4 ), transitioning from violet and azure to emerald and amber. Specifically, azure shades denote minimal degradation, generally linked to brief transit routes, whereas amber shades reflect substantial degradation, aligned with prolonged routes such as those reliant on slower maritime shipping. This depiction highlights the method’s robustness, featuring an extensive frontier that enables stakeholders in eco-friendly supply chains to harmonize financial, ecological, and product integrity aspects during route planning.
The table highlights that long and medium routes are prevalent, with rail and sea modes dominating examples like 2-3-2-2-2, suggesting efficiency in balancing cost and emissions. Medium routes incorporate road/rail/sea mixes in later segments to reduce emissions, while medium-long variants prioritize multimodal flexibility despite potential time increases.
As depicted in Figure 9, the bar chart shows the total frequencies sorted descending, with route index 10 dominating, followed by indices 6 and 15. This distribution indicates a strong bias toward medium–long routes, likely due to their balanced performance across objectives.
Following the aggregation of top routes from ten NSGA-II executions, Q-learning served as a post-processing mechanism to identify superior solutions. Candidates were derived from the diversity-enhanced set to construct a Markov Decision Process (MDP), wherein nodes functioned as states and pairs of (next node, transportation mode) as actions. Training encompassed 30,000 episodes employing an epsilon-greedy policy (initial ϵ = 1.0, decaying at a rate of 0.99 to a minimum of 0.1), with Q-table perturbations (±0.2) introduced every 500 episodes to enhance exploration. The update procedure incorporated a learning rate of 0.5 and a discount factor of 0.9. The resulting optimal outcomes are summarized in Table 7.
The Q-learning-derived optimal route, Harbin-Dalian-Shanghai-Guangzhou with road-sea-rail modes (nodes 1-3-5-7, [1,3,2]; f 1 = 53,356 , f 2 = 79 , f 3 = 2056 , f 4 = 10 ), achieves low grain quality loss by minimizing time-sensitive segments via streamlined node bypassing, reducing exponential degradation, while elevating total costs due to sea transportation expenses; compared to the previously dominant route (1-2-3-4-5-7 with modes [2,3,2,2,2]; f 1 = 35,472 , f 2 = 113 , f 3 = 2552 , f 4 = 10 ), it exhibits approximately 50% higher costs alongside approximately 20% diminished carbon emissions, stemming from waterway preferences that offer lower unit emissions but incur greater fixed overheads, reflecting a strategic emission-quality balance ideal for China’s carbon neutrality policy scenarios prioritizing sustainability over speed or budgeting.

5.3. Sensitivity and Uncertainty Analysis

To evaluate the model’s robustness under parameter variations and uncertainties (as per assumptions in Section 3.1), we performed sensitivity analyses on key parameters: carbon price ( p c ), penalty coefficient ( γ ), degradation rates ( α / β ), partial containerization fraction (frac), and humidity/temperature (H/T). Additionally, Monte Carlo (MC) simulation with 100 iterations [42] incorporated stochastic perturbations: Q by ±20% (normal), t by ±10%, p c by ±15%—simulating real-world NSGT fluctuations. Each MC run executed a Hybrid NSGA-II-QI to compute objectives.
Table 8 presents the sensitivity of mean objective values to variations in carbon price (Pc) from 0.05 to 0.10. The results exhibit negligible changes across all objectives: f 1 (cost) remains stable at approximately 64,901, f 2 (time) is invariant at 97.19, f 3 (emission cost) increases marginally by 0.04% (from 2583.49 to 2584.56), and f 4 (grain loss) shows minimal fluctuation around 20.78. This limited sensitivity underscores the model’s robustness to carbon pricing policies, implying that optimized transportation routes maintain efficiency even under escalating environmental taxes. Such stability is particularly valuable for NSGT planning, as it reduces the need for frequent reconfiguration in response to policy shifts. However, the slight rise in f 3 highlights a preference for low-emission modes at higher Pc, aligning with sustainable development goals.
Figure 10 illustrates the sensitivity of the mean violation penalty in f 3 to γ (0.5–2). At γ = 0.5, the penalty is negative (−1279.69), indicating lenient handling of time window violations that may underestimate emission costs. At γ = 1, it approaches zero (1.52), reflecting balanced enforcement. At γ = 2, it becomes positive (2563.94), a 300% increase from γ = 1, demonstrating stricter penalties reduce violations but elevate f 3 . This linear rise aligns with the model’s penalty term, confirming that high gamma promotes compliance in NSGT schedules. Practically, γ > 1 favors robust low-carbon routes by minimizing delays, though excessive values may overpenalize feasible solutions.
Figure 11 depicts α / β sensitivity on mean f 4 : at α = 0.005, f 4 ranges from 20.386 to 20.390 (minimal β impact); at α = 0.01, 20.776 to 20.785; at α = 0.02, 21.559 to 21.573—A linear increase of 5.8% per α doubling, with β inducing < 0.1 % variation. This moderate exponential response validates the degradation model Section 3.3, Equation (4), where higher rates amplify time-dependent loss.
Figure 12 shows Humidity/Temperature effects: f 4 rises non-linearly from 21.536 (H = 0.4, T = 20) to 21.577 (H = 0.6, T = 25), 0.2% overall, with peaks at elevated values. This extension of ignored factors confirms exponential humidity/temperature impacts on grain quality, underscoring the need for controlled storage in NSGT.
Table 9 summarizes the sensitivity of mean objective values to partial containerization fraction (frac) from 0.5 to 1.0. The analysis reveals minimal fluctuations: f 1 (cost) hovers around 90525 with negligible perturbation (<0.001%), f 2 (time) remains constant at 97.19, f 3 (emission cost) shows a subtle rise of 0.0002% (from 5146.97 to 5146.98), and f 4 (grain loss) varies slightly by 0.06% around 21.55. This limited response highlights the model’s insensitivity to fraction variations, validating the transshipment structure. In practice, higher frac values support cost and emission reductions through mixed-mode efficiency, facilitating adaptable NSGT strategies under partial loading scenarios. A constraint is the assumption of uniform containerization; empirical logistics data could further calibrate the impacts.
Figure 13 illustrates the distributions of objective values from 100 Monte Carlo simulations, incorporating stochastic perturbations in grain quantity (±20%), transit time (±10%), and carbon price (±15%) to mimic real-world NSGT uncertainties. The f 1 (cost) histogram exhibits right-skewing, with frequencies peaking at low values (0–2 × 10 5 ) and tapering to 6 × 10 5 , reflecting optimization bias toward cost efficiency. f 2 (time) shows a bimodal pattern, clustering around 0 and 200–400 h, indicating preferences for short-to-medium routes. f 3 (emission cost) is left-skewed (0–5000 dominant, tail to 15,000), underscoring low-carbon mode prioritization. f 4 (grain loss) displays a narrow peak (0–200), demonstrating controlled degradation.
These profiles affirm the model’s validity: tight distributions confirm resilience, ensuring viable solutions under fluctuations. This supports eco-friendly NSGT strategies, validating the framework’s practical utility.Overall, the model’s robustness is evident, with variations <20% in most objectives.

5.4. Comparison of Different Algorithms

To validate the effectiveness of the proposed Hybrid NSGA-II (NSGA-II-Ql) algorithm [43], it is compared with four benchmark algorithms: Pure NSGA-II, NSGA (Original), SPEA (Original), and Weighted Sum Genetic Algorithm. All algorithms are executed on the same multimodal transportation network instance using consistent parameters: population size = 100; maximum generations = 250; crossover probability = 0.9; and mutation probability = 0.15 [44,45]. Evaluations are based on 30 independent runs to ensure statistical reliability. The primary metric is Hypervolume (HV), which comprehensively assesses the convergence and diversity of the Pareto front. Additional metrics include inverted generational distance (IGD) and the Wilcoxon rank-sum test for statistical significance.
To address concerns on statistical efficacy, we conducted power analysis using the sampsizepwr function in MATLAB (R2024b) [46], revealing a power of 0.99999 for detecting >10% HV differences across 30 runs at α = 0.05 and effect size = 1.2. This exceeds the 0.8 threshold, confirming sufficiency. Average run time was 35.5331 s, feasible for practical NSGT planning on standard hardware. Figure 14 illustrates the power curve obtained from 30 experimental runs.Furthermore, the HV reference point was set to [6 × 10 6 , 12,000, 60,000, 1200], derived from maximum objective values +20% buffer to ensure positive volumes [47], making HV meaningful as dominated space measure.
Figure 15 illustrates the evolution of the average hypervolume (HV) over generations across 30 runs, with shaded areas representing the standard deviation (SD). The shades consist of transparent filled bands, where the band width reflects the variability across runs. The Hybrid NSGA-II-QI exhibits the highest average HV, commencing at approximately 4.9 × 10 18 , undergoing a brief fluctuation, and then rapidly ascending to about 5.1 × 10 18 by around 100 generations before stabilizing, with a small SD indicating superior average performance and robustness. The Pure NSGA-II and Original NSGA stabilize at similar levels around 5.1 × 10 18 , with narrow SD shades demonstrating consistent performance. The Original SPEA stabilizes at approximately 5.07 × 10 18 , showing moderate improvement but with wider SD, indicating less stability. The Weighted Sum GA performs the worst, commencing at a low value and stabilizing at about 7.8 × 10 17 , with a very thin SD but limited overall performance due to its single-objective conversion nature, which may favor specific weights at the expense of diversity. A summary of the mean values and p-values is presented in Table 10.
As shown in Table 7, Hybrid NSGA-II achieves mean HV of 5.105 × 10 18 , comparable to Pure NSGA-II but superior to SPEA and Weighted Sum GA. Despite close means, p-values are low for some due to Wilcoxon rank-sum test’s sensitivity to distribution/variance differences rather than means alone. Hybrid’s lower SD indicates tighter distribution and greater reliability, reflecting consistent ranks across runs, explaining apparent contradictions.
Figure 16 presents a boxplot comparing the final hypervolume (HV) distributions after 250 generations across 30 runs. The Hybrid NSGA-II exhibits a median HV of approximately 5.105 × 10 18 , with a compact box and minimal variability, demonstrating high consistency and robustness. The Pure NSGA-II and Original NSGA show similar medians around 5.08–5.10 × 10 18 , with slightly wider boxes indicating moderate variability. The Original SPEA has a lower median of about 5.068 × 10 18 and a flatter box, reflecting greater variability and less stable performance. The Weighted Sum GA performs the worst with a median HV of approximately 7.79 × 10 17 , indicating inconsistent performance; it may achieve peaks in certain runs but remains lower in most due to its bias toward specific weight configurations.
Statistical analysis using the Wilcoxon rank-sum test reveals a p-value of 1.212 × 10 12 (<0.05) between the Hybrid NSGA-II and Weighted Sum GA, indicating significant differences. The Hybrid NSGA-II achieves one of the highest average HV values, with low variability that enhances its reliability in practical applications. The Hybrid NSGA-II attains a mean inverted generational distance (IGD) of 50.235 (SD = 29.336), suggesting that its Pareto front is closest to the reference set, thereby exhibiting superior quality. Compared to the other benchmarks, the Hybrid NSGA-II yields p-values < 0.05 for most comparisons (vs. Original NSGA: 9.211 × 10 5 ; vs. Original SPEA: 1.695 × 10 9 ), confirming its significant superiority over these algorithms. For Pure NSGA-II (p = 0.137 > 0.05), the lack of significance indicates statistical equivalence in HV distributions; however, the Hybrid NSGA-II demonstrates advantages in lower variability (SD = 5.44 × 10 15 vs. 6.42 × 10 15 ) and substantially better IGD (50.235 vs. 6104.330), underscoring its enhanced robustness and solution quality in multi-objective optimization.
Figure 17 presents a 3D comparison of the Pareto fronts ( f 1 : cost vs. f 2 : time vs. f 3 : emission cost, with color representing f 4 : grain loss, ranging from blue for low values to yellow for high values). The Hybrid NSGA-II solutions exhibit a broad distribution, encompassing regions of low cost, low time, and low emissions, with f4 colors biased toward blue (indicating low loss), thereby demonstrating excellent balance across multiple objectives.
The results indicate that the Hybrid NSGA-II-QI performs best in terms of average hypervolume (HV), with a value of approximately 5.105 × 10 18 , demonstrating stable and diverse solutions across 30 runs. Although the Pure NSGA-II and Original NSGA achieve slightly higher HV values ( 5.096 × 10 18 and 5.083 × 10 18 , respectively), the differences are not statistically significant (p = 0.137 and p = 9.211 × 10 5 ), and the Hybrid NSGA-II offers lower variability (SD = 5.44 × 10 15 ), achieving an improvement rate of approximately 12-fold in IGD relative to the Pure NSGA-II (based on code calculations). These advantages arise from its hybrid features: Q-learning refines top-tier routes, while adaptive mutation and diversity reinitialization enhance robustness. In the context of multimodal transportation optimization incorporating carbon emissions, the Hybrid NSGA-II demonstrates greater reliability in real-world scenarios. Limitations include elevated computational overhead, and future research may investigate additional metrics such as spread.

5.5. Scalability Assessment of the Framework

To evaluate the extensibility of the hybrid NSGA-II-Q-learning approach for optimizing multimodal grain transportation, we scaled the baseline 7-node network to a randomly constructed 20-node configuration. This involved amplifying network intricacy while preserving core parameters. The analysis, outlined in Table 11 and illustrated in Figure 18, leverages standard multi-objective evolutionary algorithm indicators like hypervolume (HV) and inverted generational distance (IGD), commonly used to gauge solution excellence and approximation accuracy. These align with contemporary MOEA applications in supply chain contexts, where HV assesses front span and IGD evaluates deviation from an optimal benchmark.
The outcomes demonstrate robust coherence and conform to expected behaviors in logistics modeling. The 7-node reference, rooted in empirical North-to-South Grain Transportation routes, yielded a mean terminal HV of 5.10 × 10 18 , IGD of 50.235, and average runtime of 35.5 s across iterations. In the 20-node setup, HV rose modestly to 5.17 × 10 18 , IGD increased to 52.658, and runtime grew to 38.2 s. Such variations are credible: enlarged solution spaces foster broader exploration, potentially boosting HV, but pose slight approximation hurdles evident in higher IGD. Runtimes scale modestly, reflecting NSGA-II’s efficiency in complex scenarios.
Figure 18 3D Pareto visualization reinforces this. Inverse relationships and even spacing highlight effective optimization, echoing dynamics in emission-focused supply models. The constrained f4 span, possibly due to variance-enhancing perturbations, remains viable and consistent with decay mechanisms for fragile commodities. Metrics align with field norms, affirming cross-scale dependability without irregularities.

6. Conclusions

This study addresses the multimodal transportation route optimization problem for the North-to-South Grain Transportation initiative, considering carbon emissions. It comprehensively accounts for factors such as cost, time, carbon emissions, and grain quality loss. Subsequently, a hybrid algorithm, NSGA-II-Ql, is proposed, which integrates adaptive mutation and diversity reinitialization mechanisms to achieve balanced optimization across multiple objectives, including cost, time, emission cost, and grain loss. Dynamic carbon taxes and time window constraints are introduced. Using the transportation route from Harbin to Guangzhou as a case study, optimal transportation schemes composed of combinations of different modes under specific constraints are proposed. Finally, the model is optimized using MATLAB software (R2024b), considering the four key influencing factors—time, cost, carbon emissions, and grain quality loss—to obtain global optimal solutions, providing a reference basis for logistics enterprises in implementing multimodal transportation.
The hybrid NSGA-II-Ql exhibits excellent performance in Pareto front coverage, multi-objective balance, and stability. Final results show hybrid HV at 5.105 × 10 18 vs. Weighted Sum’s 0.779 × 10 18 , with low variability ( 5.44 × 10 15 ), indicating superior reliability. Statistical tests confirm its significant advantages over Pure NSGA-II, Original NSGA, and Original SPEA in IGD ( p < 0.05 across all), and in HV over Original NSGA and Original SPEA ( p < 0.05 ), while showing comparability to Pure NSGA-II in mean HV (p = 0.137). The observed differences with Pure NSGA-II likely reflect variances in distribution rather than means, enhancing the hybrid’s robustness.These results supported by sensitivity analysis showing <20% objective variation under parameter changes. These findings imply reduced emissions for NSGT planning, differing from prior single-mode studies by enabling policy-sensitive trade-offs.
Despite the positive results achieved, this study has several limitations. First, the network scale is relatively small and may not fully reflect the complexity of real-world large-scale transportation systems. Second, the parameter settings are dependent on specific cases, necessitating further validation for generalizability. Additionally, the high computational overhead may constrain its applicability in real-time scenarios. Future work could include larger-scale network models, incorporating uncertainties such as traffic congestion. Concurrently, exploring additional performance metrics and parallel computing techniques could reduce overhead. Furthermore, applying the algorithm to other domains, such as supply chain optimization or urban logistics planning, would validate its robustness.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app16010510/s1.

Author Contributions

Conceptualization, Y.X. and W.Z.; methodology, Y.X.; software, Y.X.; validation, Y.X., W.Z. and X.H.; formal analysis, W.Z.; investigation, Y.X.; resources, X.H.; data curation, Y.X.; writing—original draft preparation, Y.X.; writing—review and editing, W.Z.; visualization, X.H.; supervision, W.Z.; project administration, W.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article or Supplementary Materials.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Liu, M. The Trends and Problems in China’s North to South Grain Logistics Integration. A Case Study of COFCO & CM Grain Exchange Co. Master’s Thesis, World Maritime University, Malmö, Sweden, 2016. [Google Scholar]
  2. Kumar, P.P.; Parida, M.; Swami, M. Performance evaluation of multimodal transportation systems. Procedia Soc. Behav. Sci. 2013, 104, 795–804. [Google Scholar] [CrossRef]
  3. Wu, J. Zhong Zhihua Committee Member: Exploring the Establishment of Individual “Carbon Accounts” to Encourage Public Transport Travel, Focusing on “Dual Carbon” to Promote Innovative Development of Urban Public Transport. Wen Wei Po 2022. Available online: https://wenhui.whb.cn/zhuzhan/ztjj2022qglh/20220306/453121.html (accessed on 10 November 2025).
  4. Yang, J.; Liang, D.; Zhang, Z.; Wang, H.; Bin, H. Path optimization of container multimodal transportation considering differences in cargo time sensitivity. Transp. Res. Rec. 2024, 2678, 1279–1292. [Google Scholar] [CrossRef]
  5. Zheng, C.; Sun, K.; Gu, Y.; Shen, J.; Du, M. Multimodal transport path selection of cold chain logistics based on improved particle swarm optimization algorithm. J. Adv. Transp. 2022, 2022, 5458760. [Google Scholar] [CrossRef]
  6. Rahman, H.; Javidroozi, V. Smart Carbon Emission Tracker: A Data-Driven Approach for Greener Multi-modal Transportation. In Proceedings of the The International Conference of Advanced Computing and Informatics, Birmingham, UK, 16–17 December 2024; Springer: Berlin/Heidelberg, Germany, 2024; pp. 161–172. [Google Scholar]
  7. Cui, T.; Shi, Y.; Wang, J.; Ding, R.; Li, J.; Li, K. Practice of an improved many-objective route optimization algorithm in a multimodal transportation case under uncertain demand. Complex Intell. Syst. 2025, 11, 136. [Google Scholar] [CrossRef]
  8. Liu, S. Multimodal transportation route optimization of cold chain container in time-varying network considering carbon emissions. Sustainability 2023, 15, 4435. [Google Scholar] [CrossRef]
  9. Hou, D.N.; Liu, S.C. Optimization of cold chain multimodal transportation routes considering carbon emissions under hybrid uncertainties. Adv. Prod. Eng. Manag. 2024, 19, 315–332. [Google Scholar] [CrossRef]
  10. Peng, Y.; Zhang, Y.; Yu, D.Z.; Luo, Y. Multiobjective Route Optimization for Multimodal Cold Chain Networks Considering Carbon Emissions and Food Waste. Mathematics 2024, 12, 3559. [Google Scholar] [CrossRef]
  11. Huang, C.; Sun, H.; Liu, C.; Zhang, X.; Gao, T.; Tian, J. Research on container multimodal transportation multi-objective path optimization from hinterlands to shanghai port. IEEE Access 2025, 13, 32794–32807. [Google Scholar] [CrossRef]
  12. Yang, L.; Zhang, C.; Wu, X. Multi-objective path optimization of highway-railway multimodal transport considering carbon emissions. Appl. Sci. 2023, 13, 4731. [Google Scholar] [CrossRef]
  13. Zhang, T.; Cheng, J.; Zou, Y. Multimodal transportation routing optimization based on multi-objective Q-learning under time uncertainty. Complex Intell. Syst. 2024, 10, 3133–3152. [Google Scholar] [CrossRef]
  14. Xu, Z.; Zhang, K.; Del Ser, J.; Li, M.; Xu, X.; He, J.; Wu, N. Multi-Objective Optimization for Multimodal Multi-Objective Multi-Point Shortest Path Problem Considering Unforeseeable Road Eventualities. IEEE Trans. Intell. Transp. Syst. 2025, 26, 8622–8640. [Google Scholar] [CrossRef]
  15. Qian, L.; Ruan, Y.; Luo, W. Method for Solving the Pareto Multi-Stakeholder Stable Solution Set of Multi-Objective Optimization for Land-Sea Intermodal Transport. J. Asian Geogr. 2024, 3, 49–61. [Google Scholar] [CrossRef]
  16. Chen, F.; Zhu, Q. Intelligent optimization method for hazardous materials transportation routing with multi-mode and multi-criterion collaborative constraints. Sci. Rep. 2025, 15, 7804. [Google Scholar] [CrossRef]
  17. Maneengam, A. Multi-objective optimization of the multimodal routing problem using the adaptive ε-constraint method and modified TOPSIS with the D-CRITIC method. Sustainability 2023, 15, 12066. [Google Scholar] [CrossRef]
  18. Chen, D.; Zhang, Y.; Gao, L.; Thompson, R.G. Optimizing multimodal transportation routes considering container use. Sustainability 2019, 11, 5320. [Google Scholar] [CrossRef]
  19. Guo, F.; Liang, J.; Niu, R.; Huang, Z.; Liu, Q. Robust optimization of a procurement and routing strategy for multiperiod multimodal transport in an uncertain environment. Eur. J. Oper. Res. 2025, 327, 115–135. [Google Scholar] [CrossRef]
  20. Zhu, P.; Lv, X.; Shao, Q.; Kuang, C.; Chen, W. Optimization of green multimodal transport schemes considering order consolidation under uncertainty conditions. Sustainability 2024, 16, 6704. [Google Scholar] [CrossRef]
  21. Zhang, H.; Li, Y.; Zhang, Q.; Chen, D. Route selection of multimodal transport based on China railway transportation. J. Adv. Transp. 2021, 2021, 9984659. [Google Scholar] [CrossRef]
  22. Peng, Y.; Luo, Y.J.; Jiang, P.; Yong, P.C. The route problem of multimodal transportation with timetable: Stochastic multi-objective optimization model and data-driven simheuristic approach. Eng. Comput. 2022, 39, 587–608. [Google Scholar] [CrossRef]
  23. Wu, C.; Zhang, Y.; Xiao, Y.; Mo, W.; Xiao, Y.; Wang, J. Optimization of multimodal paths for oversize and heavyweight cargo under different carbon pricing policies. Sustainability 2024, 16, 6588. [Google Scholar] [CrossRef]
  24. Lu, Y.; Gao, G. Multi-Attribute Collaborative Optimization for Multimodal Transportation Based on User Preferences. Appl. Sci. 2025, 15, 5512. [Google Scholar] [CrossRef]
  25. Xu, Z.; Zheng, C.; Zheng, S.; Ma, G.; Chen, Z. Multimodal transportation route optimization of emergency supplies under uncertain conditions. Sustainability 2024, 16, 10905. [Google Scholar] [CrossRef]
  26. Zhang, X.; Jin, F.-Y.; Yuan, X.-M.; Zhang, H.-Y. Low-carbon multimodal transportation path optimization under dual uncertainty of demand and time. Sustainability 2021, 13, 8180. [Google Scholar] [CrossRef]
  27. Li, P.; Xue, Q.; Zhang, Z.; Chen, J.; Zhou, D. Multi-objective energy-efficient hybrid flow shop scheduling using Q-learning and GVNS driven NSGA-II. Comput. Oper. Res. 2023, 159, 106360. [Google Scholar] [CrossRef]
  28. Matsiuk, V.; Yanovska, V.; Matviienko, H.; Parfentieva, O.; Ilchenko, N. Development of a method for estimating the carbon footprint when transporting grain by road. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2024; Volume 1415, p. 012033. [Google Scholar]
  29. Matsiuk, V.; Yanovska, V.; Hurochkina, V.; Ilchenko, N.; Tvoronovych, V. Prediction of CO2 emissions during multimodal grain transportation. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2024; Volume 1415, p. 012036. [Google Scholar]
  30. Sun, Y.; Zhang, C.; Chen, A.; Sun, G. Modeling a green and reliable intermodal routing problem for food grain transportation under carbon tax and trading regulations and multi-source uncertainty. Systems 2024, 12, 547. [Google Scholar] [CrossRef]
  31. Ni, S.; Feng, C. Research on multi-objective intermodal transportation of grain in China considering quality changes. In Proceedings of the Eighth International Conference on Traffic Engineering and Transportation System (ICTETS 2024), Dalian, China, 20–22 September 2024; SPIE: Bellingham, WA, USA, 2024; Volume 13421, pp. 1591–1597. [Google Scholar]
  32. Mazaraki, A.; Matsiuk, V.; Ilchenko, N.; Kavun-Moshkovska, O.; Grygorenko, T. Development of a multimodal (railroad-water) chain of grain supply by the agent-based simulation method. East.-Eur. J. Enterp. Technol. 2020, 6, 108. [Google Scholar] [CrossRef]
  33. Binsfeld, T.; Hamdan, S.; Jouini, O.; Gast, J. On the optimization of green multimodal transportation: A case study of the West German canal system. Ann. Oper. Res. 2025, 351, 667–726. [Google Scholar] [CrossRef]
  34. Li, W.; Wang, Y. A Comprehensive Analysis Perspective on Path Optimization of Multimodal Electric Transportation Vehicles: Problems, Models, Methods and Future Research Directions. World Electr. Veh. J. 2025, 16, 320. [Google Scholar] [CrossRef]
  35. Chupin, A.; Ragas, A.A.M.A.; Bolsunovskaya, M.; Leksashov, A.; Shirokova, S. Multi-Objective Optimization for Intermodal Freight Transportation Planning: A Sustainable Service Network Design Approach. Sustainability 2025, 17, 5541. [Google Scholar] [CrossRef]
  36. Tanwar, R.; Agarwal, P.K.; Patel, S. Evaluation of travel time performance of multimodal transport system of Bhopal city. Transp. Dev. Econ. 2025, 11, 6. [Google Scholar] [CrossRef]
  37. Uddin, M.; Clark, R.J.; Hilliard, M.R.; Thompson, J.A.; Langholtz, M.H.; Webb, E.G. Agent-based modeling for multimodal transportation of CO2 for carbon capture, utilization, and storage: CCUS-agent. Appl. Energy 2025, 378, 124833. [Google Scholar] [CrossRef]
  38. Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 2002, 6, 182–197. [Google Scholar] [CrossRef]
  39. Luo, Y.; Zhang, Y.; Huang, J.; Yang, H. Multi-route planning of multimodal transportation for oversize and heavyweight cargo based on reconstruction. Comput. Oper. Res. 2021, 128, 105172. [Google Scholar] [CrossRef]
  40. Ziaei, Z.; Jabbarzadeh, A. A multi-objective robust optimization approach for green location-routing planning of multi-modal transportation systems under uncertainty. J. Clean. Prod. 2021, 291, 125293. [Google Scholar] [CrossRef]
  41. Feng, X.; Song, R.; Yin, W.; Yin, X.; Zhang, R. Multimodal transportation network with cargo containerization technology: Advantages and challenges. Transp. Policy 2023, 132, 128–143. [Google Scholar] [CrossRef]
  42. Oberle, W. Monte Carlo Simulations: Number of Iterations and Accuracy; Technical Note ARL-TN-0684; US Army Research Laboratory: Adelphi, MD, USA, 2015. [Google Scholar]
  43. Faroqi, H. Multiobjective route finding in a multimode transportation network by NSGA-II. J. Eng. Appl. Sci. 2024, 71, 81. [Google Scholar] [CrossRef]
  44. Deb, K. Multi-objective evolutionary algorithms. In Springer Handbook of Computational Intelligence; Springer: Berlin, Germany, 2015; pp. 995–1015. [Google Scholar]
  45. Fonseca, C.M.; Fleming, P.J. An overview of evolutionary algorithms in multiobjective optimization. Evol. Comput. 1995, 3, 1–16. [Google Scholar] [CrossRef]
  46. Berens, P. CircStat: A MATLAB toolbox for circular statistics. J. Stat. Softw. 2009, 31, 1–21. [Google Scholar] [CrossRef]
  47. Deb, K.; Siegmund, F.; Ng, A.H.C. R-HV: A metric for computing hyper-volume for reference point based EMOs. In International Conference on Swarm, Evolutionary, and Memetic Computing; Springer: Cham, Switzerland, 2014; pp. 98–110. [Google Scholar]
Figure 1. Multimodal transportation network.
Figure 1. Multimodal transportation network.
Applsci 16 00510 g001
Figure 2. Network diagram of random route layout for multimodal transport.
Figure 2. Network diagram of random route layout for multimodal transport.
Applsci 16 00510 g002
Figure 3. Hybrid algorithm flowchart (NSGA-II + Q-learning).
Figure 3. Hybrid algorithm flowchart (NSGA-II + Q-learning).
Applsci 16 00510 g003
Figure 4. Encoding paradigm.
Figure 4. Encoding paradigm.
Applsci 16 00510 g004
Figure 5. Crossover process.
Figure 5. Crossover process.
Applsci 16 00510 g005
Figure 6. Mutation process.
Figure 6. Mutation process.
Applsci 16 00510 g006
Figure 7. Multimodal transport network from Harbin to Guangzhou.
Figure 7. Multimodal transport network from Harbin to Guangzhou.
Applsci 16 00510 g007
Figure 8. Three-dimensional representation of the Pareto-optimal front in multi-objective grain transportation optimization.
Figure 8. Three-dimensional representation of the Pareto-optimal front in multi-objective grain transportation optimization.
Applsci 16 00510 g008
Figure 9. Frequency of route indices in top 10% of routes.
Figure 9. Frequency of route indices in top 10% of routes.
Applsci 16 00510 g009
Figure 10. Sensitivity to Gamma.
Figure 10. Sensitivity to Gamma.
Applsci 16 00510 g010
Figure 11. Sensitivity to α / β .
Figure 11. Sensitivity to α / β .
Applsci 16 00510 g011
Figure 12. Sensitivity to Humidity/Temperature.
Figure 12. Sensitivity to Humidity/Temperature.
Applsci 16 00510 g012
Figure 13. Distributions of objective values under Monte Carlo uncertainty analysis.
Figure 13. Distributions of objective values under Monte Carlo uncertainty analysis.
Applsci 16 00510 g013
Figure 14. Power Curve for n = 30.
Figure 14. Power Curve for n = 30.
Applsci 16 00510 g014
Figure 15. Average HV comparison with SD.
Figure 15. Average HV comparison with SD.
Applsci 16 00510 g015
Figure 16. Boxplot of final HV.
Figure 16. Boxplot of final HV.
Applsci 16 00510 g016
Figure 17. Pareto front comparison.
Figure 17. Pareto front comparison.
Applsci 16 00510 g017
Figure 18. Three-Dimensional Pareto Front of the Extended 20-Node Network.
Figure 18. Three-Dimensional Pareto Front of the Extended 20-Node Network.
Applsci 16 00510 g018
Table 1. Symbol description of parameters and decision variables.
Table 1. Symbol description of parameters and decision variables.
SymbolDescription
VSet of nodes, including origins, hubs, and destinations ( i , j V )
ESet of edges representing transportation routes ( ( i , j ) E )
KSet of transportation modes (road, rail, waterway) ( k K )
d i j k Distance of edge ( i , j ) using mode k
c i j k Unit distance transportation cost (static)
t i j k Transportation time of edge ( i , j ) using mode k (static)
e i j k Unit distance carbon emissions, related to mode k
p c Fixed carbon tax price
Q i Grain demand/supply at node i
T W i = [ a i , b i ] Time window at node i, where a i is the earliest arrival time and b i is the latest arrival time
α , β Parameters of the grain quality degradation function
γ Penalty coefficient for time window violations, the time window violation penalty employs a linear function with coefficient γ = 1, selected based on common practices in transportation
x i j k Decision variable: 1 if mode k is used on edge ( i , j ) , 0 otherwise
y i k l Decision variable: 1 if hub use mode switch ( k , l ) , 0 otherwise
p i Arrival time at node i
q i j k Grain quantity transported on edge ( i , j ) using mode k
z i k l Grain quantity on node i using mode switch ( k , l )
θ Energy transshipment efficiency factor
η Energy transport efficiency factor
C k Transportation capacity of mode k
f 1 Objective function 1: Transportation cost
f 2 Objective function 2: Transportation time
f 3 Objective function 3: Carbon emission cost
f 4 Objective function 4: Grain quality loss
Table 2. Sensitivity of mean objective values to reward scales.
Table 2. Sensitivity of mean objective values to reward scales.
Scale1/Scale2Mean f1 (Cost)Mean f2 (Time)Mean f3 (Emission Cost)Mean f4 (Grain Loss)
40,000/40467,019.82110.3331.6017.39
50,000/50415,696.27111.833197.0920.91
60,000/60467,019.82110.3331.6017.39
Table 3. Transshipment costs/times/carbon emissions/efficiency factors at the nodes.
Table 3. Transshipment costs/times/carbon emissions/efficiency factors at the nodes.
HighwayRailwayWaterway
Highway00.25/1.1/0.05/0.80.4/1.2/0.06/0.7
Railway0.35/1.1/0.04/0.800.3/1.1/0.05/0.8
Waterway0.45/1.2/0.045/0.70.35/1.1/0.05/0.80
The units for the data in the table are CNY/t, h, and kg/t; the efficiency factor is unitless.
Table 4. Transportation costs/carbon emissions/energy efficiency factors for different modes.
Table 4. Transportation costs/carbon emissions/energy efficiency factors for different modes.
Transportation ModeData CategoryValue
HighwayTransportation cost (CNY (t/km)−1)1.3
Carbon emission factor (kg (t/km)−1)0.125
Energy efficiency factor1.0
RailwayTransportation cost (CNY (t/km)−1)0.65
Carbon emission factor (kg (t/km)−1)0.052
Energy efficiency factor0.8
WaterwayTransportation cost (CNY (t/km)−1)0.22
Carbon emission factor (kg (t/km)−1)0.021
Energy efficiency factor0.5
Table 5. Transportation distances and capacities for different transportation modes on each arc.
Table 5. Transportation distances and capacities for different transportation modes on each arc.
TransportationHighwayRailwayWaterway
Arc SegmentDistance/kmCapacity/tTime WindowDistance/kmCapacity/tTime WindowDistance/kmCapacity/tTime Window
(1,2)580200[10 90]520300[10 90]500
(1,3)950200[20 110]950300[20 110]500
(2,3)380200[20 110]390300[20 110]500
(2,4)1140200[30 130]1360300[30 130]500
(3,4)540200[30 130]1500300[30 130]1302500[30 130]
(3,5)1860200[50 180]2100300[50 180]1850500[50 180]
(3,7)2900200[100 210]3250300[100 210]4107500[100 210]
(4,5)800200[50 180]1310300[50 180]1302500[50 180]
(4,6)1100200[70 190]800300[70 190]500
(5,6)790200[70 190]800300[70 190]500
(5,7)1400200[100 210]1700300[100 210]2738500[100 210]
(6,7)980200[100 210]1070300[100 210]500
“–” indicates that this arc segment does not have this mode of transportation.
Table 6. Summary of path frequencies across NSGA-II runs.
Table 6. Summary of path frequencies across NSGA-II runs.
Route IndexNode Sequence (Decoded)Average Frequency (SD)Percentage (%)Common Mode ExamplesImplications
10Harbin–Shenyang–Dalian–Qingdao–Shanghai–Guangzhou (long route)1.4 (±0.0)∼14.32-3-2-2-2 (rail–sea–rail–rail–rail)Extended routes with diverse variants, incorporating sea modes, suitable for reduced emissions
6Harbin–Dalian–Shanghai–Wuhan–Guangzhou (medium route)0.9 (±0.0)∼9.21-1-2-2 (road–road–rail–rail)Medium routes emphasizing rail/sea integration for balance
15Harbin–Shenyang–Qingdao–Wuhan–Guangzhou (medium–short route)0.9 (±0.0)∼9.22-3-2-3 (rail–sea–rail–sea)Preference for medium–short routes, predominantly multimodal, achieving cost-duration balance
11Harbin–Dalian–Qingdao–Shanghai–Wuhan–Guangzhou0.8 (±0.0)∼8.22-3-1-2-2 (rail–sea–road–rail–rail)Similar to route 10 but via Qingdao, highlighting flexibility
13Harbin–Shenyang–Dalian–Shanghai–Guangzhou0.8 (±0.0)∼8.22-1-2-1 (rail–road–rail–road)Medium–long routes with mode variability
14Harbin–Shenyang–Qingdao–Shanghai–Guangzhou0.8 (±0.0)∼8.22-1-1-2 (rail–road–road–rail–rail)Comparable to route 15 but shorter sea segments
Others (e.g., 17, 1, 8, 5, 3, 2, 7, 4, 12, 19, 9)Variant routes<0.8<8.2MixedInfrequent occurrences, facilitating boundary exploration
Table 7. Summary of Q-Learning output components.
Table 7. Summary of Q-Learning output components.
ComponentDescription
Optimal route Nodes: 1 3 5 7Thus, the optimal route is Harbin → Dalian → Shanghai → Guangzhou. This represents a medium-length route (4 segments), bypassing certain nodes, likely to optimize time and emissions while maintaining cost balance.
Optimal Modes per Segment: 1 3 2This specifies the transportation modes for each route segment (node connections): 1 = road, 3 = sea, 2 = rail.
Optimal Objectives [ f 1 , f 2 , f 3 , f 4 ] f 1 = 53,356 , f 2 = 79 , f 3 = 2056 , f 4 = 10 , This configuration yields moderate costs, elevated durations accompanied by reduced emissions, and minimal losses.
Table 8. Sensitivity of mean objective values to carbon price (Pc).
Table 8. Sensitivity of mean objective values to carbon price (Pc).
PcMean f 1 (Cost)Mean f 2 (Time)Mean f 3 (Emission Cost)Mean f 4 (Grain Loss)
0.0564,901.2697.192583.4920.77
0.0764,901.5097.192583.9220.78
0.1064,901.4697.192584.5620.78
Table 9. Sensitivity of mean objective values to partial containerization fraction (frac).
Table 9. Sensitivity of mean objective values to partial containerization fraction (frac).
FracMean f 1 (Cost)Mean f 2 (Time)Mean f 3 (Emission Cost)Mean f 4 (Grain Loss)
0.590,525.4397.195146.9721.55
0.7590,525.7197.195146.9821.56
1.090,525.4697.195146.9721.55
Table 10. Summary of mean values, SD, and p-values.
Table 10. Summary of mean values, SD, and p-values.
AlgorithmMean HV ( × 10 18 )SD HVp-HV vs. HybridMean IGDSD IGDp-IGD vs. Hybrid
Hybrid NSGA-II5.105 5.445 × 10 15 -50.23529.336-
Pure NSGA-II5.096 6.422 × 10 15 0.1376104.3301623.964 5.573 × 10 10
Original NSGA5.083 1.114 × 10 16 9.211 × 10 5 5286.4081278.760 3.020 × 10 11
Original SPEA5.068 1.271 × 10 16 1.695 × 10 9 6433.8671892.806 3.020 × 10 11
Weighted Sum GA0.7790 1.212 × 10 12 6331.5751315.177 3.020 × 10 11
Table 11. Comparison of Performance Metrics Between 7-Node and 20-Node Networks.
Table 11. Comparison of Performance Metrics Between 7-Node and 20-Node Networks.
Metric7-Node Original20-Node Random
Nodes720
Edges1260
Avg Run Time (s)35.538.2
Mean Final HV 5.10 × 10 18 5.17 × 10 18
Mean IGD50.23552.658
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xie, Y.; Zhang, W.; Hao, X. Optimization of Multimodal Transportation Routes for North-to-South Grain Transportation in China Considering Carbon Emissions. Appl. Sci. 2026, 16, 510. https://doi.org/10.3390/app16010510

AMA Style

Xie Y, Zhang W, Hao X. Optimization of Multimodal Transportation Routes for North-to-South Grain Transportation in China Considering Carbon Emissions. Applied Sciences. 2026; 16(1):510. https://doi.org/10.3390/app16010510

Chicago/Turabian Style

Xie, Yilei, Wenhui Zhang, and Xiangwei Hao. 2026. "Optimization of Multimodal Transportation Routes for North-to-South Grain Transportation in China Considering Carbon Emissions" Applied Sciences 16, no. 1: 510. https://doi.org/10.3390/app16010510

APA Style

Xie, Y., Zhang, W., & Hao, X. (2026). Optimization of Multimodal Transportation Routes for North-to-South Grain Transportation in China Considering Carbon Emissions. Applied Sciences, 16(1), 510. https://doi.org/10.3390/app16010510

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop