To validate the efficacy of the proposed model and the applicability of the algorithm, this study constructs a multimodal transportation network as illustrated in
Figure 7. In the figure, the blue solid lines represent highways, the black dashed lines represent railways, and the green dotted lines represent waterways. Data for time, cost, emissions, and capacity are referenced from
Table 3,
Table 4 and
Table 5. Leveraging this network, we simulate the complexity and diversity of real-world traffic scenarios by configuring specific parameters and key settings. Subsequently, the aforementioned methodology is applied to address the case study problem, followed by an analysis of the resulting solutions to derive the ultimate decision-making strategy.
5.2. Case Solution
The proposed hybrid NSGA-II-Ql algorithm was developed utilizing MATLAB R2024b and evaluated on a Windows 11 platform featuring a 2.30 GHz processor and 16 GB RAM. The primary parameter settings for the algorithm are outlined as follows [
38]:
Crossover probability = 0.9
Mutation probability = 0.15
(learning rate, which determines the weight assigned to new information
when updating Q-values)
(discount factor, which applies a discount to future rewards)
(exploration rate)
decay-rate = 0.99 (epsilon decay rate)
30,000 (number of training episodes)
(maximum number of steps per episode to prevent infinite loops)
The proposed framework achieves an equilibrium between exploratory search (conducted over 2000 evolutionary generations) and exploitative refinement (enabled by a Markov Decision Process and epsilon-greedy mechanism for identifying superior routes), with NSGA-II constructing the multi-objective Pareto frontier and Q-Learning pinpointing the preferred outcome.
In the context of the North-to-South Grain Transportation scenario, NSGA-II was initially executed independently. The top 10% routes were then identified through comparative analysis of the multi-objective Pareto fronts. The 3D visualization of the Pareto front is illustrated in
Figure 8, and across the ten runs, the aggregated frequency distribution (
Figure 9) reveals consistent preferences for certain routes, as summarized in
Table 6.
In the diagram, the x-axis depicts the aggregate cost (), the y-axis represents the overall duration (), and the z-axis illustrates the carbon emission expense (), while the color spectrum encodes grain degradation (), transitioning from violet and azure to emerald and amber. Specifically, azure shades denote minimal degradation, generally linked to brief transit routes, whereas amber shades reflect substantial degradation, aligned with prolonged routes such as those reliant on slower maritime shipping. This depiction highlights the method’s robustness, featuring an extensive frontier that enables stakeholders in eco-friendly supply chains to harmonize financial, ecological, and product integrity aspects during route planning.
The table highlights that long and medium routes are prevalent, with rail and sea modes dominating examples like 2-3-2-2-2, suggesting efficiency in balancing cost and emissions. Medium routes incorporate road/rail/sea mixes in later segments to reduce emissions, while medium-long variants prioritize multimodal flexibility despite potential time increases.
As depicted in
Figure 9, the bar chart shows the total frequencies sorted descending, with route index 10 dominating, followed by indices 6 and 15. This distribution indicates a strong bias toward medium–long routes, likely due to their balanced performance across objectives.
Following the aggregation of top routes from ten NSGA-II executions, Q-learning served as a post-processing mechanism to identify superior solutions. Candidates were derived from the diversity-enhanced set to construct a Markov Decision Process (MDP), wherein nodes functioned as states and pairs of (next node, transportation mode) as actions. Training encompassed 30,000 episodes employing an epsilon-greedy policy (initial
= 1.0, decaying at a rate of 0.99 to a minimum of 0.1), with Q-table perturbations (±0.2) introduced every 500 episodes to enhance exploration. The update procedure incorporated a learning rate of 0.5 and a discount factor of 0.9. The resulting optimal outcomes are summarized in
Table 7.
The Q-learning-derived optimal route, Harbin-Dalian-Shanghai-Guangzhou with road-sea-rail modes (nodes 1-3-5-7, [1,3,2]; , , , ), achieves low grain quality loss by minimizing time-sensitive segments via streamlined node bypassing, reducing exponential degradation, while elevating total costs due to sea transportation expenses; compared to the previously dominant route (1-2-3-4-5-7 with modes [2,3,2,2,2]; , , , ), it exhibits approximately 50% higher costs alongside approximately 20% diminished carbon emissions, stemming from waterway preferences that offer lower unit emissions but incur greater fixed overheads, reflecting a strategic emission-quality balance ideal for China’s carbon neutrality policy scenarios prioritizing sustainability over speed or budgeting.
5.3. Sensitivity and Uncertainty Analysis
To evaluate the model’s robustness under parameter variations and uncertainties (as per assumptions in
Section 3.1), we performed sensitivity analyses on key parameters: carbon price (
), penalty coefficient (
), degradation rates (
), partial containerization fraction (frac), and humidity/temperature (H/T). Additionally, Monte Carlo (MC) simulation with 100 iterations [
42] incorporated stochastic perturbations: Q by ±20% (normal), t by ±10%,
by ±15%—simulating real-world NSGT fluctuations. Each MC run executed a Hybrid NSGA-II-QI to compute objectives.
Table 8 presents the sensitivity of mean objective values to variations in carbon price (Pc) from 0.05 to 0.10. The results exhibit negligible changes across all objectives:
(cost) remains stable at approximately 64,901,
(time) is invariant at 97.19,
(emission cost) increases marginally by 0.04% (from 2583.49 to 2584.56), and
(grain loss) shows minimal fluctuation around 20.78. This limited sensitivity underscores the model’s robustness to carbon pricing policies, implying that optimized transportation routes maintain efficiency even under escalating environmental taxes. Such stability is particularly valuable for NSGT planning, as it reduces the need for frequent reconfiguration in response to policy shifts. However, the slight rise in
highlights a preference for low-emission modes at higher Pc, aligning with sustainable development goals.
Figure 10 illustrates the sensitivity of the mean violation penalty in
to
(0.5–2). At
= 0.5, the penalty is negative (−1279.69), indicating lenient handling of time window violations that may underestimate emission costs. At
= 1, it approaches zero (1.52), reflecting balanced enforcement. At
= 2, it becomes positive (2563.94), a 300% increase from
= 1, demonstrating stricter penalties reduce violations but elevate
. This linear rise aligns with the model’s penalty term, confirming that high gamma promotes compliance in NSGT schedules. Practically,
favors robust low-carbon routes by minimizing delays, though excessive values may overpenalize feasible solutions.
Figure 11 depicts
sensitivity on mean
: at
= 0.005,
ranges from 20.386 to 20.390 (minimal
impact); at
= 0.01, 20.776 to 20.785; at
= 0.02, 21.559 to 21.573—A linear increase of 5.8% per
doubling, with
inducing
variation. This moderate exponential response validates the degradation model
Section 3.3, Equation (
4), where higher rates amplify time-dependent loss.
Figure 12 shows Humidity/Temperature effects:
rises non-linearly from 21.536 (H = 0.4, T = 20) to 21.577 (H = 0.6, T = 25), 0.2% overall, with peaks at elevated values. This extension of ignored factors confirms exponential humidity/temperature impacts on grain quality, underscoring the need for controlled storage in NSGT.
Table 9 summarizes the sensitivity of mean objective values to partial containerization fraction (frac) from 0.5 to 1.0. The analysis reveals minimal fluctuations:
(cost) hovers around 90525 with negligible perturbation (<0.001%),
(time) remains constant at 97.19,
(emission cost) shows a subtle rise of 0.0002% (from 5146.97 to 5146.98), and
(grain loss) varies slightly by 0.06% around 21.55. This limited response highlights the model’s insensitivity to fraction variations, validating the transshipment structure. In practice, higher frac values support cost and emission reductions through mixed-mode efficiency, facilitating adaptable NSGT strategies under partial loading scenarios. A constraint is the assumption of uniform containerization; empirical logistics data could further calibrate the impacts.
Figure 13 illustrates the distributions of objective values from 100 Monte Carlo simulations, incorporating stochastic perturbations in grain quantity (±20%), transit time (±10%), and carbon price (±15%) to mimic real-world NSGT uncertainties. The
(cost) histogram exhibits right-skewing, with frequencies peaking at low values (0–2
) and tapering to
, reflecting optimization bias toward cost efficiency.
(time) shows a bimodal pattern, clustering around 0 and 200–400 h, indicating preferences for short-to-medium routes.
(emission cost) is left-skewed (0–5000 dominant, tail to 15,000), underscoring low-carbon mode prioritization.
(grain loss) displays a narrow peak (0–200), demonstrating controlled degradation.
These profiles affirm the model’s validity: tight distributions confirm resilience, ensuring viable solutions under fluctuations. This supports eco-friendly NSGT strategies, validating the framework’s practical utility.Overall, the model’s robustness is evident, with variations <20% in most objectives.
5.4. Comparison of Different Algorithms
To validate the effectiveness of the proposed Hybrid NSGA-II (NSGA-II-Ql) algorithm [
43], it is compared with four benchmark algorithms: Pure NSGA-II, NSGA (Original), SPEA (Original), and Weighted Sum Genetic Algorithm. All algorithms are executed on the same multimodal transportation network instance using consistent parameters: population size = 100; maximum generations = 250; crossover probability = 0.9; and mutation probability = 0.15 [
44,
45]. Evaluations are based on 30 independent runs to ensure statistical reliability. The primary metric is Hypervolume (HV), which comprehensively assesses the convergence and diversity of the Pareto front. Additional metrics include inverted generational distance (IGD) and the Wilcoxon rank-sum test for statistical significance.
To address concerns on statistical efficacy, we conducted power analysis using the sampsizepwr function in MATLAB (R2024b) [
46], revealing a power of 0.99999 for detecting >10% HV differences across 30 runs at
= 0.05 and effect size = 1.2. This exceeds the 0.8 threshold, confirming sufficiency. Average run time was 35.5331 s, feasible for practical NSGT planning on standard hardware.
Figure 14 illustrates the power curve obtained from 30 experimental runs.Furthermore, the HV reference point was set to [6 ×
, 12,000, 60,000, 1200], derived from maximum objective values +20% buffer to ensure positive volumes [
47], making HV meaningful as dominated space measure.
Figure 15 illustrates the evolution of the average hypervolume (HV) over generations across 30 runs, with shaded areas representing the standard deviation (SD). The shades consist of transparent filled bands, where the band width reflects the variability across runs. The Hybrid NSGA-II-QI exhibits the highest average HV, commencing at approximately
, undergoing a brief fluctuation, and then rapidly ascending to about
by around 100 generations before stabilizing, with a small SD indicating superior average performance and robustness. The Pure NSGA-II and Original NSGA stabilize at similar levels around
, with narrow SD shades demonstrating consistent performance. The Original SPEA stabilizes at approximately
, showing moderate improvement but with wider SD, indicating less stability. The Weighted Sum GA performs the worst, commencing at a low value and stabilizing at about
, with a very thin SD but limited overall performance due to its single-objective conversion nature, which may favor specific weights at the expense of diversity. A summary of the mean values and
p-values is presented in
Table 10.
As shown in
Table 7, Hybrid NSGA-II achieves mean HV of
, comparable to Pure NSGA-II but superior to SPEA and Weighted Sum GA. Despite close means,
p-values are low for some due to Wilcoxon rank-sum test’s sensitivity to distribution/variance differences rather than means alone. Hybrid’s lower SD indicates tighter distribution and greater reliability, reflecting consistent ranks across runs, explaining apparent contradictions.
Figure 16 presents a boxplot comparing the final hypervolume (HV) distributions after 250 generations across 30 runs. The Hybrid NSGA-II exhibits a median HV of approximately
, with a compact box and minimal variability, demonstrating high consistency and robustness. The Pure NSGA-II and Original NSGA show similar medians around 5.08–5.10
, with slightly wider boxes indicating moderate variability. The Original SPEA has a lower median of about
and a flatter box, reflecting greater variability and less stable performance. The Weighted Sum GA performs the worst with a median HV of approximately
, indicating inconsistent performance; it may achieve peaks in certain runs but remains lower in most due to its bias toward specific weight configurations.
Statistical analysis using the Wilcoxon rank-sum test reveals a p-value of (<0.05) between the Hybrid NSGA-II and Weighted Sum GA, indicating significant differences. The Hybrid NSGA-II achieves one of the highest average HV values, with low variability that enhances its reliability in practical applications. The Hybrid NSGA-II attains a mean inverted generational distance (IGD) of 50.235 (SD = 29.336), suggesting that its Pareto front is closest to the reference set, thereby exhibiting superior quality. Compared to the other benchmarks, the Hybrid NSGA-II yields p-values for most comparisons (vs. Original NSGA: ; vs. Original SPEA: ), confirming its significant superiority over these algorithms. For Pure NSGA-II (p = 0.137 > 0.05), the lack of significance indicates statistical equivalence in HV distributions; however, the Hybrid NSGA-II demonstrates advantages in lower variability (SD = vs. ) and substantially better IGD (50.235 vs. 6104.330), underscoring its enhanced robustness and solution quality in multi-objective optimization.
Figure 17 presents a 3D comparison of the Pareto fronts (
: cost vs.
: time vs.
: emission cost, with color representing
: grain loss, ranging from blue for low values to yellow for high values). The Hybrid NSGA-II solutions exhibit a broad distribution, encompassing regions of low cost, low time, and low emissions, with f4 colors biased toward blue (indicating low loss), thereby demonstrating excellent balance across multiple objectives.
The results indicate that the Hybrid NSGA-II-QI performs best in terms of average hypervolume (HV), with a value of approximately , demonstrating stable and diverse solutions across 30 runs. Although the Pure NSGA-II and Original NSGA achieve slightly higher HV values ( and , respectively), the differences are not statistically significant (p = 0.137 and p = ), and the Hybrid NSGA-II offers lower variability (SD = ), achieving an improvement rate of approximately 12-fold in IGD relative to the Pure NSGA-II (based on code calculations). These advantages arise from its hybrid features: Q-learning refines top-tier routes, while adaptive mutation and diversity reinitialization enhance robustness. In the context of multimodal transportation optimization incorporating carbon emissions, the Hybrid NSGA-II demonstrates greater reliability in real-world scenarios. Limitations include elevated computational overhead, and future research may investigate additional metrics such as spread.
5.5. Scalability Assessment of the Framework
To evaluate the extensibility of the hybrid NSGA-II-Q-learning approach for optimizing multimodal grain transportation, we scaled the baseline 7-node network to a randomly constructed 20-node configuration. This involved amplifying network intricacy while preserving core parameters. The analysis, outlined in
Table 11 and illustrated in
Figure 18, leverages standard multi-objective evolutionary algorithm indicators like hypervolume (HV) and inverted generational distance (IGD), commonly used to gauge solution excellence and approximation accuracy. These align with contemporary MOEA applications in supply chain contexts, where HV assesses front span and IGD evaluates deviation from an optimal benchmark.
The outcomes demonstrate robust coherence and conform to expected behaviors in logistics modeling. The 7-node reference, rooted in empirical North-to-South Grain Transportation routes, yielded a mean terminal HV of , IGD of 50.235, and average runtime of 35.5 s across iterations. In the 20-node setup, HV rose modestly to , IGD increased to 52.658, and runtime grew to 38.2 s. Such variations are credible: enlarged solution spaces foster broader exploration, potentially boosting HV, but pose slight approximation hurdles evident in higher IGD. Runtimes scale modestly, reflecting NSGA-II’s efficiency in complex scenarios.
Figure 18 3D Pareto visualization reinforces this. Inverse relationships and even spacing highlight effective optimization, echoing dynamics in emission-focused supply models. The constrained f4 span, possibly due to variance-enhancing perturbations, remains viable and consistent with decay mechanisms for fragile commodities. Metrics align with field norms, affirming cross-scale dependability without irregularities.