Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,307)

Search Parameters:
Keywords = policy gradient

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 1359 KB  
Article
ESO-Enhanced Actor–Critic Reinforcement Learning-Optimised Trajectory Tracking Control for 3-DOF Marine Vessels
by Xiaoling Liang and Jiajian Li
Mathematics 2026, 14(5), 867; https://doi.org/10.3390/math14050867 - 4 Mar 2026
Abstract
This paper develops an extended-state-observer (ESO)-enhanced actor–critic reinforcement learning (RL) scheme for the trajectory tracking control of 3-DOF marine vessels subject to uncertain hydrodynamics and environmental disturbances. A coordinate-consistent error construction is provided to obtain an exact strict-feedback second-order uncertain template. On this [...] Read more.
This paper develops an extended-state-observer (ESO)-enhanced actor–critic reinforcement learning (RL) scheme for the trajectory tracking control of 3-DOF marine vessels subject to uncertain hydrodynamics and environmental disturbances. A coordinate-consistent error construction is provided to obtain an exact strict-feedback second-order uncertain template. On this basis, an Hamilton–Jacobi–Bellman (HJB)-inspired optimised control structure is implemented: the critic approximates the optimal value-gradient and the actor generates the optimised control law. A key simplification is employed: rather than minimising the squared Bellman residual via complex gradients, we introduce an HJB-inspired actor–critic consistency regularisation through a weight-matching coupling. This yields computationally light online update laws and enables transparent Lyapunov-based stability analysis while not claiming exact HJB satisfaction or policy optimality. The ESO estimates lumped uncertainty and provides feedforward compensation, so the RL module learns only the observer residual. A composite Lyapunov analysis establishes the semi-global uniform ultimate boundedness of tracking errors and boundedness of all observer signals. Practical implementation with thruster allocation, explicit wind–wave–current disturbance shaping filters, and a theory-aligned ablation protocol are provided for reproducibility. Full article
Show Figures

Figure 1

26 pages, 1296 KB  
Article
Spatiotemporal Evolution and Obstacle Factors of Coupling Coordination Among Low-Carbon Logistics, Regional Economy, and Ecological Environment Systems in the Yellow River Basin
by Qian Zhou, Ligang Wu and Mengyao Zhang
Sustainability 2026, 18(5), 2458; https://doi.org/10.3390/su18052458 - 3 Mar 2026
Abstract
Under the background of the “dual carbon” strategy and regional coordinated development, the synergistic evolution of low-carbon logistics, regional economy, and ecological environment in the Yellow River Basin has become a key pathway to achieving high-quality development. Taking nine provinces (autonomous regions) within [...] Read more.
Under the background of the “dual carbon” strategy and regional coordinated development, the synergistic evolution of low-carbon logistics, regional economy, and ecological environment in the Yellow River Basin has become a key pathway to achieving high-quality development. Taking nine provinces (autonomous regions) within the basin as the study area, this paper constructed a coupling coordination evaluation index system for the LREES (Low-carbon Logistics–Regional Economy–Ecological Environment System), and measured the comprehensive development level of each subsystem using the entropy weight method. Based on the coupling coordination degree model, the temporal evolution of the three systems from 2010 to 2024 was systematically evaluated. In addition, global and local spatial autocorrelation models were introduced to identify spatial clustering patterns, while the obstacle degree model was used to identify key constraints at both the criterion and indicator levels. The results revealed that: the overall development level of the LREES systems steadily increased, with reduced regional disparities; the coupling coordination degree showed a trend of “fluctuating rise–gradual coordination,” with the average value increasing from 0.450 to 0.623, indicating continuously enhanced synergy; spatially, a gradient pattern of “downstream > midstream > upstream” emerged, accompanied by significant positive spatial autocorrelation; resource endowment and development scale were major constraints, while construction level, operational efficiency, and governance capacity were secondary. High-frequency obstacle indicators included per capita water resources, total import and export volume, and urban sewage treatment capacity. These findings offer theoretical support and policy guidance for promoting green transformation, enhancing system synergy, and advancing coordinated regional development in the Yellow River Basin. Full article
Show Figures

Figure 1

18 pages, 4743 KB  
Article
Reinforcement Learning-Based Super-Twisting Sliding Mode Control for Maglev Guidance System
by Junqi Xu, Wenshuo Wang, Chen Chen, Lijun Rong, Wen Ji and Zijian Guo
Actuators 2026, 15(3), 147; https://doi.org/10.3390/act15030147 - 3 Mar 2026
Viewed by 37
Abstract
The high-speed Electromagnetic Suspension (EMS) maglev guidance system exhibits inherent characteristics of strong nonlinearity, parameter time-variation, and complex external disturbances. To further optimize and improve the control performance of the guidance system for high-speed maglev trains, a novel intelligent control strategy that integrates [...] Read more.
The high-speed Electromagnetic Suspension (EMS) maglev guidance system exhibits inherent characteristics of strong nonlinearity, parameter time-variation, and complex external disturbances. To further optimize and improve the control performance of the guidance system for high-speed maglev trains, a novel intelligent control strategy that integrates the Deep Deterministic Policy Gradient (DDPG) algorithm with Super-Twisting Sliding Mode Control (STSMC) is proposed. Focusing on a single-ended guidance unit with differential control of dual electromagnets, an STSMC controller is first designed based on a cascaded control framework. To overcome the limitation of offline parameter tuning in dynamic operational conditions, a reinforcement learning optimization framework employing DDPG is introduced. A multi-objective hybrid reward function is formulated, incorporating error convergence, sliding mode stability, and chattering suppression, thereby realizing the online self-tuning of core STSMC parameters via real-time interaction between the agent and the environment. Numerical simulations under typical disturbance conditions verify that the proposed DDPG-STSMC controller significantly reduces the amplitude of guidance gap variation and accelerates dynamic recovery compared to conventional PID control. Its superior performance in disturbance rejection, control accuracy, and operational adaptability is validated. This study, conducted through high-fidelity numerical simulations based on actual system parameters, provides a robust theoretical foundation for subsequent hardware-in-the-loop (HIL) experimentation. Full article
(This article belongs to the Special Issue Advanced Theory and Application of Magnetic Actuators—3rd Edition)
Show Figures

Figure 1

23 pages, 13416 KB  
Article
An Adaptive Ensemble Model Based on Deep Reinforcement Learning for the Prediction of Step-like Landslide Displacement
by Tengfei Gu, Lei Huang, Shunyao Tian, Zhichao Zhang, Huan Zhang and Yanke Zhang
Remote Sens. 2026, 18(5), 761; https://doi.org/10.3390/rs18050761 - 3 Mar 2026
Viewed by 40
Abstract
Accurate prediction of landslide displacement is crucial for hazard prevention. However, recurrent neural network (RNN) models have limitations in simultaneously capturing lag time and feature importance, and their black-box nature limits their interpretability. Moreover, the performance of single models varies across different deformation [...] Read more.
Accurate prediction of landslide displacement is crucial for hazard prevention. However, recurrent neural network (RNN) models have limitations in simultaneously capturing lag time and feature importance, and their black-box nature limits their interpretability. Moreover, the performance of single models varies across different deformation stages, especially during acceleration. To address these challenges, we propose an interpretable deep reinforcement learning-based adaptive ensemble (DRL-AE) framework. The method employs Seasonal and Trend decomposition using Loess to separate cumulative displacement into trend and periodic components. Trend and periodic sequences are predicted using double exponential smoothing and three RNN variants, respectively. An improved Convolutional Block Attention Module (ICBAM) enhances periodic feature extraction and provides temporal–spatial interpretability. The Deep Deterministic Policy Gradient algorithm adaptively integrates multi-model predictions in response to evolving environmental conditions. To validate the DRL-AE, a case study is conducted on the Baijiabao landslide in Zigui County, China. The results indicate that the DRL-AE substantially enhances prediction accuracy. For periodic displacement, it reduces MAE by 10.02% and RMSE by 6.65%, and increases R2 by 4.27% compared with the ICBAM-GRU model. The results also confirm the effectiveness of ICBAM in feature extraction, and the generated heatmaps provide intuitive interpretability of the relevant triggering factors. Full article
Show Figures

Figure 1

21 pages, 1099 KB  
Article
Low-Latency Holographic Video Transmission in Indoor VLC Networks Assisted by Rotatable Photodetectors
by Wenzhe Wang and Long Zhang
Future Internet 2026, 18(3), 129; https://doi.org/10.3390/fi18030129 - 2 Mar 2026
Viewed by 129
Abstract
As a next-generation immersive service, holographic video enables users to move freely within a virtual world. This imposes stringent requirements on wireless networks. Given the massive bandwidth capacity inherent to visible light, visible light communication (VLC) can effectively meet the transmission requirements of [...] Read more.
As a next-generation immersive service, holographic video enables users to move freely within a virtual world. This imposes stringent requirements on wireless networks. Given the massive bandwidth capacity inherent to visible light, visible light communication (VLC) can effectively meet the transmission requirements of holographic video and is an ideal wireless technology for next-generation indoor immersive services. However, VLC channels are highly dependent on Line-of-Sight (LoS) links. Due to user mobility, traditional VLC systems relying on fixed-orientation Photodetectors (PDs) often suffer from severe channel fading, which significantly degrades the transmission performance. In this paper, we propose an indoor VLC holographic video transmission architecture supporting rotatable PDs, utilizing rotatable PDs mounted on Head-Mounted Displays (HMDs) to assist in holographic video transmission. To minimize the total transmission delay of all users, we address the holographic video transmission problem by jointly optimizing the transmit power allocation of VLC Access Points (APs) and the pitch and roll angles of the users’ PDs. By formulating the problem as a Markov Decision Process (MDP), we address it using a novel Deep Reinforcement Learning (DRL) strategy leveraging the Soft Actor–Critic (SAC) architecture. Simulation results demonstrate that the proposed scheme reduces the overall latency by up to 29.6% compared to the benchmark schemes. Furthermore, the convergence speed of the algorithm is improved by 35% compared to traditional deep reinforcement learning algorithms such as Deep Deterministic Policy Gradient (DDPG). Full article
Show Figures

Graphical abstract

23 pages, 2097 KB  
Article
Stochastic Inventory Optimization with Coherent Risk Measures: A Decision-Theoretic Framework for Probabilistic Forecasting and Constrained Optimization
by Lebede Ngartera, Saralees Nadarajah, Rodoumta Koina and Youssou Gningue
J. Risk Financial Manag. 2026, 19(3), 173; https://doi.org/10.3390/jrfm19030173 - 1 Mar 2026
Viewed by 132
Abstract
Each year, inventory decisions made under demand uncertainty generate substantial economic losses, reflecting a persistent disconnect between forecasting models and the operational decisions they are intended to support. This paper addresses this gap by proposing a Decision Intelligence Framework that unifies three components [...] Read more.
Each year, inventory decisions made under demand uncertainty generate substantial economic losses, reflecting a persistent disconnect between forecasting models and the operational decisions they are intended to support. This paper addresses this gap by proposing a Decision Intelligence Framework that unifies three components typically treated in isolation: probabilistic demand forecasting via gradient boosting quantile regression, constrained newsvendor optimization under capacity and budget constraints, and coherent tail risk evaluation using Conditional Value-at-Risk (CVaR95). We establish a central theoretical result showing that calibrated quantile forecasts are mathematically equivalent to optimal newsvendor solutions, providing a rigorous decision-theoretic foundation linking probabilistic forecasting and inventory control. The framework is evaluated on the UCI Online Retail dataset (2010–2011), aggregated to daily demand at the country–SKU level and densified to a daily panel by treating missing transaction days as zero demand. Relative to median-based (P50) policies, P90 policies reduce tail risk (CVaR95) by 26.7% under empirical residual bootstrap, increase cycle service levels from 44.4% to 89.5%, and reduce mean cost by 48.7% (non-overlapping bootstrap CIs for CVaR95). A lognormal stress test shows larger reductions (72.3%), and a CV sweep confirms monotone gains in this setting. Full article
(This article belongs to the Section Risk)
Show Figures

Figure 1

28 pages, 2490 KB  
Article
Life Cycle Participation in Urban Regeneration: A Policy Design–Implementation–Evaluation Assessment of Guangzhou
by Chengwang Yang, Changdong Ye, Yin Ding, Jiyang Mi, Yingsheng Liu and Long Zhou
Land 2026, 15(3), 402; https://doi.org/10.3390/land15030402 - 28 Feb 2026
Viewed by 112
Abstract
Public participation in Global South urban regeneration often exhibits a “high-commitment—low-conversion” gap between institutional intent and effective citizen influence. Taking Guangzhou, China, as a case, this study develops a Policy design–Implementation–Evaluation (P–I–E) framework to examine participation across the policy life cycle. We review [...] Read more.
Public participation in Global South urban regeneration often exhibits a “high-commitment—low-conversion” gap between institutional intent and effective citizen influence. Taking Guangzhou, China, as a case, this study develops a Policy design–Implementation–Evaluation (P–I–E) framework to examine participation across the policy life cycle. We review 48 municipal policy documents (2009–2024) to code 34 participation elements, link them to implementation rates of 798 projects across 11 districts, and triangulate outcomes using a survey of 1000 residents. By operationalizing Arnstein’s ladder into an index and introducing an expert-scored Design Completeness (DC) measure, we identify a participation gradient in which refined, enforceable provisions cluster in ex post compliance, while early-stage agenda-setting remains weak. The persistent conversion gap is explained by contrasting governance mechanisms: procedural participation is administratively legible and low-cost to implement, whereas empowerment requires enforceable decision interfaces, multi-actor coordination, and closed-loop accountability. Empirically, symbolic instruments achieve high implementation, while power-sharing elements are rarely enacted; substantive co-creation bundled with early empowerment and feedback mechanisms is associated with higher resident satisfaction and greater uptake of citizen input. Strengthening legally binding decision interfaces and accountability infrastructures is therefore critical for advancing substantive participation. Full article
(This article belongs to the Section Land Planning and Landscape Architecture)
Show Figures

Figure 1

25 pages, 5606 KB  
Article
Health-Aware Differentiated Energy Management for Multi-Stack Fuel Cell Hybrid Power Systems on Ships
by Lin Zhu, Yancheng Liu, Haohao Guo and Siyuan Liu
J. Mar. Sci. Eng. 2026, 14(5), 460; https://doi.org/10.3390/jmse14050460 - 28 Feb 2026
Viewed by 81
Abstract
This study proposes a health-aware energy management strategy based on the twin delayed deep deterministic policy gradient (TD3) algorithm for hybrid fuel cell/battery-powered ships. Unlike traditional approaches that treat multiple fuel cell stacks as homogeneous units, this strategy innovatively implements differentiated power allocation [...] Read more.
This study proposes a health-aware energy management strategy based on the twin delayed deep deterministic policy gradient (TD3) algorithm for hybrid fuel cell/battery-powered ships. Unlike traditional approaches that treat multiple fuel cell stacks as homogeneous units, this strategy innovatively implements differentiated power allocation based on the real-time state of health of each stack. The research first validates the superiority of the TD3 framework over the deep Q-learning framework at the algorithmic level. Further comparative experiments conducted across three scenarios with varying degrees of state of health differences show that, compared to the TD3 baseline strategy employing average power allocation, the health-aware differentiated TD3 strategy significantly reduces the total voyage cost of the system, with the cost-saving effect becoming more pronounced as the state of health disparity between stacks increases. Additionally, by incorporating rule-based constraints, the convergence speed of the TD3 algorithm is effectively enhanced, improving its feasibility for real-time control. Tests under dynamic and fluctuating load conditions further confirm the strategy’s effectiveness and applicability. In summary, the health-aware TD3 strategy proposed in this study not only provides an efficient and reliable energy management solution for hybrid-powered ships but also promotes the application of machine learning in the field of ship energy management. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

31 pages, 2520 KB  
Article
Parameterized Reinforcement Learning with Route Guidance for Controlling Urban Road Traffic Networks
by Edwin M. Kataka, Thomas O. Olwal, Karim Djouani and Prosper Z. Sotenga
Future Transp. 2026, 6(2), 56; https://doi.org/10.3390/futuretransp6020056 - 28 Feb 2026
Viewed by 71
Abstract
Traditional macroscopic fundamental diagram (MFD)-based traffic perimeter metering control strategies rely on full knowledge of vehicle accumulation and inter-regional flow dynamics, assumptions that seldom hold in heterogeneous and highly variable real-world networks. Classical data-driven reinforcement learning methods face similar constraints, often converging slowly [...] Read more.
Traditional macroscopic fundamental diagram (MFD)-based traffic perimeter metering control strategies rely on full knowledge of vehicle accumulation and inter-regional flow dynamics, assumptions that seldom hold in heterogeneous and highly variable real-world networks. Classical data-driven reinforcement learning methods face similar constraints, often converging slowly and exhibiting low sample efficiency when confronted with such complexities. Motivated by these limitations, this paper proposes a Parameterized Deep Q-Network perimeter control (P-DQNPC) scheme designed for multi-region urban road networks. The framework jointly optimizes discrete actions (regional routing choices) and continuous actions (signal-timing or flow-duration regulation) within a model-free learning structure. The approach is first trained and validated on synthetic MFD data to establish stable and interpretable policy behavior under controlled conditions. It is then transferred and further evaluated using real-world measurements from the Performance Measurement System—San Francisco Bay Area (PeMS-SF), a dataset collected from 18,954 loop detectors across the California State Highway System. PeMS-SF is selected due to its high spatial and temporal resolution, broad network coverage, and strong ability to capture realistic and diverse congestion patterns qualities that support both rigorous validation and generalization to other metropolitan regions. Experimental results show that P-DQNPC consistently outperforms state-of-the-art baselines, including deep deterministic policy gradient, deep Q-network, and No-Control schemes. The proposed method achieves superior regulation of regional accumulations and demonstrates enhanced robustness in large, heterogeneous, and uncertain urban traffic environments. Full article
Show Figures

Figure 1

25 pages, 9018 KB  
Review
The Status of Marine Energy of Costa Rica: Challenges and Opportunities for Grid Integration
by Jose Rodrigo Rojas-Morales, Christopher Vega-Sánchez, Juan Luis Guerrero-Fernández, Rodney Eduardo Mora-Escalante, Pablo César Mora-Céspedes, Michelle Chavarría-Brenes, Manuel Corrales-Gonzalez, Julio César Rojas-Gómez, Rolando Madriz-Vargas and Leonardo Suárez-Matarrita
Energies 2026, 19(5), 1189; https://doi.org/10.3390/en19051189 - 27 Feb 2026
Viewed by 245
Abstract
Marine renewable energy could support Costa Rica’s decarbonization pathway, but its offshore resource base and enabling conditions remain poorly characterized in the body of knowledge. This study provides the first integrated assessment of marine energy resources, grid integration opportunities, and governance challenges in [...] Read more.
Marine renewable energy could support Costa Rica’s decarbonization pathway, but its offshore resource base and enabling conditions remain poorly characterized in the body of knowledge. This study provides the first integrated assessment of marine energy resources, grid integration opportunities, and governance challenges in Costa Rica. A meta-analysis of 76 technical, legal, and policy sources is combined with qualitative doctrinal analysis, GIS-based multi-criteria evaluation for Ocean Thermal Energy Conversion (OTEC), and satellite and reanalysis data for winds, waves, currents, and sea surface temperature to estimate power densities and extractable energy. Results show a contrast between the Pacific and Caribbean coasts. For instance, on the Northern Pacific coast, there are strong Papagayo winds, and persistent swells yield high offshore wind and wave energy potentials, with technical offshore wind resources of around 14.4 GW and Pacific wave power frequently exceeding 20–25 kW/m with relatively low seasonal variability. Furthermore, twelve OTEC-suitable zones are identified with two priority areas in the southern Pacific that combine steep bathymetry and strong thermal gradients with limited environmental conflicts, but they overlap with sensitive conservation and Indigenous territories. Current energy potential is more localized and modest in the Caribbean coast. The analysis highlights major infrastructural, legal, and social barriers but concludes that marine energy can play a pivotal role in diversifying Costa Rica’s renewable-dominated electricity market. Full article
(This article belongs to the Special Issue Advanced Technologies for the Integration of Marine Energies)
Show Figures

Figure 1

43 pages, 421 KB  
Article
Education and Sustainability-Related Orientations: Cross-National Evidence from the World Values Survey
by Fatma Gülçin Demirci, Yavuz Selim Balcioglu, Ejder Güven, Sevda Uğuz, Ayşe İlgün Kamanlı, Cihan Yılmaz and Ayşe Bilgen
Sustainability 2026, 18(5), 2266; https://doi.org/10.3390/su18052266 - 26 Feb 2026
Viewed by 130
Abstract
As societies confront accelerating sustainability challenges, understanding the individual-level orientations that support collective action has become increasingly important. This study examines the association between educational attainment and three theoretically distinct sustainability-relevant value orientations using cross-national survey data. Drawing on the World Values Survey [...] Read more.
As societies confront accelerating sustainability challenges, understanding the individual-level orientations that support collective action has become increasingly important. This study examines the association between educational attainment and three theoretically distinct sustainability-relevant value orientations using cross-national survey data. Drawing on the World Values Survey Wave 7, we analyze responses from 65,608 individuals across 65 countries using weighted least squares regression with country fixed effects to investigate how education relates to norm orientation, future orientation, and inclusion. The analysis reveals substantial variation in the strength of these associations across value dimensions. Education demonstrates a particularly strong relationship with future orientation, yielding a standardized effect size of 0.497, while showing considerably weaker associations with inclusion and norm orientation. Moderation analyses uncover important demographic contingencies, indicating that education gradients for norm orientation and inclusion weaken significantly with age, whereas the education-future orientation relationship remains stable across age groups. A modest gender difference emerges for future orientation, with slightly attenuated education effects among women. These findings suggest that education contributes to sustainability-relevant values primarily through cognitive pathways that enhance temporal perspective rather than through socialization into normative compliance or expansion of social tolerance. The results carry implications for education policy design and sustainable development initiatives. Full article
23 pages, 2877 KB  
Article
Bi-Level Coordinated Planning of Port Multi-Energy Systems Considering Source-Load Uncertainty Based on WGAN-GP and SBOA
by Liying Zhong, Ming Yang, Shuang Liu, Ting Liu, Xinhao Bian and Liang Tong
Energies 2026, 19(5), 1160; https://doi.org/10.3390/en19051160 - 26 Feb 2026
Viewed by 125
Abstract
The high-penetration integration of renewable energy into port power systems is challenged by the stochastic volatility of wind–solar generation and dynamic load demands. To address this, this study proposes a data-driven bi-level coordinated planning framework for port wind–solar-storage systems, integrating a Wasserstein generative [...] Read more.
The high-penetration integration of renewable energy into port power systems is challenged by the stochastic volatility of wind–solar generation and dynamic load demands. To address this, this study proposes a data-driven bi-level coordinated planning framework for port wind–solar-storage systems, integrating a Wasserstein generative adversarial network with gradient penalty (WGAN-GP) and hybrid secretary bird optimization algorithm (SBOA) for solution seeking. The WGAN-GP-K-Means++ framework is adopted to capture the high-dimensional spatiotemporal correlations under the uncertainty of source ports and loads, and to generate the wind and solar resource scenarios for typical day. Subsequently, a bi-level planning model is constructed: the upper layer optimizes the siting and sizing of distributed generation and energy storage to minimize the life-cycle net present value, while the lower layer minimizes annual operating costs through multi-scenario dispatch. To resolve the resulting complex mixed-integer programming problem, a nested SBOA-Gurobi algorithm is developed. Case study of a Guangxi port demonstrates that the proposed approach reduces life-cycle cost by 44.94% relative to the baseline grid-connected scheme and exhibits superior convergence stability compared with GA, GRSO, and WOA. Additionally, sensitivity analysis quantifies the impact of electricity pricing policies, shore power utilization rates, and discount rate on the system’s economic benefits. This study provides a decision-support tool for the low-carbon transition and economic planning of port energy systems. Full article
Show Figures

Figure 1

25 pages, 4796 KB  
Article
AI-Driven Predictive Analytics for Sustainable Aviation: Metaheuristic-Optimized XGBoost for Carbon Emission Prediction
by Abdullah Mohamed Salem Elarifi and Wagdi M. S. Khalifa
Sustainability 2026, 18(5), 2246; https://doi.org/10.3390/su18052246 - 26 Feb 2026
Viewed by 113
Abstract
Intelligent transportation systems increasingly rely on artificial intelligence and predictive analytics to achieve sustainability. This study presents Adaptive Weighting, Chaos Theory, and Gaussian Mutation-based RIME algorithm-tuned Extreme Gradient Boosting (ACGRIME-XGBoost), an advanced Artificial Intelligence (AI)-driven framework specifically designed for carbon emission prediction in [...] Read more.
Intelligent transportation systems increasingly rely on artificial intelligence and predictive analytics to achieve sustainability. This study presents Adaptive Weighting, Chaos Theory, and Gaussian Mutation-based RIME algorithm-tuned Extreme Gradient Boosting (ACGRIME-XGBoost), an advanced Artificial Intelligence (AI)-driven framework specifically designed for carbon emission prediction in air transport to contribute to the development of sustainable smart infrastructure. The proposed hybrid model integrates XGBoost with ACGRIME, a novel metaheuristic optimization algorithm enhanced with chaos theory, adaptive weighting, and Gaussian mutation mechanisms to overcome limitations in traditional hyperparameter tuning approaches. The framework demonstrates exceptional performance on Congress on Evolutionary Computation (CEC) 2020 benchmark functions, outperforming conventional optimization algorithms in accuracy and robustness. When applied to real-world flight data within a smart transportation monitoring, ACGRIME-XGBoost achieves a 94% R2 score for CO2 emission prediction, significantly surpassing other optimized machine learning models. This research bridges the gap between advanced AI optimization techniques and sustainable transportation infrastructure, offering a scalable decision-support system that can be integrated with IoT sensor networks and mobility platforms in the future. The results demonstrate how metaheuristic-assisted machine learning can enhance environmental monitoring capabilities in smart transportation ecosystems, supporting data-driven policy-making for climate-resilient infrastructure and sustainable aviation management within the broader context. Also, the research contributes to sustainable aviation by enabling high-fidelity CO2 prediction models that can inform policy-making and be integrated into digital monitoring tools for future smart transport infrastructures. Full article
Show Figures

Figure 1

19 pages, 3606 KB  
Article
Autonomous Navigation of an Unmanned Underwater Vehicle via Safe Reinforcement Learning and Active Disturbance Rejection Control
by Qinze Chen, Yun Cheng, Yinlong Yuan and Liang Hua
J. Mar. Sci. Eng. 2026, 14(5), 425; https://doi.org/10.3390/jmse14050425 - 25 Feb 2026
Viewed by 172
Abstract
A two-layer control framework for unmanned underwater vehicle (UUV) navigation is proposed, combining a lower-layer active disturbance rejection controller (ADRC) with an upper-layer safe reinforcement learning (RL) policy for obstacle-avoidance navigation. The lower layer, utilizing ADRC, ensures high tracking accuracy and effective disturbance [...] Read more.
A two-layer control framework for unmanned underwater vehicle (UUV) navigation is proposed, combining a lower-layer active disturbance rejection controller (ADRC) with an upper-layer safe reinforcement learning (RL) policy for obstacle-avoidance navigation. The lower layer, utilizing ADRC, ensures high tracking accuracy and effective disturbance rejection, while the upper layer integrates the twin delayed deep deterministic policy gradient (TD3) algorithm, combined with a control barrier function (CBF)-based quadratic programming (QP) safety filter and safety-inspired reward shaping (SR). The method is evaluated in two simulation studies: (i) velocity and attitude control to assess tracking and disturbance rejection, and (ii) obstacle-avoidance navigation to assess learning efficiency, trajectory smoothness, and safety-related metrics. Simulation results show that ADRC achieves faster tracking and stronger disturbance rejection than a conventional proportional–integral–derivative (PID) controller. Moreover, the proposed TD3 + QP + SR scheme exhibits faster learning, smoother trajectories, and improved safety performance compared with RL baselines. These results indicate that the proposed framework enables efficient and safe UUV navigation in simulation scenarios with obstacles and disturbances. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

31 pages, 6983 KB  
Article
Multi-Agent Deep Deterministic Policy Gradient-Based Coordinated Control for Urban Expressway Entrance–Arterial Interfaces
by Shunchao Wang, Zhigang Wu and Wangzi Yu
Systems 2026, 14(3), 231; https://doi.org/10.3390/systems14030231 - 25 Feb 2026
Viewed by 137
Abstract
Coordinated control of ramp metering, variable speed limits, and intersection signals is critical for mitigating congestion and enhancing efficiency at urban expressway–arterial interfaces. Existing strategies often operate in isolation, leading to fragmented responses and limited adaptability under heterogeneous traffic demands. This study develops [...] Read more.
Coordinated control of ramp metering, variable speed limits, and intersection signals is critical for mitigating congestion and enhancing efficiency at urban expressway–arterial interfaces. Existing strategies often operate in isolation, leading to fragmented responses and limited adaptability under heterogeneous traffic demands. This study develops a multi-agent reinforcement learning framework based on MADDPG to achieve cooperative decision-making across heterogeneous controllers. An asynchronous control cycle mechanism is designed to accommodate different temporal requirements of ramp meters, speed limits, and signal controllers, ensuring practical feasibility in real-time operations. A conflict-aware reward design further embeds density regulation, speed harmonization, and spillback prevention to stabilize flow dynamics. Simulation experiments on a calibrated urban network demonstrate that the proposed framework delays congestion onset, reduces shockwave propagation, and improves throughput compared with classical benchmarks. In particular, at the mainline merge, average travel time is reduced to 13.56 s (62.4% of VSL-only); at the ramp, occupancy is lowered to 6.4% (40.6% of ALINEA); and at the signalized approach, average delay decreases to 85.71 s (62.7% of actuated control). These results highlight the scalability and deployment potential of the proposed cooperative control approach for system-level traffic management in mixed traffic environments. Full article
Show Figures

Figure 1

Back to TopTop