Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (538)

Search Parameters:
Keywords = policy gradient methods

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 868 KB  
Article
Stochastic Production Planning in Manufacturing Systems
by Dragos-Patru Covei
Axioms 2025, 14(10), 766; https://doi.org/10.3390/axioms14100766 - 16 Oct 2025
Viewed by 132
Abstract
We study stochastic production planning in capacity-constrained manufacturing systems, where feasible operating states are restricted to a convex safe-operating region. The objective is to minimize the total cost that combines a quadratic production effort with an inventory holding cost, while automatically halting production [...] Read more.
We study stochastic production planning in capacity-constrained manufacturing systems, where feasible operating states are restricted to a convex safe-operating region. The objective is to minimize the total cost that combines a quadratic production effort with an inventory holding cost, while automatically halting production when the state leaves the safe region. We derive the associated Hamilton–Jacobi–Bellman (HJB) equation, establish the existence and uniqueness of the value function under broad conditions, and prove a concavity property of the transformed value function that yields a robust gradient-based optimal feedback policy. From an operations perspective, the stopping mechanism encodes hard capacity and safety limits, ensuring bounded risk and finite expected costs. We complement the analysis with numerical methods based on finite differences and illustrate how the resulting policies inform real-time decisions through two application-inspired examples: a single-product case calibrated with typical process-industry parameters and a two-dimensional example motivated by semiconductor fabrication, where interacting production variables must satisfy joint safety constraints. The results bridge rigorous stochastic control with practical production planning and provide actionable guidance for operating under uncertainty and capacity limits. Full article
Show Figures

Figure 1

30 pages, 8790 KB  
Article
An Adaptive Framework for Remaining Useful Life Prediction Integrating Attention Mechanism and Deep Reinforcement Learning
by Yanhui Bai, Jiajia Du, Honghui Li, Xintao Bao, Linjun Li, Chun Zhang, Jiahe Yan, Renliang Wang and Yi Xu
Sensors 2025, 25(20), 6354; https://doi.org/10.3390/s25206354 - 14 Oct 2025
Viewed by 587
Abstract
The prediction of Remaining Useful Life (RUL) constitutes a vital aspect of Prognostics and Health Management (PHM), providing capabilities for the assessment of mechanical component health status and prediction of failure instances. Recent studies on feature extraction, time-series modeling, and multi-task learning have [...] Read more.
The prediction of Remaining Useful Life (RUL) constitutes a vital aspect of Prognostics and Health Management (PHM), providing capabilities for the assessment of mechanical component health status and prediction of failure instances. Recent studies on feature extraction, time-series modeling, and multi-task learning have shown remarkable advancements. However, most deep learning (DL) techniques predominantly focus on unimodal data or static feature extraction techniques, resulting in a lack of RUL prediction methods that can effectively capture the individual differences among heterogeneous sensors and failure modes under complex operational conditions. To overcome these limitations, an adaptive RUL prediction framework named ADAPT-RULNet is proposed for mechanical components, integrating the feature extraction capabilities of attention-enhanced deep learning (DL) and the decision-making abilities of deep reinforcement learning (DRL) to achieve end-to-end optimization from raw data to accurate RUL prediction. Initially, Functional Alignment Resampling (FAR) is employed to generate high-quality functional signals; then, attention-enhanced Dynamic Time Warping (DTW) is leveraged to obtain individual degradation stages. Subsequently, an attention-enhanced of hybrid multi-scale RUL prediction network is constructed to extract both local and global features from multi-format data. Furthermore, the network achieves optimal feature representation by adaptively fusing multi-source features through Bayesian methods. Finally, we innovatively introduce a Deep Deterministic Policy Gradient (DDPG) strategy from DRL to adaptively optimize key parameters in the construction of individual degradation stages and achieve a global balance between model complexity and prediction accuracy. The proposed model was evaluated on aircraft engines and railway freight car wheels. The results indicate that it achieves a lower average Root Mean Square Error (RMSE) and higher accuracy in comparison with current approaches. Moreover, the method shows strong potential for improving prediction accuracy and robustness in varied industrial applications. Full article
Show Figures

Figure 1

24 pages, 5244 KB  
Article
Optimizing Spatial Scales for Evaluating High-Resolution CO2 Fossil Fuel Emissions: Multi-Source Data and Machine Learning Approach
by Yujun Fang, Rong Li and Jun Cao
Sustainability 2025, 17(20), 9009; https://doi.org/10.3390/su17209009 - 11 Oct 2025
Viewed by 230
Abstract
High-resolution CO2 fossil fuel emission data are critical for developing targeted mitigation policies. As a key approach for estimating spatial distributions of CO2 emissions, top–down methods typically rely upon spatial proxies to disaggregate administrative-level emission to finer spatial scales. However, conventional [...] Read more.
High-resolution CO2 fossil fuel emission data are critical for developing targeted mitigation policies. As a key approach for estimating spatial distributions of CO2 emissions, top–down methods typically rely upon spatial proxies to disaggregate administrative-level emission to finer spatial scales. However, conventional linear regression models may fail to capture complex non-linear relationships between proxies and emissions. Furthermore, methods relying on nighttime light data are mostly inadequate in representing emissions for both industrial and rural zones. To address these limitations, this study developed a multiple proxy framework integrating nighttime light, points of interest (POIs), population, road networks, and impervious surface area data. Seven machine learning algorithms—Extra-Trees, Random Forest, XGBoost, CatBoost, Gradient Boosting Decision Trees, LightGBM, and Support Vector Regression—were comprehensively incorporated to estimate high-resolution CO2 fossil fuel emissions. Comprehensive evaluation revealed that the multiple proxy Extra-Trees model significantly outperformed the single-proxy nighttime light linear regression model at the county scale, achieving R2 = 0.96 (RMSE = 0.52 MtCO2) in cross-validation and R2 = 0.92 (RMSE = 0.54 MtCO2) on the independent test set. Feature importance analysis identified brightness of nighttime light (40.70%) and heavy industrial density (21.11%) as the most critical spatial proxies. The proposed approach also showed strong spatial consistency with the Multi-resolution Emission Inventory for China, exhibiting correlation coefficients of 0.82–0.84. This study demonstrates that integrating local multiple proxy data with machine learning corrects spatial biases inherent in traditional top–down approaches, establishing a transferable framework for high-resolution emissions mapping. Full article
Show Figures

Figure 1

28 pages, 3016 KB  
Article
Ensemble Learning Model for Industrial Policy Classification Using Automated Hyperparameter Optimization
by Hee-Seon Jang
Electronics 2025, 14(20), 3974; https://doi.org/10.3390/electronics14203974 - 10 Oct 2025
Viewed by 247
Abstract
The Global Trade Alert (GTA) website, managed by the United Nations, releases a large number of industrial policy (IP) announcements daily. Recently, leading nations including the United States and China have increasingly turned to IPs to protect and promote their domestic corporate interests. [...] Read more.
The Global Trade Alert (GTA) website, managed by the United Nations, releases a large number of industrial policy (IP) announcements daily. Recently, leading nations including the United States and China have increasingly turned to IPs to protect and promote their domestic corporate interests. They use both offensive and defensive tools such as tariffs, trade barriers, investment restrictions, and financial support measures. To evaluate how these policy announcements may affect national interests, many countries have implemented logistic regression models to automatically classify them as either IP or non-IP. This study proposes ensemble models—widely recognized for their superior performance in binary classification—as a more effective alternative. The random forest model (a bagging technique) and boosting methods (gradient boosting, XGBoost, and LightGBM) are proposed, and their performance is compared with that of logistic regression. For evaluation, a dataset of 2000 randomly selected policy documents was compiled and labeled by domain experts. Following data preprocessing, hyperparameter optimization was performed using the Optuna library in Python 3.10. To enhance model robustness, cross-validation was applied, and performance was evaluated using key metrics such as accuracy, precision, and recall. The analytical results demonstrate that ensemble models consistently outperform logistic regression in both baseline (default hyperparameters) and optimized configurations. Compared to logistic regression, LightGBM and random forest showed baseline accuracy improvements of 3.5% and 3.8%, respectively, with hyperparameter optimization yielding additional performance gains of 2.4–3.3% across ensemble methods. In particular, the analysis based on alternative performance indicators confirmed that the LightGBM and random forest models yielded the most reliable predictions. Full article
(This article belongs to the Special Issue Machine Learning for Data Mining)
Show Figures

Figure 1

19 pages, 6362 KB  
Article
Micro-Platform Verification for LiDAR SLAM-Based Navigation of Mecanum-Wheeled Robot in Warehouse Environment
by Yue Wang, Ying Yu Ye, Wei Zhong, Bo Lin Gao, Chong Zhang Mu and Ning Zhao
World Electr. Veh. J. 2025, 16(10), 571; https://doi.org/10.3390/wevj16100571 - 8 Oct 2025
Viewed by 386
Abstract
Path navigation for mobile robots critically determines the operational efficiency of warehouse logistics systems. However, the current QR (Quick Response) code path navigation for warehouses suffers from low operational efficiency and poor dynamic adaptability in complex dynamic environments. This paper introduces a deep [...] Read more.
Path navigation for mobile robots critically determines the operational efficiency of warehouse logistics systems. However, the current QR (Quick Response) code path navigation for warehouses suffers from low operational efficiency and poor dynamic adaptability in complex dynamic environments. This paper introduces a deep reinforcement learning and hybrid-algorithm SLAM (Simultaneous Localization and Mapping) path navigation method for Mecanum-wheeled robots, validated with an emphasis on dynamic adaptability and real-time performance. Based on the Gazebo warehouse simulation environment, the TD3 (Twin Deep Deterministic Policy Gradient) path planning method was established for offline training. Then, the Astar-Time Elastic Band (TEB) hybrid path planning algorithm was used to conduct experimental verification in static and dynamic real-world scenarios. Finally, experiments show that the TD3-based path planning for mobile robots makes effective decisions during offline training in the simulation environment, while Astar-TEB accurately completes path planning and navigates around both static and dynamic obstacles in real-world scenarios. Therefore, this verifies the feasibility and effectiveness of the proposed SLAM path navigation for Mecanum-wheeled mobile robots on a miniature warehouse platform. Full article
(This article belongs to the Special Issue Research on Intelligent Vehicle Path Planning Algorithm)
Show Figures

Figure 1

25 pages, 2714 KB  
Article
Evaluating Municipal Solid Waste Incineration Through Determining Flame Combustion to Improve Combustion Processes for Environmental Sanitation
by Jian Tang, Xiaoxian Yang, Wei Wang and Jian Rong
Sustainability 2025, 17(19), 8872; https://doi.org/10.3390/su17198872 - 4 Oct 2025
Viewed by 298
Abstract
Municipal solid waste (MSW) refers to solid and semi-solid waste generated during human production and daily activities. The process of incinerating such waste, known as municipal solid waste incineration (MSWI), serves as a critical method for reducing waste volume and recovering resources. Automatic [...] Read more.
Municipal solid waste (MSW) refers to solid and semi-solid waste generated during human production and daily activities. The process of incinerating such waste, known as municipal solid waste incineration (MSWI), serves as a critical method for reducing waste volume and recovering resources. Automatic online recognition of flame combustion status during MSWI is a key technical approach to ensuring system stability, addressing issues such as high pollution emissions, severe equipment wear, and low operational efficiency. However, when manually selecting optimized features and hyperparameters based on empirical experience, the MSWI flame combustion state recognition model suffers from high time consumption, strong dependency on expertise, and difficulty in adaptively obtaining optimal solutions. To address these challenges, this article proposes a method for constructing a flame combustion state recognition model optimized based on reinforcement learning (RL), long short-term memory (LSTM), and parallel differential evolution (PDE) algorithms, achieving collaborative optimization of deep features and model hyperparameters. First, the feature selection and hyperparameter optimization problem of the ViT-IDFC combustion state recognition model is transformed into an encoding design and optimization problem for the PDE algorithm. Then, the mutation and selection factors of the PDE algorithm are used as modeling inputs for LSTM, which predicts the optimal hyperparameters based on PDE outputs. Next, during the PDE-based optimization of the ViT-IDFC model, a policy gradient reinforcement learning method is applied to determine the parameters of the LSTM model. Finally, the optimized combustion state recognition model is obtained by identifying the feature selection parameters and hyperparameters of the ViT-IDFC model. Test results based on an industrial image dataset demonstrate that the proposed optimization algorithm improves the recognition performance of both left and right grate recognition models, with the left grate achieving a 0.51% increase in recognition accuracy and the right grate a 0.74% increase. Full article
(This article belongs to the Section Waste and Recycling)
Show Figures

Figure 1

17 pages, 782 KB  
Article
DAPO: Mobility-Aware Joint Optimization of Model Partitioning and Task Offloading for Edge LLM Inference
by Hao Feng, Gan Huang, Nian Zhou, Feng Zhang, Yuming Liu, Xiumin Zhou and Junchen Liu
Electronics 2025, 14(19), 3929; https://doi.org/10.3390/electronics14193929 - 3 Oct 2025
Viewed by 502
Abstract
Deploying Large Language Models (LLMs) in edge environments faces two major challenges: (i) the conflict between limited device resources and high computational demands, and (ii) the dynamic impact of user mobility on model partitioning and task offloading decisions. To address these challenges, this [...] Read more.
Deploying Large Language Models (LLMs) in edge environments faces two major challenges: (i) the conflict between limited device resources and high computational demands, and (ii) the dynamic impact of user mobility on model partitioning and task offloading decisions. To address these challenges, this paper proposes the Dynamic Adaptive Partitioning and Offloading (DAPO) framework, an intelligent solution for multi-user, multi-edge Mobile Edge Intelligence (MEI) systems. DAPO employs a Deep Deterministic Policy Gradient (DDPG) algorithm to jointly optimize the model partition point and the task offloading destination. By mapping continuous policy outputs onto valid discrete actions, DAPO efficiently addresses the high-dimensional hybrid action space and dynamically adapts to user mobility. Through extensive simulations, we demonstrate that DAPO outperforms baseline strategies and mainstream RL methods, achieving up to 27% lower latency and 18% lower energy consumption compared to PPO and A2C, while maintaining fast convergence and scalability in dynamic mobile environments. Full article
(This article belongs to the Special Issue Towards Efficient and Reliable AI at the Edge)
Show Figures

Figure 1

26 pages, 12288 KB  
Article
An Optimal Scheduling Method for Power Grids in Extreme Scenarios Based on an Information-Fusion MADDPG Algorithm
by Xun Dou, Cheng Li, Pengyi Niu, Dongmei Sun, Quanling Zhang and Zhenlan Dou
Mathematics 2025, 13(19), 3168; https://doi.org/10.3390/math13193168 - 3 Oct 2025
Viewed by 302
Abstract
With the large-scale integration of renewable energy into distribution networks, the intermittency and uncertainty of renewable generation pose significant challenges to the voltage security of the power grid under extreme scenarios. To address this issue, this paper proposes an optimal scheduling method for [...] Read more.
With the large-scale integration of renewable energy into distribution networks, the intermittency and uncertainty of renewable generation pose significant challenges to the voltage security of the power grid under extreme scenarios. To address this issue, this paper proposes an optimal scheduling method for power grids under extreme scenarios, based on an improved Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm. By simulating potential extreme scenarios in the power system and formulating targeted secure scheduling strategies, the proposed method effectively reduces trial-and-error costs. First, the time series clustering method is used to construct the extreme scene dataset based on the principle of maximizing scene differences. Then, a mathematical model of power grid optimal dispatching is constructed with the objective of ensuring voltage security, with explicit constraints and environmental settings. Then, an interactive scheduling model of distribution network resources is designed based on a multi-agent algorithm, including the construction of an agent state space, an action space, and a reward function. Then, an improved MADDPG multi-agent algorithm based on specific information fusion is proposed, and a hybrid optimization experience sampling strategy is developed to enhance the training efficiency and stability of the model. Finally, the effectiveness of the proposed method is verified by the case studies of the distribution network system. Full article
(This article belongs to the Special Issue Artificial Intelligence and Game Theory)
Show Figures

Figure 1

27 pages, 4866 KB  
Article
An Intelligent Control Framework for High-Power EV Fast Charging via Contrastive Learning and Manifold-Constrained Optimization
by Hao Tian, Tao Yan, Guangwu Dai, Min Wang and Xuejian Zhao
World Electr. Veh. J. 2025, 16(10), 562; https://doi.org/10.3390/wevj16100562 - 1 Oct 2025
Viewed by 212
Abstract
To address the complex trade-offs among charging efficiency, battery lifespan, energy efficiency, and safety in high-power electric vehicle (EV) fast charging, this paper presents an intelligent control framework based on contrastive learning and manifold-constrained multi-objective optimization. A multi-physics coupled electro-thermal-chemical model is formulated [...] Read more.
To address the complex trade-offs among charging efficiency, battery lifespan, energy efficiency, and safety in high-power electric vehicle (EV) fast charging, this paper presents an intelligent control framework based on contrastive learning and manifold-constrained multi-objective optimization. A multi-physics coupled electro-thermal-chemical model is formulated as a Mixed-Integer Nonlinear Programming (MINLP) problem, incorporating both continuous and discrete decision variables—such as charging power and cooling modes—into a unified optimization framework. An environment-adaptive optimization strategy is also developed. To enhance learning efficiency and policy safety, a contrastive learning–enhanced policy gradient (CLPG) algorithm is proposed to distinguish between high-quality and unsafe charging trajectories. A manifold-aware action generation network (MAN) is further introduced to enforce dynamic safety constraints under varying environmental and battery conditions. Simulation results demonstrate that the proposed framework reduces charging time to 18.3 min—47.7% faster than the conventional CC–CV method—while achieving 96.2% energy efficiency, 99.7% capacity retention, and zero safety violations. The framework also exhibits strong adaptability across wide temperature (−20 °C to 45 °C) and aging (SOH down to 70%) conditions, with real-time inference speed (6.76 ms) satisfying deployment requirements. This study provides a safe, efficient, and adaptive solution for intelligent high-power EV fast-charging. Full article
(This article belongs to the Section Charging Infrastructure and Grid Integration)
Show Figures

Figure 1

26 pages, 2589 KB  
Article
Vision-Based Adaptive Control of Robotic Arm Using MN-MD3+BC
by Xianxia Zhang, Junjie Wu and Chang Zhao
Appl. Sci. 2025, 15(19), 10569; https://doi.org/10.3390/app151910569 - 30 Sep 2025
Viewed by 290
Abstract
Aiming at the problems of traditional calibrated visual servo systems relying on precise model calibration and the high training cost and low efficiency of online reinforcement learning, this paper proposes a Multi-Network Mean Delayed Deep Deterministic Policy Gradient Algorithm with Behavior Cloning (MN-MD3+BC) [...] Read more.
Aiming at the problems of traditional calibrated visual servo systems relying on precise model calibration and the high training cost and low efficiency of online reinforcement learning, this paper proposes a Multi-Network Mean Delayed Deep Deterministic Policy Gradient Algorithm with Behavior Cloning (MN-MD3+BC) for uncalibrated visual adaptive control of robotic arms. The algorithm improves upon the Twin Delayed Deep Deterministic Policy Gradient (TD3) network framework by adopting an architecture with one actor network and three critic networks, along with corresponding target networks. By constructing a multi-critic network integration mechanism, the mean output of the networks is used as the final Q-value estimate, effectively reducing the estimation bias of a single critic network. Meanwhile, a behavior cloning regularization term is introduced to address the common distribution shift problem in offline reinforcement learning. Furthermore, to obtain a high-quality dataset, an innovative data recombination-driven dataset creation method is proposed, which reduces training costs and avoids the risks of real-world exploration. The trained policy network is embedded into the actual system as an adaptive controller, driving the robotic arm to gradually approach the target position through closed-loop control. The algorithm is applied to uncalibrated multi-degree-of-freedom robotic arm visual servo tasks, providing an adaptive and low-dependency solution for dynamic and complex scenarios. MATLAB simulations and experiments on the WPR1 platform demonstrate that, compared to traditional Jacobian matrix-based model-free methods, the proposed approach exhibits advantages in tracking accuracy, error convergence speed, and system stability. Full article
(This article belongs to the Special Issue Intelligent Control of Robotic System)
Show Figures

Figure 1

28 pages, 3341 KB  
Article
Research on Dynamic Energy Management Optimization of Park Integrated Energy System Based on Deep Reinforcement Learning
by Xinjian Jiang, Lei Zhang, Fuwang Li, Zhiru Li, Zhijian Ling and Zhenghui Zhao
Energies 2025, 18(19), 5172; https://doi.org/10.3390/en18195172 - 29 Sep 2025
Viewed by 341
Abstract
Under the background of energy transition, the Integrated Energy System (IES) of the park has become a key carrier for enhancing the consumption capacity of renewable energy due to its multi-energy complementary characteristics. However, the high proportion of wind and solar resource access [...] Read more.
Under the background of energy transition, the Integrated Energy System (IES) of the park has become a key carrier for enhancing the consumption capacity of renewable energy due to its multi-energy complementary characteristics. However, the high proportion of wind and solar resource access and the fluctuation of diverse loads have led to the system facing dual uncertainty challenges, and traditional optimization methods are difficult to adapt to the dynamic and complex dispatching requirements. To this end, this paper proposes a new dynamic energy management method based on Deep Reinforcement Learning (DRL) and constructs an IES hybrid integer nonlinear programming model including wind power, photovoltaic, combined heat and power generation, and storage of electric heat energy, with the goal of minimizing the operating cost of the system. By expressing the dispatching process as a Markov decision process, a state space covering wind and solar output, multiple loads and energy storage states is defined, a continuous action space for unit output and energy storage control is constructed, and a reward function integrating economic cost and the penalty for renewable energy consumption is designed. The Deep Deterministic Policy Gradient (DDPG) and Deep Q-Network (DQN) algorithms were adopted to achieve policy optimization. This study is based on simulation rather than experimental validation, which aligns with the exploratory scope of this research. The simulation results show that the DDPG algorithm achieves an average weekly operating cost of 532,424 yuan in the continuous action space scheduling, which is 8.6% lower than that of the DQN algorithm, and the standard deviation of the cost is reduced by 19.5%, indicating better robustness. Under the fluctuation of 10% to 30% on the source-load side, the DQN algorithm still maintains a cost fluctuation of less than 4.5%, highlighting the strong adaptability of DRL to uncertain environments. Therefore, this method has significant theoretical and practical value for promoting the intelligent transformation of the energy system. Full article
Show Figures

Figure 1

21 pages, 2647 KB  
Article
Structural Determinants of Greenhouse Gas Emissions Convergence in OECD Countries: A Machine Learning-Based Assessment
by Volkan Bektaş
Sustainability 2025, 17(19), 8730; https://doi.org/10.3390/su17198730 - 29 Sep 2025
Viewed by 476
Abstract
This study explores the convergence in greenhouse gas emissions (GHGs) and its determinants across 38 OECD countries during the period 1996–2022, employing the novel approach which combined club convergence method with supervised machine learning algorithm Extreme Gradient Boosting (XGBoost) and SHapley Additive exPlanations [...] Read more.
This study explores the convergence in greenhouse gas emissions (GHGs) and its determinants across 38 OECD countries during the period 1996–2022, employing the novel approach which combined club convergence method with supervised machine learning algorithm Extreme Gradient Boosting (XGBoost) and SHapley Additive exPlanations (SHAP) method. The findings reveal the presence of three distinct convergence clubs shaped by structural economic and institutional characteristics. Club 1 exhibits low energy efficiency, high fossil fuel dependence, and weak governance structures; Club 2 features strong institutional quality, advanced human capital, and effective environmental taxation; and Club 3 displays heterogeneous energy profiles but converges through socio-economic foundations. While traditional growth-related drivers such as technological innovation, foreign direct investments, and GDP growth play a limited role in explaining emission convergence, energy structures, institutional and policy-related factors emerge as key determinants. These findings highlight the limitations of one-size-fits-all climate policy frameworks and call for a more nuanced, club-specific approach to emission mitigation strategies. By combining convergence theory with interpretable machine learning, this study contributes a novel empirical framework to assess the differentiated effectiveness of environmental policies across heterogeneous country groups, offering actionable insights for international climate governance and targeted policy design. Full article
(This article belongs to the Section Air, Climate Change and Sustainability)
Show Figures

Graphical abstract

30 pages, 2577 KB  
Article
Indigenous Knowledge and Sustainable Management of Forest Resources in a Socio-Cultural Upheaval of the Okapi Wildlife Reserve Landscape in the Democratic Republic of the Congo
by Lucie Mugherwa Kasoki, Pyrus Flavien Ebouel Essouman, Charles Mumbere Musavandalo, Franck Robéan Wamba, Isaac Diansambu Makanua, Timothée Besisa Nguba, Krossy Mavakala, Jean-Pierre Mate Mweru, Samuel Christian Tsakem, Michel Babale, Francis Lelo Nzuzi and Baudouin Michel
Forests 2025, 16(10), 1523; https://doi.org/10.3390/f16101523 - 28 Sep 2025
Viewed by 652
Abstract
The Okapi Wildlife Reserve (OWR) in northeastern Democratic Republic of the Congo represents both a biodiversity hotspot and the ancestral homeland of the Indigenous Mbuti and Efe peoples, whose livelihoods and knowledge systems are closely tied to forest resources. This study investigates how [...] Read more.
The Okapi Wildlife Reserve (OWR) in northeastern Democratic Republic of the Congo represents both a biodiversity hotspot and the ancestral homeland of the Indigenous Mbuti and Efe peoples, whose livelihoods and knowledge systems are closely tied to forest resources. This study investigates how Indigenous knowledge and practices contribute to sustainable resource management under conditions of rapid socio-cultural transformation. A mixed-methods approach was applied, combining socio-demographic surveys (n = 80), focus group discussions, floristic inventories, and statistical analyses (ANOVA, logistic regressions, chi-square, MCA). Results show that hunting, fishing, gathering, and honey harvesting remain central livelihood activities, governed by customary taboos and restrictions that act as de facto ecological regulations. Agriculture, recently introduced through intercultural exchange with neighboring Bantu populations, complements rather than replaces traditional practices and demonstrates emerging agroecological hybridization. Nevertheless, evidence of biodiversity decline (including local disappearance of species such as Dioscorea spp.), erosion of intergenerational knowledge transmission, and increased reliance on monetary income indicate vulnerabilities. Multiple Correspondence Analysis revealed a highly structured socio-ecological gradient (98.5% variance explained; Cronbach’s α = 0.977), indicating that perceptions of environmental change are strongly coupled with demographic identity and livelihood strategies. Floristic inventories confirmed significant differences in species abundance across camps (ANOVA, p < 0.001), highlighting site-specific pressures and the protective effect of persistent customary norms. The findings underscore the resilience and adaptability of Indigenous Peoples but also their exposure to ecological and cultural disruptions. We conclude that formal recognition of Indigenous institutions and integration of their knowledge systems into co-management frameworks are essential to strengthen ecological resilience, secure Indigenous rights, and align conservation policies with global biodiversity and climate agendas. Full article
(This article belongs to the Special Issue Forest Ecosystem Services and Sustainable Management)
Show Figures

Figure 1

18 pages, 4509 KB  
Article
Reinforcement Learning Stabilization for Quadrotor UAVs via Lipschitz-Constrained Policy Regularization
by Jiale Quan, Weijun Hu, Xianlong Ma and Gang Chen
Drones 2025, 9(10), 675; https://doi.org/10.3390/drones9100675 - 26 Sep 2025
Viewed by 484
Abstract
Reinforcement learning (RL), and in particular Proximal Policy Optimization (PPO), has shown promise in high-precision quadrotor unmanned aerial vehicle (QUAV) control. However, the performance of PPO is highly sensitive to the choice of the clipping parameter, and inappropriate settings can lead to unstable [...] Read more.
Reinforcement learning (RL), and in particular Proximal Policy Optimization (PPO), has shown promise in high-precision quadrotor unmanned aerial vehicle (QUAV) control. However, the performance of PPO is highly sensitive to the choice of the clipping parameter, and inappropriate settings can lead to unstable training dynamics and excessive policy oscillations, which limit deployment in safety-critical aerial applications. To address this issue, we propose a stability-aware dynamic clipping parameter adjustment strategy, which adapts the clipping threshold ϵt in real time based on a stability variance metric St. This adaptive mechanism balances exploration and stability throughout the training process. Furthermore, we provide a Lipschitz continuity interpretation of the clipping mechanism, showing that its adaptation implicitly adjusts a bound on the policy update step, thereby offering a deterministic guarantee on the oscillation magnitude. Extensive simulation results demonstrate that the proposed method reduces policy variance by 45% and accelerates convergence compared to baseline PPO, resulting in smoother control responses and improved robustness under dynamic operating conditions. While developed within the PPO framework, the proposed approach is readily applicable to other on policy policy gradient methods. Full article
Show Figures

Figure 1

27 pages, 4674 KB  
Article
Design of a Robust Adaptive Cascade Fractional-Order Proportional–Integral–Derivative Controller Enhanced by Reinforcement Learning Algorithm for Speed Regulation of Brushless DC Motor in Electric Vehicles
by Seyyed Morteza Ghamari, Mehrdad Ghahramani, Daryoush Habibi and Asma Aziz
Energies 2025, 18(19), 5056; https://doi.org/10.3390/en18195056 - 23 Sep 2025
Viewed by 508
Abstract
Brushless DC (BLDC) motors are commonly used in electric vehicles (EVs) because of their efficiency, small size and great torque-speed performance. These motors have a few benefits such as low maintenance, increased reliability and power density. Nevertheless, BLDC motors are highly nonlinear and [...] Read more.
Brushless DC (BLDC) motors are commonly used in electric vehicles (EVs) because of their efficiency, small size and great torque-speed performance. These motors have a few benefits such as low maintenance, increased reliability and power density. Nevertheless, BLDC motors are highly nonlinear and their dynamics are very complicated, in particular, under changing load and supply conditions. The above features require the design of strong and adaptable control methods that can ensure performance over a broad spectrum of disturbances and uncertainties. In order to overcome these issues, this paper uses a Fractional-Order Proportional-Integral-Derivative (FOPID) controller that offers better control precision, better frequency response, and an extra degree of freedom in tuning by using non-integer order terms. Although it has the benefits, there are three primary drawbacks: (i) it is not real-time adaptable, (ii) it is hard to choose appropriate initial gain values, and (iii) it is sensitive to big disturbances and parameter changes. A new control framework is suggested to address these problems. First, a Reinforcement Learning (RL) approach based on Deep Deterministic Policy Gradient (DDPG) is presented to optimize the FOPID gains online so that the controller can adjust itself continuously to the variations in the system. Second, Snake Optimization (SO) algorithm is used in fine-tuning of the FOPID parameters at the initial stages to guarantee stable convergence. Lastly, cascade control structure is adopted, where FOPID controllers are used in the inner (current) and outer (speed) loops. This construction adds robustness to the system as a whole and minimizes the effect of disturbances on the performance. In addition, the cascade design also allows more coordinated and smooth control actions thus reducing stress on the power electronic switches, which reduces switching losses and the overall efficiency of the drive system. The suggested RL-enhanced cascade FOPID controller is verified by Hardware-in-the-Loop (HIL) testing, which shows better performance in the aspects of speed regulation, robustness, and adaptability to realistic conditions of operation in EV applications. Full article
Show Figures

Figure 1

Back to TopTop