Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (114)

Search Parameters:
Keywords = multiarmed bandits

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 1069 KB  
Article
Context-Aware Online Model Splitting and Device Association for Semi-Decentralized Federated Learning in Internet of Things
by Bo Xu, Shuang Wang and Xiaoyu Tang
Sensors 2026, 26(13), 4016; https://doi.org/10.3390/s26134016 (registering DOI) - 24 Jun 2026
Abstract
As a distributed approach to Artificial Intelligence (AI) model construction over wireless networks, federated learning (FL) based on multi-device collaborative training can protect data privacy, as well as increase the computing load of local model updates. In contrast, split learning (SL) with proper [...] Read more.
As a distributed approach to Artificial Intelligence (AI) model construction over wireless networks, federated learning (FL) based on multi-device collaborative training can protect data privacy, as well as increase the computing load of local model updates. In contrast, split learning (SL) with proper model splitting can adapt to the computation and transmission capabilities among devices. In this paper, while taking advantage of FL and SL, we concentrate on a semi-decentralized hybrid federated split learning (SD-HFSL) framework, in which we surpass the limitations of a single central server and allow the shared split models to be aggregated among multiple edge servers. To verify the importance of latency optimization for training efficiency, we analyze the convergence performance of SD-HFSL while jointly considering the limited computation and communication resources. Then, aiming at maximizing the long-term training efficiency, we propose an online optimization problem that includes local model splitting and device association. Considering that the training latency is unknown to the system a priori, a context-aware online training algorithm with sublinear regret is proposed based on the framework of contextual multi-armed bandit (CMAB), where the edge servers can observe the context information of device sites for latency estimation, followed by the iterative optimization based on the evaluated information in different contexts. Experiments on several neural network models show that the proposed algorithm reduces training latency and improves test accuracy compared with the selected benchmarks. Full article
(This article belongs to the Section Internet of Things)
33 pages, 1096 KB  
Article
Surrogate-Assisted Rezone-Enhanced Multi-Objective Adaptive Evolutionary Algorithm for Truck–UAV Collaborative Delivery Route Optimization
by Ai-Qing Tian, Fei-Fei Liu and Xiao-Yang Wang
J. Superintelligence 2026, 1(1), 3; https://doi.org/10.3390/superintelligence1010003 - 8 Jun 2026
Cited by 1 | Viewed by 133
Abstract
To address the challenges of combinatorial explosion and expensive evaluations in truck–drone (truck–UAV) collaborative delivery under complex geographical constraints, this paper proposes a Surrogate-assisted Rezone-Enhanced Multi-objective Adaptive Evolutionary Algorithm (SRE-MAEA). As a knowledge-driven decomposition-based surrogate-assisted framework, the proposed algorithm aims to synergistically optimize [...] Read more.
To address the challenges of combinatorial explosion and expensive evaluations in truck–drone (truck–UAV) collaborative delivery under complex geographical constraints, this paper proposes a Surrogate-assisted Rezone-Enhanced Multi-objective Adaptive Evolutionary Algorithm (SRE-MAEA). As a knowledge-driven decomposition-based surrogate-assisted framework, the proposed algorithm aims to synergistically optimize a four-dimensional conflicting objective space consisting of economic cost, social satisfaction, environmental emissions, and battery resilience. To overcome the curse of dimensionality in high-dimensional and strongly constrained environments, SRE-MAEA constructs an adaptive Rezone Search architecture. By dynamically deconstructing the decision space, it transforms global search pressure into refined knowledge mining within high-potential local regions. The core mechanism incorporates an intelligent sampling strategy based on the Multi-Armed Bandit (MAB). By utilizing real-time evolutionary feedback to dynamically prioritize the Pareto contribution of each rezone, the MAB achieves pruning-level scheduling of expensive evaluation resources. Simulation results on 15 benchmark instances with clustered, random, and mixed spatial distributions demonstrate that SRE-MAEA exhibits superior convergence boundaries and distribution uniformity in terms of IGD and HV metrics, significantly outperforming state-of-the-art regression-based strategies. Furthermore, computational efficiency analysis confirms that by precisely identifying invalid search paths via the MAB mechanism, SRE-MAEA maintains a high-precision Pareto front while reducing the average CPU time by approximately 35.2–48.5%. This effectively resolves the computational bottleneck caused by complex battery resilience integral models. This research provides an efficient algorithmic paradigm for resilient logistics scheduling in extreme environments and holds significant academic value and engineering application prospects. Full article
Show Figures

Figure 1

25 pages, 1822 KB  
Article
Adaptive Task Scheduling for Edge-Intelligent Systems: An Online Sleeping Restless Bandits Framework
by Sujunjie Sun, Chenchen Fu, Yuhang Xu and Weiwei Wu
Symmetry 2026, 18(6), 951; https://doi.org/10.3390/sym18060951 - 1 Jun 2026
Viewed by 203
Abstract
In edge-intelligent systems, efficient resource management and task scheduling are critical but challenging due to the dynamic and heterogeneous nature of edge nodes (e.g., IoT devices, drones). We model this dynamic resource allocation challenge as an online sleeping Restless Multi-Armed Bandits (RMAB) problem, [...] Read more.
In edge-intelligent systems, efficient resource management and task scheduling are critical but challenging due to the dynamic and heterogeneous nature of edge nodes (e.g., IoT devices, drones). We model this dynamic resource allocation challenge as an online sleeping Restless Multi-Armed Bandits (RMAB) problem, where each edge node (arm) operates as a Markov decision process. Unlike prior RMAB frameworks assuming perpetual availability, our setting captures the stochastic availability of edge nodes across rounds. The system controller (learner) is unaware of the transition functions, reward distributions, and node availability a priori. The goal is to maximize expected cumulative rewards through adaptive node selection. To explore this target problem, we first derive an asymptotically optimal sleeping-index policy (SIP) as the oracle based on the fluid process transformation. Then we propose OSILA (Online Sleeping Index-aware Learning Algorithm), featuring a Minimum Exploration Guarantee (MEG) mechanism for efficient exploration. This is coupled with a modified Linear Programming-based exploitation mechanism to construct an online sleeping index, effectively handling dynamic node availability. To the best of our knowledge, this work is the first to provide the theoretical analysis (which achieves O˜(KT2/3logT) regret where K is the number of arms and T is the time horizon) to the online sleeping RMAB problem. Empirical results validate both theoretical guarantees and practical effectiveness in dynamic edge computing environments. Full article
(This article belongs to the Special Issue Symmetry and Asymmetry in Embedded Systems)
Show Figures

Figure 1

26 pages, 7601 KB  
Article
State-Separated SARSA: A Practical Sequential Decision-Making Algorithm with Recovering Rewards
by Yuto Tanimoto and Kenji Fukumizu
Algorithms 2026, 19(5), 419; https://doi.org/10.3390/a19050419 - 21 May 2026
Viewed by 298
Abstract
While many multi-armed bandit algorithms assume that rewards for all arms are constant across rounds, this assumption does not hold in many real-world scenarios. This paper considers the setting of recovering bandits, where the reward depends on the number of rounds elapsed since [...] Read more.
While many multi-armed bandit algorithms assume that rewards for all arms are constant across rounds, this assumption does not hold in many real-world scenarios. This paper considers the setting of recovering bandits, where the reward depends on the number of rounds elapsed since the last time an arm was pulled. We propose a new reinforcement learning (RL) algorithm tailored to this setting, named the State-Separated SARSA (SS-SARSA) algorithm, which treats the elapsed rounds as states. The SS-SARSA algorithm achieves efficient learning by reducing the number of state combinations required for Q-learning/SARSA, which often suffers from combinatorial explosion for large-scale RL problems. Additionally, it makes minimal assumptions about the reward structure and has lower computational complexity. Furthermore, we prove asymptotic convergence to an optimal policy under mild assumptions. Simulation studies demonstrate the superior performance of our algorithm across various settings. Full article
(This article belongs to the Section Evolutionary Algorithms and Machine Learning)
Show Figures

Figure 1

36 pages, 5711 KB  
Article
Digital Twin-Enabled Waveform Optimization for VHF Radio Communication Systems
by Chenzhe Zhong, Bo Liu, Wei Zhu, Binnian Wang, Yifan Tan and Xiangchen Wang
Sensors 2026, 26(10), 3060; https://doi.org/10.3390/s26103060 - 12 May 2026
Viewed by 528
Abstract
Very High Frequency (VHF) radio communication systems face significant challenges in modern electromagnetic environments, including spectrum congestion, dynamic interference, and varying channel conditions. Existing adaptive approaches rely on static rule-based switching or single-cycle optimization, which cannot accumulate operational experience across decision cycles. This [...] Read more.
Very High Frequency (VHF) radio communication systems face significant challenges in modern electromagnetic environments, including spectrum congestion, dynamic interference, and varying channel conditions. Existing adaptive approaches rely on static rule-based switching or single-cycle optimization, which cannot accumulate operational experience across decision cycles. This paper proposes a digital twin-enabled online learning framework (DT-MAB) for adaptive waveform selection in tactical VHF communication. The framework employs a contextual multi-armed bandit algorithm (Lin-UCB) that continuously learns the mapping from channel conditions to optimal configurations, with the digital twin serving as a virtual exploration sandbox that screens candidate configurations before physical deployment—preventing link disruptions during exploratory actions. An expanded configuration space of 63 candidates (7 waveforms × 3 MAC protocols × 3 power levels) is constructed, and a hierarchical performance evaluation model combining voice quality, bit error rate, communication delay, and transmission range is developed using the Analytic Hierarchy Process (AHP) as the reward function for online learning. Experimental results across 10 random seeds demonstrate that DT-MAB achieves the lowest mean cumulative regret, reducing regret by 29% relative to MAB without a digital twin and by 16.5% relative to PSO-based optimization on average. Ablation experiments confirm that removing virtual exploration increases performance drop events by 49% (from 250 ± 79 to 373 ± 6), demonstrating that the digital twin is a functionally indispensable component of the online learning architecture. Full article
(This article belongs to the Section Communications)
Show Figures

Figure 1

28 pages, 10170 KB  
Article
An RL-Guided Hybrid Forecasting Framework for Aircraft Engine RUL and Performance Emission Prediction
by Ukbe Üsame Uçar and Hakan Aygün
Appl. Sci. 2026, 16(9), 4271; https://doi.org/10.3390/app16094271 - 27 Apr 2026
Viewed by 399
Abstract
In this paper, a new hybrid prediction method is proposed for estimating remaining useful life, emissions, and performance parameters using experimental data obtained from a micro-turbojet engine. Experiments were conducted under various rotational speed conditions, yielding a total of 342 measurement points. Turbine [...] Read more.
In this paper, a new hybrid prediction method is proposed for estimating remaining useful life, emissions, and performance parameters using experimental data obtained from a micro-turbojet engine. Experiments were conducted under various rotational speed conditions, yielding a total of 342 measurement points. Turbine speed, exhaust gas temperature, fuel flow rate, and thrust were considered as input variables in the study. Thermal efficiency, total power, CO2, and NO2 were considered as output variables. The experimental findings showed that thermal efficiency varied between 0.49% and 7.1%, total power between 0.266 and 13.94 kW, and CO2 emissions by volume between 0.317% and 2.183%. The proposed RL-MH-LR-CBR approach combines the advantages of multiple methods. In this method, the interpretable formulation of linear regression serves as the foundation. Additionally, in the adaptive meta-heuristic optimization process, a hyper-heuristic selection mechanism based on the UCB1-based multi-arm bandit approach is used to select the optimal algorithm from among the meta-heuristic methods. Finally, the CatBoost-based residual error learning component aims to capture non-linear patterns that cannot be explained by the linear model. The method was compared with 14 different methods on both the NASA C-MAPSS FD001 dataset and real engine data. The results demonstrate that the proposed framework exhibits more balanced, stable, and higher generalization capabilities compared to classical regression models and powerful AI methods, particularly in non-linear, noisy, and heterogeneous outputs. In the real engine dataset, the proposed method produced R2 values of 0.968 for CO2 and 0.936 for NO2, while the predictive performance was even stronger for thermal efficiency and total power, with corresponding R2 values of 0.998 and 0.995, respectively. Additionally, the method demonstrated a clear advantage in hard-to-model outputs by reducing the error level to 0.061 in NO2 predictions. These findings demonstrate that the proposed approach is not limited to micro-turbojet-engines. The developed method provides a robust decision support framework that is applicable, scalable, and generalizable to predictive maintenance, emissions monitoring, energy systems, aviation analytics, and other highly dynamic engineering problems. Full article
(This article belongs to the Section Aerospace Science and Engineering)
Show Figures

Figure 1

24 pages, 1601 KB  
Article
SHIFT-MAB: Fair and Mobility-Aware Handover Control for 6G Fully Decoupled RANs
by Tian Gong, Chen Dai and Tongtong Yang
Sensors 2026, 26(8), 2560; https://doi.org/10.3390/s26082560 - 21 Apr 2026
Viewed by 360
Abstract
Fully decoupled radio access networks (FD-RANs) achieve spectral efficiency and coverage flexibility for 6G via independent uplink (UL) and downlink (DL) base station operation, yet dynamic user mobility brings critical challenges to joint user association and resource allocation. Asymmetric interference and heterogeneous base [...] Read more.
Fully decoupled radio access networks (FD-RANs) achieve spectral efficiency and coverage flexibility for 6G via independent uplink (UL) and downlink (DL) base station operation, yet dynamic user mobility brings critical challenges to joint user association and resource allocation. Asymmetric interference and heterogeneous base station capacities cause persistent network unfairness, while uncoordinated mobility management triggers ping-pong handovers and heavy handover overheads. To resolve these intertwined problems, we propose a fully decoupled, mobility-resilient and fairness-guaranteed framework, which integrates short-term congestion pricing with the long-term Jain fairness index for equitable resource distribution and introduces a composite handover penalty with a strict physical hysteresis margin to block invalid handovers. We formulate the optimization problem as a novel Sliding-Window Hysteresis-Integrated Fairness Two-Layer Multi-Armed Bandit (SHIFT-MAB) model, embedding an exponentially weighted moving average (EWMA) sliding-window mechanism to track real-time channel fluctuations efficiently. Theoretical analysis confirms the model’s decoupling optimality, sublinear regret bound and fairness convergence. Extensive simulations show that SHIFT-MAB effectively suppresses invalid handovers, ensures high network fairness, optimizes system utility and achieves a superior handover–throughput trade-off. Full article
(This article belongs to the Section Communications)
Show Figures

Figure 1

35 pages, 2019 KB  
Article
Defining Quantum Agents: Formal Foundations, Architectures, and NISQ-Era Prototypes
by Eldar Sultanow, Madjid Tehrani, Siddhant Dutta, William J. Buchanan and Muhammad Shahbaz Khan
Quantum Rep. 2026, 8(1), 24; https://doi.org/10.3390/quantum8010024 - 13 Mar 2026
Viewed by 1257
Abstract
Quantum computing offers potential computational advantages, yet its integration into autonomous decision-making systems remains largely unexplored. This paper addresses the need for a unified framework that systematically combines quantum computation with agent-based artificial intelligence. We examine how quantum technologies can enhance the capabilities [...] Read more.
Quantum computing offers potential computational advantages, yet its integration into autonomous decision-making systems remains largely unexplored. This paper addresses the need for a unified framework that systematically combines quantum computation with agent-based artificial intelligence. We examine how quantum technologies can enhance the capabilities of autonomous agents and, conversely, how agentic AI can support the advancement of quantum systems. We analyze both directions of this synergy and present conceptual and technical foundations for future quantum–agentic platforms. Our work introduces a formal definition of quantum agents and outlines architectures that integrate quantum computing with agent-based systems. As concrete proof-of-concept implementations, we develop and evaluate three quantum agent prototypes: (i) a Grover-based decision agent for quantum search-driven action selection, (ii) a variational quantum reinforcement learning agent for adaptive policy learning in a multi-armed bandit setting, and (iii) an adaptive quantum image encryption agent that autonomously selects encryption strategies based on entropy-driven feedback. These prototypes demonstrate practical realizations of quantum agency in decision-making, learning, and security contexts under NISQ-era constraints. Furthermore, we discuss application domains including quantum-enhanced optimization, hybrid quantum–classical orchestration, autonomous quantum workflow management, and secure quantum information processing. By bridging these fields, we introduce a structured theoretical and architectural framework for quantum–agentic systems, providing formal definitions, system models, and early operational prototypes that illustrate the feasibility of quantum-enhanced agency under NISQ constraints. Full article
Show Figures

Figure 1

30 pages, 4019 KB  
Article
S-HSFL: A Game-Theoretic Enhanced Secure-Hybrid Split-Federated Learning Scheme for UAV-Assisted Wireless Networks
by Qiang Gao, Xintong Zhang, Guishan Dong, Bo Tang and Jinhui Liu
Drones 2026, 10(1), 37; https://doi.org/10.3390/drones10010037 - 7 Jan 2026
Viewed by 603
Abstract
Hybrid Split Federated Learning (HSFL for short) in emerging 6G-enabled UAV networks faces persistent challenges in data protection, device trust management, and long-term participation incentives. To address these issues, this study introduces S-HSFL, a security-enhanced framework that embeds verifiable federated learning mechanisms into [...] Read more.
Hybrid Split Federated Learning (HSFL for short) in emerging 6G-enabled UAV networks faces persistent challenges in data protection, device trust management, and long-term participation incentives. To address these issues, this study introduces S-HSFL, a security-enhanced framework that embeds verifiable federated learning mechanisms into HSFL and incorporates digital-signature-based authentication throughout the device selection process. This design effectively prevents model tampering and forgery attacks, achieving a defense success rate above 99%. To further strengthen collaborative training, we develop a MAB-GT device selection strategy that integrates multi-armed bandit exploration with multi-stage game-theoretic decision models, spanning non-cooperative, coalition, and repeated games, to encourage high-quality UAV nodes to provide reliable data and sustained computation. Experiments on the Modified National Institute of Standards and Technology (MNIST) dataset under both Independent and Identically Distributed (IID) and non-IID conditions demonstrate that S-HSFL maintains approximately 97% accuracy even in the presence of 30% adversarial UAVs. The MAB-GT strategy significantly improves convergence behavior and final model performance, while incurring only a 10–30% increase in communication overhead. The proposed S-HSFL framework establishes a secure, trustworthy, and efficient foundation for distributed intelligence in next-generation 6G UAV networks. Full article
Show Figures

Figure 1

23 pages, 677 KB  
Article
Hierarchical MAB Framework for Energy-Aware Beam Training for Near-Field Communications
by Yunxing Xiang, Yi Yan, Yunchao Song, Jing Gao, Xiaohui You, Jun Wang, Huibin Liang and Yixin Jiang
Sensors 2026, 26(1), 60; https://doi.org/10.3390/s26010060 - 21 Dec 2025
Viewed by 658
Abstract
For XL-MIMO multi-user frequency division duplex systems, this paper proposes a near-field beam training scheme using a two-phase combinatorial multi-armed bandit (MAB) framework. This scheme leverages the MAB framework, integrating energy-aware user scheduling and hierarchical beam training to balance communication quality and device [...] Read more.
For XL-MIMO multi-user frequency division duplex systems, this paper proposes a near-field beam training scheme using a two-phase combinatorial multi-armed bandit (MAB) framework. This scheme leverages the MAB framework, integrating energy-aware user scheduling and hierarchical beam training to balance communication quality and device battery level, thereby effectively enhancing system energy efficiency and extending the device’s lifespan. Specifically, in the first phase, we account for user battery levels by designing an energy-aware upper confidence bound (UCB) algorithm for user scheduling. This algorithm effectively balances exploration and exploitation, prioritizing users with higher achievable rates and sufficient battery level. In the second phase, based on the scheduled users, two UCB algorithms are employed for beam training. In the first layer, discrete Fourier transform codebook-based beam scanning is utilized, and a UCB algorithm is applied to initially acquire angle information for scheduled users. In the second layer, based on the obtained angle information, a candidate set of polar-domain codewords is constructed. Another UCB algorithm is then employed to select the optimal polar-domain codewords. The effectiveness of our scheme is confirmed by simulations, demonstrating notable achievable rate gains for multi-user communications. Full article
Show Figures

Figure 1

17 pages, 406 KB  
Article
Spectral Efficiency Beamforming Scheme for UAV MIMO Communication via Budgeted Combinatorial Multi-Armed Bandit
by Jing Gao, Yunxing Xiang, Yunchao Song, Jing Zhu, Jun Wang, Xiaohui You, Ge Wang and Tianbao Gao
Electronics 2025, 14(24), 4805; https://doi.org/10.3390/electronics14244805 - 6 Dec 2025
Viewed by 486
Abstract
Unmanned aerial vehicles (UAVs) equipped with antenna arrays can deliver high-capacity, high-throughput, and low-latency communication services. Considering a UAV-assisted mmWave multi-input and multi-output (MIMO) system, a two-stage beamforming scheme based on a budgeted combinatorial multi-armed bandit (BC-MAB) is proposed to improve the system’s [...] Read more.
Unmanned aerial vehicles (UAVs) equipped with antenna arrays can deliver high-capacity, high-throughput, and low-latency communication services. Considering a UAV-assisted mmWave multi-input and multi-output (MIMO) system, a two-stage beamforming scheme based on a budgeted combinatorial multi-armed bandit (BC-MAB) is proposed to improve the system’s spectral efficiency (SE). The pre-beamformer design problem is initially formulated as a BC-MAB problem. In this framework, the reward is the received energy, while the cost corresponds to the energy consumed by each RF chain and the budget is represented by the residual energy of the UAV. To achieve a favorable trade-off between the number of communication slots and the energy acquired per slot, a pre-beamforming scheme based on the bang-per-buck ratio is introduced to optimize the number of activated RF chains, therefore maximizing the cumulative reward. The second stage utilizes the reduced-dimensional instantaneous channel state information to design and optimize the beamformer to achieve maximum system SE. The proposed scheme achieves more than 7.1% improvement in SE compared to the benchmark schemes. Simulations validate the superiority of the proposed scheme. Full article
Show Figures

Figure 1

15 pages, 1380 KB  
Article
Optimizing LoRaWAN Performance Through Learning Automata-Based Channel Selection
by Luka Aime Atadet, Richard Musabe, Eric Hitimana and Omar Gatera
Future Internet 2025, 17(12), 555; https://doi.org/10.3390/fi17120555 - 2 Dec 2025
Cited by 1 | Viewed by 649
Abstract
The rising demand for long-range, low-power wireless communication in applications such as monitoring, smart metering, and wide-area sensor networks has emphasized the critical need for efficient spectrum utilization in LoRaWAN (Long Range Wide Area Network). In response to this challenge, this paper proposes [...] Read more.
The rising demand for long-range, low-power wireless communication in applications such as monitoring, smart metering, and wide-area sensor networks has emphasized the critical need for efficient spectrum utilization in LoRaWAN (Long Range Wide Area Network). In response to this challenge, this paper proposes a novel channel selection framework based on Hierarchical Discrete Pursuit Learning Automata (HDPA), aimed at enhancing the adaptability and reliability of LoRaWAN operations in dynamic and interference-prone environments. HDPA leverages a tree-structure reinforcement learning model to monitor and respond to transmission success in real-time, dynamically updating channel probabilities based on environmental feedback. Simulation results conducted in MATLAB R2023b demonstrate that HDPA significantly outperforms conventional algorithms such as Hierarchical Continuous Pursuit Automata (HCPA) in terms of convergence speed, selection accuracy, and throughput performance. Specifically, HDPA achieved 98.78% accuracy with a mean convergence of 6279 iterations, compared to HCPA’s 93.89% accuracy and 6778 iterations in an eight-channel setup. Unlike the Tug-of-War-based Multi-Armed Bandit strategy, which emphasizes fairness in real-world heterogeneous networks, HDPA offers a computationally lightweight and highly adaptive solution tailored to LoRaWAN’s stochastic channel dynamics. These results position HDPA as a promising framework for improving reliability and spectrum utilization in future IoT deployments. Full article
Show Figures

Figure 1

16 pages, 3342 KB  
Article
Geoscientific Input Feature Selection for CNN-Driven Mineral Prospectivity Mapping
by Arya Kimiaghalam, Kyubo Noh and Andrei Swidinsky
Minerals 2025, 15(12), 1237; https://doi.org/10.3390/min15121237 - 23 Nov 2025
Viewed by 855
Abstract
In recent years, machine learning techniques such as convolutional neural networks have been used for mineral prospectivity mapping. Since a diverse range of geoscientific data is often available for training, it is computationally challenging to select a subset of features that optimizes model [...] Read more.
In recent years, machine learning techniques such as convolutional neural networks have been used for mineral prospectivity mapping. Since a diverse range of geoscientific data is often available for training, it is computationally challenging to select a subset of features that optimizes model performance. Our study aims to demonstrate the effect of optimal input feature selection on convolutional neural network model performance in mineral prospectivity mapping applications. We demonstrate results from both exhaustive and algorithmic feature selection methods in the context of copper porphyry prospectivity modeling and analyze the performance and stability of optimally trained models. Using the QUEST dataset from central interior British Columbia, such a feature selection technique improves model performance by 6.8% over models that use all available features, yet consumes around 2.2% of the computational resources needed to exhaustively search for the optimal feature subset. Full article
(This article belongs to the Special Issue Feature Papers in Mineral Exploration Methods and Applications 2025)
Show Figures

Figure 1

23 pages, 1841 KB  
Article
Population-Level Analysis of Personalized Food Recommendation Using Reinforcement Learning
by Yone Tellechea, Markel Arrojo, Ander Cejudo and Cristina Martin
Foods 2025, 14(21), 3770; https://doi.org/10.3390/foods14213770 - 3 Nov 2025
Cited by 1 | Viewed by 8946
Abstract
This paper introduces an innovative methodology for optimizing recommendation strategies across different populations within the food industry. While previous approaches to recommending courses have overlooked cultural and age-based preferences, our work demonstrates how understanding these differences can significantly enhance the attractiveness for consumers [...] Read more.
This paper introduces an innovative methodology for optimizing recommendation strategies across different populations within the food industry. While previous approaches to recommending courses have overlooked cultural and age-based preferences, our work demonstrates how understanding these differences can significantly enhance the attractiveness for consumers and create new opportunities for marketing. By simulating diverse populations using a fuzzy logic approach, based on individual characteristics such as age, gender, geographical area, and city size, the study evaluates how recommendation algorithms perform within a generated menu database. Results show that algorithms like State–Action–Reward–State–Action (SARSA), multi-armed bandit (MAB), and Deep-Q Network (DQN) exhibit varying levels of efficiency depending on the population. Notably, the DQN improves accumulated reward over a random recommender by 71.60% for “Foodies”, 65.02% for “Veggies”, 63.46% for “Spanish”, and 8.89% for “Seniors”, while MAB achieves similar performance with fewer resources. Statistically significant differences (p < 0.005) are found in the performance of the DQN between populations, with large effect sizes according to Cliff’s delta. These findings highlight recommender systems as an opportunity to navigate market demand, optimize supply chains, and reduce food waste. A better understanding of public preferences enables more effective alignment of supply and demand across the entire food supply chain. As a conclusion, while the DQN effectively captures target group preferences, the optimum recommendation strategy should be chosen by balancing algorithmic performance, computational efficiency, and the specific requirements of the food sector. Full article
(This article belongs to the Special Issue Artificial Intelligence for the Food Industry)
Show Figures

Figure 1

35 pages, 10688 KB  
Article
Multi-Armed Bandit Optimization for Explainable AI Models in Chronic Kidney Disease Risk Evaluation
by Jianbo Huang, Long Li and Jia Chen
Symmetry 2025, 17(11), 1808; https://doi.org/10.3390/sym17111808 - 27 Oct 2025
Cited by 2 | Viewed by 1228
Abstract
Chronic kidney disease (CKD) impacts over 850 million people globally, representing a critical public health issue, yet existing risk assessment methodologies inadequately address the complexity of disease progression trajectories. Traditional machine learning approaches encounter critical limitations including inefficient hyperparameter selection and lack of [...] Read more.
Chronic kidney disease (CKD) impacts over 850 million people globally, representing a critical public health issue, yet existing risk assessment methodologies inadequately address the complexity of disease progression trajectories. Traditional machine learning approaches encounter critical limitations including inefficient hyperparameter selection and lack of clinical transparency, hindering their deployment in healthcare settings. This study introduces an innovative computational framework that integrates adaptive Multi-Armed Bandit (MAB) strategies with BorderlineSMOTE sampling techniques to improve CKD risk assessment. The proposed methodology leverages XGBoost within an ensemble learning paradigm enhanced by Upper Confidence Bound exploration strategy, coupled with a comprehensive interpretability system incorporating SHAP and LIME analytical tools to ensure model transparency. To address the challenge of algorithmic interpretability while maintaining clinical utility, a four-level risk categorization framework was developed, employing cross-validated stratification methods and balanced performance evaluation metrics, thereby ensuring fair predictive accuracy across diverse patient populations and minimizing bias toward dominant risk categories. Through rigorous empirical evaluation on clinical datasets, we performed extensive comparative analysis against sixteen established algorithms using paired statistical testing with Bonferroni correction. The MAB-optimized framework achieved superior predictive performance with accuracy of 91.8%, F1-score of 91.0%, and ROC-AUC of 97.8%, demonstrating superior performance within the evaluated cohort of reference algorithms (p-value < 0.001). Remarkably, our optimized framework delivered nearly ten-fold computational efficiency gains relative to conventional grid search methods while preserving robust classification performance. Feature importance analysis identified albumin-to-creatinine ratio, eGFR measurements, and CKD staging as dominant prognostic factors, demonstrating concordance with established clinical nephrology practice. This research addresses three core limitations in healthcare artificial intelligence: optimization computational cost, model interpretability, and consistent performance across heterogeneous clinical populations, offering a practical solution for improved CKD risk stratification in clinical practice. Full article
Show Figures

Figure 1

Back to TopTop