applsci-logo

Journal Browser

Journal Browser

Advances in Intelligent Decision-Making Systems

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 December 2026 | Viewed by 1635

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China
Interests: computer decision-making; multi-agent; swarm intelligence; computer games

E-Mail Website
Guest Editor
Department of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China
Interests: computer game; multi-agent RL; multimedia information processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Business Administration of Food and Agricultural Enterprises, University of Patras, 30100 Agrinio, Greece
Interests: artificial intelligence; computational intelligence; machine learning; genetic/evolutionary algorithms; decision support theory; intelligent information systems; multi-objective optimization
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Recent years have seen accelerated progress in artificial intelligence. Within this field, computational games occupy a pivotal position as foundational testbeds for AI development. Imperfect-information games are particularly critical, as their inherent hidden information effectively models real-world complexity, presenting greater challenges and research value. Insights derived from these studies directly facilitate the advancement of intelligent systems for domains such as smart transportation, financial risk control, and autonomous driving.

A core research focus centers on intelligent decision-making under uncertainty. This necessitates key capabilities: perceiving ambiguous or incomplete information within dynamic environments, accurately modeling opponents and context despite partial observability, and enabling efficient, robust decision policies. Progress fundamentally depends on developing novel methods for uncertainty quantification, opponent intention prediction, and real-time adaptation strategies—essential for deploying reliable AI in stochastic settings.

Exemplified by autonomous driving, systems must perceive complex scenarios, infer hidden agent intentions, and execute safe real-time maneuvers under uncertainty. Research breakthroughs in addressing such uncertainty within imperfect-information games yield transferable frameworks applicable across intelligent decision-making systems. These advances drive innovation in domains where adaptive strategies under partial information are imperative.

Dr. Jiajia Zhang
Prof. Dr. Shuhan Qi
Prof. Dr. Grigorios Beligiannis
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • decision-making
  • imperfect-information games
  • opponent modeling
  • real-time adaptation strategies
  • machine learning
  • reinforcement learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 7605 KB  
Article
Decision of Nonsynchronous Framework: Agents in MARL Have Different Priorities While Making Decisions
by Shanghui Xie, Junyang Zhao, Jiajia Zhang and Lei Wang
Appl. Sci. 2026, 16(11), 5202; https://doi.org/10.3390/app16115202 - 22 May 2026
Abstract
Multi-Agent Reinforcement Learning (MARL) faces key challenges in credit assignment and the curse of dimensionality as agent numbers grow. In cooperative settings, uniform treatment of agents often exacerbates these issues. We argue that an agent’s importance depends on its personalized attributes and environment [...] Read more.
Multi-Agent Reinforcement Learning (MARL) faces key challenges in credit assignment and the curse of dimensionality as agent numbers grow. In cooperative settings, uniform treatment of agents often exacerbates these issues. We argue that an agent’s importance depends on its personalized attributes and environment states and propose concentrating computational resources on key agents while others act simply, alleviating dimensionality explosion and improving generalization. We propose the Decision of Nonsynchronous Framework (DNF), which identifies and prioritizes key agents at each time step for optimized decision-making, while assigning predefined or simplified behaviors to the remaining agents based on computational outcomes. To realize this, we introduce a Core Extractor (CE) architecture that categorizes agents into Priorities Key Agents (PKAs) and followers. Although agents are differentiated by priority, we still adhere to the Centralized Training with Decentralized Execution (CTDE) paradigm. This approach reduces the dimensionality of the joint state-action space, mitigates the dimensionality explosion problem in MARL, and fosters improved collaboration among agents. Experimental results demonstrate that DNF achieves a 100% win rate on multiple SMAC maps, including 3m, 2s3z, and 1c3s5z, and achieves 98.9–100% win rates on challenging hard and super-hard scenarios such as 2c_vs_64zg and Corridor, significantly outperforming baseline methods like QMIX and QPLEX in both final performance and training stability, while incurring only a modest increase in computational overhead. In the continuous MPE, DNF matches or exceeds HAPPO in performance and demonstrates substantially higher time efficiency, with both advantages growing more pronounced as the number of agents increases. Full article
(This article belongs to the Special Issue Advances in Intelligent Decision-Making Systems)
Show Figures

Figure 1

26 pages, 3759 KB  
Article
Prediction-Regularized Spatio-Temporal Transformer Framework for Offline Multi-Intersection Traffic Signal Control
by Yueting Deng, Huale Li, Tong Xia, Zhaobin Wang and Ruoming Lei
Appl. Sci. 2026, 16(10), 5156; https://doi.org/10.3390/app16105156 - 21 May 2026
Abstract
Multi-intersection traffic signal control must jointly address local coordination and delayed traffic propagation under strongly time-varying conditions. Existing offline sequence-imitation methods mainly recover actions from historical trajectories and make limited use of short-term future traffic evolution in shared-representation learning. To address this issue, [...] Read more.
Multi-intersection traffic signal control must jointly address local coordination and delayed traffic propagation under strongly time-varying conditions. Existing offline sequence-imitation methods mainly recover actions from historical trajectories and make limited use of short-term future traffic evolution in shared-representation learning. To address this issue, we propose PR-STLight, a prediction-regularized spatio-temporal extension of TransformerLight for offline multi-intersection traffic signal control. PR-STLight introduces short-term future inbound-queue evolution as structural supervision for shared representation learning. The model combines neighborhood-constrained spatial self-attention, causal temporal self-attention, and a Topology-Recurrent Queue Predictor (TRQP) to capture topology-aware spatio-temporal dependencies and near-future congestion dynamics. Training adopts a two-stage strategy, namely queue-prediction pretraining followed by joint control-prediction optimization, to improve optimization stability on a fixed offline replay buffer. In experiments on the adopted CityFlow benchmarks, PR-STLight obtains average travel times of 274.39 s on Jinan 3×4 and 288.09 s on Hangzhou 4×4, corresponding to 1.14% and 2.82% lower travel times than the strongest non-PR baseline, and 21.27% and 22.54% lower travel time than the TransformerLight backbone, respectively. It also achieves the lowest average inbound queue on Hangzhou and remains competitive on Jinan. These results show that PR-STLight provides an effective offline spatio-temporal sequence framework for coordinated multi-intersection signal control. Full article
(This article belongs to the Special Issue Advances in Intelligent Decision-Making Systems)
Show Figures

Figure 1

27 pages, 6447 KB  
Article
Active Distribution Network Voltage Control with a Physics-Informed Spatiotemporal Attention Network
by Tong Xia, Huale Li, Yueting Deng, Zetao Lin and Lei Wang
Appl. Sci. 2026, 16(10), 5109; https://doi.org/10.3390/app16105109 - 20 May 2026
Abstract
Active voltage control (AVC) in active distribution networks coordinates the reactive power outputs of distributed inverters to maintain bus voltages within secure limits. Although multi-agent reinforcement learning (MARL) shows promise for AVC, current methods face three main limitations: graph topologies rely on unweighted [...] Read more.
Active voltage control (AVC) in active distribution networks coordinates the reactive power outputs of distributed inverters to maintain bus voltages within secure limits. Although multi-agent reinforcement learning (MARL) shows promise for AVC, current methods face three main limitations: graph topologies rely on unweighted adjacency, ignoring physical parameters like line impedance and electrical distance; centralized critics output a single global Q-value, leading to coarse spatial credit assignment; and temporal critic modules suffer from vanishing gradients and representation drift. To address these issues, we propose physics-informed spatiotemporal multi-agent value learning (PST-MA), a physics-informed spatiotemporal value-learning framework integrating three coupled designs: a physics-informed graph attention mechanism with electrical-distance-aware sparsification; node-conditional value outputs utilizing a replicated-graph diagonal-extraction strategy; and a temporal latent compression module featuring a gated bypass and late action fusion. Experiments on the IEEE 33-bus and 141-bus systems validate the effectiveness of the proposed PST-MA method. Results demonstrate that it consistently achieves a higher controllable ratio than baseline methods for coordinated voltage regulation under uncertainty. Full article
(This article belongs to the Special Issue Advances in Intelligent Decision-Making Systems)
30 pages, 4413 KB  
Article
Dotsformer: Capturing Chain-Loop Structures for Transformer in Dots-and-Boxes
by Ranran Zhang, Changming Xu, Kuo Wu, Mingze Zheng, Xingcan Liu and Junwei Wang
Appl. Sci. 2026, 16(7), 3395; https://doi.org/10.3390/app16073395 - 31 Mar 2026
Viewed by 539
Abstract
In many board games, AlphaZero has demonstrated superhuman abilities. Dots-and-Boxes is a classic board game with simple rules but requiring skills to win. This paper proposes Dotsformer, which extracts chain-loop structures from the game board. These structures connect distant boxes, providing long-range relational [...] Read more.
In many board games, AlphaZero has demonstrated superhuman abilities. Dots-and-Boxes is a classic board game with simple rules but requiring skills to win. This paper proposes Dotsformer, which extracts chain-loop structures from the game board. These structures connect distant boxes, providing long-range relational information as input to the Transformer. We employ multiple convolutional kernels to generate Q, K, and V, and incorporate information about the box structure itself into the attention scores. We also incorporate auxiliary training tasks, including an initiative task and a classification task. These tasks determine whether to retain or relinquish the initiative in the current situation, and classify actions into forbidden, conceding, safe, and scoring moves. They provide additional supervisory signals and accelerate learning. The experimental results show that Dotsformer outperforms AlphaZero in both rollback speed and playing strength: it achieved a winning rate of 87.6% and an ELO rating lead of 340 points against the baseline. Additionally, ablation studies verify the effectiveness of each key module. Full article
(This article belongs to the Special Issue Advances in Intelligent Decision-Making Systems)
Show Figures

Figure 1

26 pages, 972 KB  
Article
Constructing Non-Markovian Decision Process via History Aggregator
by Yongyi Wang, Lingfeng Li and Wenxin Li
Appl. Sci. 2026, 16(2), 955; https://doi.org/10.3390/app16020955 - 16 Jan 2026
Viewed by 419
Abstract
In the domain of algorithmic decision-making, non-Markovian dynamics manifest as a significant impediment, especially for paradigms such as Reinforcement Learning (RL), thereby exerting far-reaching consequences on the advancement and effectiveness of the associated systems. Nevertheless, the existing benchmarks are deficient in comprehensively assessing [...] Read more.
In the domain of algorithmic decision-making, non-Markovian dynamics manifest as a significant impediment, especially for paradigms such as Reinforcement Learning (RL), thereby exerting far-reaching consequences on the advancement and effectiveness of the associated systems. Nevertheless, the existing benchmarks are deficient in comprehensively assessing the capacity of decision algorithms to handle non-Markovian dynamics. To address this deficiency, we have devised a generalized methodology grounded in category theory. Notably, we established the category of Markov Decision Processes (MDP) and the category of non-Markovian Decision Processes (NMDP), and proved the equivalence relationship between them. This theoretical foundation provides a novel perspective for understanding and addressing non-Markovian dynamics. We further introduced non-Markovianity into decision-making problem settings via the History Aggregator for State (HAS). With HAS, we can precisely control the state dependency structure of decision-making problems in the time series. Our analysis demonstrates the effectiveness of our method in representing a broad range of non-Markovian dynamics. This approach facilitates a more rigorous and flexible evaluation of decision algorithms by testing them in problem settings where non-Markovian dynamics are explicitly constructed. Full article
(This article belongs to the Special Issue Advances in Intelligent Decision-Making Systems)
Show Figures

Figure 1

Back to TopTop