Comparative Evaluation of Bandit-Style Heuristic Policies for Moving Target Detection in a Linear Grid Environment

Kang, Hyunmin; Ahn, Minho; Seo, Yongduek

doi:10.3390/s26010226

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Comparative Evaluation of Bandit-Style Heuristic Policies for Moving Target Detection in a Linear Grid Environment

by

Hyunmin Kang

^1,2

,

Minho Ahn

^3,4 and

Yongduek Seo

^2,*

¹

Digital Healthcare Center, Gumi Electronics & Information Technology Research Institute, Gumi 39253, Republic of Korea

²

Department of Artificial Intelligence, Sogang University, Seoul 04107, Republic of Korea

³

Artificial Intelligence Laboratory, Konan Technology Inc., Seoul 06627, Republic of Korea

⁴

Department of Defense Acquisition, Konkuk University, Seoul 05209, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sensors 2026, 26(1), 226; https://doi.org/10.3390/s26010226 (registering DOI)

Submission received: 27 November 2025 / Revised: 23 December 2025 / Accepted: 25 December 2025 / Published: 29 December 2025

(This article belongs to the Special Issue Multi-Sensor Technology for Tracking, Positioning and Navigation)

Download Versions Notes

Abstract

Moving-target detection under strict sensing constraints is a recurring subproblem in surveillance, search-and-rescue, and autonomous robotics. We study a canonical one-dimensional finite grid in which a sensor probes one location per time step with binary observations while the target follows reflecting random-walk dynamics. The objective is to minimize the expected time to detection using transparent, training-free decision rules defined on the belief state of the target location. We compare two belief-driven heuristics with purely online implementation: a greedy rule that always probes the most probable location and a belief-proportional sampling (BPS, probability matching) rule that samples sensing locations according to the belief distribution (i.e., posterior probability of the target location). Repeated Monte Carlo simulations quantify the exploitation–exploration trade-off and provide a self-comparison between the two policies. Across tested grid sizes, the greedy policy consistently yields the shortest expected time to detection, improving by roughly 17–20% over BPS and uniform random probing in representative settings. BPS trades some average efficiency for stochastic exploration, which can be beneficial under model mismatch. This study provides an interpretable baseline and quantitative reference for extensions to noisy sensing and higher-dimensional search.

Keywords: moving target detection; partially observable Markov decision process (POMDP); restless-bandit motivation (RMAB-inspired); belief-proportional sampling (BPS; probability matching); belief state; reinforcement-learning–inspired heuristics; expected time to detection (ETTD)

Share and Cite

MDPI and ACS Style

Kang, H.; Ahn, M.; Seo, Y. Comparative Evaluation of Bandit-Style Heuristic Policies for Moving Target Detection in a Linear Grid Environment. Sensors 2026, 26, 226. https://doi.org/10.3390/s26010226

AMA Style

Kang H, Ahn M, Seo Y. Comparative Evaluation of Bandit-Style Heuristic Policies for Moving Target Detection in a Linear Grid Environment. Sensors. 2026; 26(1):226. https://doi.org/10.3390/s26010226

Chicago/Turabian Style

Kang, Hyunmin, Minho Ahn, and Yongduek Seo. 2026. "Comparative Evaluation of Bandit-Style Heuristic Policies for Moving Target Detection in a Linear Grid Environment" Sensors 26, no. 1: 226. https://doi.org/10.3390/s26010226

APA Style

Kang, H., Ahn, M., & Seo, Y. (2026). Comparative Evaluation of Bandit-Style Heuristic Policies for Moving Target Detection in a Linear Grid Environment. Sensors, 26(1), 226. https://doi.org/10.3390/s26010226

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparative Evaluation of Bandit-Style Heuristic Policies for Moving Target Detection in a Linear Grid Environment

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI