Next Article in Journal
LOLS Research in Technology for the Development and Application of New Fiber-Based Sensors
Previous Article in Journal
Metal Oxide Nanostructures and Their Gas Sensing Properties: A Review
Article Menu

Export Article

Open AccessArticle
Sensors 2012, 12(3), 2632-2653; doi:10.3390/s120302632

Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration

1,2,†
,
1,†,* , 3
and
1
1
Key Lab of Visual Media Processing and Transmission, Shenzhen Institute of Information Technology, Shenzhen, Guangdong 518029, China
2
Department of Computer Science, University of Massachusetts, Amherst, MA 01003, USA
3
Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ 07030, USA
These authors contributed equally to this work.
*
Author to whom correspondence should be addressed.
Received: 3 January 2012 / Revised: 20 January 2012 / Accepted: 2 February 2012 / Published: 28 February 2012
(This article belongs to the Section Physical Sensors)
View Full-Text   |   Download PDF [794 KB, uploaded 21 June 2014]   |  

Abstract

In this paper a new framework, called Compressive Kernelized Reinforcement Learning (CKRL), for computing near-optimal policies in sequential decision making with uncertainty is proposed via incorporating the non-adaptive data-independent Random Projections and nonparametric Kernelized Least-squares Policy Iteration (KLSPI). Random Projections are a fast, non-adaptive dimensionality reduction framework in which high-dimensionality data is projected onto a random lower-dimension subspace via spherically random rotation and coordination sampling. KLSPI introduce kernel trick into the LSPI framework for Reinforcement Learning, often achieving faster convergence and providing automatic feature selection via various kernel sparsification approaches. In this approach, policies are computed in a low-dimensional subspace generated by projecting the high-dimensional features onto a set of random basis. We first show how Random Projections constitute an efficient sparsification technique and how our method often converges faster than regular LSPI, while at lower computational costs. Theoretical foundation underlying this approach is a fast approximation of Singular Value Decomposition (SVD). Finally, simulation results are exhibited on benchmark MDP domains, which confirm gains both in computation time and in performance in large feature spaces. View Full-Text
Keywords: Markov Decision Process; sensor-actuator systems; random Projections; Kernelized Least Square Policy Iteration Markov Decision Process; sensor-actuator systems; random Projections; Kernelized Least Square Policy Iteration
Figures

This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Liu, B.; Chen, S.; Li, S.; Liang, Y. Intelligent Control of a Sensor-Actuator System via Kernelized Least-Squares Policy Iteration. Sensors 2012, 12, 2632-2653.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Sensors EISSN 1424-8220 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top