Special Issue "Feature Paper Special Issue: Reinforcement Learning"

A special issue of Stats (ISSN 2571-905X).

Deadline for manuscript submissions: 1 January 2023 | Viewed by 1429

Special Issue Editors

Prof. Dr. Wei Zhu
E-Mail Website
Guest Editor
Department of Applied Mathematics and Statistics, State University of New York at Stony Brook, Stony Brook, NY 11794, USA
Interests: biostatistics; quantitative Finance; errors in variable regression (EIV); structural equation modeling (SEM); experimental design; statistical learning
Dr. Sourav Sen
E-Mail Website
Guest Editor
Machine Learning Team, Upstart Network Inc., San Carlos, CA 94070, USA
Interests: interpretable machine learning; natural language processing; computational linguistics; reinforcement learning; applied machine learning
Special Issues, Collections and Topics in MDPI journals
Dr. Keli Xiao
E-Mail Website
Guest Editor
College of Business, Stony Brook University, Stony Brook, NY 11794, USA
Interests: business analytics; data mining; real estate/urban computing; economic bubbles and crises; asset pricing

Special Issue Information

Dear Colleagues,

Machine-learning methods can be classified into three general categories: unsupervised learning, supervised learning and reinforcement learning. Of the three, reinforcement learning holds the promise to create artificial intelligence that can surpass human capacity in specialized tasks such as gaming (e.g., Google AlphaGo, AlphaGo Zero, AlphaZero) or research (e.g., Google AlphaFold). 

Reinforcement learning spans diverse academic areas including statistics, operations research, computational mathematics and computer science. We are organizing this Special Issue to help promote the development of this research frontier, and we are honored to feature incoming papers from leaders in this field including Professor Richard Sutton, co-author of the first textbook [1] in this field, and Professor Jiaqiao Hu, co-developer of the Monte Carlo tree search algorithm [2] that lies behind the success of the Google AlphaGo. We welcome colleagues from all related fields to contribute to this Special Issue as authors and/or Guest Editors. Accepted papers will be published sequentially without delay and without publication fee, to help advance this important research topic.

[1] Sutton, Richard S.; Barto, Andrew G. (2018). Reinforcement Learning: An Introduction (2 ed.). MIT Press. ISBN 978-0-262-03924-6.

[2] Chang, Hyeong Soo; Fu, Michael C.; Hu, Jiaqiao; Marcus, Steven I. (2005). "An Adaptive Sampling Algorithm for Solving Markov Decision Processes". Operations Research. 53: 126–139.

Procedure

All submissions will be rigorously reviewed according to the Stats journal guidelines.

Authors of manuscripts that are not suitable for this Special Issue will be notified as soon as after consultation with the Editorial Board Members. Authors of these manuscripts may still consider submitting in other Special Issues or as a regular paper. Other manuscripts will be forwarded for review.

Manuscripts that are not selected as feature papers will be notified after the first round of reviews. The selection will be based on the review. Authors of those manuscripts that are not selected for the Special Issue may decide to revise and submit as a regular paper in Stats. Please note that authors of these manuscripts need to shoulder the publication fees.

Other manuscripts will be sent for a second round of reviews. However, this does not necessarily mean that a manuscript under the second round of reviews will be published as a feature paper. We will still seek comments and suggestions from reviewers.

Prof. Dr. Wei Zhu
Dr. Sourav Sen
Dr. Keli Xiao
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Stats is an international peer-reviewed open access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1200 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • reinforcement learning
  • dynamic programming
  • Markov decision process
  • multi-agent system

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
Deriving the Optimal Strategy for the Two Dice Pig Game via Reinforcement Learning
Stats 2022, 5(3), 805-818; https://doi.org/10.3390/stats5030047 - 17 Aug 2022
Viewed by 400
Abstract
Games of chance have historically played a critical role in the development and teaching of probability theory and game theory, and, in the modern age, computer programming and reinforcement learning. In this paper, we derive the optimal strategy for playing the two-dice game [...] Read more.
Games of chance have historically played a critical role in the development and teaching of probability theory and game theory, and, in the modern age, computer programming and reinforcement learning. In this paper, we derive the optimal strategy for playing the two-dice game Pig, both the standard version and its variant with doubles, coined “Double-Trouble”, using certain fundamental concepts of reinforcement learning, especially the Markov decision process and dynamic programming. We further compare the newly derived optimal strategy to other popular play strategies in terms of the winning chances and the order of play. In particular, we compare to the popular “hold at n” strategy, which is considered to be close to the optimal strategy, especially for the best n, for each type of Pig Game. For the standard two-player, two-dice, sequential Pig Game examined here, we found that “hold at 23” is the best choice, with the average winning chance against the optimal strategy being 0.4747. For the “Double-Trouble” version, we found that the “hold at 18” is the best choice, with the average winning chance against the optimal strategy being 0.4733. Furthermore, time in terms of turns to play each type of game is also examined for practical purposes. For optimal vs. optimal or optimal vs. the best “hold at n” strategy, we found that the average number of turns is 19, 23, and 24 for one-die Pig, standard two-dice Pig, and the “Double-Trouble” two-dice Pig games, respectively. We hope our work will inspire students of all ages to invest in the field of reinforcement learning, which is crucial for the development of artificial intelligence and robotics and, subsequently, for the future of humanity. Full article
(This article belongs to the Special Issue Feature Paper Special Issue: Reinforcement Learning)
Show Figures

Figure 1

Article
Quantitative Trading through Random Perturbation Q-Network with Nonlinear Transaction Costs
by and
Stats 2022, 5(2), 546-560; https://doi.org/10.3390/stats5020033 - 10 Jun 2022
Cited by 2 | Viewed by 733
Abstract
In recent years, reinforcement learning (RL) has seen increasing applications in the financial industry, especially in quantitative trading and portfolio optimization when the focus is on the long-term reward rather than short-term profit. Sequential decision making and Markov decision processes are rather suited [...] Read more.
In recent years, reinforcement learning (RL) has seen increasing applications in the financial industry, especially in quantitative trading and portfolio optimization when the focus is on the long-term reward rather than short-term profit. Sequential decision making and Markov decision processes are rather suited for this type of application. Through trial and error based on historical data, an agent can learn the characteristics of the market and evolve an algorithm to maximize the cumulative returns. In this work, we propose a novel RL trading algorithm utilizing random perturbation of the Q-network and account for the more realistic nonlinear transaction costs. In summary, we first design a new near-quadratic transaction cost function considering the slippage. Next, we develop a convolutional deep Q-learning network (CDQN) with multiple price input based on this cost functions. We further propose a random perturbation (rp) method to modify the learning network to solve the instability issue intrinsic to the deep Q-learning network. Finally, we use this newly developed CDQN-rp algorithm to make trading decisions based on the daily stock prices of Apple (AAPL), Meta (FB), and Bitcoin (BTC) and demonstrate its strengths over other quantitative trading methods. Full article
(This article belongs to the Special Issue Feature Paper Special Issue: Reinforcement Learning)
Show Figures

Figure 1

Back to TopTop