Enhancing Portfolio Performance through Financial Time-Series Decomposition-Based Variational Encoder-Decoder Data Augmentation

: The objective of portfolio diversiﬁcation is to reduce risk and potentially enhance returns by spreading investments across different asset classes. Existing portfolio diversiﬁcation models have traditionally been trained on historical ﬁnancial time series data. However, several issues arise with historical ﬁnancial time series data, making it challenging to train models effectively to achieve the portfolio diversiﬁcation objective: an insufﬁcient amount of training data and the uncertainty deﬁciency problem, wherein the uncertainty that existed in the past is not visible in the present. Insufﬁcient datasets, characterized by small data size, result in information asymmetry and compromise portfolio performance. This limitation underscores the importance of adopting a pattern-centric data augmentation approach, capable of unveiling hidden patterns and structures within the ﬁnancial time series data. To address these challenges, this paper introduces the ﬁnancial time series decomposition-based variational encoder-decoder (FED) method to augment ﬁnancial time series data, overcoming the limitations of insufﬁcient training data and providing a more realistic and dynamic simulation of the ﬁnancial market environment. By decomposing the data into distinct components, such as trend, dispersion, and residual, FED leverages pattern-centric data augmentation within the ﬁnancial time series data. In the environment generated using the FED method, this paper proposes a two-class portfolio diversiﬁcation, called FED2Port. It integrates stochastic elements into the reward function, enabling a reinforcement learning algorithm to learn from a comprehensive spectrum of ﬁnancial market uncertainties. The experimental results demonstrate that the proposed model signiﬁcantly enhances portfolio performance.


Introduction
Financial investments involve a trade-off between risk and return.Higher potential returns usually come with higher risks.A diversified portfolio is an investment strategy that involves spreading investments across different asset classes.Large-scale funds, such as national pensions worldwide, invest in a diverse range of assets.In most countries, equities and bills and bonds were the two main asset classes in which pension capital was invested in 2020, accounting for more than half of the investment in 35 out of 38 OECD countries and four reporting non-OECD G20 jurisdictions [1].The Melbourne Mercer Global Pension Index (MMGPI) considers a split between growth and defensive assets [2].Growth assets typically include high-risk assets, such as equities, property, and some alternative assets.On the other hand, defensive assets include low-risk assets, such as bills and bonds, as well as cash and deposits.
The present study classifies financial assets into two broad categories based on their inherent characteristics and the level of associated risk: high-risk and low-risk assets.This classification is similar to the MMGPI categorization, with growth representing high-risk and defensive representing low-risk.Such categorization assists investors in making wellinformed portfolio decisions, balancing risk tolerance and investment goals.Investments in high-risk assets can offer significantly large returns, making them attractive to investors seeking aggressive growth, but they also come with a higher likelihood of losses.On the other hand, investments in low-risk assets are often considered safer for preserving capital and generating modest, consistent returns.Two-class portfolio diversification involves spreading investments between these two classes of assets to reduce the overall portfolio risk.
A buy-and-hold strategy is a long-term investment approach where an investor buys assets and holds onto them for an extended period, regardless of short-term market fluctuations.Portfolio rebalancing is the process of periodically adjusting the weights of assets in a portfolio.The tangency portfolio among Markowitz optimization [3], risk budgeting [4,5], recurrent reinforcement learning (RRL) [6,7], and deep deterministic policy gradient (DDPG) [8,9] aims to find the optimal proportion of assets within a given period.Traditional portfolio diversification models [3][4][5] aim to optimize the allocation of assets in a portfolio to balance risk and return.Markowitz optimization [3] provides a mathematical approach for constructing an investment portfolio that maximizes the expected return for a given level of risk or minimizes the risk for an expected return.Risk budgeting [4,5] involves allocating risk across different assets or asset classes based on predefined risk constraints.This strategy aims to control and manage the portfolio risk effectively.Reinforcement learning (RL) portfolio diversification models [6][7][8][9][10][11] make decisions by interacting with an environment to maximize a cumulative reward signal.While RRL [6,7,10] aims to learn the optimal policy by maximizing the reward functions, DDPG [8,9] achieves this goal by adjusting the parameters of the actor and critic networks iteratively using optimization techniques.
Existing portfolio diversification models have a common deficit; they are trained using only historical financial time series data.On the other hand, historical financial time series data have the following problems.

•
Uncertainty deficiency.Both the financial market and its empirical time series data contain inherent uncertainty.At some point, probabilities were assigned to different events or market scenarios, including rises, falls, and magnitudes of changes, with nonzero probabilities.On the other hand, as time elapses, all past events collapse into a single outcome.Consequently, only one event is assigned a 100% probability, and the probabilities of all other events are set to 0%.This phenomenon, termed uncertainty deficiency, suggests that historical financial time series data only represent a sequence of singular events, lacking the diversity of market uncertainties that existed in the past.Ignoring financial market uncertainty can lead to overly confident models that fail to account for unforeseen risks.RL algorithms or traditional models optimized solely based on historical financial time series data may lack robustness and show poor capability when applied to novel or extreme events.• Insufficient amount of training data.Historical financial time-series datasets are often not large enough for training due to financial market uncertainty.For example, even with 10 years of daily data for an asset class (250 trading days in a year × 10 years = 2500), the amount is relatively small, only 2.5k.Insufficient datasets, characterized by small data size, result in information asymmetry and compromise portfolio performance.
Good results are not possible in the face of future uncertainty because of these problems.A financial time series decomposition-based variational encoder-decoder (FED) data augmentation is proposed to address the challenges of financial market uncertainty and insufficient training data, providing a more realistic and dynamic simulation of the financial market environment.Under the environment generated by FED, this paper proposes a two-class portfolio diversification (FED2Port), allowing the RL algorithm to learn from a comprehensive spectrum of financial market uncertainties.
The main contributions of this paper are as follows.
• FED for Financial Time Series Data Augmentation.The first contribution introduces an innovative financial time series data augmentation called the FED.Generating nonstationary financial time series data is deemed challenging, and FED addresses this challenge by leveraging decomposition techniques, separating the financial time series into distinct components (trend, dispersion, and residual).Based on the encoderdecoder architecture, the FED method utilizes latent variables further decomposed into components.This pattern-centric approach provides a profound understanding of the underlying structure of financial time series data, unveiling the hidden patterns or structures and offering insights into factors influencing observed trends and fluctuations.FED captures the distributions of latent variable components, generating more realistic financial time series data.In doing so, the FED method revives some of the past uncertainty that had disappeared, compensating for the problems of uncertainty deficiency and an insufficient amount of training data.

•
FED2Port for Decision-Making under Financial Market Uncertainty.The second contribution is the proposal of FED2Port as a novel diversification approach to enhance the efficiency of RL algorithms.Specifically tailored for RL portfolio diversification models, FED2Port addresses the uncertainty deficiency problem inherent in historical financial time series data.FED2Port trains the RL algorithm under the financial market environment generated using the FED.This environment simulation incorporates stochastic elements in the reward function, enabling the algorithm to learn from a more comprehensive spectrum of financial market uncertainties.Therefore, FED2Port improves the adaptability of the algorithm significantly, empowering it to make well-informed decisions in the face of future uncertainty, ultimately enhancing portfolio performance.

Related Work
Financial time series data generation plays a significant role in RL portfolio diversification models by addressing the challenges of financial market uncertainty and enhancing portfolio performance.Simulating the financial market environment with additional scenarios and variations can help improve the robustness of portfolio diversification.This ensures that the RL algorithm is exposed to a broader range of market conditions.The two most prominent types of generative models are the generative adversarial nets (GANs) [12,13] and variational autoencoders (VAEs) [14,15].
GANs usually generate more realistic data but face training stability and sampling diversity challenges.GANs are based on a two-player minimax game with value function V(G, D): where z, G, and D are random noise, a generator, and a discriminator, respectively.GANs involve training a generator and a discriminator in a competitive setting, which can sometimes lead to training instabilities, mode collapse, or difficulties in convergence.Time-series GAN (TimeGAN) [16] and real-world time series GAN (RTSGAN) [17] are designed to generate synthetic data that closely resemble real-world time series data.In TimeGAN, the generator produces embeddings, and the recovery produces time series data based on the generated embeddings.RTSGAN shares similarities with TimeGAN but sets itself apart by specializing in generating time series data with variable lengths.They do not address the nonstationary financial time series data generation.
VAEs often exhibit more stable training dynamics compared to GANs.VAEs explicitly model the generative process by assuming a specific form for the latent variable distribution.This can be advantageous in scenarios where understanding the generative process is crucial.VAEs aim to maximize the probability of the generated output with respect to the input and produce an output from a target distribution by compressing the input into a latent space.VAEs can learn via maximum likelihood using a variational approach to maximize the evidence lower bound (ELBO) as follows: where q φ (z|x) is an approximate posterior distribution for the latent variables, also known as a probabilistic encoder; p(z) is a prior over the latent variables; p θ (x|z) is a likelihood function, also known as a probabilistic decoder.
Time series decomposition [18][19][20][21][22][23][24][25][26] aims to decompose a time series into its components structurally and interpretably.These components typically include the trend, seasonal, and residual components.The trend component represents the long-term direction or underlying movement in the financial time series data, capturing the overall trajectory, which can be either linear or nonlinear.It helps identify whether the financial time series data generally increases or decreases over time.The seasonal component captures the periodic patterns and fluctuations within a year or specified period, elucidating regular, predictable movements in the financial time series data.The cyclical component represents longer-term fluctuations tied to economic or business cycles, spanning multiple years and identifying broader economic trends.The residual component, the error term or reminder, accounts for random and unexplained variability in the financial time series data, not attributable to the trend, seasonal, or cyclical components.Time series decomposition can be expressed in additive or multiplicative forms.An additive decomposition [22] would be written as A multiplicative decomposition [22] would be written as where y t , S t , T t , and R t are the data, seasonal component, trend component, and residual component, respectively, all at time t.The additive decomposition is appropriate when the magnitude of the seasonal fluctuations or the variation around the trend does not change as the level of the time series increases.The multiplicative decomposition is more appropriate when the variation in the seasonal pattern or the variation around the trend is proportional to the time series level [22].STD decomposition [26] extracts the components of the seasonality, trend, and dispersion, it is expressed as where y t , S t , D t , and T t are the data, seasonal component, dispersion component, and trend component, respectively, all at time t.STD with a reminder component [26], called STDR, is defined as follows: where S t is an averaged seasonal component, and R t is a reminder component, all at time t.
Decomposition is a crucial tool in analyzing and simulating nonstationary financial time series data, providing insights into changing patterns and helping simulators better understand and model the complexities of financial markets.The Markowitz optimization [3], known as modern portfolio theory (MPT), provides a mathematical approach to constructing portfolios to maximize the expected returns while minimizing risk.The tangency portfolio represents the optimal portfolio that maximizes a risk-adjusted return measure, the Sharpe ratio [27].Risk budgeting [4,5] is a portfolio construction approach that involves allocating risk among different assets based on predefined risk constraints.A risk budgeting model helps to manage and control the overall risk of a portfolio while optimizing returns.Previous studies [3][4][5] relied on the assumption of a stationary market.The Black-Litterman model [28] is an asset allocation framework that combines market equilibrium assumptions [29] with an investor's subjective views.Ap-plying the Black-Litterman model requires the availability of expert views or a predictive model that can represent those expert views.
Maximizing future rewards typically involves optimizing a sequence of decisions or actions to achieve the best possible outcomes.It is a fundamental problem in various fields, including reinforcement learning.Ref. [6] introduced an RL model called recurrent reinforcement learning (RRL) for portfolio management.They used the Sharpe ratio as the reward.A previous study [7] used the modified RRL, which optimizes the Sharpe ratio with batch learning.The deep deterministic policy gradient (DDPG) [30] was used for portfolio management [8,9].RL portfolio diversification models [6][7][8][9] can construct optimal portfolios that achieve the best possible rewards, such as the expected return, the Sharpe ratio [27], the Sortino ratio [31], or the market-adaptive ratio [32].The Sharpe ratio [27] relates the excess returns on a portfolio to its risk, the standard deviation of the excess return.The market-adaptive ratio [32] is a risk-adjusted return based on a market-type measure, rho.This ratio is a general form of the Sharpe ratio, considering the characteristics of the market types.During bull markets, the focus is on seeking high returns and embracing risk.In contrast, it aims to preserve capital and minimize risk during bear markets.The Sortino ratio [31] focuses on the downside risk.RL portfolio diversification models leverage insights gained from data analysis instead of relying on an assumption.On the other hand, they use observed historical environments to estimate the model parameters that lead to the uncertainty deficiency and the shortage of training data problems.

Proposed Methods
The d-day log-return vector of the high-risk asset (or the low-risk asset) at time t is defined as where p t is the price of a high-risk asset (or the low-risk asset) at time t.

FED
The encoder-decoder architecture encourages the latent space to have meaningful representations of the data, which is advantageous for operations like interpolation or feature manipulation.Based on this architecture, the FED method utilizes latent variables further decomposed into components.This approach provides a profound understanding of the underlying structure of financial time series data, unveiling the hidden patterns or structures and offering insights into factors influencing observed trends and fluctuations.
Time series decomposition is a fundamental technique in time series analysis that separates complex time series data into individual components, helping understand the underlying dynamics.Most time series decomposition methods have focused on the trend, seasonal, and residual components.Previous work [26] considers a component related to the dispersion of the time series.The trend and dispersion components are crucial for generating financial time series data due to their nonstationary property.FED incorporates components of the trend, dispersion, and residual.The trend component, m t , is the mean return at time t, representing the direction of financial time series data.The dispersion component, s t , is the standard deviation of the return at time t, representing the fluctuation of financial time series data.The residual component accounts for the unexplained variability in the financial time series data.The primary concept of the proposed model is to apply decomposition into the hidden space.By emphasizing these components, FED leverages pattern-centric data augmentation within the financial time series data.
Assume that data xt are generated by a decoder with a probabilistic latent variable, The FED method is based on the latent variable decomposition, where ν t ∼ N(µ νt , Σ νt ) is a probabilistic trend component of the latent variable, τ t ∼ N(µ τt , Σ τt ) is a probabilistic dispersion component of the latent variable, and ξ t ∼ N(µ ξt , Σ ξt ) is a probabilistic residual component of the latent variable, all at time t.
The product of two multivariate normal distributions results in another multivariate normal distribution [33], which is valuable and highly useful in the proposed model.Consequently, the parameters of the probabilistic hidden variable, h t ∼ N(µ ht , Σ ht ) were calculated, as follows: where and FED employs three encoders to model the three probabilistic components of the latent variables, including trend (return), dispersion (standard deviation), and residual.Similar to reference [14], the reparameterization trick was used.Figure 1 illustrates the general framework of FED.x t is the d-day log-return vector of the high-risk asset (or the low-risk asset) at time t.mt , st , and xt are the generated trend (return), the generated dispersion (standard deviation), and the generated d-day log-return vector of the high-risk asset (or the low-risk asset), respectively, all at time t.ν t ∼ N(µ νt , Σ νt ) is a probabilistic trend component of the latent variable, τ t ∼ N(µ τt , Σ τt ) is a probabilistic dispersion component of the latent variable, and ξ t ∼ N(µ ξt , Σ ξt ) is a probabilistic residual component of the latent variable.h t = ν t × τ t × ξ t is the decomposed latent variable.
The marginal log-likelihood of the trend m t : where p θ ν (m t |ν t ) is the conditional probability distribution of the trend m t given the latent variable ν t , modeled by a decoder and the sampling of the latent variable, and q φ ν (ν t |x t ) is the conditional probability distribution of the latent variable ν t given data x t , modeled by an encoder and the reparameterization trick.The above bound is the evidence lower bound (ELBO).
Similarly, the marginal log-likelihood of dispersion s t is expressed as where p θ τ (s t |τ t ) is the conditional probability distribution of the dispersion s t given the latent variable τ t , modeled by a decoder and the sampling of the latent variable, and q φ τ (τ t |x t ) is the conditional probability distribution of the latent variable τ t given data x t , modeled by an encoder and the reparameterization trick.
Similarly, the marginal log-likelihood of data x t is expressed as where p θ (x t |h t ) is the conditional probability distribution of data x t given the latent variable h t , modeled by a decoder and the sampling of the latent variable, and q φ (h t |x t ) is the conditional probability distribution of the latent variable h t given data x t , modeled by encoders and the reparameterization trick.The FED method aims to maximize the combination of the above three bounds as follows: where α, β, and γ are hyperparameters that control the importance of each task.

FED2Port
The environment of the FED2Port is defined as follows.
• The action is defined as the weight vector: where a t,hr and a t,lr ≥ 0 represent the weights of a high-risk asset and a low-risk asset, respectively, with the constraint that a t,hr + a t,lr = 1.

•
The state is defined as the portfolio return s t : where x t,hr and x t,lr are the d-day log-return vectors of the high-risk and low-risk assets, respectively.

•
The reward is defined as the market-adaptive ratio [32]: where ρ hr = 2 1+e −R hr represents the rho of the high-risk asset; R hr is the return of the high-risk asset, xt+d,hr and xt+d,lr are the generated log-return vectors of the high-risk and low-risk assets, respectively.FED methods are used for high-risk and low-risk assets.Rp and σ p represent the expected return and standard deviation of the total portfolio, respectively, and R f is the risk-free rate.In this paper, the risk-free rate equals zero.By using the market-adaptive ratio as the reward, FED2Port can take into account market characteristics such as bull and bear markets.
The agent, π ω , receives the portfolio return and selects an action.
It controls the policy using an evaluation of the reward.Figure 2 illustrates the general framework of FED2Port.FED2Port aims to allocate the total investment into two classes: high-risk and low-risk assets.This paper considers three stock indices and three bond funds (Table 1) in the experiment.The daily data from January 2010 to December 2022 (https://finance.yahoo.com/ accessed on 1 October 2023) were included.To initialize models, they were trained using the five-year data from January 2010 to December 2014 for each dataset.Then, we tested the model using eight-year data, from January 2015 to December 2022.Figure 3 depicts the price data of the assets, while Table 2 lists the differences between stock market indices and bond funds.While stock market indices carry higher risk, bond funds offer lower risk.Nine two-class portfolios (Table 3) were considered, comprising three stock indices and three bond funds (Table 1), to assess the performance of the proposed model.

Benchmarks
For comparison, several benchmarks (Table 4) were considered, including buy-andhold strategies, traditional portfolio diversification models, and RL portfolio diversification models.The buy-and-hold strategy is a long-term investment approach in portfolio management where an investor buys financial assets and holds onto them for an extended period, regardless of short-term market fluctuations.Traditional portfolio diversification models help construct portfolios that align with the investors' risk tolerance and return objectives.RL portfolio diversification models showcase the adaptability and learning capabilities of reinforcement learning.

Performance Measures
The expected portfolio return, the standard deviation of the portfolio return, and the Sharpe ratio were considered to evaluate the effectiveness of portfolio strategies.
The expected portfolio return (Profit) is expressed as where t is the length of the test period, and Rp is the daily mean return of the portfolio.The expected portfolio return provides insight into the overall portfolio performance, capturing the total change in value over time.
The standard deviation of the portfolio return (Risk) is expressed as follows: where R p,i is a daily return of the portfolio at time i.The standard deviation of the portfolio return is a key metric in assessing the risk associated with a portfolio.A higher standard deviation indicates greater variability in returns, suggesting higher risk, while a lower standard deviation implies more stability.The Sharpe ratio is a risk-adjusted return that evaluates the portfolio performance, which was calculated using expected return and risk during the test period, as follows.

Experimental Results
Network architectures in Figure 4 were used for the encoder and decoder of the FED method.The dimensions of the latent variables were set to 100.The network architecture in Figure 5 was used for the FED2Port agent, π ω , which utilizes the Softmax function to generate portfolio weights.A rolling window approach was implemented to retrain the FED2Port model annually from January 2015 to December 2022.The total portfolios were rebalanced for each month (20 trading days).The Profit (Equation ( 16)), Risk (Equation ( 17)), and Sharpe ratio (Equation ( 18)) were considered to evaluate the effectiveness of the portfolio strategies.The importance of using FED in FED2Port was demonstrated by comparing the performances of TimeGAN2Port and RTSGAN2Port.Synthetic data were generated using TimeGAN [16] for TimeGAN2Port and RTSGAN [17] for RTSGAN2Port.Ten samples were generated at each time step for each generation.
Tables 5-13 list the experimental results.The empirical evaluation of FED2Port across diverse datasets underscored its robustness and superior performance, consistently outperforming benchmark models, including traditional and reinforcement learning models.The risk-return trade-off is a fundamental trading principle that describes the inverse relationship between investment risk and return.The Sharpe ratio is a helpful measure for quantifying this trade-off.For eight portfolios out of nine, 100% low-risk asset portfolios provided the lowest risks, but the profits were not sufficiently strong.In the VCIT&DAX dataset (Table 12), TimeGAN2Port provided the lowest risk, but its profit was also lower.For five portfolios out of nine, 100% high-risk asset portfolios offered the highest profits but they also came with the highest risks.In the BND&KOSPI (Table 7) and the BSV&KOSPI (Table 10) datasets, DDPG offered the highest profits, but its risks were higher than those of the proposed model, FED2Port.The Sharpe ratios of FED2Port were the highest among the compared models across all portfolios, indicating that FED2Port delivered the most favorable return per unit of risk undertaken.Other RL portfolio diversification models (RRL, DDPG, TimeGAN2Port, and RTSGAN2Port) exhibited mixed results in terms of robustness.They sometimes outperformed traditional portfolio models (tangency portfolio and risk budgeting) while yielding poorer results at other times.This variability suggests that the performance of these RL models may be sensitive to specific market conditions or dataset characteristics.The primary concept behind FED2Port is to utilize financial market environment simulation through FED.The importance of using FED was highlighted by comparing the performances of FED2Port, TimeGAN2Port, and RTSGAN2Port.The results demonstrated that employing financial market environment simulation through FED is crucial for enhancing portfolio performance.FED was compared with the most recent time series data generation models, namely TimeGAN [16] and RTSGAN [17].TimeGAN and RTSGAN are designed to generate synthetic data that closely resembles real-world time series data.However, neither of these models addresses the generation of nonstationary financial time series data.FED leverages decomposition techniques to break down financial time series data into distinct components, such as trend, dispersion, and residual.By decomposing the data in this manner, FED can capture the various underlying factors influencing the trends and fluctuations in the market, leading to a more accurate representation of real-world financial time series data.The t-SNE plots of original versus generated data were plotted in Figure 6.The results indicated that FED produces synthetic data that closely match the original distribution of the data, suggesting that FED is more effective in capturing the underlying structure and characteristics of financial time series data compared to other models.

Conclusions
This paper introduced a novel portfolio diversification approach called FED2Port, which effectively addresses the uncertainty deficiency problem inherent in historical financial time series data and insufficient training data.This is achieved by utilizing dynamic financial market environment simulation during reinforcement learning algorithm training.Our experimental results across diverse datasets have demonstrated the robustness and superior performance of FED2Port compared to benchmark models, including traditional and reinforcement learning models.Notably, FED2Port consistently outperformed in terms of the Sharpe ratio, emphasizing its effectiveness in delivering risk-adjusted returns.This superior performance underscores the importance of environment simulation in enhancing portfolio diversification strategies, as it allows for a more accurate representation of real-world conditions.
However, it is important to note that the experimental results for TimeGAN2Port and RTSGAN2Port were not as favorable as those of the other benchmarks.This highlights the limitations of solely relying on synthetic data generation methods that do not specifically address the complexities of financial markets.Our findings suggest the necessity of employing financial pattern-centric data augmentation techniques, such as FED, to enhance portfolio diversification strategies.By providing more accurate insights into market trends and fluctuations, FED2Port enables investors to make informed decisions that can potentially enhance portfolio performance and mitigate risks.
Overall, our findings highlight the practical importance of incorporating sophisticated data augmentation techniques, like FED, into portfolio diversification.Moving forward, further research in this area could explore additional applications of FED and similar methods in portfolio optimization and risk management, ultimately contributing to more robust and effective investment strategies in financial markets.

Figure 1 .
Figure 1.General framework of financial time series decomposition-based variational encoderdecoder (FED).xt is the d-day log-return vector of the high-risk asset (or the low-risk asset) at time t.mt , st , and xt are the generated trend (return), the generated dispersion (standard deviation), and the generated d-day log-return vector of the high-risk asset (or the low-risk asset), respectively, all at time t.ν t ∼ N(µ νt , Σ νt ) is a probabilistic trend component of the latent variable, τ t ∼ N(µ τt , Σ τt ) is a probabilistic dispersion component of the latent variable, and ξ t ∼ N(µ ξt , Σ ξt ) is a probabilistic residual component of the latent variable.h t = ν t × τ t × ξ t is the decomposed latent variable.

Figure 2 .
Figure 2. General framework of two-class portfolio diversification (FED2Port).s t and a t are the state and the action, respectively, at time t.xt+d,hr and xt+d,lr represent the generated log-return vectors of the high-risk and low-risk assets, respectively, at time t.r t is the reward at time t.The objective of FED2Port is to maximize the expected reward,

Figure 3 .
Figure 3. Graphs of the price data of the assets.

Table 2 .
Statistic of funds during test period.

Table 5 .
Results of the BND&SP500 portfolio.Cells with a red background color indicate the best Sharpe ratio in the experiment.

Table 6 .
Results of the BND&DAX portfolio.Cells with a red background color indicate the best Sharpe ratio in the experiment.

Table 7 .
Results of the BND&KOSPI portfolio.Cells with a red background color indicate the best Sharpe ratio in the experiment.

Table 8 .
Results of the BSV&SP500 portfolio.Cells with a red background color indicate the best Sharpe ratio in the experiment.

Table 9 .
Results of the BSV&DAX portfolio.Cells with a red background color indicate the best Sharpe ratio in the experiment.

Table 10 .
Results of the BSV&KOSPI portfolio.Cells with a red background color indicate the best Sharpe ratio in the experiment.

Table 11 .
Results of the VCIT&SP500 portfolio.Cells with a red background color indicate the best Sharpe ratio in the experiment.

Table 12 .
Results of the VCIT&DAX portfolio.Cells with a red background color indicate the best Sharpe ratio in the experiment.

Table 13 .
Results of the VCIT&KOSPI portfolio.Cells with a red background color indicate the best Sharpe ratio in the experiment.