Asset Allocation Model for a Robo-Advisor Using the Financial Market Instability Index and Genetic Algorithms

There has been a growing demand for portfolio management using robo-advisors, and hence, research on the automation of portfolio composition has been increasing. In this study, we propose a model that automates the portfolio structure by using the instability index of the financial time series and genetic algorithms (GAs). We use the instability index to filter the investment assets and optimize the threshold value used as a filtering criterion by applying a GA. For an empirical analysis, we use stocks, bonds, commodities exchange traded funds (ETFs), and exchange rate. We compare the performance of our model with that of risk parity and mean-variance models and find our model has better performance. Several additional experiments with our model using various internal parameters are conducted, and the proposed model with a one-month test period after one year of learning is found to provide the highest Sharpe ratio.


Introduction
Interest in robo-advisors is growing with FinTech trends. FinTech, a compound of finance and technology, has a large impact on various industries as well as the financial industry. As one of the most important FinTech developments, robo-advisors are being treated as representative of the next-generation business model in the area of portfolio management by the investor community [1,2]. On behalf of human investment professionals, robo-advisors provide online investment advisory service for portfolio management according to individual investment propensity and investment purpose with big data analysis and advanced algorithm-based automation system. In fact, robo-advisors are not an entirely new technique, but the terminology is new. The core of the robo-advisors is the composition of investments in a portfolio that is mainly based on exchange traded funds (ETFs). An ETF is a type of fund that owns the underlying assets such as stocks, bonds or commodities and is traded like a common stock on a stock exchange with relatively low transaction costs [3]. Empirical research on robo-advisors has been steadily progressing with technology development. Recent developments in information and communication technology (ICT) and artificial intelligence have defined new terms and become a domain of the robo-advisor [4,5]. Several studies show better portfolio management performance using artificial intelligence than using various existing strategies [6][7][8].
As financial markets are constantly evolving with sophisticated techniques, the classical financial theories or traditional models are no longer effective to understand the financial market mechanism and construct efficient portfolios. Since the mean-variance model by Markowitz, the asset allocation model for optimal portfolio has evolved steadily [9]. The mean-variance model has been extensively used for theoretical and empirical studies [10,11]. Since the 1960s, the capital asset pricing model (CAPM) and a model that eliminates unsystematic risk from diversified investment were developed [12][13][14]. The CAPM, which find expected return of asset based on many assumptions, has been criticized because it does not fit real market conditions [15]. Since then, the multi-factor model using other factors in addition to market return are introduced, and the Black-Litterman model which reflects investor's market prospects was introduced for constructing a portfolio [15][16][17].
The other noteworthy model is the risk parity model, which is gradually replacing existing models. The risk parity model has many advantages over the previous models. This model allows investors to focus more on risk than other factors. It effectively constructs a more diversified portfolio and shows good performance in backtesting [18][19][20][21]. This result seems to be possible because the model shows lower drawdown and volatility than existing models. However, some have argued that the risk parity strategy is affected by interest rates, lacks consideration of future returns, and requires leveraging low-risk assets. The weights on each asset included in a portfolio can be changed significantly depending on risk measurement. Furthermore, the risk parity strategy does not reflect the investor's special circumstances efficiently. In contrast, a robo-advisor needs to reflect investors' risk preferences more properly for portfolio management regardless of leverage. Indeed, a model that can select assets considering investors' risk preference is more efficient for portfolio management.
The classical financial theories or traditional models are no longer effective to understand the financial market mechanism and construct efficient portfolios because financial markets are constantly evolving with sophisticated techniques. The purpose of this research is to develop an asset allocation model for a robo-advisor using more advanced techniques and compare its performance with those of traditional models. The model proposed in this paper focuses on asset instability. This model uses market indices to measure the degree of instability of financial time series data, compares them to thresholds, and manages risk by eliminating unstable assets. The degree of instability is measured by a p-value that is used to determine the weight of each asset included in a portfolio. Genetic algorithms are used to optimize thresholds to determine the degree of instability. By comparing with the risk parity model and the mean-variance model, we confirm that the proposed model provides higher profitability and stability [22,23]. The significant advantage of the proposed model is that it can reflect investors' propensity to invest in various ways by adjusting this threshold. We can increase the share of assets with high instability for investors who prefer risk, whereas we can increase the proportion of stable assets for investors who prefer stability. A general role of a robo-advisor is to provide financial advice or investment management based on mathematical rules or algorithm. In this sense, our asset allocation model can be used as a portfolio management algorithm or an automated investment advice by a robo-advisor.
For empirical analysis, we use ETFs traded in Korean markets. Many recent portfolio investments focus on investing in ETFs. Before ETFs are traded on the market, it is difficult to manage the risks of a portfolio because a lot of capital is required to make a diversified portfolio. However, with the advent of ETFs, it is possible to make a fully diversified portfolio even at low cost.
The composition of this paper is as follows. In Section 2, we review the algorithms and theories used in the proposed model and describe how to derive the iFMII (integrated financial market instability index) that identifies instability. We also discuss the risk parity model and the genetic algorithms used for threshold optimization and describe the components of the proposed model. In Section 3, we report empirical results and compare the performance of our model with that of other models. Section 4 concludes our paper.

p-Value Derived from the iFMII
The p-value indicating the degree of instability has been derived from the iFMII. In fact, the instability index was first introduced as the iSMII (integrated stock market instability index), but it was expanded to the iFMII so that it is not limited to the stock market [24]. The iFMII considers time series data to be linear or nonlinear. For the nonlinear model, nonparametric nearly stationary autoregressive (NNSAR) model with artificial neural networks (ANN) is used. ANN is one of the widely applied models with nonlinear time series. For the linear model, autoregressive (AR) model is used.
As a first step, a stable interval of the time series between certain values is determined. Once the interval is determined, linear and nonlinear models are constructed based on the stable interval. Then, the time series included in the remaining interval is predicted by the model constructed in the stable interval. To determine the order of constructed model, sample partial autocorrelation function (SPACF) is used.
The linear FMII (financial market instability index) and the nonlinear FMII are calculated as the difference between the predicted value and the actual value. Because the model is constructed based on a stable section, the larger the error, the more unstable the value is. The two derived FMIIs are combined using the Bayesian averaging method to produce the iFMII. It is empirically found that the combined index with approximately 50% of each FMII best reflects the degree of instability. The p-value of each FMII is obtained rank of the error in descending order, which implies that the higher the error value (or the lower the rank of an error), the lower the p-value. The two p-values obtained from the linear and nonlinear FMIIs are also combined to same weight as when calculating the iFMII to produce the p-value of the iFMII called the ip-value. Previous studies have shown that an ip-value lower than 0.2 indicates a very unstable condition in the Korean stock market [24]. In this paper, we propose to optimize the threshold value for unstable conditions and determine whether a 0.2 threshold value is appropriate.

Risk Parity Model
To use the mean-variance model to make an efficient portfolio, the expected return of each asset, the covariance matrix, and the risk aversion factor of the investor are required [9,16]. The covariance matrix includes the volatility of each asset and covariances between all possible pairs of each asset included in the portfolio. Expected return, volatility, and correlation are estimated by various methods, but the most controversial variable is the expected return. The expected return has the greatest impact on asset allocation decisions depending on the estimation method, such as using historical figures and assumptions imposed for estimation.
To overcome these weaknesses, various alternatives have been proposed. One of them is the risk parity model that requires only the covariance matrix [25,26]. The risk parity model constitutes a portfolio so that each asset (or security) has the same risk contribution to the total portfolio risk. In this sense, the risk parity model is considered a model for distributing risks rather than a model for constructing an efficient portfolio. In this paper, the historical standard deviation of each asset is used as the volatility measure, and the assets in a portfolio are allocated proportionally to the reciprocal of the volatility, which is also known as naïve model of risk parity [27][28][29][30].

Genetic Algorithms
A genetic algorithm is an optimization methodology that is based on the principles of Darwinian evolution [31][32][33][34]. The algorithm is mainly used to find solutions to nonlinear optimization problems using evolutionary rules such as crossover, selection, and mutation. Before applying the algorithm, all candidates (chromosomes) for solutions are usually converted into a binary structure, and the objective (fitness) function, which represents the goal of the problem and is used to calculate fitness score, should be constructed. A set of chromosomes is called a population, and each population is numbered as the nth generation. Moving on to the next generation, the evolutionary rules are implemented based on the fitness scores of chromosomes, which measure how well chromosomes fit the problem. Typically, chromosomes that have higher fitness scores have higher probability to be parents for crossover according to a selection rule. A crossover rule is a way to create better chromosomes by mixing two different chromosomes, and a mutation rule randomly selects and changes genes of chromosomes. By repeating this process, chromosomes gradually fit the objective function and are closer to the optimal solution [35,36].
In this study, a GA is used to optimize the thresholds of asset filtering in a portfolio. The threshold set of each asset in a portfolio is treated as a variable, and the maximization of the Sharpe ratio with zero risk-free rate over the past period is used as an objective function. The Sharpe ratio which is widely used in finance to measure an asset's performance is defined as the average return earned more than the risk-free rate per unit of volatility or total risk. For the operators on the GA, rank-based selection and uniform crossover with 0.5 crossover rate are used. Mutation method is performed randomly by variables individually with 0.1 mutation rate. By rank-ordered replacement method, the worst solutions are replaced with new possible solutions created by operators. The GA stops only when the fitness value has not improved at least 0.01% during last 20,000 trials. The larger the trials, the more likely it is to obtain a globally optimized solution, but complexity increases exponentially in finding an optimal threshold. Therefore, the empirical analysis is usually carried out using a reasonable stopping condition in the process of finding an optimal solution.

Proposed Model
The model proposed in this paper consists of three phases. In the first phase, we calculate the ip-value, which is used for the two purposes of asset weighting and asset filtering. In the second phase, we find an optimal threshold using a GA. As a measure of the instability degree of an asset, the ip-value is used as a criterion to exclude assets that do not meet the threshold. Once the unstable asset group is removed from the portfolio, the ip-value is used to determine the proportions of the remaining assets and construct a portfolio containing stable assets. In the last phase, we finally decide the proportion of the investor's capital to invest in the portfolio using the volatility target method. Figure 1 shows the three phases of the proposed model. The FMII value for each asset is obtained by constructing linear and nonlinear models for the time series data included in a stable interval. For a given time series of ith asset Y i1 , Y i2 , · · · , Y in , the NNSAR model with order k and the error term e i,t is presented in Equation (1).
The AR model with order k and the error term e i,t is presented in Equation (2).
The FMII value for each asset is the error between the predicted value from the constructed model and the actual value. We use the mean absolute percentage error type (MAPE) presented in Equation (3) to calculate the error.
whereŶ t−i is the predicted value and the Y t−i is actual value. FMII1 is derived from the linear model, and FMII2 is derived from the nonlinear model. Then, the p-value is calculated for each FMII value, and the two p-values are combined to determine the final ip-value for the asset. First, we determine a stable interval for each asset. A stable interval means a period in which the movement of the asset's time series data appears to be a normal (or stationary) time series. Generally, an interval where the time series is not volatile and moves within a certain box is selected as a stable interval. After selecting the stable interval and calculating FMII1 and FMII2 for each asset, the p-value of FMII1 and FMII2 for each asset is obtained by dividing the rank of its FMII value in descending order by the total number of predictions. We first calculate the p-value for the linear model. For a given interval, we check the sample autocorrelation function (SACF) to determine whether the time series in the interval is abnormal (or nonstationary). Then, we use the nonstationary linear AR model for the prediction of an abnormal time series. The order of the AR and ANN model is determined through a SPACF. The AR formula is constructed using the determined order, and the FMII1 of each asset is derived. In this paper, the SPACF cuts off at a lag of 2 because it is the first time that the SPACF is between −0.2 and 0.2. Therefore, we can present our AR(2) and ANN models as Equations (4) and (5), respectively. The error terms for (5) are fitted by using AR(1) after ANN model constructed When asset movements are nonlinear, an ANN is used to estimate and predict nonlinear time series. The ANN derives FMII2 using the commonly used backpropagation (BPN) algorithm and multi-layer perceptron (MLP) architecture with 2 × 3 × 1 for financial data analysis. In this paper. The p-value from the linear model (FMII1) and the nonlinear model (FMII2) is calculated by Equation (6) and they are combined with an equal weight to calculate the ip-value of the asset. The derived ip-value has a value between 0 and 1, and values closer to 0 mean more unstable and those closer to 1 indicate more stable. This value is used for two purposes in the next step.
p − value = the rank of FMII in decending order the total number of prediction

Phase 2: Determine the Allocation of Assets
The allocation of assets uses the ip-value calculated in Section 3.1. First, we use ip-value to filter the assets. If the ip-value is lower than a certain value, it is excluded from the portfolio because it is considered unusual and unstable. The second use of ip-value is to determine the weight of each asset in the portfolio. The weight is obtained by dividing the ip-value of each asset by the total sum of the ip-value of all assets in the portfolio. For example, the ip-values of three assets in a portfolio, A1, A2, and A3, are 0.3, 0.8, and 0.1, respectively, and the threshold is 0.2. Then, A3 is excluded from the portfolio, and the investment weight for each asset is as follows: The determination of the asset weight as mentioned above requires a threshold value for the ip-value. In a previous study, the threshold value for the p-value is reported as 0.2 for the Korean stock market, but this value is not always appropriate [24]. In this paper, a GA is used to determine an appropriate threshold value with a population size, crossover rate and mutation rate of 1000, 0.5 and 0.06, respectively. The learning section sets a certain period of time in the past where the optimized ip-value is found. The chromosome is a linear form of the ip-value threshold for each asset, and the objective function is to maximize the Sharpe ratio of index returns over the past period. We use the optimized ip-value threshold from the GA to remove unstable assets and determine the asset weight of a portfolio for the next period.

Phase 3: Volatility Target
Once the asset weights in the portfolio have been obtained, investors must determine the proportion of their capital to invest in the portfolio. We use the volatility target method to determine the portfolio investment proportion. First, the portfolio volatility is calculated using the volatility, weight of each asset, and the correlation coefficient between asset returns in the portfolio. Then, if the portfolio volatility is larger than target volatility, the capital proportion of investing in the portfolio is obtained by dividing the target volatility by the portfolio volatility. The investor's capital, other than portfolio investment, is assumed to be invested in bond. Finally, the investor's total return is generated by the portfolio returns and the bond returns.

Application of the Proposed Model to Build a Robo-Advisor
A robo-advisor is created by a team of experts in financial advice, investment management, and tech product development. The procedure of building a robo-advisor largely consists of three steps. A robo-advisor framework in Figure 2 describes the procedure of building a robo-advisor. In the first step shown in panel 1 of Figure 2, a model for identifying customer propensity is developed where client survey is conducted, and clients are classified depending on their status and risk preferences. In the second step shown in panels 2-5 of Figure 2, an optimized portfolio is constructed by judging market status, measuring price volatility, and clustering ETF by risk level. As the last step shown in panel 6 of Figure 2, pattern matching trading system is developed for making actual trades. Comparing  Figures 1 and 2, the proposed model in this paper can be applied to the second step of building a robo-advisor as the proposed asset allocation model consisting of three phases can be used to make an optimized portfolio. In other words, Figure 1 is related to panels 2-5 of Figure 2.

Experimental Environments
We use ETF time series data traded in Korean financial markets. In Korea, investors can invest in ETFs at a lower cost than other funds because there are no imposed sales fees and lower commissions. Investing in a stock ETF provides a similar effect to investing in all stocks included in the index. Therefore, ETFs enable investors to invest in a very small percentage of the entire market as well as assets that are difficult to access, such as illiquid bonds or crude oil. As a result, ETFs allow investors to make diversified investments at a lower cost. We use the KODEX200, which tracks the KOSPI200 index as the stock index and the KODEX KTB as the bond index. We use the won/dollar exchange rate for foreign exchange assets and TIGER crude oil futures which is a Korean exchange-traded fund that tracks the performance of the S&P GSCI Crude Oil Enhanced Index for commodity assets. Additionally, U.S. (United States) financial market data is used for model validation. We use the iShares ETF dataset managed by BlockRock which is an American global investment management corporation. Table 1 shows selected ETFs for stock, bond, currency, and commodity sectors of Korean and U.S. markets. Monthly time series data for each asset from January 2014 to December 2018 are used for training and testing in our experiments. Figure 3 shows the movement of the four time series during the sample period. The learning interval or the training period for the threshold optimization is set to 12 months in the past, and the test period is the next month. Because we use monthly data, we learn from 12 data points in the past and apply the determined weight of assets to the next month. The sliding window method is used. The window interval is one month, and the total number of windows is 48. Table 2 shows the sliding window schedule with training and testing periods. The target volatility is set at 2.9%, which is the exposed realized volatility of a typical 60/40 U.S. stock/bond portfolio [37].  Four preliminary experiments were conducted to verify that the proposed model is more valid than the other models. Experiment 1 adopts the risk parity model with a monthly volatility target of 2.9%, which determines the proportion of asset allocation as the reciprocal of the volatility of each asset. In Experiment 2, we use the traditional mean-variance model with a volatility target to determine asset weights for an efficient portfolio. Experiment 3 is conducted to determine the optimal asset weights by using GA and the ip-value with a fixed threshold of 0.2. In this experiment, ip-value is used to determine the investment asset and specific allocation weights are searched by GA with 1-year training. In Experiment 4, we compose a portfolio based on the ip-value with optimized threshold values by a GA for each asset. The optimized threshold values are used to determine the investment asset, and specific allocation weights are from the ip-value of invested asset. We use 1-year (Experiment 4.1) and 2-year training periods (Experiment 4.2) in this Experiment. The objective function for the GA is to maximize the Sharpe ratio during the training period. Experiments 1 to 4 are based on the Korean ETF market. For model validation, we conduct Experiment 5 where we apply the most profitable parameters obtained from Experiment 1 to 4 to the U.S. ETF market.
Once the asset weights in the portfolio have been obtained, investors must determine the proportion of their capital to invest in the portfolio. We use the volatility target method to determine the portfolio investment proportion. First, the portfolio volatility is calculated using the volatility and weight of each asset and the correlation coefficient between asset returns in the portfolio. Then, if the portfolio volatility is larger than target volatility, the capital proportion of investing in the portfolio is obtained by dividing the target volatility by the portfolio volatility. The investor's capital other than the portfolio investment is assumed to be invested in KODEX KTB (bond) in this model. Finally, the investor's total return is generated by the portfolio returns and the KODEX KTB return.

Experimental Results
First, the volatility and the ip-value of each ETF are obtained as shown in Figures 4 and 5, respectively. The lower the ip-value is, the higher the instability. When the ip-values in Figure 5 are compared with the volatilities of each asset in Figure 4, it is noted that the larger the volatility is, the lower the ip-value. In the case of bonds, even though the volatility does not seem to change, the ip-value fluctuates because it is based on the rank of iFMII and reflects a low level of volatility. Therefore, the ip-value shows the level of stability of asset movement.  Using the monthly return during the testing period, we calculate mean monthly return, cumulative return, standard deviation, downside risk, skewness, kurtosis, Sharpe ratio, and Sortino ratio for each experiment model. The Sharpe ratio is the ratio of mean monthly risk-free excess return of portfolio to standard deviation as shown in Equation (7). Downside risk is calculated based on Equation (8). For every month, deviation is calculated when the monthly return is less than the target return of mean monthly risk-free return in this paper. The Sortino ratio is the ratio of mean monthly target excess return to downside risk to measure the risk-adjusted return as shown in Equation (9).
The results of these statistics for each experiment model are shown in Table 3. The risk parity model (Experiment 1) shows the lowest return and negative Sharpe and Sortino ratios due to low return volatility and negative excess return. Returns and ratios of the mean-variance model (Experiment 2) are found to be lower than those from the ip-value models. Experiment 3, which uses the specific threshold rather than the optimized threshold with 1-year training period, also shows higher returns and ratios than those from experiments 1 and 2. Therefore, we can expect that the ip-value-based models bring higher returns per volatility than the existing models. Additionally, the Sharpe and Sortino ratios of Experiment 4 are higher than those from experiments 1 and 2. However, we find some differences in results from the ip-value models using different parameters. The model using a 1-year training period (Experiment 4.1) shows better performance than that using a 2-year learning period (Experiment 4.2). Comparing the results of Experiment 3 and Experiment 4.2, the model using specific threshold and 1-year weight optimization by GA shows lower returns but similar Sharpe ratios and higher Sortino ratios. Although assessing the asset allocation model through the returns is important, it is more important to evaluate it through the generated portfolio returns per volatility, which is measured by the Sharpe and Sortino ratios. From that point of view, 2-year training process may not be appropriate when constructing ip-value-based models. Our presented 1-year training model in Experiment 4.1 shows the highest mean monthly and cumulative returns as well as the highest Sharpe and Sortino ratios. Therefore, we use the training interval and other parameters of Experiment 4.1 for Experiment 5 with U.S. data. We also find that monthly returns from all models except for risk parity model shows left-tailed distribution with negative skewness and there is no excess kurtosis problem for all models. Figures 6 and 7 show the monthly return and cumulative return of Korean ETFs over the sample period generated from the ip-value model with a 1-year learning period, respectively. As shown in Figure 6, the portfolio generates mostly positive returns except for several months. In this empirical analysis, rebalancing is performed monthly, so there may be some point in time in which asset allocation is not adjusted as sensitively. If the asset allocation is rebalanced on a weekly or daily basis, this problem seems to be solved to some extent. Figures 8 and 9 show the monthly return and cumulative return of U.S. ETFs over the sample period generated from the ip-value model with a 1-year learning period, respectively. As shown in Figure 8, the portfolio generates more frequent positive returns for U.S. ETFs as those for Korean ETFs. We also find that our model generates competitive performance in the U.S. ETF market as well as the Korean ETF market.

Concluding Remarks
With the emergence of the robo-advisor, the asset allocation model is expected to evolve one step further. There are few asset allocation models that achieve high diversification effects and reflect the investment tendency of various investors. A model that can reflect investment tendencies is more suitable for the future robo-advisor period than the previous risk parity and mean-variance models. The model proposed in this paper combines the time series model (or statistical model) to measure the degree of instability and GA to optimize the asset weights and thresholds for instability. In this respect, our approach differs from the existing asset allocation models which are based on only statistical models. The empirical results show better performance by the proposed model than the previous models. It is noted that the previous models depend on some assumptions. In contrast, our model does not make any specific assumptions, so it seems more plausible to apply it to actual market data. Our model is also used for investors with various investment propensities by using various threshold values.
The model proposed in this paper can be used as a portfolio management algorithm or an automated investment advice to construct a robo-advisor. A procedure to identify investor preferences should be added, and then ETFs that match the identified trends can be selected. Furthermore, adding another component that automatically makes asset trades constructs a robo-advisor. In future research, we plan to assess the proposed model using various ranges of internal parameters for the learning period and threshold values.
This study has potential limitations. The model developed in this paper is based on a portfolio that consists of stocks, bonds, foreign exchange assets, and commodity assets in the Korean ETF market, and U.S. ETF market for model validation. As such, the empirical results are limited to the Korean and U.S. ETF market data. Based on the idea of our model, a future research can be enriched by developing a model that can be used for other portfolios containing various types of financial assets in a global market.

Conflicts of Interest:
The authors declare no conflict of interest.