Gray Wolf Optimization Algorithm for Multi-Constraints Second-Order Stochastic Dominance Portfolio Optimization

: In the ﬁeld of investment, how to construct a suitable portfolio based on historical data is still an important issue. The second-order stochastic dominant constraint is a branch of the stochastic dominant constraint theory. However, only considering the second-order stochastic dominant constraints does not conform to the investment environment under realistic conditions. Therefore, we added a series of constraints into basic portfolio optimization model, which reﬂect the realistic investment environment, such as skewness and kurtosis. In addition, we consider two kinds of risk measures: conditional value at risk and value at risk. Most important of all, in this paper, we introduce Gray Wolf Optimization (GWO) algorithm into portfolio optimization model, which simulates the gray wolf’s social hierarchy and predatory behavior. In the numerical experiments, we compare the GWO algorithm with Particle Swarm Optimization (PSO) algorithm and Genetic Algorithm (GA). The experimental results show that GWO algorithm not only shows better optimization ability and optimization efﬁciency, but also the portfolio optimized by GWO algorithm has a better performance than FTSE100 index, which prove that GWO algorithm has a great potential in portfolio optimization.


Introduction
Since the mean-variance (MV) model is proposed by Markowitz [1,2], the portfolio optimization problem has attracted a lot of attention.In MV model, variance is used as the risk measure, and it is assumed that returns are normally or elliptically distributed [3].However, Chunhachinda et al. [4] point out that the return to the world's fourteen major stock market is not normally distributed, which means MV model lacks effectiveness in practical applications.Besides, variance counts both upward and downward deviation, which is contrary to the definition of investment risk.Later, to supply the gap, Markowitz [5] replaces the risk measure with the semi-variance, which only counts downward deviation.In addition, Konno [6] and Speranza [7] introduce the mean absolute deviation (MAD) into portfolio optimization model as risk measure.MAD reduces the sensitivity of risk, while it is difficult to do differential operation.Generally speaking, these risk measures are rather abstract.As a new type of risk measure, value at risk (VaR) is put forward in the mid-90s.Roughly speaking, VaR is a maximum quantile of the random loss in a specified period with some confidence level [8].However, VaR also has some limitations.For example, Artzner et al. [9] point out that VaR is not a coherent measure of risk.Therefore, as an improvement on VaR, conditional value at risk (CVaR) gets widely applications in the field of portfolio optimization.
Other than Markowitz MV model, the stochastic dominance (SD) model is another important model for solving portfolio optimization problem.SD is proposed by Fishburn [10], then, Dentcheva and Ruszczynski [11] introduce SD constraint into portfolio optimization model, which takes the risk appetite into consideration and is more suited to the investors in realistic investment environments.Thereinto, in the SD model, we mainly consider first-order stochastic dominance (FSD) constraint and second-order stochastic dominance (SSD) constraint.Besides, Leshno and Levy [12] establish almost stochastic dominance (ASD) rules which formally reveal a preference for most decision makers, but not for all of them.Moreover, Javanmardi and Lawryshyn [13] introduce a new second-order stochastic dominance efficiency model called SSD-DP model, which doesn't require a benchmark portfolio.
On the basic of MV model and SD model, scholars refine their models by adding a series of objectives or constraints into it.Arditti [14] points out that skewness has great importances of financial economics.Then, Yu et al. [15] and Bhattacharyya et al. [16] study and develop the mean-variance-skewness (MVS) model, which introduce skewness into MV model as one of the objectives.Recently, Pouya et al. [17] add P/E criterion and experts' recommendations on market sectors to the primary MV model as two objectives.Moreover, Soleimani et al. [18] take sector capitalization constraint into account.Generally speaking, the optimization of MV model and SD model allows these models to be more widely used in realistic investment environments.
After adding these objectives to portfolio optimization model, the original single-objective portfolio optimization problem has become a multi-objective optimization problem.Recently, Macedo et al. [19] introduce the multi-objective evolutionary algorithms (MOEAs), including non-dominated sorting genetic algorithm II (NSGA-II) and strength pareto evolutionary algorithm II (SPEA-II), into solving mean-semivariance framework model.However, the portfolio optimization entails considering competing and conflicting objectives, it's unlikely that a portfolio can solve the multi-objectives problem simultaneously [20].Besides, normally multi-objective optimization problem is transformed into a single-objective programming model by using fuzzy programming approach, such as Chen et al. [21].Therefore, in this paper, we take skewness, kurtosis, CVaR or VaR into SSD framework as constraints, which we called it MCVSK and MVSK model respectively.What's more, we only use return as the objective.
Inspired by the level of leadership and hunting mechanisms of the Grey Wolf, Mirjalili proposed Gray Wolf Optimization Algorithm, which simulates the gray wolf's social hierarchy and predatory behavior [22].Compared with Immune particle swarm optimization (IPSO), Ant colony algrothrim (ACA), GA, and Biogeography-based optimization(BBO), GWO has a simple structure and fast convergence.Further details about bioinspired algorithms can be found in the literature [23,24].In the GWO algorithm, the hunting behavior is performed by α, β, δ and ω follows the first three to track the cocoon of the prey and finally complete the predation task.Recently, GWO is used to improve the classification performance of Convolutional Neural Network (CNN) models.Kumaran N et al. [25] propose a hybrid CNN-GWO approach for the recognition of human actions from the unconstrained videos, the experimental validation of this approach shows better achievable results on the recognition of human actions with 99.9% recognition accuracy.Besides, Mustaffa Z et al. [26] use GWO to train Least Squares Support Vector Machine (LSSVM) for price forecasting and present a hybrid forecasting model.In this paper, GWO algorithm is used to solve MCVSK and MVSK portfolio optimization model.Besides, we compare GWO algorithm with PSO algorithm and GA in numerical experiments.
The rest of the paper is organized as follows.In Section 2, we discuss the MCVSK and MVSK portfolio optimization model respectively.In Section 3, we discuss the GWO algorithm for MCVSK and MVSK portfolio optimization model.In Section 4, we conduct numerical experiments and analyse the result of experiments.In Section 5, we summarize the performance of MCVSK and MVSK portfolio optimization model.

The Measure of Return and Risk
As the most fundamental objective, the return to the portfolio always catches investors' attention.In the portfolio optimization problem, our aim is to invest our capital in some assets in order to obtain some desirable characteristics of the total return on investment.
Let n denote the number of assets, which are available for investment in the beginning of a fixed period, and we assume that we have a fixed capital to be invested in them.Then we use x = (x 1 , x 2 , . . . ,x n ) T to denote the fractions of initial capital invested in n assets.Thereinto, i = w i w , i = 1, . . ., n where w i is the capital invested in asset i and w is the total amount of capital to be invested.Let X denote a set of feasible portfolios where x ∈ X, and it is clear that X ∈ R n is a bound convex polyhedron.Besides, let R i (ξ), i = 1, . . ., n denote the return of asset i in the case of discrete distribution, where ξ : Ω → Ξ is a random vector on probability space (Ω, F, P) [27].and we assume that E | R j | < ∞ for all j = 1, . . ., n.In a word, if we have a fixed capital, the return to portfolio g (x, ξ) can then be formulated as: The measure of risk has many forms, such as variance, semi-variance, mean absolute variance, VaR, CVaR and so on.Variance is the risk measure of Markowitz MV model, which has a long history and a wide range of applications.On the basic of variance, semi-variance, mean absolute variance and other risk measures are appeared.In this paper, we mainly study two kinds of risk measures: VaR and CVaR.VaR is one of the most well-known downside risk measures due to its intuitive meaning and wide spectra of applications in practice [8].For a fixed level α, the value at risk VaR α is defined as the α-quantile of the cumulative distribution function F Y [28]: where Y = −X.However, VaR still has several noticeable limitations and drawbacks.For example, it's insensitive to the magnitude of losses beyond VaR, and it's not a coherent risk measure [9].Meanwhile, CVaR measures the conditional expectation of losses beyond VaR, which is a coherent risk measure.Therefore, CVaR is theoretically more attractive and partly resolve the shortcomings of VaR [29].The CVaR is defined as the solution to an optimization problem: where [z] + = max (z, 0).Besides, for smooth F Y , CVaR equals the conditional expectation of Y [30]:

The Second-Order Stochastic Dominance Constraint
The theory of stochastic dominance originates from the theory of discrete stochastic variable optimization, and later develops into the theory of generalized stochastic variable optimization, which is widely used in economic and financial fields nowadays.In the stochastic dominance theory, the comparison between random variables is under their k-order distribution function F (k) .Assuming that there are portfolios X and Y, and the utility function of all investors is monotonically increasing.If all the investors prefer portfolio X to portfolio Y, or believe that there are only part of them is no difference, then we say portfolio X stochastically dominates portfolio Y in the first order [31].let x and y be the decision vectors and ξ be the random variable, it's said that g (x, ξ) stochastically dominates g (y, ξ) in the first order, denoted by g (x, ξ) 1 g (y, ξ), if: where g (x, ξ) is the concave continuous function both in x and ξ, F (g (x, ξ) ; η) is the cumulative distribution function of g (x, ξ).For a random variable X, the first order distribution function of X is its right-continuous cumulative distribution function: In portfolio optimization problem, SSD is of interest because, for any decision maker with risk-averse and non-satiable preferences, a portfolio which dominates a benchmark portfolio is preferred to the benchmark [32].Based on the definition of first-order stochastic dominance, we denote g (x, ξ) stochastically dominates g (y, ξ) in the second order by g (x, ξ) 2 g (y, ξ), if: Thereinto, for two different profolios X and Y, the strict dominance relation (k) is defined as follows: Fábián et al. [33] and Ogryczak et al. [34] prove that SSD is equivalent to the following two inequalities: where E (•) denotes the expected value with respect to the probability distribution of ξ.Moreover, for any increasing and concave utility function u, the following inequality exists: where R x and R y are two random variables, and they may represent the returns of two portfolios x and y [33].Besides, let X and Y denote two different profolios, Noyan and Rudolf [35] point out that the SSD constraint is equivalent to the continuum of CVaR constraints for all confidence levels α ∈ (0, 1]:

The Skewness and Kurtosis Constraints
In order to get a more practical result by portfolio optimization model, many scholars try to achieve this goal by adding several objectives and constraints into the model.For example, Soleimani et al. [18] introduce transaction cost, cardinality constraints and market capitalization into Markowitz MV model as constraints, which is optimized by genetic algorithm.Besides, Lwin et al. [3] introduce some real-world constraints, such as cardinality, quantity, pre-assignment, round-lot and class constraints, into Markowitz MV model.In this paper, we mainly consider the skewness and kurtosis constraints.
About transaction costs, Yoshimoto [36] points out that ignore the transaction costs will lead to very ineffective portfolio implementation.Briefly, transaction costs can be used to model a number of costs, such as brokerage fees, bid-ask spreads, taxes, or even fund load.Supposing the transaction cost c i to be a V-shaped function of the difference between a given portfolio x o = x o 1 , . . ., x o n and a new portfolio x = (x 1 , . . . ,x n ), which is incorporated into the portfolio return.Therefore, the transaction cost of i th asset and total transaction cost can be expressed as follows [37]: n Generally speaking, the transaction cost can be modelled as the sum of fixed and variable cost [38].In this paper, we only consider the situation under a single period, so we assume that the transaction cost is as a relatively exogenous variable.
Chunhachinda et al. [4] point out that higher moments, including skewness and kurtosis, can't be neglected in the portfolio selection, which has great importances of financial economics.As shown by Arditti [14], the investor's preference for more skewness to less is consistent with the notion of decreasing absolute risk aversion, because a positive-skewness asset return refers to a right-hand elongated tail of density function of asset return.Let ζ be an uncertain variable with finite expected value e, the skewness and kurtosis of ζ are respectively given by Equations ( 14) and ( 15):

The MCVSK and MVSK Portfolio Optimization Model
Above all, in this paper, we mainly study the following MCVSK portfolio optimization model: where y is benchmark investment, such as FTSE100 index portfolio.Towards the objective function f (x, ξ), we mainly consider the following two forms [31]: where x is the investment proportion vector and ξ is a vector represents the return of assets.Thereinto, the Equation ( 17) is relatively simple and the Equation ( 18) is more complicated.For Equation ( 18) takes the impact on transaction cost into consideration, which prefer the big amounts investment in portfolio.Besides, the first three constraints on the MCVSK model represent the SSD, skewness and kurtosis constraints respectively.It is worth mentioning that we combine the CVaR with the return to portfolio, which is introduced into the MCVSK model as a new constraint.This constraint is mainly inspired by behavioral economics.We think the risk which investors can bear should not a fixed value.
It is obvious that the more return of portfolio, the higher risk we should bear.Moreover, we don't consider the short-selling, so X can be expressed as follows: In practical application, in order to ensure the diversity of portfolio, the upper bound on the fraction of capital invested in each asset is set to a relatively low vaule, and the specific value is adjusted by the number of assets available for investment.Meanwhile, as a comparison, we change the risk measure in MCVSK model into VaR and propose the following MVSK portfolio optimization model: x ∈ X (20)

The GWO Algorithm for the MCVSK and MVSK Portfolio Optimization Model
Grey Wolf Optimization algorithm is a developed metaheuristic search algorithm inspired by grey wolves proposed by Mirjalili, it be used to solve nonconvex engineering optimization problem, which simulate the social stratum and hunting mechanism of grey wolves in nature.The grey wolves being organized into four main levels [39] α wolves are the leaders and their responsibility is making decisions.β wolves are second-level wolves that help α wolves in taking decisions or the other activities.β wolf is the best candidate to be the next α wolf.δ wolves executes α and β instructions but can direct other underlying individuals.Their responsibilities are scouts, sentinels, elders, hunters, and caretakers.ω wolves are fourth-level wolves and acts as an executor throughout the wolves.Besides, GWO are based on three main steps: searching prey, encircling prey and hunting.In this section, we will mainly discuss these three steps [22].

Prey Searching
By the arbitral (random) initialization of grey wolves in the search space paves the way for the searching operation.In order to search the prey, the grey wolf takes of the diversion from one wolf to another and the wolves are merged after the detection of prey.

Prey Encirclement
The encirclement of prey initiates after the detection of prey.The encirclement characteristics can be described mathematically as follows ( 21) where − → X is the grey wolf position, − → X p is the prey position, t is the iteration number, and − → D given by follows (22): − → C are vectors given by follows: where − → a is linearly decreased from 2 to 0 over the course of iterations, which is given by Equation (25).− → r 1 and − → r 2 are referred to as random vectors by the limit of [0, 1].
where l is the total iteration number allowed for the optimization.

Chasing (Hunting)
After the completion of prey encirclement procedure, the grey wolves focus on the hunting of prey.Alpha wolf usually leads hunting.The beta and delta wolf also participate in hunting.However, in the abstract search space, we don't know about the best location (prey).In order to mathematically simulate the hunting behaviour of grey wolves, we suppose that the alpha (best candidate solution), beta and delta have better knowledge about the potential location of prey.Alpha, beta and delta can be replaced by Equations ( 26)-( 28 And the position vector of prey with respect to alpha, beta and delta wolves can be calculated using the following mathematical formulation: The best position can be calculated from the average of alpha, beta and delta wolves as shown in Equation ( 32 For the MCVSK and MVSK models, we use the return of assets as a fitness function for the gray wolf algorithm and constrain the initialized wolf pack vector to a 101-dimensional investment proportion vector.The procedure chat of the GWO algorithm is shown in Figure 1 and the GWO approach is described in Algorithm 1.
Algorithm 1 the main procedure of GWO algorithm for the MCVSK and MVSK model.

Numerical Experiments
We have carried out a series of numerical tests by using GWO algorithm, PSO algorithm and GA in MATLAB2014b install of a Lenovo PC with Windows 8.1 operating system and 8 GB of RAM.In the actual investment market, there are a large number of assets for investors to choose.Therefore, in the numerical experiments, we use the stocks involved in FTSE100 index as the data source.We collect 249 daily historical returns to 101 FTSE100 index assets prior to December 2016 to construct the portfolio strategy.Specifically speaking, we use the first 200 daily historical returns to construct the portfolio strategy, and the further 49 daily historical returns to an out-of-sample test.In this section, we report and analyze the test results.

Backtesting and Out-of-Sample Test
In model (16) and model ( 20 as the benchmark and we assume that short-selling is prohibited.Besides, in order to ensure the diversity of the portfolio, the upper bound on the fraction of capital invested in each asset is set to 5%.Above all, we propose the following MCVSK model and MVSK model: Comparing with the FTSE100 Index, we get the test results shown in Table 1 and the specific index weight data of FTSE100 Index is shown in Table A1. In Table 1 and the rest of the tables, "α" refers to the confidence level, " algorithm " indicates the algorithm used to optimize the MCVSK model and MVSK model, " skew " and " kurt " refer to the skewness and kurtosis of the portfolio, " VaR " and " CVaR " are two kinds of risk measures, "E[g(x, ξ)]" means the return of portfolio, " max " is the largest investment ratio in the portfolio and " time " indicates the average running time of algorithm.Among them, the algorithm parameters are set as follows: the initial population of GWO algorithm, PSO algorithm and GA are 100, 10 and 100 respectively.In addition, the number of iterations of the above three algorithms is 12.It is worth mentioning that because the PSO algorithm has a slow speed in optimizing the portfolio optimization model, in order to ensure that the algorithm optimization process has the same time-consuming, we set the initial population of the PSO algorithm to a lower level.What's more, we conduct thirty repeated experiments in each case.The Figures 2-11 show the specific asset structure of optimal portfolio and the return to optimal portfolio in each period by GWO algorithm.Specifically speaking, Figures 2-7 show the specific asset composition of the portfolio optimized by GWO algorithm and there are a total of 101 assets that can be invested in each portfolio.Figure 8 shows the specific asset composition of the FTSE100 index portfolio.Figures 9-11 show the return of optimal portfolio by model (33) and model (34) in each period by GWO algorithm and the FTSE100 Index portfolio in backtesting with three kinds of confidence levels.Among them, the data set for the backtesting is the historical rate of return for the previous 200 days, which is the x-axis scale in Figures 9-11.
In the above, we set up a backtest which compares the optimal portfolio obtained from MCVSK model and MVSK model by GWO algorithm, PSO algorithm and GA with FTSE100 index.Furthermore, we set up an out-of-sample test to evaluate the performance of the selected portfolio over the remaining 49 samples.The Figures 12-14 show the return of optimal portfolio in another 49 days.

Numerical Analysis
In the above experiment, we conduct in-sample test and out-of-sample test of MCVSK and MVSK portfolio optimization models by GWO algorithm, PSO algorithm and GA respectively, then we compare them with FTSE100 index and the control group y.In this section, we mainly consider four indicators: skewness, kurtosis, VaR or CVaR and E[g(x, ξ)].From the results of backtesting, it's shown that the skewness of portfolio derived from MCVSK and MVSK model are higher than FTSE100 index and control group y.Simultaneously, its value of kurtosis is rather lower than FTSE100 index and control group y.On the indicator of risk measure, MCVSK model take CVaR as risk measure while MVSK model's risk measure is VaR.It can be seen from the experimental results that the portfolio optimized by GWO algorithm are generally much more profitable than FTSE100 index and control group y while having lower risks.
Specifically speaking, when the confidence coefficient α is 1%, 5% or 10%, the CVaR of portfolio obtained by MCVSK model is −2.7743, −2.1648 and −1.7413 respectively, which is lower than −3.2503, −2.3205 and −1.8086.At the same time, the return of portfolio obtained by MCVSK model is 0.1430, 0.1478 and 0.1512 respectively, which is greatly higher than 0.1017.Compare the MCVSK model with the MVSK model, it can be found that portfolio obtained by MCVSK model generally has a better performance than MVSK model on the rate of return indicator.Specifically speaking, when the confidence coefficient α is 1%, the return of portfolio obtained by MCVSK model is 0.1430 while the portfolio obtained by MVSK model is 0.1402.When the confidence coefficient α is 5% and 10%, there is also the same phenomenon.Therefore, in a sense, CVaR has its special advantage as a type of risk measure.
Then we compare the GWO algorithm with PSO algorithm and GA.From the experimental results, it can be seen that all three algorithms can obtain excess returns when all the constraints on the portfolio optimization model are met.However, as a main evaluation index, the return to portfolio optimized by GWO algorithm is much higher than that of PSO algorithm and GA.Specifically speaking, when the confidence level is 1%, the GWO algorithm optimizes the MCVSK model to obtain a return on portfolio of 0.1430.Under the same conditions, the PSO algorithm optimizes the return on portfolio to 0.1107 and GA to 0.1215.This phenomenon also exists when the confidence level is 5% and 10%.While having a higher rate of return, the risk of portfolio optimized by GWO algorithm is similar to the portfolios optimized by PSO algorithm or GA.This shows that the GWO algorithm has a higher optimization capability in optimizing the MCVSK and MVSK portfolio optimization model.
From the results of out-of-sample tests, the MCVSK model and MVSK model optimized by GWO algorithm also have a wonderful performance.Generally speaking, the experimental results prove that the GWO algorithm has great potential in solving single-objective portfolio optimization problems.The GWO algorithm not only has a faster optimization speed, but also maintains a high optimization quality.It's believed that GWO algorithm will have a much wider application.

Conclusions and Future Research
In this paper, we introduce several constraints, including skewness and kurtosis, into the basic second-order stochastic dominance portfolio optimization model.Besides, we use two different risk measures in the model, which we call them MCVSK model and MVSK model.As a single-objective optimization model, the portfolio optimized by GWO algorithm can achieve more return than PSO algorithm, GA and FTSE100 index from the experimental results.We also find that, compared with using VAR as risk measure, using CVAR as risk measure can achieve better and more stable performance.Generally speaking, the GWO algorithm can get a better portfolio faster, which shows the excellent optimization ability and optimization efficiency of the GWO algorithm.It is believed that GWO algorithm has great potential in solving portfolio optimization problem.
As a single-objective optimization model, MCVSK model and MVSK model only take return as one and only optimization goal.Therefore, the portfolio optimized by MCVSK model or MVSK model will have a higher risk than FTSE100 index, which means the portfolio obtained by MCVSK model and MVSK model will not meet the specific risk appetite of all investors.In the future, we will try to discuss the multiple objective portfolio optimization model based on second-order stochastic dominance constraint, which is designed to make up for the shortcomings in the single-objective portfolio optimization model.Moreover, From the result of experiment, we find that the efficiency of the model is closely related to the choice of benchmark.Therefore, we will pay attention to this problem in the following research.

Input: 6 :Figure 1 .
Figure 1.The procedure chat of the GWO algorithm.
), we have introduced the basic MCVSK model and MVSK model respectively.In this part, we use y =

Table 1 .
The results of Backtesting.