A Heuristic Approach to Forecasting and Selection of a Portfolio with Extra High Dimensions

: The performance of a ﬁnancial portfolio depends on the output of two tasks: ﬁrst, a forecasting process, where quantities of interest for the investors, such as the rate of return and risk for each stock, are predicted into the future, and second, an optimization process, where those individual stocks are formed into the portfolio optimizing the combined risk and reward features. However, in very large dimensions, when the number of stocks is high, those two quantitative problems often become intractable because of a loss in precision. This paper introduces a forecasting and portfolio formation strategy in multiple periods based on the splitting of the multivariate forecasting model into multiple bivariate forecasting models and updating investment weights at each period based on the predicted target quantities for the returns and the covariances. The methodology proposed is suitable for a very large portfolio of assets. The experimental results are based on a sample of one thousand stocks from the Chinese stock market. For such a large sample, the forecast and optimization process is executed speedily. The investment strategies are benchmarked with the equally weighted portfolio. In the long run, they offer a better investment performance in terms of a higher rate of return or lower risk, compared with this portfolio, demonstrating the applicability and economic value of the proposed methodology in practice.


Introduction
Portfolio managers and investors make decisions concerning the amount they invest in different assets and the timing for these investments.At the core of their decisionmaking process, there are two challenges.First, for the timing, there is a forecast process for the performance of each asset.Second, once the expectations for future performances are formed, there is a wealth optimization process by combining individual assets in a diversified portfolio.We shall refer to the first challenge as the forecasting problem and the second challenge as the optimization problem.
In the finance literature, the two problems are often considered separately.The forecasting problem has been taken up by studies in time-series modeling.Since the work of Box and Jenkins [1] on the autoregressive moving average (ARMA) models and seminal paper of Engle [2] on the conditional heteroscedasticity, we have seen over the 1990s and 2000s a vast and fruitful development on the modeling conditional stock return and conditional volatility.The literature considers both univariate and multivariate model specifications for, respectively, the conditional variance and conditional covariance.For the multivariate case, an excellent review is [3], summarizing the main parametric models.The optimization problem, instead, has been studied since the pioneering work of Markowitz in mathematical finance [4].He proposed a "mean-variance" framework for the formation of diversified portfolios.Investors are risk averse and in making investment decisions they optimize their utilities by combining investments to maximize the expected return and minimize the expected risk.The Sharpe ratio [5] was introduced as a measure of portfolio performance combining both the reward and the risk aspects of investments.For investment evaluations, this ratio is commonly used by practitioners as an objective function.
This paper takes a holistic view of the two problems.Both forecasts of stock performances and optimization of a large portfolio of stocks are tacked as a unique challenge for the investment decision-making process.Because of the difficulty in producing an accurate forecast of stock returns, in practice, portfolio managers tend to focus on the second issue neglecting the forecasting models for returns and covariances.The unconditional return and covariance, estimated as historic sample return and sample covariance of assets are often used for portfolio optimization.The forecasting problem being more difficult is not less important than the optimization problem as superior out-of-sample portfolio performance can be delivered only with a better forecast of return and covariances.The approach to investment adopted by this paper may fall into the predict and optimize category.A recent review of both time series prediction and portfolio optimization is [6], where investment management based on an approach that falls into this category is also presented.Some general results in the empirical literature for daily return forecasting are statistically significant trends and partial autocorrelations (see for example [7] for the UK stock market).To capture these effects, this paper applies the traditional autoregressive AR(1) forecasting model.Moreover, there is plenty of evidence of heteroskedasticity and time-varying covariances (see for example [8] for volatility spillover and heteroskedasticity for the Chinese stock markets due to the economic business cycle).This paper employs the GARCH(1,1) specification for time-varying stock variance and it applies the dynamic conditional correlation (DCC) model [9] as its parsimonious specification that also allows time-varying conditional correlations to be predicted.
The contribution of this paper is the following.It offers a practical approach to both forecasting and portfolio optimization by combining the two problems with an iterative algorithm.Furthermore, the methodology presented overcomes the curse of dimensionality for high dimensional volatility modeling and the intractability of portfolio optimization due to large covariance matrices.Instead of modeling the covariance structure of all assets as a whole, the approach relies on the equally weighted portfolio as a proxy for the market portfolio and models the covariance of each asset with this portfolio thus reducing the large multivariate model in several bivariate models.In practice, with the development of electronic markets, the basket of stocks and other assets available for investment is very large.Investment managers often monitor investment opportunities from different markets and even by considering only one market, the number of stocks traded is by far higher than what forecasting and optimization models can handle without incurring precision problems.For example, some of the world's largest stock exchanges, such as the New York Stock Exchange or the Shanghai Stock Exchange, all have stocks traded every day exceeding two thousand units.This is contraposed to the fact that, in the research literature, the multivariate volatility models are only being applied successfully for up to 50 assets as in [10].Multivariate time series forecasting models with assets exceeding a thousand unit has never been successfully attempted.
Recently, thanks to developments in deep neural network modeling, stock market forecasting has acquired a new impetus.While traditional time series models imposed specific functional forms and required stationarity assumptions of stock returns, a deep-learning model does not require such assumptions and allows the inclusion of data from different sources in predicting future market performance.For example, in [11] a deep model is used to extract market sentiment information from textual data and this information is embedded in another learning model to forecast future returns.In [12], a deep neural network architecture that includes convolutions and memory gates is applied to three series of high-frequency price data to predict their future prices.Deep-learning models can potentially handle data that is hyperdimensional without running into accuracy problems during model training.However, their performance out-of-sample (model testing) is yet to be established.Training a deep learning model is computationally expensive and if one would hope to achieve consistent out-of-sample predictive power on an extra large set of data, such as on one thousand stock series, those hyper-parameterized models would require much more data input that is normally available to researchers and industry practitioners.An example of an out-of-sample performance study is the paper [13] where several deep-learning models are compared with the traditional ARMA model to forecast a univariate time series of stock price.Even in this case of low dimensionality only a semi-parametric ARMA, can beat the traditional ARMA out-of-sample.This paper does not use any deep learning architecture to forecast future returns but instead uses the simplest parametric autoregressive model to capture the trend of a stock return in a hope that, when this is captured, the downstream investment strategy will avail of this and generate economic value.
The investment strategies presented in this paper rely on multiple-period evaluations of predicted returns and covariances.The strategies are heuristic in the sense that they do not rely on mathematical optimization but rather on simple and approximate updating of portfolio weights, at each period of investment, by tracking and improving upon the equally weighted portfolio investment strategy.For each investment period, the forecasting models produce predictions for the stock returns and covariances and, based on those, investment weights are re-adjusted and thus portfolios are re-balanced.The investment strategies proposed in this paper are tactical in the sense that it has a goal of optimizing reward and risk for each period of investment but also tracking the equally weighted portfolio as a long-term benchmark.Investment strategies with multiple period decisions have been studied by the extant literature.For example, in [14,15], approximate optimization methods are used for investment decision problems.Their approaches do not consider the out-of-sample forecast and investment performance, thus resulting in less practical usability.Furthermore, in [16] asset allocation is studied by considering realistic scenarios for investor preferences and transaction costs.However, its approach is infeasible in large dimensions by being heavily based on simulation.
Heuristic optimization techniques have been studied and applied in many areas of research.They are greedy algorithms specific to the problem at hand with the goal being to reach a solution that is near optimum when the optimization problem has high complexity.For example, in [17,18] some of those algorithms are presented and reviewed.Application of heuristic algorithm in portfolio optimization can be found in [19,20].When the investment problem is too complex such as with multiple objective functions or with many inequality constraints those algorithms are needed because the computational complexity of the problem at hand is too high or because the pareto optimal solution may not exist.Instead of considering an optimal solution for one period, say, the minimum variance portfolio for one day of investment, the objective function of this study is the overall investment performance over multiple periods such as the Sharpe ratio for an arbitrarily long period of time composed of several subperiods where investment decisions are made.In this sense, the investment approach proposed in this paper can be considered a heuristic multi-period stochastic optimization approach to investment, where stochastic refers to the fact that stock returns are predicted random variables.
In terms of the research question and modeling strategy, this article is closest to [21] as it considers both prediction and asset allocation strategy jointly.In the latter, a multivariate hidden Markow model is used for forecasting stock returns and an intertemporal optimization process is considered, maximizing the expected return and incorporating the maximum value drawdown as an optimization constraint.There are other research items in the optimization literature that consider intertemporal financial optimization with different investment objectives (see for example [22,23]).However, none of the methods proposed are evaluated against very large dimensions such as when the number of assets reaches a thousand units.This paper takes a more practical perspective by focusing on investment performance when the portfolio under management has any arbitrary dimension and for this scope, it applies less complex forecasting models.
A related stream of work is in combining both prediction and optimization problems into non-parametric models.In [24], a neural network with user-defined loss functions is used to train a prediction model for the portfolio return and the investment weights are the output of this prediction model.Similarly, in [25] a deep neural network with the Sharpe ratio as a loss function for the forecasting model is used.Furthermore, in [26] the problem of forecasting and optimization is considered jointly by formalizing properties of the loss function used in the model.In this combined prediction and optimization literature the downstream performance from the optimization problem is taken into account when measuring errors in the upstream prediction model.This literature also does not consider the curse of dimensionality that arises with high dimensional portfolios.This paper differs from this literature stream as it takes a more traditional parametric predict and optimize approach in portfolio investment decisions.
This paper is also related to the stream of work focused on estimating large covariance matrices.For example, in [27] the modeling of correlation matrices is based on an equal dynamic correlation among all assets, thus overcoming problems related to exploding number of parameters for large dimensions.Furthermore, in [28,29] a factor structure is considered in estimating very large covariance matrices with dimensions even reaching one thousand.Those models rely on the shrinkage of the covariance matrix that can be used for both forecasting and portfolio selection.The shrinkage estimator of the covariance matrix has been proposed by other studies but with different model specifications.Among others [30,31], consider non-linear and non-parametric methods.Instead of estimating the whole covariance matrix, the approach adopted in this paper relies on estimating pairwise covariances of each asset with the equally weighted portfolio taken as the market.
In the experiment, data from the Shanghai and Shenzhen stock exchanges are used.The methodology is applied to a medium-large size basket of 100 stocks and an extra-large basket of 1000 stocks for multiple period forecast and investment evaluation.The data used are rather extensive covering more than 10 years of daily observations.By excluding transaction costs, the experimental results support the superiority out-of-sample of the proposed approach compared to the equally weighted portfolio as it delivers a better realized Sharpe ratio and other performance metrics in the long run.
The remainder of the article is organized as follows.Section 2 presents the forecast model specification and the heuristic portfolio allocation methodology, Section 3 introduces the data used for the analysis, the sampling scheme, and the investment performance results.Finally, Section 4 concludes.

Methodology
Given n assets, each with return r i,t , for i = 1, . . ., n and t = 0, . . ., T, the multi-period portfolio allocation problem consists of choosing weights w i,t for each asset i at time t.Without loss of generality, we consider t representing one day such that on each day the portfolio will be re-balanced as the weights change.
Let p represent the portfolio, at time t − 1 the portfolio manager set the weights based on a forecast of future returns r i,t and covariances cov(r it , r jt ) for i = 1, . . ., n and j = 1, . . ., n.This portfolio manager is risk averse as her goal is to maximize the expected return of the portfolio, and for a given return, she minimized the expected risk of the portfolio.Given the vector of weights w t = [w 1,t , • • • , w n,t ] and the vector of predicted returns r t = [ r 1,t , • • • , r n,t ] , the expected returns for the portfolio at time t is and the expected risk of the portfolio at time t is with the matrix Σ t being the predicted (n • n) variance-covariance matrix of the n assets.
In the mean variance framework, a common objective function for this portfolio manager is to optimize the Sharpe ratio: where 1 is a (n • 1) vector of one.The two constraints in the optimization problem (3) represent the budget constraint of the portfolio manager and a constraint on the short selling.
In such a framework, as the number of assets n is big, we encounter two major problems.First, all parametric models for Σ t suffer from the curse of dimensionality.Even the simplest models for the conditional covariances will fail to estimate the underlying covariances when the number of variables is very high, not even considering the computational time required for such an estimation.As the likelihood function is a high-dimensional plane with many local optima, all estimates are left with very low precision.Evidence of this fact is presented in [32] and the conclusion made is that simples models, such as the DCC of [9], are more robust when the number of assets tends to be high.The second problem is also related to accuracy: the optimal weights in the problem of Equation ( 3) require the inversion of the matrix Σ t .When this matrix is too big, traditional matrix inversion methods will fail.Even with approximate regularization methods to estimate the inverse covariance matrix, such as the one proposed in [33], the challenge of computational time in a practical environment is hard to overcome.Moreover, the upstream noise from the forecasting models will be exacerbated downstream, when doing the matrix inversion.
In this paper, a simplified approach is proposed.Instead of modeling the whole (n • n) covariance matrix Σ t , a (2 • 2) covariance matrix is modeled n times, thus overcoming the curse of dimensionality.Moreover, instead of computing the inverse of the matrix Σ t at each day t for the optimal weights w t , a heuristic weight updating approach based on the predicted returns and predicted covariances of each asset with the equally weighted portfolio is adopted.The portfolio allocation problem is thus divided into two steps, in the first step the future returns and covariances are predicted.In the second step, a portfolio based on those predictions is formed.
In the subsections below the two steps of the methodology are described.For the choice of names and mathematical symbols used throughout the paper see Appendix C.

Return and Variance Forecasting Model
This first step aims to build models to forecast the future returns of each asset, the future return of the equally weighted portfolio, and the covariance of each return with the return of the equally weighted portfolio.The return of the equally weighted portfolio that we observe (at time t − 1) is where w ew is a (n • 1) vector with elements 1/n.To predict future returns the autoregressive conditional mean model of order 1, AR(1), is used in conjunction with the generalized heteroskedasticity model GARCH(1,1) for the conditional variance.The AR(1)-GARCH(1,1) model has 5 parameters, two for the conditional mean and three for the conditional variance.It is a parsimonious model capturing trend and partial autocorrelation of stock returns and time-varying volatility that are typical of stock market returns (see [34] for a discussion on the performance of the GARCH model).Considering a univariate stock return, the conditional mean and variance for this are specified as: with stationarity and positivity of the variance constraints imposed on the parameters as This model is estimated independently for each individual asset i, for i = 1, . . ., n and, also, for the equally weighted portfolio return r ew,t .Further, to predict future covariances, the correlation between each asset in the portfolio and the equally weighted portfolio return is modeled.To capture this time-varying correlation the DCC parametrization is used: with stationarity and positivity constraints innovations from individual model estimation of Equation ( 5).The matrix R t = 1 ρ i,t ρ i,t 1 is guaranteed to be a correlation matrix thanks to the standardization in Equation ( 8) and the element ρ t is the estimated correlation between an asset and the equally weighted portfolio for day t.This correlation model is also estimated between each asset i and the equally weighted portfolio independently from the other assets.In Equations ( 5)-( 7) the subscript i is dropped for compactness.
The estimation of this model follows a two-stage maximum likelihood optimization.First, the parameters in Equations ( 5) and ( 6) are estimated by maximizing the normal log-likelihood function Second, once the optimal parameters are found, their innovations ε t are used to run the recursions in Equations ( 7) and (8) and parameters of the correlation model is estimated by maximizing the bivariate normal likelihood The covariance matrix is the (2 • 2) covariance matrix of any asset and the equally weighted portfolio.The covariance terms cov i,t = ρ i,t σ i,t σ ew,t are obtained iteratively given estimates for the variance of each asset and the correlation between each asset and the equally weighted portfolio.Because the dimension of the covariance matrix is small, the maximum likelihood estimation approach is fast for a reasonable data size T.
Note that the correlations ρ i,t are estimated correlations between the rate of return of any asset i and the rate of return of the equally weighted portfolio ew.For shortness in notation, the subscript ew is dropped.Furthermore, the correlation matrix Σ t is reestimated for each asset return, independently from the other correlation matrices.For example, with n = 100 assets, n + 1 models for conditional mean and variance and n models for conditional correlation are estimated, consisting of a total of 505 parameters for the conditional mean and variance and 200 parameters for the conditional correlation.Instead of modeling the whole correlation among all assets, this approach considers all assets as if they are independent of each other but all correlated with the equally weighted portfolio, thus reducing the modeling problem to smaller subproblems.
Once the models are estimated, a forecast of future return and future covariances are produced one period ahead.Thus, the following vector of return and covariance forecasts are predicted: The forecasting problem is summarized with Algorithm 1.The input of this algorithm is multivariate time series return data that are described in Section 3.1.The data are split into K windows of training and testing subsamples.For each training window, the univariate AR(1) − GARCH(1, 1) model is estimated for the equally weighted portfolio return and each stock return.Furthermore, the multivariate DCC model is estimated between individual stock return and the equally weighted portfolio return.Based on the estimates of this training window, one-step-ahead forecasts are produced for the corresponding test window that contains data of size step.In the algorithm, the procedure in the outer f or loop selects the training and testing window and estimates the models.The procedure in the most inner f or loop produces forecast quantities for each stock corresponding to the step out-of-sample time stamps.

Portfolio Formation Strategy
The second step of the methodology aims to set the weight vector w t+1 = [w 1,t+1 . . .w n,t+1 ] given the forecast of returns and covariances obtained previously.Starting with the equal weight vector, weights are updated on each day of the investment period.
The portfolio selection strategy consists of setting a target return and a target risk.The predicted return of the equally weighted portfolio is taken as a target return: targetRet = rew,t+1 .This is because the equally weighted portfolio is rather hard to beat, its performance out-ofsample is considerable as shown in [35].Therefore, the heuristic approach of this paper seeks to improve upon the rate of return of this equally weighted portfolio.For the risk, instead, the average predicted covariance between each of the n stocks and the equally weighted portfolio is taken as the target risk for our portfolio, targetCov = 1 n ∑ n i=1 ĉov i,t+1 .The rationale for taking the average covariances as a risk target stands from the fact that, in a large portfolio, the covariance risk is the dominant source of risk.
Consider the equally weighted portfolio return r ew .When the number of its components is infinitely high, its variance is equal to the average pairwise covariance among its components (this result can be found in [36]): where σ 2 i,j , ∀i = j is the average covariance, among assets i and j.Because the whole covariance structure between the assets is not modeled directly, but only the covariance between each asset and the equally weighted portfolio is modeled, such average covariance is taken as a relative proxy for the systemic risk σ 2 i,j .Therefore, setting this portfolio raises the overall difficulty bar for the investment strategy when there is an extra large number of stocks because this is already a fully diversified portfolio.The investment strategies aim to beat this equally-weighted portfolio in short periods when there is diversification opportunities not captured by this and when there are superior forecasts of stock returns.In the long run, the equally weighted portfolio is the benchmark investment.
Let r min,t+h = min i [r 1,t+1 , . . . ,ri,t+1 , . . . ,rn,t+h ] and cov min,t+h = min i [ ĉov 1,t+h , . . . ,ĉov i,t+h , . . . ,ĉov n,t+h ] be, respectively, the minimum of the predicted returns and the minimum of predicted covariance among all n assets under consideration for the forecast period t + 1.Based on the target return and the target covariance the weights for the forecast period are updated as the following: with w 0 = w ew and the product in Equations ( 15) and ( 16) denoting the element-byelement product.The ratio expression in Equation ( 15) is a vector containing relative predicted returns over the target return.It has elements that take values above 1 if a stock return is predicted to be higher than the target return and smaller than 1 if a stock return is predicted to be lower than the target return.The ratio in Equation ( 16) is a vector that contains the relative predicted covariance over the target covariance.Its elements take values higher than 1 if the covariance is predicted to be higher than the target covariance and smaller than 1 if the covariance is predicted to be smaller than the target covariance.The differences in return and covariances with their respective minimum element guarantees that all elements in the ratio vectors are positive.The equal weight vector w ew is added to the weight of the previous period w t to avoid the weights to shrink to zero by repeated iteration once an asset has the minimum of the return or minimum of the covariance among all assets.Furthermore, adding equal weights allows for better tracking of the equally weighted portfolio as this is the benchmark strategy.
The two objectives of maximizing expected returns (Equation ( 15)) and minimizing risk (Equation ( 16)) are combined linearly by weighting more on assets for which the relative return is predicted to be higher and weighting less on assets for which relative covariance is predicted to be higher.The constant c, for 0 ≤ c ≤ 1 is a parameter that maps on the risk aversion by considering only the expected return, and thus having a neutral position on the risk, when c = 1, and by weighting most on avoiding risk when c = 0.The standardization in Equation (18) guarantees that all weights sum to one, fulfilling the budget constraint of the investment.The weights are updated iteratively for each of the forecast periods resulting in a multi-period investment strategy.
The portfolio allocation strategy is summarized in the following Algorithm 2. The portfolio weights are updated at each period based on the forecast of returns and covariances.When the return of some assets is expected to be high compared to the expected return of the equally weighted portfolio, more weight is assigned to those assets.On the other side, when the covariance of the assets with the equally weighted portfolio is expected to be high, lesser weights are assigned to these assets.The two effects are balanced linearly in the updating process.In the algorithm, the f or loop runs for each out-of-sample time stamp t corresponding to one day.Therefore after receiving predicted returns and covariances for each day the f or loop outputs the stock allocation weights w within the loop.

Results
In this section, the data used to evaluate the proposed forecast and portfolio allocation methodology are presented, and the sampling scheme used to train and test the forecasting models is further explained.Moreover, the investment performance metrics are defined and performance statistics are reported.

Data Description and Sampling Scheme
The methodology is evaluated using data from the Chinese stock market.All stocks listed in the Shanghai and Shenzhen stock exchanges are taken into consideration.Moreover, a long-history dataset, with daily observations from 2011 to 2022 is considered.All the data are obtained from the repository tushare.pro(For documentation of the dataset see https://tushare.pro/document/2,accessed on 1 November 2022).
The data present many missing values and the main reason for this is the fact that many stocks are not actively traded.To overcome this problem, a sampling filter is applied selecting all stocks that in the last 11 years are actively traded for at least 95% of the trading days.As the sample data in the analysis consist of 2837 trading days (11 years and 10 months), this filter will take all stocks with activity in at least 2695 trading days out of 2837 days.After the filtering, a random selection of stocks is applied in forming two baskets of stocks.First, a moderately large basket of 100 stocks is formed and, second, a very large basket of 1000 stocks is formed.The market trading codes, as well as sample names of stocks constituting those baskets, are reported in Appendix B. The dataset suffers from survivorship bias (see [37]) as all stocks are continuously listed from January 2011 to October 2022.The performance evaluations are all relative to the equal-weighted portfolio, which is also subject to this bias, thus minimizing its impact.
For each basket of investments, two investment strategies are considered: a heuristic reward-risk strategy where the parameter c = 0.5 in Equation ( 17), such that both return forecast and covariance forecast are considered and a heuristic max-reward strategy with c = 1, where only the return forecast is considered.For the latter investors maximize return without considering the risk of the portfolio so that they are risk-neutral.From those two strategies on the two baskets of investments, four portfolios are derived, namely the heuristic reward-risk 100, the heuristic max-reward 100, the heuristic reward-risk 1000 and the heuristic max-reward 1000.
To evaluate the investment strategy, a realistic forecast training and investment scheme is adopted.Forecasting models are trained based on a rolling window of size ws and once the models are trained, forecasts for returns and covariances are generated every day for the next day.Based on these forecasts, the portfolio weights are also updated every day.For example, the first training window consists of data from day t = 1 to day t = ws.Based on data from this window, the models generate one-step ahead forecasts for mean and covariances for the time stamp t = ws + 1 . . ., ws + step, where step is the number of days a window is rolled over.Based on this forecast the weight vector w t is set for each day t.After this first training sample, the training window is rolled over step days ahead, it consists now of data from day step to day step + ws and the resulting forecasts are relative to t = ws + step + 1, . . ., ws + 2 • step.This pseudo-out-of-sample investment process is repeated for the whole sample data of about 11 years.Figure 1 illustrates the training and forecasting sampling scheme.The window size ws and the step size determines how often the forecast models need to be re-estimated.For instance, with our dataset of 2837 daily observations (about 11.8 years) and considering ws = 243, corresponding to about 1 year, and step = 20, corresponding to about 1 month (excluding weekends and holidays), there are 130 training windows for which the forecasting models need to be re-estimated.In the experimentation, those parameters are set differently for the two baskets of investments.For the basket of 100 stocks, the parameters ws = 243 and step = 20 are kept, while for the basket of 1000 stocks, the parameters are ws = 486, corresponding to about 2 years, and step = 40, corresponding to about 2 months (of working days).For the first basket, there are 130 training windows in which models are estimated and for the second basket, there are 58 such windows.As the first window is for training only, the out-of-sample period starts in January 2012, leaving out one year of data, for the portfolios of 100 stocks, and it starts on January 2013, leaving out two years of data, for the portfolios of 1000 stocks.Because the forecast models are fit for each stock for each window, when the number of assets reaches the thousands, it could take some hours to complete the whole model evaluation procedure with serial computing.With 1000 stocks, the equally weighted portfolio, and 58 training windows there are (58(1000 • 2 + 1)) maximum likelihood optimizations as a two-stage estimation is required to estimate conditional means, variances, and covariances.In the experimentation, a small-scale computing set-up is used with limited 500 MB memory reserved for data processing.The algorithm runs sequentially and it is implemented in Python.It makes use of the "minimize" function with Sequential Least Squares Programming (SLSQP) method as implemented in the Scipy package for the maximum likelihood evaluations.Estimation of the forecasting models takes most of the computing time with the maximum likelihood evaluations for all training data windows for the basket of 100 stocks completed in about 10 minutes runtime and evaluations of maximum likelihoods for the models for the 1000 stocks completed in about 5 hours runtime.This nonlinear increase in time is due to a nonlinear increase in complexity as the size of training data ws increases.

Investment Perfomance
The performance of the investment strategies is assessed in terms of cumulative return, annualized daily return, volatility of the daily return, and transaction costs.As cumulative return, the whole ten years and 10 months of out-of-sample data are considered and a price index is constructed such that at the beginning of 2012 the portfolio is worth 100 and the portfolio value is updated each day.As for transaction costs, only the portfolio rebalancing cost is considered by assuming 25 basis points for price unit transactions.This is the retail rate charged by many banks for trading in the Chinese market and also the fair transaction cost for pricing model based investment strategies considered in [38].The heuristic investment strategies are compared with the buy-and-hold equally weighted trading strategy that does not require any transaction costs.
As an overall performance, the realized daily Sharpe ratio is built as the standardized out-of-sample daily return: where r = 1 n ∑ T t=1 r t and T is the total number of days out-of-sample.The definition of this ratio is not equal to the ratio between annual average return and annual average standard deviation.
At a glance the max-reward strategy performs better than the reward-risk strategy, offering a higher cumulative return.In Figure 2, the overall price performance of the max-reward strategy and the reward-risk strategy are compared with that of the equally weighted portfolio of the 100 stocks.Furthermore, in Figure 3 the price of the rewardrisk and max-reward strategies for the 1000 stocks are compared with that of the equally weighted portfolio.The overall winner is the heuristic max-reward 100 portfolio, yielding the highest cumulative return over the investment period.The portfolios built on 1000 stocks performs worse than the portfolios build on 100 stocks as it offers less cumulative returns.However, this is due to stocks being randomly selected from the population as the equally weighted 1000 portfolio also performs poorly compared to the equally weighted 100 portfolio (comparing Figures 2 and 3).The heuristic investment strategies effectively improve upon the equally weighted portfolio and its investment performance matches the designed goal of the strategy.Overall the equally weighted portfolio is well tracked and this is evident from the movement pattern of prices in Figures 2 and 3.The equally weighted portfolio is already fully diversified, however, its performance is beaten thanks to superior return forecasts.For example, during the market bubble-and-crash period of 2015-2016 prices soared again after an initial crash in mid-2015.The investment strategy capitalized on this opportunity.Furthermore, in the relatively positive year 2021, some stocks outperformed the market, and also in this case the investment strategies benefit from return forecasts.In terms of risk goal, the reward-risk portfolios, which combine both return forecast and covariance forecast in its investment strategy, have overall less volatility than the equally weighted portfolios (see for example in Figure 3 during the years 2016-2020).Therefore the strategy also benefits from superior covariance forecasts.Overall, the goal of risk reduction is more difficult to achieve and this is because the equally weighted portfolio is already diversified.In periods of high correlation such as during the market crash of mid-2015, the performance of all trading strategies crashes.
Tables 1 and 2 summarize the investment performance for the investment strategies with 100 stocks and 1000 stocks, respectively.The results confirm that the max-reward strategy reaches its goal of maximizing return as the annualized rate of return, calculated as daily average return over the whole out-of-sample period multiplied by 243, is about 5.5% and 4% higher than that of the respective equally weighted portfolios of 100 stocks and 1000 stocks with a similar level of risk (27.2% annual standard deviation versus 26.68% for the portfolios of 100 stocks and 26.32% annual standard deviation versus 26.35% for the portfolio of 1000 stocks).The reward-risk portfolios also achieve their goal of balancing return and standard deviation.The average yearly returns for these portfolios are about 1% higher than the respective equally weighted portfolios and the annualized standard deviation is about 3-4% lower than that of the equally weighted portfolio.Some additional results on the effectiveness of the heuristic portfolio strategies in minimizing risk are contained in Appendix A.
Considering the realized Sharpe ratio defined in Equation ( 19) both max-reward and reward-risk strategies offer better performance than the equally weighted strategy.However, these positive results from the investment strategy do not take into account transaction costs.Assuming a relatively high transaction fee such as 25 basis points for a unit transaction, this is translated as a yearly average cost of about 7.88% and 13.21% for the portfolios reward-risk 100 and max-reward 100 and 5.76-7.23%for the portfolios reward-risk 1000 and max-reward 1000.By accounting for those costs, both max-reward and reward-risk strategies are unprofitable compared with the equally weighted strategy.In summary, the equally weighted portfolio is well tracked, and the designed goals of max-return and reward-risk to beat the equally weighted portfolio is achieved only in some specific periods when there is high predictability of stock return and risk.For most of the other periods, when there is no predictability, the performance of the investment strategies does not depart significantly from that of the equally weighted portfolio.A negative side effect is that, because of noise in forecasts for prices and covariances, those strategies require some level of portfolio rebalancing, that, even if at a small level, results in transaction costs without direct performance benefit.
To determine where those transaction costs are incurred, the daily weight changes are computed as the sum of all absolute changes in weight for all stocks from one day to the next day, including both buy and sell transactions: Transaction costs reported in Tables 1 and 2 are calculated yearly as ∑ 243 t=1 ∆ t w.Figures 4 and 5 report those weight changes for the portfolio strategies with 100 stocks while Figures 6 and 7 report the weight changes for the portfolio strategies with 1000 stocks.For the portfolios of 100 stocks, the fraction of the investment that gets rebalanced is rather high.On average, (13/2)% (As cost are calculated for buy and sell, the unit investment doubled accounting for both buy and sell transactions) of the investments get rebalanced every day for the reward-risk portfolio and (22/2)% of the investments get rebalanced for the max-reward investment strategy.Indeed the equally weighted strategy has zero rebalancing cost.For the portfolio of 1000 stocks, the average fraction of investment re-balanced daily is (9/2)% and (12/2)%, respectively.Between June and December 2015, when all heuristic investment strategies significantly beat the equally weighted portfolio in terms of absolute return, the portfolio rebalancing costs are significantly higher.This is an indication that the predicted returns for this period have high variability and on average they are accurate as the investment strategies based on those forecasts are generating a significant economic value.In the period between March and November 2018 the heuristic reward-risk 1000 portfolio incurred significantly high transaction costs (see Figure 6) and the performance of this investment strategy peaked compared with that of the equally weighted portfolio (see blue line compared with the red line in Figure 3).Furthermore, in this case, return forecasts are on average accurate.In calm periods, such as the period between 2012 and 2015, the transaction costs are relatively low and this is because in those periods both returns and covariances are predicted to be very similar from one day to the next.However, despite this, the heuristic investment strategies still incur transaction costs but without economic benefit.This is due to noise forecasts of the return.
As the annual transaction cost is high, a possibility to reduce this cost is to rebalance the portfolio at a less frequent period such as weekly or even monthly instead of daily.In this study, the daily rebalancing portfolio allocation scheme is kept to better assess the effect of the forecasting models, which are more accurate at a daily level, on the investment strategies.

Conclusions
This article introduces a heuristic approach for multi-period portfolio selection based on a forecasting model for stock returns and covariances.This approach is feasible for investment in a very large number of assets, even exceeding a thousand units.The portfolio selection method is based on an iterative assessment of reward and risk following the mean-variance framework but does not require mathematical optimization as it updates investment weights based on predicted quantities.The portfolio strategies presented in the paper track the equally weighted portfolio and aim to beat this portfolio in terms of delivering a better rate of return and lower risk in the long run.
Based on a large sample of market data from the Chinese stock exchanges, the portfolio investment strategies hit their target of improving the rate of return and reducing the risk of the investment when compared with the equally weighted portfolio investment strategy.However, when considering transaction costs, the proposed strategies are not as profitable as the buy-and-hold equally weighted strategy that does not require any portfolio rebalancing transaction costs.In this evaluation of transaction cost, a relatively expensive fee of 25 bps per unit transaction is assumed.For large institutional investors, this transaction fee can be much lower as one-fifth of the assumed fee (See for example http: //english.sse.com.cn/start/taxes/,accessed on 1 December 2022, for the fees charged by the stock exchange) thus resulting in more profitable strategies compared to the benchmark even when taking into account those transaction costs.
Mathematic forecasting models for stock return have notoriously high error (see [39] for a discussion) and this paper does not attempt to deliver a forecasting model with high accuracy but rather aims to capture the general trend of a stock return and covariances with simple forecasting models.In the empirical application the forecasting models are re-train every month (and every 2 months for the portfolio of 1000 stocks) and, once the models are trained, it applies the one-step ahead forecast approach that captures this trend for the periods into the future.Re-training the models every day and producing a forecast for the one day ahead could arguably offer a more accurate forecast, thus offering better economic performance with the multi-period portfolio allocation strategy.However, this approach is not adopted as it requires re-estimating thousands of coefficients every day and can be rather expensive from a computational point of view.
Based on the sample data, the forecasting and the portfolio formation approach presented generates an economic value of about 400-500 basis points.The annual return of the heuristic max-reward strategies is about 4-5% higher than that of the equally weighted portfolio strategy even maintaining a similar level of risk.This result is based on a long sample period of more than 10 years that includes both calm and turbulent market dynamics.For some sub-periods, such as the turbulent period of 2015-2016, this economic value is significantly higher.

Figure 1 .
Figure 1.Sampling Scheme.The forecast and investment strategy are evaluated using a rolling window scheme.For each training window ws it follows a forecast for the next step days using one-step ahead forecasts.The process is repeated K times by shifting the training and testing samples step days ahead every time.

Figure 2 .
Figure 2. Out-of-sample price performance of the heuristic reward-risk (in blue line), the heuristic max-reward (in green line) and the equally weighted (in red line) portfolios with the 100 stocks.

Figure 3 .
Figure 3. Out-of-sample price performance of the heuristic reward-risk (in blue line), the heuristic max-reward (in green line) and the equally weighted (in red line) portfolios with the 1000 stocks.

Figure 4 .
Figure 4.The daily ratio of shared to be re-weighted in the heuristic reward-risk 100 portfolio, taking into consideration both buy and sell orders.The average daily change in weight denoted with the red line is calculated as a rolling window average of the past 20 days' changes in weight.

Figure 5 .
Figure 5.The daily ratio of shared to be re-weighted in the heuristic max-reward 100 portfolio, taking into consideration both buy and sell orders.The average daily change in weight denoted with the red line is calculated as a rolling window average of the past 20 days' changes in weight.

Figure 6 .
Figure 6.The daily ratio of shared to be re-weighted in the heuristic reward-risk 1000 portfolio, taking into consideration both buy and sell orders.The average daily change in weight denoted with the red line is calculated as a rolling window average of the past 20 days' changes in weight.

Figure 7 .
Figure 7.The daily ratio of shared to be re-weighted in the heuristic max-reward 1000 portfolio, taking into consideration both buy and sell orders.The average daily change in weight denoted with the red line is calculated as a rolling window average of the past 20 days' changes in weight.

Algorithm 1
Training and 1 Step Ahead Forecasting of Return and Covariances.

Algorithm 2
Weight Update for Each Successive Day.

Table 1 .
Summary out-of-sample investment performance for the 100 stocks.

Table 2 .
Summary out-of-sample investment performance for the 1000 stocks.

Table A1 .
Ticker symbol for the 100 randomly selected stocks in the Shenzhen and Shanghai stock exchanges.The suffix ".SZ" denotes Shenzhen and the suffix ".SH" denotes Shanghai.

Table A2 .
Ticker symbol for the 1000 randomly selected stocks in the Shenzhen and Shanghai stock exchanges.This table reports only the stocks traded in the Shenzen market.

Table A3 .
Ticker symbol for the 1000 randomly selected stocks in the Shenzhen and Shanghai stock exchanges.This table reports only the stocks traded in the Shanghai market.

Table A4 .
Details of some example companies among the portfolio of 100 stocks.