Robust Estimation for the Single Index Model Using Pseudodistances

For portfolios with a large number of assets, the single index model allows for expressing the large number of covariances between individual asset returns through a significantly smaller number of parameters. This avoids the constraint of having very large samples to estimate the mean and the covariance matrix of the asset returns, which practically would be unrealistic given the dynamic of market conditions. The traditional way to estimate the regression parameters in the single index model is the maximum likelihood method. Although the maximum likelihood estimators have desirable theoretical properties when the model is exactly satisfied, they may give completely erroneous results when outliers are present in the data set. In this paper, we define minimum pseudodistance estimators for the parameters of the single index model and using them we construct new robust optimal portfolios. We prove theoretical properties of the estimators, such as consistency, asymptotic normality, equivariance, robustness, and illustrate the benefits of the new portfolio optimization method for real financial data.


Introduction
The problem of portfolio optimization in the mean-variance approach depends on a large number of parameters that need to be estimated on the basis of relatively small samples. Due to the dynamics of market conditions, only a short period of market history can be used for estimation of the model's parameters. In order to reduce the number of parameters that need to be estimated, the single index model proposed by Sharpe (see [1,2]) can be used. The traditional estimators for parameters of the single index model are based on the maximum likelihood method. These estimators have optimal properties for normally distributed variables, but they may give completely erroneous results in the presence of outlying observations. Since the presence of outliers in financial asset returns is a frequently occurring phenomenon, robust estimates for the parameters of the single index model are necessary in order to provide robust and optimal portfolios.
Our contribution to robust portfolio optimization through the single index model is based on using minimum pseudodistance estimators.
The interest on statistical methods based on information measures and particularly on divergences has grown substantially in recent years. It is a known fact that, for a wide variety of models, statistical methods based on divergence measures have some optimal properties in relation to efficiency, but especially in relation to robustness, representing viable alternatives to the classical methods. We refer to the monographs of Pardo [3] and Basu et al. [4] for an excellent presentation of such methods, for their importance and applications.
We can say that the minimum pseudodistance methods for estimation go to the same category as the minimum divergence methods. The minimum divergence estimators are defined by minimizing some appropriate divergence between the assumed theoretical model and the true model corresponding to the data. Depending on the choice of the divergence, minimum divergence estimators can afford considerable robustness with a minimal loss of efficiency. The classical minimum divergence methods require nonparametric density estimation, which imply some difficulties such as the bandwidth selection. In order to avoid the nonparametric density estimation in minimum divergence estimation methods, some proposals have been made in [5][6][7] and robustness properties of such estimators have been studied in [8,9].
The pseudodistances that we use in the present paper were originally introduced in [6], where they are called "type-0" divergences, and corresponding minimum divergence estimators have been studied. They are also obtained (using a cross entropy argument) and extensively studied in [10] where they are called γ-divergences. They are also introduced in [11] in the context of decomposable pseudodistances. By its very definition, a pseudodistance satisfies two properties, namely the nonnegativity and the fact that the pseudodistance between two probability measures equals to zero if and only if the two measures are equal. The divergences are moreover characterized by the information processing property, i.e., by the complete invariance with respect to statistically sufficient transformations of the observation space (see [11], p. 617). In general, a pseudodistance may not satisfy this property. We adopted the term pseudodistance for this reason, but in the literature we can also meet the other terms above. The minimum pseudodistance estimators for general parametric models have been presented in [12] and consist of minimization of an empirical version of a pseudodistance between the assumed theoretical model and the true model underlying the data. These estimators have the advantages of not requiring any prior smoothing and conciliate robustness with high efficiency, usually requiring distinct techniques.
In this paper, we define minimum pseudodistance estimators for the parameters of the single index model and using them we construct new robust optimal portfolios. We study properties of the estimators, such as, consistency, asymptotic normality, robustness and equivariance and illustrate the benefits of the proposed portfolio optimization method through examples for real financial data.
We mention that we define minimum pseudodistance estimators, and prove corresponding theoretical properties, for the parameters of the simple linear regression model (35), associated with the single index model. However, in a very similar way, we can define minimum pseudodistance estimators and obtain the same theoretical results for the more general linear regression model Y j = X T j β + e j , j = 1, . . . , n, where the errors e j are i.i.d. normal variables with mean zero and variance σ 2 , X j = (X j1 , . . . , X jp ) T is the vector of independent variables corresponding to the j-th observation and β = (β 1 , . . . , β p ) T represents the regression coefficients.
The rest of the paper is organized as follows. In Section 2, we present the problem of robust estimation for some portfolio optimization models. In Section 3, we present the proposed approach. We define minimum pseudodistance estimators for regression parameters corresponding to the single index model and obtain corresponding estimating equations. Some asymptotic properties and equivariance properties of these estimators are studied. The robustness issue for estimators is considered through the influence function analysis. Using minimum pseudodistance estimators, new optimal portfolios are defined. Section 4 presents numerical results illustrating the performance of the proposed methodology. Finally, the proofs of the theorems are provided in the Appendix A.

The Single Index Model
Portfolio selection represents the problem of allocating a given capital over a number of available assets in order to maximize the return of the investment while minimizing the risk. We consider a portfolio formed by a collection of N assets. The returns of the assets are given by the random vector X := (X 1 , . . . , X N ) T . Usually, it is supposed that X follows a multivariate normal distribution N N (µ, Σ), with µ being the vector containing the mean returns of the assets and Σ = (σ ij ) the covariance matrix of the assets returns. Let w := (w 1 , . . . , w N ) T be the vector of weights associated with the portfolio, where w i is the proportion of capital invested in the asset i. Then, the total return of the portfolio is defined by the random variable The mean and the variance of the portfolio return are given by A classical approach for portfolio selection is the mean-variance optimization introduced by Markowitz [13]. For a given investor's risk aversion λ > 0, the mean-variance optimization gives the optimal portfolio w * , solution of the problem with the constraint w T e N = 1, e N being the N-dimensional vector of ones. The solution of the optimization problem (4) is explicit, the optimal portfolio weights for a given value of λ being where This is the case when short selling is allowed. When short selling is not allowed, we have a supplementary constraint in the optimization problem, namely all the weights w i are positive.
Another classical approach for portfolio selection is to minimize the portfolio risk defined by the portfolio variance, under given constraints. This means determining the optimal portfolio w * as a solution of the optimization problem arg min w S(w), subject to R(w) = w T µ ≥ µ 0 , for a given value µ 0 of the portfolio return. However, the mean-variance analysis has been criticized for being sensitive to estimation errors of the mean and the covariance of the assets returns. For both optimization problems above, estimations of the input parameters µ and Σ are necessary. The quality and hence the usefulness of the results of the portfolio optimization problem critically depend on the quality of the statistical estimates for these input parameters. The mean vector and the covariance matrix of the returns are in practice estimated by the maximum likelihood estimators under the multivariate normal assumption. When the model is exactly satisfied, the maximum likelihood estimators have optimal properties, being the most efficient. On the other hand, in the presence of outlying observations, these estimators may give completely erroneous results and consequently the weights of the corresponding optimal portfolio may be completely misleading. It is a known fact that outliers frequently occur in asset returns, where an outlier is defined to be an unusually large value well separated from the bulk of the returns. Therefore, robust alternatives to the classical approaches need to be carefully analyzed.
For an overview on the robust methods for portfolio optimization, using robust estimators of the mean and covariance matrix in the Markowitz's model, we refer to [14]. We also cite the methods proposed by Vaz-de Melo and Camara [15], Perret-Gentil and Victoria-Feser [16], Welsch and Zhou [17], DeMiguel and Nogales [18], and Toma and Leoni-Aubin [19].
On the other hand, in portfolio analysis, one is sometimes faced with two conflicting demands. Good quality statistical estimates require a large sample size. When estimating the covariance matrix, the sample size must be larger than the number of different elements of the matrix. For example, for a portfolio involving 100 securities, this would mean observations from 5050 trading days, which is about 20 years. From a practical point of view, considering such large samples is not adequate for the considered problem. Since the market conditions change rapidly, very old observations would lead to irrelevant estimates for the current or future market conditions. In addition, in some situations, the number of assets could even be much larger than the sample size of exploitable historical data. Therefore, estimating the covariance matrix of asset returns is challenging due to the high dimensionality and also to the heavy-tailedness of asset return data. It is a known fact that extreme events are typical in financial asset prices, leading to heavy-tailed asset returns. One way to treat these problems is to use the single index model. The single index model (see [1]) allows us to express the large number of covariances between the returns of the individual assets through a significantly smaller number of parameters. This is possible under the hypothesis that the correlation between two assets is strictly given by their dependence on a common market index. The return of each asset i is expressed under the form where X M is the random variable representing the return of the market index, e i are zero mean random variables representing error terms and α i , β i are new parameters to be estimated. It is supposed that the e i 's are independent and also that the e i s are independent of x M . Thus, E(e i ) = 0, E(e i e j ) = 0 and E(e i x M ) = 0 for all i and all j = i. The intercept in Equation (35) represents the asset's expected return when the market index return is zero. The slope coefficient β i represents the asset's sensitivity to the index, namely the impact of a unit change in the return of the index. The error e i is the return variation that cannot be explained by the index.
The following notations are also used: Using Equation (35), the components of the parameters µ and Σ from the models (4) and (7) are given by Both variances and covariances are determined by the assets' betas and sigmas and by the standard deviation of the market index. Thus, the N(N + 1)/2 different elements of the covariance matrix Σ can be expressed by 2N + 1 parameters β i , σ i , σ M . This is a significant reduction of the number of parameters that need to be estimated.
The traditional estimators for parameters of the single index model are based on the maximum likelihood method. These estimators have optimal properties for normally distributed variables, but they may give completely erroneous results in the presence of outlying observations. Therefore, robust estimates for the parameters of the single index model are necessary in order to provide robust and optimal portfolios.
The classical estimators for the unknown parameters α, β, σ of the linear regression model are the maximum likelihood estimators (MLE). The classical MLE estimators perform well if the model hypotheses are satisfied exactly and may otherwise perform poorly. It is well known that the MLE are not robust, since a small fraction of outliers, even one outlier may have an important effect inducing significant errors on the estimates. Therefore, robust alternatives of the MLE should be considered, in order to propose robust estimates for the single index model, leading then to robust portfolio weights.
In order to robustly estimate the unknown parameters α, β, σ, suppressing the outsized effects of outliers, we use the approach based on pseudodistance minimization.
For two probability measures P, Q admitting densities p, respectively, q with respect to the Lebesgue measure, we consider the following family of pseudodistances (also called γ-divergences in some articles) of orders γ > 0 satisfying the limit relation Note that R 0 (P, Q) is the well-known modified Kullback-Leibler divergence. Minimum pseudodistance estimators for parametric models, using the family (13), have been studied by [6,10,11]. We also mention that pseudodistances (13) have also been used for defining optimal robust M-estimators with the Hampel's infinitesimal approach in [20].
For the linear regression model, we consider the joint distribution of the entire data, the explanatory variable X M being random together with the response variable X, and write a pseudodistance between a theoretical model and the data. Let P θ , with θ =: (α, β, σ), be the probability measure associated with the theoretical model given by the random vector (X M , X), where X = α + βX M + e with e ∼ N (0, σ), e independent on X M , and Q the probability measure associated with the data. Denote by p θ , respectively, q, the corresponding densities. For γ > 0, the pseudodistance between P θ and Q is defined by Using the change of variables ( is the density of (X M , e), since X M and e are independent, we can write where p M is the density of X M and φ σ is the density of the random variable e ∼ N (0, σ). Then, Notice that the first and the third terms in the pseudodistance R γ (P θ , Q) do not depend on θ and hence are not included in the minimization process. The parameter θ 0 := (α 0 , β 0 , σ 0 ) of interest is then given by Suppose now that an i.i.d. sample Z 1 , . . . , Z n is available from the true model. For a given γ > 0, we define a minimum pseudodistance estimator of θ 0 = (α 0 , β 0 , σ 0 ) by minimizing an empirical version of the objective function in Equation (17). This empirical version is obtained by replacing is the Dirac delta function, and Q with the empirical measure corresponding to the sample. More precisely, we define or equivalently Differentiating with respect to α, β, σ, the estimators α, β, σ are solutions of the system Note that, for γ = 0, the solution of this system is nothing but the maximum likelihood estimator of (α, β, σ). Therefore, the estimating Equations (19)- (21) are generalizations of the maximum likelihood score equations. The tuning parameter γ associated with the pseudodistance controls the trade-off between robustness and efficiency of the minimum pseudodistance estimators.
We can also write that where ]. When the measure Q corresponding to the data pertain to the theoretical model, hence Q = P θ 0 , it holds that Thus, we can consider θ = ( α, β, σ) as a Z-estimator of θ 0 = (α 0 , β 0 , σ 0 ), which allows for adapting in the present context asymptotic results from the general theory of Z-estimators (see [21]).

Remark 1.
In the case when the density p M is known, by replacing Q with the empirical measure P n in Equation (17), a new class of estimators of (α 0 , β 0 , σ 0 ) can be obtained. These estimators can also be written under the form of Z-estimators, using the same reasoning as above. The results of Theorems 1-4 below could be adapted for these new estimators, and moreover all the influence functions of these estimators would be redescending bounded. However, in practice, the density of the index return is not known. Therefore, we will work with the class of minimum pseudodistance estimators as defined above.

Asymptotic Properties
In order to prove the consistency of the estimators, we use their definition (22) as Z-estimators.

Consistency
Theorem 1. Assume that, for any ε > 0, the following condition for the separability of solution holds

Asymptotic Normality
Assume that Z 1 , . . . , Z n are i.i.d. two-dimensional random vectors having the common probability distribution P θ 0 . For γ > 0 fixed, let θ = ( α, β, σ) be a sequence of estimators of the unknown parameter where ]. Note that the estimators θ = ( α, β, σ) defined by Equations (19)-(21), or equivalently by (22), are also solutions of the system (26). Using the function (27) for defining the estimators allows for obtaining the asymptotic normality, only imposing the consistency condition of the estimators, without other supplementary assumptions that are usually imposed in the case of Z-estimators.
After some calculations, we obtain the asymptotic covariance matrix of θ having the form It follows that β and σ are asymptotically independent; in addition, α and σ are asymptotically independent.

Influence Functions
In order to describe stability properties of the estimators, we use the following well-known concepts from the theory of robust statistics. A map T, defined on a set of probability measures and parameter space valued, is a statistical functional corresponding to an estimator θ of the parameter θ, if θ = T(P n ), P n being the empirical measure pertaining to the sample. The influence function of T at P θ is defined by where P εz := (1 − ε)P θ + εδ z , δ z being the Dirac measure putting all mass at z. As a consequence, the influence function describes the linearized asymptotic bias of a statistic under a single point contamination of the model P θ . An unbounded influence function implies an unbounded asymptotic bias of a statistic under single point contamination of the model. Therefore, a natural robustness requirement on a statistical functional is the boundedness of its influence function. For γ > 0 fixed and a given probability measure P, the statistical functionals α(P), β(P) and σ(P), corresponding to the minimum pseudodistance estimators α, β and σ, are defined by the solution of the system Ψ(z, T(P))dP(z) = 0, with Ψ defined by (23) and T(P) := (α(P), β(P), σ(P)), whenever this solution exists. When P = P θ corresponds to the considered theoretical model, the solution of system (29) is T(P θ ) = θ = (α, β, σ). Theorem 3. The influence functions corresponding to the estimators α, β and σ are respectively given by Since χ is redescending, σ has a bounded influence function and hence it is a redescending B-robust estimator. On the other hand, IF(x M0 , x 0 , α, P) and IF(x M0 , x 0 , β, P) will tend to infinity only when x M0 tends to infinity and | x 0 −α−βx M0 σ | ≤ k, for some k. Hence, these influence functions are bounded with respect to partial outliers or leverage points (outlying values of the independent variable). This means that large outliers with respect to x M , or with respect to x, will have a reduced influence on the estimates. However, the influence functions are clearly unbounded for γ = 0, which corresponds to the non-robust maximum likelihood estimators.

Equivariance of the Regression Coefficients' Estimators
If an estimator is equivariant, it means that it transforms "properly" in some sense. Rousseeuw and Leroy [22] (p. 116) discuss three important equivariance properties for a regression estimator: regression equivariance, scale equivariance and affine equivariance. These are desirable properties since they allow one to know how the estimates change under different types of transformations of the data. Regression equivariance means that any additional linear dependence is reflected in the regression vector accordingly. The regression equivariance is routinely used when studying regression estimators. It allows for assuming, without loss generality, any value for the parameter (α, β) for proving asymptotic properties or describing Monte-Carlo studies. An estimator being scale equivariant means that the fit produced by it is independent of the choice of measurement unit for the response variable. The affine equivariance is useful because it means that changing to a different co-ordinate system for the explanatory variable will not affect the estimate. It is known that the maximum likelihood estimator of the regression coefficients satisfies all these three properties. We show that the minimum pseudodistance estimators of the regression coefficients satisfy all the three equivariance properties, for all γ > 0. On the other hand, the objective function in the definition of the estimators depends on data only through the summation which is permutation invariant. Thus, the corresponding estimators of the regression coefficients and of the error standard deviation are permutation invariant, therefore the ordering of data does not affect the estimators. The minimum pseudodistance estimators are also equivariant with respect to reparametrizations. If θ = (α, β, σ) and the model is reparametrized to Υ = Υ(θ) with a one-to-one transformation, then the minimum pseudodistance estimator of Υ is simply Υ = Υ( θ), in terms of the minimum pseudodistance estimator θ of θ, for the same γ.

Robust Portfolios Using Minimum Pseudodistance Estimators
The robust estimation of the parameters α i , β i , σ i from the single index model given by (35), using minimum pseudodistance estimators, together with the robust estimation of µ M and σ M lead to robust estimates of µ and Σ, on the basis of relations (9)-(11). Since we do not model the explanatory variable X M in a specific way, we estimate µ M and the standard deviation σ M using as robust estimators the median, respectively the median absolute deviation. Then, the portfolio weights, obtained as solutions of the optimization problems (4) or (7) with input parameters robustly estimated, will also be robust. This methodology leads to new optimal robust portfolios. In the next section, on the basis of real financial data, we illustrate this new methodology and compare it with the traditional method based on maximum likelihood estimators.

Comparisons of the Minimum Pseudodistance Estimators with Other Robust Estimators for the Linear Regression Model
In order to illustrate the performance of the minimum pseudodistance estimators for the simple linear regression model, we compare them with the least median of squares (LMS) estimator (see [22,23]), with S-estimators (SE) (see [24]) and with the minimum density power divergence (MDPD) estimators (see [25]), estimators that are known to have a good behavior from the robustness point of view.
We considered a data set that comes from astronomy, namely the data from the Hertzsprung-Russell diagram of the star clusters CYG OB1 containing 47 stars in the direction of Cygnus. For these data, the independent variable is the logarithm of the effective temperature at the surface of the star and the dependent variable is the logarithm of its light intensity. The data are given in Rousseeuw and Leroy [22] (p. 27), who underlined that there are two groups of points: the majority, following a steep band, and four stars clearly forming a separate group from the rest of the data. These four stars are known as giants in astronomy. Thus, these outliers are not recording errors, but represents leverage points coming from a different group.
The estimates of the regression coefficients and of error standard deviation obtained with minimum pseudodistance estimators for several values of γ are given in Table 1 and some of the fitted models are plotted in Figure 1. For comparison, in Table 1, we also give estimates obtained with S-estimators based on the Tukey biweighted function, these estimates being taken from [24], as well as estimations obtained with minimum density power divergence methods for several values of the tuning parameter, and estimates obtained with the least median of squares method, all these estimates being taken from [25]. The MLE estimates, given on the first line of Table 1, are significantly affected by the four leverage points. On the other hand, like the robust least median of squares estimator, the robust S-estimators and some minimum density power divergence estimators, the minimum pseudodistance estimators with γ ≥ 0.32 can successfully ignore outliers. In addition, the minimum pseudodistance estimators with γ ≥ 0.5 give robust fits that are closer to the fits generated by the least median of squares estimates or by the S-estimates than the fits generated by the minimum density power divergence estimates.

Robust Portfolios Using Minimum Pseudodistance Estimators
In order to illustrate the performance of the proposed robust portfolio optimization method, we considered real data sets for the Russell 2000 index and for 50 stocks from its components. The stocks are listed in Appendix B. We selected daily return data for the Russell 2000 index and for all these stocks from 2 January 2013 to 30 June 2016. The data were retrieved from Yahoo Finance.
The data has been divided by quarter, in total 14 quarters for index and each stock. For each quarter, on the basis of data corresponding to the index, we estimated µ M and the standard deviation σ M using as robust estimators the median (MED), respectively the median absolute deviation (MAD) defined by We also estimated µ M and σ M classically, using sample mean and sample variance. Then, for each quarter and each of the 50 stocks, we estimated α, β and σ from the regression model using robust minimum pseudodistance estimators, respectively the classical MLE estimators. Then, on the basis of relations (9), (10) and (11), we estimated µ and Σ first using the robust estimates and then the classical estimates, all being previously computed.
Once the input parameters for the portfolio optimization procedure were estimated, for each quarter, we determined efficient frontiers, for both robust estimates and classical estimates. In both cases, the efficient frontier is determined as follows. Firstly, the range of returns is determined as the interval comprised between the return of the portfolio of global minimum risk (variance) and the maximum value of the return of a feasible portfolio, where the feasible region is and N = 50. We trace each efficient frontier in 100 points; therefore, the range of returns is divided, in each case, in ninety-nine sub-intervals with where µ 1 is the return of the portfolio of global minimum variance and µ 100 is the maximum return for the feasible region X. We determined µ 1 and µ 100 using robust estimates of µ and Σ (for the robust frontier) and then using classical estimates (for the classical frontier). In each case, 100 optimization problems are solved: In Figure 2, for eight quarters (the first four quarters and the last four quarters), we present efficient frontiers corresponding to the optimal minimum variance portfolios based on the robust minimum pseudodistance estimates with γ = 0.5, respectively based on the classical estimates. Thus, on the ox-axis, we consider the portfolio risk (given by the portfolio standard deviation) and, on the oy-axis, we represent the portfolio return. We notice that, in comparison with the classical method based on MLE, the proposed robust method provides optimal portfolios that have higher returns for the same level of risk (standard deviation). Indeed, for each quarter, the robust frontier is situated above the classical one, the standard deviations of the robust portfolios being smaller compared with those of the classical portfolios. We obtained similar results for the other quarters and for other choices of the tuning parameter γ, corresponding to the minimum pseudodistance estimators, too.
We also illustrate the empirical performance of the proposed optimal portfolios through an out-of-sample analysis, by using the Sharpe ratio as out-of-sample measure. For this analysis, we apply a "rolling-horizon" procedure as presented in [18]. First, we choose a window over which to perform the estimation. We denote the length of the estimation window by τ < T, where T is the size of the entire data set. Then, using the data in the first estimation window, we compute the weights for the considered portfolios. We repeat this procedure for the next window, by including the data for the next day and dropping the data for the earliest day. We continue doing this until the end of the data set is reached. At the end of this process, we have generated T − τ portfolio weight vectors for each strategy, which are the vectors w k t for t ∈ {τ, . . . , T − 1}, k denoting the strategy. For a strategy k, w k t has the components w k j,t , where w k j,t denotes the portfolio weight in asset j chosen at the time t. The out-of-sample return at the time t + 1, corresponding to the strategy k, is defined as (w k t ) T X t+1 , X t+1 := (X 1,t+1 , . . . , X N,t+1 ) T representing the data at the time t + 1. For each strategy k, using these out-of-sample returns, the out-of-sample mean and the out-of-sample variance are defined by and the out-of-sample Sharpe ratio is defined by In this example, we considered the data set corresponding to the quarters 13 and 14. The size of the entire data set was T = 126 and the length of the estimation window was τ = 63 points. For the data from the first window, classical and robust efficient frontiers were traced, following all the steps that we explained in the first part of this subsection. More precisely, we considered the classical efficient frontier corresponding to the optimal minimum variance portfolios based on MLE and three robust frontiers, corresponding to the optimal minimum variance portfolios using robust minimum pseudodistance estimations with γ = 1, γ = 1.2 and γ = 1.5, respectively. Then, on each frontier, we chose the optimal portfolio associated with the maximal value of the ratio between the portfolio return and portfolio standard deviation. These four optimal portfolios represent the strategies that we compared in the out-of-sample analysis. For each of these portfolios, we computed the out-of-sample returns for the next time (next day). Then, we repeated all these procedures for the next window, and so on until the end of the data set has been reached. In the spirit of [18] Section 5, using (35) and (36), we computed out-of-sample means, out-of-sample variances and out-of-sample Sharpe ratios for each strategy. The out-of-sample means and out-of-sample variances were annualized, and we also considered a benchmark rate of 1.5 %. In this way, we obtained the following values for the out-of-sample Sharpe ratio: SR = 0.22 for the optimal portfolio based on MLE, SR = 0.74 for the optimal portfolio based on minimum pseudodistance estimations with γ = 1, SR = 0.71 for the optimal portfolio based on minimum pseudodistance estimations with γ = 1.2 and SR = 0.29 for the optimal portfolio based on minimum pseudodistance estimations with γ = 1.5. In Figure 3, we illustrate efficient frontiers for the windows 7 and 8, as well as the optimal portfolios chosen on each frontier.
This example shows that the optimal minimum variance portfolios based on robust minimum pseudodistance estimations in the single index model may attain higher Sharpe ratios than the traditional optimal minimum variance portfolios given by the single index model using MLE.   The obtained numerical results show that, for the single index model, the presented robust technique for portfolio optimization yields better results than the classical method based on MLE, in the sense that it leads to larger returns for the same value of risk in the case when outliers or atypical observations are present in the data set. The considered data sets contain such outliers. This is often the case for the considered problem, since outliers frequently occur in asset returns data. However, when there are no outliers in the data set, the classical method based on MLE is more efficient than the robust ones and therefore may lead to better results.

Conclusions
When outliers or atypical observations are present in the data set, the new portfolio optimization method based on robust minimum pseudodistance estimates yields better results than the classical single index method based on MLE estimates, in the sense that it leads to larger returns for smaller risks. In literature, there exist various methods for robust estimation in regression models. In the present paper, we proposed the method based on the minimum pseudodistance approach, which suppose to solve a simple optimization problem. In addition, from a theoretical point of view, these estimators have attractive properties, such as being redescending robust, consistent, equivariant and asymptotically normally distributed. The comparison with other known robust estimators of the regression parameters, such as the least median of squares estimators, the S-estimators or the minimum density power divergence estimators, shows that the minimum pseudodistance estimators represent an attractive alternative that may be considered in other applications too.
Author Contributions: A.T. designed the methodology, obtained the theoretical results and wrote the paper. A.T. and C.F. conceived the application part. C.F. implemented the methods in MATLAB and obtained the numerical results. Both authors have read and approved the final manuscript.
Since θ → Ψ(z, θ) is continuous, by the uniform law of large numbers, (A1) implies in probability. Then, (A2) together with assumption (25) assure the convergence in probability of θ toward θ 0 . The arguments are the same as those from van der Vaart [21], Theorem 5.9, p. 46.
For each i, callΨ i the matrix with elements ∂Ψ i ∂θ k ∂θ l and C n (z, θ) the matrix with its i-th raw equal to ( θ − θ 0 ) TΨ i (z, θ). Using a Taylor expansion, we get Therefore, 0 = A n + (B n + C n )( θ − θ 0 ) with i.e., C n is the matrix with its i-th raw equal to ( θ − θ 0 ) TΨ− i , wherë which is bounded by a constant that does not depend on θ, according to the arguments mentioned above. Since θ − θ 0 → 0 in probability, this implies that C n → 0 in probability. We have √ n( θ − θ 0 ) = −(B n + C n ) −1 √ nA n .
Note that, for j = 1, . . . , n, the vectors Ψ(Z j , θ 0 ) are i.i.d. with mean zero and the covariance matrix A, and the matricesΨ(Z j , θ 0 ) are i.i.d. with mean B. Hence, when n → ∞, using (A3), the law of large numbers implies that B n → B in probability, which implies B n + C n → B in probability, which is nonsingular. Then, the multivariate central limit theorem implies √ nA n → N 3 (0, A) in distribution.
Proof of Theorem 3. The system (29) can be written as We consider the contaminated model P ε,x M0 ,x 0 := (1 − ε)P θ + εδ (x M0 ,x 0 ) , where δ (x M0 ,x 0 ) is the Dirac measure putting all mass in the point (x M0 , x 0 ), which we simply denote here by P ε . Then, it holds Derivating the first equation with respect to ε and taking the derivatives in ε = 0, we obtain After some calculations, we obtain the relation Similarly, derivating with respect to ε Equations (A11) and (A12) and taking the derivatives in Solving the system formed with the Equations (A13)-(A15), we find the expressions for the influence functions.