Does Sustainability Score Impact Mutual Fund Performance?

Given that sustainable investing constitutes a major force across global financial markets, in 2016 Morningstar began reporting Morningstar Sustainability and ESG scores. We use these scores to study the effects of Socially Responsible Investments (SRI) on European equity fund performance. Sustainability score and the different pillars of ESG scores (environmental, social, and governance) impact negatively on performance. We also test the effect on mutual fund flows and risk. The sustainability score is significant on the flows, so higher-rated funds receive a larger volume of funds. In terms of risk, the level of sustainability is negatively related to the VaR (value at risk) of the fund, supporting that higher scored mutual funds offer better protection against extreme losses.


Introduction
Socially Responsible Investment (SRI), also known as sustainable, responsible, and impact investing, is "an investment discipline that considers environmental, social and corporate governance (ESG) criteria to generate long-term competitive financial returns and positive societal impact" (US SIF, n.d. i ). According to the 2016 Global Sustainable Investment Review (GSIA, 2017), in 2016 there were $22.89 trillion assets being professionally managed under SRI strategies in the world. Bilbao-Tero, Álvarez-Otero, Bilbao-Tero and Cañal-Fernández (2017) conclude that the SRI label in the mutual fund industry is valued favorably by the market, which is an important factor that drives this growth. Barreda-Tarrazona, Matalín-Sáez and Balaguer-Franch (2011) conclude that social preference (instead of financial performance) is the primary factor for investors choosing SRI mutual funds.
The growing interest in SRI in recent years has led to several organizations assessing mutual funds on how well the underlying companies perform on ESG issues.
In 2016, Morningstar launched a Morningstar Sustainability Rating. The idea of the Morningstar Sustainability Rating is classifying mutual funds about ESG factors relative to their Morningstar category peers. The advantage of this product is that it makes it possible to find sustainable funds even if they aren't labelling themselves specifically as funds that support an SRI approach. The use of these scores shows an important difference to previous studies, which compare SRI funds with an index, or the most advanced studies apply a so-called matching approach, i.e. they compare the performance of SRI and non-SRI investment funds with similar characteristics (fund size, fund age, expenses, et cetera.) to properly considered management and transaction costs for both SRI funds and conventional funds (see, among others, Mallin, Saadouni and Briston, 1995;Gregory, Matatko and Luther, 1997;Statman, 2000;Kreander, Gray, Power and Sinclair, 2002;Kreander, Gray, Power and Sinclair, 2005).
One important research question in the mutual fund industry about SRI investing is to know how SRI mutual funds perform. There are several studies that have demonstrated that companies with social responsibility policies and practices are good investments. For example, a recent paper of Friede, Busch and Bassen (2015) conducted a meta-analysis of about 2,200 unique primary empirical studies. They found that the majority of studies show a positive correlation between ESG factors and financial performance. But despite the investigations carried out to date, there is still a debate about whether these types of investments can create value for investors or not and why they put their money here. Although according to Lewis and Mackenzie (2000) and Webley, Lewis and Mackenzie (2001), some investors in SRI funds are willing to accept lower returns for their moral stance, the performance of SRI funds and conventional funds is still an open question. As Junkus and Berry (2015) sustain, after a review of the most recent work in major finance journals on SRI, "the performance of SR mutual funds and indexes are not generally significantly different to conventional funds or indexes, but again these results are also highly dependent on model specification, time period, benchmark, and other characteristics of the study".
Authors such as Luther, Matatko and Corner (1992) and Mallin, Saadouni and Briston (1995) support the idea that SRI funds outperform market indexes. But the more conventional theory is that SRI mutual funds have the same return as any other funds, and authors such as Hamilton, Jo and Statman (1993), Sinclair (2002, 2005), Gregory and Whitakker (2007) and Bauer, Derwall and Otten (2007), Humphrey, Warren and Boon (2016) and Syed (2017) are in line with this theory. Another theory defends that choosing SRI funds are basically a "trade off" between investing in SRI and returns, so SRI investments underperform the benchmark (for example, White, 1995). One important recent paper is Nofsinger and Varma (2014), which provides a new perspective because they found that the different between socially responsible (SR) and conventional mutual funds depends on the state of the market. SR mutual funds outperform conventional mutual funds during periods of market crisis, but in non-crisis periods, SR funds underperform conventional funds.
Previous research has studied the effect of sustainability on performance exclusively using a dichotomous variable to differentiate between socially responsible funds and conventional funds. However, the results could be biased because under "socially responsible", they could have funds with very different levels of sustainability. Statman and Glushkov (2016) conclude that there is a lack of clearly defined criteria to distinguish mutual funds as "socially responsible" results in inconsistently applied classifications that make it difficult to measure the performance of SRIs. Traditional methodology in empirical research is benchmarking with indices or, most recently, matched pair analysis, which was initially applied by Maillin, Saadouni and Briston (1995) and is based on comparing returns of SRI funds and conventional funds with similar characteristics in terms of volume of assets, interception dates, et cetera. For this reason, the inclusion of sustainability scores in our work allows us to evaluate whether the degree of sustainability of the portfolio in which the funds are invested has a positive effect on performance. As far as we know, only Dolvin, Fulkerson and Krukover (2017) and El Ghoul and Karoui (2017) analyzed this effect. Dolvin, Fulkerson and Krukover (2017) conclude that funds with higher Morningstar Sustainability scores have similar alphas from those with lower Sustainability scores. Authors also observe that there is little difference in the performance or Sustainability scores between self-proclaimed SRI funds versus those that fall in the top 50 and top 20 percent of Morningstar's Sustainability scores. Finally, they observe that mutual funds with higher Morningstar Sustainability metrics do not appear to be more attractive to investors compared to low scoring funds.
In contrast, self-proclaimed SRI funds have performed significantly better regarding fund flows. El Ghoul and Karoui (2017) use CSR (Corporate Social Responsibility) scores to study the effect on fund performance and flows, concluding that higher values display poorer performance and weaker performance-flow relation. From an investor point of view, the advantage of using Sustainability scores is that they can select their SRI, taking into consideration the funds with better scores, whether or not they are declared as an SRI fund ii .
This paper adds to the growing literature on SRI by specifically examining the effect of the degree of sustainability, measured though Morningstar Sustainability scores included in Morningstar Direct in 2016. In particular, we assess the effect of sustainability scores and the different dimensions in which the score is subdivided (environmental, social, and governance) in the performance, in addition to the downside risk and the flow of funds. On the other hand, the conventional dichotomous variable has been added to the models to evaluate to what extent the results may differ. Our empirical evidence also contributes to the literature on mutual funds that discusses whether applying a particular investment screening in portfolio selection affects the mutual fund performance (see, for example, Bauer, Derwall andOtten, 2007 or Muñoz, Vargas andMarco, 2014). SRI portfolios are subject to both positive and negative social screens (Rivoli, 2003). cThe Portfolio theory argues that narrowing the universe of assets restricts diversification opportunities and thus the risk-adjusted performance (Rudd, 1981); whereas Hill, Ainscough, Shank and Manullang (2007) and Chegut, Schenk, and Scholtens, 2011) consider that restricting investment screening allows the identification of companies with higher growth potential and better management, therefore leading to a better financial performance and risk profile. Sustainable mutual funds apply a specific portfolio screening by concentrating investments in socially conscious businesses. Although there is profuse empirical literature on the impact of social responsibility of the performance, little is known about the sustainability-based screening.
Our empirical results show that a large number of funds are not declared sustainable but their portfolio is comparable to sustainable mutual funds. Furthermore, the Sustainability score is significant in explaining the level of performance, downside risk, and flows. We also achieved equivalent results for the three dimensions of sustainability (environmental, social and corporate). The signs are different on performance and downside risk when the conventional dummy to declare social mutual funds is used.
The remainder of this paper is laid out as follows. In Section 2, we review the related literature on SRI performance, in Section 3 we describe our data and the performance evaluation metrics, in Section 4 we describe our empirical methods and results, in Section 5 we conduct robustness tests and, finally, we draw conclusions from our research.

Literature Review
Over the last few years, SRI investment research has been growing. The CFA Institute, which is a global association for investment professionals, states that "a key idea in the discussion of ESG issues is that systematically considering ESG issues will likely lead to more complete analyses and better-informed investment decisions" and "that every investment analyst should be able to identify and properly evaluate investment risks, and ESG issues are a part of this evaluation" (CFA Institute, 2015). For this association, there are basically two investors interested in considering ESG issues: value-motivated and values-motivated investors. We focus on the first kind of investors concerned with the financial performance of their SRI funds. Hamilton, Jo and Statman (1993) developed three hypotheses regarding the performance of SRI mutual funds. The first hypothesis is that SRI fund performance equals that of conventional funds, which is consistent with a market that does not regard the social responsibility feature. The second hypothesis is that SRI fund performance is lower than that of conventional funds, which is consistent with a market that values the social responsibility feature. Finally, the third hypothesis is that SRI fund performance is higher than that of conventional funds. There are several arguments which could explain why SRI mutual funds can outperform, in financial terms, the conventional funds (which do not consider ESG factors). First, SRI mutual funds have a higher proportion of their portfolio in the segment of small companies; these companies are better adapted to market changes (Luther, Matatko and Corner, 1992;Gregory, Matatko and Luther, 1997) and may also be more profitable in the long run. Second, social companies are more efficient, better managed and develop better in the market (Hamilton, Jo and Statman, 1993). From a theoretical point of view (for example, Margolis, Elfenbein andWalsh, 2009 or Flammer, 2015), social companies can reduce costs (penalties, etcetera.) or increase revenues (innovative products, greater employee effort, better public perception, increasing the likelihood that consumers will purchase the company's products or its share price, attract socially conscious customers, etcetera.). In contrast, one important argument of the detractors of SRI funds is that the universe of possible investments of these funds (individual companies) is small, so they assume a higher investment risk because of the lack of diversity (Chegut, Schenk and Scholtens, 2011). Humphrey and Tan (2014) replicate 10,000 pairs of SRI and conventional portfolios to test the impact of SRI screening on performance, finding no significant difference in the risk-adjusted return of screened and unscreened portfolios. They conclude that a typical SRI fund will neither gain nor lose from screening its portfolio. But Trinks and Scholtens (2017) find that negative screening implies an opportunity cost, because excluding controversial stocks for an investment portfolio may reduce financial performance. Authors such as Kurtz (1997) or Goldreyer and Diltz (1999) argue that SRI mutual funds managers need more information than conventional funds about the companies in which they invest; they base their decisions on deeper, more complete, and higher quality information, resulting in a significant reduction in the risk of their investment decisions. Empirical evidence of some authors, such as Luther, Matatko and Corner (1992) and Maillin, Saadouni and Briston (1995), support the idea that SRI funds outperform conventional investments. But there is also evidence to support the idea that SRIs are neutral to financial performance (Hamilton, Jo and Statman, 1993;Kreander, Gray, Power and Sinclair, 2005;Gregory and Whittaker, 2007;Bauer, Derwall and Otten, 2007;among others), or that SRI funds underperform conventional investments (for example, White, 1995).
The first study about SRI investment was done by Luther, Matatko and Corner (1992), where these authors found that SRI investment funds did not under or outperform the index benchmark. They used 15 British Ethical funds, finding weak evidence that 15 UK SRI funds outperformed two stock market indices. Hamilton, Jo and Statman (1993) conducted a similar study where the difference of means of excess returns was not significant and only one of 17 mutual funds had a positive Jensen`s alpha. Luther and Matatko (1994) improved their prior work by including a small market index and they concluded that the excess returns of SRI funds are strongly influenced by the low capitalization of the small cap stocks. The study also shows that SRI funds have a neutral effect on performance. White (1995) researches US and German mutual funds using a simple regression against an environmental market index, showing that the SRI investments underperform the benchmark in terms of three performance measures (Jensen`s alpha, the Treynor ratio, and the Sharpe ratio). In this research, the author used a sample of six US funds and five German SRI Investment funds.
All previous studies used an index as benchmark, so they have the problem of what is the appropriate index. Mallin, Saadouni and Briston (1995) avoided this problem by using a matched pair analysis to compare SRI mutual funds and conventional funds in the UK. The authors matched 29 SRI mutual funds to conventional ones using the size and the age of the funds as criteria. Their results showed no differences in the performance of both samples using the Sharpe and Treynor ratios as performance measures, but they found that ethical funds did better than the non-ethical funds when the Jensen performance measure was used. Gregory, Matatko and Luther (1997) studied 18 SRI funds where the investment area and the fund type were considered. They did not find differences in performance against conventional funds. Statman (2000) studied the performance of 31 US SRI mutual funds and the Domini 400 Social-Index (DSI) from 1990 to 1998. The results show that only some SRI funds could underperform the benchmark (S&P 500 or DSI). But, in general, SRI funds obtained a similar performance to S&P 500, DSI, and conventional funds. Kreander, Gray, Power and Sinclair (2002) used a matching procedure and the age, size, country and investment universe of the fund as variables. The study included mutual funds from Sweden, the Netherlands, Norway, Germany, the UK and Switzerland, and Jensen`s alpha and the Sharpe and Treynor ratios as performance metrics. Their results showed that SRI funds' performance was very similar to those of conventional funds. Kreander, Gray, Power and Sinclair (2005) studied the performance of 30 European SRI funds from four countries, finding that there is no difference between SRI funds and conventional funds. Bello (2005) studied 42 SRI U. mutual funds; he found no evidence of a performance difference between SRI and conventional funds. Both underperformed the Domini 400 Social Index and S&P 500 during the study period (1994 -2001). Bauer, Koedijk and Otten (2005) investigated the performance of 32 British, 16 German and 55 US SRI funds, they used Jensen and Carhart´s alpha and found that German and US SRI mutual funds underperformed in both their relevant indexes and the conventional funds, whereas UK funds slightly outperformed, however the differences were not significant. Scholtens (2005) investigated the performance of Dutch SRI funds and found that these funds outperformed conventional funds but with no statistically significant difference.
Also Barnett and Salomon (2006) studied 61 SRI funds tracked by the US Social Investment Forum (USSIF). They found that the relationship between financial and social performance is neither strictly negative, nor strictly positive. Instead, they found a curvilinear relationship, suggesting that the two viewpoints may be complementary. Riskadjusted performance varies with the types of social screens used. Community relations screening (excludes firms that do not invest in and/or develop economically depressed communities) increased financial performance, but environmental and labor relations screening (excludes firms with a record of poor environmental performance and firms with a record of poor labor relations practices, respectively) decreased financial performance. Bauer, Otten and Rad (2006) investigated the performance of Australian ethical funds, and Bauer, Derwall and Otten (2007) invested evidence from Canada, finding no statistical difference in performance between conventional and SRI funds. Gregory and Whittaker (2007), in the UK market, found that neither SRI nor non-SRI funds exhibited significant under performance. Renneboog, Ter Horst and Zhang (2008) found that SRI funds in the US, the UK, and in many continental European and Asia-Pacific countries underperformed their domestic benchmarks. However, with the exception of France, Japan, and Sweden, the risk-adjusted performance of SRI funds is not statistically different from the performance of conventional funds. Gil-Bazo, Ruiz-Verdú and Santos (2010) found that during the period 1997-2005, US SRI funds had better performance (gross and net Carhart´s alphas) than conventional mutual funds with similar characteristics. Authors find that the differences are driven exclusively by SRI funds run by management companies specializing in SRI, while funds run by companies not specializing in SRI underperform conventional funds Climent and Soriano (2011)  randomly selected US-based large-cap equity mutual funds ( 25 are members of the SIF and 21 are conventional funds) finding there were no significant performance differences between conventional and SRI mutual funds employing Data Envelopment Analysis. Nofsinger and Varma (2014) found that SRI mutual funds outperformed conventional funds in the global financial crisis, so they can be an optimal choice for investors who want to protect themselves from downside risk. They also found that SRI funds underperform at other times. Leite and Cortez (2014) performed a multi-country study focused on 54 international SRI funds located in eight European markets (Austria, Belgium, France, Germany, Italy, the Netherlands, the UK, and Spain); they applied the five-factor model and found a similar performance between socially responsible funds and conventional funds. Muñoz, Vargas and Marco (2014) studied 89 European green funds and 18 US funds from 1994 to 2013. They applied the Carhart four-factor model and stated that, for the US market, green funds did not perform any worse than the market, but with a global equity portfolio green funds showed evidence of underperformance. Becchetti, Ciciretti, Dalo and Herzel (2015) found no clear-cut dominance over the entire period analyzed (1992-2012), but also found that SRI funds generally did better than conventional funds in the period following the global financial crisis of 2007. Leite and Cortez (2015), focusing on the French market, found that SRI funds underperformed slightly more than their matched samples according to different models, but differences in alphas are not statistically significant in most cases. They only found significance in one of the estimated models at the 10% significance level. Humphrey, Warren and Boon (2016) found that SRI managers have longer tenure and are more likely to be female, but they did not find any significant difference in the performance of SRI and conventional funds. Ibikunle and Steffen (2017) conducted a comparative financial performance analysis on European green, conventional, and black mutual funds; they concluded that there was no difference in the performance of the green and the conventional funds and that green funds are beginning to significantly outperform black funds. Dolvin, Fulkerson and Krukover (2017) is the only reference, to our knowledge, that employs Morningstar Sustainability scores in their analysis. The authors conclude that funds with higher Morningstar Sustainability scores have similar alphas from those with lower Sustainability scores. The authors also observe that there is little difference in the performance or Sustainability scores between self-proclaimed SRI funds versus those that fall in the top 50 and top 20 percent of Morningstar's Sustainability scores. Finally, El Ghoul and Karoui (2017) employed a CSR score, which is an asset-weighted composite CSR fund score. They showed the effects of CSR on fund performance; compared to low-CSR funds, high-CSR funds displayed a poorer performance.

Sample
Our sample contains 1,593 European equity funds rated by Morningstar Sustainability in November 2016. The funds are the "open funds" type with an ESG score in the investment area of Europe. Furthermore, to avoid problems of multicollinearity, we have selected only an equivalent class for each fund. We obtained for each equity mutual fund several measures of performance and other variables such as size, volatility, socially conscious, expenses, and age. We also used the Morningstar style-box to control the effect of the different categories which are included in the sample. The number of funds varied when we considered the costs where the sample reduces from 1,593 to 571 motivated for the lack of data available in Morningstar Direct.

Variables construction
Our sustainable variables have been obtained from Morningstar Direct (original source Sustainalytics iii ). We will employ five variables: three are the pillars scores [Environment score variable (Envscore), Social score variable (Socscore) and Government score variable (Govscore)], the fourth is the ESG score of a portfolio (ESGscore), and finally, the Portfolio Sustainability Score (Sustscore), which is the ESG score minus the Portfolio Controversy Score. Bos (2017)  In order to receive a portfolio sustainability score, a portfolio must have a portfolio ESG score and a portfolio controversy score, which, according to Morningstar (2016b), at least 50% of a portfolio's assets under management must have these scores. The Morningstar Portfolio ESG Score (ESGscore) iv is calculated as: Where: We have divided our sample funds into two groups based on whether the ESG scores are below or above the median. Then, we estimated the means and their differences between both groups. Table 1 reports the results of the univariate analysis. As can be observed, the differences are very significant between the two groups for the different scores, with a difference of approximately five points in favour of the funds included in the high score group. The funds are classified into low or high groups depending on whether their score is above or below the median. The t-statistic for difference of means is reported in the third column. Sustscore is the level of sustainability of the mutual fund measured by Morningstar. ESGscore is the ESG score of a fund. EnvScore, Socscore and GovScore are the mutual fund scores for the three dimensions (environment, social and corporate governance). *Significant at 10%; ** significant at 5% and *** significant at 1%.  This table reports the number of mutual funds classified as sustainable using two different dummy variables. Sustainabledummy is based on low or high sustainable scores depending on whether their score is above or below the median. Sociallyconscious is for those mutual funds declared as socially conscious.

Performance variables
We considered different performance measures from the Morningstar Direct database. Given that we only have ESG data available for December 2016, we have analyzed the performance and risk effects using the performance and risk metrics for the last two years based on Wimmer (2012), who showed that ESG scores persisted for two years, and were motivated by the changes in the holdings of the SRI mutual funds. In particular, we used the raw return and Sharpe ratios. We also computed Carhart´s alphas based on values provided on Kenneth French's website v .
The differences in performance between the high and low ESG scored funds are negative when considering raw returns, Sharpe ratios and two years alphas (Table 3). That is, higher ESG scored funds show a poorer performance, except in the case of a one year alpha. Our results are consistent with those achieved by El Ghoul and Karoui (2017) and Dolvin, Fulkerson and Krukover (2017) for US mutual funds.

This table reports the values of the performance metrics indicated in the first column, Alpha (Carhart´s alpha) and Sharpe (Sharpe ratio) are risk adjusted returns calculated for two and one years estimated at the end of 2016. Return is the raw measure of profitability. The data has been obtained from Morningstar Direct database and Kenneth
French's website. The funds are classified into low or high groups depending on whether their ESG score is above or below the median. The t-statistic for difference of means is reported in the third column. *Significant at 10%; ** significant at 5% and *** significant at 1%.

Downside risk variables
We also assessed fund performance by considering downside risk. Tail risk is commonly taken by mutual funds and it has been shown to be useful in explaining fund performance (Kelly and Jiang, 2014). Specifically, we examined whether sustainable mutual funds are more or less exposed to tail risk by measuring mutual fund downside risk by using the Value at Risk (VaR). VaR measures the maximum loss that a fund can obtain for a given time period and a given confidence level (1-p) as:

( ≤ )
which is the loss associated with the p-th percentile of the return distribution. It can be computed as = −1 ( ), where is the return distribution of the fund . Table 4 shows the difference of means for downside risk measured by the historical monthly VaR at a 99% confidence level. The evidence for VaR reveals that highly scored mutual funds display less tail risk but are only statistically significant for the two years measured. The funds are classified into low or high groups depending on whether their ESG score is above or below the median. The t-statistic for difference of means is reported in the third column. *Significant at 10%; ** significant at 5% and *** significant at 1%.

Flow of funds
We measure the flow of funds as: Where , and , −1 are the total net assets for fund at the end of year and − 1, respectively, and , is the return of fund in year . Table 5 displays the difference of means for the flow of funds showing positive differences for higher scored mutual funds. -0.10 -0.07 -1.75* This table reports the values of flow of funds obtained from Morningstar Direct database. The funds are classified into low or high groups depending on whether their score is above or below the median. The t-statistic for difference of means is reported in the third column. *Significant at 10%; ** significant at 5% and *** significant at 1%. Table 6 shows the different variables considered in our work. As can be seen, the variables related to the level of sustainability have an average level close to 60 points, and the difference between the minimum and maximum is around 25 points. On average, the funds have a negative alpha despite yielding positive returns for the term of 1 and 2 years.

Descriptive statistics
The average flow has been negative and the percentage declared to be socially responsible is very small (8%). The size is very variable, the expense ratio is greater than 1% because the mutual funds included invest in equity, and in general the funds have a high average age.

Fund performance and Sustainability Scores
In this part, we test if the degree of sustainability measured through ESG scores has a positive or negative effect on performance. In addition, we consider ESG scores to evaluate the contribution of each dimension to the portfolio performance. We propose the following model: = + 1 + 2 + 3 + 4 + 5 + 6 + ∑ + where: = Alternative performance metrics for fund .
= 1 through N, where N is the total number of funds in the sample.
is the sustainability score provided by Morningstar.

Age = Years since inception date.
LossDev = standard deviation of mutual funds returns.
LogSize = logarithm of mutual fund market value.
ExpRat = Net expense ratio of fund i.

Sociallyconcious = dummy of SRI mutual funds.
Category= dummies of categories except small style. and 1 , 2 , 3 ,and 4 are parameters of the regression and the term error.
Our results show that Sustscore is significant in explaining the level of performance for all the metrics and terms. If we use ESG scores instead of Sustscore, the results are mainly the same. Most of the models present a negative sign in line with El Ghoul and Karoui (2017) and Renneboog et al. (2008), who suggest that socially responsible mutual funds underperform other funds. The dummy variable is also significant, showing that considering the level of sustainability can help to better understand the relationship between performance and social responsibility. Our results support Statman and Glushkov (2016), who conclude that the lack of clearly defined criteria to distinguish mutual funds as socially responsible affects the results of previous research based on dichotomy variables. Among the control variables, Table 7 shows that volatility and the expense ratio, but only in some models, are negatively related to performance, while size and age are not significant. coefficients for the regression models for different performance measures. Alpha is the Carhart´s alpha measure; Sharpe is the yearly risk-adjusted return and Return is the total net return. Sustscore is the level of sustainability of the fund provided by Morningstar and Sociallyconcious is the common dummy variable used to analyse sociallyconscious mutual funds. N is the number of observations and r2 the R-squared fit measure. The dummies of categories have been included and compared with small mutual fund of Morningstar Style Box. *Significant at 10%; ** significant at 5% and *** significant at 1%.
Using the different elements in which ESG scores are subdivided, we have achieved similar results, finding in most models a negative relation between the dimensions of sustainability and performance (Table 8). Again, those mutual funds with higher environmental scores reduce the level of performance adjusted and non-adjusted in five of the six models estimated. For the other dimensions (social and governance) the results are quite similar, concluding that, in general, the effects of the different dimensions have a negative impact on alternative performance metrics.  0.1128 0.2555 0.3358 0.4698 0.1067 0.6361 This table reports the coefficients for the regression models for different performance measures. Alpha is the Carhart´s alpha measure; Sharpe is the yearly risk-adjusted return and Return is the total net return. Sociallyconcious is a dummy variable used to analyse sociallyconscious mutual funds. N is the number of observations and r2 the R-squared fit measure. *Significant at 10%; ** significant at 5% and *** significant at 1%.

Downside risk and sustainability scores
In this part, we test if the degree of sustainability measured through ESG scores and their components has a positive or negative effect on the historical VaR of the portfolio.
We used the following model: = + 1 + 2 + 3 + 4 + 5 + 6 + ∑ + As Table 9 shows, the downside risk of mutual funds is affected by the level of sustainability (ESG score). Specifically, we observed how the variable Sustscore is negatively and significantly related to the VaR of the fund at a 99% confidence level in both terms of one and two years. These results support that funds with a higher degree of sustainability better protect investors against extreme losses. As Kurtz (1997) or Goldreyer and Diltz (1999) explain, SRI mutual fund managers base their decisions on deeper, more complete, and higher quality information, resulting in a significant reduction in the risk of their investment decisions. On the other hand, the dichotomous variable commonly used has a positive sign. and opposite sign to that resulting from using a continuous variable. We also made the analysis for the different sub factors, observing again a negative and significant relationship for most of the estimated models. As can be seen in Table 9, the increase in the level of environmental, social, and governance sustainability reduces the level of extreme losses of investment funds. It is again observed that the dummy variable is significant and positively related to the level of risk. From this analysis, we observed that the results of evaluating the effect of sustainability based on dichotomous variables may yield contradictory results to those obtained when continuous variables are used.  This table reports the coefficients for the regression models. VaR is the maximum loss that a fund i can obtain for a given time period and a given confidence level. Sustscore is the level of sustainability of the mutual fund measured by Morningstar. Envscore, Socscore and Govscore are the mutual fund scores for the three dimensions (environment, social and corporate governance). Sociallyconcious is a dummy variable used to analyse Sociallyconcious mutual funds. N is the number of observations and r2 the R-squared fit measure. *Significant at 10%; ** significant at 5% and *** significant at 1%.

Flows and sustainability scores
In this section, we analyze the effect of sustainability on the flows of investment funds.
In particular, flows of sustainable funds are generally considered to be less sensitive to changes in performance because investors value other elements in their utility function. Benson and Humphrey (2008) and Renneboog et al. (2011) obtained evidence in favor of greater stability in flows for sustainable funds, while Bollen (2007) found that SRI mutual funds are more sensitive to positive returns and less to negative ones. In line with Doven et al.
(2017) and El Ghoul and Karoui (2017), we argue that funds with higher ESG scores attract more conscious investors, who are less worried about performance and therefore the flows are less sensitive to past performance. Thus, we estimate the following model to evaluate the effect of sustainability on the flow of funds using the different performance metrics (alpha, Sharpe, raw return), the sustainability score, and the interaction of the product (SustPerf: sustsharpe, sustalpha or sustreturn): = + 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 ociallyconcious + ∑ + Where: SustPerf: is the product of Sustscore and sharpe (sustsharpe), alpha (sustalpha) or net return (sustreturn) depending on the model. hand, in Model 3, the sustainability score is also significant, so that higher-rated funds received a larger volume of funds than those with a lower score. This fact shows that the degree of sustainability stimulates fund raising and more when the degree of sustainability is higher. Also, when we analyzed the effect of the sustainability dummy variable (Sociallyconcious), it is significant in all models, which confirms the importance of sustainability in attracting investors interested in funds that are declared sustainable. This fact can be related to both greater social awareness and expectations of greater profitability in SRIs. Finally, the negative sign of the interaction variable (sustreturn) shows the lower sensitivity of sustainable funds, supporting the results found by Doven et al. (2017) and El Ghoul and Karoui (2017) using alternative metrics and US funds. This table reports the coefficients for the regression models. Alpha is the Carhart´s alpha measure; Sharpe is the yearly risk-adjusted return and, Return is the total net return. Sustscore is the level of sustainability of the mutual fund measured by Morningstar. Sociallyconcious is a dummy variable used to analyse socially conscious mutual funds. N is the number of observations and r2 the R-squared fit measure. *Significant at 10%; ** significant at 5% and *** significant at 1%.

Robustness
We conducted some additional robustness tests to check the consistency of our results and to provide other complementary analyses. We checked whether performance may differ according to the fund manager skills, considering the quantiles of different performance measures; differences in the quantiles would indicate differences in the fund manager's ability to deal with performance.
Quantile regression allowed us to capture information about the coefficients at different quantiles of the dependent variable given the set of endogenous variables. In addition, the conditional quantile regression developed by Koenker and Bassett (1978) successfully deals with skewed distributions of fund performance. In particular, we adopted the bootstrapping method proposed by Efron (1979) and implemented in the software Stata 12. Given as the different performance metrics used in this paper (alpha, Sharpe and returns), and as a vector of exogenous variables representing the sustainable score of each mutual fund and other controls, the quantile model can be written as:

= ´+
Assuming that: Table 11 reports quantile parameter estimates for three different adjusted riskreturn performances. Our evidence for all quantiles confirms no differences in the results and sustainability seems to be important independent of the level of performance analysed.
We also calculated the models excluding the expense ratio because this variable has many blanks and reduces the sample. After the calculations, we again observed no differences with the models presented in the previous empirical analyses. Finally, we recalculated the models for each category and we obtained different results depending on the category, concluding that on average the effect is negative on performance but specific for each category.  571  570  571  570  541  541  This table reports the coefficients for the quantile regression models (q25 or lower quartile, q50 or median and  q75 or upper quartile). Sustscore is the level of sustainability of the mutual fund measured by Morningstar. Sociallyconcious is a dummy variable used to analyse socially conscious mutual funds. N is the number of observations. *Significant at 10%; ** significant at 5% and *** significant at 1%.

Conclusion
In Europe, SRI strategies grew by 11.7% from 2014 to 2016 to reach $12.04 trillion (GSIA, 2017). Traditional studies focus their work on mutual funds which declare themselves as funds that support an SRI approach. One important limitation of this approach is that results could be biased, because SRI mutual funds could have different levels of sustainability and differences with conventional funds may not be significant.
Recently, Morningstar launched the Morningstar Sustainability Score to classify mutual.
The use of sustainability scores in our work allows us to evaluate the effect of the degree of sustainability on performance, risk, or flows on European equity mutual funds.
Our results show that there are a large number of funds that are not declared sustainable but their portfolio is comparable to sustainable mutual funds. Furthermore, Sustainability Score is significant, explaining the level of performance for all the metrics analysed (alpha, Sharpe, and net return), and has negative sign in most models. Using a conventional dummy to declare social mutual funds, the results are significant but with the contrary sign, showing that considering the level of sustainability can help to better understand the link between performance and social responsibility. Our results are in accordance with Statman and Glushkov (2016), who concluded that the lack of clearly defined criteria to distinguish SRI mutual funds affected the results. Also, we obtained similar results to El Ghoul and Karoui (2017) for the US mutual funds market. Using the different pillars of ESG scores (environmental, social, and governance), we were able to achieve a negative link between the dimensions of sustainability and performance, showing that all the dimensions play an important role in explaining performance. Our results are consistent with the idea that investors are paying a premium for investing in high scored mutual funds.
In terms of downside risk, the level of sustainability is negatively and significantly related to the VaR of the fund, supporting that higher scored mutual funds better protect against extreme losses. The opposite is found for the conventional dummy, showing the advantages of employing a quantitative measure of sustainability to evaluate assets´ risk.
This result could mean that SRI mutual fund managers base their decisions on a deeper analyses resulting in a significant reduction in the risk of their investment decisions. Our work shows that sustainability scores can be used by investors worried by extreme losses and not only by values-motivated investors.
Finally, we analyzed the effect of sustainability on the flows, confirming the importance of sustainability in attracting investors. The effect of the sustainability dummy variable is significant in all models. Unadjusted returns and Carhart´s alphas have a positive influence on investment decisions. The sustainability score is significant on the flows, so higher-rated funds received a larger volume of funds. Finally, the negative sign of the interaction variable (product of sustainability and return) shows the lower sensitivity of sustainable funds. This shows the different sensitivity to performance of values-motivated investors.
Future research will benefit from the increasing amount of data to make empirical studies based on sustainability criteria. Unfortunately, due to data limitations, Morningstar Sustainability scores are only available from 2016, our sample assumes the score is constant prior to 2016. Another limitation of our work is that there may be some survivorship bias, but since our sample only includes two years, this bias must be very small.