The Investment Performance of Ethical Equity Funds in Malaysia

This paper investigates the investment performance of Malaysian Islamic equity funds and a matching sample of conventional equity funds relative to their market benchmark. An integrated model is used to simultaneously capture the market timing and selectivity skills of fund managers. Our findings indicate that the Islamic funds do not match the performance of the conventional funds in terms of selectivity skill. However, Islamic funds perform no worse than their conventional counterparts in market timing, although neither outperform the market. These findings have crucial implications not only for fund managers’ investment decisions, but also for sensitive shariah-compliant investors and risk-seeking investors of Islamic equity funds in their investment portfolio preference.


Introduction
According to divine Islamic law or shariah, investors are permitted as well as urged to trade, invest, and share the direct level of risk through profit-and loss-sharing (PLS). Even though investment is generally empowered in Islam, it does not imply that the nature of business, aspects of operations, and financial activities of a company fully adhere to the ethical values of shariah. Therefore, to settle on sensible and Islamic investment decisions from a specifically Islamic perspective, Muslim investors need to obey shariah. Stocks that are permissible in divine Islamic law or shariah are designated as shariah-compliant stocks, and they form the investment opportunities set for both Muslim individual investors and institutions (for details, see Rahman 2015;Basov and Bhatti 2016;Azmat et al. 2020).
The demand for shariah-compliant investments has risen rapidly with the growth of Islamic finance (IF). Kuwait Finance House, one of the leading global Islamic financial institutions, estimates that assets of the Islamic finance sector were valued at US$1.6 trillion by the end of 2014 (Yahaya et al. 2014). These have now reached US$2.6 trillion (S&P global 2020 report) 1 , which represents an increase of about US$1 trillion. This means an average annual increase of about US$2 billion during the last five years. However, due to the current COVID-19 pandemic, it is expected that the global IF industry will grow at a slower pace, as sukuk (Islamic bonds) volumes shrink and core markets grapple with massive economic slowdowns or even closures during 2020. A growing number of affluent and knowledgeable Muslim middle-class people have likewise expanded their enthusiasm for Islamic investment activities. 1 https://www.spglobal.com/_assets/documents/ratings/research/islamic_finance_2020_screen.pdf. been refined by Bhattacharya and Pfleiderer (1983), and an empirical econometric methodology was enhanced by Lee and Rahman (1990). Rahman (1990, 1991) applied this refined model to investigate investment performance of US equity mutual funds. Coggin et al. (1993) utilized it to analyze US equity pension funds' performance and found evidence of stock selection and/or market timing expertise in a small number of funds.
Fund performance has been the subject of intensive investigation in the literature. For example, Gjergji et al. (2018) examined the impact of trading desk efficiency on the portfolio performance and investment behavior of mutual funds. Busse et al. (2019) studied the influence of trading regulation on performance efficiency. Sitikantha and Terence (2018) investigated the impact on fund performance due to portfolio disclosure frequency, whereas Don (2019) and Juan et al. (2019) assessed the influence of social responsibility or ethical level on performance appraisal. Jon and Timothy (2019) studied the relationship between fund performance and portfolio concentration, while Huamao et al. (2019) looked at decentralized portfolios in a dynamic size-induced fund flow. Papadamou et al. (2017) used style and performance analyses to investigate how mutual funds performed in Japan before and after the 2008 global financial crisis (GFC). The empirical findings reveal no positive correlation between active management in a monetary easing environment and mutual fund performance. This finding is consistent with Papadamou and Siriopoulos (2004) who asserted that active management cannot beat the Eurotoxx index over a specific period, although the benchmark portfolio selection can affect alpha magnitude.
Regarding Islamic fund performance, Arif et al. (2019) examined the performance of Islamic and conventional mutual funds in Pakistan from 2010-2017 using Sharpe's ratio, Treynor's ratio, and Jensen's alpha with other analytical methodologies. Although there is a contradiction in different statistics, their Treynor and Sharpe ratios reveal that Islamic funds do better than conventional ones. Muhammad and Dawood (2019) examined the importance of assets allocation based on smart beta strategies (specific equity attributes) on fund performance rather than securities selection. However, they declared that these strategies need further verification. Robiyanto et al. (2019) used the Sharpe, Treynor, and Jensen measures to assess the performance of 21 Indonesian mutual funds from 2012-2017, and the findings are consistent with those of Arif et al. (2019).
To the best of our knowledge, the investment performance of Islamic equity funds in Malaysia has not yet been analyzed using such a refined model. This article fills the gap in the literature by investigating the investment performance of selected Islamic equity funds in Malaysia utilizing the Bhattacharya-Pfleiderer model. This paper is organized as follows: Section 2 briefly describes and discusses the superiority of the Bhattacharya-Pfleiderer model over other models and validates the selection of this model in this investigation. Section 3 deliberates the data and econometric approach, and Section 4 explains the empirical results. Lastly, the conclusion is presented in Section 5 of this paper with a summary of the main themes covered here.

A Model for Market Timing and Selectivity Measures
This study employs a methodology like that proposed by Rahman et al. (2017). The unconditional risk-adjusted performance measures assume a stationary risk level in a managed portfolio and ignore the manager's market timing skill, for example, an ability to move into and out of segments of the market to minimize the overall portfolio risk composition.
Portfolio turnover of mutual funds also leads to a change of the portfolio risk. According to Barker (2014), an average turnover rate of about 85% (for example, the proportion of a fund's holdings) indicates that funds are turning over or selling most of their holdings yearly. This leads to a violation of the risk stationarity assumption in Sharpe's or Treynor's ratio. If fund managers implement a market-timing strategy, then Jensen's alpha becomes biased. Jensen (1968) used annual net asset and dividend data of 115 open-ended mutual funds from 1955 to 1964 to examine forecasting ability of fund managers. What is found is that that on average, these funds could not predict stock prices to outperform a buy-and-hold policy. This paper also finds that the performance estimate (Jensen's alpha) will be upwardly biased and the systematic risk estimate (beta) will be downwardly biased in the presence of market timing ability. Fama (1972) and Jensen (1972) suggested finer methods to evaluate investment performance by separating returns due to "selectivity ability" on individual stocks from those due to "timing" (predictions on market price trends). Treynor and Black (1973) emphasized that it is necessary to distinguish between systematic risk and insurable risk in balancing portfolios and find that portfolio managers can effectively isolate returns coming from security analysis activities from those coming from market timing. Ferson and Schadt (1996) stated that the measures without controlling for market timing behavior are often biased. Grant (1978) explained how the results of empirical tests that focus only on microforecasting or selectivity will be biased by managers' market timing actions. Admati and Ross (1985) argued that conventional risk-return measures in capital asset pricing model (CAPM) will fail to detect the fund managers' performance because of information asymmetry and the changing risk level. They developed a rational expectation equilibrium CAPM to make valid performance evaluations. Lee and Rahman (1994) found that the true and relevant risk carried by the manager changes over time, though other parameters are stationary. Thus, it is appropriate to evaluate fund managers' performance through both selection ability and timing skill, implying the need for model selection skill and timing simultaneously.
The same fund managers might simultaneously manage Islamic and conventional funds, and while managing Islamic funds, their performance is primarily driven by investment objectives and the constraints of respective funds. However, although selectivity and market timing skills have been well-examined within traditional equity funds in the literature, our understanding of the differences of these skills as they apply to conventional and Islamic funds is limited. This paper, therefore, simultaneously examines selectivity and market timing skills of Islamic equity fund managers to avoid a possible model misspecification, one that would lead to biased estimates and concludes whether they outperform other traditional managers in market timing. Merton (1981) and Henriksson and Merton (1981) developed a model to examine the theoretical structure of returns pattern from market timing and derive an equilibrium theory of value for market timing predicting skills, and then tested this theory using parametric and nonparametric methods. In this model, fund managers could forecast the winner between stocks and the risk-free rate without being able to predict the size of the superior performance and adjust the relative weights of the assets in their portfolios. Elsewhere, Henriksson and Merton (1981) showed that the returns on the portfolio using the model are the same as those that would be created by a strategy of investing in both stock and bonds and acquiring free put options on the market portfolio. The Henriksson-Merton model assumed that managers only have binary information (positive/negative) on the excess return. Obviously, fund managers in the Merton (1981) and Henriksson-Merton models could not forecast the magnitude of superior investment but they were able to in Jensen's (1972) model. Dybvig and Ross (1985) confirmed that the weakness of the Henriksson-Merton model is that there is no test of whether information is being used properly. Chang and Lewellen (1984) applied a parametric method to test for market timing and security selection skills in mutual fund managers and could not find evidence of these skills. They concluded that mutual funds cannot outperform a passive investment strategy. Meanwhile, Chang and Lewellen (1984) questioned the specification used in Merton (1981) or in other words, the validity of using CAPM for portfolio performance evaluation, due to the persistence of a negative correlation between alpha and beta estimates. Treynor and Mazuy (1966) observed that funds would hold more high-beta stocks to increase their portfolio returns if the market return is forecasted to increase and hold low-beta stocks to reduce capital losses if the market is expected to decline. It implies that the portfolio return will be a nonlinear function of the return on the market portfolio. The authors captured this characteristic by adding a quadratic excess market return term to standard CAPM as shown below: where R pt is the excess return of fund p at time t, R mt is the excess market return at time t, α p measures security selection skill, β p measures the sensitivity of the fund excess return to the market excess return, γ measures the fund manager's market timing skill, and µ pt and ε pt are random error terms with zero expected value. In Equation (2), the fund excess return is a convex function of the market excess return. Empirical results from 57 open-end mutual funds returns show that the market-timing skill can be found significantly at the 5% level in only one fund. Jensen (1972) developed a model similar to that of Treynor and Mazuy (1966), in order to detect the selectivity and timing skills of fund managers. In this model, the fund manager forecasts the market return, and the forecasted and actual market return are assumed to have a joint normal distribution. This study shows that the market timing skill can be measured by the correlation between the forecasted and actual market return. However, Bhattacharya and Pfleiderer (1983) corrected an error in Jensen's model and showed that selectivity and timing skills can be detected via a regression technique. 2 They specified a relationship in observed variables that is similar to the Treynor-Mazuy model, and α p is the proxy for the fund manager's selectivity skill: where in Equation (3) above, θ = response of the fund manager to information and Ψ = the coefficient of determination (R 2 ) between forecasted and excess market returns; ζ pt = forecasting error and E(R mt ) = expected excess market return. Note that α p is proved to be an accurate measure of security selection ability in Bhattacharya and Pfleiderer (1983), where α p = 0 implies that a manager does not have security-specific information. In Equation (3), managers who have security-specific information may also have information that permits them to time the market.
Equation (4) below is the error term of Equation (3), one that provides the information to detect the manager's timing skill: The first component in Equation (4) contains the information needed to quantify timing ability, and can be extracted by regressing (ω t ) 2 on (R mt ) 2 as shown below in Equation (5): where In Equation (6) 3 above, the variance of excess market return, σ 2 π , allows us to estimate Ψ = (σ 2 π )/[σ 2 π + σ 2 ε ] = ρ 2 , where ρ is the correlation coefficient between forecast and excess market return, and is a proxy for the quality of the manager's timing skill. This correlation coefficient, ρ, is like the Pearson product-moment correlation coefficient. The significance of the timing skill is examined using the following t-test: t = ρ [(n − 2)/(1 − ρ 2 )] 1 2 . This test statistic follows approximate t distribution with (n − 2) degrees of freedom, and n is the number of observations based on which ρ is calculated 2 See (Lee and Rahman 1990;Coggin et al. 1993)  (for details, see Harnett and Soni 1991, pp. 503-4) The model is an improvement of the Treynor-Mazuy model and analyzes the error term to detect a manager's macro-forecasting or timing skill.

Data and Methodology
The sample data include a balanced panel of thirty Malaysian Islamic equity funds' monthly returns from January 1990 to April 2009 and are collected from the Morningstar mutual fund database. The monthly returns are net of all expenses including the management expenses. However, these returns are calculated before deducting front-and back-end load fees so that it is valid and appropriate to evaluate the fund managers' investment performance. This is done without controlling for load or no-load funds, where load fees are managed by fund administrations.
A matched sample of thirty Malaysian conventional equity funds was generated from the Morningstar mutual fund database, in order to compare the performance of Islamic equity funds and their conventional counterparts. Each Islamic fund was matched with a conventional fund based on asset size and investment objective. The monthly return on the FTSE Bursa Malaysia Kuala Lumpur Composite Index (KLCI) was used for market return 4 . Monthly observations of the 12-month Malaysian T-bill rate were used as a proxy for the risk-free rate. To get a robust result, we examined the investment performance of the fund managers using both the Treynor-Mazuy and the Bhattacharya-Pfleiderer models.
A disadvantage of the Bhattacharya-Pfleiderer model is that it is unable to detect negative or inferior market timing (Hunter and Coggin 1993). This paper resolves this problem by inspecting the sign of the coefficient of the squared excess market return in Equation (3) following Coggin et al. (1993). In the Treynor-Mazuy model, the sign of this coefficient implies the nature of the timing skill. A negative sign implies a manager's poor timing skill (as measured by ρ). This modification makes the model more realistic. Jagannathan and Korajczyk (1986) also made a similar adjustment to the Bhattacharya-Pfleiderer model.

Empirical Results
Table 1 below presents descriptive statistics of monthly returns for Islamic funds and their matching conventional funds. Mean average monthly returns for all Islamic funds is 0.33 percent, which is lower than the average monthly returns of 0.55 percent for all conventional funds. Average returns of Islamic funds vary from −0.5 percent to 2.13 percent, while average returns of conventional funds vary from 0.14 percent to 1.04 percent.
Next, the average risks of Islamic funds (0.2604) are lower than those of conventional funds (0.2922). These risks range from 0.0981 to 1.1647 for Islamic funds and from 0.1339 to 0.5125 for conventional funds. Mean value of betas of Islamic and conventional funds are 0.7220 and 0.7717, respectively. These betas range from 0.3387 to 1.1012 for Islamic funds and from 0.4748 to 1.1914 for conventional funds. In summary, it appears that the sample Islamic funds have lower average returns, lower risk, and lower betas compared to their conventional counterparts. 4 Malaysian KLCI is the perfect market benchmark for the overall national market performance. Using an independent index such as the US S&P 500 can lead to other problems such as no or low correlation between Malaysian fund return and this index return. Furthermore, the Malaysian market is too small compared to the S&P 500. In Table 1, betas and variances of funds are calculated assuming stationarity of risk measures. As discussed earlier in Section 2, the systematic and total risks of a mutual fund change over time due to portfolio rebalancing in search of mispriced securities and/or market timing efforts. Therefore, it is not meaningful to compare and analyze the Islamic and conventional funds based merely on static statistics without adjusting for a dynamic risk measure.
Next we examined the performance of Islamic funds and their conventional counterparts based on time-varying and nonstationary risk-adjusted measures. We checked with care each regression included in this study and have corrected all possible heteroskedasticity and autocorrelation problems for every single regression included in these tables. A summary of empirical findings from applying the Treynor-Mazuy and the Bhattacharya-Pfleiderer models on Islamic and conventional funds returns is presented in Table 2. Evidence of selectivity and market timing at the individual fund level can be found in both Islamic and conventional funds. There are some noticeable differences between the Treynor-Mazuy and Bhattacharya-Pfleiderer models in detecting the market timing skill of managers of Islamic and conventional funds. Nineteen out of thirty Islamic funds have a positive selectivity measure of the Treynor-Mazuy model, only one of which is statistically significant at the 0.05 level. For this model, a positive selectivity measure is found in twenty-nine out of thirty conventional funds, only four of which are statistically significant at the 0.05 level. Regarding the timing measure from the Treynor-Mazuy model, nineteen Islamic and twenty-nine conventional funds have a positive value, in which, those of four Islamic and four conventional funds are statistically significant at the 0.05 level. For the Bhattacharya-Pfleiderer model, nineteen Islamic funds have a positive selectivity measure, only one of which is statistically significant at the 0.05 level, and twenty-nine conventional funds have positive selectivity measures, only four of which are statistically significant at the 0.05 level.
As can be seen from Table 2 for the Bhattacharya-Pfleiderer model, eleven Islamic funds and one conventional fund have a negative selectivity measure but none of these measures are statistically significant at the 0.05 level. Additionally, positive timing measures are found in nineteen Islamic and nineteen conventional funds, but only one Islamic and two conventional funds have significant estimates at the 0.05 level.
Ten Islamic funds have both positive selectivity and timing measures using the Treynor-Mazuy model, and none of those funds have statistically significant selectivity and timing measures. Nineteen conventional funds have both positive selectivity and timing measures using the Treynor-Mazuy model, and none of those funds have a statistically significant selectivity and timing measure. Eleven Islamic funds have both positive selectivity and timing measures using the Bhattacharya-Pfleiderer model, and none of those funds have statistically significant selectivity and timing measures. Eighteen conventional funds have both positive selectivity and timing measures using the Bhattacharya-Pfleiderer model, and none of these have statistically significant selectivity and timing measures. Table 2 basically summarizes that no Islamic or conventional fund has statistically significant selectivity or timing skill in either model. This strongly suggests that the funds reveal some degree of specialization in one or the other forecasting skill as noted by Fama (1972).
These outcomes show that both Islamic and conventional funds do not outperform the market measured by risk-adjusted performance, as neither group has many funds indicating statistically significant superior performance on a risk-adjusted basis. Findings in our paper are free from econometric and methodological problems and specification error as previously discussed.
These findings are fully consistent with the efficient market hypothesis that no one can consistently generate superior risk-adjusted returns. Our results are also consistent with those of earlier studies in portfolio investment performance (Jensen 1968;Kon 1983;Chang and Lewellen 1984;Henriksson 1984;Cumby and Glen 1990;Lee and Rahman 1990;Connor and Korajczyk 1991;Coggin et al. 1993;Elton et al. 1993;Grinblatt and Titman 1994;Malkiel 1995;Carhart 1997;Daniel et al. 1997;Pollet and Wilson 2008;Benos and Jochec 2011). Studies on mutual fund performance in Australia (Robson 1986;Hallahan and Faff 1999;Sawicki and Ong 2000) and the U.K. (Firth 1977;Blake and Timmermann 1998) detected similar inferior performance. However, our results are not fully in harmony with the mixed findings of prior studies on Malaysian Islamic equity funds. Several analyses of Malaysian Islamic funds concluded that Islamic funds generally perform poorly compared to the market (Hayat 2006;Ahmed 2007;Abdullah et al. 2007;Taib and Isa 2007;Hayat and Kraeussl 2011). Other studies (Hoepner et al. 2011) concluded that Islamic funds outperform the market.
We next examined the equality in risk-adjusted performance measures between the two kinds of funds using a parametric matched-pairs t-test and nonparametric Wilcoxon matched-pairs signed-rank test, and the results are shown in Table 3. It can be seen from the table that selectivity measures in both models are significant at the 0.05 level in the matched-pairs t-test and at the 0.01 level in the Wilcoxon matched-pairs signed-rank test, implying there is a significant difference in selectivity skills among managers between Islamic and conventional funds. However, both the matched-pairs t-test and the Wilcoxon matched-pairs signed-rank test fail to reject the null hypothesis of no significant difference between Islamic and conventional funds in the timing measure of the Treynor-Mazuy model and the Bhattacharya-Pfleiderer model at the 0.05 level.
Since the Bhattacharya-Pfleiderer model is econometrically and methodologically superior to the Treynor-Mazuy model and is robust in measuring investment performance of managed portfolios, this means that empirical results from the Bhattacharya-Pfleiderer model indicate that Islamic funds match the performance of conventional funds in the timing measure. This finding is consistent with those of Hamilton et al. (1993), Mallin et al. (1995), Bauer et al. (2005Bauer et al. ( , 2007, and Rahman et al. (2017).
However, Islamic funds perform worse than their conventional counterparts in the selectivity measure. This finding is consistent with Hayat (2006) and Hayat and Kraeussl (2011) who discovered that Malaysian Islamic equity funds perform worse than their conventional counterparts. One rational explanation for the apparent disadvantage of Islamic equity funds in comparison with conventional equity funds is that the range of available stocks for efficient diversification and risk-reduction is limited under the shariah screening process.
A possible solution to overcome this problem is searching for international portfolio diversification. It is because cross-border stocks are more likely to be segmented or less positively correlated than those in the same country (Solnik 1995). However, another problem could appear with international stocks, that is, some of these stocks may not pass an Islamic fund's screening criteria, and in this case the Islamic fund may encounter a "lost opportunity." Thus, in comparison with conventional funds, Islamic funds have a smaller available asset space to be considered in the portfolio diversification, and it might negatively affect investors' investment efficiency and their risk reduction strategies.
The finding in this study that Islamic funds fail to match the performance of conventional funds in stock selection on a risk-adjusted basis is consistent with conventional wisdom. This states that Islamic funds have limited diversification opportunity due to matching investable stocks with shariah criteria and designated Islamic values. Supporters of Islamic investment may argue that restricted investment opportunities due to shariah screening criteria would challenge Islamic funds' managers to be more efficient and disciplined in selecting "winners" and leaving the "losers" behind, as well as identifying potentially profitable companies. Unfortunately, we do not find support for this view in our empirical results for Malaysian Islamic equity funds. In Table 3 below, we present the results of the parametric matched-pairs t-test and the nonparametric Wilcoxon matched-pairs signed-rank test between Islamic and Conventional Equity Funds. The empirical findings of this study have major implications for investors in Islamic funds. An Islamic fund may have two groups of clients: firstly, "devoted" Islamic investors who want to keep Islamic values at the cost of risk-adjusted return; and secondly, "profit-maximizing" Islamic investors reluctant to accept lower investment returns than those in conventional funds in a similar risk class. Our findings could be good news for devoted Islamic investors and bad news for profit-maximizing Islamic investors. Devoted Islamic investors have to sacrifice returns for investments in Islamic funds, but profit-maximizing Islamic investors might not get what they expect.
It is worth noting here that the issue of survivorship bias is well-known in studies of investment performance. Our sample suffers from survivorship bias because the sample excludes funds that disappeared from the database due to merger, acquisition, or liquidation. However, this bias exerts an impact on Islamic as well as conventional funds in our sample as it might overstate the estimated coefficients and performance of all funds on average. Yet, it is not likely to significantly distort the matched-pair analysis. Although we do not know the true extent of this bias in our empirical analysis, the results in Grinblatt and Titman (1989) and Brown and Goetzmann (1995) suggest that it may not be large; it is only about 0.5 percent per year.

Concluding Remarks
This paper examines the investment performance of a sample of Malaysian Islamic equity funds and compares their performance to that of matched-pair conventional equity funds selected based on fund size and investment objective. The Bhattacharya-Pfleiderer model employed is observed to be robust from all methodological and econometric aspects as revealed in the outcomes of other empirical research on investment performance. The empirical findings presented in this study confirm that Islamic funds do not match the performance of their conventional counterparts in selectivity or stock-picking skill because of lost investment opportunity associated with the shariah screening and monitoring process. The failure of the Islamic funds to match the performance of the conventional funds in stock picking suggests that Islamic investors are experiencing financial forfeit as a cost for holding on to their precious Islamic principles. However, this study is confined to evaluating the investment performance but not the impact of shariah screening on investors' expenses.
Moreover, Islamic funds perform no worse than their conventional counterparts in the market timing measure, although Islamic and conventional funds as a group do not outperform the market, which is consistent with literature on mutual fund performance (see Rahman et al. 2017). These findings have crucial implications not only for fund managers' investment decisions, but also for sensitive shariah-compliant investors and risk-seeking investors of Islamic equity funds in their investment portfolio preference. However, due to the potential influence of the GFC, further research based on updated data is necessary to make a robust conclusion.
It appears that the matching conventional funds have slightly higher average return, variance, and beta than the Islamic funds. This study also provides some evidence of superior security selection and/or market timing skills among a very small number of Islamic and conventional funds although they do not outperform the market. These findings have crucial implications not only for fund managers' decision-making, but also for sensitive shariah-compliant investors, and risk-seeking investors or profit-maximizing investors of Islamic equity funds in their investment preferences.