Clustering Stock Performance Considering Investor Preferences Using a Fuzzy Inference System

: The fact that many stocks are traded in the marketplace makes the selection process of choosing the right stocks for investment crucial and challenging. In the literature on stock selection, cluster analysis-based methods have usually been used to group and to determine the best stock for investment. Many established cluster analysis-based methods often cluster stocks under consideration using the average of the variables, where stocks with similar scores are concluded as having the same performances. Nevertheless, the performance results obtained do not reﬂect the actual performance of the stocks. Depending only on the average score of each variable is ine ﬃ cient, as market situations usually involve uncertain extreme values. Moreover, when grouping stock performance, the established clustering methods assume that investors’ selection preferences are single and unclear, when actually, in reality, investors’ selection preferences vary; some investors are pessimistic, while others may be more optimistic. Due to this issue, this paper presents a novel fuzzy clustering method using a fuzzy inference system to ﬂexibly assess the consistent evaluations given to stock performance that di ﬀ erentiate between pessimistic and optimistic investors that are symmetrical in nature. All variables considered in this study were deﬁned in terms of linguistic inputs, where the consensus among them was aggregated using rule bases. These rule bases provide assistance in obtaining the linguistic output, which is the actual performance of the stock. Next, each stock under consideration was ranked using the proposed novel stock selection strategy based on investors’ conﬁdence levels and preferences. The proposed method was then applied to a case study of 30 Syariah stocks listed on the Malaysian stock exchange, where the results obtained were empirically validated with established cluster analysis-based methods. limitations of in study, a novel fuzzy clustering method that is capable of distinctively expressing vague investors’ preferences diverse preferences, can be distinguished in various forms, such pessimistic optimistic, and complement fuzzy inference systems for specific stock selection strategies for different types of investors. The development of the novel proposed fuzzy clustering method using a fuzzy inference system involved four steps. The first step was data collection and the identification of the inputs and outputs, as well as normalization. In this step, the inputs were variables related to the stocks—namely, return rates, standard deviations and Treynor index values—while stock performance served as the output. All inputs and outputs were then normalized to ensure that the data were in generic forms. In the second step, the results obtained from the normalization process were transformed into triangular fuzzy numbers—this process is known as fuzzification. All normalized inputs and outputs defined were in the form of linguistics terms described as triangular fuzzy numbers. For step 3, processes such as fuzzy rule base, fuzzy inference system, and defuzzification were performed. Fuzzy rule bases were developed based on the results of [3,4,7,8] and characterized by IF THEN rules. These rule bases were then aggregated in the fuzzy inference system, and the products were converted into crisp values that represent stock performance. This conversion process is known as defuzzification. The defuzzification process covers the limitation of outliers and inconsistent numbers of clusters. In step 4, the results obtained from defuzzification were projected according to confidence levels, where the confidence levels represent the actual levels of investors’ preferences. This step covered the limitation of neglecting investors’ preferences. For the purpose of distinguishing each stock based on its performance and on investors’ confidence levels, this study presents a unique stock selection strategy, whereby the best stocks are ranked based on investors’ preference priority. A flowchart on the development of the novel proposed fuzzy clustering method is given in Figure 1. The steps involved in the development of the novel proposed fuzzy clustering


Introduction
Stocks traded on the financial market are often observed as unpredictable and unstable. This is due to the uncertain fluctuations of the daily prices of stocks, which leads to hesitance in the process of selecting the right stocks to invest in [1][2][3]. Dubious investors' selection preferences due to the hefty identification of the well-balanced interaction between risks and returns also contribute toward indefinite stock selection [1]. In making investment decisions regarding stock selection, investors usually aim to select stocks when both the risks and the returns are consistent, such that the prior is is to consider their effects on fundraising by non-professional investors [19]. In [12], the authors evaluated the performance of 10 sectorial stock indices individually, using the Sharpe, Treynor, Jensen alpha, adjusted Sharpe, adjusted Jensen, and Sortino indices. Unfortunately, it is time consuming to analyze stocks one at a time, and thus, this concept is unsuitable for large data sets.
Apart from the variables that are related to stocks, investors' preferences toward risk-taking have also been considered as one of the most important factors for evaluating stock performance. Shams and Rezvani [20] investigated investors' risk aversion and risk taking by ranking the performance of investment companies using three loss aversion indices and comparing the results against the Treynor index. The result shows that the loss aversion behavior of investment companies is influenced by the outcome of previous performances. In [21], multiple hybrid methods were developed by combining SOM and k-means cluster stocks, the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to rank stocks, and the genetic algorithm to establish different classes of investors with respect to their risk-taking levels.
The established clustering methods in the literature consider several options to ensure that the stocks preferred by investors are good and the best to invest in. Among investors' preferences studied are stocks that have high return rates, low standard deviation and high Treynor index values [3,4], low standard deviations, moderate return rates, moderate turnover ratios, and moderate Sortino index values [7], or high return rates, high Sharpe index values, high appraisal ratios, high Sortino index values, low standard deviations, and downside risks [8]. However, investors' preferences considered by these methods are ambiguous, since such evaluation focuses only on one unknown investor's selection preference when clustering stock performance. These evaluations are inconsistent with investors' genuine preferences, which can be either pessimistic or optimistic in nature; thus, clustering methods are unable to track the true performance of stocks [2,22]. The inefficiencies of the established clustering methods justify the motivation for this study.
As indicated above, the established clustering methods neglect to take into consideration the importance of diverse investors' preferences when selecting stocks, and thus are inefficient in accurately clustering stocks. This study extends the works of [4,6] by proposing a novel fuzzy clustering method that has the capability to distinctively express investors' vague preferences which establish clustering methods cannot capture. Furthermore, investors' diverse preferences, which can be distinguished in various forms, such as pessimistic or optimistic, enhance and complement fuzzy inference systems for developing specific stock selection strategies for different types of investors. A fuzzy inference system was utilized in this study, as it possesses great capability for considering vague decision-makers' preferences, as well as the uncertainty in the decision-making environment [23,24]. In this proposed method, the variables considered were defined as the linguistic inputs, while stock performance was defined as the linguistic output. All defined linguistic inputs were then aggregated using fuzzy rule bases that were developed in accordance with established investors' preferences on stock clustering. In this case, rule base development is important to achieve rational interaction between the variables and the performances of stocks that are defined linguistically based on investors' preferences [23,24]. As for the output, the proposed method produces two distinct views of investors' preferences, which are pessimistic and optimistic, with each view consisting of multiple levels of investors' preferences differentiated based on confidence levels and the frequency of stock performance. The novel differentiation process in this study is the first of its kind to be developed with the objective of assisting investors in selecting the best stock to invest, given their preferences. For efficiency purposes, the results obtained from the analysis of this proposed method were compared against the established clustering methods.
The rest of the paper is structured as follows. Section 2 explains the development of the novel proposed fuzzy clustering method. Section 3 presents a case study on clustering 30 Syariah-compliant stocks in Malaysia for the year 2011, and then validation of the results is provided in Section 4. The discussion and conclusion are presented in Sections 5 and 6, respectively.

Research Formulation
As mentioned in the introduction section, previous works on clustering were unable to handle outliers, providing inconsistent numbers of clusters and neglecting investor's preferences. Taking into consideration the limitations of previous studies, in this study, a novel fuzzy clustering method that is capable of distinctively expressing vague investors' preferences is presented. Furthermore, investors' diverse preferences, which can be distinguished in various forms, such as pessimistic or optimistic, enhance and complement fuzzy inference systems for developing specific stock selection strategies for different types of investors.
The development of the novel proposed fuzzy clustering method using a fuzzy inference system involved four steps. The first step was data collection and the identification of the inputs and outputs, as well as normalization. In this step, the inputs were variables related to the stocks-namely, return rates, standard deviations and Treynor index values-while stock performance served as the output. All inputs and outputs were then normalized to ensure that the data were in generic forms. In the second step, the results obtained from the normalization process were transformed into triangular fuzzy numbers-this process is known as fuzzification. All normalized inputs and outputs defined were in the form of linguistics terms described as triangular fuzzy numbers. For step 3, processes such as fuzzy rule base, fuzzy inference system, and defuzzification were performed. Fuzzy rule bases were developed based on the results of [3,4,7,8] and characterized by IF THEN rules. These rule bases were then aggregated in the fuzzy inference system, and the products were converted into crisp values that represent stock performance. This conversion process is known as defuzzification. The defuzzification process covers the limitation of outliers and inconsistent numbers of clusters. In step 4, the results obtained from defuzzification were projected according to confidence levels, where the confidence levels represent the actual levels of investors' preferences. This step covered the limitation of neglecting investors' preferences. For the purpose of distinguishing each stock based on its performance and on investors' confidence levels, this study presents a unique stock selection strategy, whereby the best stocks are ranked based on investors' preference priority. A flowchart on the development of the novel proposed fuzzy clustering method is given in Figure 1. The steps involved in the development of the novel proposed fuzzy clustering method are described below. into consideration the limitations of previous studies, in this study, a novel fuzzy clustering method that is capable of distinctively expressing vague investors' preferences is presented. Furthermore, investors' diverse preferences, which can be distinguished in various forms, such as pessimistic or optimistic, enhance and complement fuzzy inference systems for developing specific stock selection strategies for different types of investors. The development of the novel proposed fuzzy clustering method using a fuzzy inference system involved four steps. The first step was data collection and the identification of the inputs and outputs, as well as normalization. In this step, the inputs were variables related to the stocks-namely, return rates, standard deviations and Treynor index values-while stock performance served as the output. All inputs and outputs were then normalized to ensure that the data were in generic forms. In the second step, the results obtained from the normalization process were transformed into triangular fuzzy numbers-this process is known as fuzzification. All normalized inputs and outputs defined were in the form of linguistics terms described as triangular fuzzy numbers. For step 3, processes such as fuzzy rule base, fuzzy inference system, and defuzzification were performed. Fuzzy rule bases were developed based on the results of [3,4,7,8] and characterized by IF THEN rules. These rule bases were then aggregated in the fuzzy inference system, and the products were converted into crisp values that represent stock performance. This conversion process is known as defuzzification. The defuzzification process covers the limitation of outliers and inconsistent numbers of clusters. In step 4, the results obtained from defuzzification were projected according to confidence levels, where the confidence levels represent the actual levels of investors' preferences. This step covered the limitation of neglecting investors' preferences. For the purpose of distinguishing each stock based on its performance and on investors' confidence levels, this study presents a unique stock selection strategy, whereby the best stocks are ranked based on investors' preference priority. A flowchart on the development of the novel proposed fuzzy clustering method is given in Figure 1. The steps involved in the development of the novel proposed fuzzy clustering method are described below.

Step 1: Data Collection, Input and Output Identification, and Normalization
This study started with data collection, where information related to the input and output variables was identified. The inputs were return rates, standard deviations, and Treynor index values, while the output was the stock performance. Based on [4,12,25,26], a definition for each input variable is given, as follows.
Definition 1. Return rates, R t : The return rate, R t , is the return gained from investment. A high value of R t indicates a high profit gain, and thus positive stock performance is a good sign for investors. As Chen and Huang [4] demonstrated, R t is defined based on the concept of net asset value (NAV), where the definition of R t is given as follows: where R t is the return rate for stock t, NAV t is the net asset value for the current transaction, and NAV t−1 is the net asset value of the previous transaction.

Definition 2.
Standard Deviation, S t : Standard deviation, S t , measures the volatility of returns denoted as the investment risk level [4,25]. The standard deviation, S t , can be calculated using Equation (2), shown as follows: where R ti is the rate return of stock t on the ith day, and R t is the average return rate for n period of time.

Definition 3.
Treynor Index, T t : The Treynor index, T t , is a measure of the excess return earned per unit of systematic risk [4]. The Treynor index was chosen in this study as it examines the stock portfolio against the market as a whole and is highly sensitive to market risk [12,26]. A high value of T t denotes a high return per market risk [4]. The Treynor index is given by Equation (3), as follows: where β is the systematic risk or the market risk, and R r f is the daily average risk-free rate for a week.
As mentioned earlier, step 1 of the novel proposed fuzzy clustering method involved the normalization process. In this case, all of the values obtained from the inputs were normalized using the following definition. Definition 4. Normalization, i : Let i be the normalization of input variables [9] with i = R t , S t and T t , and i is given as where Min i i, j and Max i i,j is the minimum and maximum i with j = 1, 2, 3 · · · n respectively.
With respect to all inputs defined above, all variables were normalized using Equation (4), as shown in Equations (5)- (7): where R t , S t and T t are the values of normalization for the return rates, standard deviations, and Treynor index values, respectively.

Step 2: Fuzzification
The results obtained from the normalization process in step 1 were then transformed into linguistic triangular fuzzy numbers shown by Equations (8)-(10).
where t R′ , t S′ and t T′ are the values of normalization for the return rates, standard deviations, and Treynor index values, respectively.

Step 2: Fuzzification
The results obtained from the normalization process in step 1 were then transformed into linguistic triangular fuzzy numbers shown by Equations (8)-(10).  Where a and c are the minimum and maximum values, respectively, while b is the modal value of the triangular fuzzy numbers [27,28].

Step 3: Fuzzy Rule Base, Fuzzy Inference System, and Defuzzification
In this step, rule bases in the form of linguistic terms were developed based on established stock performance decisions [23,24,[29][30][31]. All inputs in step 2 were aggregated using the rule bases developed in this step and the output obtained represent the performance of the stocks. It is worth noting that the output produced underwent defuzzification where the linguistic triangular fuzzy numbers were transformed into crisp values. The interaction between the inputs, fuzzy rule bases, and outputs are generically given as follows:

Step 4: Stock Performance, Investor Selection Preferences and Stock Selection Strategy
The stock performance obtained from the defuzzification process in step 3 was expressed as a single value; this value represents investors' evaluation of the stocks. The evaluation was then projected onto the height of the linguistics triangular fuzzy numbers, where the two confidence levels were obtained. The confidence level represents the selection preferences of two types of investors, namely, pessimistic and optimistic investors. To distinguish the stocks in accordance to investors' preference priority, a stock selection strategy was developed, and is displayed using Equation (11): where w k is the weight of the performance, b k is the number of performances obtained for the stock, k is the stock performance, and c l is the average of the confidence levels for stock performance.

Clustering Malaysia's Stock Market
According to [32], in studies to develop a model, selecting perfect data for model illustration is not a key factor; it is sufficient to use a sample of real data to illustrate the realistic scenario of investment. As a small economy, Malaysia is vulnerable to global and regional developments, and in Malaysia a stock market crash led to economic downturn. The global downturn of the 2007 financial crisis hit Malaysia hard, and the market took about 3 years to recover to its pre-crisis level. In this study, data of the year 2011 were considered when the stock market started to recover [33] to observe investors' preferences towards stocks selection that were not affected by the crisis. Generally, the proposed fuzzy clustering method can be applied to any data set, as carried out by [3,34] by employing a small sample and a shorter time frame for evaluating stock performance. Thus, consistent with previous studies where the objective was to present a novel clustering method, the capability of this novel method was demonstrated by employing real data of 30 stocks listed on Bursa Malaysia. These 30 stocks are listed under the Syariah category of consumer products and services sector, a dominant sector observed by the masses [35,36]. The details of the processes involved in the application of the proposed model are given below.

Step 1: Data Collection, Input/Output Identification and Normalization
In this study, data of 30 Syariah compliant companies listed under the consumer products and services sector from 3rd January 2011 to 30th December 2011 were collected. The data of the 30 stocks considered comprised stock prices, return rates, R t , standard deviations, S t , and Treynor index values, T t , as the input variables, while stock performance served as the output for this investigation. The companies' stock prices were derived from DataStream, while the Syariah and Kuala Lumpur Composite Index (KLCI) indices are from Bursa Malaysia. Table 1 shows part of the variables evaluated for Aeon Co (M) Bhd, one of the 30 stocks examined. Using the values from Equation (4), each variable was normalized, as displayed in Table 2.

Step 2: Fuzzification
The normalization results obtained in step 1 were transformed into linguistic triangular fuzzy numbers, where they were defined in linguistic terms as very high, high, moderate, low, and very low for the input variables, while for outputs, the linguistic terms were inferior, stable, good, and aggressive. Tables 3 and 4 describe the linguistic terms and their respective triangular fuzzy numbers for the inputs and outputs, respectively.

Step 3: Fuzzy Rule Base, Fuzzy Inference System, and Defuzzification
In this step, fuzzy rule bases were developed to aggregate the inputs considered. These rule bases were obtained based on the results achieved from past known works [3,4,7,8]. In total, there were 125 rule bases developed, utilizing the five linguistics terms for the input variables and the four linguistic terms for the output variables defined in step 2. Some of the rules generated for Aeon Co (M) Bhd are given below.

IF R t is very low AND S t is low AND T t is high THEN inferior stock performance.
IF R t is high AND S t is high AND T t is very high THEN aggressive stocks performance.

IF R t is high AND S t is low AND T t is very high THEN good stocks performance.
IF R t is moderate AND S t is moderate AND T t is high THEN stable stocks performance.

Step 4: Stock Performance, Investor Selection Preferences, and Stock Selection Strategy
The results of the stock performance acquired in step 3 are presented in the form of a single value. Table 5 gives the results for the stock performance and the confidence levels with respect to the pessimistic and optimistic investors for Aeon Co (M) Bhd. To determine the overall performance of a specific stock given by different types of investors, the performance frequency for the stocks was evaluated. Table 6 shows part of the performance evaluation for 10 stocks. Based on Table 6, the performance that emerges most frequently indicates the actual performance of the stock. In this case, Aeon Co (M) is considered to have good performance from the point of view of pessimistic investors, while optimistic investors classify the stock performance as stable. Table 7 provides the results of the stock performance for all 30 stocks evaluated, as well as the confidence level average based on both pessimistic and optimistic investors. The same technique was applied to evaluate market performance, and consistent results between the market and stock performances were obtained. Such evaluation is important to ensure that stocks and markets are coherent. Tables 8 and 9 display the market performance evaluation for Syariah and KLCI.

Stock Selection Strategy
There are various ways to select stock in the market. However, this study introduces a novel stock selection strategy based on investors' confidence levels and preferences of stocks. The strategy aims to provide investors with a unique stock priority evaluation, so that high priority stocks from the available pool of stocks will be selected first for investment. This is computed by assigning each performance a value, such that 0.1 refers to inferior performance, 0.2 stable, 0.3 good and 0.4 aggressive. Thus, the proposed strategy based on the novel stock priority evaluation for Aeon Co (M) Bhd was calculated using Equation (11). Table 10 shows the stock priority evaluation and the ranking for the 30 stocks analyzed in this study. The distinct selection made based on different investors' preferences can be observed in Table 10. As shown, pessimistic investors select the stock of SHH Resources Holding Bhd, while optimistic investors choose the Milux Corporation Bhd stock to invest in.

Validation of the Results
For validation purposes, a comparative analysis between the novel proposed fuzzy clustering, k-means clustering, and hierarchical clustering methods is presented. The validation focused on obtaining the correlation between the actual stock performance rankings, the novel proposed fuzzy clustering method, the k-means method, and the hierarchical clustering method. Table 11 summarizes the stock performance evaluations for the k-means, hierarchical, and novel proposed fuzzy clustering methods.  Table 11 provides the ranking order decided by the different methods, namely, k-means, hierarchical clustering, and the novel proposed fuzzy clustering. This ranking order was then validated against the actual ranking performance using Spearman's rank coefficient of correlation [37]. Table 12 describes the rankings and Spearman's rank coefficient of correlation scores for the k-means, hierarchical clustering, and novel proposed fuzzy clustering methods.

Discussion
As projected in Section 4, this study successfully extended the established clustering methods [4,6] by developing a novel fuzzy clustering method using a fuzzy inference system. The novel fuzzy clustering method is capable of determining stock performance based on investor preferences, as well as ranking stock based on priority. As shown in Table 5, four performance evaluations, namely, inferior, stable, good, and aggressive, were formed. As exhibited in this study, inferior performance consists of stocks that are unstable and in poor condition, yielding high risk and low return gains. Investment in this performance classification is deemed to be unworthy. The classification of stable performance for stocks consists of stocks that are still considered to be high risk and to have low return gains, but the performance is slightly better than that of inferior performance. Moreover, stable performance also consists of stocks that have moderate return rates and risk levels, but are unable to provide profit for shorter investment periods. Good performance stocks are the best stocks for investment, since the risk is low and the return rate is high, indicating that investors' chances of losing are low and that they are able to secure high returns in investments. Finally, the aggressive performance classification consists of stocks that provide higher returns but with higher risk. This stock classification is for investors who are not intimidated by high investment risk to gain high profit returns. Stocks with a lack of investor preference are classified as inferior and stable, rather than good or aggressive. Therefore, it is suggested that investors invest in stocks classified as having good and aggressive performance.
Unlike established past works that only considered a single investors' selection preferences, which cannot be justified, the proposed novel fuzzy clustering method is able to distinguish investors' preferences based on stock performance. Real investors' selection preferences can be either pessimistic or optimistic, as shown by the results obtained using the novel fuzzy clustering method. The types of investors are represented based on confidence levels, with a low value indicating pessimistic investors, while a high value denotes optimistic investors, as shown in Tables 5 and 7. Even though investors express different preferences toward performance evaluation, the results of overall stock performance show that some stocks are given the same performance evaluation by both optimistic and pessimistic investors. This implies that investors' preferences are an important element to consider when selecting appropriate stocks.
The stock performance presented using the novel fuzzy clustering method is in the form of a numerical value. This numerical value shows the strength of the stock performance, which can be used to rank stocks based on priority, and which established works were unable to do. Typically, most established works used more than one method to sort and rank stocks based on priority, as was done by [20,21]. Providentially, this study used only one method to determine stock performance, to cluster stock performance, to rank stocks based on priority, as well as to determine and rank priority stocks based on investor preferences.
In step 4, stocks based on priority were applied to search for the best stocks to invest in. At this stage, the stocks were rank based on the priority of the stock performance, as shown in Table 10. Such ranking was done for both pessimistic and optimistic investors. The performance of the proposed method was validated against established works, and was actual rank using Spearman's rank coefficient of correlation. The results show that the proposed novel fuzzy clustering method is superior compared to the k-means and hierarchical clustering methods.

Conclusions
This paper presents a novel fuzzy clustering method for stock selection based on investors' selection preferences. The novel proposed method provides precise and unambiguous investors' selection preferences compared to established methods with regard to different types of investors, such as pessimistic and optimistic investors' views. Moreover, unlike established methods, the novel investors' selection strategy developed in this study ensures that high-priority stocks are chosen as the best stock and selected first for investment by employing the proposed stock priority method. The efficiency of the proposed method was illustrated and validated by clustering 30 Syariah stocks listed in Bursa Malaysia. This study successfully applied the novel fuzzy clustering method to the problem of stock selection based on investors' preferences so as to assist investors in their investment choices. The results obtained from the validation justify this novel fuzzy clustering method, as it provided higher efficiency by achieving consistent ranking correlations against the actual results, unlike the established clustering methods. Although the novel clustering method obtained consistent results in terms of the actual stock performances, the ranking correlation values were not adequately sufficient, and thus, better computations will be needed to increase the level of accuracy. In addition, the novel fuzzy clustering method considers only the uncertain component of pessimistic and optimistic investors' behaviors; for better results, the reliability, hesitancy, and bipolarity components may need to be embedded into stock selection. In the future, the authors aim to explore stock selection procedures in relation to investors' reliability, hesitancy, and bipolarity.