Top-N Recommender Systems Using Genetic Algorithm-Based Visual-Clustering Methods

The drastic increase of websites is one of the causes behind the recent information overload on the internet. A recommender system (RS) has been developed for helping users filter information. However, the cold-start and sparsity problems lead to low performance of the RS. In this paper, we propose methods including the visual-clustering recommendation (VCR) method, the hybrid between the VCR and user-based methods, and the hybrid between the VCR and item-based methods. The user-item clustering is based on the genetic algorithm (GA). The recommendation performance of the proposed methods was compared with that of traditional methods. The results showed that the GA-based visual clustering could properly cluster user-item binary images. They also demonstrated that the proposed recommendation methods were more efficient than the traditional methods. The proposed VCR2 method yielded an F1 score roughly three times higher than the traditional approaches.


Introduction
There has been an explosion, in recent years, of the number of websites available on the internet.This can result in the user being inundated by a great number of websites to view, highlighting the subject matter and providing reams of information, most of which will be irrelevant for the users' needs.To solve this problem, the recommender system (RS) has been developed to filter the information on the internet by choosing only the most appropriate information for the user.There are some helpful surveys in the literature.The details of several methods and algorithms involved in content-based recommender systems were provided in [1].The details of methods including their limitations in content-based, collaborative and hybrid recommendation systems were described in [2].
The main methods of the RS can be classified into several groups, such as, content-based [3,4], collaborative filtering [3][4][5][6][7][8][9], feature-based [10][11][12][13], demographic-based algorithms [14], etc.The collaborative filtering is widely applied in the area of the RS [3,4,6,7,15,16].This algorithm may be classified into two categories including the user-based and item-based algorithms.In the user-based algorithm, the system generates the top-N recommendation based on similarity among users.On the other hand, in the item-based algorithm, the system generates the top-N recommendation based on similarity among items.There have been some attempts made to improve the RS performance.The hybrid between collaborative filtering and sequential pattern analysis yielded better recommendation performance [17].The personal models achieve better personalized RSs [18][19][20].User-created tags were also exploited to improve the performance of collaborative filtering [21].It was found that sequential bias in product reviews was harmful to the RS performance and should be removed before performing the recommendation [22].Local and global features of a social graph were used in the calculation of a transitive node-similarity measure in online social networks to enhance friend recommendations [23].The best subset of a user's friend, i.e. the inferred circles of friends, was used in the online social network recommendation system [24].The ensemble learning of several recommendation systems, e.g., SVD, Neighborhood-Based Approaches, Restricted Boltzmann Machine, Asymmetric Factor Model and Global Effects was utilized to produce the final recommendation result [25].
Although there are many recommendation algorithms developed to improve the performance of the RS, they still need to overcome the cold-start and sparsity problems [26].The cold-start problem occurs when the RS does not have enough information about a new user and/or a new item.The sparsity problem occurs when the frequency of the purchased items is too small.There are many algorithms used to address the cold-start problem.A probabilistic graph was used to represent an implicit social network and then probabilistic graph-based measure was used to produce the final prediction in the social network recommender system [27].The predictive feature-based regression based on the pairwise user preferences was used to leverage all available user and item information [28].The hybrid taxonomy-based recommender (HTR) was built based on the assumption that there was a relation between users' item preferences and taxonomy preferences [29].The cold-start hybrid taxonomy-based recommender (CSHTR) based on the closest taxonomic preferences cluster was built accordingly to cope with the cold-start problem.To solve the new user problem, the item-based collaborative filtering [7], feature-based [11], fuzzy-based [30], and behavior-based [31] methods were proposed.Furthermore, the fuzzy-based and feature-based methods were used to solve the new product problem [11,30,31].Prediction of missing information was applied prior to the collaborative filtering to ease the cold-start problem [32].In addition, combining the collaborative filtering with personal agents [26], using the fuzzy-based method [30], and combining the collaborative and content-based filtering methods [4] were applied to solve the sparsity problem.
However, all aforementioned methods have limitations.In the fuzzy-based method, the performance of the method depends on the expert who creates the fuzzy rule base.The personal information is difficult to collect in the combination between collaborative filtering and personal agent's methods.When using the combining collaborative and content-based filtering method and the feature-based method, the attributes (features) of items have to be chosen properly.Different types of items, such as books, movies, songs, etc. usually have a different set of useful attributes.Therefore, the RS of each type of item has to be developed separately.Above all, the algorithms still have problems when the cold-start and sparsity problems occur together.These problems lead to low performance in recommendations in the real-world data sets.
Some methods based on matrix manipulation were proposed for recommendation problems.The social matrix factorization was proposed for the recommendation system [33].A matrix factorization was applied to perform a recommendation in social networks [34].It can handle the transitivity of trust and trust propagation.Multiple collaborative filtering tasks in different domains were considered [35].The rating in each domain was modeled by the probabilistic matrix factorization.The knowledge was transferred across different domains using the correlation between domains.Multi-scale spectral decomposition method (MSEIGS) was utilized in inductive matrix completion (IMC) to find the top-k eigenvectors or top-k result in recommender systems [36].The MSEIGS provides a comparable result with the other methods with smaller computation time.An extension of matrix factorization called tensor factorization (TF), an N-dimensional tensor of user-item-context, allowed additional dimensions of different context-type representations [37].Information (blog) was represented as a vertex of a hypergraph while blogs in the same set were represented by hyperedges [38].The proposed multilevel clustering algorithm is utilized to segment a hypergraph.To achieve a final optimized recommender set, the authors utilized some optimization methods.
The objective of this research is to solve the cold-start and sparsity problems using the visual-clustering method.The visual-clustering method (VCM) was initially proposed by our research group to cluster data in binary images [39].Its clustering performance was not investigated numerically.Besides the initial clustering results, our clustering method was not utilized in any problem.In this present work, we describe the VCM in more detail and apply it to the recommendation problem.Six methods for the RS are proposed.The first is the visual-clustering recommendation (VCR1) method.The idea behind this method is to apply the clustering technique to cluster the users and items in a binary image.Then, we use the information in the derived clusters to generate the top-N recommended items to an active user.The second method is the hybrid between the VCR1 and user-based methods, namely the VCR-UB1.The third method is the hybrid between the VCR1 and item-based methods, the VCR-IB1.The other three proposed methods are derived similarly with a new fitness function during cluster generation.We test the clustering performance on three synthetic and two real-world data sets.The performances of the top-N RSs are tested on two real-world data sets by using the precision, recall, and F1 scores as evaluation measures.The proposed methods are also compared with three traditional methods including the frequency-based, user-based, and item-based methods.In previous works, recommendation performances were usually evaluated on entire data sets.That means the entire data sets were used as the training set and the test was performed on the training set.This leads to doubt about the generalization of the previous results.Hence, in this work, the ten-fold cross-validation is chosen to cope with this problem.
This paper is organized as follows.In Section 2, we review and describe related methods.The framework of the experiments is given in Section 3. The experimental results on synthetic and real-world data sets and the respective discussions are presented in Section 4. The conclusion is drawn in Section 5.

Proposed Genetic Algorithm-Based Visual-Clustering Method
As a heuristic search algorithm that mimics the process of natural evolution, the genetic algorithm (GA) has been widely applied to many applications in different RSs [5,[39][40][41][42].In the process of the GA, there are five parts: the initial population, evaluation, reproduction, crossover operation, and mutation operation.The population of the GA is a group of chromosomes consisting of genes (an array of values).By mimicking the natural selection, the chromosomes with a high fitness value are selected into a mating pool.The reproduction process occurs in the pool by copying individual chromosomes to the next generation.The crossover operation creates children from the parents based on the pairing process.The mutation operation aims to maintain genetic diversity from one generation of the population to the next.
The model of the proposed top-N recommendation systems is shown in Figure 1.In the proposed methods, the process can be divided into three parts.Firstly, we create the user-item table from the purchased transaction as shown in Figure 2a.Then the user-item table is mapped into a binary image as respectively shown in Figure 2b.Secondly, we propose a method to cluster the users and items in the binary image.Finally, we develop methods for the RS based on the clustered image to recommend items to an active user.
We initially proposed the visual-clustering method (VCM) and found that it was able to cluster data in binary images [39].However, the analytical evaluation of clustering performance and the usefulness of the clustered images were not investigated.In this work, the VCM is described in more detail.We also apply the VCM to more data sets and extend it to cope with the recommendation problem.
The idea of the VCM is to search for possible clusters in a binary image by interchanging positions of rows and columns.One may think about using the exhaustive search to solve the problem.Unfortunately, the complexity of exhaustive search method is I! ˆJ! in the scenario of I users and J items which is way too large.Meanwhile, the complexity of the GA in this problem is O(n(I + J)), where n is the number of the chromosomes in population (binary images in this case).This is the reason that the GA is selected to find the optimized clusters in this research.The five steps of typical GA, i.e., the initial population, evaluation, reproduction, crossover, and mutation, are applied here.The detailed information is as follows.
In the process of initial population, the positions of rows and columns in a binary image are randomly interchanged to generate each chromosome.There are two genes in a chromosome, i.e., the first gene represents the users (rows) and the second gene represents the items (columns).Figure 3 shows an example of the initial population in the five-user four-item scenario.reason that the GA is selected to find the optimized clusters in this research.The five steps of typical GA, i.e., the initial population, evaluation, reproduction, crossover, and mutation, are applied here.The detailed information is as follows.
In the process of initial population, the positions of rows and columns in a binary image are randomly interchanged to generate each chromosome.There are two genes in a chromosome, i.e., the first gene represents the users (rows) and the second gene represents the items (columns).Figure 3 shows an example of the initial population in the five-user four-item scenario.reason that the GA is selected to find the optimized clusters in this research.The five steps of typical GA, i.e., the initial population, evaluation, reproduction, crossover, and mutation, are applied here.The detailed information is as follows.
In the process of initial population, the positions of rows and columns in a binary image are randomly interchanged to generate each chromosome.There are two genes in a chromosome, i.e., the first gene represents the users (rows) and the second gene represents the items (columns).Figure 3 shows an example of the initial population in the five-user four-item scenario.The process of evaluation selects the chromosomes with high fitness values into the mating pool.In our previous work, we clustered objects in images but further applications, like recommendation, had never been performed [39].In the previous work, the number of clusters and the compactness of objects in the image were used to determine the fitness value, i.e., 12 Fitness = ( 1) + β , , 1, 2..., max( ) and , , 1, 2..., max( ) where Fitnessi is the fitness function calculated for the ith binary image.Parameters a1 and a2 are weight parameters.αi is the normalized number of clusters, βi is the normalized compactness, Ni is the number of clusters, and Ci is the average compactness in the ith binary image.The compactness of a cluster (connected component) is defined as: The process of evaluation selects the chromosomes with high fitness values into the mating pool.In our previous work, we clustered objects in images but further applications, like recommendation, had never been performed [39].In the previous work, the number of clusters and the compactness of objects in the image were used to determine the fitness value, i.e., Fitness i " a 1 p1 ´αi q `a2 β i (1) and where Fitness i is the fitness function calculated for the ith binary image.Parameters a 1 and a 2 are weight parameters.α i is the normalized number of clusters, β i is the normalized compactness, N i is the number of clusters, and C i is the average compactness in the ith binary image.The compactness of a cluster (connected component) is defined as: Compactness " pcluster perimeterq 2 4πpcluster areaq (4) n is the number of the binary images, i.e., the number of chromosomes in population.This fitness function is high for the image with a small number of clusters and the shape of each cluster is close to the circle.In this research, a new fitness function is proposed for clustering.The idea is to improve the weakness of the previous fitness function in [39] for the recommendation problem.We found that compactness in the previous function might not be a good indicator of a well-grouped cluster in this problem.For the recommendation problem, a well-grouped cluster can be of any shape, either circle, rectangle, elongated, etc., with many four-connected pixels.Therefore we discard the compactness in this research, but add three more factors and the new fitness function is Fitness i " a 1 p1 ´αi q `a2 p1 ´βi q `a3 γ i `a4 δ i (5) α i is defined as in Equation ( 2) to deal with the number of clusters.β i is used to deal with the number of small clusters.The number of small clusters in the ith image is denoted by N s,i .We do not want to have many small clusters, therefore N s,i should be small.In this research, the clusters with less than four pixels are considered as small clusters.The size of a large cluster (though it does not need to be the largest cluster) is taken into account in γ i .P l,i denotes the number of pixels in the large cluster in the ith image.In this research, P l,i is the number of pixels of the third largest cluster in the image.Similarly, the size of a small cluster (though not necessarily the smallest) is taken into account in δ i .P s,i denotes the number of pixels in the small cluster in the ith image.In this research, P s,i is the number of pixels of the smallest cluster with larger than three pixels.It is worth noting that the values of P l,i and P s,i are not from the largest and smallest clusters, but the order-statistics are utilized instead.This is to provide more robustness in term of the size variation.The bottom line is that the fitness value will be large when pixels in the image are well-clustered (small α i ), the number of small clusters is small (small β i ), the third largest cluster is large (large γ i ), and the smallest cluster with more than three pixels is large (large δ i ).
Reproduction is the process of matching in individual chromosomes which are copied according to the fitness values.The chromosomes with higher fitness values have higher probability to be selected.Table 1 shows the sample chromosomes, fitness values, and corresponding probabilities in the five-user four-item scenario and the number of chromosomes in population (n) equals four.Two chromosomes are selected from the mating pool of n k chromosomes to produce two new chromosomes, where n k is the number of the surviving chromosomes which is set to be half of n in this research.This reproduction process is repeated until n ´nk chromosomes are generated.Crossover operation is the creation of children from the parents which are selected in the pairing process [39,41,43].We design the crossover operation to only cross the values within a gene.For example, consider chromosome 1 (Parent 1) and chromosome 2 (Parent 2), if these are selected from the population in Table 1 In the mutation operation, it is designed to allow interchanging of the positions only within the same gene.To replace the selected point, another point within the same gene is randomly selected.For example, according to the value in column 5 (value is 5) in chromosome 2 (Table 1), after mutation, the values of the new chromosome are 5 3 1 2 4 | 3 2 4 1.

Recommendation Engine
The detailed information of the user-based and item-based collaborative filtering is widely available in the literature.Therefore, the following methods are briefly described.In the user-based top-N recommendation algorithm (UB), the top-N recommended items are produced by calculating the similarity between an active user with other users who had made similar purchase [15].To generate a recommendation, the user-based collaborative filtering algorithm creates the top-N recommendations from a neighborhood of users by using the most frequent item recommendation technique.On the other hand, in the item-based top-N recommendation algorithm (IB), the RS produces the top-N items to an active user by calculating the similarity between items [44].This algorithm uses item-to-item similarity to compute the relation between the items.The frequency-based method (FB) is another basic method.The RS sorts the frequency count of the purchased items and produces the N most frequent items that have not yet been purchased to an active user.
The recommendation engine is the main process of the RS and we have developed six of them.The first recommendation engine, the visual-clustering recommendation (VCR), directly uses the information in the clusters derived from the VCM to generate the top-N items to an active user.The top-N most frequent items in the derived clusters that the active user's purchased items belong to will be recommended.It is worth noting that the active user can initially purchase items in different clusters.In this case, we recommend items by considering frequencies of purchased items in all related clusters.The second recommendation engine generates the top-N most frequent items in a similar manner but by using the union of the VCR and IB outputs from the derived clusters.The third recommendation engine performs similarly, though it uses the union of the VCR and UB outputs from the derived clusters.The other three engines are derived similarly but use the fitness function in Equation ( 5) to generate the clusters.

Evaluation Measures
In this study, we evaluate the performance of the RS by using precision, recall, and F1 score because they have been widely used to evaluate the performance of the top-N RS [15,31,[44][45][46].Precision is defined as the ratio of the number of elements in the hit set to the number of elements in the recommendation set, i.e., precision " #of hits #of recommended items (9) Recall is defined as the ratio of the number of the elements in the hit set to the number of purchased items, i.e., recall " #of hits #of purchased items (10) The F1 score or F-measure can be derived from precision and recall, i.e., F1 " 2 ˆprecision ˆrecall precision `recall (11)

Experimental Framework
There are two parts of the experiments.The first part was designed to investigate the clustering performance.Both synthetic and real-world data sets were used to test the proposed clustering methods.In the second part, recommendation performance of the proposed methods was investigated.Two real-world data sets were exploited.As mentioned earlier, precision, recall, and F1 score were applied to evaluate the performance of the top-N RSs.

Data Descriptions
The data sets used in the experiments can be divided into two groups, i.e., the synthetic data sets and real-world data sets.In the synthetic data set, three original images are generated.The three images have the size of 20 ˆ20, 34 ˆ34, and 48 ˆ48.There are three, five, and seven clusters in the three images, respectively.Each cluster in the images contains 24 pixels.For each image, the rows are randomly interchanged.The process is repeated with the columns.The derived images are the row-interchanged and column-interchanged version of the corresponding original ones.
There are four real-world data sets used in the experiments.The first data set is the transaction of purchasing from Gazelle.com, legwear and legcare e-tailer collected by Blue Martini Software on KDD-CUP2000 (KDD).In this data set, there are 3465 purchases in total by 1831 customers.However, a great deal of important data is missing, including the customer ID, item ID, the number of purchases in some transactions.Hence, after removing those incomplete transactions, there are 1697 customers and 247 items.However, there are only 271, 110, and 14 customers who purchased at least two, three, and four items, respectively.Moreover, there are only 102 items that were purchased at least twice.The second data set is our private data.It is the transaction of purchasing at Thaiherbs-Thaimassage shop (TTS) [47].The data set consists of 371 customers and 175 items.However, there are only 112 and 55 customers who purchased at least two and three items, respectively.There are only 95 items that were purchased at least twice.
The third data set is the restaurant and consumer data set (RCM) collected by the Department of Computer Science, National Center for Research and Technological Development in Mexico.This data set contains 1161 ratings for 130 restaurants rated by 138 users.The fourth data set is the transaction of visiting the entree Chicago restaurant (ECR), collected by the Department of Information and Computer Science, the University of California, Irvine.We chose only the transactions recorded in the 4th quarter of 1996.Each user is presented by a session of user interaction with the system.There are 1786 users (sessions) and 674 restaurants.These four data sets have the sparsity problem, i.e., the frequency of the purchased items is too small.Therefore, they are suitable for the recommendation performance evaluation of the proposed methods.
As we know, it is impossible to evaluate the recommendation performance using the information of customers who purchased only one item in total.To make the recommendation possible, we use only 110 customers who purchased at least three items in the KDD data set and use only 112 customers who purchased at least two items in the TTS data set.We selected 115 users who rated at least four times for the RCM data set.For the ECR data set, we selected 611 users who visited at least five times.Hence, the size of the KDD data set is 110 customers and 247 items, whereas that of the TSS data set is 112 customers and 175 items.For the RCM and ECR data sets, the sizes are 115 customers/130 restaurants and 611 customers/386 restaurants, respectively.After the selection, the sparsity levels of the KDD, TTS, RCM, and ECR data sets are very high at 0.986, 0.969, 0.939, and 0.983 respectively.

Parameter Setting
From extensive experiments, the related parameters are chosen as follows.In the GA parameter setting, we set up the parameters as follows: size of population is 80, mutation rate is 0.01, and crossover rate is 0.6.The weights of our previously proposed fitness function (see Equation ( 1)) are set to 0.5 for both a 1 and a 2 .For the presently proposed fitness function (see Equation ( 5)), the weights a 1 , a 2 , a 3 and a 4 are set to 0.2, 0.2, 0.3, and 0.3, respectively.In the RSs, the neighborhood size of the user-based and item-based top-N RSs is limited to ten.The number N of the top-N RSs is set to five (i.e., top-5 recommendation).

Cross-Validation
When the data sets are not officially divided into the training and test sets but we need to have training and test sets to evaluate generalization properties of the recommendation methods, the cross-validation method is a standard solution of the aforementioned limitation.In our experiments, the ten-fold cross-validation was performed.We briefly describe the cross-validation here.In the ten-fold cross-validation, the entire data are divided into ten groups with approximately same size.In the first validation, the first group is kept as the test set or validation set while the nine remaining groups are used as the training set.In our case, the data in the training set are used to create the clustered image.The derived clustered image is then used to provide recommended items to the customers in the test set.The recommendation evaluation is performed on this test set.The process is repeated with the remaining groups ten times.Hence, each data will be used as the test data whose information has never been used in the training process.In this study, there will be ten values for each evaluation measure from ten validations.For each evaluation measure, we report the results in terms of the average of those ten values.

Clustering Results on Synthetic Data Sets
In real-world data sets, it is extremely difficult or impossible to evaluate whether a clustering method is able to properly cluster the users and items.Hence, three synthetic data sets were created to represent the data sets with prior known ground truth.Figures 4a, 5a and 6a show the original binary images containing three, five, and seven clusters, respectively.It should be noted that all three original images are actually binary.The gray level versions are shown here so that we can visualize the clustering performances.To indicate the elements in the same or different cluster, we label elements in the same cluster using the same gray level.For the elements in different clusters, the gray levels are different.Figures 4b, 5b and 6b show the corresponding images after randomly interchanging rows and columns.After applying our previously proposed VCM [39], the clustering results are shown in Figures 4c, 5c and 6c, respectively.Furthermore, the results using our presently proposed VCM are shown in Figures 4d, 5d and 6d, respectively.It can be clearly seen that both VCMs achieve three, five, and seven clusters, respectively.Although the shape and location of each cluster is different from the original one due to changes of rows and columns, the members in each cluster are the same as that in the original image.This emphasizes that the VCMs are able to properly cluster the users and items in a binary image.
Even though the VCMs achieve the correct number of clusters and correct members in each cluster, it is interesting to examine the value of compactness in each case.We consider the summation of compactness values of all objects in each image before and after clustering.It should be noted that, by the definition of compactness in Equation ( 4), the larger value implies a less compact object.For the three-cluster data set, the total compactness value before clustering (Figure 4b) is 7.639.After clustering, the total compactness values are 3.397 and 4.063 by using our previously proposed VCM (Figure 4c) and the presently proposed VCM (Figure 4d), respectively.The results show that our previously proposed VCM yields more overall compact clusters.It is not surprising to have these results because we take into account the compactness in our previously proposed VCM, while it is discarded in the presently proposed VCM.Likewise, for the five-cluster data set, the total compactness values of the image before clustering, after clustering using the previously proposed, and presently proposed VCMs are 12.732, 6.448, and 9.376.The total compactness values for the seven-cluster data set are 17.825, 7.808, and 8.062 for the three scenarios, respectively.All of the results suggest that the compactness of the clusters in each image have been properly taken care of by the previously proposed VCM.On the other hand, the presently proposed VCM yields less compact objects.However, as we mentioned in Section 2.1, a good cluster in the RS is not necessarily compact.That is the idea of the new fitness function in the presently proposed VCM.The recommendation results confirm the validity of this idea as shown later on in Section 4.2.

Clustering Results on Real-World Data Sets
The clustering results using the VCMs on the three synthetic data sets confirm that both proposed VCMs are able to cluster the information in binary images.The VCMs are then applied to cluster the users and items in the four real-world data sets.We show only the clustering results from the first two data sets.Figure 7a shows the original KDD image.The clustering results using our previously proposed VCM and the presently proposed VCM are shown in Figure 7b,c, respectively.Figure 8a shows the original TTS image, whereas the corresponding clustering results using our previously proposed VCM and the presently proposed VCM are shown in Figure 8b,c, respectively.The results clearly show that the VCMs are able to cluster the users and items in the two real-world data sets as the resulting images have much less numbers of clusters than those in the corresponding original images.The information in the derived clusters can be used in the RSs.

Clustering Results on Real-World Data Sets
The clustering results using the VCMs on the three synthetic data sets confirm that both proposed VCMs are able to cluster the information in binary images.The VCMs are then applied to cluster the users and items in the four real-world data sets.We show only the clustering results from the first two data sets.Figure 7a shows the original KDD image.The clustering results using our previously proposed VCM and the presently proposed VCM are shown in Figure 7b,c, respectively.Figure 8a shows the original TTS image, whereas the corresponding clustering results using our previously proposed VCM and the presently proposed VCM are shown in Figure 8b,c, respectively.The results clearly show that the VCMs are able to cluster the users and items in the two real-world data sets as the resulting images have much less numbers of clusters than those in the corresponding original images.The information in the derived clusters can be used in the RSs.

Clustering Results on Real-World Data Sets
The clustering results using the VCMs on the three synthetic data sets confirm that both proposed VCMs are able to cluster the information in binary images.The VCMs are then applied to cluster the users and items in the four real-world data sets.We show only the clustering results from the first two data sets.Figure 7a shows the original KDD image.The clustering results using our previously proposed VCM and the presently proposed VCM are shown in Figure 7b,c, respectively.Figure 8a shows the original TTS image, whereas the corresponding clustering results using our previously proposed VCM and the presently proposed VCM are shown in Figure 8b,c, respectively.The results clearly show that the VCMs are able to cluster the users and items in the two real-world data sets as the resulting images have much less numbers of clusters than those in the corresponding original images.The information in the derived clusters can be used in the RSs.

Top-5 Recommendation Results
In the recommendation experiments, the top-5 RSs were evaluated on the four real-world data sets.We intentionally created the cold-start problem by considering the recommendation only for new users with one or two items chosen in the baskets.This created the scenario that the RS does not have enough information about its new users, i.e., the cold-start problem.Moreover, the ten-fold cross-validation was performed to cope with the generalization of the results.It is worth noting that our previously proposed VCM has never been applied to the recommendation problem.Hence, we investigate its performance in the recommendation here.The RSs based on the derived clustered images using the VCM in [39] and the presently proposed VCM are called VCR1 and VCR2, respectively.We also proposed the hybrid versions of both VCRs by combining each of them to the

Top-5 Recommendation Results
In the recommendation experiments, the top-5 RSs were evaluated on the four real-world data sets.We intentionally created the cold-start problem by considering the recommendation only for new users with one or two items chosen in the baskets.This created the scenario that the RS does not have enough information about its new users, i.e., the cold-start problem.Moreover, the ten-fold cross-validation was performed to cope with the generalization of the results.It is worth noting that our previously proposed VCM has never been applied to the recommendation problem.Hence, we investigate its performance in the recommendation here.The RSs based on the derived clustered images using the VCM in [39] and the presently proposed VCM are called VCR1 and VCR2, respectively.We also proposed the hybrid versions of both VCRs by combining each of them to the

Top-5 Recommendation Results
In the recommendation experiments, the top-5 RSs were evaluated on the four real-world data sets.We intentionally created the cold-start problem by considering the recommendation only for new users with one or two items chosen in the baskets.This created the scenario that the RS does not have enough information about its new users, i.e., the cold-start problem.Moreover, the ten-fold cross-validation was performed to cope with the generalization of the results.It is worth noting that our previously proposed VCM has never been applied to the recommendation problem.Hence, we investigate its performance in the recommendation here.The RSs based on the derived clustered images using the VCM in [39] and the presently proposed VCM are called VCR1 and VCR2, respectively.We also proposed the hybrid versions of both VCRs by combining each of them to the traditional user-based (UB) and item-based (IB) RSs as described in Section 2.2.The combination of the VCR1 and UB is called the VCR-UB1.Likewise, the combination of the VCR1 and IB is called the VCR-IB1.The terms VCR-UB2 and VCR-IB2 are derived by the same manner, but using the VCR2.We compared the performance of all six proposed methods, i.e., VCR1, VCR-IB1, VCR-UB1, VCR2, VCR-IB2, and VCR-UB2, with the traditional methods, i.e., FB, UB, and IB.Because the ten-fold cross-validation was applied, we report the results using the average of each evaluation measure over all ten validations.
For the KDD data set, the customers who purchased at least three items were selected.Thus, each item was randomly selected for addition into the basket of the active user.We considered two scenarios here.Firstly, the user selected one item into the basket and we had to recommend other items.Secondly, the user selected two items into the basket and we had to complete the recommendation.Table 2 shows the average precision, average recall, and average F1 of the nine methods on the KDD data set based on the two scenarios.The corresponding standard deviation from each ten-fold cross-validation is also shown.For the TTS data set, we selected only the customers who purchased at least two items.Hence, only one item was randomly selected into the basket of the active user, we then completed the recommendation.Table 3 shows the average precision, average recall, and average F1 of the nine methods on the TTS data set.For the RCM and ECR data sets, we considered the scenarios that one to five restaurants were chosen.The average precision, average recall, and average F1 of the nine methods on the RCM and ECR data sets are shown in Tables 4 and 5.The recommendation results on the four real-world data sets are similar.The proposed methods, i.e., the VCR1 and VCR2, perform better than the traditional methods, i.e., IB, UB, and FB.On average, VCR2 is the best among all nine methods.This confirms that the proposed VCMs can cluster the data and the new fitness function works better than the previous one.
Using the combinations of the VCR1 and UB or IB yield better F1 than using the VCR1 alone.In fact, these combinations do not provide much help in the precision.However, they always help in the recall.That is the reason why combining with UB or IB can enhance the recommendation performance of the VCR1.However, these combinations do not work with the VCR2.They sometimes deteriorate the recommendation performance of the VCR2.For the KDD data set, when two items are selected into the basket, the recommendation performance is better than the case when only one item is selected.This is not surprising because when there are two items in the basket, we have more information of the user-item relationship.As a result, the recommended items are more likely to be chosen than in the case of one-item selection.This is the same for the ECR data set.When two or three restaurants are chosen, the better recommendation performances are achieved compared to when only one or two restaurants are chosen.This is also true for the RCM data set; when two restaurants are chosen, the better recommendation performance is achieved compared to when only one restaurant is chosen.However, when three restaurants are chosen, the performance worsens.This is due to the fact that most of the customers rated only a handful of restaurants.For example, there are only 38 customers who rated more than 10 restaurants.This makes it more difficult to have the correct items in the top-N recommendation when the number of items in the basket is larger because more correct items are already placed in the basket and there are not many left in the recommendation pool.
We also performed the comparison indirectly with other existing methods.The comparison between different methods is not straightforward because the ways of data preprocessing/cleaning are different.For the KDD data set, a cold-start eliminating method with an improved most frequency item-extracting algorithm for top-N recommendation yielded the F-measure value of 0.146 (20 neighbors were used) [48] which is a little bit better than what we achieved.It should be noted that the numbers of users and items in [48] are 1426 and 207, respectively, are less than what we have here.For the RCM data set, the fusion method using contextual features in [49] yielded the precision of 0.07-0.08 and the recall of 0.30-0.32.Meanwhile, it can be seen in Table 4 that the proposed methods yielded comparable recall but much better precision.For the ECR data set, we compare our results to other methods directly because we used data from one-quarter only due to computation time constraint.The previous method using contextual information as virtual items in [50] yielded an F1 measure of 0.225 using collaborative filtering and 0.341 using association rules.
Memory required to run each of the proposed methods may not be a problem here because the main portion is for storing the binary images.The memory required for a set of binary images is nIJ bits, where n is the number of the binary images (set to 80 in this research), I is the number of users, and J is the number of items.Hence, the memory required to store binary images during the clustering (training) process for the KDD, TTS, RCM, and ECR data sets are approximately 27 KB, 19 KB, 18 KB, and 235 KB, respectively.For the recommendation (testing) process, only one final binary image is required for each data set.We tested the proposed methods on a computer with i5 CPU, 4 GB RAM, running on Windows 10, 64-bit operating system.Please be reminded that 10-fold cross-validation was performed in our experiments; the average computation times to cluster images in each cross-validation for the KDD, TTS, RCM, and ECR data sets are 4 h 34 min, 4 h 44 min, 4 h 19 min, 10 h 30 min, respectively.These are the approximated computation times for these data sets in a real-world application.However, in our experiments, the clustering processing for each data set was performed ten independent times (ten-fold cross-validation).Therefore, the times taken in our experiments were about ten times those.On the other hand, the recommendation time is very quick.The computation times for each recommendation for the KDD, TTS, RCM, and ECR data sets are 0.9 ms, 1.0 ms, 1.1 ms, and 2.1 ms, respectively.

Conclusions
To solve the cold-start and sparsity problems in the recommender systems (RS), we have developed recommendation methods based on clustered user-item images.In this study, we improved a clustering method by changing the fitness function in the genetic algorithm and applying it to the recommendation problem.The proposed clustering methods worked very well with a set of synthetic data sets whose ground truths were known a priori.Even though the clustering results look good visually, it is extremely hard to evaluate whether they work properly in more complicated scenarios like that in the real-world recommendation data sets.We, therefore, further applied results from the clustering methods to generate top-5 recommendations to active users with VCR1 and VCR2.We also combined both methods with the user-based method (UB) or item-based method (IB).
Four real-world data sets were used, i.e., the transaction of purchasing collected by Blue Martini Software on KDD-CUP2000 (KDD), the transaction of purchasing at Thaiherbs-Thaimassage shop (TTS) [47], the restaurant and consumer data set (RCM) collected by the Department of Computer Science, National Center for Research and Technological Development in Mexico, and the entree Chicago restaurant (ECR) data set collected by the University of California, Irvine.The proposed methods were tested on both real-world data sets using ten-fold cross-validation.The evaluation measure used here included precision, recall, and F1 score.The proposed VCRs and their respective combinations with UB and IB were compared with three traditional methods, i.e., frequency-based, user-based, and item-based methods.The recommendation results showed that the VCR2 was the best among all nine methods tested.This confirms that changing the fitness function yields better clustering results.The combinations with UB or IB help the VCR1, but deteriorate the performance of the VCR2.In real application, the actual purchase by a new user can be used directly to update the current clusters by setting the pixels at the corresponding user and items to one.The purchasing frequencies of those items are also updated.Therefore, these clusters can be updated without reclustering the entire data set.This proposed method is very useful for e-commerce applications.The results from the VCR2 are currently used for the recommendation at the website of Thaiherbs-Thaimassage shop.
Unlike other clustering methods proposed for recommendation problems, our proposed methods provide the visualization of user-item interactions in the clustered images.Each cluster possesses many useful characteristics that the system owners can extract and use later on.We hope that the clustered images will cull a great deal more new information from these transactions/ratings than is currently possible with existing methods.It should be noted that it is theoretically feasible to implement the proposed methods on an extremely large data set, i.e., millions of customers and millions of items.However, it is not presently practical due to technological limitations.Judging from the speed of computation machine development, however, we believe that more powerful computers or a new kind computation machine that can handle such complexity will be available on the market in the near future.

Figure 1 .Figure 2 .
Figure 1.Model of the proposed top-N recommendation systems.

Figure 1 .
Figure 1.Model of the proposed top-N recommendation systems.

Figure 1 .Figure 2 .
Figure 1.Model of the proposed top-N recommendation systems.

Figure 3 .
Figure 3. Example of the initial population.

Figure 3 .
Figure 3. Example of the initial population.

Figure 5 .
Figure 5. (a) Original 5-cluster binary image; (b) Row-column-interchanged binary image; (c) Result of Figure 5b using our previously proposed VCM; (d) Result of Figure 5b using our presently proposed VCM.

Figure 6 .
Figure 6.(a) Original 7-cluster binary image; (b) Row-column-interchanged binary image; (c) Result of Figure 6b using our previously proposed VCM; (d) Result of Figure 6b using our presently proposed VCM.

Figure 6 .
Figure 6.(a) Original 7-cluster binary image; (b) Row-column-interchanged binary image; (c) Result of Figure 6b using our previously proposed VCM; (d) Result of Figure 6b using our presently proposed VCM.

Figure 6 .
Figure 6.(a) Original 7-cluster binary image; (b) Row-column-interchanged binary image; (c) Result of Figure 6b using our previously proposed VCM; (d) Result of Figure 6b using our presently proposed VCM.

Figure 7 .Figure 8 .
Figure 7. (a) Original KDD image; (b) Clustering result of Figure 7a using our previously proposed VCM; (c) Clustering result of Figure 7a using our currently proposed VCM.

Figure 7 .
Figure 7. (a) Original KDD image; (b) Clustering result of Figure 7a using our previously proposed VCM; (c) Clustering result of Figure 7a using our currently proposed VCM.

Figure 7 .Figure 8 .
Figure 7. (a) Original KDD image; (b) Clustering result of Figure 7a using our previously proposed VCM; (c) Clustering result of Figure 7a using our currently proposed VCM.

Figure 8 .
Figure 8.(a) Original TTS image; (b) Clustering result of Figure 8a using our previously proposed VCM; (c) Clustering result of Figure 8a using our currently proposed VCM.

Table 1 .
Sample chromosomes and fitness values.

Table 2 .
Recommendation performance comparison on KDD data set when one and two items are selected into the basket (average ˘standard deviation, evaluated on test sets of 10-fold cross-validation).

Table 3 .
Recommendation performance comparison on TTS data set when one item is selected into the basket (average ˘standard deviation, evaluated on test sets of 10-fold cross-validation).