Further Improvement on Two-Way Cooperative Collaborative Filtering Approaches for the Binary Market Basket Data

Two-way cooperative collaborative filtering (CF) has been known to be crucial for binary market basket data. We propose an improved two-way logistic regression approach, a Pearson correlation-based score, a random forests (RF) R-square-based score, an RF Pearson correlation-based score, and a CF scheme based on the RF R-square-based score. The main idea is to utilize as much predictive information as possible within the two-way prediction in order to cope with the coldstart problem. All of the proposed methods work better than the existing two-way cooperative CF approach in terms of the experimental results.


Introduction
User similarity measures in collaborative filtering (CF) are crucial for recommendations [1,2]. Pearson correlation is one of the most well-known user-item similarity measures in CF. Ahn [3] developed a new similarity measure for a cold-start problem with data sparsity, where many voting scores are missing. This cold-start problem is common in CF [3][4][5][6][7]. For the cold-start problem, Liu et al. [8] modified Ahn's user-item similarity measure by using nearest neighbors. Son [9] compared the existing user-item similarity measures that tackle the cold-start problem.
A variety of CF approaches can be categorized into user-based CF using the user similarity measures, model-based CF using data mining approaches, and hybrid CF combining with content-based filtering. Breese et al. [10] developed the user-based CF leveraging on the Pearson correlation, which has become one of the most widely used user-based CF approaches. In it, similarities between active users and existing users are considered for the predicted scores of test data. The user-based CF leveraging on the Pearson correlation is convenient and easy to implement. Ahn [3] and Choi and Suh [11] used the user-based CF leveraging on the Pearson correlation for predicting voting scores. By contrast, model-based CF methodologies have leveraged data mining approaches, such as Bayesian network, clustering, regression, classification, and association rule, among others [2, [12][13][14]. Stai et al. [15] developed a hybrid recommender system by using both collaborative and content-based filtering in multimedia information retrieval. CF also can be combined with knowledge-based filtering to improve its performance [16]. Many other hybrid approaches have appeared as data become easily available from complex social networks [17].
Mild and Reutterer [18] proposed using the Pearson correlation-based approach rather than the user-based CF leveraging on the Pearson correlation for binary market basket data [19]. Whereas the binary user-item matrix is used for the user-based CF for the Pearson correlation-based approach, the binary item-user matrix can be considered for the item-based CF for the Pearson correlation-based approach [20]. Recently, Hwang [20] proposed a feature selection approach to improve the Pearson correlation-based approach.

Existing CF Approaches
The section briefly reviews the previous studies including both the one-way and the two-way CF approaches. Although we follow the conventional notation used in the literature of recommender systems, we summarized the main symbols in Table 1 for readers' easy understanding. Some of those are still used in Section 3 explaining the proposed methods. w(a, i) similarity between users a and i w(b, j) similarity between items b and j P aj , P bi predicted scores by user-based and item-based CFŝ v j û i predicted scores by regression P P(P aj ,P bi ) Pearson correlation-based score P rsq(v j ,û i ) RF R-square-based score P P(v j ,û i ) RF Pearson correlation-based score

One-Way Pearson Correlation-Based Approaches
The Pearson correlation-based approach can use either user-item similarities or itemuser similarities, where either the user-based CF or the item-based CF is considered. For the user-based CF, V = v 1 , . . . , v j , . . . , v m = v ij , (i = 1, 2, . . . , n; j = 1, 2, . . . , m) represents the binary user-item matrix shown in Figure 1a, which comprises ones (representing purchased items) and zeros (representing non-purchased items). Mild and Reutterer [18] expressed the predicted score for an active user a, for an item j , P aj by where

One-Way RF Regression Approaches
Note that = , … , , … , is the binary user-item matrix in Figure 1a. Then, the RF item modeling can be considered by where is the binary user-item matrix vector representing an item ′ [20]. To calculate the predicted voting scores, the active users are considered as test data. On the contrary, we can consider the RF user modeling [20]. Then, the voting score of an active item , for Here, the Pearson correlation denoted by w(a, i) represents a user-item similarity for the user-based CF. On the contrary, we can consider the binary item-user matrix as  Figure 1b, where item-user similarities are used for the item-based CF [20]. Then, the predicted voting score for an active item b, for a user i , P bi is denoted by where Here, the Pearson correlation denoted by w(b, j) represents an item-user similarity.

One-Way RF Regression Approaches
Note that V = v 1 , . . . , v j , . . . , v m is the binary user-item matrix in Figure 1a. Then, the RF item modeling can be considered bŷ where v j is the binary user-item matrix vector representing an item j [20]. To calculate the predicted voting scores, the active users are considered as test data. On the contrary, we can consider the RF user modeling [20]. Then, the voting score of an active item b, for a user i is calculated byû where U = (u 1 , . . . , u i , . . . , u n ) is the binary item-user matrix, and u i is a vector representing a user i . This approach is known as RF user modeling [20].

One-Way PCA+LR Approaches
Lee et al. [22] considered the first k principal components of the binary user-item matrix predictors for the binary logistic regression model. Note that V = v 1 , . . . , v j , . . . , v m is the binary user-item matrix. When the first k principal components, pc v 1 , pc v 2 , . . . , pc v k are given, the PCA+LR item modeling can be considered bŷ where v j is a vector representing an item j . On the contrary, we can consider PCA+LR user modeling [20]. Then, the voting score of an active item b, for a user i is represented aŝ where U = (u 1 , . . . , u i , . . . , u n ) is a binary item-user matrix and predictors, and u i is a vector representing a user i .

Two-Way Logistic Regression Approach (PCA+LR Two-Way 1)
The Homer-Lemeshow Goodness-of-Fit Chi-square statistic is a model adequacy measure of the logistic regression approach. Lee and Olafsson [24] considered the measure to obtain a weighted mean of the PCA+LR item modeling-based prediction and the PCA+LR user modeling-based prediction, where the two weights are the Homer- Lemeshow Goodness-of-Fit Chi-square statistics for the two predictions. Based on (5) and (6), the weighted mean is represented as: where τ i is the Homer-Lemeshow Goodness-of-Fit Chi-square statistic for the PCA+LR user modeling-based prediction, and τ j is the Homer-Lemeshow Goodness-of-Fit Chisquare statistic for the PCA+LR item modeling-based prediction.

Proposed Two-Way Cooperative CF Approaches
The two-way CF scheme combining the user-based and item-based predictions is illustrated in Figure 2, where their moving direction for taking necessary information is orthogonal [24]. Then, we calculate a weighted average of the user-based and itembased predictions considering their contributions estimated by the Homer-Lemeshow Goodness-of-Fit Chi-square statistic, Pearson correlation, and the R-square value.

Proposed Two-Way Cooperative CF Approaches
The two-way CF scheme combining the user-based and item-based predictions is illustrated in Figure 2, where their moving direction for taking necessary information is orthogonal [24]. Then, we calculate a weighted average of the user-based and item-based predictions considering their contributions estimated by the Homer-Lemeshow Goodness-of-Fit Chi-square statistic, Pearson correlation, and the R-square value.

Improved Two-Way Logistic Regression Approach (PCA+LR Two-Way 2)
For the extreme high-dimensional cold-start problem, where either the row of an active user or the column of an active item in the market basket data are all zeros, the Homer-Lemeshow Goodness-of-Fit Chi-square statistic is not available (NaN) (0/0) in the R package (ResourceSelection), which worsens the performance of the PCA+LR two-way 1.
To resolve this problem, we propose that in (7), becomes zero when the Homer-Lemeshow Goodness-of-Fit Chi-square statistic is NaN (0/0) for the PCA+LR item modelingbased prediction, whereas becomes zero when the Homer-Lemeshow Goodness-of-Fit Chi-square statistic becomes NaN (0/0) for the PCA+LR user modeling-based prediction.
The Homer-Lemeshow Goodness-of-Fit Chi-square statistic is a Pearson goodness of fit statistic where the number of observed zeros and the number of expected zeros in a group are considered for the extreme high-dimensional cold-start problem. Since the binary classification problem is easily fitted as a one-class classification problem, the number of observed zeros and the number of expected zeros can be all zeros, such that the Homer-Lemeshow Goodness-of-Fit Chi-square statistic becomes NaN (0/0). The lower the Homer-Lemeshow Goodness-of-Fit Chi-square statistic, the better the model fit. Thus, we propose to make the Homer-Lemeshow Goodness-of-Fit Chi-square statistic zero for the extreme high-dimensional cold-start problem.

Pearson Correlation-Based Score
In (1)  Then, the Pearson correlation-based score for two-way cooperative CF is defined as a

Improved Two-Way Logistic Regression Approach (PCA+LR Two-Way 2)
For the extreme high-dimensional cold-start problem, where either the row of an active user or the column of an active item in the market basket data are all zeros, the Homer-Lemeshow Goodness-of-Fit Chi-square statistic is not available (NaN) (0/0) in the R package (ResourceSelection), which worsens the performance of the PCA+LR two-way 1.
To resolve this problem, we propose that in (7), τ j becomes zero when the Homer-Lemeshow Goodness-of-Fit Chi-square statistic is NaN (0/0) for the PCA+LR item modelingbased prediction, whereas τ i becomes zero when the Homer-Lemeshow Goodness-of-Fit Chi-square statistic becomes NaN (0/0) for the PCA+LR user modeling-based prediction.
The Homer-Lemeshow Goodness-of-Fit Chi-square statistic is a Pearson goodness of fit statistic where the number of observed zeros and the number of expected zeros in a group are considered for the extreme high-dimensional cold-start problem. Since the binary classification problem is easily fitted as a one-class classification problem, the number of observed zeros and the number of expected zeros can be all zeros, such that the Homer-Lemeshow Goodness-of-Fit Chi-square statistic becomes NaN (0/0). The lower the Homer-Lemeshow Goodness-of-Fit Chi-square statistic, the better the model fit. Thus, we propose to make the Homer-Lemeshow Goodness-of-Fit Chi-square statistic zero for the extreme high-dimensional cold-start problem.

Pearson Correlation-Based Score
In (1) and (2), to the sums of the correlations to consider the proportions of the contributions. Then, the Pearson correlation-based score for two-way cooperative CF is defined as a weighted mean of P aj and P bi as follows.
The first weight for P aj , is the proportion of the sum of the absolute values of the Pearson correlations between an active item b and an existing item j. Since the sum of the absolute values of the Pearson correlations reveals the importance of the prediction, the two proportions reasonably assign the importance of the prediction to the two predictions, P aj and P bi . For the extreme high-dimensional cold-start problem, where either the row of an active user or the column of an active item in the market basket data are all zeros, we propose that the corresponding Pearson correlations are considered as zeros because they cannot be calculated, and there are low correlations between the two variables. Then, the weighted mean can be reasonably calculated because both P aj and P bi , the two predictions obtained by the user-based CF and by the item-based CF, become available.

RF R-Square-Based Score and RF Pearson Correlation-Based Score
For the RF item modeling and the RF user modeling, we consider the average of the R-square (rsq) values of the RF regression approach to calculate a two-way cooperative score, because it represents a model adequacy. Based on (3) and (4), the RF R-square-based score is defined by where rsq i is an R-square value of an ith regression tree for the RF item modeling; rsq j is an R-square value of a jth regression tree for the RF user modeling; T i is the number of regression trees for the item modeling; and T j is the number of regression trees for the user modeling. Since the average of R-square (rsq) values of the RF regression approach reveals the importance of the prediction, the two proportions reasonably assign the importance of the prediction to the two predictions,v j andû i . For the extreme high-dimensional cold-start problem, where either the row of an active user or the column of an active item in the binary market basket data are all zeros, the R-square values can have a negative sign when the mean squares of errors for the RF approach is greater than the variance of the response variable. Moreover, when the R package (randomForest) says that the R-square values are NaN, both the mean squares of errors for the RF approach and the variance of the response variable are zeros. Then, we consider the R-square values because the model fit is perfect. As a result, the weighted mean can be reasonably calculated. Additionally, instead of the average of R-square (rsq) values, we can adopt the proportion of the sum of the absolute values of the Pearson correlations for the voting scores, as follows, which is called an RF Pearson correlation-based score in this study.

Scheme for RF R-Square-Based Score
For the extreme high-dimensional cold-start problem, the RF R-square-based score depends on an ad hoc approach considering even the inaccurately calculated average of the R-square (rsq) values. By leveraging on only the accurately calculated average of the R-square (rsq) values, we can improve the performance of the RF R-square-based score. We first consider both the average of the R-square values for the RF item modeling (item-rsq and that for the RF user modeling (user-rsq = ( (9). Indeed, we modify (9) according to the availabilities and the signs of item-rsq and user-rsq. The pseudocode of the proposed method is depicted below. 1

Computational Complexity Analysis
We usually analyze the computational complexity of recommender systems with consideration of two parts: computation time for model construction and that for one rating prediction. Based on a binary user-item matrix whose size is n × m, the computational complexities of the user-based CF are O n 2 mk for model construction and O(k) for one rating prediction. The former is to calculate similarities among users, and the latter is to make a prediction using k neighbors. Likewise, the computational time of the item-based CF can estimated as O nm 2 k and O(k). It is obvious that our approaches require more computational time for model construction because we employ the statistical learning algorithms. The computational complexities are O(nm) for logistic regression and O min n 3 , m 3 for principal component analysis. The CART (classification and regression tree) algorithm has the complexity of O(mn log n) in the worst case, which means that the depth of a tree is n. If we build s trees with t randomly chosen variables at each split, the complexity of random forests becomes O(stn log n). Notice that the actual times for training prediction models can be reduced by performing PCA because we use a fewer number of input variables than m. Although our methods need more computational times for model construction than the existing CF approaches, their prediction complexity is O(1), which means a constant time complexity because they do not use k neighbors. As a result of this small complexity for rating prediction, our methods are suitable for online recommendation, as other model-based approaches are. Our numerical experiments in the next section showed that the item-based CF took 0.05 s and our methods took 0.02 s for one rating prediction, although the PCA+LR item modeling and the RF item modeling took 5.29 s and 47.57 s respectively for model construction.

Experimental Settings
Based on the experimental settings used by Mild and Reutterer [18] and Lee et al. [22], we consider both the Groceries dataset (arules R package) and the EachMovie dataset (https://grouplens.org/datasets/eachmovie/, accessed on 5 September 2004). For the Groceries dataset, 9835 transactions and 169 categories were collected for 30 days from a grocery store [25]. The first 20 existing users and 168 categories are selected, whereas the next 980 active users and 168 categories are selected. "Whole milk" is chosen as a new item. We consider classification error, recall, and precision for the predicted values, and actual values to evaluate the prediction performance.
The EachMovie dataset comprises 72,916 users and 1628 movies with 2,811,983 ratings, where a six-point scale with [0.0, 0.2, 0.4, 0.6, 0.8, 1.0] is considered. The ratings are converted into binary scales, and the experimental settings are used [22], where future responses or non-responses can be predicted for target marketing.
From the EachMovie dataset, 604 existing users and 207 movies (case 1), 150 existing users and 150 movies (case 2), 10 existing users and 100 movies (case 3), and 604 existing users and 20 movies (case 4) are randomly chosen for the section A × C in Figure 1a.
Corresponding to the selected existing users and movies, 121 active users and 207 movies, 50 active users and 150 movies, 90 active users and 100 movies, and 121 active users and 20 movies are randomly selected for the section B × C in Figure 1a. Finally, 100 movies for new items (D in Figure 1a) are randomly chosen for 10 existing users and 100 movies, whereas 50 movies for new items (D in Figure 1a) are randomly chosen for the other cases. We consider Top-1, Top-2, . . . , and Top-10 accuracies as our performance measure [22]. For example, Top-10 accuracy is The number of the actual Top-10 items The number of first ten item that are recommended by a CF scheme .

Grocery Dataset
A cutoff value with a minimal classification error is chosen. For the best cutoff values, Table 2 presents the classification error, precision, recall, and F1 score of the CF approaches. As shown in Table 2, in terms of classification error, the RF Pearson correlation-based score is the best, whereas the RF item modeling and RF user modeling are the best in terms of precision. Regarding recall and F1 score, the PCA+LR item modeling works better than the other approaches, but its precision is the lowest. Most significantly, the PCA+LR two-way 1 and PCA+LR two-way 2 fail to provide two-way predictions because of the high-dimensional cold-start problem. By contrast, the Pearson correlation-based score improves the classification error and precision of the userbased CF and item-based CF, whereas the RF Pearson correlation-based score improves the classification error, recall, and F1 score of the RF item modeling and RF user modeling. In conclusion, the two-way logistic regression approaches are outperformed by the proposed Pearson correlation-based score and RF Pearson correlation-based score.

Eachmovie Dataset
We calculate the Top-N accuracies for the approaches. Table 3 summarizes the Top-N accuracy for case 1, where we can effectively check the recommendation performance by manipulating the N. The Top-N accuracy ranging from 0 to 1 has been widely used for evaluating the recommendation performance because the N can be selected by recommender system managers, and they are interested in how many items among the recommended ones would be actually chosen by users [18][19][20][21][22][23][24]. The bold numbers in the table indicate the best performances. In case 1, for the PCA+LR item modeling and PCA+LR user modeling, the PCA+LR two-way 1 performs the best for Top-8, Top-9, and Top-10, whereas the PCA+LR user modeling is the best for Top-1, Top-6, and Top-7, and the PCA+LR item modeling is the best for Top-1 to Top-5. For the user-based CF and item-based CF, the Pearson correlation-based score performs the best for Top-2, Top-3, Top-9, and Top-10, whereas the user-based CF performs the best for Top-1 and the item-based CF does for Top-4 to Top-9. For the RF item modeling and RF user modeling, the RF R-square-based score performs the best for Top-1 and Top-4 to Top-10, whereas the RF user modeling performs the best for Top-3 and the RF item modeling does for Top-1 and Top-2. For the two-way cooperative CF, the Pearson correlation-based score and the RF R-square-based score provide the best average of the ten Top-N accuracies. Therefore, we realize that the Pearson correlation-based score as well as the RF R-square-based score works more effectively than the PCA+LR two-way 1. Note that there are 604 users and 207 items in section A × C in Figure 1a. In case 2, as shown in Table 4, for the PCA+LR item modeling and PCA+LR user modeling, the PCA+LR two-way 1 performs the best for Top-2, Top-3, Top-9, and Top-10, whereas the PCA+LR user modeling is the best for Top-1 and Top-4 toTop-8 and the PCA+LR item modeling is the best for Top-1. For the user-based CF and item-based CF, the Pearson correlation-based score performs the best for Top-1 to Top-8, whereas the user-based CF performs the best for Top-2 to Top-4, Top-7, Top-9, and Top-10; the item-based CF is outperformed by the two approaches. The RF R-square-based score performs the best for Top-1 to Top-9, whereas the RF item modeling performs the best for Top-1 to Top-3, Top-5, Top-6, and Top-10; the RF user modeling is outperformed by the two approaches. For the two-way cooperative CF, the Pearson correlation-based score and the RF R-square-based score provide the best average of the ten Top-N accuracies.
Therefore, we assume that both the Pearson correlation-based score and the RF Rsquare-based score work very effectively for the two-way cooperative CF than the PCA+LR two-way 1. Note that there are 150 users and 150 items in section A × C in Figure 1a.
In case 3, as shown in Table 5, the PCA+LR item modeling performs better than the other approaches for all the Top-N accuracies. The PCA+LR two-way 2 seems not to outperform the PCA+LR user modeling, although it clearly outperforms the PCA+LR two-way 1. For the user-based CF and item-based CF, the Pearson correlation-based score and the item-based CF are outperformed by the user-based CF for all the Top-N accuracies. The Pearson correlation-based score does not seem to work well. The RF R-square-based score is outperformed by the RF user modeling and the RF item modeling. Note that there are only 10 users in section A × C in Figure 1a. In this case, the columns of some active items in the market basket data are all zeros, which is the extreme high-dimensional cold-start problem. Then, the average of the R-square values can have a negative sign, which can lead to bad prediction performance.
For instance, we randomly select a test observation where 1 denotes a purchased item and −1 denotes a non-purchased item where the predicted values of the item modeling and the user modeling range from −1 to 1. The predicted value of the item modeling is −0.3204994, and the predicted value of the user modeling is −0.5009333. The average of the R-square values of the item modeling is 0.251122 and the average of the R-square values of the user modeling is −0.3145476, which has a negative sign. Then, the calculated weighted average based on (9) is −1.215328, which does not make sense because it does not fall between −0.3204994 and 0.5009333. Therefore, the RF R-square-based score does not work well in this case.
Instead of the RF R-square-based score, we apply the Pearson correlation-based score to the RF item and user modeling. For Top-1 and Top-4 to Top-6, the RF Pearson correlation-based score performs the best and is close to the RF user modeling or the RF item modeling for the other Top-N accuracies. Moreover, the RF Pearson correlation-based score gives the best average of the ten Top-N accuracies. As a result, we realize that the RF Pearson correlation-based score works better for the two-way cooperative CF than the RF R-square-based score.
For case 4, as shown in Table 6, the PCA+LR item modeling performs better than the other approaches for all the Top-N accuracies. The PCA+LR two-way 2 does not seem to outperform the PCA+LR item modeling, although it clearly outperforms the PCA+LR two-way 1. The PCA+LR two-way 1 does not even provide appropriate predicted values. For the user-based CF and item-based CF, the user-based CF and the Pearson correlationbased score outperform the item-based CF for all the Top-N accuracies. The Pearson correlation-based score does not seem to perform the best, except for Top-1 and Top-8. The RF R-square-based score is outperformed by the RF user modeling and the RF item modeling. Note that there are only 20 items in section A× C in Figure 1a. In this case, the rows of some active users in the binary market basket data are all zeros, which is the extreme high-dimensional cold-start problem. Then, the average of the R-square values can have a negative sign, which can lead to bad prediction performance. For further analysis, we randomly select 10 test data and respectively calculate predicted values for the RF user modeling, the RF item modeling, and the RF R-square-based score, as shown in Figure 3, where 1 denotes a purchased item and −1 denotes a nonpurchased item. Although the RF R-square-based score is a weighted average of the RF item modeling-based prediction and the RF user modeling-based prediction, the first, second, third, and seventh observations violate the assumption that the weighted mean should fall between the predicted value of the RF item modeling and the predicted value of the RF user modeling, as illustrated in Figure 3, because the averages of the R-square values have negative signs. As a result, the RF R-square-based score does not work well.  Instead, we apply the Pearson correlation-based score to the RF item modeling and the RF user modeling. For Top-1, Top-3, Top-4, and Top-10, the RF Pearson correlationbased score performs the best and is close to the item modeling for the other Top-N accuracies, as shown in Table 6. Moreover, the RF Pearson correlation-based score gives the best average of the ten Top-N accuracies. Thus, the RF Pearson correlation-based score works better for the two-way cooperative CF than the RF R-square-based score. To understand these matters better, we randomly select 10 test data and calculate predicted values for the RF user modeling, the RF item modeling, and the RF Pearson correlation-based score (Figure 4), where 1 denotes a response and −1 denotes a non-response. The RF Pearson correlation-based score should be a weighted average of the RF item modeling-based prediction and the RF user modeling-based prediction. In this case, no observations violate the assumption. In other words, the RF Pearson correlation-based scores always fall between the predicted value of the RF item modeling and the RF predicted value of the user modeling. Thus, the RF Pearson correlation-based score works more effectively than   Instead, we apply the Pearson correlation-based score to the RF item modeling and the RF user modeling. For Top-1, Top-3, Top-4, and Top-10, the RF Pearson correlation-based score performs the best and is close to the item modeling for the other Top-N accuracies, as shown in Table 6. Moreover, the RF Pearson correlation-based score gives the best average of the ten Top-N accuracies. Thus, the RF Pearson correlation-based score works better for the two-way cooperative CF than the RF R-square-based score. To understand these matters better, we randomly select 10 test data and calculate predicted values for the RF user modeling, the RF item modeling, and the RF Pearson correlation-based score (Figure 4), where 1 denotes a response and −1 denotes a non-response. The RF Pearson correlation-based score should be a weighted average of the RF item modeling-based prediction and the RF user modeling-based prediction. In this case, no observations violate the assumption. In other words, the RF Pearson correlation-based scores always fall between the predicted value of the RF item modeling and the RF predicted value of the user modeling. Thus, the RF Pearson correlation-based score works more effectively than the RF R-square-based score for the two-way cooperative CF.

RF.rsq.Two.way
based score performs the best and is close to the item modeling for the other Top-N accuracies, as shown in Table 6. Moreover, the RF Pearson correlation-based score gives the best average of the ten Top-N accuracies. Thus, the RF Pearson correlation-based score works better for the two-way cooperative CF than the RF R-square-based score. To understand these matters better, we randomly select 10 test data and calculate predicted values for the RF user modeling, the RF item modeling, and the RF Pearson correlation-based score (Figure 4), where 1 denotes a response and −1 denotes a non-response. The RF Pearson correlation-based score should be a weighted average of the RF item modeling-based prediction and the RF user modeling-based prediction. In this case, no observations violate the assumption. In other words, the RF Pearson correlation-based scores always fall between the predicted value of the RF item modeling and the RF predicted value of the user modeling. Thus, the RF Pearson correlation-based score works more effectively than the RF R-square-based score for the two-way cooperative CF.  Additionally, although the proposed CF scheme for the RF R-square-based score in Section 3 D requires more procedures, it improves the prediction performance of the RF R-square-based score dramatically, as shown in Table 6. As illustrated in Figure 5, the proposed CF scheme emulates the RF Pearson correlation-based score. Indeed, the average of the ten Top-N accuracies for the proposed CF scheme, 0.7737, is greater than that of the ten Top-N accuracies for the Pearson correlation-based score, 0.7696. We realize that the proposed CF scheme performs as well as the RF Pearson correlation-based score for the two-way cooperative CF. Additionally, although the proposed CF scheme for the RF R-square-based score in Section 3 D requires more procedures, it improves the prediction performance of the RF R-square-based score dramatically, as shown in Table 6. As illustrated in Figure 5, the proposed CF scheme emulates the RF Pearson correlation-based score. Indeed, the average of the ten Top-N accuracies for the proposed CF scheme, 0.7737, is greater than that of the ten Top-N accuracies for the Pearson correlation-based score, 0.7696. We realize that the proposed CF scheme performs as well as the RF Pearson correlation-based score for the two-way cooperative CF.

Conclusions
In this study, we propose a PCA+LR two-way 2, a Pearson correlation-based score, an RF R-square-based score, an RF Pearson correlation-based score, and a CF scheme for the RF R-square-based score for two-way cooperative CF for binary market basket data.

Conclusions
In this study, we propose a PCA+LR two-way 2, a Pearson correlation-based score, an RF R-square-based score, an RF Pearson correlation-based score, and a CF scheme for the RF R-square-based score for two-way cooperative CF for binary market basket data. The experimental results show that the proposed two-way cooperative CF approaches work better than the existing PCA+LR two-way 1. For the Grocery dataset, the PCA+LR two-way 1 does not even provide an appropriate predicted value, which demonstrates that it is clearly outperformed by the Pearson correlation-based score and RF Pearson correlation-based score. For the non-high-dimensional EachMovie dataset, the Pearson correlation-based score as well as the RF R-square-based score clearly improve the accuracy of the one-way approaches, whereas the PCA+LR two-way 1 does not. For the extreme high-dimensional EachMovie dataset, only the RF Pearson correlation-based score and the proposed CF scheme clearly improve the performance of the one-way approaches.
Most significantly, for the first time, we apply the proposed two-way cooperative CF approaches to the Grocery transaction dataset and obtain promising results. Two-way cooperative CF is crucial for binary market basket data; therefore, the proposed two-way cooperative CF approaches would be useful for marketing practitioners. However, the two proposed CF approaches cannot always improve the performance of the one-way CF approaches because the prediction performance depends on the datasets. In our future research, we plan to apply the proposed two-way CF approaches to other domains and employ other supervised learning approaches.