Design and Comparative Analysis of New Personalized Recommender Algorithms with Speciﬁc Features for Large Scale Datasets

: Nowadays, because of the tremendous amount of information that humans and machines produce every day, it has become increasingly hard to choose the more relevant content across a broad range of choices. This research focuses on the design of two di ﬀ erent intelligent optimization methods using Artiﬁcial Intelligence and Machine Learning for real-life applications that are used to improve the process of generation of recommenders. In the ﬁrst method, the modiﬁed cluster based intelligent collaborative ﬁltering is applied with the sequential clustering that operates on the values of dataset, user (cid:48) s neighborhood set, and the size of the recommendation list. This strategy splits the given data set into di ﬀ erent subsets or clusters and the recommendation list is extracted from each group for constructing the better recommendation list. In the second method, the speciﬁc features-based customized recommender that works in the training and recommendation steps by applying the split and conquer strategy on the problem datasets, which are clustered into a minimum number of clusters and the better recommendation list, is created among all the clusters. This strategy automatically tunes the tuning parameter λ that serves the role of supervised learning in generating the better recommendation list for the large datasets. The quality of the proposed recommenders for some of the large scale datasets is improved compared to some of the well-known existing methods. The proposed methods work well when λ = 0.5 with the size of the recommendation list, | L | = 30 and the size of the neighborhood, | S | < 30. For a large value of | S | , the signiﬁcant di ﬀ erence of the root mean square error becomes smaller in the proposed methods. For large scale datasets, simulation of the proposed methods when varying the user sizes and when the user size exceeds 500, the experimental results show that better values of the metrics are obtained and the proposed method 2 performs better than proposed method 1. The signiﬁcant di ﬀ erences are obtained in these methods because the structure of computation of the methods depends on the number of user attributes, λ , the number of bipartite graph edges, and | L | . The better values of the ( Precision , Recall ) metrics obtained with size as 3000 for the large scale Book-Crossing dataset in the proposed methods are (0.0004, 0.0042) and (0.0004, 0.0046) respectively. The average computational time of the proposed methods takes < 10 seconds for the large scale datasets and yields better performance compared to the well-known existing methods.


Introduction
Recommender systems are broadly utilized to assist users in handling the great amount of information available on the web, particularly in searching for the most appropriate content tailored to the specific preferences of the user s.Because of the large esteem of this type of organization and required to guarantee that such interfaces provide the user with huge relevance and quality, the mechanisms for making these recommendations should be updated continuously [1].The recommendations can also be personalized and non-personalized in approach as per the user's characteristics.A list of best-recommended products on the site is provided for a personalized recommendation.The product may be recommended based on an analysis of users past behavior or the statistical advice provided by the other user, whereas non-personalized are simple to make, as they are independent of user's actions.
Conventional recommender systems are necessarily a content-based and Collaborative Filtering (CF) system [2].The traditional method utilized for recommendations is CF.Recommender systems based on CF calculate user preferences for products or services by learning past user-item relationships from a group of users sharing the same interests and tastes.An additional standard scheme when designing referral systems is content-based filtering.Content-based filtering schemes are according to the description of the item and profile of the user s preferences.These schemes are well-matched for situations where the known data of an object (name, location, description, etc.) are present, but not in the user list.Despite the achievement of these two filtration techniques, several drawbacks have been recognized.Most of the issues are connected with content-based filtering methods that include limited content analysis, over-specialization, and spacing of data [3].Moreover, joint approaches reveal cold-start, spacing, and scaling problems.These issues generally decrease the quality of referrals.To alleviate some of the issues identified, hybrid filtration has been proposed combining two or more filtration methods in various ways to enlarge the accuracy and efficiency of the recommender systems [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18].
The important research in recommender applications is the development of a good recommendation system that is expected to create a better recommendation list, based on the specific needs of the users [13,[19][20][21][22][23][24][25][26][27][28][29][30].To resolve the problems in the existing methods, this research focuses on the design of two different intelligent optimization methods using Artificial Intelligence (AI) and Machine Learning (ML) for real-life applications that are used to improve the process of generation of recommenders.In the first method, the modified cluster based intelligent CF is applied with the sequential clustering that operates on the values of dataset, user s neighborhood set, and |L|.In the second method, the specific features-based customized recommender that works in the training and recommendation steps by applying the split and conquer strategy on the problem datasets, which are clustered into a minimum number of clusters and the better recommendation list, is created among all the clusters.Some of the research gaps in the existing recommender methods are: having more deviations in the performance measurements and taking a lot of computational complexity, which results in less accuracy in the generation of the recommendation list [13,23,24,[31][32][33][34][35][36][37][38][39].Hence, it is necessary to design the new recommender strategies to offset the issues in the well-known methods of solving real-life applications [6,33,38,[40][41][42][43][44][45][46][47].Section 2 focuses on the survey on recent recommender algorithms in the generation of a better recommendation list.The requirement of the design of new recommender methods for real-world applications is also discussed in Section 2. The notations and the definitions applied in the proposed methods are explained in Section 3. The proposed recommender algorithms are presented in Section 4. The simulation of the proposed strategies, along with the analysis of the experimental results, is focused on in Section 5.The conclusions of this research with the future research areas are discussed in Section 6.

User Profile Orientation Recommenders
A few personalized recommenders have been developed recently for real-world applications.The User Profile Oriented Diffusion (UPOD) strategy to learn the user profile is developed [6].The UPOD strategy makes customized recommendations through diffusion, integrating innovation with the familiarity of products that are functioning in two different phases such as training and recommendation.The training phase of UPOD performs operations such as data prediction, defining the values of users features, feature-based clustering of users, creating a bilateral map of interactions, and searching the profile of every cluster.The training phase trains a classifier, which provides features of the target user.The referral phase of UPOD requires an input map, a target user, a set of user attributes, a classifier trained in the training operation, and the size of the referral list.In this strategy, the tuning parameter is automatically adjusted to evaluate the amount of mixing in mass diffusion activity for different sparse datasets.This strategy generates recommendations based on certain features of the user's profile.Most importantly, this strategy includes the user's profile information for refining and personally recommending the content to the users.This strategy is simulated with the parameters: λ = 0.5, the size of the recommendation list |L| = 30, size of the neighborhood is 30, k-minimum to 100, k-maximum to 200.

Content-Based Recommenders
The content-based recommenders using the Convolution Neural Network (CNN) model are developed [7].CNN can be utilized to find hidden factors from the textual data of media assets.To practice CNN, its input and output should initially be settled.For its input, the language model is utilized.For its discharge, structure the latent factor scheme, which is obliged by the L1-calculation.Also, the split Bregman iteration scheme is designed to solve the system.The main improvement of the designed recommendation method is that the text information is utilized straightly to perform the content-based recommendation without tagging.

Hybrid Recommenders
The hybrid scheme that exploits genomic tags of movies integrated with the content-based filter to recommend related movies is developed [8].It utilizes Principal Component Analysis (PCA) and correlation coefficient methods to decrease the tags, which are superfluous and demonstrate a minimum proportion of variance.The designed system using content-based filtering on the average rating of the movie, it will suggest top N movies to the users.The e-learning personalization, according to the hybrid recommendation strategy and learning style identification, is discussed [9].This describes the recommendation module of a programming training method.In this work, the system can automatically adjust to the learner's interests and knowledge levels.The designed system recognizes dissimilar styles of learning style and learner s habits by testing the learning style of the learners and extracting their server records.First, it progresses clusters according to different learning styles.Then, it investigates the habits and interests of learners by extracting scenes frequently.Finally, the system completes a customized recommendation of the learning content following these continuous visualizations provided by the recommender system.

Filter-Based Recommenders
The personalized travel route recommenders with the help of CF based on Global Positioning System (GPS) trajectories are designed [10].The presented methods consider users personal travel preferences according to their historical GPS routes.In this work, first, compute the frequencies of the user s travel behavior using the common filtering technique.A path with the highest probability of user travel behavior is then computed according to the innocent Bayes scheme.The extended version of Collaborative Travel Route Recommendation (CTRR), (CTRR+) scheme, progresses the performances of CTRR by considering cold start users and combing distance with the user travel behavior probability.The investigational outcome shows that the introduced scheme attains good output performance for travel route recommendations matched with the shortest distance path scheme.

Features-Based Recommenders
The feature-based recommender provides personalized recommendations to users in solving ERP System and E-Agribusiness datasets by performing some configuration functions initially [48].The evaluation process requires some offline evaluation of parametric optimization and splitting of the dataset into disjoint training and test dataset.The choice of the method has a great impact on the recommendation quality.Modified CF is proposed for both user-based and object-based cases in solving the MovieLens dataset [49].The group recommendation model is proposed based on factors such as sparsity, dynamics, and timeliness [50].When more additional features are considered, intelligent optimization strategies are further required to minimize the Root Mean Squared Error (RMSE).However, for large scale datasets, to provide a better recommendation, the proposed intelligent optimization methods are competitive to some of the well-known existing methods [51].
The similarity-based targeting along with the baseline approach and latent factor models, which are treated with adaptive regularization technique that provides personalization to both users and items, is discussed in [52].The recommendation system for users to recommend books is presented in [53].The bottlenecks of the paper are to test the system with other intelligent techniques to improve the performance of the system.The building recommendation system for online news is described in [54].The recommendation strategy mainly focuses on user personalization and browsing history.It was just a micro service recommendation system.The architecture of an intelligent and autonomous recommendation system to be used in any virtual learning environment to efficiently recommend digital resources is presented in [55].The architecture extracts information from the context of the students, identifying variables such as individual learning styles, socioeconomic information, connection log information, location information, among others.It uses the learning styles of the students, the context information, the social networks, among other sources, to select the best digital resources.However, the integration of recommendation methods was not analyzed.The approach consists of some functional components that assist in determining users like active users and also to find the value of |L| [56].This recommender system uses a specialized weight calculation block that assists to place the various items at different positions of L. For small values of L, it has been found that the precision metric for the traditional benchmark is very low.The machine learning model that recommends a suitable candidate's resume to the human resources based on the given job information is presented [57].This model operates in two stages: Initially, the resume is categorized.Then, the recommendation is applied based on the similarity index measurement with the given job information.Further enhancement of this model can be enhanced by applying deep learning strategies.The system for the contextual collaborative recommendation that addresses the issues of the n-dimensional contextual complexity models with new users and items is discussed [58].This model is simulated in a healthy food field where just a few proposed methods are interested in research in this area.They have used a very small sample size of 524 users only.More intelligent techniques can be applied along with large scale datasets [59][60][61].
The recommender framework for personalization and relevance feedback for some online applications is developed with a size of the dataset of 2500 videos in [62].This recommender framework focuses on the development of enriched multimedia content which is targeted to the user s preferential information.The recommendation is implemented to extract the video information through the collection of relevance feedback mechanisms from the user interactions.However, the proposed recommender algorithms are designed to provide a better recommendation for large scale datasets with different categories of features such as users, items, interactions, age, location, gender, country, etc. [19].The proposed recommender algorithms can also be extended to support the recommendation based on the video information for online applications [63][64][65].
The context-aware video recommender system is developed to improve recommendation performance by incorporating contextual features along with the conventional user-item ratings used by video recommender systems [66].The CF algorithm is discussed to confront the sparsity problem in the resulting graph partitions that may improve the prediction performance of parallel implementations without strongly affecting their time efficiency [67].The parallel hardware implementation based algorithm is developed for embedded CF applications with large datasets [68].The online recommendation algorithm is designed, which combines clustering and CF techniques to improve the accuracy of online recommendation systems for group-buying applications [69].The recommender system development is discussed that uses several algorithms to obtain groupings [70].
Nowadays, the new recommender algorithms are required for real-world applications, because of the following reasons [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][21][22][23][24][29][30][31][32][43][44][45][46][47][66][67][68][69][70]: One of the main reasons why we need a recommender system in modern society is that there are many ways for people to use the Internet.For example, Netflix has an enormous collection of movies.Despite the increase in the amount of information available, a new problem arose due to the difficulty of selecting the items that people wanted to see; A recommender system attempts to assess and predict user content preferences related to games, stories, or videos.The system draws from data usage history aimed at making recommendations based on the user s (current) interests; In the e-commerce system, recommender systems improve revenue because they are the best way to sell more products; A company with a list of thousands and thousands of products will be hard-pressed for all its products with hardcore product recommendations, and such standard recommendations will soon be outdated or inappropriate for many customers by using a variety of methods for filtering; you can find business hours to recommend new products that you can buy (whether on their site, via email, or otherwise); The recommender system should provide more precise and personalized recommendations than the existing systems.The outcome of the recommender should prove the correctness of the recommendation based on the specific needs of the users.

Notations & Definitions
This research focuses on the design of two different intelligent optimization strategies using AI and ML for real-life applications that are used to generate the better recommendation.The proposed recommender algorithms used the following notations and definitions: 3.1.Correlation Coefficient r(x, y), the correlation coefficient between two random variables or users X and Y for n pairs of observations checks the existence of a linear relationship between them, and it is computed using Cov(X, Y), the covariance between random variables X and Y; and σ x , σ y , the standard deviations of random variables X and Y, respectively, and is defined as follows [32]: In general, 0 ≤ r(x, y) ≤ 1.The values of r(x, y) = −1 and + 1 signify the perfect negative and perfect positive correlation, respectively.Cov(X, Y), the covariance between variables X and Y becomes zero when X and Y are independent.Cov(X, Y) is defined as follows: The expectation of a random variable X is given by E(X) = x.Then, the equation r(x, y) = 0 represents the random variables X and Y, which are independent.
The mean of random variables X and Y are defined as follows: x j and y = 1 n n j=1 y j (4) The unbiased estimators of σ x , σ y are the following:

Pearson Correlation Coefficient
The correlation coefficient between two random variables x and y according to Pearson is given by PearsonCorrCoe f f (x, y) = j∈I xy r x,j − r x r y,j − r y j∈I xy r x,j − r x 2 j∈I xy r y,j − r y 2 (7) where I xy defines the set of items rated by variables of two users x and y [32].The rating values of jth item are defined as r x,j and r y,j for users x and y, respectively.The average ratings of all items that interacted with users x and y are defined as r x and r y , respectively.PearsonCorrCoe f f (x, y) ∈ R and −1 ≤ PearsonCorrCoe f f (x, y) ≤ 1.

Item Prediction
The prediction of item j (or ratings for items) for a user x is given by Prediction (x, j) = r x + a∈S PearsonCorrCoe f f (x, a) r a,j − r a a∈S PearsonCorrCoe f f (x, a) where S is the set of users similar to user x concerning the values of the neighborhood [31].PearsonCorrCoe f f (x, a) defines the similarity between two variables x and a.The rating value of jth item is defined as r a, j for the user a.The mean rating of all items interacted with users a is defined as r a .Typical values of the cardinality of the set S lie between 20 and 40 to maximize the item prediction for a better recommendation.

Mass Diffusion Resource Values
Let degree( j) and degree(v) represent the vertex degrees corresponding to the item j and the user v, respectively.Then, the resource values, according to Mass Diffusion algorithm, are defined as follows [46]:

Heat Spreading Resource Values
The resource values, r Heat Spreading (v, j) and r Heat Spreading (v, j), are computed according to the Heat Spreading algorithm, and are defined as follows [46]:

Mass Diffusion Heat Spreading Resource Values
Let λ be a tuning parameter that lies between 0 and 1.Then the resource values, r MDHS (v, j) and r MDHS (v, j), are calculated according to Mass Diffusion Heat Spreading algorithm and are defined as follows: [36]

Dissimilarity (Y, X)
Let Y and X represent the vectors of different attribute values of users corresponding to the values of categories, where y j ∈ Y and x j ∈ X are the values of every data attribute of users Y and X, respectively.Then the metric of dissimilarity between Y and X in k-modes computation for m attributes is defined as follows [6]: where δ(y i , x i ) = 0 when x i = y i and δ(y i , x i ) = 1 when x i y i .For example, consider the two vectors X and Y with sample attribute values such as a 1 = profession, a 2 = age, and a 3 = sex and assume the sample values of vector X and Y are x 1 = engineer, x 2 = 20-25 and x 3 = female and y 1 = doctor, y 2 = 25-35 and y 3 = female respectively.The dissimilarity measurement between X and Y is obtained as follows: Dissimilarity (Y, X) = 1 + 1 + 0 = 2, resulting in only one attribute; a 3 = sex and is common to both vectors X and Y.This measurement finds the number of identical and non-identical attributes in both vectors X and Y.This function does not define a distance metric.This measurement is applied in the k-modes clustering algorithm in assigning the classifiers for each object in each cluster.The mode of clusters is evaluated using Dissimilarity (Y, X) & Dissimilarity (Q, X) measurements.Then the effective data partition is evaluated for each cluster.

Dissimilarity (Q, X)
Let X 1 , X 2 , X 3 . . .X n be the set of n objects corresponding to the categories of vector X.Let Q be the attribute vector of the categories of these n objects.Let Q have the least value of Dissimilarity (Y, X) in the cluster hand, and hence, it decreases the following function [6]:

Column Entropy
Let X = {x 1 , x 2 , x 3 . . .x n } be a vector of the dataset, which consists of c columns and n instances.Define a column vector to describe the instance x i as ( Then the column values of x i are assigned from the finite number of unique categorical values of the domain set A i .For a value v ∈ A i , the probability of x i = v is given by P (x i = v).Let p(v|X) be the empirical probability of x i = v, which is evaluated in the dataset X.Then, the column entropy of domain A i is defined as follows [16]:

Expected Entropy
Partition the data set X into k clusters of objects as ) be the objects of the subset of C k .Define H(C k ) as the entropy of the cluster, which depends on the dataset.Then, E C k , the expected entropy for

Entropy-Based Clustering Criterion
The entropy-based clustering criterion is given by the function, Optimize C k , which minimizes the value of expected entropy E C k [16].For a constant c, this optimization criterion is defined as 3.12.Recall (List, User) Let g(L) be the items count that is associated with the valid target user, Valid Users , in the recommendation list L and testing set.Let I User be the total items count that are related to the valid user s User ∈ Valid Users .Then Recall (List, User) defines the proportion of items in the testing set valid users that are similar to the items in the recommendation list L that is created for a target user u and is given by [38] Recall (List, User) = g(L) I User (20)

Precision (List, User)
Let |L| be the number of elements in the recommendation list L.Then, Precision (List, User) defines the measure of the ratio of items in L that are corresponding to items connected to the target user User in the testing set.This metric is defined as follows [38]

Rank (i)
Rank (i), the rank of item i determines a position where items connected to the target user in the testing set appear in L and are defined as follows [38] Rank (i) = The item position in L The number of items initially unknown to the user (22) 3.15.Ranking Score (User) Ranking Score (User), the ranking score of a user is defined as follows [39]:

Sparseness (Dataset)
Given the total number of users, the number of data items, and number interactions between the users and data items, then the sparseness of a given dataset is computed as follows [27]: Number interactions between the items and user (Number 3.17.Root Mean Square Error (RMSE) For n, the number of ratings present in the test set, the quality of the predicted rating is obtained using RMSE [50].This metric compares the prediction ratings with probe test set and is defined as follows:

Proposed Recommender Algorithms
This research focuses on the design of two different intelligent optimization methods using Artificial Intelligence and Machine Learning for real-life applications that are used to improve the process of generation of recommenders.In the first method, the modified cluster based intelligent CF is applied with the sequential clustering that operates on the values of the dataset, user s neighborhood set, and the size of the recommendation list.This strategy splits the given dataset into different subsets or clusters, and the recommendation list is extracted from each group for constructing the better recommendation list.In the second method, the specific features-based customized recommender that works in the training and recommendation steps by applying the split and conquer strategy on the problem datasets, which are clustered into a minimum number of clusters and the better recommendation list, is created among all the clusters.This strategy automatically tunes the tuning parameter λ that serves the role of supervised learning in generating the better recommendation list for the large datasets.
The proposed recommender algorithms can also be extended to support the recommendation based on the multimedia information for online applications by updating the user profile, dataset resources, and domain knowledge base components of the proposed recommender systems [62].These recommender algorithms are also applied in generating recommendations for hybrid online and big data in social and complex network applications [63][64][65].
The proposed algorithms are discussed in the following subsections.

Novelty in the Proposed Methods
The proposed recommender algorithms are designed using the following new strategies: Applying the split and conquer strategy on a large scale datasets into different clusters and generating better recommenders from each cluster.Updating the similar knowledge needs of other users in all clusters in a database and storing the better recommendation lists in a database for all the clusters.
Applying the Machine Learning in the identification of knowledge requirements for every partition by extracting the better previously-stored knowledge information from the database to reduce the computational complexity.
The proposed strategy works well for large scale datasets compared to the feature-based recommender developed in solving ERP System and E-Agribusiness datasets, which require the computation of some configuration functions initially and its evaluation process requires some offline evaluation of parametric optimization [48].Compared to this strategy, the proposed strategy obtains the better recommendation list in each of the clusters and it is updated in the database for future recommendation purposes.The favorite items are combined in the recommendation list based on the user profile of a target user.
The proposed methods are compared with the CF developed for both user-based and object-based cases in bipartite networks where the filtering is based on the degree of nodes in the bipartite network [49].However, in the proposed CF, the split and conquer strategy predicts the overall ratings for all unrated items and recommends the best list for each cluster, and then the better recommendation from the ratings of the entire list is chosen artificially from the database.The proposed methods work well in solving the MovieLens dataset compared to the other methods.
The group recommendation model is proposed based on factors such as sparsity, dynamics, and timeliness [50].However, the proposed collaborative recommender is designed to update the similar knowledge needs of other users in all clusters of the same group, and the information is updated in a database for providing the overall better recommendation.Even when more additional features are considered, this intelligent optimization strategy further minimizes the RMSE for large scale datasets to provide a better recommendation.
For the datasets that require a higher number of categories, latent class methods are computationally slow and provide an infeasible solution using k-modes clustering developed in [51].Hence for large datasets that involve more categorical variables to improve the performance of cluster analysis, the k-modes clustering with specific features-based personalized recommender algorithm is required.Then k-modes method is applied to construct the different clusters by preprocessing the updated training dataset.The proposed k-modes strategy is a frequency-based method which evaluates the mode of clusters using Dissimilarity (Y, X) & Dissimilarity (Q, X) measurements.In the proposed k-modes strategy, for each object from the classifier: (training-dataset, attributes-set), the object is assigned to the cluster whose mode is the nearest to it according to Dissimilarity (Y, X) & Dissimilarity (Q, X) measurements.This strategy is used to evaluate the effective data partition for each cluster in obtaining a better recommendation.The training and recommendation steps are applied with some preprocessing and automatically tune the parameter λ to serve the role of supervised learning.

Modified Cluster Based Intelligent Collaborative Filtering Algorithm (Method 1)
The key idea of CF is that the items which are liked by many users can be liked by any other user.The proposed modified cluster based intelligent CF is operating on the dataset, neighborhood set of the user and |L| [25][26][27][28].The method starts with the partition of the knowledge needs of the required information and applies the sequential clustering to identify the required knowledge needs [33][34][35][36][37][38][39][40][41][42].It identifies the necessary current knowledge needs for each of the partitions by extracting the better previously-stored similar knowledge needs from the database artificially.This strategy splits the given data set into different subsets or clusters and the recommendation list is extracted from each cluster, and the combined recommendation list is generated and stored in a database.Then the better recommendation list is chosen at the end by extracting from the previously-stored recommendation lists from the database.The flowchart for proposed method 1 is depicted in Figure 1, and its algorithm is shown in Algorithm 1.

The Specific Features based Personalized Recommender Algorithm (Method 2)
The new profile-based customized recommender algorithm works in the training and recommendation steps with the split and conquer strategy in which the datasets are clustered into a minimum number of clusters.The algorithmic design strategy is depicted in Figure 2. The training and recommendation steps are applied to each cluster.The recommendation list is generated from each cluster independently and updated in the database.Then the better recommendation list is generated by combining the generated recommendation lists from the database.The proposed Profile-based Customized Recommender is modeled as a bi-partition graph G of a user-item with three sets: (I, U, E) where the user set is {u1, u2, u3 … uN}, the item set is {i1, i2, i3 …iN} and the edge set is {e1, e2, e3 … ek}.
This algorithm operates based on user profile information for each cluster.It applies a tuning parameter λ, which lies between 0 and 1.The two important stages in this algorithm are training and recommendation.The stage 1 operations are dataset preprocessing, defining user features, deleting information for invalid users and empty data, performing basic feature-based operations, constructing interactions through graphs, preprocessed data features into clustering using a k-modes clustering algorithm, mapping the clusters to λ values, defining the pair (λ, cluster) values, and transforming attributes into categories.The required parameters are n, the data size; k, the cluster size; and t, the number of iterations.The proposed specific features based recommender is depicted in Algorithm 2. The proposed Profile-based Customized Recommender is modeled as a bi-partition graph G of a user-item with three sets: (I, U, E) where the user set is {u 1 , u 2 , u 3 . . .u N }, the item set is {i 1 , i 2 , i 3 . . .i N } and the edge set is {e 1 , e 2 , e 3 . . .e k }.
This algorithm operates based on user profile information for each cluster.It applies a tuning parameter λ, which lies between 0 and 1.The two important stages in this algorithm are training and recommendation.The stage 1 operations are dataset preprocessing, defining user features, deleting information for invalid users and empty data, performing basic feature-based operations, constructing interactions through graphs, preprocessed data features into clustering using a k-modes clustering algorithm, mapping the clusters to λ values, defining the pair (λ, cluster) values, and transforming attributes into categories.The required parameters are n, the data size; k, the cluster size; and t, the number of iterations.The proposed specific features based recommender is depicted in Algorithm 2.
Algorithm 2: Preprocessing-Specific Features based Personalized Recommender Inputs: dataset, user x, S, |L| 1: Partition the knowledge needs of the required information 2: Apply sequential clustering to identify the current knowledge needs and extract the better stored similar knowledge needs from the database 3: Construct the graph G using the data set of training that is already known 4: Perform the following operations on each cluster: 5: Assign every user u to the vertex in U 6: Assign every item j to the vertex in I 7: If there is an interaction of a user u with any item j then: 8: Insert the edge e k into G 9: Make vertex u adjacent to vertex j 10: Perform the following for the target user u in each cluster: 11: Assign the value of the resource, r(u, j) to every item j in G 12: Set r(u, j) = 1 if there is an edge between u and j; otherwise set r(u, j) = 0 13: Apply the propagation process to redistribute the resource values 14: Calculate r (v, j) for every user v ∈ U 15: Redistribute the values of r (v, j) 16: Update the new resource value r (v, j) for each item j in G The flowchart of the training step of the proposed method 2 is depicted in Figure 3.The implementation of a training phase that trains a classifier is shown in Algorithm 3. The best tuning value of λ is extracted based on the target user's features and is returned from Algorithm 3.This algorithm starts with applying initial preprocessing operations on the dataset and it transforms the attribute values into categorical values.For the datasets that require a higher number of categories, latent class methods are computationally slow and provide an infeasible solution [51].Hence, for large datasets that involve more categorical variables to improve the performance of cluster analysis, the k-modes clustering with specific features-based personalized recommender algorithm is required.Then k-modes method is applied to construct the different clusters by preprocessing the updated training dataset.The k-modes strategy, a frequency-based method that evaluates the mode of clusters using Dissimilarity (Y, X) & Dissimilarity (Q, X) measurements, is shown in Algorithm 4. The asymptotic complexity of this strategy is Θ(nki), a linear complexity.Then the k clusters are evaluated for each value of k ranging from k-minimum to k-maximum.The effective data partition is evaluated for each cluster using the k-modes method [20].The best value of k, best (k), is obtained for the best-clusters, which optimizes the entropy-based clustering criterion function, Optimize C k .Then, this algorithm proceeds by determining the value of λ-best, the best λ, for each cluster using Algorithm 5. A training pair for the classifier, (user-features, λ-best (j)), is constructed, which collects the features of each user in the specific cluster.Finally, the classifier is trained for the training-set.The user profile corresponding to the λ-best value is indicated by the classifier using the given user attributes.This training algorithm outputs the trained classifier and the bipartite graph G, which represents the interactions between items and users in the training-dataset.The specific features based recommendation is depicted in Algorithm 6.This algorithm computes the λ-best, the best value of the profile of the user's λ for each cluster.It determines λ-best, by evaluating each recommendation list, which is generated with each value of λ by applying the metrics recall, precision and ranking score.The set Λ consists of the typical λ values that lie between 0.0 and 1.0.The output of the trained classifier from Algorithm 3 determines the best λ for a target user in its recommender phase.This step requires the following inputs: a bipartite graph G, attributes set, target user, a trained classifier that is the result obtained from Algorithm 3, and |L|.The complete operations of the recommendation steps are described in Algorithm 3. The algorithm starts with extracting the target user s feature values and is assigned to user features.Then, λ User , the proper λ value for the user, will be predicted by giving the user-features values into the trained classifier.The target user profile is reflected in the variable λ User with Mass Diffusion Heat Spreading algorithm that selects the novelty-based popular items.The recommendation list is updated by applying the Mass Diffusion Heat Spreading algorithm with the tuning parameter λ User and |L|.Finally, the lesser-known and favorite items are combined in the recommendation list based on the user profile of a target user.The flowchart of the recommendation step of the proposed method 2 is depicted in Figure 4.The specific features based recommendation process is described in Algorithm 6.In general, the worst-case complexity of the proposed recommenders is O (ni) for n users with the number of data items, i.The complexity of model construction of proposed cluster based recommenders with the partition of c clusters is O (cni), a linear complexity for one rating prediction with the space requirement of O (ci + n).The asymptotic complexity of the proposed k-modes clustering strategy is Θ(nki), a linear complexity since the k clusters are evaluated for each value of k ranging from k-minimum to k-maximum.However, in practical applications, it is expected to have O (n + i) complexity since for each user only a finite number of items are considered.Since one loop is used on n users to compute the similarity and one on the i items to compute the prediction.
The implementation cost in realistic recommender systems depends on the number of user attributes, size of the recommendation list, the tuning parameter, and the number of graph edges produced in the bipartite graph.There are some solutions for addressing the implementation cost in realistic recommender systems: discarding the users with minimal required popular items and discarding very popular items, since the items are partitioned into different datasets and clustering the data is applied.

Simulations & Results
This section focuses on the simulation of the proposed algorithms on the sample datasets with its outcomes and analysis.The algorithms are implemented in the Java language.It has been experimentally found that the computation time depends on the structure of the computation, number of user attributes, size of the graph edges, and the dataset.

Datasets
The proposed algorithms are implemented and executed on the following data sets [19]: The MovieLens dataset, which consists of the users, items, and interactions values as 910, 1672, 95,579 In general, the worst-case complexity of the proposed recommenders is O (ni) for n users with the number of data items, i.The complexity of model construction of proposed cluster based recommenders with the partition of c clusters is O (cni), a linear complexity for one rating prediction with the space requirement of O (ci + n).The asymptotic complexity of the proposed k-modes clustering strategy is Θ(nki), a linear complexity since the k clusters are evaluated for each value of k ranging from k-minimum to k-maximum.However, in practical applications, it is expected to have O (n + i) complexity since for each user only a finite number of items are considered.Since one loop is used on n users to compute the similarity and one on the i items to compute the prediction.
The implementation cost in realistic recommender systems depends on the number of user attributes, size of the recommendation list, the tuning parameter, and the number of graph edges produced in the bipartite graph.There are some solutions for addressing the implementation cost in realistic recommender systems: discarding the users with minimal required popular items and discarding very popular items, since the items are partitioned into different datasets and clustering the data is applied.

Simulations & Results
This section focuses on the simulation of the proposed algorithms on the sample datasets with its outcomes and analysis.The algorithms are implemented in the Java language.It has been experimentally found that the computation time depends on the structure of the computation, number of user attributes, size of the graph edges, and the dataset.

Datasets
The proposed algorithms are implemented and executed on the following data sets [19]: The MovieLens dataset, which consists of the users, items, and interactions values as 910, 1672, 95,579 respectively, with user's features age, location, and gender.Last.FM dataset consists of the users, items, and interaction values of 2846, 4995, 14,583 respectively, with users, features country, gender, and age.The dataset Book-Crossing consists of the users, items, and interactions values of 3421, 26811, 35,572 respectively with users, features age and location.

Metrics for Evaluation
The results are evaluated using metrics such as Recall (List, User), Precision (List, User) and Ranking Score (User) that are applied in validating the values of L that are generated by the recommender system.When Recall (List, User) becomes higher, then the system recommends the testing set items.When Precision (List, User) becomes higher, it indicates that more items in the recommendation list are corresponding to the testing set user items.When the Ranking Score (User) becomes lower, it indicates that the item is closer to the first position.The sparseness of the data sets is computed using the expression Sparseness (dataset).The sparseness values for the data set MovieLens, Last.FM and Book-Crossing are 0.9371, 0.9989, and 0.9996, respectively.

Comparative Results & Analysis
The proposed methods are compared with the MDHS algorithm [36,46], the nearest neighborhood CF [11,23], UPOD [6], and Dynamic Group Recommender (DGR) algorithms [6,[49][50][51].The following simulation parameters are applied during the execution of the proposed methods: λ = 0.5, the size of the recommendation list |L| = 30, size of the neighborhood is 30, k-minimum to 100, and k-maximum to 200.The simulation outcomes are tabulated in Tables 1-6.The results are compared using the statistical t-test to find the significance of the proposed methods over the existing methods [6].The mean µ and standard deviation σ are calculated for the datasets after applying the Optimize C k (Equation ( 19)) 100 times.Tables observe that the proposed methods are outperforming the existing techniques based on the metrics applied to evaluate the performance measurements.The data for user profiles can also be extracted from social networks.The average computational time (in seconds) for the proposed methods in the form of (proposed method 1, proposed method 2) for the Movielens, Last.FM, Book-Crossing datasets are (8.2,4.9), (9.8, 4.89), (8.75, 4.97) respectively.CF [11,23] 0.406 0.000 0.000 MDHS [36,46] 0.395 0.002 0.000 UPOD [6] 0.394 0.003 0.000 Proposed Method 1 0.395 0.004 0.000 Proposed Method 2 0.398 0.004 0.000 Table 6.Comparison of σ for the proposed methods with existing strategies-Book-Crossing.

Strategies
Ranking Score (User) Recall (List, User) Precision (List, User) CF [11,23] 0.007 0.000 0.000 MDHS [36,46] 0.007 0.001 0.000 UPOD [6] 0.008 0.001 0.000 Proposed Method 1 0.008 0.001 0.000 Proposed Method 2 0.007 0.005 0.000 The proposed methods are evaluated and compared with the other well-known strategies, such as CF, MDHS, and UPOD.The parameter λ has been set to 0.5 when comparing with MDHS and UPOD methods.The proposed methods are evaluated using the n group cross-validation in which the data set is divided into n groups of equal sizes.Valid Users , that is, the users are those who are available both in testing and the training set, are considered for the evaluation.
The performance of the proposed methods with the existing methods is evaluated using the three metrics: Recall (List, User), Precision (List, User), and Ranking Score (User), and the recommendation list generated by the system is validated.The simulation outcomes are tested using the statistical t-test with a level of significance α = 0.05 to check if there is a significant difference between the proposed methods, with the existing methods being statistically significant.The measures k µ and k σ are computed for each dataset, after applying the clustering with 10 executions.The inferences are analyzed from the experimental results, which are tabulated from tables Tables 1-6.The figures which are indicated in bold conclude the better performance of the proposed methods over the existing methods.
The comparison of the mean µ and standard deviation σ to the Movielens dataset is shown in Tables 1 and 2 respectively.For this dataset, the measures are calculated as k µ = 198 and k σ = 4.21.For this dataset, it has been found that there is no significant difference obtained in terms of parameters µ and σ concerning the Ranking Score (User) and Recall (List, User) measures in the proposed methods.However, the proposed method 2 behaves well in terms of Precision (List, User) measurements compared to other methods with the size of the recommendation list |L| = 30.
The accuracy of the proposed methods is also evaluated for the MovieLens dataset with the same parameters considered in [49].To test the performance of the proposed recommenders, the MovieLens dataset is divided into 90% of the training set and 10% of probe data.The dataset can also be divided into (80%, 20%), (70%, 30%), and so on.The only known information is available in the training set and no prediction is made in the probe set of data.For the jth user u j , the position of an uncollected object o j is measured in the ordered queue.Then the position of o j is obtained by dividing the particular location from the top by the total number of uncollected movies.Hence, a good recommender is expected to produce a small Ranking Score (User), which shows the better accuracy of the recommender.The performance comparison of the proposed methods with other methods for the MovieLens dataset over the three metrics is shown in Figure 5  The accuracy of the proposed methods is also evaluated for the MovieLens dataset with the same parameters considered in [49].To test the performance of the proposed recommenders, the MovieLens dataset is divided into 90% of the training set and 10% of probe data.The dataset can also be divided into (80%, 20%), (70%, 30%), and so on.The only known information is available in the training set and no prediction is made in the probe set of data.For the j th user uj, the position of an uncollected object oj is measured in the ordered queue.Then the position of oj is obtained by dividing the particular location from the top by the total number of uncollected movies.Hence, a good recommender is expected to produce a small Ranking Score (User), which shows the better accuracy of the recommender.The performance comparison of the proposed methods with other methods for the MovieLens dataset over the three metrics is shown in Figure 5 [49].The simulation is conducted with 10% of probe data, L = 50.The values corresponding to the proposed methods are better ones concerning all three metrics.The comparison of the mean  and standard deviation  to the Last.FM dataset is shown in Tables 3 and 4 respectively.For this dataset, the measures are calculated as  = 195 and  =2.75.For the Last.FM dataset, the proposed method 2 performs better than the existing methods in terms of all performance metrics considered.For this dataset, there is no significant difference obtained even when the size of the recommendation list |L| becomes 30, while keeping the size of the neighborhood as < 30 in all executions.
For n, the number of ratings present in the test set, the quality of the predicted rating is obtained using RMSE [50].This metric compares the prediction ratings with the probe test set.The accuracy of the proposed methods is also evaluated for the Last.FM dataset and the results are compared with the methods presented in [50].The minimum RMSE is considered for some of the methods presented in the DGR.The RMSE comparison of the proposed methods with DGR presented in [50] is shown in Figure 6.In the Figure, the size of the groups is plotted on the X-axis and the RMSE is plotted on the Y-axis.The predictions generated by the proposed recommenders are better than the existing methods.The experimental results further conclude that the proposed recommenders consider the features of the individual preferences of the group members and the specific features.It has also been found that the accuracy is slightly decreasing while the group size The comparison of the mean µ and standard deviation σ to the Last.FM dataset is shown in Tables 3  and 4 respectively.For this dataset, the measures are calculated as k µ = 195 and k σ = 2.75.For the Last.FM dataset, the proposed method 2 performs better than the existing methods in terms of all performance metrics considered.For this dataset, there is no significant difference obtained even when the size of the recommendation list |L| becomes 30, while keeping the size of the neighborhood as <30 in all executions.
For n, the number of ratings present in the test set, the quality of the predicted rating is obtained using RMSE [50].This metric compares the prediction ratings with the probe test set.The accuracy of the proposed methods is also evaluated for the Last.FM dataset and the results are compared with the methods presented in [50].The minimum RMSE is considered for some of the methods presented in the DGR.The RMSE comparison of the proposed methods with DGR presented in [50] is shown in Figure 6.In the Figure, the size of the groups is plotted on the X-axis and the RMSE is plotted on the Y-axis.The predictions generated by the proposed recommenders are better than the existing methods.The experimental results further conclude that the proposed recommenders consider the features of the individual preferences of the group members and the specific features.It has also been found that the accuracy is slightly decreasing while the group size is increasing since the group recommendation depends on the diverse set of users and specific personal features.The average (k µ ) & standard deviation (k σ ) of observed ratings for the proposed datasets are shown in Table 7.The comparison of the mean µ and standard deviation σ to the Book-Crossing dataset is shown in Tables 5 and 6 respectively.For this dataset, the measures are calculated as k µ = 194 and k σ = 7.83.For this dataset, it has been found that there is a significant difference obtained in terms of parameters µ and σ concerning the Precision (List, User) and Recall (List, User) measures in the proposed methods.However, the proposed methods are competitive with the existing methods.The proposed method 2 performs well compared to proposed method 1.It has also been found that, for this dataset, there is no significant difference obtained even when the size of the recommendation list |L| becomes 50 while keeping the size of the neighborhood as < 30 in all executions.
The proposed methods are simulated for the large scale data set Book-Crossing which consists of 3421 users and 26,811 items on varying the tuning parameter λ for the different values as 0.3, 0.4, 0.5, and 0.6.The significant results are obtained and are shown in Figure 8.It has been found that when λ < 0.5, the Ranking Score (User) increases while the Precision (List, User) and Recall (List, User) measures decrease.When λ > 0.5, the Ranking Score (User) also increases gradually and no such significant differences are obtained in the Precision (List, User) and Recall (List, User) measures.In this case, the experimental results show that the proposed method 2 can provide a better recommendation based on the defined metrics when λ = 0.5.
The average ( ) & standard deviation ( ) of observed ratings for the proposed datasets are shown in Table 7.The comparison of the mean  and standard deviation  to the Book-Crossing dataset is shown in Tables 5 and 6 respectively.For this dataset, the measures are calculated as  = 194 and  = 7.83.For this dataset, it has been found that there is a significant difference obtained in terms of parameters  and  concerning the Precision (List, User) and Recall (List, User) measures in the proposed methods.However, the proposed methods are competitive with the existing methods.The proposed method 2 performs well compared to proposed method 1.It has also been found that, for this dataset, there is no significant difference obtained even when the size of the recommendation list |L| becomes 50 while keeping the size of the neighborhood as < 30 in all executions.
The proposed methods are simulated for the large scale data set Book-Crossing which consists of 3421 users and 26,811 items on varying the tuning parameter λ for the different values as 0.3, 0.4, 0.5, and 0.6.The significant results are obtained and are shown in Figure 8.It has been found that when λ < 0.5, the Ranking Score (User) increases while the Precision (List, User) and Recall (List, User) measures decrease.When λ > 0.5, the Ranking Score (User) also increases gradually and no such significant differences are obtained in the Precision (List, User) and Recall (List, User) measures.In this case, the experimental results show that the proposed method 2 can provide a better recommendation based on the defined metrics when λ = 0.5.For the Book-Crossing dataset, the metric Precision (List, User) is evaluated for different sizes of the recommendation list, for example, for the values of |L| = 10, 20, 30, 40, 50, and 100.The corresponding metric is plotted as shown in Figure 9.When this metric becomes higher, the items that are in L corresponding to the items list that is corresponding to the users present in the testing set values.By keeping the tuning parameter as λ = 0.5, the proposed method 2 provides better performance while |L| ≤ 30.When |L| > 30, it seems that the proposed method 1 provides better recommendation based on this metric.For the Book-Crossing dataset, the metric Precision (List, User) is evaluated for different sizes of the recommendation list, for example, for the values of |L| = 10, 20, 30, 40, 50, and 100.The corresponding metric is plotted as shown in Figure 9.When this metric becomes higher, the items that are in L corresponding to the items list that is corresponding to the users present in the testing set values.By keeping the tuning parameter as λ = 0.5, the proposed method 2 provides better performance while |L| ≤ 30.When |L| > 30, it seems that the proposed method 1 provides better recommendation based on this metric.The experiments are also conducted for the Book-Crossing dataset when varying the user sizes, for example, for the values of 500, 1000, 1500, 2000, 2500, and 3000, and the corresponding Precision (List, User) & Recall (List, User) metrics are plotted as shown in Figure 10.The experimental results analyze that for small values of user sizes (≤500), higher values of these metrics are obtained.When the user sizes exceed 500, these metrics decrease gradually and the proposed method 2 performs better than the proposed method 1.The significant differences are obtained in these methods because the structure of computation of the methods depends on the number of user attributes, size of the recommendation list, the tuning parameter, and the number of graph edges produced in the bipartite graph.The maximum values of the (Precision, Recall) metrics obtained for the Book-Crossing dataset in the proposed methods are (0.0004, 0.0042) and (0.0004, 0.0046) respectively.The experiments are also conducted for the Book-Crossing dataset when varying the user sizes, for example, for the values of 500, 1000, 1500, 2000, 2500, and 3000, and the corresponding Precision (List, User) & Recall (List, User) metrics are plotted as shown in Figure 10.The experimental results analyze that for small values of user sizes (≤500), higher values of these metrics are obtained.When the user sizes exceed 500, these metrics decrease gradually and the proposed method 2 performs better than the proposed method 1.The significant differences are obtained in these methods because the structure of computation of the methods depends on the number of user attributes, size of the recommendation list, the tuning parameter, and the number of graph edges produced in the bipartite graph.The maximum values of the (Precision, Recall) metrics obtained for the Book-Crossing dataset in the proposed methods are (0.0004, 0.0042) and (0.0004, 0.0046) respectively.

Discussion of Important Results
The proposed methods are providing better performance compared to the existing methods.The discussion of the important results of the proposed methods is analyzed as follows: The statistical measurements mean  and standard deviation  are calculated for the

Discussion of Important Results
The proposed methods are providing better performance compared to the existing methods.The discussion of the important results of the proposed methods is analyzed as follows: The statistical measurements mean µ and standard deviation σ are calculated for the considered datasets after applying the Optimize C k (Equation ( 19)) 100 times.The expected entropy E C k is minimized after applying Optimize C k more than 90 times.The proposed methods are evaluated using the n group cross-validation in which the data set is divided into n groups of equal sizes.Valid Users , that is, the users are those who are available both in testing and the training set, are considered for the evaluation.The simulation outcomes are tested using the statistical t-test with a level of significance α = 0.05 to check if there is a significant difference between the proposed methods with the existing methods that is statistically significant.The measures k µ and k σ are computed for each dataset, after applying the clustering with the number of executions 10.It has been experimentally found that the accuracy of the proposed methods is slightly decreasing while the group size is increasing since the group recommendation depends on the diverse set of users and specific personal features.
The performance of specific feature selection is analyzed based on varying the values of neighborhood sizes |S|.The interesting result is obtained during the simulation when the experiments are conducted for small, medium, and large neighborhood sizes |S|.The RMSE values of 41% and 37% are obtained in the proposed methods when |S| < 20.When the value of |S| increases, RMSE also increases.For large values of |S|, there is a significant difference that becomes smaller in the proposed methods.The simulation of proposed methods on the large scale data set, Book-Crossing, when varying the tuning parameter λ for the different values such as 0.3, 0.4, 0.5, and 0.6, produces significant results.It has been experimentally found that when λ < 0.5, the Ranking Score (User) increases while the Precision (List, User) and Recall (List, User) measures decrease.When λ > 0.5, the Ranking Score (User) also increases gradually and no such significant differences are obtained in the Precision (List, User) and Recall (List, User) measures.In this case, the experimental results show that the proposed method 2 can provide a better recommendation based on the defined metrics when λ = 0.5.
For the Book-Crossing dataset, the simulation has been conducted for different sizes of |L| = 10, 20, 30, 40, 50, and 100.When this measurement becomes higher, the items that are in L correspond to the items list that is corresponding to the users present in the testing set values.By keeping the tuning parameter as λ = 0.5, the proposed method 2 provides better performance while |L| ≤ 30.When |L| > 30, it seems that the proposed method 1 provides better recommendation based on this metric.The simulation has been performed on the Book-Crossing dataset when varying the user sizes, in multiples of 500, up to the maximum of 3000.The experimental results analyze that for small values of user sizes (≤ 500), higher values of the metrics are obtained.When the user sizes exceed 500, the metrics decrease gradually and the proposed method 2 performs better than the proposed method 1.The significant differences are obtained in these methods because the structure of computation of the methods depends on the number of user attributes, size of the recommendation list, the tuning parameter, and the number of graph edges produced in the bipartite graph.The maximum values of the (Precision, Recall) metrics obtained for the Book-Crossing dataset in the proposed methods are (0.0004, 0.0042) and (0.0004, 0.0046) respectively.The proposed method 2 works well when λ = 0.5 with the size of the recommendation list |L| = 30 and the size of the neighborhood is 30, and it automatically tunes the tuning parameter λ.

Conclusions & Future Work
The proposed modified cluster based intelligent CF and the profile based customized recommender method are proposed and analyzed in this research.The proposed method 2 works well when λ = 0.5 with the size of the recommendation list |L| = 30, and the size of the neighborhood is 30, and it automatically tunes the tuning parameter λ.The proposed methods combine the novelty and popularity features based on the user's profile and generate the recommendation list.The experimental

Figure 2 .
Figure 2. The Flowchart of the Split and Conquer Strategy of the Proposed Method 2.

Figure 2 .
Figure 2. The Flowchart of the Split and Conquer Strategy of the Proposed Method 2.

Figure 3 .
Figure 3.The Flowchart of the Training Step of the Proposed Method 2.

Algorithm 3 :
Training Steps Inputs: training-dataset, attributes-set, k-minimum, k-maximum 1: Partition the required knowledge needs and apply split and conquer strategy on the training-dataset and extract the better stored similar knowledge needs from the database 2: Apply the preprocessing operations on the training-dataset and update it 3: Initialize evaluation to zero 4: For the cluster size k = k-minimum to k-maximum: 5: Construct clusters (k), that is, the k -clusters by applying Algorithm 4. 6: Evaluate clusters (k) using entropy-based clustering criterion (Eqn.19) and update evaluation (k) 7: Compare evaluation (k) with evaluation and update evaluation, best-clusters and best (k) 8: End For 9: Construct a bipartite graph G for the training-dataset 10: Initialize training-set to null 11: For j = 1 to best (k): 12: Compute λ-best (j) by applying Algorithm 5 13: For each user in best-clusters (j): 14: Update user-features by extracting the features from user attributes-set 15: Include the classifier: (user-features, λ-best (j)) in the training-set 16: End For 17: End For 18: Update the classifier by training to the training-set 19: Return the bipartite graph and the classifier

Figure 3 .Algorithm 5 :
Figure 3.The Flowchart of the Training Step of the Proposed Method 2.

Figure 4 .
Figure 4.The Flowchart of the Recommendation Step of the Proposed Method 2.

Algorithm 6 : 3 :
Specific Features based Recommendation Inputs: G, attributes-set, target-user, classifier, |L| 1: Update user-features by extracting the features from user attributes-set 2: Predict λ , the proper λ value for the user, by giving the user-features values into the trained classifier Apply the Mass Diffusion Heat Spreading algorithm and update the Recommendation-List 4: Return Recommendation-List

Figure 4 .Algorithm 6 :
Figure 4.The Flowchart of the Recommendation Step of the Proposed Method 2.
[49].The simulation is conducted with 10% of probe data, L = 50.The values corresponding to the proposed methods are better ones concerning all three metrics.Mathematics 2020, 8, x FOR PEER REVIEW 19 of 29 proposed methods.However, the proposed method 2 behaves well in terms of Precision (List, User) measurements compared to other methods with the size of the recommendation list |L| = 30.

Figure 5 .
Figure 5.The Performance Comparison of the Proposed Methods with Other Methods [49]: MovieLens dataset.

Figure 5 .
Figure 5.The Performance Comparison of the Proposed Methods with Other Methods [49]: MovieLens dataset.

Mathematics 2020, 8 ,
x FOR PEER REVIEW 20 of 29 is increasing since the group recommendation depends on the diverse set of users and specific personal features.

Figure 6 .
Figure 6.The RMSE Comparison of the Proposed Methods with DGR [50].The performance of specific feature selection is also analyzed based on varying the values of neighborhood sizes |S|.The interesting result is obtained during the simulation when the experiments are conducted for small, medium, and large neighborhood sizes |S| and are shown in Figure7.The RMSE values of 41% and 37% are obtained in the proposed methods when |S| < 20.When the value of |S| increases, RMSE also increases.For large values of |S|, there is a significant difference that becomes smaller in the proposed methods.

Figure 7 .
Figure 7.The RMSE Comparison of the Proposed Methods for Different Neighborhood Sizes.

Figure 6 .
Figure 6.The RMSE Comparison of the Proposed Methods with DGR [50].The performance of specific feature selection is also analyzed based on varying the values of neighborhood sizes |S|.The interesting result is obtained during the simulation when the experiments are conducted for small, medium, and large neighborhood sizes |S| and are shown in Figure 7.The RMSE values of 41% and 37% are obtained in the proposed methods when |S| < 20.When the value of |S| increases, RMSE also increases.For large values of |S|, there is a significant difference that becomes smaller in the proposed methods.

Figure 6 .
Figure 6.The RMSE Comparison of the Proposed Methods with DGR [50].The performance of specific feature selection is also analyzed based on varying the values of neighborhood sizes |S|.The interesting result is obtained during the simulation when the experiments are conducted for small, medium, and large neighborhood sizes |S| and are shown in Figure7.The RMSE values of 41% and 37% are obtained in the proposed methods when |S| < 20.When the value of |S| increases, RMSE also increases.For large values of |S|, there is a significant difference that becomes smaller in the proposed methods.

Figure 7 .
Figure 7.The RMSE Comparison of the Proposed Methods for Different Neighborhood Sizes.

Figure 7 .
Figure 7.The RMSE Comparison of the Proposed Methods for Different Neighborhood Sizes.

Figure 8 .
Figure 8. Performance of the Proposed Method 2 for Different Values of λ: Book-Crossing.

Figure 8 .
Figure 8. Performance of the Proposed Method 2 for Different Values of λ: Book-Crossing.

Figure 9 .
Figure 9. Size of the Recommendation List versus Precision (List, User) for Book-Crossing.

Figure 9 .
Figure 9. Size of the Recommendation List versus Precision (List, User) for Book-Crossing.

Figure 10 .
Figure 10.Sizes of the Users versus Precision & Recall Metrics for Book-Crossing.

Table 1 .
Comparison of µ for the proposed methods with existing strategies-MovieLens.

Table 2 .
Comparison of σ for the proposed methods with existing strategies-MovieLens.

Table 3 .
Comparison of µ for the proposed methods with existing strategies-Last.FM.

Table 4 .
Comparison of σ for the proposed methods with existing strategies-Last.FM.

Table 5 .
Comparison of µ for the proposed methods with existing strategies-Book-Crossing.

Table 7 .
The average (k µ ) & standard deviation (k σ ) of the observed ratings.

Table 7 .
The average ( ) & standard deviation ( ) of the observed ratings.