Abstract
In this paper, we build a recommender system for a new study area: social commerce, which combines rich information about social network users and products on an e-commerce platform. The idea behind this recommender system is that a social network contains abundant information about its users which could be exploited to create profiles of the users. For social commerce, the quality of the profiles of potential consumers determines whether the recommender system is a success or a failure. In our work, not only the user’s textual information but also the tags and the relationships between users have been considered in the process of building user profiling model. A topic model has been adopted in our system, and a feedback mechanism also been design in this paper. Then, we apply a collative filtering method and a clustering algorithm in order to obtain a high recommendation accuracy. We do an empirical analysis based on real data collected on a social network and an e-commerce platform. We find that the social network has an impact on e-commerce, so social commerce could be realized. Simulations show that our topic model has a better performance in topic finding, meaning that our profile-building model is suitable for a social commerce recommender system.
1. Introduction
E-commerce, the activity of buying or selling online, has generated significant business value, and it is becoming ever more common for both consumers and companies to purchase products or services online. The rise of various e-commerce platforms results in competition between companies, and they continue to explore new sales models. In recent years, social commerce, defined as a subset of e-commerce involving social networks, has become one of the most popular topics [1]. In social commerce, users are encouraged by a platform via financial rewards to share/diffuse/broadcast information about the various products sold on that platform via social networks. For example, Amazon and Twitter launched a seamless cross-platform shopping service, ‘AmazonCart’, in 2014, which allows Twitter users to purchase Amazon items in tweets while browsing Twitter. Another example is Alibaba, the largest e-commerce platform in China. A large number of Alibaba vendors promote their commodities through Sina Weibo (the largest Chinese social network). This indicates that e-commerce has evolved from a stand-alone platform to one that incorporates social information. RT Wigand et al. indicated that the rapid development of social media and web technology may have the potential to transform e-commerce from a product-oriented environment to a social and customer-centered one [2]. Tajvidi M et al. found that, in social commerce, consumer–consumer interaction and consumer–seller interaction enhance consumers’ intention to co-create brand value [3]. Adam et al. provided evidence that social media use significantly influences the development of e-government and the diffusion of e-commerce globally [4].
Recommender systems for e-commerce companies have been well studied [5,6]. However, most of the existing recommender systems use only information from e-commerce to make recommendations, such as consumer purchased history log or rating scores of purchased commodities. With the integration of e-commerce and social platforms, new recommender systems should be designed, and they should make full use of social network user information and product information. Some studies have begun to use social network information to improve the accuracy of the recommendations on e-commerce platforms. Damian et al. proposed a web recommender system for e-commerce that traces clients and analyzes their activities on Facebook [7]. However, the user profiles are only based on keywords from the users’ activity and their friends’ activities. Hao Ma et al. improved the recommender system by adding social contextual information, i.e., social tags and latent information obtained by heterogeneous data mining [8]. Zhao et al. extracted the demographic information of both products and social network users’ activities, and then leveraged the demographic information to improve the recommendation performance on e-commerce websites [9]. They also proposed a cross-platform recommender system that operates by learning both users’ and products’ features from data collected from e-commerce websites using recurrent neural networks and then apply a modified gradient boosting trees method to transform users’ social networking features into user embeddings [10]. Using these features, their work realized cross-site product recommendation and solved the cold-start problem. With the rise of AI technology, Pan et al. proposed a unified framework of active transfer learning for cross-system recommendation, which used an active learning principle to construct entity correspondences across systems [11]. Xiang et al. integrated the fuzzy association rule and complex preference into a recommendation model to improve the efficiency of the traditional collaborative filtering recommendation algorithm [12]. However, these cross-platform recommendation methods solely rely on sparse social network data and e-commerce data. These methods do not fully integrate textual information, tagging information and behavioral information from the social network. In fact, the social network contains abundant, detailed, time-resolved, real-world user data, e.g., tweets (microblogs), tags and relations with other users, which motivates us to extract users’ information and capture users’ interest profiles for cross-platform recommendations.
In this paper, we propose a novel cross-platform recommender system (CPRec) to make full use of the abundant information on social networks to improve recommendation accuracy. In CPRec, we build both a user profile and a commodity profile from data on social networks and e-commerce. An improved topic model for detecting users’ interest profiles from their historical released information is designed, which is based on Latent Dirichlet Allocation (LDA) [13]. The users’ tags and their followees’ profiles will be used when we are building the users’ profiles. After obtaining a user profile, we make recommendations based on the recommended score as calculated from the users’ profiles and the commodity profiles. Considering that each user will take different actions after he/she receives the recommended products, a feedback mechanism is designed for the CPRec. Since a user-commodity-score matrix will be obtained if the CPRec starts to work, we develop an improved collaborative filtering algorithm that combines user profiles to make further recommendations. Finally, in order to show the performance of the proposed system, we evaluate and analyze the CPRec based on two platforms, Sina Weibo and Alibaba. The contributions of this paper are summarized as follows:
- We propose a novel cross-platform personalized recommender system, CPRec, for recommending e-commerce commodities to users on social network platforms.
- An interest mining process is proposed to build the user interest profiles, which makes full use of users’ information on social networks.
- We propose three subdivisions for CPRec, i.e., recommendations for individuals, a feedback mechanism and an improved collaborative filtering algorithm.
- The experimental results validate the feasibility of the CPRec, the veracity of user profiling and the superior performance of our improved collaborative filtering algorithm compared with some existing algorithms.
The remainder of the paper is organized as follows. In Section 2, an overview of our proposed cross-platform recommender system is given. System building and user profiling are presented in Section 3, and commodity profiling and recommendation subdivisions are described in detail. Section 4 discusses the experimental results. Section 5 concludes this paper and outlines future work.
2. System Model
2.1. Preliminary
2.1.1. Social Networks
Social networks can be illustrated as a graph of the relationships and interactions within in a group of individuals, and they play a fundamental role as a medium for the spread of information, ideas and influence among their members. In this paper, we consider a general model of social networks, which is abstracted as a set of nodes and a set of edges between the nodes. Each node can be considered as an individual or as a collective unit such as a department, organization or family. There exists an edge between two nodes if they have relation. Figure 1 shows a brief instance of a social network, which contains four nodes (users in the social network) and their relations (following a person). In Figure 1, user a follows user c while a is being followed by b. In the social network, the user followed by other users is defined as followee, and those who follow this user are called followers. In practice, users are mostly likely to follow a user whose interests align with their own. In Sina Weibo, users always write a short message (limited 140 characters) and upload some pictures to show moments in their lives or interesting things.
Figure 1.
An abstract graph of social network.
The short message is an important part of our model. While a short microblog may be unable to depict the full scope of a user’s interests or thoughts, we collect user’s historical microblogs and divide them into groups to analyze. The user’s microblogs from a uniform time period will be represented as the microblog group . For a given user, his/her entire microblog history M can be denoted as , where represents the current time slot. Users’ tags are another useful source of information. Selecting tags is an essential if optional part of the registration process after users create their accounts with the social network. Tags give obvious information about the interests that user want to represent to others, such as singing, eating, shopping, traveling, or IT. Let denote the tags that user u selected. Another data source is followees. We denote the user ’s followees as .
2.1.2. E-Commerce
In this paper, the e-commerce data source that we adopted is Taobao, which is a child company of Alibaba and the most successful e-business platform in China. As of 2014, it had generated a total volume of 1.172 trillion RMB.
Commodities on e-commerce platforms are the key factor that we concentrate on, and they can be described by the commodity’s name, its description and the buyer’s information. These three kinds of data will be used to define the commodity profiles.
2.2. System Model of the Cross-Platform Recommender System (CPRec)
We concentrate on an idealized cross-platform process in which an e-commerce company or a retailer R wants to recommend suitable commodities to a social network user u. The first thing that the retailer R should do is to understand what the user u prefers. To give an example, if R recommends a basketball to a user who enjoys soccer, this user will not be satisfied with the recommendation. With our recommender system, there exists an opportunity for retailers to learn about a social network user’s interests, supposing that our system has the necessary permissions to get the social network user’s information. R could maintain a user’s profiles with his/her existing information from social networks though our profiling model. The next question is how to make recommendations. Retailers naturally understand that users are more likely to purchase those commodities that most closely match their interests. Figure 2 depicts an overview of our CPRec. The first part is what we call the original data collection period, in which data for “potential consumer profiling” and “commodity profiling” would be collected. The original data used in user profiling include three types: historical microblogs, a user’s tags and information about the user’s followees (the people whom that user follows). Information about a commodity (the commodity name, commodity description and buyers’ information) would be collected analogously. The second part refers to the profiling process whereby the data on the user and commodity will be analyzed and used to obtain profiles of both. From the original data, we aim to build both a stable interest profile and temporal interest profile for each user, due to the fact that a user’s preferences may evolve over time, but his or her general preferences may also be relatively stable. In the third part, namely, the recommendation process, we offer three subdivisions for the CPRec, and the different subdivisions realize different functions.
Figure 2.
System model of the proposed cross-platform recommender system (CPRec).
3. The Proposed CPRec
3.1. User Profiling
User profiling is the core of the proposed recommender system. It is hardly possible to recommend suitable goods to a consumer without knowing his/her profiles. In this section, we aim to identify each user’s profiles, or what we may call their interests. Considering the users’ interests may be variant, we indicate that the users’ interest profiles consist of two components: stable interest and temporal interest. Furthermore, the stable interests have the characteristic of being time-immune, which means that they only change slightly as time passes. However, some interests may be generated due to reasons like the influence of a hot social trend, and we define these interests as temporal interests, meaning that they are short-time interests. In our model, we employ a scheme with time-weighting to capture both stable interests and temporal interests from a user’s historical microblogs. Then, considering tags to be part of the interest criteria that users set for tagging themselves at the initial time, which could be powerful evidence for defining a user’s stable interests, we propose an algorithm to combine the interest profiles drawn from these two sources. Last, we integrate the profiles of a user’s followees.
3.1.1. Latent Interest Profiles Obtained by Microblogs
Many studies related to users’ posted messages have been conducted [14,15,16], including research focusing on the problem of identifying influential users in a social network by taking into account the similarity of the topics that users post about topics, which is decided by a user’s posted messages [17]. However, we wish to detect a user’s latent interest profiles though the messages posted naturally by that user. The method we adopt is based on Latent Dirichlet Allocation (LDA), an unsupervised machine learning technique which has been widely used to detect latent topics in documents. LDA treats a single document as “a bag of words”, which means that it views a document as a vector of word counts. Each document is represented as a probability distribution over various topics, while each topic is represented as a probability distribution over a number of words, as shown in Figure 3.
Figure 3.
An abstract representation of LDA principle.
Standard LDA may not fit the writing of microblog users, the reason being that a single microblog will always be short and contain only one topic, so the method we adopt is the Microblogs Topic Discovery Model, which is based on the twitter-LDA in [18]. In Section 2, we have introduced method for dividing a user’s microblogs into groups such that. denotes all of the microblogs that user released during time , and it will be processed to obtain the user’s interest profiles during time.
Suppose that there are T hidden topics in microblogs set , and that each topic t has a word distribution and a background words distribution . denotes a Bernoulli distribution that manages the choice between background words and topic words. is the topic distribution of user u. Each multinomial distribution is governed by some symmetric Dirichlet distribution. Gibbs sampling is used to perform model inference. We leave out the derivation details and the sampling formulas here. Figure 4 describes the generation process of microblogs, and we illustrate the plate notation of the model in Figure 5.
Figure 4.
Generation process of Microblogs.
Figure 5.
Plate notation of Microblogs Topic Discovery Model.
As the result of the method we adopt, the microblogs set has been patterned as a distribution over all topics to obtain latent topics and to represent the topic distribution of the microblogs collection as a topic vector, each entry of which denotes the weight of the representative words for each topic. Otherwise, the topics that users focus on are the latent interests user have, so, leveraging LDA, we depict as an interest vector
where denotes the set of interest vectors, represents a kind of interest (lurking in topics, e.g., eating, IT, traveling), and is the degree that occupies in user u’s interest. Take all the into account, we formulate (the latent interest vector in user u’s microblogs) as
Note that time is a factor which influences a user’s interests, and some interests submerge or weaken. We determine to add a time weight function to each . There are two requirements for:
- must be monotonically decreasing for the reason that current interest vector should have more weight in .
- The value of should lie in the range of [0, 1].
Inspired by Li et al. [19], we found that three kinds of time function could be used to describe the curve of time weight: an exponential function (), a logistic function () and a damping function (). is the decay rate. Figure 6 shows the diagrams of these three functions (), in which we can clearly notice that all of the functions are monotonically decreasing and moving to zero in the end, which is suitable to describe interest attenuation. Allowing for the idea that a user’s interests should not change quickly, we choose the damping function as our time weight function. Hence, the calculation formula is updated as
Figure 6.
Diagrams of three time functions.
Then can be rewritten as follows
3.1.2. Interest Profiles Obtained from Tags
A user u’s tags are denoted as , which correspond to the user’s interests . Considering that tags are dominant and stable indicators of a user’s interests, we set (interest vector in user ’s tags) as following equation:
where is a constant and the value of c corresponds to the interest degree of .
3.1.3. Interest Profiles Obtained by Followees’ Profiles
In the real world, we have more connections with people who have tastes and habits similar to ours. The motivation for a user to follow another user in asocial network is determined by whether he/she has nearly the same interests as this user. Namely, the followees’ profiles can mirror the user’s profiles to some extent. Therefore, we expand a user’s profiles via the people whom the user is following. Suppose user u follows l people and their profiles have already been created. Hence, , the interest profiles reflected by the followees, is calculated by
where is reduction factor.
3.1.4. Stable Interest Profiles and Temporal Interest Profiles
We have acquired three kinds of interest profiles (,, ), which are obtained from a user’s historical microblogs, tags and followees, respectively. In this section, we will propose an algorithm to integrate , and , and then define stable interest profiles and temporal interest profiles. Firstly, Algorithm 1 is the procedure of integrating and, and its result we define as. Then, we follow the same procedure for integrating and , which we do not detail here. We get the user’s profiles eventually, and we define as the user’s stable interest profile, while the temporal interest profiles refer to current interest vector , which is decided by the recent user data.
| Algorithm 1. Procedure for integrating and |
| Input: Interest sequence and , interest vector and Output: User u’s interest vector 1. initialize = ; 2. for i = 1: m (the number of ); 3. if , 4. update interest vector to in ; 5. else 6. insert and its value into ; 7. end for; 8. Output ; |
3.2. Commodity Profiling
The name of a commodity is made up of commodity details. For example, “Motorola Moto 360 2nd Gen Smartwatch for Most Apple iOS and Android Cell Phones (Men’s, 42 mm, Black w/Black Leather)” is the trade name of a product sold on Amazon. Generally, e-commerce limits the length of the trade name, for which reason each component represents important information about product. Observing this trade name, we divide it into several key components: “Motorola”, “Moto 360”, “Smartwatch”, “Apple”, “iOS”, “Android”, “Cell Phones”, “Men’s”, “Black”, “w/Black” and “Leather”. These key components show consumers the nature of the commodity concisely and explicitly. We define these components as commodity profile components , where n is one component, such as “Motorola”. The other crucial aspect we concern ourselves with is the commodity’s classifying labels, which were filed in the e-business database when the vender put this item on the shelf. For instance, “Cell Phones & Accessories”, “Accessories” and “Smart Watch Accessories” are three classifying labels for the commodity we mentioned above. From these three labels, consumers understand that this product belong to those classifications. Let be a collection of a user’s classification labels.
For each commodity c, and are the profiles obtained from that commodity’s name and classifying labels, respectively. By combining and, a commodity profiles can be obtained using Algorithm 2.
| Algorithm 2. Procedure for integrating and |
| Input: Interest sequence and , interest vector and Output: Commodity c’s profiles 1. initialize = ; 2. for i = 1: r; 3. if , 4. update interest vector to in ; 5. else 6. insert and its value into ; 7. end for; 8. Output ; |
3.3. Recommendation Subdivisions
In this section, we introduce three recommendation subdivisions. We propose three subdivisions for the reason that an intact recommendation scheme needs to be a complex system as a result of the complexity of the recommendation process. Recall the process by which we intend to recommend a commodity from an e-commerce platform to a social network user. At first, we consider the question of how we can recommend a single commodity to an individual. As we have developed a method for obtaining user profiles and commodity profiles, they become the means to resolve this question. However, when the individual receives an item recommendation based on both profiles, different actions may ensue, as he/she may accept, reject or ignore it. We consider these actions as feedback on this item for the reason that the different actions represents different levels of acceptance of recommended item. Then, we design a feedback mechanism by taking such feedback actions into account. By regarding three feedback actions as three user-item scores, a user-commodity recommendation matrix will be acquired. Finally, we will employ an improved collaborative filtering algorithm, which will make recommendations based on the user-commodity recommendation matrix. Consequently, in this section, we will introduce a recommendation scheme containing three subdivisions: Recommendations for individuals, the Feedback mechanism of recommendation scheme, and Recommendations by the improved collaborative filtering algorithm.
3.3.1. Recommendations for Individuals
Since the proposed cross-platform recommender system is to recommend commodities across platforms, the profiles we obtain from both platforms turn into the bridge connecting user and commodity. Due to the fact that consumers are more likely to spend money on commodities that fit their habits, we try to compute the similarity between the user profiles and the commodity profiles. For a given user u and commodity c, the recommended score (cosine similarity of and) is computed as follows:
Supposing is the critical value of recommendation, then
which means that, if the relevance of u and c is big enough, we recommend c to u. Otherwise, c is not a suitable recommended item for u.
3.3.2. Feedback Mechanism
When dealing with individual recommendations, the plays an important role. However, once a user receives the item recommendation, different actions may be taken by the user. Supposing that a user u receives a recommendation of commodity c, one of the three actions below would be taken:
- Click recommended item, browse and buy finally.
- Click recommended item, browse but do not buy.
- Do not click recommended item.
Different action indicates different levels of acceptance (emotions) of the recommended item. Table 1 lists the details regarding these three actions as different types of feedback from the user, which correspond to three rankings of the recommended commodity, defined as rank(high), rank(middle), rank(low), respectively. rank (high) means that the result that recommended the commodity with profiles to the user with profiles is perfect, while rank(middle) means the result is generally positive and rank(low) means the result is bad.
Table 1.
Different action and different emotion.
Then in the recommendation database will be updated according to the user’s feedback. Formally, we define as the feedback score and have
where is a positive real number. Hence, is updated by the formula
3.3.3. Recommendations Based on Collaborative Filtering Algorithm Using User Profiles
If the proposed cross-platform recommender system starts to work by recommending different commodities to different users, a user-commodity-feedback score matrix will be obtained. We apply the collaborative filtering (CF) method to produce the predicted likeness score of a given item for a given user with the help of the user-commodity-feedback score matrix. As for CF, it is the most successful recommendation technique to date. The basic idea of CF-based algorithms is to provide item recommendations or predictions based on the opinions of other like-minded users, and CF characterizes consumers and products implicitly by their previous interactions. Considering that we have user interest profiles, we want to improve the traditional collaborative filtering algorithm by combining it with the similarity of user profiles between users.
Suppose that we have a list of m users and a list of n commodities . The user-commodity-score matrix is denoted as .
where represents the feedback score that user i gives to commodity j. will always be a sparse matrix while the recommendation method aims to predict the unknown score in .
Measuring the similarity between users is quite important in CF. There are three popular methods of measurement used in CF, which are Cosine Similarity, Pearson Correlation Coefficient Similarity and Modified Cosine Similarity. In our paper, Pearson Correlation Coefficient Similarity has been used. The formula is shown below.
where denotes the similarity of user a and user b. and denote the scores that and give to commodity . and are the average score that and give to commodities. denotes the set of commodities to which have both given a score and .
indicates the predicted score that gives to , which means that has never given a score to . Firstly, we need the set of users who have given scores to . Then we calculate the similarity between and every user in , choose k users with the greatest similarity as the set of neighbor users and calculate by
Considering that each user has his or her own profiles, we make some improvements to the similarity formula and propose a collaborative filtering algorithm based on the user profiles (CFUP). and have n profiles and . We use the Euclidean metric method to measure the profile difference of and . denotes the Euclidean distance of and , while the calculation formula is shown below.
Then, the improved similarity formula appears as follows:
where .
4. Evaluation and Analysis
In this section, we first introduce the data that we used. After that, we will conduct experiments to validate the value of the cross-platform recommender system and confirm that promotional information in a social network could affect e-business. Next, we compare the performance of the Microblogs Topic Discovery Model and standard LDA in user profiling. Following that, users’ profiles will be built by our topic model. Finally, a comparison between CFUP and the tradition collaborative filtering algorithm will be presented.
4.1. Data Preparation
The social network data source we adopted in this paper is Sina Weibo, which is the most popular and influential social network in China. It had 530 million active monthly users (MAU) in December of 2021. In order to realize our simulation, the original microblogs corpus dataset was collected from the Sina Weibo website using a crawler tool. We collected this dataset by starting from a seed set of active Sina Weibo users (we call these users shop owners) who have set foot both in social networks and in e-business. When these users market their commodities using microblogs (we call these microblogs promotional microblogs) in a social network, we trace the information in these microblogs and collect data both on the social network platform and the e-business platform. This process is shown in Figure 7.
Figure 7.
The detailed process of collecting data.
For the purpose of obtaining credible and complete user profiles, the collected data from the social network platform contains four main types of information, which are shown in Table 2 in detail.
Table 2.
Four types of information collected from social network platform.
The same collection method is used in commodity information collection process. The six types of commodity information are shown in Table 3.
Table 3.
Six types of information collected from e-business platform.
4.2. Effects of a Promotional Microblog on Commodity Sales
In this section, we study the following process: a shop owner who has Weibo account in a social network and has a number of followers releases a promotional microblog, namely, an advertisement of his products, which are available on an e-commerce platform. Would this microblog promote his product sales? Starting with this question, we do some experiments below.
Firstly, several special users on Weibo has been chosen as shop owners and have shops (selling food, cosmetic, living goods, etc.) on Taobao (a famous e-commerce platform). When they release a promotional microblog for selling their own products, we track the microblog and collect the product’s sale information from Taobao. Later, we analyze these sales data. In the instance of the promotional microblog shown in Figure 8, one of the shop owners has a Weibo account (his homepage: http://weibo.com/wysr2007 (accessed on 7 March 2022)) and a shop on an e-commerce site (shop link: https://shop70713800.taobao.com/ (accessed on 7 March 2022)). The number of his followers is more than one million nine hundred thousand. The promotional microblog in Figure 8 was released by this shop owner to announce that there would be a discount in his shop. In total, it received 1009 forwards, 368 comments and 433 praises.
Figure 8.
An instance of the special users we study.
In this experiment, five different promotional microblogs are selected and analyzed, and detailed information about them is shown in Table 4.
Table 4.
Detailed information on five different promotional microblogs.
After these microblogs were released, we kept track of all of the commodities sales in the e-business stores and collected their sales information. Figure 9 shows the metabolic curves of different shop owners’ commodities sales.
Figure 9.
The curves of different shop owner’s commodities sales.
In these figures, the abscissa denotes the time and each unit of abscissa represents one day, while the zero abscissa indicates the day that the shop owner released the promotional microblog, and the ordinate denotes the sale volume of shop on e-commerce. To display the change clearly, we use a red line to indicate places where the commodities sales are larger than before.
From these curves, we can observe that, after these shop owners released their promotional microblogs, there was an obvious upward trend of their shops’ sales volumes. For shop owners 1, 4 and 5, their commodities sales appeared to peak after they released the promotional microblogs. For shop owner 2 and 3, there exist a continuously higher sales volume than the days before. Although these sales curves move differently, their general tendency is to go up, which means that promotional microblogs certainly have a facilitating function for promoting sales. In other words, a social network has ability to play a role in creating economical value for e-commerce, which gives meaning to our research field. In turn, if we know about one user’s profiles in a social network, meaning that we know what this user prefers, could we take a suitable commodity and recommend it to the corresponding user? This is the process of cross-platform recommendation, which discovers a user’s interests with the help of social media and chooses products fit for that user interests. In the next section, we will make an evaluation of our method of obtaining user profiles.
4.3. Microblogs Topic Discovery Model and Standard LDA Model
In order to test the efficiency of our model, we quantitatively evaluate the MTDM compared with the standard LDA model, i.e., treating all tweets as a single document.
The above-mentioned two models have four parameters, and different choices of parameters have implications for the inference results. In our experiment, learning from other research and from our own experience, the number of topics T is set as 5, is 50/T, is 0.1 and the iterations of Gibbs sampling that we set is 1000. In addition, some preparatory work must be accomplished before using these two models, such as deleting stop words, removing punctuation and segmenting words. However, we omit the description of this work. Table 5 shows samples of the results obtained using MTDM and the standard LDA model (we only list six words in each topic, and we translate them into English in brackets).
Table 5.
The result samples of MTDM and Standard LDA.
However, we repeat the process 100 times with different data sets so that we have 100 pairs of results. We select three human judges to make judgements regarding these results. The results are first mixed randomly and then sent to the judges. They assign a grade for each topic according to original data. The grading rules are given below.
Grading rules:
- 1: meaningful and coherent
- 0.5: not very good; contains other topics or meaningless words
- 0: makes no sense
Then, we calculate the average grade for two models and list them in Table 6.
Table 6.
Comparison results of MTDM and Standard LDA.
The average grade of MTDM is larger than that of Standard LDA, which indicates that MTDM has better performance on topic detecting. In the next section, we use MTDM to detect user interest profiles.
4.4. The Construction of User Interest Profiles
This section will build user interest profiles based on the data we have collected from social networks and test the effectiveness of the profiles we have built.
4.4.1. The Process of Building User Interest Profiles
Here, we choose one user as an example. Considering the fact that the user’s interests may change as time goes by, we separate the user’s microblogs into five groups, so that each group contains microblogs that the user released during (six months). Then, we train MTDM using , respectively. The user interest vector at different times is shown in Table 7.
Table 7.
Examples of profiles building process.
We can observe from Table 7 that are different from each other. shows that, in last six months, the user mainly focused on Car, Mobile and Soccer, while shows that the user was interested in Shopping, Mobile, Technology, Soccer and News. This phenomenon demonstrates that user interests as detected using microblogs, which reflect actual user preferences, would change over different time periods. This indirectly shows the necessity of distinguishing stable interest profiles and temporal interest profiles.
The interest profiles in a user u’s microblogs can be obtained by
Here the value of we choose is 1; therefore, ,,,,.
The user’s that we obtain is {(Soccer, 0.0296), (Mobile, 0.0259), (Car, 0.0246), (Shopping, 0.0116), (Huawei, 0.009548), (Technology, 0.0088), (Environment, 0.0087), (News, 0.0085), (Meizu, 0.0044), (Website, 0.0021), (Japan, 0.0014), (Company, 0.0015)}.
While user has tags = (Soccer, News, Humor, Mobile, Java, Game, Post-80s), so = {(Soccer,0.01), (News,0.01), (Humor,0.01), (Mobile,0.01), (Java,0.01), (Game,0.01), (Post-80s,0.01)}, the user profiles can be obtain by Algorithm 1.
= {(Soccer,0.0396), (Car,0.0246), (Shopping,0.0116), (News,0.0185), (Humor,0.01), (Mobile,0.0359), (Java,0.01), (Game,0.01), (Post80s,0.01), (Huawei,0.009548), (Technology,0.0088), (Environment,0.0087), (Meizu,0.0044), (Website,0.0021), (Japan,0.0014), (Company,0.0015)}.
4.4.2. Efficiency of the Profiles Used
Since we have been able to build user profiles using the process mentioned above, now we proceed to test the effectiveness of the profiles that we obtain. The method we adopt is an indirect way which relies on common sense to an extent. It is easy to understand that users in a social network prefer to make comments on microblogs when they have an interest in the information that the microblog spreads. Therefore, we chose one promotional microblog which wanted to sell mobile phone and analyzed its 51 reviewers’ interest profiles, obtained through the process above. If the user profiles we get are effective, the reviewers’ interest profiles will be much likely to include an interest in ‘Mobile’.
We generate statistics according to keywords in the reviewers’ interest profiles, and the statistical results are shown in Figure 10.
Figure 10.
Statistical results of keywords in users’ interest profiles.
Figure 10 indicates that most of reviewers are interested in the Mobile area, which can prove our assumption. From the figure above, we can see that the reviewers of this promotional microblog mostly have interests in mobile, technology, digital and some other aspects related to mobile. This phenomenon indicates that users in social networks prefer to be selective about the information that they focus on and that our profile model can describe user interest profiles.
4.5. Analysis of the Collaborative Filtering Algorithm Based on the User Profiles (CFUP)
While there is no actual data set for a cross-platform recommender system, we choose the SUSHI data set, which is similar to the actual cross-platform data (containing user’s profiles) that we want to use. It contains 5000 users, 100 different kinds of sushi and the scores that different users give to different types of sushi. Each user has ten attributes. The scores range from zero to four. In this experiment, 80% of the SUSHI data set is training data, while the rest is test data. Mean absolute error (MAE) is adopt as the evaluation criterion and is calculated by the formula below, where stands for the predicted score, stands for real score and T is the number of train data.
4.5.1. Impacts of the Parameter
is the parameter we use in the CFUP similarity formula, and different lead to different predicted results. This experiment aims at finding the best for CFUP. Here the number of neighbor users is 30.
As we can observe from Figure 11, the MAE decreases as increases, while the MAR increases as increases. The MAE reaches the lowest point when. So will be selected as the best parameter value.
Figure 11.
The impact of parameter on MAE.
4.5.2. Comparison Results with Collaborative Filtering Algorithm
This experiment attempts to show whether the collaborative filtering algorithm based on user profiles (CFUP) has better performance in terms of prediction precision. In order to answer this question, we make a comparison with the tradition collaborative filtering algorithm (CF), and the result is shown in Figure 12. The abscissa is the number of neighbors.
Figure 12.
The comparison results of CF and CFUP.
It can be observed from Figure 12 that MAE decreases as the number of neighbor users increases in the different algorithms. It changes sharply at the beginning, and then more slowly before finally seeming become a stable number. However, the general recommendation result of CFUP is better than that of the traditional CF.
5. Conclusions
This paper proposed a cross-platform recommender system, CPRec. By constructing user profiles and commodity profiles, commodity recommendations will be realized based on the similarity of user and commodity. The experiments and analysis demonstrated that social networks effect e- commerce, which will play an important role in creating economical value for e-commerce. The simulation results showed that the Microblogs Topic Discovery Model performs better compared with the LDA model, and we built users’ profiles more precisely with the help of the proposed model. Moreover, we also improved the traditional collaborative filtering algorithm and proposed a collaborative filtering algorithm based on the user profiles (CFUP) by considering the similarity of users’ attributes. The experiments with CFUP show that is the best parameter for CFUP and CFUP, which obtain more accurate recommendation results than the traditional collaborative filtering algorithm.
In future work, we will focus on studying the information spread path. We aim to find the fastest and broadest path for information spreading to enhance the influence of promotional information. The reason for this is that, the greater the influence of cross-platform information, the more economical value it may obtain.
Author Contributions
J.Z. and B.S. designed the proposed method and wrote the paper; J.Z. and X.R. wrote the code and performed the experiments; J.Z. and B.S. analyzed the data; Z.C. modified the paper and offered support. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the 2020 Youth Fund Project of Fuzhou Polytechnic, grant number FZYKJJQN202001. Additionally, the APC was funded by the 2020 Youth Fund Project of Fuzhou Polytechnic (FZYKJJQN202001). This work is also supported by the National Natural Science Foundation of China (62277010), the Fujian Natural Science Foundation (2021J011013 and 2020J01132452) and the Medical Innovation Project (2021CXA001).
Data Availability Statement
Not applicable.
Acknowledgments
The authors thank the 2020 Youth Fund Project of Fuzhou Polytechnic (FZYKJJQN202001) for covering the costs to publish in open access and the costs incurred when writing this study. In addition, the authors thank the anonymous reviewers for their insightful comments that helped improve the quality of this study.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Gao, C.; Huang, C.; Yu, D.; Fu, H.; Lin, T.; Jin, D.; Li, Y. Item Recommendation for Word-of-Mouth Scenario in Social E-Commerce. IEEE Trans. Knowl. Data Eng. 2022, 34, 2789–2809. [Google Scholar] [CrossRef]
- Wigand, R.T.; Benjamin, R.I.; Birkland, J.L.H. Web 2.0 and beyond: Implications for Electronic Commerce. In Proceedings of the 10th International Conference on Electronic Commerce, Innsbruck, Austria, 19–22 August 2008; pp. 1–5. [Google Scholar]
- Tajvidi, M.; Wang, Y.; Hajli, N.; Love, P.E. Brand Value Co-creation in Social Commerce: The Role of Interactivity, Social Support, and Relationship Quality. Comput. Hum. Behav. 2021, 115, 105238. [Google Scholar] [CrossRef]
- Adam, I.O.; Alhassan, M.D. The Role of Social Media on the Diffusion of E-Government and E-Commerce. Inf. Resour. Manag. J. 2021, 34, 63–79. [Google Scholar] [CrossRef]
- Kang, S.; Lee, D.; Kweon, W.; Yu, H. Personalized Knowledge Distillation for Recommender System. Knowl. -Based Syst. 2022, 239, 107958. [Google Scholar] [CrossRef]
- Forestiero, A. Heuristic recommendation technique in Internet of Things featuring swarm intelligence approach. Expert Syst. Appl. 2022, 187, 115904. [Google Scholar] [CrossRef]
- Fijalkowski, D.; Zatoka, R. An Architecture of a Web Recommender System Using Social Network User Profiles for E-Commerce. In Proceedings of the 2011 Federated Conference on Computer Science and Information Systems (FedCSIS), Szczecin, Poland, 19–21 September 2011; pp. 287–290. [Google Scholar]
- Ma, H.; Zhou, T.C.; Lyu, M.R.; King, I. Improving Recommender Systems by Incorporating Social Contextual Information. ACM Trans. Inf. Syst. 2017, 29, 1–23. [Google Scholar] [CrossRef]
- Zhao, W.X.; Li, S.; He, Y.; Wang, L.; Wen, J.R.; Li, X. Exploring demographic information in social media for product recommendation. Knowl. Inf. Syst. 2016, 49, 61–89. [Google Scholar] [CrossRef]
- Zhao, W.X.; Li, S.; He, Y.; Chang, E.Y.; Wen, J.R.; Li, X. Connecting Social Media to E-Commerce: Cold-Start Product Recommendation Using Microblogging Information. IEEE Trans. Knowl. Data Eng. 2016, 28, 1147–1159. [Google Scholar] [CrossRef]
- Pan, S.J.; Zhao, L.; Yang, Q. A unified framework of active transfer learning for cross-system recommendation. Artif. Intell. 2017, 245, 38–55. [Google Scholar]
- Xiang, D.; Zhang, Z. Cross-border e-commerce personalized recommendation based on fuzzy association specifications combined with complex preference model. Math. Probl. Eng. 2020, 2020, 8871126. [Google Scholar] [CrossRef]
- Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent dirichlet allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
- Zhou, X.; Chen, L. Migrating social event recommendation over microblogs. Proc. VLDB Endow. 2022, 15, 3213–3225. [Google Scholar] [CrossRef]
- Djenouri, Y.; Belhadi, A.; Srivastava, G.; Lin, C.W. Toward a Cognitive-Inspired Hashtag Recommendation for Twitter Data Analysis. IEEE Trans. Comput. Soc. Syst. 2022, 9, 1748–1757. [Google Scholar] [CrossRef]
- Tahmasebi, H.; Ravanmehr, R.; Mohamadrezaei, R. Social movie recommender system based on deep autoencoder network using Twitter data. Neural Comput. Appl. 2021, 33, 1607–1623. [Google Scholar] [CrossRef]
- Weng, J.; Lim, E.P.; Jiang, J.; He, Q. Twitterrank: Finding topic-sensitive influential twitterers. In Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM 2010), New York, NY, USA, 4–6 February 2010; pp. 261–270. [Google Scholar]
- Zhao, W.X.; Jiang, J.; Weng, J.; He, J.; Lim, E.P.; Yan, H.; Li, X. Comparing twitter and traditional media using topic models. In Proceedings of the European Conference on Information Retrieval, Heidelberg, Berlin, 18–21 April 2011; pp. 338–349. [Google Scholar]
- Li, L.; Zheng, L.; Yang, F.; Li, T. Modeling and broadening temporal user interest in personalized news recommendation. Expert Syst. Appl. 2014, 41, 3168–3177. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).