A Game-Theoretic Approach for Modeling Competitive Diffusion over Social Networks

: In this paper, we consider a novel game theory model for the competitive inﬂuence maximization problem. We model this problem as a simultaneous non-cooperative game with complete information and rational players, where there are at least two players who are supposed to be out of the network and are trying to institutionalize their options in the social network; that is, the objective of players is to maximize the spread of a desired opinion rather than the number of infected nodes. In the proposed model, we extend both the Linear Threshold model and the Independent Cascade model. We study an inﬂuence maximization model in which users’ heterogeneity, information content, and network structure are considered. Contrary to previous studies, in the proposed game, players ﬁnd not only the most inﬂuential initial nodes but also the best information content. The proposed novel game was implemented on a real data set where individuals have different tendencies toward the players’ options that change over time because of gaining inﬂuence from their neighbors and the information content they receive. This means that information content, the topology of the graph, and the individual’s initial tendency signiﬁcantly affect the diffusion process. The proposed game is solved and the Nash equilibrium is determined for a real data set. Lastly, the numerical results obtained from the proposed model were compared with some well-known models previously reported in the literature.


Introduction
In recent years, there has been a growing interest in studying social networks as they have widespread application in different fields such as sociology, economics, computer science, biology, and mathematics [1][2][3][4][5][6].In every society, information is disseminated among the population by the relationships between individuals.Thus, an important part of research on social networks targets the problem of diffusion and spread of products, information, and so forth in social networks.Two early studies in this context deal with two most popular influence models called the Linear Threshold model and the Independent Cascade model by Domingos and Richardson [7] and Kempe et al. [8].It is thought that diffusion of messages is usually more effective and convincing if messages are received from a friend rather than from a social change agent (e.g., companies) [9].The social change agents work to support people and organizations with the aim of creating social impact.Universities are instances of social change agents [10].Therefore, most studies have been focused on a situation in which players (social change agents) attempt to find the most influential nodes in order to maximize the total number of infected nodes at the end of the diffusion process [5,7,8].The influence maximization problem has been studied from two perspectives.One perspective occurs in a non-competitive situation where there is only one social change agent who wants to diffuse its own option in a social network.It was first defined by Domingos and Richardson [7], who considered this problem in a probabilistic context and provided heuristics to find an influence maximization set.Their research was followed up in several studies such as [1,3,8,11].The other perspective is studying the influence maximization problem in a competitive situation where there are two or more social change agents who compete with each other in order to maximize the diffusion of their options in a network.Recently, the competitive diffusion problem has been investigated for a cascade model in [12][13][14] and for a threshold model in [8,15].
One of the most efficient tools to study the competitive diffusion problem is game theory, which has lately been developed in some studies such as [16][17][18][19][20][21][22].These studies deal with the diffusion process as a strategic game where social change agents are players and compete to maximize the diffusion of their options (for instance, their information or their goods) in the social network.They seek to find the most influential initial nodes so as to maximize the total number of infected nodes.Most of these research endeavors focus on social networks with homogeneous nodes.Also, these studies assume that if a node receives influence from more than one player at a time, it is considered as a gray node, which will not have any efficacy at the subsequent steps of the diffusion process and can be deleted from the network.Moreover, most of the previous research has not paid attention to the way in which data are diffused.These basic assumptions make the models developed in these studies detached from reality.More recently, study [23] regarded the diffusion process as a game played in a network by external agents.Kermani et al. [23] investigated the effect of individual characteristics and message content on the diffusion process.In particular, they supposed that information diffusion takes place within a communication framework such as a cell phone text messaging service.Furthermore, ref. [23] supposed that nodes are heterogeneous and considered the effect of nodes' identity and message content.These considerations make this study more in line with reality compared to other studies.However, it has to be borne in mind that Kermani et al. [23] supposed that individuals have initial tendency (desire) toward the diffused options which does not change over time, while the tendency of individuals in real-world networks may change due to the influence they receive from their neighbors through time.Another assumption that makes this model out of touch with reality is that the content of messages is constant and that players cannot choose their own message content.
The present study deals with the maximization diffusion problem as a strategic game, where the effect of individuals' characteristics (i.e., nodes in the network), message content and the topology of the network are investigated.That is to say, individuals have an initial tendency toward diffused options by different players.Nodes' tendency can change due to the effects of their friends during the diffusion process.Additionally, players can select the content of their own messages.While the previous studies assumed that players only aim to find the most influential initial nodes, in the model developed here players have the additional goal of finding the best message content.To this end, it is assumed that social change agents attempt to maximize the total sum of the tendencies of individuals toward their own diffused options rather than maximizing the total number of infected nodes.To explain the logic behind this consideration, it is necessary to review the the active node concept as used in the literature.A node is called active if it receives a message from one of the agents (players) in the diffusion process.However, it is clear that some of the received messages do not have any effect on a node and cannot convince the inactive node to choose that agent's option.Therefore, it can be concluded that the total number of infected nodes is not the most suitable criterion for optimizing the diffusion process.Hence, in the current paper, a node is called active if there is a received message that can change its own tendency, and players seek to maximize the sum of individuals' tendencies toward their options, i.e., the level of society's desire to choose the options taken by the players is maximized.All of these considerations bring the proposed model closer to reality compared to previous studies.
The remainder of the paper is organized as follows.Section 2 introduces the game and its components.The competitive influence model is presented in Section 3. In Section 4, the proposed model is implemented on a real data set and provides sensitivity analysis.Finally, Section 5 concludes the paper and offers suggestions for future research.

The Game
Without loss of generality, in the proposed model, it is assumed that there are two social change agents (players).All of the results can be extended to the games that involve more than two players.Players are out of the network and have two different options to be diffused in the network.P = {p 1 , p 2 } denotes the set of players.The network is represented by means of a weighted directed graph G(N, E) in which N = {1, 2, • • • , n} indicates the set of the nodes (i.e., individuals in the network) and E is the set of the edges.This network is a messaging network (Short Message System, Telegram, Instagaram, Whats App, ...) in the sense that arc (i, j) shows that node i can send a message to node j and w i,j ∈ [0 1] determines the weight of edge (i, j), which represents the effect of node i on node j, and if there is no edge between i and j, then w i,j = 0.For node i ∈ G, two sets I i = {j ∈ N|(j, i) ∈ E} and O i = {j ∈ N|(i, j) ∈ E} are defined based on the graph.The nodes are heterogeneous and have different scores, for instance, the initial tendency of node i toward the players' options, is represented by vector , where m is the number of the players (agents) and −1 ≤ α ij ≤ 1 shows the tendency of node i toward the option diffused by the agent (player) j which may be a result of information obtained from friends, advertisement, or other channels.It can change over time because of the influence of messages received from neighbors during the diffusion process.Further, the normalized social skill score of each node i is denoted by β i , which reflects the sociability score of node i in the network which is evaluated by a social skill questionnaire [23].Lastly, for each node i, two thresholds 0 ≤ θ i l ≤ θ i h ≤ 1 are defined, 0 ≤ δ ≤ 1 is a fixed threshold for the network, and t ∈ {1, 2, • • • } denotes a discrete time step.Agents (players) send messages with different content to the network.The content of player p i 's message is denoted by text p i = {t p i1 , t p i2 , ..., t p im }, where −1 ≤ t p i1 ≤ 1 represents the tendency of message text p i toward player p 1 's option.Also, −1 ≤ t p i2 ≤ 1 is the tendency of message text p i toward player p 2 's option.It is thought that text p 1 should be {1, −1} and text p 2 should be {−1, 1}.Hence, player p i should diffuse a message whose content thoroughly proves the player's own option and strongly disproves its rival's option.However, it will be shown in Section 4 that this is not always true and sometimes based on social conditions, if players act more moderately, they will be more successful in diffusing their options.Contrary to previous studies, in the proposed model, each player p i not only chooses a set of nodes I ⊆ N (based on its budget) but also selects content text p i for its message at the first step.Thus, the strategy set of player p i gives rise to: Then, diffusion will continue on the basis of an influence model, and at the end of the diffusion process, the total sum of social tendency toward the ith player is this player's payoff, which is denoted by f p i and gives rise to: Each player p i tries to select the best initial set and the best message content in order to maximize f p i .

The Influence Model
In this section, the influence maximization problem sets the rules of the game.Based on [24,25], both node personality and message content affect the diffusion process.Therefore, social skill, initial tendency, and message content are considered in this influence model.Diffusion occurs in discrete steps.At the first step (t = 1), each social change agent (player) p i selects a set of nodes I ⊆ N (based on its budget) and content text p i for its message to be sent to I. Two different states-active and inactive-are allocated to the nodes at each step in order to explain the state of nodes.In this influence model, a node is called active if its tendency has changed by receiving a message and it is inactive if it has not received any message or if the received messages cannot change its initial tendency.At the first step, all of the nodes are inactive.Nodes face three different decision-making situations: first, accepting the incoming message such that the node's tendency changes; second, forwarding the message; and third, selecting the target nodes to forward the message to.It should be noted that, at the first step (t = 1), nodes receive a message from social change agents (players), but at subsequent steps (t ≥ 2), they receive messages from their neighbors.
At step t = 1, all of the nodes are inactive and the diffusion process starts.Players select the content of their messages and the initial subset nodes and then send their own messages to them.So, at step t = 1, one of the following events occurs for each node i: 1. Node i does not receive any message from players and remains inactive.2. Node i receives (only) a message from the kth social change agent.In this case, the effect of the received message on node i depends on the sender's social skill and the consistency of i's tendency toward the content of the received message.From a mathematical point of view, the magnitude of this effect is calculated by . If this value is lower than node i's low threshold θ i l , then it does not influence node i and cannot change its tendency.That is: If this value is higher than node i's low threshold θ i l , then it can change the tendency of node i; that is: In this case, node i is called active, and if this value is also higher than node i's high threshold θ i h , then node i decides to forward the received message.In order to model this step, variable x ik (t) needs to be defined as below: i f node i decides to send message text P k in step t So, the following relationship is concluded: for (k ∈ P − k). 3. Node i receives both messages text p 1 and text p 2 from both players.In this case, node i faces a decision-making situation in which it evaluates the influence of both messages and decides how to act by drawing a comparison between these messages.Realistically, node i selects one of the incoming messages based on the its social skill (node i's social skill) and the consistency of its tendency toward the content of the received messages.The mathematical representation of this situation is as follows: Also, node i decides to forward message text y to some of its neighbors if text y not only fits best with the node's interest but also the magnitude of its effect is greater than θ i h .That is: where ) and y = P − {y}.
At the subsequent steps of the diffusion process, nodes will receive messages from their neighbors who have been active and decided to send a message at the previous step.It is to be noted that if node i accepts one of the incoming messages at step t and decides to forward it, it can only do this at step t + 1, but not at later steps.Let us suppose that at step t − 1, node i has accepted message text P k and decided to forward it at step t.However, since the cost of forwarding a message is a consideration, node i cannot forward the message to all of its neighbors.Hence, this node selects some of its neighbors as destination nodes (which are inactive and are also better capable of forwarding the received message compared to other nodes).The choice of destination node j by active node i is related to the consistency of j's tendency toward the content of the forwarded message, the influence of node i on j, and j's social skill.To model the selection of the destination node using mathematical relationships, it is necessary to define variable y ijk (t) as follows: i f node i f orwards message text P k to node j at step t in which t ∈ {1, 2, • • • }, (i, j) ∈ E and k ∈ P. Therefore, the node selection step is modeled as: The patterns of accepting or rejecting an incoming message at step (t ≥ 2) are different from the first step.At subsequent steps of the diffusion process, for each inactive node i ∈ N, one of the following situations occurs.
1.It does not receive any message and remains inactive.2. It receives (only) one type of message from its neighbors; e.g., message text p k .In this case, according to the magnitude of the impact of the received messages (that depends on the senders' influence on i, social skill of i, and message content), node i decides how to act.That is: else, the tendency of node i does not change so that α i (t) = α i (t − 1).Also, the forwarding decision step can be represented as below: where (k = P k ). 3. It receives both messages text p 1 and text p 2 from its active neighbors.It will encounter a decision-making situation.The message that fits best with the node's tendency and has been forwarded by neighbors which have a considerable influence on i will be accepted and changes i's tendency.That is: ∑ Moreover, the forwarding decision step is mathematically shown below: where q = arg max k∈P ∑ j∈I i :y jik (t)=1 w ji ) and q = P − {y}.

Results
In this section, the performance of the proposed model is evaluated by implementing the game on two different networks: first, a small dataset with 20 nodes which have been selected randomly and then a real dataset with 163 nodes.
Suppose that there are two players that are out of the network and want to diffuse their own options in a random network.All of the parameters α i (t), θ i l , θ i h and β i have been randomly selected and α i (t) has been selected such that, ∑ 20 i=1 α i1 (1) = −0.2975and ∑ 20 i=1 α i2 (1) = 0.2272, meaning that on average individuals do not have much tendency toward any option diffused by the players.The purpose of the players is to maximize the sum of social tendency toward their own options.They select their initial nodes and message content and then forward the messages to the selected nodes.Without loss of generality, suppose that each player selects only one node (numbers of initial nodes is based on the players' budget.).Thus, the strategy set of players is as follows: As was explained in the previous section, −1 ≤ m ij ≤ 1 represents the tendency of player i's message content toward the option diffused by player j.Since m ij is closer to −1, the content of player i's message strongly disproves player j's diffused option.Without loss of generality, to simplify the calculations, m ij is considered to be discrete and belongs to {−1, −0.9, −0.8, ..., 0, ..., 0.9, 1}.The proposed strategic game is implemented on the network, and the Nash equilibrium will be calculated using the concept of best response functions.It is shown that parameters θ i h , θ i l , and δ have a significant effect on the players' payoff.Table 1 shows how the number of infected nodes and players' payoff varies when θ i h and θ i l vary between 0 and 1, and δ varies between 0.1 and 0.9.Based on parameter definition, increasing the value of the parameters decreases the number of infected nodes and the players' payoff.In Table 1, players 'payoff, best initial nodes and the number of infected nodes are represented for different values of parameters.For instance, if δ = 0.1, 0 < θ i l < 0.3 and 0.3 < θ i h < 1, the Nash equilibrium occurs when player 1 selects node 19 and message content {1, 0}.Specifically, the message content of player 1 is such that it strongly promotes the advantage of its option and does not openly attack the credibility of the competitor's option.Player 2 selects node 2 and message content {−1, 1}.Also, player 1's payoff is 8.4581, which is the sum of the tendencies of individuals toward player 1's option at the end of the diffusion process and was −0.2975 prior to the process.Player 2's payoff is 1.3689, and the number of nodes infected by player 1 and player 2 is 17 and 1, respectively.To show the efficiency of the model in terms of performance validity and solution accuracy, the proposed novel game should be implemented on a real dataset (Abrar data set [6]).
Abrar University is a single-sex university which is located in Tehran, Iran.This data set consists of 163 students enrolled in the fields of computer engineering and industrial engineering in the 2010-2011 and 2011-2012 academic years.These students are regarded as social network nodes i and j.Also, a directed link is formed from person i to person j if node i can send a message to node j in a short message system.Sets I i and O i are defined on the basis of the Abrar data set.Further, for each node i, the social skill score β i is determined based on a questionnaire developed in 1992 [26].To explain the numerical results, let us assume that two principal cell phone brands (say, Nokia and Samsung) compete with each other to maximize the sales of their products in the network.The purpose of these two agents is to maximize the total tendency of the network (Abrar data set) toward their products and this will maximize the enthusiasm of the nodes to proceed and choose the products of the agents at the time of purchase.The strategy of the players is viral marketing, which is based on a messaging system.This system can be a short messaging system or any online social network such as Telegram, Instagram, WhatsApp, and Facebook.P = {P 1 , P 2 } is the set of players.Without loss of generality, let us suppose that each player p k selects one of the students as its initial node because of its budget constraint and also selects content m k = {m k1 , m k2 } for its message to be sent to its chosen initial node i.Thus, the strategy set of each player is as follows: For each node i, initial tendency α i (1), thresholds θ i h , and θ i l are determined randomly such that ∑ 163 i=1 α i1 = 2.0747 and ∑ 163 i=1 α i2 = 15.4826.Moreover, without loss of generality, w i,j for all (i, j ∈ N) is supposed to be 1 if there is a link from i to j; it is supposed to be 0 if there is no such link.
It is shown that parameters θ i h , θ i l , and δ have a significant effect on the players' payoff.The number of infected nodes and also the sum of tendencies of the individuals vary as a function of θ i h , θ i l and δ.The Nash equilibrium will be determined for each game.Table 2 shows how the number of infected nodes and players' payoff vary when θ i h varies between 0.4 and 1, θ i l varies between 0 and 1, and δ varies between 0.1 and 0.9.Based on parameter definition, increasing the value of the parameters decreases the number of infected nodes and the players' payoff.Note that, in this game (similar to the previous game) the message content is considered to be discrete and the Nash equilibrium can be calculated using the concept of the best response function.In Table 2, the Nash equilibrium is determined and it is shown that players' payoff varies and the number of infected nodes decreases with an increase in the value of the parameters.
The diffusion process does not happen.
As is shown in Table 2, some of the games have no Nash equilibrium, some have only one, and some others have more than one.For instance, if δ = 0.7, θ i l = 0 and θ i h = 0.4, the Nash equilibrium occurs when player 1 selects node 27 and message content {1, −0.5}.Specifically, the message content of player 1 is such that it strongly promotes the advantage of its option and does not openly attack the credibility of the competitor's option, but rather mildly expresses the disadvantages of the competitor's option.Player 2 selects node 113 and message content {−0.2, 1}.Also, player 1's payoff is 3.5958, which is the total sum of the tendencies of individuals toward player 1's option at the end of the diffusion process and was 2.0747 prior to the process.Player 2's payoff is 17.1273, and the number of nodes infected by player 1 and player 2 is 12 and 17, respectively.By definition, the Nash equilibrium occurs in strategy profiles where players have no motivation to perform differently from the Nash equilibrium.In the present study, the players' strategy is to find the best initial nodes and the best message content.Below comes an analysis of a situation where at least one of the players deviates from the Nash equilibrium, once for selecting the initial node and the other time for selecting message content.For example, when δ = 0.7, 0 ≤ θ i l ≤ 0.4, and 0.4 ≤ θ i h ≤ 1, the performance of the competitive influence model propounded here is compared with some of the well-known models previously proposed in the literature (e.g., MGBD, MGEB, MGTB, MGSB, and MRND [23]), and the numerical results are summarized in Tables 4 and 5. Table 3 shows the initial node determined by the above-mentioned models.Now, suppose that players send their message separately to the network and they play in a noncompetitive situation in the absence of rivals.Certainly, their payoff is expected to increase compared to competitive conditions.This consideration is implemented on the proposed game and the results are shown in Tables 8 and 9, respectively.As is obvious in Tables 8 and 9, if players send their message to the network in the absence of competitors, they can diffuse their option better than in a competitive situation.For instance, if δ = 0.1 then player 1's payoff is 23.3373 and player 2's payoff is 59.7413 in the competitive game (Table 2).However, based on Tables 8 and 9, when they play the game separately, their payoff is 82.5373 and 89.2413, respectively.These results are exactly what we expect to achieve.This is another increase that shows the efficiency and good performance of the proposed model.

Conclusions
In this paper, we developed a novel game theory model to study the influence maximization problem in a messaging network.The proposed model is superior to the ones previously reported owing to the assumptions made in the present study for the network and also because of the realistic nature of the diffusion method employed.More particularly, unlike in the previous studies, we assumed that individuals in the network are heterogeneous (i.e., that they have personal tendencies that can change over time) and have different degrees of influence on their neighbors.The proposed model also takes account of information content.The players attempt to find the best initial set of nodes and the best content in order to maximize the total sum of network tendencies toward their options, whereas in the previous studies the sole purpose of the players was to maximize the total number of infected nodes.A possible avenue of research in the future is to model this influence maximization problem as a multi-objective linear optimization problem and solve it through exact methods.Another interesting possibility would be to explore networks other than messaging networks, finding mixed-strategy Nash equilibrium instead of the pure-strategy Nash equilibrium.

Table 1 .
Nash equilibrium for a random network.

Table 2 .
Nash equilibrium for different parameters for the Abrar data set.

Table 3 .
Suggested initial nodes based on different strategies.

Table 8 .
Only player 1 forwarding its message to the network.

Table 9 .
Only player 2 forwarding its message to the network.