Multi-Winner Election Control via Social Inﬂuence: hardness and algorithms for restricted cases

: Nowadays, many political campaigns are using social inﬂuence (SI) in order to convince 1 voters to support/oppose a speciﬁc candidate/party. In election control via SI problem, an attacker 2 tries to ﬁnd a set of limited inﬂuencers to start disseminating a political message in a social network of 3 voters. A voter will change his opinion when he receives and accepts the message. In constructive case, 4 the goal is to maximize the number of votes/winners of a target candidate/party, while in destructive 5 case, the attacker tries to minimize them. Recent works considered the problem in different models 6 and presented some hardness and approximation results. In this work, we consider multi-winner 7 election control through SI on different graph structures and diffusion models, and our goal is to 8 maximize/minimize the number of winners in our target party. We show that the problem is hard to 9 approximate when voters’ connections form a graph, and the diffusion model is the linear threshold 10 model. We also prove the same result considering an arborescence under independent cascade model. 11 Moreover, we present a dynamic programming algorithm for the cases that the voting system is a 12 variation of straight-party voting , and voters form a tree. 13


Introduction
Social media (SM) is an integral part of nowadays life.No one can ignore the effect of SM on different aspects of our life.Many people from all around the world are using SM to provide/use various services like teaching/learning, spreading information, events' announcements, and advertising.It has been shown that two-thirds of American adults get news on SM [1].It is easy to find evidence that a social influence (SI) started by few users has influenced many people.Then, SM is a kind of cheap means to spread a message among many users.Note that the power of SM is not just like spreading a message or advertising.Its power comes from the fact that a user will receive news from those who have enough authority to change his opinion, like close friends, family members, and colleagues.Since using SI is effective and cheap, it has been attracting the attention of many political campaigns and candidates to target the user's opinion through SI.They disseminate a piece of information to change voters' opinion.
There are two well-known diffusion models used in SI called linear threshold model (LTM) and Independent Cascade Model (ICM) [7].In LTM, a voter accepts a message if the sum over his incoming neighbors' influence, who already accepted the message, is high enough.On the other hand, in ICM, a voter will accept a message if at least one of his incoming neighbors, who already accepted the message, can convince him to accept it (please see Section 2 for a formal definition of LTM and ICM).
In this paper, we consider the multi-winner election control (MWEC) via SI problem.We are given a social network of voters, a limited budget, a set of candidates each belongs to a party, a dynamic diffusion model to spread a message among the voters, and an attacker/manipulator who supports/opposes a party.When we use LT diffusion model, we assume that the attacker knows the probability that each voter wants to vote for each candidate.To take into account the incoming influence of each node v, we use an updating rule based on the incoming influence from the node's incoming activated neighbors, akin to [8].On the other hand, when we use ICM, we assume the attacker knows the exact preferences list of all voters.When a node/voter becomes active/influenced/infected, in constructive (resp.destructive) case, it will promote (resp.demote) the position of the target candidates in its/his preference list, akin to [9,10] (See Section 3 for formal definition).
Regarding both LTM and ICM, there will be several winners, and they will be elected according to the overall candidates' scores after the diffusion.In the constructive (resp.destructive) case, the attacker wants to find a set of nodes (voters), according to its budget, to start the diffusion and change the voters' opinion to maximize (resp.minimize) the number of winners from his target party.In fact, in a given directed graph, we should find some diffusion starters to influence the voters such that the difference between the number of winners from our target party, w.r.t. the number of winners in the opponent party with the most winners, after and before the diffusion is maximized (resp.minimized).We present some results, including hardness of approximation, approximation, and polynomial-time exact algorithms considering some well-known objective functions on different structures.
Related works.There are many articles regarding voting manipulation (see the survey in [11]).The problem of finding a set of limited seed nodes from a given graph to maximize the expected number of influenced nodes is known as Influence Maximization (IM) problem.There exists an extensive literature about it, too [12].Domingos and Richardson [13,14] introduced the IM problem, and Kempe et al. formalized it [7,15].On the other hand, few works consider both of them together, i.e., the election control through SI problem.
Wilder and Vorobeychik introduced the election control through SI problem regarding single-winner elections [10].They investigated maximizing margin of victory (MoV) and probability of victory (PoV), where MoV is the difference of the score between the target candidate and the most voted opponent after and before the diffusion.The problem is considered under ICM.They showed maximizing MoV is NP-hard, and presented a 1 − 1 e -approximation algorithm concerning the optimal solution.Also, for maximizing PoV, they showed that it is NP-hard to approximate the problem within any constant factor.Corò et al. [16,17] extended the work using any non-increasing scoring function under LTM.
They demonstrated the same approximation factor for it.Abouei Mehrizi et al. considered the problem when the attacker knows a probability distribution over the candidates instead of the exact preferences list, under LTM [8].They showed that maximizing/minimizing the expected probability to vote for a target candidate is hard to approximate within any constant factor under unique game with small set expansion conjecture.They also presented some constant factor approximation algorithms for a relaxed version of the problem.Abouei Mehrizi and D'Angelo showed that in multi-winner elections, when the manipulator wants to maximize/minimize the number of winners in his target party, the problem is inapproximable under ICM, except P = NP [9].They also presented some constant factor approximation algorithms when the voting system is similar to the straight-party voting.
Bredereck and Elkind considered some different models, like bribing nodes/voters, adding or deleting edges under LTM.They showed that the problem is hard in those models.They also presented some polynomial-time algorithms for specific cases of the problem [18].Castiglioni et al. investigated the same models under ICM.They showed that the problem is hard even in restricted structures.Regarding the bribing nodes to influence other voters, they proved that the election control is hard even if the given graph is a line.Also, considering the edge removal/addition case, they demonstrated that the problem is hard even if the attacker has an infinite budget [19].Faliszewsk et al. considered the problem where each voter has a preference list.Each node of the graph is representative of all users with the same opinions.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 30 August 2020
There is an edge between two nodes if their opinion differs by the place of an adjacent pair of candidates.
They used LTM and proved that maximizing the number of votes for the target candidate is NP-hard and fixed parameter tractable with respect to the number of candidates [20].
Also, there is another model in which voters have a preference list over candidates, and voters will change their preference list according to the majority of their neighbors' opinions [21][22][23].
Outline and our results.In Section 2, we define the most prominent diffusion models in the literature (called LTM and ICM) that we used in this paper.Section 3 defines our model and objective functions formally.We show that our problem is hard to approximate within any factor in a general graph when the diffusion model is LTM in Section 4. Section 5 contains the same result when the diffusion model is ICM, and the given graph is in the form of an arborescence, i.e., edges are from leaves to root of the tree.
Moreover, in Section 6, we investigate the problem while the voting system is a variation of straight-party voting (SPV), where voters can vote for the parties.In other words, voters have a preference list (or probability distribution) over the candidates, but they can vote for the parties instead of candidates.
We presented a polynomial-time algorithm based on the dynamic programming approach to find the maximum difference of votes for our target party before and after diffusion.It also gives a 1  3 and 1 2 -approximation algorithms for maximizing MoV in constructive and destructive models, respectively.
Finally, we will discuss the results and future works in Section 7.

Background
In this section, we introduce two diffusion models that we have used in this paper, called linear threshold model (LTM) and independent cascade model (ICM) presented by Kemp et al. [7,15].They are the most prominent dynamic diffusion models used in literature (see a survey on the topic [24]).

Linear Threshold Model
We are given a directed graph , where N i v is the set of incoming neighbors of v. Also, each node v ∈ V has a threshold t v ∈ [0, 1] which is generated uniformly at random.
In this model, the diffusion will start from a set of nodes S ⊆ V known as seed nodes.At the first step, just the seed nodes will become active/influenced/infected, and all other nodes are inactive.Let us show A i as the set of nodes that are active at step i, i.e., A 1 = S.The activation process, for each step i > 1, is as follows: All nodes in A i−1 will remain active at step i, i.e., A i−1 ⊆ A i ; moreover, each inactive node v ∈ V \ A i−1 will become active if the sum of the weight from its incoming activated neighbors is not less than its threshold, i.e., for each node v ∈ V \ A i−1 , it will be in The diffusion process will proceed in utmost |V| discrete steps, and it will stop as soon as no extra node becomes active, i.e., it stops at step k > 1 if A k = A k−1 .We use A S as the set of activated nodes after the diffusion process started from the set of seed nodes S. In what follows, to increase the readability of this article, when we say after S, it means after the diffusion process started from a set of seed nodes S. Note that the thresholds are not a part of the input, and they will be generated uniformly at random and independently when we run the process.Also, the process is random, and several executions on the same graph may get different results for A S .
Kemp et al. [7] defined the IM problem as: Given a graph G = (V, E) and a budget B |V|. Kemp et al. [7] considered the IM under ICM.They showed that the greedy algorithm works for this model, too.They also demonstrated that it is NP-hard to approximate the problem within any factor better than 1 − 1 e .

Multi-Winner Election Control: Models and Objective Functions
In this section, we consider the Multi-Winner Election Control (MWEC), where some parties are running for an election so that more than one candidate will be elected as the winner, like a parliament election.We consider t different parties C 1 , . . ., C t , each of them contains k different candidates, i.e., We use C for the set of all candidates, i.e., C = ∪ t i=1 C i .Also, without loss of generality, we assume C 1 is our target party.Note that there will be exactly k winners for the election.

MWEC under LTM
In this model, we investigate the case that the adversary does not know the preferences list of the voters; instead of that, for each voter, the attacker has a probability distribution over all candidates.This model is similar to the model known as probabilistic linear threshold ranking (PLTR) defined in [8].Since most voters do not reveal their preferences in SM, then it is a realistic assumption.
The adversary tries to maximize/minimize the number of winners in his target party.For each node v ∈ V, we show π v as the probability distribution of the voter/node v over all candidates; we define π v (c) as the probability that the voter v votes for a specific candidate c ∈ C. Then for every node v ∈ V, In LTM, each node has an incoming influence, which shows the amount of pressure from incoming neighbors to support/oppose a target party.We use this incoming influence of node v ∈ V to change its probability distribution.Let us define πv as the probability distribution of node v after S. Respectively, πv (c) is the probability that node v will vote for candidate c ∈ C after S. We use A S to show the set of nodes that will become active after S.
We consider a single message which spreads among the voters.The message contains some constructive/destructive information targeting all candidates in the target party.When a node v becomes active, its probability distribution will change according to the incoming influence from its activated neighbors.We have to normalize the vector in order to make sure that the sum of the probabilities is equal to one, after S. For constructive model the probability distribution of a node v ∈ A S changes as follows.
Recall that N i v is the set of incoming neighbors of node v. Also, considering the destructive case, the probability distribution of an active node v ∈ A S will change as follows.
By these changes (and normalization), we guarantee that the sum of the probability for each node is equal to 1.In both constructive and destructive cases, the probability distribution of inactive nodes v ∈ V \ A S will not change after S, i.e., πv = π v .
Let us define the expected number of votes for candidate c ∈ C after S, as F (c, S) = is the expected number of votes for candidate c ∈ C before any diffusion.

MWEC under ICM
Our model is similar to the work presented in [9].We briefly mention the model bellow.In this model, despite LTM, we assume that the attacker knows the voters' preference list.Each voter v ∈ V has a preferences list π v .Abusing the notations, 1 π v (c) tk is the rank of candidate c in the preference list of the voter v.After the diffusion, inactive voters will keep their original opinions, i.e., ∀v ∈ V \ A S : πv = π v ; but the activated voters will change their preferences list as follows.Remind that A S is the set of activated nodes after S.
• Constructive: For each node v ∈ A S and for each target candidate c ∈ C 1 , the new position of c in πv is otherwise the new rank of the candidate c will be calculated as follows.
• Destructive: For each node v ∈ A S and for each target candidate c ∈ C 1 , we have In this article, we consider the plurality scoring rule for simplicity, where just the most preferred candidate of each voter gets one score.However, the results can be extended for any non-increasing scoring function, e.g., k-approval, anti-plurality, and Borda's rule [25].Let us denote by F (c, ∅), F (c, S), the expected score of candidate c before and after S, respectively; formally, ∀c ∈ If we want to generalize the problem and consider any non-increasing scoring function g(•), the functions would be defined as Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 30 August 2020

Objective Functions
In this paper, our goal is to maximize/minimize the number of winners from our target party.Then the objective functions are the same as [9].Considering both IC and LT models, we define F (C 1 , S) as the number of candidates in C 1 that are among the winners.Formally, consider a set of given activated nodes A S , which became active after S. Let us define F A S (c) as the expected number of votes that candidate c will receive while A S is the set of activated nodes.We set Y A S (c) as the number of candidates c ∈ C \ {c} where the expected number of their votes is less than c.In order to consider the tie-breaking rule, if F A S (c By this definition, we define F (C 1 , S) as the expected number of winners from party C 1 , i.e., F (C 1 , S) = Now, let us define the first objective function as Difference of Winners (DoW), where is the difference between the number of winners in our target party before and after S. Formally, in constructive (resp., destructive) model we define DoW c (resp., DoW d ) as As the second objective function, we define a more compelling one called Margin of Victory (MoV).For constructive case, we define it as DoW plus the difference between the number of winners in the opponent parties with the most winners after and before S. Formally, for constructive (resp., destructive) case, we define MoV c (resp., MoV d ) as where C B , C S A , respectively, are the opponent parties with the most winner before and after S.
The constructive margin of victory (CMV) problem is looking for a set of seed nodes S (|S| B) in order to maximize MoV c (C 1 , S).Similarly, destructive margin of victory (DMV) refers to the problem of finding a set of seed nodes S (|S| B) to maximize MoV d (C 1 , S).

MWEC on Graph under LTM
It is proven that the problem is NP-hard to approximate within any factor of approximation using ICM [9].In this part, we prove the same statement considering LTM.
Theorem 1.It is NP-hard to approximate CMV and CDW within any factor on a given graph under LTM.
Proof.Let us reduce the vertex cover (VC) problem to any approximation algorithm for CDW (reps., CMV).In VC, we are given an undirected graph G = (V, E) and an integer k; the decision question is: Is there a set of nodes V ⊆ V (|V | k) so that for each edge (u, v) ∈ E, at least one of its vertices are in V ?Assume I(G, B) is a given instance for VC problem, where G = (V, E) is the given graph, and B is an integer value.We create an instance I (G , B) for CDW (reps., CMV) so that is the graph build from G, and B is also the budget for our problem.Let us consider a case where there We fix the order of candidates in the probability distribution of the voter v as and build G as follows.
• For each undirected edge (u, v) ∈ E add two directed edges (u, v), (v, u) to E .Set the weight of each incoming edge to a node v ∈ V as 1 . By this the sum over weight of all incoming edges is equal to one, i.e., ∀v ∈ V : • For each node v ∈ V, add two more nodes v , v to V , V , respectively.Also, add an edge (v, v ) to • Set the preferences list of the nodes as follows.
By this reduction, the score of candidates before any diffusion is Note that in this reduction a node v will become active deterministically, if either it is selected as a seed node, or all of its incoming neighbors are selected as the seed nodes.Then if we can find a set of seed nodes S ⊆ V so that it activates all nodes in V deterministically, the seed set S is also an answer for the corresponding VC problem.
In any approximation algorithm, we know that S ⊆ V after the diffusion; otherwise, if there is a node v ∈ V ∩ S we can replace it with its incoming neighbor v ∈ V such that (v, v ) ∈ E and we get at least the same value for MoV c , DoW c .Also, if there exists a node v ∈ V ∩ S one of the following situations holds: • There exists an inactive node v ∈ V \ A S after the diffusion S. In this case, we can substitute v for v and then we get at least the same DoW c , MoV c .
• There is no inactive node v ∈ V \ A S .In this case, according to the nodes' probability distribution, when all nodes in V become active, the value of MoV c and DoW c is maximum.Then even if we remove v from S it does not change the value of MoV c or DoW c .By the way, in this situation, if there exist any node v ∈ V \ A S we replace v with it, otherwise we replace it with a node Then from now on, we assume S ⊆ V.
If all nodes in V become active, since they have an outgoing edge to all nodes v ∈ V with probability one, then all nodes in V ∪ V will become active, and the score of the candidates will be as follows.
> 0, and any approximation algorithm will return a positive value, then the answer of I will be YES.
On the other hand, if there is a node v ∈ V, which is inactive after the diffusion, i.e., ∃v ∈ V \ A S , the score of candidates will be as follows.
Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 30 August 2020 = 0, and any approximation algorithm will return zero, then the answer of I will be NO.
For the other direction, note that if we can find a set of nodes S ⊆ V, which is an answer for I, using the same set of nodes, we can activate all nodes in V ∪ V and DoW c (C 1 , S) > 0, MoV c (C 1 , S) > 0.
To extend the proof for any number of parties (t) and candidates (k), we need to assign the probability distribution as follows, and the same approach concludes the proof for any t, k > 2. The same as before, the order of the candidates in probability distribution of a voter v is 0, . . ., 0).
The following theorem proves the same statement for the destructive case of the problem.
Theorem 2. It is NP-hard to approximate DMV and DDW within any factor on a given graph under LTM.
Proof.The reduction is similar to the constructive case.Consider the case where t = k = 2.We should set the voters' probability distributions such that one of our target candidates be among the losers before and after any diffusion.Also, another target candidate is among the winners before any dissemination; but, he will lose the election if and only if all nodes in the connected part of the graph become active.
Please note that, since our target candidates have more priority than the others, we need one more node to be able to do that.

MWEC on Arborescence under ICM
In this section, instead of a general graph, we consider an arborescence structure.We are given a tree G = (V, E) and a budget B where the directed edges are from leaves towards the root under ICM.
We are asked to find at most B seed nodes to maximize MoV Proof.We show the hardness by reducing the IM problem to our problem.Given an instance I(T, B) of IM problem where T = (V, E) is the tree (arborescence), and B is the budget.Let us define the decision version of the problem as follows: Is there at most B seed nodes so that it activates all nodes of the tree in expected?
We consider the case where there are two parties and each of them have just two candidates, i.e., . Also, for simplicity, we consider the plurality scoring rule.
The proof can be extended for any number of parties and candidates using any non-increasing scoring function, akin to [29].
Let us create an instance of our problem I (T , B) as follows, where T = (V ∪ V ∪ V , E) is a tree, and B is the same budget for both problems.
• For each node v ∈ V we add two more nodes v , v to V , V , respectively, i.e., ∀v ∈ • Set the preference list of all nodes as follows. ∀v Clearly, seed nodes will be selected from V, i.e., S ⊆ V; otherwise, if there is a node v ∈ S ∩ V , then the node is useless and does not affect DoW c or MoV c .If there is a node v ∈ S ∩ V , we can replace it with its incoming neighbor and get at least the same value for DoW c and MoV c .
Using aforementioned polynomial-time reduction, if there exists a set of nodes S ⊆ V (|S| B) so that MoV c > 0 (resp.DoV c > 0), then the node will activate all nodes in V ∪ V .Hence, we can select the same set and they will activate all nodes in T; then the answer of I will be YES.On the other hand, if MoV c = 0 (resp.DoW c = 0), it means there is no seed set can activate all nodes in V ∪ V ; then the answer of I is NO.More formally, before any diffusion the score of candidates is Then, none of the candidates in our target party will be elected as winner.After S, if there exists an inactive node in V ∪ V , then the the score of candidates will be as follows: In this case also, none of our target candidates will be among the winners, and MoV c = DoW c = 0. But, if all nodes in V ∪ V become active after S, the score of the candidates will be as follows and one of our target candidates (c 1 1 ) will be elected as winner and any approximation algorithm will return MoV c > 0 (resp.DoW c > 0).It concludes the prove.
The following theorem demonstrates the same hardness of approximation for the destructive case of our problem.
Theorem 4. It is NP-hard to find an approximation algorithm for DMV and DDW on arborescence under ICM.
Proof.The prove for the destructive case is similar to the constructive one.Consider I in Theorem 3, we need to set the preferences list of the nodes so that all of our target candidates win the election before any diffusion; but after the diffusion, one of them (let us say c ∈ C 1 ) will lose if and only if all nodes in V ∪ V become active.Note that since our target candidates have more priority than the others, we need one more isolated node to ensure that c will lose the election after the diffusion.Following the same approach concludes the statement.

MWEC on Tree Using Straight-Party Voting
In this part, we consider the problem on a variation of the straight-party voting (SPV) system (also called Straight-ticket voting) in which the voters can vote for a party instead of candidates [30,31].This model is used in many real elections [32,33].The multi-winner election control problem via social influence under ICM and a general graph is considered in [9].They showed that the problem is hard, and presented some constant factor approximation using SPV system.In this section, we consider the problem on a tree where the edges are directed from root to the leaves.
In the rest of this section, we assume the given tree is a binary tree as we can convert any tree T to a binary tree T by adding O(n) fake nodes.However, our algorithm can use the fake nodes to navigate the tree, but they neither have a probability distribution (preference list) nor can be selected as a seed node.To ensure that the fake nodes will not change the diffusion process on the tree, the weight of each incoming edge to each fake node should be equal to one.Moreover, the weight of an edge from a fake node to an original node is equal to the weight of the original node's incoming edge in T.
In the following, we present some dynamic programming (DP) algorithm to maximize DoV ).Given a tree T = (V, E), and budge B, the idea is that for a fixed node v ∈ V and budget k (0 k B), we calculate the maximum outcome from the sub-tree rooted at v, among the following cases: First, select the node v and try to find the other k − 1 seed nodes in its children.Second, do not select v and look for k seed nodes in its children.
We define r(v), l(v), f (v), respectively, as the right child, left child, and the parent (father) of the node v.In Section 6.1 we consider the problem under LTM, and in Section 6.2 the problem is investigated under ICM.

MWEC Using SPV under LTM
In this section, the voters have preferences list over the candidates.However, they vote for a party proportional to the probability of voting for all candidates in each party.Let us define F spv (C 1 , ∅), F spv (C 1 , S), as the sum of the scores for our target party C 1 before and after S, respectively.Formally they are defined as follows.

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 30 August 2020
The same as before we define the objective function MoV and difference of votes (DoV), for constructive case, as follows.
while C B and C S A are the most voted opponent party before and after S, respectively.For destructive model the objective functions are defined as (2)

Maximizing DoV in SPV under LTM
We define F v as the set of possible probabilities that the node f (v) may become active.More precisely, consider all nodes in the path from root to the v as is the parent of v).If none of the nodes in F v are selected as a seed node, then the probability that f (v) becomes active by his incoming influence is zero.If just the root (v 0 ) is selected as the seed node, then the probability that f (v) becomes active is ∏ i<t i=0 b v i ,v i+1 ; also, if v 1 is selected as a seed node but none of the nodes v i , 2 i t, are selected as a seed node, the probability that f (v) becomes active by its parent is ∏ i<t i=1 b v i ,v i+1 , and so on; all these probabilities belong to F v .
Let us define DoV c (v, k, S, p) as the maximum value of the sum over the difference of probability to vote for our target party after and before S in the sub-tree rooted at v while p ∈ F v is the probability that its parent is active, and the budget is k.Also, all selected seed nodes will be in S. In other words, DoV c (v, k, S, p) = max{DoV spv c (C 1 , S)} in the sub-tree rooted at v while it will become active with probability p • b f (v),v and |S| k.The formal definition of DoV c (v, k, S, p) is as follows: where D v is the increased score of our target party made by the node v if it becomes active, which is We As the base cases, for each leaf v ∈ V, and which is the difference of the probability to vote for our party after and before diffusion S, made by the node v.In fact, if the budget is greater than zero, the node will become active for sure, and we need to consider the difference of scores, but if the budget is zero we cannot select it as a seed node and the value should be multiplied by the probability that the node will become active, i.e., p • b f (v),v .We also define DoV c (null, k, S, p) = 0, that is, the value of DoV c for a null reference is zero.It is useful when a node has just left (resp.right) child, then the value of the function for its right (resp.left) child, regardless of the other parameters, is zero.The pseudo-code of the DP is presented in Algorithm 1, which calculates the maximum DoV spv c ; by small changes, it can find the seed nodes too.Note that the final answer will be calculated by DoV c (v root , B, ∅, 0) where v root is the root node of the tree, B is the budget, ∅ represents that we have no seed node so far, and 0 means the parent of the root node will be activated with zero probability.The following theorem shows that the DP works well.
It is a two-dimensional array A[0..B, 0..|V| − 1] Name all nodes in V from 0 to |V| − 1 in BFS reverse order for (j ← 0; j < |V|; j ← j + 1) do F v j ← Set of all possible probabilities that f (v j ) may become active for (i ← 0; i <= B; i ← i + 1) do the variables i, j are a counter for rows and columns, respectively.
The final result for the root node using all budget end Algorithm 1: Calculating maximum DoV c for e given tree T and budget B when the diffusion model is LTM and voting system is SPV.Please note that if f (v) becomes active, it can activate v with a probability equal to the weight of the edge between them (b f (v),v ).It holds because each node has just one incoming edge (its parent), and the threshold of the node will be generated uniformly at random.Then the probability that the threshold of the node v be less than (or equal) to the weight of the incoming edge is b f (v),v .
Let us show that all values in the arrays will be calculated correctly, by induction.To see that, consider the base cases.For each leaf v ∈ V, the node cannot activate any other node as it has no outgoing edge.Then, these nodes cannot change the probability distribution of other nodes.In other words, each leaf will change just its own probability distribution.If k = 0, it means that we cannot select the node as a seed node, and we need to consider the probability of activating the node, because just activated nodes can update their probability distribution after the diffusion.Then if k = 0, we have , where D v is the difference of the party's score if the node v becomes active (defined in (4)), and p • b f (v),v is the probability that the node will be activated by its parent.On the other hand, if k > 0, we can select v as a seed node, and it will be activated with the probability of one, then we have DoV c (v, k, S, p) = D v .Using the updating rule (defined in Section 3.1), and the definition of DoV spv c (defined in (1)), the base cases are true.
Let us define (i , j ) < (i, j) if j < j, or j = j ∧ i < i.We have shown that all arrays A related to the base cases filled out correctly.Now by induction step, assume all related arrays related to pair (i , j ) smaller than (i, j) are correctly calculated.In order to calculate the A related to A[i, j], for each column p ∈ F v j we use following formula in which the first maximization considers the maximum value among all possible cases that we do not select the node v j as a seed node, and the second one considers the maximum value among all possible cases that we choose v j as a seed node.The last term in each maximization is the increased amount of DoV c in the node v j , which is according to the probability that v j will become active.Note that in the above formula, we are using the value of DoV c for the children of v j , and the nodes are sorted as the BFS reverse order, then all required values are correctly calculated before, and we are selecting the maximum value among all possible cases.Then DoV c (v j , i, S, p) will find the maximum possible value of DoV spv c correctly and concludes the proof.
For the destructive model, we define DoV d (v, k, S, p) as the maximum difference of probability to vote for our target party before and after S in the sub-tree rooted at v, while the budget is k and p ∈ F v is the probability that f (v) will become active.Formally, we define DoV d (v, k, S, p) as follows. where is the difference that the node v can apply.Moreover, for the base cases of the problem, for each leaf v ∈ V, and each probability p ∈ F v , if k = 0 we need to consider the probability that the node will become active, then DoV and to find the optimal set of seed nodes we need the most voted opponent party (parties), which is a defective cycle.
To deal with this problem, someone may say that we consider C i , 2 i t as the most voted opponent party after S, and solve the related DP; after finding the outcome for all t − 1 parties, we select the maximum result as the output.Nevertheless, this is not true in all cases.Consider a case that there are two opponent parties, and each of them has half of the votes before any diffusion.If we consider each of them as the most voted opponent after the diffusion, we will get a wrong outcome as they both can be the most voted opponent after different diffusion processes.In fact, we need to consider multiple parties as the most voted opponent party.
By the way, it has been shown that by maximizing DoV

MWEC Using SPV under ICM
As we saw in previous section (in LTM), each node v becomes active either by being among the seed nodes or by the incoming influence from its parent f (v).Since there is just one incoming edge for each node v ∈ V, and the threshold of the nodes t v is generated uniformly at random, then the probability that its threshold be less than or equal to the incoming weight (b In other words, the node will become active from its parent with the probability that its parent f (v) is active, times the weight of the edge between them.On the other side, in ICM, a node v becomes active if it is either selected as a seed node or its parent f (v) is activated and tries to influence v with the probability b f (v),v .
Then in a tree, the activation processes in both LTM and ICM are the same.
However, the updating rule is entirely different in them.In other words, in LTM, voters have a probability distribution over the candidates, and the activated nodes will update the probability of voting for candidates regarding the influence from activated incoming neighbors, while in ICM, voters have an exact preferences list over candidates, and the activated nodes promote/demote the position of some candidates in their preference list, regardless of neighbors (see Section 2 for a formal definition).

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 30 August 2020
Since the diffusion process in ICM is the same as LTM, we focus more on updating part of the problem to maximize DoV spv c .Recall that we consider the plurality scoring rule for simplicity; but, it is possible to extend the results to any non-increasing scoring function.Then the scoring function F spv for our target party is defined as follows. 2 and the objective functions for the constructive and destructive cases of our problem are the same as ( 1) and ( 2), respectively.

Maximizing DoV in SPV under ICM
In this case, node v can increase our target party's score by one, if none of our target candidates are in the first position before any diffusion, and one of them is in the second position of the voter's preference list.In other words, the voter v may increase the score of our target party if ∃c ∈ C 1 , ∃c ∈ C \ C 1 : π v (c ) = 1 ∧ π v (c) = 2; otherwise, the node v can influence its children and change their opinion, but it cannot affect the target party's score.We call this condition as pre-condition and show it by ¶ v .We define F v as the set of all possible probabilities that the node v may become active. 3Consider a sub-tree rooted at v ∈ V, budget k, seed set S, and p ∈ F v , we define DoV c (v, k, S, p) as follows.
As the base cases of the problem, for each leaf v ∈ V, budget zero, and p ∈ F v as the probability that v will become active, we set DoV c (v, k, S, p) = p • 1 ¶ v , and for the same parameters except a budget k > 0 we set DoV c (v, k, S, p) = 1 ¶ v . 4The same as before, for each reference to a node which does not exists (null), we define DoV c (null, k, S, p) = 0.In order to implement the DP (6), the idea is the same as Algorithm 1.The following theorem shows that it calculates the maximum DoV Proof.In DP (6), there is a maximization over two other maximization formulae.The first one considers the case that we do not select v as a seed node; in this case, we consider the probability that node v will become active, i.e., p ∈ F v .The second maximization considers selecting v as a seed node; in this state, v will be activated with probability equal to one.In both cases, the node may increase the function's value if the pre-condition holds; otherwise, it can influence its children.The same as previous proves, we show that it works by induction.To extend the result using any non-increasing scoring function g(•), we should define the functions as To show that the base cases are correct, note that the leaves cannot activate any other node.Their only effect is by becoming active and changing their own opinion.Then there are two cases if the pre-condition holds for a leaf v: First, the budget is more than zero, then v can be a seed node and increase the amount of DoV c by one.Second, if the budget is zero, v can increment DoV c with the probability of becoming active through its parent, i.e., in expected, it will be p • 1 ¶ v where p ∈ F v is the probability that v will be activated through its parent.Note that if the pre-condition does not hold, the leaf cannot make any effect, and in both cases, its effect is equal to zero.
Let us say (i , j ) < (i, j) if j < j, or j = j ∧ i < i.As the step of induction, assume that all cells (i , j ) smaller that (i, j) are filled correctly for 0 i B, 0 j < |V|.In order to calculate the array A related to the cell (i, j), for each p ∈ F v j we have to calculate the result of the following function.
There is a maximization over two cases.Let us check each case separately.The first case: It considers all possible cases to split the budget into two parts for its children r(v j ) and l(v j ) (the first and second terms) when v j is not selected as a seed node.It finds the split with the maximum outcome using the DoV c of its children, which are calculated correctly.In this case, since the node v j is not a seed node, then the probability that its right (resp.left) child will become active is p The fixed-term is the amount of change that the node v j can afford to maximize our target party's score.If the pre-condition holds, then with the probability of p it will increase the score by one, that is p The second maximization: It investigates the same situation except that it selects v j as a seed node (if i > 0) and uses the value DoV c of its children to find the best split for the i − 1 remaining budgets.In this case, the node v j can increase our party's score by one (if the pre-condition holds) as it is selected as a seed node and will be activated for sure. 5Note that all corresponding values for the children of v j are correctly calculated before because the nodes are sorted as BFS reverse order.Finally, it finds the maximum value among the two cases.
For the destructive case of the problem, we define pre-condition ¶ v as ∃c ∈ C 1 : π v (c) = 1.Then for a node v, if it becomes active and ¶ v holds, the node will decrease the party's score by one; otherwise, v cannot change it.For each sub-tree rooted at v, budget k, and p ∈ F v , let us define DoV d (v, k, S, p) as follows.Note that the definition is exactly the same as constructive case except for the pre-condition.Also the base cases are the same as before if we substitute ¶ v for ¶ v .The prove of the following theorem is similar to the Theorem 7; then we omit it to avoid repetition.

Discussion
Controlling election via SI is one of the most crucial parts of each democratic election.It has been shown that many campaigns are using this powerful tool to influence the voters and change their opinion during elections.In this work, we considered the multi-winner election control utilizing SI so that the attacker tries to maximize/minimize the number of winners from his target party, concerning the party with the most winners.
We exhibited different results, including hardness of approximation, approximation guarantee, and optimal solutions for our problem considering different structures, diffusion models, and voting systems.
In ICM, each voter has a preference list over the candidates and will vote for one or more candidate according to the voting rule, e.g., plurality, Borda's rule, k-approval, and anti-plurality.In this case, the influenced voters change their opinion by promoting/demoting the candidates' position in their preference list.On the other hand, in LTM, we consider that the voters have a probability distribution over all candidates.Each voter votes for one or more candidates proportional to the probability of voting for them.In this model, the activated voters change their opinion based on the incoming activated neighbors' influence.
We proved the problem is hard to approximate within any factor when the structure is a general graph, and the diffusion model is LTM.We also considered the problem when the structure is an arborescence, and the diffusion process follows the ICM rules.We showed that the problem is inapproximable within any factor, except P = NP.Another structure that we investigated is a tree where the voting system is a variation of Straight-party voting.We presented a polynomial-time algorithm to maximize the expected score of our target party regarding both LT and IC diffusion models.It yields that we can get a 1 3 -approximation factor for maximizing MoV in constructive case, and 1 2 -approximation factor concerning MoV in the destructive model.
The results of this paper open several research directions.Considering the MWEC through SI on arborescence, when the diffusion model is LTM can be an exciting research problem.We conjecture that maximizing both objective functions (MoV and DoW) is hard; even though, there exists a polynomial-time algorithm for the IM problem on arborescence under LTM.We plan to consider maximizing MoV in SPV to either present an optimal solution or provide a hardness result regarding both constructive and destructive cases.Also, maximizing DoV on the bidirected trees, where a child can activate its parent too, can be impressive.We conjecture that the problem accepts a polynomial-time algorithm following a similar dynamic programming approach.
The problem of constructive difference of winners (CDW) asks for finding a set of seed nodes S (|S| B) to maximize DoW c (C 1 , S).Similarly, destructive difference of winners (DDW) refers to the problem of finding a set of seed node S (|S| B) to maximize DoW d (C 1 , S).

Theorem 3 .
c and DoW c .It has been shown that the problem in inapproximable on a general graph, except P = NP [9].Bharathi et al. conjectured that the IM problem considering ICM on arborescence is NP-hard [26].Lu et al. proved that the conjecture is true [27], while Wang et al. showed that the IM problem accepts a polynomial-time algorithm on arborescence under LTM [28].In the following, we show that our problem is hard to approximate within any factor of approximation on arborescence under ICM.Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 30 August 2020 It is NP-hard to find an approximation algorithm for CMV and CDW on arborescence under ICM.

F (c 1 1 ,
S) = |V|, Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 30 August 2020 can calculate and store the values in a two-dimensional array A[B + 1, |V|] where the rows are the budgets (starting from zero to B), and the columns are the nodes of the tree presented as the BFS reverse order, and each cell (i, j) (0 i B, 0 j < |V|) of the array refers to another array A [|F v j |].Then in the worst case, since the budget B, and |F v j | (for any v j ∈ V) are at most equal to |V|, then we can solve the problem in polynomial time using O(|V| 3 ) memory.Note that we have to fill the matrix A left-to-right and top-down, while for each cell of it we can fill the corresponding array A in any order.

Theorem 5 .
Given a tree T = (V, E) and budget B, the DP (3) finds a set of seed nodes S (|S| B) to maximize DoV spv c .Proof.Consider the matrix A[B + 1, |V|] where each cell A[k, v] point to another array A where the columns are all possible probabilities that f (v) will become active.Calculating all possible probabilities for the array A , we have at most |F v | columns for each node v ∈ V and budget 0 k B, and for each of them, we need to calculate and store the maximum DoV c .Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 30 August 2020

Theorem 6 .
we have DoV d (v, k, S, p) = D v .Also, we set DoV c (null, k, S, p) = 0.The same as constructive case, for implementation we need a tow-dimensional array A[B + 1, |V|].Moreover, for each cell (i, j), 0 i B, 0 j < |V|, we keep another array A [|F v j |], where F v j is the set of possible probabilities that the node f (v j ) can become active.The following theorem shows that by filling the matrix A left-to-right and up-down direction, we can find the optimal answer for DoV spv d .Given a tree T = (V, E) and a budget B, using the DP (5), we can find a set of seed nodes S (|S| B) to maximize DoV spv d .Proof.The proof is similar to Theorem 5, except for the base cases and the way of updating each activated node's probability distribution after the diffusion.Since a leaf cannot activate any other node, the only change that it can make is updating its own probability distribution.According to the updating rule (in Section 3.1), and the definition of DoV spv d (defined in (2)), the base cases hold.Also, by induction, we can see that the DP (5) will find the maximum value of DoV spv d correctly.6.1.2.Maximizing MoV in SPV under LTM In order to maximize MoV spv c we have to know C S A , i.e., the most voted opponent party after S. We have no problem to find the most voted opponent party before any diffusion (C B ); but to find the most voted opponent party after S we need to have the optimal set of seed nodes that maximizes MoV spv c ,

spv c in polynomial-time. Theorem 7 .
Given a tree T = (V, E), and budget B, the DP (6) gives a set of seed nodes S (|S| B) which maximizes DoV spv c .

3
Please note that the definition of F v in ICM is different from LTM.4 To extend the algorithm for any non-increasing scoring function g(•), we need to define the base cases, respectively, as DoVc (v, k, S, p) = p • (∑ c∈C 1 ,∃c ∈C\C 1 :πv (c )<πv (c) g(π v (c) − 1) − g(π v (c))) and DoV c (v, k, S, p) = ∑ c∈C 1 ,∃c ∈C\C 1 :πv (c )<πv (c) g(π v (c) − 1) − g(π v (c)).Preprints (www.preprints.org)| NOT PEER-REVIEWED | Posted: 30 August 2020Consider a two-dimensional array A[B + 1, |V|] where rows are the budgets from zero to B, and columns are the nodes in BFS reveres order.Each cell A[i, j] (0 i B, 0 j < |V|) refers to another array A with the size of |F v j |.We calculate each array related to each cell (i, j) left-to-right and up-down direction.

Theorem 8 . 2 .
Given a tree T = (V, E), and budget B, the DP(7) gives a set of seed nodes S (|S| BMaximizing MoV in SPV under ICM Similar to Section 6.1.2,we do not know the most scored parties after the diffusion started from a set of optimal seed nodes.However, it has been shown that by maximizing DoV

Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 30 August 2020 nodes
E) with a weight b u,v ∈ [0, 1] on each edge (u, v) ∈ E. The same as LTM, all nodes are inactive, and at the first step the seed nodes S ⊆ V become active.Let us define S i as the that were inactive at step i − 1 and became active at step i, then S 1 = S.At each step i > 1, each node v ∈ S i−1 will try to activate its outgoing neighbors with the probability of the edge between them.In other words, consider N o v as the set of outgoing neighbors of node v; for each u ∈ N o v , node v tries to activate u with the probability b v,u .If v has multiple outgoing neighbors, it tries to activate them in an arbitrary order.Note that a node becomes active once, let us say at step k, and try to activate its outgoing neighbors exactly once, at step k + 1.