Formation of Stable and Efficient Social Storage Cloud

Mane, Pramod C.; Krishnamurthy, Nagarajan; Ahuja, Kapil

doi:10.3390/g10040044

Open AccessArticle

Formation of Stable and Efficient Social Storage Cloud

by

Pramod C. Mane

¹

,

Nagarajan Krishnamurthy

²

and

Kapil Ahuja

^1,*

¹

Computer Science and Engineering, Indian Institute of Technology Indore, Indore 453552, India

²

Operations Management and Quantitative Techniques, Indian Institute of Management Indore, Indore 453556, India

^*

Author to whom correspondence should be addressed.

Games 2019, 10(4), 44; https://doi.org/10.3390/g10040044

Submission received: 28 June 2019 / Revised: 4 September 2019 / Accepted: 16 October 2019 / Published: 1 November 2019

Download

Browse Figures

Versions Notes

Abstract

In this paper, we study the formation of endogenous social storage cloud in a dynamic setting, where rational agents build their data backup connections strategically. We propose a degree-distance-based utility model, which is a combination of benefit and cost functions. The benefit function of an agent captures the expected benefit that the agent obtains by placing its data on others’ storage devices, given the prevailing data loss rate in the network. The cost function of an agent captures the cost that the agent incurs to maintain links in the network. With this utility function, we analyze what network is likely to evolve when agents themselves decide with whom they want to form links and with whom they do not. Further, we analyze which networks are pairwise stable and efficient. We show that for the proposed utility function, there always exists a pairwise stable network, which is also efficient. We show that all pairwise stable networks are efficient, and hence, the price of anarchy is the best that is possible. We also study the effect of link addition and deletion between a pair of agents on their, and others’, closeness and storage availability.

Keywords:

network formation; pairwise stability; network externalities; social storage cloud; socially-aware storage-sharing

MSC:

91A40; 91A80; 91B32; 91B99

JEL Classification:

C72; D62; D85; L86

1. Introduction

Online data backup services such as BuddyBackup1 and CrashPlan2 allow agents to share their under-utilized storage (disk) space with others as well as backup their data on the storage space shared by other agents. In academic discourse, numerous architectural prototypes of data backup systems (for example, Social Storage Cloud [1], Friendstore [2], F2Box [3], FriendBox [4], BlockParty [5], and so on) have been proposed. In order to mitigate issues like data security, trust, or low quality of services, these systems (services) are leveraging social connections. The social connections are either exogenous (that is, encoded in social graphs, for instance, the Facebook social graph3) or endogenous (constructed by the agents [6]). Such social connections are at the core of these systems.

A recently published survey [7] in this field mentions various issues related to social connections, such as small friend sets, social closeness quantification, and so on. These issues are discussed in the context of exogenous social connections.

The aspect of endogenous social connections is notably lacking. The agents’ self-interested behaviour, guided by the cost-benefit trade-off, in building storage-sharing connections is still poorly understood. Specifically, what is not well understood in this context is: (1) which network structure is likely to emerge when self-interested agents construct their storage sharing connections; (2) whether the emerged storage-sharing connection structure is stable and efficient, or not; and (3) the impact of link formation between two agents on their storage availability as well as that of other agents. To advance our understanding about these aspects, there is a need for formal modeling of endogenous socially-aware storage-sharing networks, as previous studies have focused exclusively on exogenous networks.

This paper studies the aforementioned aspects by focusing on social storage cloud systems (a case of socially-aware resource sharing systems). We model social storage cloud systems as an endogenous social storage cloud by using the tools of network analysis4, game theory5, and network formation6. Specifically, we model social storage cloud systems as a strategic network formation game, where self-interested agents decide with whom they want to form a connection and with whom they do not. For this, we define the utility of agents in a social storage cloud by taking into consideration the parameters data failure rate, value of data, and cost for maintaining social connections.

In [23], the authors consider a degree-based utility model, where agents benefit only from direct neighbors, and the benefit decreases with an increase in the number of neighbors of each neighbour [24]). The utility function we define in this study is degree-distance-based, where agents obtain benefits from direct and indirect neighbors, but the benefit decreases with an increase in the number of direct and indirect neighbors [25]. With this utility function, we study the effect of decisions of addition and deletion of links by pairs of agents on their storage availability in the network. We study externalities in the network, that is, the effect of link formation between a pair of agents on the utility of the other agents. We then analyze the network structure that evolves due to these decisions of link addition and link deletion.

The focus of this paper is to study network stability, efficiency, and the measures of price of anarchy and price of stability. For the analysis of network stability, we make use of the concept of pairwise stability proposed in [26]. In our model, agents experience both positive and negative externalities, determined by storage availability. We provide necessary and sufficient conditions for an agent to experience positive and negative externalities. Further, we show that if data failure rate is less than the ratio of cost of maintaining the link to data value, then the null network is the unique pairwise stable as well as an efficient network. However, if the data failure rate is higher than the ratio of the cost of maintaining the link to data value then a network where every agent has, at most, a single link is the unique pairwise stable and efficient network.

The structure of the paper is as follows. Section 2 discusses the social storage cloud model. Section 3 studies the effect of addition and deletion of a link between a pair of agents on their closeness and storage availability, and that of others. Section 4 discusses the characterization of stable networks, where we study deviation conditions that show when agents have incentives for adding or deleting a link. Further, the section discusses network stability, efficiency, and inefficiency. Section 5 concludes the discussion.

2. Social Storage Cloud Model

In this section, we describe the social storage cloud model through an interaction structure, a storage-sharing framework and cost-benefit analysis of agents.

2.1. Interaction Structure

A social storage cloud

g = (A, L)

is a storage-sharing and data backup network that consists of a nonempty set

A

of N agents who are involved in storage (disk)-space sharing and data backup activity; and a set,

L

, of links that connect these agents. The set

L

acts as a communication infrastructure for agents to share their storage space with others and search for storage space provided by others. A link,

〈 i j 〉 \in L

, represents a direct communication channel between agents i and j, which is bidirectional (and hence,

〈 i j 〉 = 〈 j i 〉

). If

〈 i j 〉 \in L

, we call the agents i and j as neighbours in the network

g

. The number of neighbors of agent i in

g

is denoted by

η_{i} (g)

.

Given distinct agents

a_{1}, a_{2}, \dots, a_{n} \in A

, if

〈 a_{1}, a_{2} 〉, 〈 a_{2}, a_{3} 〉, \dots, 〈 a_{n - 1}, a_{n} 〉 \in L

, then there is a path

P_{a_{1} a_{n}} (g)

, from

a_{1}

to

a_{n}

, of length

n - 1

. The distance

d_{i j} (g) (= d_{j i} (g))

between a pair of agents i and j is the length of the shortest path connecting them in

g

.

A network

g

is connected if there exists at least one path between any pair of agents, otherwise, it is disconnected. A path of length

\geq 2

between a pair of agents is an indirect communication channel between them. The set,

G (N)

, consists of all possible networks on N agents.

Data stored on local storage space is prone to loss due to multiple reasons such as virus infection, software or hardware failure, data corruption, and so on. Therefore, each agent wants to backup its data on remote storage (disk) space. For any agent, data loss is costly. We capture this by assuming that the value each agent associates with its data is quantifiable and given. Every agent (as a data owner) strives for obtaining storage space provided by other agents (as storage providers) in

g \in G (N)

. Agent i wants to backup

{\bar{b}}_{i}

amount of data and shares

{\bar{s}}_{i} = \sum_{j \in A ∖ {i}} {\bar{b}}_{j}

amount of storage space. This leads to endogenous social storage cloud formation, where each agent builds its communication channel to seek storage space from direct and indirect communication channels. We assume that each agent has global (complete) information about the network structure.

A network

g

evolves when agents perform two actions, namely, link addition (

g + 〈 i j 〉

) and link deletion (

g - 〈 i j 〉

). Mutual consent of a pair of agents is required for addition of a link between them, but any link can be unilaterally deleted.

Table 1 summarizes all notations used in this paper.

2.2. Storage Sharing

According to [1], agents could limit storage-sharing with those who are close to them in the social cloud. In order to capture this, we make use of the harmonic centrality measure (discussed in [10,27,28]), defined as follows:

Φ_{i} (g) = \sum_{j \in g ∖ {i}} \frac{1}{d_{i j} (g)} .

(1)

We use harmonic centrality as it deals with disconnected networks as well.

In

g

, an agent j (as a storage provider) computes a probability distribution on all agents for the purpose of allocating storage space to agent

i \in g

(as a data owner), as below:

α_{i j} (g) = \frac{\frac{1}{d_{i j} (g)}}{\sum_{j \in g ∖ {i}} \frac{1}{d_{i j} (g)}} = \frac{1}{d_{i j} (g) Φ_{i} (g)},

(2)

where

α_{i j} (g)

is the probability that agent i will obtain storage space from agent j in

g

.

Remark 1.

If

d_{i j} (g) = \infty

, then

α_{i j} (g) = 0

(and

α_{j i} (g) = 0

). As agents i and j are disconnected in

g

, their chances of obtaining storage space from each other is zero.

The probability that an agent i obtains storage space from at least one agent in

g

is

\begin{matrix} γ_{i} (g) = 1 - \prod_{j \in g ∖ {i}} (1 - α_{i j} (g)) . \end{matrix}

2.3. Agent’s Utility and Symmetry

The utility of agent i in

g

is given by a function

u_{i} : G \to R^{+}

. Let u be the the vector (profile) of utility functions

u = (u_{1}, . . ., u_{n})

. Thus, we have

u : G \to R^{N}

. In other words, each possible social storage cloud structure

g \in G

leads to a utility function profile for agents.

We define the utility of agents in a social storage cloud

g

with the following parameters. An agent, i, loses its data with probability

λ_{i} \in (0, 1)

. Therefore, to minimize this risk of data loss, agent i aims to backup its data on the storage provided by others. For agent i,

β_{i}

is the value of the local data that is to be backed up. Agent i obtains storage space provided by others in

g

, with probability

γ_{i} (g)

. Thus, the value of data

β_{i}

, the chance of losing the data

λ_{i}

, and the chance of obtaining storage space

γ_{i} (g)

capture the expected benefit of agent i in

g

.

An agent searches for storage by staying connected in the network. Direct as well as indirect links help agents to get storage space. The direct link between agents i and j costs

ς_{i}

. This cost can be interpreted as the cost required for maintaining storage space, infrastructure, bandwidth, time, and so on. The cost to maintain an existing link and that for adding (and maintaining) a new link are the same. There is no additional cost to add a new link. Thus, agent i incurs a total cost of

ς_{i} η_{i} (g)

in order to obtain an expected benefit of

β_{i} λ_{i} γ_{i} (g)

, in case of data loss. But the network is formed upfront, before the data loss happens. The cost to maintain links is, hence, incurred even in the case of no data loss, where the expected benefit to i is

β_{i} (1 - λ_{i})

.

Therefore, given the aforementioned parameters, the expected utility is

u_{i} (g) = β_{i} (1 - λ_{i}) + β_{i} λ_{i} γ_{i} (g) - ς_{i} η_{i} (g) .

(3)

Free riding (a situation where an agent offers less storage space, but consumes more) is a widely discussed issue in the literature on peer-to-peer storage. In order to deal with free riding, many P2P storage systems (for example, Internet Cooperative Backup System [29], PeerStore [30], Pastiche [31]) follow a symmetric storage-sharing mechanism, where agents share the same amount of storage space.

We define a symmetric social storage cloud

g

as follows.

Definition 1.

A symmetric social storage cloud (SSSC)

g

is a network where the benefit (value) associated with backed-up data is the same for all agents in the network, that is,

β_{i} = β_{j}

(say β),

ς_{i} = ς_{j}

(say ς7), and

λ_{i} = λ_{j}

(say λ8) for all

i, j \in A

, and hence, utility of each agent i in

g

is

u_{i} (g) = β (1 - λ) + β λ γ_{i} (g) - ς η_{i} (g),

(4)

where

λ, β, ς \in (0, 1)

.

For further study, we consider the above utility function (Equation (4)). Henceforth, whenever we refer to a network, or just

g

, we mean an SSSC.

2.4. Pairwise Stability

In order to characterize endogenously built social storage cloud, we adopt pairwise stability [26] as a solution concept. A network is pairwise stable if (1) no agent benefits by deleting an existing link and (2) no two agents benefit by adding a new link between them.

Definition 2.

[26] A network

g

is pairwise stable if

for all $i, j \in g$ such that $〈 i j 〉 \in g$ , $u_{i} (g) \geq u_{i} (g - 〈 i j 〉)$ , and $u_{j} (g) \geq u_{j} (g - 〈 i j 〉)$ ; and
for all $i, j \in g$ such that $〈 i j 〉 \notin g$ , if $u_{i} (g + 〈 i j 〉) > u_{i} (g)$ , then $u_{j} (g + 〈 i j 〉) < u_{j} (g)$ .

3. Network Structure and Storage Availability

One of the objectives of this paper is to understand the impact of link addition and deletion on storage availability for those agents who are involved in the link addition/deletion as well as those who are not. The storage availability is determined by the distances between them and their closeness (from Equation (2)). Therefore, first we study how addition and deletion of a link impacts the shortest distances between pairs of agents and, therefore, their closeness. This analysis provides a base for understanding the effect of link-addition/deletion on agents’ storage availability in

g

.

3.1. Effect of Link Alteration on Closeness

Lemma 1.

Suppose

〈 i j 〉 \notin g

. Then,

Φ_{i} (g + 〈 i j 〉) > Φ_{i} (g)

.

Proof.

Clearly,

d_{i j} (g + 〈 i j 〉) < d_{i j} (g)

. As

〈 i j 〉 \notin g

, we have,

d_{i j} (g) \geq 2

. Also,

d_{i j} (g + 〈 i j 〉) = 1

. Thus,

Φ_{i} (g)

and

Φ_{j} (g)

increase by at least

\frac{d_{i j} (g) - 1}{d_{i j} (g)}

in

g + 〈 i j 〉

. ☐

Lemma 2.

Suppose

〈 i j 〉 \in g

. Then,

Φ_{i} (g - 〈 i j 〉) < Φ_{i} (g)

.

Proof.

Let us assume there is no path between i and j in $g - 〈 i j 〉$ , then $d_{i j} (g - 〈 i j 〉) = \infty$ , thus, $Φ_{i} (g)$ and $Φ_{j} (g)$ decrease by 1 in $g - 〈 i j 〉$ .
Now, let us assume there exists a path $P_{i j} (g - 〈 i j 〉)$ between i and j in $g - 〈 i j 〉$ , the distance between i and j in $g - 〈 i j 〉$ being at least 1 more than that in $g$ . Thus, $Φ_{i} (g)$ and $Φ_{j} (g)$ decrease by at least $\frac{d_{i j} (g - 〈 i j 〉) - 1}{d_{i j} (g - 〈 i j 〉)}$ in $g - 〈 i j 〉$ . ☐

Lemmas 1 and 2 show that, with respect to closeness, every link benefits agents on either side of the link. An action of link addition or deletion between a pair of agents not only impacts their closeness, but also that of other agents. Now, we study the impact of link addition or deletion between a pair of agents (say, i and j) on the closeness of the other agents

k \in g ∖ {i, j}

.

Lemma 3.

Suppose

〈 i j 〉 \notin g

and

k \in g ∖ {i, j}

. Then,

Φ_{k} (g) = Φ_{k} (g + 〈 i j 〉)

if and only if

d_{k l} (g) = d_{k l} (g + 〈 i j 〉)

for all

l \in g

.

Proof.

If

d_{k l} (g) = d_{k l} (g + 〈 i j 〉)

for all

l \in g

, then by Equation (1),

Φ_{k} (g) = Φ_{k} (g + 〈 i j 〉)

.

Conversely, suppose

Φ_{k} (g) = Φ_{k} (g + 〈 i j 〉)

.

It is easy to see that, if for some

l \in g

, if

d_{k l} (g + 〈 i j 〉) \neq d_{k l} (g)

, then

d_{k l} (g + 〈 i j 〉) < d_{k l} (g)

. (Paths in

g

exist in

g + 〈 i j 〉

too).

We have

d_{k l} (g + 〈 i j 〉) \leq d_{k l} (g)

for all

l \in g

and, if there exists x such that

d_{k x} (g + 〈 i j 〉) < d_{k x} (g)

, then

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

, a contradiction. ☐

Lemma 4.

Suppose

〈 i j 〉 \in g

and

k \in g ∖ {i, j}

. Then,

Φ_{k} (g) = Φ_{k} (g - 〈 i j 〉)

if and only if

d_{k l} (g) = d_{k l} (g - 〈 i j 〉)

for all

l \in g

.

Proof.

As

d_{k l} (g - 〈 i j 〉) \geq d_{k l} (g)

for all l, the proof follows in lines similar to that of Lemma 3. ☐

We now show necessary and sufficient conditions for increase in the closeness of agents who are not involved in link addition or deletion.

Theorem 1.

Suppose

〈 i j 〉 \notin g

, and let k be an agent distinct from i and j. Then,

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

if and only if there exists at least one agent

l \in g

such that

d_{k l} (g) \geq 3

and all shortest paths

P_{k l} (g + 〈 i j 〉)

from k to l in

g + 〈 i j 〉

contain

〈 i j 〉

.

Proof.

Let

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

. Then, by Lemma 3, there must be at least one agent, say l, such that

d_{k l} (g) > d_{k l} (g + 〈 i j 〉)

.

Suppose

i, k,

and l are all distinct. Note that j may be the same as l.

If possible, let

d_{k l} (g) < d_{k i} (g) + d_{i j} (g) + d_{j l} (g)

for all

l \in g

. Then,

d_{k l} (g) = d_{k l} (g + 〈 i j 〉)

for all

l \in g

. From Lemma 1,

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

, a contradiction. Therefore, there exists an

l \in g

such that

d_{k l} (g) = d_{k i} (g) + d_{i j} (g) + d_{j l} (g)

.

As

〈 i j 〉 \notin g

,

d_{i j} (g) \geq 2

. As

k \neq i

,

d_{i k} (g) \geq 1

and

j = l

. Hence,

d_{k l} (g) \geq 3

.

Now,

\begin{array}{l} d_{k l} (g + 〈 i j 〉) & = d_{k i} (g + 〈 i j 〉) + d_{i j} (g + 〈 i j 〉) + d_{j l} (g + 〈 i j 〉) \\ = d_{k i} (g) + d_{i j} (g + 〈 i j 〉) + d_{j l} (g) \\ < d_{k i} (g) + d_{i j} (g) + d_{j l} (g) \\ = d_{k l} (g) \end{array}

It follows that every shortest path between k and l in

g + 〈 i j 〉

contains

〈 i j 〉

. (Note that if there exists a shortest path from k to l in

g + 〈 i j 〉

that does not contain

〈 i j 〉

, then this shortest path exists in

g

too).

Conversely, let

l \in g

such that

d_{k l} (g) \geq 3

and all shortest paths

P_{k l} (g + 〈 i j 〉)

from k to l in

g + 〈 i j 〉

contain

〈 i j 〉

.

Clearly,

Φ_{k} (g) \leq Φ_{k} (g + 〈 i j 〉)

.

If possible, let

Φ_{k} (g) = Φ_{k} (g + 〈 i j 〉)

. This means for every l in

g

there exists a shortest path from k to l in

g + 〈 i j 〉

that does not contain

〈 i j 〉

, a contradiction. Therefore,

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

. ☐

Theorem 2.

Suppose

〈 i j 〉 \in g

, and let k be an agent distinct from i and j. Then,

Φ_{k} (g - 〈 i j 〉) < Φ_{k} (g)

if and only if there exists at least one agent

l \in g

such that

d_{k l} (g) \geq 2

and all shortest paths

P_{k l} (g)

from k to l in

g

contain

〈 i j 〉

.

We skip the proof as it is similar to the proof of Theorem 1.

In subsequent sections, we present our results due to link addition. We present our results on link deletion in Appendix B.

3.2. Effect of Closeness on Distances of Agents Not Involved in Link Alteration

In this section, we classify agents whose mutual distances from each other remain the same after link alteration. We use the same to analyze the effect of closeness on distances between agents who are not involved in the link addition or deletion.

Given k such that

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

, we use

L_{k}^{+}

to denote the set of all

l \in g

such that all shortest paths from k to l in

g + 〈 i j 〉

contain

〈 i j 〉

. We use

l_{k}^{+}

to denote an agent in

L_{k}^{+}

.

Proposition 1.

Suppose i, j, and k are distinct agents in

g

. Suppose l is another agent, distinct from i and k, and suppose

Φ_{k} (g + 〈 i j 〉) > Φ_{k} (g)

. If

d_{k i} (g + 〈 i j 〉) < d_{k j} (g + 〈 i j 〉) \leq d_{k l} (g + 〈 i j 〉)

, then

d_{i k} (g + 〈 i j 〉) = d_{i k} (g)

.

Proof.

We have

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

. Then, from Theorem 1, there exists

l \in g

such that all shortest paths

P_{k l} (g + 〈 i j 〉)

from k to l in

g + 〈 i j 〉

contain

〈 i j 〉

.

We consider the two cases

j = l

and

j \neq l

.

Suppose $j = l$ . As $d_{k i} (g + 〈 i j 〉) < d_{k j} (g + 〈 i j 〉)$ , k observes i before j on all shortest paths $P_{k l} (g + 〈 i j 〉)$ . This implies $d_{i k} (g + 〈 i j 〉) = d_{i k} (g)$ .
Suppose $j \neq l$ . As $d_{k i} (g + 〈 i j 〉) < d_{k j} (g + 〈 i j 〉) \leq d_{k l} (g + 〈 i j 〉)$ , k observes i before j, and j before l, on all shortest paths $P_{k l} (g + 〈 i j 〉)$ . This implies $d_{i k} (g + 〈 i j 〉) = d_{i k} (g)$ . ☐

Definition 3.

Suppose

〈 i j 〉 \notin g

and k is an agent such that

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

. A

(k, + i j)

-shortest-path-network,

g_{i j}^{k^{+}}

, is a subnetwork of

g + 〈 i j 〉

that consists of all shortest paths from k to

l_{k}^{+}

in

g + 〈 i j 〉

, which contain

〈 i j 〉

, for all

l_{k}^{+} \in L_{k}^{+}

.

Definition 4.

An (all

k, + i j)

-shortest-path-network,

g_{i j}^{+}

, is

⋃_{\underset{Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)}{k \in g,}} g_{i j}^{k^{+}}

, the smallest network consisting of all

(k, + i j)

-shortest-path-networks.

Definition 5.

A sub-

(i, +)

-network,

g_{i}^{+}

of

g_{i j}^{+}

, is the induced subnetwork of

g_{i j}^{+}

consisting of all agents

k \in g_{i j}^{+}

such that

d_{i k} (g) = d_{i k} (g + 〈 i j 〉)

. Similarly, we define the sub-

(j, +)

-network,

g_{j}^{+}

of

g_{i j}^{+}

, as the induced subnetwork of

g_{i j}^{+}

consisting of all agents

l \in g_{i j}^{+}

such that

d_{j l} (g) = d_{j l} (g + 〈 i j 〉)

.

Refer to Appendix A for an illustration of the above definitions.

Proposition 2.

For all

k, \bar{k} \in g_{i}^{+}

,

d_{k \bar{k}} (g) = d_{k \bar{k}} (g + 〈 i j 〉)

.

Proof.

If

k, \bar{k} \in g_{i}^{+}

then, from Definition 5,

d_{i k} (g) = d_{i k} (g + 〈 i j 〉)

and

d_{i \bar{k}} (g) = d_{i \bar{k}} (g + 〈 i j 〉)

. As

k, \bar{k} \in g_{i j}^{+}

as well, there exists l and

\bar{l}

such that

d_{k l} (g) > d_{k l} (g + 〈 i j 〉)

and

d_{\bar{k} \bar{l}} (g) > d_{\bar{k} \bar{l}} (g + 〈 i j 〉)

.

It is sufficient to show that, given

\bar{k}

,

\bar{l}

can never be k.

If possible, let

\bar{l} = k

. Then, from Definition 5,

d_{\bar{k} i} (g) = d_{\bar{k} i} (g + 〈 i j 〉)

implies

\bar{k}

observes i first, and subsequently j to reach k, on all shortest paths

P_{\bar{k} k} (g + 〈 i j 〉)

from

\bar{k}

to k in

g + 〈 i j 〉

. Then,

d_{i k} (g) \neq d_{i k} (g + 〈 i j 〉)

. This is because, if

k = j

,

d_{i k} (g) < d_{i k} (g + 〈 i j 〉 = 1

. Therefore,

k \notin g_{i}^{+}

, which is a contradiction. Now, if

k \neq j

, then k must first visit j, and later i, to reach

\bar{k}

on all shortest paths

P_{k \bar{k}} (g + 〈 i j 〉)

from

\bar{k}

to k. This implies

d_{i k} (g) \neq d_{i k} (g + 〈 i j 〉)

and hence,

k \notin g_{i}^{+}

, again, a contradiction. Thus,

k \neq \bar{l}

. ☐

We discuss our results on shortest distances due to link deletion in Appendix B.1.

3.3. Effect of Link Alteration on Storage Availability

Our aim here is to analyze under what conditions agents’ chance of obtaining storage space in the network increases or decreases by adding a new link. We present our results in the case of link deletion in Appendix B.2.

Lemma 5.

Suppose agent i and j add a direct link in

g

and let

k \notin g_{i j}^{+}

. Then,

α_{i k} (g) = α_{i k} (g + 〈 i j 〉)

and

α_{j k} (g) = α_{j k} (g + 〈 i j 〉)

.

Proof.

If agent

k \notin g_{i j}^{+}

then

Φ_{k} (g) = Φ_{k} (g + 〈 i j 〉)

. Thus,

d_{k i} (g) = d_{k i} (g + 〈 i j 〉)

. Therefore, from Equation (2),

α_{i k} (g) = α_{i k} (g + 〈 i j 〉)

. A similar proof holds for j too. ☐

Lemma 6.

Suppose agents i, j, k, and l are such that

i \neq j

,

j \neq k

,

i \neq l

, and

k \neq l

. (Agents i and k may be the same, and agents j and l may be the same). Suppose

〈 i j 〉 \notin g

,

k \in g_{i}^{+}

, and

l \in g_{j}^{+}

. Then,

$α_{k l} (g) < α_{k l} (g + 〈 i j 〉)$ , and
$i \neq k$ implies that $α_{i k} (g) > α_{i k} (g + 〈 i j 〉)$ . Similarly, if $j \neq l$ , then $α_{j l} (g) > α_{j l} (g + 〈 i j 〉)$ .

Proof.

Refer Appendix C for the proof. ☐

Lemma 7.

Let k and

\bar{k}

be agents in

g_{i}^{+}

. Then,

α_{k \bar{k}} (g) = α_{k \bar{k}} (g + 〈 i j 〉)

and

α_{\bar{k} k} (g) = α_{\bar{k} k} (g + 〈 i j 〉)

.

Proof.

The proof follows from Proposition 2. ☐

Theorem 3.

Suppose agents i and j are such

i \neq j

, and

〈 i j 〉 \notin g

. Then,

γ_{i} (g) < γ_{i} (g + 〈 i j 〉)

if and only if

\frac{\prod_{k \in g_{i}^{+}} (1 - α_{i k} (g + 〈 i j 〉))}{\prod_{l \in g_{j}^{+}} (1 - α_{i l} (g))} < \frac{\prod_{k \in g_{i}^{+}} (1 - α_{i k} (g))}{\prod_{l \in g_{j}^{+}} (1 - α_{i l} (g + 〈 i j 〉))}

.

Additionally,

γ_{i} (g) < γ_{i} (g + 〈 i j 〉)

if and only if

\frac{\prod_{k \in g_{i}^{+}} (α_{i k} (g + 〈 i j 〉))}{\prod_{l \in g_{j}^{+}} (α_{i l} (g))} > \frac{\prod_{k \in g_{i}^{+}} (α_{i k} (g))}{\prod_{l \in g_{j}^{+}} (α_{i l} (g + 〈 i j 〉))}

.

Proof.

The proof follows from Lemmas 5, 6, and 7. ☐

3.4. Externalities

In this section, we study externalities, that is, how a link that is added between a pair of agents affects the utility of others. (Refer to Definition 6). The particular form of externalities (positive, negative, or none) is crucial in determining which network is likely to evolve and the conditions under which it will lead to a stable and efficient network.

Definition 6.

[32] Consider a network,

g

, with agents

i, j \in g

such that

i \neq j

and

〈 i j 〉 \notin g

. Suppose agents i and j form a direct link

〈 i j 〉

. Then, agent

k \in g ∖ {i, j}

experiences

Positive externalities if $u_{k} (g + 〈 i j 〉) > u_{k} (g)$ ;
Negative externalities if $u_{k} (g + 〈 i j 〉) < u_{k} (g)$ ;
No externalities if $u_{k} (g + 〈 i j 〉) = u_{k} (g)$ .

We now show that the type of externalities an agent

k \in g

experiences, can be determined using conditions on the storage availability, independent of the data loss rate and the value that agents associate with their data.

Proposition 3.

In an SSSC

g

, an agent

k \in g

experiences

Positive externalities if $γ_{k} (g + 〈 i j 〉) > γ_{k} (g)$ ;
Negative externalities if $γ_{k} (g + 〈 i j 〉) < γ_{k} (g)$ ;
No externalities if $γ_{k} (g + 〈 i j 〉) = γ_{k} (g)$ .

Proof.

By Definition 6, $u_{k} (g + 〈 i j 〉) > u_{k} (g)$
$\Rightarrow β (1 - λ) + β λ γ_{k} (g + 〈 i j 〉) - ς η_{k} (g + 〈 i j 〉) > β (1 - λ) + β λ γ_{k} (g) - ς η_{k} (g)$ .
As agent k does not pay the cost for link $〈 i j 〉$ , we have $ς η_{k} (g + 〈 i j 〉) = ς η_{k} (g)$ .
Thus, $β λ γ_{k} (g + 〈 i j 〉) > β λ γ_{k} (g) \Rightarrow γ_{k} (g + 〈 i j 〉) > γ_{k} (g)$ .
For Cases 2 and 3, the proof is similar to that of Case 1. ☐

The following results provide a necessary and sufficient condition under which an agent

k \in g

experiences positive or negative externalities.

Lemma 8.

Let i, j, and k be distinct agents in

g

. Suppose

k \notin g_{i j}^{+}

. Then, k experiences only negative externalities.

Proof.

If agents i and j add a direct link in

g

, then, from Lemma 1,

Φ_{i} (g) < Φ_{i} (g + 〈 i j 〉)

. If

k \notin g_{i j}^{+}

, then, from Theorem 1,

Φ_{k} (g) = Φ_{k} (g + 〈 i j 〉)

, thus,

d_{k l} (g) = d_{k l} (g + 〈 i j 〉)

for all

l \in g

. Therefore,

α_{k i} (g + 〈 i j 〉) < α_{k i} (g)

, by Equation (2). Now, for all

l \in g

,

Φ_{l} (g) \leq Φ_{l} (g + 〈 i j 〉)

. If

Φ_{l} (g) = Φ_{l} (g + 〈 i j 〉)

, then

α_{k l} (g + 〈 i j 〉) = α_{k l} (g)

and, if

Φ_{l} (g) < Φ_{l} (g + 〈 i j 〉)

, then

α_{k l} (g + 〈 i j 〉) < α_{k l} (g)

. Thus,

γ_{k} (g + 〈 i j 〉) < γ_{k} (g)

. ☐

Theorem 4.

Suppose agents i, j, k,

\bar{k}

, and l are such that

i \neq j

,

i \neq k

,

i \neq l

,

j \neq k

,

k \neq \bar{k}

, and

k \neq l

. (Agents i and

\bar{k}

may be the same, and agents j and l may be the same). Suppose

〈 i j 〉 \notin g

,

\bar{k} \in g_{i}^{+}

and

l \in g_{j}^{+}

. Then, agent k experiences positive externalities if and only if

k \in g_{i j}^{+}

and

\frac{\prod_{\bar{k} \in g_{i}^{+}} (1 - α_{k \bar{k}} (g + 〈 i j 〉))}{\prod_{l \in g_{l}^{+}} (1 - α_{k l} (g))} < \frac{\prod_{\bar{k} \in g_{i}^{+}} (1 - α_{k \bar{k}} (g))}{\prod_{l \in g_{j}^{+}} (1 - α_{k l} (g + 〈 i j 〉))}

, otherwise k experiences negative externalities.

Proof.

From Lemma 8, it is required to increment in agent k’s closeness. It is straightforward to observe that

\frac{\prod_{\bar{k} \in g_{i}^{+}} (1 - α_{k \bar{k}} (g + 〈 i j 〉))}{\prod_{l \in g_{l}^{+}} (1 - α_{k l} (g))} < \frac{\prod_{\bar{k} \in g_{i}^{+}} (1 - α_{k \bar{k}} (g))}{\prod_{l \in g_{j}^{+}} (1 - α_{k l} (g + 〈 i j 〉))}

, then

γ_{k} (g + 〈 i j 〉) > γ_{k} (g)

. Thus, k experiences positive externalities.

Conversely, let

γ_{k} (g + 〈 i j 〉) < γ_{k} (g)

, then either from Proposition 8,

d_{k i} (g) = d_{k i} (g + 〈 i j 〉)

, for all

i \in g

or

\frac{\prod_{\bar{k} \in g_{i}^{+}} (1 - α_{k \bar{k}} (g + 〈 i j 〉))}{\prod_{l \in g_{l}^{+}} (1 - α_{k l} (g))} < \frac{\prod_{\bar{k} \in g_{i}^{+}} (1 - α_{k \bar{k}} (g))}{\prod_{l \in g_{j}^{+}} (1 - α_{k l} (g + 〈 i j 〉))}

. ☐

Lemma 8 and Theorem 4 show that an increase in the closeness of an agent (who is not involved in the link formation) is necessary in order for that agent to experience positive externalities. Although we have provided a necessary and sufficient condition for positive and negative externalities by performing a microscopic analysis of externalities, it is hard to obtain a general characterization of networks where agents experience only positive externalities. This leads us to the following question. At least for specific network structures, can we show positive (or negative) externalities? For instance, we can argue that in a two diameter network, agents never experience positive externalities.

4. Characterization of Stable and Efficient Networks

One of the central focuses of this study is to analyze what network is likely to emerge when each agent (or pair of agents) decides selfishly which link they want to delete (respectively, whether to add a link or not), when agents build their social connections (links) based on the benefit associated with their data, the cost for link formation, and the prevailing data loss rate.

In the following subsections, we discuss pairwise stable networks, efficient networks, and the measures of efficiency, namely, price of anarchy (PoA) and price of stability (PoS). In our analysis of stable and efficient networks, we assume that network formation takes place starting with the null network (where there are no links between any pair of agents).

4.1. Stable Networks: Characterization, Existence, and Uniqueness

We now discuss the conditions under which an agent prefers to add a new link or delete an existing link, and use the same to characterize stable networks.

Lemma 9.

Let

〈 i j 〉 \notin g

. An agent

i \in g

is benefited by adding a direct link with agent

j \in g

if and only if

β λ [γ_{i} (g + 〈 i j 〉) - γ_{i} (g)] > ς

.

Proof.

Agent i has incentive to form a link with agent j if and only if

u_{i} (g + 〈 i j 〉) > u_{i} (g)

\Rightarrow β (1 - λ) + β λ γ_{i} (g + 〈 i j 〉) - ς (η_{i} (g) + 1) > β (1 - λ) + β λ γ_{i} (g) - ς η_{i} (g)

\Rightarrow β λ [γ_{i} (g + 〈 i j 〉) - γ_{i} (g)] > ς

. ☐

Corollary 1.

An agent

i \in g

has no incentive to add a link with agent

j \in g

if and only if

λ [γ_{i} (g + 〈 i j 〉) - γ_{i} (g)] \leq \frac{ς}{β}

.

Lemma 10.

Let

〈 i j 〉 \in g

. An agent

i \in g

benefits by deleting a link with agent j if and only if

β λ [γ_{i} (g) - γ_{i} (g - 〈 i j 〉)] < ς

.

Proof.

An agent i has incentive to delete a link with agent j if and only if

u_{i} (g - 〈 i j 〉) > u_{i} (g)

.

\Rightarrow β (1 - λ) + β λ γ_{i} (g - 〈 i j 〉) - ς (η_{i} (g) - 1) > β (1 - λ) + β λ γ_{i} (g) - ς η_{i} (g)

\Rightarrow ς > β λ [γ_{i} (g) - γ_{i} (g - 〈 i j 〉)]

. ☐

Corollary 2.

An agent i has no incentive to delete an existing link with agent j if and only if

λ [γ_{i} (g) - γ_{i} (g - 〈 i j 〉)] \geq \frac{ς}{β}

.

Theorem 4, stated below, provides an easy characterization of a stable network

g

.

Proposition 4.

A network

g

is pairwise stable if and only if

for all $i, j \in g, λ [γ_{i} (g) - γ_{i} (g - 〈 i j 〉)] \geq \frac{ς}{β}$ , and $λ [γ_{j} (g) - γ_{j} (g - 〈 i j 〉)] \geq \frac{ς}{β}$ ; and
for all $i, j \in g,$ if $λ [γ_{i} (g + 〈 i j 〉) - γ_{i} (g)] > \frac{ς}{β}$ , then $λ [γ_{j} (g + 〈 i j 〉) - γ_{j} (g)] < \frac{ς}{β}$ .

Proof.

The proof follows from Definition 2, Corollary 2, and Lemma 9. ☐

In the following theorem, we prove existence and uniqueness of pairwise stable networks, given the values of the parameters.

Theorem 5.

There always exists a pairwise stable network. Given N, there exists exactly two pairwise stable networks. For each

β, ς

, and λ,thepairwise stable network

g

is unique.

1.

If

λ \leq \frac{ς}{β}

, then

g

is the null network.

2.

If

λ > \frac{ς}{β}

, then

g

consists of

(a): a set of $\frac{N}{2}$ connected pairs of agents, if N is even; or
(b): a set of $\frac{N - 1}{2}$ connected pairs of agents and one isolated agent, if N is odd.

Proof.

Initially, all agents are isolated in

g

, hence, for all i∈

g

,

γ_{i} (g) = 0

.

If agents i and j form a direct link

〈 i j 〉

, then

Φ_{i} (g + 〈 i j 〉) = Φ_{j} (g + 〈 i j 〉) = 1 .

Thus,

γ_{i} (g + 〈 i j 〉) = γ_{j} (g + 〈 i j 〉) = 1

.

However, from Lemma 9, agents i and j benefit by forming a direct link if and only if

λ β [γ_{i} (g + 〈 i j 〉) - γ_{i} (g)] > ς

and

λ β [γ_{j} (g + 〈 i j 〉) - γ_{j} (g)] > ς

, respectively.

This implies that a pair of agents have no incentive to add a direct link if and only if

λ \leq \frac{ς}{β}

.

Therefore,

g

is the null network. This completes the proof of 1.

Now, if

λ > \frac{ς}{β}

, then every pair of agents has an incentive to add a direct link. Suppose agents i and j add a direct link, and suppose link

〈 i j 〉

is the only link in the network, say

g^{'}

. Let k be another agent, different from i and j, in

g^{'}

. Then,

γ_{i} (g^{'} + 〈 i k 〉) = 1 - {(1 - \frac{1}{1.5})}^{2}

.

By Lemma 9, agent i benefits by adding the link

〈 i k 〉

if and only if

λ [γ_{i} (g^{'} + 〈 i k 〉) - γ_{i} (g^{'})] > \frac{ς}{β}

. Here,

λ [γ_{i} (g^{'} + 〈 i k 〉) - γ_{i} (g^{'})] < 0 ≯ \frac{ς}{β}

.

This implies that no agent benefits by adding more than one link, proving 2. ☐

4.2. Efficient Network, Price of Anarchy, and Price of Stability

We analyze whether the network formed by self-interested agents is also efficient, that is, socially optimal or in other words, “good" for all the agents put together.

Definition 7.

A social storage network

g

is efficient with respect to utility profile

(u_{1}, . . ., u_{N})

if

\sum_{i \in N} [β (1 - λ) + β λ γ_{i} (g) - ς η_{i} (g)] \geq \sum_{i \in N} [β (1 - λ) + β λ γ_{i} (\bar{g}) - ς η_{i} (\bar{g})], \forall \bar{g} \in G (N)

.

It might be possible that when self-interested agents build their social connections for their own benefit, the resulting network formation will lead to a “bad” outcome from a societal viewpoint. That is, the resulting network may be advantageous for a set of agents, while other agents may not be benefited by the outcome. This results in an inefficient network. In this state of affairs, we would like to measure how far a pairwise stable network is from an efficient network. For this, we make use of the widely discussed measures, namely, price of anarchy (PoA) and price of stability (PoS). We define these measures as follows.

Definition 8.

The price of anarchy (PoA) is the ratio of the worst sum of the utility value of an equilibrium network and the optimal sum of the utility value in any network.

Definition 9.

The price of stability (PoS) is the ratio of the best sum of the utility value of an equilibrium network and the optimal sum of the utility value in any network.

Theorem 6.

Every pairwise stable network is efficient. Therefore, PoA = 1. In addition, every efficient network is pairwise stable. Hence, PoS = 1.

Proof.

The proof follows from Theorem 5 and Definition 7, the fact that network formation starts with the null network, and the fact ([32]) that PoS = 1 if and only if every efficient network is pairwise stable, and PoA = 1 if and only if all pairwise stable networks are efficient.

In Appendix D, we discuss our experimental results on random stable networks where, for 150 random scenarios, no agent loses its data. That is, even if the disk of an agent fails, in our random experiments, the disk of the agent’s neighbor (from whom it can retrieve its data) is intact.

5. Conclusions

In this paper, we present the model of social storage cloud network formation, where agents (involved in storage sharing and data backup) wish to form a network strategically. The agents in this network strive for increasing the probability of obtaining storage space by minimizing the distances with others. We propose a degree-distance-based utility function and use the same to study network formation. We also study the impact of the decision of link addition (deletion) between a pair of agents on shortest distances, closeness, and storage availability.

We study the deviation conditions under which agents have an incentive to add or delete a link in a given network structure. With these conditions, we analyze pairwise stability and efficiency of social storage cloud. We show that there always exist a unique pairwise stable network, which is also efficient. Hence, the price of anarchy and the price of stability are, both, one.

5.1. Research Implications

In the social cloud literature, the issues of low service availability (for example, data and storage availability) and imbalanced workload (that lead to low storage utilization) are strongly correlated with the number of social contacts. The studies [33,34] show that the small friend set is a cause of low service availability as well as poor storage utilization. However, it is worth noting that these findings are drawn in the context of exogenous social contacts. We show that, for the given utility function in symmetric social storage, if agents are allowed to select their storage partners, then each agent wants to form a social connection with only one other agent, or in other words, each agent has only one neighbor. We infer that if agents select their partners by looking at their cost-benefit trade-offs, then the issues discussed above are more significant than in the context of exogenous social contacts.

We believe that the analysis of storage availability and network formation performed by us have several advantages from the point of view of storage providers (for example, BuddyBackup, CrashPlan, Friendstore). The analysis of network stability may help design efficient strategies related to data redundancy that suggest how many data pieces are needed on the storage space provided by partners in order to achieve the required level of data availability. It also helps design efficient workload strategies to maximize storage utilization. One of the advantages of endogenous network formation is that it provides more control to agents on their data and in selecting their storage partners. Further, our approach of analysis of network stability is useful for the agents who are part of the Friendstore storage system—it is easy for them to calculate their maintainable capacity [2] so as to maximize their storage utilization and data reliability.

5.2. Limitations

Despite the above advantages and implications, our study has several limitations. Firstly, our social storage model stands on the assumption (similar to various network formation models [35,36,37,38]) that agents have complete information about the network structure. Though we do not require this assumption during network formation as, owing to Theorem 5, links form (at most) pairwise, this assumption is crucial for our analysis in Section 3 on closeness and distances. Secondly, although the proposed utility model captures various parameters essential for understanding social storage cloud formation, we cannot rule out that parameters like online availability of agents, trust between them, and the bandwidth they have may influence the network formation.

5.3. Future Scope

In this paper, though the utility function we propose is for heterogeneous agents, our analysis is limited to homogeneous agents (or symmetric social storage cloud systems). In the case of heterogeneous agents, it would be interesting to see how externalities will influence social storage cloud formation. Our analysis of storage availability will also be more relevant in this setting.

Further, we can enrich the utility model proposed in this paper by taking the above mentioned parameters (that is, online availability, bandwidth, and trust) into account. One can then study social storage cloud formation with both complete and incomplete information. For example, in the incomplete information setting, agents know neither the network structure nor the online availability and bandwidth of others. Analysis in this context will give more insight into social storage cloud formation.

Author Contributions

All authors have contributed equally to all aspects of this paper.

Funding

This research received no external funding.

Acknowledgments

The authors would like to acknowledge the editor for the help and the anonymous reviewers for their valuable comments and suggestions, which have improved the quality of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Example

Example A1.

Consider networks

g

and

g + 〈 i j 〉

, as shown in Figure A1a,b, respectively. The newly added link

〈 i j 〉

in

g

increases the closeness of agents

k, c, d, f, g, j, i, h

, and l in

g + 〈 i j 〉

. For instance, from Equation (1), we have

Φ_{k} (g) = 4.60

and

Φ_{k} (g + 〈 i j 〉) = 4.87

. However, there is no change in the closeness of agents

a, b

, and e. For instance,

Φ_{a} (g) = 6.17 = Φ_{a} (g + 〈 i j 〉)

. Figure A1c shows the shortest-path-network

g_{i j}^{k^{+}}

for agent k, where the set

L_{k}^{+}

consists of agents

j, g, h

, and l. This suggests that the newly added link

〈 i j 〉

in

g

brings agents

j, g, h

, and l close to k in

g + 〈 i j 〉

, and therefore,

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

. Figure A1d represents the shortest-path-network

g_{i j}^{+}

that satisfies Definition 4. In this case,

g_{i j}^{+}

is the union of

g_{i j}^{k^{+}}, g_{i j}^{c^{+}}, g_{i j}^{d^{+}}, g_{i j}^{f^{+}}, g_{i j}^{g^{+}}, g_{i j}^{h^{+}}, g_{i j}^{i^{+}}, g_{i j}^{j^{+}}

, and

g_{i j}^{l^{+}}

.

Figure A1e,f show the induced subnetworks

g_{i}^{+}

and

g_{j}^{+}

of

g_{i j}^{+}

, respectively. The induced subnetworks

g_{i}^{+}

and

g_{j}^{+}

are as per Definition 5. In

g_{i}^{+}

, the distances between agent i, and other agents

k, c, d

, and f are the same in networks

g

and

g + 〈 i j 〉

. We have

d_{i k} (g) = d_{i k} (g + 〈 i j 〉) = 2

,

d_{i c} (g) = d_{i c} (g + 〈 i j 〉) = 2

,

d_{i d} (g) = d_{i d} (g + 〈 i j 〉) = 1

, and

d_{i f} (g) = d_{i f} (g + 〈 i j 〉) = 1

. Similarly, in

g_{j}^{+}

, the distances between agent j, and other agents

g, h

, and l are the same in network

g

and

g + 〈 i j 〉

;

d_{j g} (g) = d_{j g} (g + 〈 i j 〉) = 1

,

d_{j h} (g) = d_{j h} (g + 〈 i j 〉) = 1

, and

d_{j l} (g) = d_{j l} (g + 〈 i j 〉) = 2

.

Figure A1. Induced subnetworks of

g + 〈 i j 〉

.

Figure A1. Induced subnetworks of

g + 〈 i j 〉

.

Appendix B. Results Owing to Link Deletion

Appendix B.1. Effect of Closeness on Distances of Agents not Involved in Link Deletion

Given k such that

Φ_{k} (g) > Φ_{k} (g - 〈 i j 〉)

, we use

L_{k}^{-}

to denote the set of all

l \in g

such that all shortest paths from k to l in

g - 〈 i j 〉

contain

〈 i j 〉

. We use

l_{k}^{-}

to denote an agent in

L_{k}^{-}

.

Proposition A1.

Let

i \neq j \neq k

,

l \neq k

,

i \neq l

, and

Φ_{k} (g - 〈 i j 〉) < Φ_{k} (g)

. If

d_{i k} (g) < d_{i j} (g) \leq d_{k l} (g)

, then

d_{i k} (g - 〈 i j 〉) = d_{i k} (g)

.

We skip the proof as it is similar to the proof of Proposition 1.

Definition A1.

Suppose

〈 i j 〉 \in g

and k is an agent such that

Φ_{k} (g) > Φ_{k} (g - 〈 i j 〉)

. A

(k, - i j)

-shortest-path-network containing

〈 i j 〉

,

g_{i j}^{k^{-}}

, is a subnetwork of

g

that consists of all shortest paths from k to

l_{k}^{-}

in

g

, which contain

〈 i j 〉

, for all

l_{k}^{-} \in L_{k}^{-}

.

Definition A2.

An (all

k, - i j)

-shortest-path-network,

g_{i j}^{-}

is

⋃_{\underset{Φ_{k} (g) > Φ_{k} (g - 〈 i j 〉)}{k \in g,}} g_{i j}^{k^{-}}

, the smallest network that contains all

(k, - i j)

-shortest-path-networks.

Definition A3.

A sub-

(i, -)

-network,

g_{i}^{-}

of

g_{i j}^{-}

, is the induced subnetwork of

g_{i j}^{-}

consisting of all agents

k \in g_{i j}^{-}

such that

d_{i k} (g) = d_{i k} (g - 〈 i j 〉)

. Similarly, we define the sub-

(j, -)

-network,

g_{j}^{-}

of

g_{i j}^{-}

, as the induced subnetwork of

g_{i j}^{-}

consisting of all agents

l \in g_{i j}^{-}

such that

d_{j l} (g) = d_{j l} (g - 〈 i j 〉)

.

Proposition A2.

Let

k, \hat{k} \in g_{i}^{-}

. Then, for all

k, \hat{k} \in g_{i}^{-}

,

d_{k \hat{k}} (g) = d_{k \hat{k}} (g - 〈 i j 〉)

.

We skip the proof as it is similar to the proof of Proposition 2.

Appendix B.2. Effect of Link Deletion on Storage Availability

We discuss the effect of link deletion on agents’ storage availability.

Lemma A1.

Suppose

〈 i j 〉 \in g

and

k \notin g_{i j}^{-}

. Then,

α_{i k} (g) = α_{i k} (g - 〈 i j 〉)

and

α_{j k} (g) = α_{j k} (g - 〈 i j 〉)

.

Proof.

The proof follows from Proposition A1 and is in similar lines to the proof of Lemma 5. ☐

Lemma A2.

Suppose agents i, j, k, and l are such that

i \neq j

,

j \neq k

,

i \neq l

, and

k \neq l

. (Agents i and k may be the same, and agents j and l may be the same). Suppose

〈 i j 〉 \in g

,

k \in g_{i}^{-}

, and

l \in g_{j}^{-}

. Then,

1.: $α_{k l} (g) > α_{k l} (g - 〈 i j 〉)$ .
2.: If $i \neq k$ , then $α_{i k} (g) < α_{i k} (g - 〈 i j 〉)$ . Similarly, if $j \neq l$ , then $α_{j l} (g) < α_{j l} (g - 〈 i j 〉)$ .

Proof.

The proof of 1 is in similar lines to the proof of 1 of Lemma 6. The proof of 2 follows from Proposition A2. ☐

Lemma A3.

Let k and

\hat{k}

be agents in

g_{i}^{-}

. Then,

α_{k \hat{k}} (g) = α_{k \hat{k}} (g - 〈 i j 〉)

and

α_{\hat{k} k} (g) = α_{\hat{k} k} (g - 〈 i j 〉)

.

Proof.

The proof follows from Lemma A2. ☐

Theorem A1.

Suppose agents i, j, k, and l are such that

i \neq j

and

k \neq l

. Suppose

〈 i j 〉 \in g

. Then,

γ_{i} (g) < γ_{i} (g - 〈 i j 〉)

if and only if

\frac{\prod_{l \in g_{j}^{-}} (1 - α_{i k} (g))}{\prod_{l \in g_{j}^{-}} (1 - α_{i l} (g - 〈 i j 〉))} < \frac{\prod_{k \in g_{i}^{-}} (1 - α_{i k} (g - 〈 i j 〉))}{\prod_{k \in g_{i}^{-}} (1 - α_{i l} (g))}

.

In addition,

γ_{i} (g) < γ_{i} (g - 〈 i j 〉)

if and only if

\frac{\prod_{l \in g_{j}^{-}} (α_{i k} (g))}{\prod_{l \in g_{j}^{-}} (α_{i l} (g - 〈 i j 〉))} > \frac{\prod_{k \in g_{i}^{-}} (α_{i k} (g - 〈 i j 〉))}{\prod_{k \in g_{i}^{-}} (α_{i l} (g))}

.

Proof.

The proof follows from Lemmas A1–A3. ☐

Appendix C. Proof of Lemma 6

Proof of 2 follows from Proposition 1.

Proof of 1 is as follows.

As

k \in g_{i}^{+}

and

l \in g_{j}^{+}

,

d_{k l} (g) > d_{k l} (g + 〈 i j 〉)

, hence,

Φ_{k} (g) < Φ_{k} (g + 〈 i j 〉)

and

Φ_{l} (g) < Φ_{l} (g + 〈 i j 〉)

.

It is easy to see that network

g

in Figure A2a is the one where adding link

〈 i j 〉

leads to the maximum increment in l’s closeness and the minimum decrements in the distances between l and

k_{m}, (m = 1, 2 \dots, n - 4

, where n is the number of agents in

g

), the maximum and the minimum being across all network structures.

We consider two cases,

j \neq l

and

j = l

.

Suppose

j \neq l

. Consider the network

g

, as shown in Figure A2a.

From Equation (1),

Φ_{l} (g) = \frac{1}{d_{l j} (g)} + \frac{1}{d_{l x} (g)} + \frac{1}{d_{l i} (g)} + \sum_{\underset{d_{k l} (g) = 4}{k \in g,}} \frac{1}{d_{k l} (g)} = 1 + \frac{1}{2} + \frac{1}{3} + \frac{n - 4}{4} = \frac{3 n + 10}{12}

.

Without loss of generality, let

k = k_{m}

for some

m \in {1, 2, \dots, n - 4}

. Then, from Equation (2),

α_{k l} (g) = \frac{(\frac{1}{4})}{Φ_{l} (g)} = \frac{3}{3 n + 10}

.

If agents i and j add a direct link in

g

, we have network

g + 〈 i j 〉

, as shown in Figure A2b. Then, from Equations (1) and (2), we have

Φ_{l} (g + 〈 i j 〉) = \frac{n + 2}{3}

and

α_{k l} (g + 〈 i j 〉) = \frac{1}{n + 2}

.

From the above, clearly,

α_{k l} (g) < α_{k l} (g + 〈 i j 〉)

.

Figure A2. Network structure

g

and

g + 〈 i j 〉

with n agents.

Figure A2. Network structure

g

and

g + 〈 i j 〉

with n agents.

Now, suppose

l = j

.

From Equations (1) and (2) applied to Figure A2a,b, we have

Φ_{j} (g) = \frac{2 n + 7}{6}

,

α_{k j} (g) = \frac{2}{2 n + 7}

,

Φ_{j} (g + 〈 i j 〉) = \frac{n + 2}{2}

, and

α_{k j} (g + 〈 i j 〉) = \frac{1}{n + 2}

.

It is easy see that

α_{k j} (g) < α_{k j} (g + 〈 i j 〉)

in this case as well. This completes the proof of 1. ☐

Appendix D. Experimental Results

We conduct random experiments to answer the following question. Though agents form links and backup their data with adjacent agents, can any agent still lose its data? From Theorem 5, the‘null network is the unique pairwise stable network when the cost to add links is “high”, that is,

ς \geq \frac{β}{λ}

, and pairs of agents with links between them is the unique stable network otherwise. Therefore, as far as formation of networks is concerned, we always obtain one of these two networks, depending on the values of

ς, β

, and

λ

. In our experiment, we randomly generate networks of the second type, namely, pairwise links. We generate such networks on 30 agents and consider 150 random scenarios, by generating 10 random networks, 5 different sets of randomly chosen agents whose storage disks fail, for 3 cases,

λ = 1 %, 2 %

, and

4 %

. Our assumption on the values of

λ

is based on data from Backblaze9 on hard drive failure rates. Interestingly, in none of the random cases we generated did agents on the two sides of a link fail at the same time.

References

Chard, K.; Bubendorfer, K.; Caton, S.; Rana, O.F. Social cloud computing: A vision for socially motivated resource sharing. IEEE Trans. Serv. Comput. 2012, 5, 551–563. [Google Scholar] [CrossRef]
Tran, N.; Chiang, F.; Li, J. Efficient cooperative backup with decentralized trust management. Trans. Storage 2012, 8, 8:1–8:25. [Google Scholar] [CrossRef]
Gracia-Tinedo, R.; Sánchez-Artigas, M.; García-López, P. F2Box: Cloudifying F2F storage systems with high availability correlation. In Proceedings of the 2012 IEEE Fifth International Conference on Cloud Computing (CLOUD), Honolulu, HI, USA, 24–29 June 2012; pp. 123–130. [Google Scholar]
Moreno-Martínez, A.; Gracia-Tinedo, R.; Sánchez-Artigas, M.; Garcia-Lopez, P. FRIENDBOX: A cloudified F2F storage application. In Proceedings of the 2012 IEEE 12th International Conference on Peer-to-Peer Computing (P2P), Tarragona, Spain, 3–5 September 2012; pp. 75–76. [Google Scholar]
Nguyen, T.D.; Li, J. BlockParty: Cooperative offsite backup among friends. In Proceedings of the 4th USENIX Symposium on Networked Systems Design & Implementation, Cambridge, MA, USA, 11–13 April 2007. [Google Scholar]
Tran, N.; Li, J.; Subramanian, L.; Chow, S.S. Optimal Sybil-resilient node admission control. In Proceedings of the 2011 Proceedings IEEE INFOCOM, Shanghai, China, 10–15 April 2011; pp. 3218–3226. [Google Scholar]
Zuo, X.; Iamnitchi, A. A survey of socially aware peer-to-peer systems. ACM Comput. Surv. 2016, 49, 9:1–9:28. [Google Scholar] [CrossRef]
Freeman, L.C. Centrality in social networks conceptual clarification. Soc. Netw. 1978, 1, 215–239. [Google Scholar] [CrossRef]
Borgatti, S.P.; Everett, M.G. A Graph-theoretic perspective on centrality. Soc. Netw. 2006, 28, 466–484. [Google Scholar] [CrossRef]
Boldi, P.; Vigna, S. Axioms for centrality. Internet Math. 2014, 10, 222–262. [Google Scholar] [CrossRef]
Bloch, F.; Jackson, M.O.; Tebaldi, P. Centrality measures in networks. arXiv 2017, arXiv:1608.05845. [Google Scholar] [CrossRef]
Skibski, O.; Sosnowska, J. Axioms for distance-based centralities. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; pp. 1218–1225. [Google Scholar]
Rao, N.S.V.; Ma, C.Y.T.; He, F.; Yau, D.K.Y.; Zhuang, J. Cyber-physical correlation effects in defense games for large discrete infrastructures. Games 2018, 9, 52. [Google Scholar] [CrossRef]
Altman, E.; Kameda, H.; Hosokawa, Y. Nash equilibria in load balancing in distributed computer systems. Int. Game Theory Rev. 2002, 04, 91–100. [Google Scholar] [CrossRef]
Hausken, K. Information sharing among cyber hackers in successive attacks. Int. Game Theory Rev. 2017, 19, 1750010-1–1750010-33. [Google Scholar] [CrossRef]
Timmer, J.; Scheinhardt, W. Customer and cost sharing in a Jackson network. Int. Game Theory Rev. 2018, 20, 1850002-1–1850002-10. [Google Scholar] [CrossRef]
Pilling, R.; Chang, S.C.; Luh, P.B. Shapley value-based payment calculation for energy exchange between micro- and utility grids. Games 2017, 8, 45. [Google Scholar] [CrossRef]
Sanchez-Soriano, J. An overview on game theory applications to engineering. Int. Game Theory Rev. 2013, 15, 1340019-1–1340019-18. [Google Scholar] [CrossRef]
Dutta, B.; Jackson, M.O. On the formation of networks and groups. In Networks and Groups: Models of Strategic Formation (Studies in Economic Design), 1st ed.; Dutta, B., Jackson, M.O., Eds.; Springer: Berlin/Heidelberg, Germany, 2003; Volume VIII, pp. 1–15. [Google Scholar]
Jackson, M.O. A survey of network formation models: Stability and efficiency. In Group Formation in Economics: Networks, Clubs, and Coalitions; Demange, G., Wooders, M., Eds.; Cambridge University Press: Cambridge, UK, 2005; pp. 11–57. [Google Scholar]
Marini, M.A. Games of coalition and network formation: A survey. In Networks, Topology and Dynamics, 1st ed.; Naimzada, A.K., Stefani, S., Torriero, A., Eds.; Lecture Notes in Economics and Mathematical Systems; Springer: Berlin/Heidelberg, Germany, 2009; Volume 613, pp. 67–93. [Google Scholar]
Borkotokey, S.; Gogoi, L.; Sarangi, S. A survey of player-based and link-based allocation rules for network games. Stud. Micro. 2014, 2, 5–26. [Google Scholar] [CrossRef]
Mane, P.C.; Ahuja, K.; Krishnamurthy, N. Stability, efficiency, and contentedness of social storage networks. Ann. Oper. Res. 2019, 1–32. [Google Scholar] [CrossRef]
Morrill, T. Network formation under negative degree-based externalities. Internat. J. Game Theory 2011, 40, 367–385. [Google Scholar] [CrossRef]
Möhlmeier, P.; Rusinowska, A.; Tanimura, E. A degree-distance-based connections model with negative and positive externalities. J. Public Econ. Theory 2016, 18, 168–192. [Google Scholar] [CrossRef]
Jackson, M.O.; Wolinsky, A. A strategic model of social and economic networks. J. Econ. Theory 1996, 71, 44–74. [Google Scholar] [CrossRef]
Opsahl, T.; Agneessens, F.; Skvoretz, J. Node centrality in weighted networks: Generalizing degree and shortest paths. Soc. Networks 2010, 32, 245–251. [Google Scholar] [CrossRef]
Marchiori, M.; Latora, V. Harmony in the small-world. Phys. A 2000, 285, 539–546. [Google Scholar] [CrossRef]
Lillibridge, M.; Elnikety, S.; Birrell, A.; Burrows, M.; Isard, M. A cooperative Internet backup scheme. In Proceedings of the Annual Conference on USENIX Annual Technical Conference, San Antonio, TX, USA, 9–14 June 2003; pp. 29–41. [Google Scholar]
Landers, M.; Zhang, H.; Tan, K.L. PeerStore: Better performance by relaxing in peer-to-peer backup. In Proceedings of the Fourth International Conference on Peer-to-Peer Computing, Zurich, Switzerland, 27–27 August 2004; pp. 72–79. [Google Scholar]
Cox, L.P.; Murray, C.D.; Noble, B.D. Pastiche: Making backup cheap and easy. SIGOPS Oper. Syst. Rev. 2002, 36, 285–298. [Google Scholar] [CrossRef]
Jackson, M.O. Social and Economic Networks, 2nd ed.; Princeton University Press: Princeton, NJ, USA, 2010; pp. 215–216. [Google Scholar]
Sharma, R.; Datta, A.; DeH’Amico, M.; Michiardi, P. An empirical study of availability in friend-to-friend storage systems. In Proceedings of the 2011 IEEE International Conference on Peer-to-Peer Computing, Kyoto, Japan, 31 August–2 September 2011; pp. 348–351. [Google Scholar]
Zuo, X.; Blackburn, J.; Kourtellis, N.; Skvoretz, J.; Iamnitchi, A. The power of indirect ties in friend-to-friend storage systems. In Proceedings of the 14-th IEEE International Conference on Peer-to-Peer Computing, London, UK, 8–12 September 2014; pp. 1–5. [Google Scholar]
Bala, V.; Goyal, K. A noncooperative model of network formation. Econometrica 2000, 68, 1181–1229. [Google Scholar] [CrossRef]
Johnson, C.; Gilles, R. Spatial social networks. Rev. Econ. Des. 2000, 5, 273–299. [Google Scholar] [CrossRef]
Moscibroda, T.; Schmid, S.; Wattenhofer, R. Topological implications of selfish neighbor selection in unstructured peer-to-peer networks. Algorithmica 2011, 61, 419–446. [Google Scholar] [CrossRef]
Buechel, B. Network Formation with Closeness Incentives. In Networks, Topology and Dynamics: Theory and Applications to Economics and Social Systems, 1st ed.; Naimzada, A.K., Stefani, S., Torriero, A., Eds.; Lecture Notes in Economics and Mathematical Systems; Springer: Berlin/Heidelberg, Germany, 2009; Volume 613, pp. 95–109. [Google Scholar]

1.	http://www.buddybackup.com (accessed on 21 June 2019).
2.	https://support.crashplan.com (accessed on 21 June 2019).
3.	https://developers.facebook.com/docs/graph-api (accessed on 21 June 2019).
4.	We refer the reader to the body of the paper [8,9,10,11,12] for details on centrality measures in networks.
5.	The theoretical game techniques have been quite successfully used in computer science [13,14,15,16] and other engineering disciplines [17,18].
6.	The literature on network formation is vast. The surveys [19,20,21,22] explore many dimension of this topic.
7.	We assume, $ς = \frac{ς_{i} + ς_{j}}{2}$ , that is, a pair of agents involved in a link share the cost $ς$ .
8.	For simplicity, we assume uniform data loss rate $λ$ .
9.	https://www.backblaze.com/blog/backblaze-hard-drive-stats-q1-2019/ (accessed on 04 September 2019).

Table 1. Notation summary.

$g$	social storage cloud.
$A$	set of agents (or vertices).
N	the number of elements in the set $A$ , which is the number of agents in $g$ .
$L$	set of links (or edges).
$〈 i j 〉$	link between agents i and j.
$ς$	cost incurred by each agent to maintain a link.
$λ$	probability that an agent loses its data.
$β$	worth (or value) that each agent has for its data.
$Φ_{i} (g)$	closeness of agent i in $g$ .
$α_{i j} (g)$	probability that agent i obtains storage space from agent j in $g$ .
$γ_{i} (g)$	probability that agent i obtains storage space from at least one agent in $g$ .
$η_{i} (g)$	neighborhood size of agent i in $g$ . Also denotes the set of neighbors of i.
$P_{a_{1} a_{n}} (g)$	a path from agent $a_{1}$ to $a_{n}$ in $g$ such that $〈 a_{1}, a_{2} 〉, 〈 a_{2}, a_{3} 〉, \dots, 〈 a_{n - 1}, a_{n} 〉 \in L$ .
$d_{i j} (g)$	the length of the shortest path connecting agents i and j in $g$ .
$g + 〈 i j 〉$	new link $〈 i j 〉$ is added to $g$ .
$g - 〈 i j 〉$	existing link $〈 i j 〉$ is deleted from $g$ .
$G (N)$	the set of all networks on N agents.
$u_{i} (g)$	utility of agent i in $g$ .

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mane, P.C.; Krishnamurthy, N.; Ahuja, K. Formation of Stable and Efficient Social Storage Cloud. Games 2019, 10, 44. https://doi.org/10.3390/g10040044

AMA Style

Mane PC, Krishnamurthy N, Ahuja K. Formation of Stable and Efficient Social Storage Cloud. Games. 2019; 10(4):44. https://doi.org/10.3390/g10040044

Chicago/Turabian Style

Mane, Pramod C., Nagarajan Krishnamurthy, and Kapil Ahuja. 2019. "Formation of Stable and Efficient Social Storage Cloud" Games 10, no. 4: 44. https://doi.org/10.3390/g10040044

APA Style

Mane, P. C., Krishnamurthy, N., & Ahuja, K. (2019). Formation of Stable and Efficient Social Storage Cloud. Games, 10(4), 44. https://doi.org/10.3390/g10040044

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Formation of Stable and Efficient Social Storage Cloud

Abstract

1. Introduction

2. Social Storage Cloud Model

2.1. Interaction Structure

2.2. Storage Sharing

2.3. Agent’s Utility and Symmetry

2.4. Pairwise Stability

3. Network Structure and Storage Availability

3.1. Effect of Link Alteration on Closeness

3.2. Effect of Closeness on Distances of Agents Not Involved in Link Alteration

3.3. Effect of Link Alteration on Storage Availability

3.4. Externalities

4. Characterization of Stable and Efficient Networks

4.1. Stable Networks: Characterization, Existence, and Uniqueness

4.2. Efficient Network, Price of Anarchy, and Price of Stability

5. Conclusions

5.1. Research Implications

5.2. Limitations

5.3. Future Scope

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Example

Appendix B. Results Owing to Link Deletion

Appendix B.1. Effect of Closeness on Distances of Agents not Involved in Link Deletion

Appendix B.2. Effect of Link Deletion on Storage Availability

Appendix C. Proof of Lemma 6

Appendix D. Experimental Results

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI