Improving Traffic Load Distribution Fairness in Mobile Social 2 Networks

: Mobile social networks suffer from an unbalanced traffic load distribution due to the het-7 erogeneity in mobility of nodes (humans) in the network. A few nodes in these networks are highly 8 mobile, and the proposed social-based routing algorithms are likely to choose these most "social" 9 nodes as the best message relays. Finally, this could lead to inequitable traffic load distribution and 10 resource utilization, such as faster battery drain and/or storage consumption of the most (socially) 11 popular nodes. We propose a framework called Traffic Load Distribution Aware (TraLDA) to improve 12 traffic load balancing across network nodes. We present a novel method for calculating node pop-13 ularity which takes into account both node inherent and social-relations popularity. The former is 14 purely determined by the node’s sociability level in the network, and in TraLDA is computed using 15 the Kalman-prediction which considers the node's periodicity behaviour. However, the latter takes 16 the benefit of interactions with more popular neighbours (acquaintances) to boost the popularity of 17 lower (social) level nodes. Using extensive simulations in the Opportunistic Network Environment 18 (ONE) driven by real human mobility scenarios, we show that our proposed strategy enhances the 19 traffic load distribution fairness of the classical, yet popular social-aware routing algorithms Bub-20 bleRap and SimBet without negatively impacting the overall delivery performance. 21


Introduction
As a particular case of MANETsmobile ad-hoc networks (MANETs), opportunistic 25 mobile networks (OMNs) [1] are unique dynamic wireless mobile networks.Unlike MA- 26 NETsMANETs, in such networks persistent connectivity is not a necessity, and end-to-27 end paths from sources to destinations are not assumed to exist at all times.A link be-28 tween a pair of nodes is established whenever they come into contact.In opportunistic 29 mobile networks OMNs, pairwise node contacts occur randomly in time, and the duration 30 of each contact is also random.Thanks to the omnipresence of mobile devices nowadays, 31 e.g., mobile phones and tablets, human can exploit contact opportunities to exchange in-32 formation by means of short radio range connections.This leads to human-centric oppor-33 tunistic mobile networks, also referred to mobile social networks (MSNs) in [2,3].These net-34 works have mainly been introduced by combining social networks and mobile communi-35 cation networks.MSNs take a human-centric approach to networking, closing the gap 36 between networks and human behaviour.Moreover, studies in [4][5][6] revealed that social  as optimal carriers for message transfers [7][8][9].This might end up in heavy traffic load in the (socially) popular nodes, quickly draining the nodes' constraint resources, such as power and storage, and this unbalanced traffic load eventually deteriorates the network's delivery performance [10].In addition, the poor traffic load balancing also results in unfair delivery success rate among individuals, where messages from popular individuals can reach the destinations with a high probability, but individuals with few social connections will experience in low delivery success [11].This variance of the delivery rate becomes a deterrent for nodes to participate in the message forwarding.Ultimately, the unfairness of traffic load makes popular nodes are easy target of attacks [12].
Unbalanced traffic distribution across network nodes leading to traffic congestion in social networks has been extensively studied in several areas [13][14][15].[x,x,x].In [13] [x] (data) traffic congestion during crowd disaster was thoroughly discussed.In that crowd management scenario, mobile devices carried by individuals is used to detect and inform to the crowd managers about the crowd density.However, in crowded areas traffic can increase dramatically within a short period of time, and, in turn, traffic congestion starts to occur, making the crowd managers fail to handle the crowd.In [14] [x] traffic in social networks was investigated in various applications, ranging from vehicular traffic in urban environments to data traffic in Internet of Things and human-machine networks.In these settings, local failures such as traffic congestion in some parts of networks might provoke a cascade of failures throughout systems.Machine learning approaches were therefore nominated to address such issues.In [15] [x] pocket switched networks were proposed to transfer data between users' mobile devices.Such opportunistic networks exploit human mobility to enable a store-carry-forward mechanism to deliver messages from sources to destinations.In each contact, social-based routing algorithms [7][8][9] typically select popular nodes (individuals) as the best relays in the network, resulting in unbalanced traffic distribution across nodes and traffic congestion in the most central nodes.
Social-based routing algorithms are a class of utility-based routing algorithms.In such schemes, heuristic methods are used to determine the "quality" (utility) of a node as a relay.Each node i retains   (), a utility function that denotes the likelihood of i delivering a message to j.The utility function can be based on some different parameters, such as contact history, mobility model, social relations, etc. Spyropoulos et al. [16] categorized utility functions into two types: destination-dependent (DD) and destination-independent (DI).
In DD, node utility is dependent of the destination; i.e., node i is an optimal relay for one destination  1 , yet node j is the best one for another  2 , or   ( 1 ) >   ( 1 ), but   ( 2 ) <   ( 2 ) for  1 ≠  2 .DD functions could be based on last-contact, social similarity, or correlated mobility pattern, with the given destination.However, destination-dependent (DD) imposes a large overhead on nodes, since the nodes should keep a single entry for each peer in the network.As opposed to DD, node utility in DI is independent of any destination, for example, a single node may be the best carrier for most/all destinations in the network, or in general it holds that   ( 1 ) >   ( 1 ) then   () >   () for most/all j, d.Instances of nodes which are better relays for all destinations would be those with many connections to others (e.g., hub nodes in scale-free networks), nodes with many acquaintances (e.g., popular nodes in social networks), or nodes with high mobility (e.g., cars or buses in vehicular delay-tolerant networks).Nevertheless, destination-independent (DI) imposes a higher forwarding overhead on better relays, leading to poorer fairness in both traffic load distribution and utilization of the nodes' resources.
This paper proposes a framework called Traffic Load Distribution Aware (hereafter, TraLDA), aiming to improve fairness in forwarding of social-based routing algorithms.
Here, we introduce a novel computation of node (global) popularity in the entire network.
This utility metric is obviously independent of the message destination, and it may contribute to a traffic load imbalance across nodes, as mentioned in [16].In TraLDA, we consider two different popularities in the calculation of node popularity, namely inherent popularity and social-relations popularity.Inherent popularity is based solely on the node's so-Formatted: Font color: Red Formatted: Font color: Red Formatted: Font color: Red Formatted: Font color: Red ciability level, and in TraLDA is computed using the Kalman-prediction [17] which considers the periodicity in human behaviour.The works in [18,19] confirmed that human activities typically exhibit some of periodicity.Consequently, the calculation of node popularity in mobile social networks should consider this property.Social-relations popularity, on the other hand, reflects the social benefit of connections with popular nodes, and spreads the popularity of these nodes to their lower ranking acquaintances.Finally, we apply the TraLDA's node popularity computation on the classical, yet prominent socialbased routing algorithms SimBet [20] and BubbleRap [21], and next investigate the performance improvements of these routing schemes, particularly in the trade-off between forwarding fairness and efficiency.SimBet and BubbleRap basically combine two different utility metrics to decide node fitness as relay to a given destination: the one which is dependent of the destination (i.e., similarity and social community in SimBet and Bub-bleRap, respectively), and the other one which is independent of the destination (i.e., be- We proceed in this paper as follows.Related literature is given in Section 2, research background is described in Section 3, detailed design strategy of TraLDA is discussed in Section 4, simulation and discussion is presented in Section 5, and lastly conclusion and future work is showed in Section 6.

Related literature
Fairness is important in many areas of human lives, e.g., sociology, economics and politics, and it is also true in technologies.In computer engineering, distinct computer resources should be shared equally amongst all processes and threads.In computer networking, all nodes require to attain the bandwidth and quality of service (QoS) equitably.
In [23] fairness challenges and issues in wireless networks is thoroughly discussed, and some trade-offs between fairness and performance are reviewed.Mtibaa and Harras [10] studied the trade-offs between fairness and efficiency of social-based routing algorithms in MSNsmobile social networks.They found that excluding popular nodes on the message forwarding significantly degrades the delivery efficiency.We in [24] also showed that absolute traffic load fairness leads to the deterrent of delivery efficiency; yet, high delivery efficiency results in unfairness of traffic load.
To overcome theis problem, fair routing algorithms have been proposed for MSNs mobile social networks [11,[25][26][27].Fan et.al. [11] introduced a fair routing strategy based on packet priority to improve fairness in success rate among nodes.Ying et.al. [25] proposed FSMF, a fair social aware message forwarding to solve the issues of imbalanced traffic load distribution as well as unfair delivery rate.Pujol et .al. [26] proposed FairRoute that combines social strength and buffer queue length as the routing metrics to fairly distribute the traffic load among nodes.Milena and Grundy [27] presented CafRep, an adaptive congestion aware forwarding strategy that diverts the traffic from congested nodes (popular nodes) to less congested nodes (unpopular nodes).
Indeed, fair routing algorithms in distributed, intermittently connected wireless networks like MSNs mobile social networks are more complex than those in conventional networks, such as the Internet, since: firstly(i), negotiation and compromise amongst autonomous nodes is more complicated, for example non-cooperative nodes may be reluctant to help other nodes in forwarding; and secondly(ii), due to the lack of knowledge about the global states, routing decisions are made solely based on nodes' local information.For the former first issue, the impact of selfish nodes on delivery performance and resource consumption fairness has been investigated in [28].In addition, to increase fairness in forwarding an incentive or a credit was applied on the routing decisions in [25].
Finally, In in [29] a game theoretic approach is used to support fair cooperation among nodes in opportunistic networks.For the latter second issue, current works of fair routing schemes searcheded for proper nodes's' locally available information to ensure a better fairness and efficiency trade-off.Furthermore, there are two sorts of node local knowledge which are commonly used to improve traffic fairness and reduce congestion: (i) buffer statistics and (ii) social measures.For the former case, some algorithms consider node burden, inferred from the node's buffer queue length, as the forwarding metric to achieve a balanced traffic distribution.For example, Some algorithms consider node burden, inferred from the node's buffer queue length, as the forwarding metric to achieve a balanced traffic load distribution; for example, FOG [10] and GreBurD [30] prioritize nodes with higher residual buffer space as suitable relays to distribute load away from the congested nodes;.CafRep [27] defines node retentiveness, calculated as an expected weighted moving average of the node's remaining storage, as the congestion heuristic to detect storage congestion in popular nodes.For the latter case, on the other hand, researchers search for better social network measures for improving fairness in forwarding of social-based routing schemes.For example, FairRoute [26] improves the calculation of pairwise tie strength based on the short-term and long-term relationships; SimBet [20] adds connection strength information to the routing metrics to offload traffic from popular nodes; Socially-Aware Prediction (SAP) [31] estimates future contacts based on the node (social) similarity, and forwards messages to nodes with a higher similarity with the destinations, thus reducing messages forwarded to globally popular nodes.
As opposed to [26] [20] [31] which focus on improving the calculation of destinationdependent (DD) utility metrics, our proposed scheme TraLDA chooses to improve the computation of node popularity in the network, since as noted in [16], this destinationindependent (DI) utility metric primarily contribute to the traffic imbalance among nodes in mobile social networks.In social network analysis, Freeman [32]  Nevertheless, Freeman's centrality measures typically disregard the influence of the neighbours.The authors of [33] [xRusiNowska] argued that a node's importance in the social network should also be determined by the importance of its neighbours.In [34] [xpage6Dom], the authors studied a strategy to find persons that are able to spread advertisements as far as possible in a social network.They showed that a person that receives highly respects from her friends, her advertisements will be highly probable to spread over the social network quickly.In addition, Ursino and Virgili [35] [x] integrated the concept of social networks and IoT to determine the reputation of IoT objects.They proposed a formula to calculate reputation of an object in a social Internet of Things based on the well-known Google PageRank.In that technique, the reputation of an object is determined by the level of trust it obtains from other IoT objects.Almost similar, Cauteruccio et al. [36] [x] attempted to introduce concepts and behaviours of social networks into the IoT settings.In that work, to measure the reputation of an IoT object, the authors defined Impact Degree, calculated as the average trust degree that the object receives from the other objects in its scope (neighbourhood).Meanwhile, from the social network theory, there exist centrality measures that consider a richer range of direct and indirect influence of neighbours, such as the Katz's prestige measure [37].This centrality metric is developed based on the premise that a node's importance in the network is influenced by its neighbours' importance.Thus, this prestige measure considers a node's connectedness to other nodes as well as its proximity to other important nodes.In this regard, node popularity calculation in TraLDA should take into account the influence of more popular neighbours when determining the popularity of a node.Therefore, as our second contribution in this paper, we propose a method to calculate node social-relations popularity based on the Katz's prestige measure [37].We perform some modifications on the calculation of this centrality metric to make it appropriate for distributed, ad hoc environments, such as mobile social networks.
In addition to buffer state metrics, researchers also search for better connection measures to improve fairness in MSNs; for instance, the authors of SimBet [21] add connection strength information to the routing metrics to offload traffic from popular nodes; FairRoute [16] uses two distinct temporal interaction strength metrics to accurately estimate the pairwise connection strength; Socially-Aware Prediction (SAP) [22] estimates future contacts based on the encounter history and social network information and forwards messages to nodes with a higher contact chance with the destination, thus reducing messages forwarded to globally popular nodes.
Social-based routing algorithms are a class of utility-based routing algorithms.In such routing schemes, heuristic methods are used to determine the "quality" (utility) of a node as a relay.Each node i retains   (), a utility function that denotes the likelihood of i delivering a message to j.The utility function can be based on some different parameters, such as contact history, mobility model, social relations, etc.Furthermore, Spyropoulos et al. [23] categorized utility functions into two types: destination-dependent (DD) and destination-independent (DI).In DD, node utility is dependent of the destination; i.e., node i is an optimal relay for one destination  1 , yet node j is the best one for another  2 , or   ( 1 ) >   ( 1 ), but   ( 2 ) <   ( 2 ) for  1 ≠  2 .DD functions could be based on last-contact, social similarity, or correlated mobility pattern, with the given destination.However, DD Formatted: Font color: Red Formatted: Font color: Red Formatted: Font: Italic Formatted: Font color: Red imposes a large overhead on nodes, since the nodes should keep a single entry for each peer in the network.As opposed to DD, node utility in DI is independent of any destination, for example, a single node may be the best carrier for most/all destinations in the network, or in general it holds that   ( 1 ) >   ( 1 ) then   () >   () for most/all j, d.
Instances of nodes which are better relays for all destinations would be those with many connections to others (e.g., hub nodes in scale-free networks), nodes with many acquaintances (e.g., popular nodes in social networks), or nodes with high mobility (e.g., cars or buses in VDTN).Nevertheless, DI imposes a higher forwarding overhead on better relays, leading to poorer fairness in both traffic load distribution and utilization of the node resources.
This paper proposes a framework called Traffic Load Distribution Aware (TraLDA), aiming to improve fairness in forwarding of social-based routing algorithms.Here, we introduce a novel computation of node popularity in the entire network.This utility metric is obviously independent of the message destination, and as noted above it might contribute to a traffic load imbalance across MSN nodes.In TraLDA, we consider two different popularities in the calculation of node global popularity, namely inherent popularity and social-relations popularity.Inherent popularity is based solely on the node's sociability level, and in TraLDA is computed using the Kalman-prediction [24] which considers the periodicity in human behaviour.Moreover, the works in [25,26] have confirmed that human activities typically exhibit some of periodicity.Consequently, the calculation of node popularity in MSNs should consider this characteristic.Social-relations popularity, on the other hand, reflects the social benefit of connections with popular nodes, and spreads the popularity of these nodes to their lower ranked acquaintances.Finally, we apply TraLDA on the classical, yet prominent social-based routing algorithms SimBet [21] and Bub-bleRap [27], and next investigate the performance improvements of these schemes, par-  Finally, we apply TraLDA on the calculation of node popularity and centrality in BubbleRap and SimBet, respectively, in order to improve the traffic load balancing across network nodes.Using extensive simulations in the ONE [28] environment driven by realistic human mobility scenarios, we show that TraLDA enhances fairness in forwarding of both schemes, without decreasing the overall delivery performances.
We proceed in this paper as follows.Background is described in Section 2, detailed design strategy of TraLDA is discussed in Section 3, simulation and discussion is presented in Section 4, and lastly conclusion and future works is showed in Section 5.

Research Background
In this section, we discuss the topology structure of MSNs mobile social networks and the forwarding strategy of social-based routing algorithms.We initially consider an opportunistic network with N nodes as a graph G (V, E), where V and E are the sets of nodes and links, respectively.In this graph, a link between two nodes represents the physical contact between them, and the link weight is defined as the probability of their pairwise contact.We assume the graph G is connected, that is, between any pair of nodes at least a single path exists.Further, the message dissemination in the graph G under a utility-based routing is formulated as a discrete-time Markov chain.Suppose that a message m is transferred hop-by-hop in this graph.Initially, a message m is in state i if it is carried by node i, and when a contact occurs between node i and j and suppose that i transfers the message m to j, then the state of m changes from i to j.Therefore, the forwarding procedure of a message in an opportunistic network can be modelled as a state transition process in a discrete-time Markov chain.Next, we develop a transition probability matrix P, with   denotes the probability that the message m is transferred from node i to j, and is expressed as follows where    is the probability of encounter between i and j, and    is the likelihood that i transfers the message m to j during the contact.Node contact probability in MSNs mobile social networks is directly related with the human mobility pattern, and in some papers, such as [5,38], it was characterized based on the structural properties of node contacts.
Yet, forwarding probability fully depends on the forwarding rules used in message routing.In the following, we analyze the topology characteristics of MSNs mobile social networks as well as the forwarding features of social-based routing schemes, and discuss how the combination of them may result in the unfairness in forwarding among network nodes.

Topology structures of MSNsmobile social networks
When analyzing the delivery performance of a routing algorithm, information of net- Additionally, we conduct an online analysis in the ONE simulator [22] to investigate the node popularity distribution in real MSNsmobile social networks.A node in a selforganizing networks like such as MSNs mobile social networks should be able to sense its own popularity throughout the network.Here, node popularity is defined as the number of different nodes contacted in a certain time window.In an aggregated contact graph, this corresponds to node degree (centrality) [21].In this study, we consider the Reality contact traces [39] as the mobility scenario.In Fig. 2 (left) we show the node degree distribution in Reality, where the degree of a node is computed in a 6-hour-time-window basis.
It is evident that some nodes have a degree value that is significantly larger than the network's average degree (i.e., ≈2.2).Furthermore, in Fig. 2 (right) we show the node degree distribution in the Reality scenario on a log-log scale.The graphic shows that the node degree distribution follows a power-law distribution, with a low probability of finding nodes with a high degree because most network nodes have a low one.Moreover, The the authors of [40] established the potential of coupling between MSNs mobile social networks and scale-free networks, which have a power-law degree distribution as their main characteristic.ThusIn other words, the degree distribution in real human-based networks differs from the Gaussian (normal) degree distribution commonly assumed in random networks.

Social-based routing algorithms
Social-based routing schemes typically define a utility metric for each node when making routing choices.Clearly, a higher utility reflects a higher chance of the node to deliver a message.The method forwards the message to the contacted node with a higher utility in each contact.This best next-hop heuristic forwarding    can be described as Nevertheless, the utility-based routing algorithms in MSNs mobile social networks have some drawbacks as follows:  Hill-climbing heuristic forwarding is a pure greedy approach that sends the message to the nodes with the highest utility at each contact (/hop).Fan et.al.
[11] used a Markov model to show that under this forwarding technique, the probability of a message reaching the greatest utility node(s) is one, implying that messages will always find the highest utility nodes in MSNsmobile social networks.Furthermore, in the following we show mathematically that the forwarding heuristic, which is biased towards higher value nodes, guides the routing algorithm to send the bulk of network traffic through the highest utility node(s) as follows.We first assume a routing strategy that determines the nexthop nodes in a random manner.The message forwarding is therefore a random walk over the graph G(V, E) mentioned above, with the transition probability matrix P, where its element   is defined in (1).Under this random forwarding,   is equal to the inverse of node i's degree   , or   = 1   ⁄ .In a steady state traffic flow, the chance to find a message m in node j, which also equals to j's traffic load, can be computed as the first eigenvector of the distribution matrix   , with   =   .(∑    ) −1 .Then, it is easy to see that the eigenvector for distribution matrices of networks with a non-random (heterogeneous) connectivity distribution like MSNs mobile social networks will be skewed towards the highly connected nodes (hub nodes) under this random scheme.
ThereforeEventually, this confirms the natural traffic load imbalance in the MSNssocial networks.Further, if the forwarding strategy is not random, but biased towards connectivity (i.e., favouring nodes with a higher degree), the probability of hub nodes receiving relay traffic increases and the traffic load distribution becomes more unbalanced.MoreoverFurthermore, using simulation in the Reality mobility scenario [39] we show illustrate in Fig. 3 (left) the node degree vs. node traffic load when the hill-climbing heuristic forwarding is used applied on the network (here, node traffic load is defined as the total relay messages carried by a node).The graphic depicts how a few the highest degree nodes handle a big portion of traffic, yet most of network nodes only process a small one.This quickly depletes the hub nodes' constrained resources like power and storage.For instance, we show in Fig. 3 (right) the buffer occupancy changes of illustrative hub node and non-hub node in Reality.Clearly, The the buffer occupancy in the hub node is regularly saturated, whereas the buffer queue on the non-hub node is normally low during the experiment.


In MSNsmobile social networks, node utility can change over time, and a low utility node at the present time could become a good relay in the future.Most conventional utility-based forwarding algorithms, however, often ignores this.
Furthermore, the studies in [18,19] showed that node degree popularity in human-based networks, such as MSNs, varies over time and has a periodic pattern.Considering this, when TraLDA calculates node popularity, these features will be taken into account.

TraLDA Design
In TraLDA, we improve the computation of node (global) popularity in MSNsmobile social networks.To determine a node's popularity, two popularity metrics are calculated: inherent popularity and social-relations popularity.We hypothesize that inherent popularity is purely determined by the node's own mobility pattern or sociability level, whereas social-relation popularity is derived as an advantage from relationships with more popular nodes.Finally, TraLDA uses both popularity indicators to choose optimal relays during contacts.In the following, the computations for both measures are described in detailed.

Inherent popularity calculation
The inherent popularity of a node is determined by its own sociability degree or network movement pattern.In practice, this metric is defined based on a particular metric, such as the total contacts with different nodes in a time interval [21] or the neighbour change rate [38,41].In the literature, the former is denoted as the node degree in an aggregated encounter graph.In TraLDA, we use node degree to quantity a node's inherent popularity.Moreover, our investigation below shows that node degree in realistic MSNmobile social networks fluctuates significantly over time and exhibits some of periodicity.ThereforeThus, it is important to take into account these features when computing calculating node degree at a given time.Finally, we introduce a novel calculation of node degree using the Kalman-prediction [17] which consider the periodicity of human behaviour.
We begin by looking into the node degree change characteristics in MSNs mobile social networks using real-world human movement cases.The Reality trace dataset [39] is used in this study because it consists a large number of nodes and spans a lengthy period of time.Furthermore, an instantaneous node degree is estimated by the number of distinct nodes contacted in a given time window.In the case of Reality, we chose a time

Hub node Non-hub node
Formatted: MDPI_5.1_figure_caption,Indent: Left: 0.25 cm window of 6 hours as the basis for node degree calculation based on a study in [21] that found that individuals' daily life is typically separated into four main periods of 6 hours each: morning, afternoon, evening, and night (however, for a detailed discussion of the impact of time window scale choices on the node degree calculation, see [42]).
We now depict changes in node degree in the Reality scenario; for instance, in Fig. 4 (left), we present the node degree variations of an illustrative hub node in Reality.We notice that the node's degree changes dramatically and rapidly over time.Subsequently, we use a periodogram analysis [43] to find the main periods (frequencies) within the node's degree data series.We display the discovered periodicities of the hub node's degree in Fig. 4 (right).The figure clearly shows that the degree of the hub node firmly demonstrates a 7-days (weekly) period (moreover, our investigation on all the nodes in Reality finds that majority of the nodes possess a weekly cycle of their popularities as well).Indeed, the Reality dataset logged MIT staff and student activities on campus, which are higher during the weekdays but lower on weekends due to less interactions.
Nevertheless, depending on the experimental setting, distinct human encounter datasets may have different periodicities.
The structural component of the node degree data series in Fig. 4 (left) is then observed using a discrete time series analysis.A discrete time series is a set of observations   logged regularly at a specific time interval.In the traditional decomposition model [44], can be broken down into a trend component, a seasonal (periodic) component with period d, and a random noise component.We apply a seasonal filter [45] to the supplied given data series to get estimated periodic data: in Fig. 5 (upper), we present the longterm seasonal data of the data series;.Finallyfinally, by removing the periodic data from the original data, deseasonalized data is obtained (shown in Fig. 5 (lower)), deseasonalized data (shown in Fig. 5 (lower)) is obtained by removing this periodic data from the original data, consisting consisting of a random noise element and a less obvious trend element.
Based on the previous analysis, we now use the Kalman-filter theory [17] to develop an estimation model of the time series data to compute a node's inherent popularity in a time interval.The Kalman filter is widely used in control system design to estimate unmeasured process conditions.It can calculate the best estimates of the current states of a dynamic system defined in a state vector.The state is updated based on periodic observations of the system.We use a typical state space model [46] to express the problem in our model.Furthermore, we only investigate the case when a seasonal component dominates the time series (see [47] for the discussion of Kalman-prediction for a complete model).
The state space model is constituted by two scalar equations, namely the observation equation and the state equation.For our model with (only) a seasonal component, the observation equation is given as follows where   is a state variable and   is an additive white noise with zero mean and variance   2 (  = (0,   2 )).Furthermore, when we consider   representing a seasonal component with a period d such that  + =   and ∑   = 0  =1 , it is therefore possible to determine  +1 as For a more general expression of   allowing random deviations to exist in the periodicity, a white noise term   (  = (0,   2 )) is added in the right hand side of (4).Afterwards, regarding only the seasonal effect on the series, in order to obtain the state equation for our model, we introduce ( − 1) dimensional state vector  t defined as and the series   is determined as For the purpose of the derivation of Kalman-prediction, the observation equation in ( 3) is now rewritten in a general form as follows with   = [1 0 0 0 … 0], and  t satisfies the state equation with   = [  2 0 0 … 0]  , and Given the observation equation in (7) and the state equation in (8), the recursive equations of Kalman-filter for the estimation of the values of the series are defined as follows.Considering the initial settings as the Kalman recursive equations are then given as where , and   = σ w 2 . .

of 27
As an example of node popularity estimation using the Kalman-prediction, we show in Fig. 7 the Kalman estimates of the hub node's degree compared with the actual values of the node degree in the Reality mobility scenario (for the detail implementation of the Kalman-prediction on the node degree calculation in the ONE simulator, please refer to https://github.com/soelistijanto/TraLDA/routing/community/KalmanDegree.java).

Social-relations popularity calculation
An individual can gain (social) benefits from relationships with his/her more central or popular acquaintances in a social network.Depending on the substance of the relations, measures of node centrality can be classified as undirected (symmetric) relations, such as friendship and kinship, or directed (asymmetric) relations, such as choice and influence.
Moreover, in directed graphs centrality is known as "prestige" [48], where the direction of the interaction is a key attribute for this metric.For instance, individuals who are picked as friends by many others have a special status -prestige in the group.In the literature, there exist metrics of prestige which consider both direct and indirect social influences.
For instance, the centrality measures in [37,49] are based on the assumption that the importance of a node in the network is determined by the importance of its neighbours.
Thus, these metrics take into account both a node's connectivity to other nodes and its proximity to other important nodes.
We now mention one of the widely used centrality measures, the Katz's prestige measure [37].This defines the prestige of node i in the graph G, denoted by   (), as the sum of the prestige of all i's neighbours divided by their degrees.Node i therefore gains its prestige from having a neighbour j with higher prestige.This i's prestige is however corrected by the number of neighbours of j, so if j has more relations, then i gains less prestige from friendship with j.This adjustment might be thought of correcting for i's time spent with or relative access to j.As a result, node i's Katz centrality in the graph G is determined as follows: () = ∑     ()   ≠ (13) where   = 1 if there is a relation between i and j, or "0" otherwise, and   is the degree of j representing the number of j's neighbours.
Inspired by the Katz's centrality measure, we introduce social-relations popularity, the node's popularity derived from relationships with more popular nodes.This distributes the popularity of more (socially) important nodes to their less important neighbours, and thus takes neighbours' popularity into account when calculating a node's popularity.
We employ (13) to compute a node's social-relations popularity in a given time interval window as follows.To begin, we suppose that social influence occurs in only one direction, with nodes with lower popularity can only receive social benefits from their more popular neighbours; for instance, from (13) we can deduce influence from j towards i, denoted by   ⃗⃗⃗⃗⃗ = 1, exists when   () >   () or "0" otherwise, and therefore   ⃗⃗⃗⃗⃗ ≠   ⃗⃗⃗⃗⃗ .Second, we assume that the popularity of a more important node is shared by its less important neighbours and is weighted by the strength of their interactions with the given node.As a result, the higher (social) level node gives more effect on the closer neighbours.Finally, the social-relations popularity of node i in time window t is defined as follows where   ⃗⃗⃗⃗⃗ denotes the presence of a (social) influence of j towards i :   ⃗⃗⃗⃗⃗ =1 for where    () is the instantaneous global popularity of j in time window t computed using ( 16) below.
To give an example of the calculation of node social-relations popularity, we consider a simple neighbourhood of node A in Fig.  TraLDA to select optimal relays during node contacts.
We now discuss how TraLDA is implemented in a distributed environment.In selforganizing networks like MSNsmobile social networks, a node should be able to perceive its immediate neighbours autonomously.In TraLDA, we use the terminology "familiar set" in [50] to refer to a node's group of friends (direct neighbours) (hereafter, called a friendship set F). Every node stores a map of the contacted nodes together with their total encounter times.When the pairwise total contact time surpasses a given friendship threshold  ℎ , the contacted node is added in the given node's friendship set.This implies that the two nodes now have a link, and in turn, we apply a direction and a weight on this connection to indicate the direction of social impact and the strength of the tie between them, respectively.Finally, in Algorithm 1 we describe how to calculate node popularity in MSNs mobile social networks using the TraLDA distributed algorithm.When a contact occurs in time window t and the contacted node is in the current node's friendship set, the two nodes exchange two items of data to compute their social-relations popularities: ̅  −1 (. ) the mean of global popularity in time window  − 1, and   (. ) the total strength of connections to the less popular neighbours.The latter is computed as , where k is the direct neighbours of j,   is the connection strength of j and k, and   ⃗⃗⃗⃗⃗⃗ is the existence of influence of j towards k.The current node modifies its social-relation popularity and then recalculates both its instantaneous global popularity and cumulative average global popularity based on this peer's data.When the contact ends, if the contacted node is not in the friendship set yet, then the current node updates a map (, ()) .Finally, the peer will be added to the friendship set when

Simulation setup
The scenarios of simulations and evaluation metrics considered in the TraLDA's investigation are now discussed.We implement TraLDA and the algorithm benchmarks on in the Opportunistic Network Environment (ONE) simulator [22].For the simulations, we vary the total number of nodes and simulation time dependent of the mobility scenarios.
A warm-up phase of 30% of the simulation duration is used to enable nodes to gather information about the network's states.We set the node buffer to 20 MB, while the message size and its TTL are set to 10kB and 7 days, respectively.A new message is generated at a rate of 12 messages per hour at a random node, and is directed to a randomly selected destination.For each algorithm, the simulations are run five times with distinct random number seeds.
For mobility scenarios, we use two realistic, long period of human encounter datasets, Reality [39] and Sassy [51].In Reality, 100 mobile phones were carried by MIT staffs and students during nine months.The phones were running software that performed Bluetooth device discovery every 5 minutes, logging contacts with nearby Bluetooth-enabled devices.The dataset gathered device contacts in the campus over the given period.
The traces were acquired in Sassy, however, utilizing TMote invent devices carried by academics of University of St. Andrews.The invent devices were designed to broadcast beacons every 6.67 seconds to detect other devices within a 10-meter radius.The experiment was conducted for 74 days, where they were asked to bring the devices at all times, whether in or out the town.
For performance evaluation, we utilize the following evaluation metrics: 1. Delivery ratio: the ratio of the number of messages delivered to the number of new messages created.
2. Delivery latency: the time it takes for a message to be created and forwarded to the intended recipient.
3. Message overhead ratio: the fraction between total overhead messages and total delivered messages.The total overhead messages is computed as the number of forwarded messages minus the number of messages successfully delivered 4. GINI index: this statistical dispersion measure [52] computes the disparity between values of a frequency distribution.Here, the GINI index is used to quantify the fairness level of traffic load distribution in the network: a value of "0" indicates that traffic is divided equally among network nodes, while a value of "1" indicates that all network traffic is processed by a single node.

cm, No bullets or numbering
We now present the simulation results and discussions of the delivery performances of conventional BubbleRap [21] (hereafter, called BubbleRap) and conventional SimBet [20] (hereafter, called SimBet) compared with their improved versions within the TraLDA framework (hereafter, called Bubble-TLDA and SimBet-TLDA, respectively) in the given mobility scenarios, Reality and Sassy.

BubbleRap vs. Bubble-TLDA
BubbleRap bases its routing on both node global popularity and the community to which the destination belongs to.When either the current node or the encountered node is in the destination's community, routing choices decisions are performed based on local popularity, which is the popularity of a the node within a giventhe given community; otherwise, global popularity is considered.In BubbleRap, the C-Window method is used to compute node global popularity.This method calculates a node's degree value in the current time window by simply taking the average of all the node's degree values in prior time windows., which is a cumulative mean of all node degree values in prior time windows.TraLDA, on the other hand, estimates node inherent popularity in a time window (also measured in node degree) based on the Kalman-prediction, which considers the regularity periodicity of human activityactivities.For a performance comparison between two schemes, As an illustration, in Fig. 6 7 we show the time series of an illustrative hub node's degree values in Reality.In every each single time window, the node's degree value is determined based on real measurement (  ), C-Window ( ̅  ), and Kalman-prediction ( ̂) (we show these values in a daily basis to make them easily observed).For Kalman-prediction, we assume (from Section 3.1) that the seasonality   is known with the period of 7 days.We next discuss the delivery performance of BubbleRap compared with that of Bubble-TLDA in the Reality and Sassy scenarios based on the given evaluation metrics.As we noted above, BubbleRap considers node global popularity and the community of the destination belongs to when making forwarding decisions.To determine the community of a node, For the latter case, we exploit the k-clique community detection in [50] to determine the community of a given node..For the parameters of the k-clique scheme used by both BubbleRap and Bubble-TLDA, for Reality we choose k=5 and familiar threshold  ℎ =250k seconds for Reality, and for Sassy k=3 and  ℎ =3k seconds.Moreover, For for the  TraLDA's parameters in Bubble-TLDA, we use two different distinct values of friendship thresholds for each mobility scenario:  ℎ =150k seconds and 300k seconds for Reality, and  ℎ =2k seconds and 3k seconds for Sassy.In addition, for both mobility scenarios, we use a social impact factor () of 0.8, which determines the weight of neighbours' influences on the overall node's popularity.
As previously mentioned in the node social-relations popularity (Section 4.2), stated, the neighbourhood of a node is defined in terms of a friendship set, with the peeranother node being involved in the node's friendship set if their pairwise total encounter time surpasses a given friendship threshold ( ℎ ).Indeed, this threshold is critical for TraLDA's performance as it dictates the size of a friendship set of a givena node's friendship set, which in turn impacts the the node's network-wide social influence in its neighbourhood.
For instance, we show in Table-1 we show the comparison of the friendship sets of hub node and non-hub node in the Reality scenario of Reality hub node and non-hub node for different various values of friendship threshold ( ℎ ) (in seconds).In the case of hub node, we notice that increasing the friendship threshold  ℎ makes decreases the node's friendship set shrinking (Table-1(a)).This also implies that as  ℎ increases, the spread of social influences of the hub node to its neighbours diminishes.Since a hub node, in general, is more the most active node in the network, it consequently has weaker ties with its neighbours.Furthermore, Granovetter [53] underlined the relevance of weak relationships in information dissemination in social networks.A non-hub node, on the other hand, has stronger relationships to its friends (direct neighbours), and as indicated in Table-1(b) the friendship threshold ( ℎ ) in this case has a small influence on the node's friendship set size.[5,7,13,17,20,22,32,82,84,95] 200k [45,63,82,95, 96] 250k [5,7,13,17,20,22,82,84,95] 250k [45,63,82,95, 96] 300k [5,13,17,22,82,84,95] 300k [45,63,82,95,96] Finally, we depict the delivery performances of BubbleRap and Bubble-TLDA in Reality and Sassy in Figs.improves the delivery delay time in both scenarios.The greater the value of the social impact factor (), the more traffic is redirected from optimal paths (via hub nodes) to suboptimal paths (through non-hub nodes), which are often longer than the shortest routes (via hub nodes) to the destination.Finally, for the case of message overhead ratio, Bubble-TLDA marginally rises BubbleRap's delivery cost in both mobility scenarios (Fig. 7 and 8 (lower-right)).This implies that reducing traffic in hub nodes, while increasing traffic in non-hub nodes gives a less impact on the delivery overhead, i.e., Bubble-TLDA is able to maintains the total message copies as high as BubbleRap.

SimBet vs. SimBet-TLDA
For the last TraLDA's analysis, we now consider SimBet routing [20].SimBet uses two distinct social properties, namely betweeness centrality and social similarity, to cal-  A node's centrality in SimBet-TLDA, however, is determined by considering both the periodicity of human activities as well as the centrality of the neighbours of the nodes.This can reduce the traffic in the most central nodes and distributes the traffic more equitably across the network nodes, indicated by the reduce of GINI index in both mobility scenarios.Furthermore, The GINI index reduction in SimBet-TLDA is more obvious in the case of a lower friendship thresholds ( ℎ ).As described in Table .1, a lower friendship threshold results in This is because with a lower  ℎ , the influence of more popular central nodes are wider in their neighbourhood, and consequently hence, many more less -popularcentral neighbour neighbours nodes can increase their popularity and may afterwards can become a good candidate relays now.Moreover, the reduce of GINI index This GINI index reduction, moreover, slightly impacts on the delivery ratio, and SimBet-TLDA delivers the messages to the destinations with a success rate as high as that of SimBet.However, as in Bubble-TLDA, the GINI index decrease reduction in SimBet-TLDA also increases the delivery time in both mobility cases.The explanation of this is similar to that given in the Bubble-TLDA before, as follows: when SimBet-TLDA successfully reduces the GINI index, some of traffic is diverted away from the shortest-paths (through hub nodes) on to the sub-optimal paths (via non-hub nodes); in turn, increasing the average delivery time.
Finally, in terms of message delivery costoverhead ratio performance, SimBet-TLDA performs as well as SimBet in both scenarios, i.e., SimBet-TLDA creates (redundant) message copies as many as SimBet in the network.

Conclusion
We presented TraLDA, a distributed framework aimed primarily at improving fairness in forwarding among MSN nodes in mobile social networks.In TraLDA, we introduce a novel calculation of node popularity, a function of inherent and social-relations popularity.We have demonstrated that TraLDA achieves this fairness, reducing the GINI index of BubbleRap and SimBet, but at the expense of an slightly increase of delivery delay of these routing schemes.Given that MSNs mobile social networks are assumed to be delay-tolerant, the increased delivery latency is a reasonable trade-off given the enhanced network traffic fairness and lower resource use in the most popular nodes.
For future work, we believe that TraLDA can be incorporated with buffer congestion control to further improve traffic load balancing across network nodes and simultaneously avoid congestion mainly in the most popular nodes.[Algorithms] Manuscript ID: algorithms-1760905 -Accepted for Publication riley.song@mdpi.com<riley.song@mdpi.com> on behalf of Algorithms Editorial Office <algorithms@mdpi.com>We will now edit and finalize your paper, which will then be returned to you for your approval.Within the next couple of days, an invoice concerning the article processing charge (APC) for publication in this open access journal will be sent by email from the Editorial Office in Basel, Switzerland.
If, however, extensive English edits are required to your manuscript, we will need to return the paper requesting improvements throughout.
We encourage you to set up your profile at SciProfiles.com,MDPI's researcher network platform.Articles you publish with MDPI will be linked to your SciProfiles page, where colleagues and peers will be able to see all of your publications, citations, as well as other academic contributions.
We also invite you to contribute to Encyclopedia (https://encyclopedia.pub), a scholarly platform providing accurate information about the latest research results.You can adapt parts of your paper to provide valuable reference information, via Encyclopedia, for others both within the field and beyond.

Kind regards, Frank Werner
Editor-in-Chief

37 interactions
influence human mobility.As a result, MSNs are closely linked to social (re-38 lation) networks, and knowledge about social ties can be used to improve routing routing 39 algorithms in such human-based networks.

40 41
Researchers currently focus on studying social relation patterns, e.g., node popular-42 ity and social similarity, as the choice parameters of relay nodes.Furthermore, the pro-43 posed social-based routing algorithms [7-9] typically favour nodes with many social ties 44 Citation: Lastname, F.; Lastname, F.; Lastname, F. Title.Algorithms 2022, 15, x.https://doi.org/10.3390/xxxxxAcademic Editor: Firstname Last- centrality and global popularity in SimBet and BubbleRap, respectively).In this case, TraLDA focuses on improving the calculation of global popularity and betweeness centrality in BubbleRap and SimBet, respectively.The following are the main contributions we made in this paper:  To increase fairness in forwarding of social-based routing algorithms in mobile social networks, we propose TraLDA, a framework of traffic load distribution aware.We offer a new method for calculating node global popularity, a function of both node inherent and social-relations popularity. The inherent popularity of a node is solely determined by the node's own mobility pattern or sociability level in the network, and in TraLDA is computed using the Kalman-prediction which accounts for the regularity (periodicity) of human behaviour. Node social-relation popularity, on the other hand, represents the advantages of connections with more popular or central nodes (individuals).It shares the popularity of more popular nodes to their less popular counterparts. Finally, we apply TraLDA on the calculation of node global popularity and centrality in BubbleRap and SimBet, respectively, in order to improve the traffic load balancing among network nodes.Using extensive simulations in the Opportunistic Network Environment (ONE) [22] driven by realistic human mobility scenarios, we show that TraLDA enhances fairness in forwarding of both schemes, without negatively affecting the overall delivery performances.
ticularly in the trade-off between forwarding fairness and efficiency.SimBet and Bub-bleRap combine two different utility metrics to decide node fitness as relay to a given destination: the one which is dependent of the destination (i.e., similarity and community in SimBet and BubbleRap, respectively), and the other one which is independent of the destination (i.e., betweeness centrality and global popularity in SimBet and BubbleRap, respectively).TraLDA, in turn, focuses on improving the calculation of global popularity and centrality in BubbleRap and SimBet, respectively.The following are the main contributions we made in this paper:  To increase fairness in forwarding of social-based routing algorithms in MSNs, we propose a framework called Traffic Load Distribution Aware (TraLDA).In TraLDA, we offer a new method for calculating node global popularity, a function of both inherent and social-relations popularity. The inherent popularity of a node is solely determined by the node's own mobility pattern or sociability level in the network, and in TraLDA is computed using the Kalman-prediction, which accounts for the regularity (periodicity) of human behaviour. Node social-relation popularity, on the other hand, represents the advantages of connections with the more popular or central nodes (individuals).It shares the popularity of more popular nodes to their less popular counterparts.

Figure 1 .
Figure 1.A MSN's mobile social network's structural topology.On the top layer, the social network drives human to move, and this human mobility creates opportunistic contacts in the physical network.

Figure 2 .
Figure 2. (left) the node degree distribution in the Reality mobility scenario, and (right) when it is plotted in a log-log scale.The almost linear of the plot of the node degree in the log-log scale verifies that the node degree is power-law distributed.

Fig. 2 .
Fig. 2. (left) the node degree distribution in the Reality mobility scenario, and (right) when it is plotted in a log-log scale

Figure 3 .
Figure 3.(left) node degree vs. node traffic load, and (right) the buffer queue growths of illustrative hub node and non-hub node, when the hill-climbing heuristic forwarding is appliedused in the mobile social network on the Reality mobility scenario.This describes an imbalanced traffic load among nodes, with the highest degree nodes handling the bulk of network traffic, resulting in significant buffer occupancy throughout the simulation.0 1 2 3 4 5 6 7 8 9

Figure 4 .Figure 5 .
Figure 4. (left) the changes of node degree of an illustrative hub node in Reality (measured by node degree in a 6-hour time window), and (right) the detected periodicities of the node's degree.This describes that the node popularity in the mobile social network fluctuates over time and has a weekly period.0 100 200 300 400 500 600 700 800 0 5 10 15 20 25 30 , or =0 otherwise, () represents the set of i's friends,   is the connection strength of i and j, and  ̅   () is the cumulative mean of global popularity in time window t calculated as follows

25 34. 3 .Figure 6 .Formatted
Figure 6.A neighbourhood of node A, comprising 4 neighbour nodes which can gives social influences to node A. Social influences (red dotted vectors) to node A (red dotted vectors) exist when the global popularity of the neighbours is higher than A's.

Fig. 6 7
shows that Kalman-prediction captures fluctuations in the node degree values, and thus delivers more accurate estimations of the instantaneous node's popularity compared to BubbleRap's C-Window.C-Window reacts slowly to variations in node popularity and ignores the regularity of human activity.

Figure 67 .
Figure 67.Node degree values of an illustrative Reality hub node in a certain time window, comparing the actual value, the Kalman prediction, and the C-Window estimate.Kalman-prediction clearly outperforms C-Window when estimating the actual node's degree level in each time window, and it captures the periodic pattern of the node degree quite well.

Figure 78 .
Figure 78.Performance evaluation of BubbleRap and Bubble-TLDA (=0.8)for the Reality mobility scenario, comparing the delivery performances of BubbleRap and its improved version, Bubble-TLDA.Bubble-TLDA significantly decreases of the GINI index of BubbleRap in this case, without negatively impacting other delivery performances.

Figure 910 .Figure 89 .
Figure 910.(left ) the traffic load distribution among nodes in Reality for BubbleRap, and (right) for Bubble-TLDA (=0.8, ℎ =150ks).Clearly, the improved node popularity calculation of TraLDA on BubbleRap significantly reduces the traffic load in the hub nodes, while increasing the relay traffic in majority of non-hub nodes.
culate node utility to a given destination.Both the SimBet's utility metrics are calculated based on a binary model of a social connection, where a value of "1" denotes that a pair of nodes have known each other, and "0" otherwise.The binary social relationships may create a substantial issue in forwarding fairness, since a node having large contacts with other nodes will always be considered as the popular nodes regardless of time.Using the graph with binary links, To quantify node centrality, SimBet computes node betweeness centrality based on an ego-centric network approach, since the global network topology information is commonly unavailable for nodes in MSNs.Node social similarity, on the other hand, is calculated as the number of common encountered nodes between a pair of

Figure 1112 .
Figure 1112.Performance evaluation of SimBet and SimBet-TLDA ( =0.8) for the Reality mobility scenario.As in the case of BubbleRap, in this case TraLDA is also able to substantially reduce the SimBet's GINI index, without much affecting other delivery performances.

Figure 1213 .
Figure 1213.Performance evaluation of SimBet and SimBet-TLDA (=0.8)for the Sassy mobility scenario.Similar with the case in Reality, here SimBet-TLDA also improves the traffic fairness in the network (indicated by the reduced GINI index), while keeping other delivery performances as high as those of SimBet.

et w or k H um an M ob ili ty P hy si ca l N et w or k Low High Topology Volatility Formatted:
[5]]logy is typically needed.The movement patterns of nodes in mobile networks, e.g.MANETs and OMNs, have a direct impact on the networks' topologies.MSNsMobile social networks, in particular, are human-based networks, and node encounters in such networks represent the ways in which people interact.Yoneki et al.[38]and Hossmann et al.[5]studied the topology characteristic of mobile MSNs social networks using some realistic human mobility scenarios.They firstly aggregated the contact data to establish Justified weighted contact graphs, where the link weights express the duration of contact of pairs of nodes.These graphs in turn exhibit the characteristics of social networks.(Aasocial network is a graph of human relationships formed by one or more types of interdependencies, such as mutual interests, kinship, or friendship).By applying a complex network analysis on the derived graphs, they concluded that the networks have a non-random (heterogeneous) connectivity structure, exhibiting a power-law degree distribution in which some nodes have a relatively large connectivity degree to other nodes, whereas the majority of nodes in the network have few.The large degree nodes (so-called hub nodes) are the most popular (central) nodes in the social graph, and therefore they can act as information brokers which are capable of disseminating messages to all nodes within a relatively short delay.In Fig.1we show the structural topology of an MSNmobile social S oc ia l Nnetwork: a virtual social network exists on top of an MSNmobile social network, which is less volatile than the physical network, and this network guides humans to move.
6, comprising 4 neighbours with different levels of global popularity at a time t.Between a pair of nodes A and B, a black line indicates the social connection between them, with   represents their connection strength (e.g., measured in total contact duration (seconds)).A red dotted vector, on the other hand, Finally, the social-relations popularity of node A at time t is calculated as: () exceeds the threshold  ℎ .(The implementation of the TraLDA distributed algo-

Table 1 .
The friendship sets of Reality's hub node and non-hub node in Reality for different values of  ℎ .