A Privacy-Preserving Routing Protocol Using Mix Networks in Opportunistic Networks

: This paper focuses on the problem of providing anonymous communications in opportunistic networks. To that end, we propose an approach using Mix networks that enables a relatively simple solution. Opportunistic networks present some constraints that make the deployment of typical network anonymity solutions difﬁcult or infeasible. We show, utilizing simulations on the basis of real mobility traces, that the proposed solution is feasible for some scenarios by introducing a tolerable penalty in terms of message delay and delivery. To investigate the impact of routing strategies, we offer two different methods to select Mix nodes. From the experiment results, we show the trade-off between network performance and security.


Introduction
We live during a global deployment of computer networks and network devices. This technology has allowed for the creation of a wide range of different network-based applications that enable collaborative behavior among users to share their personal interests, such as their hobbies and professions. During the last few years, the research community has proposed different solutions to allow for mobile nodes to communicate with each other, even in extreme cases where no end-to-end connection is guaranteed. This network paradigm has been termed as opportunistic networks (OppNets) [1]. In these opportunistic social networks, users carry mobile devices to communicate with each other and share data as in traditional social networks. Nevertheless, when it comes for privacy in OppNets, not all that glitters is gold.
There is a range of applications in OppNets, such as crisis management, battlefield coordination, wildlife monitoring, transportation engineering, and remote healthcare. They are typically used in environments that require networks to be tolerant of long delays, interruptions, and high error rates. Nodes in these networks communicate with each other in an opportunistic manner. It is common to use routing strategies on the basis of the so-called store-carry-forward strategies. That is, a node can store and carry a message until it finds another node to forward it.
Because of the unstable connection property of OppNets, it is difficult to directly use traditional network privacy and anonymity mechanisms. Current proposals are mostly centered on providing extensions or strategies to adapt onion routing for OppNets. Although these could be interesting approaches, some OppNets scenarios need to address the problem of traffic analysis. Contrary to the use of onion routing on the Internet, for example, in some OppNets scenarios, having an attack model where the attacker has access to the whole communication medium is not rare. Typical OppNets use wireless connections where traffic analysis imposes a dangerous threat. Attacks can exploit message flow to detect the whole onion routing path, which leaks the communication pattern between sender and receiver, assuming that all network traffic is not extremely high. A common approach to deal with this problem is to generate noisy network traffic to mask actual traffic. Another alternative that has not been investigated in depth is the use of mix networks (mixnets). In this paper, we propose the use of a mixnet mechanism to provide privacy in opportunistic networks.
Mixnets can be used to provide anonymous communication in OppNets without high impact on the normal operation of the network. Some nodes from the network act as mix nodes enabling the implementation of different strategies. To the best of our knowledge, our anonymous schema is the first work to propose the use of mixnets in OppNets. Our main contributions are the following: First, we designed a mixnet-based communication approach for OppNets. Second, we validated and showed the particularities of our proposed schema in the ONE simulator (a common network simulator especially focused towards OppNets). Third, we analyzed the performance and privacy property of our proposed mechanism. This paper is organized as follows. In Section 2, we introduce the current related work. We then present our anonymous-routing mechanism using mixnets in Section 3. Section 4 describes our experiment simulation on the ONE simulator. Lastly, in Section 5, we conclude our proposed work.

Motivation and Related Work
In OppNets, we can have unstable nodes with unpredictable movements producing a regularly changing network topology. Hence, end-to-end routing paths between source and destination nodes may be difficult to establish. The reliability of data transmission and network-traffic control, which are provided by traditional transport protocols such as TCP, are very inefficient [2] in OppNets. This also means that, in order to provide privacy and anonymous communication services, traditional network mechanisms cannot be directly applied [3].
Several proposals addressed anonymous communications in OppNets, and most of them are based on some application of onion routing. For instance, ARDEN [4] proposes an attribute-based encryption method to provide strong anonymity guarantees. It is based on a traditional onion-routing architecture where nodes can act as onion routers. ARDEN divides nodes into several groups, and each group shares the same key. Thus, nodes from the same group can forward the message, which can improve delivery probability.
Sakai et al. [5] designed an onion-based routing protocol and extended the existing protocol with group onions into multicopy forwarding. The main contributions of this paper were performance and security analyses of onion-based anonymous routing for DTNs.
In [6], Chen et al. proposed an onion-based scheme to ensure anonymous communications in predictable opportunistic networks (POppNets). A POppNet is a network where end-to-end connectivity is not guaranteed, and node communication happens in an opportunistic manner, but the behavior of the network can be predicted in advance. The predictability of such networks can be exploited to simplify some mechanisms of more generic OppNets where there is no prior knowledge on the network behavior.
Alternatively, [7] provided a new structure for the history table of nodes in an OppNet in which the personal information of participants in the network is hidden from other nodes. To send a message from a sender node to the final receiver node, the optimized route, composed of different nodes, is identified by inspecting the history table that was created by each node on the basis of previous interactions. In this paper, a new privacy-preserving history-based (PPHB) routing mechanism was proposed on the basis of historical location tracking. Bakiras et. al used a random walk process and encrypted exchanged message to ensure anonymous communications in OppNets [8]. Their work leveraged the simple opportunistic mechanism to store-carry-forward the message from end to end.
In general, and particularly in solutions based on onion routing, some problems might arise if we consider traffic analysis attacks. In an OppNet, the transmission medium is usually the air, making it relatively easy to gather traffic data by anyone with physical proximity to the devices. If we consider an adversarial model where the attacker has access and can monitor all network interactions, traffic analysis techniques can be used to correlate source and destination in common onion routing based approaches. This adversarial model is relatively realistic for some scenarios. Consider, e.g., an OppNet built for a conference where nodes move around a relatively small physical space. OppNet routing protocols usually provide some sort of traffic analysis resistance, in the sense that a packet might be broadcast or sent to nodes which are neither the destination nor an intermediate node in the final delivery path. This does not prevent the identification of the message source, and in most cases does not prevent the attacker from gaining some knowledge about intermediate and destination nodes.
In order to provide an anonymous communication method that is robust enough against this adversarial traffic analysis model, we considered the use of mixnets for OppNets. The idea of using mixes providing anonymous communication was first proposed by Chaum in 1981 [9]. In Chaum's untraceable system, mix nodes encrypt messages, and use the stack to store and shuffle the messages. Freedman proposed a peer-to-peer mix network called Tarzan [10] that exploits layered encryption and multihop routing to achieve anonymous communication. We introduce a similar approach for OppNets, considering the particularities of these networks and, more precisely, the typical routing strategies used in these networks.

Anonymous Routing Using a Mix Network
Our proposal is based on the use of a mixnet in OppNets. The main idea is to be able to use a relatively simple mixnet schema without requiring complex cryptographic mechanisms. This approach is not only is feasible in OppNet scenarios, but can also set the basis for interesting solutions to anonymous communications in dynamic networks.

Opportunistic Mix Network
We are dealing with networks where nodes can move, appear, and disappear, and all links are opportunistic. Following the notation from [6] we can see an OppNet as an undirected dynamic graph G(V, E), where V is the set of nodes and E the set of edges. Each edge denotes a connection between two nodes that can be used in both directions during a time range. Usually, an edge is denoted as e = (u, v, t, λ), showing that it starts at time t and has duration λ. Nodes can also appear and disappear from the network; without loss of generality, we can assume that the disappearance of a node is captured by the fact that it has no edges (both having a node without connections or the fact that the node disappears is equivalent for our purposes). Graph G * (V * , E * ) is the static undirected graph obtained from G by considering all its edges without time constraints (all edges are in the graph independently of time); we have that In order to implement the mixnet, each network has a set of m mix nodes M = M 1 , . . . , M m , such that M ⊆ V. Our proposal exploits the fact that there could be several mix nodes, and the source node is free to choose which ones it uses to send a message. Moreover, the sender can choose the number of mix nodes to use in a cascade fashion (network) from set M. In this sense, our approach follows what could be denoted as a restricted free route mix network-it is a free route mixnet [11,12]-but the selection of nodes is limited to a subset of the nodes of the network. We assumed that there was a key distribution mechanism, so each mix node had an asymmetric key pair, and the public key of each mix node was known to all nodes.
We considered a typical decryption mixnet. For example, consider sender node V s that chooses three mix nodes, M 1 , M 2 , M 3 , to send message x to destination V d . Each node has a corresponding public key PK s , PK 1 , PK 2 , PK 3 , PK d . To that end, V s builds an onion-based encryption scheme, such as PK 1 (r 1 , PK 2 (r 2 , PK 3 (r 3 , PK D (r d , x)))) where r i are random nodes ensuring that messages cannot be correlated to their encrypted versions in any step. The message is received by M 1 , decrypted, and sent to M 2 after performing the mix. Each node does the same until the message arrives at destination V d . Each node V i waits until it has received k i messages, and then forwards these k i messages, making it hard for an attacker to correlate input messages with output messages.
In the general case, mix nodes are a subset of the whole network nodes, M ⊆ V. The source node selects a path of mix nodes M i ∈ M of length p, and encrypts message x as We assumed that each mix node had enough capacity to perform the cryptographic operations, and memory to store the required k i messages. Which nodes act as mix nodes depends on the specific scenario and the parameterization of our approach. If we assume that we can choose such nodes, this can have implications in several ways. Having more mix nodes could provide more privacy. The sender can choose randomly from the set of mix nodes, and thus make it harder for an attacker to follow the message through the network. On the other hand, having many mix nodes can enlarge the delay associated with each mix since each node needs to wait for k i messages. We can also consider other approaches to select potential mix nodes; for instance, it seems reasonable that those nodes with more interactions are obviously more interesting to use. We discuss this in Section 3.3.

Message Routing
Routing messages in our opportunistic mixnet approach differs greatly from a common mixnet implemented in a more classical network. Messages might not be easily routed, and source routing cannot generally be used. Thus, when the source node chooses the path of mix nodes, delivering the message to such nodes might require routing it through other intermediary nodes. To that end, our proposal uses epidemic routing.
In epidemic routing, each node forwards a message to all its neighbors until the message reaches its destination. It can produce much flooding in the network and, although it is not the most efficient routing strategy for OppNets, it is the most generic and basic approach, and served as a good basis to evaluate our proposal.
As an example, Figure 1, shows a possible message delivery from source node V s to destination node V d . The source node chooses three mix nodes, M 1 , M 2 , M 3 . The delivery of the message needs to use intermediary nodes, denoted as V i for = 1, . . . , 8. The dashed lines denote epidemic routing, which is used to deliver the message from V s to M 1 , then from M 1 to M 2 , and so on. We denoted these routes between each mix node, and source and destination nodes as routing steps. The number of intermediate nodes used in each routing step depend on the network behavior at the moment. As described before, each mix node has to wait until receiving k i messages, which introduces an additional penalty to message-delivery time.

Choosing Mix Nodes
As described before, the localization of mix nodes in the network is an important issue. In order to evaluate our opportunistic mixnet proposal, we chose a set of network nodes M that could act as mix nodes. The size of M, denoted as |M|, or more precisely its size with respect to |V|, determines the overall performance of the mixnet. Apart from its size, another important decision is how to choose such a set M. There are existing routing strategies in free route mixnets to optimize the anonymity of the system [13]. These strategies, however, assume that all nodes are reachable, and that routing does not opportunistically occur. In order to evaluate the use of our proposal, we provide two selection strategies focused on the performance of the opportunistic mixnet.
The two different strategies that can be used to select set M are: • Random selection: M is randomly selected among all nodes V. This is a relatively simple idea where nodes acting as mixes can be any node from the network. No other characteristic is considered. • Centrality-based selection: M is selected as the |M| nodes from V with higher centrality.
Here, the idea is to promote the use of highly connected nodes to act as mixes. This should improve the performance of the overall network. In order to simulate this idea, we look at nodes with higher centrality. As the centrality measure, we used the historic number of neighbors of each node. That is, a node with more contacts over time has higher centrality. In our case, we used a connection-based centrality measure that captured how many connections or interactions have a given node. More precisely, the centrality of a node v ∈ V is denoted as C(v) = |N * (v)|. In real situations, this could be equivalent to selection nodes that we know that have higher centrality or connectivity, e.g., in some vehicular networks, road-side units are known by every member of the network, and they are also central nodes.
Intuitively, the centrality-based strategy yields better delivery ratios and delivery time. Those nodes have better connectivity, and messages are more efficiently routed among them.
The size of set M with respect to V also has performance and privacy implications, but such implications might not be that obvious. For instance, a big set of potential mix nodes imply:

•
Increased privacy: as more nodes can be used as mix nodes, the attacker needs to monitor more nodes attempting to correlate inputs with outputs. This introduces confusion as a means of privacy. Privacy is actually determined by the number of messages that each mix node accepts and the number of mix nodes each message uses. The fact that there are more mix nodes present in the network hinders the task of the attacker in monitoring such nodes, and can provide better privacy due to the distribution of links in the entire mix network [13]. • More time delays: as there are more mix nodes that can be used, filling each node with its corresponding k i messages takes more time, and the delay introduced by each mix node is expected to be higher. This obviously depends on the actual traffic of the network (number of messages exchanged over time), but in general, more mix nodes produce higher delay times.

Threat Model
The objective of the adversary is to link the sender and receiver in our OppNets system. Specifically, we assumed that the attacker could access our OppNets system. It can wiretap, forward, and delete a message. Meanwhile, the attacker can also compromise the user and mix nodes in the system. However, we assumed that at least some mixes were honest.
Our threat model was based on [14]. There is no perfect security defender that could resist all kinds of attacks. In our assumption, we mainly considered three different threat types.

•
Traffic analysis. Attacker A can conduct a traffic analysis to probe the link between sender node S and receiver node D. A can obtain access to every single message M, and its goal is to speculate that S and D are communicating with each other in high probability. • Compromised user nodes. We also assumed that attacker A could compromise user nodes. Thus, A can obtain message M and forward it. In our scenarios, we assumed that at least two users were honest.
• Compromised mix nodes. Mix nodes can also be compromised, but at least one of them should be honest. The compromised mix nodes can conduct a link attack to deanonymize the system.

Evaluation
In this section, we evaluate our opportunistic-mixnet proposal. We first introduce the scenarios and simulation environment. Then, we describe three metrics to evaluate the performance. Lastly, we present the results.

Simulation Environment and Scenarios
We present an experiment using an enhanced version of the Opportunistic Network Environment (ONE) simulator [15] that included our anonymous mixnet routing. We chose different scenarios to analyze the performance of our schema. Node contacts from the scenarios were defined by physical contacts obtained from real mobility traces from the Crawdad database (http://crawdad.org/), a community resource for collecting wireless data at Dartmouth College. Each scenario corresponded to a different network with different nodes, and different mobility and contact patterns.
The first scenario, Info5, was based on real mobility traces [16] obtained during the 2005 edition of the Infocom conference over the course of almost three days. Contacts from these mobility traces represent 22,459 contacts from 41 different nodes. The second scenario, Cambridge, was based on 10,641 real contact traces from 51 students from the system research group of the University of Cambridge carrying small devices for six days [17]. Lastly, the third scenario, Taxis [18], contained 449,226 mobility contacts from 304 taxis during one month in the city of Rome. Table 1 summarizes the main characteristics of the three scenarios. These include the connections per minute, which is the average number of connections a network has in one minute during the whole duration of the scenario; total connection time, which is the sum of the duration of all connections in the network, ∑ e i ∈G λ(e i ); and the maximal and average degree of all nodes v ∈ V * , where the degree of the node is determined by N * (v) (see Section 3.1). These scenarios are commonly used in the evaluation of routing algorithms in the OppNet literature. They provide a realistic setup to measure the feasibility of our proposal.
In our experiment, we set the simulation time of 86,400 s (24 h). That is, we simulate each scenario for one day.

Performance Metrics
One of the main concerns when using not only mixnets, but any means to provide anonymous communications in a network, is the penalty that such solutions could worsen overall network performance. This is especially relevant in OppNets, where we already do not usually have good performance in message delivery. The use of mix nodes introduces latency in message delivery and decreases the delivery ratio. The goal is to see whether these penalties are tolerable. To that end, we measured message latency and delivery ratio in several simulations for the three proposed scenarios, and using different setups and parameterizations.
We mainly focused our experiments to observe:

•
Message latency: the average time that it takes for a message to reach its destination. • Delivery ratio: ratio of successfully delivered messages.

•
Overhead ratio: the factor of number of relayed messages minus delivered messages, divided by the number of delivered messages. The overhead ratio shows the number of network resources used in the process of delivering a message to its receiver.

•
Average routing-step path length: the average number of hops that one packet needs to traverse in each routing step using epidemic routing (see Section 3.2). That is, the number of nodes used to route a message from the source node to the first mix node, from the first mix node to the second, and so on.
There are two main parameters that influence how our proposal is deployed in a given scenario. One is the percentage of mix hosts, i.e., the percentage of nodes that can be chosen as mix hosts or the size of M with respect to V. These are 20%, 40%, 60%, 80%, and 100%.
The other parameter is the number of mix nodes used in cascade. That is, the number of mix nodes that each message gets through. In our experiment, we set the number of mix nodes from 0 to 6. Here, the value set to 0 means using the general broadcast method. We used simple broadcast to transmit our message from one mix to another. To simulate a real OppNet environment, the TTL (time-to-live) of each message was set to a relatively large value of 5.5 h (20,000 s). We assumed that each node had enough buffer to store the messages. During the simulation, every 100 s, a randomly selected node sent a 1 byte message to a randomly chosen node. This was performed in all three scenarios. In our simulation, we set parameter k i to 10 for each mix node i mentioned in previous sections.

Results
Running the simulations, we could measure the introduced penalty by using the opportunisticmixnet approach in the three above mentioned scenarios. Figure 2 shows the delivery ratio for each scenario. In each case, the figure shows the delivery ratio based on the number of nodes used in cascade to communicate. 0 mix nodes correspond to the case of not using our opportunistic-mixnet approach. Each line corresponds to a different size for the set M of mix nodes. We used sizes corresponding to 100%, 80%, 60%, 40%, and 20% of the total nodes. This size was randomly selected from the whole set of nodes of the network. Similarly, Figure 3 shows the same data, but the selection of the set of mix nodes was performed using node centrality.
In general, the selection of nodes with higher centrality as mix nodes provided slightly better results in terms of delivery ratio. This improvement was, however, relatively small in most cases.
Using a higher number of nodes as mix nodes, that is, a larger size for set M, usually provided lower delivery ratios. This is because increasing the number of mix nodes also increases the delay introduced by each. A source node randomly selects mix nodes from M. With more possible nodes, the distribution of messages for each mix node decreases, and mix nodes need more time to receive their corresponding k i messages to be able to flush them. Increasing the size of M increases the privacy provided by the system, but on the other hand introduces a clear observable penalty.
In Figures 4 and 5, we show the latency of delivered messages for each scenario using different sizes for set M. Those sizes are 80%, 60%, 40%, and 20% of the total number of the node for each case. The delay is also shown for a different number of cascaded mix nodes. As per the previous figures, the 0 number of mix nodes is the case of sending messages without the opportunistic mixnet.
Again, Figure 4 shows the case when the set of mix nodes was chosen randomly, and Figure 5 when those nodes were selected by their centrality. (c) Taxis. (c) Taxis. The obtained results in this case are analogous to the ones observed for the delivery ratio. When using nodes on the basis of their centrality as mix nodes, we obtained slightly better results. The same happens for smaller sizes of set M.
Overall, the introduced penalty by using the opportunistic-mixnet approach was not very large, and could easily be tolerable in most scenarios and applications of opportunistic networks. In Tables 2 and 3, we can see the overhead ratio of our proposed mix networks methods with random and centrality mix nodes. The overhead ratio can be exploited to present the volume of used network resources to deliver a message to its receiver. From the result, we can see that the overhead ratio of the Taxis scenario was much higher than those of the Cambridge and Info5 scenarios. The main reason is that there are more nodes in the Taxis scenario compared with in the two other scenarios, which leads to more relayed messages during transmission. This also shows that the number of mixes had more of an effect on the overhead ratio of the Taxis scenario than those of the Cambridge and Info5 scenarios.
We can also check the average routing-step path length in each case, that is, the number of hops that a message needs to traverse between each mix node. This is shown in Tables 4 and 5 when selecting mix nodes randomly or on the basis of their centrality, respectively.
In general, the path length for each step is kept constant, and the use of the mix nodes does not increase such length. Obviously, as the number of nodes increases, the number of paths from source to destination also increases, e.g., if we have an average step-path length of 4, using one mix node results in path length of 8, and using 3 results in overall path length of 12. This is something already expected by the way an opportunistic mixnet works. We saw, however, that the penalty introduced in terms of delivery ratio or overall path delay is relatively tolerable.    To analyze the buffer storage of each node, we saw in simulations that the average time in which a node holds messages in mix nodes is less than 1% of total delivery time. This percentage is so small since there is a lot of traffic in the network, and the minimal number of messages to forward a message is guaranteed 99% of the time. Because of this, this buffer utilization did not affect the performance of the latency of delivery time.

Anonymity Analysis
Our mixnet-based proposal can ensure sender and receiver anonymity, and sender-receiver unlinkability.
Sender anonymity: message M cannot be linked to any sender S. In our mixnet proposal, receiver R cannot trace back sender S, as R can only monitor its previous node. A single mix node only knows the previous and successor node. Even if the attacker compromises a number of mix nodes, it is still impossible to connect the sender and receiver if there is at least one honest mix node.
Receiver anonymity: message M cannot be linked to any receiver R. When message M is forwarded in our OppNet schema, it goes through a number of mix nodes, since there is at least one mix node that cannot be compromised. The attacker cannot obtain any information from message M. Thus, our proposed mixnet method can provide receiver anonymity.
Sender-receiver unlinkability: It is essential to trace the routing path in the OppNet to link sender and receiver. However, this was not the case in our scenarios. For any honest sender S and receiver R, message M could be encrypted and rerandomized with mix nodes.

Privacy Analysis Compared with Onion-Routing-Based Scheme
In this section, we compare our proposed mix network scheme with the possible use of an onion-routing (OR) scheme. For this purpose, we introduced the concept of attack range that denotes the communication area that the attacker can access. The attacker can gain access to all messages exchanged in this area. As we previously mentioned, one of the drawbacks of common OppNet scenarios is that it might be relatively easy to monitor the entire communication network. In general, we assumed that, if the attacker can monitor the communication of the entire onion-routing circuit, then the attacker could break the circuit. This means that the attacker could monitor the whole circuit or communication path, violating the privacy of the circuit from sender and receiver nodes.
For example, suppose that a user sends a message from source node V s to destination node V d using three onion routers OR 1 , OR 2 , OR 3 , V s → OR 1 → OR 2 → OR 3 → V d . In order to go from one node to the other, the message might follow an epidemic routing path through different network nodes as described in Section 3.2. That is, we have something similar to the scenario described in Figure 1, but using onion routers instead of mix nodes. In this case, if all nodes are inside attack range, then the attacker can trace communication V s → V d . Depending on the size of the attack range, the attacker could trace a given number of communications.
In order to show the case in our OppNet scenarios, we simulated the attack range with the random walk mode in the ONE simulator. The entire simulation movement area was 800 × 800. The number of ORs is three (i.e., this is the common number of onion router used in Tor). Table 6 indicates the percentage of onion-routing circuits that an attacker could break in terms of the size of the attack range, which is expressed as a percentage of the total geographical area of the network.
From the table, we can see that the bigger the attack range is, the smaller the general privacy is achieved using onion routing. With regard to our mix network scheme, it is impossible to connect all packets with the whole circuit even if the attack range is 100%. This is due to the nature of the mixers, which prevents the correlation of source and destination even if the attacker can monitor the whole network. So, the privacy of our mix network scheme remains the same no matter the range that the attacker can access.

Conclusions
In this paper, we proposed the use of a mix network to provide anonymous communications in OppNets. Our approach is to rely on a free routing mixnet with opportunistic routing between mix nodes. We showed that the introduced performance penalty by using such a system in common OppNet scenarios is tolerable, making it an interesting solution to consider.
As we pointed out in Section 2, using an onion-routing mechanism cannot ensure anonymous communication well in some situations, as it is often easier to be compromised with entry and exit nodes. Our mixnet-based proposal has interesting advantages as compared to existing approaches that are mainly based on some form of onion routing. One of the main advantages is making the system more secure against traffic analysis. Some typical OppNet scenarios involve wireless communications that can be easily monitored. Even monitoring the whole network is not unrealistic. The use of mix nodes provides more resistance to such attacks at the expense of introducing a larger penalty to the performance of the network, both in terms of message-delivery ratio and delay. Our goal here was to show that such a penalty can be tolerable in most cases.
There are some potential extensions to our presented work. As network traffic in OppNets