^{1}

^{2}

^{3}

^{*}

^{4}

^{5}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Wireless communication between sensors allows the formation of flexible sensor networks, which can be deployed rapidly over wide or inaccessible areas. However, the need to gather data from all sensors in the network imposes constraints on the distances between sensors. This survey describes the state of the art in techniques for determining the minimum density and optimal locations of relay nodes and ordinary sensors to ensure connectivity, subject to various degrees of uncertainty in the locations of the nodes.

This survey provides an overview of wireless sensor network (WSN) connectivity, and discusses existing work that focuses on the connectivity issues in WSNs. In particular, we are interested in maintaining connected WSNs and their connectivity related characteristics including sensor node placement, as well as the construction of a small connected relay set in WSNs. We aim to review extensively the existing results related to these topics, and stimulate new research.

Sensor networks have a long history, which can be traced back as far as the 1950's. It is recognized that the first obvious sensor network was the Sound Surveillance System (SOSUS) [

Evolution of technologies has driven sensor networks away from their original appearance. With the emergence of integrated sensors embedded with wireless capability, most of current sensor networks consist of a collection of wirelessly interconnected sensors, each of which is embedded with sensing, computing and communication components. These sensors can observe and respond to phenomena in the physical environment [

In a WSN, after collecting information from the environment, sensors need to transmit aggregated data to gateways or information collection nodes. It is important to ensure that every sensor can communicate with the gateways. Due to the multi-hop communication of WSNs, a sufficient condition for reliable information transmission is full connectivity of the network. A network is said to be fully connected if every pair of nodes can communicate with each other, either directly or via intermediate relay nodes. Due to the large number of sensors in a WSN, the total cost could be high for the whole network, though the cost of each individual sensor is low. Therefore, it is important to find the minimum number of nodes required for a WSN to achieve connectivity.

Another related important problem for WSNs is finding a small connected relay set to assist in routing. Multi-hop WSNs need to perform efficient routing. Since mobile ad hoc networks (MANETs) and WSNs often have very limited, or even does not have, fixed infrastructure, the routing process in such networks is often complicated and inefficient; it can generate a large amount of overhead, and there are many possible paths, due to the broadcast nature of the wireless communications. Thus it is helpful to find a small connected set of sensor nodes to form a routing “backbone”, and restricted all other nodes to connecting to this backbone by a single hop. This node set can also help to resolve the broadcast storm problem [

As WSNs may be deployed in inaccessible terrains, and may contain a tremendous number of sensor nodes, it is often difficult or impossible to replace or recharge their batteries. Thus, energy conservation is critical for WSNs, both for each sensor node and the entire network level operations. Various approaches have been proposed to reduce energy consumption for sensor networks. For example, for the network level operations such as routing, if only a small fraction of sensors are involved in the routing process, the rest of the sensors can be turned off to save energy. This scheme is supported by the hardware and software advances that leverage the capability of temporarily shutting down those sensors that are not involved in any network operations. For instance, Rockwell's WINS sensor nodes can achieve a factor of ten power reduction by shutting down the radio transceiver, compared to those idle nodes whose transceivers are on [

The remainder of this paper is organized as follows. Section 2 gives a brief introduction to the graph models applied to wireless network investigations. Section 3 provides an overview of the prior results for connectivity studies in wireless ad hoc networks and WSNs, including percolation theory. Section 4 describes models with more general radio coverage patterns, and some hybrid models. The implications of connectivity on the achievable capacity are discussed in Section 5 Section 6 considers the construction of a small connected relay set, such that the packet delivery can be achieved by forwarding packets using only sensors in the relay set. Section 7 covers the optimal placement of sensor nodes, which has a fundamental impact on the connectivity and other operational requirements of WSNs. Section 8 summarizes this survey.

Connectivity is critical for WSNs, as information collected needs to be sent to data collection or processing centres. This is only possible if there is a path from each node to that collection centre. The connectivity of a WSN is usually studied by considering a graph associated with that WSN.

A WSN or a wireless ad hoc network is often represented by a graph in which vertices correspond to the communication nodes, and a directed edge from one vertex to another indicates that the node corresponding to the former can send data directly to the node corresponding to the latter. It is common to assume that propagation conditions can be modelled simply by there being a “transmission range” within which transmission is possible, and outside of which it is impossible. If all nodes have equal transmission ranges, then the graph becomes undirected.

A network is called connected if this associated graph is connected. A graph

Weaker notions of connectivity are also possible, such as the requirement that each node need only be connected to one of a set of base stations [

It is clear that the connectivity of a WSN is related to the positions of nodes, and those positions are heavily affected by the method of sensor deployment. In general, there are two types of approaches to deploy sensors in a WSN: deterministic deployment, where sensors are placed exactly at pre-engineered positions, and the random deployment, where nodes are deployed at random positions. For the deterministic deployment, networks are carefully planned, and nodes are placed at desired positions. If specifications of nodes are known, it is not difficult to determine whether the network is connected, and if not, to add relay nodes where needed. Although deterministic deployment has many advantages, in order to reduce installation costs, it has often been proposed that large WSNs which contains very large numbers of nodes be deployed randomly. Nodes may be dispersed by a moving vehicle or artillery shell [

Random graphs are often applied to model communication networks to highlight their randomness. Mathematically, a random graph is a graph that is generated by a stochastic process [

As the probability of an edge existing between each pair of nodes is equal in an Erdős-Rényi graph, this model is not well suited to WSNs, which are embedded in two (or three) dimensional space, and in which the probability of a link existing is very much higher between nodes which are geometrically close. Moreover, as discussed by Chlamtac and Faraó [

A natural candidate for random network modelling is the class of _{1} = (

An important special case of geometric graphs is the class of unit disk graphs. A unit disk graph is a graph embedded in Euclidean space that has an edge between any two vertices whose Euclidean distance is less than 1. If the positions of vertices of the unit disk graph are random, this unit disk graph is a random graph, and it is a random geometric graph if the locations of the vertices are i.i.d. and uniformly distributed.

Gilbert [

One of the most interesting questions regarding the connectivity of random WSNs concerns finding limiting regimes for which the connectivity becomes almost sure to occur. Typically these regimes involve the number of nodes becoming large.

Among the most celebrated results is that of Gupta and Kumar [_{n}

Before returning to this type of result, let us review some results for infinite networks

Many of the results concerning connectivity of large ad hoc networks are derived from results in percolation theory This section gives an introduction to percolation theory, and reviews some existing work related to the network connectivity studies. An introduction to theoretical aspects of percolation can be found in [

The notion of percolation was developed by Broadbent [

Mathematically, percolation theory studies infinite random graphs, which contain an infinite set of vertices _{1} of _{1}. An infinite connected component corresponds to a communication network of infinitely many nodes spread over a large geographic plane. Therefore, the existence of an infinite connected component implies the capability of long distance communication via multi-hop links.

Early studies of percolation constrained vertices to lie on a regular lattice. This requirement was relaxed by Gilbert [^{2}, where _{c}_{c}_{c}_{c}_{c}_{c}_{c}_{c}

Designing a large network to achieve full connectivity is much more demanding than designing to ensure that a large percentage of nodes are connected, with an associated increase in energy requirements. This is quantified by simulation in [

Percolation theory can also be applied to 1-dimensional (1-D) networks, such as sensor networks following rivers. Percolation is much more limited in 1-D, since there is much less variety of possible paths than in higher dimensions. However, some fundamental results are known. For example, the asymptotic connectivity probability of a 1-D network exhibits a strong zero-one law even if the locations of all the nodes independently follow an arbitrary identical distribution, as long as this distribution admits a non-vanishing density function [

The effect of a phase transition to partial connectivity in infinite networks is mirrored by a similar effect for complete connectivity in large but finite networks.

Philips

It can be seen that this conjecture, in two dimensions, is established and tightened by (1), which replaces

Piret [

So far, we have considered the problem of determining the radius required to ensure a high probability of connectivity in large networks. The converse problem of estimating the probability of connectivity for a given radius is if considerable engineering interest.

Consider nodes placed uniformly in a square region, with a pair of nodes connected if their distance is below a threshold, _{0}. As the number of nodes becomes large, the probability that the network is connected (or

The analogous problem of finding the probability of a finite 1-D network being connected can be addressed directly. Considering a 1-D interval with

In a practical sensor deployment, it may not be realistic to assume that the nodes are identically distributed. An alternative model, studied in [

As connectivity is so important to function a sensor network, some techniques have been proposed to improve the connectivity by extending the transmission range of sensors. These techniques include cooperative transmission [

Cooperative transmission is an exploitation of distributed beam-forming. With cooperative transmission, it is possible to accumulate the transmission power from different nodes to achieve a higher power to transmit identical information, and hence the transmission range can be greatly enlarged, and the connectivity of the whole network can be improved. To implement cooperative transmission, all nodes that are transmitting the same message should synchronize their transmission and superimpose the emitted waveforms on the physical medium, so that the power can be summed up to help the detection at the receiver side. This type of techniques are especially helpful to eliminate the nodes isolation and network separation for large and sparse networks. Krohn

Another technique with renewed interests in the network connectivity area is the use of directional antennas [

Given the complexity of these two techniques on one hand, and the requirements of simple design of sensors on the other, clearly the challenge of improving connectivity in sensor networks still exists.

Geometric random graphs, such as the unit disk model, are tractable because of the high degree of symmetry they assume. In real networks, the coverage regions of different nodes have different areas, and are highly non-circular. Many more realistic models have been studied in the context of both percolation and connectivity of finite graphs.

If each node can adjust its transmission power independently, they have inhomogeneous transmission ranges. This model has also received considerable attention. Xue and Kumar [

Having inhomogeneous transmission ranges causes the corresponding graph to be directed. This is not desirable from an engineering point of view, both because it complicates routing, and also because it bars the use of link-layer acknowledgements and retransmission. The solution taken by Blough

The concept of “transmission range” used in most connectivity studies is a marked over-simplification. In real networks, the presence of a link depends on both the signal strength and the background interference. One of the main drawbacks of disk-based models is that they do not consider interference, though dense networks produce strong interference.

To take interference into account, Dousse

In real networks, obstacles block the path of the signal, and cause random anisotropic signal strengths. This effect, called

The most widely accepted model of shadowing is log-normal shadowing [^{2/}^{ζ}

Bettstetter and Hartmann [

Other models of shadowing are also possible. An interesting result arises if we consider the coverage area of each node to be a deterministic but irregular region, rather than a disk. Booth

Driven in part by results showing that the capacity per node decreases in large networks [

Network capacity or throughput is an important constraint when considering the connectivity of ad hoc networks. In a wireless ad hoc network, the increase of the transmission power could increase the transmission distance of each node, which leads to the increased probability of network connectivity. However, the large power results in severe interference within the network, which reduces the network capacity and degrades the performance of decoding at receivers. On the other hand, reducing the transmission range by reducing the transmission power can limit the interferences, but reduces the probability of connectivity and increases the number of hops required to reach the destination. Therefore, the tradeoff between connectivity and network capacity has been widely studied.

As mentioned previously, the work of Gupta and Kuma [

The fact that the capacity scales less than linearly with the number of nodes has prompted the study of hybrid networks containing some wired base stations. However, it has been shown that a linear scaling of capacity with the number of nodes is possible if the modeling assumptions are relaxed.

As already mentioned, Ozgur, Leveque, and Tse [

Dousse

Note that the connectivity requirement is only an issue if the designer has control over the transmission range, but cannot add wireless relay nodes. An often overlooked consideration in these studies is that the

To investigate the relationship between connectivity and capacity, Dousse and Thiran [

Another approach has been taken to studying the capacity of sensor networks, in the classic work [

This work either implicitly or explicitly assumed that the network is connected. For example, [

Simply maintaining the connectivity of a WSN is not sufficient for data dissemination. Routing must also be considered. This section will consider connectivity issues arising in one popular approach: cluster-based routing.

Routing is a major challenge for WSNs. The large number of sensors involved means that efficient routing is required. However, their typical data rates are much lower than other large networks such as the internet. Hence the relative cost of the signalling overhead of standard routing algorithms such as distance vector routing [

One common solution to these problems is to perform routing through a specific subset of nodes, called a virtual backbone, to which all other nodes connect in a single hop. A good virtual backbone can simplify the routing process and can reduce the overall network energy consumption in two ways. First, as only nodes in the virtual backbone forward packets, non-backbone nodes can spend more time in a low-power idle mode. Second, all sensor nodes need to perform in-network processing and data aggregation. Doing this within virtual backbone nodes can eliminate redundant data and relax the packet transmission burden, which leads to energy saving.

For a given WSN, we typically wish to find a virtual backbone with the minimum number of nodes, to maximise the potential energy saving. However, successful packet delivery requires that the nodes in the virtual backbone remain connected, and that every other node is within range of a backbone node.

In this section we first introduce cluster-based routing, which groups nodes according to their geometric positions, and selects a head node for each group (i.e., cluster). All the cluster heads form a virtual backbone. We also discuss a graph theoretic concept, Minimum Connected Dominating Sets (MCDS), which model a small connected relay set. We also cover the state-of-the-arts algorithms for MCDS construction.

Clustering is an effective way to achieve efficient routing in WSNs [

The usage of cluster-based routing in WSNs is also motivated by the requirements of in-network processing and data aggregation to reduce energy consumption, as such operations can be spontaneously performed at cluster head nodes. It should be noted that once the clusters are formed in a WSN, traditional proactive routing schemes such as distance vector routing and link state routing, as well as reactive routing approaches such as DSR [

Cluster-based routing is a special example of backbone-based routing [

A virtual backbone of a WSN can be modelled by a

Unfortunately, finding an MCDS in a given connected graph is not as easy in practical situations as in the previous example. This problem is known to be NP-hard [

Note the difference between heuristics and approximation algorithms. An algorithm _{opt}_{opt}

Existing algorithms for constructing an approximate MCDS can be classified into three categories: constructive, pruning-based and multipoint-relay-based algorithms. Constructive algorithms approximate the MCDS in a graph by gradually adding nodes to a candidate set. In contrast, pruning-based algorithms begin with taking a large candidate set, then detect and remove redundant nodes to eventually obtain a small CDS. The last type, multipoint-relay-based algorithms, allows each node to determine its smallest one-hop message relay set, and all nodes selected as relay nodes for a particular message relaying form a CDS.

For all three types of algorithms, the connectivity among the relay nodes must be considered. A common approach used by constructive algorithms is first to select a subset of a CDS, called a

These three types of algorithms are described in more detail in the following subsections. Typical examples of each type are also discussed.

Guha and Khuller's seminal paper [

The first algorithm approximates the MCDS by greedily creating a tree

To develop the second algorithm, Guha and Khuller introduced a new concept called

A distributed algorithm is called

Neither of the above algorithms is localized, as they need global information to select the nodes with maximum degree of white neighbors or

Another distributed heuristic to compute the MCDS for a wireless ad hoc network was proposed by Alzoubi and Wan [

Besides constructive algorithms, there also exist other algorithms which are based on pruning procedures to approximate MCDS. As indicated by its name, a pruning-based algorithm gradually reduces a candidate CDS according to some greedy criteria, and the left-over set after the reduction is the approximate MCDS.

Wu and Li [

Wu

Butenko

Multipoint relaying is a technique that is widely used for flooding in wireless ad hoc networks. In multipoint relaying, each node

Note that in [^{2}), where

However, as pointed out by Wu [

Other MPR-based CDS construction heuristics include Chen

As the MCDS problem is NP-complete, future research will likely to focus on providing heuristic algorithms and approximations that are efficient and scalable. Another possible new research direction is to focus on networks with specific structures, as done by Li

The placement of nodes largely influences the operations and performance of WSNs, as sensor nodes must be able to observe events of interest, and transmit the information to data collection centres. Moreover, sensor placement also affects the resource management in WSNs [

According to the roles that the deployed nodes play, node placement can be classified into placement of ordinary nodes, and placement of relay nodes, respectively. The former focus on the deployment of normal sensors, while the latter places a special type of nodes, which are responsible for forwarding packets.

The foremost step required for a WSN to perform its designed functions is deploying all the sensor nodes to form a WSN. As mentioned previously, sensors can be placed exactly on carefully engineered positions, or thrown in bulk on random positions [

Most work on deterministic placement seeks to determine the “optimal” placement pattern. Different optimality criteria are used, according to the applications and goals of the WSNs. A common objective is to minimise the number of required sensors needed, subject to the constraint that the whole sensing field is monitored by the deployed sensors. This is equivalent to finding the minimum number of nodes such that every position in the sensing field is within the sensing field of at least one node.

Minimising the number of sensors can take the form of an “art gallery” problem [

However, in the art gallery problem, all security guards are assumed to have infinite vision if there is no obstacles. This assumption does not hold for WSNs in which sensor nodes have limited sensing ranges. It was shown that arranging sensors at the centres of regular hexagons is optimal for a WSN with a large sensing field, given that all the sensor nodes have identical limited sensing ranges [

The above work on the node placement only considered the coverage constraint that the WSN needs to be able to observe any positions within the sensing field. They did not include the discussion about the connectivity, or they implicitly assumed that the WSN formed by the obtained pattern was connected, regardless the transmission ranges of sensor nodes. Nevertheless, this assumption may not be true. To find the optimal node placement pattern subject to both the coverage and connectivity constraints, Biagioni and Sasaki [

Iyengar

Studies on objectives other than minimising the total number of sensors are also available. Khan [

The above mentioned work [

Models of random placement usually assume that nodes are independently and uniformly distributed in space. It is not clear that this is a suitable model, although its tractability is appealing. Moreover, it has been shown that several models of random motion of nodes eventually yield a uniform distribution of nodes. Blough

In a WSN, if a small set of special nodes, whose main function is packet forwarding, are deployed, the management and network operations in the WSN can be potentially simplified drastically. These nodes are called relay nodes, and have attracted flourishing interests [

The problem of relay node placement can be categorized into either single-tiered or two-tiered, according to the data forwarding scheme adopted by the WSN. If both of relay nodes and ordinary sensor nodes can forward packets, this is single-tiered relay node placement. In contrast, in a WSN with two-tiered relay node placement, only relay nodes can forward packets. In a two-tiered system, the relays form a virtual backbone, and must be a connected dominating set (CDS) of the WSN.

We survey the prior work related to these two types of relay placement. In the following discussion, the transmission ranges for relays and ordinary sensors are denoted

Cheng

In order to provide fault-tolerance, Kashyap ^{2}) was developed. Recently, Zhang

The setting that only relay nodes can perform the packet forwarding is known as the two-tiered infrastructure. Pan

Apart from the above unconstrained relay node placement, in which relay nodes can be placed on anywhere, recent work has started to investigate constrained relay node placement problem which captures the practical consideration such as interferences or forbidden regions prevent relay nodes from being placed on certain positions. Some recent work investigating constrained relay node placement problem is given in [

Wireless Sensor Networks have the potential to revolutionize our everyday life, as they provide a flexible approach for us to observe the surrounding environment, and respond to events. The availability of tiny battery-powered sensor nodes, embedded with sensing, processing, and communication capabilities, which are wirelessly networked together via multi-hop communication, increase the opportunities for WSNs to find applications in a wide range of areas. Along with the opportunities, there are also challenges and requirements for the successful deployment and operations of WSNs. This survey has focused on the implications of the need for connectivity.

Ensuring connectivity of a WSN is challenging when sensors have random locations, either because of mobility or initial random deployment. A substantial body of literature has been written on this problem, including deep theoretical results applied to simple models of i.i.d. uniformly distributed nodes with circular radio footprints. This model is widely accepted as it is analytically tractable. An important open research problem is to generalise these results to more realistic propagation models, including effects such as shadowing and non-uniform distribution of nodes, and to determine what engineering insights can be drawn from the theoretical asymptotic results.

Connected subsets of nodes also play an important role in WSNs. As we have seen, cluster routing uses a connected “backbone” of nodes to simplify routing, and minimise the work required from the majority of sensor nodes. Finding the smallest such backbone is equivalent to finding a Minimum Con-nected Dominating Set in the corresponding graph, which is known to be NP-complete. Further research is needed both into finding more efficient, more accurate or simpler suboptimal solutions to the MCDS problem, and also into the benefits which can be obtained by using non-minimum backbones. Such benefits include reduced path lengths, and increased resilience. Such research will play an important part in bringing about the benefits that sensor networks have to offer.

An example of unit disk graphs.

An example of Minimum Connected Dominating Set.