You are currently viewing a new version of our website. To view the old version click .
Sensors
  • Article
  • Open Access

17 November 2010

Dynamic Load Balancing Data Centric Storage for Wireless Sensor Networks

,
,
,
,
and
1
Department of Computer Engineering, Chungju National University, 72 Daehak-ro, Chungju-si, Chungbuk 380-702, Korea
2
(C) Gain IT/ #503, 505, Daejeon Intelligent Robot Engineering Center, 629 Tamnip-dong, Yuseong-gu, Daejeon, Korea
3
Department of Venture and Business, Jinju National University, 150 Chilam-dong, Jinju-si, Kyeongnam 660-758, Korea
4
Department of Electronics Engineering, Mokwon University, Mokwon Gil 21, Seo-gu, Daejeon, 302-318, Korea
This article belongs to the Special Issue Selected Papers from The 2nd International Conference on Advanced Science and Technology

Abstract

In this paper, a new data centric storage that is dynamically adapted to the work load changes is proposed. The proposed data centric storage distributes the load of hot spot areas to neighboring sensor nodes by using a multilevel grid technique. The proposed method is also able to use existing routing protocols such as GPSR (Greedy Perimeter Stateless Routing) with small changes. Through simulation, the proposed method enhances the lifetime of sensor networks over one of the state-of-the-art data centric storages. We implement the proposed method based on an operating system for sensor networks, and evaluate the performance through running based on a simulation tool.

1. Introduction

Recently, wireless sensor networks have attracted great interest since they provide us with a means for continuous and real-time monitoring and interacting with the physical world. The sensor nodes that make up the sensor network continuously collect physical data such as location, CO2, temperature, humidity and so on. The collected data is stored in sensor networks, or they are transmitted to a base station for further query processing. Many studies have proposed techniques for energy efficient data storage and query processing in sensor networks. Among them, data centric storage (DCS) is one of the most interesting techniques [1].

Data centric storage (DCS) stores data in a sensor network by its values. Each sensor reading (event) is mapped to an owner sensor node by a hashing function based on the values of the event’s attributes. The event is routed to the owner node from the original sensor node according to some routing protocols, such as greedy perimeter stateless routing (GPSR) [2]. Therefore, all events with the same value are stored at the same owner node.

Existing DCSs suffer the lack of a strategy for handling hot-spots. A storage hot-spot occurs when many events are mapped to a small number of sensor nodes. Due to the storage constraints of sensor nodes, a storage hot-spot increases the dropping rate of events by overloaded sensor nodes. Queries for events in a storage hot-spot may be delayed due to contention at the storage sensor nodes and the surrounding sensor nodes. The sensor nodes in the hot-spot may quickly consume energy, due to the high load of query and storing sensor readings. Increased dead sensor nodes results in decreasing coverage area, and causes the formation of coverage gaps within such areas.

In distributed index for multi-dimensional data (DIM) [3] which is one of the better-known DCSs, events are mapped to sensor nodes based on a K-D tree. The union of its leaf nodes covers the whole sensor network area and a leaf node contains one or zero sensor nodes. The bisection history of a leaf node is transformed to the address of a sensor node contained in the leaf node. When a sensor node generates an event, it maps the event to an address based on a repetitive fixed uniform splitting of its attributes’ ranges in a round robin fashion. The event is transmitted to the owner node using GPSR. DIM has two major problems. One problem is that there may exist orphan regions that contain no sensor nodes. The nodes contained in the neighbour regions of orphan regions have to take care of the load for the orphan regions. The other is the storage hotspot problem discussed above.

K-D tree based data centric storage (KDDCS) [4] was proposed to solve the problems of DIM. In KDDCS, to avoid orphan regions, a region is divided up so that both partitioned regions contain the same numbers of sensors nodes if possible. Consequently, a KDDCS K-D tree is balanced and there are no orphan regions. However, according to this feature, KDDCS modifies GPSR to use it as its routing technique. The modified GPSR may require more communication costs to send one message than the original GPSR.

In addition, to solve the storage hot-spot problems, KDDCS presents a K-D tree re-balancing (KDTR) algorithm. However, the KDTR requires extra communication costs to move data from a region to its neighbour region for load balancing. Besides DIM and KDDCS, various other DCSs have been proposed.

In [5], zone partitioning (ZP) and zone partial replication (ZPR) are proposed. ZP partitions the hot spot storage responsibility among a larger number of sensors, while ZPR replicates the hot spot in neighbours. The authors of [6] present dynamic balanced data-centric storage (DBAS) that uses a cooperative strategy between the base station and the in-network processing. Virtual grid based DCS is proposed in [7]. It uses a virtual grid to assign addresses to sensor nodes and distribute the load of a hot-spot area by using a multilevel grid technique. In fact, this paper is an extended version of [7].

In this paper, we propose a new data centric storage based on multi-level grid techniques. The proposed DCS divides a sensor network area with virtual grid techniques, so all cells cover the same size of area. The proposed DCS uses GPSR to send an event or query to a target with some modifications, but the modified GPSR does not require any extra communication costs. Also, it distributes the load of a hot spot sensor node by using multilevel grid technique without moving events.

This paper is organized as follows. Section 2 gives the description of DIM and KDDCS. Section 3 presents the proposed multi-level grid based DCS. Experimental results are discussed in Section 4, and we conclude in Section 5.

3. Proposed Data Centric Storage

3.1. Initial Process

The proposed DCS is based on virtual multilevel grid techniques. The geographic area for a sensor network is partitioned into 2b rectangular cells, where b denotes the user specified number of bits. In our method, the sensor network area is recursively divided into four quadrant cells according to the given level. Once we have divided the sensor network space in this way, we assign a unique bit-string of length b to each cell on each level. The bit-string for each cell on a level is determined by the Z-order. The bit-string of a lowest level cell is assigned to sensor nodes that are included the cell. The bit-string of an upper level cell is calculated by shifting the bit-string of any contained lower cell to the right two times.

In a cell, there may be more than two sensor nodes. In this case, one of them becomes the cell header (CH). The cell header is selected by sensor nodes in the cell and manages other cell sensor nodes. Sensor nodes construct this virtual multilevel grid by themselves. When the network is deployed, the grid information such as the geographic area of sensor network, and the level and the number of bits b is broadcasted to the sensor network, or is given by sensor network manager during the deployment of sensor nodes. Also, it is assumed that the nodes are aware of locations of themselves and their neighbor nodes within their radio ranges. Each sensor node calculates, and assigns by itself a cell id (CID) according to its location. Figure 3 shows an example when the given level is 3. Cells on each level have their own unique numbers.

Figure 3. Grid based index when the given level is 3.

3.2. Data Insertion of GDCS

When a sensor node detects an event (sensor reading), it forwards the event to the center point of a cell that is selected by mapping the event's value to a lowest level CID. A data packet consists of the event, the coordinate of the cell's center point and the CID of the cell. We use GPSR algorithm with a simple modification to route the data packet to the target sensor node. We add an end condition to the original GPSR when the forwarding action is stopped. Whenever a sensor node receives a packet, first, it compares the CID of the packet with its CIDs. If the CIDs are the same, the sensor node stores the data of the packet. Otherwise, it performs the forwarding actions according to the GPSR.

In our method, storage hot-spot problem can still occur. If relatively many events are mapped to a certain cell id, the sensor nodes in the cell will be overloaded. Consequently, some events are lost, and the sensor nodes run out of energy more quickly. Our proposed method uses multilevel grid techniques to handle the storage hot-spot problem. If a hot-spot cell on the lowest level is detected, the cell and its neighbor cells on the same level are merged into a higher level grid cell. A new cell id on the higher level is assigned to the sensor nodes that are contained in the higher cell. After that merging process, the work load of the hot spot cell is distributed to all of the sensor nodes of the merged cell.

The merging process is performed by the sensor nodes themselves. When a CH detects that insertion work load of its cell is increased over the predefined threshold, the merging process is started. The CH calculates the cell id at the higher level, and sends the cell id to sensor nodes that are contained in a cell at the higher level.

For example, in Figure 4(a) the CH of sensor nodes of cell 9 detects that the cell is hot spot. Then, the CH merges 8, 9, 10 and 11 cells to the higher level cell, 2 on level 2 to distribute its insertion and query processing load to neighbor sensor nodes. In the following, we denote CID (L) as CID on level L. In order to merge the cells, the CH sends merge messages to sensor nodes contained in the cell 2, and the sensor nodes that received the message assign new cell id 2 to them. Subsequently, sensor nodes in the hot spot cell have the newly assigned CID, and sensor nodes in the merged cell have two CIDs on the different levels. Since the new cell is created, a new CH must be selected. At the first time, the CH of hot spot cell becomes the new CH of 2. Then, the CH is reselected according to the energy consumption of sensor nodes later.

Figure 4. Merge operation for handling hot-spots.

Events are inserted into sensor networks as follows. When a sensor node detects an event (sensor reading), it maps the event to a cell id at the lowest level. Then, it forwards a packet that consists of the event, the cell id and the geographic location to the center point of the cell. A sensor node that receives the packet compares the cell id of the packet and its cell ids. If they are matched, the packet is forwarded to the CH of the sensor node. The CH knows the energy consumption, storage utilization, values of stored events of sensor nodes, so it can select a proper sensor node to store the data of the packet, and sends the packet to the sensor node.

If a sensor node has two or more cell ids of different levels, first, the cell id on the lowest level is compared with the received packet's cell id, and then the cell ids on the higher level are compared repeatedly until finding a matched cell id. If the level of the sensor node’s cell id is higher than that of the received packet, the sensor node converts the received cell id to the level of its cell id before comparing them by shifting bits. Algorithm 1 shows the event insertion algorithm.

A CH also consumes energy more quickly since it manages the sensor nodes and every insertion in a cell. Therefore, a CH of a cell should be reelected by considering the energy level of each sensor node in the cell. Since the CH knows the energy level of each sensor node, the CH can select the most proper sensor node as a new CH.

For example, in Figure 5, we assume that the sensor node in cell 5 has an event e and the lowest level cell id of e is 10. Also, sensor node A has cell ids 9(1) and 2(2). The numbers, 3 and 2, in parenthesis mean the level of cell ids. Sensor node B and C have cell ids 9(1), 2(2) and 8(1), 2(2), respectively. All sensor nodes in cell 2(2) know that cell 9(1) requested a merge operation and the node B is the cell header. The sensor node in cell 5 starts to route the event e to cell 10. At that time, the sensor node does not know that the cell 10(1) is merged to 2(2). e is routed to the node B through the nodes in the cell 7(1) and 6(1). B has two cell ids, 9(1) and 2(2), so first we compare 9(1) with e’s cell id 10(1). They are not matched, so we get the cell id of level 2 from the e’s cell id 10(1). It is simply calculated by shifting 2 bits of 10(1010 b) to right. The cell id of level 2 is 2(10 b) and we compare 2(2) with e’s 2(2). They are matched, so we can insert e to the cell 2(2). CH of the cell 2(2) determines A as the most proper node, and we insert the e into the node A.

Figure 5. Example of insertion.
Algorithm 1. Insertion algorithm.
Algorithm 1. Insertion algorithm.
/*
Function handle_event () is a call back function.
It is called whenever an event is occurred.
*/
handle_event (event)
{
  if event is a new sensor reading
    make a packet (event, target_cid, target_xy);
    send packet to a neighbor cell according to GPSR algorithm;
    end handle_event;
  end if
  if event is a received packet
    for (each current_node.cell_ids)
      if (packet.cell_id is matched with one of current_node.cell_ids)
        if (current_node is CH)
          send packet to a proper node in current_node's cell ;
        else
          send packet to CH;
        end if
        end handle_event;
      end if
    end for
      send packet to a neighbor cell according to GPSR algorithm;
    else
  end if
}

4. Performance Evaluation

We performed a simulation to evaluate the performance of the proposed DCS. We assume that 400 nodes are randomly deployed in a square area, 200 × 200 m2. Also, we assume that each sensor node has initial energy of 1,000 units and a storage capacity of 100 units. A message transmission from a sensor node to its neighbor node consumes 0.1 energy units for sending the message and 0.05 energy units for receiving the message. The radio range each node has is 15 m. Every node generates 50 events that are normal distribution of values, and a percentage of 80% of the events falls into a percentage of 20% of the reading range. We measure the energy level of each sensor node, and compare the results with that of KDDCS which is one of the most well-known data centric storage. Table 1 shows the simulation parameters.

Table 1. Parameters for performance evaluation.

Table 2 shows the average energy level of each sensor node in the hot-spot area and entire area. As shown in the table, entire sensor nodes of the KDDCS consume more energy than those of the proposed DCS by about 73%. In the hot-spot area, sensor nodes of KDDCS consume much more energy than those of the proposed DCS by about 97%. The reasons why the proposed GDCS outperforms KDDCS are as follows. The first reason is that the KDDCS needs to visit the node that contains the split information to forward the packet to the next node. Also, in a KDDCS’ hot spot handling strategy, a node in the hot spot moves data to neighbor nodes whenever adapt the area covered by a sensor node.

Table 2. Average energy level of each sensor node.

In this paper, we implement a simple application that uses our GDCS based on TinyOS [10]. Then, we run the implemented application based on TOSSIM [11] to show how our GDCS distributes the workload of a hot spot area. Table 3 shows the TOSSIM parameters. The number of nodes is 256 and the width and height of the sensor network are was 100 m. The radio range is 10 m and the number of bits per grid is 8.

Table 3. Parameters for running TOSSIM.

When we set the number of levels as 1, i.e., multilevel grid techniques are not used, as shown in Table 4, the sensor nodes of the hot spot area receive 40% of the total messages. However, when the number of levels was 2, the ratio of the messages of sensor nodes in hot spot area to the total message is decreased to about 21%. The results mean that the proposed GDCS distributes the workload of hot spot areas effectively.

Table 4. Message receiving rate.

5. Conclusions

In this paper, we have proposed a grid technique based data centric storage for sensor networks. The proposed DCS (GDCS) handles storage hot spot problems by using multilevel grid techniques. Each cell header detects a hot spot, and lowers the grid level of the hot spot area so more sensor nodes cooperate to store data and to process queries. We have performed simulations to evaluate the GDCS. The GDCS improved the life time of sensor networks by about 60% over KDDCS. We also showed that the proposed GDCS efficiently distributed the workload of hot spots.

Acknowledgments

This research was supported by a grant from the University Restructuring Program (funded by the Ministry of Education and Human Resources Development) of Chungju National University. This work was also supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD) (KRF-2005-202-D00395)

References

  1. Shenker, S; Ratnasamy, S; Karp, B; Govindan, R; Estrin, D. Data-Centric Storage in Sensornets. ACM SIGCOMM Comput. Commun. Rev 2003, 33, 137–142. [Google Scholar]
  2. Karp, B; Kung, H. GPSR: Greedy Perimeter Stateless Routing for Wireless Networks. Proceedings of the 6th annual International Conference on Mobile Computing and Networking, Boston, MA, USA, 8 August 2000; pp. 243–254.
  3. Li, X; Kim, Y; Govindan, R; Hon, W. Multi-dimensional Range Queries in Sensor Networks. Proceedings of SenSys, Los Angeles, CA, USA, 5 November 2003; pp. 63–75.
  4. Aly, M; Pruhs, K; Chrysanthis, PK. KDDCS: A Load-Balanced in-Network Data-Centric Storage Scheme for Sensor Networks. Proceedings of CIKM, Arlington, MA, USA, 6 November 2006; pp. 317–326.
  5. Aly, M; Chrysanthis, P; Pruhs, K. Decomposing Data-Centric Storage Query Hot-Spots in Sensor Networks. Proceedings of MOBIQUITOUS, San Jose, CA, USA, 17 July 2006; pp. 1–9.
  6. Lai, Y; Chen, H; Wang, Y. Dynamic Balanced Storage in Wireless Sensor Networks. Proceedings of the 4th Workshop on Data Management for Sensor Networks, Vienna, Austria, 24 September 2007; pp. 7–12.
  7. Hong, Y; Choi, Y; Kim, T; Kwak, Y; Joe, K; Kang, H; Song, S. Energy Efficient Data Centric Storage for Sensor Networks Employing Multilevel Grid Techniques. Proceedings of Future Generation Communication and Networking, Hainan, China, 13 December 2008; pp. 133–136.
  8. Ratnasamy, S; Karp, B; Yin, L; Yu, F; Estrin, D; Govindan, R; Shenker, S. GHT: A Geographic Hash Table for Data-centric Storage. Proceedings of the 1st ACM International Workshop on Wireless Sensor Networks and Applications, Atlanta, GA, USA, 28 September 2002; pp. 78–87.
  9. Aly, M; Morsillo, N; Chrysanthis, P; Pruhs, K. Zone Sharing: A Hot-spots Decomposition Scheme for Data Centric Storage in Sensor Networks. Proceedings of the 2nd International Workshop on Data Management for Sensor Networks, Trondheim, Norway, 29 August 2005; pp. 21–26.
  10. Levis, P. TinyOS Programming, 2006. Available online: http://csl.stanford.edu/~pal/pubs/tinyos-programming.pdf (accessed on 12 November 2010).
  11. Levis, P; Lee, N; Welsh, M; Culler, D. TOSSIM: Accurate and Scalable Simulation of Entire TinyOS Applications. Proceedings of SenSys, Los Angeles, CA, USA, 5 November 2003; pp. 126–137.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.