Two-way Linear Probing Revisited

We introduce linear probing hashing schemes that construct a hash table of size $n$, with constant load factor $\alpha$, on which the worst-case unsuccessful search time is asymptotically almost surely $O(\log \log n)$. The schemes employ two linear probe sequences to find empty cells for the keys. Matching lower bounds on the maximum cluster size produced by any algorithm that uses two linear probe sequences are obtained as well.


Introduction
In classical open addressing hashing [75], m keys are hashed sequentially and on-line into a table of size n > m, (that is, a one-dimensional array with n cells which we denote by the set T = {0, . . ., n − 1}), where each cell can harbor at most one key.
Each key x has only one infinite probe sequence f i (x) ∈ T , for i ∈ N.During the insertion process, if a key is mapped to a cell that is already occupied by another key, a collision occurs, and another probe is required.The probing continues until an empty cell is reached where a key is placed.This method of hashing is pointer-free, unlike hashing with separate chaining where keys colliding in the same cell are hashed to a separate linked list or chain.For a discussion of different hashing schemes see [41,51,92].
The purpose of this paper is to design efficient open addressing hashing schemes that improve the worst-case performance of classical linear probing where f i+1 (x) = f i (x) + 1 mod n, for i ∈ [[n]] := {1, . . ., n}.Linear probing is known for its good practical performance, efficiency, and simplicity.It continues to be one of the best hash tables in practice due to its simplicity of implementation, absence of overhead for internally used pointers, cache efficiency, and locality of reference [46,73,81,88].On the other hand, the performance of linear probing seems to degrade with high load factors m/n, due to a primary-clustering tendency of one collision to cause more nearby collisions.
Our study concentrates on schemes that use two linear probe sequences to find possible hashing cells for the keys.Each key chooses two initial cells independently and uniformly at random, with replacement.From each initial cell, we probe linearly, and cyclically whenever the last cell in the table is reached, to find two empty cells which we call terminal cells.The key then is inserted into one of these terminal cells according to a fixed strategy.We consider strategies that utilize the greedy multiplechoice paradigm [5,93].We show that some of the trivial insertion strategies with two-way linear probing have unexpected poor performance.For example, one of the trivial strategies we study inserts each key into the terminal cell found by the shorter probe sequence.Another simple strategy inserts each key into the terminal cell that is adjacent to the smaller cluster, where a cluster is an isolated set of consecutively occupied cells.Unfortunately, the performances of these two strategies are not ideal.We prove that when any of these two strategies is used to construct a hash table with constant load factor, the maximum unsuccessful search time is Ω(log n), with high probability (w.h.p.).Indeed, we prove that, w.h.p., a giant cluster of size Ω(log n) emerges in a hash table of constant load factor, if it is constructed by a two-way linear probing insertion strategy that always inserts any key upon arrival into the empty cell of its two initial cells whenever one of them is empty.
Consequently, we introduce two other strategies that overcome this problem.First, we partition the hash table into equal-sized blocks of size β, assuming n/β is an integer.We consider the following strategies for inserting the keys: A. Each key is inserted into the terminal cell that belongs to the least crowded block, i.e., the block with the least number of keys.
B. For each block i, we define its weight to be the number of keys inserted into terminal cells found by linear probe sequences whose starting locations belong to block i.Each key, then, is inserted into the terminal cell found by the linear probe sequence that has started from the block of smaller weight.
For strategy B, we show that β can be chosen such that for any constant load factor α := m/n, the maximum unsuccessful search time is not more than c log 2 log n, w.h.p., where c is a function of α.If α < 1/2, the same property also holds for strategy A. Furthermore, these schemes are optimal up to a constant factor in the sense that an Ω(log log n) universal lower bound holds for any strategy that uses two linear probe sequences, even if the initial cells are chosen according to arbitrary probability distributions.
For hashing with separate chaining, one can achieve O(log log n) maximum search time by applying the two-way chaining scheme [5] where each key is inserted into the shorter chain among two chains chosen independently and uniformly at random, with replacement, breaking ties randomly.It is proved [5,8] that when r = Ω(n) keys are inserted into a hash table with n chains, the length of the longest chain upon termination is log 2 log n + r/n ± O(1), w.h.p.Of course, this idea can be generalized to open addressing.Assuming the hash table is partitioned into blocks of size β, we allow each key to choose two initial cells, and hence two blocks, independently and uniformly at random, with replacement.From each initial cell and within its block, we probe linearly and cyclically, if necessary, to find two empty cells; that is, whenever we reach the last cell in the block and it is occupied, we continue probing from the first cell in the same block.The key, then, is inserted into the empty cell that belongs to Random and uniform probings are, in some sense, the idealized models [89,97], and their plausible performances are among the easiest to analyze; but obviously they are unrealistic.Linear probing is perhaps the simplest to implement, but it behaves badly when the table is almost full.Double probing can be seen as a compromise.
During the insertion process of a key x, suppose that we arrive at the cell f i (x) which is already occupied by another previously inserted key y, that is, f i (x) = f j (y), for some j ∈ N. Then a replacement strategy for resolving the collision is needed.Three strategies have been suggested in the literature (see [68] for other methods): 1. first come first served (fcfs) [75]: The key y is kept in its cell, and the key x is referred to the next cell f i+1 (x).
2. last come first served (lcfs) [78]: The key x is inserted into the cell f i (x), and the key y is pushed along to the next cell in its probe sequence, f j+1 (y).
3. robin hood [13,14]: The key which travelled the furthest is inserted into the cell.That is, if i > j, then the key x is inserted into the cell f i (x), and the key y is pushed along to the next cell f j+1 (y); otherwise, y is kept in its cell, and the key x tries its next cell f i+1 (x).

Average Performance
Evidently, the performance of any open addressing scheme deteriorates when the ratio m/n approaches 1, as the cluster sizes increase, where a cluster is an isolated set of consecutively occupied cells (cyclically defined) that are bounded by empty cells.Therefore, we shall assume that the hash table is α-full, that is, the number of hashed keys m = ⌊ αn ⌋, where α ∈ (0, 1) is a constant called the load factor.The asymptotic average-case performance has been extensively analyzed for random and uniform probing [9,55,67,75,89,97], linear probing [50,51,54,61], and double probing [6,43,57,84,86].The expected search times were proven to be constants, more or less, depending on α only.Recent results about the average-case performance of linear probing, and the limit distribution of the construction time have appeared in [32,52,91].See also [3,39,76] for the average-case analysis of linear probing for nonuniform hash functions.
It is worth noting that the average search time of linear probing is independent of the replacement strategy; see [51,75].This is because the insertion of any order of the keys results in the same set of occupied cells, i.e., the cluster sizes are the same; and hence, the total displacement of the keys-from their initial hashing locationsremains unchanged.It is not difficult to see that this independence is also true for random and double probings.That is, the replacement strategy does not have any effect on the average successful search time in any of the above probings.In addition, since in linear probing the unsuccessful search time is related to the cluster sizes (unlike random and double probings), the expected and the maximum unsuccessful search times in linear probing are invariant to the replacement strategy.
It is known that lcfs [78,79] and robin hood [13,14,68,91] strategies minimize the variance of displacement.Recently, Janson [45] and Viola [90] have reaffirmed the effect of these replacement strategies on the individual search times in linear probing hashing.

Worst-case Performance
The focal point of this article, however, is the worst-case search time which is proportional to the length of the longest probe sequence over all keys (llps, for short).Many results have been established regarding the worst-case performance of open addressing.
The worst-case performance of linear probing with fcfs policy was analyzed by Pittel [77].He proved that the maximum cluster size, and hence the llps needed to insert (or search for) a key, is asymptotic to (α − 1 − log α) −1 log n, in probability.As we mentioned above, this bound holds for linear probing with any replacement strategy.Chassaing and Louchard [15] studied the threshold of emergence of a giant cluster in linear probing.They showed that when the number of keys m = n− ω( √ n), the size of the largest cluster is o(n), w.h.p.; however, when m = n − o( √ n), a giant cluster of size Θ(n) emerges, w.h.p.Gonnet [40] proved that with uniform probing and fcfs replacement strategy, the expected llps is asymptotic to log 1/α n − log 1/α log 1/α n + O(1), for α-full tables.However, Poblete and Munro [78,79] showed that if random probing is combined with lcfs policy, then the expected llps is at most (1+o(1))Γ −1 (αn) = O(log n/ log log n), where Γ is the gamma function.
On the other hand, the robin hood strategy with random probing leads to a more striking performance.Celis [13] first proved that the expected llps is O(log n).However, Devroye, Morin and Viola [22] tightened the bounds and revealed that the llps is indeed log 2 log n ± Θ(1), w.h.p., thus achieving a double logarithmic worst-case insertion and search times for the first time in open addressing hashing.Unfortunately, one cannot ignore the assumption in random probing about the availability of an infinite collection of hash functions that are sufficiently independent and behave like truly uniform hash functions in practice.On the other side of the spectrum, we already know that robin hood policy does not affect the maximum unsuccessful search time in linear probing.However, robin hood may be promising with double probing.

Other Initiatives
Open addressing methods that rely on rearrangement of keys were under investigation for many years, see, e.g., [10,42,58,60,68,82].Pagh and Rodler [74] studied a scheme called cuckoo hashing that exploits the lcfs replacement policy.It uses two hash tables of size n > (1 + ǫ)m, for some constant ǫ > 0; and two independent hash functions chosen from an O(log n)-universal class-one function only for each table.
Each key is hashed initially by the first function to a cell in the first table.If the cell is full, then the new key is inserted there anyway, and the old key is kicked out to the second table to be hashed by the second function.The same rule is applied in the second table.Keys are moved back and forth until a key moves to an empty location or a limit has been reached.If the limit is reached, new independent hash functions are chosen, and the tables are rehashed.The worst-case search time is at most two, and the amortized expected insertion time, nonetheless, is constant.However, this scheme utilizes less than 50% of the allocated memory, has a worst-case insertion time of O(log n), w.h.p., and depends on a wealthy source of provably good independent hash functions for the rehashing process.For further details see [21,26,33,70].
The space efficiency of cuckoo hashing is significantly improved when the hash table is divided into blocks of fixed size b ≥ 1 and more hash functions are used to choose k ≥ 2 blocks for each key where each is inserted into a cell in one of its chosen blocks using the cuckoo random walk insertion method [25,37,34,35,56,95].For example, it is known [25,56] that 89.7% space utilization can be achieved when k = 2 and the hash table is partitioned into non-overlapping blocks of size b = 2. On the other hand, when the blocks are allowed to overlap, the space utilization improves to 96.5% [56,95].The worst-case insertion time of this generalized cuckoo hashing scheme, however, is proven [35,38] to be polylogarithmic, w.h.p.
Many real-time static and dynamic perfect hashing schemes achieving constant worst-case search time, and linear (in the table size) construction time and space were designed in [11,23,24,27,28,36,71,72]. All of these schemes, which are based, more or less, on the idea of multilevel hashing, employ more than a constant number of perfect hash functions chosen from an efficient universal class.Some of them even use O(n) functions.

Our Contribution
We design linear probing algorithms that accomplish double logarithmic worst-case search time.Inspired by the two-way chaining algorithm [5], and its powerful performance, we promote the concept of open addressing hashing with two-way linear probing.The essence of the proposed concept is based on the idea of allowing each key to generate two independent linear probe sequences and making the algorithm decide, according to some strategy, at the end of which sequence the key should be inserted.Formally, each input key x chooses two cells independently and uniformly at random, with replacement.We call these cells the initial hashing cells available for x.From each initial hashing cell, we start a linear probe sequence (with fcfs policy) to find an empty cell where we stop.Thus, we end up with two unoccupied cells.We call these cells the terminal hashing cells.The question now is: into which terminal cell should we insert the key x?
The insertion process of a two-way linear probing algorithm could follow one of the strategies we mentioned earlier: it may insert the key at the end of the shorter probe sequence, or into the terminal cell that is adjacent to the smaller cluster.Others may make an insertion decision even before linear probing starts.In any of these algorithms, the searching process for any key is basically the same: just start probing in both sequences alternately, until the key is found, or the two empty cells at the end of the sequences are reached in the case of an unsuccessful search.Thus, the maximum unsuccessful search time is at most twice the size of the largest cluster plus two.
We study the two-way linear probing algorithms stated above, and show that the hash table, asymptotically and almost surely, contains a giant cluster of size Ω(log n).Indeed, we prove that a cluster of size Ω(log n) emerges, asymptotically and almost surely, in any hash table of constant load factor that is constructed by a two-way linear probing algorithm that inserts any key upon arrival into the empty cell of its two initial cells whenever one of them is empty.
We introduce two other two-way linear probing heuristics that lead to Θ(log log n) maximum unsuccessful search times.The common idea of these heuristics is the marriage between the two-way linear probing concept and a technique we call blocking where the hash table is partitioned into equal-sized blocks.These blocks are used by the algorithm to obtain some information about the keys allocation.The information is used to make better decisions about where the keys should be inserted, and hence, lead to a more even distribution of the keys.
Two-way linear probing hashing has several advantages over other proposed hashing methods, mentioned above: it reduces the worst-case behavior of hashing, it requires only two hash functions, it is easy to parallelize, it is pointer-free and easy to implement, and unlike the hashing schemes proposed in [25,74], it does not require any rearrangement of keys or rehashing.Its maximum cluster size is O(log log n), and its average-case performance can be at most twice the classical linear probing as shown in the simulation results.Furthermore, it is not necessary to employ perfectly random hash functions as it is known [73,81,88] that hash functions with smaller degree of universality will be sufficient to implement linear probing schemes.See also [27,49,70,74,84,85,86] for other suggestions on practical hash functions.

Paper Scope
In the next section, we recall some of the useful results about the greedy multiplechoice paradigm.We prove, in Section 3, a universal lower bound of order of log log n on the maximum unsuccessful search time of any two-way linear probing algorithm.We prove, in addition, that not every two-way linear probing scheme behaves efficiently.We devote Section 4 to the positive results, where we present our two two-way linear probing heuristics that accomplish O(log log n) worst-case unsuccessful search time.Simulation results of the studied algorithms are summarized in Section 5.
Throughout, we assume the following.We are given m keys-from a universe set of keys U-to be hashed into a hash table of size n such that each cell contains at most one key.The process of hashing is sequential and on-line, that is, we never know anything about the future keys.The constant α ∈ (0, 1) is preserved in this article for the load factor of the hash table, that is, we assume that m = ⌊ αn ⌋.The n cells of the hash table are numbered 0, . . ., n − 1.The linear probe sequences always move cyclically from left to right of the hash table.The replacement strategy of all of the introduced algorithms is fcfs.The insertion time is defined to be the number of probes the algorithm performs to insert a key.Similarly, the search time is defined to be the number of probes needed to find a key, or two empty cells in the case of unsuccessful search.Observe that unlike classical linear probing, the insertion time of two-way linear probing may not be equal to the successful search time.However, they are both bounded by the unsuccessful search time.Notice also that we ignore the time to compute the hash functions.

The Multiple-choice Paradigm
Allocating balls into bins is one of the historical assignment problems [48,53].We are given r balls that have to be placed into s bins.The balls have to be inserted sequentially and on-line, that is, each ball is assigned upon arrival without knowing anything about the future coming balls.The load of a bin is defined to be the number of balls it contains.We would like to design an allocation process that minimizes the maximum load among all bins upon termination.For example, in a classical allocation process, each ball is placed into a bin chosen independently and uniformly at random, with replacement.It is known [40,63,80] that if r = Θ(s), the maximum load upon termination is asymptotic to log s/ log log s, in probability.
On the other hand, the greedy multiple-choice allocation process, appeared in [30,49] and studied by Azar et al. [5], inserts each ball into the least loaded bin among d ≥ 2 bins chosen independently and uniformly at random, with replacement, breaking ties randomly.Throughout, we will refer to this process by GreedyMC(s, r, d) for inserting r balls into s bins.Surprisingly, the maximum bin load of GreedyMC(s, s, d) decreases exponentially to log d log s ± O(1), w.h.p., [5].However, one can easily generalize this to the case r = Θ(s).It is also known that the greedy strategy is stochastically optimal in the following sense.
Theorem 1 (Azar et al. [5]).Let s, r, d ∈ N, where d ≥ 2, and r = Θ(s).Upon termination of GreedyMC(s, r, d), the maximum bin load is log d log s± O(1), w.h.p.Furthermore, the maximum bin load of any on-line allocation process that inserts r balls sequentially into s bins where each ball is inserted into a bin among d bins chosen independently and uniformly at random, with replacement, is at least log d log s−O(1), w.h.p.
Berenbrink et al. [8] extended Theorem 1 to the heavily loaded case where r ≫ s, and recorded the following tight result.

Theorem 2 (Berenbrink et al. [8]
).There is a constant C > 0 such that for any integers r ≥ s > 0, and d ≥ 2, the maximum bin load upon termination of the process Theorem 2 is a crucial result that we have used to derive our results, see Theorems 8 and 9.It states that the deviation from the average bin load which is log d log s stays unchanged as the number of balls increases.
Vöcking [93,94] demonstrated that it is possible to improve the performance of the greedy process, if non-uniform distributions on the bins and a tie-breaking rule are carefully chosen.He suggested the following variant which is called Always-Go-Left.The bins are numbered from 1 to n.We partition the s bins into d groups of almost equal size, that is, each group has size Θ(s/d).We allow each ball to select upon arrival d bins independently at random, but the i-th bin must be chosen uniformly from the i-th group.Each ball is placed on-line, as before, in the least full bin, but upon a tie, the ball is always placed in the leftmost bin among the d bins.We shall write LeftMC(s, r, d) to refer to this process.Vöcking [93] showed that if r = Θ(s), the maximum load of LeftMC(s, r, d) is log log s/(d log φ d ) + O(1), w.h.p., where φ d is a constant related to a generalized Fibonacci sequence.For example, the constant φ 2 = 1.61... corresponds to the well-known golden ratio, and φ 3 = 1.83....In general, φ 2 < φ 3 < φ 4 < • • • < 2, and lim d→∞ φ d = 2. Observe the improvement on the performance of GreedyMc(s, r, d), even for d = 2.The maximum load of LeftMC(s, r, 2) is 0.72... × log 2 log s + O(1), whereas in GreedyMC(s, r, 2), it is log 2 log s + O(1).The process LeftMC(s, r, d) is also optimal in the following sense.
Theorem 3 (Vöcking [93]).Let r, s, d ∈ N, where d ≥ 2, and r = Θ(s).The maximum bin load of of LeftMC(s, r, d) upon termination is log log s/(d log φ d ) ± O(1), w.h.p.Moreover, the maximum bin load of any on-line allocation process that inserts r balls sequentially into s bins where each ball is placed into a bin among d bins chosen according to arbitrary, not necessarily independent, probability distributions defined on the bins is at least log log s/(d log φ d ) − O(1), w.h.p.
Berenbrink et al. [8] studied the heavily loaded case and recorded the following theorem.

Life is not Always Good!
We prove here that the idea of two-way linear probing alone is not always sufficient to pull off a plausible hashing performance.We prove that a large group of two-way linear probing algorithms have an Ω(log n) lower bound on their worst-case search time.To avoid any ambiguity, we consider this definition.Definition 1.A two-way linear probing algorithm is an open addressing hashing algorithm that inserts keys into cells using a certain strategy and does the following upon the arrival of each key: 1.It chooses two initial hashing cells independently and uniformly at random, with replacement.
2. Two terminal (empty) cells are then found by linear probe sequences starting from the initial cells.
3. The key is inserted into one of these terminal cells.
To be clear, we give two examples of inefficient two-way linear probing algorithms.Our first algorithm places each key into the terminal cell discovered by the shorter probe sequence.More precisely, once the key chooses its initial hashing cells, we start two linear probe sequences.We proceed, sequentially and alternately, one probe from each sequence until we find an empty (terminal) cell where we insert the key.Formally, let f, g : U → {0, . . ., n − 1} be independent and truly uniform hash functions.For x ∈ U, define the linear sequence f 1 (x) = f (x), and f i+1 (x) = f i (x) + 1 mod n, for i ∈ [[n]]; and similarly define the sequence g i (x).The algorithm, then, inserts each key x into the first unoccupied cell in the following probe sequence: f 1 (x), g 1 (x), f 2 (x), g 2 (x), f 3 (x), g 3 (x), . ... We denote this algorithm that hashes m keys into n cells by ShortSeq(n, m), for the shorter sequence.
The second algorithm inserts each key into the empty (terminal) cell that is the right neighbor of the smaller cluster among the two clusters containing the initial hashing cells, breaking ties randomly.If one of the initial cells is empty, then the key is inserted into it, and if both of the initial cells are empty, we break ties evenly.Recall that a cluster is a group of consecutively occupied cells whose left and right neighbors are empty cells.This means that one can compute the size of the cluster that contains an initial hashing cell by running two linear probe sequences in opposite directions starting from the initial cell and going to the empty cells at the boundaries.So practically, the algorithm uses four linear probe sequences.We refer to this algorithm by SmallCluster(n, m) for inserting m keys into n cells.Before we show that these algorithms produce large clusters, we shall record a lower bound that holds for any two-way linear probing algorithm.

Universal Lower Bound
The following lower bound holds for any two-way linear probing hashing scheme, in particular, the ones that are presented in this article.
Theorem 5. Let n ∈ N, and m = ⌊ αn ⌋, where α ∈ (0, 1) is a constant.Let A be any two-way linear probing algorithm that inserts m keys into a hash table of size n.Then upon termination of A, w.h.p., the table contains a cluster of size of at least log 2 log n − O(1).
Proof.Imagine that we have a bin associated with each cell in the hash table.Recall that for each key x, algorithm A chooses two initial cells, and hence two bins, independently and uniformly at random, with replacement.Algorithm A, then, probes linearly to find two (possibly identical) terminal cells, and inserts the key x into one of them.Now imagine that after the insertion of each key x, we also insert a ball into the bin associated with the initial cell from which the algorithm started probing to reach the terminal cell into which the key x was placed.If both of the initial cells lead to the same terminal cell, then we break the tie randomly.Clearly, if there is a bin with k balls, then there is a cluster of size of at least k, because the k balls represent k distinct keys that belong to the same cluster.However, Theorem 1 asserts that the maximum bin load upon termination of algorithm A is at least log 2 log n − O(1), w.h.p.
The above lower bound is valid for all algorithms that satisfy Definition 1.A more general lower bound can be established on all open addressing schemes that use two linear probe sequences where the initial hashing cells are chosen according to some (not necessarily uniform or independent) probability distributions defined on the cells.We still assume that the probe sequences are used to find two (empty) terminal hashing cells, and the key is inserted into one of them according to some strategy.We call such schemes nonuniform two-way linear probing.The proof of the following theorem is basically similar to Theorem 5, but by using instead Vöcking's lower bound as stated in Theorem 3. Theorem 6.Let n ∈ N, and m = ⌊ αn ⌋, where α ∈ (0, 1) is a constant.Let A be any nonuniform two-way linear probing algorithm that inserts m keys into a hash table of size n where the initial hashing cells are chosen according to some probability distributions.Then the maximum cluster size produced by A, upon termination, is at least log log n/(2 log φ 2 ) − O(1), w.h.p.

Algorithms that Behave Poorly
We characterize some of the inefficient two-way linear probing algorithms.Notice that the main mistake in algorithms ShortSeq(n, m) and SmallCluster(n, m) is that the keys are allowed to be inserted into empty cells even if these cells are very close to some giant clusters.This leads us to the following theorem.
Theorem 7. Let α ∈ (0, 1) be constant.Let A be a two-way linear probing algorithm that inserts m = ⌊ αn ⌋ keys into n cells such that whenever a key chooses an empty and an occupied initial cells, the algorithm inserts the key into the empty one.Then algorithm A produces a giant cluster of size Ω(log n), w.h.p.
To prove the theorem, we need to recall the following.Definition 2 (See, e.g., [29]).Any non-negative random variables X 1 , . . ., X n are said to be negatively associated, if for every disjoint index subsets I, J ⊆ [[n]], and for any functions f : R |I| → R, and g : R |J| → R that are both non-decreasing or both non-increasing (componentwise), we have Once we establish that X 1 , . . ., X n are negatively associated, it follows, by considering inductively the indicator functions, that The next lemmas, which are proved in [29,31,47], provide some tools for establishing the negative association.
Lemma 1 (Zero-One Lemma).Any binary random variables X 1 , . . ., X n whose sum is one are negatively associated.Lemma 2. If {X 1 , . . ., X n } and {Y 1 , . . ., Y m } are independent sets of negatively associated random variables, then the union {X 1 , . . ., X n , Y 1 , . . ., Y m } is also a set of negatively associated random variables.Lemma 3. Suppose that X 1 , . . ., X n are negatively associated.Let I 1 , . . ., I k ⊆ [[n]] be disjoint index subsets, for some positive integer k.For j ∈ [[k]], let h j : R |Ij | → R be non-decreasing functions, and define Z j = h j (X i , i ∈ I j ).Then the random variables Z 1 , . . ., Z k are negatively associated.In other words, non-decreasing functions of disjoint subsets of negatively associated random variables are also negatively associated.The same holds if h j are non-increasing functions.
Throughout, we write binomial(n, p) to denote a binomial random variable with parameters n ∈ N and p ∈ [0, 1].
Proof of Theorem 7. Let β = ⌊ b log a n ⌋ for some positive constants a and b to be defined later, and without loss of generality, assume that N := n/β is an integer.Suppose that the hash table is divided into N disjoint blocks, each of size β.For i ∈ [[N ]], let B i = {β(i − 1) + 1, . . ., βi} be the set of cells of the i-th block, where we consider the cell numbers in a circular fashion.We say that a cell j ∈ [[n]] is "covered" if there is a key whose first initial hashing cell is the cell j and its second initial hashing cell is an occupied cell.A block is covered if all of its cells are covered.Observe that if a block is covered then it is fully occupied.Thus, it suffices to show that there would be a covered block, w.h.p.
For i ∈ [[N ]], let Y i be the indicator that the i-th block is covered.The random variables Y 1 , . . ., Y N are negatively associated which can been seen as follows.For j ∈ [[n]] and t ∈ [[m]], let X j (t) be the indicator that the j-th cell is covered by the t-th key, and set X 0 (t) := 1− n j=1 X j (t).Notice that the random variable X 0 (t) is binary.The zero-one Lemma asserts that the binary random variables X 0 (t), . . ., X n (t) are negatively associated.However, since the keys choose their initial hashing cells independently, the random variables X 0 (t), . . ., X n (t) are mutually independent from the random variables X 0 (t ′ ), . . ., X n (t ′ ), for any distinct t, t ′ ∈ [[m]].Thus, by Lemma 2, the union ∪ m t=1 {X 0 (t), . . ., X n (t)} is a set of negatively associated random variables.The negative association of the Y i is assured now by Lemma 3 as they can be written as non-decreasing functions of disjoint subsets of the indicators X j (t).Since the Y i are negatively associated and identically distributed, then Thus, we only need to show that N P {Y 1 = 1} tends to infinity as n goes to infinity.To bound the last probability, we need to focus on the way the first block B 1 = {1, 2, . . ., β} is covered.For j ∈ [[n]], let t j be the smallest t ∈ [[m]] such that X j (t) = 1 (if such exists), and m + 1 otherwise.We say that the first block is "covered in order" if and only if 1 Since there are β! orderings of the cells in which they can be covered (for the first time), we have full before the insertion of the t-th key, and otherwise be the minimum i ∈ B 1 such that the cell i has not been covered yet.Let A be the event that, for all t ∈ [[m]], the first initial hashing cell of the t-th key is either cell M (t) or a cell outside B 1 .Define the random variable W := m t=1 W t , where W t is the indicator that the t-th key covers a cell in B 1 .Clearly, if A is true and W ≥ β, then the first block is covered in order.Thus, However, since the initial hashing cells are chosen independently and uniformly at random, then for n chosen large enough, we have and for t ≥ ⌈ m/2 ⌉, Therefore, for n sufficiently large, we get which goes to infinity as n approaches infinity whenever a = 8e 2 /α 2 and b is any positive constant less than 1.
Clearly, algorithms ShortSeq(n, m) and SmallCluster(n, m) satisfy the condition of Theorem 7.So this corollary follows.

Hashing with Blocking
To overcome the problems of Section 3.2, we introduce blocking.The hash table is partitioned into equal-sized disjoint blocks of cells.Whenever a key has two terminal cells, the algorithm considers the information provided by the blocks, e.g., the number of keys it harbors, to make a decision.Thus, the blocking technique enables the algorithm to avoid some of the bad decisions the previous algorithms make.This leads to a more controlled allocation process, and hence, to a more even distribution of the keys.We use the blocking technique to design two two-way linear probing algorithms, and an algorithm that uses linear probing locally within each block.The algorithms are characterized by the way the keys pick their blocks to land in.The worst-case performance of these algorithms is analyzed and proven to be O(log log n), w.h.p.
Note also that (for insertion operations only) the algorithms require a counter with each block, but the extra space consumed by these counters is asymptotically negligible.In fact, we will see that the extra space is O(n/ log log n) in a model in which integers take O(1) space, and at worst O(n log log log n/ log log n) = o(n) units of memory, w.h.p., in a bit model.
Since the block size for each of the following algorithms is different, we assume throughout and without loss of generality, that whenever we use a block of size β, then n/β is an integer.Recall that the cells are numbered 0, . . ., n − 1, and hence, for i ∈ [[n/β]], the i-th block consists of the cells (i − 1)β, . . ., iβ − 1.In other words, the cell k ∈ {0, . . ., n − 1} belongs to block number λ(k) := ⌊ k/β ⌋ + 1.

Two-way Locally-Linear Probing
As a simple example of the blocking technique, we present the following algorithm which is a trivial application of the two-way chaining scheme [5].The algorithm does not satisfy the definition of two-way linear probing as we explained earlier, because the linear probes are performed within each block and not along the hash table.That is, whenever the linear probe sequence reaches the right boundary of a block, it continues probing starting from the left boundary of the same block.
The algorithm partitions the hash table into disjoint blocks each of size β 1 (n), where β 1 (n) is an integer to be defined later.We save with each block its load, that is, the number of keys it contains, and keep it updated whenever a key is inserted in the block.For each key we choose two initial hashing cells, and hence two blocks, independently and uniformly at random, with replacement.From the initial cell that belongs to the least loaded block, breaking ties randomly, we probe linearly and cyclically within the block until we find an empty cell where we insert the key.If the load of the block is β 1 , i.e., it is full, then we check its right neighbor block and so on, until we find a block that is not completely full.We insert the key into the first empty cell there.Notice that only one probe sequence is used to insert any key.The search operation, however, uses two probe sequences as follows.First, we compute the two initial hashing cells.We start probing linearly and cyclically within the two (possibly identical) blocks that contain these initial cells.If both probe sequences reach empty cells, or if one of them reaches an empty cell and the other one finishes the block without finding the key, we declare the search to be unsuccessful.If both blocks are full and the probe sequences completely search them without finding the key, then the right neighbors of these blocks (cyclically speaking) are searched sequentially in the same way mentioned above, and so on.We will refer to this algorithm by LocallyLinear(n, m) for inserting m keys into n cells.We show next that β 1 can be defined such that none of the blocks are completely full, w.h.p.This means that whenever we search for any key, most of the time, we only need to search linearly and cyclically the two blocks the key chooses initially.Theorem 8. Let n ∈ N, and m = ⌊ αn ⌋, where α ∈ (0, 1) is a constant.Let C be the constant defined in Theorem 2, and define Then, w.h.p., the maximum unsuccessful search time of LocallyLinear(n, m) with blocks of size β 1 is at most 2β 1 , and the maximum insertion time is at most β 1 − 1.
Proof.Notice the equivalence between algorithm LocallyLinear(n, m) and the allocation process GreedyMC(n/β 1 , m, 2) where m balls (keys) are inserted into n/β 1 bins (blocks) by placing each ball into the least loaded bin among two bins chosen independently and uniformly at random, with replacement.It suffices, therefore, to study the maximum bin load of GreedyMC(n/β 1 , m, 2) which we denote by L n .However, Theorem 2 says that w.h.p., and similarly,

Two-way Pre-linear Probing: algorithm DECIDEFIRST
In the previous two-way linear probing algorithms, each input key initiates linear probe sequences that reach two terminal cells, and then the algorithms decide in which terminal cell the key should be inserted.The following algorithm, however, allows each key to choose two initial hashing cells, and then decides, according to some strategy, which initial cell should start a linear probe sequence to find a terminal cell to harbor the key.So, technically, the insertion process of any key uses only one linear probe sequence, but we still use two sequences for any search.The hash table is divided into blocks of size β 2 .The number under each block is its weight.Each key decides first to land into the block of smaller weight, breaking ties randomly, then probes linearly to find its terminal cell.
Formally, we describe the algorithm as follows.Let α ∈ (0, 1) be the load factor.Partition the hash table into blocks of size β 2 (n), where β 2 (n) is an integer to be defined later.Each key x still chooses, independently and uniformly at random, two initial hashing cells, say I x and J x , and hence, two blocks which we denote by λ(I x ) and λ(J x ).For convenience, we say that the key x has landed in block i, if the linear probe sequence used to insert the key x has started (from the initial hashing cell available for x) in block i. Define the weight of a block to be the number of keys that have landed in it.We save with each block its weight, and keep it updated whenever a key lands in it.Now, upon the arrival of key x, the algorithm allows x to land into the block among λ(I x ) and λ(J x ) of smaller weight, breaking ties randomly.Whence, it starts probing linearly from the initial cell contained in the block until it finds a terminal cell into which the key x is placed.If, for example, both I x and J x belong to the same block, then x lands in λ(I x ), and the linear sequence starts from an arbitrarily chosen cell among I x and J x .We will write DecideFirst(n, m) to refer to this algorithm for inserting m keys into n cells.
In short, the strategy of DecideFirst(n, m) is: land in the block of smaller weight, walk linearly, and insert into the first empty cell reached.The size of the largest cluster produced by the algorithm is Θ(log log n).The performance of this hashing technique is described in Theorem 9: Theorem 9. Let n ∈ N, and m = ⌊ αn ⌋, where α ∈ (0, 1) is a constant.There is a constant η > 0 such that if then, w.h.p., the worst-case unsuccessful search time of algorithm DecideFirst(n, m) with blocks of size β 2 is at most ξ n := 12(1 − α) −2 (log 2 log n + η), and the maximum insertion time is at most ξ n /2.
Proof.Assume first that DecideFirst(n, m) is applied to a hash table with blocks of size β = ⌈ b(log 2 log n + η) ⌉, and that n/β is an integer, where b = (1 + ǫ)/(1 − α), for some arbitrary constant ǫ > 0. Consider the resulting hash table after termination of the algorithm.Let M ≥ 0 be the maximum number of consecutive blocks that are fully occupied.Without loss of generality, suppose that these blocks start at block i + 1, and let S = {i, i + 1, . . ., i + M } represent these full blocks in addition to the left adjacent block that is not fully occupied (Figure 4).Notice that each key chooses two cells (and hence, two possibly identical blocks) independently and uniformly at random.Also, any key always lands in the block of smaller weight.Since there are n/β blocks, and ⌊ αn ⌋ keys, then by Theorem 2, there is a constant C > 0 such that the maximum block weight is not more than λ n := (αb + 1) log 2 log n + αbη + α + C, w.h.p.Let A n denote the event that the maximum block weight is at most λ n .Let W be the number of keys that have landed in S, i.e., the total weight of blocks contained in S. Plainly, since block i is not full, then all the keys that belong to the M full blocks have landed in S. Thus, W ≥ M b(log 2 log n + η), deterministically.Now, clearly, if we choose η = C + α, then the event A n implies that (M + 1)(αb + 1) ≥ M b, because otherwise, we have which is a contradiction.Therefore, A n yields that Recall that (αb + 1) < b = (1 + ǫ)/(1 − α).Again, since block i is not full, the size of the largest cluster is not more than the total weight of the M + 2 blocks that cover it.Consequently, the maximum cluster size is, w.h.p., not more than . This concludes the proof as the maximum unsuccessful search time is at most twice the maximum cluster size plus two.
Remark.We have showed that w.h.p. the maximum cluster size produced by De-cideFirst(n, m) is in fact not more than

Two-way Post-linear Probing: algorithm WALKFIRST
We introduce yet another hashing algorithm that achieves Θ(log log n) worst-case search time, in probability, and shows better performance in experiments than De-cideFirst algorithm as demonstrated in the simulation results presented in Section 5. Suppose that the load factor α ∈ (0, 1/2), and that the hash table is divided into blocks of size where δ ∈ (2α, 1) is an arbitrary constant.Define the load of a block to be the number of keys (or occupied cells) it contains.Suppose that we save with each block its load, and keep it updated whenever a key is inserted into one of its cells.Recall that each key x has two initial hashing cells.From these initial cells the algorithm probes linearly and cyclically until it finds two empty cells U x and V x , which we call terminal cells.Let λ(U x ) and λ(V x ) be the blocks that contain these cells.The algorithm, then, inserts the key x into the terminal cell (among U x and V x ) that belongs to the least loaded block among λ(U x ) and λ(V x ), breaking ties randomly.We refer to this algorithm of open addressing hashing for inserting m keys into n cells as WalkFirst(n, m).In the remainder of this section, we analyze the worst-case performance of algorithm WalkFirst(n, m).Recall that the maximum unsuccessful search time is bounded from above by twice the maximum cluster size plus two.The following theorem asserts that upon termination of the algorithm, it is most likely that every block has at least one empty cell.This implies that the length of the largest cluster is at most 2β 3 − 2.
Theorem 10.Let n ∈ N, and m = ⌊ αn ⌋, for some constant α ∈ (0, 1/2).Let δ ∈ (2α, 1) be an arbitrary constant, and define Upon termination of algorithm WalkFirst(n, m) with blocks of size β 3 , the probability that there is a fully loaded block goes to zero as n tends to infinity.That is, w.h.p., the maximum unsuccessful search time of WalkFirst(n, m) is at most 4β 3 − 2, and the maximum insertion time is at most 4β 3 − 4.
For k ∈ [[m]], let us denote by A k the event that after the insertion of k keys (i.e., at time k), none of the blocks is fully loaded.To prove Theorem 10, we shall show that P {A c m } = o(1).We do that by using a witness tree argument; see e.g., [16,17,62,66,83,93].We show that if a fully-loaded block exists, then there is a witness binary tree of height β 3 that describes the history of that block.The formal definition of a witness tree is given below.Let us number the keys 1, . . ., m according to their insertion time.Recall that each key t ∈ [[m]] has two initial cells which lead to two terminal empty cells belonging to two blocks.Let us denote these two blocks available for the t-th key by X t and Y t .Notice that all the initial cells are independent and uniformly distributed.However, all terminal cells-and so their blocks-are not.Nonetheless, for each fixed t, the two random values X t and Y t are independent.

The History Tree
We define for each key t a full history tree T t that describes essentially the history of the block that contains the t-th key up to its insertion time.It is a colored binary tree that is labelled by key numbers except possibly the leaves, where each key refers to the block that contains it.Thus, it is indeed a binary tree that represents all the pairs of blocks available for all other keys upon which the final position of the key t relies.Formally, we construct the binary tree node by node in Breadth-First-Search (bfs) order as follows.First, the root of T t is labelled t, and is colored white.Any white node labelled τ gets two children: a left child corresponding to the block X τ , and a right child corresponding to the block Y τ .The left child is labelled and colored according to the following rules: (a) If the block X τ contains some keys at the time of insertion of key τ , and the last key inserted in that block, say σ, has not been encountered thus far in the bfs order of the binary tree T t , then the node is labelled σ and colored white.
(b) As in case (a), except that σ has already been encountered in the bfs order.We distinguish such nodes by coloring them black, but they get the same label σ.
(c) If the block X τ is empty at the time of insertion of key τ , then it is a "dead end" node without any label and it is colored gray.
Next, the right child of τ is labelled and colored by following the same rules but with the block Y τ .We continue processing nodes in bfs fashion.A black or gray node in the tree is a leaf and is not processed any further.A white node with label σ is processed in the same way we processed the key τ , but with its two blocks X σ and Y σ .We continue recursively constructing the tree until all the leaves are black or gray.See Figure 6 for an example of a full history tree.Notice that the full history tree is totally deterministic as it does not contain any random value.It is also clear that the full history tree contains at least one gray leaf and every internal (white) node in the tree has two children.Furthermore, since the insertion process is sequential, node values (key numbers) along any path down from the root must be decreasing (so the binary tree has the heap property), because any non-gray child of any node represents the last key inserted in the block containing it at the insertion time of the parent.We will not use the heap property however.
Clearly, the full history tree permits one to deduce the load of the block that contains the root key at the time of its insertion: it is the length of the shortest path from the root to any gray node.Thus, if the block's load is more than h, then all gray nodes must be at distance more than h from the root.This leads to the notion of a truncated history tree of height h, that is, with h + 1 levels of nodes.The top part of the full history tree that includes all nodes at the first h + 1 levels is copied, and the remainder is truncated.
We are in particular interested in truncated history trees without gray nodes.Thus, by the property mentioned above, the length of the shortest path from the root to any gray node (and as noted above, there is at least one such node) would have to be at least h + 1, and therefore, the load of the block harboring the root's key would have to be at least h + 1.More generally, if the load is at least h + ξ for a positive integer ξ, then all nodes at the bottom level of the truncated history tree that are not black nodes (and there is at least one such node) must be white nodes whose children represent keys that belong to blocks with load of at least ξ at their insertion time.We redraw these node as boxes to denote the fact that they represent blocks of load at least ξ, and we call them "block" nodes.Figure 7: A witness tree of height h which is a truncated history tree without gray nodes.The boxes at the lowest level are block nodes.They represent selected blocks with load of at least ξ.The load of the block that contains key 70 is at least h + ξ.

The Witness Tree
Let ξ ∈ N be a fixed integer to be picked later.For positive integers h and k, where h + ξ ≤ k ≤ m, a witness tree W k (h) is a truncated history tree of a key in the set [[k]], with h + 1 levels of nodes (thus, of height h) and with two types of leaf nodes, black nodes and "block" nodes.This means that each internal node has two children, and the node labels belong to the set [[k]].Each black leaf has a label of an internal node that precedes it in bfs order.Block nodes are unlabelled nodes that represent blocks with load of at least ξ.Block nodes must all be at the furthest level from the root, and there is at least one such node in a witness tree.Notice that every witness tree is deterministic.An example of a witness tree is shown in Figure 7.
Let W k (h, w, b) denote the class of all witness trees W k (h) of height h that have w ≥ 1 white (internal) nodes, and b ≤ w black nodes (and thus w − b + 1 block nodes).Notice that, by definition, the class W k (h, w, b) could be empty, e.g., if w < h, or w ≥ 2 h .However, |W k (h, w, b)| ≤ 4 w 2 w+1 w b k w , which is due to the following.Without the labelling, there are at most 4 w different shape binary trees, because the shape is determined by the w internal nodes, and hence, the number of trees is the Catalan number 2w w /(w + 1) ≤ 4 w .Having fixed the shape, each of the leaves is of one of two types.Each black leaf can receive one of the w white node labels.Each of the white nodes gets one of k possible labels.
Note that, unlike the full history tree, not every key has a witness tree W k (h): the key must be placed into a block of load of at least h+ξ−1 just before the insertion time.We say that a witness tree W k (h) occurs, if upon execution of algorithm WalkFirst, the random choices available for the keys represented by the witness tree are actually as indicated in the witness tree itself.Thus, a witness tree of height h exists if and only if there is a key that is inserted into a block of load of at least h + ξ − 1 before the insertion.
Before we embark on the proof of Theorem 10, we highlight three important facts whose proofs are provided in the appendix.First, we bound the probability that a valid witness tree occurs.Lemma 4. Let D denote the event that the number of blocks in WalkFirst(n, m) with load of at least ξ, after termination, is at most n/(aβ 3 ξ), for some constant a > 0. For k ∈ [[m]], let A k be the event that after the insertion of k keys, none of the blocks is fully loaded.Then for any positive integers h, w and k ≥ h + ξ, and a non-negative integer b ≤ w, we have The next lemma asserts that the event D in Lemma 4 is most likely to be true, for sufficiently large ξ < β 3 .
Lemma 6 addresses a simple but crucial fact.If the height of a witness tree then the number of white nodes w is at least two, (namely, the root and its left child); but what can we say about b, the number of black nodes?Lemma 6.In any witness tree is the event that after the insertion of k keys (i.e., at time k), none of the blocks is fully loaded.Notice that and the event A β3−1 is deterministically true.We shall show that P {A c m } = o(1).Let D denote the event that the number of blocks with load of at least ξ, after termination, is at most n/(aβ 3 ξ), for some constant a > 1 to be decided later.Observe that Lemma 5 reveals that P {D c } = o(1), and hence, we only need to demonstrate that . ., m.We do that by using the witness tree argument.Let h, ξ, η ∈ [2, ∞) be some integers to be picked later such that h + ξ ≤ β 3 .If after the insertion of k keys, there is a block with load of at least h + ξ, then a witness tree W k (h) (with block nodes representing blocks with load of at least ξ) must have occurred.Recall that the number of white nodes w in any witness tree W k (h) is at least two.Using Lemmas 4 and 6, we see that Note that we disallow b = w + 1, because any witness tree has at least one block node.We split the sum over w ≤ 2 h−η , and w > 2 h−η .For w ≤ 2 h−η , we have b ≥ η, and thus provided that n is so large that a2 h+1 ξβ 3 ≤ n, (this insures that awξβ 3 /n < 1/2).For w ∈ (2 h−η , 2 h ], we bound trivially, assuming the same large n condition: In summary, we see that We set a = 32, and ξ = ⌈ δβ 3 ⌉, so that 32αβ 3 /(aξ) ≤ 1/2, because δ ∈ (2α, 1).With this choice, we have , where c = w≥2 w η /2 w .Clearly, if we put h = η + ⌈ log 2 log 2 n η ⌉, and η = 3, then we see that h + ξ ≤ β 3 , and p k = o(1/n).Notice that h and ξ satisfy the technical condition a2 h+1 ξβ 3 ≤ n, asymptotically.that there is a nonlinear increase, as a function of n, in the difference between the performances of these algorithms.This may suggest that the worst-case performances of algorithms ShortSeq and SmallCluster are roughly of the order of log n.The simulation data of algorithms LocallyLinear, WalkFirst, and Decide-First are presented in Tables 3, 4 and 5.These algorithms are simulated with blocks of size (1 − α) −1 log 2 log n .The purpose of this is to show that, practically, the additive and the multiplicative constants appearing in the definitions of the block sizes stated in Theorems 8, 9 and 10 can be chosen to be small.The hash table is partitioned into equal-sized blocks, except possibly the last one.The average and the maximum values of the successful search time, inset time, and cluster size (averaged over 10 iterations each consisting of 100 simulations of the algorithms) are recorded in the tables below where the best performances are drawn in boldface.
Results show that LocallyLinear Algorithm has the best performance; whereas algorithm WalkFirst appears to perform better than DecideFirst.Indeed, the sizes of the cluster produced by algorithm WalkFirst appears to be very close to that of LocallyLinear Algorithm.This supports the conjecture that Theorem 10 is, in fact, true for any constant load factor α ∈ (0, 1), and the maximum unsuccessful search time of WalkFirst is at most 4(1−α) −1 log 2 log n+O(1), w.h.p.The average maximum cluster size of algorithm DecideFirst seems to be close to the other ones when α is small; but it almost doubles when α is large.This may suggest that the multiplicative constant in the maximum unsuccessful search time established in Theorem 9 could be improved.
Comparing the simulation data from all tables, one can see that the best average performance is achieved by the algorithms LocallyLinear and ShortSeq.No-    tice that ShortSeq Algorithm achieves the best average successful search time when α = 0.9.The best (average and maximum) insertion time is achieved by the algorithm LocallyLinear.On the other hand, algorithms WalkFirst and LocallyLinear are superior to the others in worst-case performance.It is worth noting that surprisingly, the worst-case successful search time of algorithm SmallCluster is very close to the one achieved by WalkFirst and better than that of DecideFirst, although, it appears that the difference becomes larger, as n increases.
Proof.Fix ξ ≥ δβ 3 .Let B denote the last block in the hash table, i.e., B consists of the cells n − β 3 , . . ., n − 1.Let L be the load of B after termination.Since the loads of the blocks are identically distributed, we have Let S be the set of the consecutively occupied cells, after termination, that occur between the first empty cell to the left of the block B and the cell n − β 3 ; see Figure 8.We say that a key is born in a set of cells A if at least one of its two initial hashing cells belong to A. For convenience, we write ν(A) to denote the number of keys that are born in A. Obviously, ν(A) is binomial(m, 2 |A| /n).Since the cell adjacent to the left boundary of S is empty, all the keys that are inserted in S are actually born in S. That is, if |S| = j, then ν(S) ≥ j.So, by the binomial tail inequality given earlier, we see that

Figure 1 :
Figure 1: An illustration of algorithm ShortSeq(n, m) in terms of balls (keys) and bins (cells).Each ball is inserted into the empty bin found by the shorter sequence.

Figure 2 :
Figure 2: Algorithm SmallCluster(n, m) inserts each key into the empty cell adjacent to the smaller cluster, breaking ties randomly.The size of the clusters is determined by probing linearly in both directions.

Figure 3 :
Figure 3: An illustration of algorithm DecideFirst(n, m).The hash table is divided into blocks of size β 2 .The number under each block is its weight.Each key decides first to land into the block of smaller weight, breaking ties randomly, then probes linearly to find its terminal cell.

Figure 4 :
Figure 4: A portion of the hash table showing the largest cluster, and the set S which consists of the full consecutive blocks and their left neighbor.

Figure 5 :
Figure 5: Algorithm WalkFirst(n, m) inserts each key into the terminal cell that belongs to the least crowded block, breaking ties arbitrarily.

Figure 6 :
Figure 6: The full history tree of key 18. White nodes represent type (a) nodes.Black nodes are type (b) nodes-they refer to keys already encountered in bfs order.Gray nodes are type (c) nodes-they occur when a key selects an empty block.

Figure 8 :
Figure 8: The last part of the hash table showing clusters, the last block B, and the set S.

Table 1 :
The average and the maximum successful search and insert times averaged over 10 iterations each consisting of 100 simulations of the algorithms.The best successful search time is shown in boldface and the best insert time is shown in italic.

Table 2 :
The average maximum cluster size and the average cluster size over 100 simulations of the algorithms.The best performances are drawn in boldface.

Table 3 :
The average and the maximum successful search time averaged over 10 iterations each consisting of 100 simulations of the algorithms.The best performances are drawn in boldface.

Table 4 :
The average and the maximum insert time averaged over 10 iterations each consisting of 100 simulations of the algorithms.The best performances are drawn in boldface.

Table 5 :
The average and the maximum cluster sizes averaged over 10 iterations each consisting of 100 simulations of the algorithms.The best performances are drawn in boldface.