The Codebook algorithm performs a background subtraction with a clustering technique on sequences taken from a still point of view in order to segment moving objects out of the background. The method works in two phases: Codebook construction and foreground detection. In the first phrase, a model representing the background is constructed from a sequence of images on a pixel-by-pixel basis. Then, in the second phase, every new frame is compared with this background model in order to finally obtain a foreground–background segmentation.
During the last decade, many works have been dedicated to improving this model. For example, Refs. [
22,
23] have adopted a two-layer model, to handle dynamic background and illumination variation problems. Other modifications like transferring RGB to other color models in order to solve the problem of existence of shadows and highlights for foreground detection can also be found in [
10]. In Ref. [
24], a multi-feature Codebook model, which integrates intensity, color and texture information across multiple scales, has been presented.
The object of this work is to investigate the benefits of multispectral sequences rather than traditional RGB to improve the performance of moving objects’ detection. Thus, we first adapt the original Codebook algorithm to multispectral sequences. Minor modifications have been performed comparing with the original RGB Codebook technique [
8]. Specifically speaking, the definition of brightness in RGB is extended to multispectral case. In addition, unlike color distortion in the original Codebook, we adopt spectral distortion instead, as the term color is always related to RGB, and, even for three bands out of multispectral sequences, they are not strictly color.
2.1. Codebook Construction
For each pixel, a Codebook is constructed to describe what background should act and each codebook consists of L codewords. The number of codewords is different according to the pixel’s activities. More precisely, each codeword is defined by two vectors: the first one contains the average spectral values for each band of the pixel, where n is the number of bands of multispectral sequences. The second one is a six-tuple vector , where:
, the min and max brightness, respectively, of all pixels assigned to codeword .
, the frequency with which codeword has occurred.
, the maximum negative run-length (MNRL), defined as the longest interval of time during the construction period that the codeword has not been updated.
, the first and the last times, respectively, that the codeword has been occurred.
To construct this background model, the codebook for each pixel is initialized as the first line in Algorithm 1 shows, when the algorithm starts.
is defined as the total number of frames in the construction phase. Then, the current value
of a given pixel is compared to its current codebook. If there is a match with a codeword
this codeword is used as the sample’s encoding approximation. Otherwise, a new codeword is to be created. The detailed algorithm of Codebook construction is given in Algorithm 1, during which the matching process is evaluated by two judging criteria: (a) brightness bounds and (b) spectral distortion.
Algorithm 1 Codebook Construction |
|
|
find the matching codeword to xt in C if (a) and (b) occur. |
(a) brightness = true |
(b) spectral_dist |
if C⟵ϕ or there is no match, then L⟵L + 1, create a new codeword |
v0 = xt |
aux0 = 〈I,I,1,t−1,t,t〉. |
Else, update the matched codeword, composed of |
|
|
|
end for |
(a) The brightness of the pixel must lie in the interval
. For grayscale pixels, the grayscale value or the brightness is obtained by
. For RGB pixels, the brightness is calculated by
. Accordingly, for multispectral pixel vector
the brightness can be also measured by the L2-norm of the pixel vector
where
n is the number of bands. The boundaries are calculated from the min and max brightness
, with Equation (2):
where the values of
and
are obtained from experiments. Typically,
is between 0.4 and 0.7, and
is between 1.1 and 1.5 [
8].
Thus, the logical brightness function is defined as:
(b) The spectral distortion, spectral_dist, must lie under a given threshold
ε1. Following Equations (4) and (5), define the calculation of spectral distortion between an input multispectral vector
and a background average multispectral vector
:
To make it intuitive, the two criteria (a) and (b) are visualized in
Figure 1. The pixel of a multispectral image is considered as a vector in an
n-dimensional space and three bands are used as an example. In
Figure 2, the blue cylinder represents a certain codeword, whose bottom radius is the spectral distortion threshold
. The red and the blue vectors stand for the average spectral
in this codeword and the current pixel
, respectively. With Equations (4) and (5), the spectral distortion can be calculated and illustrated with the green line. As discussed above, a match is found if the brightness of the pixel vector lies between
and
, and the spectral distortion is under a given threshold
. Accordingly, the L2-norm of vector
must be located along the axis in the cylinder and the length of the green line must be smaller than the radius of the cylinder.
At the end of the Codebook construction algorithm, the model has to clean the codewords that are most probably belonging to foreground objects. To achieve that, the algorithm makes use of MNRL recorded in the six-tuple of each codeword. A low value means that the codeword has been frequently observed. A high value means that it has been less frequently observed and that it should be removed from the model as it is probably part of foreground. The threshold value is often set as half of the number of images used in the construction period [
8].