The amount of information captured from a scene, which determines the radiometric resolution of the image responsible for the actual information contained in an image, is dependent on the sensitivity of the infrared sensors in the thermal camera. The sensitivity of the infrared sensor describes its ability to detect slight differences in infrared energy, which are encoded using bits and are displayed in varying tones of gray. A high level of gray corresponds to high infrared energy emissions and vice versa. The images are, therefore, made up of numbers ranging from zero (0) to one minus the maximum number of gray levels available, and the maximum gray level is determined by the number of bits used to represent infrared energy detected. To highlight pedestrians, it is necessary to reduce the sensitivity of the camera so that regions with infrared emissions lower than the minimum detectable by the thermal sensor appear as black and regions, with infrared emissions higher than the maximum temperatures appearing as white. In this paper, we perform this sensitivity reduction using histogram specification and iterative histogram partitioning. The pseudo-code for the Dynamic Range Adjustment algorithm is shown in Algorithm 1.
Algorithm 1: Dynamic Range Adjustment. |
![Sensors 22 01728 i001]() |
3.1.1. Histogram Specification Using Histogram Equalisation
Histogram equalisation is the best-known application of histogram specification traditionally used as an image enhancement technique that yields a more balanced histogram and better contrast. When used on infrared images, it simulates the effect of reducing the dynamic range of the camera before a scene is captured. In this section, we are interested in creating the effect of reduced camera sensitivity to reduce the level of details in the image and enhance the pedestrians. The mathematical foundation of histogram equalisation is based on the idea that pixels in the original and equalised images can be regarded, respectively, as continuous random variables
X and
Y in the range of gray levels
and the normalised histogram as probability density function (PDF) [
34]. It is a transform
T of
X into
Y, which spreads gray levels over the entire scale, and each gray level is allotted an equal number of pixels. Therefore,
Y is defined as follows:
where
is the PDF of the original image.
T, therefore, is the cumulative distribution of
X multiplied by
.
The histogram
of an image
f with
L gray levels in the range [0, 255] is given as follows:
where
is the number of pixels with
l gray level. If the image has
m pixels in total, the normalised histogram
is calculated as follows:
where
. Given gray level
l, this is equivalent to dividing each gray level
by the total number of pixels in the image
m. Given intensity
k, the histogram equalised image
g of
f can then be defined by as follows.
3.1.2. Histogram Partitioning
Histogram partitioning is normally carried out by finding a suitable gray value with which it is used to divide the histogram to identify and/or isolate objects of interest. Selecting the suitable gray value can be performed by using a local or global thresholding technique. Local techniques partition an image into subimages and determine a threshold for each of these subimages, while global techniques partition the entire image with a single threshold value. In this section, we are interested in finding the range at which the pedestrians lie, and this is performed by using iterative global thresholding, beginning with the histogram equalised image until convergence. At each stage of the iteration, the threshold value obtained becomes the new minimum dynamic range. The threshold value at each iteration is obtained using minimum cross-entropy. The details are given as follows.
The thresholding problem is the choice of the best distribution estimate for an event with unknown probabilities. Let the event with unknown probabilities
be the resulting binary image where
refers to the grey level of the image pixels or the number of bins in the image histogram. The problem is to choose a distribution
b that best estimates
given what we know. The solution to the problem is the distribution having expected values that fall within the bounds or equal to the known values, thereby satisfying certain learned expectations
or constraints. However, there is an infinite set of distributions that satisfy the constraints. Information is a measure of one’s freedom of choice in making selections [
35]; thus, entropy, a measure of information, becomes necessary. The principle of maximum entropy, which states that the distribution of choice, from all that satisfy the constraints, is the one with the largest entropy, is the prescribed solution for solving such problems. However, in situations where a prior distribution that estimates
is known in addition to learned expectations, the principle of minimum cross-entropy, a generalisation of the maximum entropy principle, applies [
36].
Let the original image be the prior distribution with known probabilities
. Cross-entropy
is the average number of bits needed to encode data with distribution
b when modelled with distribution
w [
37] and is defined as follows:
and can be written as
where the following:
is the entropy of the distribution
b and the following:
is the Kullback–Leibler (K-L) divergence of distributions
b and
w defined as the excess code over the optimal code needed to represent data because it was modelled using distribution
w instead of true distribution
b [
37]. The principle of minimum cross-entropy states that of all distributions
b that satisfy constraints, the distribution of choice is the one with the smallest cross-entropy [
36]. The value of
in Equation (
5) is fixed and during minimization and it reduces to an additive constant. Therefore, minimizing cross-entropy reduces to minimizing K-L divergence:
which is subject to the following.
The Monkey Model proposed in [
38] views a digital image as a discrete probability distribution. In the model, a troop of monkeys is responsible for randomly throwing balls onto an empty array of cells. The image formed shows the number of balls received at each cell in the array. This view is exactly how infrared images are formed as the pixel intensity is a measure of the level of infrared emission received from the scene by the thermal camera’s sensors. Normalizing pixel intensity provides an approximate measure of the probability of the emission at each pixel from the scene. Therefore, known probabilities
of the original image are the normalised pixel intensity values.
The binary image’s distribution
b is determined from that of original image
w and can be described using two probability measures
and
, which are the below—and above—threshold means of the original image, respectively. These expectations are summarised as follows:
which allows for the determination of
and
as follows:
where
y and
z are the smallest and largest grey levels present in the original image, respectively,
T is the candidate threshold value, and
is the probability of grey level
i given by the following:
where
is the number of pixels having grey level
, and
N is the total number of pixels making up the image. Therefore, we have the following.
Hence, cross entropy from Equation (
6) becomes the following:
and the threshold to be chosen corresponds to the minimum cross entropy and is provided as follows.
Since we know that humans exist in the brightest part of the image, we obtain the iterative procedure shown in Algorithm 1 until iteration converges. The convergence condition is as follows:
where
is the current
while
is the previous one.