Infrared Image Enhancement Using Adaptive Histogram Partition and Brightness Correction

Infrared image enhancement is a crucial pre-processing technique in intelligent urban surveillance systems for Smart City applications. Existing grayscale mapping-based algorithms always suffer from over-enhancement of the background, noise amplification, and brightness distortion. To cope with these problems, an infrared image enhancement method based on adaptive histogram partition and brightness correction is proposed. First, the grayscale histogram is adaptively segmented into several sub-histograms by a locally weighted scatter plot smoothing algorithm and local minima examination. Then, the fore-and background sub-histograms are distinguished according to a proposed metric called grayscale density. The foreground sub-histograms are equalized using a local contrast weighted distribution for the purpose of enhancing the local details, while the background sub-histograms maintain the corresponding proportions of the whole dynamic range in order to avoid over-enhancement. Meanwhile, a visual correction factor considering the property of human vision is designed to reduce the effect of noise during the procedure of grayscale re-mapping. Lastly, particle swarm optimization is used to correct the mean brightness of the output by virtue of a reference image. Both qualitative and quantitative evaluations implemented on real infrared images demonstrate the superiority of our method when compared with other conventional methods.


Introduction
Infrared (IR) imaging has been extensively applied in Smart City applications [1,2], e.g., scene surveillance, goods sorting, and fire prevention due to its unique ability to receive IR rays (780 nm-300 µm) which cannot be perceived by human eyes. However, compared with visible (Vis) images, IR images do suffer from many intrinsic drawbacks, e.g., lack of color information, low contrast, blurred resolution, and visual disturbance caused by noise, which generate much inconvenience when attempting to recognize the target of interest from the background [3,4]. As a result, IR image enhancement is always a hot topic worthy of investigation in Smart City applications and plays a significant role in the pre-processing techniques of intelligent systems.
During the past decades, a number of image enhancement algorithms based on histogram equalization (HE) have been developed, the goal of which is to re-map the grayscales of the original image and obtain a new histogram with a uniform intensity distribution so as to improve the image contrast as much as possible. Among them, global histogram equalization (GHE) is the simplest one, in which a transfer function formulated by the cumulative density function (CDF) is utilized to re-calculate the grayscale histogram so that the overall histogram distribution is forced to be flattened out and accounts for a broader dynamic range, exaggerating the global contrast to the greatest extent [5]. To this end, grayscales with high probability distribution function (PDF) values (usually belonging to the background) will occupy the most intensity ranges and be dramatically enhanced, while those with low PDF values (usually belonging to target regions) will be greatly suppressed or even lost. That is to say, over-enhancement may easily occur in IR images with a large area of homogeneous regions [6]. In an even worse scenario, the noise existing in background may be hugely amplified due to over-enhancement.
Recently, some improved HE-based algorithms have also been investigated. Dynamic histogram equalization (DHE) proposed by Abdullah-Al-Wadud et al. [7] tends to preserve the details of the input. The raw grayscale histogram is partitioned based on local minima and specific grayscale ranges are assigned to each partition, after which HE is implemented on each sub-histogram [8]. Brightness preserving bi-histogram equalization (BBHE) [9] divides the original grayscale histogram into two parts using the mean brightness of the input and then the two sub-histograms are equalized independently. It has been proved that selecting the mean brightness as the separation threshold can preserve the general brightness before and after enhancement to a certain degree. Equal area dualistic sub-image histogram equalization (DSIHE) [10] increases the global contrast with the same strategy as BBHE, but the only difference is that the mean brightness is replaced by the median as the separation threshold. Relevant experiments further verify that DSIHE can achieve better performances than BBHE with respect to preserving the average information content of the input image. In addition, minimum mean brightness error bi-histogram equalization (MMBEBHE) [11] is another improvement of BBHE which yields the optimized mean brightness difference between the input and output. Recursive mean-separate histogram equalization (RMSHE) [12] and recursive sub-image histogram equalization (RSIHE) [13] are the recursive versions of BBHE and DSIHE. Although they provide a flexible way of monitoring the over-enhancement degree, the mean brightness is overly emphasized. Overall, the afore-introduced algorithms may have satisfactory enhancement results in Vis images, but side effects may happen when facing the variation of grayscale distribution in the histogram [14] and the strategy of brightness preservation is obviously unreasonable in IR image enhancement considering the fact that the input brightness itself is distorted.
With the aim of alleviating over-enhancement, Vicker et al. [15] proposed plateau histogram equalization (PHE), whose PDF is limited via a threshold. Furthermore, Song et al. [16] presented double plateau histogram equalization (DPHE) in which the upper threshold is designed to prevent the over-enhancement of background noise with typical grayscales, while the lower threshold is developed to protect details with fewer pixels from being merged [17]. Yet, the thresholds in PHE and DPHE need to be manually selected, which limits their usage in practice. As a result, adaptive plateau histogram equalization (APHE) [18], as well as adaptive double plateau histogram equalization (ADPHE) [17], were then developed. However, the robustness of ADPHE cannot be guaranteed because the thresholds are fixed for all the grayscales.
Most of the existing methods are highly dependent on mode estimation, which may easily lead to an over or under segmentation of the histogram. To deal with this significant problem, a non-parametric method using contrario theory-based automatic histogram mode estimation was proposed by Delon et al. [19]. In this method, a segmentation of the histogram without a priori assumptions in terms of the number or shape of its modes is presented, and the core idea is to test the simplest multi-modal law fitting the data. Specially speaking, an adequacy test named 'meaningful rejections' is developed, which is demonstrated to be both locally and globally superior. The algorithm is able to detect even very small modes when they are isolated, making it well adapted to document analysis. In addition, it is more robust to quantization noise owing to its statistical aspect. To address the afore-mentioned problems in the existing methods, we focus on investigating an effective IR image enhancement algorithm based on adaptive histogram partition and brightness correction. First of all, the original grayscale histogram is smoothed by Gaussian filtering and a locally weighted scatter plot smoothing (LOWESS) algorithm [20] so that the spikes can be removed. After that, the histogram is partitioned into several intervals via local minima and whether a certain sub-histogram belongs to the foreground or background is judged by its grayscale density. For the foreground parts, a novel local contrast weighted distribution is presented to take the place of the conventional intensity distribution, which only takes the occurrence frequency of each gray level into account and each foreground sub-image is equalized using that distribution; for the background parts, all the corresponding intervals keep their proportions of the whole dynamic range. It is remarkable that, motivated by the characteristic of human eyes, a visual correction factor is proposed to adjust the range of each foreground histogram before equalization. Finally, a reference IR image with an appropriate brightness is employed to correct the mean brightness of the enhanced image by a particle swarm optimization (PSO) algorithm. Figure 1 shows a complete flow chart of our method. To address the afore-mentioned problems in the existing methods, we focus on investigating an effective IR image enhancement algorithm based on adaptive histogram partition and brightness correction. First of all, the original grayscale histogram is smoothed by Gaussian filtering and a locally weighted scatter plot smoothing (LOWESS) algorithm [20] so that the spikes can be removed. After that, the histogram is partitioned into several intervals via local minima and whether a certain subhistogram belongs to the foreground or background is judged by its grayscale density. For the foreground parts, a novel local contrast weighted distribution is presented to take the place of the conventional intensity distribution, which only takes the occurrence frequency of each gray level into account and each foreground sub-image is equalized using that distribution; for the background parts, all the corresponding intervals keep their proportions of the whole dynamic range. It is remarkable that, motivated by the characteristic of human eyes, a visual correction factor is proposed to adjust the range of each foreground histogram before equalization. Finally, a reference IR image with an appropriate brightness is employed to correct the mean brightness of the enhanced image by a particle swarm optimization (PSO) algorithm. Figure 1 shows a complete flow chart of our method.

Review of Global Histogram Equalization
Global histogram equalization (GHE) is the basic HE-based image enhancement method in computer vision and has been widely applied to practice. Its core is to build a transfer function based on PDF and re-map the raw grayscale distribution to a new uniform one.
Let us suppose that = I {I(x,y)} represents the input IR image, where I(x,y) is the grayscale of each pixel whose spatial coordinate in I is (x,y) . Additionally, the IR image is digitized into L gray levels as where, k I refers to an arbitrary grayscale and k p(I ) is its corresponding PDF; k n denotes the number of pixels whose grayscales are k I ; and n is the total pixel number of I. Particularly, the graphic appearance of PDF is regarded as a histogram [14]. Further, the CDF of each grayscale is calculated as: , k 0,1,...,L 1 n . (2)

Review of Global Histogram Equalization
Global histogram equalization (GHE) is the basic HE-based image enhancement method in computer vision and has been widely applied to practice. Its core is to build a transfer function based on PDF and re-map the raw grayscale distribution to a new uniform one.
Let us suppose that I = {I(x, y)} represents the input IR image, where I(x, y) is the grayscale of each pixel whose spatial coordinate in I is (x, y). Additionally, the IR image is digitized into L gray levels as {I 0 , I 1 , . . . , I L−1 } (for an 8-bit digital image, L = 256). So, it is apparent that ∀I(x, y) ∈ {I 0 , I 1 , . . . , I L−1 }. Next, the PDF of each grayscale can be defined as: where, I k refers to an arbitrary grayscale and p(I k ) is its corresponding PDF; n k denotes the number of pixels whose grayscales are I k ; and n is the total pixel number of I. Particularly, the graphic appearance of PDF is regarded as a histogram [14]. Further, the CDF of each grayscale is calculated as: n t n , k = 0, 1, . . . , L − 1.  (2), the transfer function T(I k ) is constructed as: Lastly, the enhanced image O is obtained as: Note that: From Equation (5), it is easy to find that grayscales with high PDFs can have a high gain, and account for the majority of the whole dynamic range. As a rule, those grayscales mainly belong to background pixels and are always over-merged by GHE, which is absolutely unexpected. On the contrary, the foreground, especially some small-sized targets, the grayscales of which have relatively low PDFs, are greatly suppressed after GHE, i.e., the details are thus lost.

Theory
In this section, the detailed theories related to the proposed algorithm are discussed at great length. First, Section 3.1 introduces the automatic way of partitioning the raw grayscale histogram. Next, the criterion of distinguishing between fore-and background sub-histograms is revealed in Section 3.2. After that, we clarify the different means of enhancing fore-and background sub-histograms, respectively, in Section 3.3. Finally, PSO algorithm optimizing the mean brightness with a reference image is depicted in Section 3.4.

Adaptive Segmentation of Input Grayscale Histogram
Considering the physical law in IR imaging that the temperature distribution of the target (foreground) or background is continuous and centers around a specific temperature, meaning that the grayscale distribution of the foreground should be unimodal (the condition of background is the same), we propose to judge whether a cluster of grayscales belongs to the foreground or background by means of partitioning the raw histogram into several intervals and recognizing each of them through a specific characteristic of grayscale distribution discussed in Section 3.2. In this way, we can use different strategies to enhance the fore-and background separately.
Inspired by previous works [21][22][23], the strategy of segmenting a histogram via local minima is adopted in this paper. Unlike the conventional methods, we focus on improving its robustness to those spikes existing in the histogram. Therefore, a data smoothing method called LOWESS is utilized as a pre-processing step.
We notice that there are large quantities of spikes, which can be seen as local outliers, in the input grayscale histograms, leading to too many peaks if the local minima are examined directly. Thus, a one-dimensional Gaussian filter is employed to roughly smooth the histogram at first: where, p (I k ) refers to the smoothed PDF after Gaussian filtering; the size of the local window is 1 * (2w 1 + 1); and κ(·) represents a one-dimensional Gaussian kernel function which is defined as: Remote Sens. 2018, 10, 682

of 34
where, σ 1 is a constant parameter deciding the weight of each neighboring PDF affecting the output and x o = k denotes the index of the current grayscale I k .
Then, a local smoother called LOWESS is used on the histogram after Gaussian filtering to further remove the spikes. The specific steps to nullify the minor peaks from p (I k ) are described as follows: (1) Given a span ζ with a fixed length d, the regression weight of each neighboring PDF in ζ is locally calculated using a 9-degree polynomial ϕ j (p (I k )) expressed as Equation (8).
where, p (I j ) denotes the neighboring PDF of p (I k ) in ζ and p (I k ) is the current PDF to be smoothed.
(2) A polynomial unary regression estimation function is used to compute the smoothed PDF of each grayscale, and the coefficients of the locally linear regression are deduced by the weighted least square estimate of equations fitting an affine model to the probabilities. Please refer to the study made by Cleveland [20] for the detailed deduction. Here, the solution of the coefficients is directly given as: where, X ∈ R d×2 in which the first column values are equal to 1 and the second column values are written as a vector (p (I k ) − p (I j )); ϕ = diag(ϕ 1 (p (I k )), ϕ 2 (p (I k )), . . . , ϕ d (p (I k ))) is a d × d sized diagonal matrix; and Y is the intended span of p (I k ). Figure 2 gives a simple schematic diagram of the above-discussed procedure of estimating.
further remove the spikes. The specific steps to nullify the minor peaks from ' k p (I ) are described as follows: (1) Given a span ζ with a fixed length d , the regression weight of each neighboring PDF in ζ is locally calculated using a 9-degree polynomial φ ' j k (p (I )) expressed as Equation (8).
where, ' j p (I ) denotes the neighboring PDF of ' k p (I ) in ζ and ' k p (I ) is the current PDF to be smoothed.
(2) A polynomial unary regression estimation function is used to compute the smoothed PDF of each grayscale, and the coefficients of the locally linear regression are deduced by the weighted least square estimate of equations fitting an affine model to the probabilities. Please refer to the study made by Cleveland [20] for the detailed deduction. Here, the solution of the coefficients is directly given as: where,   Figure 3 shows an example of smoothing the grayscale histograms produced by Gaussian filtering and LOWESS, respectively. It can be clearly seen that although the Gaussian-smoothed histogram in Figure 3b becomes smoother, there are still some fake peak and valley points (see the orange circle). Fortunately, in Figure 3c, LOWESS removes almost all the spikes effectively, while only the valid peaks and valleys remain. Thus, it proves that LOWESS is necessary in our work. According to Equation (9), a 2 × 1 sized coefficient vectorθ of the locally linear regression model is yielded and the smoothed output p * (I k ) of each PDF after Gaussian filtering p (I k ) is determined through Equation (10). Figure 3 shows an example of smoothing the grayscale histograms produced by Gaussian filtering and LOWESS, respectively. It can be clearly seen that although the Gaussian-smoothed histogram in Figure 3b becomes smoother, there are still some fake peak and valley points (see the orange circle). Fortunately, in Figure 3c, LOWESS removes almost all the spikes effectively, while only the valid peaks and valleys remain. Thus, it proves that LOWESS is necessary in our work. Based on the idea of searching for local minima, a × 2 1 w sized sliding window is designed and all the local minima are extracted using the following criterion: where, ⋅ min( ) represents the minimal component in a set.
Besides, both the first and last non-zero components of the input grayscale histogram are also regarded as local minima. In this case, we assume that there is a total of ς local minima in the histogram and they are denoted as

Recognition of Fore-and Background Sub-Histograms
Before re-mapping the grayscales, we need to distinguish between fore-and background intervals and use different tools to deal with them. In this paper, we argue that whether a certain interval belongs to the foreground or background cannot depend on the PDF of a single grayscale, but the overall grayscale distribution of the interval should be considered. On account of the following two physical facts in IR imaging: (1) background pixels occupy a large percentage of the whole image whereas the object has an opposite condition, and both the intensity values of the Based on the idea of searching for local minima, a 1 × w 2 sized sliding window is designed and all the local minima are extracted using the following criterion: where, min(·) represents the minimal component in a set. Besides, both the first and last non-zero components of the input grayscale histogram are also regarded as local minima. In this case, we assume that there is a total of σ local minima in the histogram and they are denoted as {m 1 , m 2 , . . . , m σ }. Hence, the raw histogram can be partitioned into σ − 1 intervals, which are further written as  Based on the idea of searching for local minima, a × 2 1 w sized sliding window is designed and all the local minima are extracted using the following criterion: where, ⋅ min( ) represents the minimal component in a set.
Besides, both the first and last non-zero components of the input grayscale histogram are also regarded as local minima. In this case, we assume that there is a total of ς local minima in the histogram and they are denoted as

Recognition of Fore-and Background Sub-Histograms
Before re-mapping the grayscales, we need to distinguish between fore-and background intervals and use different tools to deal with them. In this paper, we argue that whether a certain interval belongs to the foreground or background cannot depend on the PDF of a single grayscale, but the overall grayscale distribution of the interval should be considered. On account of the following two physical facts in IR imaging: (1) background pixels occupy a large percentage of the whole image whereas the object has an opposite condition, and both the intensity values of the

Recognition of Fore-and Background Sub-Histograms
Before re-mapping the grayscales, we need to distinguish between fore-and background intervals and use different tools to deal with them. In this paper, we argue that whether a certain interval belongs to the foreground or background cannot depend on the PDF of a single grayscale, but the overall grayscale distribution of the interval should be considered. On account of the following two physical facts in IR imaging: (1) background pixels occupy a large percentage of the whole image whereas the object has an opposite condition, and both the intensity values of the foreground and background tend to cluster around a peak; (2) foreground intervals always present a flat shape whereas background intervals possess a narrow but sharp shape, a metric Λ i measuring the grayscale density of each interval is proposed below: where, Based on the definition of grayscale density, we further propose that a small Λ i is related to the foreground and a large one corresponds to the background. Next, an adaptive threshold Λ * to recognize the property of each interval is adaptively calculated via Ostu's method [24], which is based on maximizing the inter-class variance of the dataset. Suppose that all the Λ i compose a set Λ = {Λ 1 , Λ 2 , . . . , Λ σ−1 }, and Λ * separates this set into two classes Λ α and Λ β . Ostu's method exhaustively searches for the optimum Λ * as: where, Λ max and Λ min stand for the maximum and minimum of Λ, respectively; and, where, E(Λ α ) and E(Λ β ) stand for the mean values of the two separated classes; µ Λ is the mean value of Λ; and W α and W β represent the fractions indicating the component numbers of the two classes among the whole, respectively. To this end, the grayscale histogram of input is segmented into fore-and background sub-histograms as the following criterion: Additionally, Figure 5 illustrates the above-introduced segmentation result.
Remote Sens. 2018, 10, x FOR PEER REVIEW 7 of 34 foreground and background tend to cluster around a peak; (2) foreground intervals always present a flat shape whereas background intervals possess a narrow but sharp shape, a metric i Λ measuring the grayscale density of each interval is proposed below: where,  m ] . Based on the definition of grayscale density, we further propose that a small i Λ is related to the foreground and a large one corresponds to the background.
Next, an adaptive threshold * Λ to recognize the property of each interval is adaptively calculated via Ostu's method [24], which is based on maximizing the inter-class variance of the dataset. Suppose that all the i Λ compose a set ..,Λ } , and * Λ separates this set into two classes α Λ and β Λ . Ostu's method exhaustively searches for the optimum * Λ as: where, max Λ and min Λ stand for the maximum and minimum of Λ , respectively; and, where, To this end, the grayscale histogram of input is segmented into fore-and background subhistograms as the following criterion: Additionally, Figure 5 illustrates the above-introduced segmentation result.

Methods of Sub-Histogram Enhancement
In this section, some specific techniques are developed to enhance the partitioned sub-histograms respectively. Section 3.3.1 describes the details of allocating a mapped range for each interval; in Section 3.3.2, the fore-and background parts are enhanced with two different strategies.

Allocation of Re-Mapped Range
In order to overcome the adverse effect of over-enhancement, which commonly occurs in the background, we consider only equalizing the foreground sub-histograms while the modified intensity range of each background sub-histogram accounts for the same percentage in the raw histogram [25]. Under the circumstances, the gray levels of background will not be over-merged, but the target/background contrast is improved.
For an arbitrary background interval [m i , m i+1 ], its re-mapped range R i is computed as follows: where, I max and I min denote the maximal and minimal grayscales in the raw image, respectively. The condition of foreground intervals is more complicated. First of all, the CDF is applied as a metric to decide the re-mapped range as: where, s and t represent the indexes of background and foreground intervals, respectively; is the CDF of a foreground interval [m j , m j+1 ]; and m 1 denotes the optimal minimal grayscale of the enhanced image, which will be discussed in Section 3.4. In light of the property of human vision that noise in flat regions of an image will give rise to spurious or texture to the observer and the contrast sensitivity will decrease at sharp transitions [26], an exponential correction factor γ is designed to modify the re-mapped ranges of foreground intervals so as to reduce the effect of image noise as much as possible. Based on this consideration, we further notice that local entropy is an extensively-used physical metric in informatics that measures the richness of texture of an image block. Thus, we choose to use local entropy as a significant component to construct γ. The definition of γ j ∈ [0, 0.6] is given as: where, Θ j stands for the smoothness degree of [m j , m j+1 ] which is defined as Equation (19) and where, N j represents the number of pixels belonging to [m j , m j+1 ]; and H(x, y) is the local entropy of pixel (x, y) whose grayscale is within [m j , m j+1 ] and its formula is given as Equation (20).
As is shown in Equation (18), the local entropy in regard to all the pixels in a w 3 * w 3 sized local window W e (x, y) around (x,y) is counted. Obviously, a larger Θ j reveals a foreground interval with more complex textures on average.
As a result, Equation (17) can be re-written as: 3.3.2. Re-Mapping of Fore-and Background Histograms Based on the principle that the background intervals maintain their proportions of the whole dynamic range without any other manipulations, the re-mapped result O b (I k ) of a background grayscale I k ∈ [m i , m i+1 ] can be computed as Equation (22).
Then, let us further discuss the enhancement strategy for those foreground intervals, the local details of which need to be enhanced. In our study, a local contrast weighted distribution is developed and is written as: where, where, M and N are the width and height of the image, respectively; δ(x, y) = 1, x = y 0, others is the Kronecker delta function; and ω(x, y) denotes the local contrast operator [27], which is defined as: where, As can be seen from Equations (25) and (26), ω(x, y) indicates the four-direction gradient information of pixel (x, y). Furthermore, ψ(·) representing the weight function is formulated based on the sigmoid function as: where, Sg(x) = 1 1+e −x is the standard sigmoid function whose curve is drawn as Figure 6a; and α, λ, θ, and ξ denote the amplitude, scale, phase, and shift parameters, respectively.
where, ε 1 and ε 2 are two constant values that are approach 0 and 1, respectively. An example of calculating the local contrast weighted distribution via the above-introduced method is given in Figure 7.
On the basis of the presented local contrast weighted distribution ℵ ∈ k (I ) [0,1] , the grayscale distribution of the foreground interval is substituted and a foreground grayscale re-mapped using the approach expressed as: where, ε 1 and ε 2 are two constant values that are approach 0 and 1, respectively. An example of calculating the local contrast weighted distribution via the above-introduced method is given in Figure 7.
where, ε 1 and ε 2 are two constant values that are approach 0 and 1, respectively. An example of calculating the local contrast weighted distribution via the above-introduced method is given in Figure 7.  On the basis of the presented local contrast weighted distribution ℵ(I k ) ∈ [0, 1], the grayscale distribution of the foreground interval is substituted and a foreground grayscale I k ∈ [m j , m j+1 ] is re-mapped using the approach expressed as: where, C ℵ j = m j+1 ∑ k=m j ℵ(I k ) denotes the cumulative local contrast weighted distribution of [m j , m j+1 ]. Here, a schematic diagram of the grayscale re-mapping is also drawn in Figure 8. where, ( I ) denotes the cumulative local contrast weighted distribution of Here, a schematic diagram of the grayscale re-mapping is also drawn in Figure 8. Thus, the final enhanced image O can be obtained as: However, one point that should be noticed is that the minimal grayscale after re-mapping ' 1 m is uncertain and it decides the general brightness of the output image directly. The optimization of this parameter is discussed in Section 3.4.

Brightness Correction via PSO Algorithm
As is mentioned above, the minimal grayscale m is set as 0 by default, which results in the phenomenon that the enhanced image is over-dark. However, we cannot simply use the strategies of brightness preservation proposed in the previous works [9][10][11][12][13], which aim to make the mean brightness close to the input as much as possible, because the brightness of the input itself is distorted for IR images. On account of this problem, we consider using the PSO algorithm to optimize the exact value of ' 1 m by virtue of a reference image with a suitable brightness. PSO is a population-based stochastic optimization algorithm [28] and it utilizes a certain number of particles forming a swarm motion in the search space to estimate the optimum solution of the objective function F. All the particles fly through the search space depending on the best solution of each particle achieved so far and the best solution tracked by any particle among all generations of the swarm, which are named pbest and gbest , respectively [29]. Thus, the final enhanced image O can be obtained as: However, one point that should be noticed is that the minimal grayscale after re-mapping m 1 is uncertain and it decides the general brightness of the output image directly. The optimization of this parameter is discussed in Section 3.4.

Brightness Correction via PSO Algorithm
As is mentioned above, the minimal grayscale m 1 affects the overall brightness of the enhanced IR image. In normal conditions, m 1 is set as 0 by default, which results in the phenomenon that the enhanced image is over-dark. However, we cannot simply use the strategies of brightness preservation proposed in the previous works [9][10][11][12][13], which aim to make the mean brightness close to the input as much as possible, because the brightness of the input itself is distorted for IR images. On account of this problem, we consider using the PSO algorithm to optimize the exact value of m 1 by virtue of a reference image with a suitable brightness.
PSO is a population-based stochastic optimization algorithm [28] and it utilizes a certain number of particles forming a swarm motion in the search space to estimate the optimum solution of the objective function F. All the particles fly through the search space depending on the best solution of each particle achieved so far and the best solution tracked by any particle among all generations of the swarm, which are named pbest and gbest, respectively [29].

Particle Initialization
In this step, N particles are first generated. The initial position of each particle can be regarded as random sampling, and the position of each particle is a potential value of m 1 . Assuming that the sum of the re-mapping ranges of background intervals calculated in Section 3.3.1 is denoted as = ∑ ∀Λ s >Λ * R s , we propose that the initial position of each particle should be greater than 0 and smaller than L − 1 − . Based on this point, uniform distribution-based random sampling is implemented in the range [0, L − 1 − ] to assign the initial value for each particle.
Here, an example of particle initialization for Figure 9 is shown in Table 1 ] to assign the initial value for each particle.
Here, an example of particle initialization for Figure 9 is shown in Table 1

Objective Function
In our study, ' 1 m to be optimized composes a one-dimensional search space for PSO and the objective function is defined as the absolute mean brightness error [30] of the enhanced image and reference image. Equation (29) gives the formulation of the objective function F as:

Objective Function
In our study, m 1 to be optimized composes a one-dimensional search space for PSO and the objective function is defined as the absolute mean brightness error [30] of the enhanced image and reference image. Equation (29) gives the formulation of the objective function F as: where, M O (m 1 ) and M R stand for the mean of output image O and reference image R, respectively. Unlike the offline learning strategy, the reference image R is not used for the learning purpose. Here, only a reference image, the mean brightness of which is comfortable for the human vision system, is needed, i.e., its mean brightness is neither too dark nor too bright. It is utilized to generate a standard mean intensity value M R which is regarded as a standard value to optimize the objective function value F in the following PSO optimization. By this means, the optimal value of m 1 can be computed.
Note that M O (m 1 ) can be calculated using the method discussed from Sections 3.1-3.3, once m 1 is generated by PSO.

Implementation of PSO Iteration
Every particle tries to change its position via information including the current position, current velocity, the respective distance between the current position and pbest, and the distance between the current position and the gbest [31]. The velocity of each particle is suitably updated as Equation (30).
where, V t i denotes the velocity of the i-th particle at the t-th iteration; c 1 and c 2 are learning factors; rand() ∈ [0, 1] represents a uniformly distributed random number; P t i is the position of the i-th particle at th t-th iteration; and pbest i represents the current best of the i-th particle (the minimal F in the current iteration) while gbest i represents the global best of the swarm (the minimal F in all the iterations that have been implemented so far). Besides, ω is the inertia weight which is set to be in the range [ω min , ω max ] and Equation (33) gives its formulation in detail.
where, t is the current iteration and t max is the maximal iteration. Based on the afore-discussed modified velocity V t+1 i , the position of a particle P t+1 i can be thus updated as: Then, the optimal m * 1 is achieved by iteration until convergence, and the complete procedure is concluded in Algorithm 1. Finally, the enhanced image is outputted using the procedures introduced in Sections 3.1-3.3.
Algorithm 1: Optimization of m 1 using PSO 1. Input: an IR image I to be enhanced and the mean of a reference image M R . 2. Create N particles {P 1 , P 2 , . . . , P N }, where P i ∈ R 1×1 representing an m 1 ; 3. for each particle i = 1 to N do 4. Initialize the position of each particle P 1 i and its corresponding velocity V 1 i ; 5. end for 6. for t = 1 : t max do 7. for each particle 8.
Produce the enhanced image O t i using methods introduced in Section 3.1, Section 3.2 and Section 3.3 based on its position value P t i ; 9.
Calculate the objective function F t i using Equation (29)

Experiment and Discussion
In this section, a number of real IR images are applied to make comparative experiments and they can be downloaded from a publicly available database [32]. The reference image with a suitable brightness used in our method is shown in Figure 10 and its mean intensity is 99. All the test IR images whose wavelengths on the electromagnetic spectrum are from 4 to 12 µm [33] are 8-bit and their sizes are 384 × 288. Figure 11 presents the eight test images utilized, which are named as 'residence', 'car', 'guardrail', 'building', 'room', 'cabinet', 'leg', and 'computer', respectively. The former fours are outdoor scenes, while the latter fours are indoor scenes. In our experiments, seven conventional algorithms: GHE, BBHE, MMBEBHE, DSIHE, RMSHE, RSIHE, and ADPHE, introduced in Section 1, are used to make both qualitative and quantitative comparisons. Among them, the recursion level of RMSHE is set as the default value r = 2.

Experiment and Discussion
In this section, a number of real IR images are applied to make comparative experiments and they can be downloaded from a publicly available database [32]. The reference image with a suitable brightness used in our method is shown in Figure 10 and its mean intensity is 99. All the test IR images whose wavelengths on the electromagnetic spectrum are from 4 to 12 μm [33] are 8-bit and their sizes are × 384 288 . Figure 11 presents the eight test images utilized, which are named as 'residence', 'car', 'guardrail', 'building', 'room', 'cabinet', 'leg', and 'computer', respectively. The former fours are outdoor scenes, while the latter fours are indoor scenes. In our experiments, seven conventional algorithms: GHE, BBHE, MMBEBHE, DSIHE, RMSHE, RSIHE, and ADPHE, introduced in Section 1, are used to make both qualitative and quantitative comparisons. Among them, the recursion level of RMSHE is set as the default value = r 2. Figure 10. Reference image utilized in our method. Figure 10. Reference image utilized in our method.

Experiment and Discussion
In this section, a number of real IR images are applied to make comparative experiments and they can be downloaded from a publicly available database [32]. The reference image with a suitable brightness used in our method is shown in Figure 10 and its mean intensity is 99. All the test IR images whose wavelengths on the electromagnetic spectrum are from 4 to 12 μm [33] are 8-bit and their sizes are × 384 288 . Figure 11 presents the eight test images utilized, which are named as 'residence', 'car', 'guardrail', 'building', 'room', 'cabinet', 'leg', and 'computer', respectively. The former fours are outdoor scenes, while the latter fours are indoor scenes. In our experiments, seven conventional algorithms: GHE, BBHE, MMBEBHE, DSIHE, RMSHE, RSIHE, and ADPHE, introduced in Section 1, are used to make both qualitative and quantitative comparisons. Among them, the recursion level of RMSHE is set as the default value = r 2.  Below, we firstly discuss the parameter setting in Section 4.1; the processing results as well as some related discussions are revealed in Section 4.2; in Section 4.3, several quantitative metrics are employed to further demonstrate the superiority of our method. Furthermore, all the simulations are implemented by Matlab 2012a software on a PC with a 2.60 GHz INTEL CPU and 4 GB memory. The codes of this work can be downloaded from our github website [34].

Parameter Setting
In this section, all the parameters utilized, as well as the corresponding settings, are listed in Table 2. The default values of c 1 , c 2 , ω max , and ω min are set according to [31], and the others are set empirically. Note that all these values are fixed in all the simulation experiments. Specifically speaking, w 1 decides the length of the Gaussian filtering window and σ 1 affects the weight of each neighboring PDF. The suitable values of these two parameters need to ensure that the remarkable spikes in the raw histogram can be filtered while the local feature of normal PDFs can be preserved. In the LOWESS algorithm, d adjacent PDFs are utilized to codetermine the final smoothed output of a centered PDF processed by Gaussian filtering. For the purpose of maintaining the local shape of the raw histogram, d cannot be too large. With regards to w 2 , it should be neither too large nor too small so that those spikes which cannot be totally smoothed by Gaussian filtering and LOWESS are prevented from being recognized as local minima and only those outstanding valley points can be picked out. We argue that w 3 should be relatively small, because a large w 3 would not only increase the computational cost, but also make the local entropy calculated not well represent the property of the centered pixel. Due to the fact that the weight function ℵ(x) cannot exactly reach 0 and 1, ε 1 and ε 2 are used to represent the practical lower and upper bounds, respectively. Thus, their values should approach to 0 and 1 as much as possible. In order to limit the computational load and get a satisfactory convergence simultaneously, the maximal iteration t max and the particle number N are both set as 10. Lastly, based on previous work on the PSO algorithm [31], the two learning factors c 1 and c 2 should be in the range [0, 4], and ω max , ω min are commonly set as 0.9 and 0.1, respectively.

Qualitative Comparison
First of all, the enhanced images of all the eight algorithms are given in Figures 12-19. As is shown in Figure 11, all the IR images to be enhanced suffer from low global contrast and blurred details, and image noise is particularly serious in Figure 11e-h, which are indoor scenes.
image contrast while suppressing the noise. Owing to the usage of local contrast weighted distribution, the heat source on the table becomes more outstanding, while it totally vanishes in ADPHE. On the other hand, our strategy of maintaining the raw proportion of background interval in the input grayscale histogram makes it a success in simultaneously eliminating background noise and improving the global contrast.         For the 'residence' scene, it can be observed from Figure 12 that GHE, BBHE, and MMBEBHE have strong abilities of grayscale equalization, leading to the exaggerated fore-and background contrast. The wall and the roof are thus over-enhanced and the noise on the ground is also obviously amplified. On the other hand, DISHE, RMSHE, and RSIHE maintain the input brightness to a certain extent, whereas some grayscales are still over merged. The details, e.g., the windows and doors, are enhanced by ADPHE, but the visual contrast is not clearly improved, which is far from our expectation. Fortunately, our proposed algorithm achieves a relatively better visual performance which avoids the over-enhancement on the wall successfully.
With respect to the 'car' image shown in Figure 11b, the thermal radiations of the wheel and snow cause severe disturbances for most of the traditional methods. The grayscales belonging to the wheel are overly improved and over-merged in GHE, BBHE, MMBEBHE, DISHE, and ADPHE, while those belonging to the telegraph pole and wall are mistakenly re-mapped by RMSHE and RSIHE, but all these adverse phenomena do not happen in our method.
The grass and guardrail pixels of the 'guardrail' scene are exceedingly suppressed by GHE, BBHE, and MMBEBHE, which results in an unnatural visual performance. Besides, artifacts are seriously generated in the wall region at the top left corner, especially in DISHE, RMSHE, RSIHE, and ADPHE (see Figure 14d-g). For our algorithm, we argue that although the global contrast of Figure 14h is lower than Figure 14a-c,g, the over-enhancement is completely overcome and the overall clarity is satisfactory.
The condition of the 'building' scene is quite similar to 'residence', and the transition area around the skyline becomes visually abrupt due to the fact that the conventional algorithms treat fore-and background in the same way. In addition, the image noise is comparatively obvious in Figure 15c produced by MMBEBHE and the textures of the block wall are lost dramatically in Figure 15g produced by ADPHE. It is worth noting that the enhanced result using RSIHE is much more blurred when compared with DISHE and RMSHE, even though they can all preserve the input brightness quite well, indicating that more detailed information is missing in RSIHE. When considering the performance of our method, we find that not only the troublesome noise is suppressed, but also the fore-and background contrast can meet the observational requirement well.
The 'room' image displayed in Figure 11e is changeling on account of the striking noise. On the one hand, the grayscales belonging to the box at the bottom right corner are overly enhanced, except in our method. What is more, the noise distributing around the door is extremely magnified by GHE, BBHE, and MMBEBHE. We also notice that the small heat source near the chairs is over-merged by ADPHE, despite the fact that the noise is not amplified.
The cabinet in Figure 11f is fuzzy and all the comparing algorithms make attempts to improve its contrast as much as possible. We consider this kind of strategy as incorrect since it decreases the clarity in spite of the improvement of contrast. In other words, the balance between clarity and contrast is the key point in IR image enhancement. As is shown in Figure 17h, our presented method not only preserves the texture of the floor, shelf, and power lines, but also increases the global contrast to a certain degree.
The 'leg' image has a relatively higher target/background contrast when it is compared with other test images. One of the most significant tasks in enhancing this image is to enrich the texture existing on the leg and to keep the local contrast of background. Among all the processed results, severe over-enhancement of grayscales belonging to the chair leg and the wall occurs in GHE, MMBEBHE, and ADPHE, even though ADPHE adaptively selects the upper threshold to suppress noise to some extent. It should be noted that although BBHE and DISHE achieve satisfactory global contrasts, one of the legs of the chair located in the middle of the image disappears and the noise is magnified more seriously than RMSHE, RSIHE, ADPHE, and our method.
Finally, the 'computer' image is the most challenging one among all the eight images due to the remarkable noise and quite weak contrast. It is clear that only our method succeeds in improving the image contrast while suppressing the noise. Owing to the usage of local contrast weighted distribution, the heat source on the table becomes more outstanding, while it totally vanishes in ADPHE. On the other hand, our strategy of maintaining the raw proportion of background interval in the input grayscale histogram makes it a success in simultaneously eliminating background noise and improving the global contrast.

Comparison of Processed Result
To further verify the superiority of the presented algorithm, several extensively used metrics are applied to make quantified comparisons in this section.
First of all, the definitions of all the metrics utilized in this paper are introduced: (1) Linear index of fuzziness η [35] is widely applied to analyze the performance of image enhancement, which is defined as: where, O represents the enhanced image and its width and height are M and N, respectively. and, where, max(O) represents the maximal grayscale of O. A smaller value of ρ indicates a better performance of the enhanced image with fewer clutters [36].
(2) Peak signal-to-noise ratio (PSNR) [37][38][39] is commonly used for evaluating image quality and measures how much the enhanced image has degraded when referred to the input image in the area of image enhancement [38]. Here, the formula of PSNR is given in Equation (37).
where, MAX denotes the maximal gray level and MAX = 255 for 8-bit digital images; MSE(O,I) is the mean square error between the output O and input I, which is defined as: Besides, a larger PSNR refers to a higher image quality of the output.
(3) Image definition ρ [40] is a comprehensive index integrating η and PSNR and is always used to reflect the overall definition of an enhanced image. Below, its specific definition is given as: Ideally, a qualified result should not only improve the global contrast, but also has less degradation. Thus, for a high-quality enhanced image, a small ρ is expected.
(4) Roughness Ω [41] is used as a metric for evaluating the performance of an algorithm for noise reduction in many IR imaging applications. Here, we employ it in the following experiments for measuring the effect of noise suppression of the enhanced image. Further, the calculation of Ω is introduced below: where, h 1 = [1, −1] and h 2 = [1, −1] T stand for the horizonal and vertical difference filters, respectively; * is a convolution operator; and · 1 stands for the L 1 norm. Obviously, a smaller Ω is related to a smoother result with less residual noise. (5) Discrete entropy (DE) [42] characterizing the information amount contained in an enhanced image is employed to measure the degree of over-enhancement in our experiment. Additionally, DE is a globally statistical index which is defined as: A larger DE means fewer gray levels are merged, leading to a clearer visual performance. Note that the essence of DE is different from the one of local entropy H defined in Equation (20), although their formulas are seemingly the same. DE is a globally counted index, but H is calculated in a local window.
(6) Logarithmic Michelson Contrast Measure (AME) [43] denotes a measure of local contrast, and Equation (42) gives its definition as: where, the output O is segmented into l 1 × l 2 blocks; O i,j max and O i,j min denote the maximal and minimal grayness of the block, respectively; and ε = 10 −5 is a constant for preventing the invalid values. AME aims to use the relationship between the spread and the sum of the two intensity values in each block, and a smaller AME means a better performance [33].
Tables 3-8 list the statistical results of the six indexes and the average values are also given in the last row of each table. To present a clearer comparison, the corresponding line graphs are also presented in Figure 20.   On the one hand, η denotes the target/background contrast, and it also measures the quantity of clutters to a certain degree. As is shown in Table 3 and Figure 20a, our method achieves the best average η , implying that the foregrounds become brighter while the background clutters become darker after our enhancement. More specifically, we argue that it is the strategy of sub-histogram segmentation that enlarges the intensity difference between the fore-and background. We notice that BBHE, GHE, and MMBEBHE also result in low η values (<2) in most scenes, which matches the experimental fact that the intensity differences between target and background regions are always remarkable in their enhanced images. Besides, the average η of RSIHE is the worst, which is about 10% larger than ours and it proves that the serious over-enhancement is somewhat related to the degradation of global contrast. On the one hand, η denotes the target/background contrast, and it also measures the quantity of clutters to a certain degree. As is shown in Table 3 and Figure 20a, our method achieves the best average η, implying that the foregrounds become brighter while the background clutters become darker after our enhancement. More specifically, we argue that it is the strategy of sub-histogram segmentation that enlarges the intensity difference between the fore-and background. We notice that BBHE, GHE, and MMBEBHE also result in low η values (<2) in most scenes, which matches the experimental fact that the intensity differences between target and background regions are always remarkable in their enhanced images. Besides, the average η of RSIHE is the worst, which is about 10% larger than ours and it proves that the serious over-enhancement is somewhat related to the degradation of global contrast.
Unlike its poor performance related to η, RSIHE has the most satisfactory PSNR on the whole, while our method still takes second place in this metric. Since PSNR reflects the degree of image degradation, we consider that if the PSNR value is excessively large, a large amount of detail information is lost at the same time. That is the reason why the enhanced images processed by RSIHE are fogged and their PSNRs are always very high. By contrast, our method maintains a balance between PSNR (slightly larger than ADPHE and RMSHE) and exhibits the preservation of details, meaning that not only the visual performance presented in Section 4.2, but also the quantitative evaluation of image degradation, is qualified.
As is mentioned above, ρ is a comprehensive index measuring the overall performance of image enhancement. Since our method results in comparatively remarkable performances in both η and PSNR, its ρ values of all the test images thus keep the advantage. Relying on its outstanding achievements in η, the average ρ value of our method is almost two times that of RSIHE, which possesses the second largest value. One point that should be noticed is that those algorithms overly emphasizing input brightness preservation, e.g., BBHE, MMBEBHE, DISHE, RMSHE and RSIHE, sacrifice the fore-and background contrast instead, which leads to their poor performances in η and ρ.
Ω is commonly utilized to reflect the smoothness of image. As is shown in Figure 20d, BBHE and our method obtain the top two Ω values, whereas DISHE presents the worst one. Since the background occupies the majority of an image and the background histograms are not equalized in our method, the noise is not amplified and the smoothness is thus satisfactory. However, some grayscale histogram equalization-based algorithms, like GHE and MMBEBHE, quite easily generate over-enhancement in high grayscales and image noise precisely tends to center on high grayscales, so the effects of the noise suppression of these algorithms greatly decrease. According to Table 6, we also see that ADPHE's capability of removing noise through the adaptive upper threshold is limited in IR images considering its comparatively high Ω values.
DE is a significant index that directly measures the degree of over-enhancement. Through observing its definition, we can easily find that if more grayscales are merged by equalization, the entropy of the image will decrease more. Clearly, our method has a distinct advantage in DE which is approximately 3% greater than the second largest one (BBHE). On the contrary, GHE and DISHE decrease the DE value of the original image seriously, and this is mainly because these methods tend to merge different gray levels together through enhancement. Interestingly, RSIHE achieves a satisfactory performance in PSNR, but its DE performance is poor. Furthermore, BBHE undergoes an opposing situation. We infer that a high DE value can indicate relatively low image degradation to a certain degree.
To evaluate the local contrast of the enhanced image, Table 8 reports the comparison result in terms of AME. Among all the comparative algorithms, the average AME of our method is the best, while the one of GHE is the worst. Even though the η values of GHE and MMBEBHE, which represent the global contrast, are large, plenty of local regions turn to be homogenous due to the fact that it is highly likely that pixels in the same region will merge in these algorithms, contributing to the low AMEs.

Evaluation of Running Time
A comparison of running time is made in Table 9. All the algorithms are implemented for five times and the average running time is recorded.
As we can see, BBHE achieves the fastest running speed, which is about 88 fps. Besides, DISHE and ADPHE also perform well, and their running speeds can reach more than 60 fps and 40 fps, respectively. In regard to our method, the PSO-based optimization consists of multiple iterations of the procedures introduced in Section 3.2. That is to say, these parts of codes need to be executed for N × t max times in total, which leads to the long computation time. As a matter of fact, computation burden is a common problem existing in all the iterative algorithms which needs to be urgently settled. Motivated by the extensive applications of multi-core digital signal processors, e.g., GPU, in practical engineering, we propose to increase our running efficiency using the strategy of parallel processing. Considering that each particle in PSO is individual, the executable codes of all the particles in the same iteration can be regarded as a group of parallel tasks and can be implemented in different cores at the same time. Under the circumstances, even real-time running can be expected. Table 9. Comparison of execution time with conventional methods (s), and bold values indicate the top two results.

GHE BBHE MMBEBHE DISHE RMSHE RSIHE ADPHE Ours
In order to demonstrate the advantage of PSO optimization among concurrent methods, three other concurrent optimization algorithms: the genetic algorithm (GA) [44], ant colony optimization (ACO) [45], and the bat algorithm (BA) [46] are applied to make a comparison. For a fair comparison, all these algorithms are implemented with the same number of particles and iterations (N = 10, t max = 10), and Table 10 clearly reports the running time of the four algorithms. Note that the execution time listed is the average of five times of running.

GA ACO BA PSO
As is indicated from Table 10, the average running time of GA and ACO is obviously longer than that of BA and PSO. This is because GA contains several genetic operations, e.g., crossover and mutation, and ACO contains a series of combinatorial optimizations. BA and PSO achieve almost the same average execution time due to the fact that their update mechanisms of position and velocity are similar.

Evaluation of Mean Brightness Error Ratio
Intended to further verify the advantage of our method when it is compared with other concurrent methods, a comparison of the mean brightness error ratio δ, which reflects the optimization ability, is made in this section. Here, the definition of δ is given below: where, mean(O) and mean(Ref) stand for the mean intensity values of the enhanced image and the reference image, respectively. Table 11 records the optimized values of m 1 and Table 12 records the mean brightness errors. It can be observed from these two tables that all four of the concurrent algorithms are able to generate reasonable optimization results and the average values of δ are all lower than 3.5. However, we need to notice that apart from the 'computer' image, the δ values of our method are superior than most of the others, which convincingly verifies the optimization ability of our algorithm. We consider that this is mainly owing to the memory mechanism of PSO. Furthermore, the reason why the δ values of the 'computer' image are relatively large is that the brightness of the input itself is much higher than the reference image, so the mean brightness of the enhanced image cannot approach the reference even if m 1 = 0.

Conclusions
In this paper, a new IR image enhancement method using adaptive histogram segmentation and local contrast weighted distribution is presented. In order to separately address the fore-and background, the raw grayscale histogram is firstly smoothed by Gaussian filtering and LOWESS and partitioned via local minima. Then, a metric called grayscale density is developed to recognize whether each sub-histogram belongs to the foreground or background. For the foreground sub-histograms, a local contrast weighted distribution is calculated via modified sigmoid transformation and is applied to replace the intensity distribution, after which the detail information can be enhanced by equalization based on the proposed distribution. For the background sub-histograms, a strategy where the corresponding proportions of the whole dynamic range are maintained is considered to avoid over-enhancement. Also, an exponential factor is designed based on a characteristic of human vision and is employed to adjust the re-mapping ranges for the purpose of decreasing the effect of noise. Finally, the overall brightness of the output is corrected through the PSO algorithm based on a reference image with an appropriate mean brightness. A number of both qualitative and quantitative experiments prove that our method outperforms other conventional methods in different scenes.
We would like to acknowledge that there are 13 parameters utilized in our algorithm and most of the values are empirically set (see Section 4.1). Fairly speaking, this number of parameters may be more than other existing methods, although it can achieve a more satisfactory performance in IR image enhancement. In future work, we will pay more attention to this issue and improve the robustness and simplicity of our algorithm further.
Also, we will concentrate on transplanting the presented algorithm to an appropriate hardware platform. Compared with the state-of-the-art methods, the PSO iteration in our method hugely increases the computational complexity. Under the circumstances that we would like to improve the accuracy of optimization, more particles and more iterative rounds are needed, which will further increase the running time. However, we consider that this problem can be solved if multi-core processors, e.g., GPU, are utilized, because every particle can be regarded as an individual task and be implemented in different cores simultaneously.
Author Contributions: Minjie Wan conceived of, designed, and performed the algorithm, analyzed the data, and wrote the paper; Guohua Gu and Xavier Maldague acted as the research supervisors; Kan Ren helped modify the language; Weixian Qian and Qian Chen provided technical assistance to the research; The manuscript was discussed by all the co-authors.