Weight Quantization Retraining for Sparse and Compressed Spatial Domain Correlation Filters

Using Spatial Domain Correlation Pattern Recognition (CPR) in Internet-of-Things (IoT)based applications often faces constraints, like inadequate computational resources and limited memory. To reduce the computation workload of inference due to large spatial-domain CPR filters and convert filter weights into hardware-friendly data-types, this paper introduces the power-oftwo (Po2) and dynamic-fixed-point (DFP) quantization techniques for weight compression and the sparsity induction in filters. Weight quantization re-training (WQR), the log-polar, and the inverse log-polar geometric transformations are introduced to reduce quantization error. WQR is a method of retraining the CPR filter, which is presented to recover the accuracy loss. It forces the given quantization scheme by adding the quantization error in the training sample and then re-quantizes the filter to the desired quantization levels which reduce quantization noise. Further, Particle Swarm Optimization (PSO) is used to fine-tune parameters during WQR. Both geometric transforms are applied as pre-processing steps. The Po2 quantization scheme showed better performance close to the performance of full precision, while the DFP quantization showed further closeness to the Receiver Operator Characteristic of full precision for the same bit-length. Overall, spatial-trained filters showed a better compression ratio for Po2 quantization after retraining of the CPR filter. The direct quantization approach achieved a compression ratio of 8 at 4.37× speedup with no accuracy degradation. In contrast, quantization with a log-polar transform is accomplished at a compression ratio of 4 at 1.12× speedup, but, in this case, 16% accuracy of degradation is noticed. Inverse log-polar transform showed a compression ratio of 16 at 8.90× speedup and 6% accuracy degradation. All the mentioned accuracies are reported for a common database.


Introduction
The computer vision system competence has faced many challenges during their early development phases. These challenges impeded the target detection performance of artificial vision systems. The human vision can easily distinguish the object despite occlusion, clutter, rotation, lighting conditions, scale, or noise; however, the camera sensors encountered difficulty while resolving the mentioned challenges. In order to mitigate these hindrances, multiple efforts are made in the Correlation Pattern Recognition (CPR) literature. Usually, these challenges can be solved by applying an appropriate form of rota-tion. Besides scale invariance, improving the statistical approach to training the matching filter, extraction, and exploiting the scale-invariant features for the detection or geometrical transform are other useful steps. Affixing pre-processing steps before the training and inference phase incurs an extra computation cost.
Traditionally, CPR filters are trained and tested in the frequency domain. Contrary to that, spatial domain CPR filters are trained in the frequency domain, and they are later converted back to the space domain for inference. The current paper refers to this methodology as frequency-trained (FT). In addition to this approach, this paper also considers complete training and inference in the space domain known as spatially-trained (ST). Inference in the spatial domain is computationally expensive as compared to the frequency domain. It involves cross-correlation between the test image and the reference template. Inference can be performed on various devices, like CPU, Internet-of-Things (IoT) devices, GPU, or ASICs.

Motivation and Research Challenges
Computation Cost Associated with Spatial Domain Correlation Filters: In order to handle the false detection under non-uniform lighting conditions, the state-of-the-art CPR [1,2] employed spatial filters instead of a typical approach of training and testing filters in the frequency domain. However, real-time implementation of the spatial filters demands more computational resources than frequency domain filters.
Hardware Implementation Constraints: Intrinsically, embedded systems have limited resources. So, synthesizing the state-of-the-art CPR inference on embedded systems poses many challenges. Hardware is either constrained by the number of operations that can be executed in parallel or by the memory interface transmission rate [3].
Associated Research Challenges: The problem of computation complexity and hardware constraints poses the following challenges.

•
Efficiency and Computational Complexity of Inference due to Number and Large Sizes of Spatial Domain Correlation Filters: The large size of CPR-trained templates and the number of filters required for each target, especially for out-of-plan training, make the inference phase computationally complex. This complexity increases because of certain limitations and critical requirements, such as limited available power, high throughput demand, and hard real-time processing requirements; so, sparsity can reduce workload and increase inference efficiency. • Memory Requirement of CPR filter Weights: Full-precision filter weights have higher memory requirements, which increases with the size and number of spatial filters. In that case, memory minimization is possible through filter-weight compression.
Consequently, both the above-mentioned challenges increase the number and complexity of operations required to detect the target. To address such challenges, this paper mainly focuses on compression and retraining CPR approaches. Subsequently, the following research gaps should be explored: • We need to explore compression techniques for CPR filters and improve the inference computation efficiency; however, reducing the weight precision results in the emergence of quantization error, which degrades the classification accuracy. The real challenge is to maintain the classification accuracy for assuring the maximum possible compression ratio • To minimize the computation workload for inference without degradation in classification accuracy.
Recent researches on CPR apply some pre-processing steps before training filters focusing on accuracy or in-variance. These steps are used to achieve zoom, rotation, or translation in-variance. Gardezi et al. [1] use Affine Scale Shift Invariant Feature Transform (ASIFT) along with a spatial correlation filter that enables the fully-invariant filter. Similarly, Awan et al. [4] devise an auto-contour-based technique to reduce the side lobes. This method assures higher accuracy through prior object segregation before correlation with the reference template; however, the mentioned techniques do not target the filters' compression, efficiency, or memory requirements.
The flow diagram of the proposed techniques is illustrated in Figure 1. Database of training, validation, and testing samples are read to pre-process through geometric transformation (step 1 in Figure 1), which is log-polar or inverse log-polar transform. For direct quantization, training samples are passed through spatial-training (ST) (step 2 in Figure 1) or frequency-training (FT) (step 3 in Figure 1) before applying quantization schemes. Weight quantization retraining (WQR) (step 5 in Figure 1) is applied after ST. Then, the outcome is quantized (step 4 in Figure 1) for compressed filters. These quantization techniques from all methods are cross-correlated with a testing image to produce a correlation plane. This plane is used to generate the detection score. Particle Swarm Optimization (PSO) (step 6 in Figure 1) is employed to find and fine-tune the γ and β parameters. Further, Table 1 represents the details of variable and its description used in this paper.

Contributions
This paper makes the following contributions: • A Weight Quantization Retraining (WQR) (step 5 in Figure 1) method is proposed in this paper to retrain low-precision quantization weights of the CPR filter for dynamic fixed point and power-of-two (step 4 in Figure 1) quantization schemes. Further, the PSO (step 6 in Figure 1) technique is applied to optimize β and γ. • Log-polar and inverse log-polar transforms (step 1 in Figure 1) are introduced as the pre-processing strategies to support the low-precision CPR filter quantization. • An analysis is performed to compare the advantages of ST filters (step 2 in Figure 1) and FT filters (step 3 in Figure 1). This analysis is further extended to each domain, either spatially-trained or frequency-trained, to investigate the comparative benefits of power-of-two (Po2) and dynamic-fixed-point (DFP) quantization schemes. • The overall analysis compares the advantages of direct, log-polar, inverse log-polar, and WQR, which provides a better perspective.

Mathematical Background and Related Work
CPR is a match-filtering [5][6][7] technique. Reference [8] is one of the correlation-based pattern recognition approaches. During the last three decades, CPR progressively improves the designs of statistical methods for training the reference templates. Typically, the CPR training phase involves different sample images and their processing through a statistical training method to prepare a reference template for the target/object. Target localization in the testing image uses cross-correlation with a stride of 1, which means convolution after 180-degree rotation for searching the target/object at each location, and that gives the output correlation plane. For spatial domain filtering, each output in the correlation plane has a float-point operational cost, which is equal to the product of height and width of the reference template. The presence of the target is identified by the height of the peak in the correlation plane. The relative class resemblance of the target/object is proportional to the peak height in the output correlation plane. The larger peak in the output correlation plane corresponds to a stronger probability of the target, whereas the absence of the target results in a broader peak in the correlation output plane. Each reference template design must be designed keeping in view the fact that there is a trade-off between optimal correlation peak, distortion invariance, and clutter suppression [9,10]. Generally, these traits are regulated by optimized parameters. Preliminary steps to resolve the target detection problem [5] are limited to optimal optical correlators [6,7]; however, these basic template designs do not solve the issues, such as distortion and clutter rejection. Synthetic Discriminant Filters (SDFs) [11] are used as the first decent strategy to deal with the challenges. It is the earliest effort in the overall CPR domain and besides, it provides the foundation for further advancements in this field. Although, the later generalization of SDFs involves addressing the in-variance issue in the filter design [12,13]. This approach partially handles the mentioned challenges. Besides, for achieving the optimality and invariance [13], the SDF design emphasizes on enhances the signal-to-noise ratio in the proximity of the target but it allows the side-lobes to function in the proximity of the correlation peak, which complicates the estimation method, and makes it difficult to obtain the optimal threshold value. Casasent et al. [14] generate a bigger dataset for training, which is obtained by rotating and shifting each image. After that, the Minimum Average Correlation Energy (MACE) [15,16] filter, which is a hybrid form of MACE, and a filter with the minimum variance SDF (MVSDF) [17,18] are proposed. The proposed approaches produce sharper peaks in the correlation output plan as compared to their predecessors. Both these filters (MACE, MVSDF) give excellent responses against noise; however, their performances are inadequate against distortion. A trade-off must be kept between the parameter to control the object/target detection despite clutter and the distortion of an object/target. To answer the possible challenges, a breakthrough in the CPR field is made in the mid-90s. First, the Maximum Average Correlation Height (MACH) and then the Optimal Trade-Off MACH filter are introduced. These statistical models optimize the filter response between noise, distortion, and clutter rejection but these filters excessively depend on the mean of all the samples. This obstacle impedes the classifier's performance and results in false positives. Eigen Maximum Average Correlation Height (EMACH) [19] mitigates the dependence on the sample average, which relatively improves the classifier accuracy. Further improvement in the classification accuracy is possible in the Enhanced Eigen Maximum Average Correlation Height Filter (EEMACH) [20] along with a tradeoff. WMACH [21][22][23] enhances the performance of the reference template, and the Gaussian wavelet is applied before training as a pre-processing step. Target search in the input scene enables the CPR techniques to accurately localize the target. MMC filter [24] exploits this CPR feature by integrating the Support Vector Machine (SVM) with CPR to pinpoint the target's location within the input scene. The CPR localizes the target when the SVM allowed generalization in the input. Further enhancement in the CPR performance is introduced through partial-aliasing correlation filters [25]. These filtering techniques ensure sharper peaks in the presence of a target/object. Performance improvement originates from the aliasing effect that takes place because of the circular correlation as compared to linear convolution, which impedes the CPR performance. Human action recognition [26][27][28][29] intercedes the CPR filters to detect human actions.
Achuthanunni et al. [30] and Banerjee et al. [31] propose the band-pass pre-processing of Laplacian of Gaussian (LoG) of unconstrained correlation filter for facial recognition. The band-pass filtering achieves a trade-off between suppressing irrelevant details and enhancing the edges for feature representation. The proposed technique applies PSO to find the optimum scale. The filter successfully handles the challenges, like illumination and noise, during face recognition and outperforms other correlation filters. However, it lacks detection in the presence of out-of-plane and in-plane rotation and scale. Akbar et al. [32] also employ the rotational invariant correlation filter for moving human detection. The proposed methodology pre-processes the color conversion approach and background elimination to enhance correlation filters' speed and accuracy. Akbar et al. [33] propose the hardware implementation of correlation filters on FPGA, which reduces the processing time with negligible performance loss. Hardware design is implemented in LabView, which later may be used later in real-time security applications. Haris et al. [34] apply the MACH filter to localize the target in videos. The target is tracked using a particle filter, while motion is approximated using the Markov model. An approximate proximal gradient algorithm is applied to limit the object tracking to target templates. Haris et al. [35] implement the fast-tracking and recognition using Proximal Gradient Filter (PG) and modified MACH. The proposed tracking approach resolves the challenge of target detection and changing the coordinates of the target. In another research [28], a blended approach is proposed to simultaneously handle noise, clutter, and occlusion. Logarithm transform and DoG are applied as a pre-processing step to achieve this. Further, to produce sharp peaks, a minimum average correlation energy filter is adopted to recognize the target. The results show a remarkable performance of the mentioned approach as compared to other correlation filters.
Reducing the data precision is a straightforward approximation technique to reduce memory and stringent requirements of energy. It also brings some accuracy degradation; however, finding the application's resilience against error introduces due to bit-width reduction is vital for approximation. This approximation is feasible both at software and architecture levels. Venkataramani et al. [36] propose an approximation approach to mitigate the energy requirements of Neural Networks (NNs). An approximation framework is presented which employed the back-propagation to convert the standard trained NN to AxNN which is an approximated and energy-efficient version with almost the same accuracy. This method locates the neuron, which has the least effect on accuracy then replaces that neuron with its approximated equivalent neuron. Different approximation versions with the energy-accuracy trade-off for original NN are produced by adjusting the input precision and neuron weights. Retraining is used to recover the accuracy loss generated due to approximations. Authors also proposed customized hardware that enables the flexibility of weights, topologies, and tunable approximation called neuromorphic processing engine. This engine exploits the computation and activations units to implement the AxNN and achieve precision -energy trade-off during execution.
Rubio-Gonzalez et al. [37] propose a Precimonious framework for approximating floating-point precision reduction. This approach finds low precision floating-point data type for the program's variables, which depends on given accuracy constraints. For hardware applications, FPGA implementation requires the code change, while software application only requires to use a dedicated library or modification in data type. The framework evaluates different test programs which include numerical analysis applications, Scientific Library, and Numerical Aerodynamic Simulation (NAS) parallel computing. The results demonstrate a 41% improvement in the performance for a precision reduction in data types. Pandey et al. [38] propose the fixed-point logarithm function approximation, which is implemented using FPGA. The proposed approach approximates the mathematical function by presenting a binary logarithm unit. The proposed hardware combines the fixed-point data path with a combinational logical circuit, enabling low area utilization. This approach is verified using a Xilinx Virtex-5 device. The hardware can approximate integer, a mix of integer and fraction, and fractional-only inputs. Moreover, Table 2 summarizes the comparison between significant works in CPR literature.

Optimal Trade-Off Maximum Average Height Correlation (OT-MACH) Filter
Maximum Average Height Correlation filter [39,40] is designed with a prime objective of target/object recognition, which, unlike previous methods, simultaneously handles maximum possible distortion tolerance, ability to discriminate objects, and capability of dealing with noise in the test image. The MACH filter mainly comprises of the criteria known as Average Correlation Height (ACH), Average Correlation Energy (ACE), and Output Noise Variance (ONV). For deriving the MACH, the following energy expression is used: where α, β and γ are non-negative filter-tuning parameters, m x is the average of the training image vector x 1 , x 2 , x 3 . . . , x n , and C is a diagonal-power spectral density matrix of additive input noise.
where X i is a diagonal matrix of the i th training image. S x denotes the similarity matrix of the training images.
where M x is mean of vectors X i . Different values of α, β and γ are optimized to get the required response under different test-image scenarios. Automatic Target Recognition to achieve sparse and compressed correlation filter representation.

Eigen Maximum Average Correlation Height (EMACH) Filter
EMACH [19] is designed to improve the false-positive generated due to over-emphasis on the average training image. The improved statistical method introduced β to control the average training image's contribution in the filter design. EMACH filter is defined by criteria C β x and S β x .
Eigen value λ and Eigen vector (1 + S β x )C β x define the filter.

Log-Polar Transform
The mammalian retina is analog to the log-polar transform which converts the standard Cartesian coordinate (x, y) into log-polar coordinates θ and ρ. The log-polar transform is used for object rotation and scale-invariance [41], where scalability and rotation translate into a peak position in the output correlation plane. Log-polar transform of the color image in RGB format is shown in Figure 2. Note that transform is applied to each channel separately.
In log-polar domain, the same is represented as: In the log-polar domain, θ corresponds to the rotation. In fact, rescaling an object results in a horizontal shift in the mapping, and the re-scaling effect is grasped through the following set of equations (see Figure 2): z = rγe θ (12) where logγ corresponds to the horizontal shift. This paper is organized into different sections. Each section describes a significant part of our approach in detail. Section 3 demonstrates the methodology and subdivides it into further sections. Section 3.2 describes the details of each quantization scheme. Section 3.3 explains the CPR filters' re-training using the quantization error. Details of log-polar, inverse log-polar transformation and its support for quantization are mentioned in Section 3.4. The quantization configuration settings are given in Section 3.5. Section 4 provides the details of the experimental analysis, which is further divided into experimental setup, parameter optimization, and performance analysis. In the end, Section 5 concludes the paper.

Overview
The block diagram presents an integrated framework of the proposed approach in Figure 3. The framework accepts a number of training instances x 1 , x 2 , x 3 , ....x N and a testing sample y i as inputs. Instances carry through a log-map (step 1 in Figure 3), and inverse log-map transforms (step 2 in Figure 3). The details of these respective transforms are provided in Section 3.4. Direct quantization is applied (step 3 in Figure 3) without the transform. From this point onwards, each of these cases either branches out into FT, ST, or spatially-retrained categories. For a FT filter, frequency transform (step 4 in Figure 3) is used.  After obtaining the Correlation Height (step 7 in Figure 3) and Average Similarity (step 8 in Figure 3) in the frequency domain, the regular training (step 13 in Figure 3) is performed to get the H EEMACH . Spatial frequency response h spatial− f requency EEMACH is obtained after the Inverse Fourier Transform (step 14 in Figure 3) of H EEMACH . Further, the training process is repeated in the spatial domain where Spatial Correlation Height (step 5 in Figure 3) and Spatial Average Similarity (step 6 in Figure 3) are calculated, while the training is conducted in the spatial domain. Similarly, Weighted Quantization Retraining Modified Spatial Correlation Height (step 9 in Figure 3) and Modified Average Similarity (step 10 in Figure 3) are computed for retraining (step 12 in Figure 3) of reference filter h RtEEMACH . Retraining requires β, γ and the already optimized filter h opt EEMACH with floating-point precision. The details of retraining are available in Section 3.3. Consequently, the trained templates from all the quantized approaches (step 15 in Figure 3) use power-of-two (Po2) and dynamic-fixed-point (DFP) schemes. The details of these schemes are in Section 3.2. Subsequently, the correlation is calculated in a spatial domain, like a window operation, which generates the correlation output plane. Lastly, the detection score evaluation (step 16 in Figure 3) is performed by post-processing of the output correlation plane. The details of post-processing are available in Section 4.

Quantization Schemes
Quantization schemes convert the pre-trained floating-point precision weights into quantized weights with minimum distance from the original filter; however, the magnitude of distance depends on the type of quantization scheme.
Evaluation: For evaluating the quantization mechanism, two quantization schemes are chosen and evaluated for filter compression. These schemes are power-of-two (Po2) and dynamic-fixed-point (DFP) quantization. The resulting properties of these quantization schemes are studied in conjunction with direct, log-polar, inverse log-polar, and filter retraining.
Power-of-Two Quantization: Po2 is the state-of-the-art quantization technique used for data compression. Zhou et al. [45] implement the Po2 quantization for quantization of deep networks. This technique is employed due to its hardware-friendly nature, which means that multiplication can be performed using a shift operation. This property gives it an advantage over other quantization schemes during spatial cross-correlation. Po2 quantization can be defined using the following mathematical framework: m 1 and m 2 are integer numbers with For a given bit-width (BW), m 2 can be mathematically represented as follows: Equation (14) presents the proposed quantization levels for different values of m 1 and m 2 ; however, as mentioned in Equations (15) and (16), these values further depend on the absolute maximum value of weights in the filter f w . Overall, the quantization levels depend on the distribution and maximum absolute value of weights in the filter. Since the quantization scheme's nature is symmetric, we utilize only 2 BW − 1 of 2 BW quantization levels. Additionally, we have added an extra quantization level at zero value, as shown in Equation (14). Therefore, the filter may show high accuracy. CPR filters employ the training samples containing black background, adding the quantization level at zero increases the sparsity significantly in trained filters.
Dynamic-fixed-point Quantization: Since there are plenty of cases in which this scheme is successfully implemented to achieve relatively better compression versus accuracy trade-off [46,47], however, unlike Po2 quantization, DFP quantization has a better peak-signal-to-noise ratio (PSNR). This property provides it an edge over Po2. Overall, this scheme assures less noise because it produces equidistant points for quantization levels as compared to the previous quantization method. Equations (18) and (19) represent the quantization scheme. For bit-width (BW), Equation (18) maintains quantization levels at equal distance from each other.
Like Po2 quantization, this approach is symmetric, while Equation (18) provides the normalizing and scaling functions to the already-bounded quantization levels in Equation (19).
The weight distribution of both quantization schemes before and after quantization is presented in Figure 4. By comparing Figure 4a,b, the Po2 has non-uniform quantization levels as compared to DFP, whereas more weights are mainly quantized around zeroth quantization level, which preserves the low-value weights that help to improve the accuracy. Meanwhile, the DFP induces more sparsity as compared to Po2 quantization. This sparsity increases with a rise in compression ratio as more zeroth levels are added with an increase in compression ratio. To establish a proper connection between the quantization schemes and the resulting quantization noise, Figures 5 and 6 are provided for analysis. For DFP quantization, Figure 5 illustrates the relationship between peak signal-to-quantization noise ratio (PSNR) and compression ratio. A sample is taken from the Fashion MNIST dataset, and we approximate it using DFP quantization. By applying direct DFP to the filter, the peak signal-to-noise ratio remains constant up to 6-bit compression, but, after this, peak-signal-tonoise-ratio starts to decrease because the quantization interval doubles with the reduction of a bit, which doubles the compression ratio. Similarly, DFP quantization has better PSNR after log-polar transform, while both quantization schemes have the same PSNR values for 2-bit (CR = 8) to 1-bit compression (CR = 16). Likewise, in Figure 6, applying Po2 quantization on a sample and monitoring PSNR values for each compression level yields a constant PSNR value, and this goes on up to 3-bit compression. From that point, as the compression ratio increases, the new quantization levels are added near the zeroth level. This causes more quantization noise in a compressed version. For direct Po2 quantization, it gradually falls to 2-bit (CR = 8) and 1-bit (CR = 16) compression. Overall, a better PSNR value is achieved through this method than the direct Po2 quantization, and besides, that behavior reverses after 3-bit compression. PSNR for 2-bit (CR = 8) and 1-bit (CR = 16) compression falls more rapidly than the direct Po2 quantization. By comparing the two different quantization approaches, it is evident that DFP quantization has achieved better PSNR values as compared to Po2 quantization. Contrary to that, PSNR values of Po2 remain insensitive to most compression values. Moreover, in the case of DFP quantization, PSNR value for 1-bit compression (CR = 16) drops to −40; however, in the case of Po2 quantization, PSNR drops to −32 in case of 1-bit (CR = 16) compression; therefore, it is quite clear from both Figure 5 and 6 that Po2 quantization drops less in case of PSNR for lower bit compression as compared to DFP quantization.
These observations hint at the superior accuracy of DFP quantization as compared to Po2 up to 5-bit compression. After that, accuracy of both should be equal for 4-bit compression, while, for 3-bit and more, Po2 quantization should have greater accuracy.

Retraining the CPR Filter
Quantization error causes inaccurately trained filters. The fine-tuning of the quantized trained filter is an approach to get more accurate filters, which can be possible through a retraining filter. A mathematical framework is proposed for retraining filter to add the quantization error term in an already defined statistical training method. Equation (20) adds the given quantization error for trained filter h eq with each sample x i . In Section 3.2, we have already seen the PSNR degradation with the increase in the compression ratio. This PSNR degradation is different for the Po2 and DFP quantization schemes due to quantization noise. In order to ensure the compensation because of introducing these quantization approaches, quantization error co-efficient ξ is used to control the contribution of quantization error (h eq ) in filter design, where (0 < ξ < 1). Modified Average Image Correlation Height (mAICH) criteria has an additional term h eq for each sample x i because it adds to each sample, as well as in the average of samples, m. Equation (22) presents the mAICH after substitution of v i and m h .
where h + is the complex conjugate transpose of h.
whereas β denotes the contribution of m h in Equation (23). In Equation (23), mC x β is the average of correlation peak intensities (v i − βm h ) samples. Ideally, all training images should follow this convention in which v i is subtracted from a partial average of training samples. To achieve this, every sample v i should have an identical output correlation plane, like the ideal output correlation plane f . To find out the f that suits all samples' correlation output planes, the minimum deviation is required between its correlation planes. Equation (24) describes this deviation as the average square error (ASE).
The average square error between f and g i is given in Equation (24): where where h * is the complex conjugate of h. To achieve the maximum peak, partial derivative with respect to f should be equal to zero, as given in Equation (26): In this equation, f opt is the optimized filter, and, after solving Equation (26) and substituting g i in Equation (27), we get the following: where M h = M + ξ H eq . H eq is a diagonal matrix having h eq along its main diagonal, and M is a diagonal matrix having m along its diagonal. Substituting Equation (21) into Equation (23), we get: The next step is to change the Average Similarity Measure (ASM), which defines the dissimilarity of training samples to (1 − β)M h h * , the measure referred as modified ASM, or mASM.
where mS x β,γ is a diagonal matrix. ( Eigen value λ and Eigen vector (1 + mS β x )mC β x define the filter. In Equations (28) and (30), mS x β,γ and mC x β,γ are the modified forms of S x β,γ and C x β,γ . Figure 7 shows the floating-point filter's histogram, quantized version, and the retrained quantize version. The illustration clearly demonstrates the displacement of weight values of the retrained filter. Noticeably, the intensity values of the filter shift to adjust to new values. The retraining process changes the value of retraining intensities. The complete retraining process for a 3-by-3 snip of the filter is shown in Figure 8a. The floating-point precision filter, h f transforms into a quantized version, h q . Further, the quantization error, h eq is calculated to support the retraining process. This process yields h rt filter using Equations (31) and (32), which is a retrained version of the filter in floating-point precision. Finally, its quantized version, h rtq has a reduced quantization error h eq . Figure 8a demonstrates the function of retraining approach as quantization error h eq in case of retaining method (see the first row in Figure 8a) is less than the direct quantization (see the second row in Figure 8a). Note that the retrained filter in floating-point precision, h rt alters its weights to reduce the quantization error, h eq . Figure 8b illustrates a part of the filter before and after the retraining process. All the weights of the filter do not change the value because the alteration is only limited to certain intensities. The above observations confirm that WQR reduces the quantization error h eq . We expect that WQR will reduce the accuracy degradation in trained CPR filters due to the quantization process.

Geometric Transform
Quantizing the magnitude of 2-dimensional filters introduces the quantization error, which degrades the quality and causes accuracy loss during the inference process. PSNR measures the ratio between the maximum possible signal power and noise power. This ratio estimates the quality after quantization. Equation (33) represents the PSNR, while the power of noise in the denominator is defined by the Mean Square Error (MSE), which is the average of the square of the pixel-by-pixel difference between the original image and the approximated version of the image; however, the MSE also depends on the variance of the original and quantized signals. Equation (34) establishes a relationship between MSE and variance of the signal. For a higher number of pixels, the variance of both original and estimated images has more contribution than the equation's last three terms. The equation implies that there is a higher variance value of the original signal and its quantized version results in more error. It is obvious in mathematical proof presented in the Appendix A.
whereas MAX f is the maximum possible value for a given bit-width.
whereas Y i andŶ i denote the original image and the estimated image, respectively. σ 2 are the variances of the original image and the estimated image, respectively. N denotes the total number of pixels in the original image.
To enhance PSNR for a given compression ratio, minimizing the variance of the signal and its approximated version require some transformation of the original signal. Here, we have introduced two types of geometric transforms to reduce the variance in the next two subsections.

Reducing the Standard Deviation Using Log-Polar Transform
Sabir et al. [41] have already demonstrated in a previous study that applying the log-polar transform has negligible influence on the classification accuracy degradation. This paper demonstrates the additional property of log-polar besides achieving the scale and rotation invariance. This transform alters the distribution of intensity levels, resulting in reduced standard deviation in the transformed image compared to the original image. This property may be useful for the quantization of the filter weights. A sample of a shirt is selected in grayscale format to understand the effect of log-polar transform outcomes on the intensity value distribution. Figure 9a is the picture of a shirt with strips of various intensity levels. Figure 9c is a histogram of the image, which shows that a large portion of the image has zero intensity level, while 100th intensity level has the second-largest occurrence. By analyzing the overall distribution, the estimated standard deviation is 63.91. Figure 9b represents the picture's log-polar transform, which is a distorted form of an image; however, it changes the intensity distribution of the image. By observing the histogram in Figure 9d, it is evident that it reduces the frequency of black from 1200 to just 200, while the occurrence of 100th intensity level varies from less than 200 to ∼280, it alters the histogram distribution of the image of the shirt. When the log-polar transform is used, the standard deviation of the image falls from 63.91 to 45.62, which shows that now, the frequency of intensity level is in a more compact form than before; therefore, it became more efficient and convenient to apply any quantization scheme to represent the intensity levels because it will reduce the PSNR value. For log-polar quantization, higher PSNR values as compared to direct quantization confirm the better resilience of this method as shown in Figures 5 and 6.

Reducing the Standard Deviation Using Inverse Log-Polar Transform
One of the many properties of log-polar is its reversibility, which means that it is possible to convert an image back to its original for using a 2-dimensional inverse log-polar transform. Figure 10b shows a transformed image, but, unlike the previous transform, the resulting transformed object in the image reduced in size and quality because many horizontal features of the image are almost curbed. The inverse log-polar transform demonstrates in Figure 10d, and the standard deviation of the image is further reduced to 42.48, but, in this process, zero intensity increased to 2300, which is almost double as compared to the intensity of the log-polar transform. Equations (35) and (36) show the conversion of θ into x and y Cartesian coordinates; however, when θ varies across its range (0 to 2π), the Cartesian coordinates x and y range is 0-r. Figure 10b represents the evidence that the frequency of most intensity levels beyond zero is modified and reduced to the minimum level.
whereas ρ denotes the logarithm of the distance between the given point and the origin, and θ denotes the angle between the x-axis and the line through the origin and the given point. Based on the mentioned observations, we can expect that applying the log-map and inverse log-map pre-processing will reduce the quantization noise, which indirectly increases the compression ratio of spatial CPR filters.

Configurations for Weight Quantization
To understand the quantization and re-quantization, it is necessary to first understand different quantization methods and their configurations with or without the transform. Figure 11a illustrates the direct quantization method, through which regular training of the filter h is followed by quantization, which implies that either DFP or Po2 is performed for a given bit-width. Figure 11b represents the retraining method, through which intensity levels are reinforced using the retraining process. First, like direct quantization, h q filter is obtained after regular filter training. Then, using h q , a separate retraining process for each quantization approach (DFP and Po2) is employed. Finally, after the re-quantization process, h qrt is achieved, which is a quantized form of the retrained filter. In Figure 11c, transforms are applied to support the quantization process. The resulting filter h is transformed using a log-polar or an inverse log-polar transform. For a given bit-width, each quantization technique is performed to obtain a different filter of h l q .    Figure 12 illustrates the overall experimental setup consisting of different components.

CPR Filter Implementations and Setting
EEMACH [20] and its derivatives [44] showed a remarkable performance as compared to other CPR filters. Literature shows their superior clutter-rejection capability as compared to other methods, and experiments, which have shown better results, are conducted taking N v = 1 and not at other value of N v . The same setting of N v . is applied to our experiments.
We applied a couple of mathematical quantization techniques to filters to reiterate the proposed approaches, as demonstrated in block diagrams Figure 11a Then, either scale, moving lighting, rotation, or classification test is performed using cross-correlation (conv.m). After this, evaluation and detection score produces the analysis graphs separately for each test.

Database
For evaluation, the experimental work is carried out on publicly available datasets [48]. These datasets contain test images with or without a (black) background in different poses, which vary from 0 to 180 degrees out of a plane at different elevation angles. For the training phase, we use images without a background. These training snips are centered in the middle of the test image, which makes them ideal for recognizing correlation patterns. In order to analyze the responses of precision reduction in filters, dataset 01 is specifically used to evaluate the ROC's comparison of different techniques and methods adopted in this paper. Similarly, to study the precision reduction responses against the scale enhancement and lighting alterations, dataset 02 and dataset 03 are employed, respectively.

Evaluation Framework
To understand the efficiency of the proposed techniques, we initially outlined an appropriate framework for experimental evaluation. After choosing the database, the next step is to define the performance evaluation framework. Instead of performing a lexicographical scan, equal window size for both the filter and the full-test image is considered. The block diagram of this framework is shown in Figure 3. Three different objects are demonstrated in Figure 13. At a 30-degree elevation angle, the images of each object are divided into six sections. Each section has six object images with six consecutive out-of-plane angles. These image sections have successive intervals 0-30, 35-55, 60-90, 95-125, 130-160, and 165-180 degrees with 5-degree incremental gap in each section. For testing, we use an image at a 50-degree elevation angle. An example is demonstrated in Figure 14. Thus, a total of 18 filters are formed, while there are six filters for each object. To assess the clutter rejection capability of each filter, we draw 2560 clutter images from the database. The filter response of each filter has a maximum value called correlation-output peak intensity. Instead of directly considering the raw correla-tion plane cp and the peak value for measuring the target object's existence, we consider a mathematically-derived form of correlation plane output intensity, as is presented in Equations (37) and (38). This mathematical transform assures the output quality of the correlation plane. Raw correlation plane response does not provide the quality of correlation. Only considering the maximum value in the output correlation plane provides no information about the suppression of side lobes in a correlation plane; therefore, during the correlation process, there is a high probability that the final response has high-correlation output peak intensity, but, after mathematical processing, correlation peak can reduce. Conversely, after the mathematical process, lesser peak intensity is observed in the correlation plane with shorter side lobes' intensity, which might increase.
Here, cp j is the raw correlation plane of the test image j, σ ϑ j is the standard deviation of the mean subtracted normalized correlation plane ϑ j , and ncp is the normalized correlation plane. The correlation output peak intensity of this plane serves as an objectdetection score. Table 3 represents sample COPI's and its corresponding detection scores for both quantization schemes.  Here, ncp j is the correlation response of the test image j. ι is the absolute peak correlation intensity used in Equation (39). In Equation (40), ∆% is the percentage difference between the threshold and the COPI of test response j. The average normalized correlation peaks' response τ of training instances i = 1, 2, 3, ........., N multiplied with a factor of 0.5 in Equation (41).

Parameter Optimization
To maximize the filter response, we should select appropriate parameters for each filter. For this purpose, we establish a framework of cross-validation set for each filter. This validation set has a test image from the training set of the corresponding filter as a true class and 100 clutter images are randomly chosen out of 2560 clutter images as a false class. The cross-validation of 600 clutter images is defined for each object. Previously, to estimate the binary class difference, Peak-to-Side-lobe Ratio and Fisher Ratio were used; however, in this paper, we employ a simple ratio of mean correlation output peak intensity of the false class µ ncp F to mean correlation output peak intensity of the true class, µ ncp T . This peak ratio is illustrated in Equation (42).
We select an optimal β value with minimum ratio P r . We search the optimal β value across the beta range 0-1 using PSO for each compression ratio. As quantization process for each filter results in the quantization error, which is different for direct quantization, log-polar, inverse log-polar, and filter retraining methods; therefore, each compression ratio holds a different set of optimal parameters.

PSO-Based Optimization of γ and β
The classical PSO is a self-organizing approach that holds the powerful property of dynamic non-linearity. Our problem is non-linear, we want to calculate the parameter(s) using the minimum objective function value. This makes PSO an ideal solution for parameter searching, like previous literature [36]. This property ensures the trade-off between the positive and negative response. The positive response supports constructing the swarm structures, while the negative response acts as a counterweight to this construction. Overall, this method provides a stable and complete solution to a non-linear problem. PSO also offers a balance between the exploitation and exploration of a solution. Further, potential solutions are known as particles, which excessively interact with neighboring particles. This interaction spread the updated information through-out the swarm. Filter retraining method search space is not limited to a parameter. We employed the classical PSO technique for finding the optimal values of β and γ, while minimizing the objective function of the peak ratio P r , where v i,j = [v 1,j , v 2,j , ....v 20,j ] is the velocity vector of twenty particles and p i = [γ i , β i ], β ∈ [0, 1], and γ ∈ [0, 1] for particle i and dimension j. Velocities and particles are randomly initialized. Each particle is initialized with uniformly random values in a bounded range [0,1]. For each particle, a filter is separately retrained and cross-correlated with false and true images. Subsequently, the P r ratio is calculated using the average of peak intensity for false and true samples defined in the cross-validation dataset. Pr value is minimized for each filter using the algorithm given in Figure 15. PSO exits on either completion of the maximum number of epochs or getting the same minimum value of Pr for the number of epochs. Equations (43) and (44) are used to update the velocity and position of each particle. v itr+1 where itr denotes iteration, i = 1, 2, ..., 20 and j = 1, 2. s 1 and s 2 are acceleration constants. r() and R() are random functions, while w denotes the influence of motion in previous iteration, 0 ≤ w < 1. pBest is the particle's best position, while gBest is the global best position.

Rotational Analysis
For comparison, the full precision responses of 16-bit spatially-trained and frequencytrained EEMACH are considered a baseline for the DFP and Po2 quantized trained filters. For each compression ratio, the ST and FT filters are quantized using both the proposed approaches.
Commencing with a compression ratio of 16, the DFP and Po2 quantization schemes are separately analyzed for spatial and frequency domains. The contemplated strategies show nearly identical performance graphs. Detection responses of all the quantization schemes are slightly above the threshold. For direct quantization and retraining filters, the detection on the average response is ∼23% more than the baseline. In the case of inverse, log-map transform on the average response is ∼30% above the baseline in Figure 16a (box 1). The log-map quantization response has a slight dip around 100 degrees in Figure 16a (dip 1) and 300 degrees in Figure 16a (dip 2). On average, the log-map response is ∼35% below the baseline in Figure 16a (box 2).
With compression ratio 8 for inverse log-map, the response reduces to ∼15% on average above the baseline in Figure 16e (box 3). Direct and retrained quantization filters show responses, which are almost similar to the reference responses. The log-map pre-processing on the average response is ∼40% below the baseline in Figure 16e (box 4). In the second tetrad (e-h), the log-map response remained lower than the threshold in a 50-300 degree interval in Figure 16e (dip 3) for each quantization type. Contrary to the frequency-trained filter, the response of spatially-trained filters for log-map quantization remains below the threshold within 75-270 degree interval in Figure 16g (dip 4). Subsequently, for all spatial responses below compression ratio 8, the log-map pre-processing response for both quantization types diminish below the threshold from 75 to 270 degrees in Figure 16o (dip 5). No significant change is observed in the rest of the responses as compared to the previous cases. All the responses, except for the log-map quantization, do not significantly vary.
Regarding the compression ratio of 5.33, the average response of the log-map preprocessing remains ∼30% below the baseline; however, for all compression ratios below 5.33, the log-map pre-processing for FT filters with the Po2 quantization shows a response that diminishes below the threshold of 0-190 degree interval in Figure 16m (dip 6). For compression ratio 4 or below, the DFP quantization for FT filter remains below the threshold for a 50-190 degree interval. Consequently, the full precision and direct quantization responses have almost identical curves with a gradual drop in the compression ratio.

Scale and Moving Light Analysis
Dataset 2 includes the image of a car at different scales. Some samples are shown in Figure 17. In order to investigate the resilience of the compressed configurations to handle the target's scalability, filter detection responses are measured on the scale of 0-400% of the original target size, and it is given in Figure 18. Similar to the rotational test, each set of the four graphs obtain for the following corresponding compression ratios: 16, 8, 5.33, 4, 8, and 1.33.
For the 16 compression ratio, the full precision detection response for both ST and FT filter is above the threshold up to 125% scalability. Beyond this scale, this response mainly revolved around the threshold value; however, the detection response of the inverse log-map is well above the threshold with a slight fall around 225% scale in Figure 18a (dip 1). For direct and retrained quantization, the detection score is above the threshold up to 350% in Figure 18a (dip 2), whereas log-map pre-processing is only successful up to 80% scale in Figure 18a    When the compression ratio is 8, all the curves' detection responses do not change much except for the log-map pre-processing. In that case, the response increase continued above the threshold on a scale of 0 to 400% in Figure 18e (box 1). Overall, compression ratio 8 is found to be more resilient in terms of scale enhancements for each type of quantization.
For the compression ratio of 5.33, the FT filter's detection score for the inverse log-map pre-processing has a slightly deeper dip with almost 225% scale enhancement in Figure 18i (dip 4). This drop stays above the threshold for the spatially-trained filter in Figure 18k (dip 5). On the other hand, for compression ratio below 5.33, this dip Figure 18m (dip 6) in the detection score goes even deeper for ST and FT filters for both compression schemes but the log-map pre-processing shows a detection score below the threshold for FT-quantized filter versions.
Conversely, the detection score remains well above the threshold for the ST versions of filters. The remaining quantization versions do not considerably alter its detection score by analyzing the curves related to the rest of the compression ratios. Sequel to a comprehensive analysis of graphs in Figure 18, the detection responses of ST quantized filters are more resilient to scale enhancements than the FT quantized filters. Conversely, the detection score remains well above the threshold for the ST filter versions. By analyzing the rest of the compression ratios' curves, the remaining quantization versions do not significantly alter its detection score. The comprehensive analysis is presented in the graphs in Figure 18, which show more resilience in the detection responses of the STquantized filters than the FT-quantized filters. Dataset 3 includes more than 1000 car images captured under various lighting conditions. Each image has a background, which shows that it belonged to a specific set of images developed by incremental rotation from 0 to 360 degrees under a particular light setting around the car as shown in Figure 19. Overall, the compressed versions of the filter exhibit excellent responses under different lighting conditions. For brevity, the compression ratios of 8 and 16 are demonstrated in Figures 20 and 21, respectively. Generally, for all compression ratios, including log-map and inverse log-map, pre-processing exhibits superior performance as compared to the baseline. That is equally valid for both ST and FT filters; however, the retrained and direct quantization filter responses are below the baseline. In Figure 20, responses of each graph for all the quantized instances are found identical. In comparison, Figure 21 expresses a better response to the FT filter as compared to the ST filter for both cases of the inverse log-map and the log-map. Beyond a compression ratio of 8, the performance graphs do not change; however, their responses remain well above the threshold. Table 4

F u l l _ P r e c i s i o n _ f r e D i r e c t _ Q u a n t i z a t i o n _ f r e _ p w 2 L o g P o l a r _ Q u a n t i z a t i o n _ f r e _ p w 2 I n v e r s e L o g P o l a r _ Q u a n t i z a t i o n _ f r e _ p w 2 R e t r a i n _ Q u a n t i z a t i o n _ f r e _ p w 2 d i p 2
(a) Frequency-trained, Po2

l _ P r e c i s i o n _ s p D i r e c t _ Q u a n t i z a t i o n _ s p _ p w 2 L o g P o l a r _ Q u a n t i z a t i o n _ s p _ p w 2 I n v e r s e L o g P o l a r _ Q u a n t i z a t i o n _ s p _ p w 2 R e t r a i n _ Q u a n t i z a t i o n _ s p _ p w 2
‫ל‬ % S c a l e ( % ) (c) Spatially-trained, Po2

l _ P r e c i s i o n _ s p D i r e c t _ Q u a n t i z a t i o n _ s p _ p w 2 L o g P o l a r _ Q u a n t i z a t i o n _ s p _ p w 2 I n v e r s e L o g P o l a r _ Q u a n t i z a t i o n _ s p _ p w 2 R e t r a i n _ Q u a n t i z a t i o n _ s p _ p w 2
‫ל‬ % S c a l e ( % ) (g) Spatially-trained, Po2

l _ P r e c i s i o n _ s p D i r e c t _ Q u a n t i z a t i o n _ s p _ p w 2 L o g P o l a r _ Q u a n t i z a t i o n _ s p _ p w 2 I n v e r s e L o g P o l a r _ Q u a n t i z a t i o n _ s p _ p w 2 R e t r a i n _ Q u a n t i z a t i o n _ s p _ p w 2 d i p 5
‫ל‬ % S c a l e ( % ) (k) Spatially-trained, Po2

l _ P r e c i s i o n _ s p D i r e c t _ Q u a n t i z a t i o n _ s p _ d f t L o g P o l a r _ Q u a n t i z a t i o n _ s p _ d f t I n v e r s e L o g P o l a r _ Q u a n t i z a t i o n _ s p _ d f t R e t r a i n _ Q u a n t i z a t i o n _ s p _ d f t
‫ל‬ % S c a l e ( % ) (l) Spatially-trained, DFP     I m a g e s F u l l _ P r e c i s i o n _ f r e D i r e c t _ Q u a n t i z a t i o n _ f r e _ p w 2 L o g P o l a r _ Q u a n t i z a t i o n _ f r e _ p w 2 I n v e r s e L o g P o l a r _ Q u a n t i z a t i o n _ f r e _ p w 2 R e t r a i n _ Q u a n t i z a I m a g e s F u l l _ P r e c i s i o n _ s p D i r e c t _ Q u a n t i z a t i o n _ s p _ p w 2 L o g P o l a r _ Q u a n t i z a t i o n _ s p _ p w 2 I n v e r s e L o g P o l a r _ Q u a n t i z a t i o n _ s p _ p w 2 R e t r a i n _ Q u a n t i z a t i o n _ s p _ p w 2 I m a g e s F u l l _ P r e c i s i o n _ s p D i r e c t _ Q u a n t i z a t i o n _ s p _ d f t L o g P o l a r _ Q u a n t i z a t i o n _ s p _ d f t I n v e r s e L o g P o l a r _ Q u a n t i z a t i o n _ s p _ d f t R e t r a i n _ Q u a n t i z a t i o n _ s p _ d f t ‫ל‬ % Figure 21. CPR responses for the object under different moving lighting conditions. Graphs with a compression ratio of eight contain the first row which represents Po2 frequency-trained and DFP frequency-trained filter graphs, respectively. The second row represents Po2 spatially-trained, and DFP spatially-trained filter graphs, respectively E=2442 and p-value < 0.053 for Po2 and DFP, respectively. These bit-widths have better classification 630 performances than log-map and inverse log-map but less classification performance than the direct 631 quantization.

632
For comparing ROC curves, the area under the curve (AUC) is assumed as an accuracy measure.

633
In Eq. 45, we present parameter Z and p-value to find the difference between the AUC of the two 634 curves.
635 Figure 21. CPR responses for the object under different moving lighting conditions. Graphs with a compression ratio of eight contain the first row which represents Po2 frequency-trained and DFP frequency-trained filter graphs, respectively. The second row represents Po2 spatially-trained, and DFP spatially-trained filter graphs, respectively.

ROC Comparative Analysis
Typically, the CPR paradigm's evaluation analysis is achieved through conventional Receiver Operator Characteristic (ROC). Previously, the EMACH [19] and EEMACH [20] were analyzed in the available literature using the ROC analysis approach, but the issues with the ROC results made it insignificant for statistical analysis and inconsistent for application. Each compression level has a ROC curve, so each is considered a separate classifier; therefore, there are many ROC curves. The full-precision implementation has a distinct ROC curve for each trained method. These FT or ST methods provide a baseline for comparison to the corresponding compression rates. Since each approach holds 16 curves for each trained method, a total of 32 ROC curves are evaluated for each method. All 32 ROC curves should be compared with the corresponding ROC baseline curve representing full precision to find the compression outcomes for each bit-width. In our experimental analysis, Z, D, E, and their corresponding p-values are used to describe the likeness between the baseline ROC and the corresponding compressed versions for each method, for example, direct, log-map, and inverse log-map. E measure may highlight the differences between any two paired ROCs and the E values with their p-values are more significant as compared to Z and D; therefore, for brevity, we have only discussed the bit-widths having a minimum E value. See Appendix B for detailed results. Figure 22a illustrates the E values and p-values [49,50] for all above-mentioned methods for ROC comparison. E and p-value demonstrate an integrated absolute difference between two ROC curves; so, the smaller E value shows the two ROC curves' closeness. The large value of E and p < 0.05 illustrates ROC's degradation due to corresponding bit-width compression. For direct quantization, the FT filter has the least E value 1308 with p-value < 0.92 for both Po2 and DFP. This indicates a strong closeness between ROCs, which implies nearly equal classification performance, like original ROCs. The ST filter has E = 2368, p-value < 0.1525, E = 1106 and p-value < 0.106 for Po2 and DFP, respectively which means relatively less closeness as compared to FT. For log-map transform, the E value of all FT and ST quantization schemes varies between 83,184 and 88,000 with p-value < 2.2 ×10 −16 . This shows a classification of performance degradation for all bit-width compression. For inverse log-map transform, the E value of all FT and ST quantization schemes varies between 29,714 and 32,948 with p-value < 2.2 ×10 −16 . This demonstrates relatively less classification performance degradation for all bit-width compression as compared to log-map transform. For WQR, the ST filter has E = 3618, p-value < 0.028, E = 2442 and p-value < 0.053 for Po2 and DFP, respectively. These bitwidths have better classification performances than log-map and inverse log-map but less classification performance than the direct quantization.
For comparing ROC curves, the area under the curve (AUC) is assumed as an accuracy measure. In Equation (45), we present parameter Z and p-value to find the difference between the AUC of the two curves.
In Equation (45), θ 1 and θ 2 denote the respective AUCs of ROC1 and ROC2, while σ 2 is the standard deviation of the difference between the thetas. Figure 22b illustrates the Z value and p-value [51] 11.198, with a p-value 2.2 ×10 −16 illustrating a less significant AUC loss than the log-map. For the WQR, the ST filter has the Z value -1.4954 with p-value < 0.1348 for Po2 and Z = −0.3745 with p-value < 0.708 for DFP, which shows better AUC of bit-widths than log-map and inverse log-map transform. Equation (46) is another measure to compare the AUC's of ROCs, whereas D is given as follows: In Equation (46), θ r and θ s denote the respective AUCs of r and s ROC curves, while σ 2 r and σ 2 s are the standard deviations of "r" and "s", respectively. Figure 22c demonstrates the D value and p-value [52] for AUC comparison. The D values further support the results of E and Z. The D value and p-value for direct quantization are showed no significant AUC loss for all quantization methods. The D value varies between −1.4993 to 0.22624 with p-values. For the log-map transform, the D value changes from 18.697 to 20.433 for all quantization. This demonstrates a lot of performance degradation (AUC). For inverse log-map transform, the D value changes from 9.548 to 11.251 for all quantization. This demonstrates a less AUC degradation as compared to the log-map transform. Figure 22d illustrates the AUC values for each quantization method for both ST and FT filters and its comparisons with baseline AUCs.
To signify the benefits of quantization to the CPR filters, the performance parameters are demonstrated by column graphs in Figures 23-25. The compressed versions of these filters are concluded based on the least E value. The selected performance parameters include sparsity, CPU execution time, and memory minimization. Here, the sparsity implies the number of zero weights in the trained filters, which implies a reduction in floating-point operations workload and speed-up in the convolution process during inference. The sparsity of the corresponding quantization schemes is displayed in Figure 23. On average, the direct and inverse log-map quantization schemes have better weight sparsity values as compared to full-precision versions. The best case is Po2 compression, with a compression ratio 16 but the log-map sparsity is insignificant. For a few instances, their sparsity lagged behind the full precision. The second performance measure is the memory minimization, as shown in Figure 24. Again, the inverse log-map and the direct one have meager memory requirements as compared to the full-precision version. compared to full-precision versions. The best case is Po2 compression, with a compression ratio 16 but the log-map sparsity is insignificant. For a few instances, their sparsity lagged behind the full precision. The second performance measure is the memory minimization as shown in Figure 24. Again, the inverse log-map and the direct one have meager memory requirements as compared to the full-precision version.    On the other hand, log-map memory requirements are modest. This third measure is the CPU execution time, which is given in Figure 25. The CPU is i5-2500k, 3.30 GHz with 3301 MHz 4 Core processor, whereas the system is 64-bit, equipped with 16GB RAM The CPU time is measured using tic and toc standard functions available in MATLAB Overall, the inverse log-map has the least execution time on the CPU, which is followed by the direct quantization scheme. The fastest case is the Po2 quantization with a 16-bi compression, which showed ∼8.90× speed-up capacity during the full-precision imple mentation.  Figure 23. The graph demonstrates a sparsity comparison between direct, log-map, and inverse log-map quantization for filter bank. Note that it is filter sparsity. On the other hand, log-map memory requirements are modest. This third measure is the CPU execution time, which is given in Figure 25. The CPU is i5-2500k, 3.30 GHz with 3301 MHz 4 Core processor, whereas the system is 64-bit, equipped with 16GB RAM. The CPU time is measured using tic and toc standard functions available in MATLAB. Overall, the inverse log-map has the least execution time on the CPU, which is followed by the direct quantization scheme. The fastest case is the Po2 quantization with a 16-bit compression, which showed ∼8.90× speed-up capacity during the full-precision implementation.

Conclusions
The spatial-domain CPR filters require substantial computation resources and memory. The proposed weight quantization is imperative to reduce the computation workload, processing time, and memory minimization. We propose the WQR approach and preprocessing steps, like log-map and inverse log-map, to improve the accuracy degradation through full-precision weight quantization. The WQR regulates the filter retraining process by fine-tuning the weights through any stated quantization scheme. The PSO is used for selecting WQR training parameters. Quantization error causes more accuracy loss to the ST filters than the FT filters, and WQR alleviates this accuracy loss. No accuracy degradation occurs at 9.88-88.54% MAC sparsity, 1.11×-4.73× speedup, and 14 is the maximum compression ratio for direct quantization. The inverse log-map achieves 34.30-94.87% MAC sparsity, 2.57×-8.90× speed-up, and maximum 1-bit compression with 6% accuracy loss, while the log-map achieves 4.25-34.30% MAC sparsity, 0.98×-1.12× speedup, and maximum 4-bit compression with 16% decline in accuracy. To study the quantization for the CPR, Po2 and DFP quantization approaches are applied. The results showed that better ROC closeness of DFP quantization is assured with the floating-point, while Po2 achieved better precision reduction. Based on the results, it is unnecessary to perform the retraining procedure with DFP quantization. On the other hand, retaining with Po2 showed better performance improvement than the DFP quantization. It can easily be concluded that Po2 quantization is a preferable choice for CPR. Moreover, the Po2 implementation for the spatial-domain CPR on hardware is recommended for future work. Further, multiplication in Po2-quantized filter could be performed using the shift operation, which makes it more hardware-friendly; however, it needs optimized hardware. We consider recovering the accuracy loss of geometric pre-processing, Po2 quantization, and retraining for future work. Funding: This research received no external funding. The APC was partially funded by National University of Sciences and Technology, Islamabad.

Conflicts of Interest:
The authors declare no conflicts of interest.

Appendix A. Relationship between Mean Square Error and Variance of Sample and Its Quantized Version
Whereas Y i andŶ i denote the original image and the estimated image, respectively, N denotes the total number of pixels in the original image. Y is the difference between an original image and its corresponding quantized version. µ Y is the average of differences between original and estimated images.
As we know the equation, By multiplying Equation (A2) with 1 N , By simplifying above equations, we get By multiplying Equation (A7) with 1 N , are the variances of the original image and its corresponding quantized version, respectively.