1. Introduction
Image compression is a critical field in computer science and digital communication, playing a vital role in reducing the storage and transmission requirements of digital images without significantly compromising their quality [
1]. With the proliferation of multimedia applications, the demand for efficient compression techniques has increased exponentially. From social media platforms and cloud storage services to real-time video streaming [
2] and medical imaging systems [
3], the ability to store and transmit high-resolution images efficiently is paramount. Traditional compression methods have served well over the years; however, as technology evolves, so does the need for innovative techniques that address modern challenges, such as scalability, adaptability to different image types, and the ability to preserve essential details in various contexts [
4].
Clustering techniques have emerged as a promising solution in the realm of image compression [
5]. These methods leverage the inherent structure and patterns within an image to group similar pixel blocks into clusters. Each cluster is represented by a centroid, significantly reducing the amount of data needed to reconstruct the image [
6]. Unlike traditional approaches that rely on predefined transformations or quantization schemes, clustering-based methods are data-driven and adaptive, making them well-suited for diverse image types and resolutions [
7]. By analyzing and exploiting the spatial and spectral redundancies within images, clustering techniques can achieve high compression ratios while maintaining acceptable visual fidelity [
8].
This paper explores the application of nine clustering techniques to image compression: K-Means, Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH), Divisive Clustering, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Ordering Points to Identify the Clustering Structure (OPTICS), Mean Shift, Gaussian Mixture Model (GMM), Bayesian Gaussian Mixture Model (BGMM), and Clustering In Quest (CLIQUE). Each method offers unique advantages and challenges, depending on the characteristics of the image and the specific parameters used. One of the critical research questions is the importance of parameter tuning in achieving optimal results. For example, the choice of block size significantly impacts the performance of each technique [
9]. Smaller block sizes often yield higher SSIM values (structural similarity index), preserving fine details in the image, but they also result in lower compression ratios due to the increased granularity [
10]. Conversely, larger block sizes improve compression efficiency at the expense of detail retention, leading to pixelation and reduced visual fidelity. Similarly, clustering-specific parameters, such as the grid size in CLIQUE or the epsilon value in DBSCAN, play a pivotal role in determining the quality and efficiency of compression [
11]. This underscores the need for a tailored approach when applying clustering techniques to image compression, taking into account the specific requirements of the application [
12].
Several studies have demonstrated the effectiveness of K-Means in image compression. For example, Afshan et al. [
13] compared K-Means with other clustering-based approaches and found it to be computationally efficient, particularly for images with distinct color regions. Similarly, a study by Baid and Talbar [
14] highlighted that K-Means-based image compression significantly reduces file sizes while maintaining good image quality, as measured by metrics like PSNR and SSIM.
Divisive clustering has been explored in image compression research for its ability to adaptively determine cluster structures. Fradkin et al. [
15] found that Divisive Clustering outperformed K-Means in preserving fine-grained details in images with complex textures. Additionally, it has been used to improve vector quantization techniques by dynamically determining cluster sizes.
DBSCAN has been applied in image compression, particularly for segmenting images with irregular regions. A study by Jaseela and Ayaj [
16] showed that DBSCAN-based compression achieved better edge preservation compared to K-Means, particularly for medical and satellite images. Since DBSCAN identifies noise pixels separately, it is also useful for removing redundant details in noisy images.
GMM-based clustering is widely used in image compression due to its ability to model complex data distributions. According to Fu et al. [
17], GMMs outperform K-Means in handling overlapping clusters and representing natural images with smooth transitions. Moreover, GMM-based compression has been successfully used in medical imaging, where subtle variations in intensity need to be preserved.
This paper provides a comprehensive exploration of clustering techniques for image compression, analyzing their performance in balancing image quality and compression efficiency. The major contributions of this paper are as follows:
It includes an in-depth review of nine clustering techniques: K-Means, BIRCH, Divisive Clustering, DBSCAN, OPTICS, Mean Shift, GMM, BGMM, and CLIQUE.
A comparative analysis highlights the key characteristics, strengths, and limitations of each technique. This includes insights into parameter sensitivity, handling of clusters with varying densities, overfitting tendencies, computational complexity, and best application scenarios.
The paper outlines a universal framework for image compression using clustering methods, including preprocessing, compression, and decompression phases.
Each clustering technique was implemented to achieve image compression by segmenting image blocks into clusters and reconstructing them using cluster centroids.
The implementations are adapted for diverse clustering methods, ensuring consistency in the preprocessing, compression, and decompression phases.
The paper provides detailed analysis and interpretation for each clustering technique, addressing trade-offs between compression and quality.
Rigorous experiments were conducted using benchmark images from CID22 to validate compression efficiency and image quality for all clustering techniques.
The results are synthesized into a clear discussion, ranking the techniques based on their effectiveness in achieving a balance between CR (compression ratio) and SSIM.
Custom visualizations demonstrate the impact of varying block sizes and parameters for each technique, offering intuitive insights into their performance characteristics.
The remainder of this paper is organized as follows.
Section 2 presents the vector quantization approach and its use in image compression.
Section 3 provides a detailed review of various clustering techniques, categorized into partitioning, hierarchical, density-based, distribution-based, and grid-based methods, along with a comparative analysis of their strengths and weaknesses.
Section 4 presents the implementation and experimental evaluation of these clustering techniques, focusing on their performance across diverse datasets and configurations.
Section 5 explains the compression and decompression processes facilitated by clustering methods, highlighting the role of clustering in achieving efficient image compression.
Section 6 discusses the quality assessment metrics used to evaluate compression performance.
Section 7 delves into the performance analysis of clustering techniques for image compression, providing insights into their efficiency and effectiveness.
Section 8 validates the results using the CID22 dataset, demonstrating the practical applicability of the methods in real-world scenarios. Finally,
Section 9 concludes the paper by summarizing the findings and proposing directions for future work.
4. Implementation and Experimental Evaluation of Clustering Techniques
This section explores the application and testing of various clustering techniques to evaluate their performance in partitioning datasets into distinct clusters. The analysis is performed independent of compression objectives, focusing on how effectively each method segments data and adapts to varying configurations. By examining how these techniques respond to changes in parameter values, the study reveals their strengths and limitations in adapting to data patterns.
Figure 1 demonstrates the performance of the K-Means clustering algorithm across varying numbers of clusters, ranging from 1 to 9. The legend assigns distinct colors to each cluster (Cluster 1 through Cluster 10) and includes additional markers for centroids (red stars), noisy data points (black dots), and unclustered points (black solids).
In the first panel, with a single cluster, all data points are grouped together, resulting in a simplistic representation with limited separation between different regions. As the number of clusters increases to 2 and 3, a more distinct segmentation emerges, reflecting K-Means’ ability to partition the data into groups that minimize intra-cluster variance.
At 4 and 5 clusters, the algorithm begins to capture finer structures in the dataset, effectively separating points based on their proximity and density. This segmentation reflects the algorithm’s ability to balance between over-simplification and over-segmentation. As the number of clusters increases further to 6, 7, and beyond, the algorithm divides the data into smaller, more granular groups. This results in more localized clusters, potentially overfitting if the dataset does not naturally support such fine granularity.
Figure 2 illustrates the performance of the BIRCH algorithm as the number of clusters increases from 1 to 9. With a single cluster, all data points are aggregated into one group, offering no segmentation and overlooking the underlying structure of the data. As the number of clusters increases to 2 and 3, the algorithm begins to create more meaningful separations, delineating regions of the data based on density and distribution.
With 4 and 5 clusters, the segmentation becomes more refined, capturing the natural groupings within the dataset. BIRCH effectively identifies cohesive regions, even in the presence of outliers, as indicated by isolated points in the scatterplot. The hierarchical nature of BIRCH is evident as it progressively organizes the data into clusters, maintaining balance and reducing computational complexity.
At higher cluster counts, such as 7, 8, and 9, the algorithm demonstrates its capacity to detect smaller, more localized clusters. However, this can lead to over-segmentation, where naturally cohesive groups are divided into sub-clusters. The presence of outliers remains well-managed, with some points clearly designated as noise. Overall, BIRCH shows its strength in clustering data hierarchically, balancing efficiency and accuracy, especially for datasets with varying densities and outliers.
Figure 3 demonstrates the progression of the Divisive Clustering algorithm as it separates the data into an increasing number of clusters, from 1 to 9. Initially, with a single cluster, all data points are grouped together, ignoring any inherent structure in the data. This provides no meaningful segmentation and highlights the starting point of the divisive hierarchical approach.
As the number of clusters increases to 2 and 3, the algorithm begins to partition the data into distinct groups based on its inherent structure. These initial divisions effectively segment the data into broad regions, capturing the overall distribution while maintaining cohesion within the clusters.
With 4 to 6 clusters, the algorithm refines these groupings further, identifying smaller clusters within the larger ones. This refinement captures finer details in the dataset’s structure, ensuring that densely populated areas are segmented appropriately. At this stage, Divisive Clustering demonstrates its ability to split clusters hierarchically, providing meaningful separations while maintaining a logical hierarchy.
At higher cluster counts, such as 7 to 9, the algorithm continues to divide existing clusters into smaller subgroups. This leads to a granular segmentation, effectively capturing subtle variations within the data. However, as the number of clusters increases, there is a risk of over-segmentation, where cohesive clusters are fragmented into smaller groups. Despite this, the algorithm handles outliers effectively, ensuring that isolated points are not erroneously grouped with larger clusters. Overall, Divisive Clustering effectively balances granularity and cohesion, making it well-suited for hierarchical data exploration.
Figure 4 illustrates the performance of the DBSCAN algorithm under varying eps and
min_samples parameter configurations. DBSCAN’s ability to detect clusters of varying densities and its handling of noise is evident in the results.
With a small eps of 0.05 and min_samples set to 1, the algorithm identifies a large number of clusters (214), as the tight neighborhood criterion captures even minor density variations. This leads to over-segmentation and a significant amount of noise classified as individual clusters, reducing the interpretability of the results. Increasing min_samples to 3 under the same eps reduces the number of clusters (24) by merging smaller groups, though many data points remain unclustered. At min_samples 9, no clusters are identified, as the eps is too restrictive to form valid clusters.
When eps is increased to 0.1, the algorithm becomes less restrictive, capturing larger neighborhoods. For min_samples 1, the number of clusters decreases to 126, reflecting better grouping of data points. At min_samples 3, the results improve further, with fewer clusters (20) and more cohesive groupings. However, at min_samples 9, only 4 clusters are detected, with many points treated as noise.
With the largest eps value of 0.3, the algorithm identifies very few clusters, as the larger neighborhood radius groups most points into a few clusters. At min_samples 1, only 17 clusters are found, indicating over-generalization. For min_samples 3, the clusters reduce to 3, with most noise eliminated. Finally, at min_samples 9, only 2 clusters remain, demonstrating high consolidation but potentially missing finer details.
In summary, DBSCAN’s clustering performance is highly sensitive to eps and min_samples. Smaller eps values capture local density variations, leading to over-segmentation, while larger values risk oversimplifying the data. Higher min_samples values improve robustness by eliminating noise but can under-cluster sparse regions. The results highlight DBSCAN’s flexibility but emphasize the importance of parameter tuning for optimal performance.
Figure 5 demonstrates the performance of the OPTICS algorithm under varying
min_samples and
eps parameters. OPTICS, known for its ability to detect clusters of varying densities and hierarchical structures, shows its versatility and sensitivity to parameter adjustments.
For min_samples set to 5 and eps varying from 0.01 to 0.03, the algorithm identifies a relatively large number of clusters. At eps = 0.01, 19 clusters are detected, capturing fine density variations. As eps increases to 0.02 and 0.03, the number of clusters decreases slightly to 18 and 16, respectively. This reflects OPTICS’ tendency to merge smaller clusters as the threshold for cluster merging becomes more lenient. Despite this reduction, the algorithm still captures intricate cluster structures and maintains a high level of detail.
When min_samples increases to 10, the number of clusters decreases significantly. At eps = 0.01, only 5 clusters are found, reflecting stricter density requirements for forming clusters. As eps increases to 0.02 and 0.03, the cluster count further decreases to 5 and 4, respectively, with some finer details being lost. This highlights the impact of min_samples in reducing noise sensitivity but at the cost of losing smaller clusters.
For min_samples set to 20, the clustering results are highly simplified. Across all eps values, only 2 clusters are consistently detected, indicating a significant loss of detail and overgeneralization. While this reduces noise and improves cluster compactness, it risks oversimplifying the dataset and merging distinct clusters.
Overall, the results show that OPTICS performs well with low min_samples and small eps values, capturing fine-grained density variations and producing detailed cluster structures. However, as these parameters increase, the algorithm shifts towards merging clusters and simplifying the structure, which may lead to a loss of critical information in datasets with complex density variations. These findings emphasize the importance of careful parameter tuning to balance detail retention and noise reduction.
Figure 6 illustrates the performance of the Mean Shift clustering algorithm applied to the dataset, with varying bandwidth values from 0.1 to 5. Mean Shift is a non-parametric clustering method that groups data points based on the density of data in a feature space. The bandwidth parameter, which defines the kernel size used to estimate density, plays a critical role in determining the number and quality of clusters.
At a small bandwidth of 0.1, the algorithm detects 143 clusters, indicating a high sensitivity to local density variations. This results in many small clusters, capturing fine details in the dataset. However, such granularity may lead to over-segmentation, with clusters potentially representing noise rather than meaningful groupings. As the bandwidth increases to 0.2 and 0.3, the number of clusters decreases to 60 and 37, respectively. The algorithm begins merging smaller clusters, creating a more structured and meaningful segmentation while still retaining some level of detail.
With a bandwidth of 0.5, the cluster count drops sharply to 13, showing a significant reduction in granularity. The clusters become larger and less detailed, which may improve computational efficiency but risks oversimplifying the dataset. As the bandwidth continues to increase to 1 and beyond (e.g., 2, 3, 4, and 5), the number of clusters reduces drastically to 2 or even 1. At these high bandwidths, the algorithm generalizes heavily, resulting in overly simplistic cluster structures. This can lead to the loss of critical information and may render the clustering ineffective for datasets requiring fine-grained analysis.
In summary, the Mean Shift algorithm’s clustering performance is highly dependent on the bandwidth parameter. While smaller bandwidths allow for detailed and fine-grained clustering, they may result in over-segmentation and sensitivity to noise. Larger bandwidths improve generalization and computational efficiency but at the cost of significant loss of detail and potential oversimplification. Optimal bandwidth selection is essential to balance the trade-off between capturing meaningful clusters and avoiding over-generalization.
Figure 7 demonstrates the performance of the GMM clustering algorithm, evaluated across different numbers of components, ranging from 1 to 9. The GMM is a probabilistic model that assumes data are generated from a mixture of several Gaussian distributions, making it flexible for capturing complex cluster shapes. For the purpose of compression and vector quantization, each data point must be assigned to a single cluster, whether in the GMM or BGMM. To accomplish this, we adopt a hard assignment approach, where each point is assigned to the cluster with the highest probability, effectively selecting the most likely cluster based on maximum likelihood estimation.
With a single component, the GMM produces a single, undifferentiated cluster, resulting in poor segmentation. All data points are grouped together, reflecting the model’s inability to distinguish underlying structures in the dataset. As the number of components increases to 2 and 3, the algorithm begins to form meaningful clusters, capturing distinct groupings in the data. However, overlapping clusters are still evident, indicating limited separation.
At 4 components, the clustering becomes more refined, and distinct patterns start emerging. The data points are grouped more accurately into cohesive clusters, demonstrating the ability of the GMM to model underlying structures. As the number of components increases to 5, 6, and 7, the algorithm continues to improve in capturing finer details and separating overlapping clusters. This results in a more accurate representation of the dataset, as observed in the clearer segmentation of the clusters.
By 8 and 9 components, the clustering is highly granular, with minimal overlap between clusters. However, the increased number of components may lead to overfitting, where the algorithm begins to model noise as separate clusters. This trade-off highlights the importance of carefully selecting the number of components to balance accuracy and generalizability.
In summary, the GMM effectively models the dataset’s underlying structure, with improved clustering performance as the number of components increases. However, excessive components can lead to overfitting, underscoring the need for optimal parameter selection. This makes the GMM a versatile and robust choice for applications requiring probabilistic clustering.
Figure 8 illustrates the clustering performance of the BGMM clustering algorithm across a range of component counts from 2 to 10. The BGMM, unlike GMM, incorporates Bayesian priors to determine the optimal number of components, providing a probabilistic framework for clustering. This makes the BGMM robust to overfitting, as it naturally balances the trade-off between model complexity and data representation.
At the lower component counts (2 to 3 components), the BGMM effectively identifies broad clusters in the dataset. For 2 components, the algorithm forms two distinct clusters, offering a coarse segmentation of the data. Increasing the components to 3 enhances granularity, with the addition of a third cluster capturing finer details within the dataset.
As the number of components increases to 4, 5, and 6, the BGMM achieves progressively finer segmentation, forming clusters that better align with the underlying data structure. Each additional component introduces greater specificity in capturing subgroups within the data, reflected in well-defined clusters. The transitions between clusters are smooth, indicating the algorithm’s ability to probabilistically assign points to clusters, even in overlapping regions.
From 7 to 9 components, the BGMM continues to refine the clustering process, but the benefits of additional components start to diminish. At 9 and 10 components, the model begins to overfit, with some clusters capturing noise or forming redundant groups. Despite this, the clusters remain relatively stable, showcasing the BGMM’s ability to avoid drastic overfitting compared to the GMM.
In conclusion, the BGMM demonstrates robust clustering performance across varying component counts. While the algorithm effectively captures complex data structures, it benefits from the Bayesian prior that discourages excessive components. This makes the BGMM particularly suited for scenarios where a balance between precision and generalizability is crucial.
Figure 8 emphasizes the importance of selecting an appropriate number of components to maximize clustering efficiency and accuracy.
Figure 9 demonstrates the clustering results of the CLIQUE algorithm under varying grid sizes, showcasing its performance and adaptability. At a grid size of 1, no clusters are detected, which indicates that the granularity is too coarse to capture meaningful groupings in the dataset. As the grid size increases to 2 and 3, the algorithm begins to detect clusters, albeit sparingly, with 1 and 4 clusters identified, respectively. This improvement in cluster detection shows that a finer grid allows the algorithm to better partition the space and identify denser regions.
From grid size 4 to 6, there is a steady increase in the number of clusters found, reaching up to 8 clusters. The results reveal that moderate grid sizes provide a balance between capturing meaningful clusters and avoiding excessive noise or fragmentation in the data. Notably, the identified clusters at these grid sizes appear well-separated and align with the data’s inherent structure.
For larger grid sizes, such as 7 to 9, the number of clusters continues to grow, with up to 16 clusters detected. However, these finer grids risk over-partitioning the data, potentially splitting natural clusters into smaller subgroups. While the increase in clusters reflects a more detailed segmentation of the data, it might not always represent the most meaningful groupings, especially in practical applications.
Overall, the CLIQUE algorithm demonstrates its ability to adapt to different grid sizes, with the grid size playing a critical role in balancing cluster resolution and interpretability. Lower grid sizes result in the under-detection of clusters, while excessively high grid sizes may lead to over-fragmentation. Moderate grid sizes, such as 4 to 6, seem to strike the optimal balance, capturing the data’s underlying structure without overcomplicating the clustering.
7. Comprehensive Performance Analysis of Clustering Techniques in Image Compression
This section presents the performance evaluation of various clustering-based image compression techniques. Each method was applied to compress images using the proposed framework, followed by quantitative and qualitative assessments of the reconstructed images. Metrics such as CR, BPP, and SSIM were calculated to gauge the trade-offs between compression efficiency and image quality. Block sizes of 2 × 2, 4 × 4, 8 × 8, and 16 × 16 were applied across different clustering techniques to evaluate their impact on compression performance. However, some methods, such as BIRCH, DBSCAN, OPTICS, and CLIQUE, were only experimented with smaller block sizes (2 × 2, 4 × 4) due to the significant memory resources they require, which were not available on our hardware. The results highlight the strengths and limitations of each method, offering insights into their suitability for different compression scenarios. The findings are summarized and analyzed in the subsequent subsections.
The initial experimental results for image compression presented in this section utilized the widely recognized “Peppers” image (
Figure 12) obtained from the Waterloo dataset [
59], a benchmark resource for image processing research.
7.1. K-Means Clustering for Compression
Figure 13 illustrates the results of image compression using the K-Means clustering algorithm with different block sizes and numbers of clusters. We have implemented K-Means++ as an optimized strategy for centroid initialization to enhance clustering efficiency and improve convergence stability. Starting with the smallest block size of 2 × 2 pixels, the images exhibit high SSIM values, close to 1, indicating a strong similarity to the original image. This high SSIM is expected because the small block size allows for finer granularity in capturing image details. However, as the number of clusters increases from 31 to 255, there is a noticeable trade-off between CR and BPP. When the number of clusters is low, the CR is relatively high (14.14), but the BPP is low (1.70), indicating efficient compression. As the number of clusters increases, the CR decreases significantly (6.13), while the BPP increases to 3.91. This indicates that the image quality is maintained at the cost of compression efficiency, as more clusters mean more distinct pixel groups, which reduces the compression ratio.
When the block size is increased to 4 × 4 pixels, the reconstructed images still maintain high SSIM values, though slightly lower than with the 2 × 2 block size. This decrease in SSIM is due to the larger block size capturing less fine detail, making the reconstruction less accurate. The compression ratio improves significantly, reaching as high as 48.09 when using 31 clusters. However, as with the 2 × 2 block size, increasing the number of clusters leads to a reduction in the compression ratio (down to 18.30 with 255 clusters) and an increase in BPP, indicating that more data are needed to preserve the image quality. The images with 4 × 4 blocks and a higher number of clusters show a good balance between compression and quality, making this configuration potentially optimal for certain applications where moderate image quality is acceptable with better compression efficiency.
The 8 × 8 block size introduces more noticeable artifacts in the reconstructed images, particularly as the number of clusters increases. Although the compression ratio remains high, the SSIM values start to drop, especially as we move to higher cluster counts. The BPP values are also lower compared to smaller block sizes, indicating higher compression efficiency. However, this comes at the expense of image quality, as larger blocks are less effective at capturing the fine details of the image, leading to a more pixelated and less accurate reconstruction. The trade-off is evident here: while the compression is more efficient, the image quality suffers, making this configuration less desirable for applications requiring high visual fidelity.
Finally, the largest block size of 16 × 16 pixels shows a significant degradation in image quality. The SSIM values decrease noticeably, reflecting the loss of detail and the introduction of more visible artifacts. The compression ratios are very high, with a maximum CR of 48.25, but the images appear much more pixelated and less recognizable compared to those with smaller block sizes. This indicates that while large block sizes are highly efficient for compression, they are not suitable for scenarios where image quality is a priority. The BPP values also vary significantly, with lower cluster counts resulting in very low BPP, but as the number of clusters increases, the BPP rises, indicating that the image quality improvements come at the cost of less efficient compression.
In summary,
Figure 13 demonstrates that the K-Means clustering algorithm’s effectiveness for image compression is highly dependent on the choice of block size and the number of clusters. Smaller block sizes with a moderate number of clusters offer a good balance between image quality and compression efficiency, making them suitable for applications where both are important. Larger block sizes, while more efficient in terms of compression, significantly degrade image quality and are less suitable for applications requiring high visual fidelity. The results highlight the need to carefully select these parameters based on the specific requirements of the application, whether it prioritizes compression efficiency, image quality, or a balance of both.
7.2. BIRCH Clustering for Compression
Figure 14 displays a series of reconstructed images using the BIRCH clustering algorithm applied to image compression. Starting with the block size of 2 × 2, the images exhibit varying levels of quality depending on the threshold and branching factor used. For a block size of 2 × 2, with a threshold of 0.1 and a branching factor of 10, the SSIM values range around 0.219 to 0.241, indicating a dissimilarity to the original image. The CR is low, around 1.75 to 1.85, and the BPP is high, ranging from 12.94 to 13.74, reflecting a lower compression efficiency and a higher level of detail retention. However, as the threshold increases to 0.5, while keeping the branching factor constant at 10, the SSIM decreases significantly, with values dropping to as low as 0.163 to 0.184, indicating a deterioration in image quality. Despite this, the CR slightly improves, suggesting that more aggressive compression is taking place at the expense of image quality.
When increasing the branching factor to 50, the SSIM values decrease even further, especially for the higher threshold of 0.5, where the SSIM drops to 0.036 and 0.126. This indicates that the reconstructed images lose significant detail and structure, becoming almost unrecognizable. The BPP remains high, suggesting that although a large amount of data are being retained, they are not contributing positively to the image quality. The CR does not show significant improvement, which suggests that the BIRCH algorithm with these parameters might not be efficiently clustering the blocks in a way that balances compression with quality.
Moving to a block size of 4 × 4, the images generally show a deterioration in SSIM compared to the smaller block size, with SSIM values dropping to below 0.1 in several cases, particularly when the threshold is set to 0.5. The BPP shows slight improvements in some cases, but the SSIM remains relatively low, indicating that even though more bits are used per pixel, the quality does not improve and, in some cases, worsens significantly. For example, with a threshold of 0.5 and a branching factor of 50, the SSIM is 0.219, which is slightly better than other configurations with the same block size but still low.
In summary, the BIRCH algorithm appears to struggle with balancing compression and image quality in this context, especially with larger block sizes and higher thresholds. The SSIM values suggest that as the threshold and branching factor increase, the algorithm fails to maintain structural similarity, leading to poor-quality reconstructions. The CR and BPP metrics indicate that while compression is occurring, it is not efficiently capturing the important details needed to reconstruct the image well. This suggests that BIRCH may not be the optimal clustering method for this type of image compression, particularly with larger block sizes and more aggressive parameter settings.
7.3. Divisive Clustering for Compression
Figure 15 presents the results of applying the Divisive Clustering method to compress and reconstruct an image. At a block size of 2 × 2, the method preserves high structural similarity (SSIM close to 1) across different cluster sizes. However, increasing the number of clusters from 31 to 255 leads to a decrease in compression ratio (CR) from 14.22 to 6.58, while the bits per pixel (BPP) increases, indicating a trade-off between compression and quality.
For the 4 × 4 block size, the CR improves significantly, reaching 49.64 with 31 clusters, demonstrating the method’s efficiency in compressing images at larger block sizes. However, increasing the number of clusters to 255 reduces the CR to 18.30, while SSIM remains high, showing that Divisive Clustering can maintain good image quality at this level. The method continues to perform well for 8 × 8 blocks, achieving a peak CR of 93.01, but SSIM slightly drops as clusters increase. At a 16 × 16 block size, the method reaches its highest CR of 48.66 with 31 clusters, but image quality degrades slightly with lower SSIM. However, increasing the number of clusters to 255 partially restores image quality while sacrificing compression efficiency.
Overall, Divisive Clustering and K-Means exhibit similar trends in compression performance and image quality, demonstrating their shared characteristics as partitioning-based clustering techniques.
7.4. DBSCAN and OPTICS Clustering for Compression
Figure 16 and
Figure 17 represent the results of image compression using two clustering techniques: DBSCAN and OPTICS. Both methods are designed to identify clusters of varying densities and can handle noise effectively, which makes them particularly suitable for applications where the underlying data distribution is not uniform. However, the results demonstrate distinct differences in how each method processes the image blocks, especially under varying parameter settings. It is important to note that blocks that do not belong to any identified cluster are classified as noise and are assigned unique centroids. This approach accounts for the significantly higher number of clusters observed in certain cases, as each noisy block is treated as an independent entity.
DBSCAN’s performance across different block sizes and parameter configurations shows a stark contrast in image quality and compression metrics. At smaller block sizes (2 × 2), DBSCAN tends to find a very high number of clusters when the eps parameter is low, such as 0.1, and
min_samples is set to 1. This results in an extremely high cluster count (e.g., 46,142 clusters found), but this comes at the cost of poor CR and BPP, as seen in
Figure 16. The SSIM value remains high, indicating a good structural similarity, but the practical usability of such a high cluster count is questionable, as it results in high computational overhead and potentially overfitting the model to noise.
As the eps value increases (e.g., from 0.1 to 0.3) and min_samples rises, the number of clusters decreases significantly, which is accompanied by a drop in SSIM and an increase in CR and BPP. For instance, when eps is 0.3 and min_samples is 4, DBSCAN produces far fewer clusters, leading to much more compressed images but with significantly degraded quality, as evidenced by the low SSIM values. At larger block sizes (e.g., 4 × 4), DBSCAN’s performance diminishes drastically, with the number of clusters dropping to nearly zero in some configurations. This results in almost no useful information being retained in the image, reflected in the SSIM dropping to zero, indicating a total loss of image quality.
OPTICS, which is similar to DBSCAN but provides a more nuanced approach to identifying clusters of varying densities, shows a different pattern in image processing. Like DBSCAN, the effectiveness of OPTICS is highly dependent on its parameters (
eps and
min_samples).
Figure 17 shows that, regardless of the block size, OPTICS identifies a very small number of clusters (often just 1), especially when
eps is set to 0.3 and
min_samples is varied. This leads to extremely high CR and very low BPP, but at the cost of significant image distortion and loss, as seen in the brownish, almost entirely abstract images produced.
One notable observation is that OPTICS tends to retain minimal useful image information even when identifying a single cluster, leading to highly compressed images with very high CR but nearly zero SSIM. This suggests that OPTICS, under these settings, compresses the image to the point of obliterating its original structure, making it less suitable for tasks where preserving image quality is essential.
When comparing DBSCAN and OPTICS, it becomes clear that while both methods aim to find clusters in data, their behavior under similar parameter settings leads to vastly different results. DBSCAN’s flexibility in finding a large number of small clusters can either be an advantage or a hindrance depending on the parameter configuration, whereas OPTICS, in this particular case, consistently produces fewer clusters with more significant compression but at the cost of image quality. For instance, both techniques perform poorly with larger block sizes, but DBSCAN’s sensitivity to eps and min_samples allows for more granular control over the number of clusters and the resulting image quality. On the other hand, OPTICS, while theoretically offering advantages in handling varying densities, does not seem to leverage these advantages effectively in this context, leading to overly aggressive compression. The images produced by DBSCAN with lower eps values and small min_samples show that it can maintain a relatively high SSIM while achieving reasonable compression, although this comes with a high computational cost due to the large number of clusters. In contrast, OPTICS, even with different settings, fails to preserve the image structure, resulting in images that are visually unrecognizable.
7.5. Mean Shift Clustering for Compression
Figure 18 presents the results of image compression using the Mean Shift clustering algorithm across different block sizes and bandwidth parameters. At the smallest block size (2 × 2), the clustering process identifies an extremely high number of clusters, with values exceeding 46,000 clusters for bandwidth values of 0.1 and 0.3. Despite the large number of clusters, the compression ratio (CR) is exceptionally low, around 1.36 to 2.12, which suggests that the method is ineffective at reducing image storage size. However, the SSIM remains at 1.000, indicating an almost perfect reconstruction of the original image. The bits per pixel (BPP) remains high, indicating that the compression is not optimized for efficiency.
As the block size increases to 4 × 4, 8 × 8, and 16 × 16, a similar trend is observed where the number of clusters remains high, with minimal improvement in CR. The CR slightly increases from 1.48 (4 × 4) to 1.62 (16 × 16), but these values remain significantly lower compared to other clustering methods like K-Means or Divisive Clustering. This suggests that Mean Shift fails to provide effective compression because it tends to over-cluster image regions, leading to a high number of unique clusters that prevent substantial data reduction.
In conclusion, while Mean Shift preserves image quality perfectly (SSIM = 1.000), its compression efficiency is extremely poor. The high cluster count results in minimal data reduction, making it impractical for image compression. Unlike other methods that balance quality and compression, Mean Shift fails to achieve a desirable trade-off. Thus, it is not a viable option for practical image compression applications.
7.6. GMM and BGMM Clustering for Compression
Figure 19 and
Figure 20 present the results of compressing an image using the GMM and BGMM clustering methods. For a block size of 2 × 2, both the GMM and BGMM achieve high compression ratios at lower cluster counts, with the GMM reaching a CR of 12.63 and BGMM at 12.07 with 31 clusters. As the cluster count increases to 255, the CR drops while SSIM improves, showing a trade-off between compression efficiency and quality. The highest SSIM values for both methods remain above 0.98, indicating strong image fidelity.
With a 4 × 4 block size, compression ratios improve significantly, especially with lower cluster counts. Both methods reach CR values above 30 when using 31 clusters, while higher cluster counts lead to better SSIM at the cost of reduced compression efficiency. At 127 clusters, the GMM achieves a CR of 23.04 and an SSIM of 0.939, while the BGMM records 26.62 and 0.906, respectively.
At an 8 × 8 block size, compression ratios further increase, with the GMM reaching 86.50 and the BGMM 89.99 for 31 clusters. However, the SSIM remains relatively low in this setting. As clusters increase, SSIM improves, peaking at 0.987 for the GMM and 0.921 for BGMM at 255 clusters. This highlights that more clusters are necessary for maintaining image quality when using larger blocks.
For a 16 × 16 block size, maximum compression efficiency is achieved, with CR values exceeding 40 for low cluster counts. However, SSIM is initially lower due to larger data blocks capturing less detail. With 31 clusters, the GMM records a CR of 48.20 and SSIM of 0.952, while the BGMM achieves 49.16 and 0.950. At 255 clusters, both methods reach SSIM values above 0.96, providing high-quality reconstruction but at lower compression efficiency.
Overall, both the GMM and BGMM perform nearly identically, with the BGMM slightly favoring compression efficiency at lower cluster counts. The results confirm that distribution-based methods can maintain strong image fidelity while achieving significant compression, making them suitable for balancing storage efficiency and visual quality.
7.7. CLIQUE Clustering for Compression
Figure 21 presents the results of the CLIQUE clustering method applied for image compression with varying grid sizes, ranging from 1 to 8.5. The block size used in all experiments is fixed at 2 × 2, and the grid size controls the number of clusters and consequently affects compression performance and quality. Since CLIQUE is a grid-based clustering method, unclustered blocks are reassigned to the nearest dense grid cell, ensuring continuity in the compressed representation while minimizing data loss.
For grid sizes 1.5 and 2.0, the algorithm identifies only 1 and 16 clusters, respectively. These settings result in poor visual quality, as evidenced by the low SSIM values (0.014 and 0.864, respectively) and the inability to retain structural details in the image. The CR values are exceptionally high (4572.59 and 59.10), but this comes at the cost of extreme data loss, as depicted by the highly distorted or gray images.
Increasing the grid size from 2.5 to 4.5 results in more clusters (81 to 256), which improves the image’s visual quality. The SSIM steadily increases, reaching 0.956 at a grid size of 4.5, indicating a good retention of structural similarity compared to the original image. Correspondingly, the CR values drop significantly (32.84 to 22.15), reflecting a more balanced trade-off between compression and quality. The images become visually more acceptable as the grid size increases, with better preservation of object edges and colors.
At grid sizes 5.0 to 6.0, the number of clusters increases drastically (625 and 1296), resulting in improved image quality. The SSIM values rise further (0.972 to 0.981), indicating near-perfect structural similarity. The BPP also increases moderately (1.38 to 1.65), demonstrating a slight trade-off in compression efficiency. The images exhibit finer details and color fidelity is well-preserved, making these grid sizes suitable for high-quality compression scenarios.
As the grid size increases from 6.5 to 8.5, the number of clusters grows exponentially (2401 to 4096). The SSIM approaches near-perfection (0.986 to 0.989), and the BPP increases significantly (1.92 to 2.04). These results reflect excellent image reconstruction quality with minimal perceptual differences from the original image. However, the CR values continue to decrease (12.53 to 11.74), highlighting the trade-off between compression efficiency and quality. These grid sizes are ideal for applications requiring minimal quality loss, even at the expense of reduced compression efficiency.
The performance of the CLIQUE method is highly dependent on the grid size, with a direct relationship between grid size and the number of clusters. Lower grid sizes result in fewer clusters, leading to higher compression ratios but significantly compromised image quality, as evidenced by low SSIM values and poor visual results. Medium grid sizes, such as 4.5 to 6.0, strike a balance between compression efficiency and image quality, maintaining good structural integrity while offering reasonable compression ratios. On the other hand, higher grid sizes (e.g., 6.5 to 8.5) generate more clusters, yielding near-perfect SSIM values and visually indistinguishable images from the original but at the cost of reduced compression efficiency.
7.8. Computational Efficiency Analysis
In this evaluation, all image compression techniques were applied using a fixed block size of 4 × 4 pixels. This uniform partitioning ensures consistency across different methods, allowing for a fair comparison of their computational efficiency. The experiments were conducted on a system with the following specifications:
Processor: 11th Gen Intel® Core™ i7-11800H @ 2.30 GHz;
Installed RAM: 32.0 GB;
System Type: 64-bit operating system, x64-based processor;
Operating System: Windows 11 Pro (Dev Version).
The results of
Table 4 indicate significant variations in compression time among the techniques. CLIQUE exhibited the fastest performance, with the lowest processing time, making it the most efficient method in this study. K-Means followed with a relatively low computation time, reinforcing its suitability for real-time applications.
Conversely, methods such as Mean Shift, GMM, and BGMM demonstrated considerably higher computational costs, likely due to their iterative nature and probabilistic modeling. OPTICS also showed a substantial runtime, reflecting its density-based approach’s complexity. Meanwhile, BIRCH, Divisive Clustering, and DBSCAN presented moderate execution times, balancing efficiency and effectiveness.
7.9. Discussion
The evaluation of the nine clustering techniques highlights the diverse strengths and limitations of each method in the context of image compression. Each technique offers unique trade-offs between compression efficiency, image quality, and computational complexity, making them suitable for different applications depending on the specific requirements.
K-Means stands out as a robust and versatile method for image compression, demonstrating a good balance between compression efficiency and image quality across various block sizes and cluster configurations. Its ability to produce consistent results with high SSIM values and moderate compression ratios makes it a strong candidate for applications requiring both visual fidelity and reasonable storage savings. However, its performance diminishes slightly with larger block sizes, where fine-grained image details are lost, leading to visible artifacts.
BIRCH, on the other hand, struggles to balance compression and quality, particularly with larger block sizes and higher thresholds. Its tendency to lose structural similarity at higher parameter settings indicates its limitations in preserving critical image features. While BIRCH may excel in other data clustering contexts, its application in image compression appears less effective compared to other techniques.
Divisive Clustering showcases excellent adaptability, particularly with smaller block sizes, where it maintains high SSIM values and reasonable compression ratios. As the block size increases, it achieves impressive compression efficiency with only a slight compromise in image quality. Its hierarchical nature enables it to provide granular control over the clustering process, making it well-suited for scenarios requiring a balance between compression and quality.
Density-based methods like DBSCAN and OPTICS highlight the challenges of applying these techniques to image compression. DBSCAN’s performance varies significantly with its parameters (eps and min_samples), often producing high SSIM values at the cost of computational overhead and impractical cluster counts. OPTICS, while theoretically advantageous for handling varying densities, shows limited effectiveness in this application, often leading to excessive compression at the expense of image structure. Both methods illustrate the importance of parameter tuning and the potential challenges of adapting density-based clustering for image compression.
Mean Shift emerges as a stable technique, particularly for smaller block sizes and lower bandwidth settings. Its non-parametric nature allows it to adapt well to the data, resulting in high SSIM values and low compression ratios. However, as block sizes increase, the sensitivity of Mean Shift to its bandwidth parameter diminishes, leading to less pronounced variations in results. This stability makes it an attractive option for applications where consistency across different settings is desirable.
The GMM and BGMM provide complementary perspectives on probabilistic clustering for image compression. The GMM demonstrates more pronounced changes in performance across block sizes and cluster counts, offering high compression ratios but at the cost of noticeable quality degradation for larger block sizes. In contrast, the BGMM delivers more consistent image quality with slightly higher BPP, making it a reliable choice for scenarios prioritizing visual fidelity over extreme compression efficiency.
Finally, CLIQUE, a grid-based clustering method, demonstrates the importance of balancing grid size with block size to achieve optimal results. While smaller grid sizes lead to significant compression, they often produce highly distorted images. Medium grid sizes strike a balance, maintaining reasonable compression ratios and good image quality, whereas larger grid sizes yield near-perfect SSIM values at the expense of reduced compression efficiency. CLIQUE’s grid-based approach offers a unique perspective, emphasizing the interplay between spatial granularity and compression performance.
In summary, the comparative analysis of these techniques underscores the necessity of selecting a clustering method tailored to the specific requirements of the application. Techniques like K-Means, Divisive Clustering, and BGMM excel in maintaining a balance between compression efficiency and image quality, making them suitable for general-purpose applications. Methods such as CLIQUE and Mean Shift provide specialized advantages, particularly when specific parameter configurations are carefully tuned. On the other hand, techniques like DBSCAN and OPTICS highlight the challenges of adapting density-based clustering to this domain, while BIRCH’s limitations in this context emphasize the importance of evaluating clustering methods in their intended use cases.
8. Validation of Compression Results Using CID22 Benchmark Dataset
The CID22 dataset [
60] is a diverse collection of high-quality images specifically designed for evaluating image compression and other computer vision algorithms. It offers a wide range of visual content, including dynamic action scenes, intricate textures, vibrant colors, and varying levels of detail, making it an ideal choice for robust validation.
For this study, eight representative images were selected, as shown in
Figure 22, covering diverse categories such as sports, mechanical objects, food, landscapes, macro photography, artwork, and medical images. This selection ensures comprehensive testing across different types of visual data, capturing various challenges like high-frequency details, smooth gradients, and complex patterns. The dataset’s diversity allows for a thorough assessment of the clustering-based compression techniques, providing insights into their performance across real-world scenarios.
Figure 23a–h and
Table 5 show the results of compressing the benchmark images using the nine clustering techniques, in addition to PNG and JPEG compressors.
Table 5 highlights the consistently high SSIM values observed at higher compression ratios, demonstrating the effectiveness of the compression method in preserving image quality while achieving significant data reduction.
K-Means consistently delivers a balanced performance across all image categories. For instance, in sports and vehicles, it achieves high CR values (27.95 and 27.05, respectively) while maintaining excellent SSIM values (0.996 and 0.997). This indicates its effectiveness in preserving structural details while achieving reasonable compression. However, for more intricate scenes, such as macro photography, the CR increases to 30.58, suggesting its adaptability for detailed data. Overall, K-Means achieves a good balance between compression efficiency and image quality, making it versatile for a variety of image types.
BIRCH exhibits low performance in both CR and SSIM across all image types. For example, in food photography and macro photography, it achieves SSIM values of −0.004 and −0.032, respectively, with CR values of 1.67 and 3.39. These results indicate significant quality loss and inefficiency in compression. The method struggles to adapt to the complexities of natural scenes or high-detail photography. BIRCH’s weak performance suggests it may not be suitable for image compression tasks where quality retention is critical.
Divisive Clustering achieves high CR and SSIM values across most categories, particularly in medical and vehicles, with CR values of 33.39 and 28.34 and SSIM values of 0.991 and 0.994, respectively. These results show that the method preserves image quality effectively while achieving efficient compression. For macro photography, it performs similarly well, achieving an SSIM of 0.996. Divisive Clustering emerges as one of the top-performing techniques, maintaining a balance between efficiency and visual quality.
DBSCAN’s performance is highly dependent on parameter settings and image content. It achieves perfect SSIM values (1.0) for several categories, such as food photography and vehicles, but at the cost of extremely low CR values (e.g., 1.43 for artwork). This indicates over-segmentation, leading to inefficiencies in practical compression. In outdoor scenes and artwork, the method shows reduced CR values but still retains high SSIM, demonstrating its adaptability for specific types of data. However, its tendency to overfit or underperform depending on parameter tuning makes it less reliable overall.
OPTICS performs poorly in terms of compression efficiency and image quality. For most categories, such as sports and food photography, it achieves very high CR values (159.73 and 154.21) but with significantly degraded SSIM values (0.414 and 0.530). The images reconstructed using OPTICS often exhibit severe distortions and fail to retain meaningful structural details. The method’s performance suggests it is not well-suited for image compression tasks where preserving visual quality is important.
Mean Shift shows significant limitations in terms of compression efficiency. Despite achieving perfect SSIM values (1.0) across several categories (e.g., sports, food photography, and vehicles), its CR values are consistently low (e.g., 1.50 to 2.91). This indicates poor compression efficiency, making Mean Shift unsuitable for practical image compression tasks where achieving a high CR is essential. While it preserves image quality well, its limited efficiency renders it a less favorable choice for real-world applications.
The GMM achieves a strong balance between compression and quality, particularly in vehicles and sports, with CR values of 24.44 and 18.21 and SSIM values of 0.978 and 0.981, respectively. However, it struggles slightly with macro photography, where SSIM drops to 0.933. While the GMM performs well overall, its performance is slightly less consistent compared to Divisive Clustering or K-Means. Nonetheless, it remains a strong option for applications requiring good compression and quality balance.
The BGMM exhibits stable performance across all categories, retaining higher CR values than GMM in most cases. For instance, in vehicles and macro photography, the BGMM achieves CR values of 28.88 and 29.27, respectively. The SSIM values are also competitive, with a maximum of 0.968 in vehicles.
CLIQUE emerges as a good-performing method, combining high compression efficiency with excellent quality retention. In sports and vehicles, it achieves CR values of 14.03 and 14.40 and SSIM values of 0.992. In macro photography, it maintains a strong balance, achieving an SSIM of 0.963 while maintaining a reasonable CR of 29.29. CLIQUE adapts well to a wide range of image complexities and demonstrates consistent performance, making it a strong competitor to K-Means and Divisive Clustering.
PNG consistently achieves low CR across all image categories, reinforcing its lossless nature. The CR values for PNG range from 1.38 for outdoor scenes to 4.80 for medical images, confirming that PNG prioritizes image quality preservation over compression efficiency.
JPEG provides significantly better compression than PNG, with a CR of 38.40 for macro photography, making it the most effective method in that category. It also performs well for sports action and outdoor scenes, both at 24.00, and for food photography at 21.33. However, JPEG does not always surpass clustering-based techniques, particularly for medical images, where it has a CR of 24.00 compared to CLIQUE’s 36.30.
In summary, K-Means, Divisive Clustering, and CLIQUE stand out as the most reliable clustering methods for image compression, offering consistent performance across diverse image types. These methods effectively balance compression efficiency (high CR) and image quality (high SSIM), making them suitable for a wide range of applications. GMM and BGMM also provide good results but may require careful parameter tuning to achieve optimal performance. Mean Shift, despite its ability to retain image quality, is limited by poor compression efficiency, making it unsuitable for most compression scenarios. BIRCH, DBSCAN, and OPTICS exhibit significant limitations in either quality retention or compression efficiency, rendering them less favorable for practical applications.