The Impact of State-of-the-Art Techniques for Lossless Still Image Compression

A great deal of information is produced daily, due to advances in telecommunication, and the issue of storing it on digital devices or transmitting it over the Internet is challenging. Data compression is essential in managing this information well. Therefore, research on data compression has become a topic of great interest to researchers, and the number of applications in this area is increasing. Over the last few decades, international organisations have developed many strategies for data compression, and there is no specific algorithm that works well on all types of data. The compression ratio, as well as encoding and decoding times, are mainly used to evaluate an algorithm for lossless image compression. However, although the compression ratio is more significant for some applications, others may require higher encoding or decoding speeds or both; alternatively, all three parameters may be equally important. The main aim of this article is to analyse the most advanced lossless image compression algorithms from each point of view, and evaluate the strength of each algorithm for each kind of image. We develop a technique regarding how to evaluate an image compression algorithm that is based on more than one parameter. The findings that are presented in this paper may be helpful to new researchers and to users in this area.


Introduction
A huge amount of data is now produced daily, especially in medical centres and on social media. It is not easy to manage these increasing quantities of information, which are impractical to store and they take a huge amount of time to transmit over the Internet. More than 2.5 quintillion bytes of data are produced daily, and this figure is growing, according to the sixth edition of a report by DOMO [1]. The report further estimates that approximately 90% of the world's data were produced between 2018 and 2019, and that each person on earth will create 1.7 MB of data per second by 2020. Consequently, storing large amounts of data on digital devices and quickly transferring them across networks is a significant challenge. There are three possible solutions to this problem: better hardware, better software, or a combination of both. However, so much information is being created that it is almost impossible to design new hardware that is sufficiently competitive, due to the many limitations on the construction of hardware, as reported in [2]. Therefore, the development of better software is the only solution to the problem.
One solution from the software perspective is compression. Data compression is a way of representing data using fewer bits than the original in order to reduce the consumption of storage and bandwidth and increase transmission speed over networks [3]. Data compression can be applied in many areas, such as audio, video, text, and images, and it can be classified into two categories: lossless and lossy [4]. In lossy compression, irrelevant and less significant data are removed permanently, whereas, in lossless compression, every detail is preserved and only statistical redundancy is eliminated. In short, lossy compression allows for slight degradation in the data, while lossless methods perfectly reconstruct the data from its compressed form [5][6][7][8]. There are many applications for lossless data compression techniques, such as medical imagery, digital radiography, scientific imaging, zip file compression, museums/art galleries, facsimile transmissions of bitonal images, business documents, machine vision, the storage and sending of thermal images taken by nano-satellites, observation of forest fires, etc. [9][10][11][12][13]. In this article, we study the compression standards that are used for lossless image compression. There are many such methods, including run-length coding, Shannon-Fano coding, Lempel-Ziv-Welch (LZW) coding, Huffman coding, arithmetic coding, lossless JPEG, PNG, JPEG 2000, JPEG-LS, JPGE XR, CALIC , AVIF, WebP, FLIF, etc. However, we limit our analysis to lossless JPEG, PNG, JPEG 2000, JPEG-LS, JPGE XR , CALIC, AVIF, WebP, and FLIF, since these are the latest and most fully developed methods in this area. Although there are also many types of image, such as binary images, 8-bit and 16-bit grayscale images, 8-bit indexed images, and 8-bit and 16-bit red, green, and blue (RGB) images, we only cover 8-bit and 16-bit grayscale and RGB images in this article. A detailed review of run-length, Shannon-Fano, LZW, Huffman, and arithmetic coding was carried out in [3]. Four parameters are used to evaluate a lossless image compression algorithm: the compression ratio (CR), bits per pixel (bpp), encoding time (ET), and decoding time (DT), and bpp is the simply the inverse of the CR; therefore, we only consider the CR, ET, and DT when evaluating these methods. Most studies in this research area use only the CR to evaluate the effectiveness of an algorithm [14][15][16][17]. Although there are many applications for which higher CRs are important, others may require higher encoding or decoding speeds. Other applications may require high CR and low ET, low ET and DT, or high CR and decoding speed. In addition, there are also many applications where all three of these parameters are equally important. In view of this, we present an extensive analysis from each perspective, and then examine the strength of each algorithm for each kind of image. We compare the performance of these methods based on public open datasets. More specifically, the main aim of this research is to address the following research issues: RI1: How good is each algorithm in terms of the CR for 8-bit and 16-bit greyscale and RGB images? RI2: How good is each algorithm in terms of the ET for 8-bit and 16-bit greyscale and RGB images? RI3: How good is each algorithm in terms of the DT for 8-bit and 16-bit greyscale and RGB images? RI4: How good is each algorithm in terms of the CR and ET for 8-bit and 16-bit greyscale and RGB images? RI5: How good is each algorithm in terms of the CR and DT for 8-bit and 16-bit greyscale and RGB images? RI6: How good is each algorithm in terms of the ET and DT for 8-bit and 16-bit greyscale and RGB images? RI7: How good is each algorithm when all parameters are equally important for 8-bit and 16-bit greyscale and RGB images? RI8: Which algorithms should be used for each kind of image?
The remainder of this article is structured, as follows. In Section 2, we give a brief introduction to the four types of image. In Section 3, we briefly discuss why compression is important, how images are compressed, the kind of changes that occur in data during compression, and the types of data that are targeted during compression. The measurement parameters that are used to evaluate a lossless image compression algorithm are discussed in Section 4. In Section 5, we give a brief introduction to the advanced lossless image compression algorithms. Based on the usual parameters that were used to evaluate a lossless image compression technique and our developed technique, a detailed analysis of the experimental outcomes obtained using these algorithms is provided in Section 6. We conclude the paper in Section 7.

Background
A visual representation of an object is called an image, and a digital image can be defined as a two-dimensional matrix of discrete values. When the colour at each position in an image is represented as a single tone, this is referred to as a continuous tone image.
The quantised values of a continuous tone image at discrete locations are called the grey levels or the intensity [18], and the pixel brightness of a digital image is indicated by its corresponding grey level. Figure 1 shows the steps that are used to transform a continuous tone image to its digital form. The bit depth indicates the number of bits used to represent a pixel, where a higher bit depth represents more colours, thus increasing the file size of an image [19]. A greyscale image is a matrix of A × B pixels, and 8-bit and 16-bit greyscale images contain 2 8 = 256 and 2 16 = 65,536 different colours, respectively, where the ranges of colour values are from 0-255 and 0-65,535. Figures 2 and 3 show examples of 8-bit and 16-bit greyscale images, respectively.
A particular way of representing colours is called the colour space, and a colour image is a linear combination of these colours. There are many colour spaces, but the most popular are RGB, hue, saturation, value (HSV), and cyan, magenta, yellow, and key (black) (CMYK). RGB contains the three primary colours of red, green, and blue, and it is used by computer monitors. HSV and CMYK are often used by artists and in the printing industry, respectively [18]. A colour image carries three colours per pixel; for example, because an RGB image uses red, green, and blue, each pixel of an 8-bit RGB image has a precision of 24 bits, and the image can represent 2 24     For an uncompressed image (X), the memory that is required to store the image is calculated using Equation (1), where the dimensions of the image are A×B, the bit depth is N, and required storage is S. S = ABN2 −13 KB (1)

Data Compression
Data compression is a significant issue and a subject of intense research in the field of multimedia processing. We give a real example below to allow for a better understanding of the importance of data compression. Nowadays, digital cinema and high-definition television (HDTV) use a 4K system, with approximately 4096 × 2160 pixels per frame [20]. However, the newly developed Super Hi-Vision (SHV) format uses 7680 × 4320 pixels per frame, with a frame rate of 60 frames per second [21]. Suppose that we have a three-hour colour video file that is based on SHV video technology, where each pixel has a precision of 48 bits. The size of the video file will then be (7680 × 4320 × 48 × 60 × 3 × 60 × 60) bits, or approximately 120,135.498 GB. According to the report in [22], 500 GB to 1 TB is appropriate for storing movies for non-professional users. Seagate, which is an American data storage company, has published quarterly statistics since 2015 on the average capacity of Seagate hard disk drives (HDDs) worldwide. In the third quarter of 2020, this capacity was 4.1 TB [23]. Can we imagine what would have happened? We could not even store a three-hour color SHV video file on our local computer. Compression is another important issue for data transmission over the internet. Although there are many forms of media that can be used for transmission, fibre optic cables have the highest transmission speed [24], and can transfer up to 10 Gbps [25]. If this video file is transferred at the highest speed available over fibre optic media without compression, approximately 26.697 h would be required, without considering latency. Latency is the amount of time that is required to transfer data from the original source to the destination [26]. In view of the above problems, current technology is entirely inadequate, and data compression is the only effective solution.
An image is a combination of information and redundant data. How much information an image contains is one of the most important issues in image compression. If an image contains a number of unique symbols SL, and P(k) is the probability of the kth symbol, the average information content (AIC) that an image may contain, which is also known as entropy, is calculated using Equation (2). Image compression is achieved through a reduction in redundant data.
Suppose that two datasets A and B point to the same image or information. Equation (3) can then be used in order to define the relative data redundancy (R dr ) of set A, where the CR is calculated using Equation (4).
Three results can be deduced from Equations (3) and (4).

2.
When B A, CR → infinite, R dr → 1, dataset A contains the highest redundancy and the greatest compression is achieved.

3.
When B A, CR → 0, R dr → −in f inite, dataset B contains large memory than the original (A).
In digital image compression, there are three types of data redundancy: coding, interpixel, and psycho-visual redundancy [27,28]. Suppose that we have the image that is shown in Figure 6 with the corresponding grey levels.
The 10 × 10 image shown in Figure 6 contains nine different values (S) (118, 119, 120, 139, 140, 141, 169, 170, 171), and for a fixed code length, each value can be coded as an 8-bit code-word, since the maximum value (171) requires a minimum of 8 bits to code. Consequently, 800 bits are required to store the image. In contrast, a variable code length is based on probability, where codes of shorter length are assigned to values with higher probability. The probability of the kth values is calculated while using Equation (5), where N is the total number of values in an image. If L k represents the length of the code-word for the values S k , then the length of the average code-word can be calculated using Equation (6), where SL is the total number of different values. Table 1 shows the variable length coding for the image shown in Figure 6.   Table 1, we obtain approximate values of CR = 2.581 and R dr = 0.613, and the compressed image takes 300 bits rather than 800 bits. These results show that the original image contains redundant code, and that the variable length coding has removed this redundancy [29,30].
Interpixel redundancy can be classified as spatial, spectral, and temporal redundancy. In spatial redundancy, there are correlations between neighbouring pixel values, whereas, in spectral redundancy, there are correlations between different spectral bands. In temporal redundancy, there are correlations between the adjacent frames of a video. Interpixel redundancy can be removed using run-length coding, the differences between adjacent pixels, predicting a pixel using various methods, thresholding, or various types of transformation techniques, such as discrete Fourier transform (DFT) [31].
To remove interpixel redundancy from the image presented in Figure 7 using runlength coding, we code the image, as follows: (120,1) ( are required to code each pair, and an 8-bit code word is used for the grey level, since the maximum gray level is 171. A 4-bit code word is used for the length of the grey level, since the maximum value for the length of the gray level is nine. The main purpose of using prediction or transformation techniques can be described, as follows. To create a narrow histogram, a prediction or various other types of transformation techniques can be applied to provide a small value for the entropy. For example, we apply the very simple predictor that is given below to the image shown in Figure 6, where A represents the predicted pixels. Figure 7a shows the histogram of the original image, and Figure 7b shows the histogram obtained after applying the predictor that is shown in Equation (7).  After applying the predictor, the histogram only contains five values, of which only two (0 and 1) contain 90% of the data, thus giving a better compression ratio. The interpixel redundancy of an image can be removed using one or more techniques in combination. For example, after applying the predictor to the original image presented in Figure 6, we can apply both run-length and Huffman coding, with the result that only 189 bits are required to store the image rather than 800 bits.
The construction of an image compression technique is highly application-specific. Figure 8 shows a general block diagram of lossless image compression and decompression.
The mapping that is shown in Figure 8 is used to convert an image into a non-visual form to decrease the interpixel redundancy. Run-length coding, various transformation techniques, and prediction techniques are typically applied at this stage. At the symbol encoding stage, Huffman, arithmetic, and other coding methods are often used to reduce the coding redundancy. The image data are highly correlated, and the mapping process is a very important way of decorrelating the data and eliminating redundant data. A better mapping process can eliminate more redundant data and give better compression. The first and most important problem in image compression is to develop or choose an optimal mapping process, while the second is to choose an optimal entropy coding technique to reduce coding redundancy [32]. In channel encoding, Hamming coding is applied to increase noise immunity, whereas, in decoding, the inverse procedures are applied to provide a lossless decompressed image. Quantisation, which is an irreversible process, removes irrelevant information by reducing the number of grey levels, and it is applied between the mapping and symbol encoding stages in lossy image compression [33][34][35][36][37][38].

Measurement Standards
Measurement standards offer ways of determining the efficiency of an algorithm. The four measurement standards (compression ratios, encoding time, decoding time, and bits per pixel) are used to evaluate a lossless image compression algorithm.
The CR is the ratio between the size of an uncompressed image and its compressed version. Entropy is generally estimated as the average code length of a pixel in an image; however, in reality, this estimate is overoptimistic due to statistical interdependencies among pixels. For example, Table 1 shows that the CR is 2.581, but the estimated entropy for the same data is 3.02. Hence, the entropy-based compression ratio is 2.649. Equation (4) is used to calculate the CR to solve this issue. The bpp [39] is the number of bits used to represent a pixel, i.e. the inverse of the CR, as shown in Equation (8). The ET and DT are the times that are required by an algorithm to encode and decode an image, respectively.

State-of-the-Art Lossless Image Compression Techniques
The algorithms used for lossless image compression can be classified into four categories: entropy coding (Huffman and arithmetic coding), predictive coding (lossless JPEG, JPEG-LS, PNG, CALIC), transform coding (JPEG 2000, JPEG XR), and dictionary-based coding (LZW). All of the predictive and transform-based image compression techniques use an entropy coding strategy or Golomb coding as part of their compression procedure.

Lossless JPEG and JPEG-LS
The Joint Photographic Experts Group (JPEG) format is a DCT-based lossy image compression technique, whereas lossless JPEG is predictive. Lossless JPEG uses the 2D differential pulse code modulation (DPCM) scheme [40], and it predicts a value (P) for the current pixel (P) that is based on up to three neighbouring pixels (A, B, D). Table 2 shows the causal template used to predict a value. If two pixels (B, D) from Table 2 are considered in this prediction, then the predicted value (P) and prediction error (PE) are calculated while using Equations (9) and (10), respectively.
Consequently, the prediction errors remain close to zero, and very large positive or negative errors are not commonly seen. Therefore, the error distribution looks almost like a Gaussian normal distribution. Finally, Huffman or arithmetic coding is used to code the prediction errors. Table 3 shows the predictor that is used in the lossless JPEG format, based on three neighbouring pixels (A, B, D). In lossy image compression, three types of degradation typically occur and should be taken into account when designing a DPCM quantiser: granularity, slope overload, and edge-busyness [41]. DPCM is most sensitive to channel noise. Table 3. Predictor for Lossless Joint Photographic Experts Group (JPEG).

Mode
Predictor A real image usually has a nonlinear structure, and the DPCM uses a linear predictor, This is why problems can occur. This gave rise to the need to develop a perfect nonlinear predictor. The median edge detector (MED) is one of the most widely used nonlinear predictors, which is used by JPEG-LS to address these drawbacks.
JPEG-LS was designed based on LOCO-I (Low Complexity Lossless Compression for Images) [42,43], and a standard was eventually introduced in 1999 after a great deal of development [44][45][46][47]. JPEG-LS improves the context modelling and encoding stages by applying the same concept as lossless JPEG. Although the discovery of arithmetic codes [48,49] conceptually separates the stages, the separation process becomes less clean under low-complexity coding constraints, due to the use of an arithmetic coder [50]. In context modelling, the number of parameters is an important issue, and it must be reduced to avoid context dilution. The number of parameters depends entirely on the number of context. A two-sided geometric distribution (TSGD) model is assumed for the prediction residuals to reduce the number of parameters. The selection of a TSGD model is a significant issue in a low-complexity framework, since a better model only needs very simple coding. Merhav et al. [51] showed that adaptive symbols can be used in a scheme, such as Golomb coding, rather than more complex arithmetic coding, since the structure of Golomb codes provides a simple calculation without requiring the storage of code tables. Hence, JPEG-LS uses Golomb codes at this stage. Lossless JPEG cannot provide an optimal CR, because it cannot de-correlate by first order entropy of their prediction residuals. In contrast, JPEG-LS can achieve good decorrelation and provide better compression performance [42,43]. Figure 9 shows a general block diagram for JPEG-LS. The prediction or decorrelation process of JPEG-LS is completely different from that in lossless JPEG. The LOCO-I or MED predictor used by JPEG-LS [52] predicts a value (P) according to Equation (11), as shown in Table 2.

Portable Network Graphics
Portable Network Graphics (PNG) [53][54][55], a lossless still image compression scheme, is an improved and patent-free replacement of the Graphics Interchange Format (GIF). It is also a technique that uses prediction and entropy coding. The deflate algorithm, which is a combination of LZ77 and Huffman coding, is used as the entropy coding technique. PNG uses five types of filter for prediction [56], as shown in Table 4 (based on Table 2).

Context-Based, Adaptive, Lossless Image Codec
Context-based, adaptive, lossless image codec (CALIC) is a lossless image compression technique that uses a more complex predictor (gradient-adjusted predictor, GAP) than lossless JPEG, PNG, and JPEG-LS. GAP provides better modelling than MED by classifying the edges of an image as either strong, normal, or weak. Although CALIC provides more compression than JPEG-LS and better modelling, it is computationally expensive. Wu [57] used the local horizontal (Gh).and vertical (Gv) image gradients (Equations (12) and (13)) to predict a value (P) (Equation (14)) for the current pixel (P) using Equation (15), as shown in Table 2. At the coding stage, CALIC uses either Huffman coding or arithmetic coding; the latter provides more compression, but it takes more time for encoding and decoding, since arithmetic coding is more complex. The encoding and decoding methods in CALIC follow a raster scan order, with a single pass of an image. There are two modes of operation, binary and continuous tone. If the current locality of an original image has a maximum of two distinct intensity values, binary mode is used; otherwise, continuous tone mode is used. The continuous tone approach has four components: prediction, context selection and quantisation, context modelling of prediction errors, and entropy coding of the prediction errors. The mode is selected automatically, and no additional information is required [58].

JPEG 2000
JPEG 2000 is an extension complement of the JPEG standard [40], and it is a waveletbased still image compression technique [59][60][61][62] with certain new functionalities. It provides better compression than JPEG [60]. The development of JPEG 2000 began in 1997, and became an international standard [59] in 2000. It can be used in lossless or lossy compression within a single architecture. Most image or video compression standards divide an image or a video frame into square blocks, to be processed independently. For example, JPEG uses the Discrete Cosine Transform (DCT) in order to split an image into a set of 8 × 8 square blocks for transformation. As a result of this processing, extraneous blocking artefacts arise during the quantisation of the DCT coefficients at high CR [7], producing visually perceptible faults in the image [61][62][63][64]. In contrast, JPEG 2000 transforms an image as a whole while using a discrete wavelet transformation (DWT), and this addresses the issue. One of the most significant advantages of using JPEG 2000 is that different parts of the same image can be saved with different levels of quality if necessary [65]. Another advantage of using DWT is that it transforms an image into a set of wavelets, which are easier to store than pixel blocks [66][67][68]. JPEG 2000 is also scalable, which means that a code stream can be truncated at any point. In this case, the image can be constructed, but the resolution may be poor if many bits are omitted. JPEG 2000 has two major limitations: it produces ringing artifacts near the edges of an image, and is computationally more expensive [69,70]. Figure 10 shows a general block diagram for the JPEG 2000 encoding technique. Initially, an image is transformed into the YUV colour space, rather than YCbCr, for lossless JPGE2000 compression, since YCbCr is irreversible and YUV is completely reversible. The transformed image is then split into a set of tiles. Although the tiles may be of any size, all of the tiles in an image are the same size. The main advantage of dividing an image into tiles is that the decoder requires very little memory for image decoding. In this tiling process, the image quality can be decreased for low PSNR, and the same blocking artifacts, like JPEG, can arise when more tiles are created. The LeGall-Tabatabai (LGT) 5/3 wavelet transform is then used to decompose each tile for lossless coding [71,72], while the CDF 9/7 wavelet transform is used for lossy compression [72].
Quantisation is carried out in lossy compression, but not in lossless compression. The outcome of the transformation process is the sub-band collection and the sub-bands are further split into code blocks, which are then coded while using the embedded block coding with optimal truncation (EBCOT) process, in which the most significant bits are coded first. All of the bit planes of the code blocks are perfectly stored and coded, and a context-driven binary arithmetic coder is applied as an entropy coder independently to each code block. Some bit planes are dropped in lossy compression. While maintaining the same quality, JPEG 2000 provides about 20% more compression than JPEG and it works better for larger images.
After the DWT transformation of each tile, we obtain four parts: the top left image with lower resolution, the top right image with higher vertical resolution, the bottom left image with lower vertical resolution, and the bottom right image with higher resolution in both directions. This decomposition process is known as dyadic [60], and it is illustrated in Figure 11 based on a real image, where the entire image is considered as a single tile.

JPEG XR (JPEG Extended Range)
Like JPEG, JPEG Extended Range (JPEG XR) is a still image compression technique that was developed based on HD photo technology [73,74]. The main aim of the design of JPEG XR was to achieve better compression performance with limited computational resources [75], since many applications require a high number of colours. JPEG XR can represent 2.8 ×10 14 colours, as compared to only 16,777,216 for JPEG. While JPEG-LS, CALIC, and JPEG 2000 use MED, GAP, and DWT, respectively, for compression, JPEG XR uses a lifting-based reversible hierarchical lapped biorthogonal transform (LBT). The two main advantages of this transformation are that encoding and decoding both require relatively few calculations, and they are completely reversible. Two operations are carried out in this transformation: a photo core transform (PCT), which employs a lifting scheme, and a photo overlap transform (POT) [76]. Similar to DCT, PCT is a 4 × 4 wavelet-like multi-resolution hierarchical transformation within a 16 × 16 macroblock. This transformation improves the image compression performance [76]. POT is performed before PCT to reduce blocking artifacts at low bitrates. Another advantage of using POT is that it reduces the ET and DT at high bitrates [77]. At the quantisation stage, a flexible coefficient quantisation approach that is based on the human visual system is used that is controlled by a quantisation factor (QF), where QF varies, depending on the colour channels, frequency bands, and spatial regions of the image. It should be noted that quantisation is only done for lossy compression. An inter-block coefficient prediction technique is also implemented to remove inter-block redundancy [74]. Finally, JPEG XR uses adaptive Huffman coding as an entropy coding technique. JPEG XR also allows for image tiling in the same way as JPEG 2000, which means that the decoding of each block can be done independently. JPEG XR permits multiple colour conversions, and uses the YCbCr colour space for images with 8 bits per sample and the YCoCg color space for RGB images. It also supports the CMYK colour model [73]. Figures 12 and 13 show general block diagrams for JPEG XR encoding and decoding, respectively.

AV1 Image File Format (AVIF), WebP and Free Lossless Image Format (FLIF)
In 2010, Google introduced WebP based on VP8 [79]. It is now one of the most successful image formats and it supports both lossless and lossy compression. WebP predicts each block based on three neighbor blocks, and the blocks are predicted in four modes: horizontal, vertical, DC, and TrueMotion [80,81]. Though WebP provides better compression than JPEG and PNG, only some browsers support WebP. Another disadvantage is that lossless WebP does not support progressive decoding [82]. A clear explanation of the encoding procedure of WebP has been given in [83].
AOMedia Video 1 (AV1), which was developed in 2015 for video transmission, is a royalty-free video coding format [84], and AV1 Image File Format (AVIF) is derived from AV1 and it uses the same technique for image compression. It supports both lossless and lossy compression. AV1 uses a block-based frequency transform and incorporates some new features that are based on Google's VP9 [85]. As a result, the AV1's encoder gets more alternatives to allow for better adaptation to various kinds of input and outperforms H.264 [86]. The detailed coding procedure of AV1 is given in [87,88]. AVIF and HEIC provide almost similar compression. HEIC is patent-encumbered H.265 format and illegal to use without obtaining patent licenses. On the other hand, AVIF is free to use. There are two biggest problems in AVIF. It is very slow for encoding and decoding an image, and it does not support progressive rendering. AVIF provides many advantages over WebP. For example, it provides a smaller sized image and a more quality image, and it supports multi-channel [89]. A detailed encoding procedure of AV1 is shown in [90].
Free Lossless Image Format (FLIF) is one of the best lossless image formats and it provides better performance than the state-of-the-art techniques in terms of compression ratio. Many image compression techniques (e.g., PNG) support progressive decoding that can show an image without downloading the whole image. In this stage, FLIF is better, as it uses progressive interlacing, which is an improved version of the progressive decoding of PNG. FLIF is developed based on MANIAC (Meta-Adaptive Near-zero Integer Arithmetic Coding), a variant of CABAC (context-adaptive binary arithmetic coding. The detailed coding explanation of FLIF is given in [91,92]. One of the main advantages of FLIF is that it is responsive by design. As a result, users can use it, as per their needs. FLIF provides excellent performances on any kind of image [93]. In [91], Jon et al. show that JPEG and PNG work well on photographs and drawings images, respectively, and there is no single algorithm that works well on all types of images. However, they finally conclude that only FLIF works better on any kinds of image. FLIF has many limitations, such as no browser still supports FLIF [94], and it takes more time for encoding and decoding an image.

Experimental Results and Analysis
The dependence of many applications on multimedia computing is growing rapidly, due to an increase in the use of digital imagery. As a result, the transmission, storage, and effective use of images are becoming important issues. Raw image transmission is very slow, and gives rise to huge storage costs. Digital image compression is a way of converting an image into a format that can be transferred quickly and stored in a comparatively small space. In this paper, we provide a detailed analysis of the state-of-the-art lossless still image compression techniques. As we have seen, most previous survey papers on lossless image compression focus on the CR [3,4,16,78,[95][96][97][98]. Although this is an important measurement criterion, the ET and DT are two further important factors for lossless image compression.
The key feature of this paper is that we not only carry out a comparison based on CR, but also explore the effectiveness of each algorithm in terms of compressing an image based on several metrics. We use four types of images: 8-bit and 16-bit greyscale and RGB images. We applied state-of-the-art techniques to a total of nine uncompressed images of each type, taken from a database entitled "New Test Images-Image Compression Benchmark" [99].

Analysis Based on Usual Parameters
In this section, we have analysed based on every single parameter. Figures 14-16 show the 8-bit greyscale images and their respective ETs and DTs, and Table 5 shows the CRs of the images. Table 5 shows that FLIF gives the highest CR for the artificial.pgm, big_tree.pgm, bridge.pgm, cathedral.pgm, fireworks.pgm, and spider_web.pgm 8-bit greyscale images. For the rest of the images, AVIF provides the highest CR. JPEG XR gives the lowest CRs (3.196, 3.054, and 2.932) for three images (artificial, fireworks and flower_foveon, respectively), while for the rest of the images, lossless JPEG gives the lowest CR. In terms of the ET, Figure 15 shows that JPEG-LS gives the shortest times (0.046 and 0.115 s) for two images (artificial and fireworks) and JPEG XR gives the shortest times for the remainder of the images. FLIF takes the longest times ( 3.28, 15.52, 6.18, 2.91, 6.19, 2.73, 3.27 and 4.32 s, respectively) for all the images, except flower_foveon.pgm. For this image, AVIF takes the longest time (1.54 s). Figure 16 shows that, at the decoding stage, CALIC takes the longest times (0.542, 0.89, and 0.898 s) for three images (fireworks.pgm, hdr.pgm, and spider_web.pgm, respectively), and WebP takes the longest times (3.26 and 2.33 s) for two images (big_tree.pgm and bridge.pgm, respectively), while FLIF takes the longest times for the rest of the images. On the other hand, PNG takes the shortest times, respectively, for all of the images.    Figure 17 shows examples of 8-bit RGB images. Table 6 and Figures 18 and 19 show the values of CR, ET, and DT for each of the images. From Table 6, it can be seen that FLIF gives the highest CRs (18.446 and 5.742) for two images (artificial.PPM and fireworks.PPM, respectively), and AVIF gives the highest CRs for the rest of the images. JPEG XR gives the lowest CRs (4.316 and 3.121) for two images (artificial and fireworks, respectively) and, for the remaining seven images, lossless JPEG gives the lowest CRs. Figure 18 shows that FLIF requires the longest ETs for all the images. JPEG XR gives the shortest ET for all images, except      Table 7 show the 16-bit greyscale images and their respective ETs, DT, and CRs. From Table 7, it can be seen that FLIF gives the highest CRs for all the images except deer.pgm, flower_foveon.pgm, and hdr.pgm. For these images, AVIF provides the highest CRs (3.742, 8.273, and 7.779, respectively). Lossless JPEG gives the lowest CR (2.791) for artificial.pgm, and PNG gives the lowest for the rest of the images. Figure 21 shows that FLIF gives the highest ETs for all the images, except big_tree.pgm. For this image, JPEG 2000 takes the highest (4.579 s). JPEG-LS gives the lowest ETs (0.186, 0.238, 0.257, and 0.406 s) for four images (artificial, cathedral, fireworks, and spider_web, respectively) and JPEG XR gives the lowest for the rest of the images. In terms of decoding, CALIC and lossless JPEG give the highest DTs (0.613 and 0.503 s) for artificial.pgm and flower_foveon.pgm, respectively, and JPEG 2000 gives the highest DTs (1.078, 0.978, and 1.736 s) for three images (fireworks, hdr, and spider_web, respectively). For the rest of the images, FLIF takes the highest DTs. On the other hand, PNG gives the lowest DTs for all of the images shown in Figure 22.         16.84%, and 7.65% higher for the 16-bit RGB images than JPEG-LS, JPEG 2000, Lossless JPEG, PNG, JPEG XR, CALIC, AVIF, and WebP, respectively. If the CR is the main consideration for an application, from Figure 26 we can see that FLIF is better for all four types of image. Figure 27 shows the average encoding times. It can be seen that JPEG XR requires the shortest average ET (0.167, 0.376, 0.455, and 0.417 s) for the 8-bits greyscale, 16 85.42%, and 87.12% faster than JPEG-LS, JPEG 2000, lossless JPEG, PNG, CALIC, AVIF, WebP, and FLIF, respectively, for the same case. When DT is the main consideration for an application during compression, we can conclude from Figure 28 that PNG is better for 8-bit greyscale, 16-bit greyscale, and 8-bit RGB images, whereas JPEG XR is better for 16-bit RGB images.

Analysis Based on Our Developed Technique
Because the performance of a lossless algorithm depends on the CR, ET, and DT, we develop a technique that is defined in Equations (16)- (19) to calculate the overall and grand total performance for each algorithm when simultaneously considering all of these parameters. The overall performance means how good is each method on average among all methods in terms of each individual evaluation parameter, whereas grand total performance is a combination of two or more parameters. For each method, the overall performance (OVP) in terms of CR, ET, and DT is calculated while using Equations (16)- (18). In Equation (16), N is the number of images of each type, MN represents a method name, AV represents the average value of CR, ET, or DT for the images in a category, and PM is an indicator that can be assigned to CR, ET, or DT. The OVP of each method in terms of the CR is calculated using Equation (17), while Equation (18) is used to calculate the overall performance in terms of ET or DT. In Equation (18), EDT is an indicator that can be set to ET or DT. Finally, the grand total performance (GTP) of each method is calculated using Equation (19), where NP is the number of parameters considered when calculating the grand total, and TP is the total number of parameters. For lossless image compression, TP can be a maximum of three, because only three parameters (compression ratio, encoding, and decoding times) are used for evaluating a method.
We show the two-parameter GTP (i.e., the GTP for each combination of two parameters) in Figures 29-32 for 8-bit greyscale, 8-bit RGB, 16-bit greyscale, and 16-bit RGB images, respectively. Figure 29a shows that JPEG XR and AVIF give the highest (25.128%) and the lowest (5.068%) GTPs, respectively, in terms of CR and ET. Figure 29b shows that PNG and WebP provide the highest (27.607%) and lowest (5.836%) GTPs, respectively, in terms of CR and DT. For ET and DT, Figure 29c shows that JPEG XR and FLIF provide the highest (25.429%) and the lowest (1.661%) GTPs, respectively. We can draw the conclusion from Figure 29 that JPEG XR is better when CR and ET are the main considerations for an application, and this method performs 23 Figure 30 shows the two-parameter GTP for 8-bit RGB images. Figure 30a shows that JPEG XR and PNG give the highest (23.238%) and lowest (7.337%) GTP of the stateof-the-art methods in terms of CR andET, and JPEG XR performs 29.1%, 63.13%, 50.68%, 68.43%, 67.79%, 62.94%, 59.62%, and 68% better than JPEG-LS, JPEG 2000, lossless JPEG, PNG, CALIC AVIF, WebP, and FLIF, respectively. In terms of CR and DT, PNG and CALIC achieve the highest (26.291%) and lowest (6.260%) GTPs, as shown in Figure 30b, Figure 30 we can conclude that JPEG XR is better for 8-bit RGB image compression, when either CR and ET or ET and DT are the major considerations for an application. However, PNG is better when CR and DT are most important.  For 16-bit greyscale images, Figure 31 shows the two-parameter GTP for the state-ofthe-art techniques. In terms of CR and ET, JPEG XR achieves the highest GTP (19.305%) and JPEG 2000 the lowest (5.327%) of the methods that are shown in Figure 31a. It can also be seen that JPEG XR performs 34.87%, 72.41%, 35.54%, 68.96%, 58.53%, 44.38%, 34.31%, and 33.01% better than JPEG-LS, JPEG 2000, lossless JPEG, PNG, CALIC, AVIF, WebP, and FLIF, respectively. Figure 31b shows the GTP that is based on CR and DT, and it can be seen that PNG and JPEG 2000 have the highest (20.288%) and lowest (3.866%) GTPs. Figure 31b shows that PNG performs 50.69%, 80.94%, 38.95%, 31.16%, 73.63%, 50.5%, 43.21%, and 38% better than JPEG-LS, JPEG 2000, lossless JPEG, JPEG XR, CALIC, AVIF, WebP, and FLIF, respectively. PNG and FLIF achieve the highest (20.352%) and lowest (3.833%) GTPs in terms of ET and DT, as shown in Figure 31c, and PNG performs 20.81%, 78.37%, 8.23%, 8.23%, 60.52%, 77.22%, 74.12%, and 81.17% better than JPEG-LS, JPEG 2000, lossless JPEG, JPEG XR, CALIC, AVIF, WebP, and FLIF, respectively. Hence, based on Figure 31, we conclude that JPEG XR is better for 16-bit greyscale image compression when either CR and ET is the main consideration for an application. PNG is better when CR and DT or ET and DT are considered to be important for this type of image.   Figure 32 shows the GTP of each method for 16-bit RGB images. Figure 32a- Figure 32 we can conclude that JPEG XR is the best for 16-bit RGB image compression for any application. Figure 33 shows the three-parameter GTP (CR, ET, and DT) for the state-of-the-art techniques. Figure 33a- Figure 33 that JPEG XR is better for the four types of image when all three parameters (CR, ET, and DT) are equally important for lossless still image compression for an application. In order to allow for a straightforward comparison, we summarise the analysis presented in this paper in Table 9, and the two best methods are shown in each category. Matlab (version 9.8.0.1323502 (R2020a)) was used for these experiments, on a DELL desktop computer with an Intel(R) Core(TM) i7-8700 CPU @3.20GHz 3.19GHz (Intel, Santa Clara, CA, USA).

Conclusions
Lossless still image compression is a topic of intense research interest in the field of computing, especially for applications that rely on low bandwidth connections and limited storage. There are various types of images, and different algorithms are used in order to compress these in a lossless way. The performance of a lossless algorithm depends on the CR, ET, and DT, and there is a range of demand in terms of different types of performance.
Some applications place more importance on the CR, and others mainly focus on the ET or DT. Two or all three of these parameters may be equally important in some applications. The main contribution of this research article is that we have analysed state-of-the-art techniques from each perspective, and they have evaluated the strengths of each algorithm for each kind of image. We also recommend the best two state-of-the-art methods from each standpoint.
From the analysis that is presented here, we can see that FLIF is optimal for the four types of images, in terms of the CR. However, JPEG XR and PNG provide better performance in terms of encoding and decoding speeds, respectively, for 8-bit greyscale and RGB, and 16-bit greyscale. For 16-bit RGB images, JPEG XR works the fastest. When the CR and ET are the main considerations, JPEG XR provides better performance for the four types of image. PNG achieves good performance for 16-bit greyscale images when encoding and decoding times are most important, and JPEG XR performs best for other types of images. When CR and DT are most important, PNG is better for 16-bit greyscale, 8-bit greyscale and RGB images, and JPGE-XR is better for 16-bit RGB images. JPEG XR is better for the four types of image if all parameters are equally important for lossless compression in an application.
An important outcome of this research is that it can allow users to easily identify the optimal compression algorithm to use for an application, based on their particular needs. For example, there are many data storage applications, like Google Drive, OneDrive, Dropbox etc., where the compression ratio is more important. In this case, FLIF could be the best choice for all types of image. On the other hand, the compression ratio and encoding time are more important than decoding time during photo attachment in a mail and, in instant messaging when a photo is shared, all three parameters are equally important. For these two cases, JPEG XR could be the best selection for all categories of an image.
In future work, we plan to carry out an exhaustive analysis to show which parts of an algorithm should be developed, and how many of the parameters need to be developed in order to optimise the algorithm.