Lossless Image Compression Techniques: A State-of-the-Art Survey

: Modern daily life activities result in a huge amount of data, which creates a big challenge for storing and communicating them. As an example, hospitals produce a huge amount of data on a daily basis, which makes a big challenge to store it in a limited storage or to communicate them through the restricted bandwidth over the Internet. Therefore, there is an increasing demand for more research in data compression and communication theory to deal with such challenges. Such research responds to the requirements of data transmission at high speed over networks. In this paper, we focus on deep analysis of the most common techniques in image compression. We present a detailed analysis of run-length, entropy and dictionary based lossless image compression algorithms with a common numeric example for a clear comparison. Following that, the state-of-the-art techniques are discussed based on some bench-marked images. Finally, we use standard metrics such as average code length (ACL), compression ratio (CR), pick signal-to-noise ratio (PSNR), efﬁciency, encoding time (ET) and decoding time (DT) in order to measure the performance of the state-of-the-art techniques.


Introduction
The utilization of the computer in modernized activities is increasing virtually everywhere.As a result, sending a plethora of data, especially images and videos over the cyber world, is the most challenging issue because of circumscribed bandwidth and storage capacity; and it is time-consuming and costly as reported in [1].For instance, a conventional movie camera customarily uses 24 frames per second.However, recent video standards sanction 120, 240, or 300 frames per second.Video is a series of still images or frames passed per second and a color image contains three panels: red, green and blue.Suppose you would like to send or store a three-hour color movie file of 1200 × 1200 dimension and 50 frames are passed in every second.It takes approximately (1200 × 1200 × 3 × 84 × 50 × 10,800) bits = 17,797,851.5625Megabits = 2172.5893gigabytes storage if a pixel is coded in 8 bits, which is a sizably voluminous challenge to store in a computer or send over the cyber world.Here, three is the number of channels of a color image, that is, R, G, and B, and 10,800 is the total number of seconds.Additionally, the medium of transmission and latency are two major issues for data transmission.If the video file is sent over a medium of 100 Mbps, approximately (17,797,851.5625Megabits)/100 = 177,978.5156s = 49.4385h is required because the medium can send 100 Megabits per second.For these reasons, compression is required and it is a paramount way to represent an image with fewer bits keeping its quality and an immensely colossal volume of data can be sent through an inhibited bandwidth at high speed over the cyber world reported in [2,3].The general block diagram of an image compression procedure is shown in Figure 1.There are many image compression techniques and an image compression technique is verbally expressed to be the best when it contains less average code length, encoding and decoding time, and provides more compression ratio.Image compression algorithms are extensively applied in medical imaging, computer communication, military communication via radar, teleconferencing, magnetic resonance imaging (MRI), broadcast television and satellite images reported in [4].Some applications of these require high-quality visual information and others need less quality, reported in [5,6].
From the perspectives, compression is divided into two types: lossless and lossy.All pristine data are recuperated correctly from an encoded data set in lossless, whereas the lossy technique retrieves virtually all data sempiternally eliminating categorical information, especially redundant information reported in [7,8].Lossless is mostly utilized in facsimile transmissions of bitonal images, ZIP file format, digital medical imagery, internet telephony, and streaming video file reported in [9].
The foremost intention of implementing a compression algorithm is to diminish superfluous data reported in [10].Run-length coding, for example, is a lossless procedure where a set of consecutive same pixels (runs of data) are preserved as a single value and a count stated in [11,12].But, long runs of data does not subsist in authentic images mentioned in [13,14] which is the main quandary of run-length coding.Article [15] shows that a chain code binarization with run-length, and LZ77 provides a more satisfactory result than the traditional run-length technique from a compression ratio perspective.The authors in [16] show a different way of compression utilizing a bit series of a bit plane and demonstrate that it provides a better result than conventional run-length coding.
The entropy encoding techniques are proposed to solve the difficulties of a run-length algorithm.Entropy coding style encodes source symbols of an image with code words of different lengths.There are some well-recognized entropy coding methods: such as Shannon-Fano, Huffman and arithmetic coding.The first entropy coding technique is Shannon-Fano, which gives a better result than run-length reported in [17].The authors in [18] show that Shannon-Fano coding provides 30.64% and 36.51% better results for image and text compression, respectively, compared to run-length coding.However, Nelso et al. stated in [19] that Shannon-Fano sometimes generates two different codes for the same symbol and does not ascertain optimal codes, which are the two main problems of the algorithm.From the perspectives, Shannon-Fano coding is an inefficient data compression technique reported in [20,21].
Huffman is another entropy coding algorithm that solves the quandaries of Shannon-Fano reported in [22,23].In that technique, pixels that are happening more frequently are encoded, utilizing fewer bits shown in [24,25].Although Huffman coding is a good compression technique, Rufai et al. proposed singular value decomposition (SVD) and Huffman coding based image compression procedure in [26], where SVD is used to decompose an image first and the rank is reduced by ignoring some lower singular values.Lastly, the processed representation is coded by Huffman coding, which shows a better result than JPEG2000 for lossy compression.In [27], three algorithms, Huffman, fractal algorithm and Discrete Wavelet Transform (DWT) coding, have been implemented and are compared to show the best coding procedure among them.It shows that Huffman works better to reduce redundant data and DWT improves the quality of a compressed image, whereas the fractal provides a better compression ratio.The main problem of Huffman coding is that it is very sensitive to noise.It can not reconstruct an image perfectly from an encoded image if any changes are happened reported in [28].
Another lossless entropy method is arithmetic coding, which gives a short average code compared to Huffman coding reported in [29].In [30], Masmoudi et al. proposed a modified technique of arithmetic coding that encodes an image from top to bottom block-row wise and block by block from left to right in lieu of pixel by pixel using a statistical model.The precise probability between the current and its neighboring block are calculated by reducing the Kullback-Leibler gap.As a result, around 15.5% and 16.4% bitrates are decremented for static and adaptive order sequentially.Utilizing adaptive arithmetic coding and finite mixture models, a block-predicated lossless compression has been proposed in [31].Here, an image is partitioned into non-overlapping blocks and encoded every block individually utilizing arithmetic coding.This algorithm provides 9.7% better results than JPEG-LS reported in [32,33] when the work is done in a predicted error domain in lieu of pixel domain.Articles [34,35] state that arithmetic coding provides better compression ratio.But, it takes so much time that is virtually unutilizable for dynamic compression.Furthermore, its use is restricted by patent.On the other hand, though Huffman coding provides marginally less compression but it utilizes very less time to encode an image than arithmetic coding.That's why it is good for dynamic compression reported in [36,37].Furthermore, an image encoded by arithmetic coding can corrupt the entire image for a single bit error because it has very impecunious error resistance reported in [38][39][40].Contiguous to, the primary inhibition of entropy coding is that it increments the complexity of CPU stated in [41,42].
LZW (Lempel-Ziv-Welch) is a dictionary predicated compression technique that reads a sequence of pixels, and then groups the pixels into strings.Lastly, the strings are converted into codes.In that technique, a code table with 4096 common entries are utilized and the fixed codes 0-255 are assigned first in a table as an initial entry because an image can have a maximum of 256 different pixels from 0 to 255.It works better in case of text compression reported in [43].However, Saravanan et al. propose an image coding procedure utilizing LZW, which compresses an image in two stages shown in [44].
Firstly, an image is encoded utilizing Huffman coding.Secondly, after concatenating all the code words, LZW is applied to compress the encoded image, which provides a better result.However, the main challenge of that technique is to manage the string table.
In this study, we use a common numeric data set and shows the step by step details of implementation procedures of the state-of-the-art data compression techniques mentioned.This demonstrates the comparisons among the methods and explicates the quandaries of the methods based on the results of some benchmarked images.The organization of this article is shown as follows: the encoding and decoding procedure; and the analysis of run-length, Shannon-Fano, Huffman, LZW and Arithmetic coding are discussed in Section 2. The experimental results of some bench-marked images are explained in Section 2.2, and concluding statements are presented in Section 3.
It shows that only twenty six elements are preserved in two matrices in lieu of 50 items, which designates that (26 × 8) = 208 bits are sent to the decoder in lieu of (50 × 8) = 400 bits.Thus, the average code length is 208/50 = 4.16 bits and ((8−4.16)/8)× 100 = 48% working memory is saved for the data set.

Run-Length Decoding Procedure
The two array named position and items are received for decoding, and the decoder follows the style shown below for decompression.

1.
Read each element from the array items and write the element repeatedly until its corresponding number in the position array is found.
As an example, the first 6 and 7 of the items array are written one times at index 1 and 2 in the new decoded list, respectively, whereas the next 6 and 7 are reiterated three times at the index 3 to 5 and nine times at the index 6 to 14 in the same decoded list, respectively.These processes will continue until the reading of all elements from the items is finished.Conclusively, we get the same list as the original list(A) after decoding.

Shannon-Fano Coding
Shannon-Fano is a lossless coding technique that takes sorted probabilities in the descending order of an image and separated them into two sets where the total sum of each set is almost equivalent, which is reported in [46].The Shannon-Fano encoding procedure is shown as follows: Find the distinct symbols (N) and their corresponding probabilities.

2.
Sort the probabilities in descending order.

3.
Divide them into two groups so that the entire sum of each group is as equal as possible, and make a tree.4.
Assign 0 and 1 to the left and right group, respectively.5.
Repeat steps 3 and 4 until each element becomes a leaf node on a tree.
Run-length coding does not perform any compression on array C. C contains seven different components (7,5,4,6,3,2,1) and their probabilities are 0.42,0.26,0.08,0.08,0.06,0.06 and 0.04, respectively.As indicated by the algorithm, the two groups left (0.42,0.08) and right (0.26,0.08,0.06,0.06,0.04)are made and the Shannon-Fano encoding system is applied as demonstrated in Figure 2. Entropy, efficiency, ACL, CR, mean square error (MSE) and PSNR are determined using the following equations that are utilized to measure the performance of a compression algorithm, where Pro i , B i , OR, CO, and MAX represent probability of i th symbol, length of the code word of the i th symbol, original image, compressed image and the maximum variation of a dataset separately.The encoded results of the array (C) are appeared in Table 1, where E i represents an encoded code word of the i th symbol: bitstream of data set (C), which is sent for decompression together with symbols and their probabilities.It appears that Shannon-Fano saves ((8−2.48)/8)× 100 = 69% storage, where run-length coding can save no memory for the same data set.Thus, Shannon-Fano provides 69% better results than run-length coding for the data set and the algorithm proficiency is 92.298%.

Shannon-Fano Decoding Style
In decoding, Shannon-Fano receive an encoded bitstream, items and their relating probabilities.It builds a similar tree to Figure 2 dependent on the probabilities, and the following procedure is used for decoding.Finally, we get the same data list as array C.

1.
Read each bit from an encoded bitstream and scan the tree until a leaf node is found.At the point when a leaf hub is discovered, read the symbol of the node as decoded value, and the process will proceed until scanning of the encoded bitstream is finished.

Huffman Coding
Shannon-Fano coding sometimes produces the poorest code for some set of probabilities because it cannot produce an optimal tree.David A. Huffman illustrated a coding procedure that consistently makes an optimal tree and tackled the issues that exist in Shannon-Fano coding reported in [47,48].Shannon-Fano coding is a top-down methodology, whereas Huffman coding uses the reverse route, from the leaves to the root.Huffman coding uses the statistical information of an image like Shannon-Fano coding.The encoding style of Huffman coding is given below.In addition, Figure 3 and Table 2 demonstrate a graphical representation of Huffman tree and the outcomes dependent on the same data used in Shannon-Fano coding.

1.
List the probabilities of a gray-scale image in descending order.

2.
Form a new node of a tree with the sum of the two lowest probabilities on the list and rearrange them in the same order for the proceeding process.This process will continue until the end.

3.
Assign 0 and 1 to each left and right branch of the tree, respectively.
Based on Table 2, we get [0000100101001000101 11000111110110001000110001111111101110101110 011001 10000000110000000000010101010010010101010100001] as encoded bitstream for the data set (C) that is sent for decoding with symbols and their corresponding probabilities.In this way, Huffman coding saves 71% memory space, which is 69% and 2% more than Run-length and Shannon-Fano coding, respectively, and the efficiency of Huffman coding is 98.664%, which is 6.366% more than Shannon-Fano coding.Huffman coding provides an optimal prefix code.Huffman receives an encoded bitstream, items and their corresponding probabilities and uses the following methodology for decompression; and we get indistinguishable data as the original list (C): 1.
Recreates the equivalent Huffman tree built in the encoding step using the probabilities.

2.
Each bit is scanned from the encoded bitstream and traverses the tree node by node until a leaf node is reached.At the point when a leaf node is discovered, the symbol is predicted from the node.This process will proceed until finished.

Analysis of Huffman Coding
The main problem for Huffman coding is that it is very sensitive to noise.A minor change in any bit of the encoded bitstream would break the whole message reported in [28].Assume that the decoder receives items, probabilities and the encoded bitstream with only three altered bits at the positions 5 th , 19 th , 54 th .Then, we get [2676657477775744747777777355773323 2255556555551] as decoded values where bold elements (total 23) indicate loss of data.In addition, it produces only 47 elements rather than 50 elements.Thus, it devastates ((23 + 3)/50) × 100 = 52% data.

Lempel-Ziv-Welch (LZW) Coding
Lempel-Ziv-Welch (LZW) is generally used for lossless text compression invented by Abraham Lempel, Jacob Ziv, and Terry Welch.This strategy is easy to implement and broadly applied for Unix file compression, which was published in 1984 as an updated version of LZ78.It encodes a sequence of characters with a unique code using a table-based lookup algorithm.In this algorithm, the first 256 8-bit code, 0-255 are inserted into a table as an initial entry because an image contains 0-255 distinct pixels, and the following codes come from 256 to 4095, which will be embedded into the bottom of the table.This algorithm works better in case of text compression and provides most noticeably a terrible outcome for another sort of compression.The encoding procedure of the algorithm is shown as follows.Since the previously mentioned original list (C) contains only 7 (1-7) different values, only 1-7 are inserted into the table as an initial dictionary first.Applying the LZW encoding procedure on C shown in Table 3 and we get the decoded list that appears in Table 4. Finally, the encoded bitstream is sent to the decoder, where each piece of encoded data is converted into 6-bit binary on the grounds that the biggest value is 33 in the encoded list and just 6 bits are required to represent 33.Since the average code length is 3.84, as it appears in Table 4. Thus, LZW saves 36% memory, which is 28.7356% and 29.3103% more than Shannon-Fano and Huffman coding individually for the same dataset.Furthermore, the only encoded bitstream is sent to the decoder for decompression.

LZW Decoding Procedure
The Lempel-Ziv-Welch (LZW) decoding procedure uses the same initial dictionary used in the encoding step and decoding is done using the procedures shown below for image compression.For instance, the mentioned encoded bitstream converts each six bits into decimal value and assign 1-7 as the initial dictionary shown in Table 5.The decoding demonstration for the encoded data is shown in Table 6, and we get a similar list as C after decoding.

Analysis of LZW Coding
Searching dictionary is a major challenge in the LZW compression technique because it is more complicated and time-consuming.Moreover, an image that does not carry much repetitive data at all cannot be reduced, and it is good for deducing file size that carries more repeated data reported in [49,50].

Arithmetic Coding
Arithmetic coding is a lossless data compression procedure where a set of symbols is presented using a fixed number of bits reported in [51,52].It takes likelihood data from a dataset and applies the following procedures for encoding, where N and CF indicate number and cumulative frequency.In addition, UL, LL, LUL and LLL upper, lower, last upper and the last lower limit of the current range, respectively.The tag value is calculated using Equation ( 7): For the example shown in Figure 4, the LLL and LUL are 0.9551925 and 0.9551975.Thus, the tag is 0.955195.The bitstream of the tag value is 001111000110111.Thus, the average code length is 15/10 = 1.5 bits, and the compression ratio is 5.3333 where 15 is the length of the tag.Finally, the tag's bitstream, symbols (4,3,1,2), and their corresponding probabilities (0.5, 0.2, 0.2, 0.1) are sent to the decoder for decompression.When Arithmetic coding is applied on data set (C), we get [00000101100110101110100101111100101010111011110011 11110101011001011100110011100000000101011010010110110111100001011] bitstream from the provided tag.Thus, average code length and compression ratio is 2.3000 bits and 3.4783 separately, which saves 71.25% of storage.It appears that run-length, Shannon-Fano, Huffman and LZW coding use 44.7115%, 7.2581%, 6.5041% and 33.908% more memory than arithmetic coding.

Arithmetic Decoding Procedure
The decoding procedure of arithmetic coding receives tag, symbols and their corresponding probabilities; and the tag is converted into its floating point number and follows the following methodology for decoding.For decompression, if the tag is in between in any range, then the symbol of the range is taken as the decoded value.The range (r) and Newtag (NT) is calculated using Equations ( 8) and ( 9), respectively. Arithmetic_decoding(CF) The whole decoding procedure of the ten values is demonstrated in the following list using Figure 5, and we get the same list [2 3 4 3 4 4 4 1 4 1] as the original.Here, the floating value of the corresponding tag's bitstream is 0.955195.1. tag = 0.955195.Since .9< = tag < = 1.0,Thus, decoded value is 2 because the symbol 2 is in range.2. NT1 = (tag-LL)/r = 0.55195 and it is in between 0.5 and 0.7, so the decoded value is 3.

Analysis of Arithmetic Coding Procedure
The authors in [34,35] state that arithmetic coding provides a better compression ratio.However, it takes so much time that it is virtually not utilizable for dynamic compression.Furthermore, its use is restricted by the patent.On the other hand, though Huffman coding provides marginally less compression, it utilizes much less time to encode an image than arithmetic coding.This is why it is good for dynamic compression reported in [36,37].Furthermore, an image encoded by arithmetic coding can corrupt the entire image for a single bit error because it has very impecunious error resistance reported in [38,40].Another problem is that an entire code word must be taken to start interpreting a message.Contiguous to the primary inhibition of entropy coding is that it increments the complexity of CPU stated in [41,42].Suppose the decoder receives the tag of the original 50 elements with only a first bit altered and we get [65427777727567777477777717765757772472757571571711] as a decoded list where the bold symbols indicate the altered values.In the list, 31 elements have been altered, which means (31/50) × 100 = 62% of the data have been corrupted.

Experimental Results and Analysis
The outcomes and investigation of the state-of-the-art methods have been demonstrated in this segment.The techniques have been applied on the different types of bench-marked images.In this paper, we have initially used three PC created photographs and the next twenty-two medical images from the DICOM Image dataset [53] of various sizes appeared in Figure 6.Encoding time, decoding time, average code length, compression ratio, PSNR and efficiency have been used to analyze the performance of the algorithms.The encoding and decoding time are the periods of time required to encode and decode an image.Average code length determines the number of bits used to store a pixel on average, and the compression ratio represents the ratio of original and compressed images.Pick signal-to-noise ratio ((PSNR)) is used to measure the quality of an image.Less encoding and decoding time, short average code length and higher compression ratio tell how much faster an algorithm is and how much less memory it uses.The higher efficiency and PSNR convey that an image contains high-quality information.The encoding time, decoding time, average code length, and compression ratio are shown in Tables 7-10, whereas  show the graphical representation of encoding time, decoding time, average code length, compression ratio and efficiency, respectively, based on the twenty-five images.
Table 7 shows that arithmetic and run-length coding take the highest (4.0178) and lowest (0.1349) milliseconds on average, whereas Shannon-Fano, Huffman and LZW take 0.5873, 0.2488 and 0.1054 milliseconds individually to encode the images.It appears that arithmetic coding uses 96.6424%, 85.3825%, 93.8076% and 97.3767% more time than run-length, Shannon-Fano, Huffman and LZW coding, respectively.However, Huffman coding uses much less time (0.0062) on average in decoding, whereas arithmetic coding uses more time, which is demonstrated in Table 8.On the other hand, LZW uses more time than Shannon-Fano and Huffman coding but less than Arithmetic and Run-Length coding.Figures 7 and 8 show the graphical representation of encoding and decoding time for comparison.Tables 9 and 10 show average code length and compression ratio, respectively.It looks that RLE uses 10.5618 bits per pixel, on average, which is 24.2553% more memory being used than the original images, which is the reason it is not used directly for real image compression.On the other hand, LZW uses the lowest number of bits (5.9365) per pixel, but the problem of LZW is that it sometimes uses more memory than an original, which happened for image 21 shown in Table 9. Arithmetic coding uses the second lowest number of bits per pixel on average.Thus, arithmetic coding is the best coding technique because it provides a better compression ratio than other state-of-the-art techniques without LZW shown in Table 10.All the state-of-the-art strategies are lossless.Thus, pick signal-to-noise ratio and mean squared error (MSE) for each algorithm are inf and zero, respectively, for every case.However, arithmetic and run-length coding on average have the highest (99.9899) and lowest (58.6783) efficiency than the other methods shown in Figure 11.Despite the fact that the proficiency of LZW coding at some point provides better outcomes and sometimes provides absolutely terrible outcome, which is why it is not used for image compression in real applications.The list of the decompression images is shown in Figure 12.From the previously mentioned perspectives, it can tell that arithmetic coding is the best way when more compression is required; however, it isn't useful for a real-time application in view of taking additional time in encoding and decoding steps.Searching in a dictionary is a big challenging issue for LZW coding, and it provides the worst results for an image compression.Shannon-Fano coding sometimes does not provide optimal code and provides two different codes for the same element, which is the reason it is obsolete now.Run-length coding is not good for a straightforward real image compression.
Thus, it very well may be reasoned that Huffman coding is the best algorithm for the recent technologies among the state-of-the-art lossless methods mentioned used in various applications.However, if we can decrease the encoding and decoding time in case of arithmetic coding, then it will be the best algorithm.On the other hand, Huffman coding will work more if we can decrease its average code length keeping its same encoding and decoding times.In this article, all the experiments are done using C, Matlab (version 9.

Conclusions
In this study, we presented a detailed analysis of some common lossless image compression techniques such as: the run-length, Shannon-Fano, Huffman, LZW and arithmetic coding.The relevance of these techniques comes from the fact that most of the other recently developed lossless (or lossy) algorithms use one of them as a part of its compression procedure.All the mentioned algorithms have been discussed using a common numeric data set.Both computer generated and actual medical images are used to assess the efficiency of such state-of-the-art methods.We also used standard metrics such as: encoding time, decoding time, average code length, compression ratio, efficiency and PSNR

Figure 1 .
Figure 1.General block diagram of an image compression procedure.
Figures 9 and 10 demonstrate the graphical representation of average code length and compression ratio separately for comparison.

Figure 7 .
Figure 7. Encoding time comparison of the images.

Table 1 .
The results of Shannon-Fano encoding procedure.

Table 4 .
Average code length and compression ratio.
of NC to DS, the first code of DS to NC, NC to FEV and add FEV+NC into the table.Furthermore, send DS to the output.
IF (NC is not found in the table).Assign the translation of FEV to DS and perform DS = DS+NC ELSE Assign the translation

Table 6 .
The decoding procedure of LZW coding
Figure 8. Decoding time comparison of the images.

Table 9 .
Comparison of average code length.
Figure 9. Average code length comparison of the images.

Table 10 .
Comparison of compression ratio.