This section discusses the results of encrypting ASCII data with the proposed block cipher in both ECB and CFB modes. Randomness of the resulting ciphertext is measured through the use of statistical tools discussed in
Section 2.4, as well as Shannon entropy and Law of Iterated Logarithm calculations. A detailed discussion into diffusion follows, including measurements of decay and security concerns with repeated plaintext. Lastly, a comparison of memory and computational requirements between the cryptosystem modes and AES is made to illustrate the point that the proposed block cipher in this paper is a better option for IoT devices. All comparisons include metrics on the GEF stream cipher variation [
23] as an additional baseline. Confusion and other cryptographic properties generally offered in block ciphers are not included in this analysis because the proposed cryptographic algorithm does not include techniques to introduce them. The tools used to validate the results in this section do not guarantee security to intelligent attacks. They offer a baseline measurement of strength against generic statistical attacks like frequency analysis.
4.1. Entropy Analysis
A sample of 1,000,000 characters were used as the plaintext in a MATLAB implementation of this cryptographic algorithm. All punctuation was removed except spaces and was raised to upper case. The frequency of characters is displayed in a histogram in
Figure 4.
A single-stage RNS PRNG was used for matrix construction, generating values in GF(
). The matrix was sized
. The distribution of the ciphertext from ECB mode is shown in
Figure 5. The mean of the histogram of ciphertext is 3906.25, with one standard deviation of 62.56. The expected mean of the ciphertext in GF(
) is 3906.25, indicating the resulting ciphertext is uniformly distributed. The Shannon entropy is 7.9998 bits, where the maximum entropy for this system is 8, meaning the ciphertext is almost uniformly distributed.
Table 1 displays the results of the NIST STS after analyzing the generated ciphertext from this implementation. The test suite was provided 1,000,000 characters, where 1000 bitstreams 1000 bits long were chosen randomly from the pool of provided ciphertext. With
(significance) set to 0.001, 980 sequences needed to pass a specific test for the ciphertext to be considered seemingly random. Each test relevant to the cryptosystem is displayed in the table, and indicates an overall success. For more information regarding what the individual tests examine, refer to the documentation on NIST’s website [
28].
The same exact setup was run, but using CFB mode when encrypting. The distribution of the resulting ciphertext is displayed in
Figure 6. The Shannon entropy is 7.99
bits, which is essentially equivalent to the maximal entropy. The mean of the histogram in GF(
) for the ciphertext is 625,000 since CFB mode produces
n times more ciphertext than ECB mode, and the standard deviation is 128.65. Interpreting these values, the cryptographic algorithm run in CFB mode is nearly uniformly distributed. The ciphertext was analyzed using NIST’s test suite with the same parameters as the standard mode. The results are also displayed in
Table 1.
The stream cipher variation of the GEF cryptosystem from [
23] was also implemented in MATLAB and used to encrypt the same plaintext as ECB mode and CBC mode, resulting in a Shannon entropy of 7.9998 bits. The distribution, shown in
Figure 7, indicates a mean of 3906.25 and standard deviation of 63.56.
As expected, the stream and ECB modes of the cryptographic algorithm were nearly equivalent in their distribution. However, CFB mode’s results indicated a more secure option due to the higher entropy and lower standard deviation for a substantially larger mean. Based on the results of the NIST test suite, all three variations of the algorithm appear seemingly random since each test passed the threshold of 98%. ECB mode outperformed the other two modes in the context of these tests.
4.2. The Law of the Iterated Logarithm
While the NIST test suite validates the randomness of a cryptosystem’s output, the Law of the Iterated Logarithm (LIL) provides another level of confidence in the randomness of the ciphertext. Previous research indicates that some cryptographically weak systems have passed the NIST test suite, but did not pass the LIL test [
24,
30]. The test analyzes the variance of a pseudorandom string by looking at the reduced number of 1s after a large output capture. Generally, the provided data is on the order of
to
captured outputs. Summarizing what merits a success in the LIL test, the variance must converge to the bounds [−1, 1], but still maintain a large distribution within those bounds. The tool referenced in [
30] requires a sample ciphertext size of at least 62.5 Megabytes to accurately measure the variance. The output stream is randomly broken into smaller samples upon which three statistical metrics of distance are calculated: total variation, Hellinger, and root-mean-square deviation. If the distance values calculated from the sample ciphertext are less than accepted thresholds dependent on the sample size, the test is considered a success.
Figure 8 shows the results of the LIL test on the ECB mode and
Figure 9 displays the results of the CFB mode. Both of the modes pass all three distance metrics while staying within the bounds [−1, 1] with few deviations outside. As shown in the figures, a wide distribution of variance is maintained within the bounds indicated by the dashed red line.
4.3. Diffusion Analysis
Two properties used in the operation of a symmetric cryptosystem are diffusion and confusion. When employed correctly, these work together to prevent statistical tests and other analysis methods from finding meaningful information in ciphertext. Diffusion means that a change in plaintext should propagate throughout the resulting ciphertext. In an ideal cryptosystem, a change in a single bit of plaintext should result in at least 50% of the ciphertext changing. Confusion means each bit of ciphertext should depend on multiple bits of plaintext. These properties are most commonly achieved through permutation and substitution operations respectively. The block cipher described in this paper only employs methods of diffusion.
In the stream cipher version of the GEF cryptosystem, when a bit change occurs in a plaintext word or key generated by a PRNG, the resulting diffusion is isolated to the corresponding ciphertext word, as shown in
Figure 10. Given a repeated plaintext word, the probability that a uniformly distributed PRNG, operating in GF(
), will generate the same key for encryption is
. While the probability is relatively small, given a large sample of plaintext, identical ciphertext words can occur. Intelligent attacks can then use repeated ciphertext as a clue on how to retrieve the ciphertext. To solve this problem, the stream cipher can operate in a different mode of operation that uses previous ciphertext as an encryption parameter, such as Counter (CTR) or Output Feedback (OFB) mode. In this case, diffusion would propagate throughout the remaining ciphertext. Bit errors also need to be considered when making the decision of which mode to operate in. In the standard stream cipher mode, a bit error would only result in a single ciphertext word changing, whereas an error in CTR or OFB mode would propagate through the remaining ciphertext.
Diffusion in block ciphers is different as a change in the plaintext or key results in a change in the entire block. Within the ECB mode, if a bit change in a plaintext word at index
i occurs, diffusion only occurs within the corresponding ciphertext words from index
i up to 0 since the combination technique utilizes a dot product with a triangular matrix.
Figure 10 shows an example of the diffusion propagation. Any change in the first
n plaintext words will only result in propagation throughout the first block of ciphertext. The change does not propagate between blocks. In traditional block ciphers such as AES or 3DES, no propagation between blocks introduces security issues as a repeated message generates the same output [
31]. However, in the proposed system, the use of a PRNG with sufficiently large
k (
k >= 8) to generate the key matrix mitigates concern as two identical blocks will not be encrypted with the same key. If, for some reason, a prime
p is chosen for the field order, the key matrix need not be triangular so long as the determinant is relatively prime to
p. In this case, a single bit change between two exact plaintext words with identical key matrices will hopefully result in a substantially changed ciphertext block. On the other hand, a low rate of diffusion between plaintext blocks means bit transmission errors only propagate as far as the corresponding ciphertext block.
Table 2 displays the Shannon diffusion averaged from 1000 samples for different block sizes. The row indicates the block size
n and the column marks the index
i of the plaintext word that was changed. As mentioned earlier, using an upper-right triangular matrix results in a lower rate of diffusion when a change occurs in a lower index. As the block size increases, the overall diffusion decreases with a change in index 1 since this does not propagate throughout blocks 2 through
n. However, diffusion reaches the targeted
in every scenario when the change occurs at index
n. Given smaller block sizes,
n = 4, a system can still meet an acceptable rate of diffusion.
Since the CFB mode reuses the ciphertext as a variable in the next iteration of encryption with only a single new plaintext value, diffusion propagates from the current block to further blocks before eventually decaying out. How far the change propagates depends on which word in the plaintext vector is modified, as the lower down it is the farther this change propagates. A small change propagating is great for diffusion when identical blocks with a single bit change are encrypted. The plaintext word would be encrypted through
n blocks and contribute to increased diffusion in following blocks. However, an increase in diffusion indicates a larger impact when errors occur. A single bit error would result in incorrect plaintext, following decryption, in each block after the location of the error. Since the ciphertext vector shifts out a word with each iteration of encryption, the diffusion eventually decays. An example of this can be seen in
Figure 10. The change made in plaintext word one does not propagate to the second ciphertext block, the change made in plaintext word two only propagates to ciphertext blocks one and two, and so forth.
Further research on triangular matrices as a key proved that the inverse of a lower-left triangular matrix with entries from a finite field can be computed using Algorithm 1. The direction of diffusion in the ciphertext vector is completely dependent upon the type of triangular matrix used as a key since the operator used for encryption is a dot product. In an upper-right triangular matrix, ciphertext words at index
i are a result of
plaintext combinations; in a lower-left matrix, ciphertext words at index
i are a result of
i plaintext combinations. Diffusion propagates to lower index words in an upper-right matrix and the opposite with a lower-left matrix. To achieve a constant rate of diffusion while maximizing entropy, a system designer could alternate key construction between an upper-right and lower-left triangle. Alternatively, a synchronized PRNG between the sending and receiving devices could make this decision, as shown in
Figure 11. By alternating, diffusion need not be restricted to one direction when propagating throughout the resulting ciphertext. Implementing this process does require maintenance of another synchronized PRNG between the transmit and receive devices.
Figure 12 shows a comparison of diffusion rate when encrypting with different types of keys. The black line represents diffusion rate when encrypting with upper-right triangular keys, the red line represents lower-left triangular keys, and the blue line is when the key structure alternates. The dashed line indicates the targeted diffusion rate of 50%. The displayed percentages are averaged over 1000 samples where the x-axis indicates the index of the plaintext word modified by a single bit in each block. As expected, upper-right triangular keys result in higher diffusion at greater indices, whereas using lower-left triangular keys results in the inverse. Alternating between the two every other encryption operation results in a constant rate of diffusion. Note that these are measurements based off a single bit change per block. To achieve a targeted Shannon diffusion rate of 50%, more changes can be induced in the plaintext.
4.4. Memory and Computation Requirements
In this subsection, we attempt to analyze the memory and computation requirements for the ECB, CFB, and stream modes. Scalability is considered, and parameters for real-world applications are recommended. The use of the PRNG is relevant as maintaining this structure contributes to the overall memory space used and computation time.
Memory and computation requirements for ECB mode of the cryptosystem are substantially less than CFB. For a matrix sized , the total memory necessary to construct all key matrices in GF() is given by where . Each plaintext word is encrypted exactly once in ECB mode. Overall, the mode requires only encryption or decryption operations to build the ciphertext or retrieve the plaintext.
In CFB mode, the option for an initialization vector (IV) is provided and can be generated by the key PRNG. Without using one, every plaintext word in the first block from index,
i, 1 to
is encrypted an
i number of times. Every subsequent plaintext word is encrypted
n times. The use of an initialization vector ensures every value of plaintext is encrypted
n times. With the inclusion of an IV, the number of PRNG outputs needed to construct sufficient key matrices is given by
. The amount of ciphertext generated is given by
. The total ciphertext generated is
n times more than in the standard mode, but a very small increase in entropy and a more uniformly distributed output, as seen in
Section 4.1.
While in ECB or stream mode, the amount of memory generated by the encryption process is equivalent to the amount of plaintext. However, the stream cipher requires only as many PRNG outputs as there are plaintext words. The trade-off here is between diffusion rate and PRNG maintenance. CFB mode generates substantially larger amounts of ciphertext than the given plaintext. A maximum practical size for an IoT caliber platform is likely on the order of GF() with n = 16 or 32. Keep in mind that during the encryption and decryption operations, all values are extended to a higher field, requiring additional space to temporarily store these values. The PRNG outputs used to construct the matrices can be recycled as soon as they are used, but the computational requirements of this cryptosystem need to be considered.
4.5. Hardware Validation
To measure the speed and energy efficiency of this cryptosystem, an implementation was built on a TI MSP430FR5994. The microcontroller chosen to implement the proposed system represents many modern IoT devices, as the device only has 256 KB of FRAM, 8 KB of RAM, and a 16 MHz processor. The microcontroller was chosen to remain consistent with the device used in the stream cipher implementation [
23]. The program required 3% of the device’s RAM and 9% of the available FRAM when fully optimized by the compiler. To measure scalability, the cryptosystem was set up to encrypt ASCII text, GF(
), with parameter
n = 4. The key matrix was generated by the same single-stage RNS PRNG used in CPU implementation with a period of
14,429,764,351.
The TI microcontroller has internal software called EnergyTrace that can be used to provide time and energy measurements of a section of code. Using EnergyTrace, time and energy usage during encryption and decryption operations were recorded and averaged from 1000 samples in both ECB and CFB modes, as well as AES. These results are compiled in
Table 3. The measured results for the stream cipher’s encryption is gathered from previous work [
23]. Details on the decryption operation were not provided and are not displayed in the table, but the time and energy required would be about the same. The use of the EnergyTrace program slightly increases the recorded speed and energy consumption due to the capturing process running in the background.
ECB mode required substantially less time and energy for both encryption and decryption than AES, but encrypted a smaller block size. The CFB mode implementation started with a randomly generated initialization vector and encrypted the same plaintext as the ECB experiment. Encrypting all the plaintext required n more operations, contributing to larger time and energy requirements. Decryption operations involve calculating the adjoint of a matrix. The algorithm to do so is almost equivalent to calculating the inverse of a matrix, which has a complexity of . As a result, decrypting the ciphertext in both ECB and CFB modes takes much longer and requires more energy. The system does not scale well on IoT devices when considering values of n and k that would encrypt equivalent amounts of data per block compared to AES. However, the system can encrypt and decrypt more efficiently when operation on smaller block sizes, 4.