1. Introduction
For a cryptographic hash function to be considered secure, a variety of properties, such as collision resistance, preimage resistance, and pseudo-randomness, are desired. A hash function collision occurs if for two arbitrary inputs, the hash function outputs the same digest. If it is hard to find two such inputs, the function is considered collision-resistant [
1]. This property implies second-preimage resistance, which itself implies preimage resistance, and therefore collision resistance is often considered a primary measure of hash function security.
The Strict Avalanche Criterion (SAC) is a measure of diffusion, a property which describes how well a cryptographic transformation dissipates patterns and statistical structure from input [
2]. A hash function which is low in diffusion will correlate input via some pattern with output. This pattern could likely be abused to discover collisions in the function, and therefore high diffusion is correlated with high collision resistance.
The hash function SHA-256, standardized by NIST FIPS 180-2 and 180-4 [
3,
4] has been previously measured for the SAC [
5]. This measure does not utilize any method to reduce false positives stemming from multiple comparisons. Here, the function SHA-256 is measured for SAC with the Bonferroni Method, in order to prevent false positives.
Contributions
This work is a direct extension of the work done by Vaughn and Borowczak [
5], in that the SAC of all sub-function variant combinations of the SHA-256 compression function are measured. The threshold at which a sub-function is said to exhibit the SAC is relaxed via the Bonferroni Method. Previous work in this area [
5] only tested sub-function combinations with six elements, known as sub-function-removed variants. This work expands the testing to all 127 non-vacuous sub-function combinations.
The unmodified SHA-256 function is initially tested for SAC across the 64 rounds of compression. Results of this measure lead to the consideration that additional operation of highly diffusive sub-functions might lead to a diffusion-dampening effect. Next, the threshold at which each sub-function combination exhibits SAC, if at all, is compared amongst all combinations. Trends in round of SAC plateau are examined. Additionally, the SAC of all component sub-functions are compared to each combination of sub-functions. This is done in order to measure diffusion, which stems not from the individual sub-functions directly, but rather from the overlap and combination of sub-functions. The ratio of expected and measure diffusion, which defines either a dampening or amplifying effect on diffusion, is explored.
2. Strict Avalanche Criterion
A cryptographic transformation is complete if each bit of output depends on all bits of input [
6,
7]. Specifically, for a cryptographic transformation
, if there exist plaintext vectors
which differ only at position
i, such that ciphertext vectors
and
differ at position
j, for all
, then the transformation
f is complete.
The Avalanche Effect is exhibited by the transformation if on average half of the output bits are flipped when a single input bit is flipped. An exhaustive test of all plaintext vectors is needed in order to thoroughly test this effect, though in practice, a statistically significant k is chosen, and k plaintexts are tested via the following procedure. A plaintext vector is randomly chosen, and n vectors are calculated such that they are of copies of except at position i, which is inverted, for . The avalanche vector is calculated for each i, where ⊕ is the XOR function. If approximately half of the avalanche vectors elements equal 1, then the transformation exhibits the Avalanche Effect.
The
Strict Avalanche Criterion (SAC) is a combination of both
completeness and the
Avalanche Effect, where “each output bit should change with probability of one half whenever a single input bit is complemented” [
7]. The procedure for measuring the SAC is initially described by Webster and Tavares [
7]. For a cryptographic transformation
, SAC measurements output an
matrix
. First, each element of
is initialized to 0. For a large choice of
k, a plaintext
is chosen and ciphertext
calculated. For each bit
i of the plaintext, a modified plaintext
is generated, where position
i is flipped, for
. Each modified plaintext results in a modified ciphertext
, which is combined with the original ciphertext to generate an associated avalanche vector
. Each avalanche vector is added to the matrix such that element of
at position
j is added to row
i column
j of
, for
. After this process is repeated
k times, each value in
is divided by
k. In order to meet the SAC, each element in
should equal approximately
.
Statistical Testing
Previous SAC measurement on SHA-256 [
5] utilizes simple binomial testing. A binomial test is a statistical hypothesis test used to compare a real-world measurement with a binary outcome—such as success or failure—to a hypothesized binary outcome. In the case of previous SHA-256 SAC measurements, that SHA-256 exhibits SAC was considered the null hypothesis
, and the data was tested at
confidence or
= 0.01. The hypothesis
was rejected when 0.5 was not contained in the resulting confidence interval.
However, this method is likely to falsely reject with the large number of values tested here
= 131,072. This is known as the
multiple testing problem [
8]. The Bonferroni Method is commonly used to deal with this problem. For
d hypothesis tests,
versus
for
, where
are the associated
p-values for each test, then the
Bonferroni Method rejects the null hypothesis
if
[
8]. Bonferroni is a rather conservative method in the sense that it will introduce additional likelihood of false acceptance of the null hypothesis. For this reason, SAC tested with this method are described as relaxed. Bonferroni is chosen due to this extreme conservatism, such that any positive results which show a sub-function combination not exhibiting the SAC, are likely to be true. The choice of Bonferroni also allows for calculation of a strict threshold, for which a combination is said to exhibit or not exhibit the SAC.
The Bonferroni Method at
confidence results in
. As either too many bit-flips or too few will fail the SAC, the test is two-tailed and therefore
. This
value can be input into an inverse Cumulative Distribution Function to receive the number of standard deviations exceeded
z to reject the null hypothesis
. In this case
. For the given values, the standard deviation of a SAC flip after
trials can be calculated
= 0.0005 = 0.05% [
8]. Combining these values, the allowed deviation for achieving SAC is calculated to be 0.00266, and the threshold for achieving SAC is 50% ± 0.266%.
3. SHA-256
NIST standardized the SHA-2 family of cryptographic hash functions as the Secure Hash Standard in 2002 [
3]. Since then, SHA-256 has become a common choice of cryptographic hash function. In 2015 the SHA-3 family was additionally standardized [
9], though SHA-3 does not supersede SHA-2 and therefore SHA-256 is still commonly used in practice [
10].
The SHA-256 hash function receives an input message w of arbitrary length l, and outputs a 256-bit hash. To calculate this hash, the message w must be split in blocks of 512-bit length. Therefore, a single ‘1’ bit is appended, and then a variable number of ‘0’ are appended such that the new message has length . The original length l, represented by a 64-bit integer, is then appended to the message such that . This padding protocol results in a unique appended message for each original message w.
Each 512-bit block , where , is run through SHA-256’s compression algorithm. The first step of this algorithm is to apply the Message Scheduler to , which lengthens the block to 2048-bit, which is best represented as a length 64 array of of 32-bit integers. Each element of the block array is then inserted into the compression function during each of the 64 compression rounds. The output of the compression function is then used as either an initial parameter for the compression of the next block , or as the final hash when is the last block of input.
3.1. Message Scheduler
The Message Scheduler is a function which takes as input a 512-bit block . The block is split into 16, 32-bit integers , for . Using the recurrence relation, , for , additional blocks are created. The Message Scheduler expands the input such that it can be inserted into each of the 64 rounds of compression, while also introducing diffusion via the sigma functions and the carry operation of integer addition.
The sigma functions are defined as
and
. These functions provide an additional layer of diffusion, while preventing internal collisions [
1].
3.2. Compression Function
The compression function inputs the 64-element, 32-bit integer array output by the Message Scheduler. Throughout operation, the compression function calls six sub-functions, which contribute to overall diffusion: Majority, Choose, , , Integer Addition (+), and the K Function.
Each compression function calculation additionally inputs an eight element, 32-bit initialization vector (IV). The IV is originally set to a constant set of seemingly random values. For every message block after the first, the output of the compression function on the previous block is used as the next IV.
A total of 64 rounds are calculated in the compression function. Prior to the first round, the IV is split into eight variables (a–h). Each round thereafter, the compression round algorithm is run (Algorithm 1). Each sub-function referenced in the SHA-256 compression round algorithm takes as input one or multiple of these 32-bit integer variables.
| Algorithm 1 The SHA-256 Compression Round Algorithm [4] |
|
The Majority sub-function is defined as Majority. The function operates as the name suggests: at each bit-index of , the binary value with the majority representation is output. The function is non-linear and therefore complicates recurrence relation formulas used to dissect the overall algorithm.
The sub-function Choose is similarly defined Choose. Choose is often referred to as IF, for at each bit-index, if the x bit is ‘1’, y is chosen as output, else z is chosen. This sub-function, as with Majority, is non-linear, and can therefore complicate any attempt to algebraically structure the overall algorithm.
The two sigma functions are defined as
and
. As with the sigma functions, which are defined for the
Message Scheduler, these functions are “linear and injective mappings in
” [
1,
5]. This property helps support internal collision resistance, at the potential cost of additional diffusion potential.
Additionally included as a sub-function for the purposes of SAC measures is Integer Addition (+). This is due to the carry operation providing a significant amount of diffusion by occasionally shifting bits. The carry operation is non-linear in , so internal collisions could be made possible. It is important to note that in order for the compression function to achieve any compression, some sub-functions must not be injective.
Lastly, each round of the compression functions includes the addition of a constant
, known here as the
K Function. This constant was set during standardization [
3], and appears to be pseudo-random. The constants are derived from “the first thirty-two bits of the fractional parts of the cube roots of the first sixty-four prime numbers” [
3,
4], and contribute a small additional factor in the diffusion of the overall function.
3.3. Previous Sub-Function Measures
The concept of splitting the SHA-256 compression function into sub-functions comes originally from Yoshida and Biryukov [
11], who swap
Integer Addition for a simple XOR operation, where XOR is effectively addition without the additional diffusion via the carry operation.
Vaughn and Borowczak [
5], taking inspiration from this work, consider swaps for all the listed sub-functions. The
Message Scheduler is swapped for a simpler function where the input block
is repeated four times in order to extend the 16-element vector into 64 elements.
Integer Addition is swapped with XOR, as had been done previously. The remaining sub-functions of
Majority,
Choose,
,
, and
K Function are simply removed in tested variants that forgo them.
In this previous work, one of the seven sub-functions is removed from the overall compression function, and the SAC calculated. The results suggest that
Choose,
Integer Addition, and
sub-functions provide the most non-redundant diffusion [
5]. This measure, however, does not consider how combinations of sub-functions might develop diffusion greater than the diffusion measured by the sum of parts.
4. Methods
All combinations of SHA-256 compression sub-functions are tested. In line with previous work [
5], when a tested variant does not include a specific sub-function,
Integer Addition is swapped with the XOR function,
Message Scheduler is simplified into block repetition, and the remaining sub-functions are simply removed. There are seven SHA-256 compression sub-functions, and therefore
possible non-empty sub-function combinations.
For each of these possible combinations, the SAC is measured on each of the 64 rounds of compression. Each combination is tested for 1,000,000 plaintexts. The data for each of these calculations is stored in element CSV files. For each combination tested, an additional CSV file stores minimum, maximum, and mean for each of the 64 rounds of CSV files.
This data is then parsed in a variety of ways. First, individual sub-functions are considered and compared. Then, all combinations are compared at the round at which each passes the SAC threshold, if at all. Lastly, the SAC values of combinations are compared to the SAC values of component parts in order to calculate whether diffusion increases to be greater than the sum of parts.
The SHA-256 implementation is the same as was used in previous works in this field [
5,
12]. Standards such as GO crypto/sha256 [
13] do not provide enough granularity to test individual compression sub-functions. The version used was tested with plaintext/hash pairs from the NIST Secure Algorithm Validation System [
14].
5. Results
For each of the CSV files, which represent a matrix of SAC data, the mean is calculated, as well as a combined minimum/maximum metric. This combined metric measures the minimum and maximum values of the CSV then checks which is further from the ideal of . The furthest value is referred to as the SAC value for that round. The mean value is referred to as the SAC-mean value. Given that the SAC is strict, the SAC value is the most accurate representation of all of the data in the SAC matrix as a single point. The SAC-mean value can be more useful for visualizing how the measured function approaches the goal.
Previous work [
5] has considered only individual sub-function-removed variants of SHA-256: combinations of six of the seven sub-functions. Here, all combinations of sub-functions are measured, starting with individual sub-functions. The SAC values and SAC-mean values are graphed across the 64 rounds of compression for these sub-functions, whereas they are compared to the full SHA-256 SAC values. All 127 sub-function combinations are then compared, simplified to the round at which they pass the SAC Bonferroni threshold. Lastly, diffusion stemming from the sum of parts rather than the parts themselves is measured.
5.1. SHA-256 Unmodified
Under the tighter thresholds considered in previous work [
5], SHA-256 does not exhibit the SAC. This tighter threshold does not consider the multiple testings problem. Here, the Bonferroni Method is used, which prevents most false positives at the cost of additional false negatives.
Under the Bonferroni threshold, as shown in
Figure 1, SHA-256 does exhibit the SAC between rounds 23–53 and 58–64 at
confidence. The function does not plateau at SAC levels until round 58, with SAC values from rounds 54–57 all at 0.49728, below the threshold of 0.49734 (
Figure 2).
This unexpected result suggests that through the additional calculation of sub-functions, the SAC value somehow moves farther from the ideal of 50% as additional rounds are computed. Somehow, certain sub-function combinations are dampening the diffusion throughout these rounds.
5.2. Individual Sub-Functions
The individual sub-functions
Message Scheduler,
Majority,
Choose,
,
,
Integer Addition (+), and the
K Function are tested for SAC across the 64 rounds of compression (
Figure 3). These functions admit varying levels of SAC values, though the
Message Scheduler alone exhibits SAC at round 40, plateauing from there. The remaining sub-functions never achieve SAC values above 0 throughout the 64 rounds of compression.
The SAC-mean values (
Figure 4) provide more information as to how individual sub-functions approach the SAC. The
Message Scheduler SAC-mean data looks similar to the SAC-mean data of the unmodified function. Both appear near sigmoid-shaped, whereas the
Message Scheduler sub-function is shifted to the right 17 rounds.
The
sub-functions provide unique SAC-mean data. Neither sub-function seems to plateau at any round, rather
builds up to 0.29590 SAC-mean at round 36, then retreats, while
approaches 0.25879 at round 19 and then 0.276854 at round 55. While
is speculated to not provide significant diffusion [
5], both sub-functions appear to provide important levels of diffusion compared to other individual sub-functions.
The remaining sub-functions cluster near the bottom throughout all 64 rounds, with Integer Addition leading the group. This sub-function’s SAC-mean peaks at round 64 at 0.04444. The Majority and Choose sub-functions appear to essentially overlap throughout all rounds with SAC-mean peaks at round 64 with values 0.01561 and 0.01562 respectively. The K Function trails behind all other sub-functions, with a round 64 SAC-mean value of 0.00586.
5.3. Sub-Function Combinations
All sub-function combinations are considered as to when, if ever, each combination surpasses and plateaus beyond the threshold set via the Bonferroni Method. The latest round at which the SAC value passes 0.49734 while never returning below is considered that sub-function combination’s plateau round. Plateau rounds for each combination are graphed along eight subplots (
Figure 5), each of which represent an individual sub-function. Plateau round data is represented across the 64 compression rounds and number of sub-functions included in that combination. If a sub-function combination includes a particular function, that combination’s plateau round data is included in the that function’s subplot. The eighth subplot represents plateau rounds for all combinations regardless of combination structure. The alpha value of each data point is set to 0.25, this controls data translucency. If for the same number of sub-functions included in a combination, many combinations plateau rounds overlap, this is revealed through more opaque coloration at that point. Additionally, if a sub-function combination never exhibits the SAC, that combination is graphed beyond round 64, separated from the other data by a vertical line.
Of the sub-function combinations which include only an individual sub-functions, as verified by
Figure 3 the
Message Scheduler exhibits the SAC with round plateau value of 40. All other combinations at this level fail to exhibit that SAC.
At combination level two, all sub-function combinations which include Message Scheduler exhibit the SAC, maximally with plateau round of 40. Combinations including the K function lag behind, where the only combination exhibiting the SAC includes the Message Scheduler. Half of the Majority and Choose combinations exhibit the SAC, with best round plateau values of 28 and 30 respectively. The sigma functions appear in most sub-function combinations which exhibit the SAC at this level, yet when combined they do not exhibit the SAC together. In combination with , Integer Addition has the lowest round plateau value of the combination level, at 25.
Of all combinations which include three sub-functions, the Message Scheduler once again stands out, as each combination containing it exhibits the SAC. The K function once again stands out for the inverse reason, as all but one of the combinations that do not exhibit the SAC contain this sub-function. The lowest round plateau value occurs with the , Integer Addition, Choose combination, at round 24. At this level, most function combinations exhibit the SAC, though the round plateau values are spread between 24 and 40.
The four-element sub-function combinations continue the same trends. The data visually clusters around between 23 and 38 rounds. All combinations which include Message Scheduler exhibit the SAC, whereas there is at least one instance of all other sub-functions contributing to a combination which fails the SAC. Specifically, there are only two combinations which fail to exhibit SAC of the 21, but these combinations are made up of all the other sub-functions. The lowest round plateau value at this combination level is 23, for the combination made up of Message Scheduler, , Integer Addition, and Choose.
At combination level five, all sub-function combinations exhibit the SAC. The main cluster of SAC plateau rounds continues to tighten between 23 and 29, with only three sub-function combinations remaining outside of the cluster. Four combinations have round plateau values of 23, of which all include , Integer Addition, and Choose. The combination with the highest round plateau value of 63 excludes Majority and Integer Addition.
Combinations which include six sub-functions, also referred to as sub-function-removed variants in the previous work [
5], suggest similar results as to those of the previous work, but with one addition. The exclusion of
,
Majority, and
K function result in round plateau values of 23. The use of the Bonferroni Method to calculate the SAC value threshold pushes the
Message Scheduler-removed combination to a round plateau value of 24. As suggested by Vaughn and Borowczak [
5], the sub-functions
Choose,
Integer Addition, and
provide round plateau values of 23, 25, and 27 respectively when excluded from the combination.
Lastly, there is the whole SHA-256 function, which defies the trend and has a round plateau value of 58. As noted previously, the function exhibits SAC from rounds 23–53, but fails to plateau, and regresses before finally plateauing at round 58. This suggests that through some mechanism, adding a sub-function actually removes SAC exhibition.
Clearly, the Message Scheduler, , Integer Addition, and Choose sub-functions provide significant levels of diffusion across the 64 rounds of compression compared to Majority, K Function, and .
5.4. Diffusion Directly from Combination
The SAC values measured for individual sub-functions, then combined, might not capture the entire SAC value of the same sub-function combination. That is, for individual sub-functions where is the function which results from their composition, the resulting values SAC combined with SAC might be more or less than SAC. If SAC is the greater value, then the combination of A and B amplifies the result, whereas if SAC is the lesser value the combination of A and B dampens the results.
The SAC measurement algorithm detects bit complement in output via the XOR function. Since SAC values are the furthest measured probability from , analysis is restricted to these values. For one plaintext in the SAC measuring algorithm, for complemented plaintext bit, each bit of output is XOR’ed with the output of the original plaintext. In the case where we run through this algorithm, the result will only complement the original hash output if either complemented but did not, or if complemented and did not. This results in an expected value of . Therefore, the ratio of the measured SAC value of divided by the expected SAC value results in the dampening/amplifying SAC ratio of each sub-function combination for each round. For a SAC ratio , the combination of functions amplifies diffusion, and for a SAC ratio the combination dampens diffusion.
For each number of sub-functions in a combination, the calculation increases in terms. The general formula for calculating the probability that an odd number of bit-flips occurred, for n composited sub-functions , is . For visibility of measurement, SAC-mean is used to calculate the SAC-mean ratio.
For sub-function combinations with two elements (
Figure 6), the data is widely distributed. Minimally, the SAC-mean ratio reaches 0.50817 for the
Choose,
Majority combination, which suggests that there is a significant diffusion-dampening effect between these sub-functions. Maximally, the SAC-mean ratio reaches 4.38483 with the combination of
Integer Addition and
Choose, which suggests a strong amplifying effect from the combination of these sub-functions. All combinations involving the
Message Scheduler sub-function notably exhibit very little amplifying or dampening effect. Overall, the data suggests that sub-function combination leads to much more amplification than dampening.
At combination level three, the range of data shifts. The combination of Choose, Majority and K function demonstrates an extreme dampening effect with a SAC-mean ratio of 0.43115. For the Choose, Integer Addition, K function combination, the SAC-mean ratio is 4.02964. All Message Scheduler combinations once again have a SAC-mean ratio of near 1.0. The data does not appear to significantly constrict in density in comparison to the combination level two data.
However, for combinations which include four sub-functions, the data does appear to tighten. At this level no sub-function combinations lead to dampening, and the minimal SAC-mean ratio is 1.0. Maximally, the SAC-mean ratio is 3.32819 for the combination of Integer Addition, Choose, Majority, and K function. As combination levels increase, the amplifying effect trends to 1.0.
This is revealed in greater detail at combination level five, where dampening continues to not exist, and where amplification occurs maximally at a SAC-mean ratio of 2.02217, with the combination which excludes and Message Scheduler. For combinations of six sub-functions, the only combination with ratio not 1.0 is the combination which excludes just Message Scheduler. This combination has a SAC-mean ratio of 1.36574.
Overall, it appears that diffusion-dampening effects occur most amongst Choose, Majority, and K function. Simultaneously, amplification appears strongest amongst Choose and Integer Addition. The Message Scheduler sub-function notably pushes all combinations it is contained in to a SAC-mean ratio of 1.0.
6. Conclusions
The level of diffusion a hash function demonstrates can be effectively measured via the SAC, though in order to reduce false positives, a method such as Bonferroni should utilized to measure the multiple comparisons involved. With this method, SHA-256 exhibits the SAC at round 23 of compression, fails the SAC between rounds 54 to 57, then exhibits SAC once more at round 58.
This late-round failure to exhibit SAC defies the trend displayed by the many combinations of SHA-256 sub-functions. While only the Message Scheduler exhibits SAC individually, at round 40, combinations of sub-functions trend toward exhibiting SAC at round 23. This trend continues until the whole sub-function is constructed from the combination, where the entirety of SHA-256 fails late, as described.
Of the sub-functions which make up the SHA-256 compression function,
,
Integer Addition,
Choose, and
Message Scheduler are consistently contained in the combinations which exhibit the SAC at the earliest rounds. This reinforces the results of Vaughn and Borowczak [
5], who suggest
,
Integer Addition, and
Choose as the best targets for future sub-function-focused attacks. Here, we also suggest
Message Scheduler as a useful target for this same type of attack.
The measured SAC of a sub-function combination is sometimes amplified or dampened compared to the expected SAC of that combination. Notably, the dampening effect only occurs at lower levels of combination, largely amongst combinations containing Majority, Choose, and K function. Future sub-function attacks could consider targeting this weakness. Further, amplification occurs for almost all sub-function combinations, but decreases as additional functions are added to each combination. There is no obvious attack that stems from sub-function combination amplification.