The data input to the qubit output the hash function through the process of absorbing and squeezing by the sponge structure. In the absorbing process, the message block is converted through XOR and permutation functions, and the converted message block is updated by repeating the function
f (i.e., Keccak-f[1600, 24]). The final hash value is output through the squeezing process. The proposed improved low-depth SHA3 quantum circuit for fault-tolerant quantum computers was implemented for all Keccak-f phases. This section describes the implementation of quantum circuits for each function. The SHA3 internal function
f consists of five steps, presented below, and operates as many as
rounds, depending on the
b bits (SHA3:
b = 1600). The power-of-two word size
w is defined as
bit, and SHA3 uses a 64-bit word (i.e.,
l = 6):
3.1. Theta
Theta
is one of the five phases of the SHA3 (i.e., Keccak-k) hash function. In the
phase, the data are processed in the three-dimensional state array structure shown in
Figure 4. The result of
saves to
bits. The final result value is stored in
.
Equation (
1) is the theta (
) operation in a classic computer. In classic computer operation, temporary registers (hereafter referred to as
) of
C,
D, and
R are allocated to store intermediate calculation values. Therefore, four 1600-bit
are used. Quantum circuits reduce the use of 4800 qubits by allocating one 1600-bit
of one state size. The proposed quantum circuit avoids an increase in depth by not initializing
qubits through reverse operation in each round. This scheme allocates one temp qubit to replace the inverse operation per round.
Figure 5 shows (a) the exclusion of the inverse operation and (b) the inclusion of the inverse operation in theta (
).
T represents the
qubit. The scheme excluding the inverse operation includes only one
function in the quantum circuit per round. The scheme including the inverse operation involves the
function and the
function to return the temporary qubit
T to its original state in the quantum circuit each round. Including the inverse operation can reduce
T qubits but increases the number of quantum gates and the depth. In an attempt to reduce the depth, the scheme excluding inverse operations was selected in this paper. Compared to [
8], the operation process proposed in this paper increased the CNOT gate by about 36.36% and reduced the depth by about 71.27% in
(trade-off between quantum gates and depth). In the implementation of [
8],
used the most CNOT gates and increased the depth. Considering this, the proposed quantum circuit replaced
with the use of
T qubits. As a result, a trade-off occurred between 1,360,000 CNOT gates + 25 depth and 1600 qubits at
per round (increase: qubit, decrease: CNOT gate + depth).
Algorithm 1 shows the operation of our quantum circuit for theta (
). In the input,
X and
T denote the input qubit and the
qubit. All operations performed by the CNOT gate updated
T, and
T was returned at the end. Compared to previous research results [
8], the proposed algorithm increased the CNOT gate but reduced the full depth in theta (
).
Algorithm 1 Quantum algorithm for theta () |
Input: X, T- 1:
for (i = 0 to 5): for (j = 0 to 5) : for (k = 0 to 64) : - 2:
for s = 0 to 5 do - 3:
- 4:
- 5:
- 6:
end for
Return T |
3.4. Chi
Chi
is the only non-linear part of Keccak-f. Considering the results of the quantum circuit implementation, the Toffoli gate was only used in this step. Since it was the only internal step to use the T gate, it presented the T depth. Chi (
) is the process of XOR operation with the result of multiplying the correct two bits in the row, and the operation is as follows:
This shows the classic chi (
) operation. In the proposed quantum circuit, the operation results are reflected directly on the target qubit without intermediate
qubits to reduce the depth and additional
qubits. Using the Toffoli gate, the result of
is directly reflected in the target qubit. The values required to update
in
are shown in
Table 2.
In the order
there is no problem in updating X, but for
, a problem arises because the state of the qubits of
and
X required for the operation has changed from the preceding operation (marked with ⊛ in the
Table 2). A method involving inverse operation can be considered, but this greatly increases the depth of the quantum circuit. The proposed quantum circuit allocated qubits to store the values of
and
before operation and maintain the values. For each round in chi (
), this method reduced the CNOT gate by about 98.08%, the T depth by about 30.3%, and the full depth by about 90.08% through the use of an additional 640
qubits. Algorithm 2 shows the quantum circuit operation for chi (
).
Algorithm 2 Quantum algorithm for chi () |
Input: x, , - 1:
- 2:
- 3:
for (i = 0 to 5) : for (j = 0 to 5) : for (k = 0 to 64) :
- 4:
- 5:
- 6:
- 7:
- 8:
- 9:
- 10:
- 11:
- 12:
|
3.6. Quantum Cost Analysis for SHA3
Table 3 shows the proposed quantum resources for the Keccak-f function in SHA3, and
Table 4 shows the quantum resources for our method, with the results of Amy et al. [
8] included for comparison. As shown in
Table 3, in the proposed quantum circuit,
and
used the most quantum resources, and 1600 qubits in
and 640 qubits in
were used for each round.
As shown in
Table 4, the proposed quantum circuit increased the number of qubits to reduce the depth of the quantum circuit, resulting in a reduction in the depth of each function compared to previous implementations. In the theta (
) operation, we increased the CNOT gate by about 36.36% and reduced the depth by about 71.27% per round. To omit the
process, 1600 additional qubits were used. The result of this quantum trade-off was a reduction of 1,360,000 CNOT gates + 25 depth for
and an increase of 1600 qubits. In total, the full depth was reduced by about 73.67% in theta(
) and theta inverse
.
In chi (
), 640 qubits were used additionally to reduce the CNOT gate by about 98.08%, the T depth by about 30.3%, and the full depth by about 90.08% per round (trade-off between qubits and gate + depth). In [
8], the number of Toffoli gates used was not shown, but through the T depth, it can be inferred that Toffoli gates were used more than in our method. Our method used more 1qClifford gates but reduced the number of CNOT and Toffoli gates, which are more expensive quantum gates than 1qClifford gates, so we saw this as an appropriate quantum resource trade-off.
For the operation of iota (), the classic-to-quantum method reduced the quantum resources required for RC calculation and replaced the use of CNOT gates with X gates.
As a result of these efforts, the proposed improved low-depth SHA3 quantum circuit for fault-tolerant quantum computers decreased the depth of all functions, reducing the overall quantum circuit depth by about 80.01%.
We optimized the depth of each function in the SHA3 quantum circuit for low quantum computing error rates. Our approaches reduced the depth of the quantum circuits. In [
32], the additional parts for Korean block ciphers LEA, HIGHT, and CHAM were implemented in parallel to reduce the overall depth compared to the first proposed quantum circuit [
33]. The quantum resource estimation results of the preceding quantum circuit [
32] presented in
Table 5 were efficiently reduced in terms of depth through quantum resource trade-off, and the results are shown in
Table 6. The method presented in [
33] showed performance enhancements in terms of quantum circuit depth for LEA, HIGHT, and CHAM of 78%, 85%, and 70%, respectively.
In [
14], to reduce the depth of the Korean standard hash function LSH, a part that made parallel operation possible was identified in previous research [
34] and designed to perform parallel operation inside the quantum circuit. As a result, compared to the initial research results presented in
Table 7, the full depth of the quantum circuit was reduced by about 96%, and the results are shown in
Table 8. The previous study reduced the internal calculation time and error by reducing the depth of the quantum circuit. The depth of the quantum circuit was reduced by about 70% to 96% compared to the initial work.
Table 9 shows the quantum resources needed for the SHA-256 [
8] and SM3 [
15] hash functions. The results of comparing our SHA3 quantum circuit with the quantum circuits for other hash functions (LSH-256 [
14], SHA-256 [
8], and SM3 [
15]) were as follows:
The SHA3 quantum circuit proposed in this paper used more X gates and CNOT gates than the parallel quantum circuit of LSH-256 [
14] and had fewer Toffoli gates and a smaller T depth and full depth. Compared to the SHA-256 [
8] quantum circuit, our SHA3 quantum circuit used more X gates and CNOT gates but fewer Toffoli gates, and the T depth and full depth were smaller. Compared to the Chinese National Standard hash function SM3 [
15], the SHA3 hash function used X gates and CNOT gates more and Toffoli gates less, and the T depth and full depth were smaller. Compared to the LSH, SHA256, and SM3 hash function quantum circuits, the proposed SHA3 required the more X and CNOT gates, but the required number of Toffoli gates, T depth, and full depth were lower. In future quantum computing, it is expected that SHA3 will be the most vulnerable hash function according to its depth. On the other hand, in terms of qubits, it is expected that the time to reach the number of qubits required for SHA3 quantum circuit operation will be the longest.