3.1. LSH Quantum Circuit
This section describes the parallel quantum circuit for LSH. In this paper, the parallel operation is designed for the independently operable part of the LSH hash function. This is designed to reduce the quantum circuit depth of LSH through an efficient quantum resource trade-off. We implemented the message expansion(MsgExp) function and the message mix(Mix) function in parallel. As a result, the parallel quantum circuit shows about a 96% reduction in the depth of the quantum circuit compared to the previous work [
19]. In message expansion(MsgExp) and message mix(Mix), which are internal functions of the LSH hash function, each message is independently calculated in units of words. Therefore, since they do not affect each other’s results, it shows that the depth of the circuit can be greatly reduced by processing the operation of each message word in parallel. That is, both functions can significantly reduce the depth of the quantum circuit by processing the operation of each message word in parallel. We design the parallel operation in the LSH using the parallel adder proposed by [
27]. In a previous work, Song et al. [
19] used a sequential adder [
27] in the LSH quantum circuit. The quantum adder is performed by reusing 1-ancilla qubits. However, since the ancilla qubits used in the quantum adder are reused, sequential operations must be performed even if parallel operations are possible. The sequential adder uses 
 Toffoli gates, 
 CNOT gates, and 
 depth; (
n: bit length). 
Figure 5 shows the sequential addition operation in MsgExp. In this adder, the message block pairs 
, 
 are calculated sequentially. Since 16 additions are performed one by one, the quantum circuit has a depth of 16 × (6
2). The sequential adder of the LSH quantum circuit is inefficient in terms of quantum circuit depth because it greatly increases the depth. As a result, the sequential quantum circuit of the previous work was implemented at a depth of hundreds of thousands (#LSH-256-
n: about 210,050;   #LSH-512-
m: about 421,850).
We propose a method to utilize the adder in [
27] as an efficient parallel quantum adder in LSH. We design a parallel addition structure that uses an optimal quantum adder for the LSH quantum circuit and has an efficient trade-off between quantum resources. The parallel quantum adder uses 
 Toffoli gates, 
 CNOT gates, and 
 X gates, and has a depth of 
. 
Figure 6 shows the parallel addition operation in MsgExp. In this adder, the message block pairs 
, 
 are calculated in parallel. Since 16 additions are performed at once, the depth of the quantum circuit is only (2
 3). The parallel adder increases the number of CNOT and X gates. However, it reduces the number of Toffoli gates, which is a more expensive resource than CNOT, X gates, and significantly reduces the depth of the quantum circuit. Consequently, we saw this as a very efficient trade-off. We describe quantum circuits based on LSH-256-
n. In fact, we show the result of reducing the total depth by about 96% compared to the previous work by implementing the parallel structure. The trade-off results for quantum resources are described in detail in 
Section 4.
  3.2. Parallel Quantum Circuit for LSH
In LSH, the addition is used for message expansion (MsgExp) and message mix (Mix), and it has a characteristic that can be processed as a parallel adder. Each addition operation unit in MsgExp and the mix does not affect the results of each other. Due to this characteristic, the depth of the circuit can be significantly reduced using the parallel adder. In LSH-256-
n, 32 bits are processed in units of 1 word. In the LSH-256-
n quantum circuit, 1024 qubits are used in the padded plaintext M, 512 qubits for the connection variable (CV), and 16 carry qubits are used in the parallel adder. In LSH-512-
m, 64 bits are processed in units of 1 word. In the LSH-512-
m quantum circuit, 2048 qubits for the padded plaintext M, 1024 qubits for the connection variable (CV), and 15 carry qubits are used in the parallel adder. The overall operation process of LSH-256-
n and LSH-512-
m is the same, but the output of each operation bit unit, step constant, and the final hash value is different. Source codes of the proposed parallel structure LSH quantum circuit are available in 
https://github.com/kyungzzu/Grover-on-SM3-and-LSH (accessed on 18 September 2022).
Figure 2 shows the progress of the MsgExp function and 
 function of the original LSH hash function. That is, after expanding all messages through the MsgExp function, the expanded message is used in the 
 function. This method is very inefficient in terms of quantum resources because it requires qubits to store the entire expanded message. Therefore, in the quantum circuit, the MsgExp function and 
 functions are iteratively performed, as shown in 
Figure 7 to reduce the temporary qubits used for message expansion. For example, by the message expansion equation 
 in Equation (
4), the third message block 
 is expanded by the addition operation of 
 and 
. 
 is the value substituted by the permutation in 
Table 4. If the MsgExp and 
 functions are performed in units of one message block when the 
 message block is used, 
 and 
 have already been used, so the result of the expansion of 
 can be calculated in 
. In LSH-256-
n, a 1024 bit message is divided into 
(
) of 1 word(1 word = 32 bit) each to perform the 
 function. In LSH-512-
m, a 2048-bit message is divided into 
(
) of 1 word(1 word = 64 bits) each to perform 
 function.
 In summary, the proposed technique does not allocate qubits to store the updated 
M. Instead, it saves qubits by generating new values for 
M used in the previous round. The connection variable 
 updates the value by performing the MsgExp, Mix, and WordPerm functions, and finally obtains a hash value through the Final function with the updated value. MsgExp generates 16 word message block 
 for message block 
, …, 
 (32 bits) by using Equation (
4). The adder used to generate the next message is performed after bit permutation, where the bit permutation 
 is shown in 
Table 4.
        
In MsgExp, the addition operations of message block pairs (i.e., 
, 2 
) are all independent, so the adders can be designed in parallel. Algorithm 1 shows the computation for parallel addition in MsgExp. This adder uses 16 ancilla qubits 
c to store carry values per message pair. Since the adder uses ancilla qubits 
 individually, it can perform parallel additions on pairs of input messages. As a result, the Algorithm 1 is run concurrently for the number of message pairs.
        
| Algorithm 1 Parallel quantum adder of LSH. | 
| Input:  and  pair, ancilla  (1)
 | 
| 1:fori = 0 to 29 do2:    [i + 1] ← CNOT([i + 1], [i + 1])3:end for4:← CNOT([1], )5:← Toffoli([0], [0], )6:[1] ← CNOT([2], [1])7:[1] ← Toffoli(, [1], [1])8:[2] ← CNOT([3], [2])9:fori = 0 to 26 do10:    [i + 2] ← Toffoli([i + 1], [i + 2], [i + 2])11:    [i+3] ← CNOT([i+4], [i+3])12:end for13:[29] ← Toffoli([28], [29], [29])14:[31] ← CNOT([30], [31])15:[31] ← CNOT([31], [31])16:[31] ← Toffoli([29], [30], [31])17:fori = 0 to 28 do18:    X([i + 1])19:end for20:[1] ← CNOT(, [1])21:fori = 0 to 28 do22:    [i + 2] ← CNOT([i + 1], [i + 2])23:end for24:[29] ← Toffoli([28], [29], [29])25:fori = 0 to 26 do26:    [28-i] ← Toffoli([27-i], [28-i], [28-i])27:    [29-i] ← CNOT([30-i], [29-i])28:    X([29-i])29:end for30:[1] ← Toffoli(, [1], [1])31:[2] ← CNOT([3], [2])32:X([2])33:← Toffoli([0], [0], )34:[1] ← CNOT([2], [1])35:X([1]36:← CNOT([1], )37:fori = 0 to 30 do38:    [i] ← CNOT([i], [i])39:end for
 | 
In Mix, adders operate in parallel for 
 and 
 pairs, respectively. The result of the addition operation is stored in 
. Since addition operations of 
 and 
 do not affect each other, the parallel operation is possible. The adder used in Mix is the same as Algorithm 1, and message block pairs (i.e., 
, 2 
) are changed to 
, 
 pairs at the input. Algorithm 3 shows the quantum circuit implementation of the Mix function. One Mix function is performed with two word pairs 
 and a total of eight Mix functions are operated per round. In the Mix function quantum circuit, the a_rotation, b_rotation, and c_rotation functions of lines 2, line 5, and line 7 perform index rotation. The rotation value is determined according to the number of words (32-bit or 64-bit) and the 
j value of the step function 
. Since only the swap gate is used in the rotation operation, additional quantum resources are not used.
        
| Algorithm 2 Quantum circuit of the Mix function. | 
| Input: , , , (0 ≤ i ≤ 7)
 | 
| 1:←Parallel_adder(, )2:a_rotation()3: 4:Applying X gate toaccording to5:←Parallel_adder(, )6:b_rotation()7:←Parallel_adder(, )8:c_rotation()
 |