1. Introduction
Recently, the most efficient quantum main memory was developed [
1,
2], which can access all the data stored in its memory cells with a time cost of
. Additionally, it can store chunks of data in all memory cells with a time cost of
. Moreover, this memory reduces the memory space that is required to store an amount of data exponentially compared to classical RAM [
1,
2]. Storing the inputs and the corresponding outputs of Boolean functions is an essential operation during the processing of these functions using quantum computers. Implementing Boolean functions through data structures is crucial for efficient computation and storage optimization, as it allows for compact representation and the faster evaluation of logical operations [
3,
4]. This approach is fundamental in areas such as circuit design, database indexing, and algorithm optimization. Among various fields, one of the most important issues that have been observed is the junta problem. The junta problem is one process of learning an unknown Boolean function based only on variables that are uniformly selected at random from samples. Junta variable estimation is one such concept brought out by data structures, machine learning, and computational learning theory. A junta function can be related to a Boolean function that depends only on a small subset of its input variables, or junta variables. Being able to estimate their count may lead to efficient data storage and processing. In many real-world datasets or functions, only some of the features are relevant for decision-making or computation. It is expected that only a small, unknown fraction of a long DNA sequence will influence a genetic trait in studying the relationship between the two using computational biology [
5]. Hence, it is effective to conduct a dimensionality reduction in machine learning applications where identification and estimation focus on junta variables, concentrating on the relevant ones and disregarding the irrelevant variables [
6]. This is also important for saving storage so that only the relevant variables are stored instead of the whole dataset during the learning process. In addition, the computational efficiency is such that performing a computation on a smaller subset of variables reduces the requirements in terms of time and resources [
7]. It is also an effective sparse representation in that, instead of storing the entire input space—which has a size
for Boolean functions—it can represent the function using only the junta variables. Therefore, an estimation of the junta variables is effective for compressing the data structure. Hence, junta variable identification would imply that it can achieve data structure compression by encoding only the relevant variables and their relationship. This is particularly useful in applications such as decision trees or hash tables, where storage efficiency is at a premium. Knowing the junta variables allows us, in a database or search algorithm, to index queries depending on the attributes, which reduces the search space, hence yielding an improved query performance.
An investigation of junta variables was begun. Ambainis et al. [
8] came up with a quantum learning method that can solve the problem of group testing using
function operations, where
is the error tolerance and
k is the number of junta inputs. A quantum method applicable to any Boolean function was developed by El-Wazan et al. [
9]. Still another black-box function combining two function operations was developed by them. The quantum-based approach here uses
function operations to identify dependent variables with a probability of at least
. However, no exact quantum learning method for the k-junta problem exists. For the rare case of 2-junta, the Boolean function of
n variables
, where
and
, was proposed by Floess et al. [
10]. They provided a quantum algorithm that, for any
t function operations, examines variables that are dependent with a probability of
. Chen [
11] provided a quantum algorithm for the solution of 2-junta with certainty. The worst time for Chen’s algorithm to find two variables that are dependent is
. Therefore, this work was extended by Chen in Ref. [
12] to find the three dependent variables of the Boolean function
, where
and
with the time complexity
. He also proved that this explained algorithm cannot solve the problem of
k-junta using a single uncomplemented product. Chen [
13] proposed an exact quantum learning procedure to solve the 2-junta problem, involving an
function operation in the worst case. Also, Chen [
14] proposed an exact algorithm for quantum learning to find the two dependent variables in solving the 2-junta problem. Later, in solving the 3-junta problem, Chen [
15] proposed the exact algorithm of quantum learning to identify the three dependent variables. This algorithm solves the 3-junta problem using a modified black-box function. The quantum algorithm requires
function operations in the worst-case scenario. Recently, in 2024, F. Aljuaydi et al. [
16] used Zidan’s quantum computing model to figure out if a variable is a junta or not for any unknown oracles and unknown input states. They were able to beat the classical approach in terms of both the time and memory cost. However, estimating the number of junta variables for a Boolean function that is represented via an unknown oracle with a low time cost and low memory space is still an open problem. Therefore, this paper discusses a novel quantum circuit and algorithm to tackle this open research problem. The proposed technique was realized empirically using both IBM’s quantum computer simulator and IBM’s real quantum computer.
This paper is organized as follows:
Section 2 explains the problem statement. The methodology that was used to develop the quantum algorithm that solved the assumed problem statement is explained in
Section 3. The developed algorithm and its quantum circuit are explained in
Section 4. The practical results are discussed in
Section 6. Finally, the main results are summarized in
Section 7.
6. Experimental Realization of the Proposed Algorithm
The proposed algorithm was realized practically to examine the number of junta variables k by implementing its quantum circuit on IBM’s quantum computer simulator and IBM’s real quantum computer. Here, nine experiments were conducted for nine different oracles. Each oracle was designed with a distinct number of input variables to evaluate the effect of the input dimensionality on the performance of the models, such that each oracle was designed to be known only to the experiment’s designer and hidden from anyone else. The quantum circuit of the proposed algorithm in each experiment was carried out with 20,000 shots. The statistical fidelities of both the simulated results and the real quantum computer results, in each experiment, were calculated as follows: and , where n is the number of qubits and , , and are the simulation probabilities, theoretical probabilities, and real quantum computer probabilities, respectively.
Here, the steps for creating the oracles
in the conducted experiments are detailed. To build the quantum circuit as an oracle to encode a given Boolean function, the following steps were executed: (i) Create the truth table of the Boolean function at hand, and then retrieve from this truth table the input that produces a value of 1 when evaluating the Boolean function in the truth table. (ii) Take these rederived inputs and combine them in the Boolean expression such that each input term is related to the other terms by a logical OR operator. (iii) Replace each OR operation with an XOR operation using Reed–Muller form, and simplify the obtained Boolean expression [
27]. (iv) Convert the final formula of the Boolean expression into a quantum circuit of
qubits, where
n represents the number of input variables and the extra qubit is for the function’s output. Then, each term in the Boolean expression is represented as a CNOT gate, or Toffoli gate, in this quantum circuit of the oracle [
27].
In the first experiment, the first oracle encoded the Boolean function
. The results of this experiment are presented in
Figure 3. It can be observed that the probability of
was equal to 1; this means that
in both the theoretical calculations and the simulation results. Consequently, both theoretically and according to the simulation result, when estimating
k for this oracle using Step 7 of the proposed algorithm,
. In contrast, when the quantum circuit of the proposed algorithm was implemented on the real quantum computer for the first oracle, the probabilities of the basis states
and
were
and
. So, according to Step 7 in the proposed algorithm, the number of junta variables, which was estimated when implementing the algorithm on the real quantum computer, was
, which aligns with the theoretical value. The reason that there was no difference between the value of
k for the real quantum computer and the theoretical result, despite the fact that the real quantum computer is noisy, is because the oracle encoded a simple function
. The statistical fidelities of this experiment were
and
.
In the second experiment, the second oracle encoded the Boolean function
. The results of this experiment are presented in
Figure 4. It can be observed that the probability of
was equal to 1; this means that
in both the theoretical calculations and the simulation results. Consequently, both theoretically and according to the simulation result, when estimating
k for this oracle using Step 7 of the proposed algorithm,
. In contrast, when the quantum circuit of the proposed algorithm was implemented on the real quantum computer for the first oracle, the probabilities of the basis states
and
were
and
. So, according to Step 7 in the proposed algorithm, the number of junta variables, which was estimated when implementing the algorithm on the real quantum computer, was
, which aligns with the theoretical value. The statistical fidelities of this experiment were
and
.
In the third experiment, the third oracle encoded the Boolean function
. The results of this experiment are presented in
Figure 5. It can be observed that the probabilities
,
,
, and
in the theoretical calculations and in the simulation results were approximately
,
,
, and
. Consequently, both theoretically and according to the simulation result, when estimating
k for this oracle using Step 7 of the proposed algorithm,
and
. In contrast, when the quantum circuit of the proposed algorithm was implemented on the real quantum computer for the first oracle, the probabilities of the basis state were
,
,
, and
. So, according to Step 7 in the proposed algorithm, the number of junta variables, which was estimated when implementing the algorithm on the real quantum computer, was
, which aligns with the theoretical value. The statistical fidelities of this experiment were
and
.
In the fourth experiment, the oracle encoded the Boolean function
. The results of this experiment are presented in
Figure 6. It can be observed that the probabilities
, and
in the theoretical calculations and in the simulation results were approximately
,
,
, and
. Consequently, both theoretically and according to the simulation result, when estimating
k for this oracle using Step 7 of the proposed algorithm,
. In contrast, when the quantum circuit of the proposed algorithm was implemented on the real quantum computer for the first oracle, the probabilities of the basis states were
, and
. Consequently, according to Step 7 in the proposed algorithm, the number of junta variables, which was estimated when implementing the algorithm on the real quantum computer, was
. The statistical fidelities of this experiment were
and
.
In the fifth experiment, the oracle encoded the Boolean function
. The results of this experiment are presented in
Figure 7. It can be observed that the probabilities
, and
in the theoretical calculations and in the simulation results were approximately
, and
. Consequently, both theoretically and according to the simulation result, when estimating
k for this oracle using Step 7 of the proposed algorithm,
and
. In contrast, when the quantum circuit of the proposed algorithm was implemented on the real quantum computer for the first oracle, the probabilities of the basis state were
, and
. It is clear that, according to Step 7 in the proposed algorithm, the number of junta variables, which was estimated when implementing the algorithm on the real quantum computer, was
. The statistical fidelities of this experiment were
and
.
In the sixth experiment, the oracle encoded the Boolean function
. The results of this experiment are presented in
Figure 8. It can be observed that the probabilities
, and
in the theoretical calculations and in the simulation results were approximately
,
, and
. Consequently, both theoretically and according to the simulation result, when estimating
k for this oracle using Step 7 of the proposed algorithm,
and
. In contrast, when the quantum circuit of the proposed algorithm was implemented on the real quantum computer for the first oracle, the probabilities of the basis state were
,
, and
. Hence, according to Step 7 in the proposed algorithm, the number of junta variables, which was estimated when implementing the algorithm on the real quantum computer, was
. The statistical fidelities of this experiment were
and
.
In the seventh experiment, the oracle encoded the Boolean function
. The results of this experiment are presented in
Figure 9. It can be observed that the probabilities of
, and
in the theoretical calculations and in the simulation results were approximately
, and
. Consequently, both theoretically and according to the simulation result, when estimating
k for this oracle using Step 7 of the proposed algorithm,
and
. In contrast, when the quantum circuit of the proposed algorithm was implemented on the real quantum computer for the first oracle, the probabilities of the basis state were
, and
. According to Step 7 in the proposed algorithm, the number of junta variables, which was estimated when implementing the algorithm on the real quantum computer, was
. The statistical fidelities of this experiment were
and
.
In the eighth experiment, the oracle encoded the Boolean function
. The results of this experiment are presented in
Figure 10. It can be observed that the probabilities
, and
in the theoretical calculations and in the simulation results were approximately
, and
. Consequently, both theoretically and according to the simulation result, when estimating
k for this oracle using Step 7 of the proposed algorithm,
and
. In contrast, when the quantum circuit of the proposed algorithm was implemented on the real quantum computer for the first oracle, the probabilities of the basis state were
, and
. Thus, according to Step 7 in the proposed algorithm, the number of junta variables, which was estimated when implementing the algorithm on the real quantum computer, was
. The statistical fidelities of this experiment were
and
.
Finally, in the ninth experiment, the oracle encoded the Boolean function
. The results of this experiment are presented in
Figure 11. It can be observed that the probabilities of
, and
in the theoretical calculations and in the simulation results were approximately
, and
. Consequently, both theoretically and according to the simulation result, when estimating
k for this oracle using Step 7 of the proposed algorithm,
. In contrast, when the quantum circuit of the proposed algorithm was implemented on the real quantum computer for the first oracle, the probabilities of the basis state were
, and
. So, according to Step 7 in the proposed algorithm, the number of junta variables, which was estimated when implementing the algorithm on the real quantum computer, was
. The statistical fidelities of this experiment were
and
. Overall, it is evident that both the theoretical and simulation results for estimating the number of junta variables were aligned in all experiments. This alignment occurred because the simulation results represent the outcomes of implementing the proposed approach in futuristic, fault-tolerant real quantum computers, which are free from noise and mitigate inherent errors [
17,
18]. On the other hand, in the first experiment, where the number of junta variables was 0, the theoretical, simulation, and real quantum computer results for estimating the number of junta variables were also aligned. Additionally, the difference between the theoretical and simulation results on one side and the real quantum computer results on the other side for estimating the number of junta variables was small and can be considered negligible in experiments 2–4, where the number of junta variables was 1, 2, and 3, respectively. Nevertheless, the gap between the theoretical and simulation results on one side and the real quantum computer results on the other side for estimating the number of junta variables increased in Experiments 5–9, where the number of junta variables ranged from 4 to 8. The difference between the values of the junta variables from the real quantum computer and the theoretical results in these experiments can be attributed to two factors: (i) Current real quantum computers are noisy devices and are not resistant to noise, which is why they are referred to as noisy intermediate-scale quantum (NISQ) computers [
19,
20]. Nevertheless, several research efforts are striving to develop fault-tolerant quantum computers by 2029. The number of junta variables in the oracles used in Experiments 5–9 increased from 4 to 9. This means that these oracles had complex internal structures in terms of CNOT gates and multi-controlled gates, which led to the accumulation of errors during the implementation of these gates in the current IBM’s real quantum computers. Additionally, the decoherence times scaled with the circuit depth, which is related to the number of gates in the entire quantum circuit, explaining the observed divergence.
Table 1 explains the time cost of the proposed approach for estimating the number of
k-junta variables for an unknown oracle compared to the classical approach. The time complexity for the classical approach is
and the required memory space in this approach is
as well (the detailed proof is investigated in [
16]). It has been proven that the time complexity of handling the quantum computing problems using Zidan’s quantum computing model based on the
operator is
. Subsequently, the time cost for handling the proposed problem statement that is explained in
Section 2 using the suggested algorithm explained in
Section 4 is
(the detailed proof is investigated in [
16,
22,
28]), where
is a predefined error. This means that the developed algorithm is independent of the number of inputs
n. It also means that the developed quantum algorithm that solves the problem statement explained in
Section 2 in constant time is compared to the classical approach, which requires exponential time. However, it is worth noting that, while the classical approach handles the problem in exponential time, it is deterministic. Conversely, the developed approach solves the same problem in constant time with allowable error
. Furthermore, the proposed quantum approach requires
qubits in each version of the quantum system, which has two versions (see
Section 4). Therefore, the total memory space that is required for the proposed quantum approach is
. Hence, it is clear that the proposed algorithm achieves quantum supremacy in terms of memory space and time complexity.