Transforming Neural Networks into Quantum-Cognitive Models: A Research Tutorial with Novel Applications

Maksimovic, Milan; Maksymov, Ivan S.

doi:10.3390/technologies13050183

Open AccessArticle

Transforming Neural Networks into Quantum-Cognitive Models: A Research Tutorial with Novel Applications

by

Milan Maksimovic

and

Ivan S. Maksymov

^*

Artificial Intelligence and Cyber Futures Institute, Charles Sturt University, Bathurst, NSW 2795, Australia

^*

Author to whom correspondence should be addressed.

Technologies 2025, 13(5), 183; https://doi.org/10.3390/technologies13050183

Submission received: 10 March 2025 / Revised: 27 April 2025 / Accepted: 3 May 2025 / Published: 4 May 2025

(This article belongs to the Topic Quantum Information and Quantum Computing, 2nd Volume)

Download

Browse Figures

Versions Notes

Abstract

Quantum technologies are increasingly pervasive, underpinning the operation of numerous electronic, optical and medical devices. Today, we are also witnessing rapid advancements in quantum computing and communication. However, access to quantum technologies in computation remains largely limited to professionals in research organisations and high-tech industries. This paper demonstrates how traditional neural networks can be transformed into neuromorphic quantum models, enabling anyone with a basic understanding of undergraduate-level machine learning to create quantum-inspired models that mimic the functioning of the human brain—all using a standard laptop. We present several examples of these quantum machine learning transformations and explore their potential applications, aiming to make quantum technology more accessible and practical for broader use. The examples discussed in this paper include quantum-inspired analogues of feedforward neural networks, recurrent neural networks, Echo State Network reservoir computing, and Bayesian neural networks, demonstrating that a quantum approach can both optimise the training process and equip the models with certain human-like cognitive characteristics.

Keywords:

artificial intelligence; human–machine teaming; machine learning; neural networks; quantum tunnelling; quantum technologies

1. Introduction

In his address to the United Nations Security Council, Professor Yann LeCun, Chief AI Scientist at Meta, highlighted significant limitations of current artificial intelligence (AI) systems [1]. He noted that state-of-the-art AI systems lack a true understanding of the real world, possess no persistent memory, and are incapable of reasoning or effective planning. Furthermore, they fall short in acquiring new skills with the speed and efficiency demonstrated by humans or even animals.

LeCun also emphasised that intelligence cannot be achieved merely by combining extensive datasets with high-performance computing capabilities. Instead, he argued that genuine AI must arise from cognition and human-like behavioural capabilities, surpassing the superficial ability to process vast amounts of data at high speed.

While LeCun’s vision, along with those of other scientists, champions the development of the next generation of cognitive and more human-like AI, the general public and many professionals not directly engaged in AI research or the application of AI technologies approach this idea with a certain degree of scepticism [2,3,4]. Notably, there is a widespread perception that the increased reliance on advanced AI tools may contribute to a decline in critical thinking and even broader cognitive skills among humans [5]. Moreover, the rapid proliferation of AI across various spheres of social and financial affairs presents significant challenges to the modern legal and financial systems [6,7,8].

In this context, quantum cognition theory (QCT) models provide a novel framework for developing human-like cognitive AI by offering insights into the probabilistic and contextual nature of human thought and decision making [9,10,11,12]. Unlike classical approaches, which are often grounded in deterministic models, QCT integrates principles from quantum mechanics with psychology [11,12], behaviour science, and decision making, shedding light on how humans process information, perceive ambiguities, and make judgments under uncertainty [13,14,15,16].

One key feature of QCT is the concept of quantum superposition that enables individuals to hold multiple, sometimes contradictory, beliefs or percepts simultaneously until a decision collapses these possibilities into a single outcome [9,10,11,12]. This capability mirrors the complex and nuanced ways humans approach decision making, particularly in scenarios involving uncertainty or conflicting information [13,14]. Moreover, QCT provides mechanisms for understanding phenomena such as biases, optical illusions, and the contextuality of human judgments [11,12,15,16], which are critical for designing AI systems that can better interpret and respond to human behaviour.

Thus, in essence, QCT bridges the gap between cognitive psychology and advanced AI, offering a paradigm shift in how we design intelligent systems. Such systems hold the promise to enhance decision making in real-world applications, ranging from healthcare and finance to autonomous systems [17], by fostering a deeper integration of human-like cognitive processes within AI systems.

It is worth noting that QCT and its practical applications have often faced unwarranted scepticism and criticism, largely due to confusion regarding the theory of the quantum mind [18], which proposes that quantum effects may influence cognitive processes [19] (see, e.g., the relevant discussion in Ref. [20]). Nevertheless, the quantum mind hypothesis merits at least theoretical consideration, as it expands general knowledge and potentially contributes to a deeper understanding of the concept of quantum neural networks.

1.1. Quantum Neural Networks

Quantum neural networks (QNNs) are computational models that integrate the principles of quantum mechanics with neural network architectures. The concept of quantum neural computation can be traced back to the 1990s [21,22], though its origins may be even earlier (for a relevant review see, e.g., Ref. [23]). Interestingly, a connection can also be drawn between the early concepts of QNNs and the theory of the quantum mind [18], as well as proposals to investigate quantum effects and their potential impact on cognitive processes [19].

Contemporary research on QNNs focuses on integrating classical artificial neural networks, which are widely used in machine learning (ML), and computer vision, with the unique advantages of quantum information processing [24,25]. This integration aims to develop more efficient algorithms by using quantum phenomena such as superposition and entanglement to enhance computational performance [23,26]. For instance, the work of Ref. [27] examined the quantum generalisation of traditional neural network models, while Refs. [28,29,30] introduced diverse approaches to quantum-inspired deep learning [31,32]. Further systematic review of QNN models can be found in Refs. [33,34], including hardware implementations of QNNs.

Moreover, QNN models provide robust frameworks for developing uncertainty-aware neural networks, thereby enhancing the reliability and safety of AI systems in high-stakes applications. Yet, while quantum mind theory and related scientific hypotheses have often been regarded as distinct from the framework of QCT, as discussed, for example, in Ref. [16], these approaches should eventually converge, further advancing the ability of QNNs to successfully complete these ambitious tasks.

1.2. Neuromorphic Computing

QNNs also share a conceptual background with neuromorphic computing (NC) through their foundational aim of understanding and artificially reproducing cognitive processes observed in a biological brain [35,36,37,38]. NC systems seek to develop hardware and algorithms inspired by the architecture and functioning of biological neural systems, thereby mimicking spiking neurons and synaptic connections to emulate the efficiency and adaptability of a biological brain [17,36,39,40]. Both QNNs and NC systems aim to overcome the limitations of conventional computing architectures, which rely on deterministic logic and sequential processing and, therefore, can be suboptimal for complex tasks such as pattern recognition, real-time decision making, and adaptive learning. In particular, NC systems address these challenges by using event-driven architectures and multithread information processing, aligning with how neurons and synapses function in the brain [35,36,37].

QNNs further enhance the concept of NC by exploiting quantum states to encode and process information, enabling parallelism and exponential scalability in specific computational tasks. Therefore, the integration of QNNs with neuromorphic principles offers exciting possibilities. For instance, NC systems could serve as a physical platform for implementing QNNs, using analogue circuits to represent quantum-inspired operations like superposition and entanglement. Conversely, QNNs could inspire new neuromorphic designs, employing quantum algorithms to model phenomena such as associative memory and probabilistic reasoning.

1.3. Objectives of This Tutorial

Thus, this tutorial paper aims to bridge the gap between quantum technology and broader accessibility by demonstrating how traditional neural networks can be transformed into neuromorphic quantum models. While quantum technologies are becoming increasingly pervasive in electronic, optical, and medical devices—and quantum computing and communication are advancing rapidly—access to them remains largely confined to researchers and professionals in high-tech industries. This paper illustrates how anyone with a basic understanding of undergraduate-level ML can develop quantum-inspired models that emulate human brain function—all using a standard laptop.

We present several examples of quantum transformations and explore their potential applications, making quantum technology more accessible and practical. Specifically, we introduce quantum-inspired analogues of feedforward neural networks (FNNs), recurrent neural networks (RNNs), Echo State Network (ESN) reservoir computing, and Bayesian neural networks (BNNs). These quantum approaches not only optimise the training process but also imbue the models with certain human-like cognitive characteristics, opening new avenues for quantum-enhanced AI.

1.4. Organisation of the Article

The remainder of this article is organised as follows. In Section 2, we expand on the distinctions between traditional and quantum neural networks. This discussion is intentionally broad, incorporating general aspects of quantum information and technology that extend beyond the conventional concept of the qubit [41]. In particular, we do not limit our analysis to digital, gate-based quantum computing models. Instead, we focus primarily on analogue quantum technologies—such as those exploiting quantum tunnelling and continuous-variable systems—which are often overlooked in standard quantum ML approaches. These analogue models enable a different, and in some cases more natural, integration with neuromorphic computing principles and quantum cognitive theories, enabling us to explore richer and more biologically plausible architectures for AI [17].

Section 3 continues this discussion, with a focus on quantum tunnelling. We outline the fundamental physics underpinning this phenomenon, its connection with key principles in quantum computing, and its novelty within the context of machine learning.

Section 4 is a central component of this tutorial, as it establishes the foundation for understanding the departure point for the quantum transformation—namely, traditional neural network models. We present these models in a highly detailed manner to offer a self-contained and accessible resource for both expert and non-expert readers.

In line with this, we start our discussion with a basic FNN, one of the simplest types of artificial neural networks, comprising an input layer, one or more hidden layers, and an output layer, where information flows strictly in one direction, from input to output, without any cycles or feedback loops [42]. Each neuron in a layer is connected to every neuron in the subsequent layer, and the network learns by adjusting the weights of these connections during training, typically using backpropagation in conjunction with gradient descent.

Despite its simplicity, the FNN is capable of learning complex, non-linear relationships between inputs and outputs provided it has an adequate number of hidden units and suitable activation functions. In this work, the number of hidden units is intentionally minimised to ensure that the accompanying code can run efficiently on any computer. However, the choice of activation function is a central theme throughout the article, as it plays a crucial role in the performance of the model and forms one of the key subjects of our discussion. Then, building on FNNs, which serve as the foundational architecture for more advanced models, we extend our approach to more complex models such as Bayesian networks, RNNs, and ESNs. This demonstrates the universality of our method in transforming traditional ML models into quantum-inspired counterparts.

Finally, we note that a comprehensive comparison between the performance of traditional and quantum-based neural networks is beyond the scope of this article, as the outcomes of such investigations have already been published elsewhere [15,16,43,44]. The cited work [43] has also provided a detailed discussion of the initial values of the neural weights used in quantum-based models.

2. Motivation

2.1. General Background

Mechanics, a fundamental branch of classical physics, investigates the motion of physical objects resulting from the application of forces [45]. These objects range from macroscopic entities, such as balls and vehicles, to celestial bodies. In contrast, quantum mechanics provides a framework for describing the behaviour of systems at atomic and subatomic scales [46,47]. Quantum mechanics extends beyond the explanatory power of classical physics, enabling the understanding of phenomena involving photons, electrons, and other quantum particles [46,47]. Furthermore, it underpins a wide array of technologies integral to contemporary society, including semiconductor devices, medical imaging systems, optical fibre communication networks, and quantum computing [48].

The principles of quantum mechanics often challenge intuition due to their departure from the classical framework [49]. A prominent example is the concept of superposition that allows a quantum system to exist simultaneously in multiple states until measured [46,47,49]. This concept is famously illustrated by Schrödinger’s cat thought experiment (Figure 1a). Interestingly, Schrödinger did not intend for the notion of a dead-and-alive cat to be taken as a serious possibility. Rather, he used this thought experiment to highlight the absurdity of the prevailing interpretation of quantum mechanics [50]. However, advancements in quantum mechanics since Schrödinger’s time have led scientists to propose alternative interpretations of its mathematical framework, rendering the concept of a superposition of ‘alive and dead’ states more tangible and potentially applicable in practical contexts.

Another foundational principle is Heisenberg’s uncertainty principle that asserts a fundamental limit to the simultaneous measurement precision of certain pairs of physical properties, such as position and momentum [49]. Specifically, as the precision of one property increases, the uncertainty of the other proportionally increases.

Additionally, the phenomenon of quantum entanglement represents a critical aspect of quantum mechanics [51]. Entanglement occurs when two or more particles are generated or interact in such a way that their quantum states become interdependent [24]. As a result, the state of one particle cannot be fully described without reference to the state of the other, regardless of the spatial separation between them. This phenomenon challenges classical notions of locality and underscores the unique characteristics of quantum systems.

To further contextualise this discussion, we compare the operation of a traditional digital computer with that of a quantum computer [48]. A classical digital computer relies on bits, which are always in one of two discrete physical states, representing the binary values ‘0’ and ‘1’. This behaviour is analogous to an on/off light switch. In contrast, a quantum computer employs quantum bits (qubits), which can occupy the states

| 0 〉

and

| 1 〉

, analogous to the binary states of a classical bit. However, a qubit can also exist in a superposition of these states, represented mathematically as

| ψ 〉 = α | 0 〉 + β | 1 〉

, where the coefficients

α

and

β

are complex numbers that satisfy the normalisation condition

{| α |}^{2} + {| β |}^{2} = 1

.

From a physical perspective, the state of a qubit can be visualised on the Bloch sphere (Figure 1b). When a closed qubit system interacts in a controlled manner with its environment, measurement reveals the probabilities of finding the qubit in either of its basis states. Specifically, for the state

| ψ 〉 = α | 0 〉 + β | 1 〉

, the measurement probabilities are given by

P_{| 0 〉} = {| α |}^{2}

and

P_{| 1 〉} = {| β |}^{2}

. This implies that, upon measurement, the qubit collapses to one of its basis states

| 0 〉

or

| 1 〉

. Graphically, this measurement process corresponds to projecting the qubit state onto one of the coordinate axes of the Bloch sphere (e.g., the z axis in Figure 1b).

To demonstrate that light behaves as a wave, scientists illuminate a narrow slit in an opaque screen to observe how light passes through the slit, bends at its edges, and spreads out beyond it. This phenomenon is known as diffraction. When the screen contains two narrow slits, the optical waves diffracted by the two slits interact to produce an alternating pattern of light and dark bands referred to as interference fringes. These experimental results can be reproduced using waves of different natures, including water waves [49].

However, the same experiment can be performed using electrons (Figure 1c). Each electron passing through the slits is registered on a screen as a single bright spot. As more electrons pass through, the individual bright spots begin to cluster, overlap, and merge. Ultimately, a double-slit interference pattern emerges, characterised by alternating bright and dark fringes, analogous to the pattern observed in experiments involving optical waves. This result indicates that each individual electron exhibits wave-like behaviour, described by a wave function

ψ

, which passes through both slits simultaneously and interferes with itself before striking the screen.

The square magnitude of the wave function,

{| ψ |}^{2}

, represents the probability density of the particle. Accordingly, the alternating peaks and troughs of the wave function of the electron correspond to a quantum probability pattern: bright fringes indicate a higher probability of finding an electron, while dark fringes indicate a lower probability. Before an electron strikes the screen, its position is not definite but rather probabilistic—it can be found anywhere that the modulus square of the wave function is non-zero. This probability distribution, where multiple states exist simultaneously, is a manifestation of quantum superposition.

2.2. Menneer–Narayanan Quantum-Theoretic Concept

In the 1995 technical report ‘Quantum-inspired Neural Networks’ by Tamaryn Menneer and Ajit Narayanan [52,53], the authors explore the integration of quantum-theoretic concepts into neural network training methodologies. They propose an innovative approach inspired by the many-worlds interpretation of quantum mechanics, aiming to enhance computational efficiency and address problems that traditional neural networks struggle to solve. Although that work initially received relatively little attention, it stands as a pioneering proposal that has significantly influenced subsequent research into quantum neural architectures [23].

The core idea involves training multiple single-layer neural networks, each on a distinct pattern, rather than training a single network on multiple patterns. The weights from these individual networks are then combined to form a quantum network, where the weights are calculated as a superposition of the individual weights of the networks. This method employs the concept of superposition from quantum theory to potentially improve learning efficiency and problem-solving capabilities.

The authors draw an analogy between their approach and the famous double-slit experiment in quantum mechanics (Figure 1b,c). Just as an electron can exist in a superposition of paths until measured, the quantum-inspired neural network can maintain a superposition of multiple learned patterns until collapsed into a final trained state. This analogy reinforces the idea that quantum-inspired models can explore multiple solutions simultaneously, akin to quantum parallelism [48].

The authors validated their approach using two microfeature tasks, demonstrating the potential advantages of their quantum-inspired training method. Thus, their work represents an early effort to merge principles from quantum mechanics with neural network training, laying the groundwork for future research in quantum-inspired computational models [15,16]. In fact, as demonstrated below, at both the theoretical and computational levels, the double-slit experiment and an experiment showcasing the effect of quantum tunnelling through a potential barrier are conceptually similar. This similarity also establishes a connection between the idea of tunneling-based neural networks discussed in this paper and the Menneer–Narayanan quantum-theoretic concept.

3. Quantum Tunnelling Effect

3.1. Theory

Quantum tunnelling (QT) is a fundamental phenomenon in quantum mechanics that enables particles to pass through potential energy barriers that would be insurmountable under the laws of classical physics [46,47]. This effect arises due to the wave-like nature of quantum particles, described by Schrödinger’s equation, which permits non-zero probability amplitudes even in classically forbidden regions, i.e., areas where a particle does not have sufficient energy to be.

At a microscopic scale, the effect of QT is a direct consequence of the Heisenberg uncertainty principle and the probabilistic interpretation of quantum mechanics. Unlike classical particles, which require sufficient energy to overcome a barrier, quantum particles can ‘tunnel’ through it due to the non-zero probability of their wavefunction extending beyond the boundaries of the barrier.

Mathematically, this behaviour is captured by the transmission coefficient, which depends on factors such as the barrier width, height, and the energy of the particle, and is described by the time-independent Schrödinger equation

[- \frac{ℏ^{2}}{2 m} \frac{d^{2}}{d x^{2}} + V (x)] ψ (x) = E ψ (x),

(1)

where

ψ (x)

is a wave function,

m \approx 9.1093837 \times 10^{- 31}

kg is the mass of the electron,

ℏ \approx 1.054571817 \times 10^{- 34}

J·s is Plank’s constant, and E is the energy of the electron. The profile of the potential barrier is

V (x) = \{\begin{matrix} 0 & for x < 0 \\ V_{0} & for 0 < x \leq a \\ 0 & for x > a . \end{matrix}

(2)

In classical mechanics, a counterpart of this physical system is a marble ball. While a ball with energy

E < V_{0}

cannot penetrate the barrier, an electron, behaving as a matter wave, has a non-zero probability of penetrating the barrier and continuing its motion on the other side. Similarly, for

E > V_{0}

, the electron may be reflected from the barrier with a non-zero probability.

The electron tunnelling behaviour can be quantified by finding the transmission coefficient from the solution of Equation (1) for the potential barrier given by Equation (2). The solution of the Schrõdinger equation can be written as a superposition of left and right moving waves [54]

V (x) = \{\begin{matrix} ψ_{L} (x) = A_{1} e^{i k x} + A_{2} e^{- i k x}, & x < 0 \\ ψ_{C} (x) = B_{1} e^{i κ x} + B_{2} e^{- i κ x}, & 0 < x \leq a \\ ψ_{R} (x) = C_{1} e^{i k x} + C_{2} e^{- i k x}, & x > a, \end{matrix}

(3)

where

i

is the imaginary unit,

k = \sqrt{2 m E / ℏ^{2}}

, and

κ = \sqrt{2 m α / ℏ^{2}}

, with

α = E - V_{0}

(the special cases

E = 0

and

E = V_{0}

are treated separately). The coefficients

A, B, C

are found from the boundary conditions at

x = 0

and

x = a

, requiring that

ψ (x)

and its derivative have to be continuous. Below, omitting the intermediate derivations [54], we present the expressions for the probability of the electron transmission through the barrier.

For electron energies smaller than the barrier height (

E < V_{0}

), there is a non-zero transmission probability [54]

{T |}_{E < V_{0}} = {(1 - β {sinh}^{2} (κ_{1} a))}^{- 1},

(4)

where

β = \frac{V_{0}^{2}}{4 E α}

, and

κ_{1} = \sqrt{- 2 m α / ℏ^{2}}

. For

E > V_{0}

,

{T |}_{E > V_{0}} = {(1 + β {sin}^{2} (κ a))}^{- 1} .

(5)

Finally, the expression for

E = V_{0}

is obtained by taking the limit of T as E approaches

V_{0}

, resulting in

{T |}_{E = V_{0}} = {(1 + \frac{m a^{2} V_{0}}{2 ℏ^{2}})}^{- 1} .

(6)

For example, suppose an electron with energy

E = 5

eV encounters a barrier of

V_{0} = 10

eV and width

a = 1

nm. Using the expressions above, we can demonstrate that the transmission coefficient will be

T \approx 5 \times 10^{- 10}

, which suggests that tunnelling is highly unlikely but not impossible. If the barrier width is reduced to

0.5

nm, the transmission coefficient increases significantly to

T \approx 4.25 \times 10^{- 5}

, highlighting how nanoscale engineering can control the effect of QT. It is noteworthy that the parameters used in this example possess not only a clear physical interpretation but can also be viewed as hyperparameters within a model of human mental states developed under the QCT framework [13].

The effect of QT has been widely utilised in semiconductor electronic devices [55,56,57,58], as well as in various spectroscopy [59,60] and microscopy techniques [60,61]. Furthermore, electron devices exploiting QT have been demonstrated as fundamental building blocks of NC systems [62,63]. However, in these neuromorphic computers, QT has not been harnessed directly. Instead, the non-linear dynamics of entire QT-based electron devices and their associated circuits have been employed as a means of computation.

3.2. Practical Applications of Quantum Tunnelling

In practice, QT has been exploited in tunnel [55] and resonant-tunnelling [57] diodes. Additionally, certain NC architectures exploited negative differential resistance [40,64], whichis a hallmark characteristic of tunnel diodes. Notably, systems based on tunnel diodes and other QT-based devices exhibit significantly lower power consumption compared to conventional integrated electronic circuits [62]. Moreover, QT has played a crucial role in the development of scanning tunnelling microscopy (STM) instrumentation [61]. Lastly, QT of individual electrons has been experimentally observed in quantum dots [64], which serve as essential building blocks of quantum neuromorphic systems [65].

Recent advancements in quantum computing have increasingly harnessed the phenomenon of QT in both quantum annealing and NC architectures [66]. In quantum annealing, tunnelling enables systems to efficiently navigate complex energy landscapes by allowing quantum states to traverse potential energy barriers, thereby facilitating the discovery of optimal solutions in combinatorial optimisation problems. For instance, D-Wave’s quantum annealers exploit quantum tunnelling to solve such problems effectively (see Ref. [66] and references therein).

In the field of NC systems, efforts have been made to emulate the brain’s architecture by integrating QT mechanisms. A particular example is the development of neuromorphic Ising machines that utilise Fowler–Nordheim tunnelling annealers [67]. These systems employ pairs of asynchronous ON-OFF neurons, with thresholds adaptively adjusted by annealers replicating optimal escape mechanisms, thereby ensuring convergence to ground states in Ising problems.

Moreover, the effect of QT is essential in quantum biology, influencing enzymic reactions and energy transfer in photosynthesis [68,69,70]. In this context, it has been demonstrated that QT might play a role in the quantum mind theories discussed above in this text [71,72].

3.3. The Relationship Between QT and Menneer–Narayanan Quantum-Theoretic Concept

Let us now focus on Figure 1d. In quantum mechanics, the state of an electron remains undefined until its wave function interacts with a detection screen, causing the wave function to ‘collapse’ and the electron to manifest at a specific location. This principle mirrors the approach used by Shor in his quantum computing algorithm for factoring large integers [73], as well as by Menneer and Narayanan in their pursuit of a robust QNN architecture [52,53].

As detailed in Refs. [52,53], a memory register is initially placed in a superposition of all possible integers it can hold. Following this, a separate calculation occurs in each universe path after the slit (it has been argued [26] that the operation of the specific class of quantum neural networks discussed in Refs. [52,53] is consistent with the many-worlds (parallel universe) interpretation of quantum mechanics [74,75,76,77], which warrants the use of the term ‘universe’ in this current text). The computation halts when the universes begin to interfere with one another, forming standing waves from the repeating sequences of integers in each universe. While there is no guarantee of accuracy in the results, a subsequent check can confirm if the returned numbers are indeed prime factors of the large integer. Remarkably, quantum computing can solve the prime factorisation problem in seconds, which is a feat that would take classical computers exponentially longer [48].

Both the double-slit experiment and QT can be illustrated using a two-dimensional mathematical model, where the electron is represented as a Gaussian-shaped energy packet advancing towards a potential barrier, which may or may not contain slits (Figure 2). The motion of the energy packet, with its direction of propagation indicated by the arrows and energy level encoded in false colour, and its interaction with the barrier, represented by white rectangles, are governed by the Schrödinger equation, which is solved using the Crank–Nicolson method [78,79].

Figure 2 plots the probability density in two-dimensional space, illustrating the evolution of the energy packet through four snapshots taken at distinct non-dimensionalised time intervals. The packet then interacts with the barrier, producing both a reflected and a transmitted signal. The top panels Figure 2(a.i)–(a.iv) show the physical picture used by Menneer and Narayanan [52,53], while the bottom panels Figure 2(b.i)–(b.iv) offer a more rigorous depiction of the effect of QT compared with the algebraic expressions obtained above, also highlighting the physical similarity between QT-based neural network models and the approach suggested by Menneer and Narayanan.

4. Benchmarking Neural Network Models

We begin this section by outlining the traditional neural network algorithms used in this paper as a reference. Then, we present a general framework for transforming these conventional methods into neuromorphic quantum approaches. To demonstrate the effectiveness of this framework, in the following sections, we will test the resulting quantum networks on key tasks: classification tasks such as image recognition for FNN, RNN, and BNN models and chaotic time series prediction for the ESN model. For classification tasks involving image recognition, input images will be flattened into vectors and normalised to the range

[0, 1]

, while the corresponding labels will be transformed into one-hot encodings for loss calculation [42]. A similar normalisation of inputs will also be applied in the ESN tests to ensure consistency across tasks [80].

4.1. Feedforward Neural Networks

FNNs are a class of neural networks where information flows in a single direction, from input to output, without feedback connections. They consist of an input layer, one or more hidden layers, and an output layer. Each layer is fully connected to the next, with no recurrent connections [42].

The Rectified Linear Unit (ReLU) activation function is commonly used to introduce nonlinearity to an FNN model by outputting the input value directly if it is positive and producing zero otherwise [42]. However, other activation functions, such as the sigmoid and tanh, can also be employed depending on the specific characteristics of the task, with each offering distinct advantages in terms of gradient propagation and model performance [42].

For a given input vector x, the output of the network is computed layer by layer. The activation

a^{(l)}

of the l-th layer is given by the equation

a^{(l)} = W^{(l)} z^{(l - 1)} + b^{(l)},

(7)

where

W^{(l)} \in R^{n_{l} \times n_{l - 1}}

is the weight matrix connecting layer

l - 1

to layer l,

z^{(l - 1)} \in R^{n_{l - 1}}

denotes the activations from the previous layer, and

b^{(l)} \in R^{n_{l}}

is the bias vector for layer l. The output of layer l, denoted

z^{(l)}

, is obtained by applying a non-linear activation function

σ

to the pre-activation

a^{(l)}

as

z^{(l)} = σ (a^{(l)}) .

(8)

At the output layer, the network produces a prediction

\hat{y}

, which depends on the computational task. For classification, the Softmax function is commonly used to compute class probabilities [42]. This function reads as

{\hat{y}}_{i} = \frac{exp (z_{i}^{(L)})}{\sum_{j} exp (z_{j}^{(L)})},

(9)

where

z^{(L)}

represents the output of the final layer L, and

{\hat{y}}_{i}

is the probability of class i.

The network is trained by minimising a loss function

L

, which quantifies the difference between the predicted output

\hat{y}

and the true target y. For classification, the cross-entropy loss is typically used:

L = - \sum_{i} y_{i} log ({\hat{y}}_{i}) .

(10)

To optimise the weights

W^{(l)}

and biases

b^{(l)}

, gradient backpropagation is employed [42]. This involves computing the gradients of the loss function with respect to the network parameters. Using the chain rule, the gradient

\frac{\partial L}{\partial W^{(l)}}

is calculated as

\frac{\partial L}{\partial W^{(l)}} = δ^{(l)} \cdot z^{(l - 1) ⊤}

(11)

where

δ^{(l)}

represents the error at layer l propagated backward through the network

δ^{(l)} = (W^{(l + 1) ⊤} δ^{(l + 1)}) ⊙ σ^{'} (a^{(l)}),

(12)

where

W^{(l + 1) ⊤}

is the transpose of the weight matrix from layer l to

l + 1

, ⊙ is the element-wise multiplication operator, and

σ^{'} (a^{(l)})

is the derivative of the activation function at

a^{(l)}

. Then, the error at the output layer is computed as

δ^{(L)} = \hat{y} - y .

(13)

Using these gradients, the parameters are updated via gradient descent

W^{(l)} \leftarrow W^{(l)} - η \frac{\partial L}{\partial W^{(l)}},

(14)

b^{(l)} \leftarrow b^{(l)} - η \frac{\partial L}{\partial b^{(l)}},

(15)

where

η

is the learning rate parameter.

By iteratively applying these updates over the training data, the network learns to minimise the loss function, improving its performance on the given task.

4.2. Recurrent Neural Networks

RNNs are a class of neural networks designed for processing sequential data [81,82,83]. The RNN algorithm relies on the iterative update of a hidden state, thereby capturing information about past inputs in the sequence.

The hidden state at time step t, denoted as

h_{t}

, is updated based on the current input

x_{t}

and the previous hidden state

h_{t - 1}

as

h_{t} = tanh (W_{h} h_{t - 1} + W_{x} x_{t} + b_{h}),

(16)

where

W_{h} \in R^{n \times n}

is the weight matrix for the hidden state,

W_{x} \in R^{n \times m}

is the weight matrix for the input,

b_{h} \in R^{n}

is the bias vector for the hidden state, and

tanh (\cdot)

is the hyperbolic tangent activation function introducing nonlinearity to the model.

The output at time step t, denoted as

o_{t}

, is computed based on the current hidden state

h_{t}

as

o_{t} = W_{y} h_{t} + b_{y},

(17)

where

W_{y} \in R^{k \times n}

is the weight matrix for the output, and

b_{y} \in R^{k}

is the bias vector for the output.

For classification tasks, the output

o_{t}

is processed by means of the Softmax function to compute class probabilities

{\hat{y}}_{t}

. In the context of the discussion in this subsection, this function can be written as

{\hat{y}}_{t} = softmax (o_{t}) = \frac{exp (o_{t}^{(i)})}{\sum_{j} exp (o_{t}^{(j)})},

(18)

where

{\hat{y}}_{t}^{(i)}

represents the probability of the i-th class at time t.

The loss function over a sequence is computed as the sum of individual losses at each time step and is computed as

L = \sum_{t = 1}^{T} L_{t} .

(19)

The cross-entropy loss that is typically used for classification tasks is

L_{t} = - \sum_{i} y_{t}^{(i)} log ({\hat{y}}_{t}^{(i)}),

(20)

where

y_{t}^{(i)}

is the true label for class i at time t.

During the training of the network, gradients of the loss with respect to the weights are computed via Backpropagation Through Time (BPTT) [81]. Gradients for the loss at time t depend on the current hidden state

h_{t}

and the previous hidden states

{h_{t - 1}, h_{t - 2}, \dots}

. Due to the repeated application of the chain rule, gradients can either vanish (approach zero) or explode (grow uncontrollably) as they propagate backward through time. This challenge limits the effectiveness of standard RNNs for long sequences and is managed using a gradient clipping technique [84].

4.3. Bayesian Neural Networks

BNNs are a probabilistic extension of traditional neural networks that aims to model the uncertainty in the weights of the network [85,86,87,88]. Instead of learning deterministic weights, BNNs estimate a distribution over the weights. This enables the model to quantify the uncertainty in its predictions, which is particularly useful in situations where decisions must be made under uncertainty. The typical architecture of a BNN consists of an input layer that processes the input data, hidden layers that consist of neurons with probabilistic weights and biases, and an output layer that is used to computes the output employing the probabilistic weights.

Let

W_{1}, W_{2}

represent the weights for the first and second layers of the network, respectively. In a Bayesian framework, these weights are not fixed but are treated as random variables with a probability distribution. During training, the model aims to estimate the posterior distribution of the weights given the data.

The weights for the network are parameterised by their mean and standard deviation as follows:

W_{1} = W_{1, mean} + W_{1, std} \cdot ϵ_{1}, ϵ_{1} \sim N (0, 1),

(21)

W_{2} = W_{2, mean} + W_{2, std} \cdot ϵ_{2}, ϵ_{2} \sim N (0, 1),

(22)

where

ϵ_{1}

and

ϵ_{2}

are random variables drawn from a standard normal distribution. The forward pass procedure of the BNN uses the probabilistic weights. The first layer of the network is computed as

z_{1} = X W_{1} + b_{1},

(23)

where X is the input matrix, and

b_{1}

is the bias for the first layer. The output of the first layer,

a_{1}

, is obtained using the ReLU activation function

a_{1} = ReLU (z_{1}) .

(24)

The second layer output is then computed as

z_{2} = a_{1} W_{2} + b_{2}

(25)

and the output of the network is obtained by applying the Softmax function

y_{pred} = Softmax (z_{2}) .

(26)

For prediction, the model samples weights from their respective distributions and exploits the sampled weights to make predictions. This process enables the network to estimate the uncertainty in its predictions. The output prediction is the average of predictions made from multiple weight samples

y_{pred} = \frac{1}{N} \sum_{i = 1}^{N} Softmax (X W_{1}^{(i)} + b_{1}^{(i)}, X W_{2}^{(i)} + b_{2}^{(i)}),

(27)

where N is the number of weight samples, and each sample

W_{1}^{(i)}, W_{2}^{(i)}

is drawn from a distribution parametrised by the means and standard deviations of the weights.

The weights of the BNN are trained using a standard stochastic gradient descent algorithm but with weight uncertainty incorporated into the training process. The gradient of the loss function with respect to the weights is computed using the chain rule, and the weights are updated in each step based on these gradients. The loss function is typically computed as

Loss = - \frac{1}{T} \sum_{t = 1}^{T} \sum_{c = 1}^{C} y_{t}^{(c)} log (p_{t}^{(c)}),

(28)

where

y_{t}^{(c)}

is the true label, and

p_{t}^{(c)}

is the predicted probability for class c at time step t. Then, the gradients with respect to the weights are computed and used to update the means of the weight distributions

W_{1, mean} \leftarrow W_{1, mean} - η \cdot \nabla_{W_{1}} Loss,

(29)

where

η

is the learning rate, and

\nabla_{W_{1}}

represents the gradient of the loss with respect to

W_{1}

. Similar updates are applied for

W_{2}

,

b_{1}

, and

b_{2}

.

4.4. Echo State Networks and Reservoir Computing

Echo State Networks (ESNs) are an independent class of RNNs specifically designed for processing sequential data, including highly nonlinear and chaotic time series, while addressing issues such as vanishing and exploding gradients during training [35,36,37,39,80]. ESNs exploit a sparsely connected reservoir of dynamic recurrent units with fixed weights, focusing on training only the output weights, which makes it an independent ML technique [35].

A traditional ESN system consists of three primary components: an input layer that maps the input sequence to the reservoir, a reservoir representing a sparsely connected and randomly initialised network of recurrent neurons, providing rich, in terms of physics and mathematics, non-linear dynamics [35,36,39], and an output layer that linearly combines the reservoir states to generate predictions.

Similarly to the broader concept of NC, the core concept behind ESN draws inspiration from the functioning of biological brains, which operate through vast, intricate networks of neural connections. Like the brain, neural networks are dynamic systems, meaning they evolve over time and exhibit complex, non-linear, and sometimes chaotic behaviour [89,90]. In mathematical terms, dynamical systems are characterised by equations that describe how their states change over time [91]. This similarity between biological and artificial systems has led to the application of principles from non-linear dynamics in designing ESN-inspired artificial neural network models [92,93]. In particular, non-linear differential equations are often used to model how the connection strengths between nodes in artificial neural networks evolve over time [35,92,93].

For a given input temporal sequence

{x_{t}}_{t = 1}^{T}

, the dynamical state of the reservoir

h_{t}

at time step t is updated as

h_{t} = tanh (W_{in} x_{t} + W_{res} h_{t - 1} + b),

(30)

where

W_{in} \in R^{N \times M}

is the input weight matrix,

W_{res} \in R^{N \times N}

is the reservoir weight matrix,

b \in R^{N}

is the bias vector, N is the number of reservoir neurons, and

tanh (\cdot)

is the hyperbolic tangent activation function.

The reservoir weights

W_{res}

are scaled to ensure the echo state property, which guarantees that the states of the network are stable and dependent on the input history [35,80]. This is typically achieved by setting the spectral radius

ρ

of

W_{res}

to be less than 1 as

ρ = max | λ |,

(31)

where

λ

defines the eigenvalues of

W_{res}

. The output at each time step,

y_{t}

, is computed as

y_{t} = W_{out} h_{t},

(32)

where

W_{out} \in R^{K \times N}

is the output weight matrix, and K is the dimension of the output. Unlike traditional RNNs, ESNs only train the output weights

W_{out}

, keeping

W_{in}

and

W_{res}

fixed. The training involves solving a linear regression problem

W_{out} = Y H^{†},

(33)

where

Y \in R^{K \times T}

is the desired output matrix,

H \in R^{N \times T}

is the matrix of reservoir states over time, and

H^{†}

is the pseudo inverse of H [80]. Regularisation, such as ridge regression [80], is often used to prevent overfitting and is defined as

W_{out} = Y H^{⊤} {(H H^{⊤} + λ I)}^{- 1},

(34)

where

λ

is the regularisation coefficient.

4.5. From Classical to Quantum: Transforming Computational Models

The process of converting a traditional neural network model into a QT-based model is relatively straightforward: the conventional activation functions (as exemplified by ReLU in Figure 3) are replaced by the physical QT effect. In this work, the Softmax function remains unchanged, although some success has been achieved in experimental tasks involving fully QT-based neural networks.

The same procedure applies to the derivative of the activation function if it is used in the particular model of interest [16]. Additionally, applying the QT activation function and its derivative to ML algorithms may require basic mathematical normalisation to regulate the numerical values entering the algorithm (see Ref. [44] for a relevant discussion). This procedure is heuristic and programmatically straightforward (and the variable ampl used in the source code accompanying this paper serves this purpose).

Naturally, the replacement procedure illustrated in Figure 3 alters the dynamics of neural network training and deployment, necessitating a readjustment of the mathematical range of the activation function, weight distribution, number of neurons in the hidden layer, learning rate, and the number of epochs and batches. However, we have established that satisfactory performance can be achieved simply by replacing the traditional activation functions with QT, provided that the original traditional neural network model was appropriately tuned for the specific task at hand. This approach is adopted in this work, i.e., we keep all topological and numerical model parameters constant across both traditional and QT-based models. Importantly, we also demonstrated that the resulting QT-based model can be trained 50 times faster without any additional adjustments and holds potential for further acceleration with proper tuning [43].

Aside from their empirical performance, activation functions also possess distinct mathematical properties, including the necessity of non-linearity as stipulated by the universal approximation theorem [42,94,95]. In this paper, we demonstrate that the QT activation function exhibits a greater degree of non-linearity compared to the standard ReLU and sigmoid activation functions (Figure 4).

Analysing the degree of non-linearity of an activation function is a non-trivial task. A previous study [96] suggested that two non-linear processes can be compared by examining the Fourier spectra of the functions that approximate these processes. Following this approach, we analysed the responses of the ReLU, sigmoid, and QT activation functions to a purely sinusoidal wave signal at a frequency of 1 Hz (note that this value was chosen solely for convenience and does not influence the analysis of the non-linear properties of the activation functions). For the sake of comparison, we also applied the sinusoidal signal to a linear identity function, producing a Fourier spectrum with a sole peak at 1 Hz.

As shown in Figure 4, the responses of ReLU, sigmoid, and QT to the sinusoidal waves resulted in the non-linear generation of higher-order harmonics. According to [96], the strength of the non-linearity can be quantified using the magnitude of the harmonic peaks and the total number of harmonics produced by the non-linear process. Focusing on the latter criterion, we can see that ReLU produced the peak at the second-harmonic frequency 2 Hz and then at the fourth, 4 Hz, and the sixth, 6 Hz, harmonics. In turn, sigmoid produced the third, 3 Hz, and fifth, 5 Hz, peaks only. On the contrary, QT generated strong peaks at both odd and even harmonic frequencies. Within the theoretical framework used in this analysis, this result suggests that the QT exhibits the strongest non-linearity among the three analysed functions. We also note that the non-linearity of QT can be enhanced by using higher, yet physically realistic, values of its model parameters such as the thickness of the potential barrier.

The spectral analysis presented above can, in principle, be extended to other well-known activation functions used in ML, including the Swish [97] and Gaussian Error Linear Unit (GELU) [98]. However, these activation functions can be regarded as functionally smoother versions of ReLU, with Swish also being a variation of the sigmoid function, and they are commonly used in transformer models [99], where smooth gradients are preferred over sharp transitions typical of ReLU. Therefore, from a mathematical and signal processing perspective, the Fourier spectra of these functions will be qualitatively similar to those of ReLU, exhibiting the same distinctions when compared to the QT function.

5. Results and Discussion

In this section, we present and discuss the results obtained using the source codes that accompany this article (these can be accessed via the link provided in the Data Availability Statement). All codes were developed and tested in Python 3.1 utilising standard mathematical libraries such as NumPy and Matplotlib. Any additional libraries used served only auxiliary purposes, and their functions are not integral to the core neural network algorithms. As a result, the provided codes are highly portable and can be readily adapted to virtually any programming language, assuming the availability of basic linear algebra routines.

We note that in all models—except for the ESN, where the two-dimensional reservoir weight matrices were generated directly using a standard normal distribution—the means of the weights and biases were initialised with samples drawn from a standard normal distribution, with their standard deviations manually set to small constant values (typically 0.01). This deliberate narrowing of the distribution serves multiple purposes. First, it ensures numerical stability during the early stages of training by limiting the variance of activations and gradients, thus helping to avoid exploding or vanishing gradient problems [42]. Second, the choice reflects a common practice in Bayesian neural networks and variational inference, where a small fixed variance allows for controlled stochasticity in parameter sampling without introducing excessive uncertainty [85]. This approach enables the network to explore the parameter space locally around a learnable mean, supporting both robust convergence and better regularisation.

Additionally, we established that the use of a narrow distribution is particularly appropriate when the emphasis is on fine-tuning pre-defined model behaviour or when integrating prior knowledge into model structure and learning dynamics. The Xavier initialisation [100], which is designed to maintain consistent variance of activations and gradients across layers to improve training stability, was also successfully tested. However, it is not included in the accompanying codes, as we aim to keep the implementation as simple as possible in line with the objectives of this article.

Finally, we remind the reader that, in all examples considered in this article, we deliberately kept all topological and numerical model parameters constant across both traditional and QT-based models. Although a comprehensive comparison between traditional and quantum models is beyond the scope of this article, the same approach we have applied in previous work [15,16,43,44] enables a rigorous and fair assessment of model performance. Nevertheless, readers are encouraged to experiment with different parameters, as this may offer them an additional advantage of the QT-based models over the traditional ones.

5.1. QT-Feedforward Neural Network

In this section, we tasked the QT-based model to classify images from the MNIST (Modified National Institute of Standards and Technology) dataset. This dataset is a widely used benchmark in computer vision and ML research due to its balanced class distribution and moderate computational requirements. It contains 60,000 greyscale images of size 28 × 28 pixels, categorised into 10 classes representing handwritten digits from 0 to 9, with 6000 images per class.

The model used in this study comprises a single hidden layer consisting of 512 neurons, reflecting a balance between computational efficiency and representational capacity. A learning rate of 0.01 was selected empirically to ensure stable and efficient convergence, and training was conducted over 100 epochs to provide the model with sufficient opportunities to adjust its weights without overfitting. The batch size was set to 64, a commonly adopted value that offers a practical compromise between training speed and the stability of gradient estimates. Furthermore, a gradient clipping value of 5 was employed to address the issue of exploding gradients—an instability that can arise during the training of neural networks, especially when using non-linear activation functions or deeper architectures [42]. While our focus in this work was on relatively shallow models for clarity and accessibility, the architecture and training setup chosen here are representative of standard practices in the field of traditional ML.

As seen in Figure 5, the model demonstrated accurate classification of all test images. The overall accuracy achieved during training in this configuration is 98.7%, while the testing of all images from the test section of the MNIST dataset yielded an accuracy of 98.3%.

It is worth noting that fully connected neural networks are generally suboptimal for MNIST image classification tasks because they do not explicitly explore the spatial correlations present in image data. Convolutional neural networks (CNNs), on the other hand, are specifically designed to capture local patterns and hierarchical features through convolutional layers, making them more effective for such tasks. However, the QT-based network, despite its fully connected architecture, demonstrated decent performance on MNIST images. The incorporation of a QT activation function introduces a novel non-linearity that helps mitigate some of the inherent limitations of standard fully connected layers. Although the QT network may not reach the optimal performance levels of state-of-the-art CNNs and other advanced models, its ability to deliver respectable accuracy suggests that physics-inspired modifications can offer a viable alternative approach for enhancing fully connected models in complex image classification scenarios.

5.2. QT-RNN for a Sentiment Analysis Task

In this section, we demonstrate the transformation of a traditionally built RNN sentiment analysis model [101] into a QT-based model. The original model processes a dataset structured as a two-column table, where the first column contains text phrases and the second column classifies them as positive or negative [101]. The dataset is divided into two subsets: a training set for model learning and a testing set for performance evaluation.

First, we constructed a vocabulary encompassing all unique words in the dataset and assign each word a corresponding integer index. The model was then trained on the training subset and evaluated on the testing subset to assess its classification accuracy.

Figure 6 presents the test accuracy and loss over training epochs for QT-RNN and standard RNN models. We can see that the QT-RNN (blue circles) achieved rapid convergence, reaching 100% accuracy with near-zero loss after 300 epochs. In contrast, the traditional RNN (red squares) showed slower improvement, initially struggling with accuracy fluctuations and higher loss before eventually converging. These results highlight the advantages of the QT-based approach in optimising model performance [43].

In addition to faster convergence, the near-zero loss values from early epochs indicate that QT-RNN should be more stable during training. This stability can be a result of architectural choices (such as gating mechanisms or normalisation techniques [16,43]) that help control gradient issues often seen in traditional RNNs. Arguably, QT-RNN could also be employing an architecture that captures the underlying patterns in the data more efficiently. This property can result from enhanced non-linearity (Figure 4) or mechanisms that better preserve long-term dependencies, leading to more robust feature learning [43]. Moreover, the superior performance of QT-RNN may indicate that the quantum model possesses an increased ability to generalise, which results in better performance on test data.

5.3. QT-Bayesian Neural Network

To test the performance of the QT-BNN model, we employed the Fashion MNIST dataset that consists of 70,000 greyscale images (28 × 28 pixels) representing 10 categories of fashion items, such as T-shirts, trousers, and sneakers. Fashion MNIST was created as a more challenging alternative to the classic MNIST dataset of handwritten digits. While MNIST has been a de facto standard for benchmarking ML models, it is often considered too simplistic for modern algorithms, which can achieve near-perfect accuracy. Fashion MNIST, on the other hand, introduces greater complexity due to the variability in textures, shapes, and patterns within each class, making it harder for ML models to distinguish between similar categories like shirts, coats, and pullovers. This increased difficulty makes Fashion MNIST a more realistic and practicable benchmark for evaluating the performance of ML models, particularly in tasks like image classification and object recognition, where real-world data are often noisy and ambiguous.

The model used in this study is a relatively simple yet efficient QT-based BNN architecture designed for classification tasks on the Fashion MNIST dataset. It consists of a single hidden layer with 512 neurons, which provides a balance between model capacity and computational efficiency. The learning rate was set to 0.5, a relatively high value that allows for faster convergence during training. Training was conducted over 1000 epochs. To simplify the software implementation, no batching was used during training, meaning the model updated its weights after processing each individual sample. For the Bayesian component of the algorithm, 50 samples were used to approximate the posterior distribution. While this is a modest number for a Bayesian model, it strikes a balance between computational efficiency and the ability to capture uncertainty in the predictions.

Overall, this experimental setup provides a clear yet rigorous framework for evaluating the performance of BNNs on the Fashion MNIST dataset, as demonstrated by the results in Figure 7. In this instance, the model exhibited strong classification performance, correctly identifying all items except for a single misclassification, where a coat was predicted as a shirt. This type of error is scientifically relevant, as it highlights the inherent challenges associated with distinguishing visually similar classes within the dataset [43]. Indeed, coats and shirts, for example, may share common visual features such as texture, shape or pattern, particularly in greyscale images where colour information is absent. Therefore, the ability of the QT-BNN model to adequately process an image dataset with such characteristics makes this model valuable for undertaking real-world classification tasks [43].

5.4. QT-ESN and Mackey–Glass Time Series Forecast Task

In this section, we used a well-documented and highly optimised source code that accompanies the work Ref. [80] and converted it into a QT-based model. The RC system contains 1000 neurons, and in it was trained to forecast the future evolution of a Mackey–Glass time series (MGTS), a standard and relatively challenging task for ML systems [39,80,102].

We generated an MGTS dataset solving the delay differential equation [103]

\begin{matrix} {\dot{x}}_{M G} (t) & = & β_{M G} \frac{x_{M G} (t - τ_{M G})}{1 + x_{M G}^{q} (t - τ_{M G})} - γ_{M G} x_{M G} (t), \end{matrix}

(35)

where overdot denotes differentiation with respect to time,

τ_{M G} = 17

,

q = 10

,

β_{M G} = 0.2

, and

γ_{M G} = 0.1

[80]. The parameters used in this equation ensure that the time series exhibits highly non-linear and also chaotic behaviour. We split the generated dataset into two equal parts, with each being 2000 discrete time steps long. The first part was used to train the RC system, but the second one was used as the target data that were not known to the RC system but used exclusively as the ground truth to evaluate the accuracy of the forecast made by the RC system.

Known as the free-running mode of operation, this regime transforms the RC system into a generator of the future temporal evolution of MGTS [39]. Naturally, the predicted signal will deviate from the target (ground truth) after a certain period of time. The accuracy of the forecast is measured as the mean squared error between the predicted outputs and the target values as

NMSE = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i}^{target} - {\hat{y}}_{i})}^{2},

(36)

where N is the number of testing samples.

Figure 8 presents the forecast generated by the QT-ESN system (blue solid line) alongside the ground truth target signal (green dashed line). The QT-ESN output remained virtually indistinguishable from the target signal within the time range from 0 to 500. The NMSE, computed over the entire time range from 0 to 2000, came out to 4.2 × 10⁻⁵—only an order of magnitude higher from the highly optimised traditional ESN. Given that the QT-ESN was constructed solely by replacing the traditional tanh activation function with the QT function, without any further optimisation, this result provides strong evidence for the plausibility of the QT activation function as a high-performance activation function for RC systems.

6. Potential Applications of QT-Based Models in Neuromorphic and Cognitive AI

Thus far, we have demonstrated the successful application of QT-based neural network models to standard tasks in machine vision, language processing, and chaotic time series forecasting. These results establish the viability of QT-based architectures as robust and efficient alternatives to conventional neural networks in existing AI applications. However, the potential of QT-based networks extends far beyond these benchmarks, especially due to a direct relevance of the effect of QT to the framework of the quantum cognition and decision-making theory [11,15,16]. In particular, in what follows, we will aim to conceptually demonstrate that their ability to process information through quantum-inspired mechanisms, including those illustrated in Figure 1, could influence the field of human–machine teaming.

Human–machine teaming, where humans and machines collaborate to enhance each other’s strengths (Figure 9), is an evolving field with significant applications in military operations, medicine, and AI development [104,105,106,107]. QT-based neural networks, which more closely align with human cognitive processes [16,43], have the potential to make advances in this domain. By enabling AI systems to interpret complex data while exhibiting adaptive reasoning, situational awareness, and real-time strategic planning, these networks can go beyond conventional machine intelligence [16,43]. Yet, we argue that their ability to navigate uncertainty and make rapid yet reliable decisions should make them particularly valuable for defence and security applications, where high-stakes environments demand both precision and agility (the relevant results will be published elsewhere).

Cognitive Human–Machine Teaming

Good practical application design reduces the cognitive load on operators and helps improve their situational awareness [108,109,110]. This is particularly important in environments where quick and accurate decision making is required, such as in defence operations [111,112,113,114]. AI systems can support critical operations by reducing cognitive load and improving decision-making speed, which in turn can lead to higher mission success rates [115,116]. For example, in our previous work, we demonstrated that algorithms similar to those considered in this paper can make real-time decisions about commands received by an autonomous vehicle from both human operator and physical processes occurring in the surrounding environment [17]. Such a functionality has the potential to significantly reduce the cognitive load on the drone operator, thereby decreasing the likelihood of errors. Because the commands given by a human to the drone can be represented as a time-dependent sequence of voltage pulses, the same AI system can be configured to learn from the past inputs of the operator to generate new inputs in a manner consistent with the operator’s style of controlling the drone.

Identifying cognitive demands means looking closely at the tasks people perform, the decisions they must make, and the overall cognitive workload they experience in different scenarios [116,117]. When designing systems for complex environments, whether on the battlefield or in high-tech settings like AI algorithms, it is important to understand the cognitive efforts users must experience. By breaking down the tasks and decisions involved, designers can create interfaces and tools that help manage cognitive load rather than adding to it, which is a process that involves examining operators’ work and understanding how each task contributes to their overall cognitive burden [117,118].

A practical way to address these issues is through cognitive task analysis, which maps out the exact steps and decisions required in an operation. For example, in military and spaceship control settings, using cognitive task analysis has provided valuable insights that lead to systems better tuned to human needs [117]. Similarly, by assessing cognitive workload through established methodologies, designers can adjust system requirements to more closely align with what human operators need to function efficiently [118]. In fact, as discussed in more detail in Ref. [16], QT-based neural network models can help establish to which extent a human operator has been affected by adverse environments such as weightlessness.

As part of practical application, experiments can be designed to observe how different versions of neural network algorithms affect performance measures such as prediction accuracy and classification outcomes, focusing on how closely they mimic human decision making through quantum cognition and decision theories [43,119]. In line with this approach, this paper presents a ready-to-use framework for benchmarking traditional and quantum neural network models that account for the effects of QT, enabling the investigation of AI-driven decision making in defence applications and beyond. Specifically, this toolbox provides an opportunity to examine the interplay between quantum mechanics and its application in AI by evaluating the impact of varying QT parameters (e.g., the width and height of the potential barrier—each directly relevant to the models of cognitive processes [13]) on model predictions and classification outcomes, with a focus on emulating human-like reasoning [43].

We suggest that experiments should be conducted to simultaneously run multiple QT algorithms with different configurations. This parallel execution—meaning that the algorithms are tested side by side rather than sequentially—is computationally affordable due to the technical simplicity and efficiency of the accompanying codes. These algorithms should be able to pass information to each other and learn from each other, mimicking human conscience and subconscious interactions, which is also a computationally affordable task. Thus, quantum ML would benefit from being able to run even more processes at different levels than humans could ever be able to do (Figure 9).

Practical applications of the outlined approach could include building an experiment for military object detection, such as identifying military tracks, where military drones might use multiple cameras installed at different angles (e.g., 30° left, right, top, and bottom) in addition to central camera video capture. Work on such datasets is currently in progress, and the scientific findings will be published separately.

Multimodal models, which process and integrate multiple data types simultaneously, offer significant advantages by combining inputs such as audio, text, and video to generate more accurate and insightful results [120]. This article presents examples of how the QT-based model can be applied to images, text, and time-dependent data, with the latter encompassing not only complex time series but also potentially speech and music patterns [121]. Therefore, these algorithms can be further expanded to process audio and video inputs that come not only from human operators but also from cognitive interactions between humans and machines [122].

An advanced system should also integrate cognitive interactions from both humans and machines, fostering effective human–machine teaming through communication via text and voice. In this context, practical applications must be designed to learn from environmental cues provided by both the system and its users, creating a dynamic feedback loop where the cognitive system continuously adapts to human input while users refine their interactions with the system [122]. A schematic illustration of such an interaction is presented in Figure 9, potentially encompassing the multimodal capabilities of the QT-based neural network models demonstrated in this paper.

Such initiatives in cognitive human–machine teaming underscore the increasing demand for innovative computing paradigms, often termed cognitive computers [123]. In response, quantum and NC systems are establishing the groundwork for the next generation of AI and ML models, enabling more adaptive, efficient and human-like decision making in complex environments [43,65]. Consequently, we anticipate that the paradigm of QT-based neural network models will find a distinct niche within this rapidly evolving field.

7. Conclusions

Quantum technologies are rapidly reshaping multiple scientific and industrial domains, playing an increasingly prominent role in fields such as computing, secure communication, sensing, and even medical diagnostics. While quantum computing promises revolutionary advancements, its accessibility remains a significant challenge, being primarily confined to research institutions and specialised industries with access to high-performance quantum-physical hardware. This paper addresses this gap by demonstrating how traditional neural networks can be transformed into neuromorphic quantum models, significantly lowering the barrier to entry for quantum-inspired computing.

By applying the fundamental principles of quantum mechanics, we have shown that widely used neural network architectures—including feedforward neural networks, recurrent neural networks, reservoir computing models such as the Echo State Network, and Bayesian neural networks—can be adapted to improve both computational efficiency and cognitive-like processing capabilities. The quantum-inspired versions of these models exhibit key advantages, such as more efficient training and an ability to capture aspects of human-like reasoning. This approach not only improves the performance of AI models but also provides insights into the intersection of quantum mechanics and cognitive computing.

One of the most significant contributions of this work is its emphasis on accessibility. Unlike conventional quantum computing frameworks that require specialised hardware and deep technical expertise, the quantum-inspired models presented in this paper can be implemented using standard computational resources, such as a laptop, with only an undergraduate-level understanding of ML. Making these methods available to a wider audience opens new possibilities for researchers, engineers, and students alike, enabling a broader community to explore and apply quantum principles in AI.

Beyond conventional ML tasks, our findings suggest that quantum-inspired networks have the potential to transform fields such as human–machine collaboration, adaptive decision making, and cognitive AI. By embedding quantum dynamics into neural architectures, we move closer to developing AI systems that not only process information efficiently but also demonstrate traits of flexible reasoning, contextual awareness, and adaptive learning—key elements for the next generation of intelligent systems.

Future research should further investigate the scalability of these models in real-world applications, including defence, security, healthcare, and autonomous systems. Additionally, exploring how these networks can be integrated with emerging quantum hardware could lead to hybrid classical–quantum AI frameworks with unprecedented computational power. Ultimately, this work marks a step toward making quantum-inspired neural networks more practical, accessible, and impactful, bringing us closer to realising the full potential of quantum technologies in everyday AI applications.

Author Contributions

I.S.M. developed the traditional and quantum neural network models used in this study and obtained the primary results. The concept for the QT-BNN algorithm originated from M.M., who also envisioned the application of QT-based neural network models in cognitive human–machine teaming and authored the relevant sections of the paper. I.S.M. edited the manuscript as a whole, with contributions from M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

This paper has no additional data. The source codes that accompany this paper are available in the GitHub repository at https://github.com/IvanMaksymov/Quantum-Tunnelling-Neural-Networks-Tutorial (accessed on 2 May 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
BNN	Bayesian Neural Network
BPTT	Backpropagation Through Time
CNN	Convolutional Neural Network
ESN	Echo State Network
FNN	Feedforward Neural Network
MGTS	Mackey–Glass Time Series
ML	Machine Learning
MNIST	Modified National Institute of Standards and Technology database
NC	Neuromorphic Computing
QT-BNN	Quantum-Tunnelling Bayesian Neural Network
QT-ESN	Quantum-Tunnelling Echo State Network

QT-FNN	Quantum-Tunnelling Feedforward Neural Network
QT-RNN	Quantum-Tunnelling Recurrent Neural Network
QCT	Quantum Cognition Theory
QNN	Quantum Neural Network
QT	Quantum Tunnelling
RNN	Recurrent Neural Network
ReLU	Rectified Linear Qnit
STM	Scanning Tunnelling Microscopy

References

LeCun, Y. Security Council Debates Use of Artificial Intelligence in Conflicts, Hears Calls for UN Framework to Avoid Fragmented Governance. Video Presentation. 2024. Available online: https://press.un.org/en/2024/sc15946.doc.htm (accessed on 4 January 2025).
Gunashekar, S.; d’Angelo, C.; Flanagan, I.; Motsi-Omoijiade, D.; Virdee, M.; Feijao, C.; Porter, S. Using Quantum Computers and Simulators in the Life Sciences: Current Trends and Future Prospects; RAND Corporation: Santa Monica, CA USA, 2022. [Google Scholar] [CrossRef]
Gerlich, M. Perceptions and acceptance of artificial intelligence: A multi-dimensional study. Soc. Sci. 2023, 12, 502. [Google Scholar] [CrossRef]
Walker, P.B.; Haase, J.J.; Mehalick, M.L.; Steele, C.T.; Russell, D.W.; Davidson, I.N. Harnessing metacognition for safe and responsible AI. Technologies 2025, 13, 107. [Google Scholar] [CrossRef]
Gerlich, M. AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies 2025, 15, 6. [Google Scholar] [CrossRef]
Sourdin, T. Judge v Robot? Artificial intelligence and judicial decision-making. UNSW Law J. 2018, 41, 1114–1130. [Google Scholar] [CrossRef]
Motsi-Omoijiade, I.D. Financial Intermediation in Cryptocurrency Markets—Regulation, Gaps and Bridges. In Handbook of Blockchain, Digital Finance, and Inclusion, Volume 1; Lee Kuo Chuen, D., Deng, R., Eds.; Academic Press: Cambridge, MA, USA, 2018; pp. 207–223. [Google Scholar] [CrossRef]
Governatori, G.; Bench-Capon, T.; Verheij, B.; Araszkiewicz, M.; Francesconi, E.; Grabmair, M. Thirty years of artificial intelligence and law: The first decade. Artif. Intell. Law 2022, 30, 481–519. [Google Scholar] [CrossRef]
Khrennikov, A. Quantum-like brain: “Interference of minds”. Biosystems 2006, 84, 225–241. [Google Scholar] [CrossRef]
Atmanspacher, H.; Filk, T. A proposed test of temporal nonlocality in bistable perception. J. Math. Psychol. 2010, 54, 314–321. [Google Scholar] [CrossRef]
Busemeyer, J.R.; Bruza, P.D. Quantum Models of Cognition and Decision; Oxford University Press: New York, NY, USA, 2012. [Google Scholar]
Pothos, E.M.; Busemeyer, J.R. Quantum Cognition. Annu. Rev. Psychol. 2022, 73, 749–778. [Google Scholar] [CrossRef]
Maksymov, I.S.; Pogrebna, G. Quantum-mechanical modelling of asymmetric opinion polarisation in social networks. Information 2024, 15, 170. [Google Scholar] [CrossRef]
Maksymov, I.S.; Pogrebna, G. The physics of preference: Unravelling imprecision of human preferences through magnetisation dynamics. Information 2024, 15, 413. [Google Scholar] [CrossRef]
Maksymov, I.S. Quantum-inspired neural network model of optical illusions. Algorithms 2024, 17, 30. [Google Scholar] [CrossRef]
Maksymov, I.S. Quantum-tunneling deep neural network for optical illusion recognition. APL Mach. Learn. 2024, 2, 036107. [Google Scholar] [CrossRef]
Abbas, A.H.; Abdel-Ghani, H.; Maksymov, I.S. Classical and quantum physical reservoir computing for onboard artificial intelligence systems: A perspective. Dynamics 2024, 4, 643–670. [Google Scholar] [CrossRef]
Penrose, R. The Emperor’s New Mind; Oxford University Press: Oxford, UK, 1989. [Google Scholar]
Georgiev, D.D. Quantum Information and Consciousness; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]
Brooks, M. Can quantum hints in the brain revive a radical consciousness theory? New Sci. 2024, 261, 40–43. [Google Scholar] [CrossRef]
Chrisley, R. Quantum Learning. In New Directions in Cognitive Science, Proceedings of the International Symposium, Saariselkä, Lapland, Finland, 4–9 August 1995; Pylkkänen, P., Pylkkö, P., Eds.; Finnish Association of Artificial Intelligence: Helsinki, Finland, 1995; pp. 77–89. [Google Scholar]
Kak, S.C. Quantum Neural Computing. In Advances in Imaging and Electron Physics; Elsevier: Amsterdam, The Netherlands, 1995; Volume 94, pp. 259–313. [Google Scholar] [CrossRef]
Schuld, M.; Sinayskiy, I.; Petruccione, F. The quest for a Quantum Neural Network. Quantum Inf. Process. 2014, 13, 2567–2586. [Google Scholar] [CrossRef]
Horodecki, R.; Horodecki, P.; Horodecki, M.; Horodecki, K. Quantum Entanglement. Rev. Mod. Phys. 2009, 81, 865–942. [Google Scholar] [CrossRef]
Chitambar, E.; Gour, G. Quantum resource theories. Rev. Mod. Phys. 2019, 91, 025001. [Google Scholar] [CrossRef]
Ezhov, A.A.; Ventura, D. Quantum Neural Networks. In Future Directions for Intelligent Systems and Information Sciences; Kasabov, N., Ed.; Physica-Verlag HD: Heidelberg, Germany, 2000; pp. 213–235. [Google Scholar] [CrossRef]
Wan, K.H.; Dahlsten, O.; Kristjánsson, H.; Gardner, R.; Kim, M.S. Quantum generalisation of feedforward neural networks. Npj Quantum Inf. 2017, 3, 36. [Google Scholar] [CrossRef]
Beer, K.; Bondarenko, D.; Farrelly, T.; Osborne, T.J.; Salzmann, R.; Scheiermann, D.; Wolf, R. Training deep quantum neural networks. Nat. Commun. 2020, 11, 808. [Google Scholar] [CrossRef]
Zhao, C.; Gao, X.S. QDNN: Deep neural networks with quantum layers. Quantum Mach. Intell. 2021, 3, 15. [Google Scholar] [CrossRef]
Hiesmayr, B.C. A quantum information theoretic view on a deep quantum neural network. AIP Conf. Proc. 2024, 3061, 020001. [Google Scholar] [CrossRef]
Yan, P.; Li, L.; Jin, M.; Zeng, D. Quantum probability-inspired graph neural network for document representation and classification. Neurocomputing 2021, 445, 276–286. [Google Scholar] [CrossRef]
Pira, L.; Ferrie, C. On the interpretability of quantum neural networks. Quantum Mach. Intell. 2024, 6, 52. [Google Scholar] [CrossRef]
Peral-García, D.; Cruz-Benito, J.; García-Peñalvo, F.J. Systematic literature review: Quantum machine learning and its applications. Comput. Sci. Rev. 2024, 51, 100619. [Google Scholar] [CrossRef]
Choi, S.; Salamin, Y.; Roques-Carmes, C.; Dangovski, R.; Luo, D.; Chen, Z.; Horodynski, M.; Sloan, J.; Uddin, S.Z.; Soljačić, M. Photonic probabilistic machine learning using quantum vacuum noise. Nat. Commun. 2024, 15, 7760. [Google Scholar] [CrossRef]
Lukoševičius, M.; Jaeger, H. Reservoir computing approaches to recurrent neural network training. Comput. Sci. Rev. 2009, 3, 127–149. [Google Scholar] [CrossRef]
Tanaka, G.; Yamane, T.; Héroux, J.B.; Nakane, R.; Kanazawa, N.; Takeda, S.; Numata, H.; Nakano, D.; Hirose, A. Recent advances in physical reservoir computing: A review. Neural Newt. 2019, 115, 100–123. [Google Scholar] [CrossRef]
Nakajima, K. Physical reservoir computing–an introductory perspective. Jpn. J. Appl. Phys. 2020, 59, 060501. [Google Scholar] [CrossRef]
Gauthier, D.J.; Bollt, E.; Griffith, A.; Barbosa, W.A.S. Next generation reservoir computing. Nat. Commun. 2021, 12, 5564. [Google Scholar] [CrossRef]
Maksymov, I.S. Analogue and physical reservoir computing using water waves: Applications in power engineering and beyond. Energies 2023, 16, 5366. [Google Scholar] [CrossRef]
Kent, R.M.; Barbosa, W.A.S.; Gauthier, D.J. Controlling chaos using edge computing hardware. Nat. Commun. 2024, 15, 3886. [Google Scholar] [CrossRef]
Kendon, V.M.; Nemoto, K.; Munro, W.J. Quantum analogue computing. Phil. Trans. R. Soc. A. 2010, 368, 3609–3620. [Google Scholar] [CrossRef]
Kim, P. MATLAB Deep Learning with Machine Learning, Neural Networks and Artificial Intelligence; Apress: Berkeley, CA, USA, 2017. [Google Scholar]
Maksimovic, M.; Maksymov, I.S. Quantum-cognitive neural networks: Assessing confidence and uncertainty with human decision-making simulations. Big Data Cogn. Comput. 2025, 9, 12. [Google Scholar] [CrossRef]
McNaughton, J.; Abbas, A.H.; Maksymov, I.S. Neuromorphic Quantum Neural Networks with Tunnel-Diode Activation Functions. arXiv 2025, arXiv:2503.04978. [Google Scholar]
Landau, L.D.; Lifshitz, E.M. Mechanics, 3rd ed.; Course of Theoretical Physics; Butterworth-Heinemann: Oxford, UK, 1976; Volume 1. [Google Scholar]
Messiah, A. Quantum Mechanics; North-Holland Publishing Company: Amsterdam, The Netherlands, 1962. [Google Scholar]
Griffiths, D.J. Introduction to Quantum Mechanics; Prentice Hall: Englewood Cliffs, NJ, USA, 2004. [Google Scholar]
Nielsen, M.; Chuang, I. Quantum Computation and Quantum Information; Oxford University Press: New York, NY, USA, 2002. [Google Scholar]
Feynman, R.P.; Leighton, R.B.; Sands, M. The Feynman Lectures on Physics, Vol. 3: Quantum Mechanics; The New Millennium Edition; Basic Books: New York, NY, USA, 2011. [Google Scholar]
Schrödinger, E. Die gegenwärtige Situation in der Quantenmechanik. Naturwissenschaften 1935, 23, 807–812. [Google Scholar] [CrossRef]
Brody, J. Quantum Entanglement: A Beginner’s Guide; MIT Press: Cambridge, MA, USA, 2020. [Google Scholar]
Menneer, T.; Narayanan, A. Quantum-Inspired Neural Networks; Technical Report 329; Department of Computer Science, University of Exeter: Exeter, UK, 1995. [Google Scholar]
Narayanan, A.; Menneer, T. Quantum artificial neural network architectures and components. Inf. Sci. 2000, 128, 231–255. [Google Scholar] [CrossRef]
McQuarrie, D.A.; Simon, J.D. Physical Chemistry—A Molecular Approach; Prentice Hall: New York, NY, USA, 1997. [Google Scholar]
Esaki, L. New phenomenon in narrow germanium p − n junctions. Phys. Rev. 1958, 109, 603–604. [Google Scholar] [CrossRef]
Kahng, D.; Sze, S.M. A floating gate and its application to memory devices. Bell Syst. Tech. J. 1967, 46, 1288–1295. [Google Scholar] [CrossRef]
Chang, L.L.; Esaki, L.; Tsu, R. Resonant tunneling in semiconductor double barriers. Appl. Phys. Lett. 1974, 12, 593–595. [Google Scholar] [CrossRef]
Ionescu, A.M.; Riel, H. Tunnel field-effect transistors as energy-efficient electronic switches. Nature 2011, 479, 329–337. [Google Scholar] [CrossRef]
Modinos, A. Field emission spectroscopy. Prog. Surf. Sci. 1993, 42, 45. [Google Scholar] [CrossRef]
Rahman Laskar, M.A.; Celano, U. Scanning probe microscopy in the age of machine learning. APL Mach. Learn. 2023, 1, 041501. [Google Scholar] [CrossRef]
Binnig, G.; Rohrer, H. Scanning tunneling microscopy—from birth to adolescence. Rev. Mod. Phys. 1987, 59, 615–625. [Google Scholar] [CrossRef]
Feng, Y.; Tang, M.; Sun, Z.; Qi, Y.; Zhan, X.; Liu, J.; Zhang, J.; Wu, J.; Chen, J. Fully flash-based reservoir computing network with low power and rich states. IEEE Trans. Electron Devices 2023, 70, 4972–4975. [Google Scholar] [CrossRef]
Kwon, D.; Woo, S.Y.; Lee, K.H.; Hwang, J.; Kim, H.; Park, S.H.; Shin, W.; Bae, J.H.; Kim, J.J.; Lee, J.H. Reconfigurable neuromorphic computing block through integration of flash synapse arrays and super-steep neurons. Sci. Adv. 2023, 9, eadg9123. [Google Scholar] [CrossRef]
Yilmaz, Y.; Mazumder, P. Image processing by a programmable grid comprising quantum dots and memristors. IEEE Trans. Nanotechnol. 2013, 12, 879–887. [Google Scholar] [CrossRef]
Marković, D.; Grollier, J. Quantum neuromorphic computing. Appl. Phys. Lett. 2020, 117, 150501. [Google Scholar] [CrossRef]
Abel, S.; Chancellor, N.; Spannowsky, M. Quantum computing for quantum tunneling. Phys. Rev. D 2021, 103, 016008. [Google Scholar] [CrossRef]
Chen, Z.; Xiao, Z.; Akl, M.; Leugring, J.; Olajide, O.; Malik, A.; Dennler, N.; Harper, C.; Bose, S.; Gonzalez, H.A.; et al. ON-OFF neuromorphic ISING machines using Fowler-Nordheim annealers. arXiv 2024, arXiv:2406.05224. [Google Scholar] [CrossRef]
Georgiev, D.D.; Glazebrook, J.F. The quantum physics of synaptic communication via the SNARE protein complex. Prog. Biophys. Mol. 2018, 135, 16–29. [Google Scholar] [CrossRef]
Georgiev, D.D.; Glazebrook, J.F. Quantum tunneling of Davydov solitons through massive barriers. Chaos Soliton. Fract. 2019, 123, 275–293. [Google Scholar] [CrossRef]
Georgiev, D.D.; Glazebrook, J.F. Quantum transport and utilization of free energy in protein α-helices. In Quantum Boundaries of Life; Poznaṅski, R.R., Brändas, E.J., Eds.; Advances in Quantum Chemistry; Academic Press: Cambridge, MA, USA, 2020; Volume 82, pp. 253–300. [Google Scholar]
Georgiev, D.D. Quantum propensities in the brain cortex and free will. Biosystems 2021, 208, 104474. [Google Scholar] [CrossRef]
Georgiev, D.D. Causal potency of consciousness in the physical world. Int. J. Mod. Phys. B 2024, 38, 2450256. [Google Scholar] [CrossRef]
Shor, P.W. Algorithms for quantum computation: Discrete logarithms and factoring. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, Santa Fe, NM, USA, 20–22 November 1994; pp. 124–134. [Google Scholar] [CrossRef]
Everett, H.I. The Many-Worlds Interpretation of Quantum Mechanics: The Theory of the Universal Wave Function. PhD Thesis, Princeton University, Princeton, NJ, USA, 1956. [Google Scholar]
Everett, H. “Relative state” formulation of quantum mechanics. Rev. Mod. Phys. 1957, 29, 454–462. [Google Scholar] [CrossRef]
Wheeler, J.A. Assessment of Everett’s “relative state” formulation of quantum theory. Rev. Mod. Phys. 1957, 29, 463–465. [Google Scholar] [CrossRef]
Vaidman, L. Many-Worlds Interpretation of Quantum Mechanics. In The Stanford Encyclopedia of Philosophy; Fall 2021 ed.; Zalta, E.N., Ed.; Metaphysics Research Lab, Stanford University: Standford, CA, USA, 2021. [Google Scholar]
Khan, A.; Ahsan, M.; Bonyah, E.; Jan, R.; Nisar, M.; Abdel-Aty, A.H.; Yahia, I.S. Numerical solution of Schrödinger equation by Crank–Nicolson method. Math. Probl. Eng. 2022, 2022, 6991067. [Google Scholar] [CrossRef]
Maksymov, I.S. Quantum Mechanics of Human Perception, Behaviour and Decision-Making: A Do-It-Yourself Model Kit for Modelling Optical Illusions and Opinion Formation in Social Networks. arXiv 2024, arXiv:2404.10554. [Google Scholar]
Lukoševičius, M. A Practical Guide to Applying Echo State Networks. In Neural Networks: Tricks of the Trade, Reloaded; Montavon, G., Orr, G.B., Müller, K.R., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 659–686. [Google Scholar]
Haykin, S. Recurrent Neural Networks for Prediction; Wiley: Chichester, UK, 2001. [Google Scholar]
Jaeger, H. A Tutorial on Training Recurrent Neural Networks, Covering BPPT, RTRL, EKF and the “Echo State Network” Approach; GMD Report 159; German National Research Center for Information Technology: St. Augustin, Schloss Birlinghoven, Germany, 2005. [Google Scholar]
Jiang, J.; Lai, Y.C. Model-free prediction of spatiotemporal dynamical systems with recurrent neural networks: Role of network spectral radius. Phys. Rev. Res. 2019, 1, 033056. [Google Scholar] [CrossRef]
Zhang, J.; He, T.; Sra, S.; Jadbabaie, A. Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 30 April 2020. [Google Scholar]
Jospin, L.V.; Laga, H.; Boussaid, F.; Buntine, W.; Bennamoun, M. Hands-on Bayesian neural networks–A tutorial for deep learning users. IEEE Comput. Intell. Mag. 2022, 17, 29–48. [Google Scholar] [CrossRef]
Liu, S.; Xiao, T.P.; Kwon, J.; Debusschere, B.J.; Agarwal, S.; Incorvia, J.A.C.; Bennett, C.H. Bayesian neural networks using magnetic tunnel junction-based probabilistic in-memory computing. Front. Nanotechnol. 2022, 4, 1021943. [Google Scholar] [CrossRef]
Gawlikowski, J.; Tassi, C.R.N.; Ali, M.; Lee, J.; Humt, M.; Feng, J.; Kruspe, A.; Triebel, R.; Jung, P.; Roscher, R.; et al. A survey of uncertainty in deep neural networks. Artif. Intell. Rev. 2023, 56, 1513–1589. [Google Scholar] [CrossRef]
Wasilewski, J.; Paterek, T.; Horodecki, K. Uncertainty of feed forward neural networks recognizing quantum contextuality. J. Phys. A: Math. Theor. 2024, 56, 455305. [Google Scholar] [CrossRef]
McKenna, T.M.; McMullen, T.A.; Shlesinger, M.F. The brain as a dynamic physical system. Neuroscience 1994, 60, 587–605. [Google Scholar] [CrossRef]
Korn, H.; Faure, P. Is there chaos in the brain? II. Experimental evidence and related models. C. R. Biol. 2003, 326, 787–840. [Google Scholar] [CrossRef]
Marinca, V.; Herisanu, N. Nonlinear Dynamical Systems in Engineering; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Maass, W.; Natschläger, T.; Markram, H. Real-time computing without stable states: A new framework for neural computation based on perturbations. Neural Comput. 2002, 14, 2531–2560. [Google Scholar] [CrossRef] [PubMed]
Jaeger, H.; Haas, H. Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication. Science 2004, 304, 78–80. [Google Scholar] [CrossRef]
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Haykin, S. Neural Networks: A Comprehensive Foundation; Pearson-Prentice Hall: Singapore, 1998. [Google Scholar]
Maksymov, I.S.; Greentree, A.D. Coupling light and sound: Giant nonlinearities from oscillating bubbles and droplets. Nanophotonics 2019, 8, 367–390. [Google Scholar] [CrossRef]
Ramachandran, P.; Zoph, B.; Le, Q.V. Searching for Activation Functions. Available online: https://openreview.net/forum?id=SkBYYyZRZ (accessed on 23 March 2025).
Hendrycks, D.; Gimpel, K. Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units. Available online: https://openreview.net/forum?id=Bk0MRI5lg (accessed on 23 April 2025).
Lin, T.; Wang, Y.; Liu, X.; Qiu, X. A survey of transformers. AI Open 2022, 3, 111–132. [Google Scholar] [CrossRef]
Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 13–15 May 2010; Teh, Y.W., Titterington, M., Eds.; Proceedings of Machine Learning Research. Volume 9, pp. 249–256. [Google Scholar]
Zhou, V. RNN from Scratch. 2019. Available online: https://github.com/vzhou842/rnn-from-scratch (accessed on 21 February 2025).
Abbas, A.H.; Maksymov, I.S. Reservoir computing using measurement-controlled quantum dynamics. Electronics 2024, 13, 1164. [Google Scholar] [CrossRef]
Mackey, M.C.; Glass, L. Oscillation and chaos in physiological control systems. Science 1977, 197, 287–289. [Google Scholar] [CrossRef]
McNeese, N.J.; Demir, M.; Cooke, N.J.; Myers, C. Teaming with a synthetic teammate: Insights into human-autonomy teaming. Hum. Factors 2018, 60, 262–273. [Google Scholar] [CrossRef]
Lyons, J.B.; Wynne, K.T.; Mahoney, S.; Roebke, M.A. Trust and Human-Machine Teaming: A Qualitative Study. In Artificial Intelligence for the Internet of Everything; Lawless, W., Mittu, R., Sofge, D., Moskowitz, I.S., Russell, S., Eds.; Academic Press: Cambridge, MA, USA, 2019; pp. 101–116. [Google Scholar] [CrossRef]
Henry, K.E.; Kornfield, R.; Sridharan, A.; Linton, R.C.; Groh, C.; Wang, T.; Wu, A.; Mutlu, B.; Saria, S. Human–machine teaming is key to AI adoption: Clinicians’ experiences with a deployed machine learning system. Npj Digit. Med. 2022, 5, 97. [Google Scholar] [CrossRef]
Greenberg, A.M.; Marble, J.L. Foundational concepts in person-machine teaming. Front. Phys. 2023, 10, 1080132. [Google Scholar] [CrossRef]
Tzafestas, S.G. Human–Machine Interaction in Automation (I): Basic Concepts and Devices. In Human and Nature Minding Automation: An Overview of Concepts, Methods, Tools and Applications; Springer: Dordrecht, The Netherlands, 2010; pp. 47–60. [Google Scholar] [CrossRef]
Wu, L.; Zhu, Z.; Cao, H.; Li, B. Influence of information overload on operator’s user experience of human–machine interface in LED manufacturing systems. Cogn. Technol. Work 2016, 18, 161–173. [Google Scholar] [CrossRef]
Yan, S.; Tran, C.C.; Chen, Y.; Tan, K.; Habiyaremye, J.L. Effect of user interface layout on the operators’ mental workload in emergency operating procedures in nuclear power plants. Nucl. Eng. Des. 2017, 322, 266–276. [Google Scholar] [CrossRef]
Laurila-Pant, M.; Pihlajamäki, M.; Lanki, A.; Lehikoinen, A. A protocol for analysing the role of shared situational awareness and decision-making in cooperative disaster simulations. J. Disaster Risk Reduct. 2023, 86, 103544. [Google Scholar] [CrossRef]
Lim, G.J.; Cho, J.; Bora, S.; Biobaku, T.; Parsaei, H. Models and computational algorithms for maritime risk analysis: A review. Ann. Oper. Res. 2018, 271, 765–786. [Google Scholar] [CrossRef]
Pei, Z.; Rojas-Arevalo, A.M.; de Haan, F.J.; Lipovetzky, N.; Moallemi, E.A. Reinforcement learning for decision-making under deep uncertainty. J. Environ. Manag. 2024, 359, 120968. [Google Scholar] [CrossRef]
Zhu, D.; Li, Z.; Mishra, A.R. Evaluation of the critical success factors of dynamic enterprise risk management in manufacturing SMEs using an integrated fuzzy decision-making model. Technol. Forecast. Soc. Chang. 2023, 186, 122137. [Google Scholar] [CrossRef]
Brachten, F.; Brünker, F.; Frick, N.R.; Ross, B.; Stieglitz, S. On the ability of virtual agents to decrease cognitive load: An experimental study. Inf. Syst. E-Bus. Manag. 2020, 18, 187–207. [Google Scholar] [CrossRef]
Carvalho, A.V.; Chouchene, A.; Lima, T.M.; Charrua-Santos, F. Cognitive manufacturing in Industry 4.0 toward cognitive load reduction: A conceptual framework. Appl. Syst. Innov. 2020, 3, 55. [Google Scholar] [CrossRef]
McDermott, P.L.; Walker, K.E.; Dominguez, C.O.; Nelson, A.; Kasdaglis, N. Quenching the thirst for human-machine teaming guidance: Helping military systems acquisition leverage cognitive engineering research. In Proceedings of the 13th International Conference on Naturalistic Decision Making, Bath, UK, 20–23 June 2017; Gore, J., Ward, P., Eds.; pp. 236–240. [Google Scholar]
Madni, A.M.; Madni, C.C. Architectural framework for exploring adaptive human-machine teaming options in simulated dynamic environments. Systems 2018, 6, 44. [Google Scholar] [CrossRef]
Yearsley, J.M.; Busemeyer, J.R. Quantum cognition and decision theories: A tutorial. J. Math. Psychol. 2016, 74, 99–116. [Google Scholar] [CrossRef]
Liang, P.P.; Zadeh, A.; Morency, L.P. Foundations & trends in multimodal machine learning: Principles, challenges, and open questions. ACM Comput. Surv. 2024, 56. [Google Scholar] [CrossRef]
Shougat, M.R.E.U.; Li, X.; Shao, S.; McGarvey, K.; Perkins, E. Hopf physical reservoir computer for reconfigurable sound recognition. Sci. Rep. 2023, 13, 8719. [Google Scholar] [CrossRef]
Haigh, K.Z.; Nguyen, T.; Center, T.R.M. Challenges of Testing Cognitive EW Systems. In Proceedings of the 2023 IEEE AUTOTESTCON, National Harbor, MD, USA, 28–31 August 2023; IEEE: New York, NY, USA, 2023; pp. 1–8. [Google Scholar]
Wang, Y.; Widrow, B.; Hoare, C.A.R.; Pedrycz, W.; Berwick, R.C.; Plataniotis, K.N.; Rudas, I.J.; Lu, J.; Kacprzyk, J. The odyssey to next-generation computers: Cognitive computers (κC) inspired by the brain and powered by intelligent mathematics. Front. Comput. Sci. 2023, 5, 1152592. [Google Scholar] [CrossRef]

Figure 1. (a) Schrödinger’s cat thought experiment. A cat is placed in a sealed, opaque box containing a radioactive atom, a Geiger counter, a vial of poison, and a hammer. If the atom decays, the Geiger counter triggers the hammer to release the poison, killing the cat. Until the box is opened and observed, quantum mechanics suggests the cat exists in a superposition of being both alive and dead. (b) Illustration of a projective measurement of a qubit

| ψ 〉

using the Bloch sphere, where measurement collapses the qubit from a superposition to a definite state. (c) The double-slit experiment, which demonstrates quantum interference, showing that particles such as electrons behave as waves, creating an interference pattern until observed. (d) Illustration of wavefunction collapse triggered by detection using an electron detector.

Figure 1. (a) Schrödinger’s cat thought experiment. A cat is placed in a sealed, opaque box containing a radioactive atom, a Geiger counter, a vial of poison, and a hammer. If the atom decays, the Geiger counter triggers the hammer to release the poison, killing the cat. Until the box is opened and observed, quantum mechanics suggests the cat exists in a superposition of being both alive and dead. (b) Illustration of a projective measurement of a qubit

| ψ 〉

using the Bloch sphere, where measurement collapses the qubit from a superposition to a definite state. (c) The double-slit experiment, which demonstrates quantum interference, showing that particles such as electrons behave as waves, creating an interference pattern until observed. (d) Illustration of wavefunction collapse triggered by detection using an electron detector.

Figure 2. (a.i–a.iv) Instantaneous snapshots of an energy wave packet modelling the interaction of an electron with a double-slit structure. (b.i–b.iv) Instantaneous snapshots of an energy wave packet modelling the tunnelling of an electron through a continuous potential barrier. The false-colour scale of the images encodes the computed probability density values.

Figure 3. Schematic representation of a generic neural network model, illustrating the replacement of the traditional ReLU activation function with the physical QT effect. Other types of activation functions can be substituted in a similar manner. In this paper, the Softmax activation function remains unaltered. The proposed replacement approach has been demonstrated to be effective across all neural network models examined in this study.

Figure 4. Fourier spectra of the outputs of (a,e) ReLU, (b,f) sigmoid, (c,g) identity, and (d,h) QT functions activated by a sinusoidal wave signal at a frequency of 1 Hz. Note the highest number of non-linearly generated higher-order harmonics in the spectrum of the QT function.

Figure 5. Example of classifications from a randomly selected subset of the MNIST testing dataset produced by the QT-feedforward neural network. The labels above each panel indicate the predicted categories alongside the ground truth labels.

Figure 6. Test accuracy (top panel) and loss (bottom panel) over training epochs for QT-RNN (blue circles) and standard RNN models (red squares).

Figure 7. Example of classifications from a randomly selected subset of the Fashion MNIST testing dataset produced by the QT-BNN model. The labels above each panel indicate the predicted categories alongside the ground truth labels.

Figure 8. Forecast of MGTS made by the QT-ESN system (blue solid line) compared with the target signal (green dashed line).

Figure 9. Conceptual illustration of cognitive human–machine teaming, highlighting cognitive interactions between human operators and ML processes. The bright dots emerging from the human input represent the flow of information. Own work by the first author.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Maksimovic, M.; Maksymov, I.S. Transforming Neural Networks into Quantum-Cognitive Models: A Research Tutorial with Novel Applications. Technologies 2025, 13, 183. https://doi.org/10.3390/technologies13050183

AMA Style

Maksimovic M, Maksymov IS. Transforming Neural Networks into Quantum-Cognitive Models: A Research Tutorial with Novel Applications. Technologies. 2025; 13(5):183. https://doi.org/10.3390/technologies13050183

Chicago/Turabian Style

Maksimovic, Milan, and Ivan S. Maksymov. 2025. "Transforming Neural Networks into Quantum-Cognitive Models: A Research Tutorial with Novel Applications" Technologies 13, no. 5: 183. https://doi.org/10.3390/technologies13050183

APA Style

Maksimovic, M., & Maksymov, I. S. (2025). Transforming Neural Networks into Quantum-Cognitive Models: A Research Tutorial with Novel Applications. Technologies, 13(5), 183. https://doi.org/10.3390/technologies13050183

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Transforming Neural Networks into Quantum-Cognitive Models: A Research Tutorial with Novel Applications

Abstract

1. Introduction

1.1. Quantum Neural Networks

1.2. Neuromorphic Computing

1.3. Objectives of This Tutorial

1.4. Organisation of the Article

2. Motivation

2.1. General Background

2.2. Menneer–Narayanan Quantum-Theoretic Concept

3. Quantum Tunnelling Effect

3.1. Theory

3.2. Practical Applications of Quantum Tunnelling

3.3. The Relationship Between QT and Menneer–Narayanan Quantum-Theoretic Concept

4. Benchmarking Neural Network Models

4.1. Feedforward Neural Networks

4.2. Recurrent Neural Networks

4.3. Bayesian Neural Networks

4.4. Echo State Networks and Reservoir Computing

4.5. From Classical to Quantum: Transforming Computational Models

5. Results and Discussion

5.1. QT-Feedforward Neural Network

5.2. QT-RNN for a Sentiment Analysis Task

5.3. QT-Bayesian Neural Network

5.4. QT-ESN and Mackey–Glass Time Series Forecast Task

6. Potential Applications of QT-Based Models in Neuromorphic and Cognitive AI

Cognitive Human–Machine Teaming

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI