1. Introduction: When Ideas Collapse into Insight
Creative insight is widely recognized as a fundamental yet elusive aspect of human cognition. It is typically characterized as a sudden and often unexpected transition from a state of cognitive impasse to one of solution or conceptual clarity [
1,
2]. The phenomenology of insight has been documented across diverse domains, including problem solving, metaphor generation, scientific discovery, and artistic production [
3,
4,
5]. Despite its centrality in mental life, modeling insight computationally remains an open and challenging problem [
2,
6,
7,
8].
Several theoretical accounts have sought to explain the mechanisms underlying insight, ranging from heuristic-based symbolic search to associative network models that rely on spreading activation within a conceptual space [
9,
10]. However, these models often struggle to capture the non-linear, discontinuous, and context-sensitive nature of creative transitions [
4,
11]. More recent approaches have proposed dual-process theories that distinguish between fast associative mechanisms and slower controlled processes [
10,
12]. Yet, even dual-process frameworks face limitations in formally accounting for the dynamics of sudden restructuring or emergence that characterize genuine insight episodes [
4,
13].
An alternative line of research has emerged over the past two decades that leverages the mathematical formalism of quantum theory, originally developed for physics, as a modeling framework for cognitive phenomena that exhibit contextuality, interference, and probabilistic emergence [
6,
14,
15]. In particular, quantum models have been applied to concept combination, decision-making under ambiguity, and the role of context in mental representation [
1,
6,
16]. Within this tradition, a growing body of work has suggested that creative cognition, and especially insight, may benefit from a quantum-theoretic treatment [
1,
3,
17]. These models typically rely on the notions of superposition, collapse, and entanglement to capture the emergence of novel ideas from latent cognitive states.
The present work is situated within this broader line of quantum-inspired cognitive modeling. It builds upon previous efforts to represent conceptual dynamics in Hilbert space [
1,
14], extending them through the use of quantum walks on semantic graphs as a means to simulate the exploratory processes that precede insight. Rather than postulating insight as a discrete event imposed by an external rule or threshold, we aim to show that its probabilistic emergence can be modeled as an endogenous property of a quantum dynamical system navigating a structured conceptual landscape.
In this paper, we develop a quantum-inspired account of insight that moves beyond a purely conceptual proposal. We first establish, in a precise mathematical form, when and how interference can generate short-time amplifications of activation, conditions that make remote but contextually appropriate concepts temporarily more accessible. Building on this foundation, we introduce a set of quantitative observables and nonparametric statistics that let us read the model empirically rather than impressionistically, separating target-centric behavior from open-ended exploration. We then put the framework to work on large empirically derived ConceptNet subgraphs spanning three settings, an illustrative Candle instance, twenty Remote Associates Test triads, and Alternative Uses—always comparing continuous-time quantum walk (CTQW) with classical diffusion (CRW) under identical initial conditions and a common spectrum-scaled time normalization. Across these conditions, CTQW reliably concentrates more probability on single-solution targets and reaches higher peaks sooner, while in open-ended tasks it explores both more broadly and more deeply within the same temporal budget. The pattern is robust to normalized generator choices and small variations in the rate parameter once time is scaled by the operator spectrum.
The paper is organized as follows. We begin by locating our contribution within the existing literature (
Section 2), setting the stage with essentials from semantic networks, quantum walks, and the cognitive motivation that ties them together. 
Section 3 then fixes notation and modeling choices and, to keep the paper accessible, folds in a brief primer for non-specialists alongside a short note on feasibility and implementation. With this scaffold in place, 
Section 4 develops the mathematical core—our propositions on time averaging and transient amplification—while 
Section 5 lays out the quantitative yardsticks and statistical procedures that will anchor the claims. 
Section 6 and 
Section 7 introduce the operational CTQW/CRW setup and the Candle case study. 
Section 8 scales up to ConceptNet-derived subgraphs, detailing the datasets, experimental setup, and results for Candle, twenty RAT triads, and Alternative Uses. 
Section 9 closes with a discussion of the implications and potential future directions.
  3. From Semantics to Superposition: Unified Notation and Setup
We represent task-relevant conceptual knowledge as a finite undirected weighted graph  with  nodes. Nodes  correspond to concepts, and edge weights encode associative strength obtained from empirical resources (e.g., co-occurrence statistics, psycholinguistic word-association norms, WordNet, and ConceptNet) or from synthetic constructions designed to probe specific topologies (small-world, scale-free, and modular). The (symmetric) adjacency matrix is , with  for  and  otherwise; the diagonal degree matrix is , with ; the (combinatorial) Laplacian is . These matrices are real-symmetric (hence Hermitian) and admit an orthonormal eigenbasis. In what follows, we work in the node (computational) basis  (bra–ket notation), identifying  with the i-th canonical basis vector when convenient. We associate each concept node  with a basis vector , forming an orthonormal computational basis. Orthogonality here reflects perfect discriminability of labels at readout and is a representational choice common in quantum walks on graphs; semantic relatedness is encoded in the graph (edge weights) and hence in the Hamiltonian, not as basis overlap. This avoids double counting similarity in both the state space and the dynamics. The framework can be generalized to non-orthogonal concept states  with positive-definite Gram matrix . One may work with the G-inner product (so that H is Hermitian with respect to G) or equivalently pass to an orthonormal basis via a whitening transform , setting  and . In that representation, our spectral statements and projector formulas apply as usual, while measurement is represented by a POVM over . For clarity—and because our empirical nodes are discrete identifiers—we adopt the orthonormal coding throughout and let the graph weights govern similarity and interference via H.
This graph is not a passive store but the substrate on which conceptual dynamics unfold. The classical baseline is continuous-time diffusion governed by the heat equation 
, with solution 
. The classical baseline follows the diffusion dynamics in Equation (
17), where probabilities evolve under the graph Laplacian. It provides an interference-free spread of activation that respects local conservation and yields smooth relaxation. In the quantum account adopted here, conceptual dynamics are modeled by a continuous-time quantum walk (CTQW) on 
G. The cognitive state evolves unitarily as
The cognitive state evolves according to the unitary dynamics in Equation (
1), which defines the continuous-time quantum walk on the semantic graph.
The choice 
 emphasizes direct adjacency-mediated transitions, whereas 
 emphasizes diffusion-like propagation with a tunable rate. For 
, any global scaling can be absorbed into the time unit, so we set 
 without loss of generality; for 
, we keep 
 explicit to control the propagation rate. Unless otherwise stated, the initial condition is localized on a 
source node 
, namely 
; we occasionally discuss more diffuse or context-biased superpositions when theoretically informative. The instantaneous probability of observing node 
i at time 
t is
      while in the classical model we denote by 
 the 
i-th component of 
 with the same source 
 (the Dirac unit vector at 
s). When a set of 
targets  is designated by the task (e.g., solution concepts), we aggregate mass as 
 and 
. Proximity in the graph serves as a proxy for cognitive accessibility, yet weights 
 need not be mere distances: they may encode familiarity, typicality, cue validity, or other psychologically meaningful factors.
Equation (
2) provides the node-wise probabilities of the quantum walk, obtained as the squared modulus of the amplitude at node 
i. These are the directly observable quantities that we compare against the classical baseline.
The Hilbert space picture makes the cognitive reading explicit. Each concept 
 is a basis state, and a mental configuration is a normalized superposition
      in which amplitudes 
 quantify the momentary potential of concepts to be actualized. Localized states model focused search; broader superpositions capture exploratory modes and context sensitivity. Under the graph-induced Hamiltonian, amplitudes propagate along multiple paths and combine with phases; when phases align, constructive interference can transiently amplify probability on conceptually “remote” nodes, offering a mechanistic route to sudden restructuring episodes often described as insight. Conversely, destructive interference can suppress otherwise accessible alternatives. Beyond single concepts, entanglement-like structures (e.g., compositional subspaces for co-activated pairs) can model joint contextualization, whereby meanings are not merely additive but depend on relational constraints.
This representational stance departs from static symbol manipulation: conceptual content is not a fixed object but a context-sensitive process realized as state evolution. It aligns with empirical and phenomenological accounts of creative thought in which breakthroughs arise from reconfigurations among seemingly unrelated elements rather than from linear associative chains. Importantly, while some illustrations will use small hand-crafted networks to elucidate mechanisms, the same formalism applies to graphs derived from empirical linguistic data, opening a path to hybrid approaches that combine principled quantum dynamics with data-driven semantic structure.
To summarize performance over a finite horizon 
, we will report both level and latency measures on designated targets. For a single target 
r,
      approximated in practice on a uniform grid 
 (
, 
) by Riemann sums. Given a decision threshold 
, the (first) time-to-threshold is
As shown in Equation (
4), AUC integrates the probability mass on the target node over the normalized horizon, providing a level-based summary of accessibility. Equation (
5) defines the time-to-threshold metric, i.e., the first time a node’s probability exceeds a chosen level 
. This captures the latency with which a solution becomes salient.
Unless otherwise indicated, all CTQW/CRW comparisons use the same graph, source node, and time horizon; we vary , the Hamiltonian choice (A vs L), and  and assess sensitivity accordingly. This unified setup provides the bridge from semantic structure to superposed cognitive dynamics, which we analyze mathematically in the next section and evaluate empirically thereafter.
  4. Mathematical Analysis
We now formalize two aspects that distinguish continuous-time quantum walks (CTQWs) on semantic graphs from classical diffusion (CRW): (i) the structure of the long-time average distribution, and (ii) the possibility of transient phase-driven amplification on semantically remote targets. Throughout, we refer to the unified notation of 
Section 3.
Proposition 1 (Time-averaged CTQW distribution). 
Let H be a Hermitian graph Hamiltonian, either  or  with , and let  be its spectral decomposition into orthogonal projectors  onto eigenspaces associated with distinct eigenvalues λ. For any initial state  and node ,Equation (6) provides the long-time average probability of observing node r, which depends only on the spectral projectors of H. In particular, only intra-eigenspace coherences contribute to the time average, while interference across distinct eigenvalues washes out. Proof.  Expand the unitary evolution in the eigenbasis of H: . Then,  Squaring the modulus and integrating over t kills oscillatory cross-terms with frequencies  by the Riemann–Lebesgue lemma; only terms with  survive, yielding . Degeneracies are handled by the projector  at fixed . The statement holds identically for  and for ; in the latter case, the scaling  rescales time but does not affect the infinite-time average.    □
 The next result isolates a sufficient spectral condition under which interference can produce transient probability gains on a target r relative to diffusion. Intuitively, if at least two eigenmodes connect source s and target r with comparable weights, their phases can align and yield a constructive cross-term that has no classical analogue.
Proposition 2 (Transient interference-based amplification). 
Let  with , and consider the transition amplitude  where  and  is an orthonormal eigenbasis of H. Assume there exist two indices  such that , and the residual spectral weight  is finite. Then, there exists a time  such thatMoreover, for the classical heat kernel with the same initial condition,where  and . For all ,Consequently, if , then there exists a time  at which . Observe that Equation (
7) provides a lower bound on the quantum probability at the target, showing that constructive interference of multiple contributing amplitudes can guarantee a non-trivial activation. Equation (
8) expresses the classical diffusion probability in the eigenbasis of the Laplacian, where decay is governed by the eigenvalues 
. Equation (
9) provides an upper bound for the classical probability at node 
r, showing that it can never exceed the cumulative spectral weight 
B.
Proof.  Write 
, where 
 and 
. Choose 
 so that the two dominant terms are phase-aligned; i.e., 
 with 
, which maximizes 
 to 
. Then, by the reverse triangle inequality,
        and squaring provides the stated lower bound: 
 For the classical case, the heat kernel has no oscillatory cross-terms; by the triangle inequality, 
 If the quantum lower bound exceeds 
B, the claimed strict inequality follows.    □
 Equation (
10) makes explicit that the contribution of the two principal modes adds at least 
; i.e., interference can secure a non-trivial amplitude even in the presence of residual spectral mass.
Corollary 1 (Graph-level sufficient condition). Suppose there exist two internally vertex-disjoint  paths in G with distinct effective lengths/weights (so that the spectrum of H induces at least two eigenmodes with nonzero overlaps at s and r). Then, under mild non-degeneracy of the corresponding overlaps  and sufficiently small residual spectral mass R, the condition of Proposition 2 holds; hence, there exists a time  with .
 Proof.  Two distinct families of  paths typically generate distinct spectral contributions with nonzero overlaps at both s and r; for generic positive weights, this yields  with . The disjointness assumption controls interference with other modes, making R small on small-to-moderate graphs or on well-separated path families. A full proof can be provided under specific regularity assumptions.    □
 Remark 1 (Choice of H and role of ). The statements above are agnostic to whether  or . Proposition 1 depends only on the spectral projectors of H; scaling  rescales phases  but leaves the infinite-time average invariant. In Proposition 2, γ shifts the alignment time  through , thereby modulating when amplification occurs, not whether the constructive cross-term can arise. Empirically, we therefore expect similar qualitative effects under both  and , with different peak timings and bandwidths; this is consistent with the experimental sensitivity analyses reported later.
 Implications for experiments.  Proposition 1 justifies reporting long-horizon summaries (e.g., AUC) together with transient metrics since persistent biases are tied to spectral weight within degenerate eigenspaces. Proposition 2 and Corollary 1 motivate seeking semantic subgraphs where multiple path families connect source and target (e.g., via alternative associative routes). Under such conditions, CTQWs admit time windows where target probability exceeds the diffusion baseline, a signature we quantify in the experimental sections.
 To quantify these effects, we adopt the level and latency metrics defined in 
Section 5.
  5. Metrics and Statistics
We summarize model behavior over a finite horizon 
 using level and latency metrics on designated targets in the unified notation of 
Section 3. Let an experimental 
instance  specify a graph 
, a source 
, and a target (or target set) 
 (resp. 
). Time is discretized as 
 for 
, with 
.
Time is discretized on a uniform grid over the auto-normalized horizon in Equation (
4), ensuring comparability across graphs and operators.
To begin with, we quantify overall target activation through the area under the curve (AUC). Specifically, for a single target 
r,
      approximated by Riemann sums 
 We also report the normalized AUC 
. For a set of targets 
R, replace 
 with the aggregate 
.
As defined in Equation (
11), AUC summarizes the total probability mass accrued on the target over the horizon 
, providing a level-based measure complementary to peak and latency metrics.
Beyond cumulative mass, we also track the most salient activation by considering peak probability and its timing. On the grid,
As per Equation (
12), we summarize level and latency via the peak value and its earliest occurrence time on the sampled grid (for ties, we pick the earliest time).
In addition to peak-based measures, we evaluate latency more directly via time-to-threshold metrics. Given 
, define the first-passage time
Equation (
13) defines the instance-level time-to-threshold, namely the earliest time within 
 at which the target probability exceeds 
.
On the grid, we use linear interpolation: if 
, set
Equation (
14) specifies the linear interpolation used to estimate the threshold-crossing time between two consecutive grid points.
If the threshold is not reached by T, we record a right-censored observation and report both the reach rate  and the conditional summary of  on the subset that reaches . In the main text, we use .
To compare quantum and classical dynamics on the same instances, we analyze paired differences across metrics. For each instance 
j, we form paired differences
      and, for each 
, 
 on the subset, where both are uncensored. We summarize paired differences by median and interquartile range (IQR). For inference, we use the two-sided 
Wilcoxon signed-rank test on each metric family, reporting the exact (or continuity-corrected) 
p-value and the effect size
      where 
Z is the Wilcoxon test statistic normalized, and 
 is the number of nonzero paired differences. Equation (
16) defines the standardized effect size 
r, obtained from the test statistic 
Z and the effective sample size 
. In numerical simulations, Equation (
8) makes explicit that CRW trajectories are convex combinations of exponential decays. Unlike the constructive interference bound in Equation (
7), the CRW distribution is limited by the static spectral sum in Equation (
9), which restricts its capacity to amplify remote nodes.
For robust inference, we complement paired statistics with confidence intervals obtained via resampling. We report  confidence intervals for medians and for median paired differences via nonparametric bootstrap with  resamples (BCa intervals unless otherwise noted). For reach rates , we provide exact binomial CIs; when comparing reach rates between QW and CRW on paired instances, we report McNemar’s exact test.
Finally, to account for multiple hypotheses and ensure consistent reporting, we apply standard correction procedures.
Within each task family (e.g., Candle, RAT, and AUs), we adjust for multiple 
-levels and metric families using the Holm–Bonferroni procedure over the corresponding set of hypotheses. While Equation (
6) characterizes asymptotic behavior, our empirical metrics (AUC, peak, and threshold) focus on finite-horizon dynamics. Unless stated otherwise, all tables report medians 
 and paired effect sizes 
r; figures display distributions (violin/box) of paired differences 
 and time-series exemplars. All tests are two-sided.
  6. Walking the Unseen Paths
Before the full model and results, we collect the operational ingredients of continuous-time quantum walks (CTQWs) and contrast them with classical diffusion, keeping only what is used later for Figures 2 and 3. On a semantic graph 
 with adjacency 
A, degree 
D, normalized adjacency 
, and normalized Laplacian 
, the CTQW evolves unitarily, as described in Equation (
1), and is read out on nodes via the Born rule (Equation (
2)). Unnormalized generators 
 are reported as robustness checks. The matched classical continuous-time random walk (CRW) evolves a probability vector on the same graph and from the same initial mass 
 as
While CRW is stochastic and diffusive, CTQW is unitary and phase-bearing: amplitudes traveling along multiple paths can interfere constructively or destructively, yielding transient non-classical propagation and potential non-local reinforcement.
To enable comparisons across graphs and generators, we adopt an auto-normalized horizon
      with 
 the spectral radius and 
 fixed across all experiments; CTQW and CRW are sampled on the same time grid 
 with step 
. Under this convention, scaling (e.g., 
) re-parameterizes time without changing qualitative interference patterns up to discretization.
Both CTQW and CRW are sampled on the same grid within the auto-normalized horizon of Equation (4) so that scale effects due to  are controlled.
Nodes are concepts, edges carry associative weights, and 
 is the computational basis. Initial states encode the task context (localized cue or superposition). Unless stated otherwise, we read a designated target 
 and report target-centric metrics (AUC, peak, time-of-peak, and time-to-threshold) and exploration metrics (entropy, coverage, expected radius, and time-to-depth) as formally defined in 
Section 5. In cognitive terms, interference can concentrate probability on remote yet context-appropriate candidates, offering a process-level account of discontinuity and restructuring at commitment; a compact interpretation is provided in 
Section 9/
Appendix A.
The classical baseline is the continuous-time random walk (CRW) in Equation (
17), which is stochastic and diffusive: probability mass spreads locally and tends to privilege high-degree or nearby nodes. By contrast, the CTQW in Equation (
1) is unitary and phase-bearing, so amplitudes traveling along multiple paths can interfere constructively or destructively. Under identical graphs, initial states, and the auto-normalized horizon in Equation (
18), this contrast can yield transient non-classical propagation—most notably, a non-local reinforcement of structurally supported yet semantically remote nodes. In the experiments, we make the comparison explicit by pairing CTQW and CRW trajectories on the same time grid and then reading the same target-centric and exploration metrics (
Section 5).
Within a semantic network, the initial state encodes task context (localized cue or superposition), and the CTQW evolves multiple candidates in superposition until commitment at readout. Interference can concentrate probability on remote yet context-appropriate nodes, offering a process-level account of apparent discontinuity and restructuring at commitment. In this study, the interpretation is kept operational: claims are tied to numbered quantities—AUC, peak, time-of-peak, time-to-threshold, and exploration measures—and to the spectral statements proved earlier (propositions), which together predict when transient amplification should emerge on a given graph (see 
Section 5 for metric definitions and 
Section 8 for empirical tests). A compact non-technical discussion is deferred to 
Section 9/
Appendix A.
The framework is quantum-inspired and operational. Amplitudes evolve unitarily on a graph-induced real-symmetric 
d-sparse generator 
 (robustness checks with 
); spectra lie in 
 for 
 and 
 for 
. We adopt the auto-normalized horizon 
 (Equation (
18)) and sample all trajectories on a uniform grid shared by CTQW and CRW. Simulations are executed classically via sparse actions of the matrix exponential (e.g., 
expm_multiply), with per-step cost 
 and tolerance-controlled Krylov subspaces; unit-norm preservation provides a runtime diagnostic, and empirical scaling is reported in 
Section 8.1. A hardware realization would amount to implementing 
 with standard Hamiltonian-simulation techniques for 
d-sparse Hermitian matrices; our claims here concern the process model and its testable signatures on semantic graphs.
  7. Modeling and Visualizing Insight Dynamics via Quantum Walks
In this section, we present a computational simulation designed to explore the dynamic behavior of concept activation during insight problem solving. The Candle Problem, a well-known cognitive task requiring representational change, serves as our case study. By modeling the underlying semantic structure as a graph of conceptual associations, we compare the trajectories of a classical random walk and a continuous-time quantum walk initiated from a key element in the problem.
Our aim is to evaluate whether the quantum dynamics can account for the emergence of non-obvious yet functional associations. We construct a semantic network encoding the problem space and examine how each model distributes probability over time across conceptual nodes. This approach allows us to investigate whether quantum interference and coherence can facilitate representational restructuring, thereby offering a computational perspective on creative insight.
The proposed model and the associated simulations are fully accessible and reproducible via an interactive Google Colab notebook [
23].
  7.1. The Candle Problem: A Case of Insight
The 
Candle Problem, originally introduced by Karl Duncker in the 1940s [
24], is a classical experimental paradigm used to study the phenomenon of creative insight. In its standard formulation, participants are presented with a candle, a box of tacks, and a book of matches. They are asked to attach the candle to a vertical wall in such a way that it does not drip wax onto the table below. Most participants initially attempt to affix the candle using the tacks or melt it with the matches, leading to an impasse. The insight-based solution involves emptying the box of tacks and repurposing it as a shelf for the candle, thus overcoming functional fixedness and reinterpreting the box’s role from a container to a support.
This problem has been widely studied in the psychology of problem solving and creativity as a prototypical example of 
representational restructuring [
25,
26]. It exemplifies key features of insight: a period of unsuccessful search, a mental impasse, and a sudden restructuring that redefines object affordances. Notably, the solution often emerges not through incremental reasoning but via a discontinuous shift in conceptual representation [
27].
Recent cognitive models have sought to formalize the mechanisms underlying such restructuring, including dual-process theories [
12], spreading activation in semantic networks [
9], and Bayesian approaches [
28]. However, these frameworks often struggle to account for the non-local interference-driven dynamics that characterize true insight episodes.
In this section, we adopt the Candle Problem as a case study to evaluate our quantum-inspired model of creative cognition. We propose that a continuous-time quantum walk over a structured semantic graph can capture the non-linear dynamics required for a distant but contextually relevant concept, such as shelf, to become salient. Thus, we aim to simulate the cognitive reconfiguration that underlies the insight moment in this canonical task.
  7.2. Modeling and Simulating Conceptual Dynamics
To formalize the conceptual landscape of the Candle Problem, we define a weighted semantic graph 
, where each node 
 represents a concept relevant to the problem scenario (e.g., 
box, 
container, 
support, and 
shelf) and each edge 
 is assigned a positive weight 
 reflecting associative or semantic proximity. The adjacency matrix 
A is then provided by 
 for connected nodes and 
 otherwise; the corresponding Laplacian is 
, with 
 and 
. This representation provides the substrate for the continuous-time quantum walk dynamics of Equations (
1) and (
2) and the classical diffusion baseline of Equation (
17).
The illustrative graph contains approximately ten nodes, chosen to balance expressiveness and interpretability. Importantly, there is no direct edge between the initial concept (box) and the target solution concept (shelf), so any successful trajectory must pass through intermediate associations, such as platform, support, or stand. This topological constraint operationalizes the idea of insight as a non-local interference-driven transition.
Edge weights were assigned heuristically, following prior work on semantic networks and spreading-activation models [
9,
28]. For transparency, the exact weights are reported in 
Figure 1 (annotated edges) so that the reader can directly reconstruct the adjacency matrix used in the simulations. The network was implemented in Python with the 
NetworkX library, allowing reproducible analysis and visualization.
To explore the dynamic behavior of conceptual activation within the semantic network, we implemented both classical and quantum walk simulations in Python. Quantum dynamics follow the continuous-time quantum walk of Equation (
1), with readout probabilities provided by Equation (
2), starting from the localized initial state 
 (
Section 4). Probabilities 
 are sampled on a uniform time grid 
 with horizon 
 (spectral normalization; see 
Section 4).
The matched classical baseline is the continuous-time random walk governed by Equation (
17), with 
 localized at 
box. This process lacks interference and yields diffusive spreading.
Both models were run in a reproducible Google Colab environment using NumPy 2.3, SciPy 1.16, NetworkX 3.5, and Matplotlib 3.10. For each , we record the node-wise probabilities and visualize them as heatmaps.
Figure 2 shows the temporal evolution of the quantum walk probability distribution reflecting the numerical values from Equations (
1), (
2), and (
17). The heatmap values correspond to the computed 
 for each node 
i and time step 
. Starting from the localized state on 
box, the system spreads across the network; after sufficient evolution, the 
shelf node emerges with elevated probability, illustrating the potential of quantum dynamics to support non-local semantic activation relevant for insight.
 To enable a direct comparison with the classical diffusion baseline, we report in 
Figure 3 the probability distributions obtained from Equations (
1), (
2), and (
17) at four representative time instants: the initial state (
), two mid-trajectory snapshots (
 and 
 in normalized units), and the final horizon (
). For each instant, the node-wise probabilities 
 and 
 are directly computed from the simulated dynamics and plotted side by side. These snapshots are sampled on the same time grid used for 
Figure 2 and correspond to the auto-normalized horizon 
 described in 
Section 4. Each classical snapshot in 
Figure 3 is computed from Equation (
3) on the same time grid used for the quantum evolution.
  
    
  
  
    Figure 2.
      Heatmap of the quantum probability distribution 
 over the semantic graph. The walker starts at the 
box node. Over time, amplitudes evolve by Equation (
1), with readout via Equation (
2). The 
shelf node shows a marked late-time increase, consistent with interference-driven amplification of remote but context-appropriate concepts.
  
 
 
   Figure 2.
      Heatmap of the quantum probability distribution 
 over the semantic graph. The walker starts at the 
box node. Over time, amplitudes evolve by Equation (
1), with readout via Equation (
2). The 
shelf node shows a marked late-time increase, consistent with interference-driven amplification of remote but context-appropriate concepts.
 
  
 
These results substantiate the analytical conditions derived in 
Section 4: in the presence of multiple 
 paths, the CTQW can transiently amplify the probability of reaching semantically remote but contextually appropriate nodes. In contrast, the CRW follows a diffusion pattern that distributes mass more evenly and lacks the constructive interference required for focused emergence. This difference is also reflected in the quantitative metrics reported in 
Section 8.3 (e.g., higher AUC and earlier peak times under CTQW). The sharper rise on 
shelf under CTQW corresponds to higher 
 and earlier 
, as defined in Equation (
12).
  
    
  
  
    Figure 3.
      Side-by-side comparison of probability distributions obtained from Equations (
1), (
2), and (
17). At 
, the CTQW remains localized on the 
box node, while the CRW has already diffused to neighbors. At 
, the CTQW reaches semantically distant nodes with non-negligible probability on 
support, 
stand, and 
table. At 
, constructive interference amplifies activation on 
platform and 
shelf, whereas the CRW shows a flatter spread. At 
, the CTQW exhibits a clear peak on the 
shelf node, consistent with interference-driven amplification predicted by Proposition 2; the CRW maintains a uniform distribution.
  
 
 
   Figure 3.
      Side-by-side comparison of probability distributions obtained from Equations (
1), (
2), and (
17). At 
, the CTQW remains localized on the 
box node, while the CRW has already diffused to neighbors. At 
, the CTQW reaches semantically distant nodes with non-negligible probability on 
support, 
stand, and 
table. At 
, constructive interference amplifies activation on 
platform and 
shelf, whereas the CRW shows a flatter spread. At 
, the CTQW exhibits a clear peak on the 
shelf node, consistent with interference-driven amplification predicted by Proposition 2; the CRW maintains a uniform distribution.
 
  
 
  7.3. Interpretation and Implications
The simulation results offer a concrete instantiation of how quantum dynamics over a semantic network can enable the emergence of remote associations that are functionally relevant to insight problem solving. In the context of the Candle Problem, the observed convergence of probability on the shelf node, despite the absence of direct connections from the initial concept box, illustrates the capacity of quantum walks to navigate conceptual structures in a non-local and interference-based manner.
  8. Extended Experimental Results
We complement illustrative toy examples with three task families widely used in the insight and creativity literature: the Candle Problem, Remote Associates Test (RAT) triads, and Alternative Uses (AUs). In all cases, the semantic substrate is obtained from ConceptNet 5.7 (English) via 
k-hop neighborhoods (
) around task-specific seeds, followed by weight symmetrization and normalization (row-stochastic by default) and the removal of self-loops (see 
Section 3). A repository with the full dataset and Python scripts is available in [
29], while basic graph statistics for the resulting subgraphs are reported in 
Table 1.
RAT tasks probe the ability to connect three cue words via a single associative bridge (e.g., 
cottage–
swiss-
cake →
cheese). We use a set of 20 English triads (see repository seeds) and, for each triad, derive a dedicated subgraph by taking the union of 
k-hop neighborhoods around the three cues. The solution word (target) is included only for evaluation, not as a seed. RAT triads are ideal to test scalability beyond a single illustrative case since they require converging on a specific remote concept while navigating a moderately sized semantic region. They also support paired statistical comparisons (CTQW vs. diffusion) across multiple independent instances (
Section 5).
AU tasks evoke divergent thinking by prompting many uses for a common object (e.g., 
brick, 
paperclip, 
newspaper, 
rope, 
shoe, 
bottle, 
cup, 
spoon, 
fork, 
pencil, 
book, and 
towel). Here, we treat the object name(s) as seeds and extract a broader semantic neighborhood capturing materials, functions, locations, and affordances. While AUs are not single-target problems, they provide a stress test for whether quantum dynamics preferentially amplify non-obvious “remote” regions of the graph. In analyses, we therefore report aggregate target sets or exemplar remote nodes and summarize level/latency metrics accordingly (
Section 5).
For all families, we rely on ConceptNet relations from a standard whitelist (e.g., RelatedTo, IsA, UsedFor, PartOf, CapableOf, AtLocation, HasA, Causes, and Synonym); weights are symmetrized (max) when edges exist in both directions and then normalized. The exact seeds, scripts, and per-graph summaries are included in the repository for full reproducibility. ConceptNet data are used under CC BY–SA; we provide attribution and links in the repository README.
Across all subgraphs, we verify that graphs are non-empty, densities fall in 
, the largest connected component typically covers the majority of nodes, and average degree exceeds 1; detailed counts are provided in 
Table 1. These checks ensure that subsequent CTQW vs. diffusion comparisons operate on meaningful semantic neighborhoods rather than on degenerate or overly fragmented graphs.
  8.1. Experimental Setup
We evaluate continuous-time quantum walks (CTQWs) against classical diffusion (CRW) on the ConceptNet-derived 
k-hop subgraphs described above. Notation and metrics follow 
Section 3 and 
Section 5. Unless otherwise noted, CTQW and CRW are run on the 
same graphs, with the 
same initial states and time grid.
Our main results use graph-normalized operators to make scales comparable across instances: (i) the normalized adjacency 
 (
normA); and (ii) the normalized Laplacian 
 with a small rate parameter 
 (
normL, i.e., 
). In sensitivity checks, we also report the unnormalized choices 
 and 
 (extended Tables in [
29]). For 
normL, we sweep 
 over a few small values (e.g., 
).
For the Candle Problem, we initialize a localized state on the source concept (e.g., 
) and use the standard solution node as target (e.g., 
 or available variant). For RAT triads, we model simultaneous cue presentation by the equal-phase superposition
        with the triad’s solution as target. For AUs (no single target), we initialize on the object name (e.g., 
) and evaluate exploration metrics rather than target-centric ones.
To avoid arbitrary scale effects, the horizon is set by the operator’s spectral radius:  (autoT), with  fixed (we use  in our runs). Probability trajectories are sampled on a uniform grid  for  with  depending on the experiment. CTQW evolution uses the action of the matrix exponential via expm_multiply applied to ; CRW uses the heat-kernel action  computed analogously. We monitor unit-norm preservation for CTQW (tolerances ).
For Candle and RAT (single-solution tasks), we report the target-centric metrics of 
Section 5: AUC on the solution node, peak probability, and time-of-peak; we also track time-to-threshold (Equation (
5)) for 
 and first-hitting time with discrete readout. Paired differences (QW−CRW) are formed per instance (per triad for RAT), and nonparametric Wilcoxon signed-rank tests and BCa bootstrap CIs are computed across instances when sample sizes permit. AUC values in Equation (
11) are evaluated on the sampled grid by Riemann sums over the auto-normalized horizon (Equation (
4)).
For AUs (open-ended), we use exploration metrics defined in 
Section 5: mean entropy over time, coverage (number of nodes that ever exceed a small activation threshold 
), expected radius (mean shortest-path distance from the source, in hops), and time to a “deep layer” (first time the expected radius reaches a graph-dependent threshold 
, e.g., a high distance percentile). As in the target-centric case, we aggregate paired differences (QW−CRW) across sources and use Wilcoxon tests where applicable.
  8.2. Operator Ablation: Adjacency vs.  Laplacian
We compare quantum generators that are standard in graph-based CTQW: (i) the (symmetric) normalized adjacency  and (ii) the normalized Laplacian  (used as  with small ). All ablations use identical graphs and initial states and a shared spectral time normalization  to remove arbitrary scale effects; small  sweeps are included for Laplacian forms.
Since , Laplacian-based CTQW differs from adjacency-based CTQW by an on-site diagonal term D. In unitary dynamics, diagonal entries act as local phase rates (“site energies”): nodes with larger degree accumulate phase faster, which can dephase interference at hubs and modulate path combination. By contrast, adjacency-based CTQW has no such degree-dependent on-site term and tends to privilege path multiplicity more directly. Normalization () reduces degree heterogeneity and bounds the spectrum, allowing fairer cross-graph comparison.
Adjacency-based dynamics () emphasize direct associative transitions and multi-path reinforcement, aligning with a view of insight as rapid re-combination of strongly cued associations. Laplacian-based dynamics () impose a degree-sensitive “cost of connectivity,” potentially tempering hub dominance and favoring selective routes. Both are plausible: we therefore let data arbitrate under matched conditions.
For single-solution tasks (Candle and RAT), we compute AUC on the target, peak probability, and time-of-peak. For AUs, we use mean entropy, coverage, expected radius, and time to a deep layer (
Section 5). We report paired differences per instance and Wilcoxon signed-rank tests with %QW-better.
  8.3. Results
The compact summaries used in the main text (
Table 2, 
Table 3 and 
Table 4) display medians of paired differences (QW−CRW), interquartile ranges, and, when applicable, Wilcoxon 
p-values and the percentage of instances in which QW outperforms CRW (with “lower is better” for time metrics). Extended results, i.e., operator sweeps, additional thresholds, full per-instance tables, and confidence intervals, are provided in [
29].
For the Candle dataset, regarding the ConceptNet-derived subgraph for the Candle Problem, CTQW shows a transient amplification on the solution node relative to CRW: a higher area under the curve (AUC) and a higher probability peak reached earlier. These effects are summarized in the compact table (
Table 2), which reports the median differences (QW−CRW) for AUC, peak, and time-of-peak. As this is a single-graph demonstration, we use Candle chiefly as a qualitative exemplar; extended numeric diagnostics (including full time-series) are provided in [
29]. The differences in AUC (Equation (
11)) indicate that CTQW concentrates more probability on the solution node relative to CRW. The observed concentration patterns are consistent with the spectral decomposition in Equation (
6), which predicts stable long-term distributions tied to eigenstructure, while the observed amplification of the 
shelf node is consistent with the interference bound derived in Equation (
7). Differences in AUC (Equation (
11)) indicate higher mass concentration on the solution node under CTQW. Median paired differences in 
 and 
 (Equation (
12)) favor CTQW, indicating stronger and earlier concentration on the solution node.
In the case of the RAT dataset, for each triad, we computed paired differences between CTQW and CRW on target-centric metrics (AUC, peak, and time-of-peak). The compact summary (
Table 3) reports the median paired difference (QW−CRW) across the 20 triads (and Wilcoxon 
p-values when applicable). The pattern is consistent with the interference-based account: medians are 
positive for AUC (Equation (
4)) and peak (CTQW concentrates more mass and attains a higher maximum on the correct target) and 
negative for time-of-peak (CTQW reaches that peak earlier). Across the 20 triads, paired AUC differences computed via Equation (
11) are consistently positive, supporting stronger target accumulation for CTQW. Threshold-based and first-hitting times are less sensitive in cases where targets remain below fixed thresholds; we therefore include their full details in [
29] and do not emphasize them in the main text. Across triads, CTQW attains larger 
 and smaller 
 (Equation (
12)) than CRW.
In the case of the AUS dataset, without a unique target, we evaluate 
exploration on ConceptNet-derived AUS graphs across multiple sources (e.g., 
brick, 
paperclip, 
newspaper, 
rope, and 
shoe). The compact summary (
Table 4) aggregates the median differences (QW−CRW) for four interpretable measures: mean entropy (diversity of the explored distribution), coverage (number of nodes ever exceeding a small activation threshold), expected radius (mean topological distance from the source), and time to a deep layer (first time the expected radius reaches a graph-dependent threshold). The medians are 
positive for entropy/coverage/radius and 
negative for time-to-depth, indicating that CTQW explores more broadly and reaches distant regions 
faster. Per-source breakdowns and additional metrics (e.g., entropy peak, and radius peak) are reported in [
29]. For divergent tasks, the 
 metric of Equation (
13) captures the onset of activation across sets of remote nodes.
All the comparisons use identical graphs, identical initial conditions (source node or superposition of cues for RAT), and identical temporal discretization. The qualitative pattern does not depend on the specific generator (normalized adjacency, 
normA; normalized Laplacian, 
normL at small 
), and the time normalization 
 mitigates spurious scale effects. The location and sign of the observed advantages match the conditions for the transient interference-driven amplification presented in our analysis (
Section 4).
Taken together, the three datasets paint a coherent picture. On 
single-solution tasks (Candle and RAT), CTQW concentrates more probability on the correct target and reaches stronger peaks 
earlier than CRW (
Table 2 and 
Table 3). On 
open-ended tasks (AUs), CTQW explores 
broader and deeper regions of the semantic graph in the same time budget (
Table 4). Extended tables with per-triad and per-source results, as well as additional metrics and confidence intervals, are provided in [
29].
Regarding the ablation results, under identical initial states and spectral time normalization, CTQWs with 
 and 
 yield qualitatively similar advantages over CRW on single-solution tasks, with effect sizes that vary across datasets. Where differences arise, 
 tends to accentuate multi-path reinforcement (higher peaks), while 
 can produce slightly earlier or more selective amplification when high-degree hubs would otherwise dominate. On AUs, both generators broaden exploration relative to CRW; 
 often shows larger coverage/radius, whereas 
 reduces hub bias. These patterns are consistent with the diagonal degree term present in Laplacian-based dynamics (
Section 8.2).
  9. Discussion and Outlook: Quantum Paths to Creativity
The results presented here support accounts of insight as representational restructuring, an abrupt reorganization that makes previously remote solution paths salient. CRW typically yields gradual locality-driven spread in conceptual space. In contrast, our CTQW formalism introduces superposition and interference between semantic trajectories, enabling transient concentration on semantically distant context-appropriate ideas without sequential traversal of all the intermediates.
Across three families of tasks, the empirical patterns match the qualitative predictions of our analysis (
Section 4). On 
single-solution problems, CTQW concentrates more probability on the correct node and does so earlier than CRW. In the illustrative Candle graph, the compact summary (
Table 2) shows CTQW exceeding CRW on AUC and peak probability and achieving an earlier time-of-peak. More importantly, on 
RAT triads (20 instances), the paired deltas (QW−CRW) in 
Table 3 have 
positive medians for AUC and peak and 
negative medians for time-of-peak, indicating stronger and earlier target activation under CTQW. These effects are precisely the kind of 
transient amplification predicted by our sufficient conditions (Proposition 2). Threshold-based and first-hitting metrics are less diagnostic when targets remain below strict thresholds; detailed values for those measures are therefore reported in [
29] rather than emphasized here.
On 
open-ended AUs, where no single target exists, CTQW exhibits broader and deeper exploration over the same time budget. The summary in 
Table 4 shows 
positive median deltas for mean entropy, coverage, and expected radius, and 
negative deltas for time to a deep layer, indicating that CTQW spreads diversity more widely and reaches distant regions faster than CRW. Together, these findings link the formal notion of interference-enabled amplification to observable advantages in both target-centric (Candle and RAT) and AU settings.
All the CTQW–CRW comparisons were performed on identical graphs with identical initial states and matched time grids (
Section 8.1). We also normalized horizons via the operator spectral radius (
autoT), reducing spurious scale differences across instances. Hence, the observed gains are not attributable to graph size or trivial timing choices; rather, they emerge when—per our analysis—the topology supports constructive phase alignment along multiple paths.
First, Candle is a single-graph demonstration; we therefore treat it as qualitative evidence, with quantitative strength coming from RAT (paired across 20 triads) and aggregated AU sources. Second, thresholded metrics (time-to-threshold; first-hitting) can underreport quantum advantages when activation remains below preset levels; extended tables in [
29] document these cases. Third, ConceptNet-derived subgraphs inherit biases from the source resource (edge types; coverage). Future work should replicate additional semantic networks and psycholinguistic association norms. Finally, while we established robustness across generators (normalized adjacency and normalized Laplacian with small 
), effect sizes can vary with operator choice and local spectral structure—an expected consequence of the phase geometry emphasized in our propositions.
The consistent advantages of CTQW on AUC/peak/timing (single-solution tasks) and exploration breadth/depth (AUs) suggest a common mechanism: interference distributes, and then selectively re-concentrates, probability mass in a way that diffusion alone cannot. This yields two testable predictions for human cognition: (i) when restructuring succeeds, activation of the correct solution should exhibit sharper and earlier peaks relative to closely matched control conditions; and (ii) in open-ended ideation, the early-time diversity and reach of candidate ideas should be higher under contexts that encourage superposed (rather than serial) processing modes.
The immediate avenues include (i) extending the analyses to empirically curated networks (e.g., word-association norms) to improve ecological validity; (ii) hybrid classical–quantum models, where diffusion sets a baseline and interference modulates context-dependent leaps; (iii) measurement protocols (readout frequency/strength) and open-system variants to emulate attentional collapse and noise; and (iv) behavioral studies designed to track time-resolved activation patterns against our model’s predictions. On the computational side, our formulation offers a clear path to carefully controlled ablations (path multiplicity; spectral overlaps) that can further connect the theorems to the effect sizes observed in 
Table 2, 
Table 3 and 
Table 4.