A Noise-Based CMOS Probabilistic Bit for Combinatorial Optimization Problems

Jeon, Jinwoo; Lim, Chaegang

doi:10.3390/electronics15122510

Open AccessArticle

A Noise-Based CMOS Probabilistic Bit for Combinatorial Optimization Problems

by

Jinwoo Jeon

¹

and

Chaegang Lim

^2,*

¹

Department of Semiconductor Systems Engineering, Korea University, Seoul 02841, Republic of Korea

²

Division of Semiconductor and Electronic Engineering, Hankuk University of Foreign Studies, Yongin 17035, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(12), 2510; https://doi.org/10.3390/electronics15122510

Submission received: 19 May 2026 / Revised: 4 June 2026 / Accepted: 5 June 2026 / Published: 7 June 2026

(This article belongs to the Section Circuit and Signal Processing)

Download

Browse Figures

Versions Notes

Abstract

Combinatorial optimization problems (COPs) are challenging for conventional computers because their solution spaces grow exponentially. To reduce exhaustive-search burden, hardware approaches have explored stochastic traversal of energy landscapes, including quantum annealers, CMOS Ising solvers, and probabilistic computing systems. However, quantum annealers require cryogenic operation, while CMOS Ising solvers typically rely on pseudorandom bitstreams or shared random pulses. A CMOS-compatible probabilistic bit with a physical random source is attractive for scalable optimization hardware. We present a CMOS p-bit that generates stochastic states from transistor device noise. The p-bit combines a transistor-noise random source, a correlated double sampling circuit, a calibrated comparator, and a 5-bit probability controller to convert local-field inputs into digitally tunable output probabilities. Because the random source is local to each p-bit and does not require PRNG state or seed assignment, the local random-source circuit in each p-bit does not need to grow larger as the number of p-bits increases, while system-level scaling is still governed by the p-bit count, weighted-sum logic, and interconnects. Prototype p-bit chips fabricated in a 180 nm CMOS process show 32-level output-probability control, pass the NIST Statistical Test Suite, and achieve 50 MHz updates with 6.95 pJ/bit at 50% output probability under a 1.8 V supply. Interfaced with FPGA-based weighted-sum logic, the prototype probabilistic circuit demonstrates invertible Boolean operation using a clamped gate network and performs integer factorization.

Keywords:

CMOS; combinatorial optimization problem (COP); probabilistic computing; probabilistic bit; invertible Boolean operation; integer factorization; Ising model

1. Introduction

Combinatorial optimization problems (COPs) are gaining significant interest because they are tightly coupled with real-world applications such as RSA encryption [1], RTL design [2], and logistics [3]. These problems aim to find an optimal solution from a solution space, where each solution can be evaluated with a cost function. Most COPs are classified as nondeterministic polynomial time (NP) hard problems because classical computers cannot efficiently search through the exponentially expanding discrete solution space. In the absence of algorithmic breakthroughs, classical computers have improved performance by adopting the latest processors, yet they are reaching their limits as the transistor density growth has slowed below Moore’s law [4]. Consequently, obtaining solutions for massive COPs with classical computing presents significant challenges.

In 1983, S. Kirkpatrick et al. introduced a statistical-mechanics-based approach called simulated annealing [5]. This method employs a collection of spins and their interaction network to represent COP instances using the Ising model. Simulated annealing aims to find approximate solutions by searching for the ground states in the solution space, which is achieved by progressively lowering the effective temperature of the system and applying the Metropolis algorithm [6]. Like the Ising model in statistical mechanics, simulated annealing is an energy-based method. The energy landscape of the target COP is configured with the Ising Hamiltonian, and spin states yielding the lowest energy become the correct solution.

Beyond its algorithmic formulation, the annealing principle has driven a range of dedicated hardware. The most prominent quantum realization is D-Wave’s commercial annealer, which encodes spins in superconducting flux qubits [7]. Although it has proven effective on real-world optimization tasks, the platform must be held at cryogenic temperatures, and the associated cooling apparatus imposes substantial energy and footprint penalties. Quantum error and decoherence further constrain how far such systems can scale.

In contrast, CMOS-based Ising hardware offers the distinct advantage of room-temperature operability [8,9,10,11,12,13,14,15]. SRAM-based configurations have successfully implemented the simulated annealing algorithm at the transistor level [8,9]. These configurations inherently require SRAM read operations to access spin states. Certain designs also utilize dual external random pulses distributed across the spin array to modulate spin states, targeting escape from local minima [9]. This approach, however, may compromise the solution quality because of correlated spin flips and mark-space ratio deviation caused by the strength difference in pMOS and nMOS of inverter buffers. Also, the operating frequency of the system is restricted due to chip-level propagation delays. Register-based configurations have mitigated the overhead of SRAM read and write operations using D-flip-flops for storing spin states [10,11]. While this architectural choice improves operational efficiency, the challenges related to the use of shared external random pulses still need to be addressed, affecting both solution quality and operational frequency. More recent CMOS Ising hardware has also improved system scale, coefficient flexibility, density, and connectivity through multi-chip annealing systems, flexible spin processing elements, SRAM-based in-memory Ising macros, and all-to-all coupled-oscillator Ising chips [12,13,14,15].

Most Ising solvers, however, are tuned for COPs with a single optimal answer, as in Figure 1a. Lowering the effective temperature during annealing naturally drives the spins toward one configuration, which is why their use is largely confined to problems such as max-cut. Many important COPs admit several energy-equivalent solutions instead. Boolean satisfiability (SAT) is a representative example, and its landscape is sketched in Figure 1b. Handling such problems calls for a scheme that can sample multiple valid configurations rather than collapse onto a single one.

Probabilistic computing answers this requirement directly. It pivots on the probabilistic bit (p-bit), first introduced by Camsari et al. [16]. A p-bit is a bipolar primitive whose state switches at random, and Figure 2 places it conceptually between the deterministic classical bit and the superposed qubit. What distinguishes it from a classical bit is that the input strength biases this random switching, so the system can deliberately perturb p-bit states to climb out of suboptimal configurations and keep traversing the COP energy landscape. Crucially, unlike a qubit, it sustains this stochastic behavior at ordinary room temperature, removing the cooling overhead that quantum hardware imposes.

Reported p-bit hardware has been explored through several implementation routes. The first uses nanomagnetic devices: near-zero-barrier magnetic tunnel junctions (MTJs) act as compact, energy-efficient computing units whose intrinsic thermal fluctuations serve as the randomness source [17]. A recent on-chip stochastic MTJ/2D-MoS2 p-bit core demonstrated voltage-controllable stochasticity by interconnecting a stochastic MTJ with a 2D-MoS2 FET in a 1T-1MTJ configuration [18]. Their attractive density, however, is offset by the immature large-scale fabrication of such unstable junctions. The second category is fully CMOS and instead synthesizes randomness with pseudorandom number generators (PRNGs) [19,20]. A representative design distributes a single 64-bit xorshift+ stream across every p-bit [19]. Because all bitstreams originate from one generator, inter-p-bit correlation becomes a concern. Allocating a private 32-bit LFSR to each p-bit on an FPGA [20] removes this shared-source coupling, but in turn raises two scaling burdens: guaranteeing distinct seeds across many generators and lengthening each LFSR as the problem grows [21]. Beyond MTJ- and PRNG-based implementations, recent physical p-bit demonstrations have also explored fully CMOS-compatible biristor p-bits and NbOx self-oscillatory p-bits as alternative stochastic devices [22,23]. These developments indicate continuing progress in physical random-source p-bits, but scalable CMOS implementation still benefits from compact local randomness that avoids shared PRNG state, seed assignment, and global random-distribution overhead.

Here, we present a CMOS p-bit for probabilistic computing applications. To address the challenges of random seed uniqueness and the scaling overhead inherent in PRNG-based systems during expansion, our CMOS p-bit leverages transistor device noise to generate the required randomness for probabilistic outputs. Because each p-bit derives randomness from its own device noise, the local random-source circuit does not require shared PRNG state or seed assignment, and its per-p-bit area does not increase as the probabilistic circuit (p-circuit) is scaled to more p-bits. The total p-circuit area, however, still scales with the number of p-bits, coefficient storage, weighted-sum logic, update control, and interconnects. The time-averaged output of the proposed p-bit, expressed as a percentage (representing the occurrences of logical “1” at the output), is controlled via a 5-bit digital input. A collection of these p-bits is connected bidirectionally, forming a p-circuit capable of solving various COPs. We configured a prototype p-circuit with chips fabricated using 180 nm CMOS technology and an FPGA to solve COPs involving both single-solution and multiple-solution scenarios.

The remainder of this article proceeds as follows. Section 2 reviews the Ising-solver background, the probabilistic computing scheme, and the COP encoding used in this study. The proposed p-bit circuit and its p-circuit integration are described in Section 3. Section 4 reports the measurement results, and the work is summarized in Section 5.

2. Background

2.1. Ising Solver

The Ising model offers a mathematical description of how magnetic spins behave collectively. A representative two-dimensional lattice configuration appears in Figure 3a, where every spin takes one of two discrete states: up (+1) or down (−1). Each pair of spins is joined by a bidirectional weight that sets their coupling strength. A positive weight favors parallel alignment so that the connected spins prefer {+1, +1} or {−1, −1}, whereas a negative weight pushes them toward the antiparallel pairs {+1, −1} and {−1, +1}. An external magnetic field additionally tilts each spin individually. Under these interactions, the total system energy is described by the Ising Hamiltonian:

H = - \sum_{i, j} J_{i j} σ_{i} σ_{j} - \sum_{j} h_{j} σ_{j} .

(1)

In Equation (1), H denotes the Ising Hamiltonian that represents the total system energy, J_ij is the interaction coefficient, σi is the bipolar spin state, and h_i is the external field. Statistical mechanics treats the spin assignment that minimizes this energy as the physically meaningful one. Prior studies have shown that a wide class of COPs can be reduced to such an Ising ground-state problem in polynomial time [11,24]. An Ising solver therefore tackles a COP by searching its solution space for the ground states defined by the Ising Hamiltonian [25].

Figure 3. (a) Spin configuration of an Ising model on a two-dimensional lattice graph. (b) Visualization of an Ising model escaping local minima. The green, blue, and yellow markers denote the initial state, intermediate spin-update states, and the ground state, respectively. The arrows indicate a trajectory in which the system escapes a local minimum through an energy-increasing update and then relaxes toward the ground state.

Figure 3b depicts the same process from an energy standpoint. The spins, serving as bipolar variables, span the solution space, and their energy is scored through Equation (1) as the solver adjusts the states toward lower energy. Because the landscape is riddled with local minima that correspond to suboptimal answers, the solver must occasionally force spin flips that raise the energy on purpose. Such deliberate uphill moves let the system leave a local minimum and eventually relax into a configuration of lower energy.

Building on this theoretical framework, previous works [8,9,10,11] have explored various hardware implementations to optimize local minima escape strategies. In [8], SRAM error bits achieve the required randomness for updating the system state to higher energy levels. As fabricated SRAMs exhibit varying minimum operating voltages, this method deliberately lowers the supply voltage of SRAMs that store spin states, thereby inducing random error bits. While this approach allows for altering the system’s energy level without affecting interconnect values, it should be noted that the SRAM locations susceptible to such errors are spatially deterministic. Thus, the system may explore only a part of the energy landscape.

Another method for randomly flipping spin states involves applying external random pulses [9]. A spin flips its state only when it receives logical “1” from random pulses propagated both horizontally and vertically. In any other case, it keeps its original state. One can control the flipping tendency by modifying the mark-space ratio of these pulses. This method, however, may degrade solution quality because of a correlated flipping tendency among adjacent spins. Scalability challenges arise from the corruption of the mark-space ratio caused by accumulated strength imbalances in lengthy inverter buffers. To address these issues, Su et al. [10,11] applied multiple dedicated random pulses to each row and column of the spin array. Their approach adds weighted sums of random pulse values and a global weight to the weighted sums of interaction coefficients and states of adjacent spins. Although this method distributes random pulses individually to each row and column, spins within the same row or column still share the same pulse, potentially leading to correlated flipping tendencies that could limit the thorough exploration of the solution space.

2.2. Probabilistic Computing

Figure 4a abstracts the internal organization of a single p-bit. It receives the states of the p-bits connected to it together with one bias value, and these terms are scaled by their respective weights to form the drive into the probability controller. The controller then resolves the p-bit into a bipolar output in a probabilistic manner that depends on how strong that drive is. Figure 4b shows the resulting behavior, where a zero drive yields a 50% chance of logical “1”, and pushing the drive higher or lower raises or lowers that chance accordingly. Equations (2) and (3) formalize this state-update rule [16,26].

m_{i} (t) = sgn (\tanh (I_{i . t o t a l} (t)) + rand (- 1, + 1))

(2)

I_{i . t o t a l} (t) = I_{0} \{h_{i} (t) + \sum_{j} J_{i j} m_{j} (t)\}

(3)

Equation (2) combines two parts: a tunable sigmoidal term written as tanh(I_i.total(t)) and a random value drawn uniformly over the interval from −1 to +1. In a real device, however, the randomness comes from transistor thermal noise, so the underlying distribution is Gaussian rather than uniform. The state decision of a practical p-bit therefore relies on a Gaussian probability density function (PDF) split by an adjustable threshold into two regions, as Figure 5 shows. Sliding this threshold along the horizontal axis tunes the output probability.

In Equation (3), the controller input is built from a bias term h_i together with the two-body interaction terms J_ij. The Ising Hamiltonian in Equation (1) is quadratic by construction, so a COP whose natural cost function contains products of three or more variables does not map onto it directly. Prior CMOS Ising solvers were implemented with pairwise couplings only, so higher-order problems were handled by first reducing them to a quadratic form, for example, through order reduction or minor embedding. Either route introduces auxiliary spins whose count grows with the number and degree of the higher-order terms.

The p-bit update itself does not share this restriction. The only quantity each p-bit requires is its total input, which corresponds to the local field of the energy function at that node. For a higher-order cost function, this local field is a higher-order polynomial of the neighboring states, so a system that can evaluate this polynomial can incorporate terms such as J_ijk and J_ijkl with no auxiliary spins. Equation (4) extends Equation (3) to support interactions up to fourth order:

I_{i . t o t a l} (t) = I_{0} \{h_{i} (t) + \sum_{j} J_{i j} m_{j} (t) + \sum_{j, k} J_{i j k} m_{j} (t) m_{k} (t) + \sum_{j, k, l} J_{i j k l} m_{j} (t) m_{k} (t) m_{l} (t)\}

(4)

A second strength follows from how the controller input shapes the flipping tendency of each p-bit, a point already raised in the introduction. This makes the p-circuit naturally suited to COPs that possess several solutions of equal energy. Reaching all of them with an annealing Ising solver usually means repeatedly re-warming the schedule or running multiple replicas within one pass. The p-circuit instead visits the different solutions within a single continuous run while the effective temperature is held fixed, so no annealing restart is needed.

Figure 6 shows the end-to-end probabilistic computing flow. The COP is first cast as a collection of bipolar variables and the interactions linking them. For a more hardware-friendly form, the bipolar variables m_i ∈ {−1, +1} are recast as binary variables s_i ∈ {0, 1} so that the problem fits the higher-order unconstrained binary optimization (HUBO) model, which generalizes quadratic unconstrained binary optimization. Substituting m = 2s − 1 keeps Equations (2)–(4) valid in the binary form, and every equation, state, and coefficient from here on follows the HUBO convention. A p-circuit is then assembled by mapping the binary variables and coefficients onto p-bits and their connections, after which it runs and samples system states over time. A final post-processing stage converts the raw samples into candidate solutions. When the weighted sum evaluations are fast enough and no two connected p-bits update at the same time, the collected samples approach the Boltzmann distribution [16,26,27]. The expected occurrence of each solution then follows from the relations below:

p_{k} = \frac{\exp (- E_{k})}{\sum \exp (- E)}

(5)

E_{k} = - I_{0} (0.5 \sum_{i, j} J_{i j} s_{i} s_{j} + \sum_{i} h_{i} s_{i}) .

(6)

Here, p_k is the probability that the system occupies a particular state configuration, and E_k is the energy of that configuration. Because lower-energy configurations are favored, the candidates that appear most often are examined first when extracting the correct solutions.

Figure 6. Workflow of probabilistic computing.

2.3. Applications and Encoding Methods

In order to solve COPs with a probabilistic computing system, target problems must first be encoded into the HUBO model for system implementation. This encoding procedure determines the required number of p-bits and interactions. Among various COP instances, we focus on two tasks to validate the probabilistic computing system: one characterized by multiple ground states and the other by an essentially single ground state.

A representative COP with multiple ground states is the SAT problem, which asks whether some assignment of truth values makes a given Boolean formula evaluate to true. SAT carries an exponentially large search space and is historically notable as the first problem proven NP-complete. In probabilistic computing, Boolean constraints can be implemented by realizing their logic as p-bit gate networks [16,28]. Each primitive gate, such as NOT, AND, or OR, is encoded through a set of interaction coefficients defined over binary variables [29], and Figure 7 lists these gates together with their graphs and weight values. Larger circuits are then formed by fusing the p-bits that stand for variables shared between gates and adding their bias contributions.

SAT is a natural fit for probabilistic computing because a Boolean formula can admit several satisfying assignments depending on how it is written. The conjunctive-normal-form formula F = (A ∨ B) ∧ (C), for instance, is satisfied by three distinct input patterns. A conventional logic circuit would have to enumerate input combinations exhaustively to find them, since its inputs are driven deterministically. The p-circuit instead treats the formula as an energy function whose multiple ground states are all valid assignments, so one continuous run can surface them together. Holding the p-bits that represent the circuit output at a chosen logical value, a procedure referred to as clamping, drives the network in the inverted direction and makes it search for the input patterns consistent with that output. Under clamping, the usual separation between inputs and outputs no longer applies, since any p-bit can act as either.

Integer factorization is a contrasting example, a COP with essentially a single ground state. RSA security rests on an asymmetry in classical computing, where multiplying two integers is easy while factoring their product becomes rapidly harder as the number grows. A probabilistic computing system addresses this by driving the network toward the minimum-energy configuration, recovering the factors without enumerating all candidate pairs.

Encoding integer factorization in the HUBO model starts with a cost function [30]:

E = {(X Y - F)}^{2}

(7)

where E is the cost, X and Y are two integer factors in binary form, and F is the product to be factored. E becomes zero if and only if X and Y are the factors whose product equals F. For odd integer factorization, the least significant bits of X and Y (x₀ and y₀) can be fixed to 1 so that the candidate factors are restricted to odd values. Interaction coefficients for the p-circuit network are obtained by computing the partial derivatives of the cost function with respect to each of the binary variables that represent the factors X and Y [20]. The results are a group of equations containing interaction coefficients up to four-body terms. Each equation yields a weighted sum result of the corresponding p-bit, which determines the input strength of the probability controller.

3. Proposed CMOS P-Circuit System

3.1. CMOS P-Bit

Figure 8 shows the proposed CMOS p-bit. Stochastic behavior originates in the differential noise voltage produced by the random source circuit. That voltage first passes through a correlated double sampling (CDS) circuit that suppresses offset and low-frequency content, and the clocked comparator then resolves the p-bit state against an adjustable threshold set by the probability controller. To pin the output probability at 50% when a 5-bit probability-control code D_i (the quantized version of I_i.total) is set as 16, a foreground calibration logic trims the input offset of the comparator.

Figure 9 depicts the schematic of the random source circuit. The random source circuit is built around a dynamic amplifier [31] reused here as a noise integrator. Its main stage uses two branches with input transistors M₁ and M₂, and a separate timing stage uses the M₃ and M₄ branches. Because V_IN is biased at a common-mode level, the amplifier sees no deterministic differential input, so the thermal noise of M_1,2 is the only signal. Its input-referred thermal noise appears at the output scaled by the integrator gain A_v = g_m_1,2t_int/C_S. The resulting output-referred noise power of the main stage is given by Equation (8):

\bar{v_{O U T . m a i n}^{2}} = \bar{v_{I N . m a i n}^{2}} \cdot A_{v}^{2} = \frac{4 k T}{g_{m 1, 2}} \cdot \frac{1}{2 t_{int}} \cdot {(\frac{g_{m 1, 2} t_{int}}{C_{S}})}^{2} = \frac{2 k T g_{m 1, 2} \cdot t_{int}}{C_{S}^{2}} .

(8)

Here, the symbols denote the input-referred noise power

\bar{v_{I N . m a i n}^{2}}

of M_1,2, the Boltzmann constant k, the absolute temperature T, the transconductance g_m_1,2 of M_1,2, the integration time t_int, and the total output-node capacitance C_S. Integrator behavior, as opposed to a plain sampling network, requires t_int to stay well below the output-node RC time constant τ, a condition met by setting the timing-stage reference voltage V_b so that t_int is short enough.

Figure 9. Schematic of the random source circuit.

Figure 10 traces one operating cycle. Both C_S and C_T start precharged to V_DD, and on the rising edge of CLK_NI, the M_1,2 and M_3,4 branches begin draining C_S and C_T. Once V_TP and V_TN fall under the inverter threshold inside the off-logic block, SW_P and SW_N open, and the noise integration phase ends. A short interval is then reserved for the CDS circuit before the cycle returns to precharge. The timing-stage devices M₃ and M₄ are not noiseless either, as their jitter σ_t couples into the output-referred noise through the slope of V_OUTP,N:

σ_{t}^{2} = \frac{k T g_{m 3, 4}}{C_{T}^{2}} \cdot t_{int} \cdot {({\frac{d V_{T P, N}}{d t}|}_{V_{T P, N} = V_{T H . I N V}})}^{- 2} = \frac{k T g_{m 3, 4}}{C_{T}^{2}} \cdot t_{int} \cdot {(\frac{I_{3, 4}}{C_{T}})}^{- 2} = \frac{k T g_{m 3, 4} \cdot t_{int}}{I_{3, 4}},

(9)

where I_3,4 is the bias current of the timing stage. The common-mode voltage of V_IN is set to let the two stages have the same bias current. Noise induced by the timing-stage jitter is expressed as Equation (10), where I_1,2 is the bias current of the main stage.

\bar{v_{j i t t e r}^{2}} = σ_{t}^{2} \cdot {(\frac{d V_{O U T P, N}}{d t})}^{2} = σ_{t}^{2} \cdot {(\frac{I_{1, 2}}{C_{S}})}^{2} = \frac{k T g_{m 3, 4} \cdot t_{int}}{C_{S}^{2}}

(10)

The total output-referred noise of the random source circuit is then given by Equation (11).

\bar{v_{t o t a l}^{2}} = \frac{(2 g_{m 1, 2} + g_{m 3, 4}) \cdot k T \cdot t_{int}}{C_{S}^{2}}

(11)

Figure 10. Timing diagram of the random source circuit.

The clocked comparator schematic is given in Figure 11. A pair of binary-weighted capacitor arrays at the discharging nodes A_P and A_N sets how fast each node discharges. Two control words, CTLP and CTLN, steer these arrays and are produced by decoding D_i that the probability controller receives.

Figure 11. Schematic of the comparator with offset calibration and probability-control scheme.

Two low-frequency nonidealities threaten clean stochastic operation in the proposed CMOS p-bit, namely 1/f noise and the static offsets of the random source circuit and the clocked comparator. The 1/f component shows up as autocorrelation between successive output bits. An offset in the random source circuit displaces the Gaussian PDF, whereas an offset in the comparator displaces the moving threshold. Figure 12a illustrates the first case, where a positive random-source offset slides the PDF so that more probability mass falls on the logical “1” side at zero input. Figure 12b illustrates the second, where a comparator input offset moves the threshold rightward and instead biases the output toward logical “0” at zero input. The first effect is canceled by applying CDS to the random source circuit, and the second by foreground calibration of the comparator.

Figure 13 gives the CDS circuit together with its timing. Across the first two phases, SW₁ and SW₂ route the integrated noise into C₁ and then into C₂. In the final phase, the comparator is driven by OUT_CDS, which is formed from the difference in the two stored noise samples on C₁ and C₂ so that their common offset and low-frequency content cancel.

Figure 14 outlines the comparator calibration logic. An asynchronous successive-approximation-register loop drives the capacitor arrays during this phase. The comparator inputs are tied to IN_CM while the loop runs, and the resulting COP and CON codes feed the binary-weighted capacitors at the comparator discharging nodes.

3.2. P-Circuit Implementation

A fully connected and programmable p-circuit is realized by interfacing the proposed CMOS p-bit prototype chips with an AMD ZCU104 FPGA board. The invertible Boolean logic and the integer-factorization instances, already cast into the HUBO model in Section 2.3, are mapped onto the FPGA by configuring the bias terms and the bidirectional links among the p-bits in custom logic on the programmable logic (PL). The same PL also derives the 5-bit probability-control code D_i, which drives the probability controller of every p-bit. Once a p-bit completes a state update, it raises a done signal that prompts the FPGA to recompute the weighted sum for the next target p-bit. Since every p-bit emits a binary value, this weighted sum reduces to selecting and accumulating only the active weight and bias terms, which is implemented as adders and digital multiplexers in the PL fabric. The accumulated value is encoded into the 5-bit code Di and passed to the p-bit scheduled to update next. A 150 MHz clock for CMOS p-bits is generated with the Clocking Wizard IP located in the PL. According to the post-implementation power report obtained from AMD Vivado, the total, dynamic, and static power of the FPGA board were 3.647 W, 2.954 W, and 0.693 W, respectively.

4. Results

The proposed CMOS p-bit was fabricated in a 180 nm process, and its micrograph is shown in Figure 15a. Each prototype chip integrates one p-bit occupying an active area of 6551 μm², and multiple such chips are interfaced through the FPGA to form the p-circuit. Driven by a 1.8 V supply and a 150 MHz clock, the p-bit updates its state at 50 MHz, and the energy dissipation at a 50% output probability is measured as 6.95 pJ/bit. Figure 15b shows the power breakdown of the CMOS p-bit with the same operating conditions. The random source circuit, timing block, and comparator dominate the p-bit-core power because they perform dynamic precharge, noise integration, timing generation, and state decision in every update cycle. In contrast, the CDS circuit consumes only a small fraction because it operates as a switched-capacitor sampling path with small capacitive loading and no static current path. The probability controller also consumes little power in this measurement because the 5-bit code D_i is fixed at the 50% output-probability setting. The “Others” category includes global buffers and test options.

Table 1 shows measured per-p-bit energy dissipation with various probability controller inputs. Five D_i values were selected: 0, 10, 16, 22, and 31 to show the relationship between output probability and energy being dissipated. Considering that the output probability of 50% yields the maximum toggling frequency, energy consumption decreases as the input of the probability controller deviates from the center. Although the differences are small, operation at 50% output probability is the worst case in terms of energy dissipation.

Table 2 summarizes the update rate and energy dissipation of the prototype CMOS p-bit chip across temperature and supply-voltage variation. Likewise, the output probability was set to 50% to represent the worst-case energy condition. Across all combinations, the CMOS p-bit maintained a 50 MHz update rate. When the supply voltage was decreased by 10%, the measured energy dissipation decreased by approximately 20%.

The remainder of this section reports the output characteristics of the p-bit, the measured invertible Boolean operation of a p-circuit logic gate, and the measured result of an integer factorization instance.

4.1. Output Characteristics of the CMOS P-Bit

The measured output-probability curve of the CMOS p-bit is plotted in Figure 16a. Every point on the curve is the time average of a 1 Mb output bitstream, given in percent. The prototype p-bits reproduce the full 32-level control range and follow the sigmoidal characteristic of Figure 4b. P-bit #3 was chosen as a representative sample for plotting the measured output-probability curves across five temperature and supply-voltage variations, as shown in Figure 16b. The curves showed the maximum difference of 2.209 percentage points at D_i = 14. Chip-to-chip variation, on the other hand, showed the maximum difference of 14.08 percentage points at D_i = 15.

Randomness of the bitstream is evaluated with the NIST Statistical Test Suite (STS), a standard battery of quantitative randomness tests. For each variation condition, a 10 Mb measured bitstream was applied to the suite, and the outcome is summarized in Table 3. The minimum pass rates are 8/10 for the ten-partition inputs, 9/11 for the eleven-partition inputs, 10/12 for the twelve-partition inputs, and 18/20 for the twenty-partition inputs, taken at a significance level of 0.01. The proposed CMOS p-bit passed every NIST STS test under all evaluated variation conditions.

4.2. Measurement Results of Invertible Boolean Operation

An AND-OR-invert (AOI22) gate is mapped onto CMOS p-bits to demonstrate invertible Boolean operation through output clamping. The AOI22 gate is built from eight p-bits, as drawn in Figure 17. The output p-bit E is clamped to a target logical value, and the p-circuit is run while configuration counts are collected. Repeating this measurement for the two output-clamping conditions recovers the input combinations that drive the AOI22 output to logical “0” and logical “1”.

4.3. Measurement Results of an Integer Factorization

Figure 18 shows the integer factorization measured on the proposed CMOS p-circuit for the integer 1539. With the connections configured up to 4-body terms, 100,000 samples are collected and shown as a 3D histogram. The correct factorization, 57 × 27, accounts for 840 of the 100,000 samples. Because configurations near the correct answer also receive high counts, the p-circuit effectively narrows the search space, and the correct factors are confirmed by checking the top candidates.

4.4. Comparison with Prior Works

Table 4 compares the proposed CMOS p-bit with a D-Wave quantum annealer [7], an MTJ p-bit [17], and an LFSR-based FPGA p-bit [20]. Each prior approach gives up one of three properties: the quantum annealer needs a cryogenic environment, the MTJ p-bit is not compatible with standard CMOS technology, and the FPGA p-bit relies on pseudorandom LFSR bits. The proposed work instead pairs a transistor-noise random source in a standard 180 nm CMOS process with FPGA-based weighted-sum and connection logic, yielding a room-temperature, physically random p-bit whose random source stays fully CMOS while the network-level functions remain in programmable logic.

5. Discussion

5.1. Random-Source Scaling and System-Level Overheads

The scalability benefit of the proposed p-bit is limited to the local random-source circuit. Each p-bit uses transistor device noise as its own local random source, so it does not require a shared PRNG state, per-p-bit seed assignment, or global random-pulse distribution. In this sense, the random-source circuit size remains fixed per p-bit as the number of p-bits increases. This does not mean that the complete p-circuit area is independent of problem size. A complete p-circuit still requires additional p-bit cores, coefficient storage, weighted-sum logic, update-control logic, and interconnects as the mapped problem becomes larger.

A first-order estimate can be made from the measured p-bit core area and energy. The active area of one CMOS p-bit core is 6551 μm². Therefore, the p-bit-core area scales approximately linearly with the number of cores. This corresponds to approximately 0.0655 mm², 0.655 mm², and 6.55 mm² for 10, 100, and 1000 p-bit cores, respectively. These values are core-only estimates and do not include coefficient storage, weighted-sum circuits, routing, update scheduling, or peripheral circuits.

The measured worst-case energy dissipation is 6.95 pJ/bit at a 50 MHz update rate. This corresponds to an active p-bit-core power of approximately 347.5 μW per p-bit. Thus, 10, 100, and 1000 active p-bit cores would require approximately 3.48 mW, 34.8 mW, and 348 mW, respectively, excluding the network-level overhead. The FPGA power reported in this work belongs to the flexible proof-of-concept platform and should not be interpreted as the optimized power of a custom integrated p-circuit.

At the system level, the dominant overhead can shift from the p-bit core to the network implementation. For sparse p-circuits, coefficient storage, weighted-sum logic, and routing scale mainly with the number of programmed interactions. For fully connected pairwise networks, these terms scale approximately with the square of the number of p-bits. Higher-order HUBO mappings add further coefficients and local-field terms. Therefore, the proposed local random source removes PRNG-state, seed-assignment, and random-distribution overhead from the stochastic source itself, but it does not eliminate the area, power, latency, and synchronization overheads of the full weighted-sum network.

The present prototype uses one p-bit per chip and an FPGA for weighted-sum computation, so it does not demonstrate full-system scalability. Future multi-p-bit CMOS implementations should co-design p-bit placement, coefficient storage, local-field accumulation, update scheduling, and interconnects to preserve the intended probability distribution while reducing system-level overhead.

5.2. Model-Based Factorization-Size Analysis and Target Selection

For the integer-factorization demonstration, the maximum problem size should be distinguished from the simple variable-representation range. In the present setup, the two odd factors are represented as

X = 1 + \sum_{p = 1}^{5} 2^{p} x_{p},

(12)

and

Y = 1 + \sum_{q = 1}^{4} 2^{q} y_{q},

(13)

where the least significant bits of both factors are fixed to one. Therefore, the largest representable odd factors are X_max = 63 and Y_max = 31, giving a representational upper product of 63 × 31 = 1953. This value, however, only indicates the largest product that can be encoded by the available factor bits; it does not guarantee that the p-circuit can reliably generate the correct factor pair as a high-probability sample. The practical limit of the present architecture is also affected by the 5-bit local-field interface between the FPGA weighted-sum logic and the CMOS p-bit probability controller.

To estimate this limit, we used a simple finite-state model of the present 9-p-bit factorization setup. For each target integer F, all 512 states were enumerated and assigned the cost

E_{F} (s) = {(X (s) Y (s) - F)}^{2} .

(14)

This enumeration and the following local-field calculation were repeated separately for each target

F

. Equivalently, the factorization coefficients and local fields were re-derived from each target-specific cost function rather than fixed to those used for the F = 1539 experiment. For each p-bit i, the local weighted sum was calculated from the energy difference between the two possible values of the target p-bit, while all other p-bit states were fixed:

A_{i} = E_{F} (s_{i} = 0, s_{\ i}) - E_{F} (s_{i} = 1, s_{\ i}) .

(15)

A positive A_i favors s_i = 1, while a negative A_i favors s_i = 0. The implemented p-circuit scales this local weighted sum by the fixed inverse-temperature parameter I₀, shifts it by the center code 16 that produces approximately 50% output probability, and quantizes it into the 5-bit probability-control code. Accordingly, the model used

D_{i} = {sat}_{[0, 31]} (⌊16 + I_{0} A_{i}⌋),

(16)

where the operator sat_{[0, 31]} denotes unsigned saturation to the range from 0 to 31. The measured 32-level output-probability curve was then used to assign the update probability

P (s_{i} = 1) = p_{D_{i}} .

(17)

Since the measurement records the complete 9-p-bit state after every single p-bit update, rather than after every full nine-update cycle, the model used the serially recorded stationary distribution. Let T_i be the transition matrix for updating p-bit i. The cycle-boundary stationary distribution π₀ is the fixed point of this full-cycle transition:

π_{0} = π_{0} T_{c y c l e}, where T_{c y c l e} = T_{1} T_{2} \dots T_{9} .

(18)

Because the measurement records the full 9-p-bit state after every single p-bit update, the model uses the average distribution over the nine update phases:

π_{s e r i a l} = \frac{1}{9} \sum_{k = 1}^{9} π_{0} T_{1} T_{2} \dots T_{k} .

(19)

Using this model, we swept the set of nontrivial odd products representable by the same 9-p-bit allocation. This set includes all products F = XY, where X is an odd integer from 3 to 63 and Y is an odd integer from 3 to 31. For each F, the correct-factor probability P_correct(F) was obtained from the probability assigned by π_serial to the correct factor-pair state, and the correct-factor rank was determined by comparing this probability with those of the other candidate factor pairs. Because each 9-p-bit state uniquely corresponds to one factor pair (X,Y) under the present bit allocation, this probability is directly read from the state probability assigned by π_serial. We then defined a top-K model-based supported range as

F_{\max}^{(K)} = \max_{F \in G} {F | {rank}_{c o r r e c t} (F) \leq K and 100000 P_{c o r r e c t} (F) > 100} .

(20)

Here, G denotes the set of nontrivial odd products representable by the current 9-p-bit allocation. The second condition requires the correct factor pair to appear more than 100 times in 100,000 serially recorded samples, so that it is not merely a vanishing-probability state. For the measurement condition used in Figure 18, I₀ = 6.0 × 10⁻⁶, the model estimated approximately 918 occurrences for the demonstrated target 1539 = 57 × 27, which is close to the measured count of 840. Under the same condition, the top-1 criterion gave F_max⁽¹⁾ = 1653, while the top-5 candidate-verification criterion gave F_max⁽⁵⁾ = 1769. These numbers are not absolute mathematical limits of p-bit factorization. They are model-based estimates for the present 9-p-bit allocation, measured 32-level output-probability curve, 5-bit local-field quantization, fixed I₀, and serial-update sampling protocol. The analysis indicates that the present demonstration is constrained not only by the number of p-bits but also by local-field quantization and candidate ranking. Thus, the p-circuit should be interpreted as a stochastic candidate generator whose top-ranked candidates can be verified classically, rather than as a deterministic large-scale factorization solver.

Within this 9-p-bit factorization setup, 1539 = 57 × 27 was selected as a high-range but non-boundary validation target. The available bit allocation can represent odd factors up to 63 for X and 31 for Y, giving the endpoint product 1953 = 63 × 31. This endpoint case was not used because both factors correspond to all-one boundary patterns, which are less representative of a general factorization target within the available binary range. Instead, the selected factors 57 (111001₍₂₎) and 27 (11011₍₂₎) both activate their most significant available bits while avoiding the all-one pattern. The product 1539 is approximately 79% of the representational upper product 1953, and the finite-state model also places this target within the top-5 candidate criterion. Therefore, it is not a small low-bit example. It was not selected by screening for an unusually high sampling probability. The measured distribution remains broad, and nearby candidate factor pairs receive comparable counts. Thus, the 1539 result is best interpreted as a high-range proof-of-concept case showing that the p-circuit can generate the correct factor pair among top-ranked stochastic candidates, rather than as a sharply convergent or practically large-scale factorization benchmark.

6. Conclusions

This work demonstrated a CMOS p-bit with a digitally adjustable time-averaged output as a stochastic primitive for probabilistic computing systems targeting COPs. The randomness is drawn from transistor device noise rather than a pseudorandom generator, which removes the seed-management and length-scaling burdens that limit PRNG-based p-bits. The fabricated 180 nm chip supports 32 digitally selected probability levels and passes the NIST STS randomness tests. Under a 1.8 V supply, it operates from a 150 MHz clock to provide 50 MHz p-bit updates, with an energy dissipation of 6.95 pJ/bit at 50% output probability. For system-level validation, multiple CMOS p-bit chips were interfaced with an FPGA that computes weighted sums and supplies the 5-bit probability-control codes. With the HUBO-form coefficients evaluated on the FPGA, the p-circuit experimentally demonstrated invertible Boolean operation and performed integer factorization. The present prototype integrates a single p-bit per chip, so full-system scalability is not demonstrated here. At the p-bit-core level, the use of transistor device noise as a local random source keeps the random-source circuit size fixed per p-bit as the system scales, avoiding PRNG-state and seed-assignment overheads in future multi-p-bit integration for larger COPs.

Author Contributions

Conceptualization, C.L. and J.J.; methodology, J.J.; validation, J.J.; formal analysis, J.J.; investigation, J.J.; resources, C.L. and J.J.; writing—original draft preparation, J.J.; writing—review and editing, C.L. and J.J.; visualization, J.J.; supervision, C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Regional Innovation System & Education (RISE) program through the Gyeonggi RISE Center, funded by the Ministry of Education (MOE) and the Gyeonggi-do, Republic of Korea (2026-RISE-09-A37), and by the Hankuk University of Foreign Studies Research Fund of 2024.

Data Availability Statement

Data are contained within the article.

Acknowledgments

Fabrication of the chip fabrication EDA tool was supported by the IC Design Education Center (IDEC), Republic of Korea.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rivest, R.L.; Shamir, A.; Adleman, L. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 1978, 21, 120–126. [Google Scholar] [CrossRef]
Cook, C.; Zhao, H.; Sato, T.; Hiromoto, M.; Tan, S.X.-D. GPU-based Ising computing for solving max-cut combinatorial optimization problems. Integr. VLSI J. 2019, 69, 335–344. [Google Scholar] [CrossRef]
Bartolacci, M.R.; LeBlanc, L.J.; Kayikci, Y.; Grossman, T.A. Optimization modeling for logistics: Options and implementations. J. Bus. Logist. 2012, 33, 118–127. [Google Scholar] [CrossRef]
Theis, T.N.; Wong, H.-S.P. The end of Moore’s law: A new beginning for information technology. Comput. Sci. Eng. 2017, 19, 41–50. [Google Scholar] [CrossRef]
Kirkpatrick, S.; Gelatt, C.D., Jr.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef]
Metropolis, N.; Rosenbluth, A.W.; Rosenbluth, M.N.; Teller, A.H.; Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys. 1953, 21, 1087–1092. [Google Scholar] [CrossRef]
Johnson, M.W.; Amin, M.H.S.; Gildert, S.; Lanting, T.; Hamze, F.; Dickson, N.; Harris, R.; Berkley, A.J.; Johansson, J.; Bunyk, P.; et al. Quantum annealing with manufactured spins. Nature 2011, 473, 194–198. [Google Scholar] [CrossRef] [PubMed]
Yamaoka, M.; Yoshimura, C.; Hayashi, M.; Okuyama, T.; Aoki, H.; Mizuno, H. A 20k-spin Ising chip to solve combinatorial optimization problems with CMOS annealing. IEEE J. Solid-State Circuits 2016, 51, 303–309. [Google Scholar]
Hayashi, M.; Yamaoka, M.; Yoshimura, C.; Okuyama, T.; Aoki, H.; Mizuno, H. An accelerator chip for ground-state searches of the Ising model with asynchronous random pulse distribution. In Proceedings of the 2015 3rd International Symposium on Computing and Networking (CANDAR), Hokkaido, Japan, 8–11 December 2015; pp. 542–546. [Google Scholar]
Su, Y.; Mu, J.; Kim, H.; Kim, B. A scalable CMOS Ising computer featuring sparse and reconfigurable spin interconnects for solving combinatorial optimization problems. IEEE J. Solid-State Circuits 2022, 57, 858–868. [Google Scholar] [CrossRef]
Su, Y.; Kim, H.; Kim, B. CIM-Spin: A scalable CMOS annealing processor with digital in-memory spin operators and register spins for combinatorial optimization problems. IEEE J. Solid-State Circuits 2022, 57, 2263–2273. [Google Scholar] [CrossRef]
Takemoto, T.; Yamamoto, K.; Yoshimura, C.; Hayashi, M.; Tada, M.; Saito, H.; Mashimo, M.; Yamaoka, M. A 144Kb annealing system composed of 9×16Kb annealing processor chips with scalable chip-to-chip connections for large-scale combinatorial optimization problems. In Proceedings of the 2021 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 13–22 February 2021; pp. 64–66. [Google Scholar]
Su, Y.; Kim, T.T.-H.; Kim, B. FlexSpin: A CMOS Ising machine with 256 flexible spin processing elements with 8-b coefficients for solving combinatorial optimization problems. IEEE J. Solid-State Circuits 2024, 59, 2659–2670. [Google Scholar] [CrossRef]
Bae, J.; Shim, C.; Kim, B. e-Chimera: A scalable SRAM-based Ising macro with enhanced-Chimera topology for solving combinatorial optimization problems within memory. In Proceedings of the 2024 IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 18–22 February 2024; pp. 286–288. [Google Scholar]
Cılasun, H.; Moy, W.; Zeng, Z.; Islam, T.; Lo, H.; Vanasse, A.; Tan, M.; Anees, M.; Ramprasath, S.; Kumar, A.; et al. A coupled-oscillator-based Ising chip for combinatorial optimization. Nat. Electron. 2025, 8, 537–546. [Google Scholar] [CrossRef]
Camsari, K.Y.; Faria, R.; Sutton, B.M.; Datta, S. Stochastic p-bits for invertible logic. Phys. Rev. X 2017, 7, 031014. [Google Scholar] [CrossRef]
Borders, W.A.; Pervaiz, A.Z.; Fukami, S.; Camsari, K.Y.; Ohno, H.; Datta, S. Integer factorization using stochastic magnetic tunnel junctions. Nature 2019, 573, 390–393. [Google Scholar] [CrossRef]
Daniel, J.; Sun, Z.; Zhang, X.; Tan, Y.; Dilley, N.; Chen, Z.; Appenzeller, J. Experimental demonstration of an on-chip p-bit core based on stochastic magnetic tunnel junctions and 2D MoS2 transistors. Nat. Commun. 2024, 15, 4098. [Google Scholar] [CrossRef]
Smithson, S.C.; Onizawa, N.; Meyer, B.H.; Gross, W.J.; Hanyu, T. Efficient CMOS invertible logic using stochastic computing. IEEE Trans. Circuits Syst. I Regul. Pap. 2019, 66, 2263–2274. [Google Scholar] [CrossRef]
Pervaiz, A.Z.; Sutton, B.M.; Ghantasala, L.A.; Camsari, K.Y. Weighted p-bits for FPGA implementation of probabilistic circuits. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 1920–1926. [Google Scholar] [CrossRef] [PubMed]
Patel, S.; Canoza, P.; Salahuddin, S. Logically synthesized and hardware-accelerated restricted Boltzmann machines for combinatorial optimization and integer factorization. Nat. Electron. 2022, 5, 92–101. [Google Scholar] [CrossRef]
Kim, J.; Han, J.K.; Maeng, H.Y.; Han, J.; Jeon, J.W.; Jang, Y.H.; Woo, K.S.; Choi, Y.K.; Hwang, C.S. Fully CMOS-based p-bits with a bistable resistor for probabilistic computing. Adv. Funct. Mater. 2024, 34, 2307935. [Google Scholar] [CrossRef]
Rhee, H.; Kim, G.; Song, H.; Park, W.; Kim, D.H.; In, J.H.; Lee, Y.; Kim, K.M. Probabilistic computing with NbOx metal-insulator transition-based self-oscillatory pbit. Nat. Commun. 2023, 14, 7199. [Google Scholar] [CrossRef] [PubMed]
Johnson, D.S. A catalog of complexity classes. In Handbook of Theoretical Computer Science, 1st ed.; MIT Press: Cambridge, MA, USA, 1990; pp. 67–161. [Google Scholar]
Lucas, A. Ising formulations of many NP problems. Front. Phys. 2014, 2, 5. [Google Scholar] [CrossRef]
Camsari, K.Y.; Debashis, P.; Ostwal, V.; Pervaiz, A.Z.; Shen, T.; Chen, Z.; Datta, S.; Appenzeller, J. From charge to spin and spin to charge: Stochastic magnets for probabilistic switching. Proc. IEEE 2020, 108, 1322–1337. [Google Scholar] [CrossRef]
Pervaiz, A.Z.; Ghantasala, L.A.; Camsari, K.Y.; Datta, S. Hardware emulation of stochastic p-bits for invertible logic. Sci. Rep. 2017, 7, 10994. [Google Scholar] [CrossRef]
Biamonte, J.D. Nonperturbative k-body to two-body commuting conversion Hamiltonians and embedding problem instances into Ising spins. Phys. Rev. A 2008, 77, 052331. [Google Scholar] [CrossRef]
Aadit, N.A.; Grimaldi, A.; Carpentieri, M.; Theogarajan, L.; Martinis, J.M.; Finocchio, G.; Camsari, K.Y. Massively parallel probabilistic computing with sparse Ising machines. Nat. Electron. 2022, 5, 460–468. [Google Scholar] [CrossRef]
Peng, X.; Liao, Z.; Xu, N.; Qin, G.; Zhou, X.; Suter, D.; Du, J. Quantum adiabatic algorithm for factorization and its experimental implementation. Phys. Rev. Lett. 2008, 101, 220405. [Google Scholar] [CrossRef]
Park, Y.; Song, J.; Choi, Y.; Lim, C.; Ahn, S.; Kim, C. An 11-b 100-MS/s fully dynamic pipelined ADC using a high-linearity dynamic amplifier. IEEE J. Solid-State Circuits 2020, 55, 2468–2477. [Google Scholar] [CrossRef]

Figure 1. Energy landscapes of Ising models. (a) Single-solution scenario. (b) Multiple-solutions scenario.

Figure 2. Comparison among classical bits, p-bits, and qubits at an abstraction level.

Figure 4. (a) Block diagram of a p-bit. (b) Output probability versus input curve showing the output controllability of a p-bit.

Figure 5. Adjustable threshold for controlling the output probability of a p-bit.

Figure 7. Basic logic gates represented by p-circuits.

Figure 8. Block diagram of the proposed CMOS p-bit.

Figure 12. Horizontal shifts caused by offsets. (a) The offset of the random source circuit. (b) The offset of the clocked comparator.

Figure 13. CDS circuit and timing sequence. (a) CDS circuit generating OUTCDS from two sampled noise-integration outputs. (b) Timing sequence for two noise-integration samples and the subsequent comparator decision phase.

Figure 14. Flowchart of the comparator calibration logic.

Figure 15. (a) Chip micrograph of the proposed CMOS p-bit. (b) Power breakdown of the CMOS p-circuit at 27 °C.

Figure 16. (a) Measured output probability of the CMOS p-bit. (b) Measured output probability of P-bit #3 across temperature and supply-voltage variations.

Figure 17. Measured configuration counts of the clamped p-circuit AOI22 gate.

Figure 18. Measured factorization results of 1539 using the CMOS p-circuit.

Table 1. Measured per-p-bit energy dissipation with different output probability being set.

D_i	0	10	16	22	31
Energy dissipation [pJ/bit]	6.940	6.946	6.950	6.944	6.943

V_DD = 1.8 V, T = 27 °C, Update rate = 50 MHz.

Table 2. Measured update rate and per-p-bit energy dissipation across temperature and supply variation.

V_DD [V]	1.62	1.8	1.8	1.8	1.98
T [°C]	27	0	27	80	27
Update rate [MHz]	50	50	50	50	50
Energy dissipation [pJ/bit]	5.491	6.857	6.950	7.294	8.852

Table 3. Results of NIST STS with temperature and supply-voltage variations being applied.

V_DD [V]	1.62		1.8		1.8		1.8		1.98
T [°C]	27		0		27		80		27
Test	Pass Rate	p-Value	Pass Rate	p-Value	Pass Rate	p-Value	Pass Rate	p-Value	Pass Rate	p-Value
Frequency	0.275709	20/20	0.048716	20/20	0.275709	20/20	0.834308	20/20	0.066882	20/20
Block frequency	0.213309	20/20	0.834308	20/20	0.534146	20/20	0.911413	20/20	0.350485	20/20
Cumulative sums 1	0.275709	20/20	0.122325	20/20	0.637119	20/20	0.637119	20/20	0.162606	20/20
Cumulative sums 2	0.090936	20/20	0.991468	20/20	0.964295	20/20	0.911413	20/20	0.637119	20/20
Runs	0.964295	19/20	0.437274	20/20	0.534146	20/20	0.834308	20/20	0.275709	20/20
Longest run	0.911413	20/20	0.739918	20/20	0.534146	20/20	0.534146	20/20	0.739918	20/20
Rank	0.275709	20/20	0.275709	19/20	0.739918	20/20	0.534146	20/20	0.350485	19/20
FFT	0.006196	19/20	0.964295	20/20	0.437274	20/20	0.739918	20/20	0.739918	20/20
Non-overlapping template	0.534146	20/20	0.066882	20/20	0.275709	20/20	0.350485	20/20	0.017912	18/20
Overlapping template	0.739918	20/20	0.350485	20/20	0.213309	19/20	0.739918	20/20	0.213309	20/20
Universal statistical test	0.534146	20/20	0.637119	19/20	0.637119	19/20	0.437274	20/20	0.213309	20/20
Approximate entropy	0.964295	20/20	0.275709	19/20	0.991468	20/20	0.534146	19/20	0.637119	20/20
Random excursions	0.739918	10/10	0.122325	12/12	0.437274	11/11	0.350485	10/10	0.534146	12/12
Random excursions variant	0.534146	10/10	0.035174	12/12	0.090936	11/11	0.534146	10/10	0.739918	12/12
Serial	0.739918	20/20	0.437274	20/20	0.964295	20/20	0.534146	19/20	0.911413	20/20
Linear complexity	0.739918	20/20	0.437274	20/20	0.275709	20/20	0.162606	20/20	0.637119	19/20

Table 4. Comparison table.

	Nature 2011 [7]	Nature 2019 [17]	TNNLS 2019 [20]	This Work
Processing unit	D-Wave qubit	MTJ p-bit	FPGA p-bit	CMOS p-bit
Technology	Superconductor IC	MTJ + discrete	FPGA	CMOS 180 nm + FPGA
Chip integration	Qubits integrated	No	Possible	CMOS p-bit integrated
Operating temperature	20–100 mK	300 K	300 K	273–353 K
Random source	N/A	MTJ thermal noise	Dedicated LFSR	Transistor noise
Random source	(Quantum annealing)	MTJ thermal noise	Dedicated LFSR	Transistor noise
Random quality	N/A	Random *	Pseudorandom	NIST STS pass
Probability-control precision	N/A	12-bit (external DAC)	6-bit	5-bit
Random source/p-bit core footprint	N/A	≥0.00503 μm² **	42 LUTs + 33 REGs ***	6551 μm² ****
Update rate [MHz]	N/A	0.001 (estimated)	33.3 (estimated)	50
Energy dissipation [pJ/bit]	N/A	2000 (estimated)	N/A	6.95
Applications	COPs *	Invertible Boolean logic, Integer factorization (N = 945)	Invertible Boolean logic, Subset sum problem	Invertible Boolean logic, Integer factorization (N = 1539)

* Not explicitly reported; ** MTJ pillar footprint derived from nominal diameter; *** FPGA resources for single tunable RNG implementation; **** CMOS layout area, including the circuits depicted in Figure 8.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jeon, J.; Lim, C. A Noise-Based CMOS Probabilistic Bit for Combinatorial Optimization Problems. Electronics 2026, 15, 2510. https://doi.org/10.3390/electronics15122510

AMA Style

Jeon J, Lim C. A Noise-Based CMOS Probabilistic Bit for Combinatorial Optimization Problems. Electronics. 2026; 15(12):2510. https://doi.org/10.3390/electronics15122510

Chicago/Turabian Style

Jeon, Jinwoo, and Chaegang Lim. 2026. "A Noise-Based CMOS Probabilistic Bit for Combinatorial Optimization Problems" Electronics 15, no. 12: 2510. https://doi.org/10.3390/electronics15122510

APA Style

Jeon, J., & Lim, C. (2026). A Noise-Based CMOS Probabilistic Bit for Combinatorial Optimization Problems. Electronics, 15(12), 2510. https://doi.org/10.3390/electronics15122510

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Noise-Based CMOS Probabilistic Bit for Combinatorial Optimization Problems

Abstract

1. Introduction

2. Background

2.1. Ising Solver

2.2. Probabilistic Computing

2.3. Applications and Encoding Methods

3. Proposed CMOS P-Circuit System

3.1. CMOS P-Bit

3.2. P-Circuit Implementation

4. Results

4.1. Output Characteristics of the CMOS P-Bit

4.2. Measurement Results of Invertible Boolean Operation

4.3. Measurement Results of an Integer Factorization

4.4. Comparison with Prior Works

5. Discussion

5.1. Random-Source Scaling and System-Level Overheads

5.2. Model-Based Factorization-Size Analysis and Target Selection

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI