You are currently viewing a new version of our website. To view the old version click .
Mathematics
  • Feature Paper
  • Article
  • Open Access

21 October 2025

Fast Deep Belief Propagation: An Efficient Learning-Based Algorithm for Solving Constraint Optimization Problems

,
,
and
1
School of Software Engineering, Sun Yat-sen University, Zhuhai 519000, China
2
Department of Computer Science, Cornell University, Ithaca, NY 14850, USA
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Applied Mathematics, Computing, and Machine Learning

Abstract

Belief Propagation (BP) is a fundamental heuristic for solving Constraint Optimization Problems (COPs), yet its practical applicability is constrained by slow convergence and instability in loopy factor graphs. While Damped BP (DBP) improves convergence by using manually tuned damping factors, its reliance on labor-intensive hyperparameter optimization limits scalability. Deep Attentive BP (DABP) addresses this by automating damping through recurrent neural networks (RNNs), but introduces significant memory overhead and sequential computation bottlenecks. To reduce memory usage and accelerate deep belief propagation, this paper introduces Fast Deep Belief Propagation (FDBP), a deep learning framework that improves COP solving through online self-supervised learning and graphics processing unit (GPU) acceleration. FDBP decouples the learning of damping factors from BP message passing, inferring all parameters for an entire BP iteration in a single step, and leverages mixed precision to further optimize GPU memory usage. This approach substantially improves both the efficiency and scalability of BP optimization. Extensive evaluations on synthetic and real-world benchmarks highlight the superiority of FDBP, especially for large-scale instances where DABP fails due to memory constraints. Moreover, FDBP achieves an average speedup of 2.87× over DABP with the same restart counts. Because BP for COPs is a mathematically grounded GPU-parallel message-passing framework that bridges applied mathematics, computing, and machine learning, and is widely applicable across science and engineering, our work offers a promising step toward more efficient solutions to these problems.

1. Introduction

Belief Propagation (BP) [1,2] plays a pivotal role as a message-passing algorithm in graphical models. Its applications range from computing the partition function of a Markov random field [3] to estimating marginal distributions [4] and decoding low-density parity-check (LDPC) codes [5]. Constraint optimization problems (COPs) [6,7] provide a versatile mathematical framework for modeling real-world challenges in transportation, supply chain management, energy, finance, and scheduling [7,8,9,10]. In the realm of COPs, BP, also recognized as Min-sum message passing [11], seeks cost-optimal solutions by propagating cost information throughout the factor graph. Beyond standard BP, variational, bounding, and elimination-based perspectives provide additional context for inference in loopy or high-treewidth settings, including Tree-Reweighted BP (TRW) and Max-Product Linear Programming (MPLP) for bounds [12,13], as well as bucket elimination and mini-bucket approximations for structure-exploiting tradeoffs [14,15].
However, vanilla BP lacks convergence guarantees on factor graphs with loops and can converge to suboptimal fixed points in COPs with cyclic structure. Recognizing this challenge, significant research has focused on stabilizing loopy BP [16,17,18,19,20,21]. Notably, Damped Belief Propagation (DBP) [21] has gained attention for improving convergence behavior on loopy graphs. These stabilization techniques are complementary to variational and LP-based reparameterizations that control oscillation and provide certificates when available [12,13].
Despite the advantages of damping, fine-tuning damping factors per instance remains laborious. Recent work integrates deep learning to automate these choices. In particular, Deep Adaptive Belief Propagation (DABP) leverages self-supervised learning with gradient-based optimization to determine damping factors, and further introduces dynamic damping based on real-time optimization status to improve per-iteration decisions. This trend is part of a broader line on neuralized message passing that parameterizes or augments BP updates while preserving their semantics, including the Belief Propagation Neural Network (BPNN) [22], Neural-Enhanced Belief Propagation (NEBP) [23], and the Factor Graph Neural Network (FGNN) for higher-order factors [24], surveyed in learning for combinatorial reasoning and solver guidance [25].
However, DABP faces scalability and efficiency challenges, primarily due to high GPU memory consumption and its reliance on autoregressive damping-factor predictions, which limit applicability to larger instances. To address these limitations, we introduce a paradigm that decouples damping-factor learning from the BP process. By eliminating the need for a recurrent neural network (RNN) over sequential BP messages, our approach enables joint prediction of damping factors, akin to conventional deep models. This reduces the number of deep neural network (DNN) calls, yields substantial memory savings, and improves efficiency over DABP. Additionally, our method leverages mixed precision to further reduce GPU memory. We refine learning by introducing priors, constraining the damping space, and learning adaptive variations rather than directly optimizing raw values, which enhances stability and convergence. Overall, our approach substantially improves both the efficiency and scalability of DABP. Our design aligns with recent efficiency-first trends in machine learning for discrete optimization that aim to limit sequential neural computation, for example, diffusion-based solvers such as DIFUSCO and Fast tokens-to-tokens (Fast T2T) and industrial-scale pipelines like DISCO [26,27,28], as well as restart policies and learning-guided control for robust anytime behavior [29,30]. From a theoretical perspective, message-passing graph neural networks (GNNs) can capture optimal approximation algorithms for broad classes of Max-CSPs, with OptGNN providing a practical instantiation [31].
The efficacy of our algorithm is substantiated through extensive experiments across diverse benchmarks, with an emphasis on larger problem instances. Our approach not only handles significantly larger problem sizes but also achieves an average speedup of 2.87× over DABP at equivalent restart counts. In terms of solution quality, our model remains competitive for smaller instances (variable size less than 150) and surpasses baselines on larger instances, where DABP falters due to memory limitations.
Our contributions are threefold: (1) we propose FDBP, a scalable BP framework that rethinks how damping factors are learned; by leveraging online self-supervised learning and parallelizable message processing, FDBP accelerates COP solving; (2) FDBP eliminates RNN hidden-state storage and per-iteration neural passes, which reduces GPU memory usage and enables scalability to large COP instances where prior DABP fails due to memory constraints; and (3) we conduct extensive experiments across diverse benchmarks, demonstrating that FDBP achieves a favorable accuracy–efficiency tradeoff and consistent gains over existing methods.

3. Preliminaries

A table summarizing all acronyms and abbreviations used in the paper is provided in Appendix C.

3.1. Constraint Optimization Problems

A COP [6] can be specified by the triple X , D , F , where X is the set of variables, D collects their domains, and F is a family of cost functions. Each variable x i X has a finite domain D i D . For every f F with scope s c p ( f ) X , the function assigns a cost to each assignment of the variables in its scope.
The task is to compute a solution τ = ( τ 1 , , τ * | X | ) Π x i X D i that minimizes the total cost:
τ * = arg min τ Π x i X D i f F f ( τ | s c p ( f ) ) ,
where τ | s c p ( f ) denotes the restriction of τ to s c p ( f ) . A COP may be represented as a factor graph, a bipartite graph with variable nodes for elements of X and function nodes for members of F. An edge connects a variable node to a function node iff the variable belongs to the function’s scope.

3.2. Min-Sum Belief Propagation

Min-sum Belief Propagation (Min-sum BP) [11] is a standard method for COPs that runs on the factor graph representation by exchanging messages between variable and factor nodes. At iteration t, the message sent from a variable node x i to a neighboring factor f is a function m x i f t : D i R defined as
m x i f t = f ξ N i f m f ξ x i t 1 ,
where N i denotes the neighbors of x i in the factor graph, and m f ξ x i t 1 : D i R is the message previously received by x i from f ξ . In the opposite direction, f computes the message for x i as
m f x i t = min N x i f + x j N x i m t 1 x j f .
After aggregating incoming messages, variable x i selects a value that minimizes its belief:
τ i t = arg min τ i D i f N i m f x i t ( τ i ) ,
where m f x i t ( τ i ) is the belief cost assigned to τ i by the message from f .
On graphs with cycles, plain Min-sum BP can exhibit non-convergence and may settle on poor solutions due to repeated influence along loops. DBP [21] mitigates these issues by damping the variable-to-factor messages. Concretely, the message from x i to f is updated as
m x i f t = λ m x i f t 1 + ( 1 λ ) f ξ N i f m f ξ x i t 1 .
With an appropriate choice of λ ( 0 , 1 ] , DBP often improves both convergence behavior and the quality achieved by the undamped Min-sum BP.

4. Methodologies

In each step, the DBP algorithm uses a fixed, manually chosen damping factor to balance the contribution of new and old messages when updating the message from a variable node to a function node in a factor graph. However, the new message is composed of messages received from the neighboring function nodes of the current variable node, and the importance of these messages can vary. To address this limitation, a more flexible approach should allow for the assignment of variable-specific damping factors and neighbor-specific weights for each variable node during the BP process. Specifically, dynamic damping factors and neighbor-specific weights can be integrated into the BP process. In step t, a variable node x i computes the message to function node f using the following expression:
m x i f t = λ i t m x i f t 1 + ( 1 λ i t ) ( | N i | 1 ) × f ξ N i { f } w ξ i t ( ) m f ξ x i t 1 .
Here, λ i t [ 0 , 1 ] and w ξ i t ( ) [ 0 , 1 ] represent learnable damping factors and neighbor weights, respectively, for the message from x i to f at iteration t.
Suppose a problem instance has N variables, and its corresponding factor graph has an average node degree of M. If the BP process runs for T steps until termination, the complexity of updating the factors in adaptive BP is O ( N M T ) . This represents a computationally demanding task, making manual handcrafting impractical.
In response, DABP [33] was proposed, leveraging DNNs parameterized by M θ to automatically infer these factors in an autoregressive manner:
w t , λ t = M θ G , m x f 1 : t 1 , m f x 1 : t 1 .
Here, G represents the factor graph, and m x f 1 : t 1 and m f x 1 : t 1 denote the BP messages from variable nodes to function nodes and from function nodes to variable nodes in the previous steps, respectively.
However, the entire DABP procedure operates autoregressively, requiring the invocation of DNNs at each step to infer damping factors and neighbor weights. This recurrent dependency on DNNs can significantly increase computational overhead.

4.1. Fast Deep Belief Propagation

We propose an efficient learning-based algorithm for solving constraint optimization problems, referred to as Fast Deep Belief Propagation (FDBP). An illustration of the proposed approach is provided in Figure 1. The algorithm will execute R iterations, effectively restarting R times. The restart policy has demonstrated significant potential in addressing COPs [34]. Researchers have observed that combinatorial search algorithms often exhibit highly unpredictable runtime and solution quality across problem instances. By incorporating a restart policy, the algorithm minimizes the risk of getting trapped in local minima, thereby improving its overall efficiency and effectiveness.
Figure 1. The workflow of our algorithm FDBP.
In each iteration r, the algorithm runs for T steps. At each step t, it updates messages ( m x f t , m f x t ) r by using the messages from the previous step ( m x f t 1 , m f x t 1 ) r and inferred parameters ( λ t , w t ) r . These parameters ( λ t , w t ) r are computed by a Graph Attention Neural Network (GAT), which takes as input the factor graph and the messages from the previous iteration, specifically ( m x f t 1 , m f x t 1 ) r 1 .
This design introduces a key improvement: instead of inferring parameters ( λ t , w t ) r at each step t during the current iteration (as in the DABP algorithm), our approach infers all parameters for the entire iteration r in a single step, using a single invocation of the GAT. By doing so, the algorithm avoids the need to repeatedly invoke DNNs T 1 times per iteration, as required in DABP. This significantly reduces computational overhead and leverages GPU parallelism, making our approach far more efficient while retaining the effectiveness of parameter inference.
The time complexity of our algorithm per iteration matches that of DBP in the worst-case scenario. However, in practice, our approach often achieves faster performance due to its enhanced convergence behavior (as demonstrated in Table 1 and Table 2 of Section 5), which reduces the required number of steps to reach convergence. Compared to DABP, our algorithm’s efficiency advantage becomes particularly evident when employing the same restart strategy, as all parameters for an entire iteration are computed simultaneously, significantly streamlining the process.
It is worth noting that our algorithm adopts a different approach to parameter inference compared to DABP. In DABP, the parameters for step t are inferred autoregressively by utilizing messages from all preceding steps (0 to t 1 ) within the same iteration, where a RNN is used to encode these messages. By contrast, our algorithm infers parameters for step using only the corresponding messages from the previous iteration, ( m x f t 1 , m f x t 1 ) r 1 . This design assumes that the messages from step t of the previous iteration already encapsulate the relevant information from earlier steps. While not directly intervening in the BP process, this approach simplifies the inference process and complements our algorithm’s overall focus on efficiency and scalability.
In summary, our FDBP introduces a significantly more efficient and scalable framework for parameter inference in learning-based belief propagation. By addressing the computational bottlenecks inherent in DABP, our approach enhances both the speed and scalability of solving COPs. Specifically, FDBP overcomes the limitations of DABP by reducing the need for repeated, computationally expensive deep learning network invocations at each iteration, thereby lowering memory overhead and avoiding sequential computation bottlenecks. This leads to improved overall performance, making it well-suited for real-world, large-scale applications where both memory efficiency and computational speed are critical.

4.2. The FDBP Algorithm

The GAT model in our FDBP algorithm is trained using an online self-supervised learning approach, removing the dependency on labor-intensive, human-labeled datasets. This design allows our method to be directly applied to solving COPs without requiring prior model pretraining. Specifically, at each iteration t, given the BP messages m, we optimize the following self-supervised objective L ( m ) :
min θ L ( m ) = f F τ i 1 , , τ i n Π x i s c p ( f ) D i f ( τ i 1 , , τ i n ) j = 1 : n p i j t ( τ i j ) ,
where p i ( τ i j ) denotes the belief-induced probability of the event x i = τ i j , defined by the following:
p i ( τ i j ) = exp ( b i ( τ i j ) ) τ i j D i exp ( b i ( τ i j ) ) , b i ( τ i j ) = f F i m f x i ( τ i j ) .
Intuitively, an assignment τ i j is assigned a higher probability p i ( τ i j ) when it has a lower belief cost b i ( τ i j ) . It preserves the minimization semantics of the argmin in Equation (4) and is consistent with the global objective in Equation (1). By leveraging this self-supervised loss, our algorithm adaptively refines its predictions in a principled manner, facilitating efficient and effective training.
Our FDBP algorithm is presented in Algorithm 1. The algorithm runs for R iterations (lines 3–19), with each iteration comprising the following steps:
  • Initialization: initialize messages and parameters for the factor graph (line 4).
  • Message Updates: perform sequential message updates using Equations (6) and (3) with the inferred parameters λ t , w t (lines 6–7).
  • Solution Evaluation: periodically evaluate the current solution τ t and compare its cost to the best solution τ * found so far (lines 8–10).
  • Buffer Storage: store selected message pairs in a buffer for loss computation and gradient updates (lines 11–12).
  • Adaptive Optimization: optimize parameters adaptively through backpropagation using the stored messages and the defined loss function (lines 15–19).
Note that we use mixed precision to further economize GPU memory and refine the learning process by constraining damping factors within the range of 0.8 to 1.0. In a departure from conventional methodologies, we learn varying factors instead of directly optimizing damping factors, aligning with empirical observations suggesting optimal values around 0.9, as noted by [21].
Algorithm 1: The fast deep belief propagation algorithm
Mathematics 13 03349 i001

4.3. Time Complexity

We denote by b the maximum scope size of a factor, d the maximum variable domain size, and e the average degree of a variable node in the factor graph. Let M be the cost of one call to the GAT-based controller (a single forward pass). We follow the implementation setting where factors and variables are processed in parallel on a single GPU; the expressions below therefore reflect wall-time under this parallelism (work complexity would multiply by | F | or | X | accordingly).
  • Factor-to-variable update Equation (3). Evaluating a factor message over a scope of size b requires iterating over assignments of the other b 1 variables, giving O ( d b 1 ) .
  • Loss/objective evaluation Equation (8). Dominated by factor evaluations; with factors processed in parallel on GPU, the wall-time is O ( d b 1 ) .
  • Variable-to-factor (weighted BP) update Equation (6). Aggregation over the e incoming messages yields O ( e ) (variables processed in parallel).
  • Decoding Equation (4). Selecting the best label for one variable is O ( d ) (variables processed in parallel).
Per BP iteration, the dominant cost is, therefore,
C iter = O d b 1 + e + d .
Our controller (GAT) is invoked once every K iterations, so each restart performs T K controller calls, contributing O T K M . We evaluate the objective once per restart, adding O ( d b 1 ) .
  • Total time complexity. With R restarts and at most T BP iterations per restart, the total wall-time is
    T total = O R T ( d b 1 + e + d ) + T K M + d b 1 .
For methods that infer weights/damping every iteration (e.g., DABP), replace T K M with T M , which explains the higher per-iteration compute/memory footprint relative to our interval/restart-level controller.

5. Empirical Evaluations

In this section, we present a comprehensive empirical study on the effectiveness of our proposed FDBP. We commence by providing insights into the experimental setup and implementation details. Subsequently, we showcase the superiority of FDBP over existing state-of-the-art methods.
Benchmarks and Baselines. We evaluate on four canonical benchmarks: random COPs, scale-free networks, small-world networks, and Weighted Graph Coloring Problems (WGCPs) [7]. For random COPs and WGCPs, constraint edges are sampled i.i.d. with graph density p 1 ( 0 , 1 ] . Scale-free instances are generated using the Barabási–Albert (BA) model with parameters m 0 , m 1 Z + . Small-world instances follow the Newman–Watts–Strogatz construction with k Z + and p [ 0 , 1 ] . See Cohen et al. [21] for further details on problem instance generation.
We benchmark FDBP against state-of-the-art solvers for COPs: (1) Toulbar2 with a timeout of 1200 s (7200s for large-scale problems); (2) Mini-bucket Elimination (MBE) with an i-bound of 8 (i-bound of 7 for specific random COPs); (3) GAT-PCM-LNS with a destroy probability of 0.2; (4) DBP with a damping factor of 0.9 and a splitting ratio of 0.95; and (5) DABP with different restarts.
All experiments are conducted on a server equipped with an Intel(R) Xeon(R) Gold 6148 CPU (Intel Corporation, Santa Clara, CA, USA), NVIDIA GeForce RTX 3090 GPUs (NVIDIA Corporation, Santa Clara, CA, USA), and 125 GB memory. The reported results represent the best solution cost for each run, and for each experiment, the results are averaged over 100 random problem instances. For larger-scale problems, a 2-hour runtime limit is imposed on all methods.
Implementation. Our FDBP model first applies a learned linear projection to a batch of BP messages to obtain 8-dimensional embeddings. These embeddings are then processed by a stack of four Graph Attention Network (GAT) layers. Each GAT layer produces eight feature channels using four attention heads. In line with DBP and DABP, we operate on a Splitting Constraint Factor Graph (SCFG) with a splitting ratio of 0.95. The implementation is built in PyTorch Geometric and trained with the Adam optimizer, using a learning rate of 10 3 and weight decay 5 × 10 5 . For each problem instance, we allocate R = 20 independent restarts and cap message passing at T = 1000 iterations per restart (the restart budget and iteration cap were selected via a small grid search on random COPs with | X | = 100 , sweeping R { 5 , 10 , 20 , 30 } and T { 500 , 1000 , 1500 } . The best trade-off between accuracy and efficiency was achieved at R = 20 and T = 1000 , with only minor gains beyond R = 20 ). Dynamic weights and damping factors are re-estimated from the current BP messages every K = 20 iterations for most settings; for random COPs with | X | = 300 , the update schedule is tightened to every K = 10 iterations. We provide a list of parameters used in Appendix B.
Performance Comparison. Table 1 and Table 2 compare the performance of various methods across four standard benchmarks. Table 1 focuses on smaller problem instances, while Table 2 addresses larger problem instances.
Table 1. Comparison of methods using normalised cost and relative gap. Cost is per constraint | F | . Gap is ( Cost Cost min ) / Cost min with lower values preferred. Best results are highlighted in both bold and blue. OOM means out of memory.
Table 1. Comparison of methods using normalised cost and relative gap. Cost is per constraint | F | . Gap is ( Cost Cost min ) / Cost min with lower values preferred. Best results are highlighted in both bold and blue. OOM means out of memory.
Random COPs ( p 1 = 0.25 )
| X | = 60 | X | = 80 | X | = 100
MethodsCostGapTimeCostGapTimeCostGapTime
Toulbar229.177.53%20m32.327.75%20m34.317.19%20m
MBE32.4219.51%21s34.9416.50%41s36.8615.14%59s
GAT-PCM-LNS28.033.31%4m35s30.812.72%10m9s32.782.41%19m23s
DBP27.621.80%38s30.411.39%1m30s32.451.37%2m45s
DABP ( R = 5 )27.210.28%57s30.100.35%1m8s32.090.25%1m25s
DABP ( R = 10 )27.170.14%1m53s30.050.19%2m15s32.040.09%2m56s
DABP ( R = 20 )27.130.00%3m36s29.990.00%4m23s32.010.00%5m44s
FDBP ( R = 5 )27.240.42%17s30.130.45%29s32.080.23%53s
FDBP ( R = 10 )27.220.32%32s30.080.29%55s32.050.14%1m48s
FDBP ( R = 20 )27.170.14%1m1s30.060.21%1m43s32.020.03%3m42s
WGCPs ( p 1 = 0.25 )
Toulbar20.190.00%20m1.1936.70%20m2.0941.14%20m
MBE2.041001.03%0s2.86227.35%0s3.50136.26%0s
GAT-PCM-LNS0.52182.02%44s1.1734.10%2m5s1.7819.84%3m28s
DBP0.42126.52%0m17s1.0621.08%1m12s1.6813.72%2m55s
DABP ( R = 5 )0.3270.94%1m43s0.902.49%2m53s1.597.55%5m38s
DABP ( R = 10 )0.3063.29%3m25s0.880.95%5m35s1.490.80%11m10s
DABP ( R = 20 )0.2955.07%6m35s0.870.00%11m8s1.480.00%22m17s
FDBP ( R = 5 )0.3379.84%31s0.903.23%1m1.511.65%2m56s
FDBP ( R = 10 )0.3273.69%1m3s0.891.47%1m59s1.500.96%5m35s
FDBP ( R = 20 )0.3169.37%2m1s0.880.47%3m57s1.490.32%11m9s
Scale-free networks ( m 0 = m 1 = 10 )
Toulbar230.907.21%20m31.648.01%20m32.349.24%20m
MBE33.9717.87%22s34.5517.95%31s34.8417.69%40s
GAT-PCM-LNS29.723.11%5m28s30.413.82%8m48s31.105.06%13m25s
DBP29.281.60%41s29.791.70%1m30.031.43%1m29s
DABP ( R = 5 )28.900.29%1m1s29.390.33%1m3s29.690.30%1m5s
DABP ( R = 10 )28.860.12%1m58s29.340.16%2m4s29.630.11%2m6s
DABP ( R = 20 )28.820.00%3m50s29.290.00%4m17s29.600.00%4m12s
FDBP ( R = 5 )28.970.52%14s29.450.53%19s29.780.61%23s
FDBP ( R = 10 )28.920.36%28s29.420.46%36s29.740.47%48s
FDBP ( R = 20 )28.900.29%54s29.390.34%1m14s29.700.35%1m34s
Small-world networks ( k = 10 , p = 0.3 )
Toulbar228.5111.01%20m28.8012.37%20m28.3710.67%20m
MBE29.4614.69%16s29.4014.73%22s29.3114.32%28s
GAT-PCM-LNS26.493.13%3m24s26.433.14%5m6s26.262.45%7m19s
DBP25.870.70%35s25.880.99%56s25.680.16%1m17s
DABP ( R = 5 )25.770.33%1m52s25.710.34%1m39s25.730.35%2m
DABP ( R = 10 )25.720.13%3m42s25.670.16%3m18s25.670.12%3m57s
DABP ( R = 20 )25.690.00%7m18s25.630.00%6m33s25.640.00%7m57s
FDBP ( R = 5 )25.810.49%20s25.750.47%31s25.750.46%41s
FDBP ( R = 10 )25.760.27%40s25.680.22%1m4s25.710.28%1m19s
FDBP ( R = 20 )25.720.11%1m18s25.630.02%2m5s25.670.14%2m34s
Table 2. Comparison of methods using normalised cost and relative gap. Cost is per constraint | F | . Gap is ( Cost Cost min ) / Cost min with lower values preferred. Best results are highlighted in both bold and blue. OOM means out of memory.
Table 2. Comparison of methods using normalised cost and relative gap. Cost is per constraint | F | . Gap is ( Cost Cost min ) / Cost min with lower values preferred. Best results are highlighted in both bold and blue. OOM means out of memory.
Random COPs ( p 1 = 0.25 )
| X | = 150 | X | = 200 | X | = 250 | X | = 300
MethodsCostGapTimeCostGapTimeCostGapTimeCostGapTime
Toulbar237.165.49%2h38.924.69%2h40.144.27%2h41.053.90%2h
MBE39.4812.057%2m32s41.0010.29%4m40s41.989.03%27s42.718.09%40s
GAT-PCM-LNS35.861.81%1h5m37.861.84%2h39.322.13%2h40.392.22%2h
DBP35.560.96%9m50s37.500.87%22m38s38.760.67%45m43s39.730.57%1h90m
DABP ( R = 5 )    
DABP ( R = 10 )OOMOOMOOMOOM
DABP ( R = 20 )    
FDBP ( R = 5 )35.300.21%2m7s37.230.14%3m53s38.540.12%4m25s39.550.10%5m10s
FDBP ( R = 10 )35.260.10%4m19s37.200.07%7m42s38.520.06%8m44s39.530.06%10m24s
FDBP ( R = 20 )35.230.00%8m47s37.180.00%15m11s38.500.00%17m17s39.510.00%20m54s
WGCPs ( p 1 = 0.25 )
Toulbar23.4325.83%20m4.2517.89%2h4.8814.73%2h5.3611.61%2h
MBE4.5566.84%1s5.2044.15%1s5.7034.07%2s6.0626.22%3s
GAT-PCM-LNS2.979.04%10m39s3.784.70%22m1s4.372.84%41m49s4.810.31%1h8m
DBP3.0010.08%11m42s4.0111.05%27m54s4.7110.79%53m48s5.249.14%1h42m
DABP( R = 5 )    
DABP ( R = 10 )OOMOOMOOMOOM
DABP ( R = 20 )    
FDBP ( R = 5 )2.792.53%5m48s3.702.42%11m43s4.383.00%18m54s4.993.89%24m57s
FDBP ( R = 10 )2.750.78%11m32s3.641.01%23m17s4.311.22%38m9s4.891.94%50m
FDBP ( R = 20 )2.730.00%23m18s3.610.00%46m15s4.250.00%1h16m4.800.00%1h40m
Scale-free networks ( m 0 = m 1 = 10 )
Toulbar233.0910.27%20m33.079.12%2h33.188.81%2h33.459.50%2h
MBE35.2617.48%1m3s35.4016.79%1m25s35.5116.46%1m51s35.6516.71%2m18s
GAT-PCM-LNS32.327.68%30m46s33.279.76%53m15s33.7910.82%1h20m34.2612.14%1h54m
DBP30.361.17%2m34s30.590.93%4m9s30.690.65%5m15s30.780.74%6m29s
DABP( R = 5 )30.080.23%1m14s   
DABP ( R = 10 )30.040.11%2m23sOOMOOMOOM
DABP ( R = 20 )30.010.00%4m41s   
FDBP ( R = 5 )30.140.43%36s30.380.23%36s30.530.13%47s30.590.12%47s
FDBP ( R = 10 )30.110.32%1m8s30.330.08%1m10s30.520.08%1m33s30.570.06%1m34s
FDBP ( R = 20 )30.080.22%2m10s30.310.00%2m21s30.490.00%3m4s30.550.00%3m10s
Small-world networks ( k = 10 , p = 0.3 )
Toulbar228.1612.66%20m28.7712.13%20m27.958.80%2h27.748.06%2h
MBE28.4713.90%42s29.2914.17%55s29.8716.26%1m9s29.6315.43%1m23s
GAT-PCM-LNS25.662.65%14m41s26.322.60%22m50s26.914.74%32m56s26.653.84%45m3s
DBP25.000.00%2m49s25.660.00%4m24s26.181.89%6m3125.951.10%8m20s
DABP ( R = 5 )25.732.92%3m5s25.740.31%3m50s25.760.28%4m52s25.730.24%5m40s
DABP ( R = 10 )25.692.77%6m6s25.690.14%7m36s25.730.15%9m59s25.690.09%11m27s
DABP ( R = 20 )25.652.60%12m28s25.660.01%14m54s25.690.00%19m57s25.670.00%22m56s
FDBP ( R = 5 )25.753.01%1m4s25.770.42%1m44s25.790.39%3m5s25.770.41%2m55s
FDBP ( R = 10 )25.702.83%2m7s25.720.22%3m35s25.750.22%6m1s25.740.29%5m49s
FDBP ( R = 20 )25.672.69%4m18s25.680.08%7m6s25.720.12%11m49s25.710.15%11m22s
Even with generous time budgets (20 min and 2 h), Toulbar2 lags behind on most benchmarks. This outcome is consistent with its largely systematic search strategy: as instance size grows, the effective branching factor renders global exploration impractical, even with pruning and heuristics. By contrast, MBE often achieves shorter runtimes; however, memory limits force small bucket sizes and weaker relaxations, so its solution quality is substantially lower than the stronger competitors.
The results further highlight the difficulty of obtaining certifiably optimal labels for DNN-based BP variants using an exact solver. GAT-PCM-LNS, which combines iterative large neighborhood search with a machine-learned repair policy, often delivers higher-quality solutions than Toulbar2 and MBE. This advantage comes at a cost: wall-clock time is markedly longer because each LNS iteration destroys a subset of variables and requires multiple rounds of model inference to repair and reassign them.
Due to constraint function splitting, DBP frequently outperforms Toulbar2, MBE, and GAT-PCM-LNS. Additionally, DBP exhibits shorter runtime compared to most other methods. By seamlessly integrating BP and deep learning models to infer dynamic weights and damping factors, DABP achieves better solutions than DBP within an acceptable increase in time. However, it faces limitations in solving larger-scale problems due to GPU memory constraints.
Our proposed FDBP demonstrates significant superiority across various benchmarks:
  • Computational Efficiency: FDBP achieves an average speedup of 2.87x over DABP at equivalent restart counts (R). For 100-variable random COPs, FDBP ( R = 20 ) completes in 3m42s vs. DABP’s 5m44s (35% faster) with only a 0.03% cost gap.
  • Scalability Advantage: DABP encounters out-of-memory failures beyond 150 variables, while FDBP successfully handles 300-variable instances (Table 2). On 300-variable WGCPs, FDBP ( R = 20 ) achieves optimal normalized cost (4.80) in 1h40m versus DBP’s suboptimal 5.24 (9.17% gap) in 1h42m.
  • Performance-Efficiency Tradeoff: While DABP ( R = 20 ) maintains 0.00% gap on smaller instances, FDBP ( R = 20 ) shows marginally higher gaps (0.03–0.35%) but with 35–82% runtime reductions. This tradeoff becomes favorable for FDBP at scale—it solves 300-variable scale-free networks in 3m10s (0.00% gap) where DABP fails.
DABP integrates GRUs and multi-head GAT to infer dynamic weights and damping at each BP iteration, adding per-iteration compute and memory; FDBP avoids such per-iteration neural passes (one lightweight controller call per update interval/restart), trading a small, small-instance gap for substantial speed and lower peak memory. This design choice explains why FDBP is 35–82% faster and stays within 15.2 GB at | X | = 300 , while DABP hits OOM at | X | 150 (see Table A1). Note that we report peak GPU memory usage in Appendix A.
Anytime Performance Analysis. We conduct further comparison of solution quality among different methods over time for problem instances with | x | = 80 in Figure 2. We have the following key insights:
Figure 2. Solution quality versus time. Shaded regions depict 95% confidence intervals. Note that MBE is an elimination-based approximator: it first selects an elimination ordering and performs mini-bucket partitioning, then executes a backward elimination pass to compute bounds/messages, and finally runs a forward decoding pass to reconstruct a concrete assignment. A valid objective is available only after this decode step, so the cost–time curve for MBE starts once the first assignment is produced.
  • FDBP consistently outperforms all other methods across all problem types, demonstrating the fastest convergence to the optimal or near-optimal solution with the lowest normalized cost. This suggests that FDBP is the most efficient algorithm overall.
  • DABP and DBP generally show slower but steady improvement, with DABP typically outperforming DBP.
  • GAT-PCM-LNS and Toulbar show slower progress and are less effective in terms of both speed and quality of the solution compared to FDBP, DABP, and DBP.
  • MBE consistently reaches a plateau early but at a much higher cost than the other methods, indicating that while it converges faster, it is less effective at finding low-cost solutions.
Overall, the figures highlight the superiority of FDBP in terms of both speed and the quality of the solution across all problem instances. The other algorithms, while effective to some extent, generally converge slower and achieve higher costs, with MBE being the least competitive in terms of performance.
Albation Study. In loopy BP, messages at step t summarize the influence of steps 0 , , t 1 . Conditioning the inference network for restart r on m x f t 1 , m f x t 1 r 1 , therefore, provides near-sufficient context for predicting λ t and w t without constructing an intra-iteration autoregressive encoder. This preserves standard BP updates while avoiding sequential neural calls, reducing latency and memory. To confirm this, we also implemented an intra-iteration variant (FDBP-AR). On | X | { 80 , 100 } , FDBP-AR attains similar cost/gap but requires 2 × runtime and 15– 25 % higher peak GPU memory (see Table 3).
Table 3. Comparison of FDBP and FDBP-AR on random COPs ( | X | = 100 , R = 10 , K = 20 ). FDBP: cost 32.05 , gap 0.14 % , time 108 s, peak 6.3 GB. FDBP-AR: cost 32.03 , gap 0.08 % , time 235 s, peak 7.6 GB. Similar trends are observed at | X | = 80 .
Sensitivity Analysis. We add a sensitivity study (Table 4). For the update interval K on | X | = 100 with R = 10 , we observe a clear accuracy–efficiency trade-off— K = 5 yields cost 32.02 , gap 0.06 % , and 160 s; K = 10 yields 32.04 , 0.10 % , and 121 s; K = 20 yields 32.05 , 0.14 % , and 108 s; K = 40 yields 32.13 , 0.37 % , and 83 s—suggesting K = 20 as a good balance. For damping, the range [ 0.8 , 1.0 ] is stable and fast, whereas [ 0.7 , 1.0 ] occasionally oscillates on WGCPs. For restarts, returns diminish beyond R = 20 ; for | X | = 100 , increasing from R = 10 to R = 20 increases time from 108 s to 222 s for only a 0.11 % gap improvement. Finally, for GAT capacity, a smaller configuration 2 × 4 × 8 is 14 % faster but slightly less accurate, a larger configuration 6 × 8 × 16 is 38 % slower but slightly more accurate, and the default 4 × 4 × 8 provides the best accuracy–efficiency balance.
Table 4. Sensitivity summary on random COPs ( | X | = 100 , R = 10 ).

6. Conclusions

This work addresses critical limitations in modern BP methods for COPs. While DBP improved convergence through manual damping heuristics, and DABP automated these via deep learning, both suffered from fundamental bottlenecks: DABP’s reliance on sequential RNNs for damping prediction introduced prohibitive memory overheads and limited scalability, while DBP’s manual tuning proved impractical for large-scale applications. We propose FDBP, a scalable BP framework that fundamentally rethinks how damping factors are learned. By decoupling damping inference from BP iterations and replacing RNNs with periodic message aggregation (every K steps), FDBP achieves the following:
  • Memory Efficiency: Eliminates RNN hidden state storage, significantly reducing GPU memory usage compared to DABP, allowing scalability to 300-variable COPs, where DABP fails due to OOM issues.
  • Computational Efficiency: Parallelizable message processing reduces runtime by an average of 65% at equivalent restart counts ( R ) , solving 100-variable COPs in 3m42s compared to DABP’s 5m44s, with only a 0.03% loss in solution quality.
  • Practical Optimality: Maintains near-DABP performance (0.00–0.35% gaps) on small instances while outperforming all baselines on large-scale problems (e.g., 300-variable WGCPs: 4.80 normalized cost vs DBP’s 5.24).
Empirical results across synthetic and topology-driven benchmarks confirm that FDBP’s design choices successfully balance the accuracy–efficiency tradeoff.
Limitations and future work. This work focuses on static, single-agent COPs; time-varying structures and decentralized coordination are out of scope. In future work, we will extend the framework to dynamic COPs and multi-agent systems by adding online message updates, warm starts, and communication-efficient, decentralized controllers, enabling more adaptive and distributed problem solving.
Broader implications. This work advances applied mathematics, computing, and machine learning by coupling model-based belief propagation with a lightweight learned controller, which preserves interpretability while improving efficiency. The framework is GPU-friendly and memory-efficient, aligning with high-performance computing for AI. It supports uncertainty-aware decisions via belief estimates and applies broadly to discrete-optimization tasks in science and engineering, including scheduling, routing, network design, and error-correcting codes. This strengthens applied machine learning and AI for scientific discovery under fixed hardware budgets.

Author Contributions

Conceptualization, S.K.; Methodology, S.K. and C.L.; Software, F.C.; Validation, Z.W.; Formal analysis, Z.W.; Writing—original draft, S.K. and C.L.; Supervision, S.K. and C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Category C; Grant No. 62506090) and the Humanities and Social Sciences Youth Foundation of the Ministry of Education of the People’s Republic of China (Grant No. 21YJC870009).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Peak GPU Memory Reporting and Supervised Baseline Comparison

We report peak GPU memory (24 GB GPU). DABP reaches OOM for | X | 150 , while FDBP remains within 15.2  GB at | X | = 300 . The result is detailed in Table A1: (i) OOM indicates allocation failure at the 24 GB device limit; “pre-OOM” is the last successfully observed peak before failure. (ii) Peaks were recorded using PyTorch 2.3.1 with CUDA 12.1 counters with reset_peak_memory_stats() between runs; this captures the true per-process peak rather than instantaneous usage. (iii) Values may vary slightly (±0.2 GB) across runs due to allocator fragmentation; the table reports representative peaks under the stated controls.
Table A1. Peak GPU memory (GB) by problem family and problem size on a single 24 GB GPU. Identical settings across methods: float32, batch size 1, restart budget R = 10 , no activation checkpointing or mixed precision. Peak measured via the PyTorch CUDA peak-reservation counter with per-run resets. CPU-only baselines (e.g., Toulbar2, MBE) are omitted here as GPU peak is N/A; see Table 1 and Table 2 for cost/gap/time.
Table A1. Peak GPU memory (GB) by problem family and problem size on a single 24 GB GPU. Identical settings across methods: float32, batch size 1, restart budget R = 10 , no activation checkpointing or mixed precision. Peak measured via the PyTorch CUDA peak-reservation counter with per-run resets. CPU-only baselines (e.g., Toulbar2, MBE) are omitted here as GPU peak is N/A; see Table 1 and Table 2 for cost/gap/time.
Random COPs ( p 1 = 0.25 )
| X | = 60 | X | = 80 | X | = 100 | X | = 150 | X | = 200 | X | = 250 | X | = 300
DBP2.63.13.66.28.811.413.6
DABP ( R = 10 )5.86.78.2OOM (pre-OOM ∼22.8)OOMOOMOOM
FDBP ( R = 10 )3.33.94.69.711.813.615.2
WGCPs ( p 1 = 0.25 )
DBP2.73.23.76.49.011.613.8
DABP ( R = 10 )5.96.98.4OOM (pre-OOM ∼22.9)OOMOOMOOM
FDBP ( R = 10 )3.44.04.79.812.013.815.2
Scale-free networks ( m 0 = m 1 = 10 )
DBP2.83.33.96.69.211.814.0
DABP ( R = 10 )6.17.18.6OOM (pre-OOM ∼23.1)OOMOOMOOM
FDBP ( R = 10 )3.54.14.910.112.214.015.2
Small-world networks ( k = 10 , p = 0.3 )
DBP2.63.13.66.28.711.313.5
DABP ( R = 10 )5.76.68.1OOM (pre-OOM ∼22.7)OOMOOMOOM
FDBP ( R = 10 )3.23.84.59.611.713.515.1
We also compare FDBP to two well-known supervised baselines, FGNN and BPNN, on random COPs.
Table A2. Supervised baseline comparison (random COPs; lower is better).
Table A2. Supervised baseline comparison (random COPs; lower is better).
| X | = 60 | X | = 80
MethodCostGap (%)Time (s)CostGap (%)Time (s)
FGNN27.300.639630.170.60178
BPNN27.260.488230.120.43151
FDBP ( R = 10 ) 27.220.325530.080.2964

Appendix B. Empirical Evaluation Parameters

Table A3. List of parameters used in our experiments.
Table A3. List of parameters used in our experiments.
ParameterDefault/Range
Restarts R20 (sweep: 5 , 10 , 20 , 30 )
Max steps T1000 (sweep: 500 , 1000 , 1500 )
Update interval K20 (sweep: 5 , 10 , 20 , 40 )
Damping λ [ 0.8 , 1.0 ] (variants: [ 0.7 , 1.0 ] , [ 0.9 , 1.0 ] )
GAT layers/channels/heads 4 / 8 / 4 (variants: 2 / 4 / 8 , 6 / 16 / 8 )
HardwareSingle GPU (24 GB); single CPU host
DatasetsRandom COPs, Small-world, Scale-free, WGCPs
MetricsCost, Gap (%), Time (s), Peak GPU (GB), OOM

Appendix C. Acronyms and Abbreviations

Table A4. Acronyms and abbreviations used throughout the paper.
Table A4. Acronyms and abbreviations used throughout the paper.
Abbrev.Definition
Algorithms and models
BPBelief Propagation
DBPDamped Belief Propagation
DABPDeep Attentive Belief Propagation
FDBPFast Deep Belief Propagation
MBEMini-Bucket Elimination
MPLPMax-Product Linear Programming
NEBPNeural-Enhanced Belief Propagation
BPNNBelief Propagation Neural Network
FGNNFactor Graph Neural Network
Neural components/learning
GNNGraph Neural Network
GATGraph Attention Network
GRUGated Recurrent Unit
AdamAdaptive Moment Estimation optimizer
SCFGSplitting Constraint Factor Graph (splitting ratio used for factor duplication)
Problems/graph families
COPConstraint Optimization Problem
WGCPWeighted Graph Coloring Problem
Scale-freeBarabási–Albert-Type Networks ( m 0 = m 1 = 10 in our setup)
Small-worldWatts–Strogatz-Type Networks ( k = 10 , p = 0.3 in our setup)
Metrics/reporting
Gap (%)Relative cost gap to the best result (lower is better)
Peak GPU (GB)Peak per-process GPU memory in gigabytes
OOMOut of memory (allocation failure on the GPU)
Common symbols / schedules
RNumber of restarts
TMaximum BP iterations per restart
KUpdate interval for inferring dynamic weights/damping

References

  1. Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference; Morgan Kaufmann: Burlington, MA, USA, 1988. [Google Scholar]
  2. Kschischang, F.R.; Frey, B.J.; Loeliger, H.A. Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 2001, 47, 498–519. [Google Scholar] [CrossRef]
  3. Yedidia, J.S.; Freeman, W.T.; Weiss, Y. Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Trans. Inf. Theory 2005, 51, 2282–2312. [Google Scholar] [CrossRef]
  4. Aji, S.M.; McEliece, R.J. The generalized distributive law. IEEE Trans. Inf. Theory 2000, 46, 325–343. [Google Scholar] [CrossRef]
  5. MacKay, D.J. Information theory, Inference and Learning Algorithms; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
  6. Modi, P.J.; Shen, W.M.; Tambe, M.; Yokoo, M. ADOPT: Asynchronous distributed constraint optimization with quality guarantees. Artif. Intell. 2005, 161, 149–180. [Google Scholar] [CrossRef]
  7. Rossi, F.; van Beek, P.; Walsh, T. Handbook of Constraint Programming; Elsevier Science Inc.: Amsterdam, The Netherlands, 2006. [Google Scholar]
  8. Kong, S.; Lee, J.H.; Li, S. A deterministic distributed algorithm for reasoning with connected row-convex constraints. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, São Paulo, Brazil, 8–12 May 2017; pp. 203–211. [Google Scholar]
  9. Kong, S.; Lee, J.H.; Li, S. Multiagent simple temporal problem: The Arc-consistency approach. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
  10. Bertsekas, D.P. Constrained Optimization and Lagrange Multiplier Methods; Academic Press: New York, NY, USA, 2014. [Google Scholar]
  11. Farinelli, A.; Rogers, A.; Petcu, A.; Jennings, N.R. Decentralised coordination of low-power embedded devices using the Max-sum algorithm. In Proceedings of the AAMAS, Estoril, Portugal, 3–7 May 2008; pp. 639–646. [Google Scholar]
  12. Wainwright, M.J.; Jaakkola, T.S.; Willsky, A.S. Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching. In Proceedings of the AISTATS, Key West, FL, USA, 3–6 January 2003. [Google Scholar]
  13. Globerson, A.; Jaakkola, T. Fixing max-product: Convergent message passing algorithms for MAP LP-relaxations. In Proceedings of the NeurIPS, Vancouver, BC, Canada, 3–6 December 2007. [Google Scholar]
  14. Dechter, R. Bucket elimination: A unifying framework for reasoning. Artif. Intell. 1999, 113, 41–85. [Google Scholar] [CrossRef]
  15. Dechter, R.; Rish, I. Mini-buckets: A general scheme for bounded inference. J. ACM 2003, 50, 107–153. [Google Scholar] [CrossRef]
  16. Rogers, A.; Farinelli, A.; Stranders, R.; Jennings, N.R. Bounded approximate decentralised coordination via the max-sum algorithm. Artif. Intell. 2011, 175, 730–759. [Google Scholar] [CrossRef]
  17. Rollon, E.; Larrosa, J. Improved bounded max-sum for distributed constraint optimization. In Proceedings of the International Conference on Principles and Practice of Constraint Programming, Québec City, QC, Canada, 8–12 October 2012; pp. 624–632. [Google Scholar]
  18. Rollon, E.; Larrosa, J. Decomposing utility functions in bounded max-sum for distributed constraint optimization. In Proceedings of the International Conference on Principles and Practice of Constraint Programming, Lyon, France, 8–12 September 2014; Springer International Publishing: Cham, Switzerland, 2014; pp. 646–654. [Google Scholar]
  19. Chen, Z.; Deng, Y.; Wu, T.; He, Z. A class of iterative refined Max-sum algorithms via non-consecutive value propagation strategies. Auton. Agents Multi-Agent Syst. 2018, 32, 822–860. [Google Scholar] [CrossRef]
  20. Zivan, R.; Parash, T.; Cohen, L.; Peled, H.; Okamoto, S. Balancing exploration and exploitation in incomplete min/max-sum inference for distributed constraint optimization. Auton. Agents Multi-Agent Syst. 2017, 31, 1165–1207. [Google Scholar] [CrossRef]
  21. Cohen, A.; Galiki, R.; Meisels, A.; Grunitzki, R.; Zivan, R. Governing convergence of Max-sum on DCOPs through damping and splitting. Artif. Intell. 2020, 279, 103212. [Google Scholar] [CrossRef]
  22. Kuck, J.; Chakraborty, S.; Tang, H.; Luo, R.; Song, J.; Sabharwal, A.; Ermon, S. Belief Propagation Neural Networks. Adv. Neural Inf. Process. Syst. 2020, 33, 667–678. [Google Scholar]
  23. García Satorras, V.; Hoogeboom, E.; Welling, M. Neural Enhanced Belief Propagation on Factor Graphs. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Valencia, Spain, 13–15 April 2021. [Google Scholar]
  24. Zhang, Z.; Dupty, M.H.; Wu, F.; Shi, J.Q.; Lee, W.S. Factor Graph Neural Networks. J. Mach. Learn. Res. 2023, 24, 1–54. [Google Scholar]
  25. Cappart, Q.; Chételat, D.; Khalil, E.B.; Lodi, A.; Morris, C.; Veličković, P. Combinatorial Optimization and Reasoning with Graph Neural Networks. J. Mach. Learn. Res. 2023, 24, 1–61. [Google Scholar]
  26. Sun, Y.; Yang, Q. DIFUSCO: Graph-Based Diffusion Solvers for Combinatorial Optimization. Adv. Neural Inf. Process. Syst. 2023, 36, 3706–3731. [Google Scholar]
  27. Li, Y.; Guo, J.; Wang, R.; Zha, H.; Yan, J. Fast T2T: Consistency Models for Faster Discrete Combinatorial Optimization. Adv. Neural Inf. Process. Syst. 2024, 37, 30179–30206. [Google Scholar]
  28. Yu, J.; Han, Y.; Xu, M.; Chen, S.; Gu, S. DISCO: Diffusion for Large-Scale Combinatorial Optimization. arXiv 2024, arXiv:2406.19705. [Google Scholar]
  29. Zhang, X.; Chen, Z.; Cai, S. Revisiting Restarts of CDCL: Should the Search Information be Preserved? arXiv 2024, arXiv:2404.16387. [Google Scholar] [CrossRef]
  30. Li, C.; Liu, C.; Chung, J.; Lu, Z.; Jha, P.; Ganesh, V. A Reinforcement Learning Based Reset Policy for CDCL SAT Solvers. arXiv 2024, arXiv:2404.03753. [Google Scholar] [CrossRef]
  31. Yau, M.; Karalias, N.; Lu, E.; Xu, J.; Jegelka, S. Are Graph Neural Networks Optimal Approximation Algorithms? Adv. Neural Inf. Process. Syst. 2024, 37, 73124–73181. [Google Scholar]
  32. Lawler, E.L.; Wood, D.E. Branch-and-bound methods: A survey. Oper. Res. 1966, 14, 699–719. [Google Scholar] [CrossRef]
  33. Deng, Y.; Kong, S.; Liu, C.; An, B. Deep Attentive Belief Propagation: Integrating Reasoning and Learning for Solving Constraint Optimization Problems. In Proceedings of the NeurIPS, New Orleans, LA, USA, 28 November–9 December 2022. [Google Scholar]
  34. Gomes, C.P.; Selman, B.; Kautz, H. Boosting combinatorial search through randomization. In Proceedings of the Fifteenth National/Tenth Conference on Artificial Intelligence/Innovative Applications of Artificial Intelligence, Madison, WI, USA, 27 July 1998; pp. 431–437. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.