1. Introduction
In very-large-scale integrated (VLSI) circuit design, placement is one of the most critical stages in determining overall chip performance, area, and routability [
1]. In earlier integrated circuit technologies, standard cells were typically designed with a single-row height due to the relatively low design complexity. With continuous device scaling and the increasing diversity of circuit design objectives, modern standard-cell libraries commonly include cells with heterogeneous heights rather than a uniform row structure. For example, basic logic cells such as inverters and buffers are typically implemented as single-row-height cells, whereas complex functional units including flip-flops, multiplexers, and clock gates are often designed as multi-row-height cells to accommodate larger transistor stacks and more routing resources.
In VLSI design, placement is typically carried out in three sequential phases, namely, global placement, legalization, and detailed placement. During global placement, approximate cell positions are obtained by optimizing wire length and routability while allowing overlaps. In the legalization stage, overlaps are removed and cells are aligned to discrete rows and sites with minimal displacement. The detailed placement stage further refines the layout by locally adjusting cell ordering and spacing. Overall, placement determines the optimal positions and orientations of all standard cells under given constraints. In this paper, we focus on the legalization stage.
Although mixed-cell-height standard cells offer significant benefits in terms of design flexibility and area efficiency, their introduction substantially increases the complexity of legalization. Unlike the single-row-height cases, cells with varying heights span multiple placement rows, introducing intricate geometric constraints and inter-row interactions. Furthermore, heterogeneous cell structures and power-rail compatibility constraints further increase the difficulty of the legalization problem. For more details, see [
2,
3,
4,
5] and the references therein.
Heuristic algorithms and analytic algorithms are the two main categories addressing the mixed-cell-height legalization problem. Among heuristic methods, Abacus [
6] and Tetris [
7] are two classical algorithms originally developed for single-row-height legalization tasks. Subsequently, improved heuristic algorithms based on theses two classical types such as Eh? Placer [
8] and Jezz [
9] were proposed. Although classical legalization algorithms perform well in uniform-height cases, they cannot be easily generalized to mixed-height configurations. This is because, in single-row-height cases, cell overlaps can be resolved independently. In contrast, in mixed-cell-height cases, adjusting a cell in one row may introduce new overlaps in other rows. To address these challenges, several enhanced heuristic algorithms have been developed for the mixed-cell-height cases [
10,
11,
12,
13]. Since the objective of the legalization problem is minimizing the total displacement, it can be formulated as network flow, integer linear programming, or quadratic programming (QP) models [
1,
14,
15,
16], enabling analytic methods to efficiently obtain feasible solutions.
With proper preprocessing and relaxation, the mixed-cell-height legalization problem can be transformed into a QP problem. Using the Karush–Kuhn–Tucker (KKT) optimality framework [
17], the QP problem can be equivalently reformulated as a linear complementarity problem (LCP), denoted as LCP(
q,
A). Specifically, given
and
, the objective is to find vectors
such that
The mixed-cell-height legalization problem can be addressed using the modulus-based matrix splitting (MMS) iteration scheme [
18], which has been shown to be effective under certain assumptions [
1]. Based on this scheme, a robust MMS method and several accelerated variants are proposed [
3,
19,
20]. In addition, the LCP can be reformulated as an absolute value equation (AVE), enabling the construction of efficient iterative schemes by exploiting the structure of the system matrix together with matrix splitting techniques [
21,
22]. Building upon the MMS method and the AVE framework, more legalization problems with technical, regional, and abutment constraints have been extensively studied [
4,
5,
23,
24]. However, the classical convergence theory of MMS-type algorithms relies on the system matrix
A being symmetric positive definite (PD) or an
matrix. For mixed-cell-height legalization problems, the resulting system matrix is generally nonsymmetric positive semidefinite (PSD), which does not satisfy the aforementioned assumptions. Consequently, directly applying existing LCP-based algorithms may lead to limitations in the theoretical convergence guarantees.
In fact, the LCP (
1) is equivalent to a VI problem defined as follows: for the function
, find a vector
in the closed convex set
such that
i.e.,
,
. Compared with the LCP formulation, the VI framework provides a more flexible theoretical setting for algorithm design. Notably, the existence of a solution to the proposed variational inequality is guaranteed under standard monotonicity and convexity assumptions while uniqueness holds under strong monotonicity conditions [
25]. Furthermore, a variety of effective iterative algorithms have been developed for when
F is Lipschitz continuous and strongly monotonic or monotonic (with
A being positive definite or positive semidefinite). It has been widely observed that projection-based algorithms are particularly efficient when the closed convex set is fairly simple and the projection is relatively easy to compute. Representative projection-based algorithms include the extragradient method [
26], the projection contraction method [
27], and the prediction–correction method [
28]. However, the convergence rate and practical performance of projection-based methods are highly sensitive to the choice of step size. Fixed step-size strategies often fail to balance convergence speed and stability, especially when dealing with ill-conditioned or nonsymmetric systems.
In this paper, the mixed-cell-height legalization problem is reformulated as a VI problem. Under this formulation, the feasible region can be characterized as a nonempty closed convex set, which enables the construction of projection-type algorithms under mild assumptions on the associated operator. To efficiently solve the resulting VI problem, an existing self-adaptive inertial projection and contraction algorithm (SIPCA) is first adopted as a baseline. Building upon this framework, an improved SIPCA (SIPCA_IP) is developed by incorporating inertial acceleration and a two-step strategy based on the subgradient extragradient technique. The convergence properties of the proposed method are theoretically analyzed, and the adaptive scheme enhances convergence stability and computational efficiency. Furthermore, a lightweight Tetris-like refinement step is employed to eliminate residual overlaps after legalization. The proposed method demonstrates strong performance in solving large-scale mixed-cell-height legalization problems. The main contributions of this paper can be summarized as follows:
First, the mixed-cell-height legalization problem is reformulated as a VI, enabling efficient treatment of the LCP with a nonsymmetric positive semidefinite system matrix. The VI framework provides a flexible theoretical foundation for subsequent algorithm design.
Second, an improved algorithm, termed SIPCA_IP, is developed by incorporating an adaptive step-size scheme and a two-step iteration strategy, thereby enhancing both convergence stability and computational efficiency. Moreover, a rigorous convergence analysis is provided to establish the theoretical guarantees of the proposed method.
Third, a lightweight Tetris-like refinement strategy, adopted from existing legalization techniques, is incorporated as a postprocessing step to eliminate residual overlaps while preserving displacement quality.
Finally, numerical experiments demonstrate that SIPCA_IP outperforms the baseline SIPCA in terms of convergence speed and iterations. Moreover, comparisons with three state-of-the-art methods in terms of overlap and total displacement further confirm its superior legalization accuracy and significant improvements in placement quality.
The remainder of this paper is organized as follows: In
Section 2, the mathematical model is established and subsequently reformulated as a VI.
Section 3 details the baseline SIPCA and SIPCA_IP, along with an overview of the proposed framework. The experimental settings and corresponding results on seven benchmark cases are detailed in
Section 4.
Section 5 concludes the paper and discusses future research.
3. Self-Adaptive Inertial Projection and Contraction Algorithm and Its Improvement
3.1. Baseline Self-Adaptive Inertial Projection and Contraction Algorithm (SIPCA)
A classical iterative algorithm for the VI problem [
29] is given as follows:
where
stands for the orthogonal projection onto
, with
being a fixed step size. It has been proved that this method ensures convergence when
F has strong monotonicity and is Lipschitz continuous. However, when the assumption is weakened to monotonicity, the algorithm may diverge [
30]. To weaken the requirement of strong monotonicity, ref. [
26] introduced the extragradient method, a two-step method with the following iteration:
where
, with
L denoting the Lipschitz constant of
F.
is generated accordingly to satisfy
Based on the extragradient method, a new projection and contraction algorithm is proposed in [
31], which can be described as follows:
where
and
is a relaxation parameter. Since first-order algorithms, particularly gradient-type methods, often suffer from slow convergence, various acceleration techniques have been developed. One typical method is the inertial technique, which updates each iterate by incorporating information from the two preceding iterates. Under the assumptions that
F is monotone and Lipschitz continuous with a constant
L, an inertial projection and contraction algorithm [
32] is proposed:
with
and
where
,
. The sequence
is nondecreasing, with
and satisfying
, and
such that
It has been proved that, for
, the sequence
generated by (
10) converges weakly to a solution of
. In practical applications,
L is usually hard to estimate. To overcome this limitation, a self-adaptive scheme incorporating the inertial technique, termed SIPCA (Algorithm 1), was proposed in [
30]. Instead of using a fixed value
, the proposed method employs a backtracking procedure to adaptively compute an appropriate step size
:
where
,
,
,
,
is chosen as the minimal nonnegative integer ensuring that
satisfying
,
and
where
| Algorithm 1 SIPCA [30] |
- 1:
Input: ; , , , , , , , tolerance . - 2:
, - 3:
while true do - 4:
- 5:
if then - 6:
return - 7:
end if - 8:
- 9:
repeat - 10:
- 11:
if then break - 12:
- 13:
until condition holds - 14:
- 15:
if then else - 16:
- 17:
if then - 18:
- 19:
else - 20:
- 21:
end if - 22:
- 23:
end while
|
In SIPCA, the Lipschitz continuity requirement is removed, and the only assumption is that F is continuous. Lines 17–21 prevent from being too small, thereby improving the computational efficiency. This adaptive rule enables the algorithm to automatically enlarge the step size when the residual decreases rapidly and reduces it when instability is detected, thus maintaining a favorable balance between convergence speed and robustness.
3.2. Improved Self-Adaptive Inertial Projection and Contraction Algorithm
In recent years, the extragradient method (
8) has attracted considerable attention, and numerous variants have been developed to enhance its performance due to its simple iterative forms. The projection and contraction algorithm (
10) is one of its important extensions, and its classical form [
27] can be described as follows:
where
,
is either chosen from
or adaptively selected as a sequence
, and
Compared with the classical extragradient method (
8), where the same step size
is used in both projections, Algorithm (
12) employs two different step sizes. This difference contributes to the superior computational efficiency of the projection and contraction algorithm relative to the extragradient method. On the other hand, the extragradient method involves two orthogonal projections onto
per iteration. As a result, when the set
cannot be simply projected onto, the minimum distance problem must be solved twice to obtain the next iteration, potentially reducing efficiency and applicability. To address this issue, the subgradient extragradient method [
33] replaces the second projection with an easily computable subgradient projection, leading to the following iterative scheme:
where
and
or the sequence
is generated adaptively according to
,
,
. The integer
denotes the smallest nonnegative value for which
As discussed above, both step size-based extragradient methods and subgradient extragradient methods play important roles in influencing the convergence behavior of two-step algorithms. However, subgradient extragradient methods, as gradient-type methods, often exhibit relatively lower convergence efficiency. Therefore, it is natural to ask whether step-size adjustment, subgradient extragradient strategies, and inertial techniques can be integrated to further improve the convergence performance of projection and contraction algorithms.
Motivated by the above observations to tackle the VI problem arising from large-scale mixed-cell-height circuit legalization, we develop an improved self-adaptive projection and contraction algorithm, termed SIPCA_IP (Algorithm 2), which integrates the inertial technique with the subgradient extragradient method.
| Algorithm 2 SIPCA_IP |
- 1:
Input: ; , , , , , , , tolerance . - 2:
, - 3:
while true do - 4:
- 5:
if then - 6:
return - 7:
end if - 8:
- 9:
repeat - 10:
- 11:
if then break - 12:
- 13:
until condition holds - 14:
- 15:
if then else - 16:
- 17:
- 18:
▹ - 19:
if then - 20:
- 21:
else - 22:
- 23:
end if - 24:
- 25:
end while
|
Compared with SIPCA, the proposed SIPCA_IP introduces a key modification in the update step: Lines 16–18 replace Line 16 of the original algorithm. Instead of performing the original direct iterative update, SIPCA_IP employs a subgradient projection step, which provides a more stable search direction and enhances convergence efficiency.
3.3. Convergence Analysis
In this section, we investigate the convergence behavior of Algorithm 2. Let be a nonempty closed convex set, and let be monotone and Lipschitz continuous. We denote the solution set of by , which is assumed to be nonempty.
Theorem 1. Let be a nonempty closed convex set and be monotone and Lipschitz continuous. Moreover, the solution set is nonempty. Under the condition of Algorithm 2, let be generated bywhere Suppose that the following line-search condition holds for every k:and that the inertial parameters satisfy Then, is bounded, and Moreover, every cluster point of belongs to . In the case where has a unique solution, the sequence converges to the unique solution.
Proof. Due to space considerations, the detailed proof is provided in
Appendix A. □
Remark 2. From the structures of W and in Remark 1, it follows that A is a constant matrix. Moreover, noting that and each row of W contains only two nonzero entries and while each block of is bounded, one can estimate that . Therefore, F is Lipschitz continuous with . Further, the matrix A is positive semidefinite, which implies that F is monotone. Consequently, the proposed algorithm is applicable to the VI considered in this work.
3.4. Computational Complexity Analysis
In this subsection, we analyze the computational complexity of the proposed SIPCA_IP algorithm.
At the
kth iteration, the extrapolation step
only involves vector addition and scalar multiplication, and thus requires
operations. The main computational cost comes from the evaluation of the mapping
F and the projection step. Specifically, one evaluation of
is from
while an additional evaluation of
is required from
Therefore, each iteration computes the mapping F twice. In the proposed algorithm, F is induced by a sparse matrix–vector multiplication. Hence, each evaluation of F requires operations, where denotes the number of nonzero entries of the system matrix A of F. If the algorithm terminates after K iterations, the total complexity becomes . Since the dominant cost of SIPCA_IP is determined by sparse matrix–vector multiplications and simple projection operations, it is computationally efficient for large-scale sparse legalization problems.
Remark 3. It is worth noting that establishing an explicit convergence rate for the proposed SIPCA_IP method such as is technically challenging due to the incorporation of adaptive step-size strategies and inertial mechanisms. These components introduce additional nonlinearity into the iterative process, making standard convergence rate analysis difficult to apply directly. Therefore, the current work primarily focuses on establishing the convergence properties of the proposed method. The investigation of explicit convergence rates will be pursued in future work.
3.5. Legalization Framework
Figure 2 illustrates the overall framework for mixed-cell-height circuit legalization. The legalization stage begins with a global placement solution, where cell locations are estimated without considering overlaps. We first align each cell to the nearest feasible row while ignoring the right boundary constraints. Multi-row-height standard cells are partitioned into single-row-height subcells. Consequently, the legalization task is formulated as a QP model and then reformulated as a VI problem. The resulting VI is solved by the SIPCA and SIPCA_IP algorithms. Due to numerical precision, overlaps may still occur after restoring the multi-row-height cells. These remaining overlaps are then resolved using a Tetris-like allocation method [
3].
4. Experimental Results and Discussion
This section presents numerical experiments to evaluate the convergence behavior and layout quality of the proposed SIPCA_IP in comparison with SIPCA and several representative methods, including the modulus-based method, the robust modulus-based method, and the Newton method, for mixed-cell-height circuit legalization problems. First, we compare SIPCA and SIPCA_IP in terms of convergence behavior and layout quality. Subsequently, under identical stopping criteria, both methods are further compared with the above representative algorithms. In addition, the robustness of the proposed method and its sensitivity to parameter settings are investigated.
Experiments are conducted on seven standard mixed-cell-height benchmarks from the ISPD 2015 Detailed Routing-Driven Placement Contest [
34]. Since the original cell library does not contain multi-row-height cells, 10% of the cells are randomly selected to double the height and halve the width. These benchmarks are provided by the authors of [
11] and have been widely used in studies on mixed-cell-height legalization.
Table 2 presents the cell statistics for these benchmarks. “T.Cell”, “S.Cell”, “D.Cell”, and “Dens.” correspond the total cell count, single-row-height cell count, double-row-height cell count, and design density, respectively. “
W.size” and “
E.size” denote the dimensions of matrices
W and
E. The implementation is carried out in C++ using Microsoft Visual Studio Community 2022 (64 bit) version 17.11.4, and the experiments are executed on a machine featuring an Intel Core i5 processor with 32 GB RAM.
The efficiency of the proposed algorithm is evaluated from three perspectives: IT, CPU time, and
. Here, IT denotes the iteration number, CPU time records the running time in seconds, and
is defined by
, which is defined in the two algorithms. The parameters used in the experiments are selected according to the empirical settings reported in the existing literature [
1,
30], which have been shown to provide stable and efficient performance. The stopping tolerance is set to
, and the maximum number of iterations is set to
. For the experiments with increased proportions of multi-height cells,
is increased to 5000 to ensure sufficient convergence. The algorithms terminate when
or
is reached, with
. The detailed parameter configurations and implementation settings are provided in
Appendix B. Note that the stopping tolerance is set to
in the first two subsections. In the parameter sensitivity analysis (
Section 4.3), a stricter tolerance
is also considered to examine the influence of the stopping criterion.
4.1. Comparison Between the Proposed Algorithms
This subsection presents a comparison between the two proposed algorithms focusing on IT, CPU time, and . The “N.Avg” row reports the average normalized ratios of total runtime with respect to SIPCA_IP.
As summarized in
Table 3, the improved SIPCA_IP algorithm consistently achieves comparable or higher accuracy with markedly fewer iterations and shorter CPU time. On average, the IT and CPU time of SIPCA are approximately 2.069× and 1.467× larger than that of SIPCA_IP, confirming the superior adaptive convergence efficiency of SIPCA_IP.
To further evaluate the solution quality achieved by the two algorithms, we compare their overlaps and total displacement.
Table 4 presents the quantitative contribution of the Tetris-like refinement stage for both SIPCA and SIPCA_IP. The solver outputs before refinement, including the number of overlaps and displacement values, are reported together with the final displacement values obtained after refinement. The runtime of the refinement stage (denoted as R.Time) is also recorded separately for each benchmark instance.
After applying the Tetris-like refinement, all remaining overlaps are completely eliminated for all benchmark instances. Therefore, the overlap counts after refinement are not listed in the table. Moreover, the runtime of the refinement stage remains extremely small across all benchmark instances, typically ranging from 0.001 to 0.005 s. Meanwhile, the displacement values after refinement show only minor changes compared with those before refinement, indicating that the refinement primarily resolves residual overlaps while preserving displacement quality.
Overall, the proposed algorithm achieves the major solution quality, while the Tetris-like refinement serves as an efficient postprocessing step. On average, the total displacement produced by SIPCA is 1.009× that of SIPCA_IP, while the number of overlaps generated by SIPCA is 1.434× larger. These results indicate that, under the same termination accuracy, SIPCA_IP consistently achieves better placement quality than SIPCA.
To further evaluate the robustness of the proposed method under more challenging benchmark settings, we increase the proportion of double-height cells from 10% to 20%. The double-height cells are generated using a fixed random seed (seed = 1234). The statistics of the benchmark instances, including the numbers of single-height and double-height cells, as well as the corresponding matrix dimensions, are summarized in
Table 5.
Compared with the original 10% setting, increasing the proportion of double-height cells significantly enlarges the constraint matrix size and increases the complexity of the legalization problem. Due to the increased problem scale, the maximum iteration number is increased from 3000 to 5000 for both SIPCA and SIPCA_IP to ensure sufficient convergence, while the stopping tolerance () remains . If the iteration number reaches 5000, it indicates that the method has reached the maximum iteration limit without satisfying the stopping criterion.
The convergence performance of SIPCA and SIPCA_IP under the 20% double-height-cell setting is presented in
Table 6. As shown in the table, SIPCA reaches the maximum iteration limit on several benchmarks, such as des_perf_a, fft_a, and pci_bridge32_b, while SIPCA_IP successfully converges on all tested benchmarks within significantly fewer iterations. These results indicate that increasing the proportion of double-height cells to 20% significantly increases the difficulty of the legalization problem, as reflected by the enlarged matrix dimensions and slower convergence behavior. Despite this increased complexity, SIPCA_IP maintains stable convergence across all tested benchmarks, while SIPCA reaches the maximum iteration limit on several instances. These results confirm the robustness of the proposed SIPCA_IP under more challenging mixed-cell-height cases.
4.2. Comparison with Existing Methods
In this subsection, we compare the total cell displacement of our proposed methods with that of three representative state-of-the-art legalization methods, namely, the modulus-based method [
1], the robust modulus-based method [
20], and the robust Newton method [
4]. To ensure fair and controlled comparisons, all algorithms are implemented within the same legalization framework used in this study. Specifically, in the legalization flow shown in
Figure 2, the VI formulation converted from the QP model and its corresponding solver are replaced by the respective baseline formulations and solution methods while all other procedures remain unchanged.
To ensure a consistent comparison environment, the same benchmark instances, stopping criteria, evaluation procedures, and hardware/software settings are applied to all methods. In particular, the termination conditions are unified across all methods, i.e., the iterations terminate when
or when the maximum number of iterations
is reached. All experiments are conducted on the same computing platform described in
Appendix B.
Table 7 presents the controlled comparison results obtained under unified experimental settings. From the results, it can be seen that the proposed SIPCA_IP method achieves the smallest or highly competitive displacement values after refinement on most benchmark instances, such as des_perf_b, fft_a, and fft_b. This demonstrates the effectiveness of the proposed method in improving legalization quality under identical experimental conditions. Furthermore, the computational time of SIPCA_IP remains comparable to or lower than that of several baseline methods on multiple benchmarks, indicating that the improved performance is achieved without introducing significant computational overhead.
To further evaluate the final legalization performance, the final displacement results of all methods are summarized in
Table 8. From
Table 8, SIPCA_IP achieves total displacement comparable to that of SIPCA while outperforming the modulus-based and robust Newton approaches by
and
, respectively. The total displacement reported in [
20] is
that of SIPCA_IP. These results show that SIPCA and SIPCA_IP achieve competitive performance compared with the existing approaches in terms of total displacement.
4.3. Sensitivity Analysis with Respect to and
Both SIPCA and SIPCA_IP involve several parameters whose values are chosen according to the empirical settings suggested in the existing literature [
30]. In this subsection, we investigate the sensitivity of the algorithm to two important parameters, namely, the relaxation parameter
and the stopping tolerance
.
Since the parameter
controls the relaxation step in the iterative process and may affect the convergence, we first analyze the influence of
. To provide a more comprehensive evaluation of the parameter sensitivity, the influence of the parameter
is investigated on all seven benchmark instances used in this study. Specifically, both SIPCA and SIPCA_IP are tested with
varying from
to
with a step size of
. For each value of
, the iteration numbers and CPU times obtained from all benchmarks are collected, and their average values are reported to reflect the overall performance trend. The corresponding results are illustrated in
Figure 3, where
Figure 3a shows the average iteration numbers versus
and
Figure 3b presents the average CPU time versus
.
From
Figure 3, it can be observed that both the average number of iterations and the CPU time of SIPCA and SIPCA_IP decrease significantly as
increases. Moreover, SIPCA_IP exhibits a faster reduction than SIPCA, indicating that
significantly affects convergence, with SIPCA_IP being more sensitive to its choice. Based on the above observations,
is adopted in all subsequent experiments, since it provides faster convergence and lower computational cost while maintaining stable performance. It should be noted that a theoretical analysis of the sensitivity of the parameter
, as well as the influence of other parameters on the performance of the two algorithms, deserves further investigation.
Next, we investigate the influence of the stopping tolerance , which is employed as the stopping criterion to terminate the iteration process. In general, a smaller value of leads to higher solution accuracy but may increase the computational cost. Therefore, experiments with different values of are conducted to examine the trade-off between computational efficiency and solution accuracy.
As summarized in
Table 3 and
Table 9, tightening the termination tolerance from
and
leads to an increased number of IT across all test cases, indicating the additional computational effort required for higher precision. However, the extent of this increase varies between the two algorithms. The improved SIPCA_IP algorithm consistently attains comparable or higher accuracy with markedly fewer iterations and a shorter CPU time. When the tolerance is further tightened to
, both algorithms require more IT; nevertheless, SIPCA_IP maintains its advantage, exhibiting smaller increases in both IT and CPU time. Moreover, for the benchmarks des_perf_a, fft_a, and pci_bridge32_b, the baseline SIPCA fails to reach the required accuracy within the maximum iteration limit, whereas SIPCA_IP successfully satisfies the tolerance in all cases.
In addition, we compare the overlap counts and total displacement of the two algorithms under different stopping tolerances. The corresponding results are summarized in
Table 10. In addition to displacement values, the corresponding overlap counts and refinement runtimes are also reported for each stopping condition. This enables a further evaluation of the robustness of the Tetris-like refinement stage under varying termination criteria. It can be observed that all remaining overlap counts are completely eliminated after refinement across different stopping tolerances while the refinement runtime remains consistently small. Furthermore,
Table 4 and
Table 10 show that, as the stopping tolerance becomes stricter, both the overlap counts and total displacement decrease. Specifically, for SIPCA, the overlap count decreases by 18.18% and the total displacement by 0.08%; for SIPCA_IP, the overlap count decreases by 8.69% and the total displacement by 0.02%.
From the comparison under different stopping tolerances (
Figure 4 and
Figure 5), it can be seen that decreasing the stopping tolerance significantly increases iteration counts and CPU time for both algorithms. However, the impact of the stopping tolerance on legalization quality, measured by total displacement and overlap counts, is relatively small. This suggests that, in practice, the stopping tolerance can be moderately relaxed to achieve a better balance between computational efficiency and solution quality.
To further illustrate the relationship between computational cost and solution precision, a time–precision trade-off analysis is conducted. Specifically, the stopping tolerance
is varied from
to
. For each tolerance value, both SIPCA and SIPCA_IP are executed on all seven benchmarks. The average CPU time and iteration numbers over the seven benchmarks are computed and plotted as functions of
, as shown in
Figure 6.
From
Figure 6a, it can be observed that the computational time increases steadily as higher precision is required. Meanwhile,
Figure 6b shows that the iteration numbers also increase as the stopping tolerance decreases. In all cases, SIPCA_IP consistently requires fewer iterations and less computational time than SIPCA, demonstrating its superior efficiency under different precision requirements.
4.4. Discussion
The experimental results demonstrate that both SIPCA and SIPCA_IP achieve stable convergence, while SIPCA_IP exhibits clear advantages in convergence speed and overall performance. Compared with SIPCA, SIPCA_IP attains the prescribed accuracy with significantly fewer iterations and a shorter CPU time. In terms of displacement quality, SIPCA_IP produces smaller overlap counts and total displacement under identical termination conditions, leading to improved legalization results.
In addition, to investigate the impact of matrix size on algorithm performance, seven benchmark instances are considered, which are arranged in ascending order according to the matrix size, measured by the number of nonzero elements (nnz(A)). As shown in
Figure 7a, the overall CPU time tends to increase as the matrix size grows, indicating that the computational cost generally increases with problem scale.
Figure 7b presents the corresponding iteration numbers. It can be observed that the iteration numbers of SIPCA vary more significantly as the matrix size increases, suggesting that SIPCA is relatively sensitive to problem scale. In contrast, the iteration numbers of SIPCA_IP remain comparatively stable across different matrix sizes, indicating a weaker dependence of iteration counts on matrix size and thus demonstrating improved scalability over a range of problem scales.
Overall, the SIPCA_IP enhances convergence robustness, computational efficiency, and displacement quality simultaneously, providing a more reliable and scalable solution framework for large-scale problems. Although the proposed algorithm performs well on benchmark instances with matrix sizes up to , future work will involve further evaluation on datasets of the order of millions.
5. Conclusions and Outlook
This study transforms the mixed-cell-height legalization problem into a VI framework and addresses it using SIPCA. Inspired by the subgradient extragradient method, we further propose SIPCA_IP, which integrates adaptive step size and a two-step strategy to enhance convergence stability and computational efficiency. Extensive experiments demonstrate that SIPCA_IP achieves faster convergence, fewer iterations, and improved legalization quality, producing smaller overlap counts and total displacement compared with the baseline SIPCA. In addition, comparative experiments with representative baseline methods conducted under unified experimental settings and identical stopping tolerances demonstrate that SIPCA_IP achieves competitive or superior performance across all benchmark instances, confirming its effectiveness and robustness for large-scale mixed-cell-height legalization problems.
In future work, the proposed VI-based framework may be extended to incorporate additional design constraints, such as half-row-height and fence-region constraints. Such extensions would require modifying the feasible set to accommodate the additional placement restrictions, while the projection-based iterative structure of the algorithm would remain applicable. Moreover, due to the multiple algorithmic parameters involved in SIPCA and SIPCA_IP, integrating the proposed framework with advanced layout engines and machine learning-based parameter optimization strategies is expected to further enhance the adaptability and efficiency in practical VLSI design.