1. Introduction
Non-convergence is a persistent and practically relevant problem in structural equation modeling (SEM). When an estimation algorithm fails to converge, parameter estimates, standard errors, and model fit measures are unavailable, which prevents researchers from addressing their substantive research questions.
Empirically, non-convergence occurs disproportionately in small samples, where two distinct datasets of equal (small) sample size may show a contrasting convergence status even when fitted to the same SEM specification. This relationship is well documented [
1,
2], but is often summarized as insufficient information or increased sampling variability. While such explanations are correct, they do not fully capture the mechanisms through which small samples lead to estimation failure. In SEM, parameter estimation is usually based on the discrepancy between the sample covariance (VCOV) matrix S and its model-implied counterpart
. In small samples, this discrepancy depends less on the exact model specification and more on the numerical properties of the VCOV matrix. Throughout the remainder of this paper, the term VCOV refers to the sample covariance matrix S, unless explicitly stated otherwise.
Limited sample sizes often produce VCOV matrices that are ill-conditioned (i.e., with a large spread in eigenvalues (In the context of a VCOV matrix, eigenvalues indicate how much of the total variance is explained by independent linear combinations of variables. Large eigenvalues correspond to dominant directions of variation in the data, whereas eigenvalues close to zero signal strong linear dependencies or numerical instability. The condition number, defined as the ratio of the largest to the smallest eigenvalue, quantifies the numerical stability of the matrix: a large condition number indicates that the matrix is ill-conditioned and that small changes in the data may lead to large changes in parameter estimates)), nearly singular (i.e., with eigenvalues close to zero), or, for more general sample covariance/correlation matrices such as tetrachoric or polychoric correlation matrices or matrices based on pairwise deletion, even indefinite (i.e., containing negative eigenvalues). Extreme covariances, unstable eigenvalues, and inflated condition numbers are therefore not exceptional, but may occur more frequently when information is limited. These properties directly affect the computation of the discrepancy function, its gradient, and its Hessian. As a result, optimization algorithms may be unable to find a good solution, may get stuck near the boundary of the parameter space, or may terminate due to numerical instability. From this perspective, non-convergence should not be viewed solely as an optimization failure, but as an indication of problematic VCOV patterns caused by sampling variability, sparse information, or the way the covariance/correlation matrix was estimated.
A range of approaches has been proposed to address small-sample and numerically unstable SEM. Broadly, these can be divided into two categories. First, some approaches modify the input statistics by adjusting the sample VCOV matrix prior to estimation. A widely used strategy is shrinkage estimation [
3,
4,
5,
6,
7,
8], in which the sample VCOV matrix is combined with a structured, well-conditioned target so as to improve numerical stability and increase the likelihood of positive definiteness. Recent work has extended this idea to small-sample multilevel SEM by replacing the sample covariance matrix with a regularized shrinkage estimate [
9]. Second, other approaches leave the input statistics unchanged but modify the estimation procedure or restrict the parameter space. These include regularized and penalized SEM [
10,
11,
12], although recent simulations show that these methods themselves can suffer from non-convergence [
13]; Bayesian regularized SEM [
14,
15,
16,
17,
18]; bounded estimation [
19,
20]; and various forms of factor score regression [
21,
22,
23].
The broader covariance-estimation literature has shown that shrinkage can substantially improve the conditioning and stability of covariance estimates, particularly when the sample covariance matrix is noisy, high-dimensional, or poorly conditioned, e.g., [
24]. These developments provide the general statistical rationale for replacing an unstable sample covariance matrix by a better-conditioned compromise between the sample matrix and a structured target. However, such approaches are typically designed to improve the covariance estimator as a whole. They do not aim to identify which specific covariance elements contribute to estimation failure in a given SEM.
This limitation extends beyond covariance-based approaches: regularization at the parameter level—whether through penalized maximum likelihood, Bayesian priors, or factor score regression—equally treats the SEM as a global optimization problem, without diagnosing which parts of the sample covariance pattern are responsible for estimation failure.
The current paper builds on the work of [
4], who propose a shrinkage approach specifically designed for SEM. They show that shrinkage approaches—in which the VCOV matrix is combined with a highly structured and well-conditioned shrinkage target matrix T—substantially improve convergence rates in small samples. More formally, the method constructs an adjusted VCOV matrix as a weighted combination of S and T, given by
, where
denotes the shrinkage intensity with
. However, the shrinkage is applied uniformly to the entire VCOV matrix and therefore does not identify which covariance elements drive non-convergence in any particular case.
We address this diagnostic gap by treating non-convergence as an observable outcome of the interaction between the VCOV matrix and a specified SEM. The same VCOV matrix may lead to non-convergence for one SEM specification while converging without difficulty for another. To study this interaction, we construct a large-scale, model-independent adversarial matrix-generating process that produces a wide range of VCOV matrices spanning realistic and stress-test conditions, including near-singular, indefinite, and highly correlated cases. A pre-specified SEM is then fitted to each generated VCOV matrix to obtain a convergence label.
Using these labeled matrices, we train a machine learning model—inspired by ([
25], Chapter 5)—to predict SEM non-convergence using only information from the VCOV matrix. SHAP (SHapley Additive exPlanations) values are then used to decompose the prediction into feature-specific contributions [
26]. This allows us to identify covariance pairs that contribute most strongly to the predicted risk of non-convergence. Based on these identified pairs, we outline a localized shrinkage strategy that targets destabilizing covariance patterns while preserving the overall structure of the VCOV matrix as much as possible.
2. Materials and Methods
A fitted SEM may fail to converge for a given VCOV matrix, but it is often unclear which parts of the VCOV are responsible for this behavior. To illustrate and study this mechanism in a controlled setting, we consider a simple SEM shown in
Figure 1 as a running example. For this model the number of observed variables is
.
2.1. Adversarial Generation of VCOV Matrices and Pathology Injection
In contrast to traditional SEM simulations that rely on correctly or incorrectly specified population models, we separate sample covariance/correlation patterns from model truth. To this end, we adopt a model-agnostic simulation approach with controlled pathology injection [
27], where sample covariance/correlation matrices (
) are generated independently of any fitted model and then modified in controlled ways to reflect realistic but problematic empirical situations. This stress-test design is conceptually related to adversarial contamination ideas [
28]. It allows us to study non-convergence as a function of the sample-matrix pattern itself, rather than as a by-product of model misspecification.
Because the matrices are generated directly rather than sampled from finite datasets, the matrix-generation mechanism is independent of sample size. The injected pathologies should therefore be interpreted as matrix-level stress-test conditions designed to mimic numerical features often encountered in small-sample SEM, such as near-singularity, unstable local correlations, and poor conditioning.
Figure 2 summarizes the full diagnose–localize–shrinkage workflow; the individual steps are described in the subsections that follow.
In practice, we first sampled, with equal probability, a baseline template of size
, with
, from a heterogeneous set of generators designed to capture common dependence patterns observed in practice (i.e., Wishart-type, block-structured, Toeplitz-like, low-rank, and uniform random templates). Illustrative examples of the baseline templates are presented in
Appendix A. Next, we injected controlled pathologies by manipulating the smallest eigenvalue and local covariance patterns; see
Appendix B for a detailed description of the injection mechanisms. Pathology severity was controlled using a pre-specified distribution: clean cases were sampled with probability 45%, mild cases with 25%, moderate cases with 15%, severe cases with 10%, and extreme cases with 5%. Pathology types included (i) near-singularity, generated by shrinking the smallest eigenvalue to a severity-dependent range, (ii) indefiniteness, generated by forcing a small negative eigenvalue with severity-dependent magnitude, and (iii) extreme covariance clusters, created by imposing high within-block correlations for a randomly selected subset of variables. A mixed condition combined these mechanisms while avoiding unrealistically extreme matrices.
A total of 50,000 adversarial VCOV matrices were sampled. All generated matrices were symmetrized and standardized to correlation form, with a unit diagonal. The “clean” condition was generated using a rejection criterion that retained only well-conditioned, strictly positive definite matrices. Candidate matrices that did not satisfy these criteria were discarded and resampled. If repeated attempts failed, a nearest positive definite projection [
29] was used to obtain a stable clean reference.
To evaluate whether the adversarial matrix-generation procedure produced the intended numerical variation, we computed model-independent diagnostics. Specifically, we inspected the smallest eigenvalue, the condition number, the maximum absolute off-diagonal correlation, and the proportion of matrices with at least one non-positive eigenvalue across pathology severity levels.
Figure 3 shows that the severity levels induced systematic variation in matrix stability. Clean matrices were generally well-conditioned and positive definite, whereas matrices from more severe conditions showed poorer conditioning, more extreme local correlations, and more frequent proximity to, or violation of, positive definiteness. Thus, the adversarial generation procedure produced a heterogeneous set of sample matrices with the intended matrix-level numerical properties.
In a second step, a SEM (see
Figure 1) was fitted to each adversarially generated correlation matrix
using both the
R [
30] package lavaan ([
31], version 0.6-21) and OpenMx ([
32], version 2.21.13). The convergence status reported by each package was recorded, with 1 indicating that the model failed to converge and 0 indicating that the model converged. After all iterations had been completed, only cases with equal convergence status in both packages were retained for further analysis. This filtering step reduced the influence of optimizer-specific behavior. A model was classified as converged if (i) the optimizer reported convergence and (ii) all standard errors could be computed, indicating that the local curvature information required for standard error estimation was available.
In total,
cases converged in both lavaan and OpenMx, whereas
resulted in non-convergence. The resulting dataset therefore consisted convergence status and the vectorized lower-triangular of the correlation elements. The rate of non-convergence increased monotonically with the severity of the injected pathology (see
Table 1). This pattern provided an internal validity check of the adversarial data-generating process, confirming that increasing numerical severity was associated with an increased likelihood of non-convergence. The deviation for the extreme category was likely related to inadmissible solutions, indicating that numerical convergence alone is not a sufficient indicator of model validity. The aim of this study, however, is not to guarantee admissible model solutions, but to increase the likelihood of convergence in the presence of problematic correlation patterns.
2.2. An XGBoost Model to Detect Local Correlation Instability
To identify local correlation instability associated with non-convergence, we trained a gradient-boosted decision-tree classifier using the Extreme Gradient Boosting algorithm, implemented in the
R package
xgboost ([
33,
34], version 3.1.3.1).
The prediction target was coded as 0 for convergence and 1 for non-convergence. The predictors consisted of the vectorized lower-triangular elements of the correlation matrix. To stabilize the scale of the correlation features and to reduce the influence of extreme values near , all correlations were transformed using the Fisher-z transformation before model fitting.
The full dataset was split into training, validation, and test sets using a stratified sampling procedure, thereby preserving the proportion of converged and non-converged cases across splits. Specifically, 70% of the data were assigned to the training set, 15% to the validation set, and the remaining 15% to the test set. This three-way split allowed model development and hyperparameter tuning to be performed without using information from the final test set.
Model training used a binary logistic objective function with a fixed learning rate of 0.1. To account for class imbalance between converged and non-converged cases, class weighting was applied to the loss function. The optimal number of boosting iterations was selected using 5-fold cross-validation within the training data, with early stopping based on cross-validation performance to reduce the risk of overfitting.
Predicted probabilities were obtained for the validation and test sets. The validation set was used during model development, whereas the test set was reserved for the final evaluation of model performance. Performance was assessed using multiple metrics, including logloss, AUC-PR, AUC-ROC, and the Brier score. The results in
Table 2 show that the XGBoost model discriminated well between converged and non-converged cases. Although predictive accuracy is not the primary objective of the diagnostic model, an adequate level of discrimination is required before model attributions can be used to identify correlation patterns associated with non-convergence.
A final model was trained on the combined training and validation data using the optimal number of boosting iterations identified above.
SHAP-Based Localization of Unhappy Correlation Pairs
Next, we identified the specific correlation pairs that contributed most to the predicted risk of non-convergence, referred to here as unhappy correlations. The detailed algorithmic steps are provided in
Appendix C. Here, we summarize the procedure at a conceptual level.
To localize unhappy correlations, SHAP values were used to decompose the model prediction into additive contributions of individual correlation pairs. For each observation (i.e., each correlation matrix), the SHAP value of a feature quantified how much the corresponding correlation pair increased or decreased the predicted risk of non-convergence relative to the model baseline. Global importance was summarized by the mean absolute SHAP value across all observations, providing a ranking of the correlation features most strongly associated with non-convergence.
Two qualifications of this procedure should be noted. First, many commonly used SHAP implementations rely, either explicitly or implicitly, on assumptions about feature independence that are not strictly met in the present setting: the features are pairwise correlations from a single matrix and are therefore structurally dependent. This dependence can affect how attribution is distributed across closely related correlation pairs [
35]. Second, SHAP values quantify contributions to model predictions rather than causal effects on convergence. We therefore used SHAP as a localization device: it identified which correlation pairs the classifier associated most strongly with predicted non-convergence, rather than providing a causal explanation of why the SEM failed to converge. The practical usefulness of the localization was evaluated empirically by examining whether the subsequent targeted repair step restored convergence for the original matrix.
The selection of unhappy correlations proceeded in three steps. First, only correlation pairs with positive SHAP values were considered, because these pairs increased the predicted risk of non-convergence according to the diagnostic classifier. Negative SHAP values were ignored because they indicated correlation pairs associated with a lower predicted risk of non-convergence. Second, the absolute SHAP value of each pair was evaluated relative to its empirical distribution across the training data. This yielded a percentile score that reflected how extreme the contribution was compared with typical cases. Third, a combined score was computed as the product of the SHAP value and its percentile rank. Only correlation pairs that exceeded the predefined thresholds were retained.
This procedure ensured that the selected unhappy correlations were both influential for the current case and unusually large relative to what was typically observed in the training data. To limit unnecessary modifications, the number of unhappy correlations per case could be capped. The final set of unhappy correlations was then used in the targeted repair step.
2.3. Shrinkage Approach for Targeted Repair of Correlation Patterns
In the previous section we identified a small set of unhappy correlations associated with non-convergence. We next describe how these unhappy correlations can be used to construct a localized shrinkage update with minimal distortion of the original correlation pattern.
2.3.1. Construction of the Model-Based Target Correlation Matrix
The localized repair operator shrinks selected correlations in
toward a well-conditioned and highly structured target matrix
. Rather than relying on a generic identity or constant-correlation target, we used a model-based target matrix that reflected the assumed measurement structure of the SEM. This choice was inspired by the model-based shrinkage framework of [
4], but differs in that the target matrix was used here as a reference for localized repair rather than as a global replacement of the VCOV matrix.
For the six-indicator example, the target correlation matrix has the block structure
We constructed this target using a well-conditioned SEM with two latent factors and three indicators per factor (see
Figure 1). The factor loadings are fixed at 0.7, implying equally strong indicators. Indicator reliabilities are fixed at
, yielding homogeneous residual variances, all latent correlations are set to
, and residual errors were assumed to be uncorrelated. The resulting model-implied covariance matrix was then rescaled to correlation form.
2.3.2. Localized Shrinkage Update of Unhappy Correlations
A purely local update of only the unhappy correlations may be too restrictive, because non-convergence can reflect broader dependency patterns rather than a single destabilizing correlation. Therefore, the adjustment of each unhappy correlation was also propagated to connected correlations. The size of this secondary adjustment depended on the strength of the corresponding correlation and was controlled by a smooth sigmoid weighting function. Correlations more strongly connected to an unhappy correlation received a larger adjustment, resulting in a more coherent modification of the correlation matrix.
For a given shrinkage level
, this procedure yielded a candidate repaired correlation matrix
that preserved symmetry and a unit diagonal by construction. However, the localized shrinkage update did not guarantee positive definiteness. Therefore, for each candidate value of
, we computed the smallest eigenvalue of
and rejected the candidate unless
where
is a small positive tolerance. The shrinkage level
was selected by line search. Starting from a small initial value,
was increased until
was positive definite and the SEM fitted to
converged. The first accepted candidate was retained, corresponding to the smallest accepted value of
and therefore minimizing distortion of the original correlation pattern.
2.3.3. Jackknife-Based Reference Interval
To assess whether a repaired correlation introduces changes that exceed the natural variability of the observed data, we evaluated the size of each correlation adjustment relative to a jackknife-based reference interval. The jackknife provides an estimate of the sensitivity of each correlation coefficient to individual observations.
Starting from the observed dataset with n observations, we computed leave-one-out correlation matrices by removing one observation at a time and recomputing the correlation matrix. This procedure yielded n jackknife correlation matrices. For each correlation pair, the 95% interval across these matrices reflected the range of correlation values that one may expect from ordinary single-observation fluctuations in the data.
After the repair step each correlation was compared with its jackknife interval. If the repaired correlation fell within the interval, the adjustment was considered consistent with natural sampling variability of the data.
3. Results
Illustrative Case Study
This section illustrates the diagnose–localize–shrinkage pipeline using a controlled example. The goal is to show how a non-convergent SEM fit can be stabilized through a local adjustment of the covariance matrix.
We simulated a dataset of size
from the population model shown in
Figure 1. The random seed was chosen such that fitting the SEM to the resulting VCOV would not converge.
- Step 1:
SEM fit on (non-convergence).
We fitted the SEM using lavaan with maximum likelihood estimation. The model did not converge as expected:
lavaan 0.6-21 did NOT end normally after 2302 iterations
** WARNING ** Estimates below are most likely unreliable
Estimator ML
Optimization method NLMINB
Number of model parameters 13
Number of observations 30
- Step 2:
-
Diagnose and localize the instability.
The VCOV matrix was transformed to its correlation matrix
(see
Table 3), after which the unique off-diagonal correlations were Fisher-
z transformed and evaluated using the trained XGBoost diagnosis model. For each correlation pair, a SHAP value was computed to quantify its contribution to the model’s prediction of non-convergence for this specific case.
Unhappy correlations were selected using the SHAP-based procedure described in Section SHAP-Based Localization of Unhappy Correlation Pairs. Unhappy correlations were identified using two criteria: a percentile score of at least 0.50 and a minimum strength score of 0.05. Note that these thresholds were chosen heuristically and are not intended as optimal cut-off values, but rather as practical criteria for identifying a small set of unhappy correlations.
Table 4 shows the resulting SHAP values, percentile scores, and strength scores for all correlation pairs.
This resulted in two unhappy correlations, – and –. The positive SHAP value for – indicated that this correlation pair increased the predicted risk of non-convergence for the example matrix. The second selected pair, –, also contributed positively to the non-convergence prediction. These two pairs therefore defined the primary repair targets. The localized shrinkage update was then applied to the correlation pattern surrounding these targets: the unhappy correlations received the strongest adjustment, while connected correlations were adjusted simultaneously according to the sigmoid weighting scheme.
- Step 3:
-
Repair with sigmoid-smoothed shrinkage.
The repair extended beyond the two unhappy correlations. These unhappy correlations were treated as the primary repair targets and received the direct shrinkage adjustment. In addition, a sigmoid smoothing function was applied to correlations that shared a variable with an unhappy correlation, so that the surrounding correlation pattern could also be adjusted. For each connected correlation
, a weight was computed using a logistic function with center
, sharpness
, weight_min
, and weight_max
. As a result, connected correlations with an absolute value well below 0.65 received virtually no additional adjustment, whereas connected correlations substantially above 0.65 approached the maximum adjustment weight of 0.30. The parameter sharpness
yielded a relatively steep transition around the center point. Consequently, the unhappy correlations received the primary shrinkage adjustment, while moderate to strong connected correlations were adjusted proportionally and weak connected correlations remained essentially unchanged. For the selected unhappy correlations and their neighbors, the shrinkage update is
The shrinkage level determined by line search was
Using
, the resulting repaired correlations are summarized in
Table 5, which jointly reports the sample correlations (
), repaired values (
), and corresponding repair adjustments (
).
The two unhappy correlations changed only slightly:
No nearest positive definite projection was required. The repair therefore consisted of a minimal adjustment. As shown in
Table 5, all changes were small in magnitude, with adjustments typically on the order of
to
. The largest modifications occurred for correlations directly connected to the unhappy ones, while the majority of correlations remained nearly unchanged.
To put the size of the repair into perspective, we compared the repaired correlations with the variability of the observed correlations using a leave-one-out jackknife procedure. Starting from the observed dataset with 30 observations, we recomputed the correlation matrix 30 times, each time leaving out one observation. This yielded 30 jackknife correlation matrices. For each correlation pair, we determined the 2.5th and 97.5th percentiles across the jackknife replicates, representing the range of correlation values expected under ordinary sampling fluctuations.
The resulting 95% jackknife bounds are also reported in
Table 5. Comparing
with these bounds showed that all repaired correlations remained well within the range induced by sampling variability. This indicated that the repair was effective and remained small relative to the natural variability in the observed correlations.
Thus, the repair can be understood as a local stabilization of the correlation pattern identified by the SHAP diagnosis. Rather than replacing the full matrix, the update moved the destabilizing pairs and their connected correlations toward a stable target matrix.
- Step 4:
-
SEM refit on (convergence).
The original covariance matrix S was rescaled to its corresponding correlation matrix to remove scale effects. Model training, diagnosis and shrinkage were performed on the correlation scale. Because this transformation is reversible, the repaired matrix can be mapped back to a covariance matrix, and the SEM can therefore be refitted on the covariance scale if desired. We refitted the same SEM to using maximum likelihood estimation. After the repair, the model converged normally after 41 iterations:
lavaan 0.6-21 ended normally after 41 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 13
Number of observations 30
Thus, the originally non-convergent SEM converged after a small, SHAP-guided local shrinkage update of the identified destabilizing correlation patterns.
4. Discussion
In this paper, we proposed a diagnose–localize–shrinkage framework to address non-convergence in small-sample structural equation modeling (SEM). In contrast to existing approaches, which treat non-convergence either as a global optimization problem or as a direct consequence of limited information, our framework conceptualizes non-convergence as the result of local pathologies in the sample covariance matrix and their interaction with a given SEM specification.
A central contribution of this paper is the strict separation between model structure and diagnostic information. Covariance matrices are generated adversarially and independently of the SEM, while convergence is treated solely as an outcome variable. This design avoids information leakage from the model into the diagnostic stage. This separation is important, as many simulation studies confound model misspecification, sampling variability, and numerical instability, which makes it difficult to identify the mechanisms underlying non-convergence. The results show that specific local covariance patterns, such as extreme or internally inconsistent covariances, are systematically associated with non-convergence, even when the SEM itself is correctly specified.
Furthermore, this study demonstrates that machine learning models, such as XGBoost, can serve as a diagnostic tool in SEM. The XGBoost classifier was not primarily used to maximize predictive accuracy, but to identify via SHAP values which specific covariance pairs contribute to non-convergence for a given model. The strong predictive performance indicates that problematic covariance patterns follow consistent and detectable patterns. Importantly, these patterns are interpretable at the level of individual covariance elements, which makes local shrinkage possible.
Another key feature of the proposed framework is the local nature of the shrinkage update. Global shrinkage methods, such as ridge or Tikhonov regularization, have well-understood effects on the covariance or correlation matrix, but they modify the matrix as a whole. By contrast, the proposed approach restricts the update to a small set of unhappy correlations and their connected correlations. The illustrative case study shows that relatively small, targeted adjustments can be sufficient to restore convergence without substantially altering the overall covariance patterns.
We do not claim that the local repair has the same general statistical interpretation as global shrinkage. This local procedure is data-, model-, and diagnosis-dependent. It therefore has a diagnostic purpose: to show whether targeted local adjustments can restore convergence under a specified target structure. The repaired matrix can be used as an alternative sample matrix for subsequent SEM fitting, although users may also choose other remedies, such as changing the model, collecting more data, using another estimator, applying global shrinkage, or reporting the instability without repair.
A natural refinement of the procedure is to apply corrections sequentially rather than simultaneously, starting with the most influential unhappy correlation, refitting the model, and continuing only if needed. This strategy may achieve convergence with smaller overall modifications, although a larger shrinkage factor per step may be required. The trade-off between the number of corrected correlations and the magnitude of each adjustment can be monitored against the jackknife-based variability bounds described in
Section 2.3.3. We leave a systematic evaluation of this sequential variant to future work.
Several limitations should be acknowledged. First, the diagnostic step is inherently model-specific. The same covariance matrix may lead to non-convergence for one SEM specification while posing no problems for another. Although this dependency is an explicit part of the framework, it implies that diagnostic models need to be retrained for substantively different SEM structures. Second, the analysis focuses on convergence as a binary outcome. Convergence alone does not guarantee substantive validity. As illustrated by the extreme pathology condition, numerical convergence may coincide with biased parameter estimates, distorted standard errors, or other inadmissible solutions, particularly in small samples.
Closely related to this issue is the question of how local shrinkage affects the statistical properties of the estimated model parameters. Although the shrinkage strategy is targeted, its impact on parameter bias, variance, mean squared error, and substantive research questions is currently unknown. In line with the model-based shrinkage approaches in [
4], future simulation studies are needed to systematically evaluate these effects. The proposed framework should therefore be viewed as a stabilizing preprocessing step rather than as a guarantee of model validity.
An additional limitation is that the effects of the hyperparameters in the sigmoid transformation used to scale local covariance adjustments are not yet well understood. The different choices for its slope and midpoint may influence both the strength and the localization of the repair. Similarly, the selection of quantile-based cutoff values used to identify extreme SHAP contributions is currently heuristic. There is no criterion for their selection yet. It is also unclear how sensitive the results are to these choices.
A further limitation concerns the diagnostic instrument itself. As discussed in Section SHAP-Based Localization of Unhappy Correlation Pairs, SHAP-based attribution assumes feature independence that does not strictly hold when features are pairwise correlations from a single matrix, and SHAP values quantify contributions to model predictions rather than causal effects on convergence. In our framework, the validity of the localization is therefore not asserted on the basis of the SHAP values alone but evaluated empirically, through whether the proposed repair restores convergence. Future work could examine attribution methods that explicitly accommodate feature dependence, or compare SHAP-based localization with alternative selection strategies—for instance, sensitivity-based approaches that perturb individual covariance pairs and measure the resulting change in convergence.
Despite these limitations, the proposed framework offers something that existing approaches to non-convergence do not: a case-specific diagnosis of which covariance elements drive estimation failure for a given SEM, combined with a localized intervention whose effect is empirically verifiable. Global shrinkage, regularization, and Bayesian priors stabilize estimation but treat the input covariance pattern as a black box. The framework introduced here opens that box, identifies which specific correlation pairs the classifier associates with non-convergence, and offers a minimally invasive adjustment that can be evaluated against the natural variability of the observed data.