Bridging Crisp-Set Qualitative Comparative Analysis and Association Rule Mining: A Formal and Computational Integration
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsI find the draft to be mathematically rigorous, well-motivated, and an important cross-disciplinary contribution. Below is my assessment divided by key academic dimensions:
-
The formal definitions of csQCA as a triple (V,R,M)(V, R, M)
, the notion of configurations, consistency, and reducibility are clearly stated using set theory and logical algebra. The mapping of csQCA configurations to association rules via a transformation operator T is elegantly constructed, and the equivalence theorem is provable, precise, and meaningful. The reduction technique adapted from Quine–McCluskey is well-suited and demonstrates thoughtful alignment with existing logic minimization frameworks ===> From a mathematical modeling standpoint, the authors have succeeded in constructing a bijective framework that bridges symbolic logic (csQCA) with empirical pattern mining (ARM) through rigorous definitions and propositions. This is commendable. -
While csQCA and ARM have been independently used for decades, very few, if any, papers have established a mathematical equivalence or computational mapping between them. The use of negative association rules (including both presence and absence conditions) and proving that csQCA solutions are a subset of ARM results is novel and impactful ===> This paper provides theoretical unification in a way similar to how, say, statistical learning theory bridges regression and classification. That kind of synthesis is valuable in maturing interdisciplinary methods.
-
However:
-
No experimental benchmarks comparing runtime, memory use, or scalability between csQCA truth tables and ARM reduction.
-
No evidence that the proposed ARM reduction scales significantly better on real-world large-N problems beyond the examples shown.
-
Related work section is weak: No deep discussion of prior attempts to unify csQCA and other rule-based methods; limited referencing of other QCA variants (e.g., fuzzy-set QCA, mvQCA).
-
Author Response
Comment 1: "No experimental benchmarks comparing runtime, memory use, or scalability between csQCA truth tables and ARM reduction."
Response 1: We have updated Section 4.3 Numerical Experiments, including the memory usage and splitting the ARM algorithms into the association rule finding (Apriori algorithm), and the association rule minimization procedure. We have expanded the discussion of results and specified the details of the computer on which the simulations were performed.
Comment 2: No evidence that the proposed ARM reduction scales significantly better on real-world large-N problems beyond the examples shown.
Response 2: In the new section 4.3, we discuss how and in which aspects ARM improves csQCA results. It is true that due to computational limitations, we have not been able to analyze what happens if the number of variables increases further. But it is true that although the Apriori algorithm used does not scale well when the number of variables increases, there are alternatives mentioned in section 3.2 that can improve performance in more complex cases.
Comment 3: Related work section is weak: No deep discussion of prior attempts to unify csQCA and other rule-based methods; limited referencing of other QCA variants (e.g., fuzzy-set QCA, mvQCA).
Response 3: We have expanded the literature review in Section 2 to include the aforementioned aspects. We have added 3 paragraphs with several more references to fsQCA, mvQCA and some variation and have indicated more clearly that with the exception of one article on Bayesian rules as an alternative to QCA, we have not found any work relating association rules to QCA.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis manuscript presents a novel and rigorous mathematical formalization of Crisp-Set Qualitative Comparative Analysis (csQCA) and establishes a clear theoretical equivalence with a class of Association Rule Mining (ARM) problems under certain conditions. The paper is well-written, logically structured, and supported by both theoretical derivations and empirical applications—a small-N case study on internet shutdowns in Sub-Saharan Africa and a large-N study on immigration attitudes in Europe. The inclusion of a simulation study to demonstrate computational advantages in high-dimensional settings is a valuable contribution.
However, I have several questions and suggestions for clarification:
- Theorem 1 Assumptions:
The equivalence between csQCA and ARM in Theorem 1 is established under two critical conditions. I would appreciate further clarification on the empirical feasibility and practical implications of these assumptions. - Consistency of Dataset M:
The requirement that the dataset M is consistent (i.e., no contradictory cases with identical configurations but differing outcomes) may be difficult to satisfy in real-world scenarios, especially when the sample size or number of variables is large. In such cases, how sensitive is the theoretical equivalence to this assumption? For example, if 90% of the dataset is consistent, would the equivalence still hold approximately in practice? A brief discussion or empirical comparison would be helpful to understand whether this violation leads to substantial differences in results between csQCA and ARM. - Thresholds for Support and Confidence in ARM:
The thresholds used in the ARM setup (support = 1/m, confidence = 1) are quite stringent. The support condition implies that even itemsets occurring in only one case are retained, and the confidence threshold requires perfect conditional certainty (i.e. the conditional probability of R given X is 1). In practice, such thresholds are rarely used due to computational and robustness concerns. Could the authors comment on how relaxing these thresholds would impact the overlap between csQCA and ARM results? Specifically, how would more typical thresholds affect detection accuracy and computational efficiency? - Numerical Experiments (Section 4.3):
- The manuscript reports computation times but does not clarify whether csQCA and ARM produced identical selection results in the simulation study, as they did in the empirical examples. It would strengthen the discussion to explicitly state whether the outcome sets were exactly the same or diverged under different thresholds or data conditions.
- Regarding ARM computation time: does the reported time include the post-processing step of identifying reducible rules through minimization? If not, how much additional time was required for that step? Moreover, is there a software package or implementation available for this minimization procedure? Including such details—or ideally, making code or pseudocode available—would increase the reproducibility and impact of the work.
- Computational Resources:
Please provide the specifications of the computational environment used to generate the reported runtime (e.g., CPU model, RAM, GPU if applicable). This information is essential for readers to assess and compare performance. - Minor Corrections and Editorial Suggestions:
- Page 15, line 483: It should read "consistency = 1," not "confidence = 1."
- Table 5 (Page 17): The formatting of itemsets in the 4-itemset section is inconsistent—the use of capital and lower-case letters is contradicted to the association rules.
- Page 17, line 542: Consider adding "[THREAT]" after "perceived economic threat" to align with the notation used for other variables.
- Table 7 (Page 19): Please clarify why the metric of confidence was not shown but instead use coverage.
Author Response
Comment 1: Theorem 1 Assumptions:
The equivalence between csQCA and ARM in Theorem 1 is established under two critical conditions. I would appreciate further clarification on the empirical feasibility and practical implications of these assumptions.
-
- Consistency of Dataset M:
The requirement that the dataset M is consistent (i.e., no contradictory cases with identical configurations but differing outcomes) may be difficult to satisfy in real-world scenarios, especially when the sample size or number of variables is large. In such cases, how sensitive is the theoretical equivalence to this assumption? For example, if 90% of the dataset is consistent, would the equivalence still hold approximately in practice? A brief discussion or empirical comparison would be helpful to understand whether this violation leads to substantial differences in results between csQCA and ARM. - Thresholds for Support and Confidence in ARM:
The thresholds used in the ARM setup (support = 1/m, confidence = 1) are quite stringent. The support condition implies that even itemsets occurring in only one case are retained, and the confidence threshold requires perfect conditional certainty (i.e. the conditional probability of R given X is 1). In practice, such thresholds are rarely used due to computational and robustness concerns. Could the authors comment on how relaxing these thresholds would impact the overlap between csQCA and ARM results? Specifically, how would more typical thresholds affect detection accuracy and computational efficiency?
- Consistency of Dataset M:
Response 1:
-
- Regarding the consistency of Dataset M: Thank you very much for your comment. It is a very good point. We fully agree with the reviewer that requiring full consistency in the dataset is a very restrictive assumption. However, in csQCA it is a working assumption. In fact, if the consistency is not one, it is because there are “contradictions”, and it is usually interpreted as missing variables describing the possible outcome. This is one of the main criticisms of csQCA that was attempted to be solved by the fuzzy extension (fsQCA), which we are also working on at the moment. Moreover, the software used for the csQCA calculations in R (Dusa's QCA package), returns an error when the truth table given as input contains contradictions and asks to modify it.
- Regarding the support and confidence thresholds: The support and confidence conditions are indeed very stringent. As in the previous point, they come from the usual csQCA working assumptions. Initially, csQCA was meant to be used with a very small number of variables and very few cases. This is the reason why only one repetition was needed for a case to be considered in the list of "conducive cases". If a larger number of repetitions (say k repetitions) were needed, it would only be needed to adjust the minimum support threshold to k/m, and the theorem would still be valid.
Also, since the consistency of a configuration equals the confidence of the corresponding rule, if no contradictions are allowed, then the consistency of any conducive configuration is 1 (and thus the confidence threshold). However, in the same way as before, if one wants to allow a certain degree of inconsistency, it would suffice to set the minimum confidence threshold accordingly.
We have added two paragraphs in Remark 4, after Theorem 1, commenting on these issues.
- Regarding the consistency of Dataset M: Thank you very much for your comment. It is a very good point. We fully agree with the reviewer that requiring full consistency in the dataset is a very restrictive assumption. However, in csQCA it is a working assumption. In fact, if the consistency is not one, it is because there are “contradictions”, and it is usually interpreted as missing variables describing the possible outcome. This is one of the main criticisms of csQCA that was attempted to be solved by the fuzzy extension (fsQCA), which we are also working on at the moment. Moreover, the software used for the csQCA calculations in R (Dusa's QCA package), returns an error when the truth table given as input contains contradictions and asks to modify it.
Comment 2: Numerical Experiments (Section 4.3):
- The manuscript reports computation times but does not clarify whether csQCA and ARM produced identical selection results in the simulation study, as they did in the empirical examples. It would strengthen the discussion to explicitly state whether the outcome sets were exactly the same or diverged under different thresholds or data conditions.
- Regarding ARM computation time: does the reported time include the post-processing step of identifying reducible rules through minimization? If not, how much additional time was required for that step? Moreover, is there a software package or implementation available for this minimization procedure? Including such details—or ideally, making code or pseudocode available—would increase the reproducibility and impact of the work.
Response 2:
- Thank you very much for your comment. For the simulations in the numerical experiments, the csQCA and ARM did not produce the same identical solutions. ARM always found more configurations than csQCA. What always happened is that the csQCA solutions were in the list of ARM rules found (as is stated in Theorem 1). The reason for the discrepancy is in the minimization algorithms. In csQCA, Quiney-McClukey is a top-bottom algorithm, therefore, the smaller configurations found are always the minimization of larger configurations. On the other hand, in ARM, we first obtain all interesting rules and then we remove the rules that are reducible. This means that csQCA cannot obtain configurations that do not come from the minimization of larger rules. We did not include the detailed list of solutions in the numerical experiments because in the higher-dimensional cases, the number of rules was in the tens of millions.
- This is also a very good suggestion. The times reported included both the Apriori algorithm for finding interesting rules and the minimization procedure. To clarify the times required at each step, we have rerun the numerical experiments, reporting the memory usage (as was suggested by other reviewers) and the computation time for each ARM step separately. The Apriori algorithm used was available in the "arules" R package. The minimization of the interesting rules was done with a custom code we developed. We have added a link to a github page with the R code used (and some c++ functions) for reproducing Figure 1.
We have added a new Remark (Remark 5) at the end of Section 4.3 to clarify the first point. We have rerun the numerical experiments, and updated both Figure 1 and Figure 2, adding the memory usage, and splitting the AR computation time into AR apriori and AR minimization. We have also expanded the discussion of the results in Section 4.3 by adding several paragraphs, updating the results obtained.
Comment 3: Computational Resources:
Please provide the specifications of the computational environment used to generate the reported runtime (e.g., CPU model, RAM, GPU if applicable). This information is essential for readers to assess and compare performance.
Response 3: We have added the details of the computer used to run the numerical experiments at the end of the third paragraph of Section 4.3
Comment 4: Minor Corrections and Editorial Suggestions:
- Page 15, line 483: It should read "consistency = 1," not "confidence = 1."
- Table 5 (Page 17): The formatting of itemsets in the 4-itemset section is inconsistent—the use of capital and lower-case letters is contradicted to the association rules.
- Page 17, line 542: Consider adding "[THREAT]" after "perceived economic threat" to align with the notation used for other variables.
- Table 7 (Page 19): Please clarify why the metric of confidence was not shown but instead use coverage.
Response 4:
- Done.
- Thank you very much, we have corrected it.
- Done.
- The metric of confidence was not explicitly shown as a column in the table because, as it is stated in the Table caption, the table shows all the association rules with a 100% confidence, so we believed that a column of all 1s would not give more information. We have added (Conf = 1) in the table caption for more clarity.
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have answered my questions clearly and completely. I have no further questions.