1. Introduction
Statistical theory offers normative standards for rational behavior. These standards are central to many tasks that require probabilistic reasoning and form the foundation of research on decision-making under uncertainty. Yet, decades of cognitive psychology research show that individuals often deviate from these normative expectations. Rather than applying formal statistical principles, people frequently rely on thoughts that come to mind spontaneously and effortlessly—commonly referred to as intuitions or heuristics. These intuitive processes tend to be fast, emotionally charged, and based on surface-level features of a problem, rather than on abstract reasoning. This pattern has been extensively documented in the “Heuristics-and-Biases” program, initiated by Kahneman and Tversky [
1] and developed in further studies of Kahneman et al. [
2] and Gilovich et al. [
3].
A major theoretical framework that builds on these insights is the Dual-Process Theory of Cognition. This theory has been influential in both cognitive and social psychology and provides a structured account of how humans think, reason, and make decisions (Samuels et al. [
4]; Stanovich & West [
5,
6,
7]; Stanovich [
8,
9]; Evans [
10]). According to this theory, human cognition operates via two qualitatively distinct systems. System 1 (intuitive) is fast, automatic, intuitive, and prone to cognitive biases. System 2 (analytical) is slower, controlled, and analytical. The two systems often interact, but they can also compete. Dual-process theory holds that intuitive errors arise when responses generated by System 1 go unchallenged by System 2. To produce normatively correct answers—especially in tasks requiring probabilistic thinking—System 2 must first detect a flaw in the intuitive output of System 1, inhibit that response, and then activate a formal reasoning strategy.
This framework provides a compelling explanation for the mismatch between actual human reasoning and normative models. It captures both the source of error (unchecked intuitions) and the mechanism of correction (engagement of analytical reasoning). As such, dual-process theory serves as a powerful tool for interpreting data in the domain of statistical cognition.
The present study builds on this framework by addressing a notable contradiction in empirical findings from two related studies. These studies used identical statistical reasoning problems, yet produced opposite outcomes. In one study, participants predominantly gave the intuitive but incorrect answer. In the other, participants mostly arrived at the correct, normative response. The contrast in outcomes was not due to the content of the task but rather to the experimental design used to present it. One study used a within-subjects format, while the other used a between-subjects design.
This puzzle is part of a broader research program on statistical reasoning led by Van Dooren and colleagues. Their original study [
11] provided the basis for our investigation, and their subsequent contributions (Van Dooren et al. [
12,
13]) have significantly advanced understanding of how people reason about proportions, ratios, and probability, particularly in educational contexts. This body of work provides the methodological backbone and theoretical motivation for the present study (The terms “Original” and “New” were coined by Zwaan et al. [
14], and relate to the debate on labeling a replication as direct or conceptual).
The sharp divergence between the “Original” and “New” studies raises a crucial theoretical question: Can the structure of the experimental design itself influence which cognitive system—System 1 (intuitive) or System 2 (analytical)—is activated? While previous studies have investigated the effects of task framing, numerical format, and domain familiarity, few have directly examined whether methodological design functions as a cognitive moderator. If, as we argue, the design type nudges participants toward intuitive or analytical processing modes, then methodological decisions hold theoretical weight. The experimenter, through design choice, may inadvertently shape the cognitive route participants take, independent of the problem content.
This insight motivates the central aim of this paper, which is to offer an integrative framework that accounts for both the conceptual and empirical contradictions between the “Original” and “New” studies. In doing so, we adopt a synthesis-based approach that combines experimental findings, theoretical reasoning, and formal modelling. The empirical synthesis aligns the observed outcomes across designs, while the conceptual synthesis clarifies how these outcomes can be understood through the lens of dual-process theory.
To advance this synthesis further, we introduce a mathematical model that captures the functional relationship between experimental design, cognitive system engagement, and performance accuracy. This approach also responds to recent calls for strengthening conceptual replication in the behavioral sciences as Nosek & Errington [
15] and Derksen & Morawski, [
16]. These works emphasize that replication is not merely a confirmatory tool but also a mechanism for theory development, consistent with our goal of synthesizing empirical contradictions into a formal framework. Mathematical modeling in cognitive science has long served to formalize mental processes, generate testable predictions, and expose hidden assumptions. However, dual-process theory has historically been presented in mostly verbal or conceptual terms. Our model addresses this gap by offering a continuous function that incorporates both methodological structure and cognitive dominance as interacting variables. The model not only describes the empirical data but also enables forward prediction and inverse inference using Bayesian reasoning. This step brings formal precision to a theory that is often rich in explanation but sparse in mathematical structure.
Despite the extensive use of dual-process theory across psychology and economics, relatively little attention has been paid to how methodological framing itself may operate as a cognitive moderator. While task content and framing manipulations are well-documented (Evans [
10]; Stanovich [
9]), fewer studies address the systematic impact of within- versus between-subjects designs on reasoning outcomes. Addressing this gap not only advances theoretical clarity but also informs applied domains, such as financial decision-making, where presentation format directly influences cognitive engagement and bias susceptibility.
Further in this paper, we illustrate these integrative aspects. The “Original” and “New” studies are described in
Section 2.
Section 3 presents the synthesis of the data and concepts.
Section 4 introduces a mathematical model for dual-process integration.
Section 5 and
Section 6 conclude with a discussion and implications for future research.
2. “Original” Study vs. “New” Study: Two Methodologies
In this section we describe and analyze results from the two studies in accordance with linear, intuitive, and normative thinking theories. We refer to the “Original” study conducted by Van Dooren et al. [
11] that used within-subjects methodology to examine intuitive thinking. Alongside that, we expand to the “New” study, which uses a “conceptual” replication methodology. Related to Gelman’s [
17] comment on Zwaan et al. [
14] the different set up of the “Original” and “New” studies is associated with possible variation in the treatment effect. Thus, discrepancies in the studies’ results are expected.
2.1. The “Original” Study
Van Dooren et al. [
11] investigated biases in statistical thinking, focusing on the ‘illusion of linearity’ in probability. They tested 225 secondary school students in Belgium, approximately half of whom had formal training in probability, while the other half did not. They used four judgment tasks from a binomial chance situation (Binomial chance situation refers to situations where each one of n trials is grouped into one of two outcomes: failure with probability 1 − p or success with probability p, independently of other observations. In such situations, a variable that counts the number of successes in n trials is defined. The count variable behavior is described by binomial distribution with the parameters n and p. Hence, the probability of k successes is (n¦k)·[(p)^k (1 − p)]^(n − k). This probability is the norm for calculating any number of successes in n trials.), employing within-subjects’ methodology. As shown in
Table 1, each participant was required to answer each one of these items by circling the correct answer and to explain his choice. Each item reflects different linear connections between the parameters. The “linearity in n” (n), “linearity in k” (k), “linearity in p” (p), and the “linearity in n × k” (n × k) represent the four bionomial chance items (as detailed in
Appendix A). In
Table 1, as a means of clarification, we illustrate in the first column, by the bold in the text box, the parameters that are linearly manipulated (n × k, n, k, p). In the second column, we parameterized the text (as demonstrated in bold parentheses). Obviously, although there are linear connections between the parameters (n, k, p), there are none in the event probabilities, and thus the conclusion according to Van Dooren et al. [
11] is that of linear thinking.
To be in line with Van Dooren et al. [
11], we used the four items, as they were presented in their research, to exemplify possible expected answers in accordance with their conclusion of linear thinking. For each item, we circled the “attention-grabber”. The term “attention-grabber” signifies the wording that elicits linear thinking. For example, manipulating the parameter n, e.g., linearity in n, is performed when n2 is multiplied by the “attention-grabber” that resulted in n1. Addressing mathematical, statistical, or economic problems primarily requires System 2—analytic, effortful, and slow—cognition. Our interpretation is that the respondents relied chiefly on System 1, the fast, intuitive mode that is highly sensitive to context and distractions. Repeated cues capture attention and become the default, thereby biasing judgments toward those signals, a pattern characteristic of the within-subjects (W) condition.
Van Dooren et al. [
11] analyzed the data using a combination of quantitative calculations and qualitative text analysis, as described below:
The correct answer for each item, “This is not true”, was given by only 22.4% of the participants.
The explanations of the students that circled the incorrect answer, “This is true”, were analyzed.
A text analysis was conducted using pre-determined categories:
Focusing on the probability-educated participants, on average, 75% of the explanations were categorized as P. Regarding participants’ performance, Van Dooren et al. [
11] claim that students’ thinking can be considered as linear thinking in non-linear situations. This is evident in the probability-educated/non-educated participants. Furthermore, they concluded that the findings could have been the result of the research methodology, meaning that the student might have been inspired to default to linear thinking as a consequence of the structure of the question setup. Hence, they suggested to expand this research by employing a different methodology.
In the next section, we suggest a follow-up study to Van Dooren et al. [
11] using “conceptual” replication with a between-subjects methodology.
2.2. The “New” Study
The present study (“New”) seeks to conceptually replicate the “Original” Van Dooren et al. [
11] study using a different methodology. Our primary aim is to investigate the underlying thinking processes that account for the observed effects rather than to test the statistical significance of the effects themselves, which is typically the focus of direct replications (Crandall & Sherman, [
18]).
Primarily, we vigorously adopt Campbell’s [
19] suggestion that the findings of a study need to be replicated using different methods. Acknowledging the importance of the methodology component, as well as Van Dooren et al.’s [
11] suggestion as discussed above, encourages us to test the theory of linear thinking in a different way. We therefore modified the methodology in accordance with the two aspects as follows:
We collected data from a different population, namely probability-educated college students.
We limited the number of items each participant received to one item in a different research setup (i.e., between-subjects methodology).
As mentioned above, “conceptual” replication is research that tests the theory outlined in the “Original” study by applying different parameters. To satisfy this demand, changing the research population, as outlined in aspect 1, and performing between-subjects research, as outlined in aspect 2, represent different research methods. Along with the between-subjects design, which enables us to control some psychological consequences that a within-subject design has, we can eliminate participants’ interpretation of the experimenter’s intentions, which then changes their behavior (Charness et al. [
20]). Later on, this research approach will aid us in placing it along the axis of the methodology scale, representing the measurement at the far end of the methodology spectrum (see synthesis
Section 3). Subsequently, we conjecture that the tendency towards linear thinking would not be observed in this research setup. Specifically, we hypothesize that the percentage of the correct answers exceeds the random 50% chance. In addition, we hypothesize that the answers’ distribution does not depend on the item type. This follows from our expectation that participants use the binomial equation. Furthermore, we carefully propose that incorrect answers are not dominated by linear thinking. Thus, we hypothesize that the N category and the combined categories P + O are uniformly distributed. Relating to these three hypotheses, it is worth considering that we claim that the between-subjects design minimizes the impact of the “attention-grabber”.
Note that the components that were not changed in the “New” research as compared to the “Original” one are as follows:
The participants’ answering instructions—circling the correct answer and explaining the choice.
The data analysis—calculating frequencies of the correct/incorrect answers, and analyzing the texts attached to the incorrect answers.
2.2.1. Participants
Questionnaires were given to 177 students (aged 18–26) from an engineering college located in Northern Israel. The minimal psychometric grade for admittance is on average 550 out of maximum possible 800. This grade is the Israeli equivalent of the American SATs exam. Each participant had completed an introductory course in probability and statistics, comprising mainly the following: combinatorics, set theory, symmetrical sample space, conditional probabilities, tree diagrams, and binomial distribution.
A potential concern is that the observed improvement in accuracy between the “Original” and “New” studies could reflect differences in participant knowledge rather than the methodological design. While the “New” Study participants were indeed college students with formal probability training, the “Original” Study also included a substantial subgroup with similar background knowledge, as reported by Van Dooren et al. [
11]. Despite this overlap, the two studies yielded strikingly different accuracy rates. This suggests that knowledge level alone cannot account for the discrepancy. Instead, the within-subjects design in the “Original” Study appears to have amplified reliance on intuitive responses, whereas the between-subjects design in the “New” Study reduced such bias and promoted more analytical engagement. This interpretation is consistent with evidence that methodological framing itself can confound results independently of participant background (Charness et al. [
20]).
Charness, Gneezy, and Kuhn [
20] further note that “the choice between a within-subject and a between-subject design can critically affect the nature and magnitude of observed effects” (p. 1). Their analysis makes clear that differences in design, regardless of participants’ prior knowledge, can systematically shape outcomes. This supports our interpretation that the observed gap between the “Original” and “New” studies is best explained by methodological framing rather than differences in knowledge.
2.2.2. Method
Participants had 20 min to answer one question (item). Each participant received only one out of the four items. The questions are the Hebrew version of the questions by Van Dooren et al. [
11], as presented in
Table 1. As mentioned above, this method is a “conceptual” replication of Van Dooren et al. [
11] in a between- subjects design.
2.2.3. Analysis
Our data analysis followed the procedure used by Van Dooren et al. [
11].
Table 2 presents the frequency distribution of correct and incorrect answers across the different linear items (n, k, p, and n × k) selected by the participants.
In addition, the texts attached to the incorrect answers were analyzed in accordance with the three predetermined categories (P, O, N). The interrater reliability of this analysis (for all the protocols performed independently by two researchers) was kappa = 0.9012, indicating an almost perfect agreement between the raters (Landis and Koch [
21]).
2.2.4. Results
The following results of the selected answers and the text explanations are processed by summaries and by testing our research hypothesis:
The findings in
Table 2 show that 80 percent of the participants (141) gave the correct answer, while 20 percent (36) gave the incorrect one. This outstanding result confirms the research hypothesis that the proportion of correct answers is significantly higher than the 50% chance (
). Furthermore, the differences in the distribution of the answers across all the items are not significant (
(3) = 3.61,
p > 0.1). This confirms our research hypothesis that the distribution of the answers does not depend on the item type.
Similarly to the “Original” research, we analyzed the incorrect text explanations acording to the following predetermined categories: P, O, N, as demonstrated in
Table 3. Following the “Original” study’s line of research for inducing the P category, we circled the same possible “attention-grabber” in column 3 of items 1–3, as is shown in
Table 1.
Table 4 presents the frequency distribution of the P, O, and N categories of the incorrect text explanations, where the left side corresponds to the “Original” study’s partitioning, while the right side corresponds to the “New” study’s one. Applying the “Original” study partitioning to our data produces some differences in the frequencies of the three categories (13, 8, 15). These findings are not statistically significant (
(2) = 2.167,
p > 0.1). Practicing the “New” study partitioning produces some differences in the incorrect text’s explanations frequencies (15, 21) across the N and P + O categories. These findings are not statistically significant (
(1) = 1,
p > 0.1). This confirms our research hypothesis of uniform distribution for the N and the combined category P + O.
According to
Table 2 and
Table 4, the “Original” research findings are not replicated in this “New” research for the answers and the explanation of the categories’ distributions. Based on these findings, the evidence of “illusion of linearity” in participants’ statistical thinking has been extremely narrowed. The existence of two different outcomes directs us to investigate the relationships and interactions of the thinking processes emerging from the various methodologies. It should be noted that the “New” research, at the very least, adds value to the research literature. This is in accordance with Gelman [
17], who refers to cases where replication gives a different outcome from the “Original” one.
3. Synthesizing Data and Concepts Across Methodologies: A Functional Model of the Dual-Process Theory of Thinking
Our findings indicate that most participants relied on analytical thinking, whereas a smaller proportion displayed intuitive or linear reasoning (this claim attributes wrong answers to intuitive thinking). Referring to the “dual-process” theory of thinking, intuitive or linear thinking is associated with System 1 (intuitive) processing, while analytical thinking is associated with System 2 (analytical) processing. The correct/incorrect answer does not necessarily imply the use of solely System 1 or solely System 2. For example, participant need to use System 2 (analytical mode of thinking) to detect a problem with the operation of System 1, override the intuitive response, and use their own analytical knowledge to give the correct answer. Moreover, the correct answer could be assigned to System 1 processing, while not using System 2 efficiently might result in the incorrect answer. The latter could be expected in the absence of the needed mathematical or statistical mindware. However, according to the “Original” study and its conclusions, we determine that participants’ defaulting to linear thinking equalizes the incorrect answer.
In what follows, we use the notation S to refer to pseudo-System 1. Similarly, according to the “New” study and its conclusions, we determine that participants used analytical thinking equalizing with the correct answer. Correspondently, we use the notation to refer to pseudo-System 2.
Moving ahead, we claim that the research design determines the conditions under which incorrect (
) or correct (
) answers are given. Foremost, it is the within-subjects design, as it becomes an almost inseparable part of the treatment effect as a consequence of the salience and vividness with the effect is embedded. This phenomenon is well known as the problem of confounding in the within-subjects’ design by Charness et al. [
20]. We claim that the “Original” research setup leads to the use of heuristics thinking, i.e., a thinking strategy independent of analytical thinking, that is triggered by the “attention-grabber”. Heuristics thinking accounts for a variety of thinking strategies and is not attributed particularly to linear thinking or any other mechanism. Additionally, we emphasize that the within-subjects design highlights possible difficulties in decision-making when encountering an “attention-grabber” and, thus, it raises awareness of the importance of the need to improve our monitoring of real-life decisions. This coincides with Stanovich [
8], who indicates that “… when we are evaluating important financial decisions—buying a home or obtaining a mortgage or insurance—we do not want to substitute vividness for careful thought about the situation. Modern society keeps proliferating such situations where shallow processing is not sufficient for maximizing personal happiness. These situations are increasing in frequency because many structures of market-based societies have been designed explicitly to exploit miserly tendencies. Being cognitive misers in these situations will seriously impede people from achieving their goals.”
Regarding reducing the effect of the “attention-grabber” that might dominate participants’ thinking, modifying the research design to the between-subjects design may increase the chance of applying System 2 efficiently.
3.1. Step 1—Dual-Process Theory and Methodology Partitioning
This section provides a preliminary stage, highlighting that the focus is not on which cognitive system the participant relies on, but on the methodology applied within the framework of dual-process theory, represented here as a continuum curve. The curves shown in
Figure 1a,b are illustrative only; their shapes are arbitrary and do not imply any specific rate of change.
3.1.1. Dual-Process Partitioning
We focus on the relationships and interactions between the two thinking processes in relation to the selected answers, conceptualized as a continuum.
Figure 1a illustrates one possible instance of this continuum. The curve presents two domains of cognitive processing. On the left, the domain corresponds to the dominance of System 1 (intuitive) over System 2 (analytical). On the right, the domain corresponds to the dominance of System 2 (analytical) over System 1 (intuitive). The curve is a continuum of points (S1,2 > S2,1 possibilities) that represents all possible states of dominance between the two systems in accordance with cognitive engagement. This starts from the left end point S1, which corresponds to circling the incorrect answer and the dominance of intuitive thinking, and continues through circling the incorrect answer and using both intuitive and analytical thinking (S1 > S2), or circling the correct answer with intuitive or both intuitive and analytical thinking (S2 > S1), up to circling the correct answer and the dominance of analytical thinking,
.
3.1.2. Methodology Partitioning
In a similar way to
Section 3.1.1, we zoom in on the relationships and the interactions between the two methodologies as a continuum, shown in
Figure 1b, as one instance out of many possibilities. The curve presents two domains of within-subjects and between-subjects methodologies, in which the left/right domain corresponds to the dominance of within-subjects/between-subjects over between-subjects/within-subjects. It should be noted that the dominance of one methodology over the other means that there is at least one within-subjects factor and at least one between-subjects factor in the same experiment, as one methodology outnumbers the other one in terms of factors. The curve is a continuum of the swiped points (W, W > B, B > W, B possibilities) that describes all states in accordance with the methodology dominance. Starting from the left end point W, which corresponds to the within-subjects methodology, it continues through the combined methodology of within-subjects and between-subjects (W > B) or (B > W), up to the right end point B that corresponds to the between-subjects methodology.
3.2. Step 2—Defining Performance Measures as Methodology-Dependent, Alongside the Influence of Thinking Systems
Up to this stage, we can integrate insights from both the “Original” and the “New” experiments into a continuum of curves, as illustrated in
Figure 1b. It is important to note, however, that each experiment was conducted using a different methodological approach, represented as conceptual planes in
Figure 2. Specifically, Plane “O” (blue) corresponds to the within-subjects design of the “Original” study, while Plane “N” (orange) corresponds to the between-subjects design of the “New” study. The solid curve in
Figure 2 traces the trajectory connecting System 1 and System 2, thereby illustrating how methodological choices shape the distribution of answers. The known endpoints (derived from the experimental data and marked with blue dots for Plane “O” or orange dots for Plane “N”) are as follows: for the blue Plane “O”, (W, S1) defines the performance obtained from the “Original” experiment (within-subjects methodology) operating under System 1, while (W, S2) represents the same experiment operating under System 2. Similarly, for the orange Plane “N”, the endpoints (B, S1) and (B, S2) correspond to the between-subjects methodology under Systems 1 and 2, respectively. It is worth noting that the shapes of the curves sketched in
Figure 2 are illustrative and do not imply a specific rate of change or a precise continuum of the answer distribution. In the next section (Step 3), we empirically demonstrate the core synthesis.
3.3. Step 3—Bayesian Probabilities as a Mechanism of the Synthesis
Due to our main idea of synthesizing the “Original” and “New” studies through new imaginary planes, equalizing the thinking systems’ state (see step 4 for specifics), we calculate the posterior probability of the methodology conditioned on the thinking system. These Bayesian probabilities exemplify the mechanism of the synthesis core and are calculated as follows, using the main data.
Based on the four end points—(W, S1); (W, S2); (B, S1); (B, S2)—
Figure 3 shows an example of the third step of artificially merging frequencies for calculating the Bayesian probabilities of the within-subjects/between-subject design conditioned on the System 1/System 2 planes. Relating to the 12th grade participants in the “Original” research (118) and the 177 participants in the “New” study, as well as the statistics given in
Section 2, the transition probabilities between states are given at each edge of the diagram.
Hence, the empirical distribution across methodologies is
where
is the probability of encountering the W methodology upon conditioning on using S
(i.e., giving the wrong answers) equals 0.72.
The probabilities
,
are analogically defined.
Figure 3.
Tree diagram of probabilistic outcomes. This diagram illustrates the conditional probability structure of decision-making outcomes based on cognitive system dominance (S1 = intuitive/System 1, S2 = analytical/System 2). The branches represent the likelihood of selecting a within-subjects (W) or between-subjects (B) methodology given system activation. Probabilities assigned to each branch reflect observed behavioral patterns under uncertainty.
Figure 3.
Tree diagram of probabilistic outcomes. This diagram illustrates the conditional probability structure of decision-making outcomes based on cognitive system dominance (S1 = intuitive/System 1, S2 = analytical/System 2). The branches represent the likelihood of selecting a within-subjects (W) or between-subjects (B) methodology given system activation. Probabilities assigned to each branch reflect observed behavioral patterns under uncertainty.
These results bring together the essence of this synthesis, observed by the decrease shown in the percentage of incorrect answers in ystem 1 due to the shift from within-subjects (72%) to between-subjects (28%), as well as the increase in the percentage of the correct answers in ystem 2 due to the shift from within-subjects (26%) to between-subjects (80%).
Heretofore, the synthesis drew on the various data collected as well as the model of two thinking systems. With what follows, we proceed with the study by theoretically complying with Crandall & Sherman [
18], who claim that “ideas are the unit of analysis in conceptual replication”. Thus, we disengage from the data while advancing our research program.
3.4. Step 4—New Approach for Creating a Functional Model of the Dual-Process Theory of Thinking by Synthesizing Data and Concepts Across Methodologies: From Within-Subjects to Between Subjects
This section is pivotal to our study and presents the final step of the synthesis. We can infer from observed performance both the operative cognitive system and the methodology to which participants were exposed. In turn, this may reduce the need for resource-intensive group testing, e.g., using fMRIs to localize active brain regions.
Put simply, the research design can be seen as a kind of cognitive nudge. Just as the layout of a store can lead shoppers toward one aisle rather than another, the methodological setup guides participants toward either fast, intuitive judgments (System 1) or slower, more deliberate reasoning (System 2). We mapped these nudges so that by identifying where a participant’s answers fall, we can infer both which system was engaged and which design shaped the response. This analogy shows that methodology is not a passive backdrop but an active force that shapes how decisions are made.
3.5. Scenarios
3.5.1. Symmetric Scenario: P-M Planes
Figure 4 illustrates the relationship between Performance (P) and Methodologies (M), under the assumption that performance outcomes are not fixed but vary according to the methodological approach. Three distinct curves are presented, each corresponding to a different constant cognitive system parameter (s). The horizontal axis (M) represents methodological variation (ranging from 0 to 1), while the vertical axis (P) captures performance outcomes. The curves clearly demonstrate that performance (P) is not absolute but rather a function of the methodology applied. This highlights that the choice of methodology directly influences observed performance levels.
Each curve corresponds to a constant cognitive system value:
Blue curve (s = 0): Performance begins at its maximum when m = 0 and decreases smoothly to a minimum at m = 1.
Green curve (s = 0.5): Performance follows a symmetric half-sine distribution, peaking at m = 0.5 and dropping to zero at both extremes (m = 0 and m = 1).
Purple curve (s = 1): Performance starts at its minimum at m = 0 and increases steadily to its maximum at m = 1.
These curves represent cognitive constraints or system-level properties that remain fixed while methodologies vary.
All curves have been normalized such that the total area under each curve equals 1. This ensures comparability between the three distributions, meaning each cognitive system parameter contributes equally in terms of total “performance capacity,” despite differences in distribution across M.
The chosen functional form for this demonstration is sine-like distributions. These functions are arbitrary and illustrative, serving as initial priors. Importantly, they do not affect subsequent posterior processing, they simply provide a way to visualize possible dependencies between methodologies and performance under fixed system parameters.
The figure demonstrates how performance outcomes (P) depend systematically on methodologies (M), with different curves reflecting distinct cognitive system constants (s). The sine-like prior ensures fairness through normalization and preserves neutrality in assumptions, so that subsequent posterior analyses, calibrated with experimental data, are not biased by the choice of prior.
Weighting principle and justification: One could, in principle, adopt a uniform weighting function along the axis, assigning identical weight to every point/state. If the outcomes are ostensibly random, a natural question arises: why privilege one state over another? In our empirical presentation, we adopted a neutral rule: each participant received the same weight in every methodology, regardless of whether they were assigned to the “Original” experiment or the “New” one (consistent with laboratory conditions). Formally, if
nOriginal and
nNew denote the sample sizes in the two settings, each participant is assigned the following:
Note that as additional tests are introduced, individual participant weights decrease proportionally and uniformly. However, real-world decision making rarely proceeds under equal weighting; heterogeneity in weights is the rule, not the exception. Accordingly, we posit a plausible functional form for how the weights behave (namely, prior function). We then illustrate why unequal weights are sensible with an applied example, financial advising in a bank. Consider a client meeting with a financial advisor. In a within-subjects setup, the advisor presents the same client with four investment alternatives that differ in compounding frequency (daily, monthly, annually, etc.). Under information overload and noise, the client may mistakenly treat an annual rate as equivalent to a daily rate over a year, overlooking compound interest, so the comparison is invalid.
A hybrid design is also possible: alongside the within-subjects presentation, a between-subjects comparison is introduced by offering a different group an alternative that does not materially change the investment profile, thereby creating a clean between-subjects frame. This setting illustrates that decision patterns differ across (and in combination with) these frames. Accordingly, when there is prior belief or evidence for such heterogeneity, one can posit a more informative prior that better reflects real-world behavior. Nevertheless, for an initial presentation, maintaining a uniform prior remains a conservative, transparent, and defensible starting point, from which the data can subsequently update the weights toward a posterior that better matches reality.
In the Bayesian framework, we begin with an initial prior that is iteratively updated to a posterior as real-world data are observed (e.g., from experiments with participant groups), continuing until convergence. Any intuitively chosen initial function is acceptable; this is not problematic, because it will ultimately be updated to the final posterior, becoming increasingly realistic as data accumulate.
3.5.2. Asymmetric Case
In the asymmetric case, there are infinite possible scenarios. One example is that, in certain tests, it may be more effective to answer instinctively rather than to “overthink.” This illustrates just one of many manifestations of asymmetry. A thorough examination of this case may yield valuable insights. For instance, is it better to think more than necessary or less than necessary?
Excessive deliberation (“overthinking”) is known to impair performance on many tasks. In methodologies where instinctive responding maximizes outcomes, additional reflection tends to degrade results. This decline may unfold continuously or exhibit a discontinuity at an overthinking threshold. For example, personality tests are designed to elicit spontaneous, candid answers; prolonged reflection promotes hesitation and misreporting. We posit a tipping point beyond which responses converge to a stable, hesitant pattern, yielding little further change despite more thinking. Just prior to this point, performance remains appropriate; once crossed, hesitation (a consequence of overthinking) can trigger an abrupt, substantial drop in performance.
We now provide a mathematical foundation for this case. Let f: S → D be a function representing the distribution of answers for a given methodology m. The function f reaches its maximum at s ∈ S when s = m, which makes sense, as m is the ideal methodology for the corresponding system.
Should this function be continuous? Below, we present examples where the function is continuous, although continuity is not a strict requirement. The function may contain points of discontinuity. For example, consider the “overthinking point” discussed earlier: at the first moment at which overthinking sets in and performance sharply declines. This transition can be either continuous or discontinuous; both cases are plausible. For our purposes, we focus only on continuous examples, as they allow us to leverage useful mathematical properties. If the function is differentiable, we can compute its derivative and analyze how outcomes change across different systems.
Naturally, each subject will have a unique function for any given methodology. However, by modeling and mathematically representing these functions, we can compute an average function simply by summing them and dividing by the number of subjects. This yields the expected function for a fixed methodology.
To simplify the mathematical model, we use a physical/mechanical representation of a spring–damper–mass system. This framework allows us to observe realistic scenarios of a system’s response to various stimuli (different inputs). The model is convenient because each input can be treated as a fixed methodology, meaning that every scenario is anchored to a consistent interpretive structure. This type of modeling is widely used in applied mathematics to explore system equilibrium, stability, and responsiveness using differential equations [
22]. Furthermore, advanced analytical tools, such as the second Bogolyubov theorem applied to generalized van der Pol oscillators, have shown how nonlinearities and structural asymmetries influence the emergence and stability of periodic behaviors in dynamic systems [
23].
Building on the foregoing, we define the one-dimensional prior response constant*y(t) as the convolution of the input, fixed methodology on the P-S plane, or fixed system on the P-M plane, with a characterization of the human cognitive system. This prior provides the basis for the inferred performance level. Additionally, the time axis is then normalized such that it becomes m (for fixed s) or s (for fixed m), ranging from 0 to 1.
Figure 5 demonstrates this.
Note that in the next section, we generalize this construction and represent the resulting prior output as a two-dimensional function over the relevant space.
Scenario 1
Figure 5 illustrates the initial prior function induced by an initial stimulus modeled as a step input (though other input forms are equally admissible). The resulting response tracks the input while exhibiting transient dynamics (e.g., overshoot, settling), which can be interpreted through the decision-making process elicited by the given stimulus.
As a demonstration, we model the system on the P-S plane while holding the methodology M fixed. By symmetry, one could equivalently model the P-M plane with S fixed.
Let g(s) denote the prior output profile along the S-axis at the chosen M. We define it as a normalized version of the time-response y(t):
, with c chosen so that , that is, the area under g(s) is constrained to be unity (a normalized “mass” over S).
Under a fixed methodological stimulus M, the participant transitions between cognitive processing modes until the system reaches a final steady state. The stimulus perturbs the system from an initial equilibrium to a final equilibrium. As shown in
Figure 5, during the transient regime, the response exhibits a peak (“overshoot”) that reaches approximately 1.38 (i.e., ~40% above the final steady-state level normalized to 1). This behavior indicates an underdamped, second-order dynamical mechanism (oscillatory affective/cognitive state), consistent with a model whose poles have negative real parts and nonzero imaginary parts.
In the same manner, one may construct an analogous normalized function on the P-M plane for a fixed S:
, with d chosen so that , that is, the area under h(s) is constrained to be unity (a normalized “mass” over m).
Together, these one-dimensional constructions provide consistent, slice-wise views of the system’s prior behavior across the design space, with normalization ensuring comparability across slices.
The subsequent oscillations reflect temporary instability in performance, likely triggered by transient external factors such as mood, fatigue, or distraction. Similar overshoot and recovery behaviors have been observed in nonlinear dynamic systems, such as the Rayleigh-Plesset model of charged cavitation bubbles, where oscillations eventually settle into a stable state [
24]. These transient dynamics are also characteristic of fast–slow coupling systems like the van der Pol–Rayleigh oscillator, which exhibits phases of rapid transition followed by relaxation toward equilibrium [
25]. Ultimately, the system converges to a consistent performance level, suggesting that for this particular methodology, performance becomes independent of cognitive fluctuation. This qualitative stabilization mirrors the equilibrium behavior described in Lane–Emden-type systems, where internal structure drives solution profiles toward a steady state [
26].
Scenario 2
The function describes an undershoot scenario, in which a significant drop in performance may occur under exceptional or atypical conditions (as shown in
Figure 6). Such models are well known in the field of economics, where similar patterns are used to represent sharp declines followed by gradual recovery.
Scenario 3
The system’s behavior can be divided into four distinct phases, as demonstrated in
Figure 7. Before t = 5, the applied force causes the mass to accelerate, resulting in a rising response. Around t = 5, the system reaches its peak velocity, marking the maximum point. Between t = 5 and t = 9, the system begins to return to equilibrium under the combined influence of the spring and damper, leading to a gradual decrease in response. After t = 9, the damping force becomes dominant, causing a steeper, non-oscillatory decay as the system stabilizes.
At the peak point, where performance reaches its highest level, a slight decline in performance follows as the cognitive system increases. This gradual decay indicates a weakening in the system’s responsiveness. The subsequent steep decline suggests that the methodology is no longer effective in eliciting high performance beyond a certain level of cognitive engagement.
The function is smooth and differentiable across the entire domain, with no sharp peaks or discontinuities. It effectively models the response of a well-damped second-order system to a smooth, pulse-like input.
3.5.3. Sensitivity Analyses or Simulations to Demonstrate Robustness
Grounded in Bayesian inference, we adopt a stable prior (an exponentially decaying form), ensuring robustness to perturbations, as shown in
Figure 8. The subsequent figures demonstrate that the prior remains stable under data-driven updates: field observations may adjust the function’s parameters, e.g., natural frequency, phase, and damping, thereby yielding a revised posterior, yet the resulting deviations are not substantial.
When the phase shifts, we can re-estimate it at each measurement so that the maximum aligns with the reference point s = m, as theoretically expected (i.e., the cognitive system best matched to the methodology). This alignment serves as a consistency check on the inferred function and a basis for correcting measurement errors where needed.
Figure 9 illustrates the response’s robustness to changes in the natural frequency relative to the original value.
Figure 10 illustrates the response’s robustness to changes in the damping ratio relative to the original value.
4. A Functional Mathematical Model for Dual-Process Integration
Building on the conceptual synthesis in
Section 3, this chapter develops a formal mathematical model. We move from a 1-D representation to a two-dimensional formulation (methodology × system), yielding a three-dimensional outlook in which performance is a surface jointly determined by both factors. The model captures how methodological framing (within- vs. between-subjects designs) interacts with cognitive system dominance (System 1 vs. System 2) to influence decision accuracy, and it is designed to support both empirical calibration and theoretical prediction.
4.1. Model Dimensions and Definitions
Let us define the following variables over the closed interval [0, 1]:
Intermediate values represent mixed or hybrid designs.
s ∈ [0, 1] Cognitive dominance index, where
d ∈ [0, 1] denote the decision accuracy, interpreted as the empirical probability of a correct (normative) response.
where the function
f: [0, 1] × [0, 1] → [0, 1] represents the expected decision accuracy given both the methodology and the cognitive factors.
4.2. Functional Form of the Model
We propose a separable multiplicative form for simplicity and interpretability:
where
α ∈ (0, 1] is a normalization constant
g(m): methodology response function
h(s): cognitive system response function
We define
where −
λ, −
μ < 0 are the system’s poles govern its transient response; the more negative their real parts (i.e., the farther they lie from the imaginary axis), the faster the transients decay and the quicker the system settles.
For simplicity, we model the prior on each plane as a first-order lag, an exponential with a single pole (i.e., no oscillations). These forms reflect diminishing marginal gains in performance as either variable increases—an empirical pattern consistent with cognitive resource limits.
Substituting this into (2) is expressed as follows:
4.3. Probabilistic Conditioning Perspective
In addition to the deterministic formulation of decision accuracy given in Equation (4), we incorporate a probabilistic layer to capture how experimental design influences the activation of cognitive systems, and vice versa. This layer enables both forward and inverse inference and allows empirical results to inform model calibration.
Let m ∈ {0, 1} denote the methodological condition, where m = 0 represents a within-subjects design and m = 1 a between-subjects design. Let s ∈ {0, 1} denote the dominant cognitive system, where s = 0 reflects System 1 (intuitive processing) and s = 1 reflects System 2 (deliberative reasoning).
We are interested in the conditional probability:
This expression denotes the probability that a given methodological structure
M =
m is observed or inferred, conditioned on the engagement of cognitive system
S =
s. Applying Bayes’ theorem yields the following:
This form is useful when the marginal and conditional probabilities of S and M are either known from empirical data or estimated from a prior distribution.
Empirical findings from both the “Original” and “New” datasets provide estimates of these conditional relationships. In particular, the data support the following approximations (as was introduced in Equations (1) and (4)):
These values indicate that intuitive responses (System 1) are more frequently observed under within-subjects designs, while deliberative responses (System 2) are more often elicited under between-subjects designs. This observation supports the theoretical claim that methodological design acts as a contextual cue that influences which cognitive system is likely to be engaged.
Moreover, Equation (9) can be inverted to support inference in the opposite direction. That is, if the system activation
S = s is observed, Equation (9) can be used to estimate the likelihood that a particular methodological design
M = m was employed. Conversely, if only the design condition is known, the likely dominance of cognitive system
S can be inferred using
This bidirectional inference capability allows researchers to use observed accuracy patterns, cognitive system indicators (e.g., response time, confidence), or methodological conditions to update their beliefs about the underlying reasoning dynamics.
Together, Equations (8)–(11) provide the probabilistic infrastructure of the model and extend its application to empirical calibration, model-based prediction, and formalization of dual-process dynamics.
4.4. Interpretive Geometry and Model Fitting
The functional model defined by Equation (12) can be visualized geometrically as a continuous surface over the unit square [0, 1] × [0, 1], with axes representing the methodological index m and the cognitive dominance index s. The output dimension corresponds to the decision accuracy d = f(m, s) ∈ [0, 1]. This visualization facilitates an intuitive understanding of how changes in methodology and cognitive activation jointly influence performance outcomes.
Contours of constant decision accuracy appear as level sets on this surface. These level curves offer insight into compensatory relationships: for instance, a design that only partially supports analytical processing (lower m) might still achieve high performance if cognitive control is otherwise engaged (higher s). The maximum of the surface, f(1, 1), represents the optimal combination of a between-subjects design and dominant System 2 reasoning, while the minimum, f(0, 0), captures the least effective scenario—within-subjects framing under intuitive control.
This surface can be empirically calibrated using experimental or observational data. Parameters
λ and
μ in Equation (7), which govern the responsiveness of accuracy to changes in
m and
s, respectively, can be estimated through nonlinear least squares or Bayesian methods. Let
denote a dataset of observed design-system-accuracy triples. The objective is to minimize the squared error:
Fitting the model to data not only yields interpretable parameters but also enables the generation of predictive surfaces for new populations or task conditions. Moreover, this calibration allows researchers to assess the marginal influence of methodology (M) and cognition (S) on accuracy (D) under varying constraints. To demonstrate the applicability of this calibration, we fitted Equation (17) to the aggregate accuracy rates observed in the “Original” and “New” studies. Using nonlinear least squares, we obtained parameter estimates of λ ≈ 0.46 and μ ≈ 0.61. These values indicate moderate responsiveness to methodological framing (λ) and stronger responsiveness to cognitive system engagement (μ). This shows that λ and μ are not arbitrary scaling constants but interpretable constructs that capture how changes in design and cognition shape performance outcomes. Although these estimates are based on only two datasets, they provide a proof of concept that the model can be empirically anchored. We explicitly note that predictive robustness requires further validation. Future research should extend the fitting procedure to new datasets and employ cross-validation to confirm that the parameter estimates generalize across tasks and populations.
In experimental design, this geometric interpretation supports the selection of optimal conditions to elicit desired reasoning modes. For instance, locating regions of the surface where accuracy is most sensitive to changes in m can guide interventions aimed at improving outcomes through design adjustments. Similarly, tasks can be targeted to cognitive training if the gradient with respect to s is steep.
Beyond individual task design, the model generalizes to broader domains. Each experimental condition can be conceptualized as a projection onto the (M, S)-plane, with accuracy measured on the vertical axis. Comparing such projections across domains (e.g., statistical reasoning, financial decision-making, educational testing) enables the identification of common surface structures and the emergence of domain-general principles.
In summary, the model surface provides both a descriptive landscape and a prescriptive tool. It translates abstract cognitive principles into measurable quantities and enables actionable predictions about how design and cognitive engagement interact to shape behavior.
5. Discussion
Although the empirical examples used in this paper were framed through controlled tasks involving balanced dice, the theoretical and methodological implications extend well beyond that specific context. The core argument—that experimental framing influences the dominance of cognitive systems—applies directly to real-world domains where probabilistic reasoning is required. One such domain is financial decision-making. Concrete examples from behavioral finance research support this extension. For instance, Glaser and Walther [
27] show that the way investment alternatives are framed can shift investors toward intuitive, System 1-based decision patterns, often reducing accuracy in portfolio choices. Pompian and Longo [
28] similarly document how phrasing such as “guaranteed return” or “loss-proof selection” primes fast, heuristic reasoning and leads to the disposition effect. More recent studies confirm these tendencies: Bhanushali and Rani [
29] and Pooja et al. [
30] demonstrate that investors are more likely to misinterpret risk-return tradeoffs when outcomes are presented with salient but misleading cues, while Lis [
31] reviews how investor sentiment interacts with such framing effects in modern asset pricing models. Taken together, these findings show that methodological framing is not merely an experimental artifact but also a mechanism that biases real-world investor judgment, reinforcing the relevance of our model for applied finance. For instance,
Table 5 could be repurposed to represent portfolio construction scenarios in investment contexts, where the parameters n and k may correspond to the number of selected assets and the target threshold of returns or defaults.
In such contexts, the salience of specific phrasings—such as “guaranteed return,” “performance doubled,” or “loss-proof selection”—often functions as an attention-grabber. These cues can trigger intuitive, System 1-based processing in novice investors. The proposed functional model of dual-process reasoning helps explain how such framing leads to common investment biases, including the illusion of control, anchoring on recent performance, and overestimation of linear returns. These are not just theoretical constructs but observed phenomena in behavioral finance [
27,
28,
29,
30,
31].
The model we developed suggests that intuitive errors in probabilistic reasoning—whether in dice games or financial planning—are not fixed cognitive failures but rather context-sensitive outcomes. Specifically, the methodological design of a task or presentation (within-subjects versus between-subjects) can modulate whether System 1 or System 2 becomes dominant. In applied finance, this translates to the way financial information is communicated—text versus visuals, aggregated figures versus disaggregated detail, or hypothetical gains versus guaranteed losses. All of these presentation factors can influence the cognitive route that the investor follows.
Moreover, our approach has methodological implications for the design of future behavioral finance research. It encourages the use of between-subjects or hybrid designs to explore and isolate the conditions that promote more reflective reasoning. The model provides a theoretical rationale for shifting methodological emphasis depending on whether the goal is to measure cognitive vulnerability or cognitive resilience.
This perspective is supported by findings from studies in behavioral economics and finance, where methodological structure affects observed reasoning behavior. For example, studies of the conjunction fallacy—such as the classic “Linda the bank teller” scenario [
32]—have shown that responses vary significantly between within- and between-subjects formats. Applying our synthesis to such cases implies that the observed distribution of responses can be understood as projections onto different planes of cognitive activation: one representing System 1 and another representing System 2. This allows for normalization of observed data by dividing answer frequencies within each methodological condition, thus clarifying the underlying processing tendencies.
It is worth emphasizing that although the “Original” paradigm relied on binomial chance scenarios involving rolling dice, the conceptual synthesis generalizes to a much broader set of problems.
Table 5, for example, presents a binomial situation from finance, illustrating how attention-grabbing phrasing such as ‘guaranteed return if at least k assets succeed’ can bias investors’ interpretations. This mirrors the same cognitive mechanism observed in the experimental tasks: surface-level cues trigger intuitive reasoning (System 1) and make incorrect conclusions appear plausible, even when they are mathematically false.
The functional model developed here permits further formal generalization. By synthesizing both conceptual and empirical observations across within-, between-, and combined-subjects designs, the model supports exploratory hypothesis generation through manipulation of binomial parameters. For example, variations in “linearity in p”, “linearity in n”, and “linearity in k” can be used to design experimental conditions that test the boundary between intuitive and analytical processing. In this way, the model can serve not only as a descriptive tool but also as a generative framework for creating new studies across domains.
Ultimately, while the theory was initially developed in the context of chance-based reasoning, its core mechanisms—interaction between task framing, cognitive system activation, and response pattern—are generalizable. The model presented in
Section 3 and
Section 4 provides a way to represent these interactions mathematically, enabling both predictive analysis and potential empirical calibration. As such, this work contributes to an emerging dialogue between cognitive theory, experimental design, and formal modelling in psychological research.
We initially adopt a sinusoidal form to describe success rates as a function of the cognitive system engaged. In the disturbances section (
Figure 5), we work on the P-S plane with M fixed and model the setting by analogy to control theory: the human is the system S, the methodology M is the external input, and performance P is the output. Mathematically, the response is a convolution y(t) = (input × characteristic-system)(t). In
Figure 5, the input is modeled as a step and the human system as a second-order underdamped LTI element, yielding an exponentially decaying sinusoid.
From a Bayesian standpoint, the chosen form serves only as an initial prior that will be updated to a posterior as empirical data accrue; the specific prior is therefore not decisive. Behaviorally, an individual at equilibrium does not react absent a stimulus; a new external stimulus induces a transient marked by an initial overshoot and damped oscillations at a characteristic frequency, followed by settling to a new equilibrium (the settling time). Greater damping implies little to no oscillation. In sum, the exponential envelope is a reasonable illustrative prior for a stable system that attenuates initial conditions and converges to a new steady state, and it will be refined by data into the final posterior.
6. Conclusions
This paper introduces a functional model that integrates methodological design with dual-process cognitive theory, offering a novel framework for understanding and influencing reasoning under uncertainty. Our primary goal was to show how specific research designs—particularly within-subjects, between-subjects, and hybrid configurations—can influence whether intuitive (System 1) or analytical (System 2) thinking becomes dominant during decision-making tasks. The empirical basis for our work draws from a synthesis of two studies on probabilistic reasoning, including the foundational work of Van Dooren et al. [
11,
12,
13] and new data presented in this study.
The findings demonstrate that methodological design is not a neutral backdrop but a cognitive catalyst. It can prime participants to engage either intuitive or analytical processes, leading to divergent response patterns even when the task content is held constant. This insight has important implications for the design of experiments in psychology and behavioral economics, where observed behavior is often interpreted without sufficient attention to methodological framing.
To support this theoretical claim, we introduced a continuous mathematical model for dual-process integration. The model formalizes the relationship between design structure, cognitive system engagement, and response probability. By incorporating parameters such as “linearity in
p”, “linearity in
n”, and “linearity in
k”, the model allows for predictive control over how tasks are likely to activate System 1 or System 2 processing. It also enables the normalization of response distributions across different methodological planes, as described in
Section 4. This model not only aligns with observed data patterns but also opens the door to Bayesian inference and forward simulation tools that can substantially enhance the predictive power of dual-process theory.
In applied domains such as finance, these insights carry direct implications. For example, when novice investors are presented with financial options framed through intuitive language—such as “guaranteed return” or “doubled performance”—they are more likely to engage System 1 processing and fall prey to cognitive biases [
27,
28,
29,
30,
31]. Our model provides a framework for mitigating such effects by guiding the design of interventions that promote analytical thinking. Financial educators, advisors, and policymakers can use these principles to enhance investor literacy and resilience to decision-making traps.
From a theoretical perspective, the model supports two directional strategies for activating preferred cognitive systems. First, as described in Step 2, one may start with a specific methodology and identify the thinking system most likely to be activated. Second, as detailed in Step 4, one may begin with a targeted cognitive system (e.g., System 2) and select the methodology that best supports its activation. This bidirectional flexibility highlights the model’s utility not only as a descriptive tool but also as a prescriptive guide for experimental and applied design.
Our conclusions also contribute to ongoing discussions regarding the philosophy of science and research methodologies. The synthesis of empirical and conceptual elements echoes broader calls for integrating descriptive and theoretical levels of explanation [
18,
19]. More broadly, our work aligns with recent debates on replication and methodological innovation. Scholars have emphasized that conceptual replication holds particular value for theory building [
33,
34,
35]. By situating our synthesis within this perspective, we reinforce its contribution not only to dual-process theory but also to the wider discussion on research reliability and scientific progress. The model we offer bridges this divide by showing how formal tools can unify data structures and conceptual reasoning within a coherent framework.
Finally, this paper suggests several directions for future research. First, the model should be tested in additional domains beyond statistical reasoning—such as legal judgment, medical diagnosis, or social decision-making—where dual-process dynamics are equally relevant. Second, future studies should gather process-tracing data (e.g., response time, eye-tracking, neural activation) to further validate the assumptions of the model. Third, interdisciplinary efforts should explore how variations in individual traits—such as need for cognition, cognitive style, or numeracy—interact with the model’s predictions. These extensions will further establish the model’s value as a cross-domain framework for reasoning research.
In conclusion, this study presents a unified theoretical and mathematical framework that integrates methodological design with dual-process cognition. It provides both an explanatory lens for interpreting contradictory empirical results and a formal tool for guiding future research and application. As cognitive science continues to develop richer models of human reasoning, we believe this synthesis of experimental structure and cognitive theory will serve as a foundation for more nuanced, predictive, and actionable science.