1. Introduction
Retail investors make buy-and-sell decisions in an environment where market information is abundant, uneven in quality, and often communicated as advice rather than neutral data (
Miller & Skinner, 2015;
Shiller, 2017;
Shiller & Pound, 1989). Interpersonal communication, expert commentary, and media attention can reinforce herding and attention-driven trading, especially under uncertainty (
Bikhchandani & Sharma, 2001;
Hsieh et al., 2020;
Shiller, 2017). Large-sample evidence shows persistent behavioural regularities such as excessive trading, attention-based stock selection, and poorer outcomes for the most active traders.
Perceived Cognitive Assistance (PCA) is defined here as the trader’s felt expansion of cognitive capability at the moment of a decision when a large language model (LLM) is available, with emphasis on how the decision is structured rather than whether outcomes improve. PCA differs from perceived usefulness because usefulness evaluates expected results (for example, performance gains, efficiency gains, or better trading outcomes), whereas PCA evaluates decision-time cognitive scaffolding (for example, a clearer path from an idea to an executable plan, better internal checking, and improved scenario comparison). This distinction matters because LLMs are conversational and can shape the user’s reasoning process in real time; therefore, a trader may feel cognitively enabled even when objective performance does not improve and, conversely, may find an LLM “useful” for information retrieval without experiencing decision-time cognitive structuring.
These tendencies are amplified by asymmetric reactions to gains and losses, including the disposition effect and related forms of loss-sensitive selling and holding (
Ahn, 2022;
Kahneman & Tversky, 1979;
Shefrin & Statman, 1985). Cognitive biases in financial judgement appear widespread across economic groups, suggesting that they are not confined to small or unusual investor segments (
Ruggeri et al., 2023).
Digital distribution channels in financial technology (FinTech) lower execution frictions and keep investors in continuous, attention-competitive streams (
Barber et al., 2021;
Miller & Skinner, 2015). Evidence from trading applications and social media settings indicates that attention shocks can increase risk taking and are associated with weaker holding period returns for attention-induced trades (
Eliner & Kobilov, 2023;
Warkulat & Pelster, 2024). These features matter because they change not only what information is available but also how investors experience and process information at the moment of choice (
Miller & Skinner, 2015;
Shiller, 2017).
Large language models (LLMs) are now entering this environment as a new form of retail-facing decision support (
Kong et al., 2024;
Lopez-Lira & Tang, 2024;
Schlosky & Raskie, 2025;
Winder et al., 2025). Recent surveys and applied studies of large language models (LLMs) in finance document the rapid diffusion of LLM-based analytical support and outline decision quality and risk channels relevant to private investors (
Li et al., 2023;
J. Lee et al., 2025;
Oh et al., 2025). Unlike screeners and many robo-advisors that mainly automate filtering or allocation, LLM-based systems can provide interactive, multi-turn conversational support that elicits user preferences and delivers tailored explanations and guidance in natural language (
Z. Chen, 2025;
Takayanagi et al., 2025). This interaction can influence framing and perceived controllability, which are central determinants of action in the Theory of Planned Behaviour (
Ajzen, 1991,
2011). At the same time, evidence from AI-advice experiments shows that people may follow AI recommendations even when those recommendations conflict with contextual information and their own interests (
Klingbeil et al., 2025). Broader work on trust and reliance in automation also shows that users can oscillate between avoidance and over-reliance depending on perceived error, presentation, and expectations (
Dietvorst et al., 2015;
Glikson & Woolley, 2020;
Kohn et al., 2021).
Despite rapid diffusion, empirical research is constrained by a measurement gap. Existing technology adoption constructs capture important evaluations of tools, but they do not directly measure the decision-time experience that users describe as “it helps me think through this decision right now” (
Ali et al., 2025;
J. Chen et al., 2025;
Davis, 1989;
Venkatesh et al., 2003,
2012). Perceived usefulness focuses on expected results and performance gains, while perceived ease of use focuses on effort in operating the tool (
Davis, 1989;
Dorobăț & Corbea, 2025;
Mustofa et al., 2025;
Venkatesh et al., 2003).
Trust in automation and AI concerns beliefs about system reliability and appropriate reliance (
Glikson & Woolley, 2020;
Hoff & Bashir, 2015;
Jian et al., 2000;
J. D. Lee & See, 2004). PCA is different; a trader may trust an LLM without feeling cognitively assisted in a specific decision, and vice versa. Trading self-efficacy reflects perceived baseline ability to trade well independent of tools, whereas PCA is conditional on LLM availability at the moment of decision (
Ajzen, 1991,
2006). Constructs developed for robo-advisory settings (e.g., delegation, satisfaction with automated allocation) generally assume a more passive, rules-based service (
Brenner & Meyll, 2019;
D’Acunto et al., 2019). They, therefore, do not target the interactive, multi-turn cognitive scaffolding in natural language that distinguishes LLM-based decision support (
Z. Chen, 2025;
Takayanagi et al., 2025). In short, existing measures do not directly target the decision-time experience of expanded cognitive capability that traders describe when using an LLM.
To address this gap, we propose a new construct, Perceived Cognitive Assistance (PCA) (
Gimmelberg & Ludviga, 2025). PCA is intentionally process-focused: it captures perceived support for understanding, judgement, and decision structuring, rather than downstream outcomes such as returns. The purpose of this study is to provide a measurement foundation for empirical tests of LLM-augmented retail trading behaviour by (i) specifying PCA and its boundaries against neighbouring constructs (usefulness, ease of use, trust, and trading self-efficacy), and (ii) reporting content-validity evidence for a PCA item pool as a gate before factor-analytic testing (
Boateng et al., 2018;
Colquitt et al., 2019;
Hinkin, 1998;
Morgado et al., 2017). This sequencing follows scale development guidance that clear domain specification and content validation should precede statistical tests of factor structure, especially when constructs are proximal and likely to be confused by respondents (
Clark & Watson, 1995,
2019;
Colquitt et al., 2019). Systematic reviews show that scale development studies often report avoidable methodological limitations, reinforcing the need to treat content validity as a front-end requirement rather than an optional add-on (
Morgado et al., 2017).
This study makes three contributions. First, it provides a clear definition and boundaries for PCA, grounded in a transparent qualitative coding frame that anchors item content in trader experiences across 76 sources. Second, it delivers a content-validated item pool: seven items meet all classification thresholds, nine are borderline, and none fall into the problematic range, confirming that PCA is perceptibly distinct from neighbouring constructs at the item level. Third, it identifies the PCA-perceived usefulness boundary as the critical discrimination challenge: filler-item accuracy for usefulness was 81.2%, below the 85% threshold, confirming that distinguishing “helps my thinking” from “improves my results” is genuinely difficult. This finding has direct implications for item wording and discriminant validity testing in subsequent studies. Practically, a short PCA score can support governance by helping to monitor when LLM use is linked to increasing strategic complexity without matching guardrails. In applied settings, PCA can also support safer financial decision making by flagging when perceived decision-time capability rises, so platforms or advisors can trigger additional risk prompts, suitability checks, and ‘human-in-the-loop’ review before users adopt complex or leveraged tactics (
Barber & Odean, 2000;
Bauer et al., 2009;
J. D. Lee & See, 2004).
This study aims to answer the following research question: can Perceived Cognitive Assistance (PCA)—the felt expansion of capability at decision time when using an LLM—be clearly defined, grounded in qualitative evidence, and supported by content validation as distinct from neighbouring constructs, producing a scale candidate ready for psychometric validation?
This paper proceeds in four steps.
Section 2 describes the two-study design, covering the qualitative domain specification and item generation (Study 1) and the naïve-judge content validation procedure (Study 2).
Section 3 reports the content validation results and proposes a provisional nine-item Perceived Cognitive Assistance (PCA) set for subsequent psychometric testing.
Section 4 discusses implications, limitations, and the next validation stage.
Supplementary Materials A–E provide supporting materials for replication and transparency, including the full study instruments and protocols, the PCA macro-code frame and item mapping, the corpus construction and mapping details, the complete item-level validation indices, and the canonical-versus-retained measurement architecture used to position PCA relative to neighbouring constructs.
The two-study design maps directly onto this measurement gap. Study 1 derives PCA inductively from traders’ descriptions of decision-time cognitive experience, rather than deducing items from existing adoption frameworks that were not designed for this purpose (
Podsakoff et al., 2016;
Hinkin, 1995). Study 2 tests whether the resulting items are recognisable as PCA and distinguishable from the neighbouring constructs identified above, using an independent-rater procedure grounded in the perspectives of retail traders themselves (
Colquitt et al., 2019).
4. Discussion
The central measurement implication is that Perceived Cognitive Assistance (PCA) can be defined as a process-focused construct that is related to, but not reducible to, perceived usefulness and perceived ease of use (
Davis, 1989;
Podsakoff et al., 2016). The construct-level calibration results (
Table 6) show that perceived usefulness (PU; a belief that using the system improves task performance) is the closest boundary construct in this setting, because the usefulness filler item did not meet the a priori correspondence target (psa = 0.812 < 0.85), unlike the filler items for perceived ease of use, trust, and trading self-efficacy (
Colquitt et al., 2019). This pattern implies that naïve judges can treat “usefulness” as an umbrella appraisal that absorbs multiple forms of help, including decision-time cognitive support, unless definitions and item wording force an explicitly process-level interpretation (
Davis, 1989;
Podsakoff et al., 2016). At the item level, correspondence and distinctiveness criteria yielded seven CORE items and nine BORDERLINE items (
Table 3), with no PROBLEMATIC items and no overlap exclusion triggers, supporting the viability of PCA as a distinct construct while identifying perceived usefulness as the primary discriminant challenge (
Anderson & Gerbing, 1991;
Colquitt et al., 2019;
Hinkin & Tracey, 1999). This CORE–BORDERLINE distribution suggests that PCA already has a recognisable core that judges interpret as decision-time cognitive scaffolding but that its perimeter remains partially entangled with perceived usefulness language because some phrasings still invite outcome interpretation. For subsequent validation stages, we, therefore, treat perceived usefulness (and secondarily perceived ease of use) as primary discriminant validity comparators rather than peripheral controls, because PCA’s substantive value depends on demonstrating measurement signal beyond general technology appraisals (
Davis, 1989;
Hinkin, 1998;
Colquitt et al., 2019). Finally, to make the operationalization transparent, we carry forward PCA as a nine-item provisional set (
Table 5 and
Table 7), consisting of the seven CORE items plus two safeguard items retained to preserve macro-code coverage for cognitive load relief (C2) and error checking (C3) (
Colquitt et al., 2019).
Two retained items use plan-oriented phrasing that can be misread as performance improving unless interpreted as decision-time structuring. In this scale, “structured path” denotes sequencing and organisation of decision steps from idea to order, and “executable trade plan” denotes translating intent into a specified plan that the trader can implement, without claiming that the plan improves returns, accuracy, or performance (
Davis, 1989;
Venkatesh et al., 2003). This wording choice is consistent with the instrument rule that PCA items describe how the decision is structured at the moment of choice, while usefulness items describe evaluative outcome appraisal (
Davis, 1989;
Venkatesh et al., 2003).
The content validation outcomes also provide early guidance about which facets are easiest to communicate as “cognitive assistance” under strict definitional tests (
Colquitt et al., 2019). Items that emphasised structure from idea to action (C1), comparison and scenario navigation under time pressure (C4), and learning-oriented understanding and reflection (C5) were more likely to meet CORE criteria (see
Table 4), which may reflect that these experiences are more distinctive from trust and ease-of-use judgements when phrased as cognitive process support (
Colquitt et al., 2019;
Davis, 1989). By contrast, the cognitive load and information triage facet (C2) and the error-checking and inconsistency detection facet (C3) did not yield CORE items, and representation of these facets, therefore, relies on the pre-specified coverage safeguard (
Hinkin, 1995;
Colquitt et al., 2019; see
Supplementary Material A for decision rules and
Supplementary Material D for full item-level indices). This outcome does not imply that C2 and C3 are outside the PCA domain, because
Supplementary Material B documents strong qualitative support for both facets in the Tier B corpus, and
Supplementary Material C provides triangulating evidence from the Tier A corpus, but it indicates that the current phrasings may invite overlap with perceived ease of use, trust, or self-efficacy unless wording is sharpened to emphasise “how my thinking changes” rather than “the tool works well” (
Podsakoff et al., 2016;
Davis, 1989). For transparency and future refinement,
Supplementary Material A provides the definitive record of the construct definitions and item wording that produced these outcomes, and
Supplementary Material D can be used to audit which distinctiveness conditions each borderline item failed under the pre-registered thresholds (
Colquitt et al., 2019).
A second implication concerns scale architecture and the sequence of validation evidence (
Hinkin, 1995). Content validation supports the claim that a subset of items corresponds to the intended definition and is not dominated by a single competing construct, but it cannot establish dimensionality, reliability, or predictive validity, which require larger-sample psychometric testing (
DeVellis, 2016;
Hinkin, 1995). The present evidence, therefore, supports treating the retained item set as a provisional instrument that is ready for factor-analytic refinement and validation in an independent sample, rather than as a final scale (
DeVellis, 2016;
Colquitt et al., 2019). Given that
Table 1 and
Supplementary Material B frame PCA as a five-facet content domain, later model comparisons should be open to both a unidimensional representation (a general PCA factor) and a correlated-facets representation, with item performance guiding whether the construct behaves as a single latent tendency or as distinguishable components in practice (
Hinkin, 1995;
DeVellis, 2016). The construct definition discipline applied here, including explicit exclusion of the autonomy and over-reliance risk pathway (C6) from the PCA domain, should help prevent post hoc drift when later statistical models are estimated (
Podsakoff et al., 2016;
J. D. Lee & See, 2004).
The results also have practical implications for research on large language model (LLM)-augmented trading and for tool governance (
Gimmelberg & Ludviga, 2025). PCA provides a way to measure the user’s perceived “cognitive lift” during complex trading decisions without collapsing that experience into simple satisfaction or usefulness ratings, which is important when behavioural change and strategic complexity are the target outcomes rather than mere adoption (
Gimmelberg & Ludviga, 2025;
Davis, 1989). In applied settings, the scale can be used as a monitoring indicator of perceived assistance intensity across tasks, strategies, or market regimes, with
Supplementary Material A providing a complete, reproducible instrument that can be fielded as written (
Colquitt et al., 2019). At the same time, the deliberate separation between PCA (C1–C5) and autonomy/over-reliance risk (C6) implies that practitioners should not use high PCA scores as evidence of safe reliance, because assistance can co-exist with responsibility drift (
J. D. Lee & See, 2004). This is why
Supplementary Material B retains C6 in the domain map even though it is not operationalised as PCA content, and why later work should pair PCA measurement with explicit over-reliance or delegation measures when governance and safety are central outcomes (
Hoff & Bashir, 2015;
J. D. Lee & See, 2004;
Podsakoff et al., 2016).
5. Conclusions
This study addresses the research question of whether Perceived Cognitive Assistance (PCA)—the felt expansion of cognitive capability at the moment of trading decision when a large language model (LLM) is available—can be clearly defined, grounded in qualitative evidence, and supported by content validation as distinct from neighbouring constructs, producing a scale candidate ready for later psychometric validation. We answer this question in the affirmative by (i) specifying PCA as a decision-time belief focused on process-level cognitive scaffolding, and (ii) demonstrating initial content validity for a provisional PCA item pool prior to any factor analysis. Across a two-tier qualitative programme (76 sources comprising interviews, YouTube narratives, and legacy fintech studies) and a pre-registered naïve-judge procedure (N = 48), the results support a retained nine-item set (seven CORE items plus two safeguards) and no items classified as problematic (
Braun & Clarke, 2006;
Malterud et al., 2016;
Colquitt et al., 2019). These items are consistently interpreted as cognitive scaffolding rather than usefulness, ease of use, trust, or baseline trading skill (
Podsakoff et al., 2016;
Colquitt et al., 2019). The practical implication is that PCA can be measured as a distinct decision-time belief, providing a defensible input to subsequent psychometric validation and to governance-focused applications.
Several limitations follow from the scope of content validation and from the chosen design (
Colquitt et al., 2019). First, Study 2 provides evidence only for definitional correspondence and distinctiveness; dimensionality, reliability, measurement invariance, and predictive validity require larger-sample psychometric testing in planned validation waves (
Hinkin, 1995;
DeVellis, 2016). Second, the naïve-judge method depends on the clarity of the construct definitions and judges’ adherence to them;
Supplementary Material A is, therefore, essential for interpretation, and replications should keep definitions stable when testing alternative wordings (
Anderson & Gerbing, 1991;
Colquitt et al., 2019). Because first-attempt comprehension check responses were not retained in the analysis export, comprehension check performance could not be analysed, and no comprehension-based robustness checks could be applied; future implementations will retain first-attempt responses (and timestamps) to enable such checks. Third, inter-rater agreement on the PCA items was low by design (Fleiss’ κ = 0.011 for PCA items alone; κ = 0.316 when filler items are included) because judges classified items against proximal comparator constructs; κ should be interpreted as a boundary difficulty diagnostic rather than as an index of scale reliability (
Anderson & Gerbing, 1991;
Colquitt et al., 2019). Fourth, the judge sample was recruited from an online panel with OECD residence and prior LLM use for trading and may not represent retail traders using broker-integrated tools, non-English interfaces, or populations with lower technology exposure (
Hinkin, 1995). Fifth, while the qualitative corpus was treated as saturated for the PCA-relevant experiential space, it draws primarily on robo-advice, social trading, and early LLM-trading contexts; as LLM tools evolve and traders gain more experience, the assistance themes captured by C1–C6 may require updating (
Hennink et al., 2017;
Malterud et al., 2016). Finally, the PCA-perceived usefulness boundary remained the hardest discrimination test (filler accuracy 81.2%, below the 85% threshold), implying that subsequent waves should continue to tighten process-focused wording and treat perceived usefulness as a primary discriminant comparator (
Davis, 1989;
Podsakoff et al., 2016).