Strategic Complexity and Behavioral Distortion: Retail Investing Under Large Language Model Augmentation
Abstract
1. Introduction
1.1. The Distinctive Nature of LLMs vs. Previous Investment Technologies
1.2. Do Retail Investors Favor Lower-Risk, Intuitive Strategies over Complex, High-Risk Ones?
2. Materials and Methods
2.1. Extending the Theory of Planned Behavior (TPB) for LLM-Augmented Contexts
Theory of Planned Behavior (TPB): A Starting Point
- Lack of system-oriented constructs: TPB does not account for technology-specific beliefs such as perceived usefulness or ease of use as other models do (Davis, 1989; Venkatesh & Davis, 2000). Other models can help to explain why some users develop trust in AI systems while others do not (Davis et al., 1989; Venkatesh et al., 2003).
- Neglect of affective mechanisms: TPB assumes rational intention formation (Ajzen, 1991; Sussman & Gifford, 2019). It does not incorporate emotional arousal, anticipatory anxiety, or vivid scenario framing (Sniehotta, 2009; Alhamad & Donyai, 2021)—factors that are central in LLM interactions, which often simulate high-stakes decision environments.
- No structured behavioral outputs: While TPB links intention to behavior conceptually, it provides no direct link to observable data such as portfolio changes, option usage, or trade frequency—key metrics in financial behavior modeling (East, 1993; Cucinelli et al., 2016).
2.2. Perceived Cognitive Assistance (PCA): Extending Perceived Behavioral Control for AI-Augmented Decisions
- (a)
- Technology Self-Efficacy. While technology self-efficacy captures confidence in using a technology effectively (Compeau & Higgins, 1995; Marakas et al., 1998), PCA specifically addresses the perceived enhancement of domain-specific capabilities through AI support. A trader might have high self-efficacy in using LLM (knowing how to prompt, interpret outputs) while having low PCA (not believing it enhances their trading capabilities), or vice versa. PCA builds upon but is conceptually distinct from domain-specific self-efficacy (Bandura, 1997), which captures an individual’s belief in their ability to perform a task based on internal mastery or experience. In contrast, PCA reflects a perceived expansion of one’s capability boundaries specifically induced by external, AI-driven scaffolding, irrespective of genuine skill acquisition.
- (b)
- Cognitive Offloading. Cognitive offloading describes the delegation of memory or computation to external tools (e.g., calculators, to-do lists), but does not entail the internalized sense of behavioral readiness for novel or analytically intensive tasks (Risko & Gilbert, 2016; Gerlich, 2025). PCA differs fundamentally because it captures not just task delegation but the belief in expanded personal capability boundaries: the belief that one is cognitively able to engage in complex tasks due to real-time AI support—even in the absence of skill acquisition. Empirical support for this distinction is growing, as both preliminary studies (A. K. Singh et al., 2023; Spatharioti et al., 2023) and recent peer-reviewed evidence (Steyvers et al., 2025) consistently demonstrate that interactions with LLMs or dialogic AI tools inflate users’ self-assessed competence, even when their actual decision accuracy remains unchanged. Specifically, it was found that users exposed to AI-generated financial narratives rated themselves as more financially knowledgeable but failed to interpret basic derivative setups correctly (Jakesch et al., 2023; Spatharioti et al., 2023).
- (c)
- Trust in Automation. Trust in automation concerns the reliability, transparency, and dependability of the system (Parasuraman et al., 2000), rather than the user’s own felt competence in executing decisions under system assistance (J. D. Lee & See, 2004). PCA is orthogonal to trust—users might trust an LLM’s outputs while not feeling it enhances their capabilities, or might feel empowered by LLM despite harboring doubts about its reliability. This distinction is crucial for understanding the “illusion of understanding” phenomenon.
- (d)
- Perceived Usefulness. Perceived usefulness from the Technology Acceptance Model (TAM) reflects beliefs about system utility in task performance (Davis, 1989; Venkatesh & Davis, 2000), but it does not capture the user’s self-appraisal of increased capability to act (Davis, 1989; King & He, 2006). PCA specifically captures the user’s belief about their own enhanced capabilities, not just improved outcomes. An investor might find LLM useful for gathering information while not feeling it makes them a more capable trader.
- Positive pathway: When supported by adequate understanding and factual confirmation of LLM effectiveness, PCA facilitates risk democratization with informed confidence.
- Negative pathway: When PCA outpaces actual comprehension and factual confirmation, it may lead to behavioral distortion.
2.3. Technology Acceptance Model (TAM): Explaining Variation in LLM Uptake and Reliance
2.4. Risk-as-Feelings Theory: Modeling Affective Divergence
2.5. Behavioral Shift Index (BSI): Empirical Operationalization
2.6. Proposed Diagnostic Framework for Detecting Behavioral Shifts: Integrating Efficient Market Hypothesis and Adaptive Market Hypothesis
2.6.1. EMH as Baseline Diagnostic Framework
2.6.2. AMH as an Evolutionary Framework
3. Results
Theoretical Model—LLM Impact on Retail Investor Strategy Migration
4. Discussion and Future Research Agenda
4.1. Empirical Validation of the Theoretical Model
4.2. Detecting LLM-Induced Behavioral Change in Retail Investing
- Temporal Exposure Anchoring. The launches of ChatGPT (November 2022) and GPT-4 (March 2023) serve as exogenous structural breaks in time-series analyses of retail trading behavior. We follow event-study methodology refined for behavioral finance, aligning periods of LLM diffusion with shifts in investor activity. Diffusion intensity is proxied by LLM-related search and engagement metrics (e.g., Google Trends, Reddit, YouTube), a strategy grounded in the literature on technology adoption and investor attention (Kirtac & Germano, 2024).
- Strategy Shifts in Retail Trading (BSI Changes). Behavioral indicators—such as multi-leg options trading, increased turnover, portfolio concentration, and reduced holding durations—are used to capture shifts in investor strategies (Lopez-Lira & Tang, 2024). These serve as observable manifestations of perceived behavioral control (PCA) and overconfidence, consistent with prior work linking psychological distortions to risky retail trading (Glaser & Weber, 2007; Han et al., 2022). These metrics compose the Behavioral Shift Index (BSI), defined and mathematically specified in Appendix A.
- Psychological Mediation via TPB and PCA Calibration. To directly investigate psychological pathways, we implement TPB-based surveys measuring changes in TPB constructs following LLM exposure. TPB has robust empirical support across domains and informs our understanding of intention-to-behavior processes (Ajzen, 1991; Gimmelberg et al., 2025b). Qualitative interviews and confidence-calibration tasks to detect PCA-induced miscalibration—where increased confidence does not correspond to improved trading outcomes.
- Attribution Control via Market Diagnostics (EMH/AMH). We apply matched-asset counterfactuals—comparing LLM-associated securities to similar but presumably unexposed ones—enabling difference-in-differences estimation that accounts for confounders such as macro shocks, earnings announcements, and social sentiment (Tetlock, 2007). Patterns of behavioral change accompanied by price movement would align with competitive equilibrium. In contrast, persistent behavior with negligible price adjustment may indicate AMH dynamics, whereas sustained inefficiencies could suggest EMH deviations.
4.3. Dual Simulation Benchmarking: The Virtual Trader and Digital Persona Framework
4.3.1. Virtual Trader: A Cognitively Degraded Counterfactual
4.3.2. Digital Persona: Human Plausibility via LLM Emulation
4.3.3. Epistemic Triangulation and Inference Logic
- Both baselines underperform the LLM → Strong evidence for LLM-enabled cognitive uplift and strategic enhancement.
- Only the degraded Virtual Trader underperforms → Indicates that the LLM mimics best-practice human logic without necessarily surpassing it.
- Digital Persona outperforms the LLM → Suggests that LLMs may introduce distortions or risk-taking strategies inconsistent with typical investor behavior.
4.4. Behavioral Bias: Quantification and Controls
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Behavioral Models Tests and Components
Appendix A.1. Components and Units of Measurement in the TPB-Based Regression Model
- β coefficients represent the estimated weights or effects of each explanatory variable on the behavioral outcome Bi(t).
- LLMi(t): LLM engagement intensity.
- PCAi(t): perceived cognitive assistance.
- Ai(t), SNi(t): TPB components (attitude toward complex strategies; subjective norms).
- Controlsi(t): vector of control covariates.
- βk: regression coefficients;
- γ: coefficient vector on Controlsi(t)
- εi(t): Residual error.
| Operationalization of Bi(t) | Unit of Measurement | Interpretation |
|---|---|---|
| Number of complex trades per week | Count (integer) | A discrete behavior frequency |
| Proportion of complex trades | Ratio (0–1) | A share or percentage |
| Portfolio risk score | Volatility or risk points | Risk exposure level |
| BSI score | Unitless composite index | Behavioral complexity measure |
| Change in strategy complexity | Delta in ordinal/index score | Magnitude of strategy shift |
| Self-assessment of behavior (Likert) | Ordinal (1–5/1–7) | Perceived behavior intensity |
Appendix A.2. Components and Units of Measurement in the BSI Formula
- Symbols: i indexes investors; t indexes time; k ∈ {1, 2, 3, 4} indexes the four BSI components (MultiLeg, Frequency, Concentration, VolExposure); wk are non-negative component weights with ; w = (w1, …, w4).
- t: a specific time point or interval (e.g., week 1 after ChatGPT launch).
- ΔMultiLegi,t: Change in proportion of multi-leg options trades.
- ΔFrequencyi,t: Change in trading frequency.
- ΔConcentrationi,t: Change in portfolio concentration (e.g., top-3 asset weight).
- ΔVolExposurei,t: Change in exposure to implied volatility.
| Component | Δ Variable Description | Unit of Measurement |
|---|---|---|
| ΔMultiLeg_t | Change in proportion of multi-leg option trades | Ratio (0–1) or percentage points |
| ΔFrequency_t | Change in trading frequency | Trades per period (count) |
| ΔConcentration_t | Change in portfolio concentration | Ratio (0–1) or percentage points |
| ΔVolExposure_t | Change in volatility exposure | Volatility units or standardized scores |
| Aspect | TPB Regression Model | BSI Composite Index |
|---|---|---|
| Formula | B_i(t) = δ0 + δ1·LLM_i(t) + … | BSI_t = w1·ΔMultiLeg_t + … |
| Dependent Variable | Behavior/intention of individual i at time t | Index at time t (cohort level) |
| Output Unit | Likert, %, or count | Unitless composite |
| Input Variables | TPB constructs + LLM exposure | Trading behavior metrics |
| Time scale | Cross-sectional | Time-series or panel |
| Use case | Explains how LLMs shift intentions | Measures actual behavior change |
| Level of Analysis | Micro (individual) | Meso/Macro (cohort or group) |
| Methodology | Regression (causal inference) | Composite diagnostics |
Appendix A.3. Complementarity Within the Research Model
Appendix A.4. Convergent/Discriminant Validity Test for PCA
- PCA (AI-scaffolded self-efficacy):“I feel confident executing complex trading strategies because the AI helps me understand the required steps.”
- Automation Trust (system reliability):“I trust AI systems to make accurate and responsible trading recommendations.”
- Perceived Usefulness (task enhancement):“AI systems improve the overall quality of my trading performance.”
Appendix B. Simulation Agent Design: Virtual Trader and Digital Persona Framework for Causal Inference
Appendix B.1. Dataset Construction and Requirements
- Trade Sample Size
- To ensure statistical robustness, we require a minimum of 500 matched trade pairs (LLM-assisted and simulated), supporting the following:
- Paired t-tests for return differentials (α = 0.05, power = 0.80).
- Regression-based analyses of risk-adjusted performance.
- Subgroup comparisons by strategy type and market regime.
- Asset Selection
- To minimize idiosyncratic noise and ensure strategic generalizability, the dataset shall include 20–30 low-correlation assets (|ρ| < 0.3), drawn from the following:
- Equities (e.g., AAPL, MSFT, IWM, QQQ).
- Volatility ETFs (e.g., VXX, UVXY).
- Macro instruments (e.g., TLT, GLD, BTC).
- Strategic Variants
- Each trade pair shall be assigned one of canonical strategies, covering both intuitive and complex forms, such as the following:
- Momentum (price breakout).
- Contrarian (mean-reversion with volatility filter).
- Volatility overlays (e.g., long straddles).
- Hybrid (momentum + IV skew).
- Event-driven (earnings, Fed policy).
- Passive hold with timing overlays.
- Operational trading families (consistent with OSF registration (Gimmelberg et al., 2025a)).
- Directional: 1–9-week momentum/continuation or breakouts when long-trend and Story gates hold.
- Volatility overlays: long-vol versus short-vol expressions chosen by IV/term/skew context to express dispersion or containment.
- Theta plays: income-oriented overlays that monetize time decay while respecting structural levels and volatility posture.
- Earnings-print plays: event-constrained tactics around expected move and term structure under explicit guardrails.
- STORY (including Dividend STORY) with tactical overlays: actions conditioned on a validated narrative, using overlays to scale in/out or harvest carry without breaking the thesis.
- Temporary ranging long stocks: range-bound/channel tactics and pin-type overlays when structural levels and positioning imply containment.
- Market Regime Coverage
- Bull vs. bear markets.
- High vs. low volatility regimes (based on VIX > 20 threshold).
- Earnings/non-earnings weeks.
- Replicability note.
Appendix B.2. Agent Architecture Overview
| Agent Type | Design Logic | Epistemic Role | Simulation Method |
|---|---|---|---|
| Virtual Trader | Cognitive constraint modeling | Bounded human logic (lower bound) | Degradation |
| Digital Persona | Behavioral emulation via prompting | Demographically plausible behavior | 2-step prompting |
Appendix B.3. Simulation Design: Cognitive Degradation Parameters
| Dimension | LLMs | Humans |
|---|---|---|
| Processing Speed | Quick execution per trade scenario; process vast amounts of data consistently without fatigue, large context window | Slower evaluation; multimodal switching cost, limited “context window” |
| Input Retention | No memory loss across input streams; able to handle complex multisource data simultaneously | Limited working memory; chance of omitting secondary inputs under cognitive load |
| Bias Resistance | Reduced susceptibility to emotional and cognitive biases; biases present but potentially mitigable; unaffected by prior beliefs or emotions | Well-documented vulnerability to emotional and cognitive biases in trading |
| Multimodal Integration | Synthesizes chart patterns, macro data, and options Greeks simultaneously into coherent reasoning | Often relies on few dominant signals or simple heuristics due to overload or lack of training |
| Fatigue and Focus | Performance stable over time; no degradation across multiple trade evaluations | Pattern detection errors rise after several evaluations; accuracy drops with time-on-task |
| Volatility Interpretation | Parses IV skew, gamma, and vega structures with consistency and fewer errors | High variance in interpretation; misestimation of IV thresholds common among retail traders |
| Analytical Depth | Capable of generating detailed, structured financial narratives and scenario analysis | Dependent on intuition and experience; struggles with scaling across large datasets |
| Adaptability and Learning | Retrainable on new datasets; consistent output over fixed training distribution | Adapt in real-time through experience, error correction, and feedback loops |
| Emotional Resilience | Unaffected by stress, mood, or fear of loss; fully consistent under pressure | Emotional stress can degrade judgment, increase error rates |
| Creativity and Intuition | Generates output from statistical pattern recognition; lacks true creativity or intuition | Capable of intuitive judgment and creative problem-solving under novel or ambiguous conditions |
- Information Processing Constraints
- 2.
- Input Retention
- 3.
- Bias Resistance
- 4.
- Multimodal Integration
- 5.
- Cognitive Fatigue and Focus
- 6.
- Volatility Interpretation
- 7.
- Analytical Depth
- 8.
- Adaptability and Learning
- 9.
- Emotional Resilience
- 10.
- Creativity and Intuition
- Interdependency Modeling
- Fatigue increases dropout probability by 20%.
- Emotional arousal increases IV misreading likelihood by 30%.
- Bias susceptibility amplifies under volatile market regimes.
- LLM Trade Validation Prior to Degradation
- Signal Alignment Test: Trade rationale must show internal coherence across chart structure, IV skew (implied volatility), and GEX (dealer gamma exposure) positioning.
- Confidence Scoring: Trade explanations must be rated “high confidence” by the LLM’s self-reflective prompt or a second-model verification (Ganguli et al., 2022a; Shinn et al., 2024).
Appendix B.4. Digital Persona: Prompt-Based Behavioral Emulation via LLM Prompting
- Investor Archetype Layer
| Persona Type Example | Core Characteristics | Typical Strategy Profile | Behavioral Biases |
|---|---|---|---|
| Conservative Retiree | Age 65+, income-focused, low digital literacy | Bond ETFs, dividend stocks, low turnover | High loss aversion, inertia |
| Tech-Savvy Millennial | Age 25–35, high digital fluency, growth-seeking | Crypto, speculative tech, high-frequency trades | FOMO, recency bias |
| Experienced Amateur | Age 40–55, 10+ years market exposure, rule-based cognition | Options spreads, sector rotation, event trades | Overconfidence, underreaction |
| Novice Enthusiast | Age 20–30, <2 years experience, social media influenced | Meme stocks, trend following, short-hold momentum | Herding, anchoring, thrill-seeking |
- Causal Modeling Constraints
- No memory or feedback loops: Personas do not retain trade history or adapt based on past outcomes.
- No reinforcement logic: Strategy choice is not influenced by synthetic rewards or simulated gain/loss records.
- No social learning or community modeling: Personas are isolated from group sentiment, crowd signals, or simulated feedback mechanisms.
- No path-dependent evolution: Risk profiles, biases, and strategy tendencies are fixed within each simulation episode.
- Fidelity Anchors and Behavior Modulation
- Risk-aligned decision logic: Personas with low risk tolerance avoid leveraged instruments or multi-leg derivatives unless strongly justified by signals.
- Emotionally modulated responses: Anxious profiles avoid volatility overlays during VIX spikes; overconfident personas may chase trend reversals.
- Trait-behavior congruence: Internal consistency is maintained between persona traits and the chosen strategy (e.g., a Conservative Retiree cannot execute a long straddle on TSLA).
- Heuristic structuring: Output logic emulates retail mental models—such as round-number targeting, recency-based signal preference, or “confirmation bias” pattern reinforcement.
- Validation Protocols and Reproducibility Controls
| Test Type | Purpose |
|---|---|
| Temporal Consistency Test | Re-running persona prompt with identical input at different times yields similar outputs. |
| Cross-Scenario Coherence | Behavioral traits remain stable across regimes (bull, bear, neutral). |
| Framing Effect Control | Varying prompt phrasing does not cause illogical strategic divergence. |
| Temperature Sensitivity | Outputs are stable across runs at fixed temperature (set at 0.4). |
| Randomness Mitigation | Multiple iterations confirm output stability and noise resilience. |
Appendix B.5. Performance Metrics and Evaluation Framework
| Metric | Description |
|---|---|
| Return | Absolute net gain/loss per trade (LLM vs. Virtual Trader) |
| Sharpe Ratio | Risk-adjusted return measured over a rolling window, standardizing performance relative to volatility exposure. |
| Max Drawdown | Largest cumulative loss from peak to trough |
| Win Rate | Percentage of trades yielding positive returns, reflecting execution consistency. |
| Missed Opportunity Rate | Percentage of profitable LLM trades skipped by the Virtual Trader, isolating the behavioral and cognitive cost of bounded rationality and conservatism. |
| Latency-adjusted ROI | Return per unit of evaluation time, accounting for cognitive processing speed as a contributor to performance differentials. |
Appendix B.6. Limitations and Extensions
- Residual Methodological Constraints
- Baseline Dependence on LLM ValidationThe quality of the Virtual Trader simulation is constrained by the reliability of the LLM-assisted trade it mirrors. Without robust hallucination detection and contradiction screening, the benchmarked trade may encode subtle structural flaws. Even validated LLMs exhibit confident errors under stress scenarios or adversarial prompts (Ganguli et al., 2022b; Shinn et al., 2024).
- Static Agent ArchitectureDigital Personas are designed as non-learning, non-adaptive agents to preserve causal inference integrity. This excludes path-dependent decision-making, memory of past trades, and incentive-based learning—all of which play meaningful roles in real-world behavior. While this design choice ensures clean counterfactual attribution, it limits realism, particularly for experienced investors (Lux & Zwinkels, 2018).
- Absence of Trader-Type CalibrationDegradation parameters applied to the Virtual Trader are drawn from empirical aggregate benchmarks but are not yet tailored to specific archetypes (e.g., high-frequency traders, swing traders, retirees). This generalization may mask behaviorally relevant asymmetries in bounded cognition across subgroups (Almansour & Elkrghli, 2023; Ruggeri et al., 2023).
- Ecological Validity GapsThe simulation environment abstracts away real-world frictions such as bid-ask spreads, slippage, mobile interface constraints, and platform-induced cognitive load. These omitted variables are known to shape live trade execution behavior (Barber & Odean, 2002; Wheeler & Varner, 2024).
- Validation Architecture
- LLM Baseline Reliability
- Ensemble Model Comparison: Cross-validation across distinct LLM families (e.g., GPT-4, Claude, Gemini).
- Confidence Scoring and Contradiction Detection: Flag low-certainty and conflicting outputs.
- Expert Calibration: Manual review of a sample of LLM trades by human finance experts.
- Adversarial and Regime Testing
- Hallucination Stress Tests: Detecting factual inconsistencies in rationale.
- Bias Amplification Monitoring: Checking for recency, confirmation, and availability biases.
- Scenario Boundaries: Evaluating LLM consistency in volatile markets and during black-swan events.
- Reproducibility and Robustness Protocols
| Validation Tier | Purpose |
|---|---|
| Internal | Trait–behavior coherence in Personas; degradation logic stability across trade types |
| External | Consistency with behavioral finance findings and retail trading datasets |
| Temporal | Simulation outcomes replayed on out-of-sample market data for cross-period validation |
Appendix B.7. Sensitivity and Stress Testing
- ±20% variation across all degradation weights.
- Alternate prompt constructions for the same persona.
- LLM output variation across prompt phrasings and temperature ranges.
- Strategy divergence testing under high-volatility, low-liquidity, and regulatory change scenarios.
- Pathways for Extension and Scalability
- Hybrid Agents: Combining degradation-based impairments with stable persona traits to simulate boundedly rational learning agents.
- Reinforcement-Conditioned Personas: Selectively enabling feedback loops in simulation sessions longer than 10 trades.
- Memory Integration: Using tokenized state representation to simulate persistent trade cognition.
- Real-Time Market APIs: Embedding streaming volatility and flow data into simulation contexts.
- Cross-Market Generalization: Testing framework adaptability in international and decentralized finance (DeFi) markets.
Appendix B.8. Human Validation
Appendix B.9. Behavioral Bias Operationalization (Investor-Level)
| Measure | Operational Rule (Summary) | Inputs (From Workflow) | Interpretation |
|---|---|---|---|
| Calibration error (probabilistic) | For decisions with explicit probabilities, compute Brier score and decompose into reliability/resolution/uncertainty; where interval forecasts are present, report empirical coverage vs. nominal. Aggregate by agent and period. | Decision records with stated probabilities and/or intervals; realized outcomes | Lower Brier and higher reliability imply better calibration |
| Disposition effect (PGR–PLR) | Compute Proportion of Gains Realized (PGR) minus Proportion of Losses Realized (PLR); for options, define gains/losses via mark-to-market relative to entry/basis at decision time. | Trade logs and timestamps; mark-to-market states; closing/roll actions | Higher PGR–PLR indicates a stronger disposition effect |
| Selective exposure (confirmation) | Share of belief-congruent vs. incongruent evidence tokens in each decision’s recorded evidence slate; aggregate by agent and period. | LAT/DP evidence logs (tokenized sources/claims), decision label | Higher congruent-share indicates stronger selective exposure |
Appendix B.10. Strategy Structural Complexity (SSC) Coding (C0–C3)
Appendix C. Candidate Empirical Methods for Market-Level Diagnostics
Appendix C.1. EMH-Aligned Diagnostics: Detecting Deviations from Informational Efficiency
| Method | Description | Key Metric |
|---|---|---|
| Matched Asset Counterfactuals | Compare high–LLM-exposure assets with volatility- and beta-matched control assets to isolate LLM effects on performance and volatility | Differential Sharpe ratios, return drift, realized volatility |
| Event Window Structural Breaks | Use LLM release points (e.g., ChatGPT, GPT-4) as anchors to detect structural breaks in return distributions or volatility regimes | Chow tests, Bai–Perron segmentation |
| Return Drift/Variance Ratio Testing | Examine whether LLM-flagged trades or tickers show non-random post-entry price behavior | Variance ratio, autocorrelation, CARs |
Appendix C.2. AMH-Aligned Diagnostics: Interpreting Adaptive Behavioral Evolution
| Method | Description | Key Metric |
|---|---|---|
| Time-Interacted Anomaly Decay | Model anomaly persistence as a function of time since LLM diffusion, capturing performance normalization | Time-interacted Sharpe ratio regressions, cohort decay coefficients |
| Three-Stage Adaptation Timeline | Operationalize AMH as a timeline: Stage 1 (0–6 months: surge); Stage 2 (6–18 months: crowding); Stage 3 (18+ months: absorption) | Time-windowed anomaly strength |
| Strategy Clustering and Mimicry | Detect convergence in retail behavior via prompt similarity, repeated ticker targeting, or synchronized entry timing | Cosine similarity (prompts); correlation matrices (tickers); timestamp clustering |
Appendix C.3. Integration with Core Behavioral Framework
| 1 | GARCH—Generalized autoregression conditional heteroskedasticity model. |
| 2 | https://osf.io/ (accessed on 30 August 2025). |
| 3 | For most LLMs, 4 tokens approximately equals 3 words, https://www.baseten.co/ (accessed on 30 August 2025). |
| 4 | https://www.anthropic.com/claude/sonnet (accessed on 30 August 2025). |
| 5 | While both Processing Speed and Cognitive Fatigue limitations may contribute to reduced human throughput relative to LLM performance, the current framework treats these as potentially overlapping but conceptually distinct constraints. The 25% throughput estimate suggested for processing speed analysis may partially or fully encompass fatigue-related decrements. Further empirical investigation is required to decompose these effects and determine whether they represent independent or confounded limitations. For the purposes of this simulation framework, we implement separate parameters while acknowledging this represents a preliminary operationalization pending empirical validation. |
References
- Abadie, A., Diamond, A., & Hainmueller, J. (2010). Synthetic control methods for comparative case studies: Estimating the effect of California’s tobacco control program. Journal of the American Statistical Association, 105(490), 493–505. [Google Scholar] [CrossRef]
- Abdullahi, M. (2021). The efficient market hypothesis: A critical review of equilibrium models and imperical evidence. African Scholar Journal of Mgt. Science and Entrepreneurship, 23(7), 379–386. Available online: http://www.africanscholarpublications.com/wp-content/uploads/2022/03/AJMSE_Vol23_No7_Dec2021-23.pdf (accessed on 10 February 2025).
- Ajzen, I. (1991). The theory of planned behavior. Organizational Behavior and Human Decision Processes, 50(2), 179–211. [Google Scholar] [CrossRef]
- Akepanidtaworn, K., Mascio, R. D., Imas, A., & Schmidt, L. D. W. (2023). Selling fast and buying slow: Heuristics and trading performance of institutional investors. The Journal of Finance, 78(6), 3055–3098. [Google Scholar] [CrossRef]
- Alhamad, H., & Donyai, P. (2021). The validity of the theory of planned behavior for understanding people’s beliefs and intentions toward reusing medicines. Pharmacy, 9(1), 58. [Google Scholar] [CrossRef]
- Almansour, B. Y., & Elkrghli, S. (2023). Behavioral finance factors and investment decisions: A mediating role of risk perception. Cogent Economics & Finance, 11(2), 2239032. [Google Scholar] [CrossRef]
- Alsup, A. (2023, April 25). Retail investors play a losing game with complex options, according to research. UF Warrington News. Available online: https://news.warrington.ufl.edu/faculty-and-research/retail-investors-play-a-losing-game-with-complex-options-according-to-research/ (accessed on 22 June 2025).
- Anthis, J. R., Liu, R., Richardson, S. M., Kozlowski, A. C., Koch, B., Evans, J., Brynjolfsson, E., & Bernstein, M. (2025). LLM social simulations are a promising research method. arXiv. [Google Scholar] [CrossRef]
- Aqham, A., Endaryati, E., Subroto, V., & Kusumajaya, R. (2024). Behavioral biases in investment decisions: A mixed-methods study on retail investors in emerging markets. Journal of Management and Informatics, 3, 568–586. [Google Scholar] [CrossRef]
- Armitage, C. J., & Conner, M. (2001). Efficacy of the theory of planned behavior: A meta-analytic review. The British Journal of Social Psychology, 40(Pt 4), 471–499. [Google Scholar] [CrossRef]
- Athey, S., & Wager, S. (2019). Estimating treatment effects with causal forests: An application. arXiv. [Google Scholar] [CrossRef]
- Bahaj, A., Rahimi, H., Chetouani, M., & Ghogho, M. (2025). Gauging overprecision in LLMs: An empirical study. arXiv. [Google Scholar] [CrossRef]
- Bai, J., & Perron, P. (2003). Computation and analysis of multiple structural change models. Journal of Applied Econometrics, 18(1), 1–22. [Google Scholar] [CrossRef]
- Bandi, F. M., Fusari, N., & Renò, R. (2023). 0DTE option pricing. Social Science Research Network. [Google Scholar] [CrossRef]
- Bandura, A. (1997). Self-efficacy: The exercise of control (pp. ix, 604). W H Freeman/Times Books/Henry Holt & Co. [Google Scholar]
- Barber, B. M., Huang, X., Odean, T., & Schwarz, C. (2021). Attention induced trading and returns: Evidence from robinhood users. Social Science Research Network. [Google Scholar] [CrossRef]
- Barber, B. M., & Odean, T. (2000). Trading is hazardous to your wealth: The common stock investment performance of individual investors. The Journal of Finance, 55(2), 773–806. [Google Scholar] [CrossRef]
- Barber, B. M., & Odean, T. (2002). Online investors: Do the slow die first? The Review of Financial Studies, 15(2), 455–488. [Google Scholar] [CrossRef]
- Barber, B. M., Odean, T., & Zhu, N. (2009). Do retail trades move markets? The Review of Financial Studies, 22(1), 151–186. [Google Scholar] [CrossRef]
- Barberis, N., Huang, M., & Santos, T. (1999). Prospect theory and asset prices. Social Science Research Network. [Google Scholar] [CrossRef]
- Barberis, N., & Thaler, R. H. (2002). A survey of behavioral finance. Social Science Research Network. Available online: https://papers.ssrn.com/abstract=332266 (accessed on 24 June 2025).
- Bartsch, H., Jorgensen, O., Rosati, D., Hoelscher-Obermaier, J., & Pfau, J. (2023). Self-consistency of large language models under ambiguity. arXiv. [Google Scholar] [CrossRef]
- Baumeister, R. F. (2018). Self-regulation and self-control: Selected works of Roy F. Baumeister (1st ed.). Routledge. [Google Scholar] [CrossRef]
- Bawalle, A. A., Khan, M. S. R., & Kadoya, Y. (2025). Overconfidence, financial literacy, and panic selling: Evidence from Japan. PLoS ONE, 20(3), e0315622. [Google Scholar] [CrossRef]
- Behrens, M., Gube, M., Chaabene, H., Prieske, O., Zenon, A., Broscheid, K.-C., Schega, L., Husmann, F., & Weippert, M. (2023). Fatigue and human performance: An updated framework. Sports Medicine, 53(1), 7–31. [Google Scholar] [CrossRef]
- Belanche, D., Casaló Ariño, L., & Flavian, C. (2019). Artificial intelligence in fintech: Understanding robo-advisors adoption among customers. Industrial Management & Data Systems, 119, 1411–1430. [Google Scholar] [CrossRef]
- Bewersdorff, A., Hartmann, C., Hornberger, M., Seßler, K., Bannert, M., Kasneci, E., Kasneci, G., Zhai, X., & Nerdel, C. (2025). Taking the next step with generative artificial intelligence: The transformative role of multimodal large language models in science education. Learning and Individual Differences, 118, 102601. [Google Scholar] [CrossRef]
- Bogousslavsky, V., & Muravyev, D. (2024). An anatomy of retail option trading. SSRN Electronic Journal. [Google Scholar] [CrossRef]
- Borman, H., Leontjeva, A., Pizzato, L., Jiang, M., & Jermyn, D. (2024). Do LLM personas dream of bull markets? Comparing human and AI investment strategies through the lens of the five-factor model. arXiv. [Google Scholar] [CrossRef]
- Bortoli, D. D., Costa, D. d., Jr., Goulart, M., & Campara, J. (2019). Personality traits and investor profile analysis: A behavioral finance study. PLoS ONE, 14(3), e0214062. [Google Scholar] [CrossRef]
- Boussioux, L. (2024). Narrative AI and the human-AI oversight paradox in evaluating early-stage innovations. Available online: https://pubsonline.informs.org/do/10.1287/2f948394-3eb2-40b0-aaff-6c621f5f5ab9 (accessed on 22 June 2025).
- Briere, M. (2023). Retail investors’ behavior in the digital age: How digitalization is impacting investment decisions. Social Science Research Network. [Google Scholar] [CrossRef]
- Broadbent, D. E. (1958). Perception and communication (pp. v, 338). Pergamon Press. [Google Scholar] [CrossRef]
- Bruine De Bruin, W., Parker, A. M., & Fischhoff, B. (2007). Individual differences in adult decision-making competence. Journal of Personality and Social Psychology, 92(5), 938–956. [Google Scholar] [CrossRef]
- Brysbaert, M. (2019). How many words do we read per minute? A review and meta-analysis of reading rate. Journal of Memory and Language, 109, 104047. [Google Scholar] [CrossRef]
- Bryzgalova, S., Pavlova, A., & Sikorskaya, T. (2023). Retail trading in options and the rise of the big three wholesalers. The Journal of Finance, 78(6), 3465–3514. [Google Scholar] [CrossRef]
- Buçinca, Z., Lin, P., Gajos, K. Z., & Glassman, E. L. (2020, March 17–20). Proxy tasks and subjective measures can be misleading in evaluating explainable ai systems. 25th International Conference on Intelligent User Interfaces. IUI ’20: 25th International Conference on Intelligent User Interfaces (pp. 454–464), Cagliari, Italy. [Google Scholar] [CrossRef]
- Byrd, D., Hybinette, M., & Balch, T. H. (2019). ABIDES: Towards high-fidelity market simulation for AI research. arXiv. [Google Scholar] [CrossRef]
- Campbell, S. D., & Sharpe, S. A. (2009). Anchoring bias in consensus forecasts and its effect on market prices. Journal of Financial and Quantitative Analysis, 44(2), 369–390. [Google Scholar] [CrossRef]
- Castro, S. C., Strayer, D. L., Matzke, D., & Heathcote, A. (2019). Cognitive workload measurement and modeling under divided attention. Journal of Experimental Psychology. Human Perception and Performance, 45(6), 826–839. [Google Scholar] [CrossRef]
- Cen, L., Hilary, G., & Wei, J. (2013). The role of anchoring bias in the equity market: Evidence from analysts’ earnings forecasts and stock returns. The Journal of Financial and Quantitative Analysis, 48(1), 47–76. [Google Scholar] [CrossRef]
- Chai, W. J., Abd Hamid, A. I., & Abdullah, J. M. (2018). Working memory from the psychological and neurosciences perspectives: A review. Frontiers in Psychology, 9, 401. [Google Scholar] [CrossRef]
- Chen, L., Zhang, Y., Feng, J., Chai, H., Zhang, H., Fan, B., Ma, Y., Zhang, S., Li, N., Liu, T., Sukiennik, N., Zhao, K., Li, Y., Liu, Z., Xu, F., & Li, Y. (2025). AI agent behavioral science. arXiv. [Google Scholar] [CrossRef]
- Chen, Z., Chen, J., Chen, J., & Sra, M. (2025). Standard benchmarks fail—Auditing LLM agents in finance must prioritize risk. arXiv. [Google Scholar] [CrossRef]
- Choy, S.-K. (2015). Retail clientele and option returns. Journal of Banking & Finance, 51, 26–42. [Google Scholar] [CrossRef]
- Chui, A. C. W., Subrahmanyam, A., & Titman, S. (2022). Momentum, reversals, and investor clientele. Review of Finance, 26(2), 217–255. [Google Scholar] [CrossRef]
- Compeau, D. R., & Higgins, C. A. (1995). Computer self-efficacy: Development of a measure and initial test. MIS Quarterly, 19(2), 189. [Google Scholar] [CrossRef]
- Conte, R., & Paolucci, M. (2014). On agent-based modeling and computational social science. Frontiers in Psychology, 5, 668. [Google Scholar] [CrossRef]
- Costarelli, A., Allen, M., Hauksson, R., Sodunke, G., Hariharan, S., Cheng, C., Li, W., Clymer, J., & Yadav, A. (2024). GameBench: Evaluating strategic reasoning abilities of LLM agents. arXiv. [Google Scholar] [CrossRef]
- Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage capacity. The Behavioral and Brain Sciences, 24, 87–114; discussion 114. [Google Scholar] [CrossRef]
- Cucinelli, D., Gandolfi, G., & Soana, M.-G. (2016). Customer and advisor financial decisions: The theory of planned behavior perspective. International Journal of Business and Social Science, 7(12), 80–92. [Google Scholar]
- Davis, F. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS Quarterly, 13(3), 319. [Google Scholar] [CrossRef]
- Davis, F., Bagozzi, R., & Warshaw, P. (1989). User acceptance of computer technology: A comparison of two theoretical models. Management Science, 35, 982–1003. [Google Scholar] [CrossRef]
- Dawid, P. (1982). The well-calibrated bayesian. Journal of the American Statistical Association, 77(379), 605–610. [Google Scholar] [CrossRef]
- DeVellis, R. F. (2016). Scale development: Theory and applications. SAGE Publications. [Google Scholar]
- D’Hondt, C., Petitjean, M., & Elhichou, Y. (2023). Uncovering the profile of passive exchange-traded fund retail investors. Forthcoming in Finance. SSRN Working Paper No. 3522963. Available online: https://ssrn.com/abstract=3522963 (accessed on 22 June 2025).
- Diamond, N., & Perkins, G. (2022). Using intermarket data to evaluate the efficient market hypothesis with machine learning. arXiv. [Google Scholar] [CrossRef]
- Dimitriadis, K. A., Koursaros, D., & Savva, C. S. (2025). Exploring the dynamic nexus of traditional and digital assets in inflationary times: The role of safe havens, tech stocks, and cryptocurrencies. Economic Modelling, 151, 107195. [Google Scholar] [CrossRef]
- Dong, M. M., Stratopoulos, T. C., & Wang, V. X. (2024). A scoping review of ChatGPT research in accounting and finance. International Journal of Accounting Information Systems, 55, 100715. [Google Scholar] [CrossRef]
- Du, J., Huang, D., Liu, Y.-J., Shi, Y., Subrahmanyam, A., & Zhang, H. (2025). Nominal Prices, Retail Investor Participation, and Return Momentum. Management Science, Advance Online Publication, 1423. [Google Scholar] [CrossRef]
- East, R. (1993). Investment decisions and the theory of planned behavior. Journal of Economic Psychology, 14(2), 337–375. [Google Scholar] [CrossRef]
- Elly, A., John, D., Okunola, A., & Notiny, B. (2025). The impact of AI on algorithmic trading and investment strategies: Analyzing performance and risk management. Available online: https://www.researchgate.net/profile/Abiodun-Okunola-6/publication/390172832_The_Impact_of_AI_on_Algorithmic_Trading_and_Investment_Strategies_Analyzing_Performance_and_Risk_Management/links/67e321c6fe0f5a760f9034a5/The-Impact-of-AI-on-Algorithmic-Trading-and-Investment-Strategies-Analyzing-Performance-and-Risk-Management.pdf (accessed on 22 June 2025).
- Epstein, J. M. (1999). Agent-based computational models and generative social science. Complexity, 4(5), 41–60. [Google Scholar] [CrossRef]
- Eriksen, B. A., & Eriksen, C. W. (1974). Effects of noise letters upon the identification of a target letter in a nonsearch task. Perception & Psychophysics, 16(1), 143–149. [Google Scholar] [CrossRef]
- Escobar, L., & Pedraza, A. (2023). Active trading and (poor) performance: The social transmission channel. Journal of Financial Economics, 150(1), 139–165. [Google Scholar] [CrossRef]
- Fama, E. F. (1970). Efficient capital markets: A review of theory and empirical work. Journal of Finance, 25(2), 383–417. [Google Scholar] [CrossRef]
- Fan, W., Zhu, Y., Wang, C., Wang, B., & Xu, W. (2025). Consistency of responses and continuations generated by large language models on social media. arXiv. [Google Scholar] [CrossRef]
- Felix, L., Kraussl, R., & Stork, P. (2020). Implied volatility sentiment: A tale of two tails. Quantitative Finance, 20(5), 823–849. [Google Scholar] [CrossRef]
- Fenton-O’Creevy, M., Nicholson, N., Soane, E., & Willman, P. (2003). Trading on illusions: Unrealistic perceptions of control and trading performance. Journal of Occupational and Organizational Psychology, 76(1), 53–68. [Google Scholar] [CrossRef]
- Ferrag, M. A., Tihanyi, N., & Debbah, M. (2025). From LLM reasoning to autonomous AI agents: A comprehensive review. arXiv. [Google Scholar] [CrossRef]
- Ferrari, J. R. (2001). Procrastination as self-regulation failure of performance: Effects of cognitive load, self-awareness, and time limits on ‘working best under pressure’. European Journal of Personality, 15(5), 391–406. [Google Scholar] [CrossRef]
- Finet, A., Kristoforidis, K., & Laznicka, J. (2025). Emotional drivers of financial decision-making: Unveiling the link between emotions and stock market behavior. Journal of Next-Generation Research 5.0, 1(3), 1–25. [Google Scholar] [CrossRef]
- Finucane, M. L., Alhakami, A., Slovic, P., & Johnson, S. M. (2000). The affect heuristic in judgments of risks and benefits. Journal of Behavioral Decision Making, 13(1), 1–17. [Google Scholar] [CrossRef]
- Firoozye, N., Tan, V., & Zohren, S. (2023). Canonical portfolios: Optimal asset and signal combination. arXiv. [Google Scholar] [CrossRef]
- Flavell, J. (1979). Metacognition and cognitive monitoring: A new area of cognitive-developmental inquiry. American Psychologist, 34(10), 906–911. [Google Scholar] [CrossRef]
- Foltice, B., & Langer, T. (2015). Profitable momentum trading strategies for individual investors. Financial Markets and Portfolio Management, 29(2), 85–113. [Google Scholar] [CrossRef]
- Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18(1), 39–50. [Google Scholar] [CrossRef]
- Galariotis, E. (2014). Contrarian and momentum trading: A review of the literature. Review of Behavioral Finance, 6, 63–82. [Google Scholar] [CrossRef]
- Ganguli, D., Hernandez, D., Lovitt, L., DasSarma, N., Henighan, T., Jones, A., Joseph, N., Kernion, J., Mann, B., Askell, A., Bai, Y., Chen, A., Conerly, T., Drain, D., Elhage, N., Showk, S. E., Fort, S., Hatfield-Dodds, Z., Johnston, S., … Clark, J. (2022a, June 21–24). Predictability and surprise in large generative models. 2022 ACM Conference on Fairness Accountability and Transparency (pp. 1747–1764), Seoul, Republic of Korea. [Google Scholar] [CrossRef]
- Ganguli, D., Lovitt, L., Kernion, J., Askell, A., Bai, Y., Kadavath, S., Mann, B., Perez, E., Schiefer, N., Ndousse, K., Jones, A., Bowman, S., Chen, A., Conerly, T., DasSarma, N., Drain, D., Elhage, N., El-Showk, S., Fort, S., … Clark, J. (2022b). Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned. arXiv. [Google Scholar] [CrossRef]
- Gao, C., Lan, X., Li, N., Yuan, Y., Ding, J., Zhou, Z., Xu, F., & Li, Y. (2023). Large language models empowered agent-based modeling and simulation: A survey and perspectives. arXiv. [Google Scholar] [CrossRef]
- Gao, C., Lan, X., Li, N., Yuan, Y., Ding, J., Zhou, Z., Xu, F., & Li, Y. (2024). Large language models empowered agent-based modeling and simulation: A survey and perspectives. Humanities and Social Sciences Communications, 11(1), 1259. [Google Scholar] [CrossRef]
- Gempesaw, D., Henry, J. J., & Xiao, H. (2023). Retail ETF investing. Social Science Research Network. [Google Scholar] [CrossRef]
- Gerlich, M. (2025). AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies, 15(1), 6. [Google Scholar] [CrossRef]
- Gerrans, P., Abisekaraj, S. B., & Liu, Z. (2023). The fear of missing out on cryptocurrency and stock investments: Direct and indirect effects of financial literacy and risk tolerance. Journal of Financial Literacy and Wellbeing, 1(1), 103–137. [Google Scholar] [CrossRef]
- Ghaffarzadegan, N., Majumdar, A., Williams, R., & Hosseinichimeh, N. (2024). Generative agent-based modeling: An introduction and tutorial. System Dynamics Review, 40, e1761. [Google Scholar] [CrossRef]
- Ghosh, P. (2025). Types of trading strategies: Momentum, mean reversion, and style differences explained by Prodipta Ghosh, QuantInsti articles. Available online: https://www.quantinsti.com/articles/types-trading-strategies/ (accessed on 24 June 2025).
- Giarlotta, A., & Petralia, A. (2024). Simon’s bounded rationality. Decisions in Economics and Finance, 47(1), 327–346. [Google Scholar] [CrossRef]
- Gimmelberg, D., Belinskiy, A., Valentine, A., Iveta, L., Kaže, V., & Filatov, A. (2025a). FACET—Four agent causal evaluation toolkit. Large language models for retail equity and options traders. OSF Registration. Open Science Framework. [Google Scholar] [CrossRef]
- Gimmelberg, D., Głowacka, M., Belinskiy, A., Korotkii, S., Artamov, V., & Ludviga, I. (2025b). Bridging human expertise and AI: Evaluating the role of large language models in retail investors’ decision-making. International Journal of Finance & Banking Studies, 14(1), 20–29. [Google Scholar] [CrossRef]
- Glaser, M., & Weber, M. (2007). Overconfidence and trading volume. The Geneva Risk and Insurance Review, 32(1), 1–36. [Google Scholar] [CrossRef]
- Goodell, J. W., Yadav, M. P., Ruan, J., Abedin, M. Z., & Malhotra, N. (2023). Traditional assets, digital assets and renewable energy: Investigating connectedness during COVID-19 and the Russia-Ukraine war. Finance Research Letters, 58, 104323. [Google Scholar] [CrossRef]
- Graham, J. R., & Kumar, A. (2006). Do dividend clienteles exist? Evidence on dividend preferences of retail investors. The Journal of Finance, 61(3), 1305–1336. [Google Scholar] [CrossRef]
- Grignoli, N., Manoni, G., Gianini, J., Schulz, P., Gabutti, L., & Petrocchi, S. (2025). Clinical decision fatigue: A systematic and scoping review with meta-synthesis. Family Medicine and Community Health, 13(1), e003033. [Google Scholar] [CrossRef] [PubMed]
- Grinblatt, M., & Keloharju, M. (2009). Sensation seeking, overconfidence, and trading activity. The Journal of Finance, 64(2), 549–578. [Google Scholar] [CrossRef]
- Grossman, M. R., & Cormack, G. V. (2011). Technology-assisted review in e-discovery can be more effective and more efficient than exhaustive manual review. Richmond Journal of Law and Technology, 17(3), 11. Available online: http://jolt.richmond.edu/v17i3/article11.pdf (accessed on 25 June 2025).
- Gu, J., Ye, J., Wang, G., & Yin, W. (2024, November 14–17). Adaptive and explainable margin trading via large language models on portfolio management. 5th ACM International Conference on AI in Finance (pp. 248–256), Brooklyn, NY, USA. [Google Scholar] [CrossRef]
- Gui, G., & Toubia, O. (2023). The challenge of using llms to simulate human behavior: A causal inference perspective. SSRN Electronic Journal. [Google Scholar] [CrossRef]
- Han, X., Sakkas, N., Danbolt, J., & Eshraghi, A. (2022). Persistence of investor sentiment and market mispricing. Financial Review, 57(3), 617–640. [Google Scholar] [CrossRef]
- Han, X., Wang, N., Che, S., Yang, H., Zhang, K., & Xu, S. X. (2024, November 14–17). Enhancing investment analysis: Optimizing AI-Agent collaboration in financial research. 5th ACM International Conference on AI in Finance. ICAIF ’24: 5th ACM International Conference on AI in Finance (pp. 538–546), Brooklyn, NY, USA. [Google Scholar] [CrossRef]
- Hansen, A. L., & Kazinnik, S. (2024). Can ChatGPT decipher Fedspeak? SSRN working paper No. 4399406. Available online: https://ssrn.com/abstract=4399406 (accessed on 22 June 2025).
- Harris, L. (2024). Algorithmic trading and portfolio optimization using big data analytics. Available online: https://www.researchgate.net/publication/386076052_Algorithmic_Trading_and_Portfolio_Optimization_Using_Big_Data_Analytics (accessed on 22 June 2025).
- Hart, W., Albarracín, D., Eagly, A. H., Brechan, I., Lindberg, M. J., & Merrill, L. (2009). Feeling validated versus being correct: A meta-analysis of selective exposure to information. Psychological Bulletin, 135(4), 555–588. [Google Scholar] [CrossRef]
- Hartley, J., Hamill, C., Batra, D., Seddon, D., Okhrati, R., & Khraishi, R. (2025). How personality traits shape LLM risk-taking behavior. arXiv. [Google Scholar] [CrossRef]
- Henning, T., Ojha, S. M., Spoon, R., Han, J., & Camerer, C. F. (2025). LLM trading: Analysis of LLM agent behavior in experimental asset markets. arXiv. [Google Scholar] [CrossRef]
- Henseler, J., Ringle, C., & Sarstedt, M. (2015). A new criterion for assessing discriminant validity in variance-based structural equation modeling. Journal of the Academy of Marketing Science, 43, 115–135. [Google Scholar] [CrossRef]
- Hoff, K. A., & Bashir, M. (2015). Trust in automation: Integrating empirical evidence on factors that influence trust. Human Factors, 57(3), 407–434. [Google Scholar] [CrossRef]
- Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945–960. [Google Scholar] [CrossRef]
- Hommes, C. (2013). Behavioral rationality and heterogeneous expectations in complex economic systems (1st ed.). Cambridge University Press. [Google Scholar] [CrossRef]
- Huang, J., Xiao, M., Li, D., Jiang, Z., Yang, Y., Zhang, Y., Qian, L., Wang, Y., Peng, X., Ren, Y., Xiang, R., Chen, Z., Zhang, X., He, Y., Han, W., Chen, S., Shen, L., Kim, D., Yu, Y., … Tsujii, J. (2025). Open-FinLLMs: Open multimodal large language models for financial applications. arXiv. [Google Scholar] [CrossRef]
- Huang, L., Yu, W., Ma, W., Zhong, W., Feng, Z., Wang, H., Chen, Q., Peng, W., Feng, X., Qin, B., & Liu, T. (2025). A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Information Systems, 43(2), 1–55. [Google Scholar] [CrossRef]
- Hurst, H. E. (1951). Long-term storage capacity of reservoirs. Transactions of the American Society of Civil Engineers, 116(1), 770–799. [Google Scholar] [CrossRef]
- Hutchins, E. (1995). Cognition in the wild (pp. xviii, 381). The MIT Press. [Google Scholar]
- Jadhav, A., Pang, N., & Zhou, Y. (2025). Large language models in equity markets: Applications, opportunities, and risks. Frontiers in Artificial Intelligence, 8, 1608365. [Google Scholar] [CrossRef] [PubMed]
- Jakesch, M., Hancock, J., & Naaman, M. (2023). Human heuristics for AI-generated language are flawed. Proceedings of the National Academy of Sciences of the United States of America, 120, e2208839120. [Google Scholar] [CrossRef] [PubMed]
- Jia, J., Yuan, Z., Pan, J., McNamara, P. E., & Chen, D. (2024). Decision-making behavior evaluation framework for LLMs under uncertain context. arXiv. [Google Scholar] [CrossRef]
- Jiang, H., Zhang, X., Cao, X., Breazeal, C., Roy, D., & Kabbara, J. (2024). PersonaLLM: Investigating the ability of large language models to express personality traits. arXiv. [Google Scholar] [CrossRef]
- Jiang, Z., Peng, C., & Yan, H. (2024). Personality differences and investment decision-making. Journal of Financial Economics, 153, 103776. [Google Scholar] [CrossRef]
- Johnson, S. G. B., Bilovich, A., & Tuckett, D. (2023). Conviction narrative theory: A theory of choice under radical uncertainty. Behavioral and Brain Sciences, 46, 1–26. [Google Scholar] [CrossRef]
- Kahneman, D. (2011). Thinking, fast and slow. Farrar, Straus and Giroux. [Google Scholar]
- Kang, H. (2021). Sample size determination and power analysis using the G*Power software. Journal of Educational Evaluation for Health Professions, 18, 17. [Google Scholar] [CrossRef]
- Karinshak, E., Liu, S. X., Park, J. S., & Hancock, J. T. (2023). Working with AI to persuade: Examining a large language model’s ability to generate pro-vaccination messages. Proceedings of the ACM on Human-Computer Interaction, 7(CSCW1), 1–29. [Google Scholar] [CrossRef]
- Kayani, U., Ullah, M., Aysan, A. F., Nazir, S., & Frempong, J. (2024). Quantile connectedness among digital assets, traditional assets, and renewable energy prices during extreme economic crisis. Technological Forecasting and Social Change, 208, 123635. [Google Scholar] [CrossRef]
- Khan, M. A., & Shabbir, H. (2025). Digital literacy and retail investing: Exploring market dynamics, efficiency, and stability in the digital era. Journal of Digital Literacy and Learning, 1, 20. [Google Scholar]
- Khorana, A., Chang, E. C., & Cheng, J. W. (1999). An examination of herd behavior in equity markets: An international perspective. Social Science Research Network. [Google Scholar] [CrossRef]
- Khuntia, S., & Pattanayak, J. K. (2018). Adaptive market hypothesis and evolving predictability of bitcoin. Economics Letters, 167, 26–28. [Google Scholar] [CrossRef]
- Kiely. (2025). Understanding performance benchmarks for LLM inference. Baseten Blog, Baseten. Available online: https://www.baseten.co/blog/understanding-performance-benchmarks-for-llm-inference/ (accessed on 24 June 2025).
- King, W. R., & He, J. (2006). A meta-analysis of the technology acceptance model. Information & Management, 43(6), 740–755. [Google Scholar] [CrossRef]
- Kirtac, K., & Germano, G. (2024). Sentiment trading with large language models. Finance Research Letters, 62, 105227. [Google Scholar] [CrossRef]
- Kobbeltved, T., & Wolff, K. (2009). The Risk-as-feelings hypothesis in a Theory-of-planned-behavior perspective. Judgment and Decision Making, 4(7), 567–586. [Google Scholar] [CrossRef]
- Korniotis, G. M., & Kumar, A. (2009). Do older investors make better investment decisions? Social Science Research Network. Available online: https://papers.ssrn.com/abstract=767125 (accessed on 25 June 2025).
- Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one’s own incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77(6), 1121–1134. [Google Scholar] [CrossRef]
- Kuchinsky, S. E., Gallun, F. J., & Lee, A. K. C. (2024). Note on the dual-task paradigm and its use to measure listening effort. Trends in Hearing, 28, 23312165241292215. [Google Scholar] [CrossRef]
- Kumar, A. (2009). Who gambles in the stock market? The Journal of Finance, 64(4), 1889–1933. [Google Scholar] [CrossRef]
- Lakens, D. (2017). Equivalence tests: A practical primer for t tests, correlations, and meta-analyses. Social Psychological and Personality Science, 8(4), 355–362. [Google Scholar] [CrossRef] [PubMed]
- Lavie, N. (2005). Distracted and confused? Selective attention under load. Trends in Cognitive Sciences, 9(2), 75–82. [Google Scholar] [CrossRef]
- Leaver, M., & Reader, T. W. (2016). Human factors in financial trading: An analysis of trading incidents. Human Factors, 58(6), 814–832. [Google Scholar] [CrossRef] [PubMed]
- LeBaron, B. (2006). Chapter 24 agent-based computational finance. In L. Tesfatsion, & K. L. Judd (Eds.), Handbook of computational economics (pp. 1187–1233). Elsevier. [Google Scholar] [CrossRef]
- Lee, H.-P., Sarkar, A., Tankelevitch, L., Drosos, I., Rintel, S., Banks, R., & Wilson, N. (2025, April 26–May 1). The impact of generative AI on critical thinking: Self-reported reductions in cognitive effort and confidence effects from a survey of knowledge workers. 2025 CHI Conference on Human Factors in Computing Systems (pp. 1–22), Yokohama, Japan. [Google Scholar] [CrossRef]
- Lee, J. D., & See, K. A. (2004). Trust in automation: Designing for appropriate reliance. Human Factors: The Journal of the Human Factors and Ergonomics Society, 46(1), 50–80. [Google Scholar] [CrossRef]
- Lerner, J. S., & Keltner, D. (2001). Fear, anger, and risk. Journal of Personality and Social Psychology, 81(1), 146–159. [Google Scholar] [CrossRef] [PubMed]
- Lerner, J. S., Li, Y., Valdesolo, P., & Kassam, K. S. (2015). Emotion and decision making. Annual Review of Psychology, 66(1), 799–823. [Google Scholar] [CrossRef]
- Li, W. W., Kim, H., Cucuringu, M., & Ma, T. (2025). Can LLM-based financial investing strategies outperform the market in long run? arXiv. [Google Scholar] [CrossRef]
- Li, Y., Miao, Y., Ding, X., Krishnan, R., & Padman, R. (2025). Firm or fickle? Evaluating large language models consistency in sequential interactions. arXiv. [Google Scholar] [CrossRef]
- Liu, Z., Guo, X., Lou, F., Zeng, L., Niu, J., Wang, Z., Xu, J., Cai, W., Yang, Z., Zhao, X., Li, C., Xu, S., Chen, D., Chen, Y., Bai, Z., & Zhang, L. (2025). Fin-R1: A large language model for financial reasoning through reinforcement learning. arXiv. [Google Scholar] [CrossRef]
- Lo, A. (2004). Reconciling efficient markets with behavioral finance: The adaptive markets hypothesis. Journal of Investment Consulting, 7, 21–44. [Google Scholar]
- Lo, A. W. (2004). The adaptive markets hypothesis. The Journal of Portfolio Management, 30(5), 15–29. [Google Scholar] [CrossRef]
- Lo, A. W. (2019). Adaptive markets: Financial evolution at the speed of thought (2nd ed). Princeton University Press. [Google Scholar]
- Lo, A. W., & MacKinlay, A. C. (1988). Stock market prices do not follow random walks: Evidence from a simple specification test. The Review Of Financial Studies, 1(1), 41–66. [Google Scholar] [CrossRef]
- Loewenstein, G. F., Weber, E. U., Hsee, C. K., & Welch, N. (2001). Risk as feelings. Psychological Bulletin, 127(2), 267. [Google Scholar] [CrossRef]
- Logg, J., Minson, J., & Moore, D. (2019). Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes, 151, 90–103. [Google Scholar] [CrossRef]
- Lopez-Lira, A. (2025). Can large language models trade? Testing financial theories with LLM agents in market simulations. arXiv. [Google Scholar] [CrossRef]
- Lopez-Lira, A., & Tang, Y. (2024). Can ChatGPT forecast stock price movements? Return predictability and large language models. arXiv. [Google Scholar] [CrossRef]
- Lou, J., & Sun, Y. (2024). Anchoring bias in large language models: An experimental study. arXiv. [Google Scholar] [CrossRef]
- Lux, T., & Zwinkels, R. C. J. (2018). Empirical validation of agent-based models. In Handbook of computational economics (pp. 437–488). Elsevier. [Google Scholar] [CrossRef]
- MacKinlay, A. C. (1997). Event studies in economics and finance. Journal of Economic Literature, 35(1), 13–39. [Google Scholar]
- Mainali, M., & Weber, R. O. (2025). Exploring cognitive attributes in financial decision-making. arXiv. [Google Scholar] [CrossRef]
- Marakas, G. M., Yi, M. Y., & Johnson, R. D. (1998). The multilevel and multifaceted character of computer self-efficacy: Toward clarification of the construct and an integrative framework for research. Information Systems Research, 9(2), 126–163. [Google Scholar] [CrossRef]
- Martin, L., Whitehouse, N., Yiu, S., Catterson, L., & Perera, R. (2024). Better call GPT, comparing large language models against lawyers. arXiv. [Google Scholar] [CrossRef]
- Martinez-Blasco, M., Serrano, V., Prior, F., & Cuadros, J. (2023). Analysis of an event study using the Fama–French five-factor model: Teaching approaches including spreadsheets and the R programming language. Financial Innovation, 9(1), 76. [Google Scholar] [CrossRef]
- Mclean, R. D., & Pontiff, J. (2016). Does academic research destroy stock return predictability? The Journal of Finance, 71(1), 5–32. [Google Scholar] [CrossRef]
- McNulty, K. (2021). Handbook of regression modeling in people analytics: With examples in R and Python. Available online: https://peopleanalytics-regression-book.org/gitbook/power-tests.html?utm_source=chatgpt.com (accessed on 25 June 2025).
- Meissner, P., & Wulf, T. (2013). Cognitive benefits of scenario planning: Its impact on biases and decision quality. Technological Forecasting and Social Change, 80(4), 801–814. [Google Scholar] [CrossRef]
- Middlebrooks, C. D., Kerr, T., & Castel, A. D. (2017). Selectively distracted: Divided attention and memory for important information. Psychological Science, 28(8), 1103–1115. [Google Scholar] [CrossRef]
- Miguel, A. F., & Su, D. (2019). Explaining differences in the flow-performance sensitivity of retail and institutional mutual funds—International evidence. Theoretical Economics Letters, 9(7), 2711–2731. [Google Scholar] [CrossRef]
- Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81–97. [Google Scholar] [CrossRef]
- Morales-García, W. C., Sairitupa-Sanchez, L. Z., Morales-García, S. B., & Morales-García, M. (2024). Adaptation and psychometric properties of a brief version of the general self-efficacy scale for use with artificial intelligence (GSE-6AI) among university students. Frontiers in Education, 9, 1293437. [Google Scholar] [CrossRef]
- Naranjo, A., Nimalendran, M., & Wu, Y. (2023). Betting on elusive returns: Retail trading in complex options. Social Science Research Network. [Google Scholar] [CrossRef]
- Narayan, S. W., Rehman, M. U., Ren, Y.-S., & Ma, C. (2023). Is a correlation-based investment strategy beneficial for long-term international portfolio investors? Financial Innovation, 9(1), 64. [Google Scholar] [CrossRef]
- Nickerson, R. S. (1998). Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology, 2(2), 175–220. [Google Scholar] [CrossRef]
- Nie, Y., Kong, Y., Dong, X., Mulvey, J. M., Poor, H. V., Wen, Q., & Zohren, S. (2024). A survey of large language models for financial applications: Progress, prospects and challenges. arXiv. [Google Scholar] [CrossRef]
- Odean, T. (1998). Are investors reluctant to realize their losses? The Journal of Finance, 53(5), 1775–1798. [Google Scholar] [CrossRef]
- Parasuraman, R., & Riley, V. (1997). Humans and automation: Use, misuse, disuse, abuse. Human Factors, 39(2), 230–253. [Google Scholar] [CrossRef]
- Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of human interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans, 30(3), 286–297. [Google Scholar] [CrossRef]
- Park, J., Konana, P., Gu, B., Kumar, A., & Raghunathan, R. (2010). Confirmation bias, overconfidence, and investment performance: Evidence from stock message boards. Social Science Research Network. [Google Scholar] [CrossRef]
- Park, J. S., O’Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., & Bernstein, M. S. (2023). Generative agents: Interactive simulacra of human behavior. arXiv. [Google Scholar] [CrossRef]
- Parker, J. A., Schoar, A., & Sun, Y. (2020). Retail financial innovation and stock market dynamics: The case of target date funds. National Bureau of Economic Research. [Google Scholar] [CrossRef]
- Parte, L., Garvey, A. M., & Gonzalo-Angulo, J. A. (2018). Cognitive load theory: Why it’s important for international business teaching and financial reporting. Journal of Teaching in International Business, 29(2), 134–160. [Google Scholar] [CrossRef]
- Pavlou, P. A., & Fygenson, M. (2006). Understanding and predicting electronic commerce adoption: An extension of the theory of planned behavior. MIS Quarterly, 30(1), 115. [Google Scholar] [CrossRef]
- Payne, J. W., Bettman, J. R., & Johnson, E. J. (1993). The adaptive decision maker. Cambridge University Press. [Google Scholar]
- Payzan-LeNestour, E., Pradier, L., & Putniņš, T. J. (2023). Biased risk perceptions: Evidence from the laboratory and financial markets. Journal of Banking & Finance, 154, 106685. [Google Scholar] [CrossRef]
- Pástor, Ľ., & Veronesi, P. (2013). Political uncertainty and risk premia. Journal of Financial Economics, 110(3), 520–545. [Google Scholar] [CrossRef]
- Peabody, J. W., Luck, J., Glassman, P., Dresselhaus, T. R., & Lee, M. (2000). Comparison of vignettes, standardized patients, and chart abstraction: A prospective validation study of 3 methods for measuring quality. JAMA, 283(13), 1715–1722. [Google Scholar] [CrossRef]
- Peng, C. (2024). Emotion-impacted Decision-making under Risks. Advances in Social Behavior Research, 13, 68–76. [Google Scholar] [CrossRef]
- Peng, Y. (2024). Internet sentiment exacerbates intraday overtrading, evidence from A-Share market. arXiv. [Google Scholar] [CrossRef]
- Pernagallo, G., & Torrisi, B. (2022). A Theory of Information overload applied to perfectly efficient financial markets. Review of Behavioral Finance, 14(2), 223–236. [Google Scholar] [CrossRef]
- Persson, E., Barrafrem, K., Meunier, A., & Tinghög, G. (2019). The effect of decision fatigue on surgeons’ clinical decision making. Health Economics, 28(10), 1194–1203. [Google Scholar] [CrossRef]
- Pimenta, A., Carneiro, D., Novais, P., & Neves, J. (2014). Analysis of Human Performance as a Measure of Mental Fatigue. In M. Polycarpou, A. C. P. L. F. de Carvalho, J.-S. Pan, M. Woźniak, H. Quintian, & E. Corchado (Eds.), Hybrid artificial intelligence systems (pp. 389–401). Springer International Publishing. [Google Scholar] [CrossRef]
- Pouget, S., Sauvagnat, J., & Villeneuve, S. (2017). A mind is a terrible thing to change: Confirmatory bias in financial markets. The Review of Financial Studies, 30(6), 2066–2109. [Google Scholar] [CrossRef]
- Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124(3), 372–422. [Google Scholar] [CrossRef]
- Rayner, K., Schotter, E. R., Masson, M. E. J., Potter, M. C., & Treiman, R. (2016). So much to read, so little time: How do we read, and can speed reading help? Psychological Science in the Public Interest, 17(1), 4–34. [Google Scholar] [CrossRef]
- Rendon-Velez, E., Van Leeuwen, P. M., Happee, R., Horváth, I., Van Der Vegte, W. F., & De Winter, J. C. F. (2016). The effects of time pressure on driver performance and physiological activity: A driving simulator study. Transportation Research Part F: Traffic Psychology and Behavior, 41, 150–169. [Google Scholar] [CrossRef]
- Richards, D. W., Rutterford, J., Kodwani, D., & Fenton-O’Creevy, M. (2017). Stock market investors’ use of stop losses and the disposition effect. The European Journal of Finance, 23(2), 130–152. [Google Scholar] [CrossRef]
- Risko, E. F., & Gilbert, S. J. (2016). Cognitive offloading. Trends in Cognitive Sciences, 20(9), 676–688. [Google Scholar] [CrossRef]
- Rose, J. M., Roberts, F. D., & Rose, A. M. (2004). Affective responses to financial data and multimedia: The effects of information load and cognitive load. International Journal of Accounting Information Systems, 5(1), 5–24. [Google Scholar] [CrossRef]
- Rubinstein, A. (2013). Response time and decision making: An experimental study. Judgment and Decision Making, 8(5), 540–551. [Google Scholar] [CrossRef]
- Ruggeri, K., Ashcroft-Jones, S., Abate Romero Landini, G., Al-Zahli, N., Alexander, N., Andersen, M. H., Bibilouri, K., Busch, K., Cafarelli, V., Chen, J., Doubravová, B., Dugué, T., Durrani, A. A., Dutra, N., Garcia-Garzon, E., Gomes, C., Gracheva, A., Grilc, N., Gürol, D. M., … Stock, F. (2023). The persistence of cognitive biases in financial decisions across economic groups. Scientific Reports, 13(1), 10329. [Google Scholar] [CrossRef] [PubMed]
- Salemi, A., & Zamani, H. (2024). Evaluating retrieval quality in retrieval-augmented generation. arXiv. Available online: http://arxiv.org/abs/2404.13781 (accessed on 31 August 2025).
- Schlegel, K., Sommer, N. R., & Mortillaro, M. (2025). Large language models are proficient in solving and creating emotional intelligence tests. Communications Psychology, 3(1), 80. [Google Scholar] [CrossRef]
- Seth, H., Talwar, S., Bhatia, A., Saxena, A., & Dhir, A. (2020). Consumer resistance and inertia of retail investors: Development of the resistance adoption inertia continuance (RAIC) framework. Journal of Retailing and Consumer Services, 55, 102071. [Google Scholar] [CrossRef]
- Sharpe, W. F. (1994). The sharpe ratio. The Journal of Portfolio Management, 21(1), 49–58. [Google Scholar] [CrossRef]
- Sheeran, P., & Webb, T. L. (2016). The intention–behavior gap. Social and Personality Psychology Compass, 10(9), 503–518. [Google Scholar] [CrossRef]
- Shefrin, H., & Statman, M. (1985). The disposition to sell winners too early and ride losers too long: Theory and evidence. The Journal of Finance, 40(3), 777–790. [Google Scholar] [CrossRef]
- Shiller, R. J. (2017). Narrative economics. American Economic Review, 107(4), 967–1004. [Google Scholar] [CrossRef]
- Shinn, N., Cassano, F., Gopinath, A., Narasimhan, K., & Yao, S. (2024). Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems, 36, 1–19. Available online: https://proceedings.neurips.cc/paper_files/paper/2023/hash/1b44b878bb782e6954cd888628510e90-Abstract-Conference.html (accessed on 25 February 2024).
- Showalter, S., & Gropp, J. (2019). Validating weak-form market efficiency in united states stock markets with trend deterministic price data and machine learning. arXiv. [Google Scholar] [CrossRef]
- Simon, A. J., Gallen, C. L., Ziegler, D. A., Mishra, J., Marco, E. J., Anguera, J. A., & Gazzaley, A. (2023). Quantifying attention span across the lifespan. Frontiers in Cognition, 2, 1207428. [Google Scholar] [CrossRef]
- Simon, H. A. (1955). A Behavioral model of rational choice. The Quarterly Journal of Economics, 69(1), 99–118. [Google Scholar] [CrossRef]
- Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: Sustained inattentional blindness for dynamic events. Perception, 28(9), 1059–1074. [Google Scholar] [CrossRef] [PubMed]
- Singh, A. K., Devkota, S., Lamichhane, B., Dhakal, U., & Dhakal, C. (2023). The confidence-competence gap in large language models: A cognitive study. arXiv. [Google Scholar] [CrossRef]
- Singh, D., Malik, G., & Jha, A. (2024). Overconfidence bias among retail investors: A systematic review and future research directions. Investment Management and Financial Innovations, 21(1), 302–316. [Google Scholar] [CrossRef]
- Slovic, P., Finucane, M. L., Peters, E., & MacGregor, D. G. (2007). The affect heuristic. European Journal of Operational Research, 177(3), 1333–1352. [Google Scholar] [CrossRef]
- Sniehotta, F. (2009). An experimental test of the theory of planned behavior. Applied Psychology: Health and Well-Being, 1, 257–270. [Google Scholar] [CrossRef]
- Snijders, T. A., & Bosker, R. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. SAGE Publications Ltd. Available online: https://uk.sagepub.com/en-gb/eur/multilevel-analysis/book234191 (accessed on 26 June 2025).
- Song, J., Xu, Z., & Zhong, Y. (2025). Out-of-distribution generalization via composition: A lens through induction heads in Transformers. Proceedings of the National Academy of Sciences, 122(6), e2417182122. [Google Scholar] [CrossRef]
- Spatharioti, S. E., Rothschild, D. M., Goldstein, D. G., & Hofman, J. M. (2023). Comparing traditional and LLM-based search for consumer choice: A randomized experiment. arXiv. [Google Scholar] [CrossRef]
- Steyvers, M., Tejeda, H., Kumar, A., Belem, C., Karny, S., Hu, X., Mayer, L., & Smyth, P. (2025). What large language models know and what people think they know. Nature Machine Intelligence, 7(2), 221–231. [Google Scholar] [CrossRef]
- Sumita, Y., Takeuchi, K., & Kashima, H. (2024). Cognitive biases in large language models: A survey and mitigation experiments. arXiv. [Google Scholar] [CrossRef]
- Sun, C. (2023). Factor correlation and the cross section of asset returns: A correlation-robust approach. Available online: https://wp.lancs.ac.uk/finec2023/files/2023/02/FEC-2023-049-Chuanping-Sun-Final.pdf (accessed on 24 June 2025).
- Sun, F., Li, N., Wang, K., & Goette, L. (2025). Large language models are overconfident and amplify human bias. arXiv. [Google Scholar] [CrossRef]
- Sussman, R., & Gifford, R. (2019). Causality in the theory of planned behavior. Personality and Social Psychology Bulletin, 45(6), 920–933. [Google Scholar] [CrossRef]
- Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive load theory. Springer Science & Business Media. [Google Scholar]
- Tan, L., Zhang, X., & Zhang, X. (2023). Retail and institutional investor trading behaviors: Evidence from China. Social Science Research Network. [Google Scholar] [CrossRef]
- Tatsat, H., & Shater, A. (2025). Beyond the black box: Interpretability of LLMs in finance. arXiv. [Google Scholar] [CrossRef]
- Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139–1168. [Google Scholar] [CrossRef]
- Tilmann, G., & Raftery, A. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102(477), 359–378. [Google Scholar] [CrossRef]
- Tuominen, J. (2023). Decisions under uncertainty are more messy than they seem. Behavioral and Brain Sciences, 46, e109. [Google Scholar] [CrossRef]
- Tversky, A., & Kahneman, D. (1986). Rational choice and the framing of decisions. The Journal of Business, 59(4), S251–S278. [Google Scholar] [CrossRef]
- Uhr, C., Meyer, S., & Hackethal, A. (2021). Smoking hot portfolios? Trading behavior, investment biases, and self-control failure. Journal of Empirical Finance, 63, 73–95. [Google Scholar] [CrossRef]
- Valeyre, S., & Aboura, S. (2024). LLMs for time series: An application for single stocks and statistical arbitrage. arXiv. [Google Scholar] [CrossRef]
- Varshney, N., Raj, S., Mishra, V., Chatterjee, A., Saeidi, A., Sarkar, R., & Baral, C. (2025). Investigating and addressing hallucinations of LLMs in tasks involving negation. In T. Cao, A. Das, T. Kumarage, Y. Wan, S. Krishna, N. Mehrabi, J. Dhamala, A. Ramakrishna, A. Galystan, A. Kumar, R. Gupta, & K.-W. Chang (Eds.), Proceedings of the 5th workshop on trustworthy NLP (TrustNLP 2025) (pp. 580–598). Association for Computational Linguistics. [Google Scholar] [CrossRef]
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems 30 (pp. 5998–6008). Curran Associates. [Google Scholar] [CrossRef]
- Venkatesh, V., & Davis, F. D. (2000). A theoretical extension of the technology acceptance model: Four longitudinal field studies. Management Science, 46(2), 186–204. [Google Scholar] [CrossRef]
- Venkatesh, V., Morris, M. G., Davis, G. B., & Davis, F. D. (2003). User acceptance of information technology: Toward a unified view. MIS Quarterly, 27(3), 425–478. [Google Scholar] [CrossRef]
- Vyetrenko, S., Byrd, D., Petosa, N., Mahfouz, M., Dervovic, D., Veloso, M., & Balch, T. (2020, October 15–16). Get real: Realism metrics for robust limit order book market simulations. First ACM International Conference on AI in Finance (pp. 1–8), New York, NY, USA. [Google Scholar] [CrossRef]
- Wang, D., Churchill, E., Maes, P., Fan, X., Shneiderman, B., Shi, Y., & Wang, Q. (2020, April 25–30). From human-human collaboration to Human-AI collaboration: Designing AI systems that can work together with people. Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (p. 6), Honolulu, HI, USA. [Google Scholar] [CrossRef]
- Wang, J., Jiang, H., Liu, Y., Ma, C., Zhang, X., Pan, Y., Liu, M., Gu, P., Xia, S., Li, W., Zhang, Y., Wu, Z., Liu, Z., Zhong, T., Ge, B., Zhang, T., Qiang, N., Hu, X., Jiang, X., … Zhang, S. (2024). A comprehensive review of multimodal large language models: Performance and challenges across different tasks. arXiv. [Google Scholar] [CrossRef]
- Wang, Q., Gao, Y., Tang, Z., Luo, B., & He, B. (2024). Enhancing LLM trading performance with fact-subjectivity aware reasoning. arXiv. [Google Scholar] [CrossRef]
- Wang, Q., Tang, Z., & He, B. (2025). From ChatGPT to DeepSeek: Can LLMs simulate humanity? arXiv. [Google Scholar] [CrossRef]
- Wang, Y.-Y., & Chuang, Y.-W. (2023). Artificial intelligence self-efficacy: Scale development and validation. Education and Information Technologies, 29(4), 4785–4808. [Google Scholar] [CrossRef]
- Wang, Z., Li, Y., Wu, J., Soon, J., & Zhang, X. (2023). FinVis-GPT: A multimodal large language model for financial chart analysis. arXiv. [Google Scholar] [CrossRef]
- Warkulat, S., & Pelster, M. (2024). Social media attention and retail investor behavior: Evidence from r/wallstreetbets. International Review of Financial Analysis, 96, 103721. [Google Scholar] [CrossRef]
- Warm, J. S., Parasuraman, R., & Matthews, G. (2008). Vigilance requires hard mental work and is stressful. Human Factors, 50(3), 433–441. [Google Scholar] [CrossRef] [PubMed]
- Webb, T., & Sheeran, P. (2006). Does changing behavioral intentions engender behavior change? A meta-analysis of the experimental evidence. Psychological Bulletin, 132(2), 249–268. [Google Scholar] [CrossRef]
- Weber, M., & Camerer, C. F. (1998). The disposition effect in securities trading: An experimental analysis. Journal of Economic Behavior & Organization, 33(2), 167–184. [Google Scholar] [CrossRef]
- Webster, J., & Watson, R. T. (2002). Analyzing the past to prepare for the future: Writing a literature review. MIS Quarterly, 26(2), xiii–xxiii. [Google Scholar]
- Wheat, C., & Eckerd, G. (2024). Returns-chasing and dip-buying among retail investors. Research Snapshot. Available online: https://www.jpmorganchase.com/institute/all-topics/financial-health-wealth-creation/returns-chasing-and-dip-buying-among-retail-investors (accessed on 22 June 2025).
- Wheeler, A., & Varner, J. D. (2024). Scalable agent-based modeling for complex financial market simulations. arXiv. [Google Scholar] [CrossRef]
- Wu, S., Irsoy, O., Lu, S., Dabravolski, V., Dredze, M., Gehrmann, S., Kambadur, P., Rosenberg, D., & Mann, G. (2023). BloombergGPT: A large language model for finance. arXiv. [Google Scholar] [CrossRef]
- Xiao, J. J., & Wu, G. (2006). Applying the theory of planned behavior to retain credit counseling clients. In D. C. Bagwell (Ed.), Proceedings of the association for financial counseling and planning education (pp. 91–101). [Google Scholar] [CrossRef]
- Xu, M. (2025, May 2). 0DTEs decoded: Positioning, trends, and market impact. Volatility Insights. Available online: https://www.cboe.com/insights/posts/0-dt-es-decoded-positioning-trends-and-market-impact/?utm_source=chatgpt.com (accessed on 22 June 2025).
- Xue, S., Zhou, F., Xu, Y., Jin, M., Wen, Q., Hao, H., Dai, Q., Jiang, C., Zhao, H., Xie, S., He, J., Zhang, J., & Mei, H. (2024). WeaverBird: Empowering financial decision-making with large language model, knowledge base, and search engine. arXiv. [Google Scholar] [CrossRef]
- Yang, H., Zhang, B., Wang, N., Guo, C., Zhang, X., Lin, L., Wang, J., Zhou, T., Guan, M., Zhang, R., & Wang, C. D. (2024). FinRobot: An open-source AI agent platform for financial applications using large language models. arXiv. [Google Scholar] [CrossRef]
- Yang, J., Tang, Y., Li, Y., Zhang, L., & Zhang, H. (2025). Dynamic hedging strategies in derivatives markets with LLM-Driven sentiment and news analytics. arXiv. [Google Scholar] [CrossRef]
- Yang, Y., Zhang, Y., Wu, M., Zhang, K., Zhang, Y., Yu, H., Hu, Y., & Wang, B. (2025). TwinMarket: A scalable behavioral and social simulation for financial markets. arXiv. [Google Scholar] [CrossRef]
- Yin, R. K. (2018). Case study research and applications: Design and methods (6th ed.). Scribd. Available online: https://www.scribd.com/document/687414473/YIn-2018-Case-Study (accessed on 23 June 2025).
- Yin, S., Fu, C., Zhao, S., Li, K., Sun, X., Xu, T., & Chen, E. (2024). A survey on multimodal large language models. National Science Review, 11(12), nwae403. [Google Scholar] [CrossRef]
- Ying, L., Collins, K. M., Wong, L., Sucholutsky, I., Liu, R., Weller, A., Shu, T., Griffiths, T. L., & Tenenbaum, J. B. (2025). On benchmarking human-like intelligence in machines. arXiv. [Google Scholar] [CrossRef]
- Yu, Y., Li, H., Chen, Z., Jiang, Y., Li, Y., Zhang, D., Liu, R., Suchow, J. W., & Khashanah, K. (2023). FinMem: A performance-enhanced LLM trading agent with layered memory and character design. arXiv. [Google Scholar] [CrossRef]
- Yu, Y., Yao, Z., Li, H., Deng, Z., Cao, Y., Chen, Z., Suchow, J. W., Liu, R., Cui, Z., Zhang, D., Subbalakshmi, K., Xiong, G., He, Y., Huang, J., Li, D., & Xie, Q. (2024). FinCon: A synthesized LLM multi-agent system with conceptual verbal reinforcement for enhanced financial decision making. arXiv. [Google Scholar] [CrossRef]
- Zhang, K., Yang, J., Inala, J. P., Singh, C., Gao, J., Su, Y., & Wang, C. (2025). Towards understanding graphical perception in large multimodal models. arXiv. [Google Scholar] [CrossRef]
- Zhang, W., Zhao, L., Xia, H., Sun, S., Sun, J., Qin, M., Li, X., Zhao, Y., Zhao, Y., Cai, X., Zheng, L., Wang, X., & An, B. (2024, August 25–29). A multimodal foundation agent for financial trading: Tool-augmented, diversified, and generalist. 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (pp. 4314–4325), Barcelona, Spain. [Google Scholar] [CrossRef]
- Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang, X., Zhao, E., Zhang, Y., Chen, Y., Wang, L., Luu, A. T., Bi, W., Shi, F., & Shi, S. (2023). Siren’s song in the AI ocean: A survey on hallucination in large language models. arXiv. [Google Scholar] [CrossRef]
- Zhang, Y., Pan, Y., Zhong, T., Dong, P., Xie, K., Liu, Y., Jiang, H., Wu, Z., Liu, Z., Zhao, W., Zhang, W., Zhao, S., Zhang, T., Jiang, X., Shen, D., Liu, T., & Zhang, X. (2024). Potential of multimodal large language models for data mining of medical images and free-text reports. Meta-Radiology, 2(4), 100103. [Google Scholar] [CrossRef]



| Feature | Traditional PBC | PCA (AI-Scaffolded) |
|---|---|---|
| Source of confidence | Internal experience and knowledge | Access to intelligent external systems |
| Nature of reasoning | Self-reliant, effortful processing | Co-constructed with AI guidance |
| Knowledge access | Stored internally | Queried or retrieved dynamically |
| Behavioral boundary | Defined by personal capability | Extended by perceived machine cognition |
| Example | “I know how to trade options” | “I can trade options because GPT explains it” |
| EMH Test Focus | Operational Measure | Interpretation |
|---|---|---|
| Weak-form efficiency | Autocorrelation in post-trade returns; Hurst exponent analysis (Hurst, 1951) | Significant autocorrelation or persistence → potential inefficiency |
| Random walk behavior | Variance ratio tests (A. W. Lo & MacKinlay, 1988); cumulative return drift | Systematic drift → violation of EMH randomness |
| Risk-adjusted performance | Sharpe ratio (Sharpe, 1994) comparisons between LLM-assisted and baseline trades | Sustained alpha with LLM → weak-form EMH violation |
| Price adjustment speed | Event study of asset price movement (MacKinlay, 1997; Martinez-Blasco et al., 2023) after LLM-identified signals | Delayed reactions suggest semi-strong inefficiency |
| AMH Construct | Observable Indicator | Measurement Method |
|---|---|---|
| Cognitive adaptation | Increase in complex strategies (e.g., spreads, straddles, delta-neutral) | Trade classification (pre/post LLM use) |
| Strategic evolution | Higher frequency of volatility exposure, use of Greeks in decision-making | Strategy tagging, prompt log analysis |
| Tool-conditioned efficiency | Return stabilization or reduced drawdowns in LLM-assisted trades | Rolling Sharpe ratios, drawdown histograms |
| Behavioral sophistication | Reduced herding, greater asset diversification | Correlation matrix of asset choices among users |
| Cross-sectional diffusion | Spread of institutional-grade strategies into retail segments | Option flow segmentation by account type |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gimmelberg, D.; Ludviga, I. Strategic Complexity and Behavioral Distortion: Retail Investing Under Large Language Model Augmentation. Int. J. Financial Stud. 2025, 13, 210. https://doi.org/10.3390/ijfs13040210
Gimmelberg D, Ludviga I. Strategic Complexity and Behavioral Distortion: Retail Investing Under Large Language Model Augmentation. International Journal of Financial Studies. 2025; 13(4):210. https://doi.org/10.3390/ijfs13040210
Chicago/Turabian StyleGimmelberg, Dmitrii, and Iveta Ludviga. 2025. "Strategic Complexity and Behavioral Distortion: Retail Investing Under Large Language Model Augmentation" International Journal of Financial Studies 13, no. 4: 210. https://doi.org/10.3390/ijfs13040210
APA StyleGimmelberg, D., & Ludviga, I. (2025). Strategic Complexity and Behavioral Distortion: Retail Investing Under Large Language Model Augmentation. International Journal of Financial Studies, 13(4), 210. https://doi.org/10.3390/ijfs13040210

