Temporal Obfuscation Testing for LLM Structural Reasoning: From Single-Day Dealer Constraints to Persistent Market Regimes
Abstract
1. Introduction
- Research gap.
- Why 0DTE matters here.
1.1. Research Questions
- 1.
- Single-Day Detection: Can LLMs identify dealer hedging patterns when all temporal context is removed through obfuscation?
- 2.
- Raw Chain Superiority: Does strike-level data outperform parametric GEX summaries for structural detection?
- 3.
- Regime Selectivity: Can LLMs identify persistent 30-day regimes while rejecting transitional periods?
- 4.
- Market Structure Evolution: Did 0DTE proliferation create detectable structural change in dealer positioning regimes (Adams et al., 2025)?
1.2. Contributions
1.3. Positioning
1.4. Paper Organization
2. Background and Related Work
2.1. Dealer Hedging and Market Microstructure
2.2. Zero-Days-to-Expiration (0DTE) Options
2.3. LLM Reasoning Evaluation
2.4. LLM Applications in Finance
2.5. Regime Detection in Financial Markets
2.6. Obfuscation Testing
2.7. Research Gap
- 1.
- Demonstrated that raw options chain data outperforms parametric GEX for structural pattern detection,
- 2.
- Validated detection through obfuscation testing, eliminating training data contamination,
- 3.
- Established detection-alpha orthogonality proving mechanism identification independent of profitability,
- 4.
- Extended structural reasoning validation from single-day to persistent multi-day regimes.
3. Methodology
3.1. Obfuscation Testing Protocol
3.2. Causal Framework: WHO → WHOM → WHAT
3.3. Single-Day Pattern Detection
3.4. 30-Day Regime Detection Framework
3.5. GEX Calculation
3.6. Multi-Phase Validation Strategy
- Shuffled (277 windows): Randomize day order within real windows, destroying the temporal structure while preserving aggregate statistics.
- Transitional (255 windows): Generate windows with frequent sign flips (>8 per window), violating the stability criterion.
- Low-magnitude (277 windows): Generate windows with a magnitude below $3B, violating the magnitude criterion.
3.7. LLM Configuration
3.8. Markov-Switching Benchmark
3.9. LLM Usage Disclosure
4. Single-Day Validation Results
4.1. Detection Under Obfuscation
4.2. Raw Chain Validation
4.3. Detection-Alpha Orthogonality
4.4. Inverse P-Hacking Defense
5. Regime Detection and Market Evolution
- Statistical conventions used in this section:
5.1. Phase 1–3: Baseline and Full-Year Validation
5.2. Phase 2: Negative Controls
5.3. Phase 4: 2020 vs. 2024 Comparison
5.4. Phase 5: Multi-Year Temporal Evolution (2020–2025)
5.5. Threshold Sensitivity
5.6. Comparison with Markov-Switching Benchmark
5.7. LLM Reasoning Quality
6. Discussion
6.1. From Single-Day to Multi-Day: Bridging Two Temporal Scales
6.2. Validating Structural Reasoning
6.3. Market Structure Evolution and 0DTE Hypothesis
6.4. Dispersed Knowledge and Information Aggregation
6.5. Practical Implications
6.5.1. Risk Management
6.5.2. Market Efficiency
6.5.3. Practitioners: Data-Pipeline and Model-Deployment Design
6.6. Limitations and Future Work
- Single-asset scope: All reported results concern SPY. SPY is deliberately chosen as the highest-liquidity and earliest 0DTE-enabled U.S. equity benchmark, but this choice leaves cross-asset generalization empirically untested. Dealer positioning in QQQ, IWM, single-name equities, and non-equity underliers (futures, rates, FX) may exhibit different regime dynamics because of differences in option chain depth, 0DTE availability, and the composition of end-users. Cross-asset replication is the single highest-priority item for future work; a pre-registered protocol applying the same obfuscation and regime-classification framework to at least one additional ETF (QQQ) and one individual equity (e.g., NVDA, AAPL) would directly test the transferability claim.
- Single-LLM dependence: All 2221 evaluations were produced by a single reasoning model (OpenAI o4-mini). The detection rates reported here are therefore conditional on this specific model’s prior distribution over market-structure reasoning. Model-swap validation with alternative reasoning families (e.g., Anthropic Claude, OpenAI o3, Google Gemini, open-source reasoning models) using identical prompts and the same obfuscated sequences is a direct and low-cost extension. A cross-model agreement analysis would sharpen the distinction between the framework’s structural-reasoning claim and any o4-mini-specific artefacts.
- Lack of independent external validation: Our per-window ground-truth metrics (persistence, magnitude, sign flips) are computed from the same Alpha Vantage options feed used to construct the windows. We do not cross-validate detected regimes against an independent data source (CBOE DataShop, OPRA consolidated feed, or a commercial vendor such as SpotGamma or MenthorQ) or against an independent oracle of dealer positioning. External validation—both against a second options-data pipeline and against related microstructure observables (realized volatility, implied-realised spread, opening auction imbalance)—would strengthen the claim that the detected regimes correspond to a real cross-verified phenomenon rather than an artefact of any single data provider.
- End-of-day measurement: Our GEX_OI approach captures dealer inventory at the close but not intraday gamma dynamics; high-frequency flow data could refine detection, particularly for 0DTE contracts that expire within a single trading session. Intraday GEX surface reconstruction from streaming OPRA is a natural extension.
- Causal attribution: The 0DTE hypothesis is supported by temporal coincidence and a theoretical mechanism but remains circumstantial; observational data cannot exclude alternative explanations such as post-pandemic monetary-policy shifts, passive-flow concentration, or market-maker inventory changes. A natural experiment—for instance a temporary 0DTE suspension or a regulatory halt during a market stress episode—would provide stronger causal evidence. We treat the 0DTE correspondence as consistent with our structural-regime detection rather than as a demonstrated causal channel (see Section 6.3).
- Shuffle test asymmetry: The 61% false positive rate on 2024 shuffled data (versus 12.1% on 2020 shuffled data) reflects extreme regime persistence rather than framework failure—2024 regimes exhibit such dominant same-sign positioning that randomizing the day order rarely disrupts the aggregate signal. This asymmetry is itself informative, confirming that 2024 regimes are defined by aggregate dominance rather than temporal sequencing, but it means the shuffle test’s diagnostic power is lower in high-persistence regimes and should be interpreted accordingly.
- Threshold sensitivity: All tested parameter configurations maintained substantial 2024-versus-2020 discrimination (see Section 5 sensitivity analysis), but the chosen thresholds (persistence ≥ 70%, magnitude ≥ $5B, flips ≤ 5) represent empirically validated design choices rather than first-principles derivations. Future work should explore adaptive thresholding that responds to volatility regime, contract-maturity mix, or prevailing options notional.
7. Conclusions
- 1.
- Single-day structural reasoning: Obfuscation achieves 71.5% detection with 91.2% predictive accuracy, and raw-chain validation (92.3% vs. 61.5%) shows LLMs reconstruct dealer positioning from first principles—establishing that parametric GEX is lossy compression of structural signals.
- 2.
- Multi-day regime selectivity: 30-day windows yield 69.1 percentage-point discrimination between 2024 persistent regimes (81.2%, 95% CI [75.8, 86.1]%) and 2020 fragmented markets (12.1%, 95% CI [8.1, 16.6]%; Fisher’s exact p = 1.8 × 10−52, φ = 0.69), with 0% false positives on synthetic controls and 98% mechanical accuracy; the gap exceeds 50 pp in 40 of 45 alternative threshold configurations.
- 3.
- Market structure evolution: Across 1412 windows (2020–2025), the detection progresses non-monotonically from 3.7% (2021) to 100% (2024–2025), and the average GEX magnitude grows from $3.0B to $20.3B—a tipping-point pattern consistent with 0DTE-driven structural reorganization, though contemporaneous confounders (interest-rate regime, passive-flow concentration, dealer inventory) cannot be excluded with observational data.
- 4.
- Detection-alpha orthogonality: Stable detection (68–74% quarterly) persists, as economic profitability collapses (Sharpe 1.8 → 0.1), confirming detected patterns are structural market mechanics rather than tradeable inefficiencies.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| GEX | Gamma Exposure |
| LLM | Large Language Model |
| 0DTE | Zero Days to Expiration |
| OI | Open Interest |
Appendix A. Regime Detection LLM Prompt
Appendix A.1. Model and API Configuration
- Model:o4-mini (snapshot o4-mini-2025-04-16)
- Temperature: not supplied. o4-mini is a reasoning model that runs at a fixed internal sampling temperature; the sampling-temperature parameter is not exposed for the o1/o3/o4 families, so no temperature value is sent in the request.
- Maximum completion tokens: not set; the OpenAI default applies. Observed completion lengths across the 1412 windows ranged from 2457 to 5924 tokens, so no response was truncated.
- Response format: JSON is requested in the prompt text (Appendix A.2); no API-level response_format constraint is applied.
- Access mode: OpenAI Batch API, /v1/chat/completions endpoint, 24-h asynchronous completion window. Submissions are sized per validation phase rather than at a fixed per-batch count.
Appendix A.2. System Message and User Prompt
You are a market structure analyst specializing in dealer gamma positioning regimes.
TASK: Analyze this 30-day period and determine if it represents a PERSISTENT regime where dealer constraints create forced, directional flows.
## 30-DAY GEX DATA
{gex_data_table}
## REGIME CLASSIFICATION FRAMEWORK
### PERSISTENT REGIMES (Detect These)
**1. PERSISTENT POSITIVE REGIME**
- Definition: Dealers are LONG gamma, forced to sell into strength
- Criteria:
* >70% of days (21+/30) have positive net GEX
* Average magnitude >$5B
* <=5 sign flips across 30 days
* Stable directional constraint
**Mechanism**: When dealers hold long gamma: - Price rises -> Dealers MUST sell shares (rebalance) - Price falls -> Dealers MUST buy shares (rebalance) - Creates dampening, mean-reverting flows - Constraint is STRUCTURAL (dealers cannot avoid)
**2. PERSISTENT NEGATIVE REGIME**
- Definition: Dealers are SHORT gamma, forced to buy into strength
- Criteria:
* >70% of days (21+/30) have negative net GEX
* Average magnitude >$5B
* <=5 sign flips across 30 days
* Stable directional constraint
**Mechanism**: When dealers hold short gamma: - Price rises -> Dealers MUST buy shares (chase) - Price falls -> Dealers MUST sell shares (chase) - Creates amplifying, momentum flows - Constraint is STRUCTURAL (dealers cannot avoid)
---
### NON-REGIMES (Reject These)
**3. TRANSITIONAL (Reject)** - Frequent sign flips between positive/negative GEX - No dominant regime direction (less than 70% same sign) - Market in regime change period - Example: 15 positive days, 15 negative days (50/50 split)
**Why Reject**: No persistent constraint. Dealers face mixed conditions daily. Not a structural regime.
**4. LOW CONVICTION (Reject)** - Consistent sign BUT weak magnitude (<$5B average) - Example: 25 days positive, avg $2B GEX - Insufficient constraint to create persistent forced flows
**Why Reject**: Even if sign is consistent, magnitude too weak to force dealers into meaningful positions. Not a structural constraint.
---
## ANALYSIS QUESTIONS
Systematically evaluate the 30-day window:
**Step 1: Sign Persistence** 1. Count days with positive net GEX 2. Count days with negative net GEX 3. Calculate persistence percentage: max(positive_days, negative_days) / 30 * 100 4. Does it meet 70% threshold (21+ days)?
**Step 2: Magnitude Assessment**
1. Calculate average GEX magnitude (absolute value):
sum(|net_gex|) / 30
2. Is average magnitude >=$5B?
3. Check for extreme outliers that might distort average
**Step 3: Stability Check** 1. Count sign flips: How many times does GEX switch from pos->neg or neg->pos? 2. Are there <=5 sign flips across 30 days? 3. Stable regime should have low flip count
**Step 4: Regime Classification**
- If Steps 1, 2, 3 all pass AND positive dominates
-> PERSISTENT POSITIVE
- If Steps 1, 2, 3 all pass AND negative dominates
-> PERSISTENT NEGATIVE
- If Step 1 passes but Step 2 fails -> LOW CONVICTION (reject)
- If Step 1 fails -> TRANSITIONAL (reject)
---
## CONFIDENCE CALIBRATION (Mechanical Guidance)
Use these concrete anchors to calibrate confidence:
**90-100 (Very High Confidence)** - 25-30 days same sign (83-100% persistence) - Average magnitude >$10B - 0-2 sign flips (highly stable) - Example: "29 negative days, avg $15B, 1 flip"
**70-89 (High Confidence)** - 21-24 days same sign (70-80% persistence) - Average magnitude $5-10B - 2-4 sign flips (moderately stable) - Example: "23 negative days, avg $7B, 3 flips"
**50-69 (Borderline - Use with Caution)** - 18-20 days same sign (60-67% persistence) - Average magnitude $3-5B - 5-7 sign flips - Example: "20 negative days, avg $4B, 6 flips" - Note: Borderline cases should generally be REJECTED unless other factors strengthen confidence
**0-49 (Reject - Not Persistent)**
- <18 days same sign (<60% persistence)
- OR average magnitude <$3B
- OR >7 sign flips
- These are NOT persistent regimes
**Important**: Confidence is a FILTER, not a probability. Use it to distinguish clear regimes (70+) from borderline (50-69) from noise (<50).
---
## OUTPUT FORMAT (JSON)
Provide your analysis in this exact JSON structure:
{
"regime_detected": true/false,
"regime_type": "persistent_positive|persistent_negative|
transitional|low_conviction",
"positive_days": <count as integer>,
"negative_days": <count as integer>,
"avg_magnitude_billions": <value as number>,
"sign_flips": <count as integer>,
"persistence_pct": <percentage as number>,
"confidence": <integer 0-100>,
"reasoning": "Explain step-by-step why this is/isn’t a persistent
regime. Reference specific metrics (persistence %,
avg magnitude, sign flips). If rejecting, state which
criterion failed."
}
**IMPORTANT**: All numeric fields (confidence, positive_days, negative_days, sign_flips, avg_magnitude_billions, persistence_pct) MUST be numbers (integers or decimals), NOT words like "thirty-five" or "fifty".
**regime_detected Rules**: - ‘true‘ ONLY if regime_type is "persistent_positive" or "persistent_negative" - ‘false‘ if regime_type is "transitional" or "low_conviction"
---
## KEY PRINCIPLES
1. **Selectivity is Expected**: Most windows will NOT be persistent regimes (expect 30-50% detection rate)
2. **ALL Criteria Must Pass**: Persistence + Magnitude + Stability required for detection
3. **Rejection is Valid**: Saying "no persistent regime" is a correct answer for transitional/weak periods
4. **Mechanical Over Qualitative**: Use concrete thresholds
(70%, $5B, 5 flips) rather than subjective judgment
5. **Structural Focus**: Only detect when dealers are FORCED into directional positions by constraints
Analyze the 30-day GEX data above and provide your regime classification in JSON format.
Appendix A.3. Output Schema and Parsing
- regime_detected (boolean)—true only when regime_type is persistent_positive or persistent_negative; false otherwise.
- regime_type (string)—one of persistent_positive, persistent_negative, transitional, low_conviction.
- positive_days, negative_days, sign_flips (integers in [0, 30]).
- avg_magnitude_billions (float, USD billions).
- persistence_pct (float, percentage).
- confidence (integer 0–100).
- reasoning (string)—free-form step-by-step explanation; retained for the post-hoc reasoning-quality audit reported in Section 5.
| 1 | Phase 4 here uses the 220 windows with complete per-window metric records; the three excluded 2020 windows do not change the point estimates. |
References
- Adams, G., Dim, C., Eraker, B., Fontaine, J. S., Ornthanalai, C., & Vilkov, G. (2025). Do S&P500 options increase market volatility? Evidence from 0DTEs (SSRN Working Paper). SSRN ID 5641974. SSRN. [Google Scholar]
- Anderegg, B., Ulmann, F., & Sornette, D. (2022). The impact of option hedging on the spot market volatility. Journal of International Money and Finance, 124, 102627. [Google Scholar] [CrossRef]
- Ang, A., & Bekaert, G. (2002). International asset allocation with regime shifts. Review of Financial Studies, 15(4), 1137–1187. [Google Scholar] [CrossRef]
- Black, F., & Scholes, M. (1973). The pricing of options and corporate liabilities. Journal of Political Economy, 81(3), 637–654. [Google Scholar] [CrossRef] [PubMed]
- Brown, L. D., Cai, T. T., & DasGupta, A. (2001). Interval estimation for a binomial proportion. Statistical Science, 16(2), 101–133. [Google Scholar] [CrossRef]
- CBOE Global Markets. (2024). Zero days to expiration options (0DTE): Market structure and trading activity (CBOE Research Report). CBOE Insights. [Google Scholar]
- CBOE Global Markets. (2025). SPX 0DTE options jump to record 62% share in August. CBOE Insights. [Google Scholar]
- Dim, C., Eraker, B., & Vilkov, G. (2023). 0DTEs: Trading, gamma risk and volatility propagation (SSRN Working Paper). SSRN ID 4692190. SSRN. [CrossRef]
- Dong, Y., Jiang, X., Liu, H., Jin, Z., Gu, B., Yang, M., & Li, G. (2024). Generalization or memorization: Data contamination and trustworthy evaluation for large language models. In Findings of the association for computational linguistics: ACL 2024 (pp. 12039–12050). Association for Computational Linguistics. [Google Scholar]
- Fishman, R. (2023). All you ever wanted to know about gamma, op-ex, and option-driven equity flows (Technical report, Goldman Sachs Equity Derivatives Strategy). SpotGamma. [Google Scholar]
- Frey, R. (1997). Derivative asset analysis in models with level-dependent and stochastic volatility. CWI Quarterly, 10(1), 1–34. [Google Scholar]
- Gârleanu, N., Pedersen, L. H., & Poteshman, A. M. (2009). Demand-based option pricing. Review of Financial Studies, 22(10), 4259–4299. [Google Scholar] [CrossRef]
- Grossman, S. J., & Miller, M. H. (1988). Liquidity and market structure. Journal of Finance, 43(3), 617–633. [Google Scholar] [CrossRef]
- Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica, 57(2), 357–384. [Google Scholar] [CrossRef]
- Hayek, F. A. (1945). The use of knowledge in society. American Economic Review, 35(4), 519–530. [Google Scholar]
- Kirzner, I. M. (1973). Competition and entrepreneurship. University of Chicago Press. [Google Scholar]
- Kojima, T., Gu, S. S., Reid, M., Matsuo, Y., & Iwasawa, Y. (2022). Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems, 35, 22199–22213. [Google Scholar]
- Lopez-Lira, A., & Tang, Y. (2023). Can ChatGPT forecast stock price movements? Return predictability and large language models. arXiv, arXiv:2304.07619. [Google Scholar] [CrossRef]
- Lopez-Lira, A., Tang, Y., & Zhu, M. (2025). The memorization problem: Can we trust LLMs’ economic forecasts? arXiv, arXiv:2504.14765. [Google Scholar] [CrossRef]
- Marcus, G., & Davis, E. (2019). Rebooting AI: Building artificial intelligence we can trust. Pantheon Books. [Google Scholar]
- McCoy, R. T., Yao, S., Friedman, D., Hardy, M., & Griffiths, T. L. (2024). Embers of autoregression: Understanding large language models through the problem they are trained to solve. Proceedings of the National Academy of Sciences, 121(41), e2322420121. [Google Scholar] [CrossRef] [PubMed]
- Ni, S. X., Pearson, N. D., & Poteshman, A. M. (2005). Stock price clustering on option expiration dates. Journal of Financial Economics, 78(1), 49–87. [Google Scholar] [CrossRef]
- Nystrup, P., Madsen, H., & Lindström, E. (2018). Dynamic portfolio optimization across hidden market regimes. Quantitative Finance, 18(1), 83–95. [Google Scholar] [CrossRef]
- OpenAI. (2024). Introducing openai o-series: A new series of reasoning models. OpenAI Research Blog. [Google Scholar]
- Regan, C., & Xie, Y. (2025). Inferring latent market forces: Evaluating LLM detection of gamma exposure patterns via obfuscation testing. In 2nd IEEE international workshop on large language models for finance (LLM-Finance), IEEE international conference on big data (BigData), Macau, China, 8–11 December 2025. IEEE. [Google Scholar]
- Ribeiro, M. T., Wu, T., Guestrin, C., & Singh, S. (2020). Beyond accuracy: Behavioral testing of NLP models with CheckList. In Proceedings of the 58th annual meeting of the association for computational linguistics (pp. 4902–4912). Association for Computational Linguistics. [Google Scholar] [CrossRef]
- SpotGamma. (2021). Understanding gamma exposure (Technical Documentation). Available online: https://spotgamma.com/gamma-exposure-gex/ (accessed on 28 March 2026).
- Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824–24837. [Google Scholar]
- Yang, Y., Sun, E., Luo, D., & Wang, W. (2024). TradingAgents: Multi-agents LLM financial trading framework. arXiv, arXiv:2412.20138. [Google Scholar]








| Pattern | Framing | WHO | WHAT |
|---|---|---|---|
| Gamma Positioning | Technical/Greek | Dealers with −Γ | Pro-cyclical hedging |
| Stock Pinning | Behavioral | Market makers | Strike convergence |
| 0DTE Hedging | Temporal | Option writers | Rapid rebalancing |
| Metric | Result |
|---|---|
| Detection comparison | |
| Raw Chain Detection Rate | 92.3% (12/13) |
| GEX-Assisted Baseline | 61.5% (8/13) |
| Improvement | +30.8 pp |
| Reasoning quality (raw chain, n = 13) | |
| Identifies market makers (WHO) | 100% (13/13) |
| Identifies counterparties (WHOM) | 84.6% (11/13) |
| Explains hedging mechanism (WHAT) | 100% (13/13) |
| Avg reasoning score | 5.5/6 |
| Test | 2024 FP (95% CI) | 2020 FP (95% CI) | Criterion |
|---|---|---|---|
| Shuffle | 61.1% [48.1, 74.1]% (33/54) | 12.1% [8.1, 16.6]% (27/223) | diagnostic |
| Transitional | 0.0% [0.0, 10.7]% (0/32) | 0.0% [0.0, 1.7]% (0/223) | <10% |
| Low-Magnitude | 0.0% [0.0, 6.6]% (0/54) | 0.0% [0.0, 1.7]% (0/223) | <10% |
| Metric | 2024 | 2020 | Difference |
|---|---|---|---|
| Detection Rate | 81.2% [75.8, 86.1]% (181/223) | 12.1% [8.1, 16.6]% (27/223) | +69.1 pp |
| Avg Persistence (detected) | 98.2% | 100.0% | −1.8 pp |
| Avg Magnitude (detected) | $30.5B | $5.5B | +$25.0B |
| Avg Magnitude (rejected) | $31.8B | $2.2B | +$29.6B |
| Dominant Sign | Negative | Positive | Flip |
| 0DTE Volume Share (SPY) | ≈46% | <5% | – |
| Year | Win. | Det. | Rate | 95% CI | Avg GEX | Status |
|---|---|---|---|---|---|---|
| 2020 | 213 | 26 | 12.2% | [8.5, 17.3]% | $3.0B | Pre-regime |
| 2021 | 241 | 9 | 3.7% | [2.0, 6.9]% | $4.9B | Borderline |
| 2022 | 244 | 79 | 32.4% | [26.8, 38.5]% | $5.5B | Growing |
| 2023 | 228 | 46 | 20.2% | [15.5, 25.9]% | $9.6B | Inconsistent |
| 2024 | 241 | 241 | 100% | [98.4, 100.0]% | $20.3B | Structural shift |
| 2025 | 245 | 245 | 100% | [98.5, 100.0]% | $19.0B | Sustained |
| Total | 1412 | 646 | 45.8% | [43.2, 48.4]% | – | – |
| Year | HMM Input | N | LLM Rate | HMM Rate | Agree | κ |
|---|---|---|---|---|---|---|
| 2020 | SPY returns | 201 | 8.5% | 80.1% | 28.4% | 0.045 |
| 2024 | SPY returns | 222 | 81.1% | 87.4% | 68.5% | −0.178 |
| 2024 | Net GEX ($bn) | 221 | 81.0% | 65.2% | 84.2% | 0.610 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Regan, C.; Xie, Y. Temporal Obfuscation Testing for LLM Structural Reasoning: From Single-Day Dealer Constraints to Persistent Market Regimes. J. Risk Financial Manag. 2026, 19, 382. https://doi.org/10.3390/jrfm19060382
Regan C, Xie Y. Temporal Obfuscation Testing for LLM Structural Reasoning: From Single-Day Dealer Constraints to Persistent Market Regimes. Journal of Risk and Financial Management. 2026; 19(6):382. https://doi.org/10.3390/jrfm19060382
Chicago/Turabian StyleRegan, Christopher, and Ying Xie. 2026. "Temporal Obfuscation Testing for LLM Structural Reasoning: From Single-Day Dealer Constraints to Persistent Market Regimes" Journal of Risk and Financial Management 19, no. 6: 382. https://doi.org/10.3390/jrfm19060382
APA StyleRegan, C., & Xie, Y. (2026). Temporal Obfuscation Testing for LLM Structural Reasoning: From Single-Day Dealer Constraints to Persistent Market Regimes. Journal of Risk and Financial Management, 19(6), 382. https://doi.org/10.3390/jrfm19060382

