1. Introduction
The emergence of blockchain-based financial ecosystems has created new paradigms for decentralized data analytics and automated decision-making systems. Cryptocurrency markets, as the primary application domain of blockchain technology, generate rich multi-dimensional datasets that combine on-chain transaction flows with traditional market microstructure signals. This unique data environment presents both opportunities and challenges for AI-driven analytics, particularly in the context of developing automated trading systems that can operate within decentralized finance (DeFi) protocols and smart contract environments.
Cryptocurrency markets exhibit unique characteristics that challenge traditional financial prediction frameworks [
1,
2]. High volatility, 24/7 trading cycles, and diverse investor participation create complex market dynamics requiring specialized analytical approaches. The rapid growth of cryptocurrency trading volumes, exceeding USD 1 trillion daily across major exchanges, demands sophisticated prediction systems capable of generating consistent profits under realistic transaction costs [
3,
4].
Traditional cryptocurrency prediction research focuses primarily on single-timeframe analysis, typically employing daily price data or minute-level technical indicators in isolation [
5]. This approach overlooks the fundamental interaction between macro-economic trends and microstructure dynamics that characterizes modern cryptocurrency markets [
6,
7]. Daily momentum patterns often manifest through intraday order flow changes, while microstructure signals gain predictive power when aligned with broader market trends.
The integration of multiple temporal scales presents both opportunities and challenges for cryptocurrency direction prediction [
8]. Macro features derived from daily OHLCV data across multiple assets provide market-wide context and fundamental momentum indicators. Microstructure features extracted from minute-frequency order book snapshots capture real-time market sentiment and liquidity conditions. The temporal bridge between these domains occurs at intermediate horizons where daily directional bias influences minute-level market-making activities.
Existing prediction frameworks typically employ three-class classification schemes (Up, Down, No-trade) where models simultaneously learn directional prediction and execution timing decisions [
9,
10]. This approach confounds signal extraction with risk management, potentially degrading both prediction accuracy and trading performance. The mixed representation of unclear signals and inappropriate timing within no-trade samples may compromise model learning effectiveness.
Our confidence-threshold approach builds upon established foundations in selective classification [Chow, 1970 [
11]; Herbei & Wegkamp, 2006 [
12]], where classifiers may abstain from predictions when confidence is insufficient. While selective classification theory has been extensively developed for cost-sensitive learning, its application to cryptocurrency markets with integrated macro–microstructure features and systematic threshold optimization across temporal scales represents an empirical contribution addressing domain-specific challenges in decentralized finance. Our framework draws conceptual parallels with abstention learning in machine learning (Cortes et al., 2016 [
13]) and uncertainty-based trading in quantitative finance (Marcos López de Prado, 2018 [
14]). Abstention learning addresses scenarios where classifiers may decline predictions when uncertainty is high, optimizing coverage-accuracy trade-offs through rejection thresholds. Similarly, uncertainty-based portfolio construction incorporates prediction confidence into position sizing and execution decisions. While these frameworks establish theoretical foundations for confidence-aware decision-making, their application to cryptocurrency direction prediction with integrated macro–microstructure features and systematic threshold optimization across multiple temporal scales represents a novel contribution within the decentralized finance domain.
Confidence-threshold mechanisms offer alternative approaches to execution control, enabling separation of directional prediction from trading decisions [
15]. By training binary classifiers for pure directional signals and employing separate confidence-based execution rules, systems can optimize the precision–recall trade-off systematically. This decoupling allows explicit control over trading frequency versus signal quality, addressing fundamental challenges in cryptocurrency prediction system design.
Neural network architectures demonstrate consistent effectiveness for cryptocurrency prediction tasks, with Long Short-Term Memory (LSTM) networks achieving directional accuracies of 60–85% across multiple studies [
9,
10]. However, most research evaluates prediction accuracy rather than economic performance, limiting practical applicability assessment. The gap between statistical performance and trading profitability requires frameworks that explicitly optimize economic metrics under realistic operational constraints.
Feature engineering for cryptocurrency prediction typically emphasizes either technical indicators derived from price data [
9] or market microstructure metrics from order book analysis [
10]. Limited research investigates systematic integration of macro and microstructure signals across different temporal scales. The potential for cross-temporal feature interactions remains underexplored, particularly regarding optimal prediction horizons and signal quality thresholds.
Parameter optimization in cryptocurrency prediction systems often focuses on individual components rather than systematic exploration of joint parameter spaces. Horizon selection [
16,
17], signal quality requirements, and execution thresholds [
15] interact in complex ways that may not be captured through independent optimization. Comprehensive parameter space analysis becomes essential for identifying optimal trading strategy configurations under different risk–return preferences.
The present research addresses these limitations through a two-class framework that integrates macro and microstructure features across multiple temporal scales. Our approach separates directional prediction from execution decisions using confidence-based thresholds, enabling systematic optimization of the precision–recall trade-off. We conduct comprehensive experiments across 11 major cryptocurrency pairs, exploring prediction horizons from 10 to 600 min, deadband thresholds from 2 to 20 basis points, and confidence levels of 0.6 and 0.8.
The research contributes to the cryptocurrency prediction literature through three primary advances. First, we develop a two-class binary classification framework that decouples directional prediction from execution timing decisions. Second, we implement systematic integration of macro momentum signals with microstructure dynamics through unified feature engineering. Third, we conduct comprehensive parameter space optimization across multiple dimensions to identify optimal trading strategy configurations.
Our experimental design employs rigorous temporal validation with symbol-wise splitting to prevent data leakage while maintaining realistic trading conditions. All the performance evaluation incorporates transaction costs and focuses on economic metrics relevant to practical trading applications. The results demonstrate significant improvements over baseline approaches, achieving peak profits of 167.64 basis points per trade, with directional accuracies of 75–95% on executed trades.
The remainder of this paper proceeds as follows.
Section 2 reviews the relevant literature on cryptocurrency prediction methods and performance benchmarks.
Section 3 describes the two-class framework architecture and multi-scale feature integration methodology.
Section 4 details the dataset characteristics and preprocessing procedures.
Section 5 presents the experimental design and validation framework.
Section 6 reports comprehensive results across both confidence regimes.
Section 7 discusses economic interpretation, benchmark comparisons, and practical implementation considerations.
Section 8 concludes with limitations and future research directions.
2. Literature Review
Cryptocurrency prediction research has evolved rapidly alongside market maturation, encompassing diverse methodological approaches from traditional time series analysis to advanced deep learning architectures. This review examines recent developments in cryptocurrency direction prediction, with particular emphasis on feature integration strategies, neural network architectures, and performance evaluation frameworks relevant to our two-class approach.
2.1. Neural Network Architectures for Cryptocurrency Prediction
Deep learning methods dominate contemporary cryptocurrency prediction research, with Long Short-Term Memory (LSTM) networks serving as the foundational architecture across multiple studies. Zhang et al. (2024) [
18] conducted a comprehensive survey of deep learning applications in cryptocurrency markets, finding that LSTM models consistently achieve 83–84% average accuracy for Bitcoin and Ethereum prediction tasks. Their analysis reveals that ensemble methods combining multiple weak classifiers often outperform individual models, achieving explained variance scores of 0.97 and mean percentage errors around 0.06.
Attention mechanisms represent a significant advancement in cryptocurrency prediction architectures. Shang et al. (2024) [
19] propose an attention-based CNN-BiGRU model for Ethereum price prediction, integrating blockchain information and external factors from 2017–2021 data. Their two-stage approach combines improved CNN for feature extraction with bidirectional GRU and attention mechanisms, achieving RMSE of 151.6 and MAE of 91.2, substantially outperforming traditional CNN-GRU (RMSE: 1067.1) and BIGRU (RMSE: 1065.7) baselines.
Graph neural networks introduce network-based perspectives to cryptocurrency prediction. Zhong et al. (2023) [
20] develop LSTM-ReGAT, combining LSTM with Relationwise Graph Attention Networks for cryptocurrency price trend prediction. Their approach constructs a cryptocurrency network based on shared features including technological foundation, industry classification, and investor co-attention patterns. Testing on 645 cryptocurrencies over 995 days (March 2020–December 2022), they achieve AUC of 0.6615 and accuracy of 62.97%, representing modest but consistent improvements over LSTM baselines (AUC: 0.6546, accuracy: 62.27%).
2.2. Multi-Scale and Multi-Target Learning Approaches
Multi-target learning emerges as a promising direction for cryptocurrency prediction, leveraging correlations across multiple assets. Pellicani et al. (2025) [
21] introduce CARROT, employing temporal clustering with Dynamic Time Warping to group correlated cryptocurrencies before training multi-target LSTM models for each cluster. Their approach processes 17 cryptocurrencies from January 2020 to December 2021, achieving an average 10% improvement in macro F1-score over single-target LSTMs, with the best performance showing 19% improvement using 6-month training intervals.
High-frequency prediction presents unique challenges requiring specialized architectures. Peng et al. (2024) [
22] propose ACLMC (Attention-based CNN-LSTM for Multiple Cryptocurrencies) combined with novel triple trend labeling using local minimum series. Their approach integrates macro and microstructure features across multiple frequencies and currencies, achieving significant reduction in transaction numbers (approximately 90% compared to traditional methods) while maintaining profitable performance.
2.3. Feature Engineering and Selection Methods
Feature selection methodology significantly impacts cryptocurrency prediction performance. El Youssefi et al. (2025) [
23] conduct systematic investigation of feature selection methods applied to 130+ technical indicators for cryptocurrency price forecasting. Using mutual information (MI), recursive feature elimination (RFE), and recursive feature importance (RFI) methods with SVR, Huber, and KNN regressors, they achieve 80–85% feature reduction while maintaining or enhancing performance. Their results show peak R
2 values of 0.45–0.7 across BTC, ETH, and BNB pairs, with momentum and volatility indicators proving most important across timeframes.
Curvature-based approaches offer alternative feature engineering strategies. Zhang et al. (2024) [
24] introduce generalized visible curvature indicator (CCPIq) for cryptocurrency bubble identification and price trend prediction. Their method captures geometric properties of log-price trajectories, quantifying interactions between trend, acceleration, and volatility. Integration with LightGBM achieves classification accuracy improvements and trading performance with Sharpe ratios up to 2.93 for Ethereum, significantly outperforming traditional bubble identification methods.
2.4. Probabilistic and Uncertainty Quantification Methods
Uncertainty quantification represents an emerging focus in cryptocurrency prediction research. Golnari et al. (2024) [
25] introduce Probabilistic Gated Recurrent Units (P-GRU) for Bitcoin price prediction with uncertainty quantification. Their approach integrates probabilistic attributes into standard GRU architecture, facilitating generation of probability distributions for predicted values. Testing on one year of Bitcoin data at 5 min intervals, they achieve R
2-score of 0.99973 and MAPE of 0.00190, substantially outperforming traditional LSTM/GRU variants.
Potential field theory provides theoretical foundation for cryptocurrency market characterization. Anoop et al. (2025) [
26] present a Bayesian machine learning framework using potential field theory and Gaussian processes to model cryptocurrency price movements as trajectories in dynamical systems governed by time-varying potential fields. Their analysis of Bitcoin crash periods (2017–2021) shows that attractors captured market trends, volatility, and correlations, with mean attractor features improving LSTM prediction performance by 25–28% in terms of MSE reduction.
2.5. Trading Strategy Integration and Performance Evaluation
The integration of prediction models with trading strategies receives increasing attention in the recent literature. Kang et al. (2025) [
27] investigate technical indicator integration with deep learning-based price forecasting across 12 models for cryptocurrency trading strategies. Their best performing strategy combines TimesNet with Bollinger Bands in ETH markets, achieving returns of 3.19, maximum drawdown of −7.46%, and Sharpe ratio of 3.56. Technical indicator integration shows significant improvements at 4 h intervals, though no improvement occurs at shorter 30 min intervals.
Portfolio construction and trading strategy evaluation require sophisticated frameworks. Viéitez et al. (2024) [
28] develop machine learning systems for Ethereum prediction and knowledge-based investment strategies, testing regression approaches with GRU and LSTM networks alongside SVM classification for trend prediction. Their evaluation across different time periods with real cryptocurrency market data shows profit factors ranging from 1.14 to 5.16, with limited influence from sentiment analysis integration.
2.6. Market Microstructure and Behavioral Factors
Market microstructure analysis reveals important patterns relevant to cryptocurrency prediction. Liu et al. (2025) [
29] investigate liquidity commonality across 50 major cryptocurrencies from 2016 to 2023, finding strong positive liquidity commonality, with most coefficients approximating 1.0. Their results show liquidity commonality peaks mid-week (Wednesday–Thursday: 0.481–0.453) compared to weekends (0.246–0.322), with seasonal patterns persisting after controlling for volatility and returns.
Momentum effects demonstrate regime-dependent characteristics in cryptocurrency markets. Hsieh et al. (2025) [
30] examine how market-state transitions shape momentum profitability across 2130 cryptocurrencies using weekly data from 2015 to 2023. Their findings show momentum profits concentrated exclusively in UP-UP transitions (11.9–15.5 basis points weekly), with no significant momentum in other regime combinations, suggesting asymmetric belief-updating patterns among cryptocurrency investors.
2.7. Grey Systems and Alternative Forecasting Methods
Alternative methodological approaches provide complementary perspectives to neural network dominance. Yang et al. (2025) [
31] propose grey multivariate convolution models (GMCN(1,N)) for short-term cryptocurrency price forecasting, using grey correlation analysis to select core influencing variables. Testing on Bitcoin, Ethereum, and Litecoin data between 2022 and 2023, they achieve highly accurate predictions with MAPE values of 1.58% (BTC), 1.12% (ETH), and 2.53% (LTC), demonstrating the effectiveness of grey systems theory for cryptocurrency prediction.
2.8. Research Gaps and Methodological Challenges
Systematic mapping studies reveal persistent challenges in cryptocurrency trading research. Nguyen and Chan (2024) [
32] analyze 622 papers on cryptocurrency trading from 2015 to 2022, categorizing research into seven themes: pricing theories (208 papers), influential factors (165 papers), forecasting (119 papers), trading and portfolio management (76 papers), market evolution and regulation (65 papers), risk evaluation (54 papers), and trading platforms (10 papers). Their analysis shows that 75% of trading systems use multiple input sources, while machine learning approaches generally achieve less than 65% accuracy in price prediction tasks.
Acceptance and adoption factors influence cryptocurrency market dynamics beyond technical prediction capabilities. Madanchian et al. (2025) [
33] conduct a systematic review of factors influencing cryptocurrency adoption, identifying motivators including privacy, curiosity, and investment potential, alongside inhibitors such as volatility, regulatory uncertainty, and security concerns. Their analysis reveals substantial research gaps in understanding adoption motivations and regional acceptance disparities.
The broader machine learning literature provides theoretical foundations for confidence-based decision-making. Abstention learning frameworks (Cortes et al., 2016 [
13]) optimize classifier performance by rejecting uncertain predictions, trading coverage for accuracy through explicit rejection costs. In quantitative finance, uncertainty-aware portfolio construction (López de Prado, 2018 [
14]) incorporates prediction confidence into position sizing and risk management. Our two-class framework extends these concepts to cryptocurrency markets by (1) separating directional prediction from execution decisions through post hoc confidence thresholding rather than integrated three-class learning; (2) systematically optimizing confidence thresholds using validation data across multiple prediction horizons and signal quality requirements; and (3) integrating cross-temporal macro–microstructure features that capture domain-specific cryptocurrency market dynamics. While conceptually related to abstention learning, our approach addresses unique challenges in decentralized financial systems, including 24/7 markets, minute-level execution constraints, and blockchain-native transaction cost structures.
2.9. Synthesis and Research Positioning
The literature reveals three primary research streams relevant to our investigation. First, architectural innovations focus on attention mechanisms, graph neural networks, and probabilistic approaches, with performance improvements typically ranging from 10 to 25% over baseline methods. Second, multi-scale and multi-target approaches demonstrate consistent benefits, particularly the 10–20% F1-score improvements shown by CARROT and similar systems. Third, feature engineering and selection methods prove critical, with studies achieving 80–85% dimensionality reduction while maintaining predictive performance.
Performance benchmarks from the literature establish context for evaluation frameworks. Directional accuracy typically ranges from 60 to 85% across studies, with higher accuracy achievable through stricter confidence requirements. Economic metrics show substantial variation, with Sharpe ratios of 2.5–3.6 representing strong performance, while profit factors of 1.1–5.2 indicate viable trading strategies under different market conditions.
The reviewed literature identifies several limitations that our research addresses. First, most studies focus on single-timeframe analysis, missing opportunities for cross-temporal signal integration. Second, confidence-based execution control remains underexplored, with most approaches using fixed prediction thresholds. Third, systematic parameter optimization across multiple dimensions (horizon, deadband, confidence) lacks comprehensive treatment in the existing work.
Our two-class framework with integrated macro–microstructure features addresses these gaps through explicit confidence-threshold optimization, unified multi-scale feature representation, and comprehensive parameter space exploration across 11 major cryptocurrency pairs.
7. Discussion
7.1. Economic Interpretation of Results
The experimental results demonstrate that cryptocurrency direction prediction using integrated macro–microstructure features can generate economically significant returns under realistic trading conditions. The peak performance of 167.64 basis points per trade (H400-DB10, τ = 0.8) represents substantial value creation when applied to institutional-scale trading volumes.
The confidence-threshold mechanism proves critical for economic viability. High confidence regimes (τ = 0.8) achieve 60.4% higher peak profits than moderate confidence conditions (τ = 0.6), confirming that precision–recall optimization directly translates to economic performance. This relationship validates the core hypothesis that separating directional prediction from execution decisions improves trading system effectiveness.
The coverage–profit trade-off reveals fundamental economic constraints in cryptocurrency markets. High confidence strategies sacrifice 82% of trading opportunities to achieve superior per-trade returns, indicating that genuinely predictable price movements occur infrequently but deliver substantial profits when correctly identified. This finding aligns with efficient market theory while demonstrating exploitable inefficiencies at specific temporal scales.
Transaction cost tolerance analysis shows robust profitability margins. High confidence configurations maintain positive returns at costs up to 6 basis points per trade, exceeding typical institutional execution costs for major cryptocurrency pairs. This margin provides operational flexibility for live deployment across different execution venues and market conditions.
7.2. Comparison with Benchmark Strategies
Table 11 positions our results within the existing cryptocurrency prediction literature, revealing competitive performance across multiple evaluation frameworks.
Our per-trade basis-point results (104–168 bps best configurations) are not directly comparable to the literature that reports profit factor, Sharpe, or long-horizon returns; we therefore treat
Table 11 as contextual, not competitive, comparison. The Viéitez et al. profit factor of 5.16 represents percentage-based returns over longer holding periods, while our basis-point measures reflect per-trade efficiency over minutes-to-hours horizons.
The Sharpe ratios reported by Kang et al. (3.56) and Zhang et al. (2.93) suggest similar risk-adjusted performance levels, indicating potential performance ceilings in cryptocurrency markets. Our confidence-based approach offers comparable economic returns through a fundamentally different methodological pathway.
Accuracy comparisons with Zhong et al. show our directional accuracy (75–95% on executed trades) substantially exceeds their 62.97% classification performance, though their broader cryptocurrency coverage (645 vs. 11 symbols) provides different market exposure profiles.
The CARROT study’s 20% F1-score improvement over LSTM baselines aligns with our multi-target learning benefits, supporting the effectiveness of cross-cryptocurrency feature integration approaches.
7.3. Feature Contribution Analysis and Model Interpretability
While formal ablation studies (macro-only, micro-only, combined) remain for future work, post hoc feature importance analysis using permutation importance on the trained MLP provides insights into component contributions. Interpretability is critical for financial model deployment, where regulatory compliance and risk management require transparent decision-making processes (Chechkin et al., 2025 [
35]).
We employ permutation importance as the primary interpretability method: for each feature, we randomly shuffle its values in the test set and measure the resulting degradation in directional accuracy. Features causing substantial performance drops when permuted are deemed important. Across optimal configurations (H400–600, τ = 0.8), macro momentum features (20-day moving averages, RSI, multi-horizon returns) rank highest in importance for longer prediction horizons (400–600 min), accounting for approximately 60–65% of total feature importance. Microstructure features (bid–ask imbalances, order book depth ratios, spread measures) contribute primarily at intermediate horizons (100–300 min), where short-term market-making dynamics influence directional outcomes.
For production deployment, more sophisticated interpretability frameworks such as SHAP (SHapley Additive exPlanations) would provide instance-level explanations of individual trading decisions. SHAP has proven effective for attention analysis in hybrid architectures combining neural networks with interpretable components (Chechkin et al., 2025 [
35]). Integration of SHAP values could enable traders to understand why specific confidence thresholds were triggered and which feature combinations drove directional predictions, enhancing trust and facilitating human oversight. However, SHAP’s computational cost (10–100× slower than forward pass) currently limits real-time application for minute-level trading. Future work should investigate lightweight approximation methods for SHAP in high-frequency financial contexts, potentially leveraging attention mechanisms to prioritize explanation computation for high-confidence trades only.
7.4. Practical Implementation Considerations
Live deployment of the two-class framework requires addressing several operational challenges not fully captured in backtesting environments. The confidence-threshold mechanism demands real-time probability calibration as market regimes shift, potentially requiring adaptive threshold adjustment beyond the fixed τ values evaluated experimentally.
Latency constraints impose practical limits on feature computation complexity. The 64-feature unified representation requires approximately 15 ms calculation time on standard hardware, compatible with minute-frequency decision cycles but potentially restrictive for higher-frequency applications.
The 11-symbol constraint reflects microstructure data availability limitations rather than methodological restrictions. Expansion to broader cryptocurrency universes would require substantial data infrastructure investments while potentially diluting signal quality through inclusion of less liquid pairs.
Execution implementation must account for market impact costs not captured in the 1-basis-point transaction cost assumption. High confidence regimes’ superior margins provide buffer against realistic market impact, particularly for institutional-scale position sizes.
Risk management integration requires position sizing rules beyond the binary execution decisions evaluated. The confidence scores provide natural position sizing signals, with higher confidence justifying larger allocations within portfolio-level risk constraints.
The unified macro–microstructure approach creates operational dependencies on multiple data streams with different update frequencies and reliability characteristics. Robust implementation requires graceful degradation capabilities when partial data becomes unavailable.
7.5. Limitations and Future Work
A critical limitation is the restriction to a single evaluation period (October 2023–October 2024). This timeframe coincides with specific cryptocurrency market conditions characterized by moderate volatility (Bitcoin volatility: 35–55% annualized) and recovering liquidity following the 2022–2023 market downturn. The framework’s performance under different market regimes—sustained bear markets, bull market euphoria, high-volatility crisis periods—remains unvalidated. Cross-regime robustness testing would require multi-year evaluation spanning complete market cycles, with separate validation periods for bull (>20% quarterly gains), bear (<−20% quarterly losses), and sideways (±10%) regimes. Without such validation, the risk of regime-specific overfitting cannot be ruled out, and the reported profitability may not generalize beyond the tested conditions. Future work should prioritize regime-conditional performance analysis and adaptive threshold mechanisms that adjust to detected market states. A significant methodological limitation is the absence of direct baseline comparisons within our experimental framework. Rigorous assessment of predictive value requires comparison against naive strategies (random walk, momentum crossover, buy-and-hold) using identical temporal splits, transaction costs, and evaluation protocols. Classical time series models provide important benchmarks for volatility forecasting and trend detection in cryptocurrency markets; their absence limits our ability to quantify the added value of neural architectures and multi-scale feature integration over established statistical approaches. While
Table 11 positions our results against published benchmarks from the prior literature, these comparisons suffer from heterogeneous evaluation frameworks (different time periods, asset selections, cost assumptions). Without internal baselines, we cannot quantify the magnitude of improvement attributable to our confidence-threshold approach versus general market trends or simple heuristics. For instance, a momentum strategy with similar transaction costs might achieve comparable profitability during trending periods. The absence of this comparison represents a critical gap that future work must address through controlled baseline experiments under identical conditions.
Feature selection via mutual information scoring was performed once on the complete training set without cross-validation or bootstrap stability analysis. This single-pass approach may introduce sensitivity to training data composition, potentially selecting features that exhibit high mutual information by chance rather than genuine predictive power. Production deployment should validate feature selection stability across bootstrap resamples and monitor feature importance drift over time as market dynamics evolve.
The MLP architecture, while computationally efficient and suitable for real-time deployment, may not fully capture long-range temporal dependencies in cryptocurrency price dynamics. Preliminary experiments with LSTM architectures demonstrated 3–7% accuracy improvements across multiple metrics, suggesting potential performance gains from recurrent architectures. However, LSTM adoption was deferred pending investigation of federated learning integration, where recurrent parameter aggregation across decentralized exchanges presents technical challenges not present in feed-forward architectures. Future research will systematically evaluate LSTM variants within privacy-preserving collaborative learning frameworks suitable for blockchain-native trading applications.
The symbol-wise temporal splitting methodology assumes independence across cryptocurrency pairs, which may not hold during market-wide stress events or regulatory announcements. Cross-sectional dependencies deserve investigation through portfolio-level evaluation frameworks.
Transaction cost modeling uses simplified assumptions that may underestimate real-world execution complexity. Integration with realistic execution simulators accounting for market impact, slippage, and venue-specific costs would strengthen practical relevance.
The binary classification framework excludes volatility forecasting and risk factor modeling that could enhance portfolio construction beyond directional prediction. Multi-task learning approaches incorporating volatility and correlation prediction represent natural extensions.
Future research directions include adaptive confidence-threshold mechanisms responsive to changing market conditions, integration with portfolio optimization frameworks, and extension to traditional financial assets where similar macro–microstructure relationships may exist.
The confidence-threshold mechanism exhibits several limitations during extreme market regimes. In low-volume periods (e.g., weekends, holiday trading), reduced liquidity may degrade order book quality, causing microstructure features to produce spurious high-confidence signals that do not reflect genuine directional information. Preliminary analysis of weekend trading (excluded from main results) shows 12–18% degradation in direction accuracy despite similar confidence scores, indicating that calibration quality deteriorates when market depth falls below typical levels.
During high-volatility events (e.g., regulatory announcements, exchange failures), rapid price movements may invalidate the temporal assumptions underlying our prediction horizons (10–600 min). The evaluation period (October 2023–October 2024) exhibited moderate volatility (Bitcoin annualized volatility: 35–55%) and did not include extreme stress events comparable to the March 2020 COVID crash (200% + volatility spike), May 2021 regulatory crackdown, or November 2022 FTX collapse. Robustness testing under high-volatility regimes (>100% annualized volatility) is essential for production deployment, as confidence calibration and feature relationships may break down during market dislocations. Dedicated stress-period backtesting on historical crisis episodes represents critical future work. Confidence scores during these periods often remain elevated despite reduced prediction reliability, as models trained on normal market conditions fail to recognize regime shifts. For instance, during the May 2023 exchange liquidity crisis (not in our evaluation period), backtested strategies would have generated substantial losses despite high-confidence signals, as confidence thresholds do not incorporate volatility regime detection.
The confidence mechanism also assumes relatively stable correlations between macro momentum and microstructure dynamics. During divergence periods (e.g., when daily trends reverse while intraday order flow persists), the feature integration approach may produce overconfident but incorrect signals. Incorporating volatility-adjusted confidence thresholds or regime-aware calibration could address these limitations but would require additional model complexity beyond our current framework. Future work should investigate adaptive confidence mechanisms that adjust thresholds based on detected market regime and liquidity conditions.
Several practical deployments and DeFi constraints remain unaddressed:
Gas Fee Impact: On-chain execution (e.g., Ethereum mainnet) incurs gas fees ranging from 5 to 50 basis points per transaction depending on network congestion. These costs would eliminate profitability for most configurations, as our peak profit of 167 bps assumes only 1 bp execution cost. Layer-2 solutions (Arbitrum, Optimism) reduce fees to 0.5–2 bps but introduce latency and liquidity fragmentation.
Execution Latency: Block confirmation delays (12–15 s on Ethereum) create timing risk where prices may move adversely between signal generation and on-chain execution. High-frequency configurations (H = 100–200 min) would suffer disproportionate slippage from confirmation lag.
Liquidity Constraints: Order book depth limits position sizing. While our evaluation uses basis-point returns implying small positions, institutional-scale deployment would face market impact costs not captured in our 1 bp assumption. Liquidity analysis (available depth at each confidence threshold) is absent.
Algorithmic Manipulation: Decentralized markets with transparent order books are vulnerable to front-running and sandwich attacks. A successful public strategy would attract adversarial trading, potentially degrading performance through adverse selection.
Feedback Effects: If the confidence-threshold approach achieves significant adoption, the predicted price movements may partially reflect the strategy’s own execution flow, creating self-referential dynamics that invalidate historical backtests.
These considerations suggest that “blockchain-native” should be understood as potential application pending engineering work (off-chain computation with on-chain settlement, privacy-preserving execution) rather than validated on-chain implementation. The framework’s economic viability in true DeFi environments remains an open question requiring dedicated deployment studies.
8. Conclusions
This research presents an empirical investigation of confidence-threshold mechanisms for cryptocurrency direction prediction, systematically integrating macro and microstructure features across multiple temporal scales. The framework applies established selective classification principles to decentralized financial markets, demonstrating their effectiveness for cryptocurrency trading under realistic transaction costs. The key methodological innovation lies in separating directional prediction from execution decisions through confidence-based thresholds, enabling explicit optimization of the precision–recall trade-off in cryptocurrency trading applications.
Comprehensive experiments across 11 major cryptocurrency pairs demonstrate the framework’s effectiveness under realistic trading conditions. High confidence regimes (τ = 0.8) achieve peak profits of 167.64 basis points per trade with directional accuracies of 82–95% on executed trades. Moderate confidence regimes (τ = 0.6) maintain 50–65% market coverage while generating profits of 104.52 basis points per trade for the evaluated 11 major cryptocurrency pairs. These results substantially exceed typical academic benchmarks and demonstrate economic viability under institutional-scale trading volumes.
The systematic parameter optimization reveals fundamental trade-offs between trading frequency and signal quality in cryptocurrency markets. Optimal performance occurs at intermediate prediction horizons (400–600 min) where daily momentum trends manifest through intraday order flow patterns. The confidence-threshold mechanism proves critical for economic performance, with high confidence requirements improving profits by 60.4% while reducing median coverage from 44.9% to 0.28% (≈−99%).
Multi-scale feature integration provides superior signal representation compared to single-timeframe approaches. The unified combination of macro momentum indicators with microstructure dynamics captures temporal bridges where fundamental price discovery mechanisms align with short-term market-making activities. This integration contributes to directional accuracies that exceed published benchmarks while maintaining economic profitability under realistic transaction costs.
The research demonstrates practical viability for institutional cryptocurrency trading applications. High confidence strategies tolerate transaction costs up to 6 basis points per trade while maintaining positive returns, exceeding typical execution costs for major cryptocurrency pairs. The framework’s robust performance across different parameter configurations provides operational flexibility for live deployment across varying market conditions.
The methodological framework developed in this research provides a foundation for blockchain-integrated financial analytics applications. The confidence-threshold mechanism offers particular advantages for smart contract-based trading systems, where binary execution decisions align naturally with on-chain transaction requirements and gas optimization constraints. The systematic parameter optimization approach enables adaptive configuration for different blockchain environments and consensus mechanisms.
Future extensions of this work could integrate on-chain transaction flow analysis with off-chain market signals to develop comprehensive blockchain-native prediction systems [
36,
37]. The integration of federated learning approaches could enable collaborative model training across multiple DeFi protocols while preserving privacy and reducing centralized dependencies. Additionally, the confidence scoring mechanism could be enhanced with cryptographic verification techniques to ensure signal integrity in decentralized trading environments.
The results presented in this study are derived from 11 major cryptocurrency pairs with high liquidity and complete macro–microstructure data coverage. Generalization to smaller-cap cryptocurrencies, synthetic blockchain assets, or emerging DeFi tokens requires empirical validation, as lower liquidity and different market microstructure characteristics may substantially alter performance. The framework’s applicability to non-cryptocurrency blockchain-native assets (e.g., tokenized securities, NFTs) remains untested and represents an important direction for future research [
38,
39]. Claims regarding decentralized finance integration should be understood as potential applications requiring additional engineering work for on-chain deployment, rather than validated production-ready implementations.
Several limitations constrain the generalizability of these findings. The evaluation period coincides with specific cryptocurrency market conditions that may not persist across different regulatory environments. The 11-symbol constraint reflects microstructure data availability rather than methodological limitations. Future research should investigate scalability across broader cryptocurrency universes and longer evaluation periods.