Design and Evaluation of Machine Learning-Based Investment Strategies in Equity Funds

Silva, Danillo Guimarães Cassiano da; Romão, Estaner Claro; Bargos, Fabiano Fernandes

doi:10.3390/ijfs14010016

Open AccessArticle

Design and Evaluation of Machine Learning-Based Investment Strategies in Equity Funds

by

Danillo Guimarães Cassiano da Silva

,

Estaner Claro Romão

and

Fabiano Fernandes Bargos

^*

Department of Basic and Environmental Sciences, Lorena School of Engineering, University of São Paulo, Estrada Municipal do Campinho, 100, Lorena 12602-810, SP, Brazil

^*

Author to whom correspondence should be addressed.

Int. J. Financial Stud. 2026, 14(1), 16; https://doi.org/10.3390/ijfs14010016

Submission received: 28 October 2025 / Revised: 21 December 2025 / Accepted: 29 December 2025 / Published: 7 January 2026

Download

Browse Figures

Versions Notes

Abstract

This study examines quantitative investment strategies for Brazilian equity funds, integrating traditional financial performance indicators with machine learning techniques to enhance fund selection. The main objective was to construct and validate predictive models for fund selection. The methodology involved collecting daily data from 2019 to 2025, computing a range of return and risk measures, and trained models to classify 1- and 3-month shifted windows. The 3-month models achieved the strongest predictive accuracy, exceeding 91%, with the Sharpe Ratio emerging as the most influential feature. A 12-month backtest (October/2024–September/2025) showed that ML-constructed portfolios delivered cumulative returns between 14.65% and 91.86%, depending on the selection criterion, substantially outperforming Brazil’s CDI risk-free benchmark (12.70%) and the Ibovespa (11.46%). These findings highlight the practical potential of ML-based fund selection, though successful implementation requires careful risk management and ongoing model validation.

Keywords:

predictive analytics; multi-class classification; model accuracy; quantitative trading; random forest

1. Introduction

Machine learning (ML) has emerged as a powerful tool for detecting patterns in complex and high-dimensional datasets, often revealing relationships that traditional statistical techniques may miss. In finance, this capability can enhance prediction accuracy and provide investors with actionable insights into market behavior and emerging trends. Traditional portfolio management, however, relies heavily on historical data and standard financial metrics, which may be insufficient to capture the dynamic and non-stationary nature of financial markets. ML offers a complementary approach, enabling more adaptive evaluation and optimization of investment strategies.

This paper investigates the use of machine learning (ML) combined with conventional risk–return measures to identify promising investment opportunities in Brazilian equity funds, addressing both retail and institutional investors. The Brazilian market presents a unique challenge: in November 2025, the SELIC—the country’s benchmark interest rate—stood at 15% per year (do Brasil, 2025), one of the highest globally. As a result, any equity-based investment strategy must consistently outperform this elevated risk-free benchmark to be considered attractive, while managing the substantial additional risk required to do so. From a practical investment perspective, the relevant question is therefore whether a systematic fund-selection approach can deliver returns superior to the risk-free rate and the broad equity market.

Following the methodology introduced in (Bargos & Claro Romão, 2025), we construct a dataset from daily historical fund prices and compute multiple performance indicators over horizons ranging from 1 to 24 months. In contrast to the approach of (J. Chen et al., 2016), who develop a supervised learning model to directly forecast future fund returns, rank funds based on the predicted values, and then form portfolios, our framework reformulates the problem as a multiclass classification task. Rather than predicting continuous returns, each fund is assigned to one of three performance categories derived from the mean and standard deviation of its short-term future returns (shifted 1- and 3-month horizons). This formulation allows ML algorithms to discriminate among likely outperformers, average performers, and underperformers using patterns embedded in historical performance metrics, leading to a more interpretable and decision-oriented selection process.

We extend our previous work by integrating the Random Forest (RF) classifier into a full portfolio-construction and backtesting framework. Our objective is not to exhaustively explore portfolio-optimization techniques or dynamic rebalancing models (Ji & Lejeune, 2018; Jiang et al., 2020), nor to benchmark predictive models against alternative forecasting algorithms, but to evaluate the economic effectiveness of ML-guided fund selection under realistic investment conditions. Accordingly, the performance of ML-based portfolios is assessed relative to the CDI (risk-free rate) and the Ibovespa (broad equity market), which constitute the natural investment benchmarks faced by Brazilian investors. The results indicate that RF-based selection provides meaningful guidance in identifying attractive funds and can outperform these benchmarks over the evaluation period.

Our empirical analysis confirms the effectiveness of this approach. RF models trained to predict 3-month performance achieve accuracy, precision, and recall above 91% on out-of-sample data. Feature-importance results show that risk-adjusted indicators—especially the Sharpe Ratio—are consistently the most influential predictors. Furthermore, a 12-month backtesting simulation of portfolios composed of ML-selected funds produced a cumulative return of 31.03%, outperforming both market benchmarks and portfolios constructed from the top 20 funds ranked purely by recent returns. These findings highlight the practical value of incorporating ML into fund selection and portfolio design.

This study contributes to the literature by introducing a methodology that leverages a comprehensive set of financial performance indicators as predictive features and by demonstrating how ML can enhance portfolio design and fund selection through the identification of historical patterns in risk–return metrics.

The remainder of the manuscript is structured as follows. Section 2 reviews the relevant literature. Section 3 describes the dataset and methodological approach. Section 4 presents the empirical findings and discussion, and Section 5 summarizes the main contributions and implications for quantitative portfolio management.

2. Related Work

Modern investment analysis has historically been grounded in classical theories such as Modern Portfolio Theory (Elton et al., 2014; Markowitz, 1952), the Capital Asset Pricing Model (CAPM) (Perold, 2004; Sharpe, 1964; Treynor, 1965), and the Efficient Market Hypothesis (EMH) (Fama & Malkiel, 1970). These models established the foundations of risk management and asset pricing, but their assumptions—such as investor rationality and normally distributed returns—often diverge from empirical market behavior, which reveals anomalies and exploitable inefficiencies (Lima, 2003; Lo & MacKinlay, 1999).

The complexity and non-stationarity of financial data challenge traditional models. While ML provides powerful tools to address these issues, its direct application to raw price series often results in overfitting and noise capture rather than robust predictive signals (Breiman, 1996; Padilha & Carvalho, 2017). Consequently, recent studies emphasize integrating financial domain knowledge into predictive modeling through effective feature engineering that enhances the informational content of data.

ML has been increasingly employed to capture the nonlinear dynamics of financial markets. A substantial body of research has applied ML techniques to predict short-term market movements, typically focusing on directional forecasts for major stock indices rather than fund-level performance. A common approach applies supervised models—such as RF, Feedforward Neural Networks (FNN), and Gated Recurrent Units (GRU)—to forecast stock returns from financial indicators. Hu et al. (2018) use neural networks to forecast directional changes in the S&P 500 and Dow Jones indices. Other studies employ hybrid or advanced neural architectures, such as fuzzy–neural models for next-day DAX-30 trends (García et al., 2018). Portfolios are then formed by selecting assets predicted to perform best, with performance assessed against market benchmarks (Tsai et al., 2023). In this framework, prediction serves as the foundation for portfolio construction.

Ji et al. (2019) and Jiang et al. (2020) develop a dynamic multi-period portfolio optimization framework that incorporates machine learning forecasts into risk–aversion adjustments. Using technical indicators to predict market direction and a rolling-horizon evaluation, they show that models such as XGBoost can enhance portfolio performance relative to standard benchmarks.

Portfolio management has since evolved from qualitative decision-making to increasingly sophisticated quantitative methods. Advances in computational power and data availability have enabled strategies based on statistical and ML modeling to optimize investments, reduce biases, and improve the risk–return trade-off (Lundberg et al., 2020; Nelson, 2017). Quantitative analysis has thus become a cornerstone of modern portfolio construction.

More recent studies advance toward integrated and adaptive portfolio optimization. For example, the Price-Aware Logistic Regression (PALR) model classifies portfolio returns to dynamically adjust asset weights, enabling adaptation to changing market conditions (Baguda et al., 2025). Others combine robust optimization with Artificial Neural Networks (ANNs) to create portfolios resilient to uncertainty in the statistical parameters of asset returns (Miri et al., 2025). This line of research not only targets higher returns but also seeks robustness against model ambiguity, as in studies embedding ML-based forecasts within classical mean–variance frameworks (W. Chen et al., 2021).

A newer paradigm, Deep Reinforcement Learning (DRL), represents a shift from prediction-based selection to autonomous decision-making. Instead of forecasting returns, DRL agents learn end-to-end trading policies—sequences of buy, sell, or hold actions—designed to maximize cumulative rewards tied to portfolio value or Sharpe Ratio. Recent implementations integrate multiple data sources, such as time-series signals and sentiment extracted from financial news via Large Language Models (LLMs), to perform dynamic sector rotation (Yan et al., 2025). This evolution from predictive screening to autonomous portfolio control reflects a broader trend: ML is no longer merely a forecasting tool but a central component in the design and execution of adaptive investment strategies.

3. Methodology

The methodology consists of three stages—preprocessing, modeling, and backtesting—as illustrated in Figure 1. Data collection and computation of performance indicators were performed during preprocessing. In the modeling stage, the Random Forest (RF) algorithm was applied due to its efficiency, robustness, and interpretability. Finally, backtesting evaluated the models through projections and risk–return analysis. All analyses were implemented in Python 3.11.13 using the PyCaret 3.2.0 library.

3.1. Preprocessing

Data Description and Data Preparation

Daily price data were obtained from the CVM (Brazilian Securities and Exchange Commission), the sole public source of fund prices, which are available only at the daily frequency. We restrict the sample to funds with at least 100 shareholders to exclude exclusive or restricted vehicles and ensure that the analysis reflects publicly accessible, benchmark-oriented retail funds.

Our prior study showed that the models depend almost exclusively on short-term performance indicators, with features beyond 24 months contributing minimally (Bargos & Claro Romão, 2025). This evidence supports using a two-year window here, as extending the period would not improve predictive relevance.

Over the period January/2019–September/2025, we computed six standard performance metrics: Return, Volatility, Beta, Tracking Error, Sharpe Ratio, and Information Ratio. These indicators, summarized in Table 1, were calculated using established financial definitions (Bodie et al., 2024). For each fund, metrics were evaluated over five horizons (1, 3, 6, 12, and 24 months), yielding 30 features per fund. Monthly windows were approximated using daily data by assuming 21 trading days per month (252 per year). Performance data for Brazilian equity funds are publicly available from the CVM and financial platforms such as (Mais Retorno, 2025).

For benchmarking, we use the Ibovespa Index, which reflects the average performance of highly traded stocks on the São Paulo Stock Exchange, providing a gauge of the general performance of the Brazilian stock market (B3, 2024). The Selic rate, which serves as the reference interest rate in Brazil, represents the return on a risk-free asset within the Brazilian economy (do Brasil, 2025). Together, these benchmarks offer essential points of comparison for evaluating the risk-adjusted returns of the funds studied.

3.2. Modeling

3.2.1. Data Transformation and Labels Definition

The procedures employed in this paper follow the methodology first introduced in (Bargos & Claro Romão, 2025), which defined the data transformation process and the ML model setup. In short, two datasets were constructed to support the experiments: one with the 1-month return shifted forward and another with the 3-month return shifted. This transformation introduced time-lagged features, aligning historical data at month t with future returns at

t + 1

.

Table 2 presents the structure of our dataset, highlighting in gray the column where the 1-month return has been shifted relative to the original 1-month return, which is used to construct the Target variable. An analogous transformation is performed to construct the 3-month shifted return. After creating Target, we remove the columns Fund ID, Date, Shifted_1M, and Return_1M, yielding one target variable and 29 performance-indicator features.

Removing the return at month t (Return_1M) ensures that only information available prior to the forecast horizon enters the model, preventing any form of look-ahead bias.

Under this construction, the resulting dataset is a panel rather than a univariate time series. Each row corresponds to an independent fund–month observation; rows from different funds are unrelated, and even rows from the same fund are not sequentially exploited by the model because the features already summarize the relevant historical window. Consequently, shuffling rows does not introduce temporal leakage.

We also define the three categories based on the 1- and 3-month shifted returns (

R_{t + 1}^{1 M}

and

R_{t + 1}^{3 M}

). Labels for training the classification models employed are determined based on the average return (

{\bar{R}}_{t + 1}^{n M}

) and the standard deviation (

σ_{t + 1}^{n M}

) of

R_{t + 1}^{1 M}

and

R_{t + 1}^{3 M}

. To ensure balanced subsets, the labels are assigned as follows.

Class = \{\begin{matrix} 0 & if R_{t + 1}^{n M} \leq {\bar{R}}_{t + 1}^{n M} - 0.5 σ_{t + 1}^{n M} \\ 2 & if R_{t + 1}^{n M} \geq {\bar{R}}_{t + 1}^{n M} + 0.5 σ_{t + 1}^{n M} \\ 1 & otherwise \end{matrix}

(1)

Applying Equation (1) to the original data, the resulting dataset is fairly balanced between classes. The descriptive statistics of

R_{t + 1}^{1 M}

and

R_{t + 1}^{3 M}

are presented in Table 3.

3.2.2. Justification for Selecting Random Forest

We first train two models to classify the 1-month and 3-month shifted returns (

R_{t + 1}^{1 M}

and

R_{t + 1}^{3 M}

) using data from January/2019 to September/2024. Then, following a rolling–expanding procedure, we append the next three months of observations and retrain the models. This process is repeated four times, so that the final models are trained on data from January/2019 to June/2025. In total, eight Random Forest models are trained.

This modeling choice is supported by the findings of (Bargos & Claro Romão, 2025), which compared fourteen machine-learning classifiers—including Random Forest, Extra Trees, LightGBM, Gradient Boosting, Quadratic Discriminant Analysis, Decision Tree Classifier, Logistic Regression, and k-Nearest Neighbors—for the same return-forecasting task. Their results showed that Random Forest offered the most reliable overall performance. In particular, RF provided the lowest rate of false positives when identifying funds with below-average returns (Class 0), exhibited stable precision, recall, and F1-scores across both 1- and 3-month horizons, and maintained strong predictive quality while avoiding extreme misclassifications such as confusing Class 0 with Class 2. These characteristics support the use of Random Forest as the primary classification model.

3.2.3. Machine Learning Model Setup

The datasets were divided into an 80% training set and a 20% test set. A 10-fold cross-validation procedure was employed for model validation and hyperparameter tuning. This approach aimed to reduce the variance of performance estimates and mitigate the risk of overfitting—a phenomenon in which the model fits the training data too closely, capturing not only the underlying patterns but also noise and random fluctuations.

The evaluation metrics used to assess model performance included Accuracy, AUC (Area Under the Curve), F1-score, Kappa and MCC (Matthews Correlation Coefficient). These metrics were chosen to evaluate not only overall accuracy but also classification quality in multi-class contexts and under potential class imbalances. The inclusion of AUC, for example, enables the assessment of the model’s discriminative ability in multi-class settings, while MCC and Kappa provide robust measures of agreement that penalize biased classifications.

RF models were implemented using the PyCaret classification module. Hyperparameters were tuned using a 10-fold cross-validation procedure, with the optimal configuration selected to maximize the mean F1-score across folds. Table 4 summarizes the main hyperparameters of the final models.

3.3. Backtesting Framework

3.3.1. Fund Selection Strategies

In this case study, we simulated a one-year investment scheme using three allocation strategies, with portfolio rebalancing performed every three months.

Strategies 1 (S1)—based on the 1-month ML models—and 2 (S2)—based on the 3-month ML models—are generated from the Random Forest classifiers trained in the rolling–expanding procedure described in Section 3.2.2. For each rebalancing date, we identify all funds predicted as high performers (Class 2) by the corresponding model. These funds are then ranked according to two criteria, evaluated in the immediately preceding month: (i) lower volatility and higher return, and (ii) higher return only. From each ranking, we select the top six funds and label them A–F for portfolio construction.

Portfolios are subsequently constructed by combining these six funds in groups of four, providing diversification while avoiding over-dilution of the ML signal, which results in 15 distinct portfolios for each strategy and allocation date (see Table 5). In addition to these 15 combinations, we also include a portfolio composed of all six selected funds.

For comparison and model evaluation, Strategy 3 (S3) was defined using 20 high-return funds—with at least 100 shareholders and that remained active during the testing period—selected solely based on their 1-month returns in September 2024. These funds were then combined in groups of four to form a third set of portfolios, totaling 4845 combinations.

Although many other funds registered with the CVM share similar characteristics (publicly accessible, active, and with at least 100 shareholders), their performance may differ substantially from the 20 selected. The purpose of Strategy 3, however, is not to exhaustively identify all possible alternatives but to provide a simple, naïve benchmark based solely on the most recent 1-month return. More importantly, the key reference points for evaluating the ML-based strategies are the CDI and Ibovespa benchmarks, which represent the Brazilian risk-free rate and the broad equity market, respectively.

3.3.2. Cumulative Return Computation

For each quarter, the cumulative returns were calculated as:

R_{p, t}^{c u m} = \prod_{t_{0} = 1}^{t} (1 + R_{p, t_{0}}) - 1

(2)

where

R_{p, t}^{c u m}

refers to the cumulative return for a portfolio p over a period of t days and

R_{p, t}

is the daily return of the portfolio.

3.3.3. Risk–Return Evaluation

To assess portfolio performance on a risk-adjusted basis, the Sharpe Ratio was employed. All portfolios were constructed under an equal-weighting scheme, where each of the N assets received a weight

w_{i} = 1 / N

.

Portfolio volatility (

σ_{p}

) was computed from the covariance matrix of asset returns (

Σ

) and the weight vector (w), according to

σ_{p}^{2} = w^{T} Σ w .

(3)

For the risk–return plots, volatility was averaged over a 3-month rolling window (

{\bar{σ}}_{p, 3 M}

) and annualized using a factor of

\sqrt{252}

, where 252 corresponds to the usual number of business days. Portfolio return was measured as the cumulative return over each evaluation period (

R_{p}^{c u m}

).

The Sharpe Ratio (

S H R_{p}

) was then calculated by dividing the portfolio’s excess cumulative return over the risk-free rate (CDI,

R_{f}^{c u m}

) by its volatility, as expressed in Equation (4) and defined in Table 1.

S H R_{p} = \frac{R_{p}^{c u m} - R_{f}^{c u m}}{{\bar{σ}}_{p, 3 M}}

(4)

3.4. Computational Resources

All computation were executed on a Dell XPS 8940 workstation (Dell Computadores do Brasil LTDA, Hortolandia, Brazil) with the following specifications: CPU: Intel(R) Core(TM) i7-10700 CPU @ 2.90 GHz, 8 Cores, 16 Threads; Memory: 16 GB, DDR4-3200 MHz, GPU: GeForce RTX 3060; NVMe WDC 512 GB-Sandisk Corp; 22.04.2-Ubuntu 6.5.0-41-generic.

4. Results and Discussion

This section presents the empirical evaluation of the proposed portfolio strategies, comparing their performance with reference portfolios and market benchmarks. The analysis focuses on cumulative returns obtained over four consecutive quarters, allowing for a consistent assessment of both short-term dynamics and overall annual performance. For each evaluation period, we report the portfolios achieving the highest and lowest returns, along with the aggregated results of a six-fund configuration. The comparison with CDI and Ibovespa benchmarks provides a baseline for contextualizing the effectiveness of the ML–based strategies relative to traditional market indicators.

4.1. Model Evaluation

Table 6 presents a consolidated summary of the Random Forest performance for the 1-month and 3-month forecasting horizons.

For the 1-month horizon, the model achieved a mean accuracy of 88.55% (std = 0.0071) and stable AUC, F1-score, Kappa, and MCC values, indicating consistent predictive performance. For the 3-month horizon, the model performed even better, achieving 91.46% accuracy (std = 0.0118), with consistently strong metrics across the board. These results confirm the robustness of the Random Forest classifier and align with previous research indicating higher predictive power for the 3-month horizon (Bargos & Claro Romão, 2025).

4.1.1. Confusion Matrix Analysis

To deepen the evaluation, we analyzed the confusion matrices of both models (Figure 2). Each matrix compares the actual and predicted classifications for the three return classes (0, 1, and 2).

The 1-month models (Figure 2a–d) exhibit consistent performance across classes. Class 0 (low-performing funds) is correctly identified in 85.8% to 88.7% of cases, while Class 2 (high-performing funds) shows slightly lower true-positive rates, ranging from 85.3% to 85.7%. As expected in ordinal classification problems, most errors occur between adjacent classes, with misclassification between Class 0 and Class 2 remaining below 3%.

For the 3-month models (Figure 2e–h), the classifiers exhibit stronger discriminatory power, with fewer inter-class misclassifications. Class 0 is correctly identified in 90.7% to 91.6% of cases. Class 2 classification also shows improved performance, with 89.1% to 90.0% of observations correctly classified, and misclassification of Class 0 as Class 2 remains below 1%. These results indicate that the 3-month models capture medium-term dynamics more effectively, yielding clearer separation between high- and average-performing funds. in agreement with Bargos and Claro Romão (2025).

4.1.2. Feature Importance Analysis

To identify which predictor variables most influenced the model’s predictions, a feature importance analysis was performed to quantify the contribution of each variable.

In Figure 3, we present the features that consistently ranked among the top 10 in importance across all predictive models. The results reveal a clear hierarchy in which risk-adjusted performance measures dominate, with the Sharpe Ratio emerging as the leading predictor, followed by short-term return metrics. Some models exhibit a sharply skewed importance distribution, where the top feature attains nearly twice the score of the next most relevant variable, underscoring its central role in short-horizon prediction. Detailed importance rankings for all models are provided in Appendix B.

4.2. Portfolio Construction and Backtesting Results

Table A1 lists the funds selected by the ML models for portfolio construction in Strategy 1 and Strategy 2 at each three-month rebalancing date, identified by their respective CNPJs (Brazil’s National Register of Legal Entities). For Strategy 3, the 20 high-return funds based on their 1-month returns in September 2024 are presented in Table A2.

4.2.1. Cumulative Return Analysis

We conducted a one-year backtesting exercise beginning in October 2024, using the real monthly returns of the funds selected by the models. Table 7 and Table 8 report the cumulative returns at the end of each of the four rebalancing intervals: October/2024–December/2024, January/2025–March/2025, April/2025–June/2025, and July/2025–September/2025. Figure A2 and Figure A3 complement these results by illustrating the evolution of the returns throughout the entire backtesting period.

As described in Section 3.3.1, the ML models identify high-performance candidates (Class 2), and the final selection of the six funds used to form the portfolios is based on two alternative criteria: (i) low volatility and high return, or (ii) high return only. Thus, the predictive model remains the same; what changes is solely the ranking rule applied after classification.

For context, our baseline benchmarks over the same period were the CDI (12.70%, risk-free rate) and the Ibovespa (11.46%, broad equity market), as shown in Table 8.

A first, straightforward investment option is to construct an equal-weighted portfolio composed of all six funds identified by the ML models (denoted as 6F in Table 7). Using the low-volatility, high-return criterion, the cumulative return was 25.03% for S1 and 47.81% for S2. When applying the high-return–only selection criterion, cumulative one-year returns range from 25.92% (S2) to 71.53% (S1). Thus, even the lowest observed return is roughly 2 times the CDI benchmark, while the highest exceeds it by almost sixfold.

A second approach is to use the same six ML-selected funds to construct additional portfolios by forming combinations. We generate all possible four-fund combinations (15 in total), and Table 7 reports the best and worst outcomes within each strategy. Under the low-volatility, high-return criterion, cumulative one-year returns range from 14.65% to 52.38%. When applying the high-return-only criterion, the dispersion is substantially larger, with results spanning from 8.37% to 91.86%. Although the highest return is more than seven times the CDI, the presence of under CDI outcomes suggests that, while ML recommendations provide valuable guidance, additional constraints or diversification rules may be necessary to prevent overly concentrated or unstable portfolio configurations.

Finally, for S3—which generates 4845 portfolios as described in Section 3.3.1—the cumulative one-year returns range from −52.46% to 77.96%. Although the best-performing S3 portfolio surpasses most ML-based portfolios, the large negative outcome highlights the substantial risk of relying solely on past 1-month returns. This naive selection approach can produce huge gains but also exposes the investor to severe losses.

4.2.2. Risk–Return Performance Analysis

The risk–return trade-off is illustrated in Figure 4 and Figure 5, where cumulative return is plotted on the vertical axis and volatility on the horizontal axis.

Under the low-volatility and high-return criterion (Figure 4), S1 and S2 portfolios exhibit superior risk-adjusted performance during the first quarter, achieving the highest Sharpe ratios in the sample with a broadly similar behavior. Even though the S1 and S2 portfolios performed similarly in the first quarter, a direct comparison reveals a significant advantage for the 3-month strategy across the quarters.

During the second quarter, S1 portfolios exhibited substantially inferior performance relative to both S2 and S3. In contrast, S2 demonstrated high consistency, significantly outperforming the S3 portfolios in terms of the Sharpe ratio.

By the third quarter, while the Sharpe ratio of S1 surpassed those of the S3 portfolios, S2 maintained a marginally superior performance profile. In the final quarter, S3 portfolios display a clearer advantage, while S1 and S2 remain positioned within the intermediate risk–return region, with S2 exhibiting superior risk-adjusted returns compared to S1 portfolios.

For the high-return-only criterion (Figure 5), S1 portfolios dominate S3 through the first three quarters, combining high cumulative returns with comparatively controlled volatility. S2 portfolios show weaker risk-adjusted performance under this criterion, reflecting their more conservative predictive structure. In the final quarter, S3 outperformed both ML strategies, though S2 held a slight edge over S1. This underscores the sensitivity of return-focused selection rules to shifting market regimes. Overall, the high-return-only criterion emphasizes growth potential and favors S1 for most of the year, whereas the low-volatility and high-return criterion yields more resilient and balanced risk-adjusted outcomes, positioning S2 as the top performer throughout the majority of the analyzed period.

The comparative analysis reveals distinct performance profiles for the machine learning strategies across both criteria. Under the high-return-only framework, S1 maintains consistent dominance with superior Sharpe ratios for the majority of the period. However, S2 demonstrates greater resilience in the final quarter, slightly surpassing S1 even as both strategies trail S3. In contrast, the low-volatility and high-return criterion establishes S2 as the more robust ML strategy. After a comparable start between S1 and S2 in the first quarter, S2 significantly outpaces S1 in the second and maintains its risk-adjusted advantage through the remainder of the year. Ultimately, while S1 captures higher growth potential, S2’s superior consistency across both selection rules underscores the higher classification robustness of its predictive stage.

5. Conclusions

This study demonstrates the feasibility and practical value of ML-based investment strategies for Brazilian equity funds. The RF models effectively distinguished low, average, and high-performing funds, with the 3-month return classifier (

R_{t + 1}^{3 M}

) exhibiting superior accuracy and more consistent separation across classes.

Feature-importance analysis revealed that risk-adjusted performance indicators—most notably the Sharpe Ratio—dominate the predictive structure of the models. This suggests that the efficiency of return generation is a more informative signal than isolated measures of profitability or volatility. Conversely, the relatively low importance of systematic risk metrics such as Beta indicates that the models rely primarily on idiosyncratic fund characteristics over general market exposure.

Backtesting results further support the practical applicability of ML-guided fund selection. The choice between S1 and S2 ultimately depends on the investor’s risk profile. S1, based on the 1-month model, delivers the highest upside potential but also exhibits greater variability in outcomes. S2, based on the 3-month model, provides smoother and more stable performance, yielding more resilient risk-adjusted results across market regimes, aligning with the superior classification stability observed in the predictive stage. When contrasted with these ML-based strategies, the naïve S3 benchmark shows comparatively higher downside risk, while the low-volatility/high-return ranking rule consistently reduces return dispersion. In contrast, the high-return-only rule magnifies both gains and losses, underscoring the importance of post-classification ranking criteria when translating ML predictions into portfolio decisions. Importantly, every ML-constructed six-fund portfolio (6F) outperformed the CDI—Brazil’s risk-free benchmark—by at least 1.97 times—demonstrating that the proposed ML framework can generate returns superior to an interest rate that is already among the highest globally.

While promising, ML-driven investment strategies require careful implementation, as their performance depends on data quality, robust feature selection, and resilience across market regimes. This study has several limitations. First, the backtests used gross returns and did not incorporate management fees, transaction costs, or turnover frictions; future work will extend the analysis to net returns to assess real-world implementability.

Second, the evaluation relied mainly on classical metrics such as accuracy and the Sharpe Ratio. Our previous ten-year study showed that short-term indicators carry the highest predictive weight, supporting the use of these measures; however, future research will integrate drawdown-sensitive metrics—including the Calmar, Omega, and Sortino ratios—and the Deflated Sharpe Ratio (Bailey & López de Prado, 2014) to better capture downside risk and correct for selection bias.

Finally, although the models rely predominantly on short-term features, future studies will examine longer and shorter historical windows, regime-dependent behavior, and sensitivity to different market environments. Additional improvements may include incorporating richer fund-level data, refining feature engineering, applying explainable ML techniques, and implementing automated retraining pipelines.

Overall, the results highlight the potential of ML-based approaches to enhance fund selection in markets characterized by many investment options and heterogeneous fund behavior. Future research should also explore the integration of ML predictions with optimized portfolio-construction frameworks, incorporate additional risk-sensitive performance metrics, and evaluate robustness across longer periods and varying market regimes. In summary, ML-based fund-selection strategies show strong potential, but their practical effectiveness depends on prudent risk management and continuous model validation.

Author Contributions

Conceptualization, D.G.C.d.S. and F.F.B.; Methodology, D.G.C.d.S. and F.F.B.; Software, D.G.C.d.S. and F.F.B.; Validation, D.G.C.d.S. and F.F.B.; Formal analysis, D.G.C.d.S., E.C.R. and F.F.B.; Investigation, D.G.C.d.S., E.C.R. and F.F.B.; Resources, F.F.B.; Data curation, D.G.C.d.S. and F.F.B.; Writing—original draft, D.G.C.d.S. and F.F.B.; Writing—review & editing, E.C.R. and F.F.B.; Visualization, D.G.C.d.S. and F.F.B.; Supervision, F.F.B.; Project administration, F.F.B.; Funding acquisition, F.F.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the University of São Paulo through the Unified Scholarship Program to Support the Education of Undergraduate Students (PUB-USP), grant no. 2855/2024.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sets, materials and codes generated during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Identification of Funds Selected

Table A1. List of funds chosen for Strategy 1 and 2 at each quarterly rebalancing based on ML predictions. Identified by their CNPJs (National Register of Legal Entities, Brazil).

Low-Volatility, High-Return Criterion
Strategy 1
	October/24	January/25	April/25	July/25
A	01.656.101/0001-88	11.628.883/0001-03	36.326.980/0001-64	34.109.625/0001-08
B	29.982.932/0001-69	34.109.803/0001-09	06.940.782/0001-25	34.028.082/0001-02
C	17.340.392/0001-30	34.109.794/0001-48	09.290.813/0001-38	13.401.224/0001-57
D	36.249.328/0001-93	18.168.479/0001-35	55.075.238/0001-78	31.874.833/0001-05
E	28.588.902/0001-00	19.436.818/0001-80	00.601.692/0001-23	34.658.753/0001-00
F	36.352.424/0001-62	34.109.625/0001-08	32.295.829/0001-55	21.347.643/0001-86
Strategy 2
	October/24	January/25	April/25	July/25
A	01.656.101/0001-88	00.906.044/0001-85	34.896.516/0001-88	29.638.303/0001-16
B	32.744.771/0001-80	34.896.516/0001-88	19.436.830/0001-94	36.318.507/0001-35
C	34.109.794/0001-48	17.447.468/0001-21	31.216.976/0001-20	17.340.392/0001-30
D	34.109.803/0001-09	19.436.830/0001-94	31.341.360/0001-80	36.017.669/0001-33
E	21.347.643/0001-86	31.216.976/0001-20	17.447.468/0001-21	17.502.937/0001-68
F	29.982.932/0001-69	31.341.360/0001-80	24.215.286/0001-90	28.588.902/0001-00
High-Return Criterion
Strategy 1
	October/24	January/25	April/25	July/25
A	10.292.302/0001-34	10.292.302/0001-34	10.292.302/0001-34	17.502.937/0001-68
B	20.147.389/0001-00	35.625.840/0001-24	00.601.692/0001-23	17.503.172/0001-80
C	19.831.126/0001-36	35.650.540/0001-03	36.326.980/0001-64	34.658.753/0001-00
D	36.017.669/0001-33	13.199.100/0001-30	10.601.479/0001-75	20.485.402/0001-30
E	27.181.765/0001-21	35.725.802/0001-43	32.041.623/0001-07	31.874.833/0001-05
F	28.588.902/0001-00	35.354.967/0001-56	32.295.829/0001-55	36.249.328/0001-93
Strategy 2
	October/24	January/25	April/25	July/25
A	10.292.302/0001-34	27.500.674/0001-01	11.209.172/0001-96	36.327.455/0001-63
B	14.632.925/0001-60	03.917.778/0001-58	03.618.010/0001-83	27.500.674/0001-01
C	30.530.779/0001-18	03.917.096/0001-45	26.648.868/0001-96	21.689.246/0001-92
D	33.824.951/0001-34	07.470.226/0001-03	12.987.743/0001-86	27.181.765/0001-21
E	35.717.740/0001-28	08.336.054/0001-34	19.727.078/0001-30	36.249.317/0001-03
F	33.913.562/0001-85	11.060.594/0001-42	10.292.302/0001-34	29.152.427/0001-97

Table A2. Funds selected for Strategy 3. The 20 funds with the highest 1-month returns in September 2024, identified by their CNPJs (National Register of Legal Entities, Brazil) registered with the CVM. Only funds with at least 100 shareholders that remained active throughout the backtesting period were included.

1	2	3	4
35.717.740/0001-28	30.530.779/0001-18	33.824.951/0001-34	68.670.512/0001-07
5	6	7	8
35.602.471/0001-54	04.882.617/0001-39	04.892.107/0001-42	36.350.655/0001-37
9	10	11	12
04.889.781/0001-78	04.895.210/0001-46	04.881.177/0001-03	04.885.820/0001-69
13	14	15	16
09.130.395/0001-11	09.296.352/0001-00	04.881.682/0001-40	07.470.234/0001-41
17	18	19	20
10.292.302/0001-34	36.318.507/0001-35	34.218.740/0001-10	27.500.674/0001-01

Appendix B. Feature Importance

Figure A1. Feature Importance for the 1-month (a–d) and 3-month (e–h) predictive models.

Appendix C. Cumulative Returns

Figure A2. Cumulative returns of the portfolios across the analyzed quarters (Low-volatility, high-return criterion). Figures (a,c,e,g) correspond to Strategy 1, while panels (b,d,f,h) correspond to Strategy 2. In both cases, Strategy 3 and the benchmark indices (CDI and Ibovespa) are included for comparison. The individual portfolios of Strategy 3 are plotted with reduced opacity to illustrate their overall distribution without emphasizing specific trajectories.

Figure A3. Cumulative returns of the portfolios across the analyzed quarters (High-return criterion). Figures (a,c,e,g) correspond to Strategy 1, while panels (b,d,f,h) correspond to Strategy 2. In both cases, Strategy 3 and the benchmark indices (CDI and Ibovespa) are included for comparison. The individual portfolios of Strategy 3 are plotted with reduced opacity to illustrate their overall distribution without emphasizing specific trajectories.

References

B3. (n.d.). Ibovespa. Available online: https://www.b3.com.br/en_us/market-data-and-indices/indices/broad-indices/ibovespa.htm (accessed on 7 November 2024).
Baguda, Y. S., AlJahdali, H. M., & Taha, A. A. (2025). Dynamic portfolio return classification using price-aware logistic regression. Mathematics, 13(11), 1885. [Google Scholar] [CrossRef]
Bailey, D. H., & López de Prado, M. (2014). The deflated Sharpe ratio: Correcting for selection bias, backtest overfitting and non-normality. Journal of Portfolio Management, 40(5), 94–107. [Google Scholar] [CrossRef]
Bargos, F. F., & Claro Romão, E. (2025). Enhanced forecasting of equity fund returns using machine learning. Mathematical and Computational Applications, 30(1), 9. [Google Scholar] [CrossRef]
Bodie, Z., Kane, A., & Marcus, A. (2024). Investments (13th ed.). McGraw-Hill Education. [Google Scholar]
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. [Google Scholar] [CrossRef]
Chen, J., Wu, W., & Tindall, M. L. (2016). Hedge fund return prediction and fund selection: A machine-learning approach (Tech. Rep. No. 16-4). Federal Reserve Bank of Dallas. [Google Scholar]
Chen, W., Zhang, H., Mehlawat, M. K., & Jia, L. (2021). Mean-variance portfolio optimization using machine learning-based stock price prediction. Applied Soft Computing, 100, 106943. [Google Scholar] [CrossRef]
do Brasil, B. C. (2025). Selic rate—Monetary policy. Available online: https://www.bcb.gov.br/en/monetarypolicy/selicrate (accessed on 1 November 2025).
Elton, E. J., Gruber, M. J., Brown, S. J., & Goetzmann, W. N. (2014). Modern portfolio theory and investment analysis (9th ed.). Wiley. [Google Scholar]
Fama, E. F., & Malkiel, B. G. (1970). Efficient capital markets: A review of theory and empirical work. The Journal of Finance, 25(2), 383–417. [Google Scholar] [CrossRef]
García, F., Guijarro, F., Oliver, J., & Tamošiūnienė, R. (2018). Hybrid fuzzy neural network to predict price direction in the German DAX-30 index. Technological and Economic Development of Economy, 24(6), 2161–2178. Available online: https://journals.vilniustech.lt/index.php/TEDE/article/view/6394 (accessed on 7 November 2024). [CrossRef]
Hu, H., Tang, L., Zhang, S., & Wang, H. (2018). Predicting the direction of stock markets using optimized neural networks with Google Trends. Neurocomputing, 285, 188–195. Available online: https://www.sciencedirect.com/science/article/pii/S0925231218300572 (accessed on 7 November 2024). [CrossRef]
Ji, R., Chang, K., & Jiang, Z. (2019, July 2–5). Risk-aversion adjusted portfolio optimization with predictive modeling. 2019 22th International Conference on Information Fusion (FUSION) (pp. 1–8), Ottawa, ON, Canada. [Google Scholar] [CrossRef]
Ji, R., & Lejeune, M. A. (2018). Risk-budgeting multi-portfolio optimization with portfolio and marginal risk constraints. Annals of Operations Research, 262(2), 547–578. [Google Scholar] [CrossRef]
Jiang, Z., Ji, R., & Chang, K.-C. (2020). A machine learning integrated portfolio rebalance framework with risk-aversion adjustment. Journal of Risk and Financial Management, 13(7), 155. [Google Scholar] [CrossRef]
Lima, L. A. d. O. (2003). Auge e declínio da hipótese dos mercados eficientes. Brazilian Journal of Political Economy, 23(4), 531–546. [Google Scholar] [CrossRef]
Lo, A. W., & MacKinlay, A. C. (1999). A non-random walk down wall street. Princeton University Press. [Google Scholar]
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S.-I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 2522–5839. [Google Scholar] [CrossRef] [PubMed]
Mais Retorno. (2025). Mais retorno—Informações e Análises financeiras. Available online: https://maisretorno.com/ (accessed on 15 February 2025).
Markowitz, H. (1952). Portfolio selection. The Journal of Finance, 7(1), 77–91. [Google Scholar] [PubMed]
Miri, S., Salavati, E., & Shamsi, M. (2025). Robust portfolio selection under model ambiguity using deep learning. International Journal of Financial Studies, 13(1), 38. [Google Scholar] [CrossRef]
Nelson, D. M. Q. (2017). Uso de redes neurais recorrentes para previsão de séries temporais financeiras [Dissertação de Mestrado, Universidade Federal de Minas Gerais]. Available online: http://hdl.handle.net/1843/ESBF-AM2NTS (accessed on 7 November 2024).
Padilha, V. A., & Carvalho, A. C. P. L. F. (2017). Mineração de dados. Available online: https://edisciplinas.usp.br/pluginfile.php/3904960/mod_resource/content/3/mineracaodadosbiologicos-parte1.pdf (accessed on 19 June 2023).
Perold, A. F. (2004). The capital asset pricing model. Journal of Economic Perspectives, 18(3), 3–24. [Google Scholar] [CrossRef]
Sharpe, W. F. (1964). Capital asset prices: A theory of market equilibrium under conditions of risk. Journal of Finance, 19(3), 425–442. [Google Scholar]
Treynor, J. L. (1965). How to rate management of investment funds. Harvard Business Review, 43, 63–75. [Google Scholar]
Tsai, P.-F., Gao, C.-H., & Yuan, S.-M. (2023). Stock selection using machine learning based on financial ratios. Mathematics, 11(23), 4758. [Google Scholar] [CrossRef]
Yan, Y., Zhang, C., An, Y., & Zhang, B. (2025). A deep-reinforcement-learning-based multi-source information fusion portfolio management approach via sector rotation. Electronics, 14(5), 1036. [Google Scholar] [CrossRef]

Figure 1. Flowchart of the methodological framework showing the three stages: preprocessing, modeling, and backtesting.

Figure 2. Confusion matrices on test data for the two classifiers predicting 1-month (a–d) and 3-month (e–h) return (

R_{t + 1}^{1 M}

and

R_{t + 1}^{3 M}

) classes. Overall, both achieved 84.4–94.6% True Positive rates for Classes 0 and 2, and 85.8–91.1% for Class 1.

Figure 2. Confusion matrices on test data for the two classifiers predicting 1-month (a–d) and 3-month (e–h) return (

R_{t + 1}^{1 M}

and

R_{t + 1}^{3 M}

) classes. Overall, both achieved 84.4–94.6% True Positive rates for Classes 0 and 2, and 85.8–91.1% for Class 1.

Figure 3. Top 10 most important features across all 1-month (a) and 3-month (b) predictive models.

Figure 4. Cumulative returns of the portfolios as a function of their average volatilities over the analyzed quarters. Low-volatility, high-return criterion. Figures (a,c,e,g) correspond to Strategy 1, while figures (b,d,f,h) correspond to Strategy 2. Portfolios from 20 High-Return Funds correspond to Strategy 3.

Figure 5. Cumulative returns of the portfolios as a function of their average volatilities over the analyzed quarters. High-return only criterion. Figures (a,c,e,g) correspond to Strategy 1, while figures (b,d,f,h) correspond to Strategy 2. Portfolios from 20 High-Return Funds correspond to Strategy 3.

Table 1. Featured abbreviations and mathematical expressions.

R_{i, t} = \frac{P_{i, t} - P_{i, t - 1}}{P_{i, t - 1}}

σ_{i, t} = \sqrt{\frac{1}{N - 1} \sum_{t = 1}^{N} {(R_{i, t} - \bar{R_{i}})}^{2}}

β_{i, t} = \frac{Cov (R_{i, t}, R_{b})}{Var (R_{b})}

{TE}_{i, t} = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(R_{i, t} - R_{b})}^{2}}

{SHR}_{i, t}

=

\frac{\bar{R_{i}} - R_{f}}{σ_{i, t}}

{IR}_{i, t} = \frac{\bar{R_{i}} - R_{b}}{{TE}_{i, t}}

R_{i, t}

: return of the fund i on day t;

P_{i, t}

: price of the fund i on day t;

P_{i, t - 1}

: price of the fund i on the previous day

t - 1

;

σ_{i, t}

: standard deviation of the fund’s returns (volatility); N: number of observations (days);

\bar{R_{i}}

: average return of the individual in a period;

R_{b}

: return of the benchmark (Ibovespa Index) in a period;

R_{f}

: the return on a risk-free asset (SELIC) in a period;

Cov (R_{i}, R_{b})

: covariance between the return of the individual stock and the market return, in this case, Ibovespa index;

Var (R_{b})

: variance of the market return, in this case, Ibovespa index;

{TE}_{i, t}

: tracking error of the fund i on the day t;

{SHR}_{i, t}

: Sharpe ratio of the fund i on day t;

{IR}_{i, t}

: Information ratio of the fund i on day t.

Table 2. Illustration of the procedure used to generate the shifted–return columns and the target variable. After creating Target, we remove Fund ID, Date, Shifted_1M and Return_1M, yielding one target column and 29 performance-indicator features for model training. The 1-month shifted return (in gray),

R_{t + 1}^{1 M}

, is used in Equation (1) to define the class labels. The construction of the 3-month shifted return,

R_{t + 1}^{3 M}

, follows the same steps (removing Return_3M afterward).

Table 2. Illustration of the procedure used to generate the shifted–return columns and the target variable. After creating Target, we remove Fund ID, Date, Shifted_1M and Return_1M, yielding one target column and 29 performance-indicator features for model training. The 1-month shifted return (in gray),

R_{t + 1}^{1 M}

, is used in Equation (1) to define the class labels. The construction of the 3-month shifted return,

R_{t + 1}^{3 M}

, follows the same steps (removing Return_3M afterward).

				Return			Volatility			Beta			Track	Sharpe	Info
													Error	Ratio	Ratio
Fund			Shifted
ID	Date	Target	1M	1M	3M	…	1M	3M	…	1M	3M	…	…	…	…
1	January/19	0	−0.032	0.061	−0.003	…	0.402	0.399	…	1.295	1.125	…	…	…	…
1	Fev/19	0	−0.071	−0.032	0.002	…	0.361	0.387	…	1.174	1.214	…	…	…	…
1	March/19	2	0.131	−0.071	−0.045	…	0.309	0.336	…	0.927	1.044	…	…	…	…
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮
n	February/25	1	0.092	0.142	0.036	…	0.334	0.333	…	0.920	1.007	…	…	…	…
n	March/25	2	0.190	0.092	0.145	…	0.300	0.315	…	0.791	0.893	…	…	…	…
n	April/25	1	0.049	0.190	0.470	…	0.376	0.333	…	1.290	0.979	…	…	…	…
⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮	⋮

Table 3. Descriptive statistics for the three classes (0, 1, and 2) used across the eight ML models, comprising four models based on 1-month returns (

R_{t + 1}^{1 M}

) and four based on 3-month returns (

R_{t + 1}^{3 M}

). We use an expanding-window approach: the initial models are trained on data from January/2023 to September/2024, and every three months we extend the dataset and retrain. After four expansions, the final model use data up to June/2025.

Table 3. Descriptive statistics for the three classes (0, 1, and 2) used across the eight ML models, comprising four models based on 1-month returns (

R_{t + 1}^{1 M}

) and four based on 3-month returns (

R_{t + 1}^{3 M}

). We use an expanding-window approach: the initial models are trained on data from January/2023 to September/2024, and every three months we extend the dataset and retrain. After four expansions, the final model use data up to June/2025.

		1M				3M
		Overall	Class 0	Class 1	Class 2	Overall	Class 0	Class 1	Class 2
January/2019 to September/2024	count	38,769	10,496	16,940	11,333	38,769	10,108	17,725	10,936
	mean	0.0047	−0.0763	0.0046	0.0800	0.0173	−0.1350	0.0167	0.1589
	std	0.0708	0.0600	0.0200	0.0381	0.1266	0.0895	0.0354	0.0770
	min	−1.1928	−1.1928	−0.0307	0.0401	−1.1880	−1.1880	−0.0460	0.0806
	25%	−0.0342	−0.0863	−0.0129	0.0553	−0.0495	−0.1762	−0.0133	0.1061
	50%	0.0064	−0.0596	0.0048	0.0711	0.0187	−0.1039	0.0159	0.1387
	75%	0.0485	−0.0425	0.0216	0.0936	0.0914	−0.0696	0.0476	0.1892
	max	0.6661	−0.0307	0.0401	0.6661	1.0305	−0.0460	0.0806	1.0305
January/2019 to December/2024	count	40,480	11,248	17,256	11,976	40,480	10,812	18,219	11,449
	mean	0.0041	−0.0754	0.0040	0.0787	0.0141	−0.1352	0.0135	0.1561
	std	0.0703	0.0586	0.0200	0.0378	0.1261	0.0866	0.0355	0.0768
	min	−1.1928	−1.1928	−0.0311	0.0392	−1.1880	−1.1880	−0.0489	0.0771
	25%	−0.0354	−0.0849	−0.0135	0.0540	−0.0556	−0.1720	−0.0168	0.1032
	50%	0.0055	−0.0590	0.0042	0.0699	0.0148	−0.1052	0.0127	0.1357
	75%	0.0483	−0.0424	0.0210	0.0922	0.0887	−0.0731	0.0442	0.1861
	max	0.6661	−0.0311	0.0392	0.6661	1.0305	−0.0490	0.0771	1.0305
January/2019 to March/2025	count	42,113	11,649	17,998	12,466	42,113	11,325	18,861	11,927
	mean	0.0041	−0.0748	0.0040	0.0780	0.0143	−0.1330	0.0140	0.1546
	std	0.0697	0.0581	0.0200	0.0376	0.1249	0.0858	0.0356	0.0764
	min	−1.1928	−1.1928	−0.0308	0.0390	−1.1880	−1.1880	−0.0482	0.0768
	25%	−0.0349	−0.0844	−0.0138	0.0532	−0.0552	−0.1698	−0.0165	0.1023
	50%	0.0057	−0.0585	0.0043	0.0692	0.0153	−0.1034	0.0133	0.1342
	75%	0.0477	−0.0419	0.0211	0.0915	0.0880	−0.0719	0.0451	0.1839
	max	0.6661	−0.0308	0.0390	0.6661	1.0305	−0.0482	0.0768	1.0305
January/2019 to June/2025	count	43,699	12,160	18,590	12,949	43,699	11,744	19,489	12,466
	mean	0.0041	−0.0739	0.0042	0.0774	0.0162	−0.1307	0.0158	0.1552
	std	0.0691	0.0573	0.0197	0.0374	0.1243	0.0857	0.0353	0.0751
	min	−1.1928	−1.1928	−0.0304	0.0387	−1.1880	−1.1880	−0.0460	0.0783
	25%	−0.0349	−0.0832	−0.0131	0.0526	−0.0526	−0.1674	−0.0148	0.1039
	50%	0.0060	−0.0579	0.0046	0.0686	0.0172	−0.1014	0.0150	0.1356
	75%	0.0473	−0.0418	0.0207	0.0905	0.0901	−0.0699	0.0469	0.1834
	max	0.6661	−0.0304	0.0387	0.6661	1.0305	−0.0460	0.0783	1.0305

Table 4. Random Forest hyperparameters used in the 1-month and 3-month prediction models.

Hyperparameter	Value	Description
n_estimators	100	Number of trees in the ensemble
max_depth	None	Maximum depth of each tree (no limit)
min_samples_split	2	Minimum samples required to split a node
min_samples_leaf	1	Minimum samples required at leaf nodes
criterion	gini	Split-quality metric
bootstrap	True	Whether bootstrap samples are used
max_features	sqrt	Features considered per split
class_weight	None	No class weight adjustment
random_state	123	Ensures reproducibility
cv	10-fold	Cross-validation strategy

Table 5. Framework for creating the 15 portfolios by combining 6 funds, taken 4 at a time.

1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
A	A	A	A	A	A	A	A	A	A	B	B	B	B	C
B	B	B	B	B	B	C	C	C	D	C	C	C	D	D
C	C	C	D	D	E	D	D	E	E	D	D	E	E	E
D	E	F	E	F	F	E	F	F	F	E	F	F	F	F

Table 6. Consolidated Random Forest performance (mean and std) from 10-fold cross-validation for the 1-month and 3-month return classes.

			Accuracy	AUC	Recall	Precision	F1	Kappa	MCC
January/2019 to September/2024	1M	mean	0.8811	0.9715	0.8811	0.8828	0.8811	0.8159	0.8168
		std	0.0050	0.0018	0.0050	0.0048	0.0049	0.0078	0.0076
	3M	mean	0.9133	0.9824	0.9133	0.9141	0.9134	0.8646	0.8650
		std	0.0039	0.0012	0.0039	0.0039	0.0039	0.0062	0.0061
January/2019 to December/2024	1M	mean	0.8797	0.9709	0.8797	0.8813	0.8798	0.8149	0.8156
		std	0.0067	0.0029	0.0067	0.0066	0.0067	0.0103	0.0103
	3M	mean	0.9119	0.9828	0.9119	0.9127	0.9120	0.8630	0.8634
		std	0.0064	0.0018	0.0064	0.0063	0.0064	0.0101	0.0100
January/2019 to March/2025	1M	mean	0.8762	0.9692	0.8762	0.8782	0.8763	0.8092	0.8102
		std	0.0086	0.0018	0.0086	0.0081	0.0085	0.0134	0.0131
	3M	mean	0.9089	0.9816	0.9089	0.9097	0.9089	0.8584	0.8588
		std	0.0041	0.0013	0.0041	0.0043	0.0041	0.0063	0.0064
January/2019 to June/2025	1M	mean	0.8737	0.9693	0.8737	0.8758	0.8738	0.8055	0.8066
		std	0.0066	0.0025	0.0066	0.0064	0.0066	0.0103	0.0101
	3M	mean	0.9086	0.9817	0.9086	0.9093	0.9087	0.8582	0.8586
		std	0.0059	0.0017	0.0059	0.0059	0.0059	0.0092	0.0092

Table 7. Cumulative returns of the proposed strategies (S1 and S2). For each quarter, the table reports the portfolios with the highest and lowest returns, as well as the aggregated results of the six-fund portfolio. The last row presents the 12-month cumulative return.

Date	Low-Volatility, High-Return Criterion						High-Return Criterion
	S1			S2			S1			S2
	Highest	Lowest	6F	Highest	Lowest	6F	Highest	Lowest	6F	Highest	Lowest	6F
December/24	0.1324	0.1023	0.1154	0.1398	0.1201	0.1272	0.1959	0.0606	0.1103	−0.0055	−0.1377	−0.0606
March/25	−0.0583	−0.1227	−0.0871	0.1020	0.0929	0.0962	0.3074	0.1766	0.2591	0.0078	−0.0687	−0.0428
June/25	0.1982	0.1132	0.1611	0.1710	0.1685	0.1696	0.3302	0.2361	0.2785	0.2775	0.1939	0.2378
September/25	0.0658	0.0537	0.0609	0.1110	0.0618	0.0851	0.0851	0.0525	0.0673	0.1484	0.0962	0.1248
12-m c.r.	0.3380	0.1465	0.2503	0.5238	0.4432	0.4781	0.9186	0.5258	0.7153	0.4282	0.0837	0.2592

Table 8. Cumulative returns of the references portfolios S3 and market benchmarks (CDI and Ibovespa). For each quarter, the table reports the highest and lowest returns. The last row presents the 12-month cumulative return.

Date	S3		Benchmarks
Date	Highest	Lowest	CDI	Ibov.
December/24	0.1087	−0.2273	0.0268	−0.0875
March/25	0.2023	−0.2027	0.0299	0.0829
June/25	0.2692	−0.1286	0.0333	0.0660
September/25	0.1993	0.0340	0.0370	0.0532
12-m c.r.	0.7796	−0.5246	0.1270	0.1146

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Silva, D.G.C.d.; Romão, E.C.; Bargos, F.F. Design and Evaluation of Machine Learning-Based Investment Strategies in Equity Funds. Int. J. Financial Stud. 2026, 14, 16. https://doi.org/10.3390/ijfs14010016

AMA Style

Silva DGCd, Romão EC, Bargos FF. Design and Evaluation of Machine Learning-Based Investment Strategies in Equity Funds. International Journal of Financial Studies. 2026; 14(1):16. https://doi.org/10.3390/ijfs14010016

Chicago/Turabian Style

Silva, Danillo Guimarães Cassiano da, Estaner Claro Romão, and Fabiano Fernandes Bargos. 2026. "Design and Evaluation of Machine Learning-Based Investment Strategies in Equity Funds" International Journal of Financial Studies 14, no. 1: 16. https://doi.org/10.3390/ijfs14010016

APA Style

Silva, D. G. C. d., Romão, E. C., & Bargos, F. F. (2026). Design and Evaluation of Machine Learning-Based Investment Strategies in Equity Funds. International Journal of Financial Studies, 14(1), 16. https://doi.org/10.3390/ijfs14010016

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
A	A	A	A	A	A	A	A	A	A	B	B	B	B	C
B	B	B	B	B	B	C	C	C	D	C	C	C	D	D
C	C	C	D	D	E	D	D	E	E	D	D	E	E	E
D	E	F	E	F	F	E	F	F	F	E	F	F	F	F

1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
A	A	A	A	A	A	A	A	A	A	B	B	B	B	C
B	B	B	B	B	B	C	C	C	D	C	C	C	D	D
C	C	C	D	D	E	D	D	E	E	D	D	E	E	E
D	E	F	E	F	F	E	F	F	F	E	F	F	F	F

Article Menu

Design and Evaluation of Machine Learning-Based Investment Strategies in Equity Funds

Abstract

1. Introduction

2. Related Work

3. Methodology

3.1. Preprocessing

Data Description and Data Preparation

3.2. Modeling

3.2.1. Data Transformation and Labels Definition

3.2.2. Justification for Selecting Random Forest

3.2.3. Machine Learning Model Setup

3.3. Backtesting Framework

3.3.1. Fund Selection Strategies

3.3.2. Cumulative Return Computation

3.3.3. Risk–Return Evaluation

3.4. Computational Resources

4. Results and Discussion

4.1. Model Evaluation

4.1.1. Confusion Matrix Analysis

4.1.2. Feature Importance Analysis

4.2. Portfolio Construction and Backtesting Results

4.2.1. Cumulative Return Analysis

4.2.2. Risk–Return Performance Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Identification of Funds Selected

Appendix B. Feature Importance

Appendix C. Cumulative Returns

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
A	A	A	A	A	A	A	A	A	A	B	B	B	B	C
B	B	B	B	B	B	C	C	C	D	C	C	C	D	D
C	C	C	D	D	E	D	D	E	E	D	D	E	E	E
D	E	F	E	F	F	E	F	F	F	E	F	F	F	F