Ergogenic Aids to Improve Physical Performance in Female Athletes: A Systematic Review with Meta-Analysis

Most intervention studies investigating the effects of ergogenic aids (EAs) on sports performance have been carried out in the male population. Thus, the aim of this systematic review and meta-analysis was to summarize the effects in the existing literature of EAs used by female athletes on performance. A literature research was conducted, and a descriptive analysis of the articles included in the systematic review was carried out. Meta-analyses could be performed on 32 of the included articles, evaluating performance in strength, sprint, and cardiovascular capacity. A random-effects model and the standardized mean differences (SMD) ± 95% confidence intervals (CI) were reported. The results showed that caffeine helped to improve jumping performance, isometric strength values, and the number of repetitions until failure. Caffeine and sodium phosphate helped to improve sprint performance. Aerobic tests could be improved with the use of taurine, caffeine, and beta-alanine. No conclusive effects of beetroot juice, polyphenols, or creatine in improving aerobic performance were shown. In terms of anaerobic variables, both caffeine and sodium phosphate could help to improve repeated sprint ability. More studies are needed in female athletes that measure the effects of different EAs on sports performance, such as beetroot juice, beta-alanine or sodium phosphate, as the studies to date are scarce and there are many types of EA that need to be further considered in this population, such as creatine and taurine.


Introduction
The ingestion of ergogenic aids (EA) to enhance performance has become popular in recent years and is a widely used strategy by athletes [1]. Although the terms "EA" and "nutritional supplementation" are used interchangeably in the literature, we can differentiate certain nuances between them. Nutritional supplementation would encompass sources of nutrients or other substances with a physiological or nutritional effect that complement the normal diet [2]. EA, on the other hand, include any short-and/or longterm nutritional practice that seeks to improve sports performance, to enhance training adaptations and/or to help post-exercise recovery [3]. Therefore, all EA are nutritional supplements, but not all supplements are EA.
There are a plethora of sports EAs on the market that aim to achieve different indirect performance improvements, such as a modulation of the inflammatory response [4], reducing oxidative stress [5], improving homeostasis recovery [6], adaptation of signaling pathways [7], reducing fatigue [8], or improving aerobic [9] and short, high-intensity exercise performance [10]. For some of these EAs there is solid proven scientific evidence (e.g., caffeine (CAF), creatine (CRE), nitrate, beta-alanine (BA), and bicarbonate (SB)) [11,12] while for others there is a lack of evidence or it is very limited and inconclusive (e.g., the amino acid N-acetylcysteine or polyphenols (PPs)) [12].
success and a reduced risk of occurrence/severity of injury when exercising and/or practicing sports (i.e., muscle strength, sprinting capacity, cardiorespiratory fitness, and all their magnitudes and components). Studies with women over 18 years old were included with no age limit. The exclusion criteria were studies that solely included measurements on parameters other than sports performance, studies which included the effects of EAs in both men and women without comparing the results by sex, and articles in which the participants used supplements without seeking ergogenic support. Conference proceedings, doctoral theses, dissertations, case studies, and other reviews and meta-analyses were excluded.
Initially, 341 items were identified. After eliminating duplicates and filtering according to the inclusion criteria, 43 articles were included in the review and 32 in the metaanalysis. The study selection flow chart is shown in Figure 1.

Data Extraction
The selection of articles and the data extraction on study source, study design, study quality, participants' sample size, participants' characteristics, ergogenic substances, and performance outcomes of the interventions were conducted by three independent reviewers (CRL, OLT, and RCE). If disagreements appeared, a third author (VEF-E) was included, and the discrepancies were resolved through discussion. Trials were not excluded based on quality. A priori, the main outcomes analyzed were changes in the measures of physical per-Nutrients 2023, 15, 81 4 of 28 formance triggered by the ingestion of EA. All the evaluations were performed in duplicate, independent of each other. Disagreements in the analysis were resolved through consensus.

Risk of Bias
The risk of bias assessment was independently evaluated by two reviewers with the PEDro scale [35]. This tool provides information about the allocation and randomization process, the blinding process, the data obtained, and the statistical analysis. The PEDro score scale considers a rate of 0-3 'poor', 4-5 'fair', 6-8 'good', and 9-10 'excellent'. Furthermore, a total PEDro score of 8/10 is considered optimal for trials assessing complex interventions (e.g., exercise).

Statistical Analysis
To analyze the data, the studies were separated according to the physical abilities examined. All the variables that assessed the same physical capacity were analyzed together. Thus, an analysis was carried out to determine performance in strength (divided into jumping and power, isometric strength, and endurance strength), sprint, and cardiovascular capacity (aerobic and anaerobic).
Of the 43 articles selected in the review, 11 were discarded for the meta-analysis because they did not include sufficient data, or they analyzed other types of variables not related to exercise performance. Therefore, measurements were taken from the EA intake groups and the placebo groups to make a comparison among them. The most representative variable of the physical capacity was included in the data analysis. In those studies that included two EAs, the two corresponding variables analyzed independently with respect to the placebo group were added.
The outcome measure to carry out the analysis was the standardized mean difference (SMD). A random-effects model was then fitted to the data. The restricted maximumlikelihood estimator (RML) [36] was used to calculate the amount of heterogeneity (i.e., tau 2 ). In addition to the RML estimator, the Q-test for heterogeneity [37] and the I 2 statistic are detailed. If any amount of heterogeneity was noticed (i.e., tau 2 > 0, regardless of the Q-test's results), a prediction interval for the true outcomes was also given. To investigate whether studies might be outliers and/or influential in the context of the model, studentized residuals and Cook's distances were utilized. Studies with values greater than the 100 × (1 − 0.05/(2 × k))th percentile of a standard normal distribution were considered to be potential outliers (i.e., utilizing a Bonferroni correction with two-sided alpha = 0.05 for k studies included in the meta-analysis). Studies which presented a Cook's distance bigger than the median plus six times the interquartile range of the Cook's distances were considered influential. The rank correlation test and the regression test were applied to test for funnel plot asymmetry, making use of the standard error of the observed results as a predictor. To perform all the analyses, the free statistical software Jamovi was used (version 1.6.15) [38].

Results
The characteristics of the studies are presented in Table 1. The articles included dated from 2008 to 2022. Most studies were cross-sectional in nature (n = 35, 81.4%). The remaining studies incorporated interventions evaluating the longitudinal effects of EAs. Thirty-five out of the forty-three studies had a cross-over design (81.4%) and the remaining eight articles included a specific control group.
A total sample of n = 710 women was included in the qualitative analysis with a mean age of 26.5 years and a mean BMI of 22.7 kg/m 2 .

Quantitative Analysis
The results are presented divided into the three physical capacities analyzed: strength, sprint, and cardiorespiratory performance. The variables were added independently of the EA used in order to see which EA produced the greatest increase in performance and on what specific test.

• Power and jumping performance
This capacity was analyzed in nine studies [29,39,40,43,46,[51][52][53]71]. The SMD varied from 0.07 to 0.49, with most of the estimates being positive (100%). According to the randomeffects model, the estimated average SMD was \hat{\mu} = 0.27 (95% CI: 0.02 to 0.52). Hence, the average outcome was significantly different from zero (z = 2.09, p = 0.0.037). There was no significant amount of heterogeneity in the true outcomes according to the Q-test (Q (8) = 1.28, p = 0.996, tau 2 = 0.00, I 2 = 0.00%). An exploration of the studentized residuals revealed that none of the studies had a value larger than ± 2.77 and therefore there was no indication of outliers in the context of this model. Considering the Cook's distances, none of the studies could be overly influential (Figure 2). Neither the rank correlation nor the regression test indicated any funnel plot asymmetry (p = 0.477 and p = 0.735, respectively). Thus, the results showed that CAF helped to improve countermovement jump height and other jumping performance variables.

Quantitative Analysis
The results are presented divided into the three physical capacities analyzed: strength, sprint, and cardiorespiratory performance. The variables were added independently of the EA used in order to see which EA produced the greatest increase in performance and on what specific test.

•
Power and jumping performance This capacity was analyzed in nine studies [29,39,40,43,46,[51][52][53]71]. The SMD varied from 0.07 to 0.49, with most of the estimates being positive (100%). According to the random-effects model, the estimated average SMD was \hat{\mu} = 0.27 (95% CI: 0.02 to 0.52). Hence, the average outcome was significantly different from zero (z = 2.09, p = 0.0.037). There was no significant amount of heterogeneity in the true outcomes according to the Q-test (Q (8) = 1.28, p = 0.996, tau 2 = 0.00, I 2 = 0.00%). An exploration of the studentized residuals revealed that none of the studies had a value larger than ± 2.77 and therefore there was no indication of outliers in the context of this model. Considering the Cook's distances, none of the studies could be overly influential (Figure 2). Neither the rank correlation nor the regression test indicated any funnel plot asymmetry (p = 0.477 and p = 0.735, respectively). Thus, the results showed that CAF helped to improve countermovement jump height and other jumping performance variables. •

•
Isometric strength Isometric strength was analyzed in six studies [40,43,47,51,52,60]. The observed SMD ranged from −0.46 to 1.37, with most estimates being positive (83%). Accord-ing to the random-effects model, the estimated average SMD was \hat{\mu} = 0.44 (95% CI: −0.08 to 0.97). Hence, the average outcome did not differ significantly from zero (z = 1.67, p = 0.094). Considering the Q-test, the true outcomes appear to be heterogeneous (Q(5) = 13.02, p = 0.023, tau 2 = 0.26, I 2 = 61.13%). A 95% prediction interval for the true outcomes ranged between −0.68 and 1.57. Thus, even though the average outcome is taken to be positive, in some studies the true outcome may be negative. None of the variables had a value larger than ± 2.64 according to an examination of the studentized residuals, and therefore there was no presence of outliers in this model. Based on the Cook's distances, none of the variables could be overly influential (Figure 3). Neither the rank correlation nor the regression test indicated any funnel plot asymmetry (p = 0.136 and p = 0.119, respectively). Therefore, the results showed that CAF could help to enhance time-to-task failure, grip strength, and peak torque values. No effects of BA on grip strength could be shown. was no presence of outliers in this model. Based on the Cook's distances, none of the variables could be overly influential (Figure 3). Neither the rank correlation nor the regression test indicated any funnel plot asymmetry (p = 0.136 and p = 0.119, respectively). Therefore, the results showed that CAF could help to enhance time-to-task failure, grip strength, and peak torque values. No effects of BA on grip strength could be shown. •

Resistance strength
To analyze resistance strength performance, eight studies [40,44,45,47,50,51,60,72] were included in the analysis. The observed SMD ranged from −0.02 to 1.22, with most of the estimates being positive (88%). The estimated average SMD according to the randomeffects model was \hat{\mu} = 0.45 (95% CI: 0.16 to 0.73). Hence, the average outcome was significantly different from zero (z = 3.05, p = 0.002). According to the Q-test, there was no presence of a significant amount of heterogeneity in the true outcomes (Q (7) = 7.80, p = 0.350, tau 2 = 0.02, I 2 = 13.89%). A 95% prediction interval for the true outcomes varied between 0.03 and 0.86. Consequently, although there may be some heterogeneity, the true outcomes of the studies are generally in the same direction as the estimated average outcome. The analysis of the studentized residuals showed that none of the studies had a value larger than ± 2.73 and this model did not show outliers. None of the studies could be considered to be overly influential as explained by the Cook's distances ( Figure  4). Neither the rank correlation nor the regression test indicated any funnel plot asymmetry (p = 0.275 and p = 0.100, respectively). Hence, the results indicated that CAF may help to enhance the number of repetitions until failure in lower body exercises.

•
Resistance strength To analyze resistance strength performance, eight studies [40,44,45,47,50,51,60,72] were included in the analysis. The observed SMD ranged from −0.02 to 1.22, with most of the estimates being positive (88%). The estimated average SMD according to the randomeffects model was \hat{\mu} = 0.45 (95% CI: 0.16 to 0.73). Hence, the average outcome was significantly different from zero (z = 3.05, p = 0.002). According to the Q-test, there was no presence of a significant amount of heterogeneity in the true outcomes (Q (7) = 7.80, p = 0.350, tau 2 = 0.02, I 2 = 13.89%). A 95% prediction interval for the true outcomes varied between 0.03 and 0.86. Consequently, although there may be some heterogeneity, the true outcomes of the studies are generally in the same direction as the estimated average outcome. The analysis of the studentized residuals showed that none of the studies had a value larger than ± 2.73 and this model did not show outliers. None of the studies could be considered to be overly influential as explained by the Cook's distances (Figure 4). Neither the rank correlation nor the regression test indicated any funnel plot asymmetry (p = 0.275 and p = 0.100, respectively). Hence, the results indicated that CAF may help to enhance the number of repetitions until failure in lower body exercises.

Sprint
Sprint capacity was assessed in six studies [29,52,58,63,73,75], with a total of k = 8 variables included in the analysis. The observed SMD presented values between −0.62 and 0.67, with the majority of estimates being positive (88%). The estimated average SMD based on the random-effects model was \hat{\mu} = 0.21 (95% CI: −0.14 to 0.55). Therefore, the average outcome was significantly different from zero (z = 1.15, p = 0.248). Based on the Q-test, the true outcomes showed no significant amount of heterogeneity (Q (7) = 10.85, p = 0.145, tau 2 = 0.10, I 2 = 38.76%). A 95% prediction interval for the true outcomes ranged between −0.50 and 0.91. Thus, even though the average outcome is taken to be positive, in some studies the true outcome may in fact be negative. The studentized residuals revealed that one variable (Ribeiro, 20 m-BA) [63] may be a potential outlier in the context of this model due to a value higher than ± 2.73. In addition, based on the Cook's distances, this variable could be considered to be overly influential ( Figure 5). The regression test indicated funnel plot asymmetry (p = 0.003) but not the rank correlation test (p = 0.275). Therefore, the results showed that both SP and CAF help to improve sprint performance.

Sprint
Sprint capacity was assessed in six studies [29,52,58,63,73,75], with a total of k = 8 variables included in the analysis. The observed SMD presented values between −0.62 and 0.67, with the majority of estimates being positive (88%). The estimated average SMD based on the random-effects model was \hat{\mu} = 0.21 (95% CI: −0.14 to 0.55). Therefore, the average outcome was significantly different from zero (z = 1.15, p = 0.248). Based on the Q-test, the true outcomes showed no significant amount of heterogeneity (Q (7) = 10.85, p = 0.145, tau 2 = 0.10, I 2 = 38.76%). A 95% prediction interval for the true outcomes ranged between −0.50 and 0.91. Thus, even though the average outcome is taken to be positive, in some studies the true outcome may in fact be negative. The studentized residuals revealed that one variable (Ribeiro, 20 m-BA) [63] may be a potential outlier in the context of this model due to a value higher than ± 2.73. In addition, based on the Cook's distances, this variable could be considered to be overly influential ( Figure 5). The regression test indicated funnel plot asymmetry (p = 0.003) but not the rank correlation test (p = 0.275). Therefore, the results showed that both SP and CAF help to improve sprint performance.

Sprint
Sprint capacity was assessed in six studies [29,52,58,63,73,75], with a total of k = 8 variables included in the analysis. The observed SMD presented values between −0.62 and 0.67, with the majority of estimates being positive (88%). The estimated average SMD based on the random-effects model was \hat{\mu} = 0.21 (95% CI: −0.14 to 0.55). Therefore, the average outcome was significantly different from zero (z = 1.15, p = 0.248). Based on the Q-test, the true outcomes showed no significant amount of heterogeneity (Q (7) = 10.85, p = 0.145, tau 2 = 0.10, I 2 = 38.76%). A 95% prediction interval for the true outcomes ranged between −0.50 and 0.91. Thus, even though the average outcome is taken to be positive, in some studies the true outcome may in fact be negative. The studentized residuals revealed that one variable (Ribeiro, 20 m-BA) [63] may be a potential outlier in the context of this model due to a value higher than ± 2.73. In addition, based on the Cook's distances, this variable could be considered to be overly influential ( Figure 5). The regression test indicated funnel plot asymmetry (p = 0.003) but not the rank correlation test (p = 0.275). Therefore, the results showed that both SP and CAF help to improve sprint performance. Aerobic capacity was analyzed in nine studies [28,[54][55][56]59,64,68,74,77], including a total of k = 12 variables. The observed SMD varied from −0.26 to 1.09, with most estimates being positive (67%). The estimated average SMD based on the random-effects model was \hat{\mu} = 0.19 (95% CI: −0.06 to 0.43). Thus, the average outcome was significantly different from zero (z = 1.48, p = 0.140). The Q-test revealed no significant amount of heterogeneity in the true outcomes (Q (11) = 7.50, p = 0.757, tau 2 = 0.00, I 2 = 0.00%). The studentized residuals analysis showed that none of the variables had a value larger than ± 2.87 and thus there was no manifestation of outliers in the context of this analysis. None of the variables could be overly influential according to the Cook's distances ( Figure 6). Neither the rank correlation nor the regression test demonstrated any funnel plot asymmetry (p = 0.545 and p = 0.398, respectively). Thus, the results showed that TAU could enhance end power, while CAF could help to improve time trial performance, and time to exhaustion could be extended by BA consumption. No conclusive effects of BJ, PP, or CRE in improving aerobic performance were shown.

Cardiorespiratory Fitness
• Aerobic Capacity Aerobic capacity was analyzed in nine studies [28,[54][55][56]59,64,68,74,77], including a total of k = 12 variables. The observed SMD varied from −0.26 to 1.09, with most estimates being positive (67%). The estimated average SMD based on the random-effects model was \hat{\mu} = 0.19 (95% CI: −0.06 to 0.43). Thus, the average outcome was significantly different from zero (z = 1.48, p = 0.140). The Q-test revealed no significant amount of heterogeneity in the true outcomes (Q (11) = 7.50, p = 0.757, tau 2 = 0.00, I 2 = 0.00%). The studentized residuals analysis showed that none of the variables had a value larger than ± 2.87 and thus there was no manifestation of outliers in the context of this analysis. None of the variables could be overly influential according to the Cook's distances ( Figure 6). Neither the rank correlation nor the regression test demonstrated any funnel plot asymmetry (p = 0.545 and p = 0.398, respectively). Thus, the results showed that TAU could enhance end power, while CAF could help to improve time trial performance, and time to exhaustion could be extended by BA consumption. No conclusive effects of BJ, PP, or CRE in improving aerobic performance were shown.

•
Anaerobic capacity This capacity was analyzed in 11 studies [10,28,29,46,53,58,64,69,70,73,76], including a total of k = 14 variables. The observed SMD ranged from −0.20 to 1.01, with most estimates being positive (64%). The estimated average SMD based on the random-effects model was \hat{\mu} = 0.22 (95% CI: 0.00 to 0.44). Thus, the average outcome was significantly different from zero (z = 2.00, p = 0.05). The Q-test revealed no significant amount of heterogeneity in the true outcomes (Q (13) = 6.59, p = 0.922, tau 2 = 0.00, I 2 = 0.00%). The studentized residuals revealed that none of the variables had a value larger than ± 2.91 and this model was not affected by outliers. None of the variables could be considered to be overly influential based on the Cook's distances (Figure 7). Neither the rank correlation nor the regression test indicated any funnel plot asymmetry (p = 0.518 and p = 0.528, respectively). Therefore, the results showed that SP and CAF could help to improve repeated sprint performance. No conclusive results could be obtained from the other EA.

•
Anaerobic capacity This capacity was analyzed in 11 studies [10,28,29,46,53,58,64,69,70,73,76], including a total of k = 14 variables. The observed SMD ranged from −0.20 to 1.01, with most estimates being positive (64%). The estimated average SMD based on the random-effects model was \hat{\mu} = 0.22 (95% CI: 0.00 to 0.44). Thus, the average outcome was significantly different from zero (z = 2.00, p = 0.05). The Q-test revealed no significant amount of heterogeneity in the true outcomes (Q (13) = 6.59, p = 0.922, tau 2 = 0.00, I 2 = 0.00%). The studentized residuals revealed that none of the variables had a value larger than ± 2.91 and this model was not affected by outliers. None of the variables could be considered to be overly influential based on the Cook's distances (Figure 7). Neither the rank correlation nor the regression test indicated any funnel plot asymmetry (p = 0.518 and p = 0.528, respectively). Therefore, the results showed that SP and CAF could help to improve repeated sprint performance. No conclusive results could be obtained from the other EA.

Discussion
This systematic review with meta-analysis shows a comprehensive analysis of the efficacy of different EAs on sports performance in female athletes and recreational practitioners. Specifically, it analyzed which types of EA were the most used to improve each physical capacity and how they affected the different performance tests. The results showed that plenty of EAs are utilized, although the use of CAF predominates over the rest.
The meta-analysis carried out on strength capacity, specifically power performance, showed CAF as the most used EA (in eight of the nine articles included). The variables that manifested the biggest improvement when taking this EA were those related to jumping performance, specifically in the CMJ test, and to the velocity of upper limb resistance exercises. This result coincides with the study carried out by Grgic [14] where CAF ingestion impacted a wide array of outcomes during the CMJ test in male athletes. In addition, peak velocity in an upper limb resistance exercise seemed to be enhanced with the use of CAF. Velocity in resistance exercise has also been improved after the ingestion of CAF in other studies, such as those reported in the review of Raya-Gonzalez et al. [79].
Regarding the dose of CAF used, most studies used doses of 3 mg/kg or 6 mg/kg, which showed performance improvements in both males and females [43,[80][81][82]. However, a lower dose of 0.9 to 2 mg/kg has also been shown to enhance mean velocity, muscular endurance, and muscular strength [83].
As for isometric strength, CAF was also the most utilized EA. This EA helped to increase isometric peak torque, which has been demonstrated in other studies in men to date [84]. As for hand grip strength, CAF seemed to enhance this variable as well, although most of the studies proving its effects have been carried out in the male population [85].
The most used EA to improve resistance strength performance was CAF, which helped to improve repetitions until failure in lower body exercises. Something similar is shown in the literature, although most studies in men focus on bench presses or upper body exercises to measure this parameter [86][87][88]. As for BA, no significant results can be drawn regarding the effects of this EA on strength enhancement in women. In most of the studies concerning the ingestion of BA, aerobic and anaerobic capacities are affected in terms of delaying the onset of fatigue, which allows athletes to perform longer or more intense training sessions [89]. Its effect on strength has only been measured in a few studies, most with males [90,91]. In the study of Outlaw et al. [92], muscular endurance was assessed after an eight-week protocol of resistance training supplemented with BA in a novice college female population. Muscular endurance improved, but with similar effects in the BA and placebo group, suggesting that the enhancement came from the training protocol itself rather than the BA consumption.
In terms of improving sprint performance, heterogeneity of EAs has been found. Those that seem to be most effective are SP and CAF. The effects of CAF on sprint performance in female athletes are similar to the improvements found in men [82,93]. As for SP, the hypothesis behind its consumption is that increased phosphate content can enhance the rate of ATP and PCr resynthesis [94]. So far, there are no studies measuring the effects of this EA on isolated sprint improvement. The studies to date measure the effects on repeated sprinting ability [58,73,[95][96][97].
In terms of aerobic performance, a variety of EAs were used. Due to this heterogeneity of EAs, hardly any consistent effects were shown, so no conclusive data can be drawn as to which EA would be better for use by female athletes. The use of TAU to improve end power [74], BA to reduce fatigue [64] when performing incremental exercise tests, and CAF in 20 km cycling time trials [55] stand out slightly.
Something similar occurs with anaerobic performance, where no studies show hugely significant results. The use of BJ in the study of Peeling et al. [69] showed better results. However, due to the small sample size and the high risk of bias, these data cannot be taken with confidence. For the improvement of repeated sprint ability, SP and CAF stand out slightly [29,58,73].
Something to point out is that one of the most used EAs to improve repeated sprint ability is CRE, with several published articles testing its efficacy in male athletes, mostly with positive effects [96,[98][99][100][101][102]. However, for women there are only three articles that study its effects on anaerobic performance [65][66][67]. In these articles, the effects of CRE ingestion are measured with respect to a placebo in female soccer or futsal players. In the three studies, the improvement of performance in sprinting, jumping, and anaerobic power is shown. However, in the study by Ramírez-Campillo et al., CRE consumption is integrated with a plyometric training program. Both the placebo and the CRE group significantly improved in comparison to the control group that did not perform plyometrics or take any EA. The CRE group obtained better results than the placebo group for peak jump power, squat jump performance, and mean sprint time variables, although it is difficult to assess if this was due to the CRE itself or to a greater adaptation to plyometric training.
Taking this into account, more studies measuring the effect of CRE on anaerobic capacity in female athletes are needed, leaving a gap in the scientific literature that needs to be addressed.
To our understanding, this is the first systematic review with meta-analysis conducted on the use of EAs in female athletes, bringing together all the existing studies which analyze their impact on exercise performance. It is therefore a first step towards future interventions, showing where there is a gap in scientific knowledge.
The limitations of this study include the great heterogeneity found in terms of the type of EA used. Another limitation is the variety of different exercise tests for physical capacities, whereby due to the methodological differences of the studies, it has not been possible to perform a meta-analysis of the specific variables. Furthermore, something to note is the difference sporting ability of the athletes included in the studies, whereby the effects on performance may be different due to variations in training load and expertise level.
However, with this study we wanted to show which EAs are the most studied to date to improve strength, sprint, or cardiorespiratory capacity and in which specific tests they show better results. This information can be used by coaches and athletes to guide their choice in deciding which EA to take to improve exercise performance.

Conclusions
CAF is shown to be the ergogenic aid most commonly used by female athletes. It was found to be the best EA to improve strength performance, specifically jumping capacity, isometric strength values, and repetitions until failure.
Regarding sprinting and speed, CAF and SP are the EAs that show the best results. The positive effects of CRE that have been demonstrated in males cannot be proven for females due to the absence of well-conducted studies carried out in women.
As for cardiorespiratory fitness capacity, aerobic test results could be improved with the use of TAU, CAF, and BA. No conclusive effects of BJ, PP, or CRE in improving aerobic performance were shown. In terms of anaerobic variables, both CAF and SP could help to improve repeated sprint ability.
Nevertheless, more studies are needed in female athletes that measure the effects of different EAs on sports performance, such as BJ, BA, or SP, as studies to date are scarce and there are many types of EA that need to be further considered in this population, such as CRE and TAU.

Conflicts of Interest:
The authors declare no conflict of interest.