1. Introduction
The CrossFit
® Open (CFO) is a multi-week, international fitness competition that serves as the preliminary qualifying stage of the CrossFit Games
TM [
1]. Each week, athletes are tasked with completing one or more physical tests (i.e., workouts) that uniquely challenge a combination of their strength, endurance, and sport-specific skill [
2,
3]. Currently, competitors are given four days to complete each test and submit their best score to the competition submission portal [
4]. Performances are verified either in-person by a judge or by competition officials via video submission, and then ranked. Assigned ranks serve as points-earned (e.g., rank #1 earns 1 point, rank #10 earns 10 points), and points accumulate over each week of the competition. After the CFO concludes, the current rules identify the top 10% of competitors (i.e., the lowest scoring 10%) within each sex division, and those athletes advance to the next stage of competition [
3]. Although some CFO tests may be repeated in later competitions [
5], most are unique and the details of any test are not known until the week of its individual release. Athletes who aim at earning a rank within the top 10% should not only work at developing the physiological traits that might impact success [
6,
7,
8,
9,
10,
11,
12,
13], but also their strategic approach to pacing a variety of possible test designs [
14,
15]. Since it is impossible to know the specific details of future CFO tests [
16], trainees may find benchmark workouts to be useful for monitoring progress and predicting future CFO success. In addition to several existing “named” workouts, whose details have been standardized across training facilities, after a CFO test’s first appearance, it becomes a benchmark workout to be incorporated into normal training. To this end, a recent article by Mangine and colleagues [
17] published normative scores for men and women in each CFO test assigned between 2011 and 2021. Trainees can use these scores to estimate how their current performance might have ranked in the associated year(s) that a specific test appeared in CFO programming. An interesting finding related to the secondary aim of that study was the performance differences noted between men and women in nearly every test.
Although men and women compete in separate divisions [
3], most of the time they are assigned the same list of exercises to complete in each CFO test (55 out of 60 total tests from 2011 to 2021). Unlike teen athletes, masters athletes, and the actual “scaled” division, prescription in the “as prescribed” (i.e., Rx) division is also scaled (i.e., modified) between men and women for one or more exercises [
2]. It might be presumed that this particular scaled prescription is meant to account for natural, physiological differences between sexes [
18,
19] and avoid drastic differences in test difficulty. However, some exercise types or modalities have never received scaled prescription, despite being tied to relevant (to sports performance) physiological attributes known to be different between sexes. In 91% of scaled tests, the programming component that was prescribed differently to men and women involved load assignments for weight-training exercises. Men are assigned heavier loads in an attempt to account for differences in strength capability [
18]. Likewise, equating strength (or power) is a plausible reason for the scaling of non-weight-training exercises (in ~33% of tests), which are exclusively limited to assigned medicine ball weight, heights of targets, and boxes assigned for wall ball (WB) shots and box jumps (BJ), respectively. Greater strength in men might also be inferred as the reason for why gymnastics exercises are not scaled. Men are typically heavier than women [
18,
20], and would naturally require greater strength to maneuver their body about a pull-up bar or walk/push-up from a handstand position. In contrast, prescription for traditional cardiovascular modalities (usually rowing and jumping rope) has never differed between men and women in any CFO test [
2].
CrossFit
®-style workouts and CFO tests are commonly designed to encourage maximizing workout density [
15]. When tests are scored by time-to-completion (TTC), they are best accomplished when the individual performs the assigned exercise repetitions as quickly as possible, efficiently transitions between exercises, and minimizes their autoregulated rest breaks. Minimizing transition time and breaks is even more important when tests ask competitors to complete ‘as many repetitions as possible’ (AMRAP) within an assigned duration, especially when there are physical limitations as to how quickly the individual exercises might be performed (e.g., the medicine ball cannot be made to drop faster from the target). The overall ability to maximize workout density within test durations lasting several minutes depends on the individual’s capacity to supply energy to exercising muscle and process deleterious metabolic byproducts [
21,
22], particularly when involving continuous effort movements (e.g., rowing and jumping rope). That is, CFO testing outcomes are affected by aerobic and anaerobic capacity [
6,
7,
8,
9,
10,
11,
12], that are attributes often known to differ between men and women [
19]. Thus, it was not surprising when two CrossFit
®-style (non-CFO) workouts that scaled all exercises (i.e., weight-training loads and rowing) except for one (i.e., burpees) reported no sex differences [
23]. Meanwhile, Mangine and colleagues [
17] reported sex differences in 56 of the 60 CFO tests created between 2011 and 2022, with men significantly outperforming women in 41 tests (~68%). These widespread differences would suggest that the prescription was not appropriately scaled between men and women in most CFO tests. However, beyond that statement, there is little insight to be gained about sex differences in relation to scaled and unscaled workout components when the examination is limited to overall test performance. A more comprehensive understanding of the sex-based differences could only be made after CFO tests were broken down into their individual components (i.e., each exercise, transition, and break). Currently, only a pair of small-sample (<12 participants) studies have broken down a CrossFit
®-style workout [
11] or CFO tests [
14] into individual components (i.e., exercises, transitions, breaks), and neither made comparisons between men and women. In fact, no study has compared the pacing strategies employed by men and women for each component of any CrossFit
®-style workout, nor has any study ever made such comparisons between competitors who would and would not advance beyond the CFO. Therefore, the purpose of this investigation was to examine the effect of sex and rank on pacing strategies employed in individual CFO test components. The findings of this study would provide useful insight into the factors that might explain why men and women, as well as higher ranking competitors, score differently in CFO tests.
3. Results
3.1. Overall Performance
Sex and rank differences in overall performance in each 2020 CFO test are illustrated in
Figure 1. Except test 2 (F = 3.5,
p = 0.063, η
2 = 0.02), significant main effects for sex were seen in absolute rank with all tests (F = 4.6–18.9;
p < 0.05; η
2 = 0.02–0.08). Of course, significant main effects for rank were observed in absolute rank with all tests (F = 55.8–77.8,
p < 0.001, η
2 = 0.27–0.33). Sex × rank interactions were seen for repetition completion rate in tests 1 (F = 4.8,
p = 0.030, η
2 = 0.01), test 3 (F = 14.0,
p < 0.001, η
2 = 0.04), and test 5 (F = 14.6,
p < 0.001, η
2 = 0.02), including the tie-break time for test 5 (F = 45.6,
p < 0.001, η
2 = 0.22). For tests 2 and 4, main effects for sex (F = 31.9–128.6,
p < 0.001, η
2 = 0.12–0.31) and rank (F = 94.6–135.0,
p < 0.001, η
2 = 0.32–0.35) were noted for repetition completion rate.
3.2. Test 1 Component Pacing
Pacing measures averaged across 10 rounds of test 1, as well as their variability, are presented in
Figure 2 and
Table 3, respectively. Time × rank interactions were seen for G2OH repetition completion rate (F = 4.7,
p = 0.033, η
2 < 0.01) and G2OH breaks (F = 10.8,
p = 0.001, η
2 = 0.01) along with a main effect for sex for repetition completion rate (F = 22.4,
p < 0.001, η
2 = 0.10). No differences were seen with failed repetitions. Main effects for time (F = 69.3,
p < 0.001, η
2 = 0.06), sex (F = 82.6,
p < 0.001, η
2 = 0.24), and rank (F = 49.7,
p < 0.001, η
2 = 0.14) were seen for BFB repetition completion rate, while only a main effect for time with BFB breaks (F = 11.1,
p = 0.001, η
2 = 0.03) and a main effect for rank with BFB failed (“extra”) repetitions (F = 4.7,
p = 0.032, η
2 = 0.02) were noted. Main effects for time (F = 42.3–44.9,
p < 0.001, η
2 = 0.03–0.04), sex (F = 10.3–12.7,
p < 0.002, η
2 = 0.04–0.06), and rank (F = 20.0–40.9,
p < 0.001, η
2 = 0.10–0.18) were also noted for transitions to G2OH and BFB. No other differences were seen.
Analysis of test 1 variability revealed main effects for time in G2OH repetition rate slope (F = 32.1, p < 0.001, η2 = 0.11) and CV (F = 4.5, p = 0.036, η2 = 0.01), as well as the slope of G2OH breaks (F = 6.4, p = 0.013, η2 = 0.03). Main effects for sex (F = 7.4–7.5, p = 0.007, η2 = 0.02–0.03) and rank (F = 6.8–14.8, p < 0.010, η2 = 0.02–0.05) were also noted for the CVs of G2OH repetition rate and breaks. Time × sex interactions were seen with BFB repetition rate slope and CV (F = 6.0–14.6, p < 0.05, η2 = 0.01–0.03), and main effects for time were noted with the slope and CV of BFB breaks (F = 6.5–10.4, p < 0.05, η2 = 0.03). Finally, whereas time × rank and time × sex interactions were noted with the slopes of transitions to BFB (F = 11.3–13.9, p = 0.001, η2 = 0.04–0.05) and G2OH (F = 11.2–19.8, p = 0.001, η2 = 0.03–0.05), only main effects for time (F = 7.6, p = 0.006, η2 = 0.02), sex (F = 7.9, p = 0.006, η2 = 0.03), and rank (F = 5.1, p = 0.025, η2 = 0.02) were seen with the CV of transitions to G2OH. No other differences were observed.
3.3. Test 2 Component Pacing
Pacing measures averaged across 20 min of test 2, as well as their variability, are presented in
Figure 3 and
Table 4, respectively. Main effects for sex (F = 61.1–286.0,
p < 0.001, η
2 = 0.25–0.60) and rank (F = 11.5–16.8,
p < 0.001, η
2 = 0.04–0.05) were seen with average DBT and DU repetition completion rates. During these two exercises, a time × sex interaction was seen with DBT breaks (F = 4.3,
p = 0.040, η
2 = 0.01) and a main effect for time with DU breaks (F = 32.2,
p < 0.001, η
2 = 0.04). With average TTB repetition rate, time × sex (F = 13.1,
p < 0.001, η
2 = 0.01), time × rank (F = 5.1,
p = 0.025, η
2 < 0.01), and sex × rank (F = 12.3,
p = 0.001, η
2 = 0.06) interactions were found, and only a main effect for time was seen for TTB breaks (F = 13.8,
p < 0.001, η
2 = 0.01). Of the three exercises, only a main effect for rank was seen in failed DU repetitions (F = 5.3,
p = 0.023, η
2 = 0.03); otherwise, failed repetitions were similar across competitors for DBT and TTB. Main effects for time (F = 44.2–51.3,
p < 0.001, η
2 = 0.04), sex (F = 20.2–21.4,
p < 0.001, η
2 = 0.08–0.09), and rank (F = 20.5–66.5,
p < 0.001, η
2 = 0.09–0.23) were noted when competitors transition to TTB and DBT, but main effects were limited to time (F = 97.6,
p < 0.001, η
2 = 0.07) and rank (F = 43.4,
p < 0.001, η
2 = 0.18) when transitioning to DU. No other differences were observed.
Analysis of test 2 variability revealed main effects for time (F = 7.5, p = 0.007, η2 = 0.02), sex (F = 28.0, p < 0.001, η2 = 0.08) and rank (F = 6.1, p = 0.015, η2 = 0.02) with the CV of DBT repetition rate, and a time × sex interaction (F = 4.1, p = 0.045, η2 = 0.01) for the CV of DBT breaks. A main effect for sex (F = 8.5, p = 0.004, η2 = 0.03) was seen with TTB rate slope, while main effects for time (F = 12.5, p = 0.001, η2 = 0.02), sex (F = 78.1, p < 0.001, η2 = 0.21), and rank (F = 11.4, p = 0.001, η2 = 0.03) were seen with the CV for TTB rate, along with a main effect for time with the CV for TTB breaks (F = 13.2, p < 0.001, η2 = 0.04). For DU, a sex × rank interaction (F = 4.6, p = 0.034, η2 = 0.01) was noted for the slope of DU rate, a main effect for rank (F = 4.9, p = 0.028, η2 = 0.01) with the slope of DU breaks, main effects for time (F = 7.3, p = 0.008, η2 = 0.01) and sex (F = 80.2, p < 0.001, η2 = 0.23) for the CVs of DU rate and breaks, along with a main effect for rank with the CV of DU rate (F = 21.4, p < 0.001, η2 = 0.06). Time × sex interactions were found with the slope of transitions between all three exercises (F = 4.0–12.5, p < 0.05, η2 = 0.01–0.03), along with time × rank interactions with the slope of transitions to TTB (F = 6.7, p = 0.011, η2 = 0.02) and DBT (F = 11.3, p = 0.001, η2 = 0.03). A sex × rank × time interaction was found for the CV of DU transitions (F = 10.8, p = 0.001, η2 = 0.02), and then main effects for time (F = 12.5–15.5, p = 0.001, η2 = 0.02–0.03), sex (F = 43.0–54.1, p < 0.001, η2 = 0.13–0.16), and rank (F = 25.4–36.5, p = 0.001, η2 = 0.08–0.11) for CVs of TTB and DBT transitions. No other differences were seen.
3.4. Test 3 Component Pacing
Pacing measures averaged across six rounds of test 3, as well as their variability, are presented in
Figure 4 and
Table 5, respectively. Main effects for time (F = 72.7–899.8,
p < 0.001, η
2 = 0.18–0.63), sex (F = 6.7–8.5,
p < 0.01, η
2 = 0.01–0.02), and rank (F = 17.0–57.2,
p < 0.001, η
2 = 0.05–0.07) were observed for average DL repetition rate and breaks with no differences amongst competitors with failed DL repetitions. A sex × rank interaction (F = 9.2,
p = 0.003, η
2 = 0.05) and main effect for time (F = 9.4,
p = 0.003, η
2 = 0.03) was noted with average transitions to HSPU-HSW. Then, sex × rank interactions were seen for the HSPU-HSW repetition rate (F = 4.4,
p = 0.038, η
2 = 0.01) and breaks (F = 7.1,
p = 0.009, η
2 = 0.03), along with a main effect for time for HSPU-HSW rate (F = 179.4,
p < 0.001, η
2 = 0.26), and time × rank (F = 12.9,
p = 0.001, η
2 = 0.05), and time × sex (F = 11.7,
p = 0.001, η
2 = 0.05) interactions for HSPU-HSW breaks. A time × rank interaction (F = 5.9,
p = 0.016, η
2 = 0.02) and main effect for sex (F = 7.0,
p = 0.009, η
2 = 0.02) was seen with HSPU-HSW failed repetitions, while main effects for sex (F = 4.5,
p = 0.035, η
2 = 0.03) and rank (F = 22.4,
p < 0.001, η
2 = 0.13) were noted with average transitions to DL. An insufficient number of competitors advanced to the fifth round of this test (top 10% men = 13, top 10% women = 7, remaining men = 2, remaining women = 0) and prevented comparisons involving transitions to DL in the last half of test 3.
Analysis of test 3 variability revealed a time × rank interaction for the CV of DL repetition rate (F = 13.0, p < 0.001, η2 = 0.04) and main effects for time with the slopes and CVs of DL rate (F = 6.4–22.8, p < 0.05, η2 = 0.02–0.14) and DL breaks (F = 21.1–37.4, p < 0.001, η2 = 0.13–0.14), and a main effect for rank was seen with the CV of DL breaks (F = 4.6, p = 0.033, η2 = 0.02). Main effects for rank were also seen with the slope and CV of transitions to HSPU-HSW (F = 4.8–5.7, p < 0.05, η2 = 0.02–0.03). Then a time × rank interaction for the CV (F = 4.6, p = 0.034, η2 = 0.02) and main sex effect for the slope (F = 8.3, p = 0.005, η2 = 0.05) of HSPU-HSW rate were noted, along with a main time effect in the CV of HSPU-HSW breaks (F = 11.4, p = 0.001, η2 = 0.05). Main effects for sex were also seen for the slope and CV (F = 9.2–14.9, p < 0.10, η2 = 0.06–0.08) along with a main rank effect with the CV (F = 28.4, p < 0.001, η2 = 0.15) of transitions to DL.
3.5. Test 4 Component Pacing
Pacing measures averaged across six rounds of test 4, as well as their variability, are presented in
Figure 5 and
Table 6, respectively. Analysis of averaged pacing across six rounds of test 4 revealed sex × rank × time interaction for average BJ-SLSQ repetition completion rate (F = 5.0,
p = 0.027, η
2 = 0.02), a main time effect for breaks (F = 66.3,
p < 0.001, η
2 = 0.16), and a time × rank interaction for failed BJ-SLSQ repetitions (F = 6.3,
p = 0.013, η
2 = 0.02). A time × rank interaction (F = 4.7,
p = 0.032, η
2 = 0.01) and main sex effect (F = 7.3,
p = 0.008, η
2 = 0.01) were then seen with transitions to CNJ. For CNJ, a time × rank interaction (F = 20.4,
p < 0.001, η
2 < 0.01) and main sex effect (F = 6.0,
p = 0.015, η
2 < 0.01) were seen with repetition rate, while a sex × time × rank (F = 7.1,
p = 0.009, η
2 = 0.01) and sex × time (F = 7.4,
p = 0.007, η
2 = 0.02) interactions were observed for breaks and failed repetitions, respectively. A sex × rank interaction (F = 4.1,
p = 0.044, η
2 = 0.02) was noted for transitions to BJ-SLSQ.
Analysis of test 4 variability revealed a time × rank interaction with the CV (F = 8.1, p = 0.005, η2 = 0.02) and main effects for time (F = 98.5, p < 0.001, η2 = 0.25) and sex (F = 4.5, p = 0.036, η2 = 0.01) with the slope of BJ-SLSQ rate. Time × sex (F = 4.9, p = 0.029, η2 = 0.01) and time × rank (F = 18.3, p < 0.001, η2 = 0.05) interactions were then noted for the slope of BJ-SLSQ breaks, but only a main time effect for the CV (F = 70.9, p < 0.001, η2 = 0.19). When transitioning to CNJ, a sex × rank × time interaction with the CV (F = 6.9, p = 0.009, η2 = 0.01) and a main time effect with slope (F = 27.3, p < 0.001, η2 = 0.10) were seen. A main time effect was observed for the slope of CNJ rate (F = 438.6, p < 0.001, η2 = 0.65), while time × sex (F = 17.0, p < 0.001, η2 = 0.04), time × rank (F = 80.4, p < 0.001, η2 = 0.18), and sex × rank (F = 5.6, p = 0.020, η2 = 0.01) interactions were observed with the CV. For CNJ breaks, a sex × rank × time interaction (F = 10.4, p = 0.002, η2 = 0.02) and time × rank interaction (F = 18.5, p < 0.001, η2 = 0.05) were noted for slope and CV, respectively. A main sex effect was also seen for the CV of CNJ breaks (F = 5.5, p = 0.020, η2 = 0.01). Time × rank (F = 9.2, p = 0.003, η2 = 0.02) and sex × rank (F = 8.5, p = 0.004, η2 = 0.02) interactions were found with the CV of transitions to BJ-SLSQ, along with a main time effect for slope (F = 9.8, p = 0.003, η2 = 0.07).
3.6. Test 5 Component Pacing
Pacing measures averaged throughout test 5, as well as their variability, are presented in
Figure 6 and
Table 7, respectively. Analysis of averaged pacing strategy revealed sex × rank interactions for the number of sets (F = 25.1,
p < 0.001, η
2 = 0.13) and time (F = 41.6,
p < 0.001, η
2 = 0.17) devoted to RMU and RMU repetition completion rate (F = 13.5,
p < 0.001, η
2 = 0.04). Sex × rank interactions were also noted for rowing calories completed per set (F = 5.5,
p = 0.020, η
2 = 0.04), rowing strokes completed per set (F = 3.9,
p = 0.049, η
2 = 0.03), transitions performed (F = 12.2,
p = 0.001, η
2 = 0.07), and total time devoted to transitions (F = 4.4,
p = 0.038, η
2 = 0.02). Main effects for sex were observed for the order of exercise completion (F = 13.7–92.9,
p < 0.001, η
2 = 0.08–0.38), the number of sets and time devoted to rowing (F = 7.1–139.3,
p < 0.010, η
2 = 0.05–0.47), total breaks taken (F = 53.3,
p < 0.001, η
2 = 0.26), total break time (F = 77.5,
p < 0.001, η
2 = 0.33), RMU repetitions per set (F = 155.2,
p < 0.001, η
2 = 0.47), rowing SPM (F = 8.1,
p = 0.005, η
2 = 0.05), rowing rate (F = 8.2,
p = 0.005, η
2 = 0.05), and failed RMU repetitions (F = 4.3,
p = 0.039, η
2 = 0.03). Main effects for rank were seen with the number of WBS sets (F = 4.6,
p = 0.033, η
2 = 0.03), total break time (F = 8.1,
p = 0.005, η
2 = 0.04), RMU repetitions per set (F = 22.8,
p < 0.001, η
2 = 0.07), and WBS repetitions per set (F = 4.7,
p = 0.031, η
2 = 0.03).
Analysis of test 5 variability revealed main effects for sex (F = 28.4, p < 0.001, η2 = 0.16) and rank (F = 8.3, p = 0.005, η2 = 0.05) for the CV of RMU rate, slope, and CV of RMU breaks (F = 4.5–35.1, p < 0.05, η2 = 0.02–0.24), and CV of RMU break time (F = 7.4–30.9, p < 0.010, η2 = 0.06–0.24), as well as a sex × rank interaction for the slope of RMU break time (F = 5.0, p = 0.027, η2 = 0.02). A sex × time interaction was also noted for the CV of rowing repetition rate (i.e., calories per stroke per second) (F = 9.1, p = 0.003, η2 = 0.06). Main sex effects were observed for the slopes and CVs of WBS breaks (F = 5.7–9.5, p < 0.05, η2 = 0.04–0.06) and break time (F = 7.1–8.3, p < 0.010, η2 = 0.04–0.05), and a main rank effect was seen with WBS break time slope (F = 4.2, p = 0.042, η2 = 0.03). Sex × rank interactions were found with the CV of transitions (F = 5.2, p = 0.024, η2 = 0.03), and slope and CV of transition time (F = 4.0–24.5, p < 0.05, η2 = 0.02–0.13). No other differences were seen.
4. Discussion
The purpose of this study was to examine sex and rank differences in pacing strategies employed by Rx competitors of the 2020 CFO. To observe differences, recorded efforts in each of the five tests programmed that year were collected from competitors who ranked within the top 10,000 places of the men’s and women’s divisions. The athletes were further sub-divided by whether they had earned an overall rank within the top 10% of all competitors within their respective sex-division in 2020. Comparisons were then made across sex divisions, ranks (i.e., top 10% and remaining), and test halves (except test 5) to assess differences in overall pace, repetition completion rate for individual exercises, the use of breaks, transition efficiency, failed repetitions, and how each of these varied across the duration of exercise. As expected, top 10% competitors generally outpaced remaining competitors in each test and within the top 10%, men outpaced women in three of the five tests. Interestingly, the remaining men (i.e., those who did not place inside the top 10%) completed four tests just as fast as the top 10% women, and exceeded their pace in the fifth test (test 5). Analysis of test components provided further insight into which test aspects were advantageous to competitor classifications. Men (in general) and the top 10% of competitors (men and women) were usually faster in completing repetitions in approximately 60% of all prescribed exercises, and their pace varied less in approximately 40% of exercises. The top 10% of competitors more consistently transitioned between exercises nearly 80% of the time, while taking more consistent breaks in about half of the tests. Men and the top 10% of competitors were particularly faster in transitioning during tests 1 and 2. Among the classifications, the clearest distinctions were seen with gymnastics pacing followed by pacing when performing resistance training exercises with higher relative loads. These data greatly expand on a previous pilot study of ten 2016 CFO competitors [
14], and is the first study to examine the effect of sex and rank on pacing strategy in discontinuous, multi-modal exercise.
The 2020 CFO featured three tests that required a high-volume of gymnastics exercises to be performed [
2]. The competitors in this study repeated a set of six TTB repetitions an average of 20 times within a 20-min time limit (120 total repetitions) on test 2. To finish test 3, competitors had to complete 45 HSPU repetitions and traverse 150 feet while walking on their hands, and test 5 required 40 RMUs. The present study found that the top 10% men (and men in general) more quickly transitioned to these exercises and completed repetitions at a faster rate than all other competitors. In contrast, the remaining women were slowest in these or often failed to even perform or complete the assigned gymnastic work. In fact, nearly 80% of the remaining women failed to complete a single HSW repetition (i.e., walk five feet), whereas more than 70% of all other athletes in this study accomplished this in test 3. In test 5, the remaining women only averaged 10.7 RMU repetitions, while the top 10% women averaged 30.6, the remaining men averaged 39.5, and the top 10% men averaged 40 repetitions. Moreover, while men typically completed all RMU repetitions before completing any other test 5 exercise, women almost always completed them last and required approximately three more sets in total. These findings support recent observations made by Mangine and colleagues [
5,
17]. While calculating normative scores for all CFO tests from 2011 to 2021,
moderate-to-
large performance differences were noted to be in favor of men for nine of the thirteen CFO tests that required high-volume gymnastics to be completed within a 10–20-min duration [
17]. In a follow-up study, men more consistently outperformed women whenever a CFO test involving a high volume of gymnastics was officially repeated, this in spite of athletes having an average of 2.4 years to improve their performance from the previous iteration [
5]. The most obvious explanation for this is that men typically possess more upper-body strength endurance than women [
18,
19], and CFO gymnastics prescription has always been exactly the same for Rx competitors in both sex divisions [
2]. That is, although physiological capability differences are expected, the men’s and women’s division competitors have always been prescribed the same amount of gymnastic work. One might argue that the gymnastic prescription is not the same because body mass is usually higher in men [
18,
20]. However, if that expected difference was sufficient to equate with work, then men and women should have been able to complete a similar number of repetitions in these exercises and at a similar rate.
The same expectation might be assumed to be true for tests involving resistance training movements (tests 1–4). Unlike gymnastics, resistance training loads are customarily different for competitors in the men’s and women’s divisions [
2]; presumably, to account for known strength differences [
18]. Since these differences typically cease to exist when loads are made relative to body mass [
26,
27,
28], a reasonable hypothesis expects men and women to be capable of completing repetitions at a similar pace when using adequately scaled loads. Nevertheless, of all the comparisons made in this study, sex differences favoring men in resistance training movements were most expected. This was because CFO loads are not prescribed relative to body mass, but rather, they are apparently based on an estimated percent difference in strength [
2]. Previously, Mangine and colleagues [
17] noted faster completion rates by men in 65% of CFO tests that incorporated a resistance training exercise. Men were faster particularly when the test assigned higher relative loads to women (68.3 ± 2.7% of loads assigned to men), and slower than women when the test assigned lesser relative loads (64.7 ± 4.0% of loads assigned to men). Those observations were supported by our results. In tests 1 and 2, loads assigned to women were 68–70% of those assigned to men, and regardless of rank, men outperformed women in nearly every aspect of those tests. Likewise, test 3 paired higher relative DL loads to women (65.1–68.9% of loads assigned to men) with the previously discussed gymnastics and men more quickly transitioned to DL (first half only) and performed repetitions at a faster rate. Though it remains unclear how the difficulties women had with gymnastics affected their DL repetition completion rate, it may be surmised that the combination of the two impacted their capability of progressing through the last half of test 3, and in turn, the resultant metrics of variability (i.e., slope and CV) examined in this study. Indeed, approximately 50% of the top 10% women and remaining men completed 4–7 DL repetitions in round 5 and less than 1% of the top 10% men failed to complete a DL repetition in the final round. Conversely, the remaining women only averaged 12 of 21 repetitions in round 4, and then 94% and 100% failed to perform a single DL repetition in rounds 5 and 6. The absence of their data in later rounds would have led to a more weighted contribution from round 4 data when calculating variables over the test’s second half; thus, would be representative of comparatively less work. While a similar pattern could also be observed with CNJ repetitions across rounds 4–6 of test 4, a test that paired lower relative CNJ loads (64.5 ± 2.2% of the loads assigned to men) with partially scaled calisthenics (BJs and SLSQs), no sex differences were seen in repetition completion rate. Instead, the only relevant differences that might explain why men generally completed this test faster had to do with first-half consistency, their speed in transitions, and fewer failed repetitions over the last half of the test. The outcomes observed in the later halves of tests 3 and 4 should be viewed with caution.
Given how men generally outpaced women, the differences between the top 10% women and the remaining men are interesting to explore for the purposes of answering the hypothetical question of whether either could excel in the other’s sex-division. Although no differences were seen between these two groups in overall performance during the first four tests, the remaining men outpaced the top 10% women in test 5 by approximately 11.2%. Additionally, examination of specific test components showed that these men completed TTB, HSPU, and RMU repetitions at faster rates, and were more consistent in their CNJ rate, whereas top 10% women were faster in performing HSW repetitions. No other specific differences were seen and visual inspection of these sub-groups’ means when a main effect for sex was noted only implied an advantage for men in test 5. No other clear pattern of advantages was seen for either of these two sub-groups in tests 1–4. When contextualizing these findings, it is important to remember that the competitors selected for inclusion in this study ranked within the top 10,000 places of their respective divisions. After applying previously described criteria for Rx competitors [
17], that placement threshold in men could more accurately be described as the point that approximately distinguished their top 20%. These sub-group comparisons suggest that 2020 CFO performances by the top 10% women were similar, with a few exceptions, to those of men who ranked between the top 10 and 20% of their division. This difference is consistent with previously reported scoring differences between a women’s division top 10% score in any CFO test (2011–2021) and where that score would have ranked amongst men [
17]. Not counting maximal strength tests, a women’s division top 10% score for any test would, on average, place them 7.4% lower in the men’s division (or 6.6% lower for tests programmed between 2016 and 2021). Conversely, a top 10% score for men was, on average, similar to a top 2% score in women. The underlying reasons for this are still unclear but the findings of this study implicate gymnastics and assigned resistance training loads as the most likely factors.
Having a better understanding of why men more commonly outperform women in CFO tests is probably more important for competitive team events and training program design than it is for individual CFO competitors. This is because, currently, men and women compete in separate divisions and their respective performances have no impact on the other’s rankings [
3]. Conversely, the disparities seen between the top 10% and remaining competitors helps to provide insight into the pacing strategies of those who ultimately advance beyond the CFO. In tests 1 and 2, the top 10% of competitors uniformly outperformed the remaining competitors within their respective sex-division in nearly every facet of these two tests. They consistently completed exercise repetitions at a faster rate over the entire workout, committed less failed (or extra) repetitions, took shorter breaks, and transitioned more quickly between exercises. Higher ranking competitors were more consistent in completing HSPU/HSW (test 3, men only) and RMU repetitions (test 5, women only) at faster rates. These observations support the idea that performances in CrossFit
®-style workouts are determined by one’s capability of maintaining a faster and more consistent pace for the duration of exercise [
14,
15]. Though multiple areas of fitness have been found to predict performance [
6,
7,
8,
9,
10,
11,
12,
13], it would seem that for the 2020 CFO, skill and stamina in these particular aspects best distinguished performance between the top 10% and remaining competitors.
There were, however, instances when top 10% competitors moved faster but were less consistent. The top 10% women completed TTB repetitions faster than the remaining women, but the remaining competitors (men and women) generally kept a more consistent pace. Higher-ranking competitors were also faster but less consistent in DL repetition rates and transitions to both DL and HSPU/HSW (in men only) during test 3, as well as in CNJ repetition rates and transitions to CNJ and to SLSQ in test 4. Additionally, shorter test 5 breaks were seen in top 10% competitors but with inconsistencies amongst individual exercises. RMU breaks and break time were more variable while WBS breaks became progressively shorter in top 10% competitors. One explanation for each of these outcomes may be related to the unequal amount of work completed amongst participants. Scoring well in CFO tests is usually dependent on the number of repetitions completed within a given time limit, or completing all assigned work more quickly and/or before time runs out [
2,
3], and this inexorably leads to more work being completed by higher ranking competitors. For instance, within the 20-min time limit of test 2, the top 10% women completed approximately four extra rounds (~22 more TTB repetitions) compared to remaining women, and those additional repetitions would have factored into calculated averages, slopes, and CVs. The same can be said about the large percentage of remaining women who failed to reach rounds five and six of tests 3 and 4. Likewise, only two-thirds of remaining women completed more than two RMU repetitions. With fewer existing data points, calculated averages would be more heavily weighted towards repetitions completed when athletes were less fatigued, while calculated measures of variability would have been disproportionately low compared to those calculated for the top 10% of competitors. Although this explanation suggests that some of the observed variability differences should be viewed with skepticism, it is possible that competitors who are capable of averaging a significantly faster pace over the course of a test can afford to be less consistent. Indeed, not counting the extremely low CVs seen in remaining women during the last halves of tests 3 and 4, higher-ranking competitors were found to be 5–30% faster than remaining competitors across all these instances but fractionally less consistent. Based on this, one might hypothesize that there is a limit to how much speed should be sacrificed for the sake of a more consistent pace.
When reviewing these findings, it is important to maintain perspective and consider them within the context of this study’s inherent limitations. For instance, included competitors might be viewed as a specific sub-group within the top 10,000 athletes. This is because only those who submitted video recordings of their best attempt to the online leaderboard [
4] for all five tests were considered for inclusion; a requirement that eliminated more than 90% of men and 95% of women who met our rank criteria. The decision to only consider the top 10,000 athletes within each division was made simply because video submissions became more scarce beyond this point, and recorded efforts were obviously necessary for video analysis to be possible. The secondary decision to only include competitors who submitted recordings for all five tests was made to ensure that different group compositions could not be a confounding factor when collating results across all test comparisons. While group compositions still varied for each test because individual efforts were excluded for not meeting a test’s programming and/or movement standards, these instances were relatively few (<5% of cases across all tests). The video submission requirement also provided evidence that the authenticity of the effort had been certified by competition officials [
3], though it is still questionable as to whether each submission was critically examined by said officials. As noted above, cases still needed to be removed from analyses for reasons such as miscounted repetitions and incomplete efforts. It is also possible that the present sample was not representative of all athletes within the top 10,000 who recorded their efforts on all five tests. In lieu of video submission, competitors had the option of completing tests at a CrossFit
® affiliate in front of a certified judge who then submits the scoresheet for certified effort [
3]. Possessing a video recording of the effort is still considered a best practice in case the validity of an attempt is questioned, but these may be submitted discreetly to competition officials. Regardless, the true number of competitors who actually recorded their efforts on all five tests but did not submit the video cannot be estimated, nor can their reasons for doing so be known. Another important limitation to mention involved the manner in which competitors were grouped by rank. This study used previously described criteria to identify a competitor’s effort as a valid attempt using Rx standards [
17], and these criteria are more stringent than those required to earn an Rx rank in the CFO [
2,
3]. Briefly, the study criteria were designed to exclude attempts where it was apparent that the competitor intentionally performed only a limited number of repetitions for the entire test solely for the purpose of earning an Rx rank. These criteria reduced the overall pool of competitors and thus, affected percent rank calculations. Consequently, top 10% competitors defined in this study may actually be representative of a more exclusive group (i.e., higher ranking) than their stated rank implies. Nevertheless, these inclusion criteria were necessary to ensure that legitimate and complete efforts were being used for all comparisons, and each effort was as consistent as possible with those of similarly ranked athletes. Any missing data from an effort (e.g., athletes who did not attempt repetitions in rounds five and six of tests 3 and 4) were missing because it was a common occurrence amongst athletes of similar rank.