How Confident Can We Be in Modelling Female Swimming Performance in Adolescence?

The purpose of this research was to determine the expected progression of adolescent female swimming performances using a longitudinal approach. The performances of 514 female swimmers (12–19 year olds) who participated in one or more FINA-regulated annual international schools’ swimming championships over an eight-year period were analysed. Quadratic functions for each of the seven individual events (50, 100, 200 m freestyle, 100 m backstroke, breaststroke, butterfly, 200 m individual medley) were determined using mixed linear models. The predicted threshold of peak performance ranged from 16.8 ± 0.2 (200 m individual medley) to 20.6 ± 0.1 (100 m butterfly) years of age, preceded by gradual rates of improvement (mean rate of 1.6% per year). However, following cross validation, only three events (100 m backstroke, 200 m individual medley and 200 m freestyle) produced reliable models. Identifying the factors that contribute to the progression of female performance in this transitory period of life remains challenging, not least since the onset of puberty is likely to have occurred prior to reaching 12 years of age, the minimum competition age for this championship.


Introduction
Based on the increasing pressure for nations to develop talented athletes and win medals at the highest level, many sporting bodies have directed strategies and resources to increasing performance levels in all sports; swimming is no exception [1][2][3]. Trying to separate the performance gains that are made by athletes due to training as opposed to natural growth and development has been one of the most important challenges to overcome. Malina [4] highlighted the need for longitudinal studies to better understand how and when athletes' performances progressed. There have been a number of approaches to predictive modelling in a variety of different sports, including physiological, mathematical or probability strategies [5]. However, these authors suggest that until all factors such as biomechanical, physiological and psychological parameters that influence human performance are fully understood and accounted for, modelling will continue to lack sufficient accuracy to meaningfully predict future performance. Nevertheless, numerous studies have considered how changes in physical, physiological and biomechanical parameters affect performance during adolescence [6,7].
To date, research exploring the development of youth swimmers during adolescence has focussed mainly on male subjects [8][9][10][11] with comparatively fewer targeting solely young females [7,12]. In one of the few studies on young female swimmers, Lätt, Jürimäe, Haljaste, Cicchella, Purge and Jürimäe [7] found that development of biomechanical factors such as velocity, stroke length, stroke rate and in particular stroke index, rather than bioenergetics, contributed more to improved performance times in the 400 m freestyle event.
The performance gap between adult males and females in swimming has reportedly been stable at 8.9% since 1979 [13]. Despite the negligible differences in swimming performance between the sexes before puberty, from age 12 years onwards the performance gap appears to increase [14]. Indeed, it is the greater stroke-specific power of males compared with females that is purported to be a key contributing factor to this difference [15,16]. However, it has been proposed that since females mature physically earlier than males, they are better equipped to compete equitably with older females after reaching the age of 15 years [14]. From a physical standpoint, males can only start competing with an equal chance of success against mature males from the age of 17 years [14].
Baxter-Jones [17] questioned the age at which athletes should formally start competing and this debate remains as relevant today. In contradiction to competition entry requirements, the Amateur Swimming Association's (ASA) "The Swimmer Pathway" [18] advocated that only 15 year old female swimmers should consider racing at the "training to compete" stage of the Long Term Athlete Development model [19]. However, Grange and Gordon [20] indicated that the youngest competition age was 9 years and the distances over which these younger swimmers competed continued to change, with no distinction being made between sexes [21]. Furthermore, the latest version of the ASA handbook does not make any reference to race distances for these younger swimmers [22]. Despite this, Light, Harvey and Memmert [3] found that, given the appropriate setting, club swimmers drawn from France, Germany and Australia (mean age of 10.39˘1.07 years), were in fact demonstrating early specialisation and were not averse to competing at an early age. The findings of Barynina and Vaitsekhovskii [23] suggested that young swimmers would benefit from later specialisation within the sport (after the age of 12 years) and less training before reaching the age of 11 years. These findings add support to the sampling approach to sport advocated by the Development Model of Sports Participation [24]. However, Erlandson, Sherar, Mirwald, Maffulli and Baxter-Jones [12] found the development process of young female elite athletes did not appear to be adversely affected by intensive participation in sports, including swimming. The multitude of conflicting ideas regarding the minimum age for specialisation and/or competition suggested by various research groups, sporting bodies and development models confirms that, as yet, there is no definitive conclusion to this debate.
Longitudinal studies have the potential to help coaches gain perspective on the success of young athletes and enables them to give sound career advice [8]. A longitudinal study by Sokolovas [25] was one of the first to draw attention to the value of tracking elite swimmers retrospectively through their careers. With recent improvements in statistical methods, Allen et al. [26] and Dormehl et al. [27] have extended this concept by creating mixed linear models of elite-level, and sub-elite adolescent male swimmers respectively.
Since there are many challenges associated with constructing accurate models of human performance, besides the performance of young female sub-elite swimmers, it is unsurprising that no quantifiable baseline model currently exists. While it is tempting to create an all-encompassing model of swimming as a single sport, it is of more value to coaches and swimmers to acknowledge the individual specialisms within this multi-disciplinary sport. The aim of the present study was therefore to create the first models of the performance progression of sub-elite adolescent female swimmers for common strokes and distances. Identifying the threshold ages of peak performance in adolescent female swimmers could provide coaches and sporting associations with some potentially useful benchmarking tools to identify talent, and possibly provide evidence to determine realistic qualifying times as well as a justifiable minimum competition age for females.

Methods
Performance times for all female entrants (n = 514, aged between 12-19 years) who competed in one of seven individual events (Table 1) were extracted from the official results of an annual schools' swimming championships from 2006 to 2013. The 13 competing schools were American, British and International schools, predominantly located in Western Europe. Team sizes were limited and the competition rules limited swimmers to a maximum of three individual events per championship. The data were in the public domain and downloaded from the relevant tournament websites. All swimmers were assigned individual identity codes to ensure anonymity. The study was approved by the institutional ethics committee and conformed to the recommendations of the Declaration of Helsinki. The single best performances in each of the seven events entered (in either the heats or the finals) over the 8-year analysis period are described in Table 1. The swimmers' ages at the time of each competition were also obtained. Note: The drop in the number of repeat performances was likely to have been caused by a change in event choice, team selection, the transitory nature of scholars at international schools, injury or dropout. * This row of data denotes the total number of swimmers competing in each event, since this table sums the consecutive number of years swum. i.e., the total number of entrants in the 50 m freestyle event was 414, 167 of whom competed for two or more years with 2 of whom went on to swim in this event for 6 consecutive years (the maximum number of years over which any swimmer could compete between age 12 and 19 years).

Statistical Analysis
The raw datasets for all performances in each of the seven events were tested for normality using the Shapiro-Francia test [28] in STATA ver. 13 (StataCorp. 2013. Stata Statistical Software: Release 13. College Station, TX, USA: StataCorp LP). The trajectories of the curves showing the progression in performance during maturation were analysed using mixed or multi-level modelling (MLM) in STATA. Time was zero centred at 12 years of age, using an unstructured covariance approach. The fit of the models in fixed and random effects were compared with maximum likelihoods, using a hierarchical method. The final models were quadratic functions for fixed effects (y = ax 2 + bx + c). The fixed effects of time represented polynomial changes of the population with age and the random effects reflected individual deviations from the sample mean trajectory. Inter-class correlation coefficients were calculated and R 2 values determined in order to measure the difference between and within person variability and effect size respectively.

Evaluation of Models
The datasets for certain events had non-normal distributions. As a result, to validate the proposed models, cross-validations were performed whereby the datasets were randomly split into 1/3 and 2/3 sub-groups. Cross-validation of models is highly recommended under such circumstances in order to determine the generalisability of the findings [29].
The percentage rate of improvement was determined through differentiation of the quadratic functions for each event separately, as y "ˆ2 a cˆ1 00˙x`ˆb cˆ1 00˙, where y = percent change in performance time and x + 12 = age, in years. The threshold age of peak performance was calculated as the axis of symmetry of the quadratic function i.e.,´b 2a .

Results
Many of the probability values for the coefficients of the functions were greater than 0.05 (Table 2), resulting in reduced confidence in those models. This included the full model of the fixed quadratic for the 100 m butterfly and at least one of the cross-validation models for the 50 and 100 m freestyle in addition to the 100 m backstroke and breaststroke. Cross validation confirmed that the full models for the 200 m freestyle and the 100 m backstroke events fit the data well in comparison to those for the other events. In the remaining five events however, at least one coefficient of the cross-validation models fell just outside of the standard error (SE) of the full model, but all fell within the 95% confidence interval (C.I.) of the full model. Of all the models, the 100 m freestyle event had the poorest fit.
The models indicate that female swimmers are likely to reach their threshold of peak performance earliest in the 200 m individual medley (16.8 years) and latest in the 100 m butterfly, the latter of which was predicted to occur beyond the age range of the dataset (Figure 1 and Table 3). The slowest rate of improvement between the ages of 12 and 16.8 years was observed in 100 m butterfly swimmers, whereas the greatest rate of improvement (over the same age range) was predicted to occur in the 200 m freestyle event. For the modelled improvement rates from 12 years through to the threshold age, 200 m freestyle swimmers remain the fastest improving, while breaststroke swimmers replace butterfly swimmers as the slowest to improve (Table 3).

Results
Many of the probability values for the coefficients of the functions were greater than 0.05 (Table  2), resulting in reduced confidence in those models. This included the full model of the fixed quadratic for the 100 m butterfly and at least one of the cross-validation models for the 50 and 100 m freestyle in addition to the 100 m backstroke and breaststroke. Cross validation confirmed that the full models for the 200 m freestyle and the 100 m backstroke events fit the data well in comparison to those for the other events. In the remaining five events however, at least one coefficient of the crossvalidation models fell just outside of the standard error (SE) of the full model, but all fell within the 95% confidence interval (C.I.) of the full model. Of all the models, the 100 m freestyle event had the poorest fit.
The models indicate that female swimmers are likely to reach their threshold of peak performance earliest in the 200 m individual medley (16.8 years) and latest in the 100 m butterfly, the latter of which was predicted to occur beyond the age range of the dataset (Figure 1 and Table 3). The slowest rate of improvement between the ages of 12 and 16.8 years was observed in 100 m butterfly swimmers, whereas the greatest rate of improvement (over the same age range) was predicted to occur in the 200 m freestyle event. For the modelled improvement rates from 12 years through to the threshold age, 200 m freestyle swimmers remain the fastest improving, while breaststroke swimmers replace butterfly swimmers as the slowest to improve (Table 3).

Discussion
The aim of the study was to model the performance of female swimmers in all strokes between the ages of 12 and 19 years. However, only the 200 m freestyle, the 100 m backstroke and, to a lesser extent, the 200 m individual medley events produced functions that can be interpreted with any confidence (Table 2).
Although Kojima, Jamison and Stager [14] did not aim to determine a peak age-they predicted that females could already start competing equally with older females from as young as 15 years of age. In contrast, the quadratic functions of this study indicated thresholds of peak performance occurred later, i.e. from the age of 16.8 years (Table 3). A possible reason for this apparent discrepancy is that our dataset only included females from the age of 12 years (Figure 1), as this was the minimum entry age for the particular competition studied, while in the Kojima study there were swimmers as young as 7 years of age [14]. The unexpectedly late age of predicted peak performance for swimmers competing in the 100 m butterfly (20.6 years), was largely due to the shallow gradients (approx. 1.1% per year) of modelled improvement for this event. According to Malina, et al. [30], puberty begins at approximately 8 to 10 years of age for females and the mean age of menarche has been reported as 12.9 years [31]. It is therefore possible that the majority of females in this study may already have experienced meaningful gains in performance due to maturational development prior to competing in these events.
The threshold age of peak performance for the sub-elite female swimmers in this study were on average only 0.7 years younger than their male counterparts at the same championships [27], even though females are expected to mature approximately 2 years earlier [30]. This finding supports the authors' concerns about combining data on all strokes and distances into one single model, as the relatively late predicted age of peak performance in the butterfly will undoubtedly have contributed to the higher mean threshold age calculated for the females in this study. However, the relative rate of improvement for adolescent female swimmers is confounded by numerous additional factors. Since females mature earlier than males, their improvement between the ages 12 and 19 years is likely to be affected less by biological processes and potentially more by external factors, including biomechanical development, psychological and social pressures [32]. While the growth and maturational process to adulthood starts prior to the age of 12 years for females, it has been questioned whether they have sufficient cognitive development to deal with the rigours of high level competition and the concomitant pressures [33], or whether they should be specialising at such a young age [34].
The expected plateau in performance as biological maturation nears its peak, experienced earlier in females than males, is a factor possibly leading to waning interest and commitment to training and potentially higher dropout rates in females [12]. In accordance with the findings of Cornett and Stager [35], who examined the effect of the number of entrants in a 50 yard freestyle event on the level of performance, it is also possible that the lower number of entrants in the older age groups (data not shown) may also have contributed to reduced competitiveness in these groups. Nevertheless, these sub-elite females were predicted to attain their threshold of peak performance 5.1 years earlier than the peak performance age reported for elite-level swimmers in the same events analysed by Allen, Vandenbogaerde and Hopkins [26]. The difference is likely due in part to their study exclusively containing a narrower sample of elite swimmers and, importantly, included performance data that progressed beyond their teenage years.
While the predicted models in this study provide poor fit for many of the events, there is value in examining the comparisons between events. Females reach their threshold of peak performance in longer distance events such as the 200 m individual medley and the 200 m freestyle at a younger age than shorter distance events (Table 3), confirming a phenomenon reported on by Arellano, et al. [36] and Allen, Vandenbogaerde and Hopkins [26]. Swimmers competing in the 200 m freestyle event also demonstrated the highest rate of improvement between the ages of 12 and 16.8 years (Figure 1). It is possible that females improve most in the longer distance events due to changes in body composition as a result of puberty. Post-pubertal females are known to have greater buoyancy, which has been suggested to give them an energy efficiency advantage over males [37] and is most noticeable in longer distance events [13,38].

Practical Applications
Rather than being limited to mere mathematical comparisons of combined threshold times of numerous specialisms within swimming, the value of the individual models developed in this study promotes many potential applications for coaches, swimmers and governing bodies. Swimmers can set realistic targets for the following season and coaches can measure the performance of their adolescent female swimmers against the average expected progressions for each of the events modelled. Furthermore, swimmers who consistently exceed the modelled rates of progression might be considered for talent development or alternatively may be identified as early or late maturers. With further refinements of the models, they could one day also assist governing bodies in the setting of justifiable qualifying times for national and international competitions.

Conclusions
Despite the poor fit of some of the models generated, the novel analysis of individual events allows for some interesting comparisons to be made. The authors feel that this approach is of more value than a one size fits all model for the sport. The models suggest that females achieve thresholds of peak performance earlier in longer distance events. Use of this particular international schools' swimming competition provided a consistent minimum age over many consecutive years and consequently ensured high validity of the dataset. However, the slow rate of progression seen in the quadratic functions generated in comparison to those found for the male adolescents by Dormehl, Robertson and Williams [27] indicates that the process of maturation had likely already begun for many of the females in this study. Compared with data for male swimmers [27], confidently identifying the contribution of maturation to performance improvement in females through adolescence remains an elusive goal. Future research should therefore consider collecting longitudinal data on very young swimmers in competition, as these could generate more robust models and higher levels of confidence. Finding a suitable sub-elite competition setting for this may however prove difficult until such time as a consensus is reached on a suitable minimum age of competition, and whether this age should be the same for both males and females. Overcoming these issues could lead to the development of useful benchmarking tools for potential talent identification of sub-elite athletes or the setting of realistic development goals.
Author Contributions: S.J.D. and C.A.W designed the study, S.J.D collected the data, S.J.R. & S.J.D. statistically analysed the data, S.J.D. wrote the first draft, C.A.W. supervised the study. All authors critically reviewed, contributed to and approved the final manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.