The Multistage 20-m Shuttle Run Test for Predicting VO2Peak in 6–9-Year-Old Children: A Comparison with VO2Peak Predictive Equations

Simple Summary Cardiorespiratory fitness is one of the main components of physical fitness. For children, a simple test that can be used to assess cardiorespiratory fitness is the multistage 20-m shuttle run test (20mSRT). Research has often used a portable gas analyzer to measure cardiorespiratory fitness in clinical and scientific settings; however, this may not be practical due to the high cost of the device. Moreover, the use of such a device with children is almost impracticable in school environments. Thus, to avoid using such a device, one possibility is to use equations for predicting peak oxygen consumption, which is recognized as one of the best indicators of aerobic fitness. In the present study, 22 equations were used to determine which predictive equations had greater agreement with VO2peak values measured by direct oximetry through performance of the 20mSRT. Furthermore, we verified if wearing and carrying a portable gas analyzer constrained the children’s performances. To accomplish these aims, 67 boys and 63 girls were included in the analysis. Our results showed that only six predictive equations correctly predicted the peak oxygen consumption. In addition, for girls, higher values of maximal speed, total laps, and total time were found when a portable gas analyzer was used. This information is helpful to strength and conditioning professionals and to schoolteachers if portable gas analyzers are unavailable or if the environment is not suitable for such assessments. Abstract This study aimed (i) to verify if using and carrying a portable gas analyzer (PGA) constrained the performance of school children on the multistage 20-m shuttle run test (20mSRT), (ii) to verify which peak oxygen consumption (VO2peak) predictive equations have greater agreement with VO2peak values measured by direct oximetry using the 20mSRT. The study participants were 130 children ((67 boys (age 7.76 ± 0.97 years) and 63 girls (age 7.59 ± 0.91 years)), who performed two randomized trials of the 20mSRT with and without a PGA. Twenty-two predictive equations predicted the VO2peak values through the performance of the test with and without a PGA. Without a PGA, lower values of maximal speed (MS), total laps (TL), and total time (TT) were found for girls than for boys with a PGA. Only six equations were considered to correctly predict VO2peak. In general, higher MS, TL, and TT values were found with the use of a PGA. The predicted VO2peak values from the 20mSRT varied significantly among the published predictive equations. Therefore, we suggest that the six equations that presented satisfactory accuracy could be practically used to examine cardiorespiratory fitness in schools and in research with large populations when direct measurement of VO2peak is not feasible.


Introduction
Cardiorespiratory fitness (CRF) is a component of physical fitness commonly used in schools [1][2][3] that can be assessed, in children and adolsecents, through oxygen consumption (VO 2 ) during a maximal test [4] to obtain the peak oxygen consumption (VO 2 peak) value, which is usually recognized as one of the best indicators of aerobic fitness [5,6].
In young people, VO 2 peak is reached in the final stage of a maximal effort [7]. Children and adolescents with low CRF tend to present a higher risk of developing metabolic syndrome [8], cardiovascular diseases [9], and depression [10]. In addition, low CRF has strong associations with cardiovascular risk factors such as atherosclerotic vascular disease and abdominal adiposity with origins in pediatric years [11,12], which supports the association between CRF and health-related outcomes in youth and highlights the importance of CRF assessment in children. For instance, low CRF in children has been shown to negatively impact the functional ability of daily tasks and consequently affect life quality. Moreover, it has been reported that low CRF maintenance, from the early years, tends to continue over time [13]. This information highlights the importance of assessing CRF in children to understand their status, since VO 2 peak is considered to be a gold standard assessment [4].
In the school context, indirect field tests are mostly used, as they generally demand low cost, shorter execution time, and are easy to apply with a higher number of participants [14]. The 20-m shuttle run test (20mSRT), designed by Léger et al. [15], is one of the most used field protocols for youth, and it is included in different batteries of CRF tests [5,16]. The 20mSRT requires a small area and almost no equipment, and it can be performed with several participants simultaneously which consequently increases their motivation [17].
To predict VO 2 peak, Léger et al. [15] developed an equation for 8-19-year-old youth and reported a correlation of r = 0.71 between predicted VO 2 peak and a retro-extrapolation VO 2 measured in the final stage of the test [2]. Recently, a systematic review showed moderate to strong evidence for five equations [1] to predict VO 2 peak values using the 20mSRT in youth [15,[18][19][20]. However, despite the results of the previous studies that showed a relationship between VO 2 peak values directly measured and predicted using equations with a range between r = 0.71 and 0.96 [1,2], other studies have tested the validity of the equations and found that they had low performances for predict VO 2 peak [21][22][23][24].
The prediction of VO 2 peak from the 20mSRT does not include variables such as height, weight, body mass index, body surface area, or skinfold measurements which have been recommended to improve the validity of the results [20,25]. Others have suggested that sex and age could affect the predictive power of the 20mSRT [1,18,20,26,27]. However, several predictive equations for esstimating VO 2 peak have presented varied results due to the different characteristics of participants, such as age, maturation, sex, and body composition [26,27]. In this sense, an equation that can effectively predict VO 2 peak should be validated and should produce small variations in amplitude among the predicted results [22].
Despite the proven usefulness of VO 2 peak for achieving quality and consistency of data in children, its use remains to be a challenge [16,28]. The 20mSRT is the most used test in the school context for estimating CRF through the prediction of VO 2 peak values [1,3,16,29], but the validity of predictive equations is still questioned by some authors [2]. Hence, it is relevant to determine which predictive equations provide the best accuracy for children and youth populations, as well as which variables provide more validity to those equations [1].
To measure CRF in field tests conducted in clinical and scientific settings, studies have often used portable gas analyzers (PGA) [29][30][31][32][33]. However, few studies have investigated using the K4b 2 PGA (COSMED, Rome, Italy), one of the frequently used devices [34][35][36], and its influence on the 20mSRT performance. A PGA consists of a face mask attached with a head harness, monitoring wires, and a portable unit fitted with a battery pack in a harness adjusted to a child's trunks, which weighs approximately~1 kg. Wearing a PGA can produce discomfort for participants and, consequently, a negative CRF perfor-mance [36]. It seems that only Selvadurai et al. [30] and Itaborahy et al. [35] analyzed the interference capacity of a PGA during a maximal stress test in children, although with different procedures.
Given the previous concerns, in this study, we aimed (i) to verify if using and carrying a PGA constrains the performance of children on the 20mSRT, (ii) to verify which of the VO 2 peak predictive equations has greater validity as compared with VO 2 peak values measured by direct oximetry from the 20mSRT, (iii) to analyze if fat mass influences the results of the predictive equations for predicting VO 2 peak values. The following hypotheses were stated: (1) Using a PGA may constrain CRF performance. (2) Some existing VO 2 peak predictive equations are not validated. (3) Fat mass may influence the results of the predictive equations.

Materials and Methods
A non-probabilistic convenience sampling method was used to recruit participants. Before the study, all children provided verbal consent, and their parents/guardians provided written informed consent. In addition, this study was approved by the Ethics committee of the Polytechnic Institute of Santarém (approval number 102019Desporto).

Participants
One hundred and thirty white (Caucasian) children (67 boys, age 7.76 ± 0.12 years and 63 girls, age 7.59 ± 0.12 years) as determined by "parents" place of birth, were recruited via word-of-mouth and through local school districts, all attending elementary schools in central and south of Portugal. To be included in the study, participants only needed to be apparently healthy without any contraindication to participate in a maximal CRF test. The exclusion criteria were a motor deficiency, medical contraindication to physical exercise or diagnosed disease or illness, or orthopedic issues that would limit their ability to run. During the collection period, children, parents, and teachers were asked not to perform any physical sports activities on the day before the tests. Although the initial sample was 140 children (70 boys and 70 girls), there was a 7% (n = 10) dropout because some children did not perform one of the tests (n = 4) or because the tests did not meet the eligibility criteria (n = 6).

Procedures
The study protocol involved three non-consecutive days for assessments. On the first day, the participants were certified regarding their health and clinical history by a technician in the presence of the classroom teacher. Body fat was determined with triceps and subscapular skinfold thickness measurements using a skinfold caliper, in a room with stable temperature and humidity (from 20 to 22 • C and from 50 to 60%, respectively). The second and third days were completed within a week, when participants completed two assessments of the 20mSRT at the same hour to avoid diurnal variation. All anthropometric and 20mSRT assessments were performed by the same certified fitness professional.

Anthropometric Assessment
During the first day, participants' heights were measured and recorded to the nearest 0.1 cm, using a tape measure and the Frankfort plane procedures were applied [37,38]. Weight was measured to the nearest 100 g, using a standard scale (Beam Balance-Stadiometer model 700, SECA, Vogel & Halke, Hamburg, Germany). For body fat determination, triceps and subscapular skinfolds were measured using ISAK protocols with a skinfold caliper Slim Guide (Creative Health Products, Plymouth, MI, USA). The triceps skinfold (TricSKF) was measured on the back of the right arm over the triceps muscle, midway between the elbow and the acromion process of the scapula [39]. The subscapular skinfold (SubSKF) was measured 2 cm below the lower angle of the right scapula. A single evaluator assessed the anthropometric measurements, which were performed three times. Then, the average of the measurements was calculated [40,41]. All skinfolds were measured to the nearest 0.5 mm. The fat mass percentage was estimated using the Slaughter equation [41]. Specifically, the sum of the triceps and subscapular skinfold thicknesses < 35 mm was used for both girls and boys. The following formulas were applied: BF (%) = 1.21 (triceps and subscapular) − 0.008 (triceps and subscapular)2 − 1.7 (boys); BF (%) = 1.33 (triceps and subscapular) − 0.013 (triceps and subscapular)2 − 2.5 (girls).
As described by Lohman et al. [40], the waist circumference (WC) measurement was taken at the level of the umbilicus with a non-elastic flexible Rosscraft measuring tape (Rosscraft, Surrey, Canada) with a margin of error of 0.1 cm. A measure was taken with the subject standing without clothing covering the waist area.

mSRT Test Protocol
The 20mSRT is a progressive intensity test. The initial speed is 8.0 km/h, which is increased by 0.5 km/h in each stage after the first minute. For better clarity, each lap is followed by one beep, while each minute is followed by three beeps [42]. The protocol was conducted during the morning in sport pavilions at four schools in central and southern Portugal cities, during the spring (ambient temperature and percentage of relative humidity, respectively, from 20 to 24 • C and from 50 to 65%).
The standardized procedures of The Cooper Institute [43] for the 20mSRT were followed. The test was clearly explained to each participant, and all participants performed the test in their physical education classes. Before the application of protocol testing, a five-minute warm-up was performed with lower and upper limb stretching exercises and walking across the 20 m field test area.
As Scalco et al. [34] proposed during the test, all participants were verbally encouraged to achieve the best possible result. The types of verbal encouragement used were: "very well", "let's go", "way to go", "you can do it", "cheer up", and "you are almost there", spoken every 60 s, throughout the process, uninterrupted, and to all participants.
The test finished when a participant did not achieve, for the second time, the 20 m distance between two line within the specific time of the stage (controlled by the beep sounds). Finally, the total number of laps (TL) completed, and the speed reached at the end of the 20mSRT for each participant was used for analysis [43].
As mentioned, all participants performed two trials of the 20mSRT (within a week). During one trial, participants wore a face mask and a PGA for gas analysis, while in the other trial, they did not use them. A random order of the test with and without a face mask and PGA was applied.

Prediction of Peak Oxygen Consumption
Peak oxygen consumption (VO 2 peak) was measured by direct oximetry with a Cosmed K4b 2 PGA analyzer (Cosmed, Rome, Italy), which had been previously validated [44]. In addition, the VO 2 peak values were also predicted by using 22 predictive equations (Table 1).    The cardiorespiratory data were collected as previously described by Silva et al. [47]. Specifically, the heart rate (HR) was measured using a Polar T31 sensor (Polar Electro Oy, Kempele, Finland) coupled to a Cosmed K4b 2 analyzer. During each test, the HR and VO 2 values of the participants were continuously monitored by telemetry. The size of the mask was adjusted to the child's face, and the device's harness was adjusted to the child's trunk, carrying the portable unit in the chest area and the battery at the level of the shoulder blades. This compact device was attached without constricting the children's movements. The K4b 2 weighed 475 g, not exceeding~1 kg of the total "equipment" weight (harness and battery), and was not expected to significantly affect the energy demands of the subjects [52]. To assure the best fit and minimal dead space for pediatric testing, the mask chosen should be smaller or pediatric specific. After placing the mask, the mouthpiece was covered, and children were asked to make a forced expiration in order to check the sealing [53].
Before each use of the analyzer, calibration tests were performed [53], as described by the manufacturer, COSMED Srl (Rome, Italy). The gas analyzer calibration procedures before the start of each test were as follows: 45-min warm-up period for the device, calibration with ambient air, calibration with reference gas (16.7% O 2 and 5.7% CO 2 ), gas transition time calibration, and turbine calibration (with 3000 mL syringe) (Quinton Instruments, Seattle, WA, USA). Respiratory parameters were recorded breath by breath, and the values of VO 2 were recorded for an average of 10 s [54,55].
There is no well-accepted definition of a VO 2 plateau in pediatric testing, possibly because it is often absent [56][57][58]. Therefore, performance on the 20mSRT was considered to be maximal when the participant achieved at least two of the following adopted physiological criteria and one subjective criterion as proposed in a previous study by [59]. The physiological criteria included: exceeding the 2nd ventilatory threshold, interrupted as the maximal exhaustion test; respiratory exchange ratio (RER) score ≥1.0 [60]; VO 2 peak with the highest VO 2 in mL/kg/min elicited during a progressive exercise test to exhaustion; reaching an aged-predicted maximal heart rate of ±10 bpm (Tanaka equation, 208-0.7 × age) [61]. The subjective criteria included: signs of maximal effort such as profuse sweating, facial flushing, and unsteady gait during the run [62]. Participants who did not achieve the previous criteria were excluded from the analysis (n = 6).

Statistical Analysis
The measured VO 2 peak values, the predicted VO 2 peak values using the 22 equations (Table 1), and all interest variable means and standard deviations (SD) were calculated for the total sample and divided by gender. Mean differences and confidence intervals at 95% (CI, 95%) were calculated for the comparisons between measured VO 2 peak values and those predicted using equations. Kolmogorov-Smirnov and the Levene tests were used to test the assumption of normality and homoscedasticity, respectively. Gender influences on the anthropometric, metabolic, cardiovascular, and performance variables from the 20mSRT and equations were determined with an independent t-test. Means comparisons between performance variables using, or not using, the K4b 2 PGA were performed with paired sample t-tests, in the total population and between genders. A paired sample t-test was performed to estimate mean differences between measured VO 2 peak values and predicted VO 2 peak values using predictive equations. Correlations were determined between measured VO 2 peak values and predicted VO 2 peak values using equations, with and without the K4b 2 PGA. The Sigmaplot version 14.0 software (Systat Software, San Jose, CA) was used to create the Bland-Altman plots. The graphical dispersion arrangement of Bland and Altman [63] allowed the visualization of the mean differences and the upper and lower limits according to two standard deviations of the differences in the measurements. The significance level considered for all tests was p < 0.05. All predictive equations were adjusted to fat mass (%). The statistical analyses were computed and performed using IBM SPSS Statistics for Windows, Version 27.0. (IBM Corp., Armonk, NY, USA). Hedge's g effect size was also calculated for comparisons between genders, while Cohen's d was calculated for the comparisons between use of the PGA and non-use of the PGA. The Hopkins' thresholds for effect size statistics were used, as follows: ≤0.2, trivial; >0.2, small; >0.6, moderate; >1.2, large, >2.0, very large; and >4.0, nearly perfect [64]. Table 2 presents the means and SDs of anthropometric, performance, metabolic, and cardiovascular variables assessed during the 20mSRT. There were differences between boys and girls in the tricipital skinfold (t(128) = 2.187, p < 0.05 and g = 0.384) and in all performance variables: MS with PGA (t(103.364) = −3.406, p < 0.001 and g = 0.661); MS without PGA (t(115.743) = −3.686, p < 0.001 and g = 0.674); TL with PGA (t(104.408) = −3.538, p < 0.001 and g = 0.610); TL without PGA (t(112.288) = −3.976, p < 0.001 and g = 0.689); TT with PGA (t(108.866) = −3.687, p < 0.001 and g = 0.638); TT without PGA (t(115.869) = −4.203, p < 0.001 and g = 0.729); and VO 2 peak (t(125.632) = −2.749, p < 0.05 and g = 0.48).  Table 3 presents differences between performance variables of the 20mSRT (maximal speed, total laps, and total time with and without the PGA). Table 4 presents the 22 VO 2 peak predictive equations reported in the literature. Since many of the equations use gender as a variable, they were compared with themeasured VO 2 peak values from our study for the total sample and both genders. Table 5 presents the mean differences and comparisons between the measured VO 2 peak values and the predicted VO 2 peak values (highlighted with the letter "a") for each predictive equation for girls, boys, and overall participants (highlighted with the letter "b"). Table 6 presents only one non-significant VO 2 peak predictive equation, since they are the valuable equations to be discussed in the total sample. The SEE values ranged between 2.24 (Equation #4) and 7.09 mL.kg −1 .min −1 (Equation #21, for the total sample). The validation coefficients (correlation between estimated and measured VO 2 peak) were significant for all equations (0.927 > r > 0.618, p < 0.001; and one presented 0.286 > r > 0.283, p < 0.05). In Tables 6 and 7, the values are adjusted for fat mass (%). Table 6 presents the limits of agreement (LoA) and range (upper LoA-lower LoA) for the entire sample. Equation #4 presents the smallest SEE value between the measured and estimated VO 2 peak values in the two tests with and without a PGA. However, it also presents the highest slope, meaning that the equation overpredicts the VO 2 peak in participants with lower VO 2 peak and underpredicts the VO 2 peak in participants with higher VO 2 peak. Equation #21 presents the lowest slope but has the lowest R and the lowest range. Equation #1 presents a reasonably large d (p < 0.05), the highest range, and the highest slope (p < 0.0001). Table 3. Comparison of the 20mSRT performance variables in the two trials with and without the portable gas analyzer.       Table 7 presents only the non-significant VO 2 peak predictive equations since they are the valuable equations to be discussed for both genders. Bland-Altman graphs were plotted to examine the bias distribution and assess the agreement between the measured VO 2 peak values and the predicted VO 2 peak values.  The Bland-Altman plots provide the systematic bias and random error between the measured and predicted VO 2 peak values, which are presented in Figure 1. Positive linear distributions of Equation #1 ( Figure 1A,B), Equation #4 ( Figure 1C,D), and Equation #6 ( Figure 1E) are found. Equation #21 presents a more random dispersion of scores ( Figure 1F,G). The Bland-Altman plots provide the systematic bias and random error between the measured and predicted VO2peak values, which are presented in Figure 1. Positive linear distributions of Equation #1 ( Figure 1A,B), Equation #4 ( Figure 1C,D), and Equation #6 (Figure 1E) are found. Equation #21 presents a more random dispersion of scores ( Figure 1F,G).

Discussion
In this study, we aimed: (i) to verify if carrying and using a PGA constrains the performance of elementary children who perform the 20mSRT, (ii) to verify which of the VO2peak predictive equations have greater validity as compared with VO2peak values measured by direct oximetry using the the 20mSRT, (iii) to analyze if fat mass influences the results of predictive equations for VO2peak. The main findings showed that the MS, TL, and TT were higher for the boys than the girls, with and without the PGA. In addition, higher values were reported with the PGA than without the PGA for both girls and boys (except for boys in TT). Moreover, among the 22 predictive equations, only six equations were considered to estimate VO2peak correctly.
To discuss all objectives of the study, this section is organized into three subsections: (1) Constraints associated with using a portable gas analyzer; (2) agreement of VO2peak predictive equations as compared with VO2peak measured by direct oximetry; and (3) fat mass influence on the results of predictive equations for VO2peak.

Constraints Associated with Using a Portable Gas Analyzer
In this study, differences were found in the 20mSRT performance variables (maximal speed, total laps, and total time) when using a PGA versus without a PGA, for the entire sample. Interestingly, only the group of girls presented differences when performing for both genders. Specifically, without the PGA, MS was lower, TL were fewer, and TT was less for girls than for boys with the PGA . Based on the HRpeak, the participants of the present study showed values that were similar to those of other studies [30,35]. Task commitment and, eventually, motivational aspects can explain these results; however, the technicians gave proper verbal feedback and encouragement during testing [43].
To the best of our knowledge, only two studies have analyzed the constraints of using a PGA for assessing CRF in an infant and youth population. Selvadurai et al. [30] assessed 93 children and adolescents with cystic fibrosis, through the 20mSRT with and without a PGA (Cardiovit 100 CS Spirometry Module) but supported on a trolley placed on tracks and pushed by a technician. No differences in the cardiorespiratory responses were found (respiratory rate, Borg scale values, and peripheral oxygen saturation), as well as in the distance traveled, with or without using a PGA. The authors justified the results due to a light mask that weighed less than 100 g, which may have minimized any discomfort. However, their results were not in agreement with the results reported by Itaborahy et al. [35], who analyzed the reproducibility of intergroup performances performing a modified shuttle walk test, a 6-min walk test, and an ADL-Glittre for pediatric populations. They

Discussion
In this study, we aimed: (i) to verify if carrying and using a PGA constrains the performance of elementary children who perform the 20mSRT, (ii) to verify which of the VO 2 peak predictive equations have greater validity as compared with VO 2 peak values measured by direct oximetry using the the 20mSRT, (iii) to analyze if fat mass influences the results of predictive equations for VO 2 peak. The main findings showed that the MS, TL, and TT were higher for the boys than the girls, with and without the PGA. In addition, higher values were reported with the PGA than without the PGA for both girls and boys (except for boys in TT). Moreover, among the 22 predictive equations, only six equations were considered to estimate VO2peak correctly.
To discuss all objectives of the study, this section is organized into three subsections: (1) Constraints associated with using a portable gas analyzer; (2) agreement of VO2peak predictive equations as compared with VO 2 peak measured by direct oximetry; and (3) fat mass influence on the results of predictive equations for VO 2 peak.

Constraints Associated with Using a Portable Gas Analyzer
In this study, differences were found in the 20mSRT performance variables (maximal speed, total laps, and total time) when using a PGA versus without a PGA, for the entire sample. Interestingly, only the group of girls presented differences when performing for both genders. Specifically, without the PGA, MS was lower, TL were fewer, and TT was less for girls than for boys with the PGA. Based on the HRpeak, the participants of the present study showed values that were similar to those of other studies [30,35]. Task commitment and, eventually, motivational aspects can explain these results; however, the technicians gave proper verbal feedback and encouragement during testing [43].
To the best of our knowledge, only two studies have analyzed the constraints of using a PGA for assessing CRF in an infant and youth population. Selvadurai et al. [30] assessed 93 children and adolescents with cystic fibrosis, through the 20mSRT with and without a PGA (Cardiovit 100 CS Spirometry Module) but supported on a trolley placed on tracks and pushed by a technician. No differences in the cardiorespiratory responses were found (respiratory rate, Borg scale values, and peripheral oxygen saturation), as well as in the distance traveled, with or without using a PGA. The authors justified the results due to a light mask that weighed less than 100 g, which may have minimized any discomfort. However, their results were not in agreement with the results reported by Itaborahy et al. [35], who analyzed the reproducibility of intergroup performances performing a modified shuttle walk test, a 6-min walk test, and an ADL-Glittre for pediatric populations. They found re-producible results among themselves, but they were not reproducible when performed with or without a PGA (Cosmed K4b 2 , Cosmed, Rome, Italy). However, the authors stated that the performances in the 6-min walk and modified shuttle walk tests were not significantly different with or without a PGA.

Agreement of VO 2 peak Based on Predictive Equations as Compared with VO 2 peak Measured by Direct Oximetry
Accuracy is necessary to establish associations between cardiorespiratory fitness and other explored variables. The validity of previously published equations (22 equations) has been verified in Portuguese children. The main finding of our study was that, in 6-9-year-old children, different predictive equations presented variations for predicting VO 2 peak through the 20mSRT. Overall, predicted VO 2 peak values were only adequate in six equations (Equations #1, #3, #4, #6, #21 and Equation #12, only in boys). From the majority of the 22 existing predictive equations tested, the generalization and accuracy of the group or gender predicted VO 2 peak values seem to be unacceptable.
Regarding the total sample, a comparison of the predicted VO 2 peak values in the two 20mSRT trials (with and without a PGA) with the measured VO 2 peak values showed that, among the 22 equations, only Equations #1, #3, #4, #6 (but only in the trial performed without using the PGA), and #21 presented no differences. These results are in line with an evaluation carried out in a systematic review by [65], in which Equations #1, #6, and #7 had a strong level of evidence (equations validated by three or more studies with low risk of bias), and Equations #2, #3, and #4 were classified as moderate evidence. Menezes-Júnior et al. [65] also stated that Equation #6 presents higher evidence and reliability for predicting VO 2 peak values, for girls and boys.
Although Equation #18 was developed using oxygen consumption data collected while the participants completed the 20mSRT, its predictive power did not prove to be accurate. Another study by [24] analyzed the VO 2 peak of Portuguese children while performing the 20mSRT with a PGA as compared with the predicted VO 2 peak values from the FITNESSGRAM reports and other equations [18,20,26,43]. They found that the FITNESSGRAM software was not significantly different from directly measured VO 2 peak values [24].

Fat Mass Influence on the Results of Predictive Equations for Predicting VO 2 peak
The different parameters that integrate the equations can give greater or lesser predictive power. Body mass index (BMI) has been considered to be the main body size predictor of VO 2 peak in several studies [18,27,51], indicating significantly improved performance of the published equations. The BMI can have a wide influence on CRF fitness in children and adolescents [16] and has a robust association with body mass. Some studies [18,24,43] have suggested that BMI or skinfold thickness, age, and gender can provide the best predicitons of VO 2 peak as compared with those that only include age. However, this study showed that the equations that only include age by Leger et al. [42] for the entire sample and by gender and the equation by Menezes-Junior et al. [1] for boys only presented reasonable predictions of VO 2 peak.
In addition, it has been reported that, in children between 11 and 17 years old, fat-free mass was the variable that affected VO 2 peak. Recently, some authors [27] have stated that the proportion-to-body mass scale exercise variables assume an underlying set of specific statistical assumptions that are rarely met in pediatric exercise research [2].
In this sense, a recent study by Menezes-Junior et al. [1] used the z-score of the BMI (BMI-z) in their equation and reported findings that suggested equations with the BMI-z and its combination with %FM were more accurate and suitable for predicting VO 2 peak values in boys. In the aforementioned study [1], the inclusion of %FM indicated improvements in the models, especially in boys. In contrast with our study, the Menezes-Junior [1] equation (that integrated the %FM) was only predictive for VO2peak in girls (6-9 years old). Despite the difference between the mean age of the sample in the present study (7.68 ± 0.08 years) as compared with the Menezes-Junior study [1] (13.37 ± 1.84 years), it is possible that the higher %FM in girls, which increases during puberty, may reduce their aerobic performance [21,66]. In contrast, the negative impact on boys' CRF caused by an increase in body fat may have resulted from an unfit lifestyle that was not considered in the present analysis.
Despite using the BMI to predict VO 2 peak, Mahar et al. [50] considered the 20mSRT performance and age sufficient to predict VO 2 peak. However, they did not consider that measures of adiposity other than BMI had higher correlations with VO 2 peak. For instance, the correlation of FM with VO 2 peak was r = −0.38, while the correlation of BMI with VO 2 peak was r = −0.22 [50]. A study by Ayala-Guzmán and Ortiz-Hernández [21] showed the same tendency: VO 2 peak had a correlation of r = −0.61 with FM as compared with BMI (r = −0.22).
On the one hand, in the present study, only Equation #2 considered the gender and triceps skinfold, nevertheless, it was not predictive of VO 2 peak in the present sample. On the other hand, several equations have included gender influence in their equations [1,20,21,24,[45][46][47][48]51]. Our results reported significant changes in performance tests regarding gender and, more precisely, in girls.
Some studies have observed that body fat displayed a better factor in the equations than BMI [21]. Nevertheless, in the present study, two equations, Equations #2 and #21 that integrated fat mass, were not found to be good predictors of VO 2 peak. Considering that excessive fat mass can decrease cardiorespiratory performance [21,67], this variable is suggested for better explanation of variations in VO 2 peak [1].
In opposition to the previous suggestion, in this study, we showed that Equation #21 presented a good prediciton of VO 2 peak using the interaction of height and age. One justification for this result could be the fact that older and taller adolescents may have an advantage performing the 20mSRT. However, such analysis was not considered in the present research.
Another study also assessed the agreement between VO 2 peak directly measured through the 20mSRT and VO 2 peak predicted using five different equations, in 13-19-yearold children [28], and from those equations, seven equations were used in our study, namely, Equations #1, #2, #3, #4, #6, #7, and #8. In a study by Ruiz et al. [28], the 20mSRT was performed by boys and girls with a mean age older than in the present study (girls, 14.6 ± 1.5 years and boys, 15.0 ± 1.6 years). All participants wore the K4b 2 Cosmed PGA and measured VO 2 peak values of 47.1 ± 8.1 mL/kg/min, which was slightly lower than the values of our study, as well as the corresponding means of predicted VO 2 peak values of 41.5 ± 5.2 mL/kg/min for Equation #1, 44.2 ± 5.6 mL/kg/min for Equation #4, 45.7 ± 5.0 mL/kg/min for Equation #2, and 43 ± 5.5 mL/kg/min for Equation #7.
Most equations have been developed from data collected during a maximal treadmill protocol. Although previous research has shown that VO 2 peak values measured during the 20mSRT and treadmill were not statistically different [68], the treadmill involved continuous walking or running, while the 20mSRT test consisted of intermittent bouts of 20 m laps. The 20mSRT test is more similar to the sporadic, varying intensity activity patterns that youth typically engage in. As most equations were developed using data obtained from treadmill assessments, the accuracy of the VO 2 peak estimations should be similar across equations. However, the results from this study showed that the most accurate of all 22 equations were: Equations #1, #3, #4, #6 (but only in the trial performed without using the PGA), and #21.
According to the literature, Equation #1 was the first to predict VO 2 peak values in children and adolescents [42] and it has presented substantial evidence in a recent study [1]. In this study, Equations #1, #4, and #6 presented a strong association between the predicted VO 2 peak values and the measured VO 2 peak values. As compared with Menezes-Junior et al. [1], this study presented lower SEE values in the aforementioned equations. In addition, as compared with the study by Ruiz et al. [19], Equation #1 had a lower value in our study (respectively, 4.27 vs. 2.52 mL/kg/min). Despite this, Equation #21 had a lower correlational value, and high SEE value (with and without PGA, respectively, 7.07 and 7.09 mL/kg/min), which may be a potential equation to predict VO 2 peak values in girls, boys, and overall sample, since no significant differences were found between the measured and predicted VO 2 peak values for both conditions. The Bland-Altman plots presented in this study (Figure 1) showed evidence of differences between the measured VO 2 peak and the predicted VO 2 peak values. It did not appear to consistently under/overestimate in the unfit/fit children with Equation #21, whereas this trend was evident with Equations #1, #4, and #6. For those, it can be said that for lower VO 2 peak values, the equation tends to overestimate the measured values, while for higher VO 2 peak values, the equation tends to underestimate the measured values. Only the plots regarding Menezes-Junior et al. [1] showed a random error which suggested a normal distribution of the differences between the methods, while the other plots presented a positive linear distribution of the values.
Possible explanations for the better validity of some equations could be related to the performance variables from 20mSRT, such as the number of laps, while others used the final speed or stage data, which were recorded only every minute of the test [18][19][20]42]. Thus, when the 20mSRT was stopped moments after the close of a stage or speed advance, only previous data were considered [69]. However, another study showed equivalent results for equations that included lap number or maximal velocity. Nevertheless, the authors recommended using lap numbers as a practical application [21].
The lack of accuracy between our collected data and the predicted values using the different predictive equations may be limited by the race/ethnic makeup of the sample groups, the lack of overweight and obese subjects, and the average age. Another limitation may be the convenience sampling techniques used, which may have elicited selection bias. Finally, sleep patterns were not controlled on the previous days of the CRF tests, which may have possibly interfered with the results. These factors can limit the generalizability of the results provided by the different equations.
Despite the previous limitations, the present study has several strengths. For instance, no allergic reactions were reported concerning the use of a PGA. In addition, there was a tendency for better performances for both the girls and the boys when using a PGA. Finally, the present study provides all equations that are better for predicting VO 2 peak values.

Conclusions
In this study, we found differences in the 20mSRT performance variables when using a PGA versus without a PGA, for the entire sample and for girls, where higher values of maximal speed, total laps, and time were found with the use of the PGA/K4b 2 . Thus, the first hypothesis was rejected since the PGA did not constrain such variables for girls while the opposite was observed for boys.
The reported data further indicate that, in 6-9-year-old children, predicted VO 2 peak values from the 20mSRT can vary using different predictive equations. For instance, predictions of VO 2 peak values were only adequate in six equations: #1 [42], #3 and #4 [20], #6 [18], #21 [1], and #12 only in boys [47]. Therefore, the second hypothesis of the present study was confirmed, and thus, suggested that these equations could be practically used to examine CRF in schools and in research with large populations when direct measurement of VO 2 peak is not feasible.
Finally, the third hypothesis was rejected, since fat mass did not show a stronger influence on the abovementioned six most adequate equations for predicitng VO 2 peak values from the 20mSRT.
For future research, it is suggested that this study be replicated with a larger sample for each age and with other brands of PGA.