The value of vertical jumping is well-established for training and testing across many sports and exercise settings [1
]. Demonstrating large correlations with sprint speed (r = 0.70–0.91), the ability to absorb and redirect force is a primary consideration in many team sports and track and field events [3
]. Thus, understanding exercise parameters, such as jump height and contact times, allows coaches to manipulate explosive multi-joint exercises to train many different adaptations, including speed, power, reactive strength, force absorption, and proprioception [4
]. However, a key factor for coaches using jump exercises for training and testing is the accuracy of these measurements.
There are several methods available to monitor jump performance spanning from a ruler and a wall to the most expensive advanced clinical technologies. Traditionally, a stand-and-reach test (Vertec) measuring center of mass displacement from subjects swatting aside plastic vanes has demonstrated large biases, both over-estimating jump height by 11.2 cm versus the Optojump [6
], and under-estimating jump height by 2.4 ± 6.6 cm (mean ± standard deviation) compared to force-plates [7
]. This typically occurs due to the set-up variability within an athlete’s arm swing, wherein more or less arm involvement can drastically affect the jump height reached. As a result, newer methodologies encourage static hand placement, either gripping a wooden dowel or placed akimbo. Research-grade multiaxial force-plates are generally considered the “gold standard” for accurately measuring ground reaction forces created during jumping due to multiple inlaid sensors, high capacities, and increased sensitivity. However, newer technologies, including linear position transducers, mobile phone-based applications, micro-lasers, and accelerometers, are attempting to more easily and affordably bring accuracy to the practitioner [8
Typically, several factors can influence the device’s accuracy, including the measurement method, sampling rate, testing area, athlete characteristics, instrument set-up, verbal instructions, and post-production algorithms. Jump height calculations based on displacement and take-off velocity have traditionally been the most accurate, but many newer technologies have reported high validity using flight time methods instead [9
]. Accelerometers use small inertial sensors to measure velocity, while linear position transducers use manual cable movement to measure vertical displacement [10
]. Both can be mounted directly on the athlete’s body via a belt or band or on a held bar. Therefore, instrument set-up, sensor placement, and anterior-posterior trunk deviations may consequently affect measurement accuracy [11
]. For accelerometers, the inclusion of a gyroscope may account for landing position and trunk inclination [11
], while linear transducers are advised to be set-up as vertical as possible. On the other hand, camera-based apps and micro-lasers use flight time primarily to calculate jump height, and therefore rely on proper jump and landing mechanics for accuracy [12
]. The main difficulty with body-mounted accelerometers and camera-based applications is the correct identification of take-off and landing [13
]. While camera-based applications rely on manual frame selection to determine take-off and landing [10
], body-mounted accelerometers use automatic algorithms. However, some authors have advised similar lengthy manual identification methods are necessary to reduce systematic bias in body-mounted accelerometers [13
While the validity and reliability of several devices have been examined, several common limitations exist. For example, most studies only examine homogenous subjects and jumps, making it difficult to ascertain accuracy with both good and poor performers [14
]. Most studies have utilized only a few statistical strategies, including limits of agreement (LoA), coefficient of variation (CV), intraclass correlation coefficient (ICC), Pearson’s r, and/or p
], which raises some issues. For instance, the most commonly reported reliability statistics, the ICC and r, are dependent on between-subject variability, which minimally affects typical error of measure (TEM) and CVs [17
]. Similarly, p
-values only assess systematic bias between measures and are largely dependent on sample size [18
]. Additionally, whilst systematic bias gives important information on whether a specific data-point is likely to be an under- or over-estimation compared to the criterion gold standard, it does not provide insight into the reliability of an estimate. Reliability is also often derived from a single session (i.e., within-trial variation), which has limited application to test-retest methodologies [8
]. Finally, while several studies have examined the validity and reliability of force-plates, motion capture, and accelerometry-based technologies [13
], the G-Flight photo-cell system, an affordable and extremely portable means of assessing jump performance, has yet to be independently validated. Therefore, the primary aim of this investigation was to compare the validity and reliability of estimated jumping performance from three portable commercially available products (G-Flight, PUSH, two-dimensional motion capture) simultaneously against a criterion laboratory-grade force-plate. Specifically, the authors aimed to address differences in absolute measurement output, intra- and inter-session variation, and composition of measurement error between portable devices compared to the force-plate. In doing so, we aimed to provide practical information for coaches to determine which technology is best suited for their needs. We hypothesized that the G-Flight would hold strong validity and reliability, like other laser-based technologies (e.g., Optojump), while the PUSH would be the least accurate and reliable of the four products due to the added difficulty for manufacturers to develop sufficient algorithms to estimate take-off and landing.
The main purpose of this investigation was to examine the validity and reliability of commercially available portable technologies against lab-grade research force-plates. For jump height, both G-Flight and PUSH overestimated the maximal height by ~4 cm, while no difference in maximal height was observed between Mo-Cap and force-plate calculations. Contact-times were significantly longer and shorter than the force-plate for Mo-Cap and PUSH, respectively. RSI was not significantly different between devices. In general, variability was higher for CMJ and DJ heights when measuring with the G-Flight compared with the force-plate but still fell into the ‘acceptable’ range (ICC > 0.67, CV < 10%).
These results are similar to other publications reporting trivial overestimations (0.25–1.8 cm) using more automated 2-D motion capture apps like MyJump [8
], compared to force-plates. In comparison, G-Flight and PUSH calculations revealed significant overestimations for SJ and CMJ heights (+3–4.5 cm). Alternatively, only PUSH overestimated jump height when performing the DJ (+4.1 cm). Wee et al. (2018) similarly reported overestimations (though somewhat larger) of 14.4 cm with PUSH, and 12 cm with GymAware to force-plates. Differences in the magnitude of overestimation are most likely due to sensor placement, with the upper spinal placement likely to increase the error due to trunk inclination compared to the lumbosacral position used in the current. It is also important to note that the LoA for all devices, jump measures, and jump heights were very flat (r2
= 0.0002–0.223) (Figure 4
), indicating that little to no systematic biases were present between high (≤48.5 cm, ≥191 ms, RSI ≤ 1.50) and low (≥8.7 cm, ≤584 ms, RSI ≥ 0.26) performing jumpers (Table 1
). Therefore, researchers and practitioners can utilize any of the examined technologies across a wide range of subjects, if it is understood that the devices are generally not interchangeable. However, it should also be noted that PUSH-derived jump heights were the three largest biases, suggesting that PUSH may not be the best choice when assessing changes in jump height due to training or acute fatigue.
To the authors’ knowledge, there have been very few published studies that have analyzed variance in jump heights, contact-time, or RSI in different measurement devices over three jump types. Furthermore, this is the first study examining the G-Flight micro-sensor system. Mo-Cap, G-Flight, and PUSH generally held similar testing variability as the force-plate. However, while ‘acceptable’, the G-Flight was commonly significantly more variable than the force-plate (Table 2
). However, this increased variability does not preclude the use of the G-Flight so long as practitioners understand that a larger shift in jump performance will be required before they can be sure a real change has occurred. Therefore, it is recommended that practitioners calculate the smallest worthwhile change from the results, or for their specific tests and populations.
There are some limitations to the present study. Firstly, to standardize jump techniques, and minimize repetitions with large forward to backward displacement (where the G-Flight laser would not be tripped), jumps performed with arm-swing were not included, limiting the maximal jump heights examined. Therefore, future studies should examine the validity and reliability of very high jumps. While purely anecdotal, it is plausible that the G-Flight consistently overestimated jump height due to an occasional forward displacement during jumps, combined with the toes contacting the force-plate before the heel. Similarly, the G-Flight could be tripped by the removal of the midfoot a few milliseconds before toe-off. As such, the G-Flight micro-sensor could be tripped slightly before or after the other technologies. Likewise, it is important to note that all jump metrics were calculated using flight time, and not take-off velocity via the impulse momentum method, a decision made to ensure a fair comparison between devices. However, readers should be aware of the inherent issues with flight time calculations, including landing with excessively flexed knees and hips. Randomizing jump types and including extremely high and low performers could have been beneficial to the study. Furthermore, examining jumps with arm-swings would improve ecological validity. Thus, it is recommended that future studies utilizing the G-Flight instruct subjects to land flat-footed. It should also be recognized that while precedent exists for the specific variability cut-offs in the present study [29
], no universal consensus exists [17
]. Therefore, practitioners may wish to apply their own inference scales.
Athletes, practitioners, and researchers can apply the findings of the present investigation in several ways. Depending on variables of interest, time, money, and practical application, all technologies can be practically implementable. While Mo-Cap was the most valid and affordable technology, it also involves the greatest processing time. For teams with a small support team or many athletes, the added hours of analysis time may not be practical on a consistent basis. G-Flight slightly overestimated CMJ and SJ height and held the greatest variability but was both valid and reliable for contact-time and RSI. Practically speaking, this technology was more accurate for time-sensitive metrics but was significantly more variable than the force-plate for jump height between sessions. This is unsurprising considering G-flight is solely based on flight time, and varying movement strategies will produce a large variance in jump height for similar contact times. Since G-flight only offers a small number of variables, analysis of movement strategy is near impossible. However, quick processing time and ease of set-up make this a good tool for testing large groups efficiently, although, some familiarization is may be necessary to ensure proper foot placement and landing cues. Moreover, with novice athletes or athletes with little jumping experience, jump height variation may be exemplified, and thus coaches will need a larger shift in performance to account for wider CV. For weekly readiness or monitoring needs, this software may not be sensitive enough for minute changes, whereas changes across weeks or months may be more easily recognized. PUSH held moderate over- and under-estimations for jump height and contact-time, respectively, but allow analysis for a myriad of different exercises. However, the cost of each additional unit or having to swap athlete set-ups mid-exercise makes testing large groups difficult, making this technology more beneficial for in-depth analysis across individuals and small groups. Practitioners should think about the number of athletes, available processing time, athlete experience, and exercise demands before purchasing available equipment.