Use, Validity and Reliability of Inertial Movement Units in Volleyball: Systematic Review of the Scientific Literature

The use of inertial devices in sport has become increasingly common. The aim of this study was to examine the validity and reliability of multiple devices for measuring jump height in volleyball. The search was carried out in four databases (PubMed, Scopus, Web of Sciences and SPORTDiscus) using keywords and Boolean operators. Twenty-one studies were selected that met the established selection criteria. The studies focused on determining the validity and reliability of IMUs (52.38%), on controlling and quantifying external load (28.57%) and on describing differences between playing positions (19.05%). Indoor volleyball was the modality in which IMUs have been used the most. The most evaluated population was elite, adult and senior athletes. The IMUs were used both in training and in competition, evaluating mainly the amount of jump, the height of the jumps and some biomechanical aspects. Criteria and good validity values for jump counting are established. The reliability of the devices and the evidence is contradictory. IMUs are devices used in volleyball to count and measure vertical displacements and/or compare these measurements with the playing position, training or to determine the external load of the athletes. It has good validity measures, although inter-measurement reliability needs to be improved. Further studies are suggested to position IMUs as measuring instruments to analyze jumping and sport performance of players and teams.


Introduction
The use of inertial motion units (IMU) has enabled sport scientists, coaches and athletes to obtain physiological, kinematic and spatial positioning data [1,2]. These data provide locomotor variables (e.g., distance, number of sprints, player load) [3], movement variables (e.g., velocity, acceleration) [4] and sport-specific patterns (e.g., player skills) [5]. These variables have been used to improve physical performance [6], monitor technical and tactical performance [7] and improve the injury prevention and recovery processes [8].
The main advantages of this type of technology may lie in its application in the real-life sport context [9], ease of use [10], large volume of stored information [11], real-time monitoring [12] and large data diversity [13]. In team sports, the data collected by these devices have generally focused on describing efforts to which athletes are subjected in competition and training [14]. To detail these efforts, the description of movement patterns [15], specific movements [16] or load indices extracted from the movements performed [17] have been used. The most analyzed variables have been heart rate [18], distances covered [19], speed of movements [20] and relative intensities [21]. In indoor team sports, the use of these variables has been mainly to describe training load [22].
In volleyball, the training and competition load alternates low and high intensity actions [23]. Of these actions, jumping is the most frequent high-intensity effort [24]. Therefore, training in this sport should have, as one of its objectives, the development of jumping as a specific capacity [25]. Thus, the assessment of jumping can provide significant information on the sporting and clinical needs of athletes [26]. Field and laboratory tests have been used for the assessment of jumps [27]. The use of technology has further refined jump analysis, primarily using video analysis methodologies [28] and biomechanical analysis [29]. These jump analysis methodologies allow for an accurate determination of height, velocity, force exerted and even joint angle parameters. For these variables, few systems report data in real time and from several athletes simultaneously.
With the development of IMUs, it has been possible to use this technology to measure the height and frequency of jumps [30]. In a systematic review by Clemente [30], the validity and reliability of IMUs to measure jumping in controlled and laboratory situations was tested. However, this review did not analyze the application of IMUs as a function of sport, and concluded that of the sixteen devices that submitted reliability and validity tests, only seven showed acceptable results. The analysis of validity and reliability in ecological settings is essential, as one of the main uses of IMUs is in sporting contexts. This type of research is important to determine how accurately and precisely aspects of sport performance can be quantified using IMUs and to what extent their use is justified. Therefore, although previous research studies have assessed the validity and reliability of IMUs for measuring jumping, these investigations were usually conducted in controlled laboratory situations and did not specify the type of sport. Although information on the use of IMUs in volleyball is limited, their usefulness is evident. Therefore, the aim of this study was to examine the validity and reliability of multiple devices for measuring jump height in volleyball, determining the degree of validity and reliability of these devices commonly used by coaches. The specific objectives of this research were: (a) to systematically identify scientific publications that have used IMUs as assessment devices in volleyball and (b) to analyze the use, validity and reliability of IMUs in this sport.

Method
This manuscript is a systematic review [31] on peer-reviewed, scientific papers related to the use, validity and reliability of IMUs used to assess performance in volleyball. The Web of Science (Web of Science Core Collection, MEDLINE, Current Contents Connect, Derwent Innovations Index, KCI-Korean Journal Database, Russian Science Citation Index and Scielo Citation Index), PubMed, SPORTDiscus and Scopus electronic databases were searched on 30 December 2022 using the keywords "volleyball" and "accelerometry" or "accelerometer" or "accelerometer" or "gyroscope" or "inertial" or "sensor" or "wearable" or "measurement unit" or "wearable system" or "device" or "IMU" or "MEMS" or "microelectromechanical" and "jump" or "activity profiles" or "specific movements". The bibliographic reference lists of the included studies were also reviewed to identify studies likely to be included in the analysis that had not appeared through our search strategy. This review process of the reference lists was also performed with the article extracted from external sources. Any disagreements were resolved by consensus between two investigators (A.M.V and A.S.L.) and arbitration by a third investigator (J.P.O.).
One investigator (A.S.L.) was in charge of conducting the electronic searches, identifying relevant studies and extracting the data in a standardized and non-pooled manner. A systematic review was performed according to PRISMA (preferred reporting items for systematic reviews and meta-analyzes) guidelines [32] and guidelines for conducting systematic reviews in sport science [33] (Figure 1). In the present review, the inclusion criteria for these articles were: (1) the sample includes only volleyball athletes of any level, age and gender; (2) IMUs used for data collection; and (3) original articles only from the field of sport sciences. All included studies were deemed to have had appropriate ethical approval by a competent review committee. Studies were excluded if: (a) the sample involved other athletes in addition to volleyball players; (b) they used other types of devices than IMUs; and (c) the type of document was a review, letter to editors, trial registration, proposal for protocols, editorial, book chapter and conference abstract, or any document not related to the field of sport sciences.

Data Extraction and Analyzed Variables
The Cochrane Consumers and Communication Review Group's data extraction protocol [34] was used to group four characteristics of the studies: (a) methodological characteristics, (b) substantive characteristics, (c) validity characteristics, and (d) reliability characteristics.
Results describing "methodological" characteristics detailed: modality, design, subjects, level, gender and age, commercial IMUs' name, technical characteristics, context of the studies and variables analyzed. The "substantive" characteristics detailed: quality, objectives, results and applications. "Validity" characteristics described: context, criterion instrument, variables identified, validated instrument, statistical analysis and value. "Reliability" characteristics detailed: criterion instrument, validated instrument, logarithm, measured variables, statistical analysis and value. First, one researcher (A.S.L.) extracted the data from the included studies and a second researcher (A.M.V.) then checked the extracted data. Disagreements were resolved by consensus.

Quality of the Studies
Two authors (A.M.V. and A.S.L.) were in charge of analyzing the risk of reporting bias of the selected studies, using an adapted version of the STROBE evaluation criteria, as other studies such as the one by O'Reilly et al. [35]. Following this evaluation methodology, each article was evaluated using 10 specific items (exposed at the bottom of Table 1). In the event of any disagreement in the evaluation of any study and/or item, it was discussed and resolved by consensus between the two previously cited authors. The study rating was interpreted qualitatively following O'Reilly et al. [35]: from 0 to 7 points, the study was considered of low quality, while, if the study was rated from 8 to 10 points, the article was considered to be of high quality. Note: (item 1): provide an informative and balanced summary of methods conducted and the main findings (item 1); establish specific objectives and hypotheses (item 2); indicate the inclusion/exclusion criteria, as well as the sources and methods of selection of the participants (item 3); provide data sources and details of the evaluation methods for each variable of interest; describe the comparability of the evaluation methods (item 4); explain how quantitative variables were used; describe and justify which groups were chosen (item 5); expose the characteristics of the study participants (item 6); summarize the key results in a manner consistent with the objectives of the study (item 7); analyze and exposes the limitations of the study; discuss both the direction and the magnitude of any potential bias (item 8); offer a cautious interpretation of the results (item 9); and indicate the source of funding and the role of the funders of this study and (item 10).
The methodological risk of bias was assessed using the methodological index for nonrandomized studies (MINORS) by two authors (A.M.V. and A.S.L.) [55]. MINORS comprises twelve items, four of which are only applicable to comparative studies. Each item is scored 0 when the criterion is not reported in the article, 1 if it is reported but not sufficiently met, or 2 when it is adequately met. Higher scores indicate a good methodologi-cal quality of the article and a low risk of bias. Therefore, the highest possible score is 16 for non-comparative studies and 24 for comparative studies. MINORS has provided acceptable inter-and intra-rater reliability, internal consistency, content validity, and discriminant validity [55,56].

Identification and Selection of Studies
The process of search, identification and selection of studies is illustrated in Figure 1.

Methodological Quality
The overall reporting risk of bias of the cross-sectional studies can be found in Table 1. The results of the methodological risk of bias of the articles included in this review can be found in Table 2. Note: The MINORS checklist asks the following information (2 = High quality; 1 = Medium quality; 0 = Low quality): clearly defined objective (item 1); inclusion of consecutive patients (item 2); information collected retrospectively (item 3); assessments adjusted to objective (item 4); evaluations carried out in a neutral way (item 5); follow-up phase consistent with the objective (item 6); dropout rate during follow-up less than 5% (item 7); a control group having the gold standard intervention (item 8); contemporary groups (item 9); baseline equivalence of groups (item 10); prospective calculation of the sample size (item 11); and appropriate statistical analysis (item 12). Table 3 shows the methodological characteristics of the studies. Of the 22 studies found, 19 studies (90.47%) correspond to the indoor modality and only 3 (9.53%) to the beach modality. The number of subjects who participated in the studies and who used IMUs in competition, training or evaluation sessions ranged from 5 subjects [44] to 115 subjects [36]. Regarding the level of the participants, the most characteristic sample was of elite (90.47%) and local level (9.37%) teams. The age of the participants was between 16.1 and 27.6 years and 57.17% of the studies assessed men, 23.77% worked with women and 19.06% assessed both sexes.
Regarding the objectives of the studies, 52.38% focused on determining the validity and reliability of the IMUs. A total of 28.57% focused on controlling and quantifying the external load and 19.05% on describing differences between playing positions. The studies that aimed to compare the jumps recorded by the IMUs with variables such as internal load [45,54], playing position [24,25,44], training [44] and/or match play [25] are presented in Table 4. Caution to control external training load, taking into account landings.
de Leeuw, 2022 [38] Identify and correlate injury risks through external load and wellbeing indicators in a season 70% of players indicating "difficulty in training" were related to jumping loads; high differences between players.
Caution to use jumping frequency as a predictor of injury if thresholds are not individualized.
Useful for control and individualization of external load; caution in assessing heights.
Gielen, 2022 [40] Determine the relationship between internal and external load over the course of a season Significant correlations between maximum accelerations and maximum HR in the warm-up jumps (p = 0.62/0.49) not significant in the game; high correlation between activity and average HR in matches (p = 0.67).
Usefulness for external load control; caution with the relationship between external and internal load.
Jarning, 2015 [41] Determine whether acceleration measured with accelerometer identifies jumps The service serve and the smash could not be distinguished as movements without jumping (p = 0.422 and 0.999).
The methodology used is not useful for skip counting.
Joao, 2021 [42] Quantifying the external load of players Difference between playing positions in external load parameters (p = 0.000) and in jump height between sets (p = 0.004).
Usefulness for external load and fatigue monitoring in competition.
Kupperman, 2021 [43] Quantify external and internal load in a season and describe differences between playing positions High correlation between RPE and IMU data (p ≤ 0.001); Significant differences in IMU data between playing position (p ≤ 0.001/>0.004).
Usefulness for monitoring and individualization of training load and fatigue.
Lima, 2019a [25] Describe jumps in playing positions and sets Difference between positions and types and intensities of jumping; No differences in heights between sets.
Usefulness to control and individualize the external training load.

Applications of IMUs
Setuain, 2021 [52] To evaluate vertical jump mechanics before and after a controlled load (volume and intensity) of a training session A 10% decrease in post-training vertical ground reaction force was observed (p = 0.02).
Useful for controlling fatigue through jumping ability.
Utility for external load control; caution in assessing jump heights.
Skazalski, 2018b [53] Compare jumps and playing positions Setters performed more jumps; Opposites more high intensity jumps.
Usefulness for control and individualization of external training load.
Vlantes, 2017 [54] Describe internal and external loads and relate them to each other Differences between playing positions in internal and external load (p < 0.01); Difference between sets of matches (p < 0.05).
Usefulness for individualization of training load.
In this review we found seven studies [10,24,39,41,47,50,51] that used a criterion of concurrent validity of measures. To do so, they compared the data collected by the device with data collected visually. This visual inspection was performed by one [39,41,47,50] or two expert judges [10,24,51] and all these studies were conducted retrospectively.
As for the commercial device that underwent the most validity testing, the results of this review indicate that it was the Vert device [10,24,36,37,47,51]. Overall, this device was found to be reliable for measuring jump height. When this device was compared to a video camera motion analysis system [10], it showed good to excellent correlations (0.879-0.998). Table 6 shows the results of the reliability characteristics of the studies.

Discussion
The objectives of this study were: (a) to systematically identify the scientific publications that have used IMUs as assessment devices in volleyball and (b) to analyze the use, validity and reliability of IMUs in this sport.

Methodological Characteristics
Regarding the use of IMUs, a predominance in indoor environments compared to outdoor scenarios stands out. These results follow the same trend as other research conducted in volleyball that focused on analyzing other variables such as injuries [57] and training methods [58]. This may be because beach volleyball is a novel modality compared to indoor volleyball, and its body of knowledge does not reach similar volumes. Associated with this, a greater popularity of indoor volleyball and thus a greater number of participants may justify more interest and evidence in this modality. The use of IMUs, however, may be equally beneficial in both modalities. In turn, the number and characteristics of the participants in which IMUs were used are quite heterogeneous. In terms of age, few studies evaluated juveniles. Additionally, no studies were found in children. The use of IMUs in youngsters and children could be very useful, as it would provide information on the characteristics of competition in formative stages, which would help to adapt training methods. Gender differences can be explained by the bias that exists in science [59]. The use of IMUs would serve women and men equally well, and a gender comparison of the data could help to understand similarities and differences in performance and physical demands.
Regarding the level of the teams evaluated, the use in elite teams stands out. For most of the studies, data acquisition has been performed both in training and in competitions [24,25,38,39,[42][43][44][45]49,53,54], laboratory [36,37,46] and structured practices [50]. Some authors have combined these contexts depending on the aims of the studies [10,41,47,51]. In controlled laboratory conditions, it is easier to standardize protocols and performances; however, in sport settings, real data are obtained from the demands of competition.
On the technical aspects of IMUs, the results show that the most commonly used type of inertial sensor was 3D accelerometers. Regarding the variety of devices available on the market, four were identified (Vert, Catapult, Sunto, Shimmmer, Blast and Zephyr BioHarness). The most widely used (Vert) is a device specifically designed and marketed for volleyball [10], which justifies its recurrence among the studies. The other devices used can be adapted to any sport and provide data on mechanical and functional capacities, as well as external loading.
Regarding the placement of the sensors, the central area of the body is the most used, specifically the iliac crest, the sternum, the lumbopelvic area, thoracic vertebrae and scapula, and a few in the ankles and tibia. The iliac crest is a body area where the use of these devices has been most validated for volleyball [10,25,[36][37][38]44,45,47,49,51,53]. One possible reason is that since the devices are designed for jumping quantification, the iliac crest represents a central body area of the body, and therefore concentrates much of the athlete's mass. Previous studies have validated devices placed on the iliac crest by comparing them with values obtained in CMJ tests [24]. The placement of IMUs should not be a limitation of movement or discomfort for athletes. In fact, the use of the device on the back, near the scapulae, provides security as it prevents the device from detachment and even minimizes the risk of injury to the athlete [54].
Finally, the variables collected in the majority of studies were jump count (77%, n = 17) and height (63%; n = 14). Thirty-two percent (n = 7) of the studies combined variables derived from count, height and time. Two studies used algorithms to express external load indices [24,54]. In this sense, the monitoring of jumps in volleyball seems to be an indicator of the greatest interest for coaches. The monitoring of jumps provides relevant data for coaches to control the training load and the athlete's performance [60,61]. However, it is debatable whether jump count and height are sufficient estimators to understand the training load. In this regard, algorithms have been proposed that combine count, height, travel speed and athlete mass [24]. In some devices, these load indices have acceptable validity and reliability as a measure of load [62]. However, due to the specific characteristics of volleyball and each playing position, they must be specifically validated. Additionally, and due to the individual characteristics of each athlete, they should be combined with internal load measures [45,54] to have a more accurate value of the load to which the athlete is subjected.

Substantive Characteristics
The aims of the studies focused on three aspects: determining the validity and reliability of IMUs (52.38%), monitoring and quantifying external load (28.57%) and describing differences between playing positions (19.05%). In other field sports [30,63,64], the use of IMUs has principally focused on monitoring training load, detecting risks of overtraining and assessing sport performance. For example, the use of devices to monitor external load has shown a positive relationship with internal load in training and competition [42,43]. Thus, Lima et al. [45] have found high relationships between number of jumps and RPE. It has also been useful to monitor performance during matches by controlling the number of jumps between sets [25,44]. The use of IMUs has allowed the identification of differences between playing positions in terms of the number of jumps and the height reached in the jumps [44,49]. The middle blocker recorded the highest number of jumps, while the setter recorded the lowest number [49]. Furthermore, as observed in the study by Bahr et al. [65], there are also sex differences in the total number of jumps recorded during training and matches in young elite volleyball players. All of the above shows possible practical applications to determine and individualize the training load and to use this information to improve performance and control the risk of injury.

Validity of IMUs in Volleyball
Studies that examine the validity of IMUs are important as they reflect the degree to which an instrument is representative of the variable it is intended to measure. The most commonly used criterion of validity was concurrent [10,24,39,41,47,50,51]. The comparison of data obtained by technological devices and visualization data is a widely used technique for device validation. In this sense, studies by McDonald et al. [47], Gageler et al. [39] and Charlton et al. [10] correctly identified 97-99% of volleyball-specific jumps in comparison to other movements (e.g., displacements, hits, serves, etc.). This method compares the frequency of jumps detected by visual inspection and IMU (true positives), records detected by visual inspection but not by IMU (true negative) and records not detected by visual inspection but detected by IMU (false positive). In all studies, the percentage of true positives was above 95% [10,39].
Only the study by Jarning et al. [41] did not differentiate jumping in the serve and smash from other movements. This may be because only acceleration data were used and the algorithm used did not allow differentiation. Regarding the types of jumps observed by visual inspection and counted by IMUs, a comparison between studies is difficult as the definitions of jumps are different (e.g., Charlton et al. [10] and McDonald et al. [47]) and in the studies where specific jumps were observed (e.g., spike, block, serve, etc.), the definitions were not found. The absence or differences in the operational definitions of the actions that are observed and quantified is one of the main problems to be solved in future research with the aim of providing greater logical and content validity [66], as well as precisely defining the variables collected, describing the reliability of the observations in the visual inspection and explaining the data control process [67].

Reliability of IMUs in Volleyball
As in the validity studies, the criterion for establishing the reliability of the IMUs and determining accuracy was to compare them with data obtained with a gold-standard instrument and to analyze the agreement between them. The results of this systematic review indicate that the characteristics of these instruments used as criteria for measurement comparison ranged from mechanical use, such as the Vertec [24,36,48] force platforms [24,39,50], video camera analysis systems [10,47], or other IMUs [10,24,37]. In this sense, the criterion instrument should present evidence of proven reliability, which makes the use of Vertec and other IMUs as "gold standard" criterion instruments cautious [68,69].
In terms of the statistical techniques used to establish the agreement of the measurements, the Pearson's correlation coefficient (r) stands out. In the study by MacDonald et al. [47], strong correlations were also observed between Vert and a 3D-motion analysis video system (r: 0.88-0.89) and narrower limits of agreement (−6.1 to 9.8 cm). This may be because this work used a laboratory jumping protocol (CMJ) and elite athletes. However, this work underestimated the maximum jump height by 2.5 cm compared to the reference method. The authors of this study (MacDonald et al. [47]) stated that Vert did not find small changes in performance given the standardized standard errors. In this sense, the susceptibility of the devices should be able to identify small changes in jump height. In a more recent study [51], which compared the results of the Vert device with data obtained on a force platform, similarities were found to that which was reported by Charlton et al. [10] and Mc Donald et al. [47], whereby a mean error of 3.02-3.13 cm and limits of agreement of 7.65-6.60 were found. Vert has utility for quantifying jumping load during training and competition in volleyball, but further studies are needed to make generalizations regarding the use of Vert to assess changes in jumping performance [30].
However, regarding the use of Pearson's correlation coefficient used in some studies [10,24,36,39,47], this statistical analysis is not the most appropriate for determining agreement between devices. In fact, the intra-class correlation coefficient presents characteristics that make them a better estimator (e.g., Schmidt et al. [51]; Damji et al. [37]; Markovic et al. [46]; Schleitzer et al. [50]; Montoye et al. [48]). Additionally, Bland-Altman statistics are highlighted as a means to analyze the limits of agreement between devices (e.g., Schmidt et al. [51]; Damji et al. [37]; Markovic et al. [46]; Schleitzer et al. [50]; Montoye et al. [48]). It is important to include in the statistical analysis the calculation of the minimum detectable change (e.g., Skazalski et al. [24]) as an estimator of the minimum degree of difference to determine whether there are differences between the two measuring instruments [70]. It would therefore be desirable for a reliability analysis to include the calculation of a set of statistics intended to provide information on the level of agreement and the magnitude of errors.
In general, the results of studies which analyze the reliability of measurements show that devices have a measurement error in quantifying jump height, in some cases overestimating [24], and in others underestimating [10,24,47]. These differences may be due to the methodology and instruments used to measure jump height. Therefore, it is difficult to make comparisons between studies, as many of them do not explain the method to establish vertical displacement, and thus detect possible systematic measurement errors. Of all the studies found, only three [39,46,50] detail the logarithm used to calculate the jump distance. It is understandable that commercial brands do not disclose the mathematical calculations for estimating this height. However, knowledge of these would help to understand one of the possible causes of technological error. Specifically, it would help to know the systematic error of the measurement and its possible solutions.
However, studies suggest that the devices have high sensitivity for detecting jumps, albeit with significant errors. These errors can be significant when the aim is to detect changes in athletes' performance. However, if the measurement error is known, the use of devices provides benefits in real environments [71] without losing utility.

Conclusions
In general, it can be concluded that the studies conducted in volleyball using IMU devices have aimed to validate and measure the reliability of these devices for counting and measuring vertical displacements and/or comparing these measures with the playing position, training or determining the external load of the athletes. Validity measures for jump counting have been shown to be good to excellent, while reliability measures for height estimation have shown conflicting data. When the devices are used in realworld settings, they have proven to be reliable tools for quantifying and individualizing training load.

Limitations of the Paper and Future Approaches
In addition to this research, knowledge with regard to the magnitude and direction of the applied force is also important in volleyball. Usually, a combination of multiple (two or three) uniaxial accelerometers with IMU is used to detect these variables, but their high cost, size and total system cost of complexity increase their difficulty to be used routinely [72,73]. Future research could focus on developing the reliability and validity of these variables.
The study's findings highlight the relevance of considering the recording system to analyze the kinematic data in volleyball, especially among senior players. The use of IMUs in the youth and children could be very useful, as it would provide information on the characteristics of competition in formative stages, which would help to adapt training methods.

Data Availability Statement:
The datasets generated from the study are available from the corresponding author on reasonable request.

Conflicts of Interest:
The authors declared no conflict of interest regarding the publication of this manuscript.