Validity and Reliability of Mobile Applications for Assessing Strength, Power, Velocity, and Change-of-Direction: A Systematic Review

This systematic review aimed to (1) identify and summarize studies that have examined the validity of apps for measuring human strength, power, velocity, and change-of-direction, and (2) identify and summarize studies that have examined the reliability of apps for measuring human strength, power, velocity, and change-of-direction. A systematic review of Cochrane Library, EBSCO, PubMed, Scielo, Scopus, SPORTDiscus, and Web of Science databases was performed, according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. From the 435 studies initially identified, 23 were fully reviewed, and their outcome measures were extracted and analyzed. In total, 11 mobile applications were analyzed and summarized for their validity and reliability to test movement velocity, movement time, movement displacement, power output, and workload. The present systematic review revealed that the tested apps are valid and reliable for measuring bar movement velocity during lower and upper body resistance exercises; however, systematic bias was detected with heavier loads.


Introduction
Performance and fitness assessments are common processes related to the individualization of training [1][2][3][4]. Different physical qualities can be screened in a fitness assessment battery [5][6][7][8]. The most typical ones are related to neuromuscular-related qualities, with strength and power [9][10][11], velocity [12,13], and change-of-direction [14,15] being the most prevalent. Typically, strength is assessed considering the lifted load or the velocity at which the load is lifted [16][17][18]. In the case of neuromuscular power (or impulse), not only is weightlifting monitored but so are other movements for which height, flight time, or contact time are considered (e.g., jumping) [19,20]. For assessing strength and power, dynamometers [21,22], linear transducers [23,24], optoelectronic systems [25,26], or force plates [27,28] are usually used to measure the movements and their intensity [29]. In the case of running velocity (sprinting) or change-of-direction tests, the time of movement between two points is usually the common outcome [30]. Photocells and timers are considered gold standard instruments for measuring this parameter [31,32].
Such assessments are typically performed in a laboratory or field-based context. However, the cost of some gold standard instruments can prevent the massification of performance or fitness assessments by strength and conditioning coaches across different economic contexts and practical scenarios [33]. However, continuous improvements in the sensors and tools included in mobile devices have made it possible to develop mobile applications (apps) that serve as alternatives to gold standard instruments [34]. In fact, the development of apps for sports sciences is ongoing, making it possible to provide a wide range of opportunities to those with limited access to expensive or gold standard instruments [35].
As mentioned, typical outcomes related to strength and power, velocity, and change-ofdirection actions have focused on the velocity, time, or the displacement of a movement [36]. These main outcomes are, in a sense, able to take measurements using image-based or video-based analyses incorporated into smartphone cameras [37][38][39]. Though they are not automatic, a wide range of apps have simple and user-friendly processes for collecting and treating data. However, this does not dismiss the need for a human operator to perform the operations, and this might increase the risk of inaccuracy or imprecision. Therefore, a growing number of original studies have tested the validity and reliability of these sports sciences apps [40,41], aiming to determine their capacity to be used for performance and fitness assessments.
The use of mobile applications has a wide range. Mobile apps are frequently used by sports scientists, strength and conditioning coaches, and practitioners to measure physical conditioning [42]. The inaccessibility of the devices used as measurement methods, or the fact that the costs are much higher than mobile applications, allow the use of mobile applications by sports scientists, strength and conditioning coaches, and practitioners [33]. Various parameters are measured by practitioners under physical conditions [3,34]. For example, it is used to measure balance [43], distance [44] and physical activity [45]. In addition, it has been reported in the research that the use of mobile applications increases the level of physical activity by increasing the level of physical fitness [46].
The systematization of evidence about the use of sports science apps was published in some recent systematic reviews [47][48][49]. However, no study (as far we know) has analyzed the validity and reliability of fitness and performance assessment apps. This is of paramount importance, since the inaccurate use of these systems when interpreting human performance could lead to inadequate decisions related to training design. In fact, if variation in performance is due to the inaccuracy or imprecision of the systems, the interpretation of results will not be appropriate.
For that reason, it is important to summarize the evidence regarding the validity and precision levels of sports science apps for measuring human strength, power, velocity, and change-of-direction capacities. Therefore, the purpose of this systematic review was two-fold: (1) to identify and summarize studies that have examined the validity of apps for measuring human strength, power, velocity, and change-of-direction, and (2) to identify and summarize studies that have examined the reliability of apps for measuring human strength, power, velocity, and change-of-direction.

Materials and Methods
The systematic review strategy was conducted according to PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-analyses) guidelines [50]. The protocol was registered with the International Platform of Registered Systematic Review and Meta-Analysis Protocols with the number 202110089 and the DOI number 10.37766/in-plasy2021.1.0089.

Eligibility Criteria
The inclusion and exclusion criteria can be found in Table 1.

Inclusion Criteria Exclusion Criteria
Test of a mobile application in sport and exercise Instruments other than mobile applications (e.g., computer software) Tests were conducted in healthy athletes or recreationally healthy active adults for strength (e.g., resistance training exercises/movements), power (e.g., jumping, lifting movements), velocity (e.g., linear sprinting), and change-of-direction The tests were not conducted in athletes (e.g., pregnant, elderly) or in healthy active adults (i.e., injury) for strength, power, velocity, and change-of-direction related movements (e.g., assessment of instruments without human action involved) Estimation of movement velocity, movement time (e.g., a difference of time to complete a movement), and movement displacement (e.g., jump height) Estimation of other outcomes than movement velocity, movement time, and movement displacement In the case of validity, the apps were compared to the recognized gold standard: (1) Movement velocity (e.g., radar gun; isoinertial dynamometer consisting in cValid-extension linear position transducer; optoelectronic system) (2) Movement time (e.g., photocells) Movement displacement (e.g., force plates, optoelectronic system) For validity, the apps were not compared with recognized gold standard methods or were compared with other apps In the case of validity, one of the following measures were included: (i) typical error; (ii) mean absolute error; (iii) correlation coefficient; and (iv) standard error of the estimate For validity, outcomes presented are not typical error, mean absolute error, correlation coefficient or standard error of estimate.
In the case of reliability, one of the following measures were included: (i) intraclass correlation test; (ii) coefficient of variation; (iii) standardized typical error; and (iv) standard error of measurement.
For reliability, outcomes presented are not (i) intraclass correlation test; (ii) coefficient of variation; (iii) standardized typical error; and (iv) standard error of measurement.

Only original and full-text studies written in English
Written in language other than English. Other article types than original (e.g., reviews, letters to editors, trial registrations, proposals for protocols, editorials, book chapters and conference abstracts).
The screening of the title, abstract and reference list of each study to locate potentially relevant studies was independently performed by the two authors. Additionally, they reviewed the full version of the included papers in detail to identify articles that met the selection criteria. An additional search within the reference lists of the included records was conducted to retrieve additional relevant studies. A discussion was made in the cases of discrepancies regarding the selection process with a third author (FMC and MRG). Possible errata for the included articles were considered.

Information Sources and Search
Electronic databases (Cochrane Library, PubMed, Scielo, and Web of Science) were searched for relevant publications prior to 16 January 2021. Keywords and synonyms were entered in various combinations in the title, abstract or keywords as follows: ("sport*" OR "exercise*" OR "athletic performance" OR "physical performance" OR "movement*"), AND ("mobile app*" OR "app*" OR "smartphone" OR "iphone"), AND ("Validity" OR "Accuracy" OR "Reliability" OR "Precision" OR "Varia*" OR "Repeatability" OR "Reproducibility" OR "Consistency" OR "noise"), AND (power OR velocity OR strength OR "change of direction"). Additionally, the reference lists of the studies retrieved were manually searched to identify potentially eligible studies not captured by the electronic searches. Finally, an external expert was contacted in order to verify the final list of references included in this scoping review in order to understand if there was any study that was not detected through our research. Possible errata were searched for each included study.

Data Extraction
A data extraction was prepared in Microsoft Excel sheet (Microsoft Corporation, Readmon, WA, USA) in accordance with the Cochrane Consumers and Communication Review Group's data extraction template [51]. The Excel sheet was used to assess inclusion requirements, and subsequently tested for all selected studies. The process was independently conducted by the two authors. Any disagreement regarding study eligibility was resolved in a discussion. Full text articles excluded, with reasons, were recorded. All the records were stored in the sheet.

Data Items
The following information was extracted from the included original articles: (i) validity measure (e.g., typical error, absolute mean error, correlation coefficient); and (ii) reliability measure (e.g., intraclass correlation coefficient [ICC] and/or typical error of measurement [TEM] (%) and/or coefficient of variation [CV] (%) and/or standard error of measurement [SEM]). Additionally, the following data items were extracted: (i) type of study design, number of participants (n), age-group (youth, adults or both), sex (men, women or both), training level (untrained, trained); (ii) characteristics of the apps and comparator (for the case of validity studies); (iii) characteristics of the experimental approach to the problem, procedures and settings of each study.

Methodological Assessment
Two authors performed the methodological assessment of the studies eligible for inclusion using an adapted version of the STROBE assessment criteria, as was applied in O'Reilly et al. [52]. Hence, each article was evaluated using 10 specific criteria. If any disagreement appeared, it was discussed and solved by a consensus decision. The study rating was qualitatively interpreted following O'Reilly et al. [52]: from 0 to 7 scores, the study was considered as risk of bias (low quality), while, if the study was rated from 7 to 10 points, it was considered as a low risk of bias (high quality).

Study Identification and Selection
The searching of databases identified a total of 435 titles (Cochrane = 117; PubMed = 108; Scielo = 70; Web of Sciences = 140). In addition, one article was added from external sources. These studies were then exported to reference manager software (EndNote TM X9, Clarivate Analytics, Philadelphia, PA, USA). The selection process can be observed in Figure 1.

Methodological Quality
The overall methodological quality of the cross-sectional studies can be found in Table 2.  Note: provide in the abstract an informative and balanced summary of what was performed and what was found (item 1); state specific objectives, including any prespecified hypotheses (item 2); provide the eligibility criteria, and the sources and methods of selection of participants (item 3); for each variable of interest, offer sources of data and details of methods of assessment (measurement). Describe comparability of assessment methods if there is more than one group (item 4); explain how quantitative variables were handled in the analyses. If applicable, describe which groupings were chosen and why (item 5); give characteristics of study participants (item 6); summarize key results with reference to study objectives (item 7); discuss limitations of the study, considering sources of potential bias or imprecision. Discuss both direction and magnitude of any potential bias (item 8); give a cautious overall interpretation of results considering objectives, limitations, multiplicity of analyses, results from similar studies, and other relevant evidence (item 9); provide the source of funding and the role of the funders for the present study and, if applicable, for the original study on which the present article is based (item 10).

Results of Individual Studies: Validity of Mobile Applications
Information of the validity levels obtained in the included studies can be found in Table 4. For the My Jump App and My Jump App 2, the correlation coefficient values of validity were between 0.926 and 9.995 [15,28,32,37,42]. For PowerLift and My Lift, the Pearson r values were r = 0.729-0.964 [18,30,34,45]. For the Ergo Arm Meter, the Pearson r value was r = 0.999 [36]. For the Smartphone Accelerometer, the Pearson r values were r = 0.54-0.93 [35,41]. For the Speedlock App, the Pearson r value was r = 0.93 [33]. For the MySprint App, the SEE values were from 0.007-0.015 m·s −1 , and the Pearson r values from r = 0.989-0.999 [31]. For the ILoad App, the SEE values were from 0.003-0.004 m·s −1 , and Pearson r values were r = 0.98-0.99 [29,47]. For the Styrd App, the SEE value was <7.3% and Pearson r value was r = 0.911. Finally, for the CODtimer App, the SEE value was 0.03s and Pearson r value was r = 0.998.

Results of Individual Studies: Reliability of Mobile Applications
Information on the reliability levels obtained in the included studies can be found in Table 5. For the My Jump App and My Jump App 2, the ICC values of reliability were from 0.492-0.999 and CV values were between 3.4% and 12% [15,19,32,37,42]. For the PowerLift and My Lift App, the ICC values of reliability were 0.70-0.989 [17,18,27,30,44,45] and CV values were between 3.97% and 10.4% [17,27,30,34,44]. For the Ergo Arm Meter, the SEM value of reliability was <13.1º/s [36]. For the Smartphone accelerometer, the ICC values of reliability were 0.634-0.99 [35,41]. For the MySprint App, the ICC value of reliability was 1 and CV values were from 0.027-0.14% [31]. For the ILoad App, the ICC value of reliability was 0.941 [47] and CV values were between 5.61% and 9.79% [29]. For the Styrd App, the ICC value was ≥0.980, the CV value was ≥4.3% and SEM was 12.5 w. Finally, for the CODtimer App, the ICC values of reliability range was 0.671-0.840, and CV values were between 2.2% and 3.2%.

Discussion
The need to assess and monitor the physical and performance status of athletes has led sports professionals to use equipment that might not be available in some sports and health contexts. Therefore, the use of mobile apps for these purposes has been gaining interest among the sports and scientific communities. However, coaches need to be confident that these apps measure what they are supposed to measure, and that their measurements are consistent and repeatable over time.
From the 24 included articles, both validity and reliability were tested for 11 different apps. However, one of the articles [60] tested only reliability. This discussion is organized based on the aims of assessing each app, considering the different models used for the same measures.

Validity of Mobile Applications
For the My Jump App and My Jump App 2, the correlation coefficient values of validity were between 0.926 and 9.995 [15,28,32,37,42]. For the PowerLift and My Lift, the Pearson r values were r = 0.729-0.964 [18,30,34,45]. For the Ergo Arm Meter, the Pearson r value was r = 0.999 [36]. For the Smartphone Accelerometer, the Pearson r values were from r = 0.54-0.93 [35,41]. For the Speedlock App, the Pearson r value was r = 0.93 [33]. For the MySprint App, the SEE values were between 0.007 and 0.015 m·s −1 , and the Pearson r values were r = 0.989-0.999 [31]. For the ILoad App, the SEE values were between 0.003 and 0.004 m·s −1 , and the Pearson r values were from r = 0.98-0.99 [29,47]. For the Styrd App, the SEE value was <7.3%, and the Pearson r value was r = 0.911. Finally, for the CODtimer App, the SEE value was 0.03s, and the Pearson r value was r = 0.998.

Strength Apps
According to this systematic review, the Power Lift/My Lift app (which are the same) seems to be the most often used mobile app for assessing the strength status of humans. Furthermore, the studies revealed that, overall, the My Lift app is a valid tool for measuring displacement and velocity data based on different strength-based exercises.
Thompson et al. [68] compared linear position transducers (LPTs), inertial measurement units (IMUs), and the My Lift app using an iPhone 7 with a 3D capture system that records time displacement data. The authors found that the LPT system had the greatest validity, and that the My Lift app's validity (r ≥ 0.88) was similar to that of the LPT.
However, when using the My Lift app, the recorded data were limited to mean velocities [68]. Similarly, another study compared the My Lift app with a 3D capture system and found strong to very strong correlations between them for peak forward, backward, and vertical displacements, suggesting that the app is valid [40]. Furthermore, in contrast with the study of Thompson et al. [68], the peak vertical velocity from the My Lift app was analyzed, and had the greatest correlation with the gold standard equipment (r = 0.902), although it also had a higher standard error (SEE = 0.124 m.s −1 ) than the other displacement measures [40].
Interestingly, Courel-Ibáñez et al. [59] revealed a linear relationship (r = 0.939-0.920) between velocity outcomes derived from the My Lift app and a linear velocity transducer (LVT), considered by the authors as the gold standard device. However, the app produced absolute mean errors of 29.6% and 27.7% 1RM, and SEEs of 0.117 m.s −1 and 0.08 m.s −1 for bench press and back squat exercises, respectively. In fact, the same authors [59] suggested that the use of Pearson's correlation coefficients might not be appropriate for analyzing the validity outcomes of a device, especially for devices that measure sensitive variables, such as bar velocity.
Notwithstanding the fact that, overall, the studies revealed acceptable validity of the My Lift app for measuring different displacement velocities for different exercises, most of the studies compared the My Lift app with different "reference" devices. In fact, while some authors refer to 3D capture systems as the gold standard device for velocity-based training (VBT) [40,68], others refer to LPTs as the gold standard [56]. Indeed, other studies noted that there is no evidence supporting the use of a 3D system as a reference device [59]. Therefore, more homogeneous study methodologies are needed for ensuring the veracity of such findings regarding the validity of the My Lift app.
In addition, two of the included studies tested the validity of the iLoad app [60,64]. Both studies compared the iLoad app with two different linear transducer systems. Despite the methodological differences between these two studies, the authors suggested that the app is a valid tool for measuring mean velocity during lower and upper body exercises. However, coaches need to manually manipulate the iLoad app when the exercise starts and stops, which may generate biological-based errors.
Furthermore, two other studies used basic smartphone accelerometer data to assess the mean bar velocity of different strength exercises [56,69]. The study of Viecelli et al. [69] revealed that the accelerometer app had a strong correlation (r > 0.93; p < 0.05) and a small absolute mean error (0.16%) when compared to a video recording system. Conversely, the other study [56] compared the accelerometer app with an LPT, revealing a lower correlation with the "reference" device (r = 0.54) than Viecelli et al. [69]. Moreover, the authors suggested that the app may not be completely valid for measuring strength because meaningful differences were found in mean velocities with higher lifting loads >90% 1RM [56].
A relevant issue regarding the studies that analyzed the validity of strength apps is the fact that some of them used Smith machines to try to eliminate horizontal bar displacements during exercises [62,72], while others used free-weight-based exercises [40,68]. As such, one can argue that lower bias is expected in studies using fixed-bar exercises when compared to those using free weights. Therefore, professionals using VBT should rely on the validity of devices that were tested in a similar apparatus than they will be using with their clients or athletes. Overall, the My Lift app seems to be the most often studied and valid option for measuring human strength.

Power Apps
Of the studies included in this systematic review, three tested the validity of the My Jump app [41,61,66] and two tested the validity of the My Jump 2 app for analyzing jump height [57] and reactive strength index (RSI) measures [37]. The validity of the My Jump app, or measuring CMJ jump height, was tested using an iPhone 5s. Good accuracy (r = 0.995, p < 0.001) and a mean absolute error of 1.1-1.3 cm were recorded when compared with a force platform that was considered the "gold standard" device [41]. Another study that compared the same app on an iPhone 6 with a contact platform revealed almost perfect correlations for height measures of CMJ, SJ, and DJ (from a 40 cm box), with a standard error of 0.1 cm for all slow and fast stretch shortening cycle jumps [61]. Further, Stanton et al. [66] revealed that the My Jump app had a strong correlation (r > 0.99, p < 0.001) with a force plate for both CMJ and DJ. Moreover, the study that tested the validity of My Jump 2, regarding jump height, revealed the app's validity (r = 0.98) when compared to a force platform and when compared to a yardstick apparatus [57].
When analyzing peak power using the My Jump app, an almost perfect correlation (r = 0.926) was found between the app and the Vertec jump system [71]. However, that same study showed a lower correlation (r = 0.813) when analyzing jump height [71]. This finding contrasts with the overall results of studies that revealed relatively high correlation values for jump height. In the study that analyzed the RSI measure using the My Jump 2 app, near-perfect correlations were found between the app and a force platform for the RSI values obtained for the 20 cm (r = 0.938) and 40 cm (r = 0.969) DJ heights [37]. However, the peak power measure revealed weak correlations for the 20 cm (r = 0.655) and 40 cm (r = 0.571) heights.
In summary, the My Jump and My Jump 2 apps are considered valid tools for assessing the vertical height and reactive strength index from different jump protocols using CMJ, SJ, and DJ. However, peak power assessments might not be as accurate as jump height assessments.

Velocity Apps
Regarding running performance, three different apps were included in the present systematic review [58,65,67]. The MySprint app was compared to timing photocells and a radar gun to test its validity [65]. The results suggested that the app is valid, as near-perfect correlations were recorded between the app and the timing photocells for 40-m sprint splits (standard error = 0.007-0.015 s). Further, the My Sprint app showed almost perfect correlations with the radar gun for measures of the power, force, velocity, and mechanical properties of sprint performance [65,73]. However, the app needs to be manually manipulated to select the frames from the video recording, which can create a gap between the accuracy and error of the app.
The SpeedClock app showed excellent agreement when compared to timing lights, revealing a slight bias between the two devices [67]. Although the SpeedClock app was determined to be a valid tool, this finding is based only on a 10-m flying sprint. Thus, the validity of this app for measuring sprint running performance above 10 m remains unknown. An issue that must be addressed is the fact that these apps are accessible on different smartphone brands and models which record videos at varying frames per second, which could influence the accuracy and systematic bias of such apps. As such, it could be difficult to compare studies that test the validity of mobile apps for measuring running performance. Moreover, few studies have confirmed the validity of such apps in specific populations (e.g., athletes who participate in specific sports).
Furthermore, as the Stryd app assesses running power output, we have added this app to the velocity apps section [58]. The mentioned study tested and confirmed the validity of the app. The authors revealed that the power output measured by the Stryd app had strong associations (r = 0.911) with VO2max, which was obtained in a running-based incremental test. However, a standard error of 7.3% was found [58]. The same study also revealed that the Stryd app has the benefit of being connected with a sport watch. The literature is scarce regarding measures of power output using the Stryd app. For these reasons, future studies should rely on expanding this app's validity to other populations and different methodologies, as the mentioned study included a small sample of only 12 male endurance athletes.
Despite the scarcity of studies on the validity of running-based apps, all apps that have been analyzed in such studies have been considered valid for the measures of movement displacement, velocity, time, and power output. Nevertheless, the standard errors of such apps must be carefully considered, as the user must manipulate the apps manually, which could increase the probability of human errors, especially when velocity is being measured.

Change-of-Direction Apps
Of studies included in the present systematic review, only one tested the validity of a mobile app for measuring change-of-direction performance [55]. It showed that the CODtimer app had a very high correlation (r = 0.998) and a standard error of only 0.03 s regarding the timing gates for measuring change-of-direction total time [55]. Although that study showed the validity of the app, the authors suggested that the app might not be valid for change-of-direction tests that were not used in their study. For those reasons, future studies using the CODtimer app based on different change-of-direction tests are needed to ensure the validity of the app in different situations.

Reliability of Mobile Applications
For the My Jump App and My Jump App 2, the ICC values of reliability were 0.492-0.999 and CV values were between 3.4% and 12% [15,19,32,37,42]. For the PowerLift and My Lift App, the ICC values of reliability were 0.70-0.989 [17,18,27,30,44,45] and CV values were between 3.97% and 10.4% [17,27,30,34,44]. For the Ergo Arm Meter, the SEM value of reliability was <13.1º/s [36]. For the Smartphone accelerometer, the ICC values of reliability were between 0.634 and 0.99 [35,41]. For the MySprint App, the ICC value of reliability was 1 and CV values were from 0.027-0.14% [31]. For the ILoad App, the ICC values of reliability was 0.941 [47] and CV values were between 5.61% and 9.79% [29]. For the Styrd App, the ICC value was ≥0.980, CV value was ≥4.3%, and SEM was 12.5 w. Finally, for the CODtimer App, the ICC values of reliability range was 0.671-0.840, and CV values were between 2.2% and 3.2%.

Strength Apps
Comparisons between the Power Lift/My Lift app and an LPT, a 3D motion capture system, and a 3-axis accelerometer, gyroscope, and magnetometer showed ICC values of up to 0.989 for the measures of bar mean velocity, peak vertical velocity, and peak forward and backward displacements [40,53,54]. However, none of these three studies included any information regarding the error of measurements or coefficients of variation of app measurements. Other studies [39,59] that also compared the app with diverse LPTs revealed that, even though the My Lift app presented ICC values between 0.973 and 0.993, the coefficient of variation ranged from 5.02% to 10.4%. The authors of those two studies [39,59] did not recommend using this app due to their substantial systematic bias. Conversely, Pérez-Castilla et al. [62] found small systematic bias and lower ICC values (0.70) than the abovementioned studies. Moreover, when measuring bar velocity, Thompson et al. [68] found coefficients of variation of <10% (for loads up to 70% of 1RM) and >10% (for loads above 90% of 1RM).
Furthermore, only one of the included studies tested the reliability of the iLoad app [64]. In line with the abovementioned study of Thompson et al. [68] regarding the My Lift app, the study of Pérez-Castilla et al. [64] revealed the acceptable reliability of the iLoad app when measuring bar velocity at lower 1RM percentages (the coefficients of variation ranged from 5.61% to 9.79%). Thus, when 1RM percentages were higher, the coefficient of variation values exceeded 10%, and the same pattern with similar values was found for the LPT system that the authors used in the same study [64]. For these reasons, professionals must be careful when using the iLoad app to measure bar velocity when heavier loads are involved, as the data extracted may be misleading. Moreover, using basic accelerometer data from a smartphone seems to have acceptable reliability [56,69]. Once more, it was found that, although the accelerometer app presented good agreement with an LPT, greater differences in mean bar velocity were, once again, found with heavier loads.
Velocity-based training (VBT) has been a topic of great interest given its practicability and ease of use. The most common equipment used for VBT seems to be LPTs and IMUs. However, these devices are expensive, and mobile apps are a potential affordable, valid, and reliable alternative. However, despite smartphone apps' ability to measure bar velocity with good validity, they show greater systematic bias than gold standard measures, especially considering that the user is required to manually select the frames of video recordings.

Power Apps
The reliability of the My Jump app was tested. After analyzing five CMJs, an almost perfect agreement was found (ICC = 0.999), presenting coefficients of variation of 3.4-3.6% for jump height using an iPhone 5s [41]. Another study also found an almost perfect agreement (ICC = 0.97-0.99) for DJ (from a 40 cm box), SJ, and CMJ heights when compared to a contact platform (coefficients of variation ranged between 3.8% and 7.6%) [61]. Stanton et al. [66] reported ICC values of 0.997 for CMJ and 0.998 for DJ heights. However, between-days systematic bias was detected for both CMJ and DJ mean values when the My Jump app was compared with a force platform. However, the same authors [66] revealed that the force plate showed lower values than the app at CMJ higher jump heights, and higher values at lower jump heights. As for DJ, the force plate produced higher values than the app at all jump heights [66].
Furthermore, when using the My Jump app to analyze peak power, only moderate ICC values were recorded for both males and females, with a wider confidence interval (CI) range calculated for males than females between poor and excellent ICC values [71]. The same study [71] revealed only poor absolute agreement for both males and females for the jump height measure. However, the authors compared the My Jump app with the Vertec system, which is not considered a gold standard for assessing power performance.
Regarding My Jump 2, two studies analyzed the reliability of the app for jump height and RSI measures [37,57]. My Jump 2 revealed acceptable intra-rater reliability for detecting changes in jump height measurements, with small variation detected between repeated tests [57]. Thus, the same study revealed that the app had moderate reliability (CV = 6.7%) when compared with the gold standard force platform. The other study that used the My Jump 2 app [37] also revealed near-perfect agreement between the app and a force platform for DJ jump height, contact time at 20 cm, and RSI measurements for 20 cm and 40 cm DJ heights. However, weak agreement was found for mean power. The RSI data extracted from the My Jump 2 app for 20 cm DJ had lower variation (CV = 6.7%) than the RSI data for higher DJ heights [37]. However, more studies need to be conducted on this new version of the My Jump app, as most of the studies focused only on the first version. The My Jump app has been found to be a reliable tool for measuring jump height.

Velocity Apps
There is a lack of studies on the reliability of running-based velocity apps. In one such study that has been carried out, the MySprint app, a radar gun, and photocells yielded ICC values of 0.987 and 1 for mechanical variables and time measures, respectively [65]. Moreover, the same study revealed that the app produced a very low coefficient of variation in repeated trials (similar to the values found for the photocells and radar gun) for time and mechanical measures [65]. Similarly, the Stryd app revealed almost perfect ICC values (<10% coefficient of variation) when used to measure running power output in both indoor and outdoor situations. This highlights the benefits of this app for consistent use in various environments for measuring running performance. The use of the mentioned apps for measuring running-based velocity properties seems to be reliable, although more studies should be conducted to confirm this.

Change-of-Direction Apps
The study of Balsalobre-Fernández et al. [55] revealed that the CODtimer app had nearperfect agreement with timing gates for measuring the total time in a change-of-direction test. The app presented similar ICC values (0.671-0.840) as timing gates for repeated trials and presented similarly low coefficients of variation (2.2% to 3.2%). Interestingly, the same study revealed that the app had moderate reliability for the left limb and good reliability for the right limb, resulting in similar limb asymmetry values between the app and the timing gates. Although there is a lack of studies regarding change-of-direction apps, the use of the CODtimer app can be an affordable choice for measuring change direction ability when expensive devices, such as timing gates or photocells, are not available.

Study Limitations, Future Research, and Practical Implications
Studies regarding the validity and reliability of mobile apps revealed some limitations that can be misleading. These limitations include (i) the limited sample sizes; (ii) the lack of studies regarding specific populations such as young athletes, adults, males, and females; (iii) the use of distinct testing protocols; (iv) the use of different smartphone brands and models in selected studies; and (v) greater focus on the validity and reliability of strength apps. Future studies should focus on analyzing the validity and reliability of such apps in specific populations with greater sample sizes. More consistent testing protocols and study methodologies must be conducted regarding the type of population, sample size, and smartphone brand and model.
Regarding the practical applications and the validity ( Table 6) and reliability (Table 7) of the mobile applications, the My Jump and My Jump 2 apps, which are considered a video recorder with a 120-Hz high-speed camera, are valid tools for assessing reactive strength index, as well as movement displacements regarding vertical height, namely CMJ, SJ, and DJ. The Power Lift/My Lift app is considered a valid and reliable application for measuring peak velocity (vertical, horizontal, forward, and back displacement) frame by frame. The Ergo Arm Meter uses 3D data from a built-in accelerometer and gyroscope, and is considered a valid and accurate tool for measuring medium-to high-velocity movements of the arm in the sagittal plane. The smartphone accelerometer, which is a triaxial accelerometer, is considered a valid and reliable tool for assessing resistance exercise and peak vertical velocity. The Speedclock app, which records video at 60 frames per second, is considered a valid tool for measuring 10-m sprint performance [33]. However, the study that this is based on did not analyze this tool's reliability.