A Scoping Review of the Validity and Reliability of Smartphone Accelerometers When Collecting Kinematic Gait Data

The aim of this scoping review is to evaluate and summarize the existing literature that considers the validity and/or reliability of smartphone accelerometer applications when compared to ‘gold standard’ kinematic data collection (for example, motion capture). An electronic keyword search was performed on three databases to identify appropriate research. This research was then examined for details of measures and methodology and general study characteristics to identify related themes. No restrictions were placed on the date of publication, type of smartphone, or participant demographics. In total, 21 papers were reviewed to synthesize themes and approaches used and to identify future research priorities. The validity and reliability of smartphone-based accelerometry data have been assessed against motion capture, pressure walkways, and IMUs as ‘gold standard’ technology and they have been found to be accurate and reliable. This suggests that smartphone accelerometers can provide a cheap and accurate alternative to gather kinematic data, which can be used in ecologically valid environments to potentially increase diversity in research participation. However, some studies suggest that body placement may affect the accuracy of the result, and that position data correlate better than actual acceleration values, which should be considered in any future implementation of smartphone technology. Future research comparing different capture frequencies and resulting noise, and different walking surfaces, would be useful.


Introduction
As smartphone technology becomes more ubiquitous, using the sensors of the phones in our pockets becomes a cheap and convenient method to gather gait data.The use of mobile phones to evaluate human movement and diagnose and track pathological gait becomes an effective way for practitioners to gather and evaluate data, but a key concern for use in clinical practice would be the accuracy of these data.Despite the increasing use of mobile phone technology within our daily lives, the development of apps to exploit the sensors available within these devices appears more limited, which may be due to concerns about the accuracy of these data when compared to the existing methods of data collection, such as motion capture or inertial movement units used in a laboratory setting.
Whereas previous studies have reviewed wearable technology in gait more generally [1][2][3] or when wearables are used to evaluate a specific clinical pathology [4][5][6][7], it is important to remember that smartphones are simply not designed for gait analysis, unlike other wearable technology.Therefore, these devices may be considered as less accurate and more prone to error due to accelerometer data capture not being their primary use.To evaluate the accuracy of these devices in measuring kinematic data, it is important to compare smartphones to other gold-standard technology such as motion capture, force plates, or research-standard accelerometers, and evaluate the concurrent validity and/or inter-method reliability of each measure [8].As smartphone use is so widespread, evaluating the reliability and validity of this technology allows us to conclude whether simple smartphone apps can be use in gait analysis to capture kinematic parameters, and the issues and protocols that need to be considered to ensure that these data are consistent and valuable.
This scoping review was conducted to systematically evaluate research quantifying concurrent validity and/or inter-method reliability comparing smartphone accelerometers to gold-standard measures.This will allow the identification of key themes and approaches used and the identification of any gaps in that research to inform future work in this area.

Protocol
This study follows the methodology for scoping reviews established in Arksey and O'Malley [9] and extended by Levac et al. [10].In addition, the approach and execution of this review have been informed by the updated guidance issued by the Joanna Briggs Institute Scoping Review Methodology Group [11].The preferred reporting Items for systematic reviews and meta-Analyses (PRISMA) statement extension for scoping reviews [12] has been followed to structure the reporting of this review, and a completed PRISMA-ScR checklist can be found in Appendix A.

Eligibility Criteria
Studies were considered eligible if they evaluated the concurrent validity or intermethod reliability of smartphone accelerometer data.There were no restrictions based on publication date; but as the search considered smartphone data, this was expected to be limited to studies since approximately 2000 due to the evolution and uptake of smartphone use.Reviews and conference papers were excluded, but these were manually checked to ensure that any relevant citations were included in the review.Papers published in languages other than English were included assuming English translations were also available.
Studies were excluded if they considered balance rather than gait parameters, or assessed static rather than dynamic movement.Further, studies were excluded unless they compared the accelerometer data (from a smartphone) with another method of objective kinematic data collection; for example, motion capture or inertial measurement units.Where studies only considered distance or time walked, such as the 6-min walk test, or total minutes of physical activity, these were excluded as no kinematic gait characteristics were evaluated.Where studies included a mixture of both gait and balance tasks, such as the timed up and go test, these were only included if the walking section of the trial was used to evaluate kinematic data such as stride time or step length.There were no restrictions placed on the operating system or type of smartphone used.

Information Sources
An electronic search of three databases was performed (PubMed, SportDiscus, and Web of Science) to identify relevant papers for inclusion.The search strategy was developed by three authors (C.S., M.A.T., A.M.) and refined via discussion.Google Scholar was used to check for any additional grey literature to identify unpublished studies and reduce publication bias.The final search results were exported into RefWorks.The literature search was performed between 23 and 24 September 2023.

Search
The search strategy included the following keywords: (gait OR walk* OR ambul*) AND (smartphone OR phone OR android) AND (valid* OR reliab* OR accur*) No further refinement or restriction was placed on the search to ensure the maximum number of studies were returned for consideration and to maximise recall.

Selection of Sources of Evidence
Studies were selected following abstract and keywords review and subsequent full text screening.To ensure consistency, one author (C.S.) performed the screening and applied the exclusion criteria, and this was validated by other authors (M.A.T., A.M.).Any paper considered valid for inclusion was then full-text screened and studies were included based on a consensus between all authors.

Data Charting
The data charting form was based on a previous scoping review conducted by this research group [13] and refined via discussion based on the scope of this review.Data charting was initially conducted in Excel (Microsoft 365, version 2309) by one author (CS) and then reviewed for accuracy (M.A.T., A.M.).Revisions to the data charting form were made iteratively via ongoing discussion as different themes emerged from the studies under review.

Data Items
The data extracted from each study included the demographic information for the participants and the pathological condition considered (if any).We also extracted information on the methods, including the comparator, the app name being evaluated (if provided), capture frequencies compared, the location(s) of the phone during the trials, and the nature of the trial (overground, laboratory walkway, treadmill).The duration and speeds of each trial were extracted, and the gait characteristic(s) under analysis.In addition, the method of assessing validity and/or reliability, including any sample size considerations, were extracted to allow the synthesis of approaches.
In common with other scoping reviews, an overall measurement of study quality has not been performed, but relevant study characteristics relating to methodological quality have been extracted for synthesis to gain an understanding of the development of the study protocols and the potential gaps in methodology [14,15].

Synthesis of Results
Studies were grouped based on the gait characteristics considered, the comparison to the type of laboratory kinematic data collected, and the method of evaluating validity and/or reliability.Any systematic reviews resulting from the search were reviewed to ensure any relevant citations were also included in the studies, as appropriate.

Study Selection
The screening and exclusion of papers is shown in Figure 1, following PRISMA reporting guidelines [16].
After duplicates were removed, 3056 studies were considered valid for screening.A total of 2427 studies were excluded from this review as they did not evaluate kinematic data relating to gait, 141 did not use the smartphone as the primary method of data collection, and 72 used sensors other than the accelerometer (for example, video capture).In total, 72 studies did not specifically assess agreement, concurrent validity, or inter-method reliability, and 41 were excluded due to not including a comparison to a gold-standard method (for example, only evaluating the test-retest reliability of the smartphone).A total of 124 papers were excluded as these presented reviews, study protocols, and conference papers.The remaining papers were considered eligible for review.After duplicates were removed, 3056 studies were considered valid for screen total of 2427 studies were excluded from this review as they did not evaluate kin data relating to gait, 141 did not use the smartphone as the primary method of da lection, and 72 used sensors other than the accelerometer (for example, video captu total, 72 studies did not specifically assess agreement, concurrent validity, or inter-m reliability, and 41 were excluded due to not including a comparison to a gold-st method (for example, only evaluating the test-retest reliability of the smartphone). of 124 papers were excluded as these presented reviews, study protocols, and con papers.The remaining papers were considered eligible for review.

Study Characteristics
The basic demographic information for each of the studies included is shown i 1 below, the mean and standard deviation are shown unless specified, and left b these values were not provided in the paper reviewed.Ages and mass have been ro to 1 decimal place, and heights to 2 decimal places, if supplied at a higher precisio stated as-is if provided at the lower precision.The studies are presented in reverse ological order to show changes in reporting/methods over time.

Study Characteristics
The basic demographic information for each of the studies included is shown in Table 1 below, the mean and standard deviation are shown unless specified, and left blank if these values were not provided in the paper reviewed.Ages and mass have been rounded to 1 decimal place, and heights to 2 decimal places, if supplied at a higher precision, and stated as-is if provided at the lower precision.The studies are presented in reverse chronological order to show changes in reporting/methods over time.Notes: PD = Parkinson's disease; TKA = total knee arthroscopy; THA = total hip arthroscopy; OA = osteoarthritis.
Three studies did not include information about the biological sex of the participants [24,31,36].Overall, the studies reviewed have recruited more females (n = 296) than males (n = 226), so this is not fully representative of the average population.Further, it is recognised that gait is affected by biological sex in both healthy adults [38] and within a pathological population [39].As these studies are all comparing two measures when evaluating the same individual's gait, then any difference in biological sex may not be considered important, as long as variety is represented, but this is not explicitly discussed.
The study location has been determined from the methods sections, or the author affiliations if not stated.In four studies [26,28,29,32], it has not been possible to determine the location of the data collection, with the authors being affiliated with both Thailand and the USA.
Many studies focus on healthy participants, but pathological populations are also represented, in particular Parkinson's disease.A broad range of ages are represented in these papers, which suggests that the research conducted is generalisable to a wider sample.The mass of the participants in each sample is infrequently reported, and none of the studies had exclusion criteria relating to mass or BMI, which may suggest that the researchers do not consider this a confounding variable when assessing gait characteristics despite the potential accuracy issues due to soft tissue artefacts [40].

Results of Individual Sources of Evidence
Details of 'gold standard' comparator and smartphone information and walking protocols are presented in Tables 2 and 3 below.Notes: PWS = preferred walking speed.

Equipment
Studies use different methods of data capture for the comparator technology, but twelve studies use motion capture to determine changes in marker position.Some studies also include additional technology such as footswitches [17], IMU [25,31,36] or video [29].Other equipment types used as the comparator were based on IMUs [24,27,30], accelerometers [37], or pressure-sensitive walkways [21,22,32,34], and one study captured video and identified gait events from this for comparison with the smartphone data [26].

Capture Frequency
The capture frequencies used for the smartphones varied from 15 Hz to 100 Hz.Four studies [21,29,32,37] document using the Android SENSOR_DELAY_FASTEST setting [41], which uses the fastest possible available capture rate, which has increased over time as smartphone technology has improved.
The capture frequencies for the comparator are often matched to the smartphone capture frequency or set to a larger value and then resampled to the same time points.

Location of Markers and Phone
The number of markers used with the motion capture technology varied from a single marker to a full-body 53 marker set.Markers were often placed on or near the smartphone [18,23,36,37].In seven studies, the smartphone was placed in an appropriate place that would replicate day-to-day use, for example, a front pocket [17, [24][25][26][27], or in different locations to evaluate whether a change in body position affected the reliability [28,29,32].Placement of the smartphone on the lumbar spine was also used [18,19,21,23,29,30,33,[35][36][37], as this is often used as the standard placement for accelerometers to evaluate movement and determine lower body gait events [42].In addition, one study placed the smartphone on the sternum [31] and one on the navel [34].

Walking Protocols
Studies mostly use preferred walking speed, although the protocol for determining these speeds is often lacking in the method description.The protocol for determining the preferred walking speed is stated explicitly in two studies [17,20] and the cues used to initiate the participants are stated in two studies [24,29].Some studies vary the speed to evaluate if this affects the accuracy of comparison between the smartphone and 'gold standard' device-in one study [20] a fixed speed is used which is specified numerically, one uses metronome cueing to fix the average speed and increases this by 10% [34], another [24] specifies the verbal cues used to obtain a fast or slow speed, whereas other studies that consider speed changes do not clearly explain the protocol to determine this [29,[31][32][33].
The majority of the studies were conducted indoors, with one study [29] also using an outdoor level pedestrian walkway, and a further study considering outdoor walking and obstacle crossing [28].Two studies used a treadmill due to the need to control the data captured or to fix speeds [17,20], and one used corridors [27], but the majority of the other studies used laboratory-based hard floor walkways [19,21,22,[24][25][26][29][30][31][32][33][34][35][36][37].One study also used the participants' indoor home environment in addition to the treadmill [20].The surface used in the trials was not reported in two studies [18,23].
Dual task trials are included in six studies [18,20,23,24,27,30].Different protocols are used, with some studies dual task consisting of the participants turning their head from side to side while walking [18,23], or a cognitive task such as the 'serial seven' or 'serial threes' test [20,24,27], or a combination of both numerical and verbal cognitive tasks [30].
The duration of each trial varies, and is expressed in either distance or walking time.As one trial considers a non-linear analysis of the data [17], this requires a longer time series to fully capture the nature of the temporal gait changes and should exceed 500 stride intervals for fractal analysis [43] or 200 strides for entropy analysis [44].The remaining studies consider linear measures such as means and coefficient of variation, and so do not have the same requirement for a long time series, and these vary from 6 steps [19] or 6 s [18,23] to 120 s of walking data [24,26].A justification of trial length in the studies concerning linear measures has not been included in any of the papers reviewed.
Turns are included in trials in five studies [24,26,29,30,33], and are included in the trial but excluded from the subsequent analysis in four studies [20,22,27,34].Subjects walked barefoot in five studies [18,22,25,32,35], without shoes in one study [34] and in normal shoes in five studies [17, 24,27,29,37], otherwise this was not stated.Obstacle crossing and uneven surfaces were considered in one study [28].Inclines and steps have not been included.

Analysis
The signal processing, analysis, gait events identified, and reliability measures are summarised in Table 4 below.
Sample size calculations are explicitly included in five studies [17,18,20,24,26], and one further study states the calculated sample size but not the values or methods used to obtain it [34].Studies that evaluate the required sample size either base the calculation on attaining an intraclass correlation coefficient (ICC) of ≥0.8 [18,20], based on the results of previous studies [17,26], or one study [24] uses the recommendations from Bujang and Baharum [45].One study [17] uses non-linear analysis when evaluating reliability, specifically detrended fluctuation analysis, approximate entropy, and sample entropy of a time series without filtering/smoothing.When linear measures are considered in the same study, the data are filtered prior to analysis.In other studies that include filtering, the cut off frequencies range from 2 Hz to 20 Hz, with some studies [21,28,30,32] also adding additional filtering of the anterio-posterior signal based on previous work by Zijlstra and Hof [47].
The actual acceleration values are used in the reliability analysis in two studies [19,36], whereas the majority of the other papers consider discrete events that can be derived from the original time series (e.g., stride time).
The majority of studies included in this review use ICCs to evaluate inter-method reliability, and also include Bland Altman limits of agreement or Pearson correlation coefficients to evaluate concurrent validity in addition to this.However, when interpreting the ICC value, different ranges have been used to quantify the result.The majority of papers reviewed that implement ICCs [17,19,20,24] use the ranges specified by Koo and Li [8]; that is, <0.5 poor, 0.5-0.75moderate, 0.75-0.90good, >0.90 excellent.However, two papers [18,23] use ranges specified by Munro [48]: <0.50 poor, 0.50-0.69moderate, 0.70-0.89high, >0.90 excellent; two studies [28,32] use ranges recommended by Cicchetti [49], <0.40 poor, 0.40-0.60fair, 0.60-0.75good, >0.75 excellent; one study [36] uses ranges recommended by Shrout and Fleiss [50]: <0.40 poor, 0.40-0.75fair to good, >0.75 excellent; and one study [30] uses an uncited set of ranges: ≤0.59 low, 0.60-0.69marginal, 0.70-0.79adequate, 0.80-0.89high, >0.90 very high.The discrepancy between these ranges is shown in Figure 2   One study [17] uses non-linear analysis when evaluating reliability, specifically detrended fluctuation analysis, approximate entropy, and sample entropy of a time series without filtering/smoothing.When linear measures are considered in the same study, the data are filtered prior to analysis.In other studies that include filtering, the cut off frequencies range from 2 Hz to 20 Hz, with some studies [21,28,30,32] also adding additional filtering of the anterio-posterior signal based on previous work by Zijlstra and Hof [47].
The actual acceleration values are used in the reliability analysis in two studies [19,36], whereas the majority of the other papers consider discrete events that can be derived from the original time series (e.g., stride time).

Findings
Many papers reported an excellent correlation either via the ICC [17], Pearson correlation coefficient [26][27][28]33,37], or Bland Altman limits of agreement [25,31].Olson et al. [18] concluded that step time had an excellent reliability, whereas step length was good.Other papers achieved good to excellent reliability [24,29,30,35].Kuntapun et al. [28] evaluated both level walking, irregular, and obstacle crossing, and found high to very high correlations for gait characteristics but low to high correlations for the COM displacement.Steins et al. [36] found the position data to be excellent, but the actual acceleration to only be good (>0.54).Grouios et al. [19] conclude that smartphones are a valid and reliable alternative to motion capture technology, but their results include ICC values from −0.348 to 0.796 and Pearson correlation coefficients of −0.464 to 0.460 which do not seem to support this.Shema-Shiratzky et al. [22] evaluated both left and right sides and concluded that smartphones have an excellent validity compared with a pressure-sensitive walkway for cadence but only achieved an adequate correlation for single limb support, double limb support, and stance phase.Kelly et al. [21] also found a strong correlation between the smartphone and the walkway for cadence.When considering different body positions, Silsupadol et al. [32] found phone placement may be important, with body and belt placement resulting in an excellent reliability when compared to the gold standard, whereas bag, hand, and pocket are good.

Summary of Evidence
The choice of the gold standard equipment to use to evaluate the validity and reliability of the smartphone data capture is not justified in any of the studies, so this may relate to convenience or previous studies conducted by the research groups.In particular, there are research groups and co-authors common in several papers, which may suggest that later papers develop earlier research, which could imply methodological bias.However, this also means that limitations identified in earlier papers can be further developed in later research studies, such as the lack of turns identified in the protocol for Silsupadol et al. [32], which is addressed in the 2020 paper [29].
The choice of capture frequency is important to ensure that the quickest system changes are captured, with 24 Hz suggested as the minimum for walking trials [51] due to the Nyquist sampling theorem.One study has a sampling rate (15 Hz) that may not capture all the required data [19], although low sampling rates (12.5 Hz) have been used successfully to capture data about cadence in older people with osteoarthritis [52].However, high sampling frequencies may increase the chance of noise in the data, so clear justification of the choice of sampling frequency is needed to reduce the risk of oversampling and associated error, which may affect the evaluation of reliability if there is error present in one sample and not the other.
There is a range of different-length trials present in the reviewed papers, but this is not justified other than when discussing non-linear analysis and the requirement for many data points [17].The trial length should also be considered in conjunction with the capture frequency to establish the number of data points available for analysis in each case-this varies in the studies reviewed from approximately 600 data points [18,23] to 12,000 captured data points [24,26] which is a considerable difference.As some of the studies include an older or pathological population, the trial length should be considered further to ensure that fatigue does not affect the gait pattern or increase the risk of adverse events.
The protocol for determining preferred walking speed is often missing from method descriptions, and this has been found to be problematic, with speed being a potential confounding variable in gait analysis with recommendations that this should be standardised to avoid ambiguity [53].In particular, the use of specific cues can affect the speed selected by the participant [54] and result in a preferred speed that is not optimal.The protocol for choosing a self-selected speed has been specified in two studies [17,20], and the cue used in another [24], and this is important to ensure that studies are repeatable and methods are rigorously reported.

Ecological Validity
Many studies attempt to replicate laboratory-based testing when deciding the placement of the smartphone, such as placing it strapped to the lower back or sternum.While this makes sense in terms of being a robust way of checking reliability versus gold standard technology, which may be applied in the same area, this does imply a lack of ecological validity, as this is not where research participants will be carrying their smartphone in a real-world situation.The placement of the smartphone during testing has taken this into account, with more focus on actual body positions that the smartphone may be used, such as the front pocket, or close to one hip.Further studies [28,32] have validated different body positions for the smartphone which may be used in recommendations for research participants in terms of where to keep their device during walking trials to maximise accuracy.There is limited research on smartphone location while walking, but a study of younger women (aged 15-40 years) found that the preferred smartphone locations also included hanging around the neck, or tucked into their bra [55], so further analysis on smartphone body locations and the effect of these on the reliability of kinematic data is warranted.
Similarly, walking barefoot in some trials lacks ecological validity if smartphone accelerometry data are to be used in a real-world setting.The location of the trials conducted in the reviewed studies often used laboratory walkways, with only two studies using an outdoor setting [28,29], which would replicate a real-world data collection.Various studies included in this review also included dual task components to replicate real-world data collection; however, these often involve cognitive or motor tasks that do not replicate what the participant may experience when walking in real life.Thus, rather than simply walking and talking, the dual task components include mathematical tasks or head-turning tasks, which are perhaps unrealistic.The studies reviewed suggest that dual tasking when captured via smartphone or gold standard is comparable, accurate, and reliable, which would also suggest that simpler dual task components may also have good reliability.
Turns are not dealt with consistently in the studies reviewed, with some deliberately excluding these as they disrupt stride timing [46].In other studies, turns are included as these represent real-world gait more accurately due to the quantity of turns experienced in activities of daily living [56] and can be accurately identified within a time series [57].As the papers reviewed are considering validity and reliability of smartphones when compared to gold standard systems, it could be argued that turns should be included as representative of usual gait, and that the two systems should handle these in the same way if we were to conclude that the smartphone was a reliable alternative measure.It should also be considered that some of the studies reviewed focused on Parkinson's disease or older adult fallers, and turns are considered to be a contributory factor in negative events such as freezing of gait [58] or increased falls risk [59], so capturing kinematic data during turning may be particularly useful in these populations.

Analysis
The raw acceleration data are often resampled, as smartphones do not sample at reliable time intervals and so need to be interpolated to ensure that the data points represent the same capture point.Many studies reviewed have reported the need to resample or interpolate the data, and this could be a potential cause of poor results if studies did not deal with this issue, as this would introduce lags into the time series.Various algorithms have been used to determine specific gait events, but the need to identify specific gait events rather than consistent features in the time signal has not been clearly explained.For evaluating stride time, for example, looking at peaks/troughs in the signal as the same consistent point, even though these may not correspond to a specific gait event, could be potentially as valid as identifying heel strikes to calculate this value, which has been employed as a strategy in some of the studies reviewed.
It should be noted that the Grouios et al. [19] paper attempts to test the reliability of each acceleration value gathered, whereas most other papers reviewed reduce the sample data points by extracting discrete data such as stride length to use in their reliability analysis.Steins et al. [36] also consider acceleration data directly and find that the actual raw accelerations have a fair to excellent reliability, whereas the position data obtained by double integration of the acceleration series had a higher reliability.This suggests that the analysis of data derived from discrete gait events, such as stride length or step time, may be more valid than using the accelerations more directly, suggesting that the accelerations may include more noise and potential error in the signal.
Sample size calculations are included in later studies, which may relate to increasing rigour in reporting over time with published articles having more defined reporting standards to adhere to [60].The wide range of ranges used to determine whether reliability is 'good' or 'excellent' is not consistent in the studies reviewed, but most studies also report the numerical value of the ICC to allow comparison between studies.
There are a ranges of approaches adopted in the studies reviewed, with agreement analysed via Bland Altman, concurrent validity analysed via correlation, and inter-method reliability analysed using ICC.In some cases, the language used could be more precise to explain the choices to assess concurrent validity rather than inter-method reliability, for example, rather than more ambiguous terms such as 'feasibility' and 'accuracy'.When studies use Bland Altman plots or Pearson correlations rather than ICC, this is often not justified, and one study uses an analysis of variance (ANOVA) which is much more limited in use than ICC for determining reliability [61].Pearson correlations alone may be misleading, as these do not measure reliability or agreement between methods [62], which may be why several studies considered multiple methods of determining validity and/or reliability.

Limitations
A scoping review approach has been used here to evaluate the breadth and depth of research in a specific area, and to identify the approaches used to inform future research.Although we searched grey literature, it is possible that publication bias may have affected the studies included in this review.In particular, pilot or preliminary studies may not have been published in peer-reviewed journals due to small sample sizes or lack of significance [63].As is standard with scoping reviews, an evaluation of the quality of each study has not been performed [14,15], but we have extracted key themes and approaches to allow readers to assess their methodological quality and rigour.

Conclusions
A range of different smartphone makes and models have been considered in the studies reviewed, as have differing speeds and dual task components.The reliability of smartphone-based accelerometry data has been assessed against motion capture, pressure walkways, and IMUs as 'gold standard' technology and has been found to be accurate and reliable.A range of different methods have been used to identify gait events, to process and analyse the data, and to evaluate the reliability.This suggests that smartphone accelerometers can provide a cheap and accurate alternative to gather kinematic data, which can be used in ecologically valid environments to potentially increase diversity in research participation.

Recommendations for Future Research
The studies reviewed cover a range of capture frequencies but no study explicitly compared different capture frequencies to see if this affects the reliability.As smartphones are not designed to capture accelerometry data for gait analysis, then it is feasible that increasing capture frequency could add noise to the signal; thus, it would be important to consider the optimal capture frequency for smartphone use, rather than just try and capture the maximum frequency possible.In addition, a consideration of different walking surfaces would increase the generalisability of the research and how this relates to the data collection in the real world and dissemination of smartphone-based data capture 'in the wild'.

Introduction Objectives 4
Provide an explicit statement of the questions and objectives being addressed with reference to their key elements (e.g., population or participants, concepts, and context) or other relevant key elements used to conceptualize the review questions and/or objectives.

Protocol and registration 5
Indicate whether a review protocol exists; state if and where it can be accessed (e.g., a Web address); and if available, provide registration information, including the registration number.

Eligibility criteria 6
Specify characteristics of the sources of evidence used as eligibility criteria (e.g., years considered, language, and publication status), and provide a rationale.

Information sources 7
Describe all information sources in the search (e.g., databases with dates of coverage and contact with authors to identify additional sources), as well as the date the most recent search was executed.

Search 8
Present the full electronic search strategy for at least 1 database, including any limits used, such that it could be repeated.

2.4
Selection of sources of evidence 9 State the process for selecting sources of evidence (i.e., screening and eligibility) included in the scoping review.

2.5
Data charting process 10 Describe the methods of charting data from the included sources of evidence (e.g., calibrated forms or forms that have been tested by the team before their use, and whether data charting was done independently or in duplicate) and any processes for obtaining and confirming data from investigators.

2.6
Data items 11 List and define all variables for which data were sought and any assumptions and simplifications made.

2.7
Critical appraisal of individual sources of evidence 12 If done, provide a rationale for conducting a critical appraisal of included sources of evidence; describe the methods used and how this information was used in any data synthesis (if appropriate).
N/A Synthesis of results 13 Describe the methods of handling and summarizing the data that were charted.2.8

Selection of sources of evidence 14
Give numbers of sources of evidence screened, assessed for eligibility, and included in the review, with 3.1

Figure A1
. Preferred reporting items for systematic reviews and meta-analyses extension for scoping reviews (PRISMA-ScR) checklist [12].

Table 2 .
Results from individual sources of evidence-equipment.

Table 3 .
Results from individual sources of evidence-walking protocols.

Table 4 .
Processing and analysis.