# Are Gait Patterns during In-Lab Running Representative of Gait Patterns during Real-World Training? An Experimental Study

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Overview

#### 2.2. Participants

**Cohort 1.**The inclusion criteria for Cohort 1 were designed to capture a pool of runners representative of the broader population of runners. Healthy runners aged 18 and older were recruited, with no upper limit on age. Participants were required to run at least three times per week with at least one run of 40 min or longer, were required to have no current musculoskeletal injury that prevented them from doing their usual running training, and were required to meet American College of Sports Medicine preparticipation guidelines for exercise [13]. Runners were recruited from the community via social media, flyers at local running stores, and in person recruitment at local running events. Recruitment and data collection for Cohort 1 took place in Greenville, North Carolina, which is located in a region with predominantly flat terrain. All participants provided written informed consent, and the study was approved by the East Carolina University and Medical Center IRB and the Indiana University IRB (protocols # 21-001137 and 12040). The sample size for Cohort 1 was determined via a learning curve power analysis for a predictive modeling goal detailed elsewhere [14], which indicated that a minimum of 40 participants were needed.

**Cohort 2.**The inclusion criteria for Cohort 2 were designed to construct a more homogenous and specialized population of runners to assess the generalizability of in-lab data to a new population of athletes. One such specialized population often studied in prospective research on running injuries is young adult female runners, who may be at greater risk of overuse injury (e.g., Davis et al., Rauh et al. [15,16]). In service of this goal of testing the generalizability of findings from the in-lab data, Cohort 2 included women aged 18–32 were recruited who fulfilled the same inclusion criteria as Cohort 1 (running at least three times per week with one run lasting at least 40 min, and no current injuries or contraindications for exercise). Runners were recruited from students at a large university via flyers on campus and at a local running club. Recruitment and data collection for Cohort 2 took place in Bloomington, Indiana, which is located in a region with predominantly hilly terrain. All participants provided written informed consent, and the study was approved by the Indiana University IRB (protocol #17923). The sample size for Cohort 2 was designed to recruit a similar number of female subjects as were recruited for Cohort 1.

#### 2.3. Wearable Sensors and Gait Metrics

#### 2.4. Protocol

**Cohort 1, in-lab run.**Participants in Cohort 1 first completed a 38 min in-lab treadmill run at speeds ranging from 30% slower to 25% faster than each runner’s self-reported preferred running speed for a “typical training run.” This range of speeds was designed to increase the variability in each runner’s gait as observed in the lab, as most gait-related parameters change as a function of speed. The range of speeds was selected by comparing data on self-reported preferred running speed from a previous in-lab study [24] with known values for the typical walk–run transition speed in healthy adults [25] and predictive equations for estimating lactate threshold from training pace [26]. The range of 30% slower to 25% faster kept the slowest speeds above the walk–run transition for most adults, avoiding uncomfortably slow speeds, and kept the fastest speeds below each runner’s predicted lactate threshold, avoiding excessive fatigue. The speeds were presented in a semirandomized fashion, with the slowest two speeds first, then a block of randomized speeds, followed by the fastest two speeds at the end. This ordering was chosen to maximize the range of speeds covered by each subject and to minimize early-onset fatigue that would prevent subjects from completing the protocol. To minimize any potential order effects, each subject was randomly assigned one of four randomized block orders (speed ordering for each protocol provided in Table S2). The in-lab treadmill run, which took place as a part of a larger study [14], was completed in a motion capture lab while equipped with the wearable sensors.

**Cohort 2, measured course run**. Participants in Cohort 2 first completed a 2.4 km run on a measured out-and-back course while under observation by research staff and while equipped with the wearable sensors. The measured course consisted of known segments of flat and straight running, left turns, right turns, inclines, and declines (Figure 1), all of which were confirmed via mapping software (OpenStreetMap, accessed 28 April 2023). The flat and straight segment was a portion of a concrete running track with inclines and decline magnitude of <0.5% grade. The left and right turns were turning portions on this same track. The incline and decline segments were on a paved sidewalk with an average grade of 5.5%.

**Cohorts 1 and 2, real-world runs.**After completing the in-lab or measured course run, participants in Cohort 1 and Cohort 2, respectively, were sent home with the wearable sensors. Both cohorts were instructed to record five outdoor runs, with no restrictions on the course, terrain, run distance, or pace. In this way, the data from the real-world runs were designed to be a representative sample of the participants’ typical gait patterns during their typical day-to-day training. For both cohorts, five real-world runs were selected because previous research has shown that this number of runs is sufficient to generate a stable characterization of the distribution of a runner’s gait pattern during real-world running [27].

#### 2.5. Data Extraction

**Cohort 1, in-lab run.**Sensor data from each of the 12 running trials were trimmed to exclude the first 25 s and the last 15 s of each segment to remove any effects caused by changes in the treadmill belt speed. During these portions of the protocol, the gait metrics recorded by the devices lag behind the runner’s true gait metrics because the treadmill belt speed is not constant—belt speed changes gradually over several seconds to avoid causing a trip hazard.

**Cohort 2, measured course run**. GNSS-based position data were cross-referenced against latitude-longitude bounding boxes to extract segments of the measured course run that took place within the known segments of flat and straight running, left turns, right turns, inclines, and declines (Figure 1). These segments were identified using a ground-truth course measured using mapping software and elevation data from OpenStreetMap.

**Cohorts 1 and 2, real-world runs.**Since real-world running data contain some amount of standing and walking, portions of the real-world data which had speeds below 1.56 m/s or a cadence below 100 strides per minute were excluded, following similar strategies used in previous work [27,28]. These strategies resulted in 6.5% of real-world data being excluded.

#### 2.6. Data Processing

#### 2.7. Identification of Flat and Straight Segments in Real-World Running

#### 2.8. Statistical Comparison of Gait Patterns

**Univariate analysis.**The univariate analysis treated each gait metric separately. For each gait metric and each comparison, the overlap between a reference distribution $D$ and a new distribution ${D}^{\mathrm{*}}$ was calculated as follows:

- Calculate the central 95% range of the points in $D$. This corresponds to the [2.5%, 97.5%] quantiles of this gait metric in this distribution.
- Calculate the proportion of points in ${D}^{\mathrm{*}}$ which fall within this central 95% range calculated from $D$.
- Data points in ${D}^{\mathrm{*}}$ which fall within this central 95% range were considered to be well-represented by the reference distribution $D$ (Figure 4).

**Depth analysis.**The univariate analysis cannot detect changes in how multiple gait metrics covary together. For example, if the speed–stride length relationship differs for a particular subject inside the lab versus outside the lab, this shift would not be fully captured by the univariate analysis. The multivariate approach using depth statistics was used to address this shortcoming.

- Define the reference distribution of gait patterns, $D$ (e.g., a set of five-dimensional vectors containing all in-lab gait data from one subject in Cohort 1).
- Calculate the half-space depth for all points $d\text{}\in D$, with respect to the reference distribution $D$.
- Calculate the 95% depth cutoff by finding the 95% quantile value of the depth values for all points in $D$.
- Define the new distribution of gait patterns, ${D}^{*}$ (e.g., real-world data from the same subject in Cohort 1).
- Calculate the half-space depth for all points ${d}^{*}\in {D}^{*}$, with respect to the distribution $D$.
- Calculate the proportion of points in ${D}^{*}$ that are as deep or deeper than the 95% depth cutoff determined in step 3.

**Gait distributions comparisons and summary statistics.**To quantify similarities between in-lab and real-world gait patterns, the gait pattern distribution comparisons listed in Table 1 were made using both the univariate and depth analysis approaches outlined above. The comparisons listed in Table 1 address the questions posed in the introduction, which examine the degree to which in-lab gait is representative of real-world gait for the same runner, a runner from the same population of participants, and a runner from a different population.

## 3. Results

#### 3.1. Participants, Recruitment, and Running Data Summary

#### 3.2. Univariate Analysis of Gait Pattern Overlap

**Analysis 1.**When using in-lab data from one subject as the reference distribution, real-world data from that same subject showed overlap from 65.7 to 95.2% on average across gait metrics, but with some subjects displaying overlap below 50% (Figure 6A). Average overlap was lowest for vertical oscillation (74.5%) and leg stiffness (65.7%), indicating a change in these gait metrics when a subject was running in-lab versus in the real-world.

**Analysis 2.**When using “leave-one-subject-out” in-lab data from Cohort 1 as the reference distribution, real-world data were well-represented by the in-lab data. Overlap for speed (98.3%), step length (96.6%), vertical oscillation (92.4%), stance time (97.3%), and leg stiffness (91.3%) were all close to 95%, indicating strong overlap between distributions. However, one to five outlying subjects had distributional overlap below 50%, with some near zero (Figure 6B)—these individuals had marked differences in their real-world step length, ground contact time, and leg stiffness, which drove this poor overlap.

**Analysis 3.**When using in-lab data from Cohort 1 as the reference distribution, real-world data from Cohort 2 were well-represented by the in-lab data for speed (95.9%), step length (94.7%), vertical oscillation (99.4%), and stance time (97.9%), but to a lesser extent leg stiffness (88.3%), which had lower mean values because of three subjects with lower distributional overlap (Figure 6C).

**Analysis 4.**When using real-world data from Cohort 1 as the reference distribution, real-world data from Cohort 2 were well-represented by Cohort 1′s real-world data for speed (93.0%), step length (94.0%), vertical oscillation (99.6%), stance time (88.6%), and leg stiffness (95.0%), albeit with lower overlap for two subjects (Figure 6D).

**Stratification by flat and straight segments.**For all analyses, overlap changed by less than five percentage points when stratifying real-world data to include only flat, straight segments (red vs. blue points in Figure 6).

#### 3.3. Depth Analysis of Gait Pattern Overlap

**Analysis 1.**When using in-lab data from one subject as the reference distribution, real-world data from the same subject showed an average of 32.5% overlap (Figure 6A). For ten subjects, overlap was less than 10%.

**Analysis 2.**When using in-lab data from Cohort 1 as the reference distribution, real-world data from a new subject from Cohort 1 showed overlap of 89.5%, though the confidence intervals for this average overlap excluded 95%—the overlap that would be expected for data drawn from the same underlying distribution, because overlap criteria were set as falling within the central 95% of the reference distribution (Figure 6B).

**Analysis 3.**When using in-lab data from Cohort 1 as the reference distribution, real-world data from Cohort 2 overlapped 90.3% with the in-lab data, with confidence intervals including 95%. One outlying individual had overlap near zero, caused by running slower and with shorter steps than most of the data in Cohort 1 (Figure 6C).

**Analysis 4.**When using real-world data from Cohort 1 as the reference distribution, real-world data from Cohort 2 showed overlap of 91.6%, with confidence intervals including 95%. One outlying individual with zero overlap; this individual ran slower and with shorter steps than any of the runners in Cohort 1 (Figure 6D).

**Stratification by flat and straight segments.**As with the univariate analysis, stratification of real-world data to include only flat, straight segments resulted in less than a five-percentage point change for all the analyses (red vs. blue points in Figure 6), indicating that the turns, inclines, and declines encountered during real-world running were not the primary driver of differences between in-lab and real-world gait patterns.

#### 3.4. Sensitivity Analysis of Gait Pattern Metric Choice

## 4. Discussion

#### 4.1. Distributional Shifts in Real-World Data

#### 4.2. Effects of Inclines, Declines, and Turns

#### 4.3. Gait Pattern Overlap across Different Populations of Runners

#### 4.4. Comparison with the Previous Literature

#### 4.5. Limitations

## 5. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Ceyssens, L.; Vanelderen, R.; Barton, C.; Malliaras, P.; Dingenen, B. Biomechanical risk factors associated with running-related injuries: A systematic review. Sports Med.
**2019**, 49, 1095–1115. [Google Scholar] [CrossRef] [PubMed] - Vannatta, C.N.; Heinert, B.L.; Kernozek, T.W. Biomechanical risk factors for running-related injury differ by sample population: A systematic review and meta-analysis. Clin. Biomech.
**2020**, 75, 104991. [Google Scholar] [CrossRef] [PubMed] - Moore, I.S. Is there an economical running technique? A review of modifiable biomechanical factors affecting running economy. Sports Med.
**2016**, 46, 793–807. [Google Scholar] [CrossRef] [PubMed] - Willwacher, S.; Kurz, M.; Robbin, J.; Thelen, M.; Hamill, J.; Kelly, L.; Mai, P. Running-related biomechanical risk factors for overuse injuries in distance runners: A systematic review considering injury specificity and the potentials for future research. Sports Med.
**2022**, 52, 1863–1877. [Google Scholar] [CrossRef] - Van Hooren, B.; Fuller, J.T.; Buckley, J.D.; Miller, J.R.; Sewell, K.; Rao, G.; Barton, C.; Bishop, C.; Willy, R.W. Is motorized treadmill running biomechanically comparable to overground running? A systematic review and meta-analysis of cross-over studies. Sports Med.
**2020**, 50, 785–813. [Google Scholar] [CrossRef] [PubMed] - Lafferty, L.; Wawrzyniak, J.; Chambers, M.; Pagliarulo, T.; Berg, A.; Hawila, N.; Silvis, M. Clinical indoor running gait analysis may not approximate outdoor running gait based on novel drone technology. Sports Health
**2022**, 14, 710–716. [Google Scholar] [CrossRef] [PubMed] - Benson, L.C.; Clermont, C.A.; Ferber, R. New considerations for collecting biomechanical data using wearable sensors: The effect of different running environments. Front. Bioeng. Biotechnol.
**2020**, 8, 86. [Google Scholar] [CrossRef] [PubMed] - Nielsen, R.Ø.; Bertelsen, M.L.; Ramskov, D.; Damsted, C.; Brund, R.K.; Parner, E.T.; Sørensen, H.; Rasmussen, S.; Kjærgaard, S. The Garmin-RUNSAFE Running Health Study on the aetiology of running-related injuries: Rationale and design of an 18-month prospective cohort study including runners worldwide. BMJ Open
**2019**, 9, e032627. [Google Scholar] [CrossRef] [PubMed] - Benson, L.C.; Räisänen, A.M.; Clermont, C.A.; Ferber, R. Is this the real life, or is this just laboratory? A scoping review of IMU-based running gait analysis. Sensors
**2022**, 22, 1722. [Google Scholar] [CrossRef] - Taborri, J.; Keogh, J.; Kos, A.; Santuz, A.; Umek, A.; Urbanczyk, C.; van der Kruk, E.; Rossi, S. Sport biomechanics applications using inertial, force, and EMG sensors: A literature overview. Appl. Bionics Biomech.
**2020**, 2020, 2041549. [Google Scholar] [CrossRef] - Blickhan, R. The spring-mass model for running and hopping. J. Biomech.
**1989**, 22, 1217–1227. [Google Scholar] [CrossRef] [PubMed] - Matijevich, E.S.; Scott, L.R.; Volgyesi, P.; Derry, K.H.; Zelik, K.E. Combining wearable sensor signals, machine learning and biomechanics to estimate tibial bone force and damage during running. Hum. Mov. Sci.
**2020**, 74, 102690. [Google Scholar] [CrossRef] [PubMed] - Riebe, D.; Franklin, B.A.; Thompson, P.D.; Garber, C.E.; Whitfield, G.P.; Magal, M.; Pescatello, L.S. Updating ACSM’s recommendations for exercise preparticipation health screening. Med. Sci. Sports Exerc.
**2015**, 47, 2473–2479. [Google Scholar] [CrossRef] - Davis IV, J.J. Understanding Internal Biomechanical Loads during Running Using Wearable Sensors; Indiana University: Bloomington, IN, USA, 2023. [Google Scholar]
- Davis, I.S.; Bowser, B.J.; Mullineaux, D.R. Greater vertical impact loading in female runners with medically diagnosed injuries: A prospective investigation. Br. J. Sports Med.
**2016**, 50, 887–892. [Google Scholar] [CrossRef] - Rauh, M.J.; Barrack, M.; Nichols, J.F. Associations between the female athlete triad and injury among high school runners. Int. J. Sports Phys. Ther.
**2014**, 9, 948. [Google Scholar] [PubMed] - Nielsen, R.O.; Cederholm, P.; Buist, I.; Sørensen, H.; Lind, M.; Rasmussen, S. Can GPS be used to detect deleterious progression in training volume among runners? J. Strength. Cond. Res.
**2013**, 27, 1471–1478. [Google Scholar] [CrossRef] - Navalta, J.W.; Montes, J.; Bodell, N.G.; Aguilar, C.D.; Radzak, K.; Manning, J.W.; DeBeliso, M. Reliability of trail walking and running tasks using the Stryd power meter. Int. J. Sports Med.
**2019**, 40, 498–502. [Google Scholar] [CrossRef] - Imbach, F.; Candau, R.; Chailan, R.; Perrey, S. Validity of the Stryd Power Meter in Measuring Running Parameters at Submaximal Speeds. Sports
**2020**, 8, 103. [Google Scholar] [CrossRef] - Andersen, C.; Skovsgaard, N. Reliability and validity of Garmin Forerunner 735XT for measuring running dynamics in-field. Sport. Technol. Thesis Aalbord Universitet Aalborg, Denmark
**2017**. [Google Scholar] - Adams, D.; Pozzi, F.; Carroll, A.; Rombach, A.; Zeni Jr, J. Validity and reliability of a commercial fitness watch for measuring running dynamics. J. Orthop. Sports Phys. Ther.
**2016**, 46, 471–476. [Google Scholar] [CrossRef] - Mercer, J.A.; Bezodis, N.E.; Russell, M.; Purdy, A.; DeLion, D. Kinetic consequences of constraining running behavior. J. Sci. Med. Sport.
**2005**, 4, 144. [Google Scholar] - Zandbergen, M.A.; Buurke, J.H.; Veltink, P.H.; Reenalda, J. Quantifying and correcting for speed and stride frequency effects on running mechanics in fatiguing outdoor running. Front. Sports Act. Living
**2023**, 5, 1085513. [Google Scholar] [CrossRef] [PubMed] - Davis, J.J., IV; Gruber, A.H. Leg Stiffness, Joint Stiffness, and Running-Related Injury: Evidence From a Prospective Cohort Study. Orthop. J. Sports Med.
**2021**, 9, 23259671211011213. [Google Scholar] [CrossRef] [PubMed] - Diedrich, F.J.; Warren, W.H., Jr. Why change gaits? Dynamics of the walk-run transition. J. Exp. Psychol. Hum. Percept. Perform.
**1995**, 21, 183. [Google Scholar] [CrossRef] [PubMed] - Daniels, J. Daniels’ Running Formula; Human Kinetics: Champaign, IL, USA, 2013. [Google Scholar]
- Benson, L.C.; Ahamed, N.U.; Kobsar, D.; Ferber, R. New considerations for collecting biomechanical data using wearable sensors: Number of level runs to define a stable running pattern with a single IMU. J. Biomech.
**2019**, 85, 187–192. [Google Scholar] [CrossRef] [PubMed] - Rowe, D.; Welk, G.; Heil, D.; Mahar, M.; Kemble, C.; Calabro, M.; Camenisch, K. Stride rate recommendations for moderate-intensity walking. Med. Sci. Sports Exerc.
**2011**, 43, 312–318. [Google Scholar] [CrossRef] [PubMed] - Mosler, K.; Mozharovskyi, P. Choosing among notions of multivariate depth statistics. Stat. Sci.
**2022**, 37, 348–368. [Google Scholar] [CrossRef] - Davison, A.C.; Hinkley, D.V. Bootstrap Methods and Their Application; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
- Lange, T.; Mosler, K.; Mozharovskyi, P. Fast nonparametric classification based on data depth. Stat. Papers
**2014**, 55, 49–69. [Google Scholar] [CrossRef] - Cuesta-Albertos, J.A.; Nieto-Reyes, A. The random Tukey depth. Comput. Stat. Data Anal.
**2008**, 52, 4979–4988. [Google Scholar] [CrossRef] - Hastie, T.J.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Dillon, S.; Burke, A.; Whyte, E.F.; O’Connor, S.; Gore, S.; Moran, K.A. Are impact accelerations during treadmill running representative of those produced overground? Gait Posture
**2022**, 98, 195–202. [Google Scholar] [CrossRef] - Milner, C.E.; Hawkins, J.L.; Aubol, K.G. Tibial Acceleration during Running Is Higher in Field Testing Than Indoor Testing. Med. Sci. Sports Exerc.
**2020**, 52, 1361–1366. [Google Scholar] [CrossRef] [PubMed] - Hong, Y.; Wang, L.; Li, J.X.; Zhou, J.H. Comparison of plantar loads during treadmill and overground running. J. Sci. Med. Sport.
**2012**, 15, 554–560. [Google Scholar] [CrossRef] [PubMed] - García-Pérez, J.A.; Pérez-Soriano, P.; Llana Belloch, S.; Lucas-Cuevas, Á.G.; Sánchez-Zuriaga, D. Effects of treadmill running and fatigue on impact acceleration in distance running. Sports Biomech.
**2014**, 13, 259–266. [Google Scholar] [CrossRef] [PubMed] - Dixon, P.; Schütte, K.; Vanwanseele, B.; Jacobs, J.; Dennerlein, J.; Schiffman, J.; Fournier, P.; Hu, B. Machine learning algorithms can classify outdoor terrain types during running using accelerometry data. Gait Posture
**2019**, 74, 176–181. [Google Scholar] [CrossRef] [PubMed] - Uhlrich, S.D.; Falisse, A.; Kidziński, Ł.; Muccini, J.; Ko, M.; Chaudhari, A.S.; Hicks, J.L.; Delp, S.L. OpenCap: 3D human movement dynamics from smartphone videos. PLoS Comput. Biol.
**2023**, 19, e1011462. [Google Scholar] [CrossRef] [PubMed] - Dorschky, E.; Nitschke, M.; Seifer, A.-K.; van den Bogert, A.J.; Eskofier, B.M. Estimation of gait kinematics and kinetics from inertial sensor data using optimal control of musculoskeletal models. J. Biomech.
**2019**, 95, 109278. [Google Scholar] [CrossRef] [PubMed] - Pearl, O.; Shin, S.; Godura, A.; Bergbreiter, S.; Halilaj, E. Fusion of Video and Inertial Sensing Data via Dynamic Optimization of a Biomechanical Model. J. Biomech.
**2023**, 155, 111617. [Google Scholar] [CrossRef] - Brund, R.B.; Waagepetersen, R.; O Nielsen, R.; Rasmussen, J.; Nielsen, M.S.; Andersen, C.H.; de Zee, M. How Precisely Can Easily Accessible Variables Predict Achilles and Patellar Tendon Forces during Running? Sensors
**2021**, 21, 7418. [Google Scholar] [CrossRef]

**Figure 1.**Route for 2.4 km measured course run completed by participants in Cohort 2. X- and Y-axes represent longitude and latitude. The route consisted of one counter-clockwise loop, an out-and-back segment on the incline/decline, and one clockwise loop. True route as determined using mapping software (OpenStreetMap) is shown in dark blue; actual global navigation satellite system (GNSS) data recorded by the participants are shown in gray. Each gray line represents course run completed by one subject. Shaded and labeled boxes indicate areas known to contain flat and straight running, left turns, right turns, inclines, and declines. Black line shows 100 m to scale.

**Figure 2.**Illustration of turn rate data from one participant’s data during a portion of the known course run. At each global navigation satellite system (GNSS) location sample, the runner’s current heading is represented by a directional arrow. The central difference derivative of this heading is the runner’s current turn rate, in degrees per second; this turning rate is illustrated by the color of each heading arrow. In this case, the turn rate is positive, indicative of turning to the left (following the right-hand rule convention for vectors). For ease of visualization, the GNSS data have been downsampled by a factor of four in this figure.

**Figure 3.**Illustration of applying threshold values to identify flat and straight running on segments of a known course run with turns, inclines, declines, and flat ground. (

**A**) Subject-by-subject distribution of calculated incline/decline data from the known course run in Cohort 2 on known segments of downhill, flat and straight ground, and uphill running. Dashed lines show ±2.28% grade, the empirically-determined cut-off that retains >99% of running on the known flat, straight segment. (

**B**) Subject-by-subject turn rates for the left turn, flat and straight, and right turn segments of the known course run in Cohort 2. Dashed lines show ±6.34 deg/s, the empirically-determined cut-off that retains >99% of running on flat, straight ground. For both the incline/decline and turn rate data, the known segments are clearly and reliably identified across subjects. Each line shows the trajectory of one participant from Cohort 2.

**Figure 4.**Example of univariate analysis applied to each of the five gait metrics from in-lab and real-world data from the same runner. The real-world data are represented by the green data points on the top portion of each plot panel, and the in-lab data are represented by the blue points on the bottom portion of each panel. Overlap is quantified as the proportion of real-world data which fall within the central 95% of the in-lab data (illustrated in blue shaded region).

**Figure 5.**Illustrative example of half-space depth calculation for data in two dimensions ($K=2$). Given a reference distribution as a point cloud $D$ (shown in panel (

**A**)), the depth value of any point $d$ (highlighted in red) relative to the reference distribution can be calculated as the minimum proportion of the reference data that can be “sliced off” by a half-space (in 2D, a line) that contains $d$ (shown in panel (

**B**)). In the example here, very few data points are sliced off by the half-space, and as such, the point $d$ has a low depth value associated with it. In comparison, the points at the center of the point cloud (yellow) have higher depth values. The depth for each point in a new distribution ${D}^{*}$ can also be calculated with respect to the original reference distribution $D$ (panel

**C**). A point can be considered “well-represented” if it falls within the 95% convex hull of the reference distribution $D$. This convex hull is shown as the green shell in pane (

**D**). The 95% convex hull contains the deepest 95% of the data from the reference distribution $D$. Though this figure illustrates depth using data in only two dimensions, the same methods can be applied to data in higher dimensions as well.

**Figure 6.**Distribution overlap results for each of the four analyses, using both the univariate analysis and the depth analysis. Each point represents data from one subject; black lines in crossbars represent the mean across subjects, and shaded regions in the crossbars represent bootstrapped 95% confidence intervals. Left panels show univariate analysis considering each gait metric separately; right panel shows depth analysis, which considers all gait metrics jointly. Dashed line shows the expected amount of overlap at the 95% confidence level (i.e., 95% overlap). Panel (

**A**) shows overlap between one runner’s in-lab data (reference distribution) and that same runner’s real-world data (new distribution). Panel (

**B**) shows overlap between in-lab data from all subjects and real-world data from a new runner from the same population (Cohort 1). Panel (

**C**) shows overlap between in-lab data from all subjects and real-world data from a new runner from a new population (Cohort 2). Panel (

**D**) shows overlap between real-world data from one population (Cohort 1) and real-world data from another population (Cohort 2). For all analyses, stratifying real-world data to only include running on flat, straight segments resulted in <5% change in distributional overlap (blue vs. red points and shaded crossbars).

**Figure 7.**Sensitivity analysis of in-lab versus real-world data. Analysis shows effects of iteratively reducing the number of gait metrics used to represent the runner’s gait pattern. Black bar shows mean overlap across subjects, and shaded box shows bootstrapped 95% confidence interval for the mean. The left-most column ($K=5$) is the original analysis (Analysis 1). This analysis shows that leg stiffness and vertical oscillation drive some of the poor overlap between in-lab and real-world running, but even when gait pattern is characterized only by speed, step length, and stance time, mean overlap remains below 50%. The high overlap when considering only speed and step length ($K=2$) indicates that a change in speed–step length strategy (i.e., how runners modulate their cadence and step length to achieve a given speed) cannot explain the poor overlap between in-lab and real-world running.

**Figure 8.**Visualization of the five-dimensional gait pattern for one subject in Cohort 1, projected to two dimensions using multidimensional scaling, a dimensionality reduction technique [33]. Panels (

**A**–

**C**) show the in-lab (red) and real-world (blue) gait pattern data for this runner. Some distributional overlap exists, but this runner’s real-world gait pattern shows a clear distributional shift away from the in-lab distribution. When using the in-lab data as a reference distribution for depth analysis (panel (

**D**)), only 30.1% of this runner’s real-world data (panel (

**E**)) fall within the 95% depth threshold of the in-lab data distribution. For this runner, over half of the real-world data have a depth of zero (shown as gray data points in panel (

**E**), meaning they are completely outside of the distribution of gait patterns seen during in-lab running.

Analysis | Comparison | Reference Distribution | New Data Distribution |
---|---|---|---|

Analysis 1 | In-lab vs. real-world running (same runner) | Cohort 1 in-lab data from one subject | Cohort 1 real-world data from same subject |

Analysis 2 | In-lab vs. real-world running (new runner, same population) | Cohort 1 in-lab data from all but one subject | Cohort 1 real-world data from one left-out subject |

Analysis 3 | In-lab vs. real-world running (new runner, new population and location) | Cohort 1 in-lab data from all subjects | Cohort 2 real-world data from all subjects |

Analysis 4 | Real-world running in one population vs. real-world running in new population and location | Cohort 1 real-world data from all subjects | Cohort 2 real-world data from all subjects |

Cohort 1 (N = 49 Participants). | |||||
---|---|---|---|---|---|

Min | 1st Quartile | Median | 3rd Quartile | Max | |

Age (year) | 18 | 24 | 29 | 35 | 58 |

Height (m) | 1.54 | 1.68 | 1.74 | 1.81 | 1.93 |

Mass (kg) | 43 | 60.9 | 66.7 | 79.2 | 110.5 |

Body Mass Index (kg/m^{2}) | 17.49 | 20.51 | 22.3 | 24.35 | 31.6 |

Training volume (km/wk) | 16.09 | 24.14 | 40.23 | 51.5 | 88.51 |

Experience (year) | 2 | 7 | 10 | 15 | 47 |

Sex | 25 M, 24 F | ||||

Category (self-reported) | Novice: 0, Recreational: 29, Competitive: 18, Elite: 2 | ||||

Cohort 2 (N = 19 Participants). | |||||

Min | 1st Quartile | Median | 3rd Quartile | Max | |

Age (year) | 18 | 19 | 20 | 21 | 32 |

Height (m) | 1.52 | 1.59 | 1.66 | 1.69 | 1.77 |

Mass (kg) | 47 | 53.35 | 57.3 | 63.7 | 80.8 |

Body Mass Index (kg/m^{2}) | 18.35 | 20.74 | 22.05 | 22.64 | 28.09 |

Training volume (km/wk) | 12.87 | 21.73 | 32.19 | 41.04 | 112.65 |

Experience (year) | 2 | 6 | 7 | 9 | 18 |

Sex | 0 M, 19 F | ||||

Category (self-reported) | Novice: 0, Recreational: 14, Competitive: 4, Elite: 1 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Davis, J.J., IV; Meardon, S.A.; Brown, A.W.; Raglin, J.S.; Harezlak, J.; Gruber, A.H.
Are Gait Patterns during In-Lab Running Representative of Gait Patterns during Real-World Training? An Experimental Study. *Sensors* **2024**, *24*, 2892.
https://doi.org/10.3390/s24092892

**AMA Style**

Davis JJ IV, Meardon SA, Brown AW, Raglin JS, Harezlak J, Gruber AH.
Are Gait Patterns during In-Lab Running Representative of Gait Patterns during Real-World Training? An Experimental Study. *Sensors*. 2024; 24(9):2892.
https://doi.org/10.3390/s24092892

**Chicago/Turabian Style**

Davis, John J., IV, Stacey A. Meardon, Andrew W. Brown, John S. Raglin, Jaroslaw Harezlak, and Allison H. Gruber.
2024. "Are Gait Patterns during In-Lab Running Representative of Gait Patterns during Real-World Training? An Experimental Study" *Sensors* 24, no. 9: 2892.
https://doi.org/10.3390/s24092892