Estimation of Kinetics Using IMUs to Monitor and Aid in Clinical Decision-Making during ACL Rehabilitation: A Systematic Review

After an ACL injury, rehabilitation consists of multiple phases, and progress between these phases is guided by subjective visual assessments of activities such as running, hopping, jump landing, etc. Estimation of objective kinetic measures like knee joint moments and GRF during assessment can help physiotherapists gain insights on knee loading and tailor rehabilitation protocols. Conventional methods deployed to estimate kinetics require complex, expensive systems and are limited to laboratory settings. Alternatively, multiple algorithms have been proposed in the literature to estimate kinetics from kinematics measured using only IMUs. However, the knowledge about their accuracy and generalizability for patient populations is still limited. Therefore, this article aims to identify the available algorithms for the estimation of kinetic parameters using kinematics measured only from IMUs and to evaluate their applicability in ACL rehabilitation through a comprehensive systematic review. The papers identified through the search were categorized based on the modelling techniques and kinetic parameters of interest, and subsequently compared based on the accuracies achieved and applicability for ACL patients during rehabilitation. IMUs have exhibited potential in estimating kinetic parameters with good accuracy, particularly for sagittal movements in healthy cohorts. However, several shortcomings were identified and future directions for improvement have been proposed, including extension of proposed algorithms to accommodate multiplanar movements and validation of the proposed techniques in diverse patient populations and in particular the ACL population.


Introduction
The anterior cruciate ligament (ACL) is one of the major stabilizing ligaments of the knee that ensures knee stability [1].An ACL injury occurs when this ligament is stretched or torn and accounts for 64% of athletic knee injuries [1,2].It commonly occurs in highly demanding sports such as soccer, volleyball, skiing, basketball, and football as they involve cutting, pivoting, and jumping actions [2].Based on the complexity of the injury, patients are recommended either for conservative treatment with rehabilitation or reconstruction surgery.Regardless of the treatment decision, all these patients undergo rehabilitation.Rehabilitation after an ACL injury takes 9-12 months and usually consists of multiple phases based on the severity of the injury and end objective [3].These phases include various dynamic exercises and tasks with different focuses like muscle strengthening, balance improvement, return to sports training and training for prevention of re-injury [3][4][5].
In everyday practice, physiotherapists assess the functional status of the patient through visual observations to devise a rehabilitation plan and to make phase transition decisions.Since the treatment is currently based on subjective visual observations during clinical visits, there is huge potential to further optimize the training of patients using quantitative assessment of relevant biomechanical parameters.
An initial literature review showed that knee moments and patellofemoral joint forces are important biomechanical parameters to be assessed during the ACL rehabilitation process [6].In addition to measuring knee moments, ankle and hip moments may also be useful in designing training programs for athletes to prevent ACL injuries/reinjuries and to facilitate the return to sports programs [7].However, physiotherapists cannot directly measure these parameters.This makes it important to use additional measurement tools during assessment by physiotherapists.
A traditional approach to computing joint kinetics is using a clinical gait lab with optical marker systems and force plates, which is also considered a gold standard system.Net joint moments can be computed with the traditional bottom-up inverse dynamics method using measured GRF and ground reaction moments (GRM).GRF and GRM can be measured using force plates and instrumented treadmills [8][9][10][11][12].However, these systems have several disadvantages, as they are expensive and require extensive time for patient preparation, have bulky setups, and involve extensive manual post processing.Systems such as instrumented treadmills that measure GRF may also alter the natural pattern of gait [13,14].On the other hand, force plates restrict measurements inside the lab to a few steps, making them not ideal for sports monitoring [15].Another alternative to measuring ground reaction forces outside the lab setting are instrumented insoles.These insoles measure foot pressure and thereby estimate GRF [16,17].However, they face technical challenges such as difficulty in accounting for frictional forces [18].These insoles also have not been evaluated for their durability, especially for sports-like conditions, and measure only 1D GRF, while 3D GRF is important for monitoring sideward movements such as during ACL rehabilitation.Since ACL injuries predominantly occur in athletes, it is essential to monitor exercise/sports movements and to allow for measurements in remote settings, thereby making conventional measurement systems not suitable.
Inertial measurement units (IMUs) offer an alternative for quantitative gait analysis over optical marker systems.They are increasingly used in research, as they are small, portable, and suitable for outdoor and sports monitoring [19,20].Their validity for kinematic gait analysis have been well studied and compared to optical reference systems [21].Several algorithms also exist in the literature to estimate GRF and GRM, net joint moments, and other kinetic parameters using the measured kinematics from IMUs.However, their reliability is not well compared or studied.The algorithms in the literature can be grouped into three categories.
Algorithms that use a two-step approach, where GRF and GRM is estimated first and then the predicted GRF and GRM are used to estimate other kinetic parameters like net joint moments; 3.
Algorithms that apply new approaches to directly estimate net joint moments and/or net joint forces.
Ancillao et al. [22] conducted a systematic review on the estimation of GRFs and GRMs using IMUs.The primary objective of that review was to identify and discuss methods to estimate GRF and GRM from IMU data.Algorithms published till the year 2018 in the first two categories were included in this review.Mundt et al. [23] in 2021 partly investigated the third category and compared three neural network (NN) approaches to estimate joint moments from IMUs: multilayer perceptron (MLP) networks, convolutional neural networks (CNNs), and recurrent neural network such as long short-term memory network (LSTM) for level-walking data.
Currently, to the best of the author's knowledge there are no reviews that identified algorithms in all three categories and evaluated them for their applicability in clinical decision-making for ACL rehabilitation.Thus, a systematic review that compares and evaluates available algorithms for estimation of kinetic parameters (joint kinetics, GRF, and GRM) using only IMU data will provide insights on the state of the art of the accuracy, reliability, and applicability of the available algorithms.In addition, it will help to identify the gaps and opportunities for further research and open new avenues for clinical decisionmaking for ACL rehabilitation and for other conditions.This might also be beneficial for other sports-related rehabilitation and injury prevention training.Therefore, the objective of this systematic review is to identify and discuss algorithms for the estimation of lower limb kinetic parameters only using IMUs without additional force information.To achieve this objective, the following research questions were formulated.

1.
What are the available algorithms to estimate GRF and GRM using only IMU data as input and then use the predicted GRF and GRM to estimate lower limb joint kinetics?2.
What are the available algorithms to directly estimate lower limb joint kinetics using only IMU data as input?3.
What is the accuracy, reliability, and applicability of the identified algorithms for pathological gait and ACL-related tasks?

Study Design
This systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement (Supplementary Table S1) [24].The steps included the formation of inclusion-exclusion criteria, pre-planned data analysis and extraction pipeline, and pre-defined quality assessment metrics.The methodology was registered in advance at the international database of systematic reviews PROSPERO (CRD42022304911) [25].

Search Strategy
The databases used for this review were PubMed, Scopus (Elsevier), and SPORTDiscus (EBSCO host).Three keyword groups were identified, and commonly accepted variations of the keywords were synthesized.A search query for each database was formulated using Boolean operator "AND" between the groups and "OR' between the terms within the same group.Multiple successive searches were performed based on an expanded search query from previous search results.The finalized full search query is available in Appendix A (Table A1).The identified keyword groups were Group1: "lower extremity" AND 'kinetics OR "ACL", Group 2: "inertial measurement unit", and Group 3: "algorithms".

Inclusion/Exclusion Criteria
All original studies that used kinematic data from IMUs or simulated inertial data were included.Reviews, comparison studies that did not propose any new method or algorithms, abstracts, and veterinary studies were excluded.Studies were also excluded when they utilized force platforms, insoles, or instrumented shoes as external force input or electromyography (EMG) data as input for algorithms to estimate kinetic parameters.Studies were included only when the estimated kinetic parameters were compared to an existing reference system with known performance characteristics.

Study Selection and Quality Assessment
Duplicate articles were identified and removed using reference manager EndNote 20.2.1 and the Rayyan web application (Rayyan System Inc., Cambridge, MA, USA), resulting in 4444 articles.The identified articles were then loaded into ASReview LAB 0.19 for screening (S.K).ASReview is an artificial intelligence tool that helps with identifying relevant papers.It requires a set of papers that will be included in the final review as a prior knowledge set and as a validation set.The tool then trains an active learning model based on the input data and produces papers based on order of relevance for the reviewer to perform title and abstract screening.The tool also retrains the model based on every subsequent user decision [26].The screening was continued until 150 consecutive papers were marked irrelevant (chosen stop criteria).The title and abstract screening in ASReview (S.K) involved a total of 1082 articles and resulted in 187 relevant papers.To mitigate potential bias introduced by the first reviewer's selections, 20% of the articles from the initial 893 excluded papers (1082-187) were added to Rayyan along with 187 papers for a consecutive title and abstract screening.Double-blinded abstract screening (S.K and B.-J.F.v.B.) of the identified articles was then performed using Rayyan.The inter-rater conflict was 19%, and the final decision of inclusion/exclusion was based on discussion between the reviewers until 100% agreement was reached.After abstract screening, 75 papers were included for full text screening.All articles obtained from the update search were manually screened (S.K), and 9 articles were included.Full-text screening was performed by the first reviewer (S.K), and 71 articles were included for quantitative data synthesis.The process pipeline for article screening and selection is provided in Figure 1.
Quality assessment of the included studies was performed (S.K) using a 14-point checklist comprising items listed in Table 1.Items 1, 2, 4, 6, 7, 8, 10, 11, 12, and 14 were directly replicated from Strom et al. [27].Items 3 and 13 were modified and adapted from the same table to better align with the specific focus of this systematic review.Additionally, we added items 5 and 9 to address additional aspects relevant to our quality assessment.For detailed assessments and scores of all included studies, including explanations and descriptions of each item, please refer to Supplementary Table S2.
Sensors 2024, 24, x FOR PEER REVIEW 4 of relevant papers.It requires a set of papers that will be included in the final review as prior knowledge set and as a validation set.The tool then trains an active learning mod based on the input data and produces papers based on order of relevance for the review to perform title and abstract screening.The tool also retrains the model based on eve subsequent user decision [26].The screening was continued until 150 consecutive pape were marked irrelevant (chosen stop criteria).The title and abstract screening in ASR view (S.K) involved a total of 1082 articles and resulted in 187 relevant papers.To mitiga potential bias introduced by the first reviewer's selections, 20% of the articles from th initial 893 excluded papers (1082-187) were added to Rayyan along with 187 papers for consecutive title and abstract screening.Double-blinded abstract screening (S.K and B J.F.v.B.)) of the identified articles was then performed using Rayyan.The inter-rater co flict was 19%, and the final decision of inclusion/exclusion was based on discussion b tween the reviewers until 100% agreement was reached.After abstract screening, 75 p pers were included for full text screening.All articles obtained from the update sear were manually screened (S.K), and 9 articles were included.Full-text screening was pe formed by the first reviewer (S.K), and 71 articles were included for quantitative data sy thesis.The process pipeline for article screening and selection is provided in Figure 1.Quality assessment of the included studies was performed (S.K) using a 14-poi checklist comprising items listed in Table 1.Items 1, 2, 4, 6, 7, 8, 10, 11, 12, and 14 we directly replicated from Strom et al. [27].Items 3 and 13 were modified and adapted fro the same table to better align with the specific focus of this systematic review.Additio ally, we added items 5 and 9 to address additional aspects relevant to our quality asses ment.For detailed assessments and scores of all included studies, including explanatio and descriptions of each item, please refer to Supplementary Table S2.

Data Extraction
The data extraction was performed by first reviewer (S.K) using ATLAS.ti23 for Windows.The following information was extracted to answer the research questions and sub-questions.General information on publication type, year, and author information was collected to gain an overview of all the papers.Information on study settings, methods, sensors used, population evaluated (healthy/pathological), sample size, and types of movements studied were extracted to understand the input requirements for the algorithms, identify gaps, and discuss the applicability of the algorithms.The type of algorithms proposed, including other information about the algorithms such as assumptions and processing steps, were also extracted.Estimated outcome parameters and their accuracy metrics in comparison to the reference system used were extracted.To ensure precision and accuracy in our review, we refrained from extracting information directly from graphical representations such as error plots.Instead, we focused on obtaining numerical values when explicitly provided in the text or abstract of the source papers.

Publication Year
The included articles were grouped based on year of publication and are depicted in Figure 2. The first publication found was from the year 1996 [28].The highest number of included articles were published in 2020, with 16 publications.The year 2023 in Figure 2 only included articles published till 15 April 2023.

Participant Characteristics
The validation of algorithms identified in the included papers was predominantly performed on healthy young adults and especially athletic cohorts (Figure 3).Only four articles (~6%) validated their algorithms on people with pathological gait [29][30][31][32].Stetter et al. evaluated their algorithm on healthy volunteers with bowlegs, which was suggested

Participant Characteristics
The validation of algorithms identified in the included papers was predominantly performed on healthy young adults and especially athletic cohorts (Figure 3).Only four articles (∼6%) validated their algorithms on people with pathological gait [29][30][31][32].Stetter et al. evaluated their algorithm on healthy volunteers with bowlegs, which was suggested to mimic the varus misalignment in patients with knee osteoarthritis [33].Female population was underrepresented in 55 articles (∼80%), while 6 articles (∼8%) [30,32,[34][35][36][37] did not report complete information on the gender distribution of the study population.

Participant Characteristics
The validation of algorithms identified in the included papers was predominantly performed on healthy young adults and especially athletic cohorts (Figure 3).Only four articles (~6%) validated their algorithms on people with pathological gait [29][30][31][32].Stetter et al. evaluated their algorithm on healthy volunteers with bowlegs, which was suggested to mimic the varus misalignment in patients with knee osteoarthritis [33].Female popu lation was underrepresented in 55 articles (~80%), while 6 articles (~8%) [30,32,[34][35][36][37] did not report complete information on the gender distribution of the study population.

Measurement Systems and Sensor Placement
Among the included studies, most studies used tri-axial gyroscopes and/or triaxia accelerometers, and biaxial and uniaxial accelerometers were also used in few studies [36,38,39].Information on sensor systems used, number of on-body sensors, and sensor characteristics are listed in Table 2.Not all studies reported the specifications of the sensor used; therefore, additional information was obtained through reference tracking and from the manufacturer website and is included in Table 2.

Measurement Systems and Sensor Placement
Among the included studies, most studies used tri-axial gyroscopes and/or triaxial accelerometers, and biaxial and uniaxial accelerometers were also used in few studies [36,38,39].Information on sensor systems used, number of on-body sensors, and sensor characteristics are listed in Table 2.Not all studies reported the specifications of the sensor used; therefore, additional information was obtained through reference tracking and from the manufacturer website and is included in Table 2.

Types of Algorithms and Estimated Parameters of Interest
Based on the approach taken, several types of models used to estimate kinetics be identified.The four main types we identified are (i) biomechanical models, (ii) mus loskeletal models, (iii) statistical models, and (iv) machine learning models.
Biomechanical (BM) models in this context refer to algorithms that use kinema measured from IMUs along with inertial properties of (rigid) body segments and th biomechanical relationships to compute kinetics by applying Newton-Euler motion eq tions.However, these equations become indeterminate during double support when distribution of load between the feet is unknown.Additional functions like the smo transition approach were introduced to deal with this problem [79], but some articles o focused on estimating GRF during single support.
Musculoskeletal (MS)-based modelling approaches extend biomechanical-ba models that only include rigid segment modelling by including properties of musc bones, and ligaments.The utilization of a combination of IMU measured kinematics a MS modelling enables the estimation of muscle forces along with joint kinetics.The rameters estimated using MS modelling-based methods included GRF and GRM [ GRF [71,81], joint reaction forces [76,78], and joint moments [76,78,81].
Statistical modelling (SM) relies on finding relationships between measured kinem ics from IMUs and measured kinetics from the force plate to build prediction equati using regression-based techniques.Knowledge of basic biomechanical relationships tween measured kinetics and kinematics is used to select inputs and identify relationsh between them.These methods were built to estimate mainly peak vertical GRF (pVGR peak GRF (pGRF), peak loading rate, GRF breaking, and propulsion point metrics fr accelerometer data [14, 31,39,40,48,57,90].However, Chien et al. used backward regress analysis to estimate the full GRF profile [47].

Types of Algorithms and Estimated Parameters of Interest
Based on the approach taken, several types of models used to estimate kinetics can be identified.The four main types we identified are (i) biomechanical models, (ii) musculoskeletal models, (iii) statistical models, and (iv) machine learning models.
Biomechanical (BM) models in this context refer to algorithms that use kinematics measured from IMUs along with inertial properties of (rigid) body segments and their biomechanical relationships to compute kinetics by applying Newton-Euler motion equations.However, these equations become indeterminate during double support when the distribution of load between the feet is unknown.Additional functions like the smooth transition approach were introduced to deal with this problem [79], but some articles only focused on estimating GRF during single support.
Musculoskeletal (MS)-based modelling approaches extend biomechanical-based models that only include rigid segment modelling by including properties of muscles, bones, and ligaments.The utilization of a combination of IMU measured kinematics and MS modelling enables the estimation of muscle forces along with joint kinetics.The parameters estimated using MS modelling-based methods included GRF and GRM [78], GRF [71,81], joint reaction forces [76,78], and joint moments [76,78,81].
Statistical modelling (SM) relies on finding relationships between measured kinematics from IMUs and measured kinetics from the force plate to build prediction equations using regression-based techniques.Knowledge of basic biomechanical relationships between measured kinetics and kinematics is used to select inputs and identify relationships between them.These methods were built to estimate mainly peak vertical GRF (pVGRF), peak GRF (pGRF), peak loading rate, GRF breaking, and propulsion point metrics from accelerometer data [14, 31,39,40,48,57,90].However, Chien et al. used backward regression analysis to estimate the full GRF profile [47].

Discussion
The main objective and scope of this systematic review was to identify existing algorithms to estimate kinetic parameters using IMU data and to evaluate their accuracy, applicability, and reliability for ACL rehabilitation.A structured approach was taken to address the objectives, which started with a preliminary literature search to identify relevant kinetic parameters for ACL rehabilitation.Following this, a systematic literature search and data extraction and synthesis were conducted and reported.A significant increase in the number of papers was seen after 2017.We found that a significant number of new algorithms have been proposed since the last available systematic review related to estimation of ground reaction forces using IMUs in 2018, and therefore, this review provides several new insights for estimating GRFs and GRMs and adds new information on existing algorithms for estimating joint kinetics.

Modelling Techniques and Estimated Kinetic Parameters
The majority of the reviewed articles utilized ML-based models and accounted for around 45% of the reviewed articles, followed by BM (∼38%).All four types of proposed algorithms have their own advantages and disadvantages.The SM approach provides methods to estimate kinetic parameters and can also help with studying various interaction effects on the estimation of kinetic parameters.Authors who utilized SM often also studied the effects of age, body mass, and sex on the prediction of kinetics [39,40,48,90].The use of SM approaches was primarily observed for the estimation of 1D GRF and peak GRF metrics.Additionally, some studies utilized SM approaches to estimate kinetic parameters derived from GRF, such as loading rate [14, 48,57] and propulsion and break metrics [31].Although these parameters offer several useful insights, they were not used to estimate 3D continuous kinetic parameters for various dynamic activities.BM and MS are based on well-established principles of human anatomy and therefore also provide meaningful interpretations of the obtained metrics.However, BM is often sensitive to the input data, such as measured anthropometric data and generalized anatomical properties, which can affect the prediction outcomes.BMs are also built based on multiple assumptions about specific gait patterns and the standard anthropometric characteristics of a healthy population, rendering questionable their adaptability to diverse types of movement that were not studied and for patient groups.
ML-based models, on the other hand, can handle complex relationships in data and can be trained on diverse datasets, making them more generalizable, but these often come at the cost of requiring large, high-quality datasets for training.This can be challenging to obtain, especially when concerning patient populations.Some strategies for creating new inertial data from existing datasets were adopted to increase the size of the datasets in some of the reviewed articles [30,73,77,80].According to Verheul et al. [67], utilizing ML-based techniques must be carried out with caution, as they are computational-based methods and thus may not account for and explain underlying physiological mechanisms.Although BMs considers physiological mechanisms, they still do not account for muscle forces or activations.MS-based models, on the other hand, account for muscle properties and forces and therefore may model physiological mechanisms more accurately.This property of MS-based models made it possible to estimate joint reaction forces [76,78].
Intersegmental forces and forces on joints were estimated only by BM and MS models, except for [68], where Stetter et al. used ML modelling (ANN) to estimate knee joint forces.Three-dimensional GRM was only estimated in three studies [78][79][80] using BM, ML, and MS models.The limited number of studies in the literature for the estimation of GRM highlights its potential for future research in this area.It is particularly important because GRM is an important input for understanding how GRF generates moments at a particular joint.The importance of GRM along with GRF as an input for the estimation of joint kinetics was also discussed by Karatsidis et al. [79].

Modelling Techniques, Tasks Studied, and Accuracies of Estimated Kinetic Parameters
In general, the large variety of accuracy metrics and their differences, along with the various activities studied, makes it challenging to assess the accuracies and compare the performance of the different proposed models in the reviewed articles.Sharma et al. [62] also discussed that it is difficult to compare the accuracies of various approaches from the literature due to differences in experimental methodologies (activities tested, conditions, and equipment used).In general, the highest accuracies achieved for estimation of kinetics were based on ML models, followed by BM models.It is important to note that the same algorithm, when validated for a particular activity at varying speeds, resulted in different accuracies.Activities performed at a comfortable a pace resulted in the best performance of the reported models, while the errors noticeably increased for higher and lower speeds [36,41,44,49,51,53,67].However, this trend was not found in [74,77], where fast walking resulted in a more accurate estimation than normal walking, although slow walking resulted in higher NRMSE and RMSE.This may be because of a decrease in dynamic range of activity during slow movements.To better understand the effect of varying speeds on accuracies, the use of IMUs should be studied first at the kinematic level for the effect of variations in speed and then compared to optical marker systems in future studies.
All articles that achieved the lowest RMSE/rRMSE for the estimation of 3D GRF, 3D GRM, pGRF, and pVGRF studied either walking or running, except Cerfoglio et al., who studied vertical drop jump [85].It is also important to note that all these activities take place primarily in the sagittal plane and have the highest magnitude of GRF in the vertical direction, while M-LGRF and APGRF are of lower magnitude.Additionally, in most of the papers, M-LGRF and APGRFs were indicated to have lower RMSEs compared to VGRFs.However, the magnitudes of M-LGRF and APGRF are significantly lower than VGRF, and therefore, it is important to look at metrics that provide a normalized view of the errors considering the variability of data such as rRMSE and nRMSE.
Side-to-side jump was studied by Recinos et al., but only VGRF was estimated, with an RMSE of 0.108-0.274(normalized to bodyweight) [71].Johnson et al. [80] proposed ML algorithms to estimate 3D GRF and 3D GRM for running and sidestepping activities.However, only correlation coefficients of the models were reported."Good" correlation greater than 0.85 was found for 3D GRF and "moderate" correlation between 0.6 and 0.7 was found for 3D GRM.Therefore, it would be beneficial to evaluate the applicability of these models to estimate 3D GRF for multiplanar tasks.A similar observation can be made for the estimation of joint moments.
Chaaban et al. [84] validated their approach of predicting VGRF and knee kinetics in healthy volunteers.They compared their computed RMSE values against differences between reported kinetics values of ACL patients and healthy volunteers from the literature (also termed "clinical difference").These clinical differences reported were 0.24 and 0.035 for VGRF and knee flexion-extension moment, respectively.They suggested that since their model achieved an RMSE of 0.21 and 0.027 for VGRF and knee flexion-extension moment, respectively (lower than the clinical difference), their algorithm is appropriate.However, asymmetries and other factors in patients may lead to lower accuracies when validated for ACL patients.

Effect of Sensor Location and Sensor Characteristics on the Accuracy of Estimated Parameters
Around 56% of the included articles (40 articles) used three or fewer than three on-body sensors.Only 6 articles used 17 sensors and measured whole-body kinematics.Apparently, using a minimal set of sensors was given priority.Although some articles used multiple sensors, they explicitly tried to find a minimal and optimal sensor set.Multiple walking studies reported the pelvis sensor to be the most accurate [32,93].Although the pelvis is known to be a good sensor location for estimating GRF due to the assumption of being close to the body's center of mass, Havashinezhadian et al. [32] discussed that this assumption may not be valid for aging populations and pathological gait such as in knee osteoarthritis due to changes in upper-body stability.Havashinezhadian et al. [32] reported that the top of the shoe is the best location to predict 3D GRF in medial knee osteoarthritis patients.The seventh cervical vertebra was also reported to be a good sensor location for the estimation of VGRF [70].Revi et al. used three sensors (pelvis, thigh, and shank) to estimate APGRF and found that even the removal of one of the sensors significantly affected the model's accuracy in both healthy and post-stroke patients.Kerns et al. studied countermovement jumps using a sensor on the trunk and found that trunk movement during countermovement jump caused an increase in error estimates [34].Therefore, it is important to consider the parameters of interest, type of task to be measured, and patient population characteristics when deciding on an optimal sensor placement.
Along with the choice of best placement location, the repeatability of IMU placements is also critical for accurate and repeatable kinetics estimations, Tan et al. studied the influence of IMU misplacement errors on GRF prediction and found that position misplacement of one sensor caused only a 1% difference; however, when eight sensors were misplaced, it lead to an error in GRF estimation of 4-6%.Error in the change in orientations of eight IMUs caused a difference of 20% in estimated 3D GRF.It is therefore suggested by the author that IMU placements are critical to obtain accurate data from the accelerometer and gyroscope for kinetic estimations.However, these errors can be compensated for by applying additional steps like rotating the sensor data to a segment frame using a static, helical axis or any other "body segment" calibration [96,97].
It was observed that 27 of the included articles (38%) measured activities at a sampling rate of 100-200 Hz; however, Sharma et al. [62] suggested that this may not be good enough to track dynamics during highly dynamic activities like sprint running, where about five steps per second could be observed.They acquired kinematic data at 400 Hz and stated that it was sufficient for their measurements.This was also consistent with other included studies that had higher sampling rates that measured dynamic activities such as running, jumping, and landing tasks.However, three studies that studied walking also used a higher sampling rate (400-1000 Hz) [42,62,81].Therefore, sampling rates higher than 100-200 Hz may be needed for application in ACL rehabilitation, as it involves dynamic activities such as change in direction, jumping, and landing tasks.However, the sampling frequency of 100 Hz would be sufficient for other movements such as walking, double-leg, and single-leg squats.

Applicability for ACL Rehabilitation
To assess the applicability of the proposed algorithms in the literature, it is crucial to not just consider the achieved accuracies but also consider the validated activity and its generalizability on patient populations.In Figure 3, we could see that most of the articles were only validated on healthy cohorts, except for four studies [29][30][31][32].The accuracies of the reviewed approaches may be different in patients when compared to healthy cohorts.This is also supported by the results of Liu et al. [29], who studied STS movements in both healthy people and people with mild lower-limb dyskinesia.The VGRF estimates had higher RMSE and correlation coefficients in the patient group compared to healthy volunteers.Liu et al. discussed that the differences are because the patients performed the tasks at a slower pace that increased the muscle tremble and sway in body motion, which in turn contaminated the measured signal, causing lower accuracies.Some authors studied and validated their algorithms on asymmetrical gait movements and found lower accuracies compared to accuracies achieved on normal movement [15,59,75].These results indicate that it is important to validate the algorithms on pathological gait, as the assumptions of the algorithms may not be valid for patient groups and therefore may result in lower accuracies.It is also known from the literature that ACL patients have increased knee flexion for at least a year after reconstruction [98,99], and hence, it is important to validate the models with data from ACL patients.However, none of the articles included in the study validated their algorithms on ACL patients.
Many of the included articles (54%) only studied walking.Important ACL rehabilitationspecific tasks such as single-leg hop and triple hop have not been studied.Change-indirection tasks could be more extensively studied to include multiplanar movements.Another important finding is that the female population was underrepresented in 80% of the included articles.Neugebauer et al. studied the estimation of GRF in youth during walking and running and discussed that sex is an influencing factor in the estimation of GRFs [39].It is also important to note that the female population has an increased risk of ACL injury [100].The increased risk of ACL injury in female populations may be attributed to neuromuscular differences such as a reduced effectiveness in stiffening the knee, causing increased tibial translation [101].Therefore, it is essential to have equal female representation in the validation process.Additionally, it may also be useful to come up with different strategies or methodologies to account for gender and other anthropometric differences to ensure the generalizability of the proposed algorithms.The sample size of participants studied also differed significantly among the included studies.The median sample size was 12 participants, with individual sample sizes ranging from 1 [28,35,71,94] to 131 [57] participants.A total of 25 articles (35%) [15,28,29,[34][35][36][37]50,51,53,[59][60][61][62][63][69][70][71]74,76,77,82,87,92,94] involved fewer than 10 participants.It is important to ensure that the considered sample size has enough statistical power to detect meaningful differences and obtain meaningful results.
A power analysis was conducted using G*Power 3.1.9.7 for a t-test comparing two independent means with an alpha = 0.05, a power of 0.80, and a medium effect size of 0.5.The chosen values were based on the work of Shirazi et al. [102].The computed required sample size was 51.Only 8 of the included papers in this study tested more than 51 participants [14, 23,48,49,54,57,58,90].It is important to recognize that the calculated sample size serves as an indicator rather than a definitive guideline, as actual sample sizes can vary based on several factors and are largely dependent on the type of study design and population.Therefore, future research should aim to establish sample sizes aligned with expected effect sizes and carefully evaluate the statistical significance of their findings.

Limitations of the Included Evidence, Review Process, and Future Directions
Although this systematic review was conducted comprehensively and methodologically, it is also crucial to acknowledge the potential bias and constraints that may influence the generalizability of the results.Firstly, although the articles were diligently searched and queried in three of the major databases, there may be a few articles that were missed due to indexing, resulting in inclusion bias.The articles included in the review were also limited to only English.The decision to include only articles that validated on human beings while improving the quality of the included results may have limited the inclusiveness of the review.
The use of varying sensor placement locations, experimental protocols, and reporting metrics used in the included articles made direct comparison and identification of the overall best model with the most accurate outcome challenging.Also, some articles only validated a small number of subjects, bringing their reliability and generalizability into question.Therefore, it is important to consider these limitations when drawing conclusions about the implications of the results.To address these challenges, future research should adhere to standardized methodologies such as maintaining an adequate sample size, choosing the right validation groups, employing standard experimental protocols, and choosing appropriate accuracy metrics.Furthermore, this also emphasizes the need for large, diverse, open datasets involving both patients and healthy subjects performing activities relevant to ACL rehabilitation.This would enable efficient comparison of the testing of existing and new algorithms.Such efforts would be crucial for facilitating comparisons across studies in the literature as well as enhancing the relevance and applicability of research findings.

Conclusions
In this comprehensive systematic review, we aimed to assess the current state of the art in using IMUs to estimate GRF and joint kinetics, specifically focusing on their potential use in ACL rehabilitation.The results of this review indicate that IMUs have good potential to estimate GRF and other joint kinetic parameters with good accuracy for movements primarily in the sagittal plane for healthy cohorts.However, none of these algorithms have been validated on ACL patients.Kinetic estimations for multiplanar ACL-relevant movements such as side hops or change-in-direction tasks have not been well studied.
Combining multiple models such as BM along with ML-based techniques could help overcome their individual limitations and therefore help achieve more accurate estimates of joint kinetics and GRF.Gaining a deeper understanding of kinetic parameters and their implications can help clinicians assess ACL patients during rehabilitation and can also be extended to a broader rehabilitation context for other conditions.Additionally, it can also help build tailored gait-retraining strategies and develop future injury prevention techniques during athletic training.Note: All search queries with multiple successive searches (indicated with #) were combined with the "AND" operator.

Figure 1 .
Figure 1.Systematic review process pipeline according to PRISMA.

Figure 1 .
Figure 1.Systematic review process pipeline according to PRISMA.

3 Sensors
Description of inclusion and/or exclusion criteria along with information about volunteers and/or patients (IV/EV) 1, 0.5 or 0 Methods (Performance Bias) 4 Data collection is clearly described and reliable (IV/EV) 1, 0.5 or 0 5 Description about activities measured, validation tasks, warmup (IV/EV) 1, 0.5 or 0

Figure 2 .
Figure 2. Number of publications per year.

Figure 2 .
Figure 2. Number of publications per year.

Figure 3 .
Figure 3. Participant demographics depicted along with the corresponding number of articles.

Figure 3 .
Figure 3. Participant demographics depicted along with the corresponding number of articles.

Figure 4 .
Figure 4. Sensor placement locations along with the corresponding number of articles.The ventral trunk placement location includes the sternum, chest, and torso.The dorsal trunk includes sensors placed on upper/mid-dorsal trunk.The feet included sensors placed on both the feet and the shoes.The lower back included sensors placed on the fifth lumbar vertebrae, pelvis, and sacrum.

Figure 4 .
Figure 4. Sensor placement locations along with the corresponding number of articles.The ventral trunk placement location includes the sternum, chest, and torso.The dorsal trunk includes sensors placed on upper/mid-dorsal trunk.The feet included sensors placed on both the feet and the shoes.The lower back included sensors placed on the fifth lumbar vertebrae, pelvis, and sacrum.

Figure 5 .
Figure 5. Activities measured in the included articles.

Figure 5 .
Figure 5. Activities measured in the included articles.

Figure 6 .
Figure 6.Sankey diagram representing several types of models used and all estimated kinetic parameters.

Figure 6 .
Figure 6.Sankey diagram representing several types of models used and all estimated kinetic parameters.