Impact of PCA Pre-Normalization Methods on Ground Reaction Force Estimation Accuracy

Ground reaction force (GRF) components can be estimated using insole pressure sensors. Principal component analysis in conjunction with machine learning (PCA-ML) methods are widely used for this task. PCA reduces dimensionality and requires pre-normalization. In this paper, we evaluated the impact of twelve pre-normalization methods using three PCA-ML methods on the accuracy of GRF component estimation. Accuracy was assessed using laboratory data from gold-standard force plate measurements. Data were collected from nine subjects during slow- and normal-speed walking activities. We tested the ANN (artificial neural network) and LS (least square) methods while also exploring support vector regression (SVR), a method not previously examined in the literature, to the best of our knowledge. In the context of our work, our results suggest that the same normalization method can produce the worst or the best accuracy results, depending on the ML method. For example, the body weight normalization method yields good results for PCA-ANN but the worst performance for PCA-SVR. For PCA-ANN and PCA-LS, the vector standardization normalization method is recommended. For PCA-SVR, the mean method is recommended. The final message is not to define a normalization method a priori independently of the ML method.


Introduction
Normalization is a crucial data pre-processing step in machine learning (ML) estimation.Indeed, if measurements have heterogeneous values, quantities with low values are taken into account less than quantities with higher values in the ML estimation procedure.In this way, normalization assigns the same importance to all measurements.The authors in [1] used the robust scaler method in conjunction with an ML method to provide an automated methodology for accurately categorizing various types of defects in industrial IoT ball bearings.The authors in [2] used the min-max normalization method to forecast monthly precipitation.The authors in [3] used the min-max method to estimate position in indoor navigation.The authors in [4] estimated the vertical component of the ground reaction force (GRF) from step sound using body weight normalization.Honert et al. [5] estimated the vertical and anterior-posterior components of the GRF using the Z−score (ZS) normalization method.
ML methods can also be combined with principal component analysis (PCA), referred to as PCA-ML methods in this paper.Again, it is essential to incorporate a normalization method by centering and normalizing the data points before PCA transformation.PCA transforms a high-dimensional input dataset into a low-dimensional dataset.The motivation behind this transformation is to reduce input data dimensionality, aiming to minimize computational costs in embedded devices.This transformation not only involves reducing Sensors 2024, 24, 1137 2 of 17 input data dimensionality but may also enhance ML estimation performance.The authors in [6] used the min-max method to assess the prediction risk associated with the digital transformation of manufacturing supply chains.The authors in [7] used the ZS method to predict diabetic retinopathy.The authors in [8] used the min-max method to predict power load.Also, ZS normalization is the most commonly suggested method when using PCA, as mentioned in references [9][10][11].
This brief state-of-the-art review shows that normalization is required when using ML or PCA-ML methods, whatever the domain of application.However, the authors do not justify why they use or recommend the chosen normalization method.Moreover, normalization methods can be categorized into two classes: statistical approaches and physics approaches.Statistical methods encompass standard techniques, like the minmax and ZS methods.Physics methods rely on the parameters specific to the domain of the database [4,[12][13][14].This is the reason why studying the impact of normalization methods in ML estimation accuracy must be carried out in a specific domain.Our domain is biomechanics.
In this field, the estimation of GRF components is required in some clinical or biomechanical studies, particularly for the analysis of posture and movement [12].Instrumented insoles are current low-cost solutions for GRF component estimation.These devices measure pressure through many sensors, from which GRF components can be estimated.Industrial pressure insole systems only evaluate the F z component through a simple linear combination of the pressure sensor values weighted by their individual sensor surface areas.This low-cost approach yields estimation results with limited precision.More sophisticated estimation methods rely on ML principles, which aim to identify the link between insole plantar pressure (PP) data and GRF components in 3D by learning input/output examples.In order to minimize computational costs in instrumented insoles, we focus only on PCA-ML methods to estimate GRF components.
We hereby propose a detailed state-of-the-art review of normalization methods with the PCA-ML procedure limited to our domain of application.We first introduce the ML methods used.The authors in [12,13,15] employed artificial neural networks (ANNs) in conjunction with PCA to reduce the dimensionality of PP data to estimate GRF components.Rouhani et al. [12] compared the PCA-ANN, PCA-locally linear neuro fuzzy, and PCAleast square (LS) methods for estimating GRF components.Sim et al. [13] compared the three methods presented by Rouhani et al. [12] and also included the PCA-wavelet neural network method in their comparison.The authors of references [12,13] estimated GRF components for walking activities, while Joo et al. [15] focused on estimating GRF components for golf activities.
Second, we introduce the normalization methods used.These methods may impact the quality of GRF component estimation.While some studies normalized PP data to insole length [12] or to body weight [13] (physics methods), others, like [15], proposed normalization within the range of [-1, 1] (statistical methods).However, none of these authors justified their choice of the normalization method.Additionally, only a few normalization methods have been explored, despite the many existing in the literature.
To the best of our knowledge, no study has evaluated the benefit of PCA methods in combination with normalization approaches.In the present study, we thus propose to assess the impact of twelve different normalization methods from the literature on the accuracy of estimating GRF components.The three components (vertical component (F z ), anterior-posterior component (F y ), and medial-lateral component (F x )) will be investigated using ANN and LS.This is the first contribution.Also, we will evaluate the performance of the support vector regression (SVR) method, as another comparative method, to estimate GRF components, which has never been tested before in the literature, to the best of our knowledge.This is the second contribution.
To carry out the proposed study, we need to evaluate the estimation accuracy of each PCA-ML method in a supervised context.This is achieved in standardized laboratory Sensors 2024, 24, 1137 3 of 17 conditions by using force plates.The force plates serve as the reference for measured GRF components (ground truth).
Figure 1 presents the flow chart for estimating GRF components from PP, where PP represents the input data, PP norm represents the normalized input data, and W is the projection matrix determined by PCA.The estimation accuracy is computed after ML modeling.In the first stage, learning is carried out using the training dataset (red) for modeling.In the second stage, testing is carried out using the test dataset (blue) for computing performance metrics.To carry out the proposed study, we need to evaluate the estimation accuracy of each PCA-ML method in a supervised context.This is achieved in standardized laboratory conditions by using force plates.The force plates serve as the reference for measured GRF components (ground truth).
Figure 1 presents the flow chart for estimating GRF components from PP, where PP represents the input data, PPnorm represents the normalized input data, and W is the projection matrix determined by PCA.The estimation accuracy is computed after ML modeling.In the first stage, learning is carried out using the training dataset (red) for modeling.In the second stage, testing is carried out using the test dataset (blue) for computing performance metrics.

Figure 1.
Processing flow chart for estimating GRF components from PP.The PP training set is used for ML modeling by using the corresponding GRF force plate data (in red).The PP testing set is used to evaluate the performance of GRF components by using the corresponding GRF force plate data (in blue).The gray rectangle indicates the classical pre-normalization PCA-ML pipeline.
Each block of Figure 1 will be detailed in the following section.

Materials and Protocol
The experimental equipment used in our study included two pressure insole systems (Moticon, ReGo AG Munich, Munich, Germany) equipped with 16 capacitive pressure sensors and 2 force plates (model BMS600900, dimensions of 600 mm × 900 mm; AMTI, Watertown, MA, USA), as shown in Figure 2. Nine healthy male subjects (height: 178 ± 4.2 cm; weight: 77 ± 11.1 kg) participated in our study.Before enrollment, participants received detailed information on the study objective and procedure and provide written informed consent, complying with the eth-Figure 1. Processing flow chart for estimating GRF components from PP.The PP training set is used for ML modeling by using the corresponding GRF force plate data (in red).The PP testing set is used to evaluate the performance of GRF components by using the corresponding GRF force plate data (in blue).The gray rectangle indicates the classical pre-normalization PCA-ML pipeline.
Each block of Figure 1 will be detailed in the following section.

Materials and Protocol
The experimental equipment used in our study included two pressure insole systems (Moticon, ReGo AG Munich, Munich, Germany) equipped with 16 capacitive pressure sensors and 2 force plates (model BMS600900, dimensions of 600 mm × 900 mm; AMTI, Watertown, MA, USA), as shown in Figure 2. To carry out the proposed study, we need to evaluate the estimation accuracy of each PCA-ML method in a supervised context.This is achieved in standardized laboratory conditions by using force plates.The force plates serve as the reference for measured GRF components (ground truth).
Figure 1 presents the flow chart for estimating GRF components from PP, where PP represents the input data, PPnorm represents the normalized input data, and W is the projection matrix determined by PCA.The estimation accuracy is computed after ML modeling.In the first stage, learning is carried out using the training dataset (red) for modeling.In the second stage, testing is carried out using the test dataset (blue) for computing performance metrics.Processing flow chart for estimating GRF components from PP.The PP training set is used for ML modeling by using the corresponding GRF force plate data (in red).The PP testing set is used to evaluate the performance of GRF components by using the corresponding GRF force plate data (in blue).The gray rectangle indicates the classical pre-normalization PCA-ML pipeline.
Each block of Figure 1 will be detailed in the following section.

Materials and Protocol
The experimental equipment used in our study included two pressure insole systems (Moticon, ReGo AG Munich, Munich, Germany) equipped with 16 capacitive pressure sensors and 2 force plates (model BMS600900, dimensions of 600 mm × 900 mm; AMTI, Watertown, MA, USA), as shown in Figure 2. Nine healthy male subjects (height: 178 ± 4.2 cm; weight: 77 ± 11.1 kg) participated in our study.Before enrollment, participants received detailed information on the study objective and procedure and provide written informed consent, complying with the eth- Nine healthy male subjects (height: 178 ± 4.2 cm; weight: 77 ± 11.1 kg) participated in our study.Before enrollment, participants received detailed information on the study objective and procedure and provide written informed consent, complying with the ethical standards of the Declaration of Helsinki (2013).They wore their own shoes with Moticon insoles, which were of the same size (42 EU).Prior to the experiment, each subject performed three exercises to calibrate the Moticon insoles: a slow walk, standing still, and shifts in body weight.The subjects performed two different tasks on the two force plates to obtain measured GRF component data for both feet (one force plate per foot; Figure 2).These tasks were the following (the reported number indicates the min and max values of steps for each subject): (1) Normal walking (7-20 steps); (2) Slow walking (8-22 steps).
The measured GRF component data from the two force plates and the PP data from the Moticon insole systems were sampled at 100 Hz.
The Fx and Fy components of the force plate frame may not have the same orientation as the foot frame.For the sake of simplification, the Fx and Fy of the force plate frame are considered as the components of the frame of the foot.The foot progression angle (FPA) is used to transform the Fx and Fy components from the force plate frame to the frame of the foot.In our study, the subjects walked in a straight line along the Fy direction (Figure 2), which is the laboratory's axis of progression (ensured by the protocol that requires the right foot to be placed on one force plate and the left foot on the other force plate).Furthermore, all the subjects participating in this study were healthy (no pathological orientation deviation).In the case of healthy subjects, the study by Caderby et al. [16] showed that estimating the FPA is neither obvious nor standardized and can produce very different FPA values, but the difference is still less than ten degrees.Additionally, we believe that not transforming data with FPA would have little impact on our results.Actually, the FPA value may be tainted by errors, and its addition can impact the quality of GRF component estimation.For these reasons, we did not consider any transformation from the reference frame of the force plates to the frame of the foot.
In the absence of a direct method for digitally or analogically synchronizing force plate data with insole data, we chose a post-processing time-shift synchronization approach for both right and left feet.For this task, we utilized the unique Fz component provided by the Moticon insole system.The Fz component equals the sum of plantar pressure divided by the sensor area over the sensors.We refer to this estimation as Fz_insole.This involves taking the Fz_insole curve and shifting it in relation to the vertical force of the force plate (referred to as Fz_force_plate) curve across a given time range.Subsequently, we compute the root mean square error (RMSE) value and the correlation coefficient (R) as two functions of the time shift.The time-shift value with the lowest RMSE corresponds to the optimal moment for synchronizing the data from the insoles with those from the force plates.A high R value ensures sufficient correlation between the two curves and validates the time-shift value.Figure 3

Normalization Methods
Normalization is an important step for data pre-processing, as it can significantly impact estimation accuracy.Additionally, by ensuring that all variables are equally important and on the same scale, normalization can help speed up the training phase.In our study, we present and apply twelve normalization methods: min-max in the range [0, 1]

Normalization Methods
Normalization is an important step for data pre-processing, as it can significantly impact estimation accuracy.Additionally, by ensuring that all variables are equally important and on the same scale, normalization can help speed up the training phase.In our study, we present and apply twelve normalization methods: min-max in the range [0, 1] or [−1, 1], mean, Z-score, robust scaler, vector standardization, maximum linear standardization, decimal scaling, median, and tanh (statistical methods); body weight and length insole (anthropometric/physics methods).
Min-max (MM) [17] is one of the most popular normalizing methods.Given a row of data X = [x 1 , x 2 , . . . ,x n ], the normalized data using the min-max method in range [0, 1] (MM [0, 1] ) are given as where n is the length of the data.However, this method may not be robust, because it is highly sensitive to outliers [17].This method can be generalized to adjust the data within the range [a, b] [18]: We opted for the max-min method to normalize our data within the range of The mean method [18] shifts the mean of the data to zero and rescales the dynamic range of the data: where µ(X) represents the mean of the data.A drawback of this method is its sensitivity to outliers.
The most classical method is the Z-score (ZS) [17] technique, which centers and reduces data as where σ(X) is the standard deviation.The ZS method is less sensitive to outliers than the mean method.The robust scaler (RS) [18] method is an order statistics-based technique utilized for data that contain outliers.Once normalized, the data exhibit a median of zero: where x 75 is the third quartile and x 25 is the first quartile.Anysz et al. [19] defined the vector standardization (VS) method as They also proposed [19] the maximum linear standardization (MLS) technique.This method involves dividing each sample by the maximum value of the data: Sensors 2024, 24, 1137 6 of 17 Decimal scaling (DS) [17] is used particularly when all the different data are distributed on a logarithmic scale.The normalization writes where d is the number of digits of the maximum absolute value of the data [17]: d = log 10 (max(abs(X)).
The median (Med) normalization method involves dividing each sample by the median of the data [20]: The tanh normalization method, as proposed by Hampel et al. [21], can be used to scale data within the range of [0, 1] using the equation provided below: The last two normalization methods involve some anthropometric measures.The body weight (BW) normalization method divides the data by the body weight (BW) of the subject [14]: The length of the insole (LI) normalization method, proposed by Rouhani et al. [13], divides the data by the length of the insole pressure system: The input (PP) data must be normalized for each pressure sensor when using PCA prior to SVR and LS.In the case of PCA-ANN, the ANN-estimated outputs are normalized values that require denormalization post-processing to retrieve the appropriate magnitude of GRF components.The denormalization method applied to the estimated output is the reverse procedure with respect to the normalization method applied to the input.Note that the authors in [13] propose a specific use of the LI method by denormalizing the estimated output with the BW method.

Principal Component Analysis
Principal component analysis (PCA) is a technique used to reduce the dimensionality of a dataset (PP data) that contains a large number of dependent variables.The goal is to retain as much of the important information in the dataset as possible [22] by transforming the variables into uncorrelated variables, known as principal components (PCs).Applying PCA to a dataset offers several advantages, such as the following [23]: (1) It takes less computation time; (2) Redundant, irrelevant, and noisy data can be removed; (3) Data quality can be improved; (4) Some ML methods do not perform well on high-dimensional data.To address this issue and improve accuracy, for example, in ANN, it can be helpful to reduce the dimension of the data.
The steps of PCA [22] begin with the normalization of PP data.Subsequently, we calculate the covariance matrix of the PP data points, from which we calculate the eigenvectors and their corresponding eigenvalues.Finally, we select the k eigenvectors, also called principle components (PCs), that explain the most cumulative variance of the eigenvalues.We use k PCs to create the projection matrix, called W matrix.We transform the data points into a new set with k dimensions using the W matrix.We used PCA to reduce the dimension of the input data from 16 to k PCs, supposing that k was the smallest value for which cumulative variance was higher than 98% [13,16].

Machine Learning Methods for GRF Component Estimation
• Artificial Neural Network (ANN) The artificial neural network (ANN) is a robust algorithm based on the functioning human brain to recognize specific data among a vast number of data and can perform multiple tasks simultaneously.ANN uses a back-propagation network to update the weights between the layers, biases, and activation function parameters to estimate an output closer to the measured output.We conducted a series of experiments on the right foot with varied parameters, following a systematic order.We started with an initial configuration of 2 hidden layers having nodes (128, 256), a batch size of 32, Adamax optimizer, a learning rate of 0.01, activation function set to leaky_relu, and the mean normalization method.We then modified the number of hidden layers and their nodes, which are shown in parentheses, as 1 hidden layer (150); 2 hidden layers (256, 128), (200, 400), (400, 200); 3 hidden layers (600, 400, 200); and 4 hidden layers (800, 600, 400, 200).The optimizer was adjusted by experimenting with Adam, SGD with momentum values of 0.5 and 0.9, and Adamax.We investigated the impact of varying the learning rate, with 0.001, 0.01, 0.05, and 0.1 values.The batch size was systematically altered, exploring values of 1, 16, 32, 64, and 128.Lastly, we explored different activation functions, including sigmoid, relu, tanh, leaky_relu, and wavelet.
The optimal parameters identified from these simulations for ANN modeling, which yielded the highest accuracy of GRF component estimation for the right foot, are as follows: a learning rate of 0.01, a batch size of 32, and the Adamax optimization algorithm.The ANN topology is shown in Figure 4.The left and right insoles are symmetric and for sake of simplification, we make the assumption that the ANN model architectures and parameters for the left foot are identical to those for the right foot.  Figure 4 depicts the network topology used to estimate the GRF components, where input vector X = [PC 1 , PC 2 , . . . ,PC k ] is followed by two hidden layers, each consisting of 400 and 200 nodes, respectively.The activation function of both hidden layers is represented by h i , which is chosen to be a relu activation function.The output layer's activation function is represented by y 1 , y 2 , and y 3 , which stand for the three GRF components.
The outputs of node j in the first and second hidden layers are given by the following formula and are denoted as h 1 (j) and h 2 (j), respectively: and where w 1,i,j , w 2,i,j , b 1,j , and b 2,j are the weights and biases for both hidden layers, respectively.The output layer employed the identity activation function, and its calculation is expressed as where w 3,i,j is the weight connecting the last hidden layer and the output layer, and b 3,j is the bias of the output layer.

•
Least Square (LS) Method The least square (LS) method is a regression method that allows for finding a linear model that connects the output with the inputs based on knowledge of the experimental data.The fundamental concept behind this approach is centered on minimizing the quadratic criterion between the measured and estimated output which are related to the chosen mathematical linear model.

•
Support Vector Regression (SVR) Method Support vector regression (SVR) is a supervised statistical learning algorithm that is utilized for solving regression problems.For classification tasks, the analogous algorithm SVM (support vector machine) can be employed.The decision boundary is represented by the two black lines in Figure 5, while the red line denotes the hyperplane.
Sensors 2024, 24, x FOR PEER REVIEW 9 of 19 The least square (LS) method is a regression method that allows for finding a linear model that connects the output with the inputs based on knowledge of the experimental data.The fundamental concept behind this approach is centered on minimizing the quadratic criterion between the measured and estimated output quantities, which are related to the chosen mathematical linear model.

•
Support Vector Regression (SVR) Method Support vector regression (SVR) is a supervised statistical learning algorithm that is utilized for solving regression problems.For classification tasks, the analogous algorithm SVM (support vector machine) can be employed.The decision boundary is represented by the two black lines in Figure 5, while the red line denotes the hyperplane.The distance between the hyperplane and the two decision boundary lines is denoted by ξ , which is a parameter that can be chosen, while the variables ' The objective of this approach is to identify the hyperplane function (which can be a nonlinear function) that maximizes the number of measured data within the decision boundary [24].
We conducted tests with linear and radial basis function (RBF) kernels while varying parameters for the right foot.Specifically, for the linear and RBF kernels, we tested ξ = 15 and 20 with different values of C, including 0.1, 1, 10, 50, and 100 (C is a regularizing parameter that determines the tolerance for deviations between the kernel and the measured data).Additionally, the supplementary γ parameter with values of 0.1, 1, 10, 50, and 100 was tested for the RBF kernel.The distance between the hyperplane and the two decision boundary lines is denoted by ξ, which is a parameter that can be chosen, while the variables ζ ′ and ζ indicate the errors between the two decision boundary lines and the measured data.
The objective of this approach is to identify the hyperplane function (which can be a nonlinear function) that maximizes the number of measured data within the decision boundary [24].
We conducted tests with linear and radial basis function (RBF) kernels while varying parameters for the right foot.Specifically, for the linear and RBF kernels, we tested ξ = 15 and 20 with different values of C, including 0.1, 1, 10, 50, and 100 (C is a regularizing parameter that determines the tolerance for deviations between the kernel and the measured data).Additionally, the supplementary γ parameter with values of 0.1, 1, 10, 50, and 100 was tested for the RBF kernel.
From these simulations, it was found that the RBF kernel model with ξ = 20, C = 50, and γ = 10 showed the highest accuracy in estimating GRF components for the right foot.The left and right insoles are symmetric, implying that the SVR model parameters for the left foot are identical to those for the foot.
The two methods mentioned above (LS and SVR) were implemented using Python (v3.10.7) and an Intel Xeon Gold 5218 R @ 2.10 GHz CPU.

Machine Learning Modeling
In order to assess the impact of normalization methods on PCA in conjunction with ANN, LS, and SVR for estimating GRF components, performance indicators were calculated between the estimated (by insole PP data) and measured (by force plate data) GRF components using the whole test datasets of both feet with intrasubject (intras) and intersubject (inters) strategies.The indicators were reported in terms of correlation coefficient (R) and root mean square error (RMSE).
To construct the ML methods for estimating GRF components, we utilized the datasets of 8 subjects for slow and normal activity for both feet, and 70% of the steps from the datasets for each activity of each subject were used for the training set.The total number of steps for each activity is shown in Table 1.An additional 10% of the steps for each activity were used solely for the validation set of the PCA-ANN model.Our model was evaluated using the intras strategy, which involved testing the model on the test datasets of the same 8 subjects, representing 20% of the steps for each activity.Additionally, we assessed the generalization capacity of our model using the inters strategy, where the whole remaining data of the 9th subject were used for evaluation purposes.Table 1 presents the number of samples and steps for the whole dataset.For the three ML methods, both intras and inters strategies included the rotation of the training and test datasets to ensure robust results, employing a leave-one-subject-out cross-validation approach.This resulted in the creation of a total of 108 models (9 models for each of the 12 normalization methods) for each foot and each ML method.

Metrics
We used RMSE and R coefficient [12,13,15] to evaluate the accuracy of the models for GRF component estimation: and Sensors 2024, 24, 1137 10 of 17 where n is the number of data points for GRF components; y i is the force plate-measured GRF component (F x , F y , or F z ) at time i; ŷi is the estimated GRF component; µ( ⌢ Y ) is the mean of the estimated GRF components; ⌢ Y = ŷ1 , . . ., ŷn are the estimated GRF components; µ(Y) is the mean of measured GRF components; and Y = y 1 , . . ., y n are the measured GRF components.

Impact of Normalization Methods on PCA for ANN, LS, and SVR for Estimating GRF Components
The nine models from PCA-ANN, LS, and SVR methods were evaluated using RMSE and R metrics for both feet with both strategies.Tables 2-5 present the average and standard deviation (SD) of these metrics.The values are reported for all normalization methods and the entire test dataset.The best estimation results considering RMSE and R values are highlighted in bold for both intras and inters strategies.The optimal number of PCs, explaining more than 98% [12,15] of cumulative variance of the PP data, is also presented in these tables.After analyzing Tables 2 and 3, we conducted an analysis to determine the most effective ML method and the corresponding normalization method for estimating GRF components for both feet using the two strategies.For the left foot with the intras strategy, we found that PCA-SVR was the most effective method for estimating F x , F y and F z using the mean, MM [−1, 1] , and mean normalization methods, respectively.For the inters strategy, PCA-LS was found to be the most effective method for estimating F x and F y using the LI and mean normalization methods, respectively.For the estimation of F z , PCA-ANN was the most effective method using the tanh method.Based on the results from Tables 4 and 5 for the right foot, regarding the intras strategy, PCA-ANN was the most effective method for estimating F x , F y and F z , using the ZS, tanh and Med and MM [0, 1] normalization methods, respectively.Concerning the inters strategy, for the estimation of F x , F y and F z , PCA-ANN was the most effective method using the VS and MLS, tanh and VS normalization methods.We can conclude that for the estimation of GRF components, PCA-ANN was the most effective method, followed by PCA-SVR and then PCA-LS in decreasing order of effectiveness.

An Illustration Example of Slow and Normal Walking
Figure 6 provides an illustration of the estimated GRF components of a single step of the whole test dataset of the right foot with intras and inters strategies.The normalization technique that produces the most accurate estimation for each ML method in Tables 4 and 5 is employed to estimate each GRF component (e.g., for the right foot, using the PCA-SVR model, MM [0, 1] is applied to estimate F z with the intras strategy).6, between the estimated and measured GRF components.The RMSE and R values are calculated between the instance of initial contact at the heel (On-Heel) and the instance of foot lift-off (Toe-Off).Based on Table 6, the results of the estimated GRF components by intras and inters strategies during slow walking are more accurate than those during normal walking.Table 6.RMSE (R) between the estimated (by insole PP data) and measured (by force plate data) GRF components for slow and normal walking using intras and inters strategies for the right foot, corresponding to Figure 6.The most accurate normalization method for each ML method in Tables 4 and 5 is used to estimate each GRF component.Tables 2-6 indicate that the performance in estimating F x and F y using the optimal normalization method is quite comparable.However, the estimation of the F z component shows a difference, with the PCA-ANN method providing the best outcomes, except in one case.Furthermore, as depicted in Figure 6, in instances where the measured GRF components from the force plate are equal to 0, particularly for the F z component, PCA-LS and PCA-SVR may provide an estimation that differs from 0 due to the constant parameter bias (b).In such cases, PCA-ANN is a better option.
From Tables 2-5, the metrics show very similar results between right and left feet.This supports the hypothesis stating that both feet can have the same model.
We can note that the optimal results are achieved by PCA-ANN, followed by PCA-SVR and then PCA-LS with 7, 3, and 2 optimal configurations, respectively, among the 12 configurations.The 12 configurations include three components, two feet, and two strategies.The PCA-LS and PCA-SVR both search for the best hyperplane that achieves regression.The latter is linear for LS and nonlinear for SVR (choice of the RBF kernel) which explains why better results were obtained with PCA-SVR than with PCA-LS.Even if SVR provides the best results for some specific configurations (3 out of 12 optimal ones), we recommend the use of ANN, which shows the best results in the majority of configurations (7 out of 12 optimal ones).
From Tables 2-6, it can be inferred that the effectiveness of each normalization technique using PCA varies depending on the ML method (ANN, SVR, or LS) and the strategy (intras or inters) employed for estimating the GRF components of both feet.Contrary to the suggestions of previous studies [9][10][11], our findings suggest that the ZS normalization method may not always be the optimal approach to estimating GRF components when utilizing PCA.For example, using the ZS method to estimate GRF components for both feet with PCA-SVR and both strategies is not recommended because the precision of estimation is low (error values range from 1.2 to 2.7 times greater than the optimal results).
Also, the authors in [12,13] recommended the use of the BW and LI methods to estimate the GRF when employing PCA.However, for PCA-SVR and PCA-LS methods, in the case of F z estimation for both feet with both strategies, these two methods performed poorly, with error values ranging from 1.1 to 4.1 times greater than the optimal results.
In conclusion, for estimating GRF components with both strategies for both feet, we recommend using the VS normalization method with PCA-ANN (error values ranging from 1 to 1.2 times greater than the optimal results) and PCA-LS (error values ranging from 1 to 1.4 times greater than the optimal results).For PCA-SVR, we recommend using the mean method (error values ranging from 1 to 1.2 times greater than the optimal results).Moreover, the same normalization method may produce either poor or excellent accuracy results depending on the ML method employed.It is recommended not to define a normalization method a priori independently of the ML method.
As an example, the BW method can be the least effective for PCA-SVR.For the intras strategy, in the estimation of the F z component of the left foot, this method produced an error value (271.29 N) that is more than three times the optimal error (59.35 N).
In the case where we want to compare or test the performance of the three ML methods by using the same PC data, we recommend using the mean or VS normalization methods for estimating the GRF components with both strategies for both feet.These two methods have shown favorable results with the three ML methods when compared with the optimal results.
PCA can reduce computation time by reducing input dimensionality.However, it requires 16 sensors to reduce the input dimension to 10-6 PCs (Tables 2-5), depending on the normalization method.This suggests a careful choice of the normalization method for minimizing the number of PCs.

Conclusions
We conducted a comprehensive study on normalization methods with PCA-ML for GRF estimation, exploring 12 statistical and physical normalization methods instead of the 3 proposed by the authors in [12,13,15].Moreover, we studied the ZS normalization method, widely used or recommended with PCA [9][10][11] in other application domains.However, the rationale for choosing these methods was not explained.We employed PCA-SVR, a method that, to the best of our knowledge, has never been examined in the literature on GRF component estimation.The PCA-SVR method achieved the second-highest accuracy in estimating GRF components, preceded by PCA-ANN.This paper finally recommends the use of some normalization methods regarding the ML method employed.For PCA-ANN and PCA-LS, the VS normalization method is recommended.For PCA-SVR, the mean method is recommended.
However, our study is limited to normal foot morphology and walking, and further research is needed for other foot characteristics.In other words, for different conditions, the normalization methods with the three ML methods in conjunction with the PCA method need to be reevaluated for estimating GRF components.

Figure 1 .
Figure 1.Processing flow chart for estimating GRF components from PP.The PP training set is used for ML modeling by using the corresponding GRF force plate data (in red).The PP testing set is used to evaluate the performance of GRF components by using the corresponding GRF force plate data (in blue).The gray rectangle indicates the classical pre-normalization PCA-ML pipeline.
presents an example of the synchronization by time-shifting between Fz_insole and Fz_force_plate for the right foot during one step.Sensors 2024, 24, x FOR PEER REVIEW 5 of 19

Figure 3 .
Figure 3. Synchronization by using the time shift with the lowest RMSE between Fz_insole and Fz_force_plate for the right foot during one step.

Figure 3 .
Figure 3. Synchronization by using the time shift with the lowest RMSE between Fz_insole and Fz_force_plate for the right foot during one step.

Sensors 2024 , 19 Figure 4 .
Figure 4.The topology of the ANN for estimating the three GRF components.

Figure 4 Figure 4 .
Figure 4 depicts the network topology used to estimate the GRF components, where input vector [ ]k ,P , ,P P X C 2 C 1 C  =is followed by two hidden layers, each consisting of 400 and 200 nodes, respectively.The activation function of both hidden layers is represented by h i , which is chosen to be a relu activation function.The output layer's activation function is represented by 1 y , 2 y , and 3 y , which stand for the three GRF components.The outputs of node j in the first and second hidden layers are given by the follow-

Figure 5 .
Figure 5. Example of a hyperplane function that maximizes the number of measured data within the decision boundary.
ζ and ζ indicate the errors between the two decision boundary lines and the measured data.

Figure 5 .
Figure 5. Example of a hyperplane function that maximizes the number of measured data within the decision boundary.

Figure 6 .Figure 6 .
Figure 6.The estimated and measured GRF components for a single step during the "slow walking" and "normal walking" activities for right foot using the intras and inters strategies.The esti-Toe-Off

Table 1 .
The number of samples (steps) of the dataset for the 9 subjects.

Table 2 .
Average ± SD of RMSE (R) between the estimated (by insole PP data) and measured (by force plate data) F x and F y components for the 9 models using intras and inters strategies with the test dataset of the left foot.

Table 3 .
Average ± SD of RMSE (R) between the estimated (by insole PP data) and measured (by force plate data) F z component for the 9 models using intras and inters strategies with the test dataset of the left foot.

Table 4 .
Average ± SD of RMSE (R) between the estimated (by insole PP data) and measured (by force plate data) F x and F y components for the 9 models using intras and inters strategies with the test dataset of the right foot.

Table 5 .
Average ± SD of RMSE (R) between the estimated (by insole PP data) and measured (by force plate data) F z component for the 9 models using intras and inters strategies with the test dataset of the right foot.

Table 6
displays the RMSE and R values for both feet, corresponding to Figure