PERSIST: A Multimodal Dataset for the Prediction of Perceived Exertion during Resistance Training

: Measuring and adjusting the training load is essential in resistance training, as training overload can increase the risk of injuries. At the same time, too little load does not deliver the desired training effects. Usually, external load is quantiﬁed using objective measurements, such as lifted weight distributed across sets and repetitions per exercise. Internal training load is usually assessed using questionnaires or ratings of perceived exertion (RPE). A standard RPE scale is the Borg scale, which ranges from 6 (no exertion) to 20 (the highest exertion ever experienced). Researchers have investigated predicting RPE for different sports using sensor modalities and machine learning methods, such as Support Vector Regression or Random Forests. This paper presents PERSIST, a novel dataset for predicting PERceived exertion during reSIStance Training. We recorded multiple sensor modalities simultaneously, including inertial measurement units (IMU), electrocardiography (ECG), and motion capture (MoCap). The MoCap data has been synchronized to the IMU and ECG data. We also provide heart rate variability (HRV) parameters obtained from the ECG signal. Our dataset contains data from twelve young and healthy male participants with at least one year of resistance training experience. Subjects performed twelve sets of squats on a Flywheel platform with twelve repetitions per set. After each set, subjects reported their current RPE. We chose the squat exercise as it involves the largest muscle group. This paper demonstrates how to access the dataset. We further present an exploratory data analysis and show how researchers can use IMU and ECG data to predict perceived exertion.


Introduction
Controlling the exercise load during resistance and aerobic training is crucial for optimal training programming.While excessive training loads increase the risk of injuries, training loads below a certain threshold do not induce optimal training effects [1].Usually, the external load is quantified by objective measurements, such as the distance traveled or the weight lifted.However, subjective measurements, such as questionnaires or Ratings of Perceived Exertion (RPEs), can provide valuable information about the internal load of athletes [2].An RPE scale is a numerical scale where athletes rate their perceived exertion.A standard scale is the Borg scale [3], which ranges from 6 to 20, where 6 represents no exertion, and 20 is the highest fatigue ever experienced.In recent years, research efforts have been made to predict RPE values during physical exercise on the fly using unobtrusive sensor systems.According to a study by Davidson et al. [4], manual reporting of RPE values is challenging due to compliance errors, recall bias, peer pressure, or dishonest reporting.Developing data-driven models trained on sensor data could predict subjective RPE values in real-time and alleviate the aforementioned issues in RPE reporting.Especially during the COVID pandemic, the need for home monitoring systems increased, as many people could not attend gym or sports lessons.By predicting the perceived exertion in the form of RPE values, an automated exercise feedback coach could help to pose warnings once the training load exceeds.
Many studies have investigated physical exertion prediction for different protocols using machine learning on sensor data.For this, various sensor modalities, such as Electromyography (EMG), heart rate (HR), Inertial Measurement Units (IMUs), Electrodermal activity (EDA), optical Motion Capture (MoCap), and Global Positioning System (GPS) have been used.Moreover, different training environments have been investigated, ranging from unrestricted outdoor training, such as free running or football training, to controlled lab studies, including resistance training protocols.An overview of related studies aiming to predict RPE values is provided in Table 1.It shows the cohort information, the sensor modalities used, and information about the training protocol used.The studies have used different versions of the RPE scale, including the classic Borg scale, the Borg CR-10 scale, and custom scales.The Borg CR-10 scale is a modified version of the Borg scale, which ranges from 1 to 10.Some studies have modified the target for the RPE prediction by normalizing the RPE values or dividing the RPE scale into different intensity classes.Table 1 also presents the different prediction targets in more detail.The older studies presented here have used traditional machine learning approaches, such as Random Forest (RF), Gradient Boosting Regression Trees (GBRT), and Support Vector Regression (SVR).Recent studies, e.g., by Jiang et al. [5], have also investigated deep learning-based methods, such as Convolutional Neural Networks (CNNs).
Table 1.Related work sorted by year.Studies include the first author's name, the study population size, the recorded sensor modalities, the target RPE scale, and whether the dataset is publicly available (PA).Some datasets might only be accessible by asking the authors.As shown in Table 1, the presented studies have the same goal of predicting RPE but have different study setups.Pernek et al. [6] only included upper limb strength exercises performed using dumbbells, which raises the question if the exercises induced fatigue in subjects properly.Other existing studies were conducted during team training sessions [7,8,10].Unrestricted experiment settings have the disadvantage of potentially including independent variables and thus lack reproducibility for other researchers.The most related study to ours was conducted by Jiang et al. [5].This study also included the squat exercise, among two other exercises.However, the authors did not define strict inclusion criteria.Subjects performed sets of five repetitions each until exhaustion.However, the exertion was not confirmed, e.g., by taking lactate measurements.The authors showed that subjects performed differently, where the lowest number of sets was nine, while the best subject performed 52 sets.Our dataset contributes to the existing body of research, as it contains a homogeneous cohort performing a single exercise in a controlled lab environment.The induced muscle fatigue was confirmed by blood lactate analysis.We have chosen the squat exercise to exhaust the lower extremity muscles, the body's largest muscle group.The squat is one of the integral exercises in resistance and condition training and represents an overall measure of lower-body strength [11].A squat simultaneously activates many large muscles in the body, including the glutes, the quadriceps, and the hamstrings.Instead of using an external weight with a barbell and plates, we decided to use a Flywheel platform, a training device that generates load independently of gravity.The training device allows to generate an eccentric overload, when lengthening of a muscle occurs while it is contracting.Studies have shown that Flywheel training is effective in increasing muscle mass and strength, while also offering benefits for rehabilitation and injury prevention [12].In particular, the Flywheel can induce significant fatigue to the lower extremity muscles, while preventing axial loading on the spine, a factor which could greatly increase injury risk, especially under fatigue.In this respect it offers an advantage over free weight exercises, which often use barbells and dumbbells.From a technical perspective, the Flywheel platform is advantageous because the cameras are able to observe the participant without obstruction, as could be the case when using barbells and dumbbells.We recorded multiple sensor modalities, including Electrocardiography (ECG), IMU, 3D cameras, and physical measurements obtained from a Flywheel training device.To the best of our knowledge, the data sets presented in Table 1 are not publicly available or only upon request, which makes it hard to reproduce the results or improve the machine learning algorithms.We aim to contribute to the field of estimating RPE ratings via machine learning by making our dataset available to other researchers.
The rest of the paper is structured as follows: Section 2 provides an overview of the collection of this data set and the measurement methods used.Section 3 provides an explanation of the data set structure as well as highlights methods on how to process the obtained data.In Section 4, we present data analysis of multiple modalities.The paper ends with a conclusion in Section 5.

Materials and Methods
This section presents the study setup, defined protocol, and the sensors used for the data collection.We have selected the squat exercise for our protocol as it targets the largest muscle groups of the body and, therefore, should induce fatigue quickly and reliably.Instead of performing squats with a weighted barbell placed on the shoulders, we chose a Flywheel training platform.The Flywheel works with inertial weight induced by the athletes by accelerating an inertial weight plate.The load in Flywheel training is determined by the diameter of the inertial disc as well as by its angular velocity [13].The ethics review board (ERB) of the University of Potsdam approved this study (Application no.21/2021).The data recording took place between April and May 2021 in the lab of the Connected Healthcare chair at the Hasso Plattner Institute, University of Potsdam.

Participants
For our study, we recruited sixteen young and healthy male participants.Participants were between 18 and 30 years old and underwent screening using the Physical Activity Readiness Questionnaire (PAR-Q), a frequently used questionnaire to assess the physical state [14].Furthermore, participants needed to perform regular resistance training for at least one year.Table 2 shows the anthropometric data and the athlete's average weekly workout times.Due to the SARS-CoV-2 pandemic, we inquired about the athlete's weekly training times and how the training times changed since the second lockdown in Germany with the closing of gyms.At this early stage, we only included male participants to have a homogeneous population.Additionally, subjects had to be able to execute the squat exercise correctly, i.e., bringing the thighs parallel to the floor.Every participant gave written consent before the data recording.Of the 16 individuals who participated in this study, 12 gave their consent to share data anonymously.

Flywheel Training Machine
The Flywheel training machine (Exxentric Training, Sweden), as shown in Figure 1, does not work with external weight, which is accelerated towards the ground by gravity, but creates and controls the force and training intensity by a spinning inertial weight plate.The Flywheel consists of a platform with a wheel connected to a harness.Pulling the harness up accelerates the wheel and creates a moment of force.The wheel can be stopped by maintaining a static position or performing a countermovement that neutralizes the energy.For the squat exercise, the athlete is wearing a hip harness that is connected to the wheel via a belt that is wrapped around a transmission shaft fixed to the Flywheel on the other end.The starting position is in a squat, as shown in Figure 2a.Thus, when a participant extends his knees and hips to move his center of mass upwards, they unwrap the belt from the shaft and spin up the Flywheel.The upward motion is caused by concentric contraction of the knee and hip extensors.At the topmost position, as shown in Figure 2b, the Flywheel induces a downward pull on the belt, which the athlete has to counteract.Biomechanically, the participant neutralizes the Flywheel's rotational energy by controlling the motion downwards with an eccentric movement.When halting the motion, the subject will again be in the starting position.In contrast to barbell squats, the starting position is in the squat itself, not the standing position.The Flywheel platform can be operated with plates of different sizes.All of the participants in our experiment used the medium-sized plate (0.025 kgm 2 ).For squats with external weight, such as a barbell, load is typically determined by measuring the one-repetition maximum (1 RM) of an athlete and using a certain percentage of that value for training.However, such load quantification in Flywheel training is impossible [13].We determined the training load for each participant by having them perform several repetitions with maximum effort.Here, they had to apply force against the belt strapped around their waist as fast and hard as possible.This force was transferred via a strap to an axle around which the inertial weight was secured.The approach is similar to that reported by Raeder et al. [11].During this test, the so-called max speed test, the time of each repetition was first measured.Then, the participants had to perform all subsequent squats at 90% of this velocity.To ensure the timing was correct, all repetitions during the fatigue protocol were guided by a visual metronome.
The Flywheel platform comes with an optional measurement unit, the so-called kMeter.This sensor measures the Flywheel's rotation with 500 Hz.It calculates information, such as the concentric, eccentric, and average power (in watts), energy (in kilojoules), number of repetitions, and vertical movement (in cm).The accuracy of the kMeter sensor was evaluated using a force plate as a reference, as shown in the study by Weakley et al. [15].The kMeter device is positioned underneath the Flywheel platform.We recorded the kMeter data during the experiments by streaming the sensor data to an Android smartphone in real-time using the kMeter app.

Study Setup
We used multiple sensor sources during our experiments, including IMU, ECG, RGB-D cameras, and the kMeter device (as mentioned in the previous section).Figure 3a shows the study setup in the laboratory, including the camera placement.Figure 3b shows the placement of IMU and ECG sensors, as further explained in the following sections.

Inertial Measurement Unit Sensors
IMU sensors measure their movement in three dimensions.They measure linear acceleration (m/s 2 ) and angular velocity (deg/s) with three accelerometers and gyroscopes placed orthogonally to each other, respectively.For our data collection, we used six Physilog 5 (GaitUp ® Corporation, Lausanne, Switzerland) IMU sensors that recorded 3D acceleration and gyroscope.Figure 3b shows a sensor unit with its 3D axes.The sensors sampled data at a frequency of 128 Hz.
We decided on our sensor locations based on related studies and our own experiences.Following the study by O'Reilly et al. [16], we placed a sensor on the back at the height of the fifth lumbar vertebra.We placed another sensor on the sternum to measure chest displacement respective to the lower back.Increasing the relative movement between the sternum and lower back might indicate an incorrect pose of the participant that could be prominent in the data.Four IMU sensors were placed on the right and left thigh and the right and left calf, as proposed by Lee et al. [17].Figure 3b shows the entire sensor placement.The six sensors streamed the data in real-time to a custom Android application developed for online streaming of sensor data (SensorHub [18]) via Bluetooth.

Electrocardiography Device
ECG data was recorded using the one-channel Faros 180 sensor (Bittium ® Corporation, Oulu, Finland).ECG sensors measure the electrical activity of the heart muscle, where the QRS complex is the most prominent pattern in every heartbeat.The QRS complex reflects the ventricular stimulation, with the R peak as the point of maximum expansion of stimulation of the heart muscle cells.This is reflected as the peak with the maximum amplitude in the ECG signal.The electrode placement of our 1-channel system is shown in Figure 3b.The Faros 180 sampled the ECG data at 1000 Hz, which was directly sent to the SensorHub app.The ECG signal was recorded during the entire session, including the resting phases of the protocol.From that ECG data, so-called Heart Rate Variability (HRV) parameters can be calculated by measuring the distance between successive R peaks.HRV parameters are deduced from the change in intervals between R peaks and can be interpreted to provide a wealth of information about the status of a subject [19].A higher heart rate implies a greater strain on the cardiovascular system, for example as a result of exercising.Overall, HRV parameters can be split into time-domain, frequency-domain, and non-linear.The Faros 180 also integrates an accelerometer that samples 3D acceleration data at a sampling frequency of 100 Hz.

Microsoft Azure Kinect Cameras
During the exercise part of the protocol, the subjects were recorded using two Microsoft Azure Kinect cameras.This camera combines a 12 MP RGB camera, an infrared emitter and receiver, a 7-microphone array, and an IMU sensor.The camera uses time-of-flight technology to create depth images with a 1 MP resolution (1024 × 1024 px).The core feature of this Kinect camera and its predecessors is the available skeleton tracking algorithm, which can track up to 32 landmarks of users in 3D space.As investigated in the study by Ryselis et al. [20], markerless skeleton tracking on monocular camera systems has problems detecting complicated poses that deviate from standard poses.This problem also occurs in functional movements, such as the squat exercise.Therefore, their study investigated a three-camera Kinect system that fused kinematic data from all cameras and was tested during a functional sport protocol.They analyzed limb length, which is the distance between two adjacent joints.Limb length should stay constant as bones are rigid.The authors assessed the intra-session variability of normalized limb lengths obtained from the camera system using the intraclass correlation coefficient (ICC).They defined an intra-session as a single session divided into two parts of equal lengths.The authors obtained a test-retest reliability of ICC = 0.892.Another study by Kotsifaki et al. [21] investigated the reliability of a dual Kinect camera system using the predecessor, Microsoft Kinect v2.They evaluated the single-leg squat using a gold-standard marker-based MoCap system.This study found that agreement improved using a dual Kinect system instead of a single camera.The authors found high agreement in the peak angles during the single-leg squat, with an ICC(2, k) of 0.665 ≤ ICC ≤ 0.932.Moreover, the SEM ranged between 2.5 ≤ SEM ≤ 4.1 degrees.In a previous study, we evaluated the pose-tracking accuracy of the Microsoft Kinect v2 and Azure Kinect to a Vicon gold-standard MoCap system during treadmill walking [22].The results indicated that the skeleton tracking algorithms deliver similar pose tracking errors, while Azure Kinect provides better foot and ankle markers accuracy.Therefore, we have used multiple Azure Kinect cameras to improve the skeleton tracking quality, similar to the study presented by Xing et al. [23].The Kinect cameras were placed at an approximately 45 degrees angle each.Figure 4 shows two simultaneously captured depth images.Both cameras captured data at 30 Hz.The Microsoft Azure Kinect offers an easy-to-use temporal synchronization of multiple devices via hardware using a 3.5 mm audio cable.It allows for two different configurations, the star and daisy chain.We have defined one camera as the master and the other as a subordinate device and connected both using the appropriate wiring on the sync ports.Data were recorded using the Microsoft Azure Kinect recorder tool, which saved the incoming RGB and depth camera streams in the Matroska file format (.mkv).After the recording, the skeleton data was extracted using the Microsoft Azure Kinect Body Tracking SDK version 1.1.2,the latest version at the time of writing.Due to data protection regulations, we only share 2D and 3D joint positions and 3D joint orientations.

Protocol Definition
The protocol of this study is shown in Figure 5 and took approx.90 min.As a first step, lactate measurements were taken from the earlobe (EKF Diagnostics, Cardiff, UK).This was followed by five minutes of rest, i.e., watching a relaxing video.Then a second blood sample was taken from the earlobe.Afterwards, the participants performed a warm-up set for two minutes.Then, the target repetition time was determined by asking the participants to perform a few repetitions as fast as possible.Ninety percent of the mean duration of the repetitions were set as target time for each repetition during the protocol.The fatigue protocol consisted of four series, each followed by a break of 180 s to allow for adequate rest during the exercise.All series consisted of three sets that took about 35 s each.Breaks of 60 s were included after each series's first and second set, while the 180 s series break was included after each series's third and last set.Each set contained 12 squats on the Flywheel training machine.After every set, subjects reported their current RPE rating.After the fatigue protocol, blood lactate was measured a last time.A significant increase in blood lactate indicates intense exercise, as the body can no longer process all the lactate consumed [24].Then subjects rested for 20 min, during which ECG data was measured to confirm the return of the heart parameters to baseline.Finally, 15 min after the last squat, subjects reported their session RPE.

Data
This section describes the dataset's structure and presents various data processing methods.

Dataset Structure
The dataset is organized in a subject-centered structure, as shown in Figure 6.The root level of each subject folder contains meta files with subject-specific information, which is explained in the following paragraph.Each sensor modality (MoCap, IMU, ECG) is stored in a respective folder.The IMU and ECG data are available in two versions: the truncated and untruncated version.The truncated version does not contain data from the resting phases.For the IMU and ECG data, the recorded sensor timestamps are relative to the recording time, i.e., starting from the second zero.We further provide preprocessed HRV and MoCap data, as explained in Sections 3.2 and 3.

ECG Data Processing
The Faros 180 sensor saves the ECG data in the European Data Format (EDF).The sensor's accelerometer data is stored in a comma-separated value (CSV) file.It is hard to interpret raw ECG signals directly.We have calculated HRV parameters using the proprietary Kubios Premium [25] software to gain more insights.Kubios calculates many HRV parameters for a recording in time windows of adjustable lengths.Table 3 shows the set of available HRV parameters.The minimum window length of the Kubios software is 30 s. Larger windows contain more information.However, a set of twelve squats in the protocol usually took around 35 s, with the heart starting recovery to baseline immediately after.Therefore, we chose a short window size to minimize the effects of other repetitions or breaks on the measured data.The Kubios report files are stored in .txtformat and are machine readable after skipping the header information.The HRV parameters in the time domain are derived based on the RR interval, the temporal distance between two consecutive R peaks measured in ms.The mean RR parameter is the mean duration of RR intervals within a given window.The heart rate is the average number of heartbeats per minute.The Training Impulse (TRIMP) parameter is a more complex parameter.It shows how the training load has accumulated in the training session.It is the product of training volume in minutes and the training intensity, modeled as the heart rate reserve information ∆HR (as shown in Equation ( 1)), according to Morton et al. [26].
In this equation, HR sample , HR rest , and HR max refer to the heart rate of the current window, the resting heart rate, and the maximum heart rate.The final TRIMP parameter is calculated as shown in Equation ( 2), where T refers to the training duration. (2)

Skeleton Data Processing
As mentioned in Section 2.6, we used two Microsoft Azure Kinect cameras from two different viewpoints.Thus, the two skeletons' time series are in two different local 3D camera coordinate systems.Each skeleton contains measurement errors, so we aim to fuse both skeletons to compensate for one camera's measurement errors with the other.Therefore, we begin by transforming both skeletons into a global coordinate system before applying fusion methods.We solve this problem by finding a rotation R ∈ R 3,3 and translation t ∈ R 3 , to register the left skeleton to the right skeleton.To this end, the left and right skeletons are denoted as R j i , L j i ∈ R 3 for joints j ∈ {1, . . ., 32} and timestamps i ∈ {1, . . ., M}.We re-order both skeletons in point sets P = {p 1 , . . ., p n } and Q = {q 1 , . . ., q n } by flattening all joints and timestamps, with p i , q i ∈ R 3 .We then use the SVD-based Kabsch algorithm to minimize the cost function given in Equation ( 3).
(R, t) = arg min The intermediate result is two overlapping skeletons that contain measurement errors and potentially large outliers, as shown in Figure 7a.When fusing both skeletons using a simple average filter, the final result would be affected by outliers from one of the two skeletons.Thus, we implemented a more advanced fusion method that considers the nature of the human movement.The assumption is that human movement is generally smooth, so measurement errors cause higher jumps between frames.Therefore, we increase the weights w R i or w L i of the respective camera if the joint has a smaller gradient between two consecutive frames, as shown in Equation ( 4).In our experiments, the exponent α further punishes the weights and is set to α = 1.4.The final result is a fused skeleton F with joints f i calculated as shown in Equation (5).
Figure 7 shows an example of the knee joint where both cameras show large outliers that are compensated by the other camera.Finally, a 4th order Butterworth filter was applied to the fused skeleton data.

Synchronization of Azure Kinect and IMU Data
As already mentioned, the ECG and Physilog IMU sensors were already temporally synchronized.The Azure Kinect cameras only recorded data during the physical exercise.The Kinect and ECG or IMU modalities must be temporally synchronized for sensor fusion use cases.For this purpose, we have selected an IMU sensor at a similar location to one of the Kinect markers, which is, e.g., the chest IMU sensor and sternum marker.We calculated acceleration in the vertical axis of the Kinect marker.Successively, both acceleration data can be synchronized using cross-correlation.Figure 8 shows an example set where the Azure Kinect camera was temporally aligned with the IMU signals.We filtered the IMU data using a 4th order Butterworth filter before applying cross-correlation.

Evaluation
In this section, we investigate the dataset by conducting an exploratory data analysis, mainly on the Flywheel data modalities.Further, we present a prior study to predict perceived exertion on IMU and HRV data.

Exploratory Data Analysis
We begin our dataset exploration by looking at the distribution of the RPE values reported by the twelve subjects.Figure 9 presents the distributions of RPE values, in Figure 9a shown as a heatmap and in Figure 9b shown as a distribution histogram.One subject stopped the experiment due to extreme exhaustion.As a next step, we analyze the Flywheel data by looking at the average power.We take the sensor readings from the kMeter device as explained in Section 2.2.Outliers in the kMeter data were filtered using a z-score outlier filtering with σ = 3.We compare the collected data for each repetition to the reported RPE value of the according set. Figure 10 shows a subject's average power (of concentric and eccentric phase) for the entire protocol and individual sets.We calculate the Pearson's correlation coefficient (PCC) between the reported RPE values and all individual repetitions and the mean values of each set, respectively.Since we hypothesized that the performance within a set is decreasing, we also show a linear regression for all repetitions in one set.It is evident that the average power decreases over the entire protocol.At the same time, the RPE values increase, which leads to a high negative correlation between RPE and the average power of PCC = −0.82.In contrast, the subject shown in Figure 11 shows no correlation between the average power and RPE values.The power performance seems to maintain constant over the entire protocol while the RPE values are increasing.For individual sets, the average power sometimes increases, shown by a positive slope of the regression lines.As shown in the last two Figures, the average power sometimes correlates with the reported RPE values.We investigate the correlations between the subject's reported RPE values and the other kMeter features, each the average value per set.Figure 12 shows the correlations of all subjects and features.It shows that the average duration of repetitions is for most subjects correlated with RPE values, i.e., the more fatigued, the slower the speed at which the movement is executed.In contrast, the correlation of the average power is negative for nine out of twelve subjects, i.e., the average power decreases during the protocol.However, for the other subjects, the correlation of average power is low (PCC = 0.05) or even high (PCC > 0.5), making it difficult to use this feature alone for prediction.In this initial exploratory data analysis, we only investigated the Flywheel modality, as the kMeter provides physical measurements aggregated as high-level information.Our data exploration revealed that most Flywheel parameters correlate with the reported RPE values, either positively or negatively.However, exceptions exist where individual subjects performed way differently from the others, e.g., the first two subjects mostly achieved different inverted correlations for most features.It is necessary to conduct further data analysis on the other modalities, as they can reveal more information and trends in the data.

Prediction of Subjective Exertion Using Heart Rate and IMU Data
In our previous study, published in [27], we investigated how to predict subjective exertion using machine learning on the collected heart rate and movement signals from the IMU sensor.We have used data from all 16 subjects.We further investigated the advantage of the HRV by training only on IMU but also on IMU and HRV data.Successively, we have investigated the impact of individual features.The objective of this study was only to use wearable sensors and not to include other modalities, such as the cameras and Flywheel data.The motivation was to develop a wearable sensor system that can be entirely worn and potentially work in the wild without laboratory restrictions in the future.
The collected IMU movement data was processed using a sliding window approach with different sizes and overlaps.Furthermore, we filtered the IMU data using a 4th order Butterworth filter with different cut-off frequencies.Successively, we calculated statistical features for each window of IMU data, i.e., the different sensor axes and sensors.Calculating eight statistical features, we obtain a feature vector with 6 • 6 • 8 = 288 entries per window.Our feature set includes minimum, maximum, mean, median, root mean square (RMS), kurtosis, skewness, and standard deviation.The HRV data were processed using the Kubious Premium software, as explained in Section 3.2.To obtain the maximum number of windows, we have used a 30-s sliding window configuration, the smallest reasonable configuration for calculating HRV parameters.After processing the entire dataset, we combined the HRV and IMU data windows by selecting the HRV window closest in time to every IMU data window.
After processing the IMU and HRV data, multiple machine learning models were trained, including Gradient-Boosting Regression Trees (GBRT), Support Vector Regression (SVR) with linear and radial basis function kernels, and random forest (RF).We trained the models for multiple epochs on the shuffled data.We evaluated the machine learning models using leave-one-subject-out (LOSO) cross-validation to obtain a fair evaluation and prevent the models from overfitting.Evaluation metrics were mean absolute percentage error (MAPE), coefficient of determination (R 2 ), mean square error (MSE), and root mean square error (RMSE).Table 4 summarizes the results of the different models.The GBRT model achieved the best result.We further investigated the feature importances by training a SVR model on both IMU and HRV data.Table 5 shows both data modalities' ten most important features.We conclude that the most important feature is the TRIMP HRV feature.More details about these findings are available in the publication Albert et al. [27].
This initial study has shown that it is possible to predict perceived exertion using HRV and IMU data using conventional machine learning models.When investigating the feature importance, the TRIMP feature was ranked as the most important feature.As Table 4 reveals, in training the models only using IMU data alone, the results are much worse, indicated by the R 2 metric that lies between −0.05 ≤ R 2 ≤ 0.08.An R 2 of zero means that a model only predicts the mean, leading the model not to be useful.Therefore, the HRV data is necessary to improve the prediction results significantly.

Discussion
This paper presents a dataset to predict the subjective perceived exertion, represented as RPE values.It includes data from a homogeneous population performing squats, an exercise involving the hip and knee extensors.We selected the Flywheel to perform the squat exercise.Blood lactate measurements confirmed that muscle fatigue was induced by the protocol.We recorded data using multiple sensor modalities, including IMU, ECG, and MoCap data.This dataset contributes to the goal of RPE prediction as it provides multi-modal data recorded in a controlled lab environment.
Although this dataset offers potential for future studies to detect fatigue, we want to highlight possible limitations of the collected dataset.One limitation is the small sample size.Although we recorded 16 subjects in total, only 12 subjects consented to the publication of their data.Another limitation is differences in familiarity with the RPE scale.Not all subjects were familiar with the Borg scale, possibly introducing a bias in the collected RPE values.Another possible limitation is the accuracy of the MoCap data using the Kinect camera.We used Azure Kinect, the latest generation of the Kinect camera at the time of writing.However, the pose estimation lacks accuracy compared to marker-based motion capture systems [28].
In this paper, we presented a preliminary exploratory data analysis of the collected data and a conducted experiment of predicting RPE values only using IMU and HRV data.This experiment showed that predicting RPE values with IMU and HRV is possible, primarily due to the HRV data, especially the TRIMP feature.This implies that building an RPE prediction model with wearable sensors could assist athletes or coaches as a biofeedback system.However, further research is necessary to investigate additional research questions potentially leading to new practical implications.One example is fatigue prediction solely using the MoCap data by analyzing the posture change during the fatigue protocol.This approach could alleviate the need for wearable systems, thus increasing the athletes' comfort during training.Moreover, marker placement is not necessary with the markerless skeleton tracking of the Azure Kinect camera, thus reducing the setup time and effort.Another research question was to investigate the combination of all sensor data, i.e., multi-modal prediction of subjective exertion.More advanced methods, such as CNN or time-series models, including Transformers or RNNs, could improve the prediction accuracy.So far, we have only used conventional machine learning methods using handcrafted statistical features on IMU and HRV data.Incorporating temporal context could further improve prediction accuracy.
Our dataset was collected in a laboratory environment as this study is still early research in RPE prediction.In this controlled setting, we aimed to control as many independent variables as possible by defining a narrow protocol and strictly setting the

Figure 1 .
Figure 1.The Flywheel training device consists of the platform, the rotating flywheel in the front, and the belt coming out of the center of the platform.For squats, a hip harness is connected to the belt of the platform.

Figure 2 .
Figure 2.An athlete performing the squat exercise on the Flywheel training machine.The Flywheel is operated without external weights, using only the athlete's invested energy.

Figure 3 .
Figure 3. Overview of the study setup (a) and the placement of IMU and ECG sensors on the participants (b).The red boxes indicate the IMU sensors and the blue circles indicate the ECG electrodes.Figure (b) also shows an IMU with its three axes.
(a) Depth map from the left camera (b) Depth map from the right camera

Figure 4 .
Figure 4. Two simultaneously captured depth maps from the left and right Azure Kinect cameras showing a subject performing the squat exercise on the Flywheel.

Figure 5 .
Figure 5. Protocol definition of the entire study, including the pre-and post-test and the fatigue protocol consisting of four series with three sets and twelve repetitions.

Figure 6 .
Figure 6.Dataset structure.Each subject has its own folder with subfolders for different data modalities.

Figure 7 .
Figure 7. Skeleton fusion from master and subordinate device.Figure (a) shows the mapping of the subordinate skeleton (red) to the master skeleton (green).In this frame, the knee joint of the master skeletons shows an outlier.Thus, the fused skeleton (blue) puts more weight on the subordinate skeleton.Figure (b) shows a fused trajectory with corrected outliers.The fused trajectory was filtered.

Figure 8 .
Figure 8. Synchronization of the Phyislog IMU sensors and the Azure Kinect camera.The signals were synchronized using the second derivative of the Azure Kinect of the y-axis of the Pelvis joint.

Figure 9 .
Figure 9. Analysis of the collected RPE values of the subjects.Figure (a) shows a heatmap of achieved RPE values per subject.Figure (b) presents a histogram of the mentioned RPE values.

Figure 10 .
Figure 10.Analysis of the average power of the Flywheel kMeter data for a subject where the average power negatively correlates with the provided RPE values.The black dots represent the average power for each repetition.The red line shows the corresponding RPE values.The local trend within each set is shown using linear regression (background colors alternate for each set).The mean value of each set is indicated as a red cross.

Figure 11 .
Figure 11.Analysis of the average power of the Flywheel kMeter data for a subject where the average power is not correlated with the reported RPE values.The black dots represent the average power for each repetition.The red line shows the corresponding RPE values.The local trend within each set is shown using linear regression (background colors alternate for each set).The mean value of each set is indicated as a red cross.

Figure 12 .
Figure 12.Confusion matrix of PCC between individual subjects and all Flywheel kMeter features.

Table 2 .
Participant characteristics and information about the weekly training time, before and since the second lockdown in Germany.SD stands for standard deviation.

Table 3 .
List of Kubios HRV export parameters in the three different domains.

Table 4 .
Training results of four machine learning models predicting perceived exertion using IMU features alone and a combination of IMU and HRV features.

Table 5 .
The ten most important features identified by training a SVR regression model on the IMU and HRV data.