1. Introduction
As the global population continues to age and the rehabilitation needs of disabled groups increase, there has been a notable increase in the utilisation of exoskeleton robots in a variety of settings, including medical rehabilitation and industrial assistance [
1,
2]. The performance evaluation of these devices is inextricably linked to their optimisation and functionality. However, traditional single-source signal evaluation methods have limitations, such as incomplete information and insufficient accuracy. This complicates the accurate quantification of exoskeleton performance in complex application scenarios. The proposed methodology of this study is an exoskeleton performance evaluation method based on multi-source signal fusion. The integration of bioelectrical, kinematic and kinetic signals, in conjunction with other multidimensional data, facilitates the development of a more comprehensive and accurate quantitative evaluation system. It is evident that this has a considerable impact on the efficiency of optimising the performance of the exoskeleton system. Furthermore, it is also of significant importance in the promotion of the development of rehabilitation, medical treatment and the improvement of the safety and efficiency of industrial production. This is a pivotal step in the intelligent development of exoskeleton technology [
3].
At this stage, the performance evaluation indexes of the exoskeleton principally comprise motion performance indexes, human physiological performance indexes and comfort evaluation indexes. The exoskeleton motion performance indexes refer to the exoskeleton’s booster effect, human movement flexibility and joint movement range. Human physiological performance indicators principally pertain to the rate of energy consumption and the extent of muscle fatigue. Exoskeleton comfort indicators principally encompass parameters such as fit, ease of wearing, breathability, and heat dissipation.
Subjective questionnaires, including the Likert-type scale [
4] and the NASA Task Load Index (NASA-TLX) [
5], are utilised to obtain subjective evaluation results based on the user’s perception. These are typically utilised for the purpose of comfort assessment in the domain of human–computer interaction. Concurrently, objective measurements are utilised to procure objective data on exoskeleton utilisation and to quantitatively evaluate its effectiveness. For instance, inertial measurement units (IMU) [
6] and pressure sensors [
7] can be employed to obtain objective motion metrics, such as gait parameters (e.g., step length and frequency) [
8], joint angles, acceleration, angular velocity and balance performance [
9]. Furthermore, the utilisation of physiological signal monitoring has become a prevailing practice in the assessment of various biomechanical metrics during exoskeleton utilisation. Electromyography (EMG) is a well-established technique that accurately measures muscle activity, oxygen uptake, heart rate, pulmonary ventilation and blood pressure [
10,
11]. In conclusion, a plethora of studies have utilised a variety of methodologies to evaluate exoskeletons.
Nevertheless, the abovementioned single-indicator evaluation method is one-sided and lacks a unified standard. Consequently, there is an urgent need for a widely applicable, unified method to effectively evaluate lower-limb wearable exoskeletons.
A number of research institutes, both domestic and international, have conducted pertinent studies on the human–computer performance evaluation of lower-limb assisted exoskeletons, with the objective of addressing the issue of single and one-sided means of evaluation [
12,
13,
14,
15]. Wietse evaluated the human–computer performance of knee-assisted exoskeletons based on three indexes: mechanical power, kinematics, and metabolic exertion. However, the present study is limited in its comparison of exoskeletons equipped with exoskeletons and those not equipped with them [
12]. Christian DiNatali developed a framework for evaluating the performance of lower-limb-assisted exoskeletons. This framework was based on learning and incorporated human–computer interaction considerations. The contention was made that the disparity in motion between an exoskeleton and the human skeleton it assists should not be regarded as a disadvantage. Instead, it should be evaluated and quantified as a system design feature. Nevertheless, the author acknowledged that the analytical tools at his disposal were not yet fully developed [
13]. Carlson developed a framework for evaluating exoskeleton performance. However, this evaluation framework did not account for environmental diversity factors [
14]. In order to evaluate the human–machine efficacy of the knee-assisted exoskeleton that he had developed, Kamran Shamaei analysed the kinematic parameters of the left and right legs. However, it was determined that the evaluation indexes were inadequate for the purpose of assessing exoskeleton equipment based exclusively on joint angle and joint moment graphs [
15]. In their study, Tamantini et al. evaluated an assistive walking exoskeleton by proposing a multimodal psychophysiological state assessment. This assessment employed a data-driven fuzzy logic approach to estimate four psychophysiological indicators: energy expenditure, fatigue, attention, and stress. The efficacy of the method was validated through an experiment involving ten healthy subjects. Utilising data-driven fuzzy logic technology, the system integrated literature-based correlations between cardiopulmonary function and physiological indicator changes through the implementation of customised fuzzy functions and rules [
16].
Existing research has limitations that indicate evaluation methods relying on a single signal type are susceptible to noise interference and fail to capture the complex interaction between exoskeletons and the human body fully. Despite the increasing prevalence of multi-source signal fusion, a standardised process for acquiring and processing signals, and a unified assessment framework, have yet to be firmly established. Moreover, the issue of how to extract effective features from large amounts of multi-source data and establish a scientific and reasonable quantitative assessment model remains urgent. It is therefore imperative to explore more efficient multi-source signal fusion strategies and quantitative assessment methods in order to overcome the current research bottleneck and ensure the accuracy and reliability of exoskeleton effectiveness assessment.
Currently, there are few AI-assisted methods for evaluating exoskeleton performance. Most AI applications are focused on monitoring muscle fatigue or conducting research into classification. This paper describes the integration of AI-assisted muscle fatigue classification into a system for evaluating exoskeleton performance.
The present paper puts forward a new, comprehensive exoskeleton effectiveness evaluation index system and quantitative evaluation standard. The system encompasses four core evaluation dimensions: human metabolism, human–machine synergy, working condition performance and perception of the assistance effect. A comprehensive and objective exoskeleton performance evaluation system is obtained by fusing multiple sensor signals, including sEMG, MMG, IMU, heart rate, and blood oxygen signals. The hierarchical analysis method and expert scoring method are then combined to provide quantitative assessment criteria for each index. The quantitative assessment results for the effectiveness of the exoskeleton are finally obtained. This paper sets out to compare and analyse the hip-knee exoskeleton and the single hip exoskeleton in order to ascertain their respective advantages and disadvantages. This demonstrates that the assessment method overcomes the limitations of traditional assessment modes by transitioning from subjective perception to objective data and from single indices to systematic evaluations. The present paper combines subjective perception with objective data to evaluate the assistive effects of exoskeletons, thus providing a new assessment method for optimising exoskeleton product performance and evaluating application outcomes.
2. Materials and Methods
The proposed exoskeleton performance evaluation metric system is based on multi-source signal fusion, utilising signals collected from various sensors. The sEMG and MMG signals are indicative of the degree of muscle fatigue, while the IMU signals represent the human motion states. It is evident that both sEMG and MMG signals undergo a series of preprocessing steps and extraction of MPF feature values. Subsequently, a muscle fatigue classification model is constructed using the BERT + BP algorithm to obtain muscle fatigue level results. The processing of IMU signals facilitates the extraction of range of motion for three joint axes. The exoskeleton performance assessment framework is formed by the combination of these metrics with subjective evaluation scales. The workflow for the evaluation of exoskeleton performance is illustrated in
Figure 1.
In order to validate the effectiveness of the exoskeleton performance evaluation metric system and quantitative assessment standards proposed in this paper, this methodology was applied to the performance evaluation testing of two material-handling assistive exoskeletons. In this study, the same evaluator is required to perform multiple index tests, including a muscle fatigue test, a metabolic index test, and a joint angle change test. These tests are to be completed in accordance with a subjective evaluation scoring form. The objective of this experiment is to quantitatively assess the efficacy of two exoskeletons during the squatting exercise. The following observations are made on the single hip exoskeleton and TT when wearing and unwearing exoskeletons. Furthermore, it is imperative that they fill the hip-knee exoskeleton. The primary distinction between these two approaches lies in the manner in which the knee joints are assisted, as opposed to the positioning of the thighs. The two exoskeletons are shown physically in
Figure 1.
The single-hip exoskeleton weighs around 8 kg and is designed for people between 165 and 182 cm tall. Single-hip exoskeleton as shown in
Figure 2a,b. It features three motor-driven joints at the shoulders, waist and hips. These joints have a range of motion that is engineered to accommodate normal human activities, including squatting, walking, and running. The hip-knee exoskeleton weighs approximately 8 kg in total and features three motor-driven joints at the waist, hip and knee.The hip-knee exoskeleton is shown in
Figure 2c.
The core differences between the two exoskeleton prototypes lie in three areas: first, whether they incorporate knee joint assistance functionality; second, the distinct positions of assistance provided by the thigh restraint structures; and third, whether shoulder joint assistance is included. Once these functional requirements have been met, the two exoskeletons will have nearly identical weights, which will minimise the influence of their own weight on subsequent comparative test results.
The experimenter initiates the procedure by assuming a squat position to retrieve a 10 kg weight from the ground. Thereafter, they stand up straight and repeat the squat movement. In this process, it is imperative for the experimenter to maintain an upright posture to mitigate potential lumbar injury from the weight of the heavy object. This process is repeated until the experimenter reaches a state of fatigue. The experimenter performs the handling test in accordance with the metronome rhythm of 180 beats per minute (BPM). The metronome sounds every four times for the experimenter to complete a movement, as shown in
Figure 3.
The experiment was conducted in accordance with the ethical guidelines set out in the Declaration of Helsinki, which was adopted in Helsinki, Finland in June 1964 and revised in Edinburgh, Scotland in October 2000. The experimental procedures were approved by the Ethics Committee of Nanjing Qixia District Hospital (ethical review number 2024-QX029). Prior to participation in the experiment, all subjects were deemed to be in a state of health, with no evidence of disease or damage present in the tested muscles. Before participating in the experiment, all volunteers were informed of the details of the experiment and signed informed consent. The recruitment of subjects in this study started on 20 December 2024 and ended on 31 December 2024.
2.1. Experiments on the Motion Performance of Exoskeletons
The primary objective of the exoskeleton performance experiment is to evaluate the range of joint angle changes that occur prior to and following the subject’s donning of the exoskeleton. Additionally, the experiment seeks to ascertain any restrictions imposed on the subject’s body movement. In this experiment, 3D motion capture was utilised to collect joint motion data from multiple subjects at varying step speeds. Subsequently, the hip and knee joint motion angles were analysed using iSen software (Version 2021). The experiment employs the Spanish iSen inertial 3D motion measurement system, which relies exclusively on wireless communication with the computer. The system is capable of accommodating up to eight sensors within a whole-body configuration, with an acquisition frequency of 200 Hz. The inertial sensing system is shown in
Figure 4.
The IMU sensor is the core data acquisition device, integrating a three-axis accelerometer, a three-axis gyroscope and a three-axis magnetometer. The fusion of multi-sensor data facilitates the real-time capture of three-dimensional alterations in human joint posture. During the testing phase, the researchers adhered to rigorous ergonomic standards. This entailed the deployment of IMU sensors at pivotal nodes of the joint motion chain. This ensured precise alignment of the sensor axes with the anatomical coordinate system of the human body. In order to minimise interference from clothing friction and motion artefacts, medical-grade elastic straps were used for fixation. Prior to the official test, static and dynamic calibrations were carried out in order to verify the accuracy and stability of the sensor data.
The five subjects were healthy young men aged 23–26, with heights ranging from 168 to 183 cm and weights ranging from 60 to 80 kg. Each subject underwent three replicate experiments. The two sets of experimental data with similar test outcomes were retained from the results, while data exhibiting sudden changes or anomalies were excluded. Prior to the commencement of the experiment, the subjects underwent a 72 h period of rest, during which no strenuous exercise was performed. Prior to the commencement of the muscle fatigue experiments, a series of experimental movement training exercises were conducted with the objective of ensuring proficiency in the movements. The IMU sensors were installed in accordance with the configuration depicted in
Figure 4. One sensor was placed in the posterior lumbar position for localisation, and four sensors were placed bilaterally on the thighs and calves to collect IMU signals during exercise. The transmission of all coordinate data was facilitated wirelessly to the host computer via a wireless acquisition box.
2.2. Experiments on the Cost of Human Metabolism
The human metabolic cost experiments comprised surface electromyographic signal (sEMG) and mechanomyography (MMG) fusion muscle fatigue tests, as well as heart rate and blood oxygen tests. The squatting exercise test was performed with and without the exoskeleton for both wearers, and three sensors were used to comprehensively evaluate the exoskeleton’s human physiological performance in the process of human assistance.
As a significant biological signal reflecting the electrophysiological activity of muscles, the sEMG, captured in real time by an array of electrodes attached to the skin surface, is able to detect the weak electrical signals generated during muscle excitation. The signal’s characteristics, such as root mean square (RMS) and mean power frequency (MPF), can provide an intuitive reflection of muscle contraction strength and fatigue state. In contrast, the MMG signal is capable of capturing mechanical vibration information generated during muscle contraction through piezoelectric or acceleration sensors. This signal has been shown to uniquely characterise muscle fibre recruitment patterns, force output stability and other properties. The combination of these two signals provides a comprehensive and multifaceted assessment of muscle condition, encompassing electrophysiological and biomechanical dimensions.
In this experiment, sEMG signals were acquired using a Cometa wireless EMG device (a wireless sEMG acquisition device manufactured by Cometa in Bareggio, MI, Italy). The device has a sensing delay of less than 500 µs, 16 sensors and a maximum sampling rate of 4000 Hz; to prevent signal loss, a sampling rate of 2000 Hz was used in this experiment.
The equipment employed for the acquisition of MMG signals is the iSen 3D inertial kinematic measurement system, which is manufactured in Spain. The device’s functionality is contingent on wireless communication with the computer. The MMG signal is capable of detecting total muscle changes caused by the activity of active or stimulated motor units during brief phases of dynamic or constant-length contraction of the muscle-tendon unit. A posture sensor is utilised for the detection of muscle surface oscillations, with the resulting vibration being the MMG signal [
17,
18].
Prior to the initiation of the experiment, it is imperative to ascertain the specific data acquisition programme, encompassing the acquisition site, frequency, and placement requirements. For the hand-held weight squatting action that was designed for this experiment, it is necessary to select muscles that show greater activation during the process and that demonstrate changes in the fatigue state more readily. The muscle sound signal acquisition system, which was designed in this paper, was then used to select five muscles that were more affected by the action: the rectus femoris, the medial femoris, the biceps femoris, the vastus lateralis and the tibialis anterior. These muscles are categorised into four distinct groups: the quadriceps muscle group, the posterior thigh muscle group, the back muscle group, and the calf muscle group [
19]. The muscle groups in question are the quadriceps, the posterior mass of the thigh, the back and the anterior calf, and the specific information is shown in
Table 1.
The subjects comprised young men aged between 23 and 26 years who were healthy and between 168 and 183 cm tall and between 60 and 80 kg in weight. Each subject underwent three replicate experiments. The two sets of experimental data with similar test outcomes were retained from the results, while data exhibiting sudden changes or anomalies were excluded. It is noteworthy that the participants had not engaged in strenuous exercise for a period of 72 h prior to the commencement of the test. Prior to the muscle fatigue experiment, training in the test movements was conducted to ensure proficiency. The ambient temperature of the test environment was set at 24 °C, as the generation of EMG signals is affected by human sweat. The subjects were required to perform a five-minute warm-up period to prevent muscle spasms during the experiment. Subsequently, the patch electrodes and the MMG signal sensor were attached to the elevated portion of the muscle belly of the target muscle. The placement of the patch electrodes is illustrated in
Figure 5. The impact of the installation of the muscle sound acquisition end following the full donning of the exoskeleton robot is illustrated in
Figure 6.
Since two signals need to be measured at the same time, the sensors are attached to opposite sides of the body to make the experiment easier to carry out. Studies have shown that the muscles on both sides of the body exhibit similar patterns of change. After signal processing and normalisation, it is possible to approximate the signals from both sides as originating from the same side for further analysis.
The human metabolic cost experiment will measure the wearer’s blood oxygen and heart rate at the beginning and end of the experiment. The instrument utilised in this study is a finger-clip pulse oximeter manufactured by Shanghai Haier Medical Technology Co., a company based in Shanghai, China. As illustrated in
Figure 7, the physical instrument and the measurement process are depicted. The pulse oximeter functions on the basis of the photoelectric volumetric pulse wave tracing principle. The device emits specific wavelengths of red (660 nm) and infrared (940 nm) light to the skin surface. The mechanism in question utilises the differential absorption of haemoglobin at distinct wavelengths of light. The device under scrutiny collects transmitted or reflected light signals and calculates blood oxygen saturation in real time through photoelectric conversion and signal processing. Concurrently, the heart rate is meticulously calculated based on the fluctuations in the time interval of the pulse wave. The heart rate is measured with high accuracy based on the time interval change in the pulse wave. The device is characterised by its portability and ease of operation. The device is non-invasive and enables continuous and stable acquisition of physiological parameters during exercise. The experimental test diagram of the sensor is shown in
Figure 8.
3. Theory
The present paper sets forth a study in which the squatting exercise test was conducted with and without the use of an exoskeleton. The effectiveness of the exoskeleton was evaluated through changes in joint angles, muscle fatigue, heart rate and blood oxygen levels, as well as a subjective evaluation. The quantitative assessment of each sensor value was conducted, and the hierarchical analysis method was employed to evaluate the index of the hierarchical division. The expert scoring determined the weight of each assessment index, and the exoskeleton’s effectiveness was assessed quantitatively.
3.1. Quantitative Assessment of Joint Angle Changes
In the domain of human–machine cooperative rehabilitation and assistance, the precise quantification of the effect of exoskeleton devices on human motor function enhancement constitutes the fundamental concern of scientific research and engineering application. In this paper, we methodically examine alterations in joint angles during movement by the wearer, employing inertial measurement unit (IMU) sensors. Subsequent incorporation of these modifications into the quantitative evaluation system for exoskeleton performance facilitates the construction of a scientific, repeatable evaluation index.
In scenarios involving the research of human movement biomechanics or the evaluation of exoskeleton assistive devices, it is essential to analyse joint angle data systematically. Firstly, joint angle data in the three basic movement directions of flexion and extension, adduction and abduction, and internal and external rotation is acquired in real time using an IMU, motion capture system and other sensor devices, in order to determine the dynamic range of change in each joint angle. Its primary function is to verify that the exoskeleton’s joint movement characteristics closely match human physiological motion. This involves ensuring that the range of joint motion covers the angles required for daily activities or work tasks, such as the hip, knee and waist angles needed for typical actions like walking, bending over and squatting. It also assesses whether motion transitions are smooth and flexible enough to adapt to varying movement rhythms. Whether the wearer is walking slowly or making rapid turns, the exoskeleton must synchronise with their posture changes without causing lag or resistance, ensuring natural, unburdened movement.
In Gabriel’s study, it was found that the range of error in joint angle change during each stabilised walk was different for normal people, with joint angle errors being within ±4% without the assistive device and ±2.63% with it [
20]. Despite the presence of errors in joint angle change that did not exceed 5%, disparities in joint angle change were evident between the two conditions. In this study, the difference in joint angle change between tests with and without the exoskeleton was used as a kinematic measure. It is imperative to consider alterations in subjects’ movement status when wearing and not wearing the exoskeleton. Consequently, an error of ±10% in joint angle change was deemed acceptable for this measure. The quantitative evaluation process strictly adhered to the established criteria, whereby the exoskeleton device was worn at a specific time to compare changes in joint angle before and after wearing the exoskeleton. Should the joint angle undergo a change of no more than 10% with or without exoskeleton intervention, the joint was deemed to be in a normal state of motion, and the quantitative score assigned was 1. If the change exceeded 10%, the calculation model of ‘1.1 minus the percentage of change before and after’ was employed to accurately calculate the quantitative assessment result for each joint.
In order to provide a comprehensive reflection of the overall motion performance of the joints, the quantitative assessment results for each joint in three directions of motion are summarised. Subsequently, the comprehensive scores for all joints are calculated using either a weighted or arithmetic average. This process ultimately yields precise quantitative assessment scores for joint angles. These scores provide a scientific basis for the optimisation of exoskeleton devices and the formulation of rehabilitation training programmes.
3.2. Quantitative Assessment of Muscle Fatigue
In the performance evaluation system of the human–machine fusion exoskeleton, the precise quantification of muscle fatigue is pivotal in measuring the device’s assistive effect and the human body’s adaptability. The conventional single-sensor evaluation method is subject to certain limitations, including an inadequate information dimension and a vulnerability to interference. In this paper, a novel approach is adopted, incorporating sEMG and MMG signal fusion technology to develop a multimodal, data-driven muscle fatigue dynamic monitoring model. This model incorporates changes in muscle fatigue into the core index system for the quantitative evaluation of exoskeleton performance.
In this paper, the two signals are subjected to a preprocessing procedure that involves the application of filtering and smoothing techniques. This is followed by the extraction of the eigenvalues of the signals, which are then utilised to determine the MPF eigenvalues. Subsequently, the two signals are integrated, and the muscle fatigue class classification results for each test are obtained through the implementation of the muscle fatigue classification model, which is based on the BP + BERT algorithm [
21].
In a preceding study, VT-Lowe’s exoskeleton was utilised to assess the maximal changes in twelve muscles prior to and following weightlifting tests. The results of the study demonstrated a significant reduction in back muscle peaks, with a decrease of approximately 30%. In contrast, the reduction in leg muscle peaks was approximately 15% [
22]. In Bosch’s study, which involved 18 participants performing a simulated assembly task, muscle activity decreased by 35–38% when wearing the exoskeleton [
23]. Looze’s review study analysed 40 articles on 26 different exoskeletons and found a 10–40% reduction in back muscle activity during dynamic lifting and static testing [
24]. Consequently, a 30% decrease in muscle activity was utilised as the pivotal metric in this paper’s quantitative assessment process.
In the present study, the level of fatigue in the muscles of five test subjects was measured before and after exercise. The fatigue level values for each state were aggregated, and the mean of these sums formed the overall fatigue index. The specific scoring criteria are as follows: if the overall fatigue index is reduced by 30% when wearing the exoskeleton in comparison with not wearing it, the exoskeleton is recognised as being the most effective at relieving muscle fatigue and receives a quantitative score of 1. If the decrease is less than 30%, the formula “percentage decrease divided by 0.3” is used to accurately calculate the muscle fatigue index for each exoskeleton device.
This quantitative assessment method intuitively reflects the degree to which the exoskeleton improves muscle fatigue. It provides an objective, quantitative scientific basis for optimising exoskeleton design, adjusting exercise training intensity and formulating rehabilitation treatment plans. This contributes to enhancing the practicality and effectiveness of exoskeleton devices.
3.3. Quantitative Assessment of Heart Rate and Oxygen Saturation
The present study evaluates the impact of exoskeleton devices on human exercise metabolism, heart rate, and oxygen saturation. These are pivotal physiological indicators that reflect cardiopulmonary function and the balance of oxygen supply and demand in the body. The dynamic change data provided by these exoskeletons offer a significant foundation for the quantification of their effectiveness. Utilising pulse oximetry, a non-invasive monitoring technology, this paper systematically explores how heart rate and oxygen saturation change before and after squatting exercises. Subsequently, these physiological indicators are incorporated into a quantitative assessment system for the performance of exoskeletons. This provides physiological support for the optimisation of device performance and sports rehabilitation applications.
A plethora of research has been conducted on the subject of metabolic cost assessment. This research has found that, in trials involving treadmill tests, the utilisation of ankle exoskeletons resulted in a 11 ± 4% reduction in the net metabolic cost of walking for test subjects wearing the exoskeletons, in comparison to walking without them [
25]. Furthermore, Zhang’s study determined that utilising an unpowered hip exoskeleton resulted in a 9.14% reduction in metabolic cost for test subjects when wearing the exoskeleton, in comparison to those not wearing one [
26]. The findings of this study demonstrated that both single-joint and unpowered exoskeletons were found to reduce the metabolic cost of exercise by more than 10%. Patrick’s study found that a personalised exoskeleton reduced the metabolic cost of walking by 17 ± 5% over one hour and running by 23 ± 8% [
27]. In summary, the present study considered a 30% reduction to be a criterion for the quantitative assessment of heart rate and oxygen saturation.
The quantitative assessment process was conducted in accordance with a rigorous, standardised procedure. The data pertaining to heart rate and oxygen saturation, obtained from each trial, were then averaged to create a composite dataset. The study utilised baseline physiological data, excluding the exoskeleton device, as a reference point for comparative analysis of post-exoskeleton usage alterations in physiological values. Should the mean heart rate and blood oxygen levels undergo a decline of 30% in comparison to the initial values, it was determined that the exoskeleton devices exerted a substantial influence on the physiological burden imposed on the human body, corresponding to a quantitative score of 1. In instances where the decline was less than 30%, the calculation model “Percentage of Decrease/0.3” was employed to accurately ascertain the quantitative scores of the exoskeleton devices in regulating heart rate and blood oxygen. In instances where the decrease was less than 30%, the calculation model “Decrease Percentage/0.3” was employed to derive a quantitative score for each exoskeleton device, with this score based on heart rate and blood oxygen regulation.
This quantitative assessment system transforms changes in physiological indicators into quantifiable scoring results. Consequently, it can be employed to scientifically assess the extent to which exoskeleton devices impact the human circulatory system. The device can also be used to compare the physiological regulation efficacy of different exoskeleton products. In addition, it can provide important data to optimise the ergonomics of exoskeleton equipment, improve wearing comfort and formulate clinical rehabilitation programmes. The system has the capacity to facilitate a comparative analysis of the physiological regulation efficiency of disparate exoskeleton products. This provides significant data to support the ergonomic optimisation of exoskeleton equipment, the improvement of wearing comfort, the formulation of clinical rehabilitation programmes, and the precise application and iterative upgrade of exoskeleton technology in the fields of medicine and healthcare, and industry.
3.4. Quantitative Assessment of Subjective Indicators
While objective physiological and kinematic data can provide a scientific quantitative basis for evaluating the performance of exoskeleton devices, the wearer’s subjective feelings, as direct feedback on the human–computer interaction experience, play an irreplaceable role in the evaluation system. The present paper sets out a systematic subjective evaluation system that improves the dimensions of exoskeleton performance evaluation from the user experience perspective. The method by which this is achieved is through the provision of a quantitative score for a number of key indexes. This provides a significant reference point for the optimisation of device comfort and the enhancement of functionality.
The utilisation of subjective evaluations has become a prevalent practice within extant research. In Cai’s study, for instance, subjects completed the System Usability Scale (SUS), a questionnaire designed to collect subjective evaluations and recommendations about the device from the wearer [
28]. In Resquin’s study, subjects completed subjective tests of overall satisfaction and emotional reaction, which serve as subjective evaluations of the assistive device [
29]. Hussain’s study demonstrated that there is a paucity of specific usability scales for exoskeletons at this stage. It was determined through experimental analysis that a usability assessment questionnaire should encompass four primary factors: mobility, adjustability, manoeuvrability, and safety [
30].
The subjective evaluation system was constructed on the basis of in-depth interviews and a comprehensive review of the relevant literature. In conjunction with the application scenarios and design objectives of the exoskeleton device, four core evaluation dimensions were identified: movement stability, ease of operation, perception of the assistance effect, and fatigue tolerance. Each dimension was then broken down into specific indicators. For instance, the concept of ‘ease of operation’ encompasses dimensions such as ‘ease of putting on’, ‘ease of taking off’, ‘ease of adjusting’, ‘ease of picking up’ and ‘ease of putting down’, among others. In order to ensure the objectivity and comparability of the evaluation results, the internationally recognised 5-point Likert scale was utilised, where 1 point represents ‘extremely unsatisfactory/weak’ and 5 points represent ‘very satisfactory/strong’. The subject is required to rate each indicator independently based on their actual experience after completing the prescribed exercise tasks. The subjects were invited to evaluate each indicator based on their actual experience upon completion of the prescribed exercise tasks.
With regard to the implementation process, the researchers will provide the wearer with a comprehensive explanation of the evaluation indexes and the scoring criteria prior to the test, and conduct simulated scoring exercises to ensure a consistent understanding is achieved. Subsequently, the subject is required to complete the subjective evaluation questionnaire immediately following the completion of the weighted squat test while wearing the exoskeleton device. In order to mitigate the impact of memory bias on the outcomes, the questionnaire is completed and collected on-site immediately following the test. During the data processing stage, the initial step was to utilise descriptive statistics for the purpose of analysing the scores of each index. The mean value was then calculated in order to visualise the distribution of the wearer’s subjective evaluation. The indicator scores from the four core evaluation dimensions were averaged, and then divided by five to normalise the four dimensions.
The integration of subjective indicators within the quantitative assessment framework for exoskeleton effectiveness serves to address the limitations of objective data in capturing human–computer interaction experience, thereby facilitating the identification of potential design issues from the user’s viewpoint. Subsequent research will synthesise other evaluation indexes with a view to further deepening the quantitative analysis of subjective evaluation and promoting the iterative upgrading of exoskeleton devices in terms of user experience.
3.5. A Quantitative Assessment Model for Integrated Effectiveness
The present paper sets out a quantitative assessment of exoskeleton performance, combining the hierarchical analysis method and the expert scoring method to create a scientific assessment system. Firstly, hierarchical analysis is employed to disassemble the exoskeleton performance evaluation into multiple levels of an index system, such as power performance, wearing comfort and system stability. Subsequently, experts in mechanical engineering, ergonomics and other relevant fields are invited to utilise the expert scoring method to evaluate the relative importance of the various indicators, construct a judgement matrix and determine the weights of the indicators through mathematical computation. Concurrently, the performance of the exoskeleton for each specific index is quantitatively evaluated. The performance of the exoskeleton is then evaluated on each specific index. Ultimately, a comprehensive, objective, quantitative and accurate evaluation of the exoskeleton’s performance is achieved by combining the weights and expert scoring results.
Exoskeleton devices are complex systems that integrate mechanical engineering, biomedicine and other disciplines. Consequently, their performance evaluation must take into account technical parameters, human–computer interaction and other multidimensional elements. It is challenging for a single assessment method to accurately reflect the device’s overall performance. The present paper proposes a novel integration of the hierarchical analysis method (AHP) and the expert scoring method, thereby constructing a systematic, structured, quantitative assessment model of comprehensive performance. This provides a scientific basis for decision-making in the research, development, optimisation and application of exoskeleton devices.
In the preliminary phase of model construction, the exoskeleton performance evaluation system is deconstructed into three layers—target, criterion and indicator—using the AHP method, based on extant research results and industry standards. The target layer is concerned with the comprehensive performance of exoskeleton devices. The criterion layer is subdivided into four core dimensions: human–machine synergy; human metabolism; working condition performance; and assisting effect. The indicator layer is expanded further to include human metabolism (e.g., muscle fatigue, heart rate, blood oxygen and subjective fatigue tolerance), forming a complete assessment framework comprising eight specific indicators.
In order to guarantee the scientific rigour and authority of the evaluation weights, the research team has adopted strict criteria for selecting experts. A total of 22 senior specialists in mechanical engineering, ergonomics, biomedical engineering and related fields have been invited to form the evaluation team. It is imperative that these experts possess extensive experience in the development of exoskeleton devices, a robust theoretical knowledge base, and a substantial academic influence in related domains. In light of the varying specialisms of the 22 experts, they have been divided into four categories. Each category incorporates an additional expert scoring factor, as illustrated in
Table 2.
The evaluation of the eight indicators will be conducted by the designated expert group. In order to ensure the objectivity and comparability of the evaluation results, the internationally recognised 5-point Likert scale was utilised. In this scale, 1 represents ‘extremely unimportant’ and 5 represents ‘very important’. The expert group was tasked with the independent evaluation of the indicators based on their theoretical knowledge. The indicators were scored independently by the expert group based on their theoretical knowledge. The scores allocated by each expert were multiplied by their respective scoring coefficients, then summed and normalised to obtain the weighting for each indicator. As illustrated in
Table 3, the calculation of the specific weight coefficients for each indicator is demonstrated.
In the context of expert scoring methods, consistency is defined as the extent to which multiple experts’ judgments or scores for a given set of evaluation indicators (or proposals) exhibit coordination and convergence. The attainment of highly consistent results is indicative of coordination within the expert group and the credibility of the conclusions is high. Conversely, the attainment of such results necessitates re-evaluation or adjustment. In instances where multiple experts are tasked with evaluating or ranking a series of subjects, it becomes imperative to assess the consistency of the expert group’s opinions. The Kendall’s coefficient of concordance is a widely employed metric for this purpose. The present study comprised 22 experts who individually scored eight indicators. This necessitated the use of Kendall’s coefficient of concordance to assess the consistency among multiple experts.
Since this study involves scoring individual indicators, the scores must first be converted into rankings. For each expert, the indicators are ranked based on their 8 scores. If scores are identical, the average rank is taken.
First, calculate the total rank sum for each indicator by summing all scores assigned to it, yielding the total rank sum
Ri for each indicator across all 22 experts. Then compute the sum of squared deviations (
S) for the total rank sum:
where
R is the average of all
Ri.
Calculate Kendall’s coefficient of concordance
W:
S denotes the sum of squared deviations,
m represents the number of experts,
n indicates the number of objects and
∑T is the correction factor related to the number of identical rank groups assigned by each expert.
where
k represents the number of indicators within each group of identical ranks. If an expert has multiple groups of identical ranks, each group is calculated separately before summing.
Applying the above formula yields a final value of W = 0.585. A significance test is then performed on this result.
The results indicate that the Kendall’s gamma coefficient of 0.585 signifies above-average consistency in the experts’ ranking of the eight indicators. The chi-square test confirms that the observed level of agreement among the experts is statistically highly significant and highly unlikely to be random. Given this level of consistency, the study can calculate the average score for each indicator. These scores can then be used for subsequent weight determination or decision analysis.
The weights of the indicators are ultimately multiplied by the guideline-level indicator scores, and then divided by the sum of the weights. This process normalises the weights, and the exoskeleton device’s comprehensive effectiveness score is thus arrived at. The quantification of the weights of the indicators effectively balances differences in the importance of the various assessment dimensions, thus avoiding one-sided subjective judgements and overcoming the limitations of single objective data. For instance, when evaluating a lower limb exoskeleton device, the model accurately identifies deficiencies in human–machine coordination. This facilitates the optimisation of control algorithms by the R&D team, thereby enhancing the device’s overall performance. Subsequent research will enhance the evaluation index system by integrating it with machine learning algorithms to dynamically optimise weight allocation, thereby improving the accuracy and universality of the evaluation model.
5. Discussion
In comparison with extant studies on single exoskeleton evaluation indexes [
22,
23,
24,
25,
26,
27,
28,
29,
30], the present study contains a more comprehensive set of indexes and can obtain key performance indexes relating to human metabolism, human–machine synergism, working conditions and the perception of the exoskeleton’s assisting effect under different working conditions. In comparison with extant comprehensive performance evaluation studies of exoskeletons [
12,
13,
14,
15,
31,
32,
33], the present study encompasses a more extensive array of indicators and a broader spectrum of testing surfaces. Furthermore, the study puts forward a range of quantitative evaluation methods for each indicator, derived from existing indicators. This provides a more comprehensive and nuanced set of quantitative evaluation methods for exoskeleton performance.
The final scores for the hip and knee exoskeletons are 0.625 and 0.845, respectively. It is evident from the examination of the final scores that the two exoskeletons demonstrate superior performance in comparison to the single hip exoskeleton. However, it should be noted that the two exoskeletons possess distinct strengths and weaknesses with regard to individual indicators. Consequently, specific modifications can be implemented based on the quantitative evaluation results of these indicators.
For the hip and knee exoskeleton, all four specific indicators in the criterion layer of human metabolic condition were found to be suboptimal, with scores of 0.399, 0.816, 0.390 and 0. This finding suggests that there are issues with the exoskeleton’s structural design, as it does not fit the human body well and does not achieve the expected boosting effect. The mechanical structure of the joint assist system may require subsequent modifications, as the current exoskeleton design does not align well with human movement patterns. However, the exoskeleton received high scores for the four human–machine synergy indicators, working condition expressiveness and perception of the booster effect: 0.756, 0.92, 0.8 and 0.84, respectively. This finding indicates that the exoskeleton is optimally designed for human–computer interaction and has the potential to enhance the user experience for testers.
The single hip exoskeleton demonstrates efficacy across six indicators, distributed across three domains: human metabolism, working conditions, and the perception of the assisting effect. The specific scores are 1.241, 0.896, 1.350, 0, 0.827 and 0.9, respectively, indicating good performance in structural design, function fulfilment and so on. However, it performs poorly in terms of human–machine synergy convenience, with scores of 0.667 and 0.78. Two indicators suggest that there are certain issues with the exoskeleton’s human–machine interaction design, including the cumbersome length adjustment process. The score for the convenience of human–computer interaction is also not satisfactory, at 0.667 and 0.78. The findings of this study indicate that the exoskeleton exhibits deficiencies in its design with regard to human–computer interaction, as evidenced by the cumbersome process of length adjustment and the discomfort experienced when the device comes into contact with the human body. Subsequent iterations of the exoskeleton can be updated based on the specific scores of the indicators in this performance evaluation.
6. Conclusions
The present study involved conducting squatting tests while wearing two different exoskeletons and while not wearing an exoskeleton. The objective of this was to synthesise four aspects of human metabolism, human–machine synergism, working conditions and performance, and the perception of the effect of assistance. A total of eight indexes were utilised, four of which were objective and four of which were subjective, in order to conduct a comprehensive quantitative assessment of the efficacy of the two exoskeletons. The objective of the present study was to develop a quantitative assessment tool for exoskeletons, select a superior exoskeleton based on comprehensive scores and provide specific recommendations for updating and improving the tested exoskeletons based on the scores of each metric.
The results indicate that the hip and knee exoskeleton attained a final score of 0.625, while the single hip exoskeleton achieved a final score of 0.845. In comparison with single-hip exoskeletons, lumbar-hip exoskeletons have been shown to offer a superior level of comfort to the user. Nevertheless, the paucity of data in this study precludes a detailed comparative analysis of the two exoskeletons, and specific modification suggestions may require further testing. However, the findings from preliminary testing indicate that this method can effectively quantify key performance metrics of exoskeletons under various working conditions, including human metabolic status, human–machine coordination, task performance, and perceived assist effectiveness. In comparison with single-signal evaluation methods, this approach has been shown to enhance assessment accuracy and reliability to a significant degree. The system provides robust technical support and a decision-making framework for the optimisation of exoskeleton performance and personalised adaptation, thereby propelling exoskeleton technology towards greater efficiency and intelligence.