Method and Test Course for the Evaluation of Industrial Exoskeletons

: In recent years, the trend for implementing exoskeletons in industrial workplaces has signiﬁcantly increased. A variety of systems have been developed to support different tasks, body parts, and movements. As no standardized procedure for evaluating industrial exoskeletons is currently available, conducted laboratory and ﬁeld tests with different setups and methodologies aim to provide evidence of, e.g., the support for selected isolated activities. Accordingly, a comparison between exoskeletons and their workplace applicability proves to be challenging. In order to address this issue, this paper presents a generic method and modular test course for evaluating industrial exoskeletons: First, the seven-phase model proposes steps for the comprehensive evaluation of exoskeletons. Second, the test course comprises a quick check of the system’s operational requirements as well as workstations for an application-related evaluation of exoskeletons’ (short-term) effects. Due to the vastness and heterogeneity of possible application scenarios, the test course offers a pool of modular conﬁgurable stations or tasks, and thus enables a guided self-evaluation for different protagonists. Finally, several exemplary exoskeletons supporting varying body regions passed the test course to evaluate and reﬂect its representativity and suitability as well as to derive discernible trends regarding the applicability and effectiveness of exoskeleton types.


Introduction
The daily use of exoskeletons attains increasing interest in industrial environments. As a human-centered approach, exoskeletons provide physical support to the workforce, and thus may prove successful in preventing work-related musculoskeletal disorders (WMSD) in the long term [1,2]. In industrial applications, WMSD are mainly caused by demanding working conditions such as strenuous and repetitive movements, or awkward working postures, occurring in, e.g., assembly and logistics tasks and potentially leading to the workforce's absences, presentisms, or a reduction in quality of life [3].
Recently, the number of commercially available exoskeletons for industrial applications in production and logistics has risen sharply [4]. The systems support different body parts such as the upper extremities, trunk, or lower limbs as well as featuring various technical properties, morphologies, and kinds of support [2,5]. Thus, potential users of industrial exoskeletons face the decision of selecting the most appropriate system [6,7], as necessary information about exoskeletons is either differently labeled or generally lacking. For instance, this concerns various characterizations of the system's support, clear application guidelines (e.g., regarding wearing time, risk assessment, hygiene, maintenance), or specifications of technical characteristics (e.g., regarding actuators, force curves, operating times) and operational requirements (e.g., regarding movability, compatibility with personal protective or working equipment). Additionally, study results depend on the respective study setup (e.g., selection of the system's power level, sample's characteristics, or selected tasks with their properties) [8] and should thus only be viewed in the context of each investigation [9]. Besides, the evaluation methodologies for industrial exoskeletons are not standardized [10] and often analyze limited constructs or items with different testing procedures and methods applied on less representative samples [11]. Focused tasks in evaluation studies often consider a fraction of workplace settings, and thus only cover restricted patterns of manual activity profiles and their requirements. Evaluators also often admit further study limitations concerning, e.g., reductions in the broad scope of possible activities or user profiles (e.g., [12][13][14][15]) as well as the focus on short-term effects (e.g., [16,17]).
These days, several initiatives for harmonizing the description and especially the evaluation of industrial exoskeletons in both regulatory committees (e.g., American Society for Testing and Materials (ASTM) Committee F48, European Committee for Standardization (CEN) CWA 17664:2021) and scientific communities take place. For instance, the ASTM works on standards for labeling, training, operating, and testing practices [18]. The CEN proposes a performance test method for walking on uneven terrain [19]. The EUROBENCH project aims to develop various tools to assess and benchmark robotic systems considering multiple aspects [20]. Additionally, scientific reviews provide informative overviews of applied evaluations of either exoskeletal prototypes or commercial systems. For instance, they focus on the user's metabolic costs with upper-body exoskeletons [21] or muscular activity, body loading, and experience with trunk exoskeletons [22,23]. In contrast, Hoffmann et al. [11] present a generic prevalence matrix of different applied types of analysis combined with their respective research objects for deriving patterns and best practices for prospective evaluation methodologies. Besides, Baer et al. [24] investigate the statistical effects of exoskeletons on biomechanical stress and strain through a meta-analysis.
Other approaches introduce different kinds of standardized test environments for uniformly evaluating, benchmarking, or comparing exoskeletons by passing representative test tasks, stations, or batteries of specific application profiles. Here, the use of coordinated or complementary methods enables an evidence-based evaluation of the system's performance or applicability in a comprehensive way. For instance, Hefferle et al. [25] combine multiple evaluation methods (e.g., local/global and subjective/objective) in systematically varied static, dynamic, and simulated assembly tasks. Bostelman et al. [26] propose a reconfigurable testbed for load positioning tasks and analyze the heart rate, visual assessment, and perception of the test person(s). Additionally, Taborri et al. [27] present an automated testbed for balance assessment while wearing exoskeletons. Baltrusch et al. [28] and Kozinc et al. [29] assess the functional performance of trunk exoskeletons with objective observations or quantitative and subjective measures on several motoric tasks, respectively. In this respect, Luger et al. [16] additionally simulate industrial tasks such as pallet box lifting, fastening, and lattice box lifting, arranged in a triangle orientation for considering pathways. Alternatively, (wearable) robots are compared and discussed in the frame of, e.g., RoboCup [30], Exoworkathlon [31], or Cybathlon [32]. In conclusion, test courses already exist and are rising in number due to their practical relevance but do not describe a holistic approach for evaluating exoskeletons as primarily focusing on or determining a few selected tasks and types of analysis.
From a practical standpoint, standardizing the evaluation of industrial exoskeletons is always a trade-off between (a) reducing the vastness of possible industrial application scenarios to a manageable and compressed level and (b) maintaining the overall representativeness of the assessed test scenario(s). Furthermore, various protagonists pursue different interests and core themes in evaluating exoskeletons. For instance, industrial companies focus on, e.g., exoskeletal effects on work performances, reduction of sick days, potential benefits on the company's reputation in society, and on human resources, as well as the employees' acceptance. On the other hand, system users are most interested in, e.g., the physical support, the operative safety, the overall usability, and the long-term prevention effect. Manufacturers and system developers mainly deal with determining the (physical and mechanical) support, validating the technical functionality, incorporating exoskeletons in different working fields, and optimizing the overall system usability and acceptance.
Other protagonists might be scientific institutions, testing institutes, or insurance companies. As a result, rather than prescribing fixed test setups, it is preferable to individually guide protagonists through the evaluation process by demonstrating a consistent framework and sensitizing them to relevant aspects [6]. However, every evaluation process by different protagonists is similar and can be divided into representative steps within the proposed comprehensive seven-phase model. Thus, in the spirit of a guided selfevaluation, this paper presents a generic and holistic evaluation method for (a) improving the understanding of the individual characteristics of exoskeletal application scenarios and (b) individually deriving relevant evaluation aspects from provided test pools. This approach allows individual adaptions, improves the representation of realistic and relevant working conditions, enables a more profound assessment of the system's workplace applicability, and maintains the general comparability of different test setups and exoskeletons. Another novelty is to combine single characteristics or tasks in a simulated and integrated workplace with interrelated activity profiles and multiple tasks. This improvement enables a more application-oriented evaluation, shortens the investigation period for the test persons, and improves the practicability for the evaluation protagonists. Finally, exemplary study results from the test course application are presented and reflected for selected industrial exoskeletons.

Method for the Evaluation of Industrial Exoskeletons
The number and complexity of both application scenarios and heterogeneous industrial exoskeletons require a generic methodology for their holistic evaluation. In order to address the different interests and aims of protagonists, the evaluation must be capable of considering a wide range of aspects and cannot be limited to a few selected criteria or tasks [8]. For this purpose, the seven-phase model in Figure 1 proposes the three stages (1) setup, (2) conduct, and (3) implication, with seven subordinate phases for evaluation intentions. in, e.g., the physical support, the operative safety, the overall usability, and the long-term prevention effect. Manufacturers and system developers mainly deal with determining the (physical and mechanical) support, validating the technical functionality, incorporating exoskeletons in different working fields, and optimizing the overall system usability and acceptance. Other protagonists might be scientific institutions, testing institutes, or insurance companies. As a result, rather than prescribing fixed test setups, it is preferable to individually guide protagonists through the evaluation process by demonstrating a consistent framework and sensitizing them to relevant aspects [6]. However, every evaluation process by different protagonists is similar and can be divided into representative steps within the proposed comprehensive seven-phase model. Thus, in the spirit of a guided self-evaluation, this paper presents a generic and holistic evaluation method for (a) improving the understanding of the individual characteristics of exoskeletal application scenarios and (b) individually deriving relevant evaluation aspects from provided test pools. This approach allows individual adaptions, improves the representation of realistic and relevant working conditions, enables a more profound assessment of the system's workplace applicability, and maintains the general comparability of different test setups and exoskeletons. Another novelty is to combine single characteristics or tasks in a simulated and integrated workplace with interrelated activity profiles and multiple tasks. This improvement enables a more application-oriented evaluation, shortens the investigation period for the test persons, and improves the practicability for the evaluation protagonists. Finally, exemplary study results from the test course application are presented and reflected for selected industrial exoskeletons.

Method for the Evaluation of Industrial Exoskeletons
The number and complexity of both application scenarios and heterogeneous industrial exoskeletons require a generic methodology for their holistic evaluation. In order to address the different interests and aims of protagonists, the evaluation must be capable of considering a wide range of aspects and cannot be limited to a few selected criteria or tasks [8]. For this purpose, the seven-phase model in Figure 1 proposes the three stages (1) setup, (2) conduct, and (3) implication, with seven subordinate phases for evaluation intentions.  Since the model should be applicable and adaptable for different protagonists, a scope for adjustments remains left to focus on individual aspects or more relevant evaluation stages. In the overall concept of the model, the evaluation process does not solely comprise the actual examination of exoskeletons but also the pre-initialization and setup as well as the subsequent practical application of findings. Regardless of the focus of the evaluation, new impressions and application-oriented specifications should iteratively flow into the procedure, individually tailoring the assessment to specific use cases. Thus, the seven-phase model envisages a circular process and intends to facilitate the evaluation of exoskeletons through a uniform systematic but leaves the protagonist the freedom of guided and adaptable self-evaluation. In the following, the different stages and phases are described in detail.

First Stage-Setup
The setup of the evaluation mainly focuses on the classification of support situations and exoskeletons as well as the preparation of the evaluation environment (phases one and two).

Phase I-Characterization
This phase thoroughly analyzes the addressed application scenarios in industrial environments and characterizes these support situations as an interaction of the system user, the applied exoskeleton, and the working environment, each with their own sets of characteristics [6,33]. By this, the derivation and selection of exoskeletons with appropriate morphological and functional properties (e.g., actuation principle, stiffness, level of support) [7,22] for the targeted application field of the system become apparent. Besides, the later evaluation can apply suitable testing batteries and measurement techniques.

Phase II-Preparation
During the preparation phase, the evaluation environment, with its specific infrastructure and test stations, is set up. As evaluation items and criteria should be determined prior to the conducting stage, it is also necessary to choose appropriate measurement techniques (e.g., electromyography, motion capture analysis, questionnaire), and thus prepare and check the needed research equipment (e.g., surface electrodes or inertial measurement units) for attachability on test person(s) and their technical functionality. The selection of appropriate, applicable, valid, and reliable methods is inevitable for the success of an evaluation. Besides, an initial familiarization with the system's functionality includes a check of the exoskeleton's fit and operability. Furthermore, the suitability of the testing environment in general (e.g., in terms of smooth operational availability or freedom of (electromagnetic) disturbances) is supposed to be proven.

Second Stage-Conduct
This stage comprises the pre-, core-, and post-evaluation, where test scenarios (including initial tests) are executed to comprehensively evaluate exoskeletons using qualitative and quantitative measurement methods and investigate long-term and learning effects (phases three to five).

Phase III-Pre-Evaluation
The concluded initialization and setup of the evaluation environment allows the concrete planning of the specific test scenarios to be executed in the subsequent phase. Accordingly, the test stations and tasks need to be capable of simulating comparable industrial applications as practically as possible. Initial tests of exoskeletons in the respective test scenarios prove the applicability and suitability of the chosen measurement methods. Among other aspects, the pre-evaluation includes the examination of the operational requirements of the exoskeleton.

Phase IV-Core-Evaluation
The core-evaluation is the central part of the seven-phase model. In this phase, the prior determined measurement techniques are applied to investigate the posed hypotheses relating to the supportive characteristics of exoskeletons on the user. To consider personal perceptions and measurable objective effects, subjective and objective assessment methods should be complementarily applied. By taking both qualitative and quantitative data, the short-term results potentially become more profound and comprehensive as well as enabling an in-depth evaluation of relevant aspects.

Phase V-Post-Evaluation
As the core-evaluation mainly gathers short-term data, the post-evaluation additionally aims to gain complementary information on the long-term effects of wearing an exoskeleton, not only focusing on immediate results but either accompanying studies over an extended period or repeated short-term studies. In this respect, biomechanical investigations in a provided framework can help determine possible changes in movement patterns or task executions over time. Surveys can reveal the learning effects of applying exoskeletons on a handling or process level. A time-shift of four to six weeks is recommended regularly.

Third Stage-Implication
The implication stage comprises the analysis of measured data and the subsequent derivation of findings, finally enabling us to derive recommendations for action and improvement initiatives (phases six and seven).

Phase VI-Analysis
The sixth phase reflects the conducted evaluation(s) by analyzing and interpreting the data generated from the studies and surveys as either qualitative or quantitative descriptions of the results. The results focus on the derivation of gained insights and profound findings. The provided and analyzed data on the exoskeleton's effects can be structurally stored in a database and prospectively used for, e.g., comparisons to similar evaluation scenarios, biomechanical simulations, or machine learning purposes to optimize the human-machine interaction. Larger sample sizes may enable statistical significance of the results.

Phase VII-Reflection
Phase seven draws conclusions and makes recommendations for action for the specific use of exoskeletons in an industrial workplace, limited to the application scenario regarded in the preceding phases. Tailoring the used measurement techniques to the focused task(s) allows us to determine the general and specific suitability and feasibility of evaluated exoskeletons, respectively. The derived insights and findings indicate possible improvement initiatives for practical implementation and evaluations. They also may help with using exoskeletons more appropriately and efficiently in practical applications.

Test Course for the Evaluation of Exoskeletons
For a comprehensive and harmonized evaluation of industrial exoskeletons, a test course is suitable to investigate their functionalities and effects. Since different kinds of support can appear [5], this approach primarily focuses on facilitating and adding movements as well as stabilizing postures as common industrial scenarios for exoskeletons. Accordingly, a test course is proposed and designed, splitting up the evaluation into the complementary assessment of operational requirements and effects resulting from the use of exoskeletons. Therefore, it takes up the spirit of Bostelman et al. [26] by its reconfigurability, modularity, and variability to enable the simulation of the vastness in industrial tasks. Similarly, but advancing to cumulative studies on motoric movements (e.g., [27]) as well as industrial tasks (e.g., [16,25]), a sequence of test activities and movement profiles is arranged in the test course and operationalized as a test battery. It is more holistic since it is not limited to isolated industrial tasks or types of analysis but considers an integrated evaluation of necessary boundary conditions and situations for using exoskeletons. Thus, the approach comprises a multifunctional testing infrastructure with standardized reusable, movable, and individually adaptable modules.
The following aggregated compilation of representative operational requirements and occupational tasks follows personal experiences of the authors from employments in industrial companies, several conducted field and laboratory studies with multiple different industrial exoskeletons over the last years, as well as investigations from research projects such as "smartASSIST" (2014-2020) and "Exo@work" (since 2018) founded by the German Federal Ministry of Education and Research (BMBF, grant number 16SV7114) and the German Social Accident Insurance Institution for the trade and distribution industry (BGHW), respectively.

Conceptual Framework for the Setup of the Test Course
The conceptual framework is assigned to the first stage of the seven-phase model in Figure 1, as it specifies the setup for subsequent evaluation of exoskeletons, and thus recommends a viable way to tailor the holistic test course to specific practical realization. Its challenge remains in containing characteristic tasks but being applicable for various industrial applications and distinguishing between exoskeletons and their applicability in a regarded industrial environment due to their operational requirements and wearing effects.
In order to evaluate exoskeletons with intended industrial use in a test course, modeling and simulation of functions and real-life tasks are required. In analogy to different settings and activity profiles in industrial sectors (e.g., the aircraft and automotive industry, mechanical engineering, handicraft, or (intra-)logistics), the tasks of the test course need to match and represent the respective characteristics on an aggregate level. Since evaluation aspects and test scenarios depend on each protagonist's perspective and interests, each protagonist should critically analyze the support situation where an exoskeleton is intended to be used concerning its specifications first. This step is essential for a targeted examination of operational requirements, representative industrial tasks, and types of analysis in the test course.
For appropriate modeling of industrial scenarios and workplaces, the test course consists of two complementary parts with different purposes: a test pool of operational requirements focusing on handling the exoskeleton, potential restrictions of motoric movements and secondary activities, and safety aspects, as well as a test pool of main tasks with different sets of characteristics derived from industrial activities. While the operational requirements are (objectively) evaluated qualitatively according to a standardized test protocol, the industry-related tasks are (both objectively and subjectively) assessed quantitatively, randomized, and in individual configurations for each application scenario. In order to provide evidence of potential effects of exoskeletal use, baseline measurements without exoskeletons are usually needed. The choice of applied types of analysis (e.g., motion capture, electromyography, load cells) depends on the protagonist's affiliation, method knowledge, and personal interest. Inspirations of possible methods can be derived by the overview of Hoffmann et al. [11]. Due to the setup and arrangement of the test course, a single-run evaluation with a minimum of one test person and a standardized test protocol is sufficient for examining operational requirements, whereas a multi-run evaluation with several test persons is suitable for industrial tasks. The resulting insights and recommendations are supposed to flow back in real application scenarios. Figure 2 schematically visualizes a framework for tailoring and selecting relevant operational requirements and industrial tasks for evaluation. As a summary, it depicts the aggregation of both test pools to an overall test course environment and outlines its interrelations and dependencies to the real industrial environment. Appl. Sci. 2021, 11, x FOR PEER REVIEW 7 of 19

Figure 2.
Interrelation between real application scenarios and test course environment.

Evaluation of Operational Requirements
Regarding the operational requirements, four different categories with varying focuses and aims are identified to qualitatively evaluate the general usability of the exoskeleton(s) in the first step. The four categories are (1) exoskeleton handling, (2) motoric movements, (3) secondary activities, and (4) operating ability. In order to analyze relevant aspects, these categories are broken down into twenty specific requirements. Nevertheless, the pool does not replace any (mandatory or legally binding) tests, certifications, or risk analyses at workplaces for system manufacturers and employers. It should merely provide operational testing scenarios for system users with regard to the focused application purpose. For clear coding, the identification tag OR (Operational Requirement) with ascending numbers is used. The test course comprises the following pool of operational requirements: Category 1 (exoskeleton handling) focuses on elementary functions in the process of utilizing the exoskeleton. Related requirements are: -(Independent) Donning (OR01): System users need to don the exoskeleton. The duration and the possibility of independently donning the system are evaluated as these aspects influence workplace organizations. -(Independent) Doffing (OR02): System users need to doff the exoskeleton. Equivalent to the donning of the system, the duration and the possibility of independently doffing the system are of primary interest as these aspects influence workplace organizations and safety aspects. Operability/Control (OR03): System users can operate the exoskeleton on their own. If system-sided possible, different support modes need to be easily adjustable to and applicable for the user, as these aspects influence the usability and time for familiarization.

-
Decoupling of support (OR04): Certain body postures in specific working tasks (e.g., hip flexion while walking underneath a lowered ceiling, shoulder abduction while lifting a box from the ground) might technically induce exoskeletal support, which disturbs the system user more than it helps. Thus, it might be necessary to decouple or block the system's support in working situations.

Evaluation of Operational Requirements
Regarding the operational requirements, four different categories with varying focuses and aims are identified to qualitatively evaluate the general usability of the exoskeleton(s) in the first step. The four categories are (1) exoskeleton handling, (2) motoric movements, (3) secondary activities, and (4) operating ability. In order to analyze relevant aspects, these categories are broken down into twenty specific requirements. Nevertheless, the pool does not replace any (mandatory or legally binding) tests, certifications, or risk analyses at workplaces for system manufacturers and employers. It should merely provide operational testing scenarios for system users with regard to the focused application purpose. Compatibility with safety clothes (OR20): Independently from the external structure on the body, system users need to remain capable of wearing safety clothes (e.g., a safety vest). Additionally, the exoskeleton must neither hamper the visibility of the safety vest nor the movability of kinematic elements of the exoskeleton or of the system user itself.
All operational requirements can be evaluated with a test protocol and an objective observation with a quick check merely on a binary basis, passing individual thresholds (e.g., time, angle, and range) based on the protagonist's previous workplace analysis. To additionally take the personal feelings or acceptance of the system users into account, subjective methods such as interviews and observations can complement the objective criteria-based evaluation. Figure 3 details the crucial functions that are eligible for evaluating the operational requirements of an exoskeleton.
-Compatibility with safety clothes (OR20): Independently from the external structure on the body, system users need to remain capable of wearing safety clothes (e.g., a safety vest). Additionally, the exoskeleton must neither hamper the visibility of the safety vest nor the movability of kinematic elements of the exoskeleton or of the system user itself.
All operational requirements can be evaluated with a test protocol and an objective observation with a quick check merely on a binary basis, passing individual thresholds (e.g., time, angle, and range) based on the protagonist's previous workplace analysis. To additionally take the personal feelings or acceptance of the system users into account, subjective methods such as interviews and observations can complement the objective criteria-based evaluation. Figure 3 details the crucial functions that are eligible for evaluating the operational requirements of an exoskeleton.

Evaluation of Industrial Tasks
Regarding the modeling and simulation of industrial activities in a test course, nine clusters form a test pool of tasks with varying focus and properties. Tasks with similar, comparable characteristics (e.g., different activities above head level equivalently requiring applied normal forces vertically away from the body, granular precision activities such as the insertion of objects into predefined positions or the plugging of components) are aggregated to coherent clusters but varying in terms of requirements and characteristics between the clusters. For better classification, three characteristics specify the focus for each representative task in terms of dynamics, granularity, and handedness. Concerning the dynamics of test tasks, dynamic and static activities are distinguished. Therefore, the addressed body part of support is decisive for the characterization of the task. For instance, while torquing above head level, the trunk and upper extremities usually remain static, while the arms move up and down dynamically. In this case, the task would be considered a dynamic task. Concerning the granularity of test tasks, coarse and fine works are distinguished. As a result, this categorization helps classify between operations with precision focus and rough operations with larger object handlings. The third categorization addresses handedness and describes whether a task is either performed singlehanded or with both hands. For better comparability between different test persons, the handedness (e.g., left-or right-handed) can individually be adapted.
In the following, one exemplary task represents each cluster. The identification tag IT (Industrial Task) with ascending numbers codes the activities one-sidedly and precisely. Depending on the focus of the investigation, an individual modification of the stations remains possible. The test course represents the following pool of tasks:

Evaluation of Industrial Tasks
Regarding the modeling and simulation of industrial activities in a test course, nine clusters form a test pool of tasks with varying focus and properties. Tasks with similar, comparable characteristics (e.g., different activities above head level equivalently requiring applied normal forces vertically away from the body, granular precision activities such as the insertion of objects into predefined positions or the plugging of components) are aggregated to coherent clusters but varying in terms of requirements and characteristics between the clusters. For better classification, three characteristics specify the focus for each representative task in terms of dynamics, granularity, and handedness. Concerning the dynamics of test tasks, dynamic and static activities are distinguished. Therefore, the addressed body part of support is decisive for the characterization of the task. For instance, while torquing above head level, the trunk and upper extremities usually remain static, while the arms move up and down dynamically. In this case, the task would be considered a dynamic task. Concerning the granularity of test tasks, coarse and fine works are distinguished. As a result, this categorization helps classify between operations with precision focus and rough operations with larger object handlings. The third categorization addresses handedness and describes whether a task is either performed single-handed or with both hands. For better comparability between different test persons, the handedness (e.g., left-or right-handed) can individually be adapted.
In the following, one exemplary task represents each cluster. The identification tag IT (Industrial Task) with ascending numbers codes the activities one-sidedly and precisely. Depending on the focus of the investigation, an individual modification of the stations remains possible. The test course represents the following pool of tasks: -Overhead torquing (IT01): This test task focuses on the static or dynamic performance of activities above head level, where, e.g., activities require the use of additional tools. The task is performed dynamically if the arm is cyclically lowered vertically after each torquing and moved upwards again. If the arm remains in the same posture or only adjustably moves, the task is considered static. The torquing can be specified as, e.g., drilling or screwing. A possible modification is to perform the task in front of the body. -Grinding walls (IT02): The characteristic of this task is using a long-range tool with a higher dead weight, requiring a two-handed operation. The application focuses on large-scale, rotational, dynamic movements in vertical or horizontal directions. In addition, all nine clusters listed above can vary due to the forced posture required to execute the task. For instance, the activities might either be performed in a confined environment (e.g., under a lowered ceiling, between shelves with a small horizontal distance) or in ergonomically unfavorable postures (e.g., with the upper body twisted, squatting, or below floor level). All nine tasks are adaptable to the specific working conditions applied. Accordingly, the setup can double the number of modeled tasks.
Furthermore, it is possible to configure, parametrize, or adapt the described generic representative tasks with several variation parameters in order to raise the total range of simulated application scenarios and the representativeness for industrial workplaces. The variation parameters comprise work height, spatial orientation, object size/weight, electric tool use, processing sequence, distance/range, and the number of objects. Each parameter is differently applicable for the respective task. Table 1 lists the nine exemplary tasks, specifies respective characteristics, and assigns arising variation parameters for potential individual adjustments.
Since evaluation focuses vary, several types of (objective and subjective) analyses are generally applicable and mainly depend on the protagonist's interest. Appropriate dimensions or aspects can be derived as best practice approaches from a prevalence matrix for conducted evaluations of industrial exoskeletons [11]. For instance, the analysis of the physical relief commonly applies electromyography for determining changes in muscular activity, followed by surveys or analysis of metabolic costs (e.g., heart rate or oxygen consumption). Besides, it is common to analyze movement patterns (e.g., with optical marker systems or inertial measurement units) regarding changes in motion sequences (e.g., velocity or joint angles) as well as applied forces (e.g., with load cells, force plates, or dynamometers) for determining the mechanical support. On the other hand, analyses of mental support or working speed are currently not widespread. Study inspiration concerning the ability to concentrate, proneness to errors, or mental fatigue can be seen in (e.g., [14,34]). Working speed or productivity is determined by Alabdulkarim et al. [12] with an individual maximum acceptable working frequency, by Wang et al. [35] with motion capture, or by Madinei et al. [36] with precision assembly tasks. However, not every type of analysis, especially the objective ones, can always be used to evaluate exoskeletons, as tasks aim to examine different criteria, and thus evaluation aspects. For instance, the center of pressure analysis cannot be applied for carrying boxes (IT08) or operating material trolleys (IT09) due to conflicting space requirements. In addition, work precision analyses do not make sense for coarse works such as grinding walls (IT02) or hanging objects (IT03). It also needs to be mentioned that analyzing, e.g., metabolic costs, maximum acceptable frequencies, or mental support methodologically requires longer durations of task execution. -Stair ramp (TI02): The stair ramp includes steps, a handrail, and a fenced plateau. An inner hollow reduces the total weight and helps the ramp remain easily movable. The plateau can be used as the reversal point or for simulating narrow working places, and the stairs to model tasks in forced postures (e.g., in a stooped posture, below floor level). Accordingly, the ramp enables the evaluation of exoskeletons regarding, e.g., OR09, IT05, IT06, and ITXX*. -Adaptable wall (TI03): The wall comprises several horizontal profiles for the placement and individual height adaption of different horizontal or vertical working boards (TI04 to TI08). The item provides a necessary basis for possible evaluations of, e.g., the tasks IT01, IT02, IT03, IT04, IT05, and ITXX*, and especially of activities performed above head level. -Screwing board (TI04): Depending on the working tool, the screwing board lays the foundation for two ways of evaluating IT01. First, pre-fixed screws can be torqued in a bar with a nut runner (single-handed). Second, several screws can be directly screwed (in a predefined way) with an electric screwdriver (both-handed). -Plasterboard wall (TI05): Several connected plasterboards allow simulating different working tasks on walls or the ceiling, e.g., grinding, cleaning, mounting, or painting tasks. Accordingly, coarse requirements of, e.g., task IT02 can be covered. After each test person, the initial situation can be restored since plasterboards can easily be repaired with priming material or replaced. -Suspension device (TI06): The suspension device uses pipe clamps tightened with a screwable strap. It fixes cylindrical, elongated tubes by being clamped into the corresponding holder. Due to its design, the item enables any possible variant of hanging objects (IT03). -Clamping board (TI07): The clamping board consists of object clamps with a snapping function when pressing the object (e.g., tube) inside. The clamping board has specifically been designed for task IT04. -Holed pegboard (TI08): The pegboard comprises drilled holes, cylindric pins (with a smaller diameter to precisely fit into the drilled holes), and a collecting pan. Due to installed permanent solenoids, the bolts also stick to the pegboard in vertical or upside-down orientations. Besides, the pegboard is easily mountable to the adaptable wall or working boards. The holed pegboard sets a necessary basis for task IT05. Figure 4 provides an overview of the addressed test items of the modular and reconfigurable infrastructure as well as exemplary practical applications with regard to the evaluation of operational requirements and industrial tasks.
Commercially available tools such as screwdrivers, drilling machines, height-adjustable worktables and shelf spaces, chairs and benches with or without back and armrest, pallets, and material carts are not specified, since these items are assumed to be given or marketavailable. Provided weight bags can be used to simulate different workloads. Due to their uniform weight, the number of bags allows for scaling the test weight as desired.  Commercially available tools such as screwdrivers, drilling machines, height-adjus able worktables and shelf spaces, chairs and benches with or without back and armres pallets, and material carts are not specified, since these items are assumed to be given o market-available. Provided weight bags can be used to simulate different workloads. Du to their uniform weight, the number of bags allows for scaling the test weight as desired

Results
Eight heterogenous and exemplarily chosen exoskeletons critically passed all de scribed operational requirements and industrial tasks of the designed test course. As Tabl 2 illustrates, exoskeletons for different body parts with varying modes of actuation wer applied.

Results
Eight heterogenous and exemplarily chosen exoskeletons critically passed all described operational requirements and industrial tasks of the designed test course. As Table 2 illustrates, exoskeletons for different body parts with varying modes of actuation were applied.

Suitability of the Test Course
The principal suitability of the presented test course, especially regarding the representativity and applicability of the operational requirements and industrial tasks, for evaluating exoskeletons in a harmonized way are assessable. In this respect, the decisive factor was whether using different exoskeletons revealed varying results due to their different functional and morphological characteristics. The modular and reconfigurable test infrastructure is capable of realizing various test setups but keeps the amount of equipment to a manageable level. By the modular approach, the test course is suitable for evaluating different types of exoskeletons with regard to their requirements and usability for movement tasks (e.g., sitting down, picking up objects, walking in narrow aisles) and application contexts (e.g., personal protective equipment).

Applicability and Effectiveness of Exoskeleton Types
In addition, the application of exemplary exoskeletons in the test course shows discernible trends with regard to the applicability and effectiveness of exoskeleton types. The described trends are generalized and not universally applicable since the effects of different exoskeletons and exoskeleton types vary and solely base on the test course evaluation.

Mode of Actuation -
In comparison to passive exoskeletons, active systems are more suitable for use in particular tasks with dynamic movement sequences and high variance due to the versatile adaptation of the support performance and its basic possibility, as the application of exoskeletons mainly in IT01, IT02, IT07, and IT08 shows. -Passive systems are mainly suitable for static holding and stabilization tasks with only minor variations (e.g., IT04 and IT05). Due to the passive drive (e.g., spring), the energy for force support must first be actively supplied to the system by the user. Accordingly, passive systems have proven to be especially suitable for activities without required load changes. -Both types often offer a possibility to deactivate the force support, whereby active systems can automatically switch off the support for selected movements (e.g., OR03, OR16). On the other hand, passive systems usually have to be manually unlocked, though not all exoskeletons possess this option (e.g., OR04, OR16).

Morphological Structure
-Soft systems, so-called exosuits, are characterized by materials fitting close to the body. Thus, these systems are particularly suitable for working contexts requiring the (invisible) provision of a high level of wearer comfort (e.g., in narrow aisles (ITXX*) or underneath personal protective equipment (OR10)). Correspondingly, exosuits mainly provide support for holding and stabilization tasks (e.g., IT05, IT06). However, the level of support is generally limited to a low level. -Rigid exoskeletons offer a higher potential for force support than soft systems, but usually require a larger operation space (e.g., IT01, IT02, ITXX*). Thus, the adaptability with working or personal protective equipment can potentially be restricted (e.g., OR10).

Effectiveness
-As the evaluation of all operational requirements assigned to the secondary activities (OR09 to OR15) as well as industrial tasks (IT01 to IT09) proves, exoskeletons are differently suited to support system users performing main and secondary activities (e.g., OR11, OR13) or to continue to operate working aids such as industrial trucks (e.g., IT09). -As the test course application of operational requirements and industrial tasks confirms, exoskeletons are primarily designed for one use case and to support the system user in one specific application, correspondingly. Secondary activities are often limited, e.g., the arms are still pushed up when bending forward in passive shoulder exoskeletons. If designed correctly, active systems with situation recognition have more far-reaching possibilities for adapting their support without hindering secondary activities. -Even though exoskeletons are capable of supporting system users by their functionality, the morphological structure or operating principle can potentially restrict the suitability (e.g., inertial active exoskeletons following or performing dynamic movements) of exoskeletons, as high-dynamic movements might be hindered (e.g., OR08, OR12, IT06, IT07).

Discussion
In the context of this paper, a seven-phase model for the evaluation of exoskeletons has been designed, operationalized by means of a test course, and tested in practice using eight exemplary systems. The validation focused on testing the practical applicability of the seven-phase model and the suitability of the test course with regard to mapping various industrial application scenarios and achieving different results for different exoskeletons. Accordingly, at this stage of the investigation, the comparability of exoskeletons based on the studies performed was of secondary interest. Nevertheless, first recommendations for the targeted and appropriate use of exoskeleton types have been derived.

Seven-Phase Model
The seven-phase model with the test course as the practical core of this method enables an evidence-based evaluation of exoskeletons in a harmonized but practice-oriented test environment. In this respect, the seven-phase model describes significant steps for comprehensively evaluating exoskeletons. It does not solely focus on the conduct of the evaluation itself but also relevant earlier (setup) and subsequent stages (implication). Accordingly, the evaluation results do not purely assess the systems but can also provide significant knowledge for different user groups and stakeholders, as the test course helps (future) endusers gain applicable information regarding the appropriate use of exoskeletons. Besides, the evaluation process and results provide insights for exoskeleton manufacturers since system configurations and modes of operation can be sharpened or designed with regard to specific application scenarios. This could potentially reduce development and engineering costs since exoskeletons can be comprehensively evaluated prior to their industrial implementation. Nevertheless, the informative value remains coupled to the considered evaluation context.

Test Course
According to the test course, the complexity of industrial application scenarios of exoskeletons does not merely require a uniform setup, but rather a multifunctional configuration of infrastructure regarding reusable, movable, and individually adaptable standardized modules. Thus, the test course does not only enable an evaluation of exoskeletons for selected isolated activities but also for interrelated activity profiles. This benefit is achieved by combining industrial tasks and setting them up in different arrangements. In addition to the task-based evaluation of exoskeletons for industrial suitability, tests of operational requirements as a second integral part complement the test course. With regard to evaluating operational requirements, the number of tested exoskeletons plays only a minor role compared to the variance of support systems since the respective functionalities are evaluated in binary terms (functionality either given or not). In this respect, the evaluation of this test pool based on eight exemplary systems is sufficient to attest to the practical applicability of the test pool. To compress the vastness of different activities and keep the number of tasks in the test course manageable, a trade-off between the representativeness and the complexity of the tasks depicted has been necessary. Even though not every activity can be precisely modeled, the test course has proven successful in clustering industrial tasks when similar characteristics are present. The evaluation of objective criteria has been primarily focused in the conducted studies, whereas the recording of subjective and perceived aspects has been of secondary interest.

Limitations
However, the conducted exemplary application of the test course does feature several limitations, since the number of evaluated systems and users is limited. For a more detailed investigation, an accompanying study of both subjective and objective effects is inevitable for the performance evaluation of exoskeletons in the test course since respective criteria decisively influence the acceptance and usability of exoskeletons. Not only do exoskeletons need to have the required functionality, but subjective factors (e.g., perceived wearer comfort or willingness to use the exoskeleton) should correspondingly be examined more closely. Complementary surveys, interviews, or observations may help provide this additional information. A larger pool of exoskeletons and larger sample sizes should also be considered as the studies expand.
The mode and level of support were fixed as the test subjects performed the tasks, whereas the comparability between subjects could have been extended if the support level was individualized in relation to the subject's weight. All system users were healthy, young, and experienced in handling exoskeletons, which does not mirror the situation in real applications. Due to one available size, the tested exoskeletons were adaptable to the subject's different body proportions and heights to varying degrees. Accordingly, the tested exoskeletons did not suit all test persons equally well.

Future Work
As the test course is set up, the practical realization of the second stage of the sevenphase model (conduct) will be focused on in the next step. In order to address the limitations, the investigations and studies will be expanded to prove the effectiveness of exoskeletons by applying biomechanical, physiological, and cognitive analysis methods. Focus lies on the further validation of the test course, as the test tasks may help quantify supportive effects. Besides, the gathered data (e.g., electromyographic data on the muscular activity indicating relieving and straining impacts by exoskeletons, motion capture data on differing joint angles using exoskeletons compared to baseline) may reveal apparent redundancies between separate test tasks and necessary adjustments in the setup of the test course. In order to disseminate the insights and findings gained through the evaluation of exoskeletons in the test course, the results will be used to guide appropriate application and evaluation of exoskeletons in industrial scenarios. In this regard, detailed recommendations for action will be derived for the targeted use of industrial exoskeletons.

Conclusions
All in all, the test course has proven to be successful in practically realizing the seven-phase model. The test course helps evaluate the operational requirements of industrial exoskeletons and their supportive effect on users while performing industrial tasks. The evaluation of the eight exemplary systems in the test course has shown and attested differences in the applicability and effectiveness between exoskeleton types. Besides, the method for evaluating exoskeletons may also raise awareness among future users with regard to the appropriate and targeted use of support systems. Institutional Review Board Statement: Ethical review and approval were waived for this study, due to the methodologic and construction-oriented focus of the paper.

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study. Data Availability Statement: Not applicable.