Motion Capture Technology in Industrial Applications: A Systematic Review

The rapid technological advancements of Industry 4.0 have opened up new vectors for novel industrial processes that require advanced sensing solutions for their realization. Motion capture (MoCap) sensors, such as visual cameras and inertial measurement units (IMUs), are frequently adopted in industrial settings to support solutions in robotics, additive manufacturing, teleworking and human safety. This review synthesizes and evaluates studies investigating the use of MoCap technologies in industry-related research. A search was performed in the Embase, Scopus, Web of Science and Google Scholar. Only studies in English, from 2015 onwards, on primary and secondary industrial applications were considered. The quality of the articles was appraised with the AXIS tool. Studies were categorized based on type of used sensors, beneficiary industry sector, and type of application. Study characteristics, key methods and findings were also summarized. In total, 1682 records were identified, and 59 were included in this review. Twenty-one and 38 studies were assessed as being prone to medium and low risks of bias, respectively. Camera-based sensors and IMUs were used in 40% and 70% of the studies, respectively. Construction (30.5%), robotics (15.3%) and automotive (10.2%) were the most researched industry sectors, whilst health and safety (64.4%) and the improvement of industrial processes or products (17%) were the most targeted applications. Inertial sensors were the first choice for industrial MoCap applications. Camera-based MoCap systems performed better in robotic applications, but camera obstructions caused by workers and machinery was the most challenging issue. Advancements in machine learning algorithms have been shown to increase the capabilities of MoCap systems in applications such as activity and fatigue detection as well as tool condition monitoring and object recognition.


Introduction
Motion capture (MoCap) is the process of digitally tracking and recoding the movements of objects or living beings in space. Different technologies and techniques have been developed to capture motion. Camera-based systems with infrared (IR) cameras, for example, can be used to triangulate the location of retroreflective rigid bodies attached to the targeted subject. Depth sensitive cameras, projecting light towards an object, can estimate depth based on the time delay from light emission to backscattered light detection [1]. Systems based on inertial sensors [2], electromagnetic fields [3] and potentiometers that track the relative movements of articulated structures [4] also exist. Hybrid systems combine different MoCap technologies in order to improve precision and reduce camera occlusions [5]. Research has also focused on the handling and processing of high dimensional data sets with a wide range of analysis techniques, such as machine learning [6], Kalman filters [7], hierarchical clustering [8] and more. acceleration of an object) for either primary and secondary industrial applications were included; in this context, an industrial application was defined as any process related to the extraction of raw materials (e.g., metals or farming), or the manufacturing and assembly of goods (e.g., cars or buildings). Therefore, proof of concept papers that were not tested experimentally, simulations, and studies concerning white collar workers (e.g., office or other non-manual workers) and were excluded; additionally, works employing sensors that can indirectly measure motion (e.g., electromyography (EMG) in conjunction with machine learning algorithms [21]) were also omitted. Articles were included only if the participants' sample size (where applicable), and the type, number and placement of all used sensors were reported. Journal papers were prioritized in the event where their contents were also covered in earlier conference publications; in cases where this overlap was only partial, multiple publications were included.
All articles were imported to a standard software for publishing and managing bibliographies and duplicates were automatically removed. Two independent reviewers screened all titles and abstracts and labelled each article based on its conformity with the aims of the study. Articles that both reviewers deemed as non-compliant with the predefined inclusion criteria were excluded from further review. The remaining articles were then fully screened, and each reviewer provided reasons for every exclusion. Conflicts between the reviewers were debated until both parties agreed to a conclusion. Finally, the reference lists of all approved articles were browsed to discover eligible articles that were not previously found; once more, both reviewers individually performed full-text screenings and evaluated all newly found publications.

Assessment of Risk of Bias
Two reviewers assessed the risk of bias and the quality of all considered studies using an adapted version of the AXIS appraisal tool for cross-sectional studies [22]. Questions 6,7,13,14,15 and 20 of the original AXIS appraisal tool were disregarded since they assess issues that are not often apparent in studies concerning industrial applications, such as taking samples from representative populations, non-responders, non-response bias and participants' consent. The remaining elements of the AXIS list were adapted to form twelve questions that could be answered with a "yes" or a "no" and were used to appraise each study (Table 1) by summing all affirmative responses and providing a concluding score out of 12. Studies ranked below 6 were viewed as having a high risk of bias, while studies with ratings over 7 and 10 were considered of medium or low risk, respectively. The average study ratings of both reviewers were also computed to confirm the inter-rater assessment consistency. Were the outcome variables measured correctly using instruments/measurements that had been trialled, piloted or published previously? Q6 10 Is it clear what was used to determined statistical significance and/or precision estimates? (e.g., p-values, confidence intervals) Q7 11 Were the methods sufficiently described to enable them to be repeated? RESULTS

Q8 12
Were the basic data adequately described? Q9 16 Were the results presented for all the analyses described in the methods? DISCUSSION

Data Extraction
Data from all considered articles were extracted by each reviewer independently using predefined tables. Cumulative logged data were systematically monitored by both parties to ensure coherent data collection. Authors' names, year of publication, sample size, sensor placement (binned: machinery, upper, lower or full body), number and types of all used sensors (e.g., IMU or cameras), secondary validation systems (where applicable), and main findings were all recorded. In reference to their respective broad objectives, all considered articles were allocated into four groups based on whether they aimed to ensure workers' health and safety (e.g., postural assessment, preventing musculoskeletal injuries, detecting trips and falls), to directly increase workers' productivity (e.g., workers' location and walk path analysis), to conduct machinery monitoring and quality control (e.g., cutting tool inspections), or to improve an industrial process (e.g., hybrid assembly systems) or the design of a product (e.g., car seats). If a work could fall into more than one category [23] (e.g., health and safety, and workers' productivity), the paper was allocated in the most prominent category. Additionally, the directly beneficiary industry sector was recorded (e.g., construction, aerospace, automotive, or energy); in the instance of a widespread application, the corresponding article was labelled as "generic". Studies that employed machine learning were additionally logged, along with the used algorithm, type of input data, training dataset, output and performance.

Search Results
Database searching returned 1682 records ( Figure 1). After removing duplicates (n = 185), the titles, keywords and abstracts of 1497 articles were screened and 1353 records were excluded as they did not meet inclusion criteria. The remaining articles (n = 144) were assessed for eligibility, and 47 papers were retained in the final analysis. Twelve more records were added after screening the reference lists of the eligible papers, bringing the total number of the included studies to 59. Four, 13 and 16 records were published in 2015, annually from 2016 to 2018, and in 2019, respectively, underlying the increasing interest of the research community on the topic.

Risk Assessment
Twenty-one and 38 studies were assessed as being prone to medium and low risks of bias, respectively ( Table 2). None of the considered articles scored lower that six on the employed appraisal checklist. All reviewed articles presented reliable measurements (Q5) and conclusions that were justified by their results (Q10); yet, many authors have inadequately reported or justified sample characteristics (Q3, 37%), study limitations (Q11, 53%) and funding or possible conflict sources (Q12, 51%). Statistics (Q6, 81%) and general methods (Q7, 88%) were typically described in depth. Generally, studies were favourably assessed against all the remaining items of the employed appraisal tool (Q1, 95%; Q2, 92%; Q4, 93%; Q8, 98%; Q9, 93%). The assessments of both reviewers were consistent and comparable with average review scores of 9.9 ± 1.6 and 9.9 ± 0.9.

MoCap Technologies in Industry
In the reviewed studies, pose and position estimation was carried out with either inertial or camera-based sensors (i.e., RGB, infrared, depth or optical cameras), or in combination with each other (Table 3). Inertial sensors have been widely employed across all industry sectors (49.2% of the reviewed works), whether the tracked object was an automated tool, the end effector of a robot [30,37,64], or the operator [27,36,39]. In 30.5% of the reviewed studies, camera-based off-the-shelf devices such as RGB, IR and depth cameras, mostly coming from the gaming industry (e.g., Microsoft Kinect and Xbox 360), were successfully employed for human activity tracking, and gesture or posture classification [25,77]. Inertial and camera-based sensors were used in synergy in 10.2% of the considered works, in the tracking of the operator's body during labour or the operator's interaction with an automated system (e.g., robotic arm). EMG, ultra-wide band

MoCap Technologies in Industry
In the reviewed studies, pose and position estimation was carried out with either inertial or camera-based sensors (i.e., RGB, infrared, depth or optical cameras), or in combination with each other (Table 3). Inertial sensors have been widely employed across all industry sectors (49.2% of the reviewed works), whether the tracked object was an automated tool, the end effector of a robot [30,37,64], or the operator [27,36,39]. In 30.5% of the reviewed studies, camera-based off-the-shelf devices such as RGB, IR and depth cameras, mostly coming from the gaming industry (e.g., Microsoft Kinect and Xbox 360), were successfully employed for human activity tracking, and gesture or posture classification [25,77]. Inertial and camera-based sensors were used in synergy in 10.2% of the considered works, in the tracking of the operator's body during labour or the operator's interaction with an automated system (e.g., robotic arm). EMG, ultra-wide band (UWB) nets, resistive bending sensors or scanning sonars were used along with IMUs to improve pose and position estimation in five studies (8.5%). One study also coupled an IMU sensor with a CCTV and radio measurements. Generally, IMU and camera-based sensors were used consistently in the industry during the last 5 years (Figure 2). (UWB) nets, resistive bending sensors or scanning sonars were used along with IMUs to improve pose and position estimation in five studies (8.5%). One study also coupled an IMU sensor with a CCTV and radio measurements. Generally, IMU and camera-based sensors were used consistently in the industry during the last 5 years ( Figure 2).  Considering that the most frequently adopted sensors used in industry were IMUs (e.g., Xsens MVN) and marker-based or marker-less (e.g., Kinect) camera systems, their characteristics, advantages and disadvantages were also mapped ( Table 4) in order to evaluate how each sensors type is appropriate to the different applications. Naturally, the characteristics of each system vary greatly depending on the number, placement, settings and calibration requirements of the sensors, yet, general recommendations can be made for the adoption of a particular type of sensor for distinct tasks. Additionally, given the required level of accuracy, capture volume, budget and workplace limitations or other considerations, Table 4 shows the specifications and most favoured industrial applications for each type of sensor (e.g., activity recognition, or human-robot collaboration).  Considering that the most frequently adopted sensors used in industry were IMUs (e.g., Xsens MVN) and marker-based or marker-less (e.g., Kinect) camera systems, their characteristics, advantages and disadvantages were also mapped ( Table 4) in order to evaluate how each sensors type is appropriate to the different applications. Naturally, the characteristics of each system vary greatly depending on the number, placement, settings and calibration requirements of the sensors, yet, general recommendations can be made for the adoption of a particular type of sensor for distinct tasks. Additionally, given the required level of accuracy, capture volume, budget and workplace limitations or other considerations, Table 4 shows the specifications and most favoured industrial applications for each type of sensor (e.g., activity recognition, or human-robot collaboration).  Human-robot collaboration [42], robot trajectory planning [52] Activity tracking [34], gesture or pose classification [25,45,53] 1 Based on a sample layout with 24 Prime x 41 Optritrack cameras. 2 Based on a sample layout with 4 Flex 3 Optritrack cameras. 3 Based on the specs of the Xsens MTW Awinda. 4 Based on the Xsens MVN. 5 Based on the Kinect V2.

Types of Industry Sectors
Most frequently, MoCap technologies were adopted by the construction industry (Table 5, 30.5%), followed by applications on the improvement of industrial robots (22%), automotive and bicycle manufacturing (10.2%), and agriculture and timber (8.5%). On a few occasions, authors engaged in applications in the food (5.1%) and aerospace industries (3.4%), while energy, petroleum and steel industries were each discussed in a single study (1.7%). All remaining applications were considered as generic (22%) with typical examples of studies monitoring physical fatigue [48,71], posture [45] and neck-shoulder pain [74] in workers. Construction, generic and robotic applications were the only researched topics in 2015, while automotive, agriculture and food industrial applications were explored every year after 2016; MoCap technologies in the aerospace, energy, steel and petroleum industries were disseminated only recently (Figure 3, left).

MoCap Industrial Applications
MoCap techniques for industrial applications were primarily used for the assessment of health and safety risks in the working environment (Table 6, 64.4%), whilst fatigue and proper posture were the most targeted issues [48,49,72]. The research interest of the industry in health and safety MoCap applications increased steadily over the reviewed period ( Figure 3, right). Productivity evaluation was the second most widespread application (20.3%), with studies typically aiming to identify inefficiency or alternative approaches to improve industrial processes. Similarly, MoCap techniques were also employed to directly improve workers productivity (10.1%), whereas 8.5 % of the studies focused on task monitoring [17] or in the quality control of an industrial processes [30].

MoCap Industrial Applications
MoCap techniques for industrial applications were primarily used for the assessment of health and safety risks in the working environment (Table 6, 64.4%), whilst fatigue and proper posture were the most targeted issues [48,49,72]. The research interest of the industry in health and safety MoCap applications increased steadily over the reviewed period (Figure 3, right). Productivity evaluation was the second most widespread application (20.3%), with studies typically aiming to identify inefficiency or alternative approaches to improve industrial processes. Similarly, MoCap techniques were also employed to directly improve workers productivity (10.1%), whereas 8.5 % of the studies focused on task monitoring [17] or in the quality control of an industrial processes [30].

MoCap Data Processing
In the majority of the reviewed works, raw sensor recordings were subject to data synchronization, pre-processing, and classification. Data synchronisation was occasionally reported as part of the pre-processing stage and included in the data fusion algorithm [24,34,36], but technical details were frequently omitted in the reviewed studies [27,28]; yet, when the synchronization strategy was reported, a master control unit [36,50,51,54] or a common communication network [15,31,67] were used. Different sampling rates of data streams were addressed by linear interpolation and cross-correlation [73] techniques, or by introducing a known event that triggers all the sensors [29,47,49,55].
Data classification was obtained by establishing thresholds or via machine learning classifiers. An example of threshold was given by [39], where trunk flexion of over 90 • was selected to identify a high ergonomic risk, or by [31] where the position of the operator's centre of mass and the increasing palm pressure identified a reach-and-pick task. Such thresholds were obtained based on observations or established models and standards (e.g., RULA: Rapid Upper Limb Assessment, and REBA: Rapid Entire Body Assessment scores). Machine learning techniques were employed in 18.6% of the reviewed works (Table 7), aiming to build an unsupervised or semi-supervised system able to improve its own robustness and accuracy while increasing the number of outcomes that were correctly predicted. The most used algorithms were Artificial Neural Network (ANN), Support Vector Machine (SVM) and Random Forest (RF), with ANN and SVM being mostly employed for binary or three group classification, while random forest for multiclass classification. The accuracy of the developed machine learning algorithms typically ranged from 93% to 99% (Table 7).

Study Designs and Accuracy Assessments
Overall, the reviewed studies dealt with small sample sizes of less than twenty participants, with the exception of Tao et al. [56], Muller et al. [38] and Hallman et al. [74] who recruited 29, 42 and 625 participants, respectively. Eighteen, 13 and 8 studies placed IMU sensors on the upper, full and lower body, respectively, while six authors attached IMUs on machinery ( Table 8). Out of the 41 studies that employed inertial units (70% of all the works), the majority of the authors used less than three sensors (25 studies, Table 8), while seven groups used 17 sensors, as a part of a pre-developed biomechanical model with systems such as the Xsens MVN, to capture full body movements. Sensor placement for all the studies that did not adopt pre-developed models is graphically depicted on Figure 4. Six studies accompanied motion tracking technologies with EMG sensors [29,[49][50][51]54,57], two with force plates [73,75], two with pressure mats [61,62] and one with instrumented shoes [73]. Two works also used the Oculus Rift virtual reality headset to remotely assess industrials locations and control robotic elements [39,43]. The tracking accuracy of the developed systems was directly assessed against gold-standard MoCap systems (e.g., Vicon or Optotrack; Table 8, in bold) in six works [14,15,55,59,73,77], while the classification or identification accuracy of a process was frequently evaluated with visual inspection of video or phone cameras [15,29,36,44,60,63,69]. A thorough diagram showing the connections between type of industry, application and MoCap system, for each considered study is also presented on Figure 5.   The orientation of a robot (clock face and orientation angles) for pipe inspection can be estimated via an inverse model using an on-board IMU.

Motion analysis system (45 × passive markers), 2 × camcorders
The 3D pose reconstruction can be achieved by integrating morphological constraints and discriminative computer vision. The performance was activity-dependent and was affected by self and object occlusion. Welding robot path planning with an error in the trajectory of the end-effector of less than 3 mm.
The transfer of assembly knowledge between workers is faster with printed instructions rather with the developed smart assembly workplace system (p-value = 7 × 10 −9 ) as tested in the assembly of a bicycle e-hub.   Identification of physical fall hazards in construction, results showed a strong correlation between the location of hazards and the workers' responses (0.83).
[59] 4 Lower Body 2 Osprey IRC system Distinguish hazardous from normal conditions on construction jobsites with 1.2 to 6.5 mean absolute percentage error in non-hazard and 5.4 to 12.7 in hazardous environments. [32] Presentation of a robot vision system based on CNN and a Monte Carlo algorithm with a success rate of 97.3% for the pick-and-place task. [70] 15 --1 × depth camera A system aiming to warn a person while washing hands if improper application of soap was detected based on hand gestures, with 94% gesture detection accuracy.

Study Designs and Accuracy Assessments
Overall, the reviewed studies dealt with small sample sizes of less than twenty participants, with the exception of Tao et al. [56], Muller et al. [38] and Hallman et al. [74] who recruited 29, 42 and 625 participants, respectively. Eighteen, 13 and 8 studies placed IMU sensors on the upper, full and lower body, respectively, while six authors attached IMUs on machinery ( Table 8). Out of the 41 studies that employed inertial units (70% of all the works), the majority of the authors used less than three sensors (25 studies, Table 8), while seven groups used 17 sensors, as a part of a pre-developed biomechanical model with systems such as the Xsens MVN, to capture full body movements. Sensor placement for all the studies that did not adopt pre-developed models is graphically depicted on Figure 4. Six studies accompanied motion tracking technologies with EMG sensors [29,[49][50][51]54,57], two with force plates [73,75], two with pressure mats [61,62] and one with instrumented shoes [73]. Two works also used the Oculus Rift virtual reality headset to remotely assess industrials locations and control robotic elements [39,43]. The tracking accuracy of the developed systems was directly assessed against gold-standard MoCap systems (e.g., Vicon or Optotrack; Table 8, in bold) in six works [14,15,55,59,73,77], while the classification or identification accuracy of a process was frequently evaluated with visual inspection of video or phone cameras [15,29,36,44,60,63,69]. A thorough diagram showing the connections between type of industry, application and MoCap system, for each considered study is also presented on Figure 5.

Discussion
Industry 4.0 has introduced new processes that require advanced sensing solutions. The use of MoCap technologies in industry has been steadily increasing over the years, enabling the development of smart solutions that can provide advanced position estimation, aid in automated decision-making processes, improve infrastructure inspection, enable teleoperation, and increase the safety of human workers. The majority of the MoCap systems that were used in industry were IMU-based (in 70% of the studies, Table 3), whilst camera-based sensors were employed less frequently (40%), most likely due to their increased operational and processing cost, and other functional limitations, such as camera obstructions by workers and machinery which were reported as the most challenging issues [25,45,55]. Findings suggest that the selection of the optimal MoCap system to adopt was primarily driven by the type of application ( Figure 5); for instance, monitoring and quality control was mainly achieved via IMUs sensors, while productivity improvement via camera-based (marker-less) systems. Type of industry was the second factor that had an impact on

Discussion
Industry 4.0 has introduced new processes that require advanced sensing solutions. The use of MoCap technologies in industry has been steadily increasing over the years, enabling the development of smart solutions that can provide advanced position estimation, aid in automated decision-making processes, improve infrastructure inspection, enable teleoperation, and increase the safety of human workers. The majority of the MoCap systems that were used in industry were IMU-based (in 70% of the studies, Table 3), whilst camera-based sensors were employed less frequently (40%), most likely due to their increased operational and processing cost, and other functional limitations, such as camera obstructions by workers and machinery which were reported as the most challenging issues [25,45,55]. Findings suggest that the selection of the optimal MoCap system to adopt was primarily driven by the type of application ( Figure 5); for instance, monitoring and quality control was mainly achieved via IMUs sensors, while productivity improvement via camera-based (marker-less) systems. Type of industry was the second factor that had an impact on the choice of a MoCap system ( Figure 5); for example, in a highly dynamic environment, like in construction sites, where the setup and configuration of the physical space change over time, wearable inertial sensors were the best option, since they could ensure a robust and continuous assessment of the operator's health and safety. In industrial robot manufacturing instead, where the environmental constraints are known and constant, health and safety issues were primarily addressed by camera-based systems.
The increased use of IMUs promoted the development of advanced algorithmic methods (e.g., Kalman filters and machine learning, Table 7) for data processing and estimation of parameters that are not directly measured with inertial systems [17]. Optoelectronic technologies performed better and with higher tracking accuracy in human-robot collaboration tasks [33] and robot trajectory planning [32,46,52], due to the favourable conditions in such applications (e.g., the limited working volume and the known robot configurations) which allowed cameras to avoid obstructions. In general, hybrid systems that incorporate both vision and inertial sensors were found to have improved tracking performance in noisy and highly dynamic industrial environments, compensating for drift issues of inertial sensors and long-term occlusions which can effect camera-based systems [15,40]. For instance, in Papaioannou et al. [40], the trajectory tracking error caused by occlusion in a hybrid system was approximately half that of a camera-based tracking system.
Workers' health and safety was found to be the most prolific research area. Even though wearable sensors are widely used in clinical settings for the remote monitoring of physiological parameters (e.g., heart rate, blood pressure, body temperature, VO2), only a single study [26] has employed multiple sensors for the measurement of such metrics in industrial scenarios. This can be attributed to the industries involved being interested in the prevention of work-related incidents that can lead to absence from work, rather than in the normative function of the workers' body. As anticipated, health and safety research focused on the most common musculoskeletal conditions (e.g., back pain) and injuries (e.g., trips or injuries due to bad body posture), while the industries in which workers deal with heavy biomechanical loads or high risk of accidents (e.g., construction, Table 5) were the industries that drove the research. Fatigue and postural distress were also successfully detected by wearable inertial MoCap technologies [27,39,49,71,72]. When MoCap systems were combined with EMG sensors (Table 8), the musculoskeletal and postural evaluation of workers during generic physical activities (Table 5) was improved [29,[48][49][50][51]54,57]. Inertial sensors also showed good results for the identification of hazardous events such as trips and falls in the construction industry [44,58,60,65,66,69,75], but the positions and numbers of the used IMUs were reported to impact on the intra-subject activity identification [26]. For example, fewer IMUs placed on specific anatomical sections (e.g., hip and neck) showed similar task classification performance than a greater number of IMUs distributed on the entire body [36]. In Kim et al. [36], a task classifier based on just two IMUs on the hip and head of the subject reached an accuracy of 0.7617 against the 0.7983 of the classifier based on 17 IMUs placed on the entire body. Activity recognition was also well performed by IMUs, and combined with activity duration measurements, made the evaluation of workers' productivity in jobsites possible [24]. This topic was also the focus of interest for more than 10% of the studies in the past years (Table 6). However, when the assessment involved the identification or classification of tasks [26], secondary sensors were frequently needed in addition to the IMUs (force cells, temperature sensors, etc.).
Advancements were also reported in the development of efficient data classification algorithms that require large data streams, such as machine learning-based classifiers ( Table 7). The usage of such algorithms has been documented in 11 works out of a total of 59, and was accompanied with a very high level of accuracy. The classification output of the reviewed algorithms differed greatly between the reviewed works, and covered applications from activity and fatigue detection to tool condition monitoring and object recognition (Table 7). However, the need of large training datasets, which usually require expert manual labelling to be produced, contradicted the very small sample sizes that were typically recruited (Table 8), and thus potentially impeding the broader use of machine learning beyond the proof-of-concept in applied cases in industry. The general lack of information regarding real-time capability of the presented classification algorithms was also identified as a potential drawback in real-world application, suggesting that more work is required to address this challenge. Yet, the reviewed works generally outlined the capacity of MoCap sensors in conjunction with machine learning solutions to provide solutions for activity recognition, tool inspection and ergonomic assessment in the workplace. These findings highlighted how the research activity on wearable systems for industrial applications is going towards solutions that can be easily embedded in working cloths. Improving key factors such as wearability, washability, battery duration, data storage and edge computing will be therefore essential. This improvement in the hardware design will have a direct impact on the amount and the quality of the data collection. This, as well, will have a beneficial effect on software development, especially for machine learning applications, were huge quantity of data are required. In this regard, attempts should be made for the further development and commercial distribution of processing algorithms that would improve the ease of use of such systems and the data processing.
Direct evaluation of the accuracy and tracking performance of a developed MoCap system [14,55] was generally achieved through comparisons with a high accuracy camera-based system. This is so far the most reliable process, as it guarantees an appropriate ground truth reference. However, the performance of algorithmic processes (e.g., evaluation of body postures or near-miss fall detections) was typically validated against visual observations of video recordings [69] or the ground truth that was provided by experts in the field [78], and therefore potentially biasing the accuracy of the respective method. As regards the use of commercially available MoCap solutions, a comparison was made of their limitations, advantages and applicability to industrial applications (Table 4) while the accuracy of off-the-shelf MoCap systems has been also extensively reviewed by van der Kruk and Reijne [82].
Even though all the reviewed works were assessed as being prone to medium and low risks of bias individually (Table 2), the main limitation at a study level was that more than half of the reviewed works (51%) did not properly report funding and conflict sources. This may be an indication of a critical source of bias, particularly in studies directly driven by the beneficiary industry, or in works that demonstrate MoCap systems that may be commercially available in the future. A limitation of this review stems from the potential publication bias and selective reporting across studies, which may affect the accumulation of thorough evidence in the field. Efforts from industry bodies to incorporate MoCap applications in their facilities that were either unsuccessful or were not disseminated in scientific journals were likely overlooked in this review. Finally, another limitation at a review-level arises from the short review period that narrowed the reporting of findings in a period of five years; however, the selected review period returned an adequate number of records for the justification of conclusions and exposure of trends (e.g., Figure 3), while also facilitating the reporting of multiple aspects of the reviewed articles, such as the studies' design and key findings (Table 8).

Conclusions
This systematic review has highlighted how the industry 4.0 framework had led industrial environments to slowly incorporate MoCap solutions, mainly to improve the workers' health and safety, increase productivity and improve an industrial process. Predominately, research was driven by the construction, robot manufacturing and automotive sectors. IMUs are still seen as the first choice for such applications, as they are relatively simple in their operation, cost effective, and present minimal impact on the industrial workflow in such scenarios. Moreover, inertial sensors have acquired, over the years, the performance (e.g., low power consumption, modularity) and size requirements to also be applied for body activity monitoring, mostly in the form of wearable off-the-shelf systems.
In the coming years, the sensors and systems that will be used in advanced industrial application will become smarter with built-in functions and embedded algorithms, such as machine learning and Kalman filters, which will be incorporated in the processing of data streams retrieved by IMUs, in order to increase their functionality and present a substitute for highly accurate (and expensive) camera-based MoCap systems. Furthermore, systems are expected to become smaller and portable in order to interfere less with the workers and workplace, while real-time (bio)feedback should accompany health and safety applications in order to aid in the adoption and acceptance of such technologies by industry workers. Marker-less MoCap systems, such as the Kinect, are low cost and offer adequate accuracy for certain classification and activity tracking tasks; however, attempts should be made for the further development and commercial distribution of processing algorithms that would improve their ease of use and capability to carry out data processing tasks. Optoelectronics have been widely and consistently used in robotics over the recent years, particularly in the research field of collaborative systems and are shown to increase the safety of human operators. In the future, the price drop of optoelectronic sensors and the release of more compact and easier to implement hybrid and data fusion solutions, as well as next-generation wearable lens-less cameras [83][84][85], will lead to fewer obstructions in jobsites and improve the practicality of camera-based approaches in other industry sectors.

Conflicts of Interest:
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Table A1. Database search strings.