1. Introduction
The Internet has transformed society in the way we communicate and, in our ability, to access information. The evolution of technology has enabled the development of different and more effective devices that integrate and contribute to an ever-growing internet network. This evolution contributes to the emergence of new concepts, one of them being the concept of cyber-systems—defined as internet-based systems, that aggregate several networking, processing, communication methods and storage. This has allowed the development of cyber-physical systems (CPSs) that include software and communication with physical processes, which in turn provide modeling and abstractions, analysis, and design techniques for the entire system [
1].
CPSs consist of sensors and actuators creating an intelligent system, with visualization methods (such as multimedia analytics) to help humans in most of their daily activities. Some functionalities, for example, incorporate flight control, electronic lodge windows in planes, auto-theft gadgets in automobiles, remote representation of house security, area administrations in priority development areas (PDAs), pacemakers in people, robot-controlled operations in the medical field and in the field of e-business (
Figure 1).
The development of CPSs covers a wide range of application areas, one of which is motion analysis—defined as the study of a sequence of images to obtain information on the movements that occur in these images, using computer vision, image processing, machine vision, and high-speed photography methods and software [
2].
The area of study of motion analysis has as its main focus the study and analysis of the movements performed by individuals, carried out during the exercise of various activities in their natural and real execution environments. Nowadays, due to the development of new technologies for the acquisition and analysis of image data, for most situations there is no need to use position markers on the body of the individuals to be analyzed. This reality provides new methods of motion analysis and contributes to a faster development and greater accessibility of motion capture solutions, which are also now easier to apply. It has therefore now become possible to analyze humans’ activity movements in their activity natural environment without the need to apply markers on the body of the individuals under study. The emergence of these new motion analysis methods has contributed to the development and availability of new, more accessible motion capture systems, both in terms of implementation, cost, and ease of use [
3].
Among the different areas where motion analysis is implemented, its application in sport is obvious. For this purpose, equipment that is easy to purchase and install, whose operation must be non-intrusive for the performance of the tasks to be carried out by the individuals under study, makes it possible to collect, analyze and present performance data from athletes in real time about their training practices and competitions on the one hand, and to develop automatic scoring and refereeing systems on the other [
3]. The effect of this trend contributes to the development of athlete performance feedback systems. These systems, using real-time motion analysis methods, make it possible to measure accelerations and forces, and recognize and track movements, using these data to quantify the effort applied to its execution as well as many other features. This evolution in real-time movement analysis has allowed the development of technological solutions whose objective is to help athletes, coaches, and referees in certain tasks [
4]. The use of these systems has the main objective of reducing the time required for the coach to carry out the analysis of the athlete’s performance, enabling a faster improvement in the athlete’s performance [
5,
6].
The real-time motion analysis systems are divided into two main groups. Optical systems, with or without markers, and non-optical systems, which use sensors placed on the athlete’s body to obtain movement information. Optical systems that use markers to identify specific areas of the individual’s body (e.g., joints) perform better. However, these results are obtained thanks to a more complex and costly implementation, which is also more intrusive for the subjects under study. On the other hand, systems without markers are usually easier to implement, being less intrusive during data collection [
7].
One of the sports, more specifically a martial art, where the use of a motion analysis system would be an added value in the analysis of the execution of the athletes’ movements is taekwondo. Taekwondo was introduced in Portugal in 1974 by Grand Master David Chung Sun Yong [
8]. Since then, in the last two decades, taekwondo has gained popularity, attracting many practitioners, and having achieved Olympic sport status in 2019, joining the Olympic program of the Sydney 2000 Olympic Games [
9]. Since then, the number of federated athletes has been increasing, currently reaching, according to the Pordata website, more than 4500 athletes in Portugal [
10].
In taekwondo as well as in martial arts in general, its practice is based on the execution of dedicated and special movements that require complex skills and tactical excellence. Thus, the movements most performed by athletes consist of specific postures and dynamic activities (blows, punches, kicks, throws, blocks, takedowns, among others) performed individually or in contact with the opponent. Most movements are characterized by their high dynamics, high intensity, and very short duration, usually called ballistic movements [
11].
Regardless of the sport, evaluating the performance of athletes is a complex and difficult task for coaches. The development of technology and its contribution to the creation of systems that are easy to use and acquire have made it possible for coaches to have access to systems developed to assist them in assessing the performance of athletes—especially when considering the sports with the greatest social impact and the greatest financial capacity to support the acquisition of those same systems. However, regarding taekwondo, this technological development did not translate into the development of specific technological tools that would help in the evaluation of the athletes’ performance in a training environment. The evaluation of the athlete’s performance during training is currently carried out manually by the coach. This evaluation consists of punctual visual analysis or videos of the athletes’ training, which are time-consuming tasks, making it difficult for the coach to give quick feedback to the athlete to change and adapt the training process [
12]. Some of the systems developed to perform the evaluation allow relevant information of the athlete’s performance, such as velocity, acceleration, force, displacement, among other characteristics, to be obtained [
13,
14,
15]. So, the need arose to develop a dedicated system to help evaluate the performance of taekwondo athletes in real time. Regarding this, the presented paper aims to contribute with a new method of identifying and quantifying the movements performed by taekwondo athletes during training sessions using deep-learning methods.
This paper is organized into six chapters as described by the flowchart shown in 
Figure 2. The second chapter presents the state of the art; in the third chapter presents the methodology used to carry out the validation of the project; the fourth chapter describes how the project’s architecture is made, addressing the elements that constitute it. In the fifth chapter, the results obtained during the test performed on the system are presented. In chapter six the final considerations are presented along with proposals for the development of the project.
  2. State of the Art
Technological evolution has enabled the development of methods and tools whose objective is to help people in the execution of certain complex tasks. Among these tools, several devices have emerged that enable the monitoring and analysis of movements performed by the human body [
16].
Considering the martial arts, which include taekwondo, the evaluation of the athletes’ performance largely depends on the analysis of their movements. Therefore, it is of great importance for athletes and coaches to apply motion analysis methods during training. It offers the possibility of improving technical skills, through the correction of the trainees’ body movement, so that they can perform the movements correctly and more effectively in any sport [
17].
As for the studies carried out based on motion analysis, some of the research mainly aims to study the movements and position of the athletes’ hands—this is the case of Suarez and Murphy [
18], which presents a case study of research based on the summary of the different techniques used for classifying gestures and for locating hands.
Another area in which the evolution of technology has contributed is image and video acquisition. However, the growing increase in image resolution is not always a positive aspect, since, due to the increase in the quality of video capture by traditional cameras, these are increasingly conditioned by the brightness and colors present in the environment, making it difficult to obtain the correct digital analysis of the image. Thus, as a suitable alternative for capturing images when these situations occur, depth cameras have emerged. These make it possible to collect images of depth, regardless of the orientation or intensity of lighting, or the color scheme of the environment. Among the existing depth sensors, one of the most used in scientific research on gesture classification and hand location is the Microsoft Kinect [
19]. This fact is mainly due to its easy acquisition, portability, ease of configuration and creation of 3D images, as it does not require the use of markers. This 3D camera, due to its characteristics, enables the development of accessible 3D video movement systems, which can be used for the kinematic analysis of human movement of joints and body segments in different areas.
Among the studies carried out using the Microsoft Kinect camera, Patsadu et al. use it for recognition of human movements using data mining classification methods in video streaming of twenty human body joint positions [
20]. To carry out the study, a system was created to recognize patterns of gestures such as getting up, sitting down, and lying down. For this, back-propagation neural networks, decision trees, support vector machines and Naive Bayes were used as classification methods adopted for comparison. It is possible to conclude that the classification method with the best performance was based on backpropagation neural networks since it obtained 100% accuracy in the recognition of the human gestures under analysis. Considering the average of all classification methods, the result was 93.72%, which also confirmed that when applied to human body recognition the Kinect camera effectively fulfills its purpose.
Also considering the Microsoft Kinect, Zerpa et al. carried out a study that consisted of comparing the values obtained in the displacement measurements of the Microsoft Kinect with the values obtained in the displacement measurements of the system based on Peak Motus markers [
21]. After analyzing the results obtained it was possible to conclude that the Microsoft Kinect, being a system without markers, was able to confirm the characteristics regarding the ease in the configuration steps, data collection and analysis in relation to Peak Motus.
Another study using Microsoft Kinect presents a real-time evaluation methodology of the performance of taekwondo athletes. For this, image-processing techniques are used to identify and record the number of occurrences of an athlete’s movement in a taekwondo training environment [
12]. To achieve the identification of movements, the angles between the joints of the human body were considered, more precisely by calculating the angles between these joints, comparing them with the reference values of each movement previously stored in a database.
Still referring to 3D cameras, a study presents a system for evaluating the performance of taekwondo athletes in real time during the athletes’ training sessions [
15]. However, the 3D camera chosen and used is the Orbbec Astra, as it presents more interesting characteristics compared to the Microsoft Kinect 2. Because it is smaller, weighs less, and has a greater maximum distance of reach. The resulting system makes it possible to store information about athletes and movements. The data obtained relative to the movements are composed by the values of the Cartesian coordinates in the real world of the articulations of the human body. This system provides the athlete and coach with information on the movements of the joints of the athlete’s hands and feet in numerical values of Cartesian coordinates, graph of Cartesian coordinates, and graph of velocity, all in real time.
Despite the significant number of studies that use Microsoft Kinect for kinematic analysis of human movement, more studies are needed to guarantee its reliability and validity [
22]. However, in some studies it was possible to verify the evidence that the evolution of systems without markers will contribute to a great revolution in the analysis of human movement, thus enabling its application increasingly in human motion capture studies [
23,
24].
Not all motion analysis studies use cameras as a means of obtaining data. Some studies choose to develop systems that use movement sensors, placed, and positioned in specific locations of the individual’s body, to collect data on the movements performed. This approach presents a methodology that could be an asset in motion analysis and contributes to the performance analysis of athletes. However, according to the literature these systems still have limitations that need to be improved [
25].
Other authors follow a different approach using magneto-inertial technology, as they consider it to be a reliable tool capable of improving the athlete’s performance. They believe that this technology could be effective in injury prevention and improvement in training specificity in relation to the athlete’s profile, since the sensors used make it possible to estimate temporal, dynamic and kinematic parameters [
26].
Not all motion analysis studies focus on motion identification or performance analysis. Others seek to study the impacts suffered by athletes when practicing their sport [
27]. The authors in these studies use smart sensors and sensor fusion as tools to obtain information for analysis, creating application systems in the areas of biomedicine and sports, using techniques and application methods capable of processing physical variables associated with the human body. These systems can be applied in different areas of intervention such as rehabilitation, and development of an athlete’s performance, among others.
Regarding the motion sensors approach, the authors use Inertial Motion Units (IMU) in their study, which confirmed that the impact signals combined with the IMU can provide a reliable scoring method. The heart rate measurement can be added to monitor the athlete’s physical state. In their study, the approach considered consisted in the integration of a “non-invasive” sensor system in the clothes used by taekwondo athletes. Pressure sensors, thin-film piezo resistive for force, and accelerometers were used to measure the impact. As for the communication method between the sensors and the computer, Bluetooth technology was chosen, which revealed a bandwidth limitation when using this transmission protocol [
28].
  3. Methodology
The developed system aims to design a technological solution that can be used as a tool to evaluate the performance of taekwondo athletes in real time. One of the functionalities intended for the system is to identify and quantify the movements performed by athletes in real time. To satisfy this requirement, Human Action Recognition (HAR) techniques would have to be used to add the ability to monitor and recognize human body movements. The research carried out allowed us to perceive the existence of different approaches to recognize movements performed by individuals [
29,
30]. Considering this, different methodologies and approaches used by these studies were tested to find the methodology capable of presenting better results according to the particularities of the data.
  3.1. Deep-Learning Methodologies
When talking about deep-learning, it is intended to describe a machine-learning technique that can teach computers to perform tasks in such a way that they seem to have been performed by human beings. This is only possible because there are different deep-learning architectures, which use specific analysis methodologies for certain types of data and analysis contexts.
In 2012 the ImageNet competition was won by AlexNet. On that date the successful use of convolutional neural networks (CNN) was proven when used for various applications [
31]. The good results obtained in image classification and object recognition led to these techniques also being used in sequential data analysis, both in video data and in 1D raw sequential data [
32]. The use of convolution layers allows applying convolution to the image to extract unique or distinct characteristics from each defined class.
In the recurrent neural networks (RNN) category, where long short-term memory (LSTM) was included, it is possible to solve data timestamp problems, such as text recognition, audio files, GPS path, etc. This problem requires that information from the previous and subsequent moment be known to be able to process the current information [
33].
It was for this purpose that the recurrent cells were initially created, with problems such as the disappearance of the gradient and the difficulty of contemplating long-term (old) information, considering only short-term (recent) information. The introduction of LSTM cells by Hochreiter and Schmidhuber in 1997 solved this problem [
34]. As the name implies (long short term) structures are created that make conceivable to define the importance of the information transmitted from t − 1 so that it is possible to understand whether this same information should be considered in the last cells and whether it should be transmitted to t + 1.
Other authors used the WISDM (Wireless Sensor Data Mining) dataset with data collected through a simple sensor that allows the identification of human movement in x, y, and z coordinates. This approach led to the conclusion that with this network model it was feasible to obtain good results for the classification of human activities such as running, sitting, standing, etc.
Models derived from LSTM have been presented in studies, such as the “Global Context-Aware Attention LSTM” model [
35], which has the particularity of the possibility of assigning greater importance to certain skeletal joints. This aspect adds useful functionality to the model, since not all joints bring useful information, some of which may introduce noise in the model. With this approach, the authors were able to obtain better results compared to the traditional LSTM model.
  3.2. Data Acquisition
The choice of data collection method is an important task when relevant results will be obtained from its analysis. Therefore, the collection method must be adequate to guarantee a correct and accurate collection of the intended data. When one intends to carry out this analysis using deep-learning techniques for motion recognition, it is vital to use accurate data to achieve the best results.
The system presented in this paper uses an Orbbec Astra 3D camera with a depth sensor to perform the data collection task, in which the 3D camera is placed 1 m away from the ground, positioned in a straight line with the athlete. In turn, the athlete positions himself 3 m away and performs the movements perpendicularly to the 3D camera. Data collection was carried out at the Polytechnic Institute of Cávado and Ave (IPCA) and at the University of Minho (UM), Portugal. With the intention of obtaining information on the taekwondo movements with different levels of execution, we tried to diversify the group of athletes in terms of their graduation level in the martial art practice. Therefore, eight taekwondo athletes with different graduation levels participated in the data collection (three Orange belt athletes—seventh kup, two Green belt athletes—sixth kup, two Blue belt athletes—fourth kup and one Black belt athlete—first dan) in which they performed the movements defined for the developed system (
Table 1).
Data collection allows the creation of a dataset with data on movements performed by taekwondo athletes. The dataset consists of three movements, considering the predominance of leg movements, along with some arm movements in taekwondo practice, two leg movements (Miro Tchagui and Ap Tchagui), and one arm movement (Jirugui) were selected. This dataset aims to allow the execution of the developed system, also making it possible to verify the collection and analysis of data on its reliability. However, it is intended that this dataset in the future should include data from other movements performed in the practice of taekwondo.
This task was carried out using a system previously developed within the scope of this project [
5,
36] which allowed us to collect the values of the Cartesian coordinates of the joints in a three-dimensional environment, related to the movements performed by the athlete. During the execution of the system, it is defined for the software that the data registry and analysis is made with an interval of one millisecond between each register, in order to guarantee an equal data collection frequency regardless of the platform (computer) where the system will be executed. The dataset obtained is composed of three classes, where each class represents a different technique/movement performed by the athlete, with 200 samples per class, composed of arm and leg techniques, along a still position, performed by taekwondo athletes. The taekwondo techniques that were collected for the dataset were described in 
Table 1. In addition to these, data of non-movement (standing) for creating a class were also collected and added to the system in order to distinguish between movement and stasis.
It is possible to visualize the raw data over a sequence of 80 samples for the movement of the right joint, in x, y, and z coordinates, respectively, in 
Figure 3. This dataset is fundamental for training in the deep-learning classification methods necessary to obtain the correct results.
  4. Experimental Design
The main objective of the project presented in this paper is to develop a technological solution that makes it possible to evaluate the performance of taekwondo athletes in real time during training. The development of a system capable of corresponding to the defined objective, as it is complex, requires planning, divided into specific tasks to be carried out, each task contributing with the elements necessary for the development of the system. It is intended that the system will provide different information regarding the analysis of the collected data. This information is divided into statistical results, biomechanical results, and movement analysis. The statistical results will be obtained through the identification and quantification of the movements performed by the athlete, allowing the evaluation of the evolution over the time of the training sessions. The biomechanical and movement analysis results will produce values of acceleration, velocity, and applied force of the athlete’s movements.
This chapter will be used to present the various tasks needed and performed, referring to their role in the functioning of the final system. Thus, this chapter will present the structure, the programs; the IMUs, and the user interface developed so far.
  4.1. Framework
The system framework defined for the project is composed of a set of elements with specific characteristics, which when grouped determine and provide a tool capable of satisfying the project requirements.
During the execution of the system, data are registered with an interval of one millisecond between each register. Currently there are complex and effective motion data acquisition systems such as the XSens MVN, whose cost of acquisition and use are high. However, there are also other devices simpler to operate, less expensive, with good performance and capable of collecting the same type of data as RGB 3D cameras with depth sensors. As shown in [
5], a comparative study was carried out between the XSens MVN system and the Microsoft Kinect v2, which allowed us to prove the reliability and precision in the collection of data from the movements performed by taekwondo athletes from the RGB camera with a depth sensor. Thus, the developed system consisted of a 3D camera with RGB and depth sensor, a computer for data processing and storage, motion sensors, and the developed software, as shown in 
Figure 4.
Among the 3D cameras suitable for the specificity of the system, the Orbbec Astra was chosen, based on a consideration of its specifications compared to the Microsoft Kinect v2 (
Table 2), which allowed us to conclude that the Orbbec Astra has a smaller dimension, weighs less, does not require external power, and has a greater maximum range system, in comparison to the Microsoft Kinect v2, the depth sensor most used in research [
15]. Another factor that led to the choice of Camera 3D Orbbec Astra was that it is among the best within its segment when compared to other Camera 3Ds with similar characteristics, presenting a very reduced measurement error of 0.2% [
37].
In addition to the 3D camera, the system consisted of a computer, software developed specifically for the system and four motion sensors that use three-axis gyroscopes and accelerometers to obtain movement data. All system components were selected and developed with a focus on creating an easy-to-use system with the lowest possible implementation cost, in order to ensure greater ease of access to the system by taekwondo athletes and coaches.
The main functions of the gathered components were to collect data from the athletes’ movements and process that information to return meaningful data to the coach and the athletes—identifying and quantifying the movements performed by the athletes, as well as calculating and presenting as some examples the values of velocity, acceleration and applied force of the athletes’ limbs in real time [
15].
  4.2. Software
The developed software can be considered the main element of the system, as it defines the operability and functionalities necessary for the system to function correctly and allow the information to be interpretated and presented correctly. Since the framework has a computer where the software will run, considering that the most used operating system in computers is Microsoft Windows [
38], it was decided to develop a Windows app. For its development, the C# language was used using the Integrated Development Environment (IDE) Visual Studio to build the interface necessary for acquiring and querying data and all the other codes necessary for implementing the required functionalities. The information collected and processed was stored in a structured database to organize it correctly using the Structure Query Language (SQL).
To obtain data through the depth sensor of the Orbbec Astra 3D camera, a middleware solution for recognizing gestures for 3D skeleton tracking was used, namely the Nuitrack™ SDK. The choice of integrating this middleware was made because it allowed a correct acquisition of the values of the Cartesian coordinates of the athletes’ joints, considering the reach distance of the Orbbec Astra camera and correct recognition of the athlete’s body (
Figure 5).
One of the functionalities of the developed software is to gather information on the movements performed by the athletes, to create a dataset with data referring to the movements performed by the athletes during the practice of taekwondo. The dataset composition is organized by classes, where each class contains information referring to a certain movement, saving the values of the Cartesian coordinates of the athletes’ joints for each class. These values are the X, Y, and Z coordinates referring to the athletes’ joints in a three-dimensional environment of the movements performed by the athletes. 
Figure 6 presents a graphical representation of the data obtained of a left ankle during the execution of an athlete’s movement.
One of the features of the system is identify and quantify the movements performed by taekwondo athletes during training in real time. To implement this functionality, deep-learning methodologies were used to perform action recognition based on the information obtained from the skeleton to perform the analysis and interpretation of the movement. The dataset created is composed of three classes (Ap tchagui, miro tchagui and jirugui), in which each class represents a movement performed by the taekwondo athlete.
  4.3. Inertial Measurement Units
The most recent 3D cameras have features, such as depth sensors, which allow them to obtain additional information to the collected image, which provides a more efficient and complete data collection.
The system developed, as described above, uses a 3D camera to acquire data from the athletes’ movements, choosing to collect data from the coordinates of the body’s joints in a three-dimensional environment during the athlete’s training period. Despite the efficiency demonstrated in data collection by the 3D camera, due to the particularity of the movements under analysis, in which there are moments of body rotation or overlapping of the limbs, moments of occlusion may occur that result in incorrect data reading or loss of data. To prevent this type of situation, it was decided to integrate motion sensors, opting for inertial measurement units. These units have an accelerometer and a gyroscope, allowing greater accuracy of functioning and obtaining correct data, being used by athletes in the extremities of the upper and lower limbs, hands, and feet. The choice of the location of the sensors is because in the practice of taekwondo, the movements are carried out mainly with the legs and arms of the athletes. These specific locations allow the collection of all movements performed by the athletes using the upper and inferior body members. The option to use four sensors at these specific body regions also aims to fulfill the objective of developing a low-cost and easy-to-use system (
Figure 7).
These sensors must remain with the athlete during the execution of the movements; however, their fixation must be resistant, comfortable, and as the least intrusive as possible, so as not to affect the athlete’s performance. Considering this, specific containers were designed and built to accommodate the various hardware components in a compact and comfortable way (
Figure 8).
Athletes perform fast and use wide movements during their practice of a martial art. In the case of taekwondo there is a greater incidence of movement in the lower limbs. Thus, the size and weight of the motion sensors to be used by the athletes must be considered. They should be the little intrusive as possible, so that their use does not interfere with the athletes’ movements and allows them to perform them comfortably. In addition to the concern with the intrusiveness of the motion sensors, the monetary value of acquisition, the ease of acquisition, and the integration of the necessary components were also considered.
According to what was defined for the movement sensor system, the following components were selected for its construction: Wi-Fi card Wemos D1 mini; GY 521 MPU 6050 sensor; a battery shield; two 3.7V 190mAh Li-Po Battery.
The component responsible for data transmission was the Wemos D1 mini, a Wi-Fi card based on the ESP-8266, with 11 digital input/output pins and 1 analog input, visible in 
Figure 9a) [
39].
To obtain the acceleration and translation data, the GY 521 MPU 6050 sensor was chosen (
Figure 9b), as it is a three-axis gyroscope with an acceleration module, with standard I2C communication. This sensor has an acceleration range of between ±2 and ±16 g and the gyroscope has a reading range between +250 to +2000°/s [
40]. To control the power supply and charging of the system, a battery shield (
Figure 9c) was added which makes it possible to use the system using a battery, as well as manage the system power supply method via battery or external power when connected to a USB charger [
41]. Finally, the battery selected was a 3.7V 190mAh Li-Po Battery (
Figure 9d), due to its reduced weight, size, and value, however allowing the system to function during an adequate period for its use.
As this system was composed of several sensors, one for each member of the group, a communication structure was defined for the correct transmission and reception of data (
Figure 10). The system has the function of managing the communication between the sensors and the software where the data are processed. For this, a router is used as access to the Wi-Fi network and manager of communication between the components of the entire system. For this system, a program was also developed in ESP-IDF, Espressif’s official IoT development framework, and a specific library was also developed with the functionality of providing a correct and accurate reading of the IMU 6050 data. The User Datagram Protocol (UDP) protocol to be used in data communication was established by defining a simple communication model using the minimum protocol mechanism.
  4.4. User Interface
Any software created with user interaction functionality, both in data entry and in accessing and obtaining data, needs to provide an interface that enables this interaction.
As this system was aimed at taekwondo coaches and athletes, this need for interaction applied to data input and output. Thus, during the development of the software, care was taken to create a user interface that met all these requirements and was intuitive to use. The software layout was composed of several interfaces, each one to perform a certain action (
Figure 9). The software menu consisted of the following options: Home; Add athlete; Training session; Consult training; Configure sensors; Exit.
When starting the software or selecting the “Start” option in the menu, it was possible to visualize the interface of 
Figure 11a, which provided information on how many athletes were in the database, the number of training sessions carried out, as well as updated news and the YouTube channel of the World Taekwondo Federation. By selecting the “Add athlete” option, we had access to the interface (
Figure 11b) that provided the fields to fill in the data necessary for creating the athlete’s data and the buttons to save or cancel the information. The “Training session” option gave access to the interface (
Figure 11c) that enables the selection of athlete to carry out the training, automatically filling in the fields referring to the athlete’s data, simply adding information about the technique to be trained and the training location. The “Continue” button allowed access to the interface (
Figure 11d) where the visual information of the athlete’s joints was presented, obtained by the depth sensor and the values of acceleration, speed and strength of the athlete’s feet and hands in real time. For this, the user had to click on the “Start training” button.
Once the training had started, the interface (
Figure 11e) effectively presents visual information of the athlete’s joints, obtained by the depth sensor and the values of acceleration, speed and strength of the athlete’s feet and hands in real time until the training is finished by clicking on the “Finish training” button. Returning to the initial interface of the software, to consult training information, the option “Consult Training” must be selected from the menu directed to the interface (
Figure 11f) where it is possible to select which athlete to consult, with all the training performed by the athlete shown in the table. Selecting one of the workouts in the table will display the interface (
Figure 11g) with information about the workout and two buttons “Back” and “Export”. By selecting the “Export” button, an xlsx file will be created containing information about the values collected during training, such as the Cartesian coordinates values, acceleration, velocity, and force values. The menu option “Configure sensors” presents the interface (
Figure 11h) where it is possible to verify the operation and calibrate the motion sensors before starting the training sessions. The “Exit” menu option allows you to terminate the software.
The layout presented above refers to the one currently present in the developed software. However, a new version of the layout is being developed. The project from which the presented software results envisages adding other functionalities to the system, maintaining usability, and simultaneously making the interface more up-to-date and intuitive to use, considering its area of application. Thus, a study was developed with the aim of exploring the potential of technology associated with design to optimize and evolve the performance of taekwondo athletes through its use [
42]. The preliminary version of the software interface design resultant of the performed study is presented in 
Figure 12.
  5. Results
There are different deep-learning architectures, each with analysis methods suitable for certain analysis contexts and data types. Considering this, to understand which deep-learning architecture best suits the dataset data types, several architectures were tested. More properly, LSTM; Convolutional Neural Network plus LSTM (CNN+LSTM) and an LSTM with integrated convolution (ConvLSTM). To test each of the deep-learning architectures, the steps shown in 
Figure 13 were performed.
The result of each of the deep-learning architectures allowed us to obtain its own confusion matrix. This matrix, as a performance measure for the machine-learning classification problem, helped to determine the performance of the classification models of the used test dataset. The values presented by the confusion matrix represented the accuracy of the model, describing how the model behaved in all analyzed classes. The closer the value is to 1, the greater the accuracy of the model is, thus, allowing us to understand which deep-learning model is most suitable for a certain type of data to be analyzed during its execution.
After testing the different deep-learning architectures and analyzing the results obtained in each of them, it was decided to apply the models with the best results to the taekwondo dataset [
43]. This dataset was created through the application of the developed system described above and was used for training, validation and inference of results using the deep-learning models presented above. The proportions of training and validation were 10% for validation and 90% for training. Observing the results obtained in the confusion matrixes of the several models tested (
Figure 14, 
Figure 15 and 
Figure 16), we could verify that the standard error of measurement in the overall class of movements was <=11% for the worst case (CNN-LSTM model, movement 
Ap Tchagui).
When analyzing the training results of the different architectures, the training loss and accuracy values determine how well the model fits the training data when training the model. Thus, the architecture that presents the lowest possible loss and the maximum possible precision will be the most suitable for the system data.
As in the LSTM model, 18 epochs were also used in the ConvLSTM model. After training with this same number of epochs the validation results of this model return an accuracy of 0.9730 (
Figure 14). Therefore, the LSTM model produced the worst results from the models considered in this study. This result may be justified by the fact that in some cases there is some confusion both between the non-movement and the 
Miro Tchagui movement, and between the non-movement and the 
Jirugui movement.
The CNN LSTM model was the last one to be used for training. This was a hybrid model composed by layers of CNNs and finally an LSTM. This training model reached accuracy results of 0.9820 in the validation data (
Figure 15). Also, in the CNN LSTM model there was some confusion in the decision between the 
Ap Tchagui movement and 
Miro Tchagui movement, which are similar techniques.
The results presented in 
Table 3 make it possible to state that that all the trained models managed to achieve very satisfactory results. It was evident that the best performance was from the LSTM model, which despite being the simplest model presented the best results in the same validation data with an accuracy of 0.9910 (
Figure 14), followed by the CNN LSTM model (
Figure 15), with an accuracy of 0.9820 and, last, the ConvLSTM, with an accuracy of 0.9730 (
Figure 16).
Another important aspect considered in addition to the validation results is to understand which of the models presents the best temporal performance in real-time operation. The main objective of this system is to identify and quantify the movements of taekwondo athletes in real time, among others. To verify the inference time of the results in conditions closer to those found in the real context of the application, an HTTP server responsible for responding to the inference requests sent by the clients was created (
Figure 17).
The client is thus responsible for sending the request via HTTP POST with the sequence of the movement in object JSON. On the server side, the data will be pre-processed and then introduced to the previously trained model, resulting in the class corresponding to the movement inferred by the network (
Figure 18).
All models were tested to see if there were significant differences in the response time between different models. With the results presented in 
Table 4 it was possible to conclude that the differences in response time were not significant. This led us to define the model that best suited our needs as the one with the best inference accuracy, the LSTM model.
The choice of the deep-learning architecture to be included is extremely important for the correct and efficient functioning of the developed system, as it will allow an accurate identification of the taekwondo movements and the presentation to the user in real time. Moreover, it contributes to a greater degree of reliability and validity of the system.
  6. Discussion and Future Work
The project presented in this paper aims to develop a technologic tool to evaluate the performance of taekwondo athletes in real time by designing and implementing a friendly and low-cost system. Due to the specifics of the movements performed during the practice of the martial art taekwondo, this system should have the lowest level of intrusion on athletes. The system consists of a 3D camera with a depth sensor, wearable motion sensors in the upper and lower limbs of the athletes and a computer and software developed specifically for the system.
This system makes it possible to collect data from the athletes’ movements and provides real-time values of speed, acceleration, and strength of the athlete’s four limbs in real time. Another functionality includes the identification and counting of the movements performed by the athletes. For this, the LSTM deep-learning architecture is used, which, applied to the dataset created with the collected data, allows this identification to be carried out. The option for the LSTM deep-learning architecture was established after testing other architectures such as CNN+LSTM and ConvLSTM, with LSTM obtaining the best accuracy values with 0.9910.
Although this system was developed to be applied during taekwondo practice, it could be used in other sports with similar characteristics. So, it is expected that it will be tested and implemented in other martial arts and sports such as kung fu, hapkido, kickboxing, judo, boxing, athletics, tennis, badminton, among others. In all of these, the evaluation of the athletes’ performance is extremely important and is carried out through the analysis of the athletes’ movements during the practice of the sport.
As future work for the project, it is intended to continue collecting data on even more taekwondo movements, with the aim of increasing and improving the dataset. This will expand the number of movements that the system will be able to identify and account for. Also, throughout the project it is intended to obtain and gather information that will allow a sustained statistical measurement regarding the reliability and validation of the developed system to be carried out.
Regarding the data output to be made available to athletes and coaches, it is intended to add more information to the result of the training consultations carried out. Some examples are chronological information of the values obtained by the athlete throughout training, with the addition of graphs for making the information more visual. At the same time, it is intended to migrate the current layout of the software to the new layout developed, by presenting an interface with a more current design, organization, and presentation of information in a more visual way, making the user experience more pleasant and intuitive.
Mobile devices are also constantly present in everyday life, and some devices currently have processing capacity comparable to personal computers. Therefore, due to their high processing capacity, achieved thanks to the use of multiprocessor architectures, they can also be used as a tool to perform various tasks. Considering that, it is also planned to develop a version of the software for mobile devices, capable of providing functionality similar to the desktop version.
The developed system uses a 3D RGB camera with a depth sensor, which obtains data from the athletes’ movements by referencing the coordinates provided by the joints of the human body via software. Despite the reliability of this implementation, due to the characteristics of the movements under analysis, occlusion of values may occur in rotation movements due to the non-detection of the joints. In order to fill this gap, motion sensors based on IMUs were developed, which provide additional information and allow these possible data gaps to be filled. However, for the implementation of these sensors and their connection with the system element that stores and analyzes the data (computer), communication via Wi-Fi was chosen. This method of communication, although effective, is dependent on a Wi-Fi network that must exist in the place where the system will be used. This could turn out to be a handicap, making it impossible to use the system in certain places and occasions. Therefore, it is intended to develop a version of the system that is not dependent on the communication method of the components via Wi-Fi.
   
  
    Author Contributions
Conceptualization, P.C., P.B., F.F., T.S., N.M., F.S. and V.C.; methodology, P.C., P.B.,F.F., T.S., N.M, F.S. and V.C.; software, P.C., P.B., F.F. and T.S.; validation, P.C., P.B., F.F., T.S., N.M., F.S. and V.C.; formal analysis, P.C. and V.C.; investigation, P.C., P.B., F.F. and T.S.; resources, P.C., P.B.,F.F.,T.S, N.M., F.S. and V.C.; writing—original draft preparation, P.C., P.B., T.S. and V.C.; writing—review and editing, P.C. and V.C.; visualization, P.C. and V.C.; supervision, N.M, F.S. and V.C.; project administration, F.S. and V.C.; funding acquisition, F.S. and V.C. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by FCT—Fundação para a Ciência e Tecnologia (Portugal) grant number SFRH/BD/121994/2016 and FCT RD Units Projects Scope: UIDB/04077/2020, UIDB/00319/2020, UIDB/05549/2020 and UIDP/05549/2020.
Data Availability Statement
Not applicable.
Acknowledgments
The authors are grateful to FCT which partially supported this work financially. Additional thanks go to coaches Joaquim Peixoto and Pedro Campaniço from Sport Club Braga Taekwondo Team (Portugal) and Suraj Maugi from Minho University Taekwondo Team (Portugal).
Conflicts of Interest
The authors declare no conflict of interest.
References
- Cyber-Physical Systems (CPS) (nsf10515). Available online: https://www.nsf.gov/pubs/2010/nsf10515/nsf10515.htm (accessed on 23 November 2022).
- Cyber-Physical Systems—A Concept Map. Available online: https://ptolemy.berkeley.edu/projects/cps// (accessed on 23 November 2022).
- King, B.A.; Paulson, L.D. News Briefs. Computer (Long Beach Calif.) 2007, 40, 13–16. [Google Scholar] [CrossRef]
- Thomas, G.; Gade, R.; Moeslund, T.B.; Carr, P.; Hilton, A. Computer Vision for Sports: Current Applications and Research Topics. Comput. Vis. Image Underst. 2017, 159, 3–18. [Google Scholar] [CrossRef]
- Cunha, P.; Carvalho, V.; Soares, F. Real-Time Data Movements Acquisition of Taekwondo Athletes: First Insights. Lect. Notes Electr. Eng. 2019, 505, 251–258. [Google Scholar] [CrossRef]
- Liebermann, D.G.; Katz, L.; Hughes, M.D.; Bartlett, R.M.; McClements, J.; Franks, I.M. Advances in the Application of Information Technology to Sport Performance. J. Sport. Sci. 2010, 20, 755–769. [Google Scholar] [CrossRef] [PubMed]
- Pueo Ortega, B.; Jiménez Olmedo, J.M. Application of Motion Capture Technology for Sport Performance Analysis. Retos Nuevas Tend. Educ. Física Deporte Recreación 2017, 32, 241–247. [Google Scholar]
- Chung, S.-Y. Wiki Sporting. Available online: https://www.wikisporting.com/index.php?title=Chung_Sun-Yong (accessed on 23 November 2022).
- [World Taekwondo] World Taekwondo Celebrate 25 Years on the Olympic Programme. Available online: http://www.worldtaekwondo.org/competition/view.html?nid=131436 (accessed on 23 November 2022).
- Dados Sobre Jogadores Federados em Portugal|Pordata. Available online: https://www.pordata.pt/portugal/praticantes+desportivos+federados+total+e+por+todas+as+federacoes+desportivas-2227 (accessed on 23 November 2022).
- VencesBrito, A.M.; Rodrigues Ferreira, M.A.; Cortes, N.; Fernandes, O.; Pezarat-Correia, P. Kinematic and Electromyographic Analyses of a Karate Punch. J. Electromyogr. Kinesiol. 2011, 21, 1023–1029. [Google Scholar] [CrossRef] [PubMed]
- Pinto, T.; Faria, E.; Cunha, P.; Soares, F.; Carvalho, V.; Carvalho, H. Recording of Occurrences through Image Processing in Taekwondo Training: First Insights. Lect. Notes Comput. Vis. Biomech. 2018, 27, 427–436. [Google Scholar] [CrossRef]
- Arastey, G.M. Computer Vision in Sport|Sport Performance Analysis. Available online: https://www.sportperformanceanalysis.com/article/computer-vision-in-sport (accessed on 23 November 2022).
- Nadig, M.; Kumar, S. Measurement of Velocity and Acceleration of Human Movement for Analysis of Body Dynamics. Measurement 2015, 3. [Google Scholar]
- Cunha, P.; Carvalho, V.; Soares, F. Development of a Real-Time Evaluation System for Top Taekwondo Athletes SPERTA. In Proceedings of the Ninth International Conference on Sensor Device Technologies and Applications, Venice, Italy, 16–20 September 2018; pp. 140–145. [Google Scholar]
- Vera-Rivera, J.L.; Ortega-Parra, A.J.; Ramírez-Ortiz, Y.A. Impact of Technology on the Evolution of Sports Training. J. Phys. Conf. Ser. 2019, 1386, 012144. [Google Scholar] [CrossRef]
- Păunescu, C.; Paunescu, M.; Haddad, M. Evaluation & Assessment in Taekwondo. In Performance Optimization in Taekwondo: From Laboratory to Field; Monoem, H., Ed.;  OMICS Group eBooks: Foster City, CA, USA, 2014; pp. 61–71. [Google Scholar] [CrossRef]
- Suarez, J.; Murphy, R.R. Hand Gesture Recognition with Depth Images: A Review. In Proceedings of the IEEE International Workshop on Robot and Human Interactive Communication, Paris, France, 9–13 September 2012; pp. 411–417. [Google Scholar] [CrossRef]
- Kinect Para Windows—Windows Apps|Microsoft Learn. Available online: https://learn.microsoft.com/pt-pt/windows/apps/design/devices/kinect-for-windows (accessed on 24 November 2022).
- Patsadu, O.; Nukoolkit, C.; Watanapa, B. Human Gesture Recognition Using Kinect Camera. In Proceedings of the JCSSE 2012—9th International Joint Conference on Computer Science and Software Engineering, Bangkok Thailand, 30 May–1 June 2012; pp. 28–32. [Google Scholar] [CrossRef]
- Zerpa, C.; Lees, C.; Patel, P.; Pryzsucha, E. The Use of Microsoft Kinect for Human Movement Analysis. Int. J. Sport. Sci. 2015, 5, 120–127. [Google Scholar] [CrossRef]
- Polak, E.; Kulasa, J.; Vences de Brito, A.; Castro, M.A.; Fernandes, O. Motion Analysis Systems as Optimization Training Tools in Combat Sports and Martial Arts. Rev. Artes Marciales Asiáticas 2015, 10, 105–123. [Google Scholar] [CrossRef]
- Robertson, D.G.E.; Caldwell, G.E.; Hamill, J.; Kamen, G.; Whittlesey, S.N. Research Methods in Biomechanics——Google Livros. Available online: https://books.google.pt/books?hl=pt-PT&lr=&id=_u56DwAAQBAJ&oi=fnd&pg=PR1&dq=Research+Methods+in+Biomechanics&ots=ConkQFLtgN&sig=riVY54GFRMkgP43XWHZ5ZM8RqZw&redir_esc=y#v=onepage&q=Research%20Methods%20in%20Biomechanics&f=false (accessed on 24 November 2022).
- Corazza, S.; Mündermann, L.; Gambaretto, E.; Ferrigno, G.; Andriacchi, T.P. Markerless Motion Capture through Visual Hull, Articulated ICP and Subject Specific Model Generation. Int. J. Comput. Vis. 2009, 87, 156–169. [Google Scholar] [CrossRef]
- Li, R.T.; Kling, S.R.; Salata, M.J.; Cupp, S.A.; Sheehan, J.; Voos, J.E. Wearable Performance Devices in Sports Medicine. Approach 2015, 8, 74–78. [Google Scholar] [CrossRef] [PubMed]
- Camomilla, V.; Bergamini, E.; Fantozzi, S.; Vannozzi, G. Trends Supporting the In-Field Use of Wearable Inertial Sensors for Sport Performance Evaluation: A Systematic Review. Sensors 2018, 18, 873. [Google Scholar] [CrossRef] [PubMed]
- Mendes, J.J.A.; Vieira, M.E.M.; Pires, M.B.; Stevan, S.L. Sensor Fusion and Smart Sensor in Sports and Biomedical Applications. Sensors 2016, 16, 1569. [Google Scholar] [CrossRef]
- Amaro, B.; Antunes, J.; Cunha, P.; Soares, F.; Carvalho, V.; Carvalho, H. Monitoring of Bioelectrical and Biomechanical Signals in Taekwondo Training: First Insights. Lect. Notes Comput. Vis. Biomech. 2018, 27, 417–426. [Google Scholar] [CrossRef]
- Wang, P.; Li, W.; Ogunbona, P.; Wan, J.; Escalera, S. RGB-D-Based Human Motion Recognition with Deep Learning: A Survey. Comput. Vis. Image Underst. 2018, 171, 118–139. [Google Scholar] [CrossRef]
- Kong, Y.; Fu, Y. Human Action Recognition and Prediction: A Survey. Int. J. Comput. Vis. 2022, 130, 1366–1401. [Google Scholar] [CrossRef]
- Ismail Fawaz, H.; Forestier, G.; Weber, J.; Idoumghar, L.; Muller, P.A. Deep Learning for Time Series Classification: A Review. Data Min. Knowl. Discov. 2019, 33, 917–963. [Google Scholar] [CrossRef]
- Khan, A.; Sohail, A.; Zahoora, U.; Qureshi, A.S. A Survey of the Recent Architectures of Deep Convolutional Neural Networks. Artif. Intell. Rev. 2020, 53, 5455–5516. [Google Scholar] [CrossRef]
- Mittal, A. Understanding RNN and LSTM. What is Neural Network? by Aditi Mittal|Medium. Available online: https://aditi-mittal.medium.com/understanding-rnn-and-lstm-f7cdf6dfc14e (accessed on 24 November 2022).
- Understanding LSTM Networks—Colah’s Blog. Available online: https://colah.github.io/posts/2015-08-Understanding-LSTMs/ (accessed on 24 November 2022).
- Liu, J.; Wang, G.; Hu, P.; Duan, L.-Y.; Kot, A.C. Global Context-Aware Attention LSTM Networks for 3D Action Recognition. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 1647–1656. [Google Scholar]
- Cunha, P.; Barbosa, P.; Ferreira, F.; Fitas, C.; Carvalho, V.; Soares, F. Real-Time Evaluation System for Top Taekwondo Athletes: Project Overview. In Proceedings of the BIODEVICES 2021—14th International Conference on Biomedical Electronics and Devices, Vienna, Austria, 11–13 February 2021;  Part of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2021. pp. 209–220. [Google Scholar] [CrossRef]
- Giancola, S.; Valenti, M.; Sala, R. Metrological Qualification of the Orbbec Astra STM Structured-Light Camera. SpringerBriefs Comput. Sci. 2018, 1, 61–69. [Google Scholar] [CrossRef]
- Operating System Market Share Portugal|Statcounter Global Stats. Available online: https://gs.statcounter.com/os-market-share/all/portugal (accessed on 30 November 2022).
- LOLIN D1 Mini v3.1.0—WEMOS Documentation. Available online: https://www.wemos.cc/en/latest/d1/d1_mini_3.1.0.html (accessed on 24 November 2022).
- MPU6050 Module Pinout, Configuration, Features, Arduino Interfacing & Datasheet. Available online: https://components101.com/sensors/mpu6050-module (accessed on 24 November 2022).
- Battery Shield—WEMOS Documentation. Available online: https://www.wemos.cc/en/latest/d1_mini_shield/battery.html (accessed on 24 November 2022).
- Silva, T.; Martins, N.; Cunha, P.; Carvalho, V.; Soares, F. Development and Design of an Evaluation Interface for Taekwondo Athletes: First Insights. Lect. Notes Inst. Comput. Sci. Soc.-Inform. Telecommun. Eng. 2022, 435, 7–20. [Google Scholar] [CrossRef]
- Barbosa, P.; Cunha, P.; Carvalho, V.; Soares, F. Classification of Taekwondo Techniques Using Deep Learning Methods: First Insights. In Proceedings of the BIODEVICES 2021—14th International Conference on Biomedical Electronics and Devices, Vienna, Austria, 11–13 February 2021;  Part of the 14th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2021. pp. 201–208. [Google Scholar] [CrossRef]
  
    
  
  
    Figure 1.
      Diagrammatic layout for CPSs.
  
 
   Figure 1.
      Diagrammatic layout for CPSs.
  
 
  
    
  
  
    Figure 2.
      Paper structure flowchart.
  
 
   Figure 2.
      Paper structure flowchart.
  
 
  
    
  
  
    Figure 3.
      Raw data from right hand joint during Jirugui movement.
  
 
   Figure 3.
      Raw data from right hand joint during Jirugui movement.
  
 
  
    
  
  
    Figure 4.
      Framework system architecture.
  
 
   Figure 4.
      Framework system architecture.
  
 
  
    
  
  
    Figure 5.
      Identification of skeletal joints with Nuitrack™ middleware, via depth sensor.
  
 
   Figure 5.
      Identification of skeletal joints with Nuitrack™ middleware, via depth sensor.
  
 
  
    
  
  
    Figure 6.
      Raw data from left ankle joint.
  
 
   Figure 6.
      Raw data from left ankle joint.
  
 
  
    
  
  
    Figure 7.
      Positioning of motion sensors in the taekwondo athlete.
  
 
   Figure 7.
      Positioning of motion sensors in the taekwondo athlete.
  
 
  
    
  
  
    Figure 8.
      Motion sensor system hardware container developed.
  
 
   Figure 8.
      Motion sensor system hardware container developed.
  
 
  
    
  
  
    Figure 9.
      Wemos D1 mini Wi-Fi board (a),GY 521 MPU 6050 (b), battery shield (c) and Li-Po battery motion sensor connection diagram (d).
  
 
   Figure 9.
      Wemos D1 mini Wi-Fi board (a),GY 521 MPU 6050 (b), battery shield (c) and Li-Po battery motion sensor connection diagram (d).
  
 
  
    
  
  
    Figure 10.
      Motion sensors’ system architecture diagram.
  
 
   Figure 10.
      Motion sensors’ system architecture diagram.
  
 
  
    
  
  
    Figure 11.
      Software interfaces layout. Application starts screen (a), athlete data entry screen (b), training data entry screen to be performed (c), training start screen (d), training visualization screen and speed data, acceleration and strength (e), training consultation screen by athlete (f), athlete training data display screen (g) and motion sensors configuration screen (h).
  
 
   Figure 11.
      Software interfaces layout. Application starts screen (a), athlete data entry screen (b), training data entry screen to be performed (c), training start screen (d), training visualization screen and speed data, acceleration and strength (e), training consultation screen by athlete (f), athlete training data display screen (g) and motion sensors configuration screen (h).
  
 
  
    
  
  
    Figure 12.
      Preliminary version of the software interface design.
  
 
   Figure 12.
      Preliminary version of the software interface design.
  
 
  
    
  
  
    Figure 13.
      Deep-learning methods testing system diagram.
  
 
   Figure 13.
      Deep-learning methods testing system diagram.
  
 
  
    
  
  
    Figure 14.
      Training results and confusion matrix from LSTM model.
  
 
   Figure 14.
      Training results and confusion matrix from LSTM model.
  
 
  
    
  
  
    Figure 15.
      Training results and confusion matrix from ConvLSTM model.
  
 
   Figure 15.
      Training results and confusion matrix from ConvLSTM model.
  
 
  
    
  
  
    Figure 16.
      Training results of confusion matrix from CNN LSTM model.
  
 
   Figure 16.
      Training results of confusion matrix from CNN LSTM model.
  
 
  
    
  
  
    Figure 17.
      POST inference.
  
 
   Figure 17.
      POST inference.
  
 
  
    
  
  
    Figure 18.
      POST inference response.
  
 
   Figure 18.
      POST inference response.
  
 
  
    
  
  
    Table 1.
    Taekwondo movements collected.
  
 
  
    
  
  
    Table 2.
    Orbbec Astra and Kinect v2 specifications comparison.
  
 
  
      Table 2.
    Orbbec Astra and Kinect v2 specifications comparison.
      
        | Model | Orbbec Astra | Kinect v2 | 
|---|
| Size | 160 × 30 × 40 (mm) | 249 × 66 × 67 (mm) | 
| Weight | 300 g | 970 g | 
| Range | 0.4–8 m | 0.5–4.5 m | 
| Depth Image Size | 640 × 480 (VGA) 16bit @30 FPS | 512 × 424 px @ 30 fps | 
| RGB Image Size | 1280 × 960 @ 10FPS | 1920 × 1080 px @30 fps | 
| Field of View | 60° H × 49.5° V (73° diagonal) | 70.6° H × 60° V (89.5° diagonal) | 
| Microphones | 2 | 4 | 
| External Power | No | Yes | 
      
 
  
    
  
  
    Table 3.
    Accuracy results from all tested deep-learning architectures.
  
 
  
      Table 3.
    Accuracy results from all tested deep-learning architectures.
      
        | Model | Accuracy | 
|---|
| LSTM | 0.9910 | 
| CNN LSTM | 0.9820 | 
| ConvLSTM | 0.9730 | 
      
 
  
    
  
  
    Table 4.
    Summary of HTTP Inference response time.
  
 
  
      Table 4.
    Summary of HTTP Inference response time.
      
        | Model | HTTP Response Time | Total Params | 
|---|
| LSTM | 288 ms | 82,904 | 
| CNN LSTM | 289 ms | 197,368 | 
| ConvLSTM | 282 ms | 276,184 | 
      
 
|  | Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
      
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).