Intelligent Performance Evaluation in Rowing Sport Using a Graph-Matching Network

Rowing competitions require consistent rowing strokes among crew members to achieve optimal performance. However, existing motion analysis techniques often rely on wearable sensors, leading to challenges in sporter inconvenience. The aim of our work is to use a graph-matching network to analyze the similarity in rowers’ rowing posture and further pair rowers to improve the performance of their rowing team. This study proposed a novel video-based performance analysis system to analyze paired rowers using a graph-matching network. The proposed system first detected human joint points, as acquired from the OpenPose system, and then the graph embedding model and graph-matching network model were applied to analyze similarities in rowing postures between paired rowers. When analyzing the postures of the paired rowers, the proposed system detected the same starting point of their rowing postures to achieve more accurate pairing results. Finally, variations in the similarities were displayed using the proposed time-period similarity processing. The experimental results show that the proposed time-period similarity processing of the 2D graph-embedding model (GEM) had the best pairing results.


Introduction
In recent years, many studies have been conducted that analyzed the performance of individual players in sports domains [1][2][3][4][5][6][7][8].While there are many studies that focus on motion analysis, most require athletes to wear sensors to acquire their posture data; however, this method easily introduces noise due to friction during motions [1,2,8].Therefore, the video-based contactless approach, which aims to acquire whole posture data using a simple camera without placing sensors on body parts, has made it easier and more convenient for scholars to analyze a variety of postures in many exciting sports domains [3,4,6,7].
Video-based motion analysis has witnessed numerous applications, including humancomputer interaction systems [3], human action understanding systems [4], medical assistance systems [9], and human pose estimation [10], all based on deep learning and recognition models.The OpenPose system [10] has proven to be valuable in capturing the skeletal joints of individuals, even in complex postures, using a simple camera without the need for specialized hardware like the Kinect device [3,9].However, as most previous applications are relevant only for recognizing simple postures, the issue of how to analyze the complex postures of sports players has been neither addressed nor effectively analyzed.
Several previous works [10][11][12][13][14][15][16][17][18][19][20][21][22] have utilized the OpenPose model [10] for various applications.Qiao et al. [11] used a series of coordinate trajectories of joint points to draw a curve that was used to determine whether Tai Chi movements were standard.Tsai et al. [12] estimated the depth distance between a person and a lens in a single image using the Open-Pose model to capture the coordinates of the human body keypoints.Nakai et al. [13] proposed a prediction method for a basketball shooting system, which detected human body keypoints using the OpenPose model and predicted basketball free throw shooting using a logistic regression model.In addition, several applications based on deep learning have also been proposed.Toshev et al. [14] proposed a state-of-the-art approach based on deep neural networks (DNNs) for human pose estimation.Xiao et al. [15] proposed a simple baseline method based on ResNet architectures [16] as the backbone network for human pose estimation and tracking.Zhang et al. [17] proposed a golf analysis system that includes a human detection subsystem and a performance analysis subsystem.In the human detection subsystem, they applied the OpenPose model, human tracking, and an LSTM deep learning model to detect golf players' postures, while the performance analysis subsystem scored the comparison results of the golf players' postures.Theagarajan et al. [18] proposed an automatically generated visual analytics and player statistics system for soccer players based on convolutional neural networks (CNNs) and deep convolutional generative adversarial networks (DCGANs).These systems can successfully perform relevant applications using system models, such as predicting basketball shots, assessing the standardness of sports movements, etc.However, they still experience a decrease in accuracy when used in complex environments.For instance, object appearance variations and lighting changes, factors present in a complex environment, make it challenging for the systems to accurately capture and analyze information from images [19][20][21].Moreover, the issue of pairing athletes to work together to enhance the performance of a sports team has not been addressed.
This study proposed a video-based performance analysis system for analyzing the performance of rowing pairs using a graph-matching network, which first detected and acquired human joint points using the OpenPose system from a rowing video.Then, the detected human joints were represented as graph structures that extracted the features of the rowing posture process.The rowing posture feature of each video frame was extracted using the graph-embedding model (GEM) and graph-matching network (GMN) model.Then, a video was developed to compare the baseline processes for two video sequences, and the performances of the rowing pairs were measured by calculating the rowing posture similarities in the pair using the GEM and GMN models.Finally, the proposed time-period similarity processing method was used to distinguish the degree of similarity changes in the players in the video segments.Experiments were carried out using a dataset of over 15 test rowing players, and the results show that the proposed approach can effectively evaluate the performances of rowing pairs and provide good suggestions for coaches when creating player groupings and training programs.
In the sport of rowing, consistency in the rowing movements among team members has a significant impact.Well-coordinated rowing motions contribute to achieving outstanding performance.Therefore, similarity in rowing postures among athletes has great significance.The purpose of this study is to analyze paired rowers using a graph-matching network to enhance the performance of a team sport.This study was organized into the following sections.Section 2 gives a brief review of related works, including the OpenPose system [10] and graph neural networks (GNNs) [18][19][20][21][22]. Section 3 provides the details of the proposed approach.Section 4 presents the experimental results, and Section 5 offers concluding remarks.

The OpenPose System
The OpenPose system, developed at Carnegie Mellon University, is a widely used, deep learning-based, real-time pose estimation system [10].This system is capable of detecting multi-person poses in real-time by leveraging the COCO, MPII, and body25 models, with the number of human joints being determined based on the specific model used.Among these models, the OpenPose system with the body25 model yields more accurate keypoint results compared with the COCO and MPII models.Figure 1 illustrates detection results obtained using the OpenPose system with the body25 model.Each detected keypoint in the resulting image contains three pieces of information: the X coordinate, Y coordinate, and C confidence.

The OpenPose System
The OpenPose system, developed at Carnegie Mellon University, is a widely used, deep learning-based, real-time pose estimation system [10].This system is capable of detecting multi-person poses in real-time by leveraging the COCO, MPII, and body25 models, with the number of human joints being determined based on the specific model used.Among these models, the OpenPose system with the body25 model yields more accurate keypoint results compared with the COCO and MPII models.Figure 1 illustrates detection results obtained using the OpenPose system with the body25 model.Each detected keypoint in the resulting image contains three pieces of information: the X coordinate, Y coordinate, and C confidence.
The OpenPose system is primarily built upon the human posture evaluation algorithm, with a core component known as part affinity fields (PAFs).This method uses a bo om-up detection approach, where it initially identifies joint point positions on the human body and subsequently extends to form the complete skeleton.As a result, the number of people being detected has minimal impact on the computational time required by OpenPose.

Graph Neural Networks
Deep learning has been extensively applied in image and sound recognition tasks, predominantly in Euclidean space recently.However, existing analysis methods like convolutional neural networks (CNNs) and recurrent neural networks (RNNs) encounter challenges when dealing with non-Euclidean spaces, such as matching human joint points.Consequently, researchers have introduced deep learning models based on graph neural networks (GNNs) [22][23][24][25][26] to tackle non-Euclidean spaces.
Although Sperduti and Starita initially proposed using neural networks for graph analysis in 1997 [20], it was not until 2005 that Gori et al. [25] presented a complete GNN architecture.Li et al. [26] further introduced the graph-embedding model (GEM) and graph-matching network (GMN), based on the GNN framework for graph-pairing tasks.GEM primarily embeds each graph into a low-dimensional vector using message transfer among adjacent nodes, allowing for distance calculation and similarity measurement between graphs.The graph-matching network (GMN), built upon the three-layer architecture of GEM, enhances accuracy in graph pairing by introducing a cross-graph message exchange mechanism alongside the node message propagation within the graph.Additionally, GNN has been extensively applied in various practical applications and research areas, including human movement recognition [27,28], identity analysis using motion [29], football prediction analysis [18,30], and similarity-based pairing tasks [31,32].The OpenPose system is primarily built upon the human posture evaluation algorithm, with a core component known as part affinity fields (PAFs).This method uses a bottom-up detection approach, where it initially identifies joint point positions on the human body and subsequently extends to form the complete skeleton.As a result, the number of people being detected has minimal impact on the computational time required by OpenPose.

Graph Neural Networks
Deep learning has been extensively applied in image and sound recognition tasks, predominantly in Euclidean space recently.However, existing analysis methods like convolutional neural networks (CNNs) and recurrent neural networks (RNNs) encounter challenges when dealing with non-Euclidean spaces, such as matching human joint points.Consequently, researchers have introduced deep learning models based on graph neural networks (GNNs) [22][23][24][25][26] to tackle non-Euclidean spaces.
Although Sperduti and Starita initially proposed using neural networks for graph analysis in 1997 [20], it was not until 2005 that Gori et al. [25] presented a complete GNN architecture.Li et al. [26] further introduced the graph-embedding model (GEM) and graphmatching network (GMN), based on the GNN framework for graph-pairing tasks.GEM primarily embeds each graph into a low-dimensional vector using message transfer among adjacent nodes, allowing for distance calculation and similarity measurement between graphs.The graph-matching network (GMN), built upon the three-layer architecture of GEM, enhances accuracy in graph pairing by introducing a cross-graph message exchange mechanism alongside the node message propagation within the graph.Additionally, GNN has been extensively applied in various practical applications and research areas, including human movement recognition [27,28], identity analysis using motion [29], football prediction analysis [18,30], and similarity-based pairing tasks [31,32].This study aimed to develop a video-based approach to analyze the performance of rowing poses of rowing pairs.To enable effective automatic comparison of rowing poses, the OpenPose system was used to extract robust rowing pose features and convert them into a graph structure that is robust to external factors, such as appearance (e.g., size, shape, and color) and lighting conditions.Subsequently, the GEM and GMN models were utilized to analyze the similarities in rowing postures between each pair of rowers.

The Proposed Approach
This section presents details of the proposed approach for analyzing the performance of rowing pairs.As shown in Figure 2, the proposed approach consists of three main steps: feature extraction, baseline comparison analysis, and rowing performance measurement.The details of the proposed approach are described as follows.
This study aimed to develop a video-based approach to analyze the performan rowing poses of rowing pairs.To enable effective automatic comparison of rowing p the OpenPose system was used to extract robust rowing pose features and convert into a graph structure that is robust to external factors, such as appearance (e.g.shape, and color) and lighting conditions.Subsequently, the GEM and GMN models utilized to analyze the similarities in rowing postures between each pair of rowers.

The Proposed Approach
This section presents details of the proposed approach for analyzing the perform of rowing pairs.As shown in Figure 2, the proposed approach consists of three main feature extraction, baseline comparison analysis, and rowing performance measure The details of the proposed approach are described as follows.
Figure 2. The proposed analysis approach for rowing player pairs.

Feature Extraction
This study used a video-based performance analysis system to analyze the p mance of paired rowers, that is, the human body joints of the rowing pose features detected and extracted using the OpenPose system.Furthermore, analyses were ducted to identify similarities in the rowing postures of paired rowers.
Let a video sequence contain L frames f1, …, fL, where each frame fi, i[1, L] con an image of M × N pixels.In order to extract human object features of the rowing p each video frame, the proposed approach first utilizes the OpenPose system wit body25 model to detect the human joint points of rowing poses.The human joint p from the OpenPose system consist of three values, namely, X coordinates, Y coordi and confidence C (0.0 ≤ C ≤ 1.0).Therefore, the set of body joints in the ith frame is obt {( , )} , where B is the number of human body joints.Figure 3a-i shows the det human joint points in the consecutive rowing poses from the catch position to the rec position during a rowing period.
After obtaining the rowing pose features of the human body joints, the featur presented as a graph structure in a graph-matching network (GMN) for further p mance measurement.Figure 4 shows an example of two key rowing poses when ro the catch pose and the finish pose, which correspond to the detected graph structu the rowing pose, as shown in Figures 3a and 3f, respectively, where the joints of th man body are represented with vertices, and the connections in the joint points o human body are represented with edges.

Feature Extraction
This study used a video-based performance analysis system to analyze the performance of paired rowers, that is, the human body joints of the rowing pose features were detected and extracted using the OpenPose system.Furthermore, analyses were conducted to identify similarities in the rowing postures of paired rowers.
Let a video sequence contain L frames f 1 , . .., f L , where each frame f i , i∈[1, L] contains an image of M × N pixels.In order to extract human object features of the rowing pose in each video frame, the proposed approach first utilizes the OpenPose system with the body25 model to detect the human joint points of rowing poses.The human joint points from the OpenPose system consist of three values, namely, X coordinates, Y coordinates, and confidence C (0.0 ≤ C ≤ 1.0).Therefore, the set of body joints in the ith frame is , where B is the number of human body joints.Figure 3a-i shows the detected human joint points in the consecutive rowing poses from the catch position to the recovery position during a rowing period.
After obtaining the rowing pose features of the human body joints, the features are presented as a graph structure in a graph-matching network (GMN) for further performance measurement.Figure 4 shows an example of two key rowing poses when rowing, the catch pose and the finish pose, which correspond to the detected graph structure of the rowing pose, as shown in Figure 3a and 3f, respectively, where the joints of the human body are represented with vertices, and the connections in the joint points of the human body are represented with edges.

Baseline Comparison Analysis
To assess the performance of each pair of rowers, this study examined similarities between their rowing postures.The proposed system utilized a graph-matching network model, which contains a graph-embedding model (GEM) and a graph-matching network (GMN) [10], in order to extract the rowing posture feature vector from each video frame.Then, the similarities in the two rowing poses of each pair of video frames were calculated, which further demonstrated the performance of the paired rowers.

The Input Data for the Model
In order to acquire the graph feature vectors (GF), twenty-five joint points were represented by graph structures, and for each rowing posture, the detected X and Y coordinates of human joints were input into the GEM and GMN models.Li et al. [26] proposed that the input space of GEM and GMN models required one-dimensional data, that is, the X and Y coordinates are respectively input into the graph-matching network.However, their study found that one-dimensional inputs were unable to distinguish between two rowing poses.Therefore, this research improved previous work by proposing a two-dimensional input space for the coordinates of human joint points.

Baseline Comparison Analysis
To assess the performance of each pair of rowers, this study examined similarities between their rowing postures.The proposed system utilized a graph-matching network model, which contains a graph-embedding model (GEM) and a graph-matching network (GMN) [10], in order to extract the rowing posture feature vector from each video frame.Then, the similarities in the two rowing poses of each pair of video frames were calculated, which further demonstrated the performance of the paired rowers.

The Input Data for the Model
In order to acquire the graph feature vectors (GF), twenty-five joint points were represented by graph structures, and for each rowing posture, the detected X and Y coordinates of human joints were input into the GEM and GMN models.Li et al. [26] proposed that the input space of GEM and GMN models required one-dimensional data, that is, the X and Y coordinates are respectively input into the graph-matching network.However, their study found that one-dimensional inputs were unable to distinguish between two rowing poses.Therefore, this research improved previous work by proposing a two-dimensional input space for the coordinates of human joint points.

Baseline Comparison Analysis
To assess the performance of each pair of rowers, this study examined similarities between their rowing postures.The proposed system utilized a graph-matching network model, which contains a graph-embedding model (GEM) and a graph-matching network (GMN) [10], in order to extract the rowing posture feature vector from each video frame.Then, the similarities in the two rowing poses of each pair of video frames were calculated, which further demonstrated the performance of the paired rowers.

The Input Data for the Model
In order to acquire the graph feature vectors (GF), twenty-five joint points were represented by graph structures, and for each rowing posture, the detected X and Y coordinates of human joints were input into the GEM and GMN models.Li et al. [26] proposed that the input space of GEM and GMN models required one-dimensional data, that is, the X and Y coordinates are respectively input into the graph-matching network.However, their study found that one-dimensional inputs were unable to distinguish between two rowing poses.Therefore, this research improved previous work by proposing a two-dimensional input space for the coordinates of human joint points.
Let the human joint point graph structures be G i (V i , E i ), i∈ [1, L], where V i are the ith human joint point nodes and E i is the set of edges that link the human joint point nodes.For the GEM and GMN models, the detected human joint points are formed as one-dimensional inputs {X i } L i=1 and {Y i } L i=1 , respectively [26].The proposed two-dimensional input data are . Therefore, the one-dimensional and two-dimensional human joint data were inputted into the GEM and GMN models to acquire each rowing pose feature for each frame, respectively.

The Graph-Embedding Model (GEM)
The GEM and GMN models include a graph encoder layer, a propagation layer, and an aggregator layer [24].In the graph encoder layer, feature encoding is performed on each node and edge of the graph, as follows: where x vi , x Eij , and MLP are the node feature vector, the edge feature vector, and the neural network of the hidden layer, respectively.It should be noted that if there is no available message for the edge and node, x Vi and x Eij are set to 1 [24].
To exchange node values in the propagation layer, each transfer generates a value, and each node h t i receives values from adjacent nodes through the edges, resulting in the value of a new node h t+1 i , which is defined as follows: where f value is a function based on MLP concatenation, f node is any core network of MLP, LSTM, or GRU, h t j is the starting point of node value transfer, h t j is the end point of receiving node values, and e ij is the edge that the node value passes through.
The aggregator layer aggregates nodes to obtain a graph feature vector GF, which is computed using: where is the Hadamard product and T is the times of message passing.After the T round of value passing, the feature vectors of the rowing postures for each frame are obtained {GF i } L i=1 .To calculate the dissimilarity in two rowing postures between each pair of frames for a pair of players, the proposed approach uses the Euclidean distance metric to calculate the dissimilarity score (DISSIM SCORE ) between two rowing feature vectors p1 and p2, i.e.: where p1 and p2 denote the number of paired rowers.In this study, 2 out of the 15 rowers were used for each similarity analysis.It was noted that lower value of the DISSIM SCORE denoted higher similarity between the rowing postures of two rowers, and it was more suitable for them to work together to improve the overall rowing performance, and vice versa.

The Graph-Matching Network (GMN)
This study also extracted feature vectors from each video frame using a graphmatching network (GMN).The main difference between the GEM and GMN models is that the propagation layers in the GMN model exchange node messages in the graph and add a cross-graph message exchange mechanism.Let two graphs be G 1 = (V 1 , E 1 ) and G 2 = (V 2 , E 2 ).Each graph G = (V,E) represents a set of nodes V and edges E. Each node i∈V is associated with a feature vector x i , and each edge (i,j)∈E is associated with a feature vector x ij .In the propagation layer, the computation is defined as: where sum(Value i ) represents the sum of all values from node j to node i and sum(µ i ) represents the sum of all values from node j to node i, respectively.f match is a function of pairing cross-graph node value exchange, which is defined as, where s h () is the Euclidean squared distance metric.In order to pair the most similar nodes in a graph with another graph, the difference value for all nodes should be computed, i.e., and the total cross-graph value is: where a j−>i is the attention weight and ∑ j µ j−>i is the difference measured between h t i and its closest neighbor in another graph.Finally, the aggregator layer aggregates the nodes under each graph to obtain graph feature vectors GF1 and GF2, which are defined as follows: Finally, the dissimilarity of the pair of rowing postures is computed using Equation ( 6).
It should be noted that, in order to accurately measure the similarity in each pair of rowing postures between two video sequences, the proposed system must measure the rowing poses in the same frame baseline; hence, this study proposed a video that compares baseline processing.First, the starting video frame S 1 in the first video sequence V 1 was selected; then, the corresponding lowest dissimilarity video frame was searched based on the starting video frame's S 1 of V 1 .The starting video frame of the second video S 2 was obtained using: where GF is the rowing feature vector, f i is the number of frame offsets, and DISSIM() is the dissimilarity function computed using the Euclidean distance.

Time-Period Similarity Processing
The purpose of the proposed time-period similarity processing (TPS) is to clearly distinguish the degree of similarity changes between the rowing postures of the paired rowers in the video segments.
In this study, the proposed approach performed segment similarity analyses in units of every 100 frames.The time-period similarity TPS was calculated as follows: where np is the average value of the segment of every 100 frames and DISSIM(X) is the dissimilarity measurement for each pair of rowing postures.

Results and Analysis
This section presents the experimental results of the proposed method.In this experiment, 15 high school rowers, who underwent training and practice in rowing techniques, used indoor rowing machines to simulate one minute of the actual rowing process [33,34], and each rower maintained a moderate rowing speed.Finally, we collected test videos from these 15 rowers to form the dataset used in our experiments.The test video was recorded at a speed of 60 frames per second (fps), and the resolution of each frame was 1920 × 1080.All experiments were run on a computer equipped with an Intel Core i9-10900k CPU, 16 G RAM, and NVIDIA GeForce RTX 2080 GPU and analyzed using the Python 3 software development tool.

The Rowing Posture Similarity Analysis
This study trained the graph neural networks 5000 times.Five rounds of node messages were exchanged each time, and the trained model was used to evaluate rowing posture similarities in the rowers.In the experiment, 15 rowers were numbered from 0 to 14, and after pairing and grouping, four models were used to calculate similarities in the rowing postures of the 2 rowers in each group.The models were as follows: 1D input of the graph-embedding model, 2D input of the graph-embedding model, 1D input of the graph-matching network, and 2D input of the graph-matching network.
Tables 1 and 2 show the results of the rowing posture similarity analysis for all rower pairs using the 1D and 2D graph-embedding models (GEMs), respectively.In the dissimilarity matrix, which used the squared Euclidean distance defined in Equation ( 6) for similarity calculation, the red numbers indicate the lowest similarity in each column, while the green numbers indicate the highest similarity in each column.Furthermore, the lower the value, the greater the resemblance between the rowing postures.This indicates a higher suitability for assigning these rowers to the same group to further optimize their team's performance in the sport.

Rowing Posture Analysis
Figures 5 and 6 show the posture similarity degree changes in the 1D and 2D inputs for the graph embedding models (GEMs) and graph-matching network (GMN), respectively, during the one-minute rowing process.The green solid line in the figure represents the most similar rower pair including No. 0 and No. 14, while the red dotted line represents the most dissimilar rower pair including No. 0 and No. 3. As shown in Figures 5a and 6a, the differences in rowing postures between two rowers are not quite distinguished in the 1D input for GEM and GMN.However, the improved 2D input for GEM and GMN can slightly distinguish differences, as shown in Figures 5c and 6c.Furthermore, our proposed time-period similarity analysis approach can significantly distinguish temporal differences in rowing motions, as shown in Figures 5b,d and 6b,d.Moreover, Figures 5c,d and 6c,d show that in the initial 500 frames, the posture similarities between rowers No. 0 and No. 3, which had the worst similarity originally, were relatively similar during this period.However, their differences in posture similarities became larger after about 500 frames.In contrast, rowers No. 0 and No. 14 maintained high similarities after 500 frames.
tively, during the one-minute rowing process.The green solid line in the figure represents the most similar rower pair including No. 0 and No. 14, while the red do ed line represents the most dissimilar rower pair including No. 0 and No. 3. As shown in Figures 5a  and 6a, the differences in rowing postures between two rowers are not quite distinguished in the 1D input for GEM and GMN.However, the improved 2D input for GEM and GMN can slightly distinguish differences, as shown in Figures 5c and 6c.Furthermore, our proposed time-period similarity analysis approach can significantly distinguish temporal differences in rowing motions, as shown in Figures 5b,d and 6b,d.Moreover, Figures 5c,d  and 6c,d show that in the initial 500 frames, the posture similarities between rowers No. 0 and No. 3, which had the worst similarity originally, were relatively similar during this period.However, their differences in posture similarities became larger after about 500 frames.In contrast, rowers No. 0 and No. 14 maintained high similarities after 500 frames.Figure 7 shows the similarities in the rowing posture results for two pairs of rowers regarding the 2D input for the graph-embedding model (GEM) at frame numbers 0~3600.The green solid line in the figure represents the rowing pair with the most similar rowing postures, while the red do ed line represents the rowing pair with the most dissimilar rowing postures.Figure 7a shows the rowing posture similarities in paired rowers using the 2D input GEM, while Figure 7b shows the proposed time-period similarity analysis results of Figure 7a.It can be found that the rowing postures of the two rowers were not stable before frame number 500 at the beginning of rowing; however, in the middle part of the video, the rowing postures of the two rowers became stable, as marked by the green line.Figure 7 shows the similarities in the rowing posture results for two pairs of rowers regarding the 2D input for the graph-embedding model (GEM) at frame numbers 0~3600.The green solid line in the figure represents the rowing pair with the most similar rowing postures, while the red dotted line represents the rowing pair with the most dissimilar rowing postures.Figure 7a shows the rowing posture similarities in paired rowers using the 2D input GEM, while Figure 7b shows the proposed time-period similarity analysis results of Figure 7a.It can be found that the rowing postures of the two rowers were not stable before frame number 500 at the beginning of rowing; however, in the middle part of the video, the rowing postures of the two rowers became stable, as marked by the green line.
postures, while the red do ed line represents the rowing pair with the most dissimilar rowing postures.Figure 7a shows the rowing posture similarities in paired rowers using the 2D input GEM, while Figure 7b shows the proposed time-period similarity analysis results of Figure 7a.It can be found that the rowing postures of the two rowers were not stable before frame number 500 at the beginning of rowing; however, in the middle part of the video, the rowing postures of the two rowers became stable, as marked by the green line.

Visual Validation
In order to verify the effectiveness of the approach proposed in this study, skeletons of the rowing postures for the two groups of rowers were superimposed, their posture similarities were visualized for comparison, and 0~3600 frames were used for each group.Group 1 consisted of rowers No. 0 and No. 3 and had the lowest rowing posture similarities, while Group 2 consisted of rowers No. 0 and No. 14 and had the highest rowing posture similarities.
Figure 8 shows the results of superimposing the skeletons of rowers No. 0 and No. 3 with the lowest rowing posture similarities, in which the red is rower No. 0 and the black is rower No. 3. Figure 8a-e shows that at the beginning of rowing, i.e., frame numbers 0~500, the rowing postures of the skeletons were matched because the rowing rhythms of the two rowers were almost the same.It should be noted that the initial motions in the rowing posts of rower No. 0 and rower No. 3 and their rhythms are very similar.However, as time progressed and physical exertion increased, their rowing poses and rhythms could not be consistently maintained, resulting in a gradual dissimilarity in their rowing poses over time.This observation is in line with the prediction results of the rowing posture analysis using the 2D GMN analysis model in the previous section [26].In addition, Figure 8f-j shows that the rowing postures of the two rowers were already quite different; hence, the skeletons in the rowing postures did not match after frame number 500.
Figure 9 shows the results of superimposing the skeletons for rower No. 0 and No. 14, which had the highest rowing posture similarities, where red is rower No. 0 and black is rower No. 14. Figure 9a-e shows that the rowing postures of the two rowers were not well matched at the beginning of rowing, i.e., frame numbers 0~500.However, the skeletons of their rowing postures began to match after frame number 500, as shown in Figure 9f-j.It should be noted that the initial motion of rower No. 0's rowing pose was standard, but their rowing rhythm could not be consistently maintained.On the other hand, rower No. 14's rowing pose at the beginning of rowing was not quite standard, but their rowing rhythm was stable.Due to this, rowers No. 0 and No. 14 were not well-matched at the beginning of rowing.However, as the rowing rhythm was adjusted, the rowing poses of the two individuals gradually became more similar.The results of this experiment show that the similarity values obtained using this research method were all in line with the results of the visualized superimposed skeleton comparison.The final similarity matrix can be used to judge whether any two rowers are suitable to be grouped to increase the rowing speed. is rower No. 3. Figure 8a-e shows that at the beginning of rowing, i.e., frame numbers 0~500, the rowing postures of the skeletons were matched because the rowing rhythms of the two rowers were almost the same.It should be noted that the initial motions in the rowing posts of rower No. 0 and rower No. 3 and their rhythms are very similar.However, as time progressed and physical exertion increased, their rowing poses and rhythms could not be consistently maintained, resulting in a gradual dissimilarity in their rowing poses over time.This observation is in line with the prediction results of the rowing posture analysis using the 2D GMN analysis model in the previous section [26].In addition, Figure 8f-j shows that the rowing postures of the two rowers were already quite different; hence, the skeletons in the rowing postures did not match after frame number 500. Figure 9 shows the results of superimposing the skeletons for rower No. 0 and No. 14, which had the highest rowing posture similarities, where red is rower No. 0 and black is rower No. 14. Figure 9a-e shows that the rowing postures of the two rowers were not well matched at the beginning of rowing, i.e., frame numbers 0~500.However, the skeletons of their rowing postures began to match after frame number 500, as shown in Figure 9f-j.It should be noted that the initial motion of rower No. 0's rowing pose was standard, but their rowing rhythm could not be consistently maintained.On the other hand, rower No. 14's rowing pose at the beginning of rowing was not quite standard, but their rowing rhythm was stable.Due to this, rowers No. 0 and No. 14 were not well-matched at the beginning of rowing.However, as the rowing rhythm was adjusted, the rowing poses of the two individuals gradually became more similar.The results of this experiment show that the similarity values obtained using this research method were all in line with the results of the visualized superimposed skeleton comparison.The final similarity matrix

Conclusions
This study presented an efficient approach for evaluating the performance of rowing pairs using a graph-matching network.The proposed approach used the OpenPose system to obtain the rowers' postures during the rowing process and acquire the positions of human joints.Afterward, the 2D coordinates of the detected human joints were input into GEM and GMN models to extract the feature vectors of each rowing posture.The proposed video baseline analysis results calculate the performance of the rowing pairs.Furthermore, the similarities in each pair of rowing postures were efficiently calculated using the GEM and GMN models.The proposed time-period similarity analysis clearly distinguished the degree of similarity changes in rowers over time.This experiment was carried out to demonstrate that the proposed approach could effectively evaluate the performance of rowing pairs and provide good suggestions for coaches when creating rower grouping and training.
The main limitation of the proposed approach is that this study only simulates indoor rowing using indoor rowing machines, which may impose several limitations when applied to actual outdoor rowing scenarios.The primary strength of this study lies in the novel approach using graph-embedding models and graph-matching networks for rowing posture analysis.Despite its focus on rowing, this system can also be applied to other sports, thereby enhancing overall team performance.In the future, identifying how to use 3D posture features and integrating a 3D graph neural network model to improve the proposed system merit further study.

Conclusions
This study presented an efficient approach for evaluating the performance of rowing pairs using a graph-matching network.The proposed approach used the OpenPose system to obtain the rowers' postures during the rowing process and acquire the positions of human joints.Afterward, the 2D coordinates of the detected human joints were input into GEM and GMN models to extract the feature vectors of each rowing posture.The proposed video baseline analysis results calculate the performance of the rowing pairs.Furthermore, the similarities in each pair of rowing postures were efficiently calculated using the GEM and GMN models.The proposed time-period similarity analysis clearly distinguished the degree of similarity changes in rowers over time.This experiment was carried out to demonstrate that the proposed approach could effectively evaluate the performance of rowing pairs and provide good suggestions for coaches when creating rower grouping and training.
The main limitation of the proposed approach is that this study only simulates indoor rowing using indoor rowing machines, which may impose several limitations when applied to actual outdoor rowing scenarios.The primary strength of this study lies in the novel approach using graph-embedding models and graph-matching networks for rowing posture analysis.Despite its focus on rowing, this system can also be applied to other sports, thereby enhancing overall team performance.In the future, identifying how to use 3D posture features and integrating a 3D graph neural network model to improve the proposed system merit further study.

Figure 2 .
Figure 2. The proposed analysis approach for rowing player pairs.

Figure 3 .
Figure 3.The human joint points detected during a rowing period using the OpenPose system.(ai) The consecutive rowing poses from the catch position to the recovery position during a rowing period.

Figure 4 .
Figure 4. Illustration showing the detected rowing pose graph structures, which correspond to Figure 3a and 3f, respectively.

Figure 3 .Figure 3 .
Figure 3.The human joint points detected during a rowing period using the OpenPose system.(a-i) The consecutive rowing poses from the catch position to the recovery position during a rowing period.

Figure 4 .
Figure 4. Illustration showing the detected rowing pose graph structures, which correspond to Figure 3a and 3f, respectively.

Figure 4 .
Figure 4. Illustration showing the detected rowing pose graph structures, which correspond to Figure 3a and 3f, respectively.

Figure 8 .
Figure 8. Rower numbers 0 and 3: (a-e) results comparing every 30 frames after frame number 0 and (d-f) results comparing every 15 frames after frame number 500.

Figure 9 .
Figure 9. Rower numbers 0 and 14. (a-e) Results comparing every 30 frames after frame number 0 and (f-j) results comparing every 30 frames after frame number 500.

Table 1 .
The dissimilarity matrix among all player pairs using 1D input graph-embedding model ×107.The green numbers indicate the most similar pairing results, while the red numbers represent the least similar pairing results.

Table 2 .
The dissimilarity matrix among all player pairs using 2D input graph-embedding model ×107.The green numbers indicate the most similar pairing results, while the red numbers represent the least similar pairing results.