1. Introduction
Motion capture data with highly realistic and real-time processing capabilities has important applications in simulation training, film and television production, entertainment, virtual reality, and many other fields. However, the original motion capture data comprise large volumes of data and their reusability is poor. In order to facilitate the transmission, storage, and reuse of motion capture data, many keyframe extraction techniques have been proposed, which can reduce the data storage space and accelerate data processing.
The keyframe extraction techniques applied to motion capture data extract representative frames from the motion data based on the entire movement sequence. At present, popular keyframe extraction methods can be divided into categories based on clustering, curve simplification, and others.
Methods based on clustering divide the motion sequence into groups with similar features, before selecting one frame from each class as the keyframe. For example, Liu
et al. [
1] divided N-frames of motion data into K sets according to the features of the motion data, before selecting the first frame in each set as the keyframe. Park and Shin [
2] employed a quaternion to represent the motion sequences, before applying PCA (Principal Component Analysis) and the K-means method to process the motion data, and the scattered data were used for interpolation to extract the keyframe. Zhu and Wang [
3] proposed a similar clustering method for extracting keyframes, where the high-dimensional original data were mapped to a low-dimensional space and the mean squared error was then used to segment the low-dimensional data. The segmentation point reflected an attitude change in the motion data, thus the segmentation point was selected as the keyframe. Halit and Capin [
4] also reduced the dimensionality of the data and obtained the motion feature based on the Gaussian-weighted average of the frame, before selecting frames higher than the average as candidate frames and then removing the unimportant frames to yield the final keyframe.
In methods based on curve simplification, each frame of the motion capture data is regarded as a high-dimensional space point, where these points in time comprise a trajectory curve. The curve simplification algorithm is used to extract feature points to obtain the keyframe from the motion sequence. Lim and Thalmann [
5] extended the curve simplification algorithm and determined the highest curve that indicated the motion capture data, before finally extracting the extreme points of the curve as the keyframe. However, this method required repeated attempts with different error levels, depending on the compression rate, to meet the requirements. Togawa and Okuda [
6] proposed a key frame extraction method based on location. First, movement was regarded as a set of curves formed by all the joint rotations of each frame data and the locations of all the joints were calculated to determine the key points to reduce the number of keyframes, before finally obtaining the keyframe. Assa
et al. [
7] used multidimensional scaling analysis to map the movement of high-dimensional data to low-dimensional space, and then used the simplified approach curves in the low-dimensional space to extract the keyframes. Yang
et al. [
8] proposed the double curve simplification algorithm, which utilized the set of angles formed by human limb bones and the center of the bones as characteristics, before using a stratified curve simplification algorithm to extract the keyframe. Peng [
9] proposed a method based on the center distance to extract the keyframe. Zhang
et al. [
10] extracted the keyframe based on the joint distance by using nine joint features to represent the original motion sequences, before applying PCA to reduce the dimensionality of the data and obtain one-dimensional curves. The extreme curve was then extracted as the initial keyframe point and the final keyframe interpolation procedure was based on the threshold between the extreme points.
In another keyframe extraction method, Liu [
11] used spherical linear interpolation to replace linear interpolation, and also introduced the speed error, which increased the measurement of the joint rotation characteristics, where the frame cut method was used to obtain a reconstruction error curve and the keyframe was extracted automatically based on the best compression ratio. However, this method could not guarantee that the keyframe obtained in the optimal compression ratio had the minimum reconstruction error. Lee
et al. [
12] constructed an objective function that treated the key frames and the compression ratio as the goal, before using genetic algorithms to extract key frames from a dynamic grid sequence. Liu
et al. [
13,
14] improved this method and proposed a keyframe extraction algorithm based on a simplex hybrid genetic algorithm (SMHGA) in order to increase the local search ability of the genetic algorithm, which used the relatively powerful local search simplex method. This method could eliminate the parameter adjustment process and it improved the local search ability. However, genetic algorithms are readily trapped by local extremum in the background knowledge, which makes it difficult to find the global optimal solution. Cai
et al. [
15] proposed a keyframe extraction method for motion capture data based on the compression ratio or reconstruction error, which was divided into two stages that employed a preselected frame and optimization selection based on the reconstruction error.
The present study presents a human motion capture data keyframe extraction method based on a multiple population genetic algorithm (MPGA), where we define a fitness function that assesses the reasonableness of the keyframe extraction result. This method sets the minimum fitness value as the optimization target. The algorithm does not require a manually specified threshold and the optimal keyframe set can be extracted.
2. Multiple Population Genetic Algorithm
Genetic algorithms are highly parallel, global, randomized, and adaptive optimization probabilistic search algorithms that imitate natural selection and the mechanism of evolution. They are highly robust and they have the capacity for global search. However, genetic algorithms have many shortcomings and defects, such as the immature convergence problem. Thus, the MPGA was introduced to address these problems.
MPGA breaks with the standard genetic algorithm (SGA) framework that relies on the evolution of a single group of genes by introducing multiple populations to optimize the search process. The different populations are assigned with different control parameters to achieve different search purposes. The algorithm also considers global search and local search. Various groups are independent and they communicate with each other via an immigration operator, where an operator regularly (every generation) introduces the best individual obtained from each population in the evolution process into the other populations, thereby achieving information exchange between populations. The specific operation rule requires that the best individual in the source population replaces the worst individual in the target population. During each generation of evolution, the best individual in other populations is selected with an artificial selection operator, thereby preserving the best individuals in the overall population.
SMHGA and MPGA belong to the same group of genetic algorithms, which all have the ability to eliminate the parameter adjustment process and improve the capacity for local search. The SMHGA is a hybrid approach that combines GA with a probabilistic simplex (SM), which is a method for directing promising search in a specific direction to speed up convergence and improve the quality of the solutions [
14]. SMHGA uses a single group to evolve, whereas the MPGA employs multiple populations to optimize the search process; thus, it has a large search space and it can readily determine the global optimal value.
2.1. Structure of MPGA
In this section, we introduce the structure of MPGA. The structure of MPGA can be seen in
Figure 1.
2.2. Description of MPGA
Step 1. Generate the initial population. Using a binary coding to generate MP initial populations Chrom, each initial population Chrom contains N individuals. Different populations are endowed with different control parameters (e.g., mutation probability and crossover probability).
Step 2. Calculate the fitness of each individual in each initial population Chrom and then sort the fitness values from highest to lowest within the scope of the various groups.
Step 3. Apply selection, crossover, and mutation operations to each separate population. Different populations are independent but co-evolution occurs.
Step 4. Immigration operations. Each initial population is relatively independent but the exchange of information occurs between populations via the immigration operation, where the worst individual in the target population is replaced by the best individual in the source population.
Step 5. Artificial selection operations. During each generation of evolution, select the best individual in other populations with an artificial selection operator and maintain the best individuals in the essence population.
Step 6. Find the best individual in the essence population. Check whether the current optimal value and the previous optimal value are the same. If this is not the case, update the optimal values.
Step 7. Check whether the evolutionary process satisfies the termination condition. If it meets the condition, the algorithm proceeds to Step 8. If it is not satisfied, then return to Step 2.
Step 8. End of the evolutionary process. Output the optimal value and the best individual to obtain the optimal value.
The algorithm is repeated in an iterative manner and the algorithm ends when the number of iterations satisfies the value in the initial settings. The individuals in the essence population can obtain a good optimized value and the best value is then selected as the movement sequence keyframe. Genetic manipulation is terminated when the following condition is satisfied: the optimal value remains unchanged MAXGEN times.