Next Article in Journal
Statistical Aspects of High-Dimensional Sparse Artificial Neural Network Models
Previous Article in Journal
Multi-Label Classification with Optimal Thresholding for Multi-Composition Spectroscopic Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Effect of Data Representation for Time Series Classification—A Comparative Study and a New Proposal

by
Kotaro Nakano
1 and
Basabi Chakraborty
2,*
1
Graduate School of Software and Information Science, Iwate Prefectural University, Iwate 020-0693, Japan
2
Faculty of Software and Information Science, Iwate Prefectural University, Iwate 020-0693, Japan
*
Author to whom correspondence should be addressed.
Mach. Learn. Knowl. Extr. 2019, 1(4), 1100-1120; https://doi.org/10.3390/make1040062
Submission received: 29 October 2019 / Revised: 3 December 2019 / Accepted: 4 December 2019 / Published: 6 December 2019
(This article belongs to the Section Data)

Abstract

:
Time series classification (TSC) is becoming very important in the area of pattern recognition with the increased availability of time series data in various natural and real life phenomena. TSC is a challenging problem because, due to the attributes being ordered, traditional machine learning algorithms for static data are not quite suitable for processing temporal data. Due to the gradual increase of computing power, a large number of TSC algorithms have been developed recently. In addition to traditional feature-based, model-based or distance-based algorithms, ensemble and deep networks have recently become popular for time series classification. Time series are essentially huge, and classifying raw data is computationally expensive in terms of both processing and storage. Representation techniques for data reduction and ease of visualization are needed for accurate classification. In this work a recurrence plot-based data representation is proposed and time series classification in conjunction with a deep neural network-based classifier has been studied. A simulation experiment with 85 benchmark data sets from UCR repository has been undertaken with several state of the art algorithms for time series classification in addition to our proposed scheme of classification for comparative study. It was found that, among non-ensemble algorithms, the proposed algorithm produces the highest classification accuracy for most of the data sets.

1. Introduction

Time series is an ordered sequence of data points which is abundant in nature as well as in real life. Due to the increasing use of various sensors, the advancement of ICT (Information and Communication Technology) and decreased cost of storage, a huge amount of time series data are collected and stored regularly in various application domains. This high volume of time series data need to be analysed for meaningful use of the data. Classification of time series is an important task among time series analysis [1] which has many important applications ranging from biometric authentication such as on line signature verification [2] to electroencephalogram (EEG), electrocardiogram (ECG) analysis in medical or health care field [3] or stock price, exchange rate in financial applications [4] to human activity recognition [5,6].
Traditional time series classification algorithms can be summarized into three categories— model-based, feature-based and distance-based. The first category of approaches focuses on building a model for each class from raw time series data by fitting its parameters to that class and the new data is classified according to the class model that best fits it. Models used in time series classification are mainly statistical, such as Gaussian, Poisson, Autoregressive [7] Markov and Hidden Markov Model (HMM) [8] or based on neural networks. Naive Bayes is the simplest model and it is used in text classification [9]. Hidden Markov models (HMM) are successfully used for biological sequence classifications. Some neural network models, such as recurrent neural network (RNN), are suitable for temporal data classification. Probabilistic distance measures are generally suitable for model-based classification of the time series.
The second category consists of extracting meaningful features from the time series, transforming the time series into a feature vector and then classification is done by using traditional machine learning classifiers. The choice of appropriate features plays an important role in this approach. A number of techniques has been proposed for feature subset selection by using compact representation of high dimensional time series into one row to facilitate the application of traditional feature selection algorithms like recursive feature elimination (RFE), zero norm optimization and so forth [10,11]. Time series shapelets, characteristic subsequences of the original series, are recently proposed as the features for time series classification [12]. Another group of techniques extract features from the original time series by using various transformation techniques like Fourier, Wavelet, and so forth. In Reference [13], a family of techniques has been introduced to perform unsupervised feature selection on time series data based on common principal component analysis (CPCA), a generalization of PCA for multivariate data items where all the data items have the same number of dimensions. Any distance metric is used for classification of the feature-based representation of the time series data.
The third category of approaches is based on developing efficient distance functions to measure the similarity between two raw time series and a good traditional classifier for clustering or classification. Similarity or dissimilarity measures are the most important component of this approach. Euclidean distance is the most widely used measure with a nearest neighbour classifier for time series classification. Although computationally simple, it requires two series to be of equal length and is sensitive to time distortion. Elastic similarity measures such as Dynamic Time Warping (DTW) [14] and its variants overcome the above problems and seem to be the most successful similarity measure for time series classification in spite of high computational cost. The combination of DTW and k-nearest neighbour classifiers is known to be a very efficient approach and was considered to be the best one until a few years ago. A comparative study of different distance measures can be found in Reference [15].
Recently, ensemble-based approaches have been developed in which different classifiers are combined to achieve a higher degree of accuracy. Different ensemble paradigms integrate various feature sets or classifiers. Elastic Ensemble (PROP) [16] combines 11 classifiers based on elastic distance measures with a weighted ensemble scheme. Collective of Transformation ensembles (COTE) [17], is another ensemble of 35 different classifiers based on different feature subsets from time and frequency domains. Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) [18] is an extended version of COTE. However, the computational times for ensemble classifiers are quite high compared to a single classifier, even with the increased use of high performance computers. A good comparative evaluation of recent time series classification algorithms can be found in Reference [19].
Due to increased interest in GPU-based computing, deep learning models are also becoming popular and have been successfully applied in the time series classification problem. A good review of the most successful applications of deep neural netwoks (DNN) can be found in Fawaz et al. [20]. Deep learning approaches for TSC can be grouped into two categories generative and discriminative models. Among various DNN models developed for different tasks, Convolutional Neural Network (CNN) is the most widely applied architecture for TSC problems, probably due to their robustness and lesser training time compared to other complex architectures. A review of CNN models can be found in Reference [21]. Two baseline CNN models are used in Reference [22], one is a fully convolutional neural network (FCN) and the other is residual network (ResNet). CNN and ResNet are known to be the most successful and effective among deep neural networks (DNN) so far according to Reference [20]. Recurrent Neural Networks such as LSTM (Long Short Term Memory) have also been used for human activity recognition from various one dimensional time series data from different sensors or for classifying stocks [23].
For efficient classification, raw time series should be preprocessed to reframe them into a new representation to feed them to CNN. A raw time series needs to be converted into a set of fixed length vectors (to be used with 1D CNN) or matrix before feeding to 2D CNN. The most popular transformation methods are Gramian Angular Fields (GAF) [24,25] and Markov Transtion Field (MTF) [26] which are used to encode time series signals as images for inputs to 2D CNN. Another way of transforming one dimensional time series to two dimensional matrix is to use recurrence plot [27]. This paper investigates the performance of recurrence plot-based time series representation with two models of DNN, namely Full Convolutional Network (FCN) and ResNet in time series classification problems. A modification of recurrence-based representation has been proposed and the efficiency of classification of the new representation method has been examined compared to other representative classification algorithms from the literature by simulation experiments with 85 benchmark data sets from UCR time series data repository. The next section contains a brief description of time series representation and classification as the background of present work. Our proposed TSC approach with modified recurrence plot is presented in Section 3. Section 4 describes the comparative study by simulation experiments followed by the simulation results and analysis in Section 5. The final section contains the summary of the work and conclusion.

2. Time Series Representation and Classification

Approaches for time series classification can also roughly be grouped into approaches based on raw time series data and approaches based on transformed data in which the time series is converted as a set of feature vectors. Figure 1 represents the grouping of popular time series classification approaches after preprocessing of the data. The group of approaches on the left consists of representation of time series as a vector of global static features, selection of the most appropriate features and classification done by traditional machine learning models such as SVM (support vector machine), KNN (k-nearest neighbour) or CART (decision tree). The block on the right represents the approaches for classification with raw time series by KNN using various similarity measures. The middle group represents various representation schemes (feature extraction) for classification by deep neural networks or other machine learning classifiers. In this work our approach falls into this category.

2.1. Feature-Based Representation

Feature-based approches for classification is generally faster than raw time series-based approaches. Feature extraction from time series can be done either in time domain or in frequency domain. Moreover features can also be derived from subsequences of a time series characterizing local patterns or from the whole time series capable of expressing the global patterns. Features computed from different subsections of a time series are combined to form a bag of feature framework TSBF for classification of time series in Baydogan et al. [28].
Some of the feature-based representations of time series convert time series into a vector of feature values which are generally average statistical measures of time series over a window (whole time series is divided into a sequence of fixed length or sliding windows) of ordered sequences like mean, standard deviation, skewness, kurtosis and their successive increments [29]. Those features are unable to preserve the dynamic information embedded in the time series. Another class of feature-based representation consists of various transformations of time series in frequency domain such as DFT (Discrete Fourier Transformation), SVD (Singular Value Decomposition), DCT (Direct Cosine Transformation), DWT (Discrete Wavelet Transformation) and so forth. Timmet et al. [30] used a variety of time and frequency domain features to represent hand tremor time series. Morchen [31] used different features from the frequency domain for classification of different time series. Wang [32] used 13 features for classification of univariate and multivariate time series.
In feature-based approaches for TSC, classification accuracy is highly dependent on the extracted and selected features rather than the classification model. The choice of features to characterize a time series is subjective and non systematic. The best feature subset is also task dependent and there is no one particular way of choosing features for all time series classification problems. All the approaches need to take care of preprocessing data and selecting appropriate features for efficient classification.

2.2. Time Series Classification with Deep Neural Network

Deep neural networks (DNN), known to be capable of automatic feature extraction, are now becoming very popular and have many successful applications in the field of image processing [33]. In addition to images, sequential text and audio data can also be processed successfully with deep neural networks. Motivated by their success, recently DNN, especially convolutional neural networks (CNN), are increasingly used in TSC problems as time series resembles text data and audio in terms of their sequential nature.
A multi channel CNN (MC-CNN) in which filters are applied to each channel and the features are flattened across channels to input to a fully connected layer is proposed in Zheng et al. [34]. A multiscale convolutional neural network (MCNN) has been proposed for univariate time series classification in which three types of representation (down sampling, skip sampling and sliding window) for preprocessing of raw time series are used to input to the network [35]. Another research work is based on the similar idea of exploiting simultaneously multiple branches of the same type of representation for time series classification [36].
Wang [22] suggested two other CNNs for time series classification, the fully convolutional neural network (FCN) without subsampling layers and ResNet (Residual Network). With the addition of some learning techniques, these two models produce better performance than MCNN or COTE, as is demonstrated by simulation with UCR benchmark data sets. An ensemble method of deep networks is proposed in Reference [37] in which LSTM (Long Short Term Memory) and FCN models are individually used and their outputs are concatenated and passed though a softmax classifier for final decision. Although deep neural networks achieve quite good classification accuracy for time series classification problems, high preprocessing effort and tuning of large set of hyperparameters make them difficult to use in a real situation.

2.3. Recurrence Plot for Deep Neural Network

There are basically two main approaches for time series classification with convolutional neural networks. In one approach, traditional CNN is modified to accept 1-dimensional time series as input and in the other approach, time series is converted into a 2D image to be used with conventional CNN. There are various methods for transforming time series signals into images using specific imaging methods like Gramian Angular Fields (GAF) [24,25], Markov Transtion Field (MTF) [26] and Recurrence Plot (RP), a tool in chaos theory to visualize time series.
Silva et al. [38] used the Campana- Keogh distance measure to estimate image similarity as a similarity measure (CK-1) between two recurrence plots corresponding to two time series and found an improvement of classification accuracy compared to Euclidean distance and dynamic time warping. Hatami et al. in Reference [39] used RP as an input to CNN for TSC problems. In a subsequent paper [40], the authors used bag of feature concepts on recurrence plot and generated bag of recurrence patterns for representation of time series for classification with Support Vector Machine (SVM) classifier. Michael et al. [41] defined a cross recurrence plot (CRP) as an extension of recurrence plot to visualize similar recurring patterns in two time series and proposed another similarity measure called the cross recurrence plot compression distance (CRPCD), which is a modification of the work in Reference [38]. Recurrence quantification analysis (RQA) [42] was developed to quantify differences in recurrence plots of two dynamical systems. It is used as a similarity measure in time series classification tasks in several recent works [43,44,45]. It seems that there is no research considering the modification of recurrence plot to be used with deep networks for better classification accuracy in a time series classification problem.
Recurrence plot (RP) created by Eckman [27], is a tool to visualize recurrent behaviour such as periodicity or irregular cyclicity, a typical phenomenon in nonlinear dynamical systems that generates the time series. It is a 2D plot for encoding 1D time series which provides a way to visualize the recurrence behaviour of trajectory through a phase space and enables us to investigate certain aspects of the m-dimensional phase space trajectory through a 2D representation. It can be defined by the following equation:
R i , j = Θ ( ϵ x i x j ) x ( · ) m i , j = 1 , , n ,
where x is a time series of length n, x i and x j are the subsequences observed at i and j positions of the time series, · is a norm (e.g., Euclidean norm) between the observations, ϵ is the recurrence threshold. It is chosen in such a way that the noises are filtered out but the recurrence structures are preserved. Θ is the Heaviside function. According to Equation (1), the recurrence of phase state at time i and j are placed in the square matrix with black and white dots. Recurrence is marked by the black dots.
C R i , j = Θ ( ϵ x i x j ) x i , x j m i = 1 , , n j = 1 , , l .
Cross recurrence plot (CRP) is an extension of RP which shows all the times when a state in one time series occurs in the other time series. When the length of the two time series n and l respectively differs, the CRP matrix becomes non-square.

3. Proposed TSC Approach by DNN with Modified Recurrence Plot

In this work, time series classification with deep neural network with a proposed modification of recurrence plot for improvement of classification performance has been investigated. Based on our literature survey we considered two architectures, fully convolutional networks (FCN) and Residual Network (ResNet) with three types of data representation, the first one being traditional recurrence plot and the two others being proposed modifications for time series classification.
The first step in the proposed classification approach is the recurrence plot generation. It is a simple tool for reconstruction of nonlinear dynamical system from the observed time series based on the concept of the embedding theorem. The embedding theorem proposed by Taken and expanded by Sauer [46] guarantees that the phase space of time delayed vectors with sufficiently large dimension will capture the structure of the original phase space.
A deterministic time series signal { s n ( t ) } t = 1 T n ( n = 1 , 2 , , N ) can be embedded as a sequence of time delay co-ordinate vector v s n ( t ) known as experimental attractor, with an appropriate choice of embedding dimension m which is the minimum number of co-ordinates needed to represent the time series with no overlapping in the state space and delay time τ which is the time lag of the time series points taken as coordinates.
v s n ( t ) { s n ( t ) , s n ( t + τ ) , , s n ( t + ( m 1 ) τ ) } ,
Now for correct reconstruction of the attractor, a fine estimation of embedding parameters (m and τ ) is needed. There are a variety of heuristic techniques for estimating those parameters [47]. The most popular method of estimating m is False Nearest Neighbour proposed by Kennel and the most popular technique for estimating τ is Average Mutual Information.

3.1. Recurrence Plot (RP) Generation

After estimation of the embedding parameters, a time series v i can be converted to recurrence plot. The recurrence plot is an array of dots in a n × n square, where a dot is placed at (i, j), whenever x j is sufficiently close to x i . By choosing an embedding dimension m, the m-dimensional orbit of x i can be constructed by the method of time delay. Then r i is chosen such that the ball of radius r i centred at x i in R m contains a reasonable number of other points of the orbit. Finally, a dot at each point (i, j) for which x j is in the ball of radius r i centred at x i , is plotted and the generated image is called the recurrence plot. The practical steps of generation are:
  • Estimation of proper embedding parameters m and τ .
  • Embedding of time series data with Equation (3).
  • Calculation of Euclid distance to generate D i , j = d i s t ( v i v j ) .
  • The square distance matrix is finally converted to grey scale image as the input to CNN for classification.
Now the square matrix generated is symmetric across the diagonal, lower left triangular part and upper right triangular part contains the same information.

3.2. Proposed Modified Recurrence Plot (Recurrence Plot Raw RP1)

In our previous study [48] for time series classification on 85 benchmark data sets from UCR repository using CNN (convolutional Neural Network) similar to the CNN used in Reference [39], it has been found that the two dimensional recurrence plot representation of input data with CNN produces better classification accuracy compared to the classification accuracy of one dimensional raw time series data with 1NN classifier and Euclid distance or DTW as the similarity measures for most of the data sets. The simulation study was done with recurrence plot generation for different m and τ values, However the following issues were noticed that need further consideration.
  • It was found that for some data sets it was possible to improve classification accuracy by tuning the parameters m and τ while in other data sets, tuning did not work. As an explanation for this, it is assumed that, during generation of the recurrence plot, if the change in the time series is small, the distance values in the matrix become close to zero, resulting in poor classification accuracy while those types of time series are better classified with the 1NN classifier and DTW measure using the raw time series.
  • Due to the symmetric nature of the square recurrence plot transformed image across the diagonal, only one triangular part is needed for representation of the data, the other part is redundant, which has an effect on increasing computational burden.
  • The computational cost increases with the size of the input image, so recurrence plot image size should be the smallest needed to preserve the characteristic pattern of the time series for classification, so resizing of the input image is needed to reduce computational burden.
To alleviate the above points, a modified image representation of the input data is proposed here where one triangular half of the square image retains the recurrence plot of the input data and the other part contains information from the raw data to remove the redundancy in the input image representation and to take care of different types of time series to be classified with similar accuracy. Finally the image is resized and checked that it does not affect classification accuracy. The steps of generation of the transformed image are summarized below and is shown in Figure 2.
  • Estimation of proper embedding parameters m and τ .
  • Embedding of time series data with Equation (3).
  • Calculation of Euclid distance to generate the distance matrix D i , j = d i s t ( v i v j ) .
  • Normalization of the distance values to lie between 0.0 and 1.0 to form the square matrix A.
  • Another square matrix B is formed with the original time series values sifted by τ . Let us suppose that the normalized original time series is represented by S consisting of 11 points. Its distribution in a square matrix B with τ = 2 is shown in the left square of the figure.
  • The final square matrix F is designed by combining A and recurrence plot information from B in which upper triangle represents the upper triangle (except the diagonal) of the recurrence plot values and the lower triangle represents the lower triangle (with the diagonal) of the original square matrix A as shown in the right square of the figure.
  • Finally, square matrix F is converted to image (RP1) and optimized to proper size as a representation of the time series.

Recurrence Plot DTW (RP2)

In another version of time series representation, step 3 of the recurrence plot algorithm for distance matrix calculation is modified and dynamic time warping (DTW) is used. The DTW distance matrix D T W ( i , j ) by DTW (the distance between the time series p i and q j with the best alignment) is obtained by the following Alogrithm 1.
Algorithm 1: Calculation of DTW
  for i = 0 to n do
   for j = 0 to l do
     C o s t = D ( p i , q j )
     D T W ( i , j ) = C o s t + m i n ( E u c l i d ( i 1 , j ) , E u c l i d ( i , j 1 ) , E u c l i d ( i 1 , j 1 ) )
   end for
  end for
  return D T W ( i , j )
D ( p i , q j ) represents the euclid distance between p i and q j .

3.3. Classification by FCN and ResNet

In this work, fully convolutional neural network (FCN) and Residual network (ResNet) has been used for time series classification. The basic structure of the FCN used is shown in Figure 3. It consists of the input layer followed by two sets of convolutional layer and max-pooling layer, two fully connected layers and output layer. The number of neurons in the first fully connected layer depends on the input image size (input image size × feature map size) and in the second fully connected layer is 512. We used three sizes of input images 70 × 70 , 100 × 100 , 200 × 200 . The detailed parameters used after some trial and error with the model are shown in Table 1. The basic structure of the ResNet used in this work is same as used in Reference [49] and is shown in Figure 4. The input image size for ResNet is restricted to 50 × 50 for all time series.

4. Comparative Study and Simulation Experiments

The proposed approaches based on FCN and ResNet with three types of recurrence plot-based data representation RP, RP1 and RP2 for time series classification have been evaluated with benchmark data sets from UCR archive. A comparative study has been done to verify the classification efficiency of the proposed approaches in comparison with some other popular and successful approaches for TSC. Here we selected the following classification approaches for comparative study.
  • 1NN classifier with Euclid distance as the similarity measure using raw time series. This is the simplest approach and has the lowest computational cost. However, this approach cannot be used to compare two time series of unequal length.
  • 1NN classifier with DTW (dynamic time warping) as the similarity measure between two time series. This is the most popular approach; it produces high classification accuracy but has high computational cost. The algorithm is presented in the previous section.
  • 1NN classifier with longest common subsequence (LCSS) [50] as the similarity measure. LCSS is a variant of edit distance which also matches two time series by allowing them to stretch like DTW. It has two parameters ϵ ND a matching threshold. Two points from two time series are considered to match if their distance is less than ϵ and δ , the warping threshold which controls the window size for matching. It is known to be more robust to noise and outliers compared to DTW.
  • CrossTranslation error (CTE), similarity measure for two time series, was developed by one of the authors previously for the online signature verification problem, which is based on the delay vector representation of time series. The details can be found in Reference [51]. It is computationally very light, although classification accuracy is poor. The calculation process is described in short here.
    • Let v s i ( t ) and v s e ( t ) denote m-dimensional delay vectors generated from time series s i ( t ) and time series s e ( t ) respectively according to Equation (3).
    • A random vector v s i ( k ) is picked up from v s i ( t ) . Let the nearest vector of v s i ( k ) from v s e ( t ) be v s e ( k ) . The index k for the nearest vector is defined as follows;
      k arg min t v s i ( k ) v s e ( t ) .
    • For the vectors v s i ( k ) and v s e ( k ) , the transition in each orbit after one step is calculated as follows;
      V s i ( k ) = v s i ( k + 1 ) v s i ( k ) ,
      V s e ( k ) = v s e ( k + 1 ) v s e ( k ) .
    • Cross Translation Error (CTE) e c t e is calculated from V s i ( k ) and V s e ( k ) as
      e c t e = 1 2 ( | V s i ( k ) V ¯ | | V ¯ | + | V s e ( k ) V ¯ | | V ¯ | ) ,
      where V ¯ denotes average vector between V s i ( k ) and V s e ( k ) .
    • e c t e is calculated for L times for a different selection of random vector v s i ( k ) and the median of e c t e i ( i = 1 , 2 , , L ) is calculated as
      M ( e c t e ) = M e d i a n ( e c t e 1 , , e c t e L ) .
      The final cross translation error E c t e is calculated by taking the average, repeating the procedure Q times to suppress the statistical error generated by random sampling in the step (3).
      E c t e = 1 Q i = 1 Q M i ( e c t e ) .
  • Time series bag of features (TSBF) is an an extension of Time series forest (TSF) with multiple stages. The first stage generates a subseries classification problem and the second stage forms class probability estimates for each subseries. The third stage constructs a bag of features from these probabilities and finally a random forest classifier is built on the bag of feature representation. The details can be found in Reference [28].
  • We also used one dimensional FCN ( Convolutional Neural Network) and ResNet and used raw time series data for classification to compare the effect of 2D recurrence plot approach for time series classification compared to 1D raw time series data. Due to limitation of computational resources while implementing ResNet, we compressed the time series for recurrence map generation, we used the same compressed time series for one dimensional version of FCN and ResNet for fair comparison.

4.1. Dataset Used

The simulation experiments were done with the benchmark datasets from UCR/UEA time series classification archive [52]. We used 85 data sets, details of which are presented on the archive website. The data sets contain time series of various characteristics, length ranges from 24 to 2709, number of classes varies from 2 to 60. Some data sets have a very small training set size. The data sets are collected from different application domains and can be divided into seven categories as Image Outline (29), Sensor Readings (16), Motion Capture (14), Spectrographs (7), ECG measurements (7) Electric device profiles (6) and Simulated Data (6), the numbers in bracket represents the numbers of data sets in the said category.

4.2. Simulation Experiments

Following simulation experiments for time series classification with benchmark data sets, training and test sets were used, as is mentioned in the original data set with 10 fold cross validation for each classifier. For convolutional neural network CNN, some trial and error experiments were done for appropriate hyper parameter setting and the hyper parameters are set for the best results and are represented in the next section. For ResNet, due to time limitations, we used previously reported parameters.
  • FCN classifier with three types of recurrence plot representation (RP, the original one, RP2, in which DTW is used for distance calculation for recurrence plot, RP1, our proposed modified recurrence plot in which raw data is also combined with the recurrence plot)
  • The above experiments are repeated with ResNet with the same three types of recurrence plots.
  • Experiments were done with Nearest Neighbor classifier with Euclid and DTW using the original raw time series.
  • 1NN classifier with edit distance-based approaches, LCSS (longest common subsequence), TWED (time warped edit distance) and MSM (Move-Split-Merge), are used for classification using the original raw time series.
  • Cross transtational error (CTE) based on the concept of multidimensional delay vector representation with 1NN classifier.
  • A feature-based approach TSBF with random forest classifier is used.
We attempted to implement ensemble-based algorithms on the data sets but due to lack of proper computing resources, we restricted our comparative study to non-ensemble algorithms.

5. Simulation Results and Analysis

Table 2 represents classification accuracies of 85 data sets with different classification approaches. In all tables, in every row, the highest value is presented in bold which represents the best classification accuracy obtained for the particular data set. Column 1 represents data sets, column 2 represents classification accuracies by FCN with traditional recurrence plot similar in the work presented in Reference [39]. Columns 3, 4 and 5 represent classification accuracy values for FCN with recurrence plot RP2, FCN with proposed modified recurrence plot RP1 and ResNet with RP1 respectively. We found that RP1 produces better classification accuracies than RP and RP2, so we did not present (RP2 + ResNet) results. The rest of the columns represent classification accuracies for Euclid, DTW, LCSS, CTE and TSBF. We did not include the results of TWED and MSM as those have poor classification accuracies compared to the methods presented in the table. It is found that no algorithm is best for all the data sets. Though TSBF produces the best classification accuracy for most of the data sets, average classification accuracy over 85 data sets is not the highest among all the methods. Our proposed method (RP1 + ResNet) achieves the highest average classification accuracy over 85 data sets. RP2 uses DTW for distance calculation, which increased computational cost as well as accuracy for some of the data sets, as a whole the increase in classification accuracy is not so significant compared to the increase in computational cost. However, our proposed modification of recurrence plot RP1 seems to have the best effect on the increase of classification accuracy. This modification does not increase the computational cost. From this table it can be assumed that TSBF, RP1+FCN and RP1 + ResNet are the effective classifiers.
Table 3 represents the comparison of classification accuracies of different one dimensional deep networks with raw time series input and two dimensional deep network-based algorithms with recurrence plot input. We excluded TSBF here to focus on the results of recurrence plot-based methods. Column 4 and column 6 represent the results of 1D convolutional neural network and 1D ResNet, respectively. It is found that the results of two dimensional deep networks with recurrence plot input are far better than one dimensional deep networks with raw time series input for most of the data sets. From this table it is also found that the classification accuracy of column 7 (RP1 + ResNet) is the highest for the most of the data sets. It can be concluded that ResNet with our proposed modified recurrence plot input produces the best average classification accuracy and the highest classification accuracy for most of the data sets. Also, the variability of the classification accuracies among different data sets is the lowest (same as DTW).
Table 4 represents the comparison of the classification performance of the our proposed classification algorithm, ResNet with modified recurrence plot RP1 as input (RP1 + ResNet), the best among all recurrence plot-based algorithms and TSBF, the best classifier among all others (where recurrence plot is not used) considered in this work. It is clearly seen that our proposed method is better in classification performance for greater number of data sets compared to TSBF.
Table 5 displays the average classification accuracies for each category of time series with all the algorithms. It is clear from Table 5 that ResNet with our proposed modified recurrence plot RP1 is the best algorithm for four categories of time series data among seven categories, one category (Spectro) has the highest classification accuracy with (RP1 +FCN). If we choose only the algorithm (RP1+ResNet) for comparison, it will produce highest classification accuracy for five categories among a total of seven categories because, for the Spectro category, this algorithm has the second best value. Only for two categories, Device and Motion, did TSBF outperform our proposed method. It is also found that ResNet with RP1 has the best performance among all other recurrence plot-based methods (as noted in the average classification accuracy of 85 data sets in the table). This is also evident in Figure 5, in which recurrence plot-based methods are compared where recurrence plot denotes RP, recurrence plot with DTW represents RP2 and recurrence plot raw represents RP1. It is seen from Figure 5, that (RP1 + ResNet) has the best classification accuracy for 5 categories and is not so significantly different from (RP1 + FCN) in the other two categories.
For testing the statistical significance between different approaches, we followed the methodology described in Reference [53]. Critical difference (CD) plots of different algorithms for individual 85 data sets and categorized into 7 group of data sets respectively are presented in Figure 6 and Figure 7 respectively in which recurrence plot denotes RP, recurrence plot with DTW represents RP2 and recurrence plot raw represents RP1. In both the figures, it is seen that our proposed approach (RP1 + ResNet) ranks higher than other approaches. From Figure 6, it is seen that the algorithm (RP1 + ResNet) has the highest rank and there is no significant difference between the algorithms (RP1 + ResNet) and (RP1 + FCN), which are significantly better than other algorithms. From Figure 7 the same conclusion can be drawn.
For consideration of computational cost, it is difficult to compare all the algorithms by implementing all of them in the same platform. It is needless to say that the parameter search of deep neural network architectures takes time and our reported results might not constitute the most optimized architecture. On the other hand, for NN-DTW, warping window size has a considerable effect on the final accuracy and we did not put significant effort into searching for the best warping window. As a rough comparison, our proposed representation technique based on recurrence plot and deep network considerably improved classification accuracy without incurring additional computational cost compared to other popular non ensemble and deep network-based algorithms.

6. Conclusions

In this paper, the effect of time series data representation methods for time series classification problems in terms of increased classification accuracy with affordable computational cost and intrepretability has been studied. Our study focussed mainly on recurrence plot-based representation of time series for use with deep network-based classifiers. Because there are several deep network architectures and, from the reported results, it is found that CNN and ResNet perform better than others in time series classification problems, fully convolutional network (FCN) and residual network (ResNet) have been used in our work. A new modified recurrence plot representation of time series data set has been proposed which judiciously includes information from raw time series in the recurrence plot framework without much additional computational cost for improvement of classification accuracy.
The use of recurrence plot as the input representation form increases the interpretability of the classification method compared to the raw time series input. Deep networks are known to be black boxes which inherently extract the features of the time series for grouping. Although it is convenient, this process is invisible to users. Recurrence plots are more visually interpretable than raw time series. Humans can deploy the results of classification by deep network later to establish a correlation between the structure of the recurrence plot with the categories of time series. Interpretability of classification process can also be increased by extracting explicit features from the time series and then classifying the time series by those features which will allow users to relate the classes with the characteristics of the time series. However, the proper selection of a feature set is important for efficient classification and there is no general way to do that.
In our work, the modification of recurrence plot by mixing of information from raw time series data with the recurrence plot allows us to consider static and dynamic features of the time series simultaneously and extends the use of recurrence plot to a wide variety of time series data. Due to computational resource limitations, we optimized the size of recurrence plot in such a way that the computational limitations could be overcome without much degradation of classification accuracy and we selected 50 × 50 image size for input to ResNet for all time series irrespective of their original length according to our computational environment. The computational cost of deep network based approaches with original time series input increases with the increase of the length of time series as the network complexity (number of parameters) increases which in turn increases the training time. Our approach is an attempt to optimize computational burden, classification accuracy with general applicability to different types of time series and also to add interpretability. Of course increasing the size of recurrence plot might increase the classification accuracy for some time series. We did not tuned the size for all the time series individually at this stage. There is a scope of further improvement of classification accuracy at the cost of more computation time.
A comparative study has been done with some of the state-of-the art algorithms and it was found that our proposed approach can produce better classification accuracy for most of the data sets. For comparison, we did not include ensemble algorithms. Although ensemble algorithms produce better classification accuracy, their computational cost is too high to find out the proper combination. It has been found from the comparative study that our proposed algorithm performs better than popular traditional non-ensemble algorithms for time series datasets for most of the domains available from the benchmark data set repository.

Author Contributions

Conceptualization, B.C.; Investigation, K.N.; Methodology, K.N.; Software, K.N.; Writing—original draft, K.N.; Writing—review & editing, B.C.

Funding

This research received no external funding.

Acknowledgments

This research was supported by PRML laboratory of Department of Software and Information Science, Iwate Prefectural University, Japan.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Esling, P.; Agon, C. Time series data mining. ACM Comput. Surv. 2012, 45, 12.1–12.34. [Google Scholar] [CrossRef] [Green Version]
  2. Tamilarasi, K.; Nithya Kalyani, S. A survey on signature verification based algorithms. In Proceedings of the IEEE International Conference on Electrical, Instrumentation and Communication Engineering (ICEICE), Karur, India, 27–28 April 2017; pp. 1–3. [Google Scholar]
  3. Wang, J.; Liu, P.; She, M.F.H.; Nahavandi, S.; Kouzani, A. Bag-of-words representation for biomedical time series classification. Biomed. Signal Process. Control 2013, 8, 634–644. [Google Scholar] [CrossRef] [Green Version]
  4. Fisher, T.; Krauss, C. Deep Learning with Long Short-Term Memory Networks for Financial Market Predictions. Eur. J. Oper. Res. 2018, 270, 654–669. [Google Scholar] [CrossRef] [Green Version]
  5. Lara, O.D.; Labrador, M. A survey on human activity recognition using wearable sensors. IEEE Commun. Surv. Tutor. 2013, 15, 1192–1209. [Google Scholar] [CrossRef]
  6. Singh, D.; Merdivan, E.; Psychoula, I.; Kropf, J.; Hanke, S.; Geist, M.; Holzinger, A. Human activity recognition using recurrent neural networks. In Machine Learning and Knowledge Extraction; lecture notes in computer science lncs 10410; Springer/Nature: London, UK, 2017; pp. 267–274. [Google Scholar] [CrossRef] [Green Version]
  7. Kini, B.V.; Sekhar, C.C. Large margin mixture of AR models for time series classification. Appl. Soft Comput. 2013, 13, 361–371. [Google Scholar] [CrossRef]
  8. Antonucci, A.; De Rosa, R.; Giusti, A.; Giusti, A.; Cuzzolin, F. Robust classification of multivariate time series by imprecise hidden Markov models. Int. J. Approx. Reason. 2015, 56, 249–263. [Google Scholar] [CrossRef]
  9. Kim, S.B.; Han, K.S.; Rim, H.C.; Myaeng, S.H. Some effective techniques for naive bayes text classification. IEEE Trans. Knowl. Data Eng. 2006, 18, 1457–1466. [Google Scholar]
  10. Lal, T.N.; Schroder, M.; Hinterberger, T.; Weston, J.; Bogdan, M.; Birbaumer, N.; Scholkopf, B. Support Vector Channel Selection in BCI. IEEE Trans. Biomed. Eng. 2004, 51, 1003–1010. [Google Scholar] [CrossRef] [Green Version]
  11. Chakraborty, B. Feature selection and classification techniques for multivariate time series. In Proceedings of the Second International Conference on Innovative Computing, Information and Control (ICICIC 2007), Kumamoto, Japan, 5–7 September 2007. [Google Scholar]
  12. Ye, L.; Keogh, E. Time series shapelets: A new primitive for data mining. In Proceedings of the ACM SIGKDD International Conference of Knowledge Discovery and Data Mining, Paris, France, 28 June 28–1 July 2009; pp. 947–956. [Google Scholar]
  13. Yoon, H.; Yang, K.; Sahabi, C. Feature subset selection and feature ranking for multivariate time series. IEEE Trans. Knowl. Data Eng. 2005, 17, 1186–1198. [Google Scholar] [CrossRef] [Green Version]
  14. Berndt, D.J.; Clifford, J. Using dynamic time warping to find patterns in time series. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (AAAIWS 94), Seattle, WA, USA, 31 July–1 August 1994; pp. 359–370. [Google Scholar]
  15. Wang, X.; Mueen, A.; Ding, H.; Trajcevski, G.; Scheuermann, P.; Keogh, E. Experimental comparison of representation methods and distance measures for time series data. Data Min. Knowl. Discov. 2013, 26, 275–309. [Google Scholar] [CrossRef] [Green Version]
  16. Lines, J.; Bagnall, A. Time series classification with ensembles of elastic distance measures. Data Min. Knowl. Discov. 2015, 29, 565–592. [Google Scholar] [CrossRef]
  17. Bagnall, A.; Lines, J.; Hills, J.; Bostrom, A. Time series classification with COTE: The collective of transform-based ensembles. IEEE Trans. Knowl. Data Eng. 2015, 27, 2522–2535. [Google Scholar] [CrossRef]
  18. Lines, J.; Taylor, S.; Bagnall, A. Time series classification with HIVE-COTE: The hierarchical vote collective of transformation based ensembles. ACM Trans. Knowl. Discov. Data 2018, 12, 52:1–52:35. [Google Scholar] [CrossRef] [Green Version]
  19. Bagnall, A.; Bostrom, A.; Large, J.; Lines, J. The great time series classification bake off: A review and experimental evaluation of recent algorithmic advances. Data Min. Knowl. Discov. 2017, 31, 606–660. [Google Scholar] [CrossRef] [Green Version]
  20. Fawaz, H.I.; Forestier, G.; Weber, J.; Idoumghar, L.; Muller, P.A. Deep Learning for time series classification: A review. Data Min. Knowl. Discov. 2019, 33, 917–963. [Google Scholar] [CrossRef] [Green Version]
  21. Sadouk, L. CNN Approaches for Time Series Classification. Convolutional Neural Netw. 2018. [Google Scholar] [CrossRef] [Green Version]
  22. Wang, Z.; Yan, W.; Oates, T. Time series classification from scratch with deep neural networks: A strong base line. In Proceedings of the IEEE IJCNN, Anchorage, AK, USA, 14–19 May 2017; pp. 1578–1585. [Google Scholar]
  23. Borovkova, S.; Tsiamas, S. An ensemble of LSTM neural networks for high-frequency stock market classification. J. Forecast. 2019, 38, 600–619. [Google Scholar] [CrossRef] [Green Version]
  24. Wang, Z.; Oates, T. Imaging time series to improve classification and imputation. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, 25–31 July 2015; pp. 3939–3945. [Google Scholar]
  25. Wang, Z.; Oates, T. Encoding time series as images for visual inspection and classification using tiled convolutional neural networks. In Proceedings of the AAAI Conference, Austin, TX, USA, 25–30 January 2015; pp. 40–46. [Google Scholar]
  26. Wang, Z.; Oates, T. Spatially encoding temporal correlations to classify temporal data using convolutional neural networks. arXiv 2015, arXiv:1509.07481v1. [Google Scholar]
  27. Eckmann, J.; Kamphorst, S.; Ruelle, D. Recurrence plots of dynamical systems. EPL (EuroPhys. Lett.) 1987, 4, 973–977. [Google Scholar] [CrossRef] [Green Version]
  28. Baydogan, M.G.; Runger, G.; Tuv, E. A Bag-of-Features Framework to classify Time Series. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 2796–2802. [Google Scholar] [CrossRef]
  29. Nanopoulos, A.; Alcock, R.; Manolopoulos, Y. Feature- based classification of time-series data. In Information Processing and Technology; Nova Science Publishers, Inc.: New York, NY, USA, 2001; pp. 49–61. [Google Scholar]
  30. Timmer, J.; Gantert, C.; Deuschl, G.; Honerkamp, J. Characteristics of hand tremor time series. Biol. Cybern. 1993, 70, 75–80. [Google Scholar] [CrossRef] [PubMed]
  31. Morchen, F. Time Series Feature Extraction for Data Mining Using DWT and DFT; Technical report; Phillips University Marburg: Marburg, Germany, 2003. [Google Scholar]
  32. Wang, X.; Smith, K.; Hyndman, R. Characteristic based clustering for time series. Data Min. Knowl. Discov. 2006, 13, 335–364. [Google Scholar] [CrossRef]
  33. LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  34. Zheng, Y.; Liu, Q.; Chen, E.; Ge, Y.; Zhao, J.L. Exploiting multichannels deep convolutional neural networks for multivariate time series classification. Front. Comput. Sci. 2016, 10, 96–112. [Google Scholar] [CrossRef]
  35. Cui, Z.; Chen, W.; Chen, Y. Multi-scale convolutional neural network for time series classification. arXiv 2016, arXiv:1603.06995. [Google Scholar]
  36. Wang, W.; Chen, C.; Wang, W.; Rai, P.; Carin, L. Earliness- aware deep convolutional networks for early time series classification. arXiv 2016, arXiv:1611.04578. [Google Scholar]
  37. Karim, F.; Majumdar, S.; Darabi, H.; Chen, S. LSTM Fully Convolutional Networks for Time Series Classification. IEEE Access 2018, 6, 1662–1669. [Google Scholar] [CrossRef]
  38. Silva, D.F.; Batista, G.E. Time Series Classification Using Compression Distance of Recurrence Plots. In Proceedings of the IEEE 13th International Conference on Data Mining, Dallas, TX, USA, 7–10 December 2013; pp. 687–696. [Google Scholar]
  39. Hatami, N.; Gavet, Y.; Debayale, J. Classification of time series images using deep convolutional neural networks. In Proceedings of the International conference on machine vision (ICMV), Vienna, Austria, 13–15 November 2017. [Google Scholar]
  40. Hatami, N.; Gavet, Y.; Debayale, J. Bag of recurrence patterns representations for time series classification. Pattern Anal. Appl. 2018, 22, 877–887. [Google Scholar] [CrossRef] [Green Version]
  41. Michael, T.; Spiegel, S.; Albayrak, S. Time Series Classification using Compressed Recurrence Plots. In Proceedings of the NFMCP Workshop @ ECML-PKDD 2015, Porto, Portugal, 7 September 2015; pp. 178–187. [Google Scholar]
  42. Spiegel, S.; Marwan, N. Time and Again: Time Series Mining via Recurrence Quantification Analysis. In Proceedings of the ECML PKDD, Rive del Garda, Italy, 19–23 September 2016. [Google Scholar]
  43. Spiegel, S.; Jain, B.J.; Albayrak, S. A Recurrence Plot-Based Distance Measures. In Translational Recurrences. Springer Proceedings in Mathematics & Statistics; Marwan, N., Riley, M., Giuliani, A., Webber, C., Jr., Eds.; Springer: Cham, Switzerland, 2014; Volume 103. [Google Scholar]
  44. Spiegel, S.; Albayrak, S. An order-invariant time series distance measure-Position on recent developments in time series analysis. In Proceedings of the International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (KDIR), Barcelona, Spain, 4–7 October 2012. [Google Scholar]
  45. Spiegel, S. Discovery of driving behavior patterns. In Smart Information Services; Computational Intelligence for Real-Life Applications; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
  46. Alligood, K.T.; Sauer, T.; Yorke, J. Chaos: An Introduction to Dynamical Systems; Springer: New York, NY, USA, 1997. [Google Scholar]
  47. Aberbanel, H.D.I. Analysis of Observed Chaotic Data; Springer: New York, NY, USA, 1996. [Google Scholar]
  48. Nakano, K.; Chakraborty, B. Effect of Data Represntation Method for Effective Mining of Time Series Data. In Proceedings of the 2019 IEEE International Conference on Big Data and Smart Computing (BigComp), Kyoto, Japan, 27 February 27–2 March 2019; pp. 1–6. [Google Scholar]
  49. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference of Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  50. Vlachos, M.; Gunopoulos, D.; Kollios, G. Discovering Similar Multidimensional Trajectories. In Proceedings of the 18th International Conference on Data Engineering, Washington, DC, USA, 26 February–1 March 2002; pp. 673–684. [Google Scholar]
  51. Manabe, Y.; Chakraborty, B. Identity Detection from Online Handwriting Time Series. In Proceedings of the SMCia08, Muroran, Japan, 25–27 June 2008; pp. 365–370. [Google Scholar]
  52. Bagnall, A.; Lines, J. The UEA TSC Website. Available online: http://timeseriesclassification.com (accessed on 6 December 2019).
  53. Demsar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]
Figure 1. Approach of time series data classification.
Figure 1. Approach of time series data classification.
Make 01 00062 g001
Figure 2. Generation of modified recurrence plot.
Figure 2. Generation of modified recurrence plot.
Make 01 00062 g002
Figure 3. Basic structure of the FCN used.
Figure 3. Basic structure of the FCN used.
Make 01 00062 g003
Figure 4. Basic structure of the ResNet used.
Figure 4. Basic structure of the ResNet used.
Make 01 00062 g004
Figure 5. Category-based classification accuracies.
Figure 5. Category-based classification accuracies.
Make 01 00062 g005
Figure 6. Critical difference plot for different classifiers (individual 85 data sets).
Figure 6. Critical difference plot for different classifiers (individual 85 data sets).
Make 01 00062 g006
Figure 7. Critical difference plot for different classifiers ( 7 categories of data sets).
Figure 7. Critical difference plot for different classifiers ( 7 categories of data sets).
Make 01 00062 g007
Table 1. Parameters of FCN.
Table 1. Parameters of FCN.
ParametersValue
Epoch200
Drop Out0.5
Learning rate0.002
Activation functionReLU
Kernel size of convolution layer3
Stride1
Size of max pooling2 × 2
Feature map of first convolution64
Feature map of second convolution32
Table 2. Classification Accuracies with Different Algorithms.
Table 2. Classification Accuracies with Different Algorithms.
DatasetsRP + FCNRP2 + FCNRP1+ FCNRP1 + ResNetEUCLIDDTWLCSSCTETSBF
50words0.6570.6790.6750.6350.6310.6900.6350.3010.776
Adiac0.7110.6270.7420.6520.6110.6040.0280.4120.291
ArrowHead0.6290.6000.6400.8290.8000.7030.4230.5940.841
Beef0.8670.8670.8670.8000.6670.6330.3330.5670.850
BeetleFly0.7500.7501.0000.9500.7500.7000.8000.9000.682
BirdChicken0.8000.8000.8500.7500.5500.7500.6500.9000.975
Car0.8830.8500.8830.8500.7330.7330.4330.6000.917
CBF0.9990.9990.9980.9990.8520.9970.9430.6890.787
ChlorineConcentration0.4730.4840.4840.7400.6500.6480.3860.6560.969
CinC_ECG_torso0.9900.9880.9870.9490.8970.6510.9250.5640.879
Coffee1.0001.0001.0001.0001.0001.0000.5360.8570.676
Computers0.5880.6000.6040.7440.5760.7000.5240.5560.853
Cricket_X0.6870.6770.7360.7080.5770.7540.6510.3790.758
Cricket_Y0.7030.7030.7150.6690.5670.7440.6490.3720.600
Cricket_Z0.6920.6850.7260.6900.5870.7540.6560.3540.901
DiatomSizeReduction0.9740.9710.9770.9900.9350.9670.3010.8560.702
DistalPhalanxOutlineAgeGroup0.6280.6200.6500.8400.7830.7920.2650.7350.975
DistalPhalanxOutlineCorrect0.8120.7970.8130.8100.7520.7680.5120.6730.495
DistalPhalanxTW0.6850.6630.7830.7850.7280.7100.0750.7300.960
Earthquakes0.7330.7390.7550.7760.6740.7420.7330.6460.969
ECG2000.9500.9600.9400.9100.8800.7700.8800.8000.930
ECG50000.7530.7050.7340.9410.9250.9240.9330.9130.618
ECGFiveDays0.9870.9730.9810.9720.7970.7680.9430.7270.692
ElectricDevices0.4930.4760.5590.6910.5510.6010.5730.4650.940
FaceAll0.4620.4590.4630.8010.7140.8080.7510.5040.860
FaceFour0.9770.9550.9550.9660.7840.8300.8410.4550.680
FacesUCR0.8860.8860.9190.8680.7690.9050.8720.4970.745
FISH0.9140.8800.9310.8800.7830.8230.1490.4060.514
FordA0.9080.8820.9140.8460.6590.5620.6960.6170.793
FordB0.8090.7600.8550.7490.5580.5940.6180.5520.782
Gun_Point0.9670.9670.9730.9800.9130.9070.7330.9130.881
Ham0.7330.7140.7430.7430.6000.4670.5330.5900.517
HandOutlines0.8670.8760.8710.8670.8010.7980.6990.6170.677
Haptics0.4580.4120.4840.4710.3700.3770.3050.3150.813
Herring0.6560.6410.6560.6410.5160.5310.5940.5630.804
InlineSkate0.3820.3550.3930.3560.3420.3840.2200.2910.825
InsectWingbeatSound0.6390.6380.6580.5640.5620.3550.5700.1450.860
ItalyPowerDemand0.9740.9640.9760.9720.9550.9500.8010.8780.709
LargeKitchenAppliances0.5520.5280.5710.6530.4930.7950.5330.3650.721
Lighting20.8200.7700.8360.9020.7540.8690.7870.7540.986
Lighting70.7400.7260.7670.6580.5750.7260.5750.5210.993
MALLAT0.9490.9510.9530.9180.9140.9340.5410.6090.780
Meat0.7330.7500.9330.9830.9330.9330.3330.9170.668
MedicalImages0.6340.6160.7120.7510.6840.7370.6640.6630.858
MiddlePhalanxOutlineAgeGroup0.5450.5400.5230.7650.7400.7500.2700.5550.400
MiddlePhalanxOutlineCorrect0.7950.8000.8130.7920.7530.6480.3530.6050.858
MiddlePhalanxTW0.5690.5390.5640.5990.5610.5840.4040.5810.688
MoteStrain0.8630.8570.8870.8440.8790.8350.8590.9080.535
NonInvasiveFatalECG_Thorax10.7910.7850.8600.9150.8290.7910.1410.2400.828
NonInvasiveFatalECG_Thorax20.8040.7960.8640.9310.8800.8650.2530.2940.770
OliveOil0.8000.7330.7000.4330.8670.8330.1670.8330.844
OSULeaf0.6360.6160.6610.6740.5210.5910.5410.4630.979
PhalangesOutlinesCorrect0.8340.8180.8410.8310.7610.7280.6400.6740.801
Phoneme0.0710.0660.0830.1900.1090.2280.1400.1950.714
Plane0.9810.9710.9811.0000.9621.0000.8000.9900.762
ProximalPhalanxOutlineAgeGroup0.6930.6680.6440.8440.7850.8050.4290.8200.770
ProximalPhalanxOutlineCorrect0.9040.8660.9110.8630.8080.7840.6840.7560.754
ProximalPhalanxTW0.6880.6650.7550.7930.7070.7370.4500.7100.936
RefrigerationDevices0.4830.4770.4750.5230.3950.4640.4240.4320.683
ScreenType0.3760.3630.3920.4560.3600.3970.3600.4130.832
ShapeletSim0.6440.8000.6670.9560.5390.6500.6330.8110.726
ShapesAll0.4300.4200.4700.7820.7520.7680.4970.3720.704
SmallKitchenAppliances0.5410.5150.5470.5870.3440.6430.2990.4910.875
SonyAIBORobotSurface0.8820.8700.8840.9400.6960.7250.7120.7140.680
SonyAIBORobotSurfaceII0.8450.8480.8610.8660.8590.8310.8320.7800.888
StarLightCurves0.9740.9640.9760.9720.8490.9070.8270.9030.721
Strawberry0.8870.8860.9140.9540.9380.9400.4080.9230.908
SwedishLeaf0.8350.8340.8940.9200.7890.7920.2960.6530.823
Symbols0.8970.8890.9310.9210.8990.9500.7710.8610.675
synthetic_control0.4900.4730.7230.9970.8800.9930.9400.6730.770
ToeSegmentation10.5960.6010.6710.8550.6800.7720.7110.7060.952
ToeSegmentation20.7230.7080.7770.9150.8080.8380.8540.8460.982
Trace0.9901.0001.0001.0000.7601.0000.6900.8200.778
TwoLeadECG0.9040.4831.0001.0000.9071.0000.9930.3700.786
Two_Patterns0.4920.9051.0000.9830.7470.9040.5160.8870.940
UWaveGestureLibraryAll0.9530.6350.9710.7940.7390.7270.6580.3970.775
uWaveGestureLibrary_X0.6540.6450.8240.7000.6620.6340.5860.3260.607
uWaveGestureLibrary_Y0.6610.6470.7520.7300.6500.6580.6160.3800.862
uWaveGestureLibrary_Z0.6640.9510.9720.9440.9480.8920.9480.2790.654
wafer0.9940.9940.9970.9980.9950.9800.9950.9700.495
Wine0.5930.7040.6480.8150.6110.5740.5000.7040.786
WordsSynonyms0.5990.5940.6070.6110.6180.6490.6100.2950.770
Worms0.4480.4590.5410.5300.3650.4640.3700.5250.985
WormsTwoClass0.6410.6520.6960.7070.5860.6630.5800.7020.994
yoga0.8630.8650.8690.8470.8300.8360.5970.6500.783
Average0.7360.7280.7740.8000.7120.7440.5766110.783
Win662222270135
Table 3. Classification Accuracies for Deep Network based Classifiers (TSBF excluded).
Table 3. Classification Accuracies for Deep Network based Classifiers (TSBF excluded).
DatasetsRp + FCNRP2 + FCN1D FCNRP1 + FCN1D ResNetRP1+ ResnetEUCLIDDTW
50words0.6570.6790.6130.6750.6520.6350.6310.690
Adiac0.7110.6270.6710.7420.6440.6520.6110.604
ArrowHead0.6290.6000.4430.6400.7130.8290.800.703
Beef0.8670.8670.7030.8670.7000.8000.6670.633
BeetleFly0.7500.7500.9001.0000.7100.9500.7500.700
BirdChicken0.8000.8000.7050.8500.7500.7500.5500.750
Car0.8830.8500.7680.8830.7130.8500.7330.733
CBF0.9990.9990.9630.9980.8460.9990.8520.997
ChlorineConcentration0.4730.4840.4140.4840.7400.7400.6500.648
CinC_ECG_torso0.9900.9880.9400.9870.8350.9490.8970.651
Coffee1.0001.0001.0001.0000.9891.0001.0001.000
Computers0.5880.6000.5410.6040.5840.7440.5760.700
Cricket_X0.6870.6770.6690.7360.6360.7080.5770.754
Cricket_Y0.7030.7030.6590.7150.6010.6690.5670.744
Cricket_Z0.6920.6850.6860.7260.6450.6900.5870.754
DiatomSizeReduction0.9740.9710.9710.9770.8780.9900.9350.967
DistalPhalanxOutlineAgeGroup0.6280.6200.5840.6500.7790.8400.7830.792
DistalPhalanxOutlineCorrect0.8120.7970.7940.8130.7250.8100.7520.768
DistalPhalanxTW0.6850.6630.7690.7830.7200.7850.7280.710
Earthquakes0.7330.7390.6080.7550.7790.7760.6740.742
ECG2000.9500.9600.8720.9400.8870.9100.8800.770
ECG50000.7530.7050.5880.7340.9280.9410.9250.924
ECGFiveDays0.9870.9730.9680.9810.8280.9720.7970.768
ElectricDevices0.4930.4760.3870.5590.6570.6910.5510.601
FaceAll0.4620.4590.2160.4630.7390.8010.7140.808
FaceFour0.9770.9550.8650.9550.7650.9660.7840.830
FacesUCR0.8860.8860.8450.9190.8020.8680.7690.905
FISH0.9140.8800.8770.9310.8180.8800.7830.823
FordA0.9080.8820.8360.9140.7530.8460.6590.562
FordB0.8090.7600.7720.8550.6300.7490.5580.594
Gun_Point0.9670.9670.8830.9730.9010.9800.9130.907
Ham0.7330.7140.6850.7430.7410.7430.6000.467
HandOutlines0.8670.8760.8530.8710.8080.8670.8010.798
Haptics0.4580.4120.4270.4840.3990.4710.3700.377
Herring0.6560.6410.5860.6560.5520.6410.5160.531
InlineSkate0.3820.3550.3030.3930.2770.3560.3420.384
InsectWingbeatSound0.6390.6380.6610.6580.5280.5640.5620.355
ItalyPowerDemand0.9740.9640.9720.9760.9240.9720.9550.950
LargeKitchenAppliances0.5520.5280.4620.5710.6330.6530.4930.795
Lighting20.8200.7700.7260.8360.7160.9020.7540.869
Lighting70.7400.7260.7160.7670.5750.6580.5750.726
MALLAT0.9490.9510.9510.9530.8370.9180.9140.934
Meat0.7330.7500.8920.9330.7630.9830.9330.933
MedicalImages0.6340.6160.5510.7120.7000.7510.6840.737
MiddlePhalanxOutlineAgeGroup0.5450.5400.4610.5230.7310.7650.7400.750
MiddlePhalanxOutlineCorrect0.7950.8000.6200.8130.7270.7920.7530.648
MiddlePhalanxTW0.5690.5390.5950.5640.5760.5990.5610.584
MoteStrain0.8630.8570.8710.8870.8080.8440.8790.835
NonInvasiveFatalECG_Thorax10.7910.7850.8140.8600.9000.9150.8290.791
NonInvasiveFatalECG_Thorax20.8040.7960.8330.8640.9280.9310.8800.865
OliveOil0.8000.7330.7270.7000.3400.4330.8670.833
OSULeaf0.6360.6160.5700.6610.5590.6740.5210.591
PhalangesOutlinesCorrect0.8340.8180.7890.8410.8130.8310.7610.728
Phoneme0.0710.0660.0550.0830.1370.1900.1090.228
Plane0.9810.9710.9780.9810.9601.0000.9621.000
ProximalPhalanxOutlineAgeGroup0.6930.6680.5470.6440.8070.8440.7850.805
ProximalPhalanxOutlineCorrect0.9040.8660.8200.9110.8850.8630.8080.784
ProximalPhalanxTW0.6880.6650.7460.7550.7590.7930.7070.737
RefrigerationDevices0.4830.4770.3360.4750.4430.5230.3950.464
ScreenType0.3760.3630.3600.3920.3740.4560.3600.397
ShapeletSim0.6440.8000.6060.6670.7790.9560.5390.650
ShapesAll0.4300.4200.3490.4700.7150.7820.7520.768
SmallKitchenAppliances0.5410.5150.4620.5470.6550.5870.3440.643
SonyAIBORobotSurface0.8820.8700.8560.8840.7760.9400.6960.725
SonyAIBORobotSurfaceII0.8450.8480.8340.8610.7180.8660.8590.831
StarLightCurves0.9740.9640.9540.9760.9630.9720.8490.907
Strawberry0.8870.8860.8380.9140.9470.9540.9380.940
SwedishLeaf0.8350.8340.8380.8940.8750.9200.7890.792
Symbols0.8970.8890.8960.9310.6970.9210.8990.950
synthetic_control0.4900.4730.5010.7230.9780.9970.8800.993
ToeSegmentation10.5960.6010.5360.6710.7460.8550.6800.772
ToeSegmentation20.7230.7080.6660.7770.7220.9150.8080.838
Trace0.9901.0000.9851.0001.0001.0000.7601.000
TwoLeadECG0.9040.4831.0001.0000.9991.0000.9071.000
Two_Patterns0.4920.9050.8881.0000.8060.9830.7470.904
UWaveGestureLibraryAll0.9530.6350.8200.9710.7660.7940.7390.727
uWaveGestureLibrary_X0.6540.6450.7140.8240.6510.7000.6620.634
uWaveGestureLibrary_Y0.6610.6470.7340.7520.6970.7300.6500.658
uWaveGestureLibrary_Z0.6640.9510.9660.9720.9180.9440.9480.892
wafer0.9940.9940.9900.9970.9900.9980.9950.980
Wine0.5930.7040.5960.6480.5630.8150.6110.574
WordsSynonyms0.5990.5940.5130.6070.5050.6110.6180.649
Worms0.4480.4590.4180.5410.4230.5300.3650.464
WormsTwoClass0.6410.6520.5980.6960.6300.7070.5860.663
yoga0.8630.8650.8160.8690.8060.8470.8300.836
average0.7360.7280.7030.7740.7280.8000.7120.744
variance0.0340.0350.0420.0330.0280.0270.0300.027
wins86332439213
Table 4. Comparison of Proposed Classifier and TSBF.
Table 4. Comparison of Proposed Classifier and TSBF.
DatasetsRP1 + ResNetTSBFDatasetsRP1 + ResNetTSBF
50words0.6350.776MiddlePhalanxOutlineAgeGroup0.7650.400
Adiac0.6520.291MiddlePhalanxOutlineCorrect0.7920.858
ArrowHead0.8290.841MiddlePhalanxTW0.5990.688
Beef0.8000.850MoteStrain0.8440.535
BeetleFly0.9500.682NonInvasiveFatalECG_Thorax10.9150.828
BirdChicken0.7500.975NonInvasiveFatalECG_Thorax10.9150.828
Car0.8500.917OliveOil0.4330.844
CBF0.9990.787OSULeaf0.6740.979
ChlorineConcentration0.7400.969PhalangesOutlinesCorrect0.8310.801
CinC_ECG_torso0.9490.879Phoneme0.1900.714
Coffee1.0000.676Plane1.0000.762
Computers0.7440.853ProximalPhalanxOutlineAgeGroup0.8440.770
Cricket_X0.7080.758ProximalPhalanxOutlineCorrect0.8630.754
Cricket_Y0.6690.600ProximalPhalanxTW0.7930.936
Cricket_Z0.6900.901RefrigerationDevices0.5230.683
DiatomSizeReduction0.9900.702ScreenType0.4560.832
DistalPhalanxOutlineAgeGroup0.8400.975ShapeletSim0.9560.726
DistalPhalanxOutlineCorrect0.8100.495ShapesAll0.7820.704
DistalPhalanxTW0.7850.960SmallKitchenAppliances0.5870.875
Earthquakes0.7760.969SonyAIBORobotSurface0.9400.680
ECG2000.9100.930SonyAIBORobotSurfaceII0.8660.888
ECG50000.9410.618StarLightCurves0.9720.721
ECGFiveDays0.9720.692Strawberry0.9540.908
ElectricDevices0.6910.940SwedishLeaf0.9200.823
FaceAll0.8010.860Symbols0.9210.675
FaceFour0.9660.680synthetic_control0.9970.770
FacesUCR0.8680.745ToeSegmentation10.8550.952
FISH0.8800.514ToeSegmentation20.9150.982
FordA0.8460.793Trace1.0000.778
FordB0.7490.782TwoLeadECG1.0000.786
Gun_Point0.9800.881Two_Patterns0.9830.940
Ham0.7430.517UWaveGestureLibraryAll0.7940.775
HandOutlines0.8670.677uWaveGestureLibrary_X0.7000.607
Haptics0.4710.813uWaveGestureLibrary_Y0.7300.862
Herring0.6410.804uWaveGestureLibrary_Z0.9440.654
InlineSkate0.3560.825wafer0.9980.495
InsectWingbeatSound0.5640.860Wine0.8150.786
ItalyPowerDemand0.9720.709WordsSynonyms0.6110.770
LargeKitchenAppliances0.6530.721Worms0.5300.985
Lighting20.9020.986WormsTwoClass0.7070.994
Lighting70.6580.993yoga0.8470.783
MALLAT0.9180.780average0.8000.783
Meat0.9830.668variance0.0270.020
MedicalImages0.7510.858wins4540
Table 5. Average classfication accuracies according to category.
Table 5. Average classfication accuracies according to category.
CategoryRP + FCNRP2 + FCNRP1+ FCNRP1 + ResNetEUCLIDDTWLCSSCTETSBF
Image0.7360.7220.7630.8020.7280.7500.5100.6140.751
Spectro0.8020.8080.8290.8180.8020.7690.4010.7700.750
Sensor0.8090.7980.8210.8230.7290.7410.6880.6780.802
Device0.5060.4930.5240.6090.4530.6000.4520.4540.817
Motion0.6590.6500.7310.7180.6280.6830.6100.4850.828
EGG0.8650.7840.8960.9450.8700.8530.6910.5570.771
Simulated0.7150.8260.8680.9700.7870.8960.7150.7340.801

Share and Cite

MDPI and ACS Style

Nakano, K.; Chakraborty, B. Effect of Data Representation for Time Series Classification—A Comparative Study and a New Proposal. Mach. Learn. Knowl. Extr. 2019, 1, 1100-1120. https://doi.org/10.3390/make1040062

AMA Style

Nakano K, Chakraborty B. Effect of Data Representation for Time Series Classification—A Comparative Study and a New Proposal. Machine Learning and Knowledge Extraction. 2019; 1(4):1100-1120. https://doi.org/10.3390/make1040062

Chicago/Turabian Style

Nakano, Kotaro, and Basabi Chakraborty. 2019. "Effect of Data Representation for Time Series Classification—A Comparative Study and a New Proposal" Machine Learning and Knowledge Extraction 1, no. 4: 1100-1120. https://doi.org/10.3390/make1040062

APA Style

Nakano, K., & Chakraborty, B. (2019). Effect of Data Representation for Time Series Classification—A Comparative Study and a New Proposal. Machine Learning and Knowledge Extraction, 1(4), 1100-1120. https://doi.org/10.3390/make1040062

Article Metrics

Back to TopTop