Abstract
Recently, predicting multivariate time-series (MTS) has attracted much attention to obtain richer semantics with similar or better performances. In this paper, we propose a tri-partition alphabet-based state (tri-state) prediction method for symbolic MTSs. First, for each variable, the set of all symbols, i.e., alphabets, is divided into strong, medium, and weak using two user-specified thresholds. With the tri-partitioned alphabet, the tri-state takes the form of a matrix. One order contains the whole variables. The other is a feature vector that includes the most likely occurring strong, medium, and weak symbols. Second, a tri-partition strategy based on the deviation degree is proposed. We introduce the piecewise and symbolic aggregate approximation techniques to polymerize and discretize the original MTS. This way, the symbol is stronger and has a bigger deviation. Moreover, most popular numerical or symbolic similarity or distance metrics can be combined. Third, we propose an along–across similarity model to obtain the k-nearest matrix neighbors. This model considers the associations among the time stamps and variables simultaneously. Fourth, we design two post-filling strategies to obtain a completed tri-state. The experimental results from the four-domain datasets show that (1) the tri-state has greater recall but lower precision; (2) the two post-filling strategies can slightly improve the recall; and (3) the along–across similarity model composed by the Triangle and Jaccard metrics are first recommended for new datasets.
1. Introduction
Time-series analysis [] has long been a subject that has attracted researchers from a diverse range of fields, including pattern discovery [,,,], clustering [,,], classification [,], prediction [], causality [], and anomaly detection []. Time-series prediction is one of the most sought-after yet, arguably, the most challenging tasks []. It has played an important role in a wide range of fields, including the industrial [], financial [], health [], traffic [,], and environmental [] fields for several decades. For multivariate time-series (MTSs), existing methods inherently assume interdependencies among variables. In other words, each variable not only depends on its historical values but also on other variables. To efficiently and effectively exploit latent interdependencies among variables, many techniques such as deep learning-based ones [,,,,], the matrix or tensor decomposition-based ones [,], the k-nearest neighbor (kNN)-based ones [,,,], and others [,,,] have been proposed. However, obtaining richer semantics with similar or better performances is meaningful but rare.
The trisecting–acting–outcome (TAO) model [] of thinking in threes [] to understand and process a whole via three distinct and related parts [] has inspired many novel and significant theories and applications. Recently, theories such as three-way formal concept analysis [] and three-way cognition computing [,] have focused on concept learning via multi-granularity from the viewpoint of cognition. The three-way fuzzy sets method [], three-way decisions space [], sequential three-way decisions [], and generalized three-way decision models [,,] have been proposed. Moreover, applications include the three-way recommender system [], three-way active learning [], three-way clustering [], tri-partition neighborhood covering reduction [], three-way spam filtering [], three-way face recognition [], and the tri-alphabet-based sequence pattern []. However, the extension of TAO to MTS prediction needs to be studied in depth.
In this paper, a tri-partition alphabet-based state (tri-state) prediction method for symbolic multivariate time-series (MTS) was proposed. First, with the symbolic aggregate approximation (SAX) [] technique, g symbols are generated with the piecewise aggregate approximation (PAA) [] version of MTS and the hypothesis of a probability distribution function. Moreover, the most common standard normal distribution, i.e., , is used here. Hence, the breakpoints can be obtained by averagely partitioning the under area of into g parts. As these breakpoints also provide the deviation degree far from the expectation, the two thresholds and () can be specified from them. Hence, if the absolute value of a breakpoint is not less than , the symbol is called a strong element. If the absolute value of a breakpoint is less than , the symbol is called a weak element. Otherwise, the symbol is called a medium element. This way, for each variable of the given MTS, its alphabet, i.e., the set of symbols, is partitioned into the strong, medium, and weak regions.
Second, on the basis of the tri-partitioned alphabet, the predicted tri-state hence takes the form of a matrix with the size (n is the number of variables). For each variable, we simultaneously predict the three most likely symbols occurring from the strong, medium, and weak regions. The state defined by the existing work only contains one case while the tri-state includes up to cases. Note that our method does not take the top three most likely occurring symbols as the prediction result because the deviation degree can provide some new orthogonal information. This way, the outliers are more noticeable for users.
Third, an along–across similarity model to generate the k-nearest matrix neighbors (kNMN) is presented. The along similarity considers the associations of the time stamps. The across similarity focuses on the relation between the variables. Additionally, with the PAA- and SAX-MTSs, the most popular numerical or symbolic metrics can be combined regardless of whether they are similarities or distances. Given a sliding window w, the PAA- and SAX-MTSs can be transformed into temporal subsequences, called instances. m is the number of time stamps, and all instances are matrices with the shape . Moreover, the latest state following each instance is denoted as the decision information, called the label. With the optimal k labels from , the tri-state can be finally predicted using the traditional voting strategy.
Fourth, two post-filling strategies called the individual and related ones, are designed to fill the possibly missing symbols of each variable. The reason for which the tri-state may be uncompleted is that no strong, medium or weak symbols occur after all matrix instances. For brevity, given a tri-state, we assume that the strong symbol of its i-th variable () is missing. The individual filling strategy (IFS) directly scans the history data of to obtain the most frequently occurring strong symbol. The related filling strategy (RFS) considers the associations between and the other variables. One of the other variables, which is the most linear related to , is its condition.
The main contributions of this paper are presented as follows:
- Tri-state. It provides three kinds of symbols for each variable simultaneously. The proposed deviation degree-based alphabet tri-partition strategy makes the outliers more noticeable for experts. Moreover, the IFS and RFS are designed to obtain a completed tri-state.
- Along–across similarity model. The similarities between time stamps and variables are considered simultaneously. This model provides a framework for the integration of the popular similarity or distance metrics.
- Combination of the popular numerical or symbolic metrics. The PAA- and SAX-MTSs are simultaneously used in the above similarity model. The PAA-MTS is available for the numerical metrics, while the SAX-MTS fits the symbolic ones.
The experimental results undertaken on four real-world datasets show that (1) in terms of precision, the states are 30% to 50% higher than the three kinds of tri-states, while for the recall, the three kinds of tri-state are 10% higher than the state; (2) the IFS and RFS can slightly improve the recall by approximately 1%; and (3) the along–across similarity model composed of the Triangle and Jaccard metrics are first recommended for new datasets. Note that the IFS and RFS are necessary if the tri-state is incomplete. In other words, when the obtained tri-state is fulfilled, no difference is found among the three kinds of tri-states.
The rest of this paper is organized as follows. Section 2 reviews the existing work on time-series prediction. Section 3 presents the fundamental definitions of the tri-state. Section 4 proposes the algorithm for tri-state prediction. Section 5 discusses the performance of the prediction algorithm on four real-world datasets. Section 6 lists the conclusions and future work of this paper.
2. Time-Series Prediction
Various techniques have been proposed for predicting time-series. These methods can be categorized into the deep learning-based ones [,,,,], matrix or tensor decomposition-based ones [,], k-nearest neighbor (kNN)-based ones [,,,], etc. [,,,].
For the deep learning-based ones aiming to solve the volatility problem of wind power, a forecasting model based on a convolution neural network and LightGBM was constructed by Ju []. Ma et al. proposed a deep learning-based method, namely transferred bi-directional long short-term memory model for air-quality prediction []. Weytjens et al. predicted accounts’ receivable cash flows by employing methods applicable to companies with many customers and many transactions [].
In terms of the matrix or tensor decomposition-based ones, Shi et al. proposed a strategy that combines low-rank Tucker decomposition into a unified framework []. Ma et al. proposed a deep spatial-temporal tensor factorization framework, which provides a general design for high-dimensional time-series forecasting []. To model the inherent rhythms and seasonality of time-series as global patterns, Chen et al. [] proposed a low-rank autoregressive tensor completion framework to model multivariate time-series’ data. To generalize the effect of distance and reachability, Wu et al. [] developed an Inductive graph neural network kriging model to recover data for unsampled sensors on a network graph structure.
For the kNN-based ones, Zhang et al. [] proposed a new two-stage methodology that combines the ensemble empirical mode decomposition with a multidimensional kNN model in order to simultaneously forecast the closing price and high price of stocks. Xu et al. [] proposed an algorithm based on the kernel kNN to predict road traffic states in time-series. Yin et al. [] proposed the multivariate predicting method and discussed the prediction performance of MTS by comparing it with the univariate time-series and kNN nonparametric regression model. Martinez et al. [] devised an automatic tool, i.e., a tool that works without human intervention; furthermore, the methodology should be effective and efficient. The tool can be applied to accurately forecast many time series.
Other techniques were also used for MTS prediction. To handle multivariate long nonstationary time-series, Shen et al. [] proposed a fast prediction model based on a combination of an elastic net and a higher-order fuzzy cognitive map. Chen et al. [] proposed a weighted least squares support vector machine-based approach for univariate and multivariate time-series forecasting. To predict future outbreaks of methicillin-resistant Staphylococcus aureus, Jimenez et al. [] proposed the use of artificial intelligence—specifically time-series forecasting techniques. The orthogonal decision tree may fail to capture the geometrical structure of data samples, so Qiu et al. [] attempted to study oblique random forests in the context of time-series forecasting.
3. Models and Problem Statement
In this section, we first introduce the definitions of the original multivariate time-series (MTS) and its piecewise aggregate approximation (PAA) and symbolic aggregate approximation (SAX) versions. Second, we propose an along–across similarity model and the problem of state prediction. Third, we define the strategy of alphabet tri-partition and the problem of tri-partition alphabet-based state prediction. The notations are introduced in Table 1.
Table 1.
Notations.
3.1. Data Model
The PAA and SAX versions of MTS are defined on the basis of the original numerical MTS.
Definition 1.
An original numerical MTS is the quadruple:
where is the finite set of time points, is the finite set of variables, {real number} is the value ranges of variable a, and is the mapping function. For brevity, can be denoted by . We further assume that ().
Definition 2.
The PAA-MTS has similar forms to Definition 1. However, two differences are present, namely (i) , , and (ii) :
Example 1.
Figure 1 shows an example of the transition of NO from the original numerical MTS () to the PAA version of MTS (). Here, and . This way, the dimension is reduced from 100 to 10.
Figure 1.
The original numerical MTS and PAA-MTS.
Definition 3.
The SAX-MTS also has a similar form to Definition 2. The only difference is that the numerical value is transformed into a symbolic one. To produce symbols with equiprobability, a set of breakpoints dividing the area of under the probability distribution function (PDF) of is required. Therefore, let containing g symbols; then, we have:
where , and , and are defined as and , respectively.
Example 2.
Table 2 shows a lookup table of breakpoints for the distribution. In practice, g can be set as an integer that is not less than 2. Notably, means that .
Table 2.
The breakpoints for the distribution [].
Example 3.
Figure 2 shows an example of the transition from PAA-MTS to SAX-MTS. Here, we let , , and its alphabet {a, b, c, d, e, f, g}.
Figure 2.
The PAA-MTS and SAX-MTS for NO.
Example 4.
Table 3 shows an example of SAX-MTS with three variables (i.e., {SO (), NO (), and PM2.5 ()}), and 10 time stamps (i.e., ). For variable SO, symbols a and f are missing. For variable NO, symbols b and e are missing. For variable PM2.5, symbols e, d, and e are missing. This phenomenon is temporary until the data are big enough.
Table 3.
An example of SAX-MTS and PAA-MTS.
3.2. State
First, a formal description of the state is introduced as follows. Additionally, the type of prediction result that the SAX-MTS state is was described.
Definition 4.
Given a SAX-MTS : ,
is called a state of SAX-MTS at time . Moreover, the state of PAA-MTS (i.e, =) is formally similar to this one.
Example 5.
With Table 3, {e, g, f} is called a state of SAX-MTS at time . Accordingly, {0.447, 1.038, 0.846} is called a state of PAA-MTS at time .
Second, the state is denoted as a known label. This way, the corresponding instance of is defined as follows.
Definition 5.
Given an SAX-MTS and a sliding window , an instance with the matrix form is:
where , , and , , . For brevity, can be denoted by when n and w are specified:
Example 6.
This way, the set of all instances can be denoted by
where .
Example 7.
With Table 3, let ; then, . Hence, .
Third, , the set of instance–label pair {(, )} can be constructed for the k-nearest matrix neighbors (kNMN).
Definition 6.
Given an instance , any is called the set of kNMN of if and:
where Δ is the along–across similarity of the given matrix pair.
Note that the neighborhood for may not be unique, where some other matrices have the same similarity with .
Fourth, the along–across similarity model is proposed to obtain the neighborhood by merging the popular similarity and distance metrics.
Definition 7.
Given PAA-MTS , SAX-MTS , and sliding window size w, the similarity between the two matrix-based instances and is:
where the row vector similarity is:
and the column vector similarity is:
where:
Note that the row or column vector in Equation (11) is indeed one of and , corresponding to PAA- and SAX-MTSs, respectively. Moreover, the data type of vector in Equations (9) and (10) are coincident. In other words, the pairs of vectors in Equations (9) and (10) are either PAA-MTS or SAX-MTS. Namely, the case that is PAA-MTS while is SAX-MTS is not permitted.
Table 4 presents the availability of similarities and distances for Equation (11). Two things need to be further explained. One is the availability of the metrics. Given any two indices r and c ( IDs), PAA(r) = True or PAA(c) = True indicates that the r-th or the c-th metric fits the numerical data. Similarly, SAX(r) = True or SAX(c) = False means that the r-th or the c-th metric fits the symbolic data. For example, PAA(0) = True indicates that the Euclidean distance fits PAA-MTS but not SAX-MTS.
Table 4.
The availability of similarities and distances.
The other is the transformation from the distance to similarity. As similarity and distance metrics are simultaneously used here, the distance needs be transformed into the similarity. Therefore, given two vectors and , the transformation from distance to similarity is:
where d denotes the distance between and . This way, 100 combinations of distances and similarities exist. Their performances are discussed in Section 5.
Example 8.
With Table 3 and Table 4, let (Jaccard similarity, PAA True), (Manhattan distance, SAX True), and , and the along–across similarity between and (i.e., ) is illustrated as follows. First, the SAX-MTS and PAA-MTS of is:
Those of is:
Second, the row vector similarity . More specifically, the Jaccard similarity between row vectors (g, f, f) and (g, f, g) is . Third, the column similarity . More specifically, the Manhattan distance between column vectors and is . Finally, .
Fifth, given a future time stamp (e.g., ), the state (e.g., ) at this time is unknown. Formally, the state occurring at time is denoted as =, , ⋯, . To obtain the components of with the kNN-like method, the instances, neighbors, and labels were defined by the above. Therefore, the label of , i.e., can be predicted with the following voting strategy.
Definition 8.
Given a SAX-MTS , and , , each component of is:
and:
where , if the condition is True; otherwise, .
Example 9.
First, the of is found. Using the process shown in Example 8, the along–across similarities of are listed in Table 5. Let the size of , i.e., ; then, .
Table 5.
The similarity matrix of .
Second, with the three nearest neighbors, the states/labels after them can be obtained. Namely, (g, f, g), (g, f, f), (c, d, c), and (e, f, f).
Fourth, the results of voting can hence be obtained as g, f, f. Namely, (g, f, f). More specially, in terms of , .
Sixth, the prediction performance is better with less difference between the and in general. The measures of prediction performance such as the precision and recall are introduced here. , and the precision and recall of the state have the same form, namely:
Finally, with the above definitions, the problem of state prediction is proposed as follows.
Problem 1.
kNMN-based state prediction for MTS:
Input: , , w and k;
Output:
Although two types of datasets, i.e., the PAA- and SAX-MTSs, are both used here, the space complexity remains the same. The time complexity is closely related to the size of the matrix instance and similarity metrics for vectors.
Example 10.
With Table 4, let and ; given PAA-MTS , SAX-MTS S, and the sliding window w, the time complexities of the row and column vectors’ similarity between two matrix instances are both . Moreover, the size of is ; hence, the time complexity of our method is .
3.3. Tri-State
To enrich the semantics of predictions, we extend each component of to a column vector with length 3. For each vector, different components have various semantics. This way, the form of prediction is changed from a vector into a matrix.
First, we introduce the definition of the tri-partition alphabet as follows.
Definition 9.
Given an SAX-MTS , ,
is called a tri-partition alphabet of a if
- ; and
- .
Additionally, we call , , and the strong, medium, and weak regions of attribute , respectively.
Example 11.
With Table 3, the range of values for variable NO () is {a, b, c, d, e, f, g}. Let {a, g}, {b, f}, and {c, d, e}, is called a tri-partition alphabet of NO.
Definition 10.
Given a SAX-MTS : and , a tri-state at time stamp is:
where , , , and .
Compared with the state in Definition 4, we replace with , , , . Moreover, this predicted vector can be interpreted as the most probable symbol from the strong, medium, and weak regions, respectively. Note that the three-way state is useless for historical data.
Therefore, we present the voting strategy for the three-way state prediction as follows. Given :
Practically, regions , , and can be obtained using various partition strategies and have meaningful explanations. Here, we partition the range of symbolic values for each attribute using the following strategy. Based on Equations (2) and (3), we can evaluate the level deviating from the mean for each symbol. For each attribute , given a set of thresholds pair with cardinality n, where , . , the tri-partition strategy is formally described as
The combination of PAA-MTS and SAX-MTS is first used here. The breakpoint is , and always holds. Hence, always belongs to .
However, up to thresholds need to be specified. Therefore, we assume that , , and for brevity. Consequently, we have , and which can be denoted by . More choices are available for and with a greater g. Moreover, if , then . When the threshold is set to , can be set to or 0.
However, the predicted tri-state is incomplete if no strong, medium, or weak symbols are found following the whole matrix neighbors. Namely, what the current method can guarantee is that each variable has at least one predicted symbol. Formally, given a tri-state at , , we have:
Example 12.
With Table 3, let and , , where {a, g}, {b, f} and {c, d, e}. Based on the four labels of Example 9, i.e., (g, f, g), (g, f, f), (c, d, c), and (e, f, f), the predicted tri-state at is
Note that ϕ means the symbol of the current position is temporally unknown. More specifically, the strong symbol of and the medium symbol of are unknown. Here, “c / e” indicates that the final predicted symbol was randomly selected from them. For brevity, the symbol c was selected.
Moreover, the precision for the incomplete tri-state is calculated as follows:
In order to remedy this defect, i.e., to obtain a completed tri-state, we propose two simplified and effective filling strategies called the individual and related ones, respectively. For each attribute, if one or two symbols are missing, the individual filling strategy (IFS) predicts them with the most frequent ones in its own history data. Then, , the IFS can be formally described as follows:
where:
Example 13.
According to Example 12 and Table 3, for variable , b. This is because IFS-Count(b) = > IFS-Count(f) = 0. For variable , a. This is because IFS-Count(a) = > IFS-Count(g) = . Hence, the tri-state filled by the IFS is
The related filling strategy (RFS) predicts the missing symbols by considering the association relationships between any pair of variables. Given two variables and (), is the most linear related variable of . Namely, . Hence, their predicted vectors are and . Then, the RFS can be formally described as follows:
where:
Example 14.
Based on Example 12 and Table 3, the Pearson correlations among the three variables are listed as follows. Pearson, Pearson, and Pearson. Hence, for the variable , the most related one is . Then, when (, g) happens, the happening symbols set of is {g}. When (, f) happens, the happening symbols set of is {g, e, e}. When (, c) happens, the happening symbols set of is {c}. No medium symbol for by the RFS is available. Therefore, the result is b, which is predicted using the IFS.
Then, for the variable , the most related one is . Then, when (, g) happens, the happening symbols’ set of is {f, f}. When (, b) happens, the happening symbols set of is {a, a, c}. When (, c) happens, the happening symbols set of is {d, c}. Therefore, the result of is a.
Accordingly, the tri-state filled by the RFS is This result of the RFS is consistent with that of the IFS.
Moreover, the precision for the completed tri-state (IFS- and RFS-ones) is calculated as follows:
- ;
- ; and .
Finally, with all of the above definitions, we can define the problem of three-way state prediction as follows:
Problem 2.
Tri-state prediction for MTS.
Input: , , w, k, α, and β;
Output:
Compared with Problem 1, Problem 2 has two more parameters α and β. The first process that generates Σ is required, but it has a polynomial time complexity . The output is a matrix P with size . Hence, we can obtain three of the most likely occurring symbols from the strong, medium, and weak regions, respectively. Note that Problem 1 obtains one predicted state at once, while Problem 2 can obtain up to possible states. Excitedly, the time and space complexity of the two problems remain the same.
4. Algorithms
In this section, the framework of the three-way state prediction algorithm with k nearest matrix neighbors (kNMN-3WSP) is shown in Figure 3. Three stages, namely kNMN construction, alphabet tri-partition, and three-way state prediction, are proposed. Note that datasets such as PAA and SAX S are the inputs of all stages. In stages II and III, and S were omitted for brevity.
Figure 3.
The process of the kNMN-3WSP algorithm.
4.1. Stage I
Algorithm 1 proposes the details of Stage I. First, is a pair of indexes which identifies the distances and similarities from Table 4. In other words, we have . Moreover, if , the similarity between two row vectors is measured using the Euclidean distance. If , the similarity between two column vectors are measured using the Triangle one. Second, the cardinalities of and all elements in are , and . Third, the availability of PAA and SAX is the key to integrating and S. They are mutually exclusive.
Algorithm 2 presents the details of Line 4. With the last two columns of Table 4, if the similarity metric supports PAA-MTS, PAA(r) or PAA(c) is True (T). For example, PAA(0) = PAA(1) = True, and PAA(2) = PAA(3) = False (F). Finally, the time complexity of this stage is .
| Algorithm 1kNMN construction. |
| Input: , , w, k and ; |
| Output: ; |
| Method: Construction. |
|
| Algorithm 2 Similarity computation. |
| Input: , and ; |
| Output: ; |
| Method: Similarity. |
|
4.2. Stage II
Algorithm 3 describes the details of Stage II. First, the variable g was specified to generate the SAX version of MTS. In other words, is the number of symbols for each attribute. Second, if , Λ is an Ø. When , no other choices are available except for . Finally, the time complexity of this stage is only .
| Algorithm 3 Alphabet tri-partition. |
| Input: , , and ; |
| Output: ; |
| Method: Tri-partition. |
|
4.3. Stage III
Algorithm 4 discusses the details of Stage III. First, is the label of attribute . The predicted symbol is the one with the maximal frequency. Second, the purpose of using the index to count is to improve the efficiency of this algorithm. In Line 6, Count is a mapping function for the count matrix in which the size is . The l-th position stores the frequency of (). Generally, the matrix is denoted by = ((Count(1), , (Count(2), , ⋯, (Count(g), ). For example, with Table 3, let the indices of H, M, and L be 0, 1, and 2, respectively, (). Matrix ((3, 0), (2, 1), (4, 2)) means the frequencies of H, M, and L are 3, 2, and 4, respectively. Moreover, = (3, 0), = 3, and = 0.
In Line 9, the count matrices are listed in the count descending order. Generally, are subject to (1) , , and (2) , . For example, the matrix ((3, 0), (2, 1), (4, 2)) is transformed into ((4, 2), (3, 0), (2, 1)). In Lines 10–21, the algorithm searches for three symbols with the biggest count from the strong, medium, and weak regions each. There is no need to continue searching if all three symbols of the current variable are known. The time complexity of this stage is .
Finally, the RFS considers more information than the IFS, but their time and space complexities are the same, namely . This way, we can obtain four kinds of states called the state, tri-state, IFS-based tri-state (IFS-tri-state), and RFS-based tri-state (RFS-tri-state).
| Algorithm 4 Three-way state prediction. |
| Input: , , N and ; |
| Output: ; |
| Method: Prediction. |
|
5. Experiments
We attempted the discussion of the following issues using experiments:
- The prediction performance of our along–across similarity model;
- The stability of the similarity metrics combination.
5.1. Dataset and Experiment Settings
Experiments are undertaken on four datasets from four different domains, i.e., the environmental, financial, industrial, and health domains. The most important information from these datasets is listed in Table 6.
Table 6.
The outlines of the datasets.
With Table 4, combinations need to be discussed. The test set consists of the last 20% of the above three MTSs. However, the training set is dynamic at different time points within the testing set. Generally, for each time point , the training set contains the whole records within the time range . In other words, 80% is the smallest training set ratio when the time point i is .
5.2. Prediction Performance
Figure 4 and Figure 5 show the meaning of precision, recall, and F1-measure for four kinds of states on the four datasets’ test sets. Commonly, the form , , indicates the indices of row and column metrics, respectively. For example, means that the row metric is Levenshtein and that the column metric is Jaccard. Second, with increasing k, the precisions of the state, tri-state, IFS-based tri-state, and RFS-based state are decreased. Third, the precision of state is better than that of the others. Moreover, the precision of tri-state is slightly better than that of the IFS- and RFS-based ones. The precisions of the IFS- and RFS-based tri-states are almost consistent. This is because three kinds of tri-states provide two additional symbols for each variable. However, tri-state may be incomplete while the IFS- and RFS-based ones are complete. Therefore, the precision of the tri-state is between the state and the IFS- and RFS-based tri-states. This can be observed in Figure 4b,c.
Figure 4.
The precisions of four state prediction strategies. (a) Dataset I. (b) Dataset II. (c) Dataset III. (d) Dataset IV.
Figure 5.
The recalls of four state prediction strategies. (a) Dataset I. (b) Dataset II. (c) Dataset III. (d) Dataset IV.
In Figure 5, the recalls of the three kinds of tri-states are better than that of the state. Moreover, the recalls of the IFS- and RFS-based tri-states are the highest. Similarly, the recall of the tri-state is also between that of the state and the IFS- and RFS-based tri-states. Interestingly, the recall of the IFS- and RFS-based tri-states on the Stocks (Dataset II) can reach 95% and 93%, respectively.
Compared with the state, the three kinds of tri-states have better recall but worse precision. Although the improvement of IFS- and RFS-based tri-states is not significant compared to the tri-state, more information can be provided. In most cases, is the first choice for precision and recall.
5.3. Stability
Table 7 and Table 8 list the top 10 metric combinations for precision and recall with four kinds of states on the four datasets’ test sets. We can observe that some metric combinations are repeated. Hence, these combinations are considered more stable, with higher frequency/probability occurring in different datasets. For stronger discrimination, we additionally introduce a weighting strategy ranking for each metric combination.
Table 7.
The top 10 metric combinations for precision.
Table 8.
The top 10 metric combinations for recall.
Given a metric combination :
is the stability metric of x. Among them, is the occurrence on four datasets. Moreover:
is the average ranking of x. If x does not occur in dataset i, . Otherwise, is the ranking of x.
Figure 6 shows the most stable metric combinations for precision. In terms of the state, combination is the first choice. In terms of the tri-state, combination is the first choice. In terms of the IFS-based tri-state, combination is the first choice. In terms of the RFS-based tri-state, combination is the first choice.
Figure 6.
The most stable metric combinations for precision. (a) State. (b) Tri-state. (c) IFS-tri-state. (d) RFS-tri-state.
Figure 7 shows the most stable metric combinations for recall. In terms of the state, combination is the first choice. In terms of the tri-state, combination is the first choice. In terms of the IFS-based tri-state, combination is the first choice. In terms of the RFS-based tri-state, combination is the first choice.
Figure 7.
The most stable metric combinations for recall. (a) State. (b) Tri-state. (c) IFS-tri-state. (d) RFS-tri-state.
With the above observations, the eighth metric, i.e., the Jaccard similarity, is the most frequently used, followed by the second one, i.e., the Manhattan distance.
6. Conclusions
In this paper, a new tri-state and its prediction problem were defined on multivariate time-series (MTS). The most likely occurring strong, medium, and weak symbols can be obtained with the tri-state. Second, a deviation degree-based tri-partition strategy and the algorithm were designed. For all symbols of each variable, the symbol was stronger and deviated further from the average value. Third, the along–across similarity model was proposed to capture the temporal and variables’ association relationships. Fourth, the integration of the PAA and SAX versions of MTS can combine numerical or symbolic similarities or distances. Finally, when a new dataset is introduced, the first choices in parameter settings are (the size neighborhood), (the Jaccard), and (the Manhattan).
The following research topics deserve further investigation:
- More alphabet tri-partition strategies;
- More tri-state completion strategies;
- Adaptive learning of the parameters by cost-sensitive learning; and
- More intelligent metrics combination strategies, e.g., integrated learning.
Author Contributions
Conceptualization, Z.-H.Z.; methodology, Z.-H.Z. and Z.-C.W.; software, Z.-C.W. and Z.-H.Z.; validation, Z.-C.W., J.-G.G. and S.-P.S.; formal analysis, Z.-H.Z., W.D. and Z.-C.W.; investigation, J.-G.G. and S.-P.S.; resources, X.-B.Z.; data curation, G.-S.C., J.-G.G. and S.-P.S.; writing—original draft preparation, Z.-H.Z. and W.D.; writing—review and editing, X.-B.Z., S.-P.S. and G.-S.C.; visualization, Z.-C.W., J.-G.G. and G.-S.C.; supervision, X.-B.Z. and W.D.; project administration, W.D.; funding acquisition, X.-B.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This research was jointly funded by the National Natural Science Foundation of China (grant number 41604114); the Sichuan Science and Technology Program (grant numbers 2019ZYZF0169, 2019YFG0307, 2021YFS0407); the A Ba Achievements Transformation Program (grant numbers 19CGZH0006, R21CGZH0001); the Chengdu Science and technology planning project (grant number 2021-YF05-00933-SN); and the Sichuan Tourism University Scientific Research Projects of China (grant number 2020SCTU14).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Wei, W.W.S. Multivariate Time Series Analysis and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2018. [Google Scholar]
- Park, H.; Jung, J.Y. SAX-ARM: Deviant event pattern discovery from multivariate time series using symbolic aggregate approximation and association rule mining. Expert Syst. Appl. 2020, 141, 112950. [Google Scholar] [CrossRef]
- Xu, J.; Tang, L.; Zeng, C.; Li, T. Pattern discovery via constraint programming. Knowl.-Based Syst. 2016, 94, 23–32. [Google Scholar] [CrossRef]
- Zhang, Z.H.; Min, F. Frequent state transition patterns of multivariate time series. IEEE Access 2019, 7, 142934–142946. [Google Scholar] [CrossRef]
- Zhang, Z.H.; Min, F.; Chen, G.S.; Shen, S.P.; Wen, Z.C.; Zhou, X.B. Tri-Partition state alphabet-based sequential pattern for multivariate time series. Cogn. Comput. 2021, 1–19. [Google Scholar] [CrossRef]
- Cheng, R.; Hu, H.; Tan, X.; Bai, Y. Initialization by a novel clustering for wavelet neural network as time series predictor. Comput. Intell. Neurosci. 2015, 2015, 1–9. [Google Scholar] [CrossRef] [PubMed]
- Li, H.L. Multivariate time series clustering based on common principal component analysis. Neurocomputing 2019, 349, 239–247. [Google Scholar] [CrossRef]
- Li, H.L.; Liu, Z.C. Multivariate time series clustering based on complex network. Pattern Recognit. 2021, 115, 107919. [Google Scholar] [CrossRef]
- Baldán, F.J.; Benítez, J.M. Multivariate times series classification through an interpretable representation. Inf. Sci. 2021, 569, 596–614. [Google Scholar] [CrossRef]
- Baydogan, M.G.; Runger, G. Learning a symbolic representation for multivariate time series classification. Data Min. Knowl. Discov. 2015, 29, 400–422. [Google Scholar] [CrossRef]
- Araújo, R.D.A. A class of hybrid morphological perceptrons with application in time series forecasting. Knowl.-Based Syst. 2011, 24, 513–529. [Google Scholar] [CrossRef]
- Sugihara, G.; May, R.; Ye, H.; Hsieh, C.H.; Deyle, E.; Fogarty, M.; Munch, S. Detecting Causality in Complex Ecosystems. Science 2012, 338, 496–500. [Google Scholar] [CrossRef] [PubMed]
- Ren, H.R.; Liu, M.M.; Li, Z.W.; Pedrycz, W. A piecewise aggregate pattern representation approach for anomaly detection in time series. Knowl.-Based Syst. 2017, 135, 29–39. [Google Scholar] [CrossRef]
- Ju, Y.; Sun, G.Y.; Chen, Q.H.; Zhang, M.; Zhu, H.X.; Rehman, M.U. A model combining convolutional neural network and LightGBM algorithm for ultra-short-term wind power forecasting. IEEE Access 2019, 7, 28309–28318. [Google Scholar] [CrossRef]
- Zhang, N.; Lin, A.; Shang, P. Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting. Phys. A Stat. Mech. Appl. 2017, 477, 161–173. [Google Scholar] [CrossRef]
- Shen, F.; Liu, J.; Wu, K. Multivariate Time Series Forecasting based on Elastic Net and High-Order Fuzzy Cognitive Maps: A Case Study on Human Action Prediction through EEG Signals. IEEE Trans. Fuzzy Syst. 2020, 29, 2336–2348. [Google Scholar] [CrossRef]
- Xu, D.W.; Wang, Y.D.; Peng, P.; Shen, B.L.; Deng, Z.; Guo, H.F. Real-time road traffic state prediction based on kernel-kNN. Transp. A Transp. Sci. 2020, 16, 104–118. [Google Scholar] [CrossRef]
- Yin, Y.; Shang, P.J. Forecasting traffic time series with multivariate predicting method. Appl. Math. Comput. 2016, 291, 266–278. [Google Scholar] [CrossRef]
- Ma, J.; Cheng, J.C.; Lin, C.Q.; Tan, Y.; Zhang, J.C. Improving air quality prediction accuracy at larger temporal resolutions using deep learning and transfer learning techniques. Atmos. Environ. 2019, 214, 116885. [Google Scholar] [CrossRef]
- Liu, P.H.; Liu, J.; Wu, K. CNN-FCM: System modeling promotes stability of deep learning in time series prediction. Knowl.-Based Syst. 2020, 203, 106081. [Google Scholar] [CrossRef]
- Martínez, F.; Frías, M.P.; Pérez, M.D.; Rivera, A.J. A methodology for applying k-nearest neighbor to time series forecasting. Artif. Intell. Rev. 2019, 52, 2019–2037. [Google Scholar] [CrossRef]
- Weytjens, H.; Lohmann, E.; Kleinsteuber, M. Cash flow prediction: MLP and LSTM compared to ARIMA and Prophet. Electron. Commer. Res. 2021, 21, 371–391. [Google Scholar] [CrossRef]
- Zhou, Y.; Cheung, Y.M. Bayesian low-tubal-rank robust tensor factorization with multi-rank determination. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 62–76. [Google Scholar] [CrossRef]
- Zhou, Y.; Lu, H.; Cheung, Y.M. Probabilistic rank-one tensor analysis with concurrent regularizations. IEEE Trans. Cybern. 2021, 51, 3496–3509. [Google Scholar] [CrossRef]
- Chen, T.T.; Lee, S.J. A weighted LS-SVM based learning system for time series forecasting. Inf. Sci. 2015, 299, 99–116. [Google Scholar] [CrossRef]
- Jimenez, F.; Palma, J.; Sanchez, G.; Marin, D.; Palacios, M.F.; López, M.L. Feature selection based multivariate time series forecasting: An application to antibiotic resistance outbreaks prediction. Artif. Intell. Med. 2020, 104, 101818. [Google Scholar] [CrossRef]
- Qiu, X.H.; Zhang, L.; Suganthan, P.N.; Amaratunga, G.A.J. Oblique random forest ensemble via least square estimation for time series forecasting. Inf. Sci. 2017, 420, 249–262. [Google Scholar] [CrossRef]
- Yao, Y.Y. The geometry of three-way decision. Appl. Intell. 2021, 51, 6298–6325. [Google Scholar] [CrossRef]
- Yao, Y.Y. Set-theoretic models of three-way decision. Granul. Comput. 2021, 6, 133–148. [Google Scholar] [CrossRef]
- Yao, Y.Y. Tri-level thinking: Models of three-way decision. Int. J. Mach. Learn. Cybern. 2020, 11, 947–959. [Google Scholar] [CrossRef]
- Sang, B.B.; Guo, Y.T.; Shi, D.R.; Xu, W.H. Decision-theoretic rough set model of multi-source decision systems. Int. J. Mach. Learn. Cybern. 2018, 9, 1941–1954. [Google Scholar] [CrossRef]
- Li, J.H.; Huang, C.C.; Qi, J.J.; Qian, Y.H.; Liu, W.Q. Three-way cognitive concept learning via multi-granularity. Inf. Sci. 2017, 378, 244–263. [Google Scholar] [CrossRef]
- Yao, Y.Y. Three-way decisions and cognitive computing. Cogn. Comput. 2016, 8, 543–554. [Google Scholar] [CrossRef]
- Deng, X.F.; Yao, Y.Y. Decision-theoretic three-way approximations of fuzzy sets. Inf. Sci. 2014, 279, 702–715. [Google Scholar] [CrossRef]
- Hu, B.Q. Three-way decisions space and three-way decisions. Inf. Sci. 2014, 281, 21–52. [Google Scholar] [CrossRef]
- Qian, J.; Liu, C.H.; Miao, D.Q.; Yue, X.D. Sequential three-way decisions via multi-granularity. Inf. Sci. 2020, 507, 606–629. [Google Scholar] [CrossRef]
- Li, X.N.; Yi, H.J.; She, Y.H.; Sun, B.Z. Generalized three-way decision models based on subset evaluation. Int. J. Approx. Reason. 2017, 83, 142–159. [Google Scholar] [CrossRef]
- Liu, D.; Liang, D.C.; Wang, C.C. A novel three-way decision model based on incomplete information system. Knowl.-Based Syst. 2016, 91, 32–45. [Google Scholar] [CrossRef]
- Xu, W.H.; Li, M.M.; Wang, X.Z. Information Fusion Based on Information Entropy in Fuzzy Multi-source Incomplete Information System. Int. J. Fuzzy Syst. 2017, 19, 1200–1216. [Google Scholar] [CrossRef]
- Zhang, H.R.; Min, F.; Shi, B. Regression-based three-way recommendation. Inf. Sci. 2017, 378, 444–461. [Google Scholar] [CrossRef]
- Wang, M.; Min, F.; Zhang, Z.H.; Wu, Y.X. Active learning through density clustering. Expert Syst. Appl. 2017, 85, 305–317. [Google Scholar] [CrossRef]
- Yu, H.; Wang, X.C.; Wang, G.Y.; Zeng, X.H. An active three-way clustering method via low-rank matrices for multi-view data. Inf. Sci. 2020, 507, 823–839. [Google Scholar] [CrossRef]
- Yue, X.D.; Chen, Y.F.; Miao, D.Q.; Qian, J. Tri-partition neighborhood covering reduction for robust classification. Int. J. Approx. Reason. 2016, 83, 371–384. [Google Scholar] [CrossRef]
- Zhou, B.; Yao, Y.Y.; Luo, J.G. Cost-sensitive three-way email spam filtering. J. Intell. Inf. Syst. 2014, 42, 19–45. [Google Scholar] [CrossRef]
- Li, H.X.; Zhang, L.B.; Huang, B.; Zhou, X.Z. Sequential three-way decision and granulation for cost-sensitive face recognition. Knowl.-Based Syst. 2016, 91, 241–251. [Google Scholar] [CrossRef]
- Min, F.; Zhang, Z.H.; Zhai, W.J.; Shen, R.P. Frequent pattern discovery with tri-partition alphabets. Inf. Sci. 2020, 507, 715–732. [Google Scholar] [CrossRef]
- Lin, J.; Keogh, E.; Wei, L.; Lonardi, S. Experiencing SAX: A novel symbolic representation of time series. Data Min. Knowl. Discov. 2007, 15, 107–144. [Google Scholar] [CrossRef]
- Shi, Q.Q.; Yin, J.M.; Cai, J.J.; Cichocki, A.; Yokota, T.; Chen, L.; Yuan, M.X.; Zeng, J. Block Hankel tensor ARIMA for multiple short time series forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volumr 34, pp. 5758–5766. [Google Scholar]
- Ma, X.Y.; Zhang, L.; Xu, L.; Liu, Z.C.; Chen, G.; Xiao, Z.L.; Wang, Y.; Wu, Z.T. Large-scale user visits understanding and forecasting with deep spatial-temporal tensor factorization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2403–2411. [Google Scholar]
- Chen, X.Y.; Sun, L.J. Low-rank autoregressive tensor completion for multivariate time series forecasting. arXiv 2020, arXiv:2006.10436. [Google Scholar]
- Wu, Y.K.; Zhuang, D.Y.; Labbe, A.; Sun, L.J. Inductive graph neural networks for spatiotemporal kriging. arXiv 2020, arXiv:2006.07527. [Google Scholar]
- Lonardi, S.; Lin, J.; Keogh, E.; Chiu, B.Y.C. Efficient discovery of unusual patterns in time series. New Gener. Comput. 2006, 25, 61–93. [Google Scholar] [CrossRef][Green Version]
- Amir, A.; Charalampopoulos, P.; Pissis, S.P.; Radoszewski, J. Dynamic and internal longest common substring. Algorithmica 2020, 82, 3707–3743. [Google Scholar] [CrossRef]
- Behara, K.N.; Bhaskar, A.; Chung, E. A novel approach for the structural comparison of origin-destination matrices: Levenshtein distance. Transp. Res. Part C Emerg. Technol. 2020, 111, 513–530. [Google Scholar] [CrossRef]
- Chung, N.C.; Miasojedow, B.; Startek, M.; Gambin, A. Jaccard/Tanimoto similarity test and estimation methods for biological presence-absence data. BMC Bioinform. 2019, 20, 644. [Google Scholar] [CrossRef] [PubMed]
- Sun, S.B.; Zhang, Z.H.; Dong, X.L.; Zhang, H.R.; Li, T.J.; Zhang, L.; Min, F. Integrating triangle and jaccard similarities for recommendation. PLoS ONE 2017, 12, e0183570. [Google Scholar] [CrossRef] [PubMed]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).