A Fast Weighted Fuzzy C-Medoids Clustering for Time Series Data Based on P-Splines

The rapid growth of digital information has produced massive amounts of time series data on rich features and most time series data are noisy and contain some outlier samples, which leads to a decline in the clustering effect. To efficiently discover the hidden statistical information about the data, a fast weighted fuzzy C-medoids clustering algorithm based on P-splines (PS-WFCMdd) is proposed for time series datasets in this study. Specifically, the P-spline method is used to fit the functional data related to the original time series data, and the obtained smooth-fitting data is used as the input of the clustering algorithm to enhance the ability to process the data set during the clustering process. Then, we define a new weighted method to further avoid the influence of outlier sample points in the weighted fuzzy C-medoids clustering process, to improve the robustness of our algorithm. We propose using the third version of mueen’s algorithm for similarity search (MASS 3) to measure the similarity between time series quickly and accurately, to further improve the clustering efficiency. Our new algorithm is compared with several other time series clustering algorithms, and the performance of the algorithm is evaluated experimentally on different types of time series examples. The experimental results show that our new method can speed up data processing and the comprehensive performance of each clustering evaluation index are relatively good.


Introduction
With the rapid development of computer information technology, the research on time series clustering is gradually rising in various fields such as finance, biology, medicine, meteorology, electricity, industry, and agriculture [1][2][3]. To better summarize and discover the effective information on these time series data, time series clustering technology has received extensive attention from researchers [4]. Through clustering, we can mine the key hidden information in time series data and find some regularities between time series sample points. It provides indispensable preparation for forecasting and anomaly detection of time series data. Consequently, the study of time series clustering is of great significance.
The data is often imperfect. Due to some human errors, machine failures, or unavoidable natural influences, noise points and outlier samples may appear in the data, this phenomenon also exists in time series data.To better avoid the impact of abnormal fluctuations in time series data onto clustering, some researchers try to smooth the data before clustering. Abraha et al. (2003) [5] proposed using B-spline fitting function data and using a K-means algorithm to divide the estimated model coefficients, which proved the strong consistency and effectiveness of the clustering method. Iorio et al. (2016) [6] used a P-spline smoother to model the raw data and then used the k-means algorithm to cluster the fitted data objects according to the best spline coefficients. D'Urso et al. (2021) [7,8] applied the B-spline fitting coefficient to the robust fuzzy clustering method and proposed four different robust fuzzy clustering methods of time series data, then applied them to is simple and common, but it is susceptible to the time offset and has a small range of applications; DTW is a well-known time series similarity measurement criterion [35], and it is also the most widely used and improved algorithm so far, but its computational cost is O N 2 when the sample object length is N. Among the edit distances, there is the TWED algorithm that satisfies the triangular inequality relationship, which can deal with time series of different sampling rates, and is also an excellent time series similarity measurement method, but the time complexity is the same as DTW. Among these similarity measures, DTW and TWED are popular, but their high time complexity leads to low clustering efficiency. This paper chooses the MASS 3 algorithm [36] in the matrix configuration file to measure the similarity between time series, which has certain robustness and can improve the accuracy of time series clustering. Moreover, the matrix configuration file has some outstanding advantages, such as simplicity, no parameters, time and space complexity are relatively low [37].
In response to the problems discussed above, we propose a fast weighted fuzzy C-medoids clustering for time series data based on P-splines, named the PS-WFCMdd. PS-WFCMdd can well deal with the negative problems caused by noise and outliers in time series data, and has fast and efficient clustering performance. The main contributions of this paper include the following three aspects: • Abnormal fluctuations and noises in time series data usually make it challenging to define a suitable clustering model and increase the burden of related measurements.
To solve these problems, we use the P-spline to fit clustering data. Smoothing the fitted data reduces abnormal fluctuations in the data compared to the original data. At the same time, the influence of noise on fuzzy clustering is avoided. Taking the fitting data as the input of the clustering algorithm, it is easier to find similar sample points in the clustering process; • In the traditional fuzzy clustering method, no matter how far the sample point is from the center of the data cluster, the membership degree of the sample point belonging to this class is not 0. This makes FCM vulnerable to boundary samples. In the proposed PS-WFCMdd, we design a new weighted fuzzy C-medoids, which can reduce the influence of outlier sample points on cluster center update by defining a new weighting method; • Among the time series similarity metrics, ED is simple and fast, but it is vulnerable to time offset. DTW algorithm has high time complexity and low efficiency in the face of a large-scale time-series data model. As a time-series measurement standard, MASS 3 has the characteristics of fast computing speed and small space overhead due to its unique segmented computing method. Therefore, we propose to apply MASS 3 to weighted fuzzy C-medoids to measure the similarity between time series data and medoids more accurately and quickly.
The rest of the paper is organized as follows: in Section 2, we introduce some basic knowledge used in our algorithm. Section 3 presents our method. The experimental results are presented in Section 4, which also includes a discussion. Finally, the conclusions are presented in Section 5.

Time Series: A P-Spline Representation
B-splines are formed by connecting polynomial segments in a specific way. As an excellent curve representation method, it can maintain the continuity and local shape controllability of the fitted data after fitting the time series [38].
Suppose the set containing N time series data is T = {T 1 , T 2 , · · · T i , · · · T N }, where ; p denotes the value at T j i of the h-th B-spline of degree p for a given equidistant grid of knots. Based on the above conditions, the vector set L(T i ) of the j fitting coefficients corresponding to the i-th time series can be obtained: where θ is the error vector and b h is the B-spline coefficient estimated through simple linear least-squares. B-splines perform well for nonparametric modeling problems but have limited control over smoothness and data fitting. In this regard, Eilers and Marx [9] use a relatively large number of nodes and different penalties for the coefficients of adjacent B-splines, showing the connection of the common spline penalty on the integral of the squared second derivative. The difference penalty is integrated into the B-spline basis, which created Psplines. If a dense set of splines is used, the curve is obtained by minimizing L(T i ) − Bb 2 . An element of B is B h T j i ; p , compared with the original sequence data, the fitting curve results obtained by b are quite different. To avoid overfitting, b can be estimated in a penalized regression setting: where D d is a d-th order difference penalty matrix such that ∆ d . The parameter λ continuously controls the smoothness of the final fit. Overfitting occurs when the smoothing parameter approaches zero, while for high values of λ, the result of the fitted function will be close to the polynomial model. The optimal spline coefficients follow as: Based on the above, the linear least-squares fitting model based on P-spline can be obtained by using differential penalty for the coefficients of the B-spline curve, which is as follows: In a broad sense, the P-spline curve is limited by polynomial curve fitting and extended on the basis of a linear regression model. P-splines have many useful properties: P-splines show no boundary effects, just like many types of kernel smoothers; P-splines can fit polynomial data exactly; P-splines can conserve moments of the data.
We show a comparison of the two time series results with different fluctuations after P-spline fitting in Figure 1. Here, we can see that whether it is noisy time series A or time series B with slight irregularity, the fitting curve obtained by fitting the time series with P-spline is smoother than the original curve, but it reasonably retains the key characteristics of the time series. Therefore, this paper proposes to use P-spline to fit the time series first, which can eliminate the noise peak contained in the time series, but retain the contour and trend of the sample. Then, the fitting data is used as the input of clustering to improve the final clustering effect.

Weighted FCMdd
The FCMdd algorithm can objectively assign each data object to a certain cluster with a certain value of membership, thereby generating fuzzy clusters. Compared with the clustering method of the k-medoids algorithm that directly divides the sample categories, the FCMdd algorithm can reasonably perform uncertainty analysis on the categories of time series. In fact, the objective function of FCMdd is the same as that of FCM, except that V in the clustering process is selected from the real sample space, rather than the points in the C continuous spaces generated by the algorithm as the new clustering center.
Suppose the set containing N time series data is T = {T 1 , T 2 , · · · T i , · · · , T N }, V = {v 1 , v 2 , · · · , v C }, v i ∈ T represents the cluster centers obtained during the clustering process. Then, the objective function of FCMdd is defined as: where C represents the number of clusters, respectively; u ik represents the membership degree of the sample to the k-th clustering; m is a fuzzification parameter used to control the degree of clustering fuzziness. The larger the value of m, the wider the fuzzy decision boundary, its default value is 2; U is the numbership matrix; V is the set of cluster centers. The definition of d ik (T i , v k ) represents the similarity between the object T i and the cluster center point v k . Both the cluster assignment results in k-medoids and the update results of medoids are discrete, while the cluster assignment results in FCMdd are continuous variables. For a randomly initialized medoid group or an updated medoid group, FCMdd can update u ik for the existing C medoid members. The update method is as follows: Any clustering algorithm needs to select the best cluster center. For FCMdd, selecting the best instance object as the medoid is one of the cores of the algorithm. The commonest approach is to select the object with the smallest distance from all objects in the dataset as the new medoid, but this approach will make the time complexity of the algorithm very high. We consider that only q sample points that maximize the membership in each cluster should be considered as new medoids. In this way, the update formula of cluster center v k can be defined as: where ξ is the set of q medoid candidates.
The weighted Fuzzy C-Medoids clustering algorithm (WFCMdd) obtains w i by weighting each data object T i and adding them to the objective function. The objective function of the minimization of WFCMdd can be defined as follows: Under constraint ∑ c i=1 u ik = 1, the value of u ik are the same as Equation (6). The new medoid v k of cluster C are defined as follows:

MASS 3
MASS is an algorithm for creating distance distributions for queries on time series. Using it to compute data requires only a small space overhead, with a small constant factor, allowing large datasets to be processed in main memory. In the continuous development of MASS, MASS 2 and MASS 3 came into being. Among them, MASS 3 is the best, it uses fast Fourier transform (FFT) to achieve semi-convolution, which is used to calculate all sliding dot products between query and sliding window. In order to avoid the problem of numerical error in convolution calculation when the data sample length n is very large, MASS 3 processes the data samples in segments instead of all n sample points at one time. The segments must overlap by exactly m-1 samples to maintain the continuity of the distance profile. Where the size of the block is a power of 2, and the size of the last block may not be a power of 2. MASS 3 does not depend on the query length (m), nor does it have the problem of dimension disaster, and its calculation cost is O(N log(N)).
The specific calculation steps are as follows: • Scan the time series with a sliding window. • Z-Normalize the window to get Q.

•
The standard deviations of moving windows of a fixed size can be calculated in one linear scan. We assume T is the time series. Calculate cumulative sums of T and T 2 and store. Subtract the two cumulative sums to obtain the sum over any window, and finally calculate the standard deviation for all windows. We can obtain the expression for the standard deviation as: where: Use a half convolution to compute all sliding dot products between the query and the sliding window. If x and y are vectors of polynomial coefficients, the expression for this half convolution is: • The final vectorized working formula is:

Weighted Fuzzy C-Medoids Clustering Based on P-Splines
In the process of FCMdd algorithm clustering, after classifying an outlier sample into a certain cluster, in the next iterative update, because of the constraint condition A, the outlier sample will obtain a higher value of membership. In turn, the selected medoids are biased, and finally affect the clustering results. Although the traditional WFCMdd assigns different weights to each data sample point in the clustering process to obtain a better clustering effect. However, this method does not solve the negative impact of outlier samples on updating medoids. To solve this problem, we propose to judge the importance between different data and new medoids according to the distance. This increases the importance of data in more dense areas in the data space and reduces the importance of peripheral data (outlier data). To solve this problem, after the initial medoids V is obtained, the update weight W i of the sample point x i and each cluster center v k are obtained as: where std is the standard deviation between the calculated input data. By updating the weights in this way, so that the data points are closer to the v k point, then W i will be larger, and W i will be smaller on the contrary. Based on the above research contents, we propose a fast weighted fuzzy C-medoids algorithm. The specific process of PS-WFCMdd is shown in Figure 2. The input to the algorithm is the original time series dataset T = {T 1 , T 2 , · · · T i , · · · T N }. Then, use P-spline to fit each sample point in T to obtain the fitted data set The P-spline will smooth the sample while preserving the main characteristics of the sample. The obtained fitting data set L is used as the processing object of clustering, and initial medoids are randomly selected from it. MASS 3 is used to efficiently calculate the similarity between each fitting data object and medoids, and convert it into the corresponding fuzzy membership matrix. The C * N fuzzy membership matrix is generated after each iteration, and the intensity of the square color in the matrix represents the membership of each sample point relative to the cluster center. In the future medoids update, to solve the problem that FCMdd is sensitive to outlier samples, we propose a new weighting method to update the medoids. To redefine the weight relationship between different data objects and new medoids according to the distance, increase the weight of data objects close to medoids, and reduce the weight of outlier data objects farther from medoids. Thus, the clustering efficiency and robustness of PS-WFCMdd are improved. In simple terms, after transforming the time series into projection coefficients on the basis of the P-spline function, it is clustered with new weights based on WFCMdd, and MASS 3 is used to measure the similarity between time series data and medoids in the clustering process. Then, the objective function of PS-WFCMdd is defined as: The iterative expression to obtain the final membership matrix and medoids is: According to Equations (4), (13), (15) and (16), the steps of PS-WFCMdd can be summarized as Algorithm 1. Update the the new centroids L(T i1 ), L(T i2 ), . . . L(T iC ) using equation (16); 8: Update the weight W k using equation (13); 9: iter=iter + 1; 10: Until [L(T i1 ), L(T i2 ), . . . L(T iC )] old = L(T i1 ), L(T i2 ), . . . L(T iC ) or iter=max.iter; 11: Calculate the classification label (Y) to which the sample belongs according to the final fuzzy membership U.

Experiment Preparation and Evaluation Index
To demonstrate the effectiveness of our proposed method on time series datasets of different dimensions and data lengths, in the experiments, we selected 18 real-world datasets from the open time series database UCR for evaluation. Among them, 10 common time series data, whose detailed description is shown in Table 1, and the detailed description of 8 large-scale time series data in Table 2. The length of the time series in all datasets is between 96 and 2844, the size of the dataset is between 40 and 8236, and the number of classes in the dataset is between 2 and 8.
To make the experiment more complete, the data objects we choose are diversified, and a brief introduction is given below. Car recorded the contours of four different types of cars extracted from the traffic video. Lightning2 is a transient electromagnetic event recorded by a suite of optical and radio frequency instruments. Wafer is a database of wafers recorded during the processing of silicon wafers for semiconductor manufacturing.
StarLightCurves are recorded celestial brightness data over time. Car, Lightning2, Wafer, and StarLightCurves are all time-series databases constructed from collections of process control measurements recorded from various sensors. Beef, Meat, Strawberry, and Rock are the classification records of items made by various spectrometers. Beefly, Facefour, Fish, and Herring record image databases for contour problems. Both the ECG200 and ECG5000 track the electrical activity recorded during a heartbeat. ChlorineConcentration and Mallat are simulated datasets constructed by software applications. Gunpoint recorded the movement trajectory data of two actors with their hands. Hourstwenty is the dataset recorded using smart home devices. A statement is required here, PS-WFCMdd, PS-k-means, and PS-WFCMdd-sD set the corresponding parameters of P-splines for each experimental dataset are consistent. Each series has been modeled by P-splines taking cubic bases and third-order penalties. For the P-spline modeling, a fixed smoothing parameter λ is chosen for each dataset, and λ ranges from 0.3 to 0.6. The distance metric criterion for both FCM and PS-k-means is ED. Note that the number of centers for each dataset is also fixed, here we specify the number of classes in Tables 1 and 2 as the C value.
In this experiment, our proposed PS-WFCMdd algorithm will be compared with six other time series clustering algorithms: traditional FCM algorithm; K-means based on P-splines [6] (PS-k-means); based on soft-DTW [39] (it is an improvement on the DTW algorithm as a differentiable loss function which can calculate all alignment costs Soft minimum) K-means algorithm (K-means-sD algorithm) [40]; soft-DTW-based K-medoids algorithm (K-medoids-sD); K-shape algorithm based on cross-correlation distance metric [41]; to better demonstrate the superiority of MASS 3, replace MASS 3 in PS-WFCMdd with softDTW, denoted as PS-WFCMdd-sD.
In the validation step, we use three widely used criteria to measure the performance of the clustering results. Due to the insufficient penalty of the Rand coefficient, the scores of the clustering algorithms are generally relatively high, and the performance of the algorithm cannot be significantly compared. Therefore, we use the adjusted Rand coefficient (ARI) [42], which is a measure of the Rand coefficient. In an improved version, the purpose is to remove the influence of random labels on the Rand coefficient evaluation results. The Fowlkes-Mallows index (FMI) [43] can be regarded as the geometric mean between precision and recall. This index shows similarity even if the number of clusters of the noisy dataset is different from that of the source data, making FMI one of the key metrics for evaluating the effectiveness of clustering. The amount of mutual information can be understood as the amount by which the uncertainty of the other decreases when guiding one. Normalized Mutual Information (NMI) is one of the most popular information-theoretic metrics in community detection methods. NMI is a normalization of the mutual information score used to scale the results between 0 (no mutual information) and 1 (perfect correlation). These standards are briefly described below.
The ARI score range is [−1, 1], negative scores indicate poor results, and a score equal to 1 indicates that the clustering results are completely consistent with the true labels. Where TP represents the number of true positive samples (the clustering results can group similar samples into the same cluster), TN represents the number of true negative samples (clustering results can group dissimilar samples into different clusters), FP represents the number of false positive samples (that is, the clustering results classify dissimilar samples into the same cluster), FN represents the number of false negative samples (clustering results group similar samples into different clusters). The definition of the ARI is: . (17) The parameter in the expression of FMI has the same meaning as the parameter in Equation (17) and can be defined as: .
The minimum score of the Fowlkes-Mallows index is 0, indicating that the clustering results are different from the actual class labels; the highest score is 1, indicating that all data have been correctly classified.
NMI value range is between [0, 1], and the optimal value is 1. Given class labels η, and a result cluster labels Y obtained by a clustering algorithm. The NMI is defined as: where H(·) means entropy; MI is given by where |η i | is the number of the samples in cluster η i .

Results of PS-WFCMdd Method on Ten Common Time Series Datasets
To confirm that the PS-WFCMdd method is effective on common low-dimensional datasets, in the first part of this subsection, we compare the performance of PS-WFCMdd and six other clustering methods on the 10 low-dimensional datasets in Table 1. Tables 3-5, they show the ARI index scores, FMI index scores, and NMI index scores of the seven clustering methods on each dataset, where the bold content represents the best value in the comparison results.  Table 3 shows the clustering results under ARI. We can see that on the Beef, GunPoint, and Meat datasets, the PS-WFCMdd algorithm has an obvious score advantage. On the BeetleFly, Car, ECG200, FaceFour, and Lightning2 datasets, the PS-WFCMdd algorithm also has the best ARI scores, but the performance is not significant. On the Fish and Herring datasets, the ARI score of the K-means-sD algorithm is the best, followed by the SRFCM algorithm, PS-WFCMdd performs the next best. There is no significant difference in the performance of PS-WFCMdd-sD and PS-WFCMdd under ARI on the four datasets FaceFour, Fish, Lightning2, and Meat datasets. Table 4 shows the clustering results under FMI. We can see that on the Beef and Meat datasets, the PS-WFCMdd algorithm has an obvious score advantage. On the BeetleFly, Fish, and GunPoint datasets, the FMI score of the PS-WFCMdd algorithm is also the best. On the Car, FaceFour, and Herring datasets, the FMI score of the PS-WFCMdd-sD algorithm is the best, and the SRFCM algorithm has a certain advantage over the remaining algorithms. On the ECG200 dataset, the FMI score of the K-means-sD algorithm is the best. On the Lightning2 dataset, the K-shape algorithm has the best score. Table 5 shows the clustering results under NMI. We can see that on the Beef, BeetleFly, GunPoint, Car, ECG200, FaceFour, and Meat datasets, the PS-WFCMdd algorithm has obvious score advantages. On the Fish dataset, the NMI score of the K-means-sD algorithm is also the best value. On the Car, FaceFour, and Herring datasets, the NMI score of the K-means-sD algorithm is the best, and the PS-WFCMdd algorithm has a certain advantage over the remaining algorithms.
By analyzing the results of the above three tables, it can be concluded that our algorithm can obtain relatively excellent performance under the three evaluation indicators when faced with ordinary low-dimensional time series data sets. The improved algorithm based on the K-means algorithm also achieved good results in some indicators, indicating that the performance of the algorithm should not be underestimated. Although the PS-WFCMdd-sD algorithm achieves good performance under the FMI evaluation index, the overall performance is still inferior to the PS-WFCMdd algorithm. This shows that the MASS 3 we used has good performance on low-dimensional data in terms of clustering effect and the time complexity of the PS-WFCMdd-sD algorithm is high, we will further analyze it in Section 4.4.

Results of PS-WFCMdd Method on Nine Large-Scale Time Series Datasets
To confirm that the PS-WFCMdd method is also effective on large-scale datasets in Table 2, in this section, we compare the clustering effects of PS-WFCMdd and six other clustering methods on the eight datasets in Figures 3-5 show the ARI scores, FMI scores, NMI scores of the seven clustering methods on each dataset.
By looking at Figure 3. We can see that on the ChlorineConcentration, Rock, and Strawberry datasets, the PS-WFCMdd algorithm has obvious score advantages. On the ECG5000, HouseTwenty, and Wafer datasets, the ARI score of the PS-WFCMdd algorithm is also the best. On the StarLightCurves dataset, the ARI score of the PS-WFCMdd-sD algorithm is the best. On the Mallat dataset, the ARI score of the K-shape algorithm is the best. The ARI scores of the PS-WFCMdd-sD and PS-WFCMdd algorithm on the ChlorineConcentration and HouseTwenty data have little difference.   According to the results in Figure 4. On the Rock, StarLightCurves, and Strawberry datasets, the PS-WFCMdd algorithm has obvious advantages. On the ECG5000 and Mallat datasets, the PS-WFCMdd algorithm is barely optimal. The FCM algorithm achieves excellent performance on the ChlorineConcentration dataset, and the PS-K-means algorithm achieves the highest score on the HouseTwenty dataset. On the Wafer dataset, the FMI score of the PS-WFCMdd-sD algorithm is outstanding.
It can be seen from Figure 5 that on the ChlorineConcentration, Rock, StarLightCurves, Strawberry, and Wafer datasets, the PS-WFCMdd algorithm has obvious score advantages; on the HouseTwenty dataset, the PS-WFCMdd algorithm achieves only a small advantage; on the ECG5000 dataset, the NMI score of the FCM algorithm is the optimal value; on the Mallat dataset, the K-shape algorithm has the best FMI score.
From the above analysis, it can be concluded that our algorithm can achieve excellent performance under three evaluation indicators when faced with large-scale time series datasets. On some datasets, the FCM algorithm and the K-shape algorithm have also achieved good results, indicating that these two algorithms have certain advantages in processing large-scale time series data. The PS-WFCMdd-sD algorithm has achieved good performance under the three evaluation indicators, but the overall performance is still better than the PS-WFCMdd algorithm.

Experimental Analysis from the Perspective of Data Characteristics
To more intuitively explain the robustness of the PS-WFCMdd algorithm from the data features, the trend of all experimental data and the cluster center results obtained by the PS-WFCMdd algorithm are shown in Figure 6. The K-means-sD algorithm, which is generally used in time series clustering, is selected for the corresponding visual comparison, as shown in Figure 7. In both figures, the colored lines represent cluster centers, and the light-colored lines represent the original sample set. In addition, the comprehensive performance of K-means-sD in the experiments in this paper is also relatively good. Let us first observe the clustering centers in the two visualization charts. Through comparison, we can see that since k-means-sd is the clustering center gradually generated in the clustering process, the trend of the clustering centers of some data is quite different from the original data, and the difference is more obvious in BeetleFly, HouseTwenty, and Wafer. However, PS-WFCMdd is the cluster center found in the corresponding original data, and the trend of the obtained cluster center is the same as that of the original data.
We then evaluate the data from the trend in the time series dynamic analysis. From the display of the source data set alone, it can be seen that BeetleFly, Car, Fish, Herring, and ChlorineConcentration all exhibit periodic changes, among which BeetleFly has a larger time series amplitude. In terms of the experimental results on Fish and Herring, PS-WFCMdd performs generally in the three evaluation indicators, while K-means-sD has better experimental results. In the experimental results of Car and ChlorineConcentration, the advantage of PS-WFCMdd is small. PS-WFCMdd performs well in the experimental results on BeetleFly. This shows that when faced with data with obvious periodicity and stable data trend, because there are few noise and abnormal samples in these data, PS-WFCMdd has no advantage in experimental results. The remaining 13 datasets contain a large number of irregular dynamic time series, which inevitably contain a lot of noise and edge data. From the experimental results of these data, the performance of PS-WFCMdd is relatively good. However, on Housetwenty and Wafer data with mutation frequencies, this type of data often presents more difficult modeling problems. PS-WFCMdd performs best among all comparison algorithms, but is not outstanding and does not make a practical solution for mutation frequency time series.  Based on the above experimental analysis, P-splines are used to fit time series data and use the fitted data as the input of clustering, which can effectively improve the final clustering effect. Under the influence of the new weight, the PS-WFCMdd algorithm can still obtain a higher clustering effectiveness index score under the condition of noise and outlier samples in the sample data. However, our algorithm also has some limitations. For example, compared with most algorithms, PS-WFCMdd needs to adjust more smoothing parameters, so the algorithm is not very concise. When faced with time series with obvious periodicity and stable data trends and irregularly changing data with mutation, our algorithm performs in general and needs further optimization and improvement.

Time Complexity Analysis
The time complexity of each comparison algorithm and PS-WFCMdd algorithm in this experiment is shown in Table 6, where N represents the number of sample data, C represents the number of clusters, and K represents the number of iterations. It can be seen that the time complexity of PS-WFCMdd is the same as that of K-shape, and it is second only to FCM and PS-K-means algorithm. The time complexity of the remaining clustering algorithms is higher. We can see that because of the difference in time complexity between soft-DTW and MASS 3, the running time consumption of PS-WFCMdd-sD and PS-WFCMdd is one order of magnitude different. Especially in the face of large time series, the advantages of the PS-WFCMdd algorithm are more obvious.

Conclusions
This paper proposes a clustering method called PS-WFCMdd suitable for time series characterized by being fast, parsimonious, and noise-resistant. The idea behind our proposal is simple and effective. It can solve the effect of noisy points and outlier samples on fuzzy clustering. We recommend fitting each series with a P-spline smoother before clustering, and using the fitted data as input to the clustering analysis. We also define a new weighting method to avoid the impact of outlier sample points on medoids update. The new weight allocation makes the selected new medoids more appropriate. In the experiment, compared with soft-DTW, MASS 3 has a better advantage in our new weighted fuzzy C-medoids method, MASS 3 can measure the similarity between time series data and medoids more accurately and quickly. In addition, the performance of our weighted fuzzy C-medoids method based on P-spline has been analyzed through three clustering evaluation indexes in Section 4. Our experimental results show that our algorithm can ensure high-quality clustering performance within a reasonable computing time, whether on ordinary time series data sets or large time series data.
In future work, we will try to study a better cluster initialization method to reduce the number of iterations. The distance measurement criterion is further improved to reduce the time complexity of the algorithm, to better improve the comprehensive performance of the algorithm. There are still many open-ended problems that need to be solved. The characteristics of each time series dataset are different. To obtain higher clustering accuracy, it is necessary to analyze the characteristics of the data more carefully and study corresponding characteristics.. The corresponding clustering method is obtained, and a more appropriate similarity measure criterion is selected. These considerations can be gradually realized in future research, which must be very interesting. Finally, the code URL for PS-WFCMdd is: https://github.com/houqinchen/PS-WFCMdd (accessed on 1 February 2022).