A Summary of F-Transform Techniques in Data Analysis

: Fuzzy transform is a technique applied to approximate a function of one or more variables applied by researchers in various image and data analysis. In this work we present a summary of a fuzzy transform method proposed in recent years in different data mining disciplines, such as the detection of relationships between features and the extraction of association rules, time series analysis, data classification. After having given the definition of the concept of Fuzzy Transform in one or more dimensions in which the constraint of sufficient data density with respect to fuzzy partitions is also explored, the data analysis approaches recently proposed in the literature based on the use of the Fuzzy Transform are analyzed. In particular, the strategies adopted in these approaches for managing the constraint of sufficient data density and the performance results obtained, compared with those measured by adopting other methods in the literature, are explored. The last section is dedicated to final considerations and future scenarios for using the Fuzzy Transform for the analysis of massive and high-dimensional data.


Introduction
Fuzzy Transform (for short, F-transform) [1,2] is a recent soft computing approximation technique, successfully used in numerous applications in image and data analysis (see, e.g., [3] for an in-depth discussion on this matter).
In particular, the properties of the F-transform in the information aggregation and function approximation favors its use in many data analysis and data mining problems.
The aim of this paper is to provide an in-depth overview of soft computing data analysis techniques based on the use of the F-transform proposed in the literature.
Some variations of basic functions used to constrict the F-transform are proposed in [4], in which they are given by B-spline functions, and in [5] where the basic functions are given by block pulse functions.
Recently an extension of the basic F-transform on higher-degree F-transform was introduced in [6] by generalizing the case of constant (zero-order) components to the case of m-order polynomial components. In [7,8] the applicability of the m-order F-transform is discussed and an application of the one-degree F-transform in seasonal time series forecasting is presented in [9]. However, while increasing the performance in terms of accuracy and precision of the results compared to basic F-transforms, the higher-degree fuzzy transforms are computationally more complex to manage and this makes them unsuitable for use in data analysis applications, especially in the presence of datasets of high cardinality and size.
In this work we focus on the application of the basic (zero-order) F-transform in data analysis. We will discuss the techniques proposed in the literature that employ the direct and inverse zero-order F-transform in data mining problems, such as dependencies between attributes, time series analysis and data classification, analyzing their critical points and performance benefits.
F-transform techniques were initially applied in image analysis in which the constraint of sufficient density described in Section 2 is always respected. In data analysis, however, the application of the F-transform necessarily requires the management of this constraint and the choice of suitable fuzzy partitions of the domains of the input variable and the choice of the appropriate dimensionality of the fuzzy partitions which cannot be too fine, to guarantee sufficient data density, nor too coarse grained, to guarantee high performance levels.
In Section 2 we introduce the one-dimensional and multi-dimensional F-transforms, providing a summary of their characteristics. In particular, the constraint of sufficient density of the data will be analyzed, which is of extreme importance in the use of F-transform techniques in data analysis. In Section 3 are discussed the methods proposed in the literature applying the multidimensional F-transform in the analysis of dependencies between attributes in the data and in detecting association rules. Section 4 focuses on the F-transform techniques applied in time series analysis. In Section 5 a classification method based on the multi-dimensional F-transform is discussed. Final considerations are contained in Section 6. A list with descriptions of all acronyms and abbreviations in the text is given in Appendix A.

Basic Functions
Let X = [a,b] be a close interval in R and {x1, x2, …, xn} be a set of n fixed points in [a,b] such that 3 ≤ n and a = x1 < x2 <…< xn = b.
In [1,2] the following definition of fuzzy partition of X was introduced: the fuzzy sets A1, …,An:  For an h-uniform fuzzy partition the following additional properties hold: 1.

One-Dimensional Direct and Inverse F-Transform
is called the fuzzy transform of f with respect to {A1, A2, …, An}. The Fk are called components of the F-transform. If the fuzzy partition {A1, A2, …, An} is uniform with nodes x1, x2, …, xn, the components are given (cfr. [2] Lemma 1) by the formula: Now we define the following function on [a,b] given by a weighted average of the basic functions in which the weights are the F-transform components: It is called inverse F-transform of f with respect to the uniform fuzzy partition {A1, A2, …, An}. An important theorem proves that the function fF,n approximates the continuous function f on [a,b] with arbitrary precision. We enunciate below this theorem and its proof is given in [ Theorem 1 concerns the approximation of a known continuous function f, but in many cases we only know that the function f assumes determined values in a set of m points p1,...,pm ∊ [a,b].
We assume that the set P of these nodes is sufficiently dense with respect to the fixed fuzzy partition, i.e., for each k = 1, …,n there exists an index j ∊ {1, …,m} such that Ak(pj) > 0. Then we can define the n-tuple [F1, F2, …,Fn] as the discrete F-transform of f with respect to {A1, A2, …, An }, where each Fk is given by Then we call the discrete inverse F-transform of f with respect to {A1, A2, …, An} to be the following function defined in the same points p1,..., pm of [a,b]: Analogously to Theorem 1, we have the following approximation theorem (its proof is given in [2] Theorem 5. Theorem 2 states that the inverse F-transform (6) approximates the original continuous function f in a point with an arbitrary precision.

Multi-Dimensional Direct and Inverse F-Transform
The one-dimensional F-transform can be extended to approximate continuous func- If the set P is sufficiently dense with respect to the fuzzy partition we can define the inverse multi-dimensional F-transform of f with respect to the basic functions , , .   In Figure 2 two examples of fuzzy partitions that are more coarse-grained with respect to the fuzzy partitions are shown in Figure 1. In both cases the data points are sufficiently dense with respect to the fuzzy partitions. It is necessary to properly set the size of the fuzzy partitions. In fact, the use of fuzzy partitions that are too thin can make the data points not sufficiently dense with respect to them; on the contrary, fuzzy partitions that are too coarse grained, while guaranteeing the sufficient density of the data points, can significantly reduce the performances of the regression analysis methods in which the inverse multi-dimensional fuzzy transform is used as a regression function.

Multi-Dimensional F-Transform Techniques to Detect Dependency between Attributes in Datasets
The multi-dimensional F-transform was applied by many researchers to detect dependency among numerical features in datasets.
In [10,11] the multi-dimensional discrete F-transform is applied to find dependency between attributes in the data.
Following [10,11] a dataset with r features can be schematized as a relation with r attributes and m instances as in Table 1. Table 1. Schema of a relation with r attributes and m instances.
pmi . pmr where X1, …,Xi, …,Xr are the attributes and, O1, …,Oj, …,Om (m>r) are the objects in the dataset; each object Oj is given by an r-dimensional data point (pj1, …,pji, …, pjr) where pji is the value assumed by Oj of the attribute Xi.
The attribute Xi is a variable assuming values in the real interval [ai,bi] defined by setting ai = min{p1i, …,pmi} and bi = max{p1i, …,pmi}.
For any interval [ai,bi], i = 1, …,k, a fuzzy partition { , , . . . . , } is created with ni ≥ 3. If the set of m points is sufficiently dense with respect to these fuzzy partitions, we can define the multi-dimensional direct F-transform of H with (h1,h2, …,hk)th components given by Using formula (10), the inverse F-transform of H in the point Pj is given by In [10] a measure of the dependency of Xz from X1, …, Xk evauated by (14) is given by the statistical index of determinacy: where ̂ is the average of values p1z, p2z, …, pmz of the attribute Xz.

fits perfectly to the data.
A variation of the formula (15) used in multiple regression analysis to take into account the number of independent variables k and the scale of the data sample is given by (Johnson and Wichern, 1998): This formula includes both the number of independent variables k and the scale of the data sample. The function H in the point (x1, x2, …,xk) is approximated by the following formula: In [10] the inverse multi-dimensional F-transform is applied to find dependency among attributes in the dataset containing economic data measured in the Czech Republic in quarters starting from 1997. The two indices of determinacy (15) and (16) are used to evaluate the existence of such dependency.
The results obtained show that the inverse multi-dimensional F-transform provides good performance used as a regression function for the analysis of the dependency between numerical attributes in the datasets. However, it is necessary to determine the optimal fuzzy partitions of the domains of the input attributes and check when the data points are not sufficiently dense. In [11] an algorithm has been proposed that finds the optimal fuzzy partitions and checks that the data points are sufficiently dense with respect to the fuzzy partition. This algorithm is schematized in Figure 3. To reduce the computational costs, the same number n of fuzzy sets is assigned to each of the fuzzy partitions of the input attribute domains. Initially the minimum value n = 3 is set; in each cycle the algorithm checks that the data points are sufficiently dense with respect to the fuzzy partitions and, successively, calculates the direct multi-dimensional F-transform and, for each data point, the inverse multi-dimensional F-transform, finally measuring the value of the index of determinacy. If this value exceeds a predetermined α threshold, the algorithm ends by returning the components of the direct F-transform, otherwise a successive iteration is performed in which the number n is increased by one unit. If, during an iteration, the data points are not sufficiently dense with respect to the fuzzy partition, the algorithm terminates by reporting that it has not found the dependency of Xz on the attributes X1, ..., Xk.
In [11] this algorithm is executed to explore dependency between oceanographic and surface meteorological attributes of a dataset containing data measured from a series of buoys positioned throughout the Equatorial Ocean Pacific and used to analyze the El Nino/Southern Oscillation (ENSO) cycles.
The application of the multi-dimensional F-transform as a machine learning regression function can become expensive in the presence of massive datasets in which the number of data points and the number of features become higher. In a recent work [12] an extension of the algorithm is proposed in [11], called MFAD (Massive F-transform Attribute Dependency) aimed to find dependencies between numerical attributes in massive datasets. MFAD apply a uniform sampling algorithm to partition the dataset in subsets having the same cardinality. The F-transform attribute dependency algorithm [11] is executed on each subset returning the multi-dimensional direct F-transform components (13) and the index of determinacy (16). Let Fq be the direct F-transform vector obtained applying the F-transform attribute dependency algorithm on the qth subset, where q = 1, …,s.
The functional dependency of Xz from X1, X2, …,Xk in the form Xz = H(X1, X2, …,Xk) in a point (x1, x2, …,xk) is evaluated computing the following weighted average: where ( ,x ,...,x ) q = 1, …,s is the value of the inverse multi-dimensional F-transform in the point (x1, x2, …,xk) obtained by (17) using the qth direct F-transfom Fq and the weighted term ( ,x ,...,x ), q = 1, …,s, is given by the formula: The In Figure 4 the MFAD method is schematized. Each subset is treated separately by applying the F-transform attribute dependency algorithm. The regression function is constituted by the weighted average of the single inverse-fuzzy transforms where the weights are the values of the index of determinacy obtained for each subset. To test the MFAD algorithm in [12] it was applied on a large dataset given by the Italian National Statistical Institute census database with 140 numerical features related to census characteristics and measured for all the 402,678 Italian census tracts enclosed. In their tests the authors execute the MFAD algorithm by varying the number s of subsets and compare the results with those ones obtained by applying the classical F-transform attribute dependency algorithm [11] to the entire dataset (s = 1). Table 2 show the final index of determinacy obtained by applying MFAD to explore the dependency of Xz = Families in owned residences on the attribute X1 = Resident population with job or capital income, setting a threshold α = 0.8.  Table 1  The results of tests performed in [12] on large datasets show that the performances are comparable with the ones obtained using the well-known Support Vector Regression (SVM) and Multilayer Perceptron (MLP) regression methods.

Multi-Dimensional F-Transform Techniques for Mining Association Rules
In [10] a method based on the multi-dimensional F-transform for mining association rules in the data is proposed. The inverse multi-dimensional F-transform (14) applied to find a dependency of the attribute Xz to the attribute X1, …, Xk in the form Xz = H(X1… Xk) can be used to mine association rules.
However, unlike to the functions describing dependency between attributes, mining associations are fuzzy functions which establish a correspondence between universes of fuzzy sets.
Let U1, …,Uk be the domains of k attributes partitioned by fuzzy sets: a mining association functionally joins some fuzzy sets from partitions of U1 …Uk with fuzzy sets over respective F-transform components.
Let { , …, , …, } be an uniform fuzzy partition of the domain of the ith attribute Xi constructed as basic functions of this domain. The fuzzy partition is obtained on the ni nodes xi1, …, in the domain Ui. Each association is supported by two parameters, namely the degrees of support r and confidence γ defined below. In [10] the multi-dimensional F-transform is applied in order to discover associations rules in the following form: where , i = 1, …, k, models the meaning of the linguistic expression "approximately ". The corresponding logic clause can be read as "Xi is approximately ". The label C in the consequent is one of the following linguistic expressions characterizing the (h1, …,hk)th component of the F-transform: Sm (small), Me (medium), Bi (big); it is eventually combined with one of the following linguistic hedges: Ex (extremely), Si (significantly), Ve (very), empty hedge, ML (more or less), Ro (roughly), QR (quite roughly), VR (very roughly). Let Oj, j = 1,2, …,m, be the jth data point with component (pj1, pj2, …, pjk, pjz).
To measure the strength of the fuzzy rule (20), in [10] a membership function of an induced fuzzy set on the set of m data points {O1, …, Om} is defined by considering the antecedent of the hth rule (20): where ( ) is the membership degree to the fuzzy set of the ith attribute in the jth data point. The following value is called degree of support of the association rule (20). If The strength of the hth association rule is evaluated by measuring the degree of support r and the degree of confidence γ. If both the two parameters are greater or equal to a degree of support threshold and a degree of confidence threshold, respectively, the association is found.
In [10] this method is tested on a dataset of measures of air pollution produced on a road related to traffic volumes and weather conditions, collected by the Norwegian Public Roads Administration.

F-Transform Techniques for Time Series Analysis
Time series forecasting involves methods for fitting over historical data referring to measures of an observable series and using them to predict future observations. A time series is given by a set of data measured at different times listed in time order. Let y be a measured parameter and y(t) the measure performed at the time t. A time series is a function y: t ∈ N → y(t)∈ R known in n regular time steps y(1),y(2), …,y(n), where y(i), i = 1,2, …,n, is the measured value of y at the ith time step.
Time series forecasting techniques assess the value of y in the n future time steps y(n + 1), ..., y(n + m), where the value y(t + 1) at the step t + 1 is evaluated as a function of the previous p+1 measured values y(t), y(t − 1), ..., y(t − p). Let y(t), t = 1,2, …,T, be a time series. It can be decomposed by following two terms: The term f(t) is a deterministic part, called trend; the term r(t) is an additional random function called residuals, giving the random error with respect to the trend at the time t. A general model of a stationary time series y(t) as a linear function of the p+1 measured values y(t), y(t − 1), ..., y(t − p) is the Auto-Regressive of order p model AR(p), given by ( [13,14]): The p coefficients α1, …, αp must satisfy some constraints and the term εt is the statistical white noise giving the fluctuations in the observations that cannot be explained by the model.

One-Dimensional F-Transform Time Series Models
In [15,16] the one-dimensional F-transform is applied to approximate the trend f(t) in (25). Let {y(t), t = 1,2…,T} be a time series given by a set of data y(t) measured in T regular time intervals. Let {t1 = 1, t2, …, tn = T} be a set of n nodes of the interval [1,T], where 3 ≤ n ≤ T, and {A1,...,An} be the basic functions of a uniform fuzzy partition of the interval [1,T].
If the dataset given by the time series {y(t), t = 1,2…,T} is sufficiently dense with respect to this fuzzy partition, then there exists the direct one-dimensional F-transform of f with components Let Pk, k = 1,...,n, be a subset of {1,2, …,T} given by the time steps t, being Ak(t) > 0, as We can decompose y(t) as: where rtk is the kth residual of yt with respect to Ak given by Based on the autoregressive model (26), in [15,16] the kth component Fk is given by a linear combination of the p previous components. The trend at the kth time step is assessed by In [15,16] p = 3 is set as well. The calculated value for Fn are used to forecast the unknown value Fn+1 as The values , , chosen for the three coefficients α , α , α minimize the absolute difference between the predicted and the calculated values of Fn. In [15] a numerical method and a Multilayer Perceptron neural network are used to find the optimal values of the coefficients α1, α2, α3. In [16] a method based on fuzzy relations is proposed to find the best values of the three coefficients.
In [16] comparisons with the autoregressive model ARIMA and with other time series fuzzy-based models are performed; the MAPE and SMAPE indexes are used to measure the forecast errors; the authors showed that their F-transform-based time series prevision model has the best performances.
In [17] the one-dimensional F-transform is proposed to filter the high frequencies in the time series. A time series can be additively decomposed into three components: trend cycle, a seasonal component, and noise. The authors prove that the one-dimensional Ftransform acts as a low-pass filter, removing or significantly reducing the seasonal and noise components; then, the inverse F-transform optimally approximates the trend component.

Multi-Dimensional F-Transform Time Series Model
In [17] a time series forecasting model based on the multi-dimensional F-transform is proposed. The authors applied their method to the well-known Mackey-Glass time series generated by the differential equation: In [18] the function y(t) is approximated by previous t-6 values y(t − 6), y(t − 5), …,y(t − 1) by constructing a multi-dimensional F-transform to approximate the output variable y as a function of six variable xi = y(t − i), i = 1, …,6.
To construct the components of the direct multi-dimensional F-transform the N points ( ) , ( ) , . . . , ( ) , ( ) are considered, where j = 1, …, N. They are given by The inverse F-transform is given by To assess the value of the function y(t) at the time t considering the value obtained in the six previous time steps: xi = y(t − i) i = 1, …,6, the formula (35) is applied by obtaining the following: In [18] the authors compare the results obtained by applying this method to the Mackey-Glass time series with those ones obtained by using the well-known Wang and Mendel method and with the results obtained using a local Wavelet Neural Network with three layers, six input nodes, 10 hidden nodes and one output node. They measure the MAPE, RMSE and MADMEAN indices, showing that the multi-dimensional time series method has the best performances.
The multi-dimensional fuzzy transform method [18] can be generalized for any function considering a dependency on k input parameters. In [19] it is applied for forecasting problems in spatial analysis. The framework proposed in [19] is schematized in Figure 5. The area of study is partitioned in subzones. For each subzone a training dataset with the measure of characteristics of the subzone in a specified period is extracted. Then, the time series correspondent to a measured characteristic f(t) from a time t = 0 to t = T is constructed and the multi-dimensional F-transform prediction method [17] is applied to assess the value of f at the time T + Δt. The RMSE and the MADMEAN are used to evaluate the performances of the forecasting model. Finally, two thematic maps of the predicted value of the characteristic at the time T + Δt and of the prediction error in each subzone are given after performing a fuzzification process. This approach is encapsulated in a Geographical Information System and is tested in [19] to analyze the demographical balance data measured every month in the period 01/01/2003-31/10/2014 in the municipalities of Cilento and Vallo di Diano National Park located in the province of Salerno (Italy). The birthrate and death-rate in November 2014 in each municipality are evaluated. The mean RMSE obtained is under 0.01.

F-Transform Seeasonal Time Series Model
In some time series a phenomenon called seasonality is present, given by a repetitive and regular pattern of changes that repeats over S time periods. For example, in a monthly time series S = 12, in an hourly time series S = 24, and so on.
Some well-known statistical models as the Seasonal Auto Regressive Integrated Moving Average (SARIMA) models [20,21] are used to forecast the value of the output variable at a time t as a combination of the trend with a seasonal component.
In [22] a seasonal time-series forecasting method based on F-transforms is proposed as Time Series Seasonal F-transform (TSSF). A polynomial best fit is applied to extract the trend; then the data are de-trended, subtracting the trend from the time series and the detreated time series is partitioned in S subsets. The one-dimensional F-transform is applied to each subset to assess the correspondent seasonality.
To assess the value of the output variable y at the time t included in the sth season, with s in {1,2, …,S}, we calculate the inverse F-transform ( ) ( ).
Let Fh, where h = 1,2, …,n(s), be the hth component of the one-dimensional direct Ftransform calculated by using a fuzzy partition of n(s) basic functions of the domain of the sth subset. The one-dimensional inverse F-transform calculated at the time t is given by The forecasted value ( ) of the output y0 at the time t included in season s is where the term trend(t) is the assessed value of the trend of the time series at the time t. In the TSSF model, to verify that each subset of data is sufficiently dense with respect to the fuzzy partition and to find the best fuzzy partition, is applied the technique proposed in [11]. To find the best fuzzy partition for each subset the MADMEAN measure is calculated, being The number of fuzzy sets of the initial fuzzy partition is set to 3; then, the sufficient density of the data with respect to the fuzzy partition is verified and the direct F-transform is calculated. The inverse F-transform in each time t (j) , where j = 1, …, Ms, is calculated by formula (37) and, finally, the MADMEAN index (39) is measured. If the MADMEAN index is greater than a fixed threshold, then the process stops and the direct F-transform components are stored; otherwise, the number of fuzzy sets of the fuzzy partition n(s) is increased by one unit and the previous steps are iterated. This process is executed for each seasonal subset.
In Figure 6 the flow diagram of the TSSF model is shown. Figure 6. Flow diagram of the TSSF model in [22].
In [22] many comparison tests are performed comparing the performance of TSSF with the ones measured executing other forecasting algorithms applied to seasonal time series. Comparisons are executed with respect to the statistical Average Seasonal Variation (avgSV) and Seasonal ARIMA models [21], the model based on the multi-dimensional F-transform (MF-tr) [18] and the soft computing forecasting models Support Vector Machine (SVM) [23] and Automatic Design of Artificial Neural Networks (ADANN) [24]. Table 3 shows the RMSE obtained applying these models on a set of 14 seasonal time series giving the daily mean temperature measured by 14 weather monitoring stations located in the province of Genova (Italy). In each experiment, the month is used as seasonality and each dataset is partitioned in twelve subsets. The results in Table 2 show that the TSSF's performances are better than the ones obtained by using the avgSV, SARIMA and F-transform and comparable with those ones obtained by using SVM and ADANN. In addition, SVM and ADANN are computationally more complex to manage than TFSS. A critical point of TSFF is its inability to manage irregular time series, in which it is complex to evaluate time series patterns in the data.
In [9] an extension of the TFSS model has been proposed, based on the use of the firstorder F-transform. This model improves the performance of the TFSS model but increases its computational complexity.

F-Transform in Data Classification
In Section 3 we analyzed techniques that use the multi-dimensional F-transform as a regression function to explore dependency between data ( [10,11]). In [25] a classification method based on the use of the multi-dimensional F-transform is proposed. The proposed algorithm, called MFC (Multi-dimensional F-transform Classification), compute the direct and inverse multi-dimensional F-transforms to classify data points.
The learning dataset is given by a set of data points characterized by a pair (X,Y), where X is a vector of s numerical features (X1,…Xs) and Y is the class feature designated as class which has C categories, labelled with the values 1,2, …,C.
The multi-dimensional F-transform is applied to explore a relation between attributes in the form: MFC uses the multi-dimensional inverse F-transform to approximate the function f. To avoid the over-fitting problem is applied the K-fold cross validation resampling algorithm to control this presence. K-fold cross validation is a well-known resampling technique in which the dataset is partitioned into K subsets of equal size called folds. The classification algorithm is iterated K times. At any iteration of a fold constitutes the validation set and the union of the other K-1 folds forms the training set, used to train the classifier. With respect to other resampling techniques, K-fold is more efficient in dealing with the over-fitting problem, as in K-fold each fold is treated once as a validation set.
Let P = (p1, p2, …, ps) be a data point. Formally, if is the multi-dimensional direct F-transform calculated by using the kth fold and ... ( , , . . . , ) is the value of the multi-dimensional inverse F-transform calculated in P, then, an average of the K inverse F-transforms in the point P is calculated as The final index giving the average of the percentage of misclassified data points in the training sets is and the final index giving the average of the percentage of misclassified data points in the validation sets is CV1 and CV2 are used to evaluate the performances of MFC. If CV1 is under a fixed threshold α and CV2 is under a fixed threshold β, then the algorithm stops, else a finer set of fuzzy partitions of the domains of the s input variables is constructed and the process is iterated.
In Figure 7 we show the flow diagram of MFC. In [25] comparison tests are performed on over 100 classification datasets extracted from the University of California, Irvine (for short, UCI) Machine Learning and from the Knowledge Extraction Evolution Learning repositories.
In Table 4 are shown the mean accuracy, precision and recall classification measures obtained by running MFC, Decision tree-based J48 [26], Multi-Layer Perceptron [27], naive Bayes [28] and Lazy K-Nearest Neighbor IBK [29]. These results show that MFC provides classification performance better than those ones obtained by using the naive Bayes and Lazy IBK algorithms. They are comparable with the results obtained by the Decision tree J48 and the Multilayer Perceptron algorithms.
A weak point of MFC algorithm is its high computational complexity which makes it unsuitable to manage massive and high-dimensional datasets.
The integration with data compression and feature selection approaches in the preprocessing phase can reduce these high computational costs. An approach that integrates Principal Component Analysis (PCA) feature reduction techniques with higher-degree Ftransform has been proposed in [30] in image classification. A mixed model that integrates higher-degree F-transform and PCA techniques could be tested in data classification to reduce the number of features and improve the accuracy and precision of the classifier model, without significantly increasing the time consumption.

Conclusions
This paper presents a summary of the data analysis techniques proposed in the literature based on the use of the F-transform in one or more dimensions. We initially presented the definition of one-dimensional direct and inverse F-transform, showing how it can be used to approximate a continuous function on a real interval. We then extended this concept to the multi-dimensional F-transform, showing how it can be used in regression analysis. In particular, attention was paid to the constraint of sufficient data density with respect to fuzzy partitions, which is extremely important for the choice of the optimal cardinality of fuzzy partitions. Then, the methods proposed in the literature for the analysis of the dependency between attributes in the data and for the extraction of association rules through the use of direct and inverse multi-dimensional F-transforms were presented and analyzed. An extensive discussion was devoted to the different time series analysis techniques based on the F-transforms proposed in the literature. Finally, a classification method recently presented in the literature based on the multi-dimensional Ftransform was described.
The use of F-transform-based approaches in data analysis still remains an evolving research field. We foresee that in the future new approaches based on the use of the Ftransform may be presented that reduce the time-consumption and computational complexity that currently, on the one hand, prevent the application of these techniques to massive and high dimensional data and on the other hand allow to also use high-orders F-transforms in data analysis, improving the performance obtained using the zero-order F-transform. In the future, hybrid strategies of using the high-order F-transform and reducing the data size could lead to an optimal trade-off between the quality of the results and the processing times.
In the future, the multidimensional zero and high-order fuzzy transform methods may be included into soft computing hybrid models for the analysis of risk prediction and damage assessment proposed in recent soft computing risk analysis and forecasting models such as damage assessment of existing buildings [31] and entity assessment of the damage that can be produced on them by seismic events [32]. Moreover, fuzzy transform methods can be applied for the solution of fuzzy differential equations [33] and fuzzy partial equations [34] in data analysis models for complex systems.