Next Article in Journal
Application of Magnetorheological Damper in Aircraft Landing Gear: A Systematic Review
Previous Article in Journal
Analysis, Modeling, and Simulation of a Rocker–Bogie System Overcoming a Harmonic Bump
Previous Article in Special Issue
Mixed Eccentricity Fault Detection of Induction Motors Based on Variational Mode Decomposition of Current Signal
 
 
Due to scheduled maintenance work on our servers, there may be short service disruptions on this website between 11:00 and 12:00 CEST on March 28th.
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Assessment of Feature Selection Algorithms for Knowledge Discovery from Experimental Data

Working Group Electrotechnical Systems of Mechatronics, Kaiserslautern University of Applied Sciences, 67659 Kaiserslautern, Germany
*
Author to whom correspondence should be addressed.
Machines 2026, 14(1), 104; https://doi.org/10.3390/machines14010104
Submission received: 1 December 2025 / Revised: 9 January 2026 / Accepted: 13 January 2026 / Published: 16 January 2026
(This article belongs to the Special Issue Reliable Testing and Monitoring of Motor-Pump Drives)

Abstract

Maintenance and repair play a crucial role in industry. Smart systems for technical diagnostics can help to save money and to prevent the breakdown of machines and plants. These systems and its classifiers benefit from plausible features because they tend toward robust classification. Although concepts for knowledge discovery are well-known in various scientific fields, they are not established in the field of rotating machines. Knowledge discovery from experimental data is a framework that combines valid methods for knowledge discovery with expert knowledge and automated experiments. For the central data mining step, feature selection algorithms based on heuristic or meta-heuristic search are established. The objective is to identify plausible pattern with a limited number of features and the best combination of these features. The results in this work show which strategies align the best with the requirements of knowledge discovery using experimental data to find plausible features. For this study, well-configured search strategies, namely, sequential forward selection and ant colony optimization, were applied on real data. The data represent several fault severity levels for parallel misalignment and cavitation. The plausible feature vectors and features exhibited good behavior when applied to new targets. It is expected that the obtained knowledge will be transferable to new classification tasks with only minimal optimization of the reference data or the classifier.

1. Introduction

Maintenance and repair are crucial for the reliable operation of plants and units in industry. The objective is always to reduce costs due to wear and failure. In addition, the cost of the measurements should also be low. Three basic strategies are used to deal with used equipment [1]: time-based, fault-based, and condition-based maintenance. They differ in terms of the provision of spare parts, the duration of the repair, and monitoring effort. Condition-based maintenance is of special interest because the breakdown reserve can be used to a greater extent than in the case of a time-based strategy without the risk of breakdown, as would be the case if a fault-based strategy were used. To put this into effect, sophisticated monitoring is necessary. This could be realized by using multiple sensors placed on the concerned parts. With this strategy, all necessary information is gathered through measurement. The drawback here is the cost, which is only reasonable if the losses are higher than the investment in technical diagnostics would be otherwise or if there is a high risk of breakdown. A different approach is the use of low-sensor monitoring. In this case, all necessary information is obtained by a few sensors and smart algorithms. This cost-effective measure, called condition-based maintenance, can be applied to smaller plants and units. Furthermore, the low number of sensors per unit also allows for the monitoring to be expanded machine floats. Finally, the fact that only electrical signals are processed enables monitoring in areas that are restricted due to lack of accessibility, low spaces, aggressive fluids, dust, or the danger of explosion. This is possible because sensors can be attached to the end of the power cable.
In the field of technical diagnostics of rotating machinery, several approaches have been established to improve the diagnostics regarding monitoring on the algorithm side. Analytical approaches improve knowledge by comparing models with real measurements [2]. The complexity of a model drastically increases with a higher number of relations. This is why this approach is only used to identify and describe basic phenomena. The obtained knowledge then can be used for optimizing diagnostic algorithms. Data-driven approaches use data from experiments or field measurements. They allow for systems to be analyzed under multiple influences. These methods focus on the relations between features, a fault, and several disturbances. Although there are powerful algorithms for data mining that easily detect hidden patterns, the identification of basic phenomena can be challenging. The reason for this is that not all features identified by a search algorithm are mono-causal. The objective here is to separate the features in a found feature vector and to check them for plausibility.
These approaches and the technical diagnostics themselves are mainly affected by two independent fields: electrical engineering and computer science. On the one hand, publications from the field of electrical engineering mainly focus on the machines and the description of fundamental effects [3,4,5,6,7,8,9]. This method is called the traditional method, according to [10]. Nevertheless, in this field, a large amount of expert knowledge is available. On the other hand, computer scientists focus on algorithms that allow for knowledge to be derived from Big Data [11,12,13,14,15,16,17] and in a systematic way [10]. In this field, state-of-the-art algorithms and methods for knowledge discovery are discussed. Both fields yield important results for technical diagnostics. Unfortunately, basic concepts are often not combined. This means that studies in the field of electrical engineering are lacking concepts such as knowledge discovery in database (KDD) as a systematic approach to find new knowledge. Studies in the field of computer science, on the other hand, suffer from a lack of expert knowledge and data. In addition, they cannot apply their methods to every specific scientific area or adapt them accordingly. As a result, issues with knowledge discovery itself, implementation, or execution are not discussed.
In the following, a new framework called Knowledge Discovery from Experimental Data (KDED) [18] will be presented. The aim of this method is to find fundamental relations between the meassured signals and the condition of a machine. These findings should make the decision-making process in classification transparent, should clear the limits of a classification and should be transferable to similar problems. In order to achieve this, expert knowledge about rotating machines and the concept of KDD was combined. The alignment of automated experiments and data mining then allows for relations to be found between faults and unknown features, even if distortions occur. By preparing the data through the experiments, the algorithm used for the data mining is forced to select only potentially plausible features and feature sets. To check the results for plausibility, only general metrics, like kurtosis, variance, or RMS, and expert features from motor current signature analysis (MCSA) were calculated. These features are physically interpretable. By combining all physical quantities with these metrics, a big feature pool is built up. Therefore, data mining needs powerful search algorithms, which have to find all relevant features and feature combinations.
The method described in this paper is based on publications that will be listed and described in the following. In recent decades, the community around MCSA has spent great effort on identifying basic phenomena and related features. These relations were assessed for faults such as broken rotor bars [19], eccentricity in the rotor shaft [20], and bearing faults [21]. The range was enlarged with the faults of a coupled machine, such as gear faults [22], misalignment [3], and cavitation [23]. In addition to basic phenomena that influence the spectra in the case of a fault, further reasons for changing a specific component in the spectra were identified. The most important disturbance in technical diagnostics is a changing load, which was researched in References [5,24]. In addition, in [25] it was shown that different faults also can influence the same component in the spectra. This means that the diagnosis of several targets becomes difficult. To solve this issue, studies to find new mono-causal features were conducted, e.g., in [26], based on instantaneous power, or in [27], based on space vector components. Another important research path is Advanced Transient Current Signature Analysis (ATCSA), which examines the transient part of a motor signal by applying Wavelet Transformation (WT). This smart approach uses the startup of a motor, which occurs periodically in most rotating machine applications. The effects on which the approach focuses are independent of the load, as described in [28]. An assessment of eligibility for the mentioned fault was conducted in [29].
Information-rich features are a limited option for solving problems in technical diagnostics. The other option is to combine features into feature vectors that can deal with multi-dependencies. Algorithms from the field of Artificial Intelligence (AI) are capable of this task. In [30], an overview of the most important algorithms is given, separated into Deep Learning and Machine Learning. Studies such as [25,31,32] show the successful application of Machine Learning, while [33,34,35] show the application of Deep Learning.
Although the example obtained good results, data-driven approaches do come with some issues. The first is the need for sufficient [36] and high-quality data [34]. This data is needed, on the one hand, for the training and, on the other hand, for validation. According to [37], using online training in combination with anomaly detection could be a solution. The second issue is how to transfer trained models to new problems. In [38], transfer learning was used to bring the parameters of an already trained convolutional neural network to a new model. The third issue is optimization. In many studies, AI algorithms were applied without adapted dimension reduction or hyperparameter tuning. There are two ways of reducing the dimension: transformation into a lower feature space or feature (subset) selection. The objectives of feature selection are to reduce both classification error and the number of features [39]. This helps to identify the relevant information and check the results for plausibility as the features remain physically interpretable, as discussed in [40,41].
Besides the progress made with powerful AI algorithms applied directly to technical diagnostics tasks for data mining, important work has been carried out in describing methods for finding valid knowledge in data. One strategy, which was first described in [17], is knowledge discovery in a database. The developed framework has five parts, which are described in detail in [16]: data selection, preprocessing, transformation, data mining, and interpretation. This indicates that the extraction of knowledge is more than the application of data mining, but data mining is the central process. These findings were also reported in [42]. As a consequence, the alignment of KDD with experiments to achieve valid knowledge for fault (shaft) misalignment was presented in [43]. The result was a list of feature sets that are capable of diagnosing misalignment on different machines. An assessment of KDD for knowledge extraction was conducted in [44]. A hands-on application of KDD was demonstrated in [45]. The objective of the study was to find features and basic phenomena related to wear on a milling tool. Another work, ref. [46], addresses the interpretation step to obtain valid feature sets from experiments with low variation in plants and machines. Unlike the other works, the authors used a method called Feature-Extraction, -Selection, Classification/Regression (FESC/R). This method uses the same setup for the data mining as KDED, but it lacks the aligned experiment and the interpretation. Depending on the field in which KDD is applied, there are different data mining strategies, as mentioned in [14]. For the field of technical diagnostics, FESC/R is a suitable choice, and so KDED utilizes this technique.
KDED is designed to find features with one experiment that allows for technical diagnostics on every machine set of the same kind. New studies have researched this issue. As described in ref. [47], the problem is that the distribution of the data for training is different from the distribution in the field; this is called domain shift. The reasons for this are influences like load or temperature that are not considered during the analysis. Leave-one-group-out (LOGO) validation is a method to check the results against unconsidered influences, as described in [48]. Also, LOGO is designed the application within KDED; it has not been implemented yet. Nevertheless, in ref. [49] it was shown that concepts based on FESC/R deal better with domain shift than deep learning-based concepts.
According to [50], feature selection is the major process in data mining for technical diagnostics. For feature selection, a feature pool is searched for features and combinations of features, respectively, to form feature sets that are information-rich. This enables the optimization of a classifier used in technical diagnostics. Depending on the structure, the algorithms are classified as filter, wrapper, embedded, or hybrid [51]. Another classification can be made using the basic search strategy [52]: exhaustive, heuristic, or metaheuristic. Representatives of the heuristic class are sequential forward selection (SFS) and sequential backward selection (SBS), and for the metaheuristic class, ant colony optimization (ACO) [53,54,55] or simulated annealing [56]. The latter algorithms are capable of finding the global best results [39]. Another issue with SFS and SBS is the nesting effect, which is described in [57]. The authors also describe a solution for the issue with SFS and SBS. This led to the further development of sequential forward floating selection (SFFS) and sequential backward floating selection (SBFS).
The contribution of this paper is a detailed description of KDED and its placement in the field of knowledge discovery. The relevance of plausible over statistical features is emphasized. In addition, the characteristics of metaheuristic search strategies in the context of knowledge discovery are shown with real-world data. A special spotlight is on the ability to find feature vectors that fulfill all the needs of KDED.
To this end, ant colony optimization (ACO) was applied. This algorithm is capable of dealing with five major requirements of the KDED: selecting the global best feature sets, combining features, avoiding the nesting effect, multi-objective optimization, and a feasible search. With the basic idea of ACO, the search for globally best feature sets and the combination of features is fulfilled. In order to avoid the nesting effect, the concept of expending prey is introduced. The ACO algorithm also enables multi-objective optimization. This means that the cost function respects the classification error and the number of features. Since KDED looks for explainable results, the number of features in a feature vector is limited. Considering this fact, the search speed can be improved by limiting the vector length. For this, the concept of stamina is implemented.
KDED and ACO were applied to data from two fault scenarios. The first fault is parallel misalignment. Misalignment can occur between the shafts of a motor and a working machine when they have to be separately mounted. Furthermore, misalignment occurs during operation because of vibration or thermal expansion. During the operation of such a system, load changes are common, so the load is intended as a disturbance. The second fault is cavitation, which arises when upstream pressure in a pump system is not sufficient. If this occurs, vapor bubbles arise in the media. With the increasing pressure of the downstream, these bubbles implode and could damage nearby surfaces. For pump systems, changing volumetric flows is common. Because of this, the flow is a disturbance in this scenario.

2. Methodology

2.1. Knowledge Discovery from Experimental Data

As shown in the introduction, there are powerful expert features and algorithms for the technical diagnostics of rotating machines. Tests and experiments on partially or fully automated test benches for rotating machines are daily business. Furthermore, there is KDD, a well-discussed framework that allows for the extraction of valid knowledge from data. In the following, a new framework will be presented that combines all the advantages of these aspects. The core elements are the use of general and expert features, the alignment of the experiment and the data mining, and verification of the results and their plausibility. All of this is performed to find basic phenomena for a diagnostic task that should allow for the transfer of classification models.
The framework of Knowledge Discovery from Experimental Data (KDED) is depicted and compared with KDD in Figure 1. The first difference between the two frameworks is a database in the case of KDD and an experiment in the case of KDED. KDD first appeared in economics to find knowledge in a company’s database and in astronomy to find patterns in sky images [10]. In this scenario, new data cannot be created for the processing. It only can be selected, transformed, or combined. For KDED, the data comes from an experiment designed for data mining. This means it is guaranteed that the information is there, the data builds clusters [58], and the data is optimized for the feature selection algorithm. In addition, it is guaranteed that there is enough data for all classes.
The second difference is the feedback. Depending on the results, modifications such as exchanging the feature selection algorithm, tuning the search parameters, or selecting the data can be made with both methods. For KDD no feedback to update the data is intended, but for KDED the modification of the data generation itself is possible. This feedback allows for optimization of the data that is to be processed. This is necessary when the results show a plausible data arrangement but bad classification performance. In this case, the step size for the sampling has to be adjusted. Also possible is a good classification but non-plausible arrangement. In these cases, the design of the experiment needs to be revised, and the process needs to be repeated. Some structural faults that can occur with data generation through experiments are described in [42]. When the results fulfill all the criteria of KDED, the found features describe basic phenomena. In addition to the differences in the structure, both frameworks can also differ in terms of the realization of the various steps, but in this paper, the focus is on KDED. Its steps will be described in the following.
For the supply of data, KDED uses an experiment. To ensure a low number of repetitions and modifications, the experiment needs a proper Design of Experiment (DoE). An example of this is provided in [59] for the analysis of bearings. The first question to answer is which signals have to be measured. To answer this question, it makes sense to first define the target and the disturbance. In a diagnostic task, the target is the quantity that must be classified correctly, and disturbance refers to those quantities that appear during a process and have a negative influence on the classification. For the DoE, the target is mostly clear because it is the fault. Selecting the disturbances requires estimating further major influences such as load, speed, volumetric flow, or temperature. If the number is unclear, expected disturbances can be added; the results will show if there is an influence. In contrast, the results can also show that a major influence was not considered. Because of this, expertise is needed to recognize all major influences. The next question is about quantization. With the experiment, a grid is defined. The grid is defined by the range and the number of steps required to quantify each quantity. This decision depends on the scope of the application and on whether non-linearities are expected in the range. A good starting point is to use four steps for the target and three steps for each disturbance. Another criterion is the scatter of the calculated features, which defines the step size of the target and the disturbances. In the first round, it should be ensured that there is no intersection of the clusters with a bigger step size. This ensures low fault rates in case of suitable feature vectors. The results will show the possible resolution of the fault severity. The expectation is to find a pattern that is comparable with the grid. Figure 2 shows the desired behavior. All the set conditions appear as native clusters and in the same order in the found dimension.
The result of the DoE is a list that combines the target with all disturbances. The list can quickly increase in size because all quantification steps need to be combined to sense all relevant information. Conducting such an experiment will take hours or days. This is why another relevant aspect of KDED is the use of partially or fully automated experiments. This can be realized with automated test benches, which are normally used for efficiency or long-term tests.
Once the data is available, the pre-processing step is conducted, which is depicted in Figure 3. The first part of this step is correction of the data structure. This includes modifying the signal length, modifying the number of samples, and removing faulty samples. The second part is feature extraction, with the following objectives:
  • Calculation of physically interpretable features → feature pool.
  • Compression.
  • Keeping the relevant information.
Figure 3. KDED step pre-processing.
Figure 3. KDED step pre-processing.
Machines 14 00104 g003
The first item on the list is relevant to the assessment and interpretation step. Interpretation is only possible if the features remain physically interpretable and can be checked against signal and rotating field theory. Results with good classification but no plausibility will be discarded. In order to calculate one feature, feature extractors based on a three-step scheme are applied to the signals. By combining these three steps, where the dashed boxes can be executed or not, features in the time domain and the spectra are created. A more detailed explanation is provided in [42]. All features together build the feature pool. The second item affects the data mining step. Compression of the data can make the search more efficient. The reduced execution time then allows for a deeper search. The third item ensures the quality of the KDED results. During feature extraction or pre-filtering, a loss of information is possible. To avoid this, or at least to be aware of any loss, KDED is designed to be transparent and systematic.
Data mining is the next step once the feature pool is available. It is the central step in KDED, in which an algorithm lists all the recommended feature vectors that are capable of classifying the data correctly. The step is described in Figure 4. With grouping, the target and the conditions for the feature selection are set. This is achieved by rearranging the prime classes that contain data from a single condition sampled in the experiment. By merging a target class and one or several disturbance classes, new and more complex classes arise. With this measure, the feature selection algorithm is forced to select only robust features that allow for classification under realistic circumstances. As previously mentioned regarding experiments that consider the major influences on a system, the amount of data quickly increases. In addition, valuable results can only be achieved with an exhaustive search. As a result, the execution time can drastically increase. To keep feature selection executable, a reduction in the feature pool may be necessary. There are several papers that focus on hybrid search strategies [60] and especially on the selection and configuration of a pre-filter [61]. The challenge in the application of pre-filters is the need to drop only features without any information. For proper feature selection, a wrapper algorithm is used. This class of search algorithm involves a classifier and its classification result as the criteria. According to [39], the advantages of doing this are as follow:
  • Assessment of feature combinations.
  • Optimization of the classification error.
  • High trust level because the classifier is involved.
Figure 4. KDED step Data Mining.
Figure 4. KDED step Data Mining.
Machines 14 00104 g004
The drawback of wrapper approaches is that they have a greater tendency to overfit than filter approaches [50]. To deal with this issue, a validation step is interposed. This means that the search algorithm rates the validation error instead of the classification error. As search strategies, heuristics such as SFS/SBS or metaheuristics such as ACO are meaningful. In most cases, an exhaustive search cannot be performed because of the ( M m max ) combinations that lead to long execution times. Metaheuristic strategies have a big advantage over greedy algorithms as they can find global optima. This aspect is essential for KDED. The application of ACO in the context of the KDED will be shown in the next section.
The final step of KDED is assessment and interpretation. This step is conducted when a search algorithm has created a recommendation list of feature vectors. It is important to emphasize that the data mining step only delivers recommendations because the results have not been checked for plausibility. In order to validate the recommendations and achieve a broad understanding of them, Figure 5 shows the criteria and the structure used to apply them. Depending on the search algorithm and its parameters, the number of features for each feature vector can vary. As a first measure, simplicity helps to reduce the list. Simplicity means that in the case of two feature vectors that have the same classification error, the one with fewer features is more valuable [10]. In addition, less complex feature vectors are easier to interpret in the sense of KDED. As a threshold, Equation (1) is applied:
m = D + 1
where m is the number of features of a vector, D the number of disturbances, and + 1 represents the target. According to this, a vector needs only one feature per influence. For some search strategies, simplicity can be included in the cost function. This leads to a multi-objective feature selection [39]. In these cases, the criterion is not part of the interpretation step. Another criterion is (Pearson) correlation. This allows for the strength of the inner connections to be assessed. One goal of KDED is to optimize classification by applying the right features. This means that good results can be achieved with different classifiers based on different concepts. For the validation of the feature vectors, k-Nearest Neighbors, Artificial Neural Networks, and Support Vector Machines are used. Finally, the arrangement of the clusters is visually checked. A good selection has to show all prime classes and an evolution that correlates with the increasing influences set up in the experiment. An example is depicted in Figure 2.

2.2. Ant Colony Optimization for KDED

Metaheuristic optimization algorithms are especially suitable for KDED because of their ability to search big feature spaces and find global optima. Out of the high number of algorithms belonging to this category, Ant Colony Optimization can be ideally modified for KDED. Further relevant aspects for the presented algorithm are as follows:
  • Tradeoff between minimal classification error and a minimal number of features.
  • Dealing with previously found solutions.
The basic idea of using ACO for feature selection is that a number of ants (agents) explore feature combinations (paths), which are then assessed by the number of features and classification error (prey). While the ants compete against each other in one round, the winner leaves a marker (pheromone track) that helps to find better solutions in the following round. The selection of a feature depends on the pheromones and a random value. The basic idea of the algorithm requires modification to fulfill all the requirements of KDED. The recommendations are depicted in Figure 6. Basically, the procedure has three loops: competition, epoch, and list. The competition and epoch loops are well-known by the basic algorithm and should deliver the global best results. These loops are extended with a list loop, which should find redundant feature vectors. After each epoch, a new feature vector is recommended, and the selected features are marked as expended. Expenditure is a weight that allows for previously selected features to be kept in the search space [62]. With this measure, unique features can be identified and used for several feature combinations.
In the first step, a route is planned for every agent. For this, (2) is applied:
P j = η · β j · τ i , j · ξ j
where η is a random number, β is the expenditure, τ is the pheromone level between the current and the next possible feature, and ξ is a mask that excludes already chosen features. This equation is the basic method for calculating a score to choose a feature. A probabilistic factor that initiates new feature combinations is important in this kind of optimization. Another aspect that is important for ACO’s application in feature selection is the length of the travel and the number of features. To determine this, the stamina of an agent is evaluated. This is achieved using (3).
s = η · f ( m , E m ) > 0.5
where η is a random number between zero and one. The function f ( m , E m ) is any sigmoid function that decreases with an increasing number of features m and with an inflection point at the expected number of features. The parameter E m is updated using the best result per epoch and can be initialized with (1). For this study, the logistic function was chosen:
f ( x ) = 1 e 0.25 · M E m
where M is the number of features within the feature pool. The threshold of 0.5 means an equal weighting for the chance to proceed and to stop. After the features are selected, they are evaluated using the k-nearest neighbors classifier. To consider the number of features, the performance (prey) is calculated with (5) the following:
γ = 1 e r r ( 0.03 · m )
where e r r is the classification error and m the number of features. The result is compared with the best result to date. If the new result is better, it is saved together with the feature vector; otherwise, it is discarded. When all agents have been evaluated, the pheromone levels of the best vector’s features are updated. The best result of the competition cycle is compared with the best result to date of the epoch cycle, and if the new result is better, it is saved together with the feature vector; otherwise, it is discarded. The update is followed by checking for the end of the epoch. The following abortion criteria are possible:
  • n-competitions executed.
  • n-rounds king of the hill.
  • Prey less than n.
All criteria come with individual advantages and disadvantages. The execution of n-competitions can be time-consuming even if there is no progress in updating the best result. The advantage is that this provides the algorithm with the chance to seek a global optimum. Prey greater than n forces the algorithm to accept only high-value results, but it is not clear whether high performance can be reached because, with a low threshold, there are no cycles to find a global optimum. For the application of the algorithm in this paper, n rounds king of the hill was selected. Using this criterion allows for the cycles to wait for a global optimum. When an epoch ends, a candidate for the recommendation list has been found. To find the next candidate, the prey is updated. This means that features of the best results receive a lower weight so that their chance of selection decreases. For a new epoch, the pheromone level of all features is reset. To check for an end to the search, the same abortion criteria mentioned above can be used again. In order to find a list of candidates, the number of epochs is set to ten.

3. Results and Discussion

For the discussion presented in this section, three experiments were conducted to generate data for the KDED. The first experiment was to measure the three-phase current and one voltage on a 7.5 kW ASM powered by the mains. The signals were measured in twelve conditions with two major influences: parallel misalignment from 0 to 0.11 mm in four steps and load from 72 to 100% of the rated load in three steps. The misalignment was induced by distance plates, and the load by configuring a coupled brake. Further details of the experiment can be found in [42]. In the second experiment, the same signals were measured on a 1.1 kW motor that propelled a circulation pump with a directly mounted wheel. The experiment included measuring nine conditions with two major influences: cavitation from 0 to a 10% drop in the pump pressure in three steps and flow from 90 to 110% of the rated flow in three steps. To set up the cavitation, a vacuum pump was used to reduce the pressure in the liquid by evacuating the air in the connected tank. The flow was set by opening and closing a valve mounted in the hydraulic circuit. In the following, the results for the KDED applied to the generated data will be discussed. Firstly, the results of the KDED conducted once with SFS and once with ACO will be compared. Aspects of interest are the number of vectors, the classification error, and the features that appeared. For the results of ACO, the assessment and interpretation step of KDED will then be applied.

3.1. Comparison Between SFS and ACO

A statistical overview of the results from SFS and ACO is listed in Table 1. The comparison shows the individual advantages of the two algorithms for KDED. Both algorithms had to calculate the 40 best feature vectors (row 9). The first three rows (rows 1–3) of the table list the number of feature vectors found with an error rate of 0, 0 to 1%, and 1 to 5%. The vectors are counted only once, starting with the group with the lowest threshold. It can be seen that SFS performed slightly better with the parallel misalignment data when classification error was the only objective. This means that ACO did not completely look for the feature space to find the best two feature vectors. A longer search could resolve this. Otherwise, for parallel misalignment and cavitation, ACO found more feature vectors below a 5% error rate, which increases the chance of finding useful knowledge.
Rows 4 to 6 list the number of features per vector. For parallel misalignment, all obtained vectors had two features. This means that the data provides enough solutions in the expected dimension, and these were discovered by the algorithms. In the case of cavitation, SFS found 40 vectors with two dimensions. Because of the search strategy and the comparison, the maximum number of features in a vector was two. Because of this, the total number of found feature vectors below 5% was only 22 compared to ACO, which found 28 feature vectors with two features. For ACO, there was no fixed threshold for the number of features. This led to 12 feature vectors with three features. Because of the expected dimension of two, more challenging search parameters could lead to a higher number of two-dimensional feature vectors.
As depicted in Figure 5, the criterion to assess the found features is the correlation. Rows 7 and 8 list the number of found feature vectors with a Pearson correlation (PC) above 0.95 and 0.98. The counting includes the correlation of a single feature with parallel misalignment, load, and angular misalignment in the case of fault parallel misalignment, as well as cavitation and volumetric flow in the case of fault cavitation. For parallel misalignment, two features with a very high correlation were found with SFS and ACO, respectively. With the lower correlation, more features were found with SFS. Nevertheless, row 11 shows that more KDED conform features were found with ACO. This means that matching features do not need a very high correlation to build a plausible result. A different picture is shown in the results for the case of cavitation. For SFS and ACO, only two very highly correlated features were found. An explanation will be provided in the next section.
Row 10 lists the number of features found by both search strategies. For both data sets, approximately 50% of the overall number of features were discovered by both algorithms.
Row 11 lists the number of feature vectors that are compliant with the KDED criteria. In the case of parallel misalignment, SFS found 15 and ACO 36 of these feature vectors. The opposite result is shown for cavitation data; here, SFS found 11 and ACO found 5 vectors. The lower number of suitable feature vectors in the case of cavitation is confirmed through the assessment and interpretation step of KDED. In fact, the data for cavitation showed a worse separation capability compared to parallel misalignment. ACO seems to deal better with optimal data. However, a limitation to two-dimensional features could lead to more of the desired feature combination. Considering the best error rates and the number of KDED-compliant results, the ACO algorithm is the better choice for KDED. The reason for this is that the classification errors were slightly higher, but the number of useful combinations was also higher. This is because the algorithm does not discard features and optimizes each combination. This property is the most important for KDED.

3.2. KDED—Assessment and Interpretation

As described in the introduction, KDED was made to search for plausible results. Because of this, no results were obtained without an assessment and interpretation of the recommendations from the data mining step. As seen in Table 1, there are results with classification errors of less than 5%. The question now is whether these results are plausible. ACO did not find any plausible results. An example is shown in Figure 7 on the left-hand side. The depicted pattern shows all prime classes derived from the twelve conditions set by the experiment. This is one aspect of the KDED search. Unfortunately, the prime classes do not show an arrangement that is suitable for the order of the variables. The arrangement only fits for fault severity, not for the load. In the examined case, the intersection of the clusters was acceptable. The severity of the intersection only provides a hint regarding the maximal resolution of the physical magnitudes.
In contrast, Figure 7 depicts, on the right-hand side, a pattern from data mining with SFS that fulfills all the requirements of KDED. The pattern shows all prime classes that were set during the experiment. In addition, the arrangement follows the order of the major influences induced with the experiment. The correlation of the features with the major influences yields the following results:
  • IL1-SS: with load 0.961; with parallel misalignment −0.01.
  • IL3-ECC1-k1m: with load −0.098; with parallel misalignment 0.968.
This indicates that each of the features describes one influence. Compensation for multi-dependence is not necessary.
For the fault cavitation, KDED conform results were found with SFS. The pattern is depicted in Figure 8. The pattern shows all nine prime classes that were set during the experiment. Furthermore, the arrangement follows the order of the major influences induced with the experiment. Only one prime class is outside the expected lining, and one class has a bigger stray compared to the others. However, the selection is accepted. It is planned to repeat this experiment with optimized parameters to set the cavitation, because the 10% drop in head was difficult to keep stable. Solving this issue should lead to the exact lining. The features correlate with the main influences as follows:
  • IL2-SRM: with flow 0.832; with cavitation −0.468.
  • IL1-Seg11-PP1: with flow 0.178; with cavitation 0.586.
Figure 8. Pattern of a plausible result from the data mining step when applying ACO to cavitation data.
Figure 8. Pattern of a plausible result from the data mining step when applying ACO to cavitation data.
Machines 14 00104 g008
In this case, the correlation with disturbance is high, and the correlation with the target is only medium. However, in this case, both features correlate with cavitation. This also leads to classifiable results.

4. Conclusions

In this paper, a new framework, called KDE,D was presented. The framework is designed to search for plausible features and feature combinations that can solve a technical diagnostics problem and is expected to overcome domain shift issues.
Also, an evaluation is always needed. Data mining remains the central step in KDD and KDED. To improve the search, a novel ACO algorithm was presented. This was designed to fulfill all KDED needs, overcoming nesting effect (completeness), feature vectors with a low classification rate (relevance), and the low number of features (interpretability), and achieving global optima (quality) within a feasible execution time. Although the nesting effect was not relevant to the real-world data, the mechanism was tested separately. The expectation that the ACO algorithm would find the global best features was also not presented with the selected hyperparameter. Moreover, SFS alone found two feature vectors with a classification error of zero. Nevertheless, ACO found 240% more KDED conform feature vectors (36:15) in the case of parallel misalignment. And, for cavitation, ACO found 45% of the features compared to SFS (5:15). An explanation for this is that the algorithm, in this case, tends to select a third feature. Considering the rule regarding simplicity and the results of SFS, this third feature is not necessary and prevents meaningful combinations. A slight optimization of the hyperparameters could solve this problem. The number of KDED conform results remains the most meaningful value in the context of KDED. Because of this, meta-heuristic search strategies such as ACO are suitable for knowledge discovery.
The application of ACO also showed some limitations. First, ACO has many more hyperparameters, and a well-tuned set of these parameters is crucial for finding suitable results. Second, the global best results are of a special interest in KDED. Unfortunately, the implemented algorithm in combination with the selected hyperparameter did not lead to the best results. It is expected that optimization in this direction would lead to a drastic increase in the execution time.
An example of plausible results for the faults of parallel misalignment and cavitation were presented and discussed. These results were obtained with SFS and ACO. The good characteristics of the results led to high correlation coefficients. As an example, a power-related feature, sum of squares, and an eccentricity-related feature, MCSA-ECC1, lead, in the case of parallel misalignment, to a classification error of zero. The target can be perfectly classified even if the load varies. A similar result, with some limitations, was presented for the case of cavitation. The first feature, peak position, is slip-related and correlates with cavitation. The second feature, square root mean, is related to the power consumption and correlates with the flow. These findings of basic relations can easily be applied to new problems of the same kind.
Further developments will bring the LOGO to KDED. With this, a proof for the transferable feature sets will be obtained. Another improvement is the application of hybrid methods. With this measure, a shorter search, especially in the case of ACO will be realized.

Author Contributions

Conceptualization, S.B. and S.U.; methodology, S.B.; software, S.B.; validation, S.B.; formal analysis, S.B.; investigation, S.B.; resources, S.U.; data curation, S.B.; writing—original draft preparation, S.B.; writing—review and editing, S.B.; visualization, S.B.; supervision, S.U.; project administration, S.U.; funding acquisition, S.U. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
mNumber of features within a feature vector
MNumber of features within the feature pool
e r r Error rate of the classification
KDDKnowledge Discovery in Database
KDEDKnowledge Discovery from Experimental Data
DMData Mining
SFSSequential Forward Selection
SBSSequential Backward Selection
ACOAnt Colony Optimization
FESC/RFeature Extraction, Feature Selection, Classification or Regression

References

  1. Leidinger, B. Wertorientierte Instandhaltung, 2nd ed.; SpringerLink, Springer Gabler: Wiesbaden, Germany, 2017. [Google Scholar]
  2. Kliman, G.B.; Koegl, R.A.; Stein, J.; Endicott, R.D.; Madden, M.W. Noninvasive detection of broken rotor bars in operating induction motors. IEEE Trans. Energy Convers. 1988, 3, 873–879. [Google Scholar] [CrossRef] [PubMed]
  3. Popaleny, P.; Antonino-Daviu, J. Electric Motors Condition Monitoring Using Currents and Vibrations Analyses. In Proceedings of the 2018 XIII International Conference on Electrical Machines (ICEM), Alexandroupoli, Greece, 1–3 September 2018. [Google Scholar] [CrossRef]
  4. Bonaldi, E.L.; Oliveira, L.E.L.; da Silva, J.G.B.; Lambert-Torres, G.; da Silva, L.E.B. Detecting load failures using the induction motor as a transducer. In Proceedings of the 2008 10th International Conference on Control, Automation, Robotics and Vision, Hanoi, Vietnam, 17–20 December 2008. [Google Scholar] [CrossRef]
  5. Wu, L.; Habetler, T.G.; Harley, R.G. A Review of Separating Mechanical Load Effects from Rotor Faults Detection in Induction Motors. In Proceedings of the 2007 IEEE International Symposium on Diagnostics for Electric Machines, Power Electronics and Drives, Cracow, Poland, 6–8 September 2007. [Google Scholar] [CrossRef]
  6. Verma, A.K.; Sarangi, S.; Kolekar, M.H. Shaft Misalignment Detection using Stator Current Monitoring. Int. J. Adv. Comput. Res. 2013, 3, 305. [Google Scholar]
  7. Kumar, C.; Krishnan, G.; Sarangi, S. Experimental investigation on misalignment fault detection in induction motors using current and vibration signature analysis. In Proceedings of the 2015 International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), Greater Noida, India, 25–27 February 2015. [Google Scholar] [CrossRef]
  8. Park, Y.; Jeong, M.; Lee, S.B.; Antonino-Daviu, J.A.; Teska, M. Influence of blade pass frequency vibrations on MCSA-based rotor fault detection of induction motors. In Proceedings of the 2016 IEEE Energy Conversion Congress and Exposition (ECCE), Milwaukee, WI, USA, 18–22 September 2016. [Google Scholar] [CrossRef]
  9. Dorrell, D.G.; Thomson, W.T.; Roach, S. Analysis of airgap flux, current and vibration signals as a function of the combination of static and dynamic airgap eccentricity in 3-phase induction motors. In Proceedings of the IAS 1995, Conference Record of the 1995 IEEE Industry Applications Conference Thirtieth IAS Annual Meeting, Orlando, FL, USA, 8–12 October 1995. [Google Scholar] [CrossRef]
  10. Fayyad, U.M.; Piatetsky-Shapiro, G.; Smyth, P. From Data Mining to Knowledge Discovery in Databases. AI Mag. 1996, 17, 37–54. [Google Scholar] [CrossRef]
  11. Helwig, N.J. Zustandsbewertung Industrieller Prozesse Mittels Multivariater Sensordatenanalyse am Beispiel hydraulischer und Elektromechanischer Antriebssysteme, 1st ed.; Number 30 in Aktuelle Berichte aus der Mikrosystemtechnik—Recent Developments in MEMS; ShakerVerlag: Düren, Germany, 2019. [Google Scholar]
  12. Beniwal, S. Classification and Feature Selection Techniques in Data Mining. Int. J. Eng. Res. Technol. (IJERT) 2016, 1, 1–6. [Google Scholar]
  13. Quah, K.H.; Quek, C. MCES: A Novel Monte Carlo Evaluative Selection Approach for Objective Feature Selections. IEEE Trans. Neural Netw. 2007, 18, 431–448. [Google Scholar] [CrossRef]
  14. Chen, Q.; Chen, C.; Chen, X. An Intelligent Knowledge Discovery System with a Novel Knowledge Acquisition Methodology. In Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007), Haikou, China, 24–27 August 2007. [Google Scholar] [CrossRef]
  15. Sun, W.; Zhang, X.; Wang, H. A knowledge discovery system for detecting and visualizing knowledge evolution of a research field. In Proceedings of the 2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), Shenyang, China, 23–25 July 2013. [Google Scholar] [CrossRef]
  16. Luo, Q. Advancing Knowledge Discovery and Data Mining. In Proceedings of the First International Workshop on Knowledge Discovery and Data Mining (WKDD 2008), Adelaide, Australia, 23–24 January 2008. [Google Scholar] [CrossRef]
  17. Fayyad, U.; Piatetsky-Shapiro, G.; Smyth, P. Knowledge Discovery and Data Mining: Towards a Unifying Framework. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996. [Google Scholar]
  18. Bold, S. Identifikation Robuster Merkmale für die Technische Diagnostik von Netzgespeisten Asynchronmotoren und der Angetriebenen Arbeitsmaschinen. Ph.D. Thesis, Saarland University and State Library, Saarbrücken, Germany, 2025. [Google Scholar] [CrossRef]
  19. Kral, C.; Haumer, A.; Grabner, C. Broken Rotor Bars in Squirrel Cage Induction Machines— Modeling and Simulation. In Lecture Notes in Electrical Engineering; Springer: Amsterdam, The Netherlands, 2010; pp. 81–91. [Google Scholar] [CrossRef]
  20. Cameron, J.R.; Thomson, W.T.; Dow, A.B. Vibration and current monitoring for detecting airgap eccentricity in large induction motors. IEE Proc. Electr. Power Appl. 1986, 133, 155. [Google Scholar] [CrossRef]
  21. Martinez-Montes, E.; Jimenez-Chillaron, L.; Gilabert-Marzal, J.; Antonino-Daviu, J.; Quijano-Lopez, A. Evaluation of the Detectability of Bearing Faults at Different Load Levels Through the Analysis of Stator Currents. In Proceedings of the 2018 XIII International Conference on Electrical Machines (ICEM), Alexandroupoli, Greece, 1–3 September 2018. [Google Scholar] [CrossRef]
  22. Kia, S.H.; Marzebali, M.H.; Henao, H.; Capolino, G.A.; Faiz, J. Simulation and experimental analyses of planetary gear tooth defect using electrical and mechanical signatures of wound rotor induction generators. In Proceedings of the 2017 IEEE 11th International Symposium on Diagnostics for Electrical Machines, Power Electronics and Drives (SDEMPED), Tinos, Greece, 29 August–1 September 2017. [Google Scholar] [CrossRef]
  23. Perovic, S. Diagnosis of Pump Faults and Flow Regimes. Ph.D. Thesis, University of Sussex, Brighton, UK, 2000. [Google Scholar]
  24. Obaid, R.R.; Habetler, T.G. Effect of load on detecting mechanical faults in small induction motors. In Proceedings of the 4th IEEE International Symposium on Diagnostics for Electric Machines, Power Electronics and Drives (SDEMPED), Atlanta, GA, USA, 24–26 August 2003. [Google Scholar] [CrossRef]
  25. Tahir, M.M.; Hussain, A.; Badshah, S.; Khan, A.Q.; Iqbal, N. Classification of unbalance and misalignment faults in rotor using multi-axis time domain features. In Proceedings of the 2016 International Conference on Emerging Technologies (ICET), Islamabad, Pakistan, 18–19 October 2016. [Google Scholar] [CrossRef]
  26. Bossio, J.M.; Bossio, G.R.; Angelo, C.H.D. Angular misalignment in induction motors with flexible coupling. In Proceedings of the 2009 35th Annual Conference of IEEE Industrial Electronics, Porto, Portugal, 3–5 November 2009. [Google Scholar] [CrossRef]
  27. Kostic-Perovic, D.; Arkan, M.; Unsworth, P. Induction motor fault detection by space vector angular fluctuation. In Proceedings of the Conference Record of the 2000 IEEE Industry Applications Conference. Thirty-Fifth IAS Annual Meeting and World Conference on Industrial Applications of Electrical Energy (Cat. No.00CH37129), Rome, Italy, 8–12 October 2000. [Google Scholar] [CrossRef]
  28. Antonino-Daviu, J.A.; Quijano-Lopez, A.; Rubbiolo, M.; Climente-Alarcon, V. Advanced Analysis of Motor Currents for the Diagnosis of the Rotor Condition in Electric Motors Operating in Mining Facilities. IEEE Trans. Ind. Appl. 2018, 54, 3934–3942. [Google Scholar] [CrossRef]
  29. Antonino-Daviu, J.; Popaleny, P. Detection of Induction Motor Coupling Unbalanced and Misalignment via Advanced Transient Current Signature Analysis. In Proceedings of the 2018 XIII International Conference on Electrical Machines (ICEM), Alexandroupoli, Greece, 3–6 September 2018. [Google Scholar] [CrossRef]
  30. Nath, A.G.; Udmale, S.S.; Singh, S.K. Role of artificial intelligence in rotor fault diagnosis: A comprehensive review. Artif. Intell. Rev. 2020, 54, 2609–2668. [Google Scholar] [CrossRef]
  31. Rapur, J.S.; Tiwari, R. Automation of multi-fault diagnosing of centrifugal pumps using multi-class support vector machine with vibration and motor current signals in frequency domain. J. Braz. Soc. Mech. Sci. Eng. 2018, 40, 278. [Google Scholar] [CrossRef]
  32. Haroun, S.; Seghir, A.N.; Touati, S.; Hamdani, S. Misalignment fault detection and diagnosis using AR model of torque signal. In Proceedings of the 2015 IEEE 10th International Symposium on Diagnostics for Electrical Machines, Power Electronics and Drives (SDEMPED), Guarda, Portugal, 1–4 September 2015. [Google Scholar] [CrossRef]
  33. Kumar, P.; Hati, A.S. Convolutional neural network with batch normalisation for fault detection in squirrel cage induction motor. IET Electr. Power Appl. 2020, 15, 39–50. [Google Scholar] [CrossRef]
  34. Shao, S.Y.; Sun, W.J.; Yan, R.Q.; Wang, P.; Gao, R.X. A Deep Learning Approach for Fault Diagnosis of Induction Motors in Manufacturing. Chin. J. Mech. Eng. 2017, 30, 1347–1356. [Google Scholar] [CrossRef]
  35. Wang, Q.; Huang, R.; Xiong, J.; Yang, J.; Dong, X.; Wu, Y.; Wu, Y.; Lu, T. A survey on fault diagnosis of rotating machinery based on machine learning. Meas. Sci. Technol. 2024, 35, 102001. [Google Scholar] [CrossRef]
  36. Akpudo, U.E.; Hur, J.W. Towards bearing failure prognostics: A practical comparison between data-driven methods for industrial applications. J. Mech. Sci. Technol. 2020, 34, 4161–4172. [Google Scholar] [CrossRef]
  37. Calabrese, F.; Regattieri, A.; Bortolini, M.; Galizia, F.G.; Visentini, L. Feature-Based Multi-Class Classification and Novelty Detection for Fault Diagnosis of Industrial Machinery. Appl. Sci. 2021, 11, 9580. [Google Scholar] [CrossRef]
  38. Hasan, M.J.; Sohaib, M.; Kim, J.M. A Multitask-Aided Transfer Learning-Based Diagnostic Framework for Bearings under Inconsistent Working Conditions. Sensors 2020, 20, 7205. [Google Scholar] [CrossRef] [PubMed]
  39. Xue, B.; Zhang, M.; Browne, W.N.; Yao, X. A Survey on Evolutionary Computation Approaches to Feature Selection. IEEE Trans. Evol. Comput. 2016, 20, 606–626. [Google Scholar] [CrossRef]
  40. Bania, R.K. Survey on Feature Selection for Data Reduction. Int. J. Comput. Appl. 2014, 94, 1–7. [Google Scholar] [CrossRef]
  41. Ghosh, S.; Pramanik, P. A Combined Framework for Dimensionality Reduction of Hyperspectral Images using Feature Selection and Feature Extraction. In Proceedings of the 2019 IEEE Recent Advances in Geoscience and Remote Sensing: Technologies, Standards and Applications (TENGARSS), Kochi, India, 17–20 October 2019. [Google Scholar] [CrossRef]
  42. Bold, S.; Urschel, S. Feature Identification for Diagnosing Misalignment under the Influence of Parameter Variation. In Proceedings of the 2022 International Conference on Electrical Machines (ICEM), Valencia, Spain, 5–8 September 2022. [Google Scholar] [CrossRef]
  43. Bold, S.; Urschel, S. A Knowledge Discovery Process Extended to Experimental Data for the Identification of Motor Misalignment Patterns. Machines 2023, 11, 827. [Google Scholar] [CrossRef]
  44. Sacha, D.; Stoffel, A.; Stoffel, F.; Kwon, B.C.; Ellis, G.; Keim, D.A. Knowledge Generation Model for Visual Analytics. IEEE Trans. Vis. Comput. Graph. 2014, 20, 1604–1613. [Google Scholar] [CrossRef]
  45. Dymora, P.; Mazurek, M.; Bomba, S. A Comparative Analysis of Selected Predictive Algorithms in Control of Machine Processes. Energies 2022, 15, 1895. [Google Scholar] [CrossRef]
  46. Fuchs, C.; Klein, S.; Schauer, J.; Schütze, A.; Schneider, T. Eine Methode zur erklärbaren Merkmalsextraktion aus dem Zeit- und Frequenzbereich für Condition Monitoring. In Proceedings of the 22. GMA/ITG—Fachtagung Sensoren und Messsysteme 2024, Wunstorf, Germany, 11–12 June 2024. [Google Scholar] [CrossRef]
  47. Goodarzi, P.; Schütze, A.; Schneider, T. Domain shifts in industrial condition monitoring: A comparative analysis of automated machine learning models. J. Sens. Sens. Syst. 2025, 14, 119–132. [Google Scholar] [CrossRef]
  48. Goodarzi, P.; Klein, S.; Schütze, A.; Schneider, T. Comparing Different Feature Extraction Methods in Condition Monitoring Applications. In Proceedings of the 2023 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), IEEE, Kuala Lumpur, Malaysia, 22–25 May 2023; pp. 1–6. [Google Scholar] [CrossRef]
  49. Hidayaturrahman; Heryadi, Y.; Lukas; Suparta, W.; Arifin, Y. Exploring Shifted Domain Problem within MNIST Dataset. In Proceedings of the 2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE), Jakarta, Indonesia, 16 February 2023; pp. 750–754. [Google Scholar] [CrossRef]
  50. Visalakshi, S.; Radha, V. A literature review of feature selection techniques and applications: Review of feature selection in data mining. In Proceedings of the 2014 IEEE International Conference on Computational Intelligence and Computing Research, Coimbatore, India, 18–20 December 2014. [Google Scholar] [CrossRef]
  51. Kaur, A.; Guleria, K.; Trivedi, N.K. Feature Selection in Machine Learning: Methods and Comparison. In Proceedings of the 2021 International Conference on Advance Computing and Innovative Technologies in Engineering (ICACITE), Greater Noida, India, 4–5 March 2021. [Google Scholar] [CrossRef]
  52. Liu, H.H. Feature Selection for Knowledge Discovery and Data Mining. In The Springer International Series in Engineering and Computer Science Ser.; Springer: Berlin, Germany, 1998. [Google Scholar]
  53. Colorni, A.; Dorigo, M.; Maniezzo, V.; Varela, F.J.; Bourgine, P.E. Distributed Optimization by Ant Colonies. In Proceedings of the European Conference on Artificial Life, Paris, France, 11–13 December 1992. [Google Scholar]
  54. Dorigo, M.; Maniezzo, V.; Colorni, A. Ant system: Optimization by a colony of cooperating agents. IEEE Trans. Syst. Man. Cybern. Part (Cybern.) 1996, 26, 29–41. [Google Scholar] [CrossRef]
  55. Subbotin, S.; Oleynik, A. Modifications of Ant Colony Optimization Method for Feature Selection. In Proceedings of the 2007 9th International Conference—The Experience of Designing and Applications of CAD Systems in Microelectronics, Lviv, UKraine, 19–24 February 2007. [Google Scholar] [CrossRef]
  56. Yang, Z.; Ren, J. An Effective Two-Stage Feature Selection Method with Parameters Optimized by Simulated Annealing Algorithm. In Proceedings of the 2018 5th International Conference on Information, Cybernetics, and Computational Social Systems (ICCSS), Hangzhou, China, 16–19 August 2018. [Google Scholar] [CrossRef]
  57. Pudil, P.; Novovičová, J.; Kittler, J. Floating search methods in feature selection. Pattern Recognit. Lett. 1994, 15, 1119–1125. [Google Scholar] [CrossRef]
  58. Runkler, T.A. Data Mining; Springer eBook Collection, Vieweg+Teubner: Berlin, Germany, 2015. [Google Scholar]
  59. Schnur, C.; Goodarzi, P.; Robin, Y.; Schauer, J.; Schütze, A. A Machine Learning Dataset of Artificial Inner Ring Damage on Cylindrical Roller Bearings Measured Under Varying Cross-Influences. Data 2025, 10, 77. [Google Scholar] [CrossRef]
  60. Thejas, G.S.; Garg, R.; Iyengar, S.S.; Sunitha, N.R.; Badrinath, P.; Chennupati, S. Metric and Accuracy Ranked Feature Inclusion: Hybrids of Filter and Wrapper Feature Selection Approaches. IEEE Access 2021, 9, 128687–128701. [Google Scholar] [CrossRef]
  61. Singh, B.; Sankhwar, J.S.; Vyas, O.P. Optimization of feature selection method for high dimensional data using fisher score and minimum spanning tree. In Proceedings of the 2014 Annual IEEE India Conference (INDICON), Pune, India, 11–13 December 2014. [Google Scholar] [CrossRef]
  62. Wang, Z.; Gao, S.; Zhou, M.; Sato, S.; Cheng, J.; Wang, J. Information-Theory-based Nondominated Sorting Ant Colony Optimization for Multiobjective Feature Selection in Classification. IEEE Trans. Cybern. 2022, 53, 5276–5289. [Google Scholar] [CrossRef]
Figure 1. Comparison of the knowledge discovery frameworks KDD (top) and KDED (bottom).
Figure 1. Comparison of the knowledge discovery frameworks KDD (top) and KDED (bottom).
Machines 14 00104 g001
Figure 2. Transition from the conditions set in the experiment (left) to native cluster (right).
Figure 2. Transition from the conditions set in the experiment (left) to native cluster (right).
Machines 14 00104 g002
Figure 5. KDED step Assessment and Interpretation.
Figure 5. KDED step Assessment and Interpretation.
Machines 14 00104 g005
Figure 6. ACO algorithm for KDED.
Figure 6. ACO algorithm for KDED.
Machines 14 00104 g006
Figure 7. Pattern of an implausible (left) and a plausible (right) result from the data mining step when applying ACO to parallel misalignment data.
Figure 7. Pattern of an implausible (left) and a plausible (right) result from the data mining step when applying ACO to parallel misalignment data.
Machines 14 00104 g007
Table 1. Comparison of SFS and ACO results for parallel misalignment and cavitation data.
Table 1. Comparison of SFS and ACO results for parallel misalignment and cavitation data.
No.EvaluationParallel MisalignmentCavitation
  SFSACOSFSACO
1# Vectors e r r = 02000
2# Vectors e r r < 1%9900
3# Vectors e r r < 5%8292229
4# Vectors with 1 feature0000
5# Vectors with 2 features40404028
6# Vectors with 3 features00012
7# Vectors with PC > 0.95472622
8# Vectors with PC > 0.982222
9total # of features80808092
10# Same features43434646
11# Vectors KDED conform1536115
12total # of Vectors40404040
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bold, S.; Urschel, S. Assessment of Feature Selection Algorithms for Knowledge Discovery from Experimental Data. Machines 2026, 14, 104. https://doi.org/10.3390/machines14010104

AMA Style

Bold S, Urschel S. Assessment of Feature Selection Algorithms for Knowledge Discovery from Experimental Data. Machines. 2026; 14(1):104. https://doi.org/10.3390/machines14010104

Chicago/Turabian Style

Bold, Sebastian, and Sven Urschel. 2026. "Assessment of Feature Selection Algorithms for Knowledge Discovery from Experimental Data" Machines 14, no. 1: 104. https://doi.org/10.3390/machines14010104

APA Style

Bold, S., & Urschel, S. (2026). Assessment of Feature Selection Algorithms for Knowledge Discovery from Experimental Data. Machines, 14(1), 104. https://doi.org/10.3390/machines14010104

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop