Efficient Fault Diagnosis of Elevator Cabin Door Drives Using Machine Learning with Data Reduction for Reliable Transmission

Gęca, Jakub; Czerwiński, Dariusz; Drzymała, Bartosz; Kolano, Krzysztof

doi:10.3390/app15137017

Open AccessArticle

Efficient Fault Diagnosis of Elevator Cabin Door Drives Using Machine Learning with Data Reduction for Reliable Transmission

¹

Department of Electrical Drives and Machines, Faculty of Electrical Engineering and Computer Science, Lublin University of Technology, Nadbystrzycka 38A, 20-618 Lublin, Poland

²

Department of Applied Informatics, Faculty of Mathematics and Information Technology, Lublin University of Technology, Nadbystrzycka 38A, 20-618 Lublin, Poland

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(13), 7017; https://doi.org/10.3390/app15137017

Submission received: 19 May 2025 / Revised: 9 June 2025 / Accepted: 12 June 2025 / Published: 22 June 2025

(This article belongs to the Special Issue Industrial AI: Applications in Fault Detection, Diagnosis, and Prognosis—2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

This article addresses the issue of the elevator cabin door drive system failure diagnosis. The analyzed component is one of the most critical and the most vulnerable part of the entire elevator. Existing solutions in the literature include methods such as spectral analysis of system vibrations, motor current signature analysis, fishbone diagrams, fault trees, multi-agent systems, image recognition, and machine learning techniques. However, there is a noticeable gap in comprehensive studies that specifically address classification of the multiple types of system components failures, class imbalance in the dataset, and the need to reduce data transmitted over the elevator’s internal bus. The developed diagnostic system measures the drive system’s parameters, processes them to reduce data, and classifies 11 device failures. This was achieved by constructing a test bench with a prototype cabin door drive system, identifying the most frequent system faults, developing a data preprocessing method that aggregates every driving cycle to one sample, reducing the transmitted data by 300 times, and using machine learning for modeling. A comparative analysis of the fault detection performance of seven different machine learning algorithms was conducted. An optimal cross-validation method and hyperparameter optimization techniques were employed to fine-tune each model, achieving a recall of over 97% and an F1 score approximately 97%. Finally, the developed data preparation method was implemented in the cabin door drive controller.

Keywords:

classification algorithms; data transmission; elevators; fault detection; fault diagnosis; machine learning; predictive maintenance; preventive maintenance

1. Introduction

Due to the growing number of residential buildings and ongoing urbanization, the current number of these devices is undoubtedly greater. Elevators today serve not only as vertical transportation systems but also as a critical factor in enhancing building accessibility for people with disabilities. Furthermore, in multi-story buildings, elevators become an indispensable part of the infrastructure. These devices are also one of the fundamental tools for enabling the mobility of elderly individuals, a factor of increasing importance in the context of Europe’s aging population.

As the significance of elevators for modern life increases, so do the demands imposed on these systems. Key considerations include technical parameters such as load capacity, travel speed, lifting height, and energy efficiency. Ensuring user comfort also requires an appropriate control algorithm and an optimized speed curve. Another crucial factor is the reliability of these systems, and the downtime associated with potential failures.

Although elevators are now commonplace in many buildings, they continue to be complex electromechanical systems. Failures can result not only in material losses but also pose safety risks to passengers and their property. This also ties into psychological aspects, as the enclosed space of an elevator cabin can induce stress and severe psychological discomfort during extended periods of entrapment caused by malfunctions. For these reasons, planning an appropriate maintenance strategy for elevator systems and implementing methods for rapid fault detection and identification become critically important challenges.

According to the data in [1], cabin door failures account for a significant percentage of all defects occurring in elevators, especially those related to improper use or maintenance. The authors in both [2] and [3], indicate that elevator cabin door system failures can account for up to 80% of all elevator system failures. Elevators are now common in many buildings and are complex electromechanical systems, whose failures can lead not only to property damage but also put traveling passengers at risk. Due to the complex structure of elevator systems and bearing in mind their important role in the life of modern man, many attempts have been made to develop solutions to diagnose failures of their components. Failure analysis is possible on the basis of the measurement data of equipment operating parameters and historical information about failures. In order to call the obtained results authoritative, it is necessary to collect a sufficient amount of data representing each case of failure and the accompanying conditions. This is crucial for finding potential causes of failure and its symptoms, therefore, building systems that allow data acquisition and storage has become an important direction of development for elevators. The dispersed nature of the location of the manufacturer’s elevators indicates that data acquisition from various devices must be performed remotely. An example of such a system can be found in [4]. The increase in popularity of remote data acquisition and monitoring systems for elevators has also led to the development of fault detection systems. In the publication from [5], the authors developed a system that collects elevator operational data, such as door opening signals, door closing signals, internal control signals, and the elevator stop signal through an IoT module. The data was then analyzed to verify whether the current elevator stop is caused by power loss and to monitor the presence of trapped passengers in the cabin. In addition, authors in [6] presented attempts to develop a system for measuring vibration levels and acoustic signals using a smartphone, which then sent data to a central server. I. Skog in [7] modeled elevator traffic intensity as an inhomogeneous Poisson process and then used a generalized linear model to describe how the intensity of the process changes over time. Furthermore, he used statistical tests to determine whether an elevator is stationary for an unusually long time. The application of the proposed method was illustrated by an example in which detected faults were compared with the elevator’s service log.

One of the best known ways to analyze the state of electromechanical systems for proper functioning is spectral analysis using vibration signals [8]. For this analysis, it is necessary to measure the vibration level in three axes of the device in question. In the case of an elevator, spectral analysis makes it possible to extract the frequencies characteristic of the movement under normal conditions, as well as emergency conditions. Frequency analysis of vibration signals can suppress noise created during measurements, preserve phase information, and describe non-linear couplings between signals [9]. The authors in [10,11] additionally used a bi-spectrum autoregressive model for their analysis. They emphasize that the shapes and distributions of AR spectrum peaks are clearly different for normal and damaged elevators. Spectral analysis using acoustic or vibration signals; however, requires measurements to be made at high sampling rates so that spectral changes at different frequencies can be captured.

Machine learning is one of the most modern methods used in diagnostics and fault detection of many devices, including elevators. Due to the wide choice of algorithms and the ability to operate on data extracted from functioning equipment, machine learning has gained popularity in the scientific and business world. Mishra and Huhtala employed a feature extraction structure based on autoencoders, followed by fault detection, using random forests in three of their studies. In [12], together with Krogerus, they used accelerometer signals for fault detection. Subsequently, in [13], they proposed the extraction of elevator ride profiles and incorporated additional magnetic signals for detection. In [14], they improved detection accuracy by omitting both the ride profiling and magnetic signals, and presented results obtained from a larger dataset collected over nine months from three elevators, validating the method on entirely new data. Based on the collected data, the researchers applied a feature extraction method using a deep autoencoder. The algorithm is an example of an unsupervised artificial neural network [15], whose task is to efficiently encode data into a form with fewer dimensions. This approach allows extraction of the most characteristic features of a given set. The features created by the autoencoder that characterize the moving elevator were used for binary classification using a random forest algorithm. The same authors also attempted to use other classifiers, such as multilayer perceptron [16]. The authors in [17] used accelerometer measurements and a convolutional neural network with one-dimensional convolution layers for fault diagnosis, whose parameters were selected using particle swarm optimization. A different approach to fault diagnosis of the elevator main drive was presented by the authors in [18], who used elevator vibration signals and then subjected them to noise reduction using wavelet functions. The authors also proposed feature extraction related to the signal envelope and classification, using a stochastically constructed neural network. The idea behind this model is to configure the network randomly with limited need for human intervention. Instead of training a model with a fixed architecture, the process of constructing a neural network starts with a structure of small size and then incrementally adds hidden nodes until acceptable results are achieved. This approach does not require any prior knowledge of the network complexity for a given task [19]. Raia et al. [20] used machine learning to monitor the state of a sheave–rope transmission.

The literature also includes ways to diagnose elevator failures using methods that combine signal processing theory and machine learning algorithms. Xu and Huang [21] used the autoregressive coefficient of signals as an input value to a support vector machine. Vibration signals in elevator fault diagnosis are also used as input features to algorithms, such as neural networks [22] in traditional and fuzzy versions [23], as well as convolutional neural networks [24].

Another way to diagnose the elevator main drive was presented by Flores et al. [25] using an induction electric motor that is the elevator’s main drive component for analysis. The authors noted that faults in shaft guides are associated with an increase in friction in vertical motion, which is perceived by the motor as an increase in load torque. They used current signature analysis to detect such faults. To do this, they measured the current signal at a sampling rate of 1 kHz and then calculated the root mean square for every 20 samples of the signal. Then, by comparing the signals of a normally functioning and damaged elevator, it was possible to determine where the guides had degraded.

The literature also shows interest in more challenging methods such as image analysis and transfer learning. The authors in [26] used images of ropes in an elevator shaft and an autocorrelation model to analyze their damage. In contrast, researchers in [27] estimated the remaining useful life of an elevator based on acoustic measurements of various elevators, using transfer learning and algorithms, such as Graph Neural Network and Long Short-Term Memory Network.

Among the most sensitive points of the elevator system is certainly the brake, which provides additional protection for the elevator in the event of power loss. The authors in [28] used an Ishikawa diagram, also known as a fish diagram, which is used for cause-and-effect analysis, to study elevator brake failure. It provides a visual representation of the various events’ causes and suggests solutions to problems. The authors describe the method in two steps. The first is a problem cause analysis, including their hierarchical division and the selection of relevant factors. Then a fish diagram is drawn, in which the most important causes of errors are written on the main bones, and then following down the hierarchy, smaller and smaller bones are drawn. The presented method, of course, requires expertise in the construction and operation of lifting equipment. An in-depth analysis of the phenomena occurring in these systems makes it possible to correctly develop the diagram, taking into account the effects and causes. In addition, it is worth noting that, depending on the type of damage, it may be necessary to build multiple diagrams, since failures differ not only in their potential effects, but also in their level of harm to the entire system.

Failure diagnosis methods that require expert knowledge include fault trees. Trees are built for individual failures, with each tree designed to identify and prioritize potential causes of failures. Primary incidents represent the causes of failures or other events and can be directly related to a particular piece of equipment. Incidents are then grouped into so-called cut sets, which arise when several basic events occur and cause another phenomenon. Such a cause-and-effect relationship between events leads to the localization of system weak spots. In addition, in this way, it is possible to determine what is the minimum path that leads to the occurrence of failure. Fault diagnosis using fault trees also implies estimating the probability of individual events, which leads to the development of a procedure for searching for a faulty component. The authors in [29] emphasize that such searches can be very costly, as checking for individual events often involves tests that are time-consuming (resulting in downtime costs) but also require activities such as entering the elevator shaft. Therefore, researchers are trying to develop trees in such a way that the troubleshooting procedure is as efficient as possible. Reliability analysis is sometimes not feasible when there are difficulties in obtaining the exact probability of base events for an elevator system. In such cases, it may be reasonable to use fault trees built using fuzzy logic [30].

The high complexity of elevator systems makes the identification of potential failures and their causes a complicated issue, although of course this is highly dependent on the manufacturer and design of a particular device. Researchers in [31] study the effect of the pressure difference caused by the stack effect on the occurrence of the cabin doors re-opening problem. The problem stems from the fact that the increased pressure difference between the corridor and the elevator shaft causes an increase in motion resistance when the door closes, which can be interpreted by the control system as blocking the door and trigger re-opening. Zhang and Zubair [32] identified a total of 17 different elevator system faults, four of which were related to the cabin door drive and four to the corridor-side access door. The authors listed the following fault categories for both of these systems: mechanism failure, instrumentation defect, blockage, and other. As noted in the introduction to this paper, wide-ranging defects in the cabin doors of passenger elevators are among the most common elevator failures [33]. Therefore, in the scientific world, one can also find many attempts to diagnose passenger elevator cabin door failures, which are expected to contribute to increasing the reliability of these devices and improving maintenance efficiency. One such attempt using multi-agent systems is presented in [34].

Due to its high efficiency, machine learning was also used to diagnose the failures of elevator cabin door drive systems. In [35], the authors developed a model to diagnose mechanical faults associated with different types of door blocking. Among them, they distinguished—normal operation of the drive, using hand to block the door, blocking object in the door gap, object directly blocking the door. The authors used the electrical and mechanical parameters of the device’s operation and past maintenance data sampled every 40 ms and then, using the sliding window technique, they calculated and added statistical features for a window length from 1 to 12 s. The authors compared the performance of four machine learning methods, with the best results obtained by the XGBoost algorithm (more than 90% classification accuracy with an F1 score of about 90%). However, the paper did not address the problem of strong class imbalance in the dataset, which is always present in real fault databases. Diagnostics of faults related to mechanical drive-blocking and caused by human factors are presented in [36]. The authors used operational and maintenance data of the elevator sampled at 100 Hz and then applied a convolutional neural network for feature extraction, which was then used to train a random forest cascade model. The final model contained 4 layers, each of which generated 201 random forests. The authors obtained high classification efficiency, but the article lacks information regarding the feasibility of implementing the developed method, which may have to do with its high level of complexity. In addition, the researchers also did not take into account the uneven distribution of classes in the dataset. The researchers in [37] developed a model for elevator door fault detection by focusing only on the door’s locking mechanism. For this purpose, they used neural networks optimized using an MPGA (multi-population genetic algorithm). Failure identification of the elevator cabin door drive divided into normal driving with completion, normal interrupted by the user, and abnormal driving (with failure) is presented in [38]. The researchers used 17 different quantities characterizing the operation of a passenger elevator cabin door sampled at a frequency of 8 Hz. Then, they used an autoencoder for feature extraction and a support vector machine for classification. The selection of the model’s hyperparameter values was performed using Bayesian optimization and k-fold cross-validation. It is worth mentioning that the authors obtained high efficiency (average F1 above 0.99) despite the inclusion of class imbalance.

The above review of the world literature in the field of passenger elevator malfunction diagnostics shows the lack of a comprehensive study of the most frequently failing system, which is the cabin doors. Although this topic has been known in principle since the early days of passenger elevators, only recently have attempts been made in the world literature to use modern diagnostic tools to improve the reliability of these devices and streamline the maintenance process. Among the studies reviewed, the vast majority focus attention on a single component of the system (for example, the locking mechanism) or carry out detection, distinguishing only failures related to door blocking. Other articles present a classification with a distinction between normal and emergency operations. Such an approach, however, does not allow the development of a system that would inform the elevator maintainer which component has most likely failed. In addition, not all researchers have taken into account in their studies the problem of class imbalance associated with the fact that under the conditions of normal operation of any system, faults occur occasionally and are a minority among the operating data of equipment. It is also worth mentioning that methods such as fault trees, despite the fact that they provide information about the potential causes of each component’s failure, are not based on the actual measurement data of the device and require verification tests to be performed. Another problem of some of the methods outlined above is the need to perform measurements at high frequency, which in the case of remote acquisition can be very costly since such a large amount of data collected for many devices would require huge disk spaces or cloud storage. An additional consideration in the case of a large number of measurements is the necessity to transmit them over an internal elevator bus, which has a limited throughput. The flow of data on this bus must not be disrupted, as the most important information on the operation of the device is transmitted there. Any problems caused by excessive load on the transmission can result in malfunctions.

Key contributions and novel aspects of this article are as follows:

Inclusion of a comprehensive dataset encompassing 10 distinct types of elevator door failures to ensure robust and diverse failure modeling and development of a data preprocessing method to reduce the amount of data transmitted, ensuring that diagnostic data transmission does not overload the CAN bus, which is shared with other critical elevator components.
Comparison of fault identification effectiveness across different machine learning algorithms, including logistic regression, decision trees, support vector machines, stochastic gradient descent (SGD) classifiers, random forests, and gradient boosting. Additionally, hyperparameter tuning and validation of results were performed.
Proposal of a practical implementation strategy for deploying the developed data preprocessing method directly on the elevator door drive controller for real-time diagnostic applications.

Figure 1 presents the chronological diagram of the proposed method. Initially, a test stand was built with a prototype elevator door drive system. Next, the most common failures in such systems were identified, and measurements were taken for normal operation and various types of faults. The data was then preprocessed and prepared for modeling using machine learning methods. Different data reduction techniques were also compared, including a custom method that reduced data volume by aggregating statistical features such as the mean and standard deviation. In the next stage, the effectiveness of seven machine learning algorithms was tested, hyperparameter tuning was performed, and the results were evaluated. Finally, the developed data reduction method was implemented on the cabin door drive controller. A detailed description of each stage of the work is provided in the following sections of the article.

The structure of the remaining part of the paper is as follows: Section 2 presents the data acquisition process, preprocessing methods, and the performance metrics used to assess fault diagnosis effectiveness. Section 3 provides a comparison of the fault detection performance of seven different machine learning algorithms. Section 4 describes the implementation of the original method for reducing the amount of data transmitted by the elevator cabin door drive controller. Section 5 contains the summary, and Section 6 outlines the limitations of the proposed method and the planned future work.

2. Data Collection and Preprocessing

2.1. Data Collection

In properly designed equipment, failures should be incidental events. For this reason, collecting a sufficiently comprehensive database of elevator cabin door faults can be a long-term process. In addition, most faults depend on the machine’s depletion rate and intensity of use. This means that the implementation of the measurement system should first be carried out on older equipment, thus interfering with the already existing system, which is a significant complication. On the other hand, deliberately triggering elevator failures to record their symptoms can lead to accelerated aging of the device’s components, as well as cause downtime in normal operation. Therefore, it was decided to build a test stand with the prototype cabin door drive system shown in Figure 2.

At this stage, the question of distinguishing between two types of doors found in elevators needs clarification. The first is the shaft door, which is installed on each floor and is an external element visible to the user. Their main function is to prevent unwanted access to the shaft, mainly in situations when the cabin is in motion or on another floor. Shaft doors do not have their own propulsion system, but only a mechanism that allows them to produce a closing moment when the cabin is outside the landing. This system takes the form of a spring or weight attached in such a way as to cause the door to close automatically. It is worth mentioning that the mechanism also places an additional load on the cabin door drive system (stretching the spring or lifting the weight). Reflecting this system on the test stand is the element marked with the number 6. Access to the inside of the elevator cabin is provided by the cabin door, the only pair which moves along with the elevator cabin and has its own drive using an electric motor. The mechanical coupling between the two types of doors is carried out by a special mechanism 2. This system has guides that move away from each other when the doors are fully closed and locked. When the elevator reaches the corresponding floor, a roller, which is part of the shaft door, enters between the guides. Then, when the door opens, the guides close, tightening the roller and coupling the two mechanisms together. This allows for coordinated movement of the doors and also prevents them from opening at the wrong moment. As already mentioned, simply closing the doors is not sufficient enough to ensure the safety of passengers in a moving cabin. An additional form of protection is provided by locking the door to prevent it from opening. The locking system 3 is also coupled with the guide mechanism of the shaft door connection. The main drive of the cabin door consists of a permanent magnet synchronous motor 5 equipped with an absolute encoder and a controller 8 that simultaneously performs other functions related to the operation of the drive system, such as communication with the main lift module. A toothed pulley has been placed on the shaft of the motor. On the other side of the door mechanism, there is a second pulley on a bearing-supported axle, which during the tests was replaced by a PMSM motor 1 from PERFOPOL, Starachowice, Poland used to simulate bearing failure by introducing an additional load torque depending on the angle of rotation and controlled by controller 10. The actuator that enables the door to move is a toothed drive belt 4. The test bench is also equipped with a measurement system, which consists of temperature sensors LM35 from Texas Instruments, Dallas, TX, USA placed inside the drive motor and under the mount of its controller, a triaxial accelerometer 7 ADXL345 from Analog Devices, Cambridge, MA, USA that allows measurement of vibrations on the door beam, and a microcontroller STM32F302 from ST Microelectronics, Geneva, Switzerland, equipped with 12-bit analog-to-digital converters, for acquisition of measurement data. In order for the drive system to work properly the cabin door drive controller measures phase currents, supply voltage, speed and motor shaft position. It was decided to use the existing measurement part of the drive controller and modify its software so that the measured values could be sent via the CAN bus to the controller collecting the data.

The system described above made it possible to trigger different types of faults and make measurements of the most important diagnostic parameters. In contrast to diagnostic methods based on the measurement of motor phase currents, it was decided to use the available information on the d-axis and q-axis components of motor current. These values, unlike the motor phase currents, are not alternating values, which makes it possible to reduce their sampling frequency.

The analyzed failures, along with their frequency of occurrence and graphs of the q-axis current component for opening and closing, are shown in Table 1. For malfunctions that have the character of sudden events, the red line indicates the moment of occurrence. In other cases, the graphs show faults that are the result of progressive degradation of individual components of the elevator door drive, the symptoms of which occur during the entire door cycle, i.e., closing or opening. Additionally, the characteristic fragments related to the door drive operation are marked on the normal operation charts (without damage).

One of the most important diagnostic parameters in the failure analysis of drive systems is the q-axis current component, since it reflects the torque developed by the motor. During the study, the phase currents of the PMSM motor were measured, and the d- and q-axis current components were calculated using the Clarke and Park transformations ([39]), which was necessary in the FOC (field oriented control) process. Additional diagnostic parameters were the output signals of the PI controllers, the voltage measured by the control system, as well as quantities related to the movement of the cabin door, such as speed, position, and direction of rotation. Based on the previously conducted analyses presented in [40] it can be concluded that a significant proportion of drive system failures result in excessive temperature rise in their most important components, such as the motor and the controller. For this reason, the measurement system was equipped with analog sensors for the temperature inside the motor and the control converter. The sensors were connected to the inputs of the analog-to-digital converter of the measurement microcontroller. In addition, a literature review concluded that a signal of the system’s vibrations could be an important parameter in terms of the possibility of identifying faults. The stand was therefore supplemented with a three-axis accelerometer located on the mechanism’s beam, sending information about accelerations in three axes via the SPI bus. Data acquisition was carried out using STM Studio v3.6.0 software, dedicated to viewing and logging data to a text file. Measurements were taken in normal operation and for the cabin door faults described earlier, resulting in a total of more than 3.9 million data records, which were given labels related to the operating states of the drive system.

In the first stage of data processing, records containing missing measurements that may have occurred under the influence of a malfunctioning measurement system were removed. These records accounted for 0.008% of the total measurement data. Due to the lack of a sensor providing information about the reference (ambient) temperature, it was decided to convert the temperature measurement results into their derivative after time. This made it possible to make the measurements independent of changes in ambient temperature between measurement series while retaining information about rapid temperature changes occurring during faults. In addition, taking into account the sensitivity of the acceleration measurement system to factors from the environment, acceleration measurements were made with the drive turned off before the actual tests. This made it possible to determine the level of vibration caused by the surroundings, that is, nearby traffic routes and other equipment operating in the building.

Fault identification models using machine learning methods require the collection of a sufficiently large dataset that contains representative examples of all considered types of device faults. Due to the nature of the signals associated with the occurrence of some faults, it is necessary to obtain measurements at a frequency sufficient to capture all the necessary information. The maximum measurements sampling frequency was 50 Hz, due to the limitations of the measurement system in terms of data transmission capabilities, but this value is similar to those used in the literature. In addition, based on an analysis of the cabin door control software, 11 of the diagnostic quantities available in the module were selected, which could have a significant impact on the ability to identify faults. In the normal operation of a passenger elevator, data from individual modules (such as the main module, cabin door controller, and cabin and floor modules) are transmitted via an internal CAN bus, the speed of which is 125 kbps due to length limitations (the bus connects multiple modules located throughout the whole elevator shaft). The elevator’s internal bus is the place where all information on the functioning of the elevator and its most important components is exchanged, so its operation must not be disturbed by an excessive amount of transmitted diagnostic data, as this could lead to delays or errors in the transmission of data crucial to the proper operation of the entire system. Transmission of the selected diagnostic data via the internal elevator bus would require transmission of 32 kb per second (11 4-byte variables plus overhead due to the CAN frame structure, every 20 ms), which is more than 25% of the throughput of the internal CAN bus. Obviously, such a high load on the bus by only one function of a single module is unacceptable and could overload the transmission. The collection of the device’s operational data at high frequency leads to the need for frequent uploads and the collection of significant amounts of information on the main server, which contains a constantly updated fault database. Taking into account the above values, it can be determined that transferring data from such an elevator to the central server would require about 10 GB of transfer per month over the GPRS network for just one device. In addition to the required transfer, it would be necessary to provide adequate disk space necessary for data storage. For one elevator, this would be about 124 GB per year, which, with an estimated production of 700 units, results in about 87 TB of memory to be filled. Given the ever-increasing number of devices being produced, the number of resources needed to store this data would be enormous and very expensive. Therefore, there is a need to reduce the amount of diagnostic data sent via the elevator bus and stored on a central server. This task became the basis for the study of methods for preprocessing information from the cabin door drive system of an elevator.

2.2. Metrics and Validation

The selection of an appropriate data preprocessing method, as well as the machine learning algorithm and its hyperparameters, is possible after a series of trials testing the performance of the model for different settings. In order to analyze effectiveness, it is required to determine performance metrics that most reliably reflect the actual behavior of the model in the face of new data. In properly functioning systems, faults are sporadic events, the frequency of which depends on the depletion of the device, the maintenance performed, and the intensity of use. It follows that among the operating data of the drive system, failures will be a significant minority of the normal operation samples. In such a case, the algorithm can be trained to obtain a very good percentage of correct classifications (it will correctly classify most samples), but not actually detect a single failure (samples marked as normal operation will be correctly classified). Therefore, verification of the model’s performance cannot be obtained only on the basis of accuracy, i.e., the percentage of correctly classified samples. Developing an effective fault identification model requires finding the right compromise between precision and sensitivity. It is essential that the model detects as many faults as possible while generating few false alarms. A measure that reflects the ratio between sensitivity and precision is the F1 score. The F1 score calculated for each label, and finding unweighted mean, was used to analyze the performance of each classifier model.

Data preprocessing method tests that allow reducing the amount of data used for modeling were carried out using simple machine learning algorithms, such as linear regression, logistic regression, and decision tree. The selected algorithms form the basis of more advanced methods, which improve them while allowing them to reduce the overfitting and underfitting. It was assumed that if improvements in the performance of simple algorithms are obtained, better results would be obtainable by the methods that constitute their improved versions. In addition, the algorithms used, due to their simplicity, also provided fast learning, which allowed many trials to be carried out in a relatively short time.

In order to develop an effective model for identifying faults in cabin door drive systems, it is necessary to carry out a number of trials to identify the data preparation method and the set of hyperparameters of the machine learning algorithm for which the fault detection error will be the smallest. However, it should be borne in mind that if during each trial, the performance of the algorithms will be checked on the same test set, then, as a result, the selected model will become the best but only for this particular set and may prove to be inadequate for predicting new samples. The test set has a certain limited size and will never be able to reflect all the relationships that were present in the training data. Therefore, it can be concluded that the model’s ability to generalize will be poor when it is prepared for a specific case of the test set. This means that the model will be biased, and its effectiveness for new test samples will be low. One solution to this problem may be to use a division into three subsets: training, validation (used to select hyperparameter values), and testing. However, the use of this method requires a significant number of samples, and in addition, the accuracy of the predictions may still depend on some random selection of the dataset split points. In practice, cross-validation is the most commonly used and extremely effective method to avoid problems with generalization accuracy. Stratified k-fold cross-validation was used to determine the performance of the model during the conducted tests. This method involves dividing a dataset into k equal parts and then teaching and testing the algorithm k times, with each iteration training the algorithm on k-1 subsets while the remaining subset is used for validation. The chosen stratified variant of cross-validation allows the set to be partitioned in such a way that in each of the k subsets the distribution of classes is similar. This allows the class distribution to be more accurately reflected during testing in cases where the original dataset is class imbalanced. The principle of data splitting based on used 5-fold cross-validation is illustrated in Figure 3. Preprocessing also included encoding class labels associated with system faults and normal operation. In addition, the data has been standardized to ensure the correct operation of machine learning algorithms.

2.3. Preprocessing Methods

The first of the data reduction preprocessing methods studied was the sampling rate reduction. To do this, sampling was performed from the original dataset. Then, the performance of simple machine learning algorithms was tested for each of the newly created datasets (sampled with different frequency). Independence from the method of data partitioning was obtained by using stratified five-fold cross-validation. The results of the experiment are shown in Figure 4.

Another of the proposed processing methods involved averaging the measurements over a period of time. Analogous values of the period to the previous method were carried over so that it would be possible to compare the effectiveness of classification at a certain level of data reduction. It can be seen in Figure 5 that the use of a simple statistic such as the average makes it possible to improve the performance of error identification especially in the lower sampling frequency range, which is desirable from the point of view of the designed system, as it allows for a reduction in data used for modeling.

Considering the improvement in the results obtained by using averaging, the authors proposed a method of preprocessing the measurement data, which involved calculating the moving mean and standard deviation of each measurement for a period defined as the width of the time window, and then adding these values as additional features to the dataset. The schematic of the described processing method is presented in Figure 6. With this method, new features were introduced to define the so-called short-term history of the device. For this purpose, it was necessary to choose one value of the sampling frequency for which the trials would be carried out. Therefore, it was decided to analyze the performance of fault classification for sampling frequencies in the range for which the F1 score during previous tests was the highest. Based on the results of the experiment, it was decided to choose a sampling frequency of 2.3 Hz. During further analysis, the effect of the time window width on the failure classification effectiveness was examined.

Figure 7 shows a comparison of the F1 score of fault classification by linear regression, logistic regression, and decision tree algorithms using the proposed processing method involving the addition of new variables representing the mean and standard deviation, calculated for different time window widths and a sampling frequency of 2.3 Hz. In the graph, one can notice the different impact of the window width for the linear methods (linear and logistic regression) and the decision tree. This may be due to the instability of the tree generation even with small changes in the input data (visible especially in the range of the time window width from 7 to 8.4 s). Introducing calculated statistics as new features in the dataset increased the number of features three times. However, given that the new features result in a significant improvement in performance for low sampling rates, it is possible to send three times larger portions of data but at much longer intervals.

On the basis of the conducted research on the impact of the data preparation method on the effectiveness of the diagnostics of the elevator cabin door drive system faults, it can be concluded that the method proposed by the authors, which assumes the use of mean values and standard deviations during the analysis, makes it possible to reduce the number of samples needed for modeling while retaining much of the useful information regarding the operation of the device in certain time intervals, of which a time window length of 7.4 s can be considered a good compromise between linear algorithms and a decision tree. It is worth noting that the operation of the elevator’s cabin door is cyclic in nature, related to opening and closing. The above observations led to the development of a new method of data preparation to reduce the amount of transmitted data.

Another of the processing methods developed involves aggregating a given cabin door cycle into a single record containing information about the mean and standard deviation of all diagnostic parameters. This means that the number of samples in the dataset is reduced to the number of cabin door operating cycles, while retaining useful information regarding the values of diagnostic parameters in the form of statistics. The method involves dividing the dataset into the operating cycles of the drive system and then replacing the diagnostic quantities of each cycle with their averages and standard deviations, as presented in Figure 8. In addition, a new parameter related to the length of the operating cycle was introduced. To ensure that the classification results are independent of the cycle order, the processed dataset was randomly shuffled. The results obtained for the described method are presented in Figure 9.

It can be seen that the proposed processing method produces much better fault classification results while reducing the amount of data from more than 3.9 million records to about 6700 (the number of cycles). Thus, it can be concluded that despite doubling the number of features, it was possible to reduce the amount of data by more than 300 times, and the required data transfer rate has been reduced to just 105 b/s (22 4-byte variables plus CAN overhead, every 11.7 s—the average cycle time), which is 0.08% of the available throughput of the elevator internal bus, which translates into about 34 MB of transfer per month and just over 0.4 GB of required disk space per year for one elevator. The proposed method significantly reduces the amount of diagnostic data transferred, which is beneficial, especially given the need to use the communication infrastructure of the elevator control system to transfer data between components and the device sending data to the central server.

3. Classification with Machine Learning

In the next step, after preprocessing the data, machine learning methods were selected for elevator cabin door drive failure classification. In the research, it was decided to compare the performance of the most popular classification algorithms available in the scikit-learn library of the Python 3.8 language, such as Logistic Regression (LR), Decision Tree (DT), Support vector machine (SVM), Stochastic Gradient Descent (SGD), K-Nearest Neighbors (KNN), Random Forest (RF), and Gradient Boosting (GB). The first comparison criterion was effectiveness in handling complex data (the ability to model intricate decision boundaries). In this regard, logistic regression performed the worst, as it is limited to modeling linear functions. When it comes to generalization capability and resistance to overfitting, these aspects largely depend on the appropriate selection of model parameters. Almost all of the algorithms include built-in regularization mechanisms, but the strength of regularization must be properly tuned. Performance on imbalanced datasets is particularly relevant in the context of fault detection. The k-nearest neighbors (KNN) algorithm is especially vulnerable to issues caused by data imbalance, as it makes decisions based solely on the local neighborhood of a sample. SVM and SGD algorithms are also sensitive to class imbalance; however, they can be adapted by incorporating class weighting mechanisms based on class quantities. Model complexity and possibility of implementation on embedded devices also pose challenges for KNN since the algorithm requires storing the entire dataset in memory. Ensemble algorithms are similarly problematic for embedded environments due to their computational complexity, which is heavily influenced by the number of weak estimators used in classification. In industrial applications related to fault diagnostics, interpretability is also a crucial factor, as these algorithms are intended to support operators and maintenance personnel. While the principles behind algorithms such as support vector machines (SVM) and KNN are relatively straightforward, interpreting their decision boundaries is significantly more difficult compared to, for instance, decision trees, which can be visualized. An additional advantage of tree-based algorithms is their ability to assess feature importance, which can serve as a valuable diagnostic tool. The final comparison criterion is scalability, which becomes increasingly important when collecting data from multiple devices. For large datasets, the training time of KNN and SVM algorithms is considerably longer compared to other models. A summary of the algorithm comparison is presented in Table 2.

An important aspect of fault detection modeling is the careful fine-tuning of the model to achieve the best possible performance. To this end, a process of hyperparameter tuning of the algorithms was carried out using random search with cross-validation. It requires defining a continuous or discrete distribution for each of the algorithm’s hyperparameters and then randomly sampling from this distribution to create a model, which is then trained and evaluated using stratified cross-validation. Parameter values and results for the tested algorithm configurations are recorded to then select the best settings. A diagram of how the random search method works is shown in Figure 10. It is worth noting that the method used allows the number of iterations to be adjusted to check the performance of the model regardless of the number of existing combinations of parameter values. In addition, the random search method makes it possible to analyze parameters that do not significantly affect the complexity of the model without sacrificing performance—in the case of the frequently used grid search method, adding a new attribute will significantly increase the number of combinations as it searches through all possible sets. In this work, it was decided to perform 100 search iterations for each algorithm. A summary of the best hyperparameters for each model obtained after tuning is shown in Table 3.

The values of precision, recall, and F1 score for all analyzed models with the best selected parameters are shown in Figure 11. It should be noted that for each of the algorithms, an improvement was obtained compared to fault identification with default hyperparameters. In most cases, the increase in efficiency was a few percent, but for algorithms such as logistic regression and SGD, the differences were much greater (up to several percent higher sensitivity). In addition, it can be seen from the presented results that each of the algorithms achieved an accuracy above 95%, but in many cases the values of the other metrics are much lower.

The development of an effective system for diagnosing elevator cabin door drive system failures requires the selection of the most effective algorithm to be implemented on the central server. In Figure 11, it can be seen that the best results in terms of sensitivity and F1 score are obtained by the random forest algorithm. Therefore, it was decided to check a confusion matrix for this method based on the last split of the cross-validation dataset (Figure 12). It can be seen that the algorithm has the biggest difficulty in classification of the faults associated with the automatic door closing system breakage. This is probably due to the fact that during the fault, the drive operates exactly the same as in normal operation, with the only difference being a slight increase in current when opening the door and a decrease when closing it. In addition, a few incorrect classifications for different types of faults are visible. Of particular interest is the triple confusion between bearing damage and drive belt stretch of 1 tooth. Thus, it seems that in some respect these two types of failure cycles are similar, and the investigation of this can be the subject of further research. On the other hand, it is worth noting that the use of a properly prepared fault diagnosis model makes it possible to distinguish failures of different types effectively. It is even possible to determine which motor controller transistor has failed.

4. Implementation

The developed model assumes calculating statistics, such as the mean and standard deviation, for the duration of the cabin door drive operation cycle. In order to reduce the amount of memory needed to collect measurement data from the entire duty cycle (this is necessary because the drive controller chip does not have enough memory), online statistics should be calculated and updated as each new sample arrives. In order to avoid the problem of so-called catastrophic cancellation [41] the use of the Welford algorithm has been proposed [42,43] updating the mean and variance after the appearance of a new measurement sample. The following formulas are used in the Welford algorithm for updating statistics:

{\bar{x}}_{n} = {\bar{x}}_{n - 1} + \frac{x_{n} - {\bar{x}}_{n - 1}}{n},

(1)

M_{2, n} = M_{2, n - 1} + (x_{n} - {\bar{x}}_{n - 1}) (x_{n} - {\bar{x}}_{n}),

(2)

σ_{n}^{2} = \frac{M_{2, n}}{n},

(3)

where

{\bar{x}}_{n}

denotes mean of n samples,

M_{2, n}

is the sum of squares of differences from the current mean, and

σ_{n}^{2}

—variance.

This preprocessing method has been successfully implemented on a cabin door drive controller equipped with an STM32F103 from ST Microelectronics, Geneva, Switzerland series microcontroller running at 72 MHz without adversely affecting the operation of the entire system. It is worth noting that the computational complexity of Welford’s algorithm is O(N), making it lightweight and suitable for the microprocessor of the drive controller. Additionally, reducing the amount of transmitted data decreases the load on the CAN bus.

Figure 13 presents a schematic of the system being implemented, which consists of a cabin door drive, a communication–measurement module, and a central server where the machine learning model will be deployed. The drive controller measures quantities, such as current and rotational speed, which are essential for vector control. These data are then processed using the proposed aggregation method and transmitted via the CAN bus to the communication–measurement module. The module also measures temperature and vibrations on the drive beam and transmits the complete dataset to the central server, where diagnostics will be performed. In addition, work is underway to implement a complete system that will also enable remote viewing of the lift’s operation, collection of operational data and identification of faults in real time. Thanks to the structure developed in this way, it will be possible in the future, once enough data has been collected, to upgrade the system to perform predictive maintenance functions.

5. Conclusions

Elevators are an integral part of modern society’s functioning and their number is constantly growing. The increase in the importance of these devices is related, among other things, to the growth of large cities. Nowadays, elevators are mandatory in skyscrapers, as well as in public places where it is important to ensure accessibility for people with disabilities.

The article presents a system that involves taking measurements of the basic drive system diagnostic parameters, processing them in a data-limited manner, and then classifying 11 different operating states of the device. The goal was achieved by building a stand with a prototype cabin door drive system, along with the necessary measuring equipment. Then, possible ways of triggering the faults were determined, and measurement quantities were selected that can have a significant impact on the performance of identifying system failures. In addition, analyses were carried out on the impact of the data preprocessing method on the performance of cabin door drive system fault diagnosis by various types of machine learning algorithms. A processing method has been developed, which, thanks to the diagnostic parameters cycle aggregation, allows to reduce the amount of data transmitted by the internal elevator bus, reducing the risk of disturbances in the operation of the communication channel. The bus throughput required to transmit diagnostic information was reduced from 32 kb/s to 0.105 kb/s, resulting in a reduction in the required monthly data transfer to the central server from 10 GB to 34 MB. The required amount of memory needed for data storage was thus reduced from 124 GB to 0.4 GB per year for one elevator. It is worth mentioning that the achieved reduction in the amount of transferred data did not result in the fault classification effectiveness decrease, while enabling the proper functioning of the entire device, providing the opportunity to install the device in places where there is no access to broadband Internet connection, and reducing the cost of both transmission and data storage. In addition, the conducted analyses took into account the problem of class imbalance as well as the selection of hyperparameter values of each machine learning algorithm, which resulted in a recall and an F1 score of approximately 97% for the random forest algorithm. Table 4 presents a comparison of cabin door fault diagnostics for methods described in the literature and the method proposed in this article. It was also tested that the developed processing method is possible to implement on the microprocessor of the cabin door drive controller and does not cause operational disturbances.

6. Limitations and Future Work

The conducted research is, of course, not free from limitations resulting from the adopted methodology. First and foremost, it was not possible to obtain data on malfunctions from operational elevators, as this would have been very time-consuming—properly designed systems experience malfunctions very rarely. Additionally, deliberately inducing malfunctions in a functioning device could lead to a reduction in its service life. However, it is worth emphasizing that the developed model of the elevator cabin door drive system is a faithful replica of real devices manufactured by a company specializing in elevator production, with which the authors have collaborated. The prototype system was built using components provided through this collaboration.

Moreover, the methods for simulating faults developed for research purposes were consulted with specialists in the field of cabin door drives. It should also be noted that although the exact values of diagnostic parameters, such as motor current or torque, may vary depending on the specific design of the device, for failures such as mechanical drive blockage, the symptoms will be nearly identical to those in a functioning system. In the case of failures such as a broken drive belt, the signals at the exact moment of the fault were not captured; however, the general symptoms of a broken belt are the same and enable easy detection of the malfunction and quick notification of the device maintenance personnel. From the perspective of rapid notification, it does not matter whether the maintenance department receives information about the malfunction one second or one minute after it occurs. It should also be kept in mind that the mechanical design of cabin door drives itself protects users and personnel from injury in the event of any of these failures. The primary objective of the research was to enable quick diagnosis and notification to restore the device to normal operation as soon as possible.

As part of future work, broader analyses of the system’s vibration measurements are planned. It will also be possible to examine the impact of the accelerometer’s placement and the method of processing measurement data (e.g., Fourier transform) on the effectiveness of the diagnostic model. Additionally, feature importance studies for the dataset will be conducted. It is likely that some diagnostic parameters can be discarded due to their minimal impact on the model’s performance, thereby reducing the number of required measurement sensors.

Work is currently underway to implement a monitoring and data collection system for malfunctions in operational elevators. In the future, this system will be used for research focused on predicting malfunctions in advance, enabling predictive maintenance.

Author Contributions

Conceptualization, D.C. and K.K.; Methodology, D.C. and K.K.; Software, J.G. and B.D.; Validation, J.G.; Formal analysis, D.C.; Investigation, J.G. and B.D.; Resources, K.K.; Data curation, D.C. and K.K.; Writing—original draft, J.G. and B.D.; Writing—review & editing, D.C. and K.K.; Visualization, J.G. and B.D.; Supervision, D.C. and K.K.; Project administration, D.C.; Funding acquisition, K.K. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by Lublin University of Technology grant no. FD-20/EE-2/605, FD-20/IT-3/050 and FD-20/EE-2/602.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yanbin, Z.; Shangchang, F.; Zheyi, L.; Kuangye, N. The Statistics and Analysis of Annual Fault Repair Report of An Elevator Company. In Proceedings of the 2020 2nd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI), Taiyuan, China, 23–25 October 2020; pp. 576–579. [Google Scholar]
Qibing, W.; Yonggang, L.; Dongliu, L.; Xiaoqiang, Z.; Ren, L.; Hongmei, Z.; Hao, Z. MCU System-Based Intelligent High-Speed Elevator Door Operator Fault Analysis and Research. In Proceedings of the 3rd International Conference on Automation, Control and Robotics Engineering (CACRE 2018), Chengdu, China, 19–22 July 2018; Volume 428. [Google Scholar]
Mao, Q.; Dong, Y.; Yang, J. Health Assessment and Fault Warning Method for Elevator Door System. In Proceedings of the 2022 5th International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 27–30 May 2022; pp. 12–16. [Google Scholar]
Liu, X.K.; Yao, C.; Yu, H.N. Research on Web-Based Elevator Failure Remote Monitoring System. Appl. Mech. Mater. 2014, 494–495, 797–800. [Google Scholar] [CrossRef]
Huang, Q.; Cao, J.; Sun, R. Design and Implementation of an Elevator Power Failure Warning System. In Proceedings of the 2020 International Conference on Computer Science and Communication Technology (ICCSCT) 2020, Hangzhou, China, 25–26 July 2020; Volume 1621. [Google Scholar]
Gao, Y.; Wang, L.; Liu, Y.; Hong, K.; Wang, B. Design of Elevator Fault Diagnosis Cloud System Based on Smart Phones. In Proceedings of the 2019 Chinese Control and Decision Conference (CCDC), Nanchang, China, 3–5 June 2019; pp. 1318–1321. [Google Scholar]
Skog, I. Nonintrusive Elevator System Fault Detection Using Learned Traffic Patterns. IEEE Sens. Lett. 2020, 4, 1–4. [Google Scholar] [CrossRef]
Zheng, B. Intelligent Fault Diagnosis of Key Mechanical Components for Elevator. In Proceedings of the 2021 CAA Symposium on Fault Detection, Supervision, and Safety for Technical Processes (SAFEPROCESS), Chengdu, China, 17–18 December 2021. [Google Scholar]
Zheng, Y.; Huang, Y. High Order Spectrum Characteristics of the Elevator Fault. Adv. Mater. Res. 2013, 605–607, 739–743. [Google Scholar] [CrossRef]
Chen, J.; Liu, X. AR Bispectrum in Fault Diagnosis of Elevator Machinery. In Proceedings of the 2012 IEEE Fifth International Conference on Advanced Computational Intelligence (ICACI), Nanjing, China, 18–20 October 2012; pp. 1149–1152. [Google Scholar]
Yi, J.; Huang, Y. Fault Diagnosis of Elevator Based AR Bi-Spectrum. Adv. Mater. Res. 2012, 468–471, 1743–1748. [Google Scholar] [CrossRef]
Mishra, K.M.; Krogerus, T.R.; Huhtala, K.J. Fault Detection of Elevator Systems Using Deep Autoencoder Feature Extraction. In Proceedings of the 2019 13th International Conference on Research Challenges in Information Science (RCIS), Brussels, Belgium, 29–31 May 2019; Volume 2019. [Google Scholar]
Mishra, K.M.; Huhtala, K. Elevator Fault Detection Using Profile Extraction and Deep Autoencoder Feature Extraction for Acceleration and Magnetic Signals. Appl. Sci. 2019, 9, 2990. [Google Scholar] [CrossRef]
Mishra, K.M.; Huhtala, K. Condition Monitoring of Elevator Systems Using Deep Neural Network. In Proceedings of the ICORES 2020—Proceedings of the 9th International Conference on Operations Research and Enterprise Systems, Valletta, Malta, 22–24 February 2020; pp. 381–387. [Google Scholar]
Hinton, G.E.; Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed]
Mishra, K.M.; Huhtala, K.J. Fault Detection of Elevator Systems Using Multilayer Perceptron Neural Network. In Proceedings of the 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Zaragoza, Spain, 10–13 September 2019; pp. 904–909. [Google Scholar]
Wu, C.; Xu, J.; Tan, G.; Zhu, Y.; Nayem, H.M. Elevator Fault Diagnosis Method Based on PSO-1DCNN. In Advances in Transdisciplinary Engineering; IOS Press Ebooks: Amsterdam, The Netherlands, 2024; Volume 51, pp. 180–186. [Google Scholar]
Dong, T.; Zang, C.; Zeng, P. Fault Diagnosis Study of Elevator Based on Stochastic Configuration Networks. In Proceedings of the 2022 4th International Conference on Industrial Artificial Intelligence (IAI), Shenyang, China, 24–27 August 2022. [Google Scholar]
Wang, D.; Li, M. Stochastic Configuration Networks: Fundamentals and Algorithms. IEEE Trans. Cybern. 2017, 47, 3466–3479. [Google Scholar] [CrossRef] [PubMed]
Raia, M.R.; Ailincai, A.; Baicoianu, A.; Husar, C.; Irimia, C. Condition Monitoring of Industrial Elevators Based on Machine Learning Models. In Proceedings of the 2023 IEEE 28th International Conference on Emerging Technologies and Factory Automation (ETFA), Sinaia, Romania, 12–15 September 2023; pp. 1–5. [Google Scholar]
Xu, S.; Huang, Y. The Fault Diagnosis of Elevator Based on the Autoregressive Model and the Support Vector Machine. Appl. Mech. Mater. 2013, 271, 1689–1694. [Google Scholar] [CrossRef]
Zheng, Q.; Zhao, C. Wavelet Packet Decomposition and Neural Network Based Fault Diagnosis for Elevator Excessive Vibration. In Proceedings of the 2019 Chinese Automation Congress (CAC), Hangzhou, China, 22–24 November 2019; pp. 5105–5110. [Google Scholar]
Wang, P.; He, W.; Yan, W. Fault Diagnosis of Elevator Braking System Based on Wavelet Packet Algorithm and Fuzzy Neural Network. In Proceedings of the 2009 9th International Conference on Electronic Measurement & Instruments, Beijing, China, 16–19 August 2009; pp. 1028–1031. [Google Scholar]
Jia, M.; Gao, X.; Li, H.; Pang, H. Elevator Running Fault Monitoring Method Based on Vibration Signal. Shock Vib. 2021, 2021, 4547030. [Google Scholar] [CrossRef]
Flores, A.Q.; Carvalho, J.B.; Cardoso, A.J.M. Mechanical Fault Detection in an Elevator by Remote Monitoring. In Proceedings of the 2008 18th International Conference on Electrical Machines, Vilamoura, Portugal, 6–9 September 2008. [Google Scholar]
Yaman, O.; Karakose, M. Auto Correlation Based Elevator Rope Monitoring and Fault Detection Approach with Image Processing. In Proceedings of the 2017 International Artificial Intelligence and Data Processing Symposium (IDAP), Malatya, Turkey, 16–17 September 2017. [Google Scholar]
Pan, J.; Shao, C.; Dai, Y.; Wei, Y.; Chen, W.; Lin, Z. Research on Fault Prediction Method of Elevator Door System Based on Transfer Learning. Sensors 2024, 24, 2135. [Google Scholar] [CrossRef] [PubMed]
Xiaolun, W.; Zhang, X. Research on Elevator Braking Failure Assessment Model Based on Fishbone Diagram and AHP. In Proceedings of the 2020 International Conference on Artificial Intelligence and Computer Engineering (ICAICE), Beijing, China, 23–25 October 2020; pp. 260–263. [Google Scholar]
Niu, D.; Qi, C.; Li, G.; Li, H.; Pang, H. Rapid Fault Diagnosis Method of Elevator System Based on Multiattribute Decision Making. Shock Vib. 2021, 2021, 9939493. [Google Scholar] [CrossRef]
Xu, Y.C.; Li, X.S.; Tao, R.; Guo, S.; Gou, M.; Li, K. Research on the Elevator System with Fault Diagnosis Method Based on Fuzzy Fault Tree. Adv. Mater. Res. 2014, 1014, 90–93. [Google Scholar] [CrossRef]
Lee, D.-S.; Ji, K.-H.; Jing, J.; Jo, J.-H. Experimental Study on Elevator Door Reopening Problems Caused by Stack Induced Pressure Differences across the Elevator Door in Buildings. Build. Environ. 2022, 221, 109271. [Google Scholar] [CrossRef]
Zhang, X.; Zubair, M.U. Extending the Useful Life of Elevators through Appropriate Maintenance Strategies. J. Build. Eng. 2022, 51, 104347. [Google Scholar] [CrossRef]
Yuan, S.; Ge, M.; Qiu, H.; Lee, J.; Xu, Y. Intelligent Diagnosis in Electromechanical Operation Systems. In Proceedings of the IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA ‘04. 2004, New Orleans, LA, USA, 26 April–1 May 2004; Volume 2004, pp. 2267–2272. [Google Scholar]
Zhang, G.; Huang, S.; Yuan, Y. The Study of Elevator Fault Diagnosis Based on Multi-Agent System. In Proceedings of the 2009 International Conference on Computational Intelligence and Software Engineering, Wuhan, China, 11–13 December 2009. [Google Scholar]
Liang, T.; Chen, C.; Wang, T.; Zhang, A.; Qin, J. A Machine Learning-Based Approach for Elevator Door System Fault Diagnosis. In Proceedings of the 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), Mexico City, Mexico, 20–24 August 2022; Volume 2022, pp. 28–33. [Google Scholar]
Zhang, A.; Chen, C.; Wang, T.; Cheng, L. Fault Diagnosis of Elevator Door Machines Based on Deep Convolutional Forest. In Proceedings of the 2022 27th International Conference on Automation and Computing (ICAC), Bristol, UK, 1–3 September 2022. [Google Scholar]
Qiu, J.; Qiu, B.; Yang, L. Fault Prediction of Elevator Door Lock Based on MPGA-BP Algorithm. In Proceedings of the 2021 2nd International Conference on Modeling, Big Data Analytics and Simulation (MBDAS 2021), Shenzhen, China, 14–15 November 2021; Volume 2179. [Google Scholar]
Chae, H.; Lee, J.E.; Oh, K.-Y. Fault Diagnosis of Elevator Doors Using Control State Information. IEEE Access 2022, 10, 7207–7222. [Google Scholar] [CrossRef]
Casadei, D.; Profumo, F.; Serra, G.; Tani, A. FOC and DTC: Two Viable Schemes for Induction Motors Torque Control. IEEE Trans. Power Electron. 2002, 17, 779–787. [Google Scholar] [CrossRef]
Czerwinski, D.; Gęca, J.; Kolano, K. Machine Learning for Sensorless Temperature Estimation of a BLDC Motor. Sensors 2021, 21, 4655. [Google Scholar] [CrossRef] [PubMed]
Chan, T.F.; Golub, G.H.; Leveque, R.J. Algorithms for Computing the Sample Variance: Analysis and Recommendations. Am. Stat. 1983, 37, 242–247. [Google Scholar] [CrossRef]
Welford, B.P. Note on a Method for Calculating Corrected Sums of Squares and Products. Technometrics 1962, 4, 419–420. [Google Scholar] [CrossRef]
Knuth, D.E. Seminumerical Algorithms. In The Art of Computer Programming, 3rd ed.; Addison-Wesley Longman Publishing Co., Inc.: Boston, MA, USA, 1997; Volume 2, ISBN 978-0-201-89684-8. [Google Scholar]

Figure 1. Chronological diagram of the proposed method.

Figure 2. Test stand with key components marked (a) front view and (b) rear view: 1—bearing failure simulation PMSM motor, 2—shaft door connection system, 3—lock, 4—drive belt, 5—PMSM drive motor with internal temperature sensor and absolute encoder, 6—shaft door closing weight, 7—guide bar, 8—triaxial accelerometer, 9—drive motor controller with temperature sensor, 10—measurement controllers, 11—bearing failure simulation motor controller.

Figure 3. Sample scheme for 5-fold, stratified cross-validation.

Figure 4. Performance metrics for fault classification by linear regression, logistic regression, and decision tree algorithms at limited sampling rate.

Figure 5. Performance metrics for fault classification by linear regression, logistic regression, and decision tree algorithms for the averaging processing method.

Figure 6. Schematic of the third processing method that introduces new features representing the mean and standard deviation for different time window widths and a sampling rate of 2.3 Hz.

Figure 7. Performance metrics for fault classification by linear regression, logistic regression and decision tree algorithms using a processing method that introduces new features representing the mean and standard deviation for different time window widths and a sampling rate of 2.3 Hz.

Figure 8. Schematic of a processing method that reduces the number of samples by applying data aggregation within the cabin door operation cycle using statistical features.

Figure 9. Performance of linear regression, logistic regression and decision tree algorithms for processing with aggregation of operation cycles.

Figure 10. Random search.

Figure 11. Fault classification performance metrics for selected algorithms after hyperparameter tuning.

Figure 12. Confusion matrix for the random forest algorithm.

Figure 13. Cabin door drive fault diagnosis system scheme.

Table 1. Elevator cabin door failures.

Type of Fault (Designation)	Severity	Description	Percentage of Records
No failures (nu)	-	3,465,880	88.20%
Drive block (bn)	Medium The drive blockage causes the system to halt; however, if the obstructing object is removed, the drive can resume operation after the elevator user presses a button.	22,880	0.58%
Drive belt stretched by 1 tooth (rp1z)	Low The stretching of the drive belt is a progressive phenomenon, and its early detection is crucial to protect the drive from the consequences of further elongation. In the case of stretching by one belt tooth, significant issues do not typically arise; however, in extreme cases, belt slippage on the drive pulley may occur.	35,159	0.89%
Drive belt stretched by 2 teeth (rp2z)	High For a stretch of two teeth, the drive belt tension becomes insufficient, resulting in belt slippage on the pulley during the generation of the torque required to lock the doors. Such behavior makes proper door closure impossible and may consequently lead to further degradation of the drive transmission components.	50,041	1.27%
Drive belt breakage (up)	High Breakage of the drive belt completely disables the drive movement.	51,116	1.30%
Bearing damage (lu)	Medium Bearing damage causes an additional load torque on the motor. In extreme cases, it may lead to complete bearing seizure and, consequently, drive failure. However, early detection of damage can prevent such severe consequences.	106,417	2.71%
Automatic door closing system breakage (us)	Medium A malfunction of the automatic closing system does not cause significant effects on the driver itself. However, if another door failure occurs simultaneously, the elevator doors will not close automatically, which may pose a safety risk to users.	26,384	0.67%
Shaft door connection breakage (upd)	High Damage to the shaft doors connection system is a serious fault that completely disallows the door to be unlocked and opened.	41,276	1.05%
Damage to the transistor in the lower branch of the control bridge (dt)	High Damage to any transistor in the drive controller results in operation without one of the motor phases, causing significant torque oscillations. Under heavy load, such as during door locking, the motor cannot generate sufficient torque to overcome the mechanism’s resistance. Consequently, the doors will not lock despite being closed.	49,519	1.26%
Damage to the transistor in the upper branch of the control bridge (gt)	High As above	54,311	1.38%
Encoder failure (en)	Medium Encoder misalignment leads to incorrect rotor position readings, resulting in improper generation of the stator voltage space vector. Consequently, for a given motor current, the shaft torque is reduced. Achieving the required torque will therefore cause a significant increase in the motor’s current consumption.	26,433	0.67%

Table 2. Comparison of machine learning algorithms.

Algorithm	LR	DT	SVM	SGD	KNN	RF	GB
Effectiveness against complex data	poor	average	good	average	average	good	good
Generalization ability and robustness against overfitting	good	average	average	average	poor	good	average
Effectiveness against imbalanced data	average	average	average	average	poor	good	good
Complexity/applicability on embedded devices	good	good	average	good	poor	average	average
Interpretability	good	good	poor	good	poor	average	average
Scalability	good	average	poor	good	poor	average	average

Table 3. Machine learning models’ selected parameters.

Model	Hyperparameter	Value
KNN	p	4
	n_neighbors	2
	metric	cityblock
Logistic regression	C	23.3
	solver	newton-cg
	class_weight	None
SGD	alpha	4.97 × 10⁻⁵
	epsilon	9.62 × 10⁻⁵
	eta0	3.53
	l1_ratio	0.811
	learning_rate	adaptive
	loss	perceptron
	power_t	0.0399
	class_weight	None
SVM	C	39.4
	coef0	0.119
	decision_function_shape	OVO
	degree	3
	gamma	auto
	kernel	linear
	class_weight	None
Decision Tree	min_samples_split	5
	min_samples_leaf	2
	max_features	None
	max_depth	None
	criterion	log_loss
	class_weight	None
Random Forest	criterion	entropy
	max_depth	7
	max_features	sqrt
	max_samples	0.915
	min_samples_split	5
	min_samples_leaf	5
	n_estimators	250
	class_weight	balanced
Gradient Boosting	criterion	firedman_mse
	learning_rate	0.0858
	max_depth	None
	max_features	sqrt
	min_samples_split	10
	min_samples_leaf	5
	n_estimators	250
	subsample	0.801

Table 4. Comparison of cabin door drives fault diagnosis methods.

Reference	Method	Number of Door States Taken into Account	Data Reduction	Number of ML Methods Compared	Effectiveness
[34]	Multi-agent System	7	no	NA	ND
[35]	Machine learning	4	sliding window with sampling from 40 ms to 1 s	4	F1 score ~90%
[36]	Machine learning	4	sliding window with sampling from 10 ms to 1 s	3	F1 score ~100%
[37]	Machine learning	5	no	1	ND
[38]	Machine learning	3	no	3	F1 score ~99%
this article	Machine learning	11	cycle aggregation with transmission after every cycle (~12 s)	7	F1 score ~97%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gęca, J.; Czerwiński, D.; Drzymała, B.; Kolano, K. Efficient Fault Diagnosis of Elevator Cabin Door Drives Using Machine Learning with Data Reduction for Reliable Transmission. Appl. Sci. 2025, 15, 7017. https://doi.org/10.3390/app15137017

AMA Style

Gęca J, Czerwiński D, Drzymała B, Kolano K. Efficient Fault Diagnosis of Elevator Cabin Door Drives Using Machine Learning with Data Reduction for Reliable Transmission. Applied Sciences. 2025; 15(13):7017. https://doi.org/10.3390/app15137017

Chicago/Turabian Style

Gęca, Jakub, Dariusz Czerwiński, Bartosz Drzymała, and Krzysztof Kolano. 2025. "Efficient Fault Diagnosis of Elevator Cabin Door Drives Using Machine Learning with Data Reduction for Reliable Transmission" Applied Sciences 15, no. 13: 7017. https://doi.org/10.3390/app15137017

APA Style

Gęca, J., Czerwiński, D., Drzymała, B., & Kolano, K. (2025). Efficient Fault Diagnosis of Elevator Cabin Door Drives Using Machine Learning with Data Reduction for Reliable Transmission. Applied Sciences, 15(13), 7017. https://doi.org/10.3390/app15137017

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficient Fault Diagnosis of Elevator Cabin Door Drives Using Machine Learning with Data Reduction for Reliable Transmission

Abstract

1. Introduction

2. Data Collection and Preprocessing

2.1. Data Collection

2.2. Metrics and Validation

2.3. Preprocessing Methods

3. Classification with Machine Learning

4. Implementation

5. Conclusions

6. Limitations and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI