1. Introduction
The development of systems enabling autonomous vessel navigation has recently attracted increasing interest [
1,
2]. One of the most important requirements in the development of autonomous vessels is the ability to monitor the proper operation of a propulsion engine. The significance of timely fault detection in engine operation primarily lies in increasing safety, preventing damage, and extending the engine lifespan, consequently reducing maintenance costs. Diesel engines are the most commonly used for propulsion systems and power supply in vessels [
3]. Therefore, the ability to detect irregularities in the operation of diesel engines is of utmost importance to ensure the high reliability of the entire system.
Particularly interesting systems for detecting faults in engine operation are based on the analysis of vibration signals [
4,
5] or acoustic signals [
6,
7] produced by the engine. In previous research, various methods for detecting faults in diesel engine operation have been presented, which are based on the analysis of vibration or acoustic signals. The proposed methods include signal analysis in the frequency domain using fast Fourier transform (FFT) [
8,
9], signal analysis using wavelet transform (WT) [
6,
7,
10,
11,
12], and signal decomposition methods such as empirical mode decomposition (EMD) [
13,
14], variational mode decomposition (VMD) [
4,
5], and principal component analysis (PCA) [
6,
15]. Recently, systems based on neural networks for detecting faults in ship systems have become increasingly interesting [
5,
16,
17,
18].
In [
8], the authors investigated the possibility of monitoring the operation of internal combustion engines by analyzing the acoustic signal captured in the immediate vicinity of the engine. The possibility of detecting knocking, misfiring, and intake faults was considered. The proposed method, which is based on frequency analysis using FFT, enables the detection of faults in engine operation. The presented results do not include data on the reliability of the fault detection using the proposed method.
A time–frequency analysis of acoustic signals using wavelet packet transform (WPT) for feature extraction, based on which engine operation is classified, was presented in [
6]. After feature extraction, three different approaches were compared: standard classification, Bayesian optimization, and a PCA method combined with Bayesian optimization. The proposed method enables reliable misfire detection; however, it requires a significant amount of time for training and testing.
In [
19], a method for feature extraction based on the mel-frequency cepstrum (MFC) and VMD analysis of vibration signals was presented. The proposed method facilitates fault detection using the K-nearest neighbor classifier and was tested specifically on valve clearance faults. The proposed method is computationally demanding owing to feature sets with a large number of dimensions; thus, an improvement using vector quantization (VQ) was proposed to partially alleviate this issue.
Fault detection in engine operation using a combination of adaptive recursive variational mode decomposition (ARVMD) and component energy distribution spectrum (CEDS) on signals obtained from vibration measurements is proposed in [
4]. ARVMD is used to extract intrinsic mode functions (IMFs), from which central frequencies and energies in unit frequency bands are obtained. Final classification is achieved by ranking the correlations of CEDS.
Detection of spectral anomalies using a variational autoencoder (VAE) is proposed in [
17,
20]. The proposed algorithm involves collecting data during both normal and faulty engine operations, feature extraction, and a training phase, in which a VAE is established and used for anomaly detection. The proposed algorithm enabled the detection of various faults with high reliability. Methods that use deep learning techniques require a sufficient amount of training data, which, in some situations, represents a significant drawback for the practical implementation of detection systems.
Beyond the aforementioned approaches, several advanced spectral techniques have been developed in other disciplines that may inspire improvements in engine fault detection. Least-squares wavelet analysis (LSWA) and its cross-wavelet extension compute time–frequency representations by directly fitting sinusoidal components to irregularly sampled data, providing instantaneous frequency estimates without pre-processing [
21]. These methods have been applied to astronomical and interferometry time series and show strong performance in detecting anomalies and coupling between signals. Multichannel antileakage least-squares spectral analysis (MALLSSA) and the antileakage Fourier transform (ALFT) iteratively estimate and subtract dominant Fourier components from irregularly sampled hydrophone and seismic records to mitigate spectral leakage [
22,
23]. While offering improved spectral estimation over conventional FFTs, these methods are computationally intensive and require iterative optimization. Recent work on compressed sensing of vibration signals for rotating machinery faults constructed an order basis using randomly sampled rotational speed data; this sparse representation achieves up to twenty-fold compression and robust reconstruction under speed variations [
24,
25]. In [
26], a novel wavelet-based spatiotemporal sparse quaternion dictionary learning (WSTS-QDL) method is proposed for the reconstruction of multi-channel vibration data. The approach exploits quaternion transforms for handling multi-dimensional channels, integrates wavelet decomposition for spatiotemporal feature extraction, and applies sparse dictionary learning to accurately reconstruct vibration signals. In [
27], a deep learning-based sparsity-free compressive sensing method is developed for high-accuracy reconstruction of structural vibration responses. The proposed approach avoids the limitations of traditional sparsity assumptions by leveraging neural networks to directly learn the mapping between compressed measurements and full vibration signals, thereby improving reconstruction accuracy. In [
28], a novel wind turbine fault diagnosis method is proposed that combines compressive sensing with a lightweight SqueezeNet model. Compressive sensing is employed to efficiently reduce the dimensionality of vibration data, while the SqueezeNet-based deep learning model enables accurate and computationally efficient fault classification.
To the best of our knowledge, such techniques have not yet been applied to marine diesel engine fault detection. Our work complements these developments by proposing a simpler FFT-based measure that can operate under varying engine speeds with minimal training data.
Although a substantial body of literature exists on diesel engine fault detection, several gaps remain. Many time–frequency and machine learning approaches rely on extensive labeled data and are sensitive to engine speed, while recent spectral methods such as LSWA, MALLSSA, ALFT, and compressed sensing provide improved frequency estimation or sparse representations, yet have been applied primarily in astronomy, geophysics, and bearings diagnostics. None of these techniques have been explored for marine diesel engines, and there is a lack of simple methods that can simultaneously handle acoustic and vibration measurements across variable speeds.
The main contributions of this paper are summarized as follows:
Novel frequency-domain fault measure: We introduce a simple metric based on the ratio of DFT magnitude spectra obtained from monitoring and training data and use the two largest spectral peaks to compute a distance that distinguishes normal from faulty operation.
Unified processing of acoustic and vibration signals: The proposed algorithm applies identically to microphone and accelerometer signals, demonstrating comparable detection performance for both modalities and highlighting the universality of the approach.
Comprehensive experimental evaluation: We validate the method on a six-cylinder marine diesel engine at five rotational speeds, emulate faults by disabling individual cylinders, and report detailed performance metrics including accuracy, precision, recall, and F1-score.
Parameter analysis and practical guidelines: The influence of window functions, FFT frame length, the number of cycles used for training and detection, and the threshold scaling parameter is systematically analyzed, providing guidance for practitioners.
The work presented in this paper is based on our previous work already published in [
29]. Our previous paper described the effect observed during the research, specifically, that a vector could be defined, from which we analyzed the distribution of the elements to classify engine operation. In addition, the basic principle of the algorithm operation was presented. In this work, the measure and threshold for classification are defined and elaborated upon in detail, along with the training process, and the performance of the proposed algorithm is thoroughly presented.
This paper presents an algorithm that enables the classification of motor operations as either correct or faulty by analyzing the signals obtained from a microphone or accelerometer. The algorithm is based on signal analysis in the frequency domain. The recognition process includes a training phase followed by detection. Training was performed by analyzing the signals over a certain number of full diesel engine operating cycles. Training and detection could be conducted at any engine speed. The classification was performed based on a threshold, the value of which was determined during the training phase. A key advantage of the proposed algorithm is its simplicity and ability to detect operating faults by applying the same algorithm to signals obtained from either a microphone or an accelerometer. In addition, the reliability of detection can be increased by adding more microphones or accelerometers. Particular emphasis can be placed on the simplicity of practical implementation. This method was tested using a marine diesel engine using acoustic and vibration signals. The proposed algorithm successfully classifies engine operation as either normal or faulty based on acoustic and vibration signals at different engine speeds. A motor fault was emulated by deactivating one cylinder.
The remainder of this paper is organized as follows.
Section 2 describes the materials and methods. The results are presented in
Section 3, followed by a detailed discussion in
Section 4. Finally,
Section 5 provides the conclusions of this study.
3. Results
Measurements were conducted in a ship’s engine room on a four-stroke diesel engine with six cylinders. The engine displacement per cylinder was 4.88 L, the power output was 2525 kW, and the maximum rotational speed was 1900 revolutions per minute (RPM). Two microphones were placed in close proximity to the engine and four accelerometers were mounted on the engine housing. In addition, an optical tachometer sensor was positioned to register the complete rotation of the flywheel. The engine room with the measuring equipment, microphone, and accelerometer is shown in
Figure 2. The signals obtained from the outputs of the microphones, accelerometers, and optical tachometer sensor, totaling seven channels, were digitized using an analog-to-digital (A/D) converter with a sampling frequency of
= 51,200 Hz. The resolution of the A/D converter was 24 bits per sample. The sampling in all the channels was synchronized. The measurements of the sound signal and vibrations were conducted on a properly functioning engine at 600, 900, 1200, 1500, and 1800 RPM. Faulty engine operation was simulated by disabling one cylinder. Two independent measurements were conducted with the first or fifth cylinder deactivated.
The performance of the proposed algorithm is shown in
Figure 3 and
Figure 4, respectively.
Figure 3 displays the results for engine operation at 600, 900, and 1200 RPM (from left to right), while
Figure 4 shows the results for operation at 1500 and 1800 RPM, also from left to right. To calculate expression (
3), in the time domain, a Hamming window function was used with a sample size of
. The presented results pertain to a measurement scenario in which the outcomes of proper engine operation are compared with the results obtained when the first cylinder was deactivated. Hereafter, the term ’block’ will be used to denote multiple consecutive full cycles of engine operation. For each block of
full cycles, the value
d was calculated according to Equation (
9), after which the average value
was computed. Let
denote the value obtained for the
i-th block; it follows that
, where
denotes the number of blocks of the collected data for each microphone or accelerometer at different revolution speeds during normal and faulty operations. Owing to limited measurement capabilities, the duration of the collected audio recordings was at least 30 s for each engine revolution count for which the measurements were conducted. A specific challenge was encountered during data collection when one cylinder was deactivated, given the potential damage that could occur if the engine operated for an extended period with a deactivated cylinder. The number of complete cycles in the specified time interval being measured,
, varied from around 15 to 80. The blue bars in
Figure 3 and
Figure 4 represent the values of
obtained from data collected during normal operation, while the orange bars correspond to the data collected during faulty operation. Faulty operation refers to a situation in which the first cylinder was disabled. Also, on the graphs for each individual measurement, the double deviation for each average value of
is depicted for normal as well as faulty engine operation. Specifically, the mean value
is shown along with the double standard deviation
. In all cases, the vector
was calculated from
full cycles of engine operation, using data collected during the training phase. From the obtained results depicted in
Figure 3 and
Figure 4, a clear difference in the mean value of the measure
was evident between proper engine operation and faulty engine operation, considering various sensors at different engine speeds. A weaker detection capability is observed in
Figure 3 at 900 RPM, where the classification threshold was situated within the deviation range of
from the mean value of
. Independent data were used for training for all tested rotational speeds and for all sensors. The training data were not used to test the method.
Figure 3 and
Figure 4 demonstrate that the value of
was significantly influenced by whether the engine operated with all cylinders active or with the first cylinder deactivated. This pattern was consistent across all tested engine rotational speeds, both for microphones and accelerometers.
To classify the mode of operation as either correct or faulty, it was necessary to determine a threshold. By analyzing the collected data, we concluded that the threshold could be determined from the training data, from which vector
was also calculated. The procedure for threshold determination was as follows: data collected during all full cycles used for training, 100 in the case of the results shown in
Figure 3 and
Figure 4, were divided into 20 blocks, each consisting of five full cycles of engine operation. For each block of
full cycles, the value
d was calculated according to Equation (
9), after which the average value
was computed. Let
denote the value obtained for the
k-th block; it follows that
. The parameter
was crucial for determining the threshold to be used to classify engine operation as either correct or faulty.
Through experimental analysis, we determined that the ‘optimal threshold’ for classifying engine operation could be defined using the parameter
, according to the criterion of maximizing classification accuracy (Acc):
where
C is a parameter for optimization and Acc is defined as
where TP, TN, FP, and FN denote the true positive, true negative, false positive, and false negative test results, respectively. TP, TN, FP, and FN refer to the total number of events determined from the collected data from each microphone and accelerometer, respectively. In
Figure 3 and
Figure 4, the horizontal line on the bar graphs represents a threshold of
, i.e.,
. Further details on how the parameter
was determined are provided in the explanation of the results shown in
Figure 5. A more reliable estimate of the threshold can be obtained by assessing it from a training dataset that is not used for calculating the vector
. In such cases, a longer training sequence is required. However, the obtained results justify the proposed approach. Algorithm 1 presents the proposed method.
Figure 5 shows the accuracy achieved by the proposed algorithm as a function of the parameter
C, which, when multiplied by
, defines the threshold value used for classifying engine operation as either correct or faulty.
Algorithm 1 Algorithm for training and detecting engine operating faults. |
while monitoring continues do number of samples within mth full cycle for do if then end if end for if then for do calculate-d end for else
calculate-d if then else end if end if end while function calculate-d() findPeaks() return d end function function findPeaks() return end function
|
The images, from left to right, correspond to
= 50, 75, and 100 full cycles used for training. From these, 10, 15, and 20 full cycles, respectively, were used to form a block, from which the
values were calculated. When averaged, these yielded the
value. Each image shows graphs depicting the dependence of classification accuracy on the parameter
C. Individual graphs correspond to the performances achieved by analyzing data for method validation, using
= 5, 10, 15, and 20 full cycles to calculate the
d value using Equation (
9). Based on this value, the operation was classified as correct or faulty, depending on whether it was below or above the
. A Hamming window function was used with a sample size of
, and the obtained results correspond to the scenario in which faulty engine operation was simulated by deactivating the fifth cylinder. The choice of the parameter
C was highly important as the accuracy of the classification depended on it. Analyzing the obtained results, it can be observed that for
(left image), the maximum accuracy was achieved when
C was chosen in the range of 1.35 to 1.45, depending on the value of
. However, since the maximum achievable accuracy for
was approximately 0.92, this case is not particularly interesting given that significantly better accuracy values were obtained by increasing the number of full cycles required for training, as can be seen in the middle and right images. In the case when
and 100 (middle and right images), the highest accuracy was achieved when
C was in the range of 1.2 to 1.25 and when
or 20. From the obtained results, it can be concluded that for the values of variables
and
relevant for practical use, namely, 75 and 100 for
and 15 and 20 for
, maximum accuracy was achieved when
C was in the range of 1.2 to 1.25.
Figure 6 and
Figure 7 depict the dependence of accuracy on the number of full cycles,
, used to calculate the value of
d for different values of full cycles used for training,
, in the case of simulating errors in operation by deactivating the first and fifth cylinder, respectively. To calculate the threshold, the optimal parameter
was used, which maximized accuracy. According to
Figure 5, for practical values (
or 100,
or 20), the maximum accuracy was achieved when
C lay between 1.2 and 1.25. Therefore,
was selected as a representative near-optimal value. A Hamming window was applied, and
.
From the results, it is evident that the number of full cycles required to calculate the value of d, which determined the threshold for classification, affected the detection accuracy. With full cycles, an accuracy of approximately 98% was achieved, while increasing to 20 yielded no significant improvement in accuracy when . It was even slightly reduced when the error was simulated by deactivating the first cylinder. A smaller number of full cycles of engine operation required for classification, , is desirable because it requires less time to determine whether an error has occurred. This is particularly important at lower engine speeds because there are fewer full cycles per unit time, which prolongs the time needed to collect data for assessment. For example, at 600 RPM, 15 full cycles take 3 s, while, at 1800 RPM, 15 full cycles take 1 s.
From the displayed results, it is particularly interesting that higher accuracy was achieved when training was conducted using full cycles of engine operation compared to when . As with , it is preferable for training to require a smaller number of full cycles of engine operation, which means shorter training times. For example, at 600 RPM, 75 full cycles take 15 s, while, at 1800 RPM, 75 full cycles take 5 s.
The displayed results show that the achieved accuracy was similar when faulty operation was emulated by deactivating the first cylinder compared to deactivating the fifth cylinder. Considering the obtained accuracy, it can be concluded that the optimal choice was and . Unfortunately, owing to objective circumstances, we were not able to explore the possibility of detecting other types of potential faults in engine operation as the measurements were conducted on an engine in commercial operation, and the risk of any damage was not acceptable.
Table 1 provides data on the total number of events (TP, FP, FN, and TN) collected from all sensors, along with the corresponding accuracies observed for all measurements, conducted in the case of deactivation of the first or fifth cylinders.
In
Figure 8, the classification success based on the data collected from all microphones and accelerometers is graphically illustrated for all engine speeds tested during normal engine operation and operation when the first or fifth cylinder was deactivated. The results presented correspond to the scenario where
= 50 and
= 5, using the Hamming window function and
N = 4096. The incorrect classification is marked with a red dot, whereas the correct one is marked with a green dot. The presented results do not correspond to the scenario in which
was chosen to achieve the maximum accuracy. The purpose of the illustration is to provide a good insight into the classification potential at different engine speeds for all microphones and accelerometers. From the obtained results, it can be seen that at certain engine speeds, in this case, 900 RPM, the classification was not satisfactory during normal engine operation because it was incorrect in a large number of cases. It can also be observed that when the fifth cylinder was deactivated, the classification was incorrect in the majority of cases for the data collected from accelerometer Acc 2 at 1200 RPM. Similar observations can be made for Mic 1 at 900 and 1500 RPM and Mic 2 at 1500 and 1800 RPM. It should be noted that the classification was often unsuccessful for data collected from a particular microphone or accelerometer, but, under the same conditions, it was successful for data collected from other microphones or accelerometers. Hence, the principle of classification based on the majority could be applied.
In the case of the first cylinder being turned off in only two cases, indicated by a red vertical line at 600 RPM, from the data collected from two microphones and one accelerometer (Mic 1, Mic 2, and Accl 1), the faulty operation was classified as correct, while, using the data collected from the remaining three accelerometers (Accl 2, Accl 3, and Accl 4), the classification was correct. In the case of the fifth cylinder being turned off, it can be observed that the classification would be correct in every instance as, at most, two out of six microphones or accelerometers had incorrect classifications simultaneously.
By applying the same classification principle with
and
, for which the maximum precision was achieved, the classification was incorrect in only one case in which correct engine operation using data collected from both microphones (Mic1 and Mic2) and one accelerometer (Acc3) was classified as faulty, as shown in
Figure 9 by the red vertical line, whereas the remaining three accelerometers were classified as correct. In all other cases, for both correct and faulty operations, regardless of whether the first or fifth cylinder was disabled, the classification was correct.
Additional Performance Metrics
While accuracy provides an overall measure of correct classifications, other metrics such as precision, recall, and F1-score are informative when dealing with imbalanced data [
31]. Precision reflected the proportion of detections that were actually faults, recall indicated the proportion of actual faults that were correctly detected, and the F1-score was their harmonic mean.
Table 2 summarizes these metrics for the representative case where
and
for both the first and fifth cylinder faults. The values were computed from the confusion matrix entries in
Table 1.
As described in
Section 2.1, the proposed method for classifying engine operation is based on calculating the value of
d according to Equation (
9), for which it is necessary to determine the two largest values in the vector
defined in Equation (
8). In
Figure 10, examples of the highest values of elements in vector
are shown for correct engine operation in the left column images and for faulty operation with the first cylinder turned off in the right column images at different engine speeds. The displayed images correspond to the case when
,
, and
and the Hamming window function was applied. The red dots indicate the highest values of the samples selected according to the proposed algorithm. In some cases, the largest values were clustered around a certain position
n; in such cases, the next largest value from that group was not considered. This example is illustrated in the image depicting the correct engine operation at 1500 RPM and the faulty engine operation at 900 RPM. To avoid selecting the neighboring maximum values, the condition for choosing the second maximum value was that it needed to be at least ten positions away from the position of the maximum value. In the displayed images, it can be observed that the maximum values of the signal samples were lower in the case of normal motor operation compared to the values for faulty motor operation, allowing for classification.
By analyzing the obtained results, we determined that the accuracy of the proposed method also depends on the applied window function.
Table 3 presents the achieved accuracy using different window functions. The results pertain to the case where there was no fault in operation and when the first cylinder was excluded, with
,
, and
. The obtained results show that the use of a window function is justified. The poorest result was obtained when no windowing function was applied, i.e., when it was rectangular, while the best result was achieved when using the Hamming window functions.
The achieved accuracy also depended on the number of samples used to calculate FFT.
Table 4 provides an overview of the achieved detection accuracy for disabling the first or fifth cylinder for different numbers of sample frames in the FFT. The results indicate that the efficiency decreased as expected, but not significantly, with a reduction in the number of samples. This suggests that the upper frequency limit covered in the signal analysis is not critical for the success of error detection in engine operation. The most critical situation occurred when the motor spun at its slowest speed, which, in this case, was 600 RPM. Given that the sampled signal
is resampled at
N samples within two full cycles of motor operation, sampling frequency after resampling can be expressed as
where
represents the number of revolutions of the engine per unit of time. The division by a factor of 2 is because a full cycle of engine operation involves two revolutions. In the case where the rotation speed was 600 RPM and for
signal samples, according to Equation (
12), the sampling frequency after resampling is
Hz. It follows that the maximum frequency covered by the analysis was
Hz. Regardless of the fact that the frequency content of the signal was analyzed up to a maximum of 320 Hz, the accuracy of fault detection was relatively high, approximately 95%. This was because the proposed method for detecting engine faults is based on finding the two maximum values of the samples in vector
and calculating the value of
d according to Equation (
9). By analyzing the distribution of the signal samples
, it was observed that the distribution differed between normal motor operation and faulty motor operation. In
Figure 11, histograms of the signal samples
are depicted at 600 and 1800 RPM for
and
in both normal and faulty motor operations. From the displayed histograms, it can be observed that the maximum values of the signal samples
were higher in the case of faulty motor operation compared to those for normal operation. This held true for different rotational speeds and values of
N. The results are presented for 600 and 1800 RPM and
and
for simplicity; however, from other conducted experiments and displayed results, it can be concluded that this holds true in general. It can be concluded that for the success of detecting faulty motor operation using the proposed method, the distribution of signal
, i.e., the presence of higher maximum values of signal samples in the case of faulty motor operation compared to normal operation, is crucial.
Figure 12 shows an example of the distribution of variable
d during correct and incorrect engine operations at 1800 RPM. The figure illustrates the difference in the distribution of variable
d between correct and incorrect engine operations. Specifically, the values of variable
d are higher during incorrect engine operation than during correct engine operation, which enables classification.