Next Article in Journal
The Influence Mechanism of Government Venture Capital on the Innovation of Specialized and Special New “Little Giant” Enterprises
Previous Article in Journal
Inclusive Internal Financing, Selective Internal Financing, or Hybrid Financing? A Competitive Low-Carbon Supply Chain Operational and Financing Strategies
Previous Article in Special Issue
Reinforcement Learning Model for Optimizing Bid Price and Service Quality in Crowdshipping
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Meta-Learning-Based LSTM-Autoencoder for Low-Data Anomaly Detection in Retrofitted CNC Machine Using Multi-Machine Datasets

1
Department of Industrial & Management Engineering, Hanyang University, 222, Wangsimni-ro, Seong-dong-gu, Seoul 04763, Republic of Korea
2
Department of Industrial & Management Engineering, Hanyang University ERICA, 55, Hanyangdaehak-ro, Sangnok-gu, Ansan 15588, Republic of Korea
*
Author to whom correspondence should be addressed.
Systems 2025, 13(7), 534; https://doi.org/10.3390/systems13070534
Submission received: 31 May 2025 / Revised: 27 June 2025 / Accepted: 27 June 2025 / Published: 1 July 2025
(This article belongs to the Special Issue Data-Driven Analysis of Industrial Systems Using AI)

Abstract

In recent manufacturing environments, the use of digitally retrofitted equipment has grown substantially, yet this trend also amplifies the challenge of ensuring stable operation through effective anomaly detection. Retrofitted systems suffer from two critical obstacles: a severe scarcity of labeled data and substantial variability in operational patterns across machines and products. To overcome these issues, this study introduces a novel anomaly detection framework that integrates Model-Agnostic Meta-Learning (MAML) with a Long Short-Term Memory Autoencoder (LSTM-Autoencoder) under a multi-machine-based task formulation. By constructing meta-tasks from time-series datasets collected on multiple five-axis computer numerical control (CNC) machines, our method enables rapid adaptation to unseen machines and production scenarios with only a few training examples. The experimental results demonstrate that, even under data-scarce conditions, the proposed model achieves an accuracy of 98.02% and an F1-score of 94.74%, representing improvements of 4.2 percentage points in accuracy and 16.9 percentage points in F1-score over conventional transfer learning approaches. Furthermore, in cross-validation on entirely new machine data, our framework outperforms existing models by 18.1% in accuracy, evidencing superior generalization capability. These findings suggest that the proposed multi-machine-based Model-Agnostic Meta-Learning Long Short-Term Memory Autoencoder (MAML LSTM-Autoencoder) can significantly enhance operational efficiency and reduce maintenance costs in retrofitted manufacturing equipment, thereby improving overall productivity and paving the way for real-time industrial deployment.

Graphical Abstract

1. Introduction

Recently, digital transformation in manufacturing has emerged as a core strategy for improving productivity and quality, with particular emphasis on the digital retrofitting of aging equipment [1,2]. For instance, upgrading a three-axis computer numerical control It has been revised based on the revision. The final title is this. (CNC) machine by adding a two-axis indexing table to form a five-axis machine enables complex three-dimensional machining at approximately 30–50% of the cost of procuring a new five-axis system. The principal advantage of digital retrofitting lies in its ability to enhance equipment availability and efficiency by integrating smart manufacturing capabilities into existing machinery without requiring substantial capital investment for replacement [3].
Ensuring the successful operation of retrofitted equipment necessitates predictive maintenance technologies capable of real-time condition monitoring and anomaly detection [4,5]. Anomaly detection techniques identify minute signs of equipment degradation early, thereby substantially reducing unnecessary maintenance costs and downtime before failures occur [6,7,8]. However, predictive maintenance in retrofit environments faces two significant technical challenges. First, retrofitted equipment typically lacks sufficient operational data, complicating the training of data-driven anomaly detection models [9,10]. Second, frequent changes in manufactured products induce variations in operational patterns, undermining the generalizability of a single anomaly detection model across different product types [11,12,13].
Existing research on time-series anomaly detection primarily employs Long Short-Term Memory (LSTM) autoencoders to effectively capture long-range dependencies [14,15,16,17]. Nevertheless, in data-scarce environments, LSTM-AE models suffer from overfitting and consequent performance degradation [18,19]. Transfer learning has been applied as a mitigation strategy. However, most implementations remain confined to single-source, single-target scenarios, leaving generalization performance in highly heterogeneous retrofit settings uncertain [20]. Although meta-learning techniques, which enable rapid adaptation in low-data regimes, have recently attracted attention, prior studies mainly validate these methods based on single-machine datasets, limiting demonstration of generalization across diverse equipment environments [21,22]. Moreover, few investigations have empirically assessed meta-learning approaches in manufacturing contexts reflecting machine heterogeneity or evaluated their ability to generalize to previously unseen equipment.
To address these gaps, this study proposes a meta-learning-based LSTM-AE anomaly detection model leveraging tasks defined across multiple machines. Meta-learning trains initial model parameters that enable rapid adaptation to new tasks with minimal data and demonstrates superior adaptability in anomaly diagnosis of rotating machinery compared to conventional models [23]. The Model-Agnostic Meta-Learning (MAML) framework is adopted due to its minimal parameter overheads and fast fine-tuning capability, making it suitable for field deployment.
The contributions of the proposed method are fourfold. First, meta-learning tasks are constructed using a target retrofit machine and five functionally identical five-axis CNC machines, incorporating a range of products and operating conditions. Second, the meta-trained model swiftly adapts to a new retrofit machine with minimal data in a few-shot setting. Third, comprehensive evaluation compares the proposed model’s performance with that of standard anomaly detection techniques, transfer learning, and conventional single-dataset meta-learning methods. Finally, the proposed model maintains high performance even when manufactured products change, indicating suitability for the target machine.
To validate the proposed method, two experiments are conducted. The first experiment assesses adaptation and generalization under data-limited conditions, achieving an accuracy of 98.02% and F1-score of 94.74% using only 1,095 sequences collected over approximately one month—representing improvements of 4.2% in accuracy and 16.9% in F1-score over transfer learning. The second experiment evaluates generalization to variations in machine and product conditions, obtaining an accuracy of 98.13% and F1-score of 95.93%, improving accuracy by 18.1% over a single-machine MAML model and enhancing the reconstruction error threshold by 73% for more precise anomaly detection. These results are expected to significantly enhance equipment availability and production efficiency in actual retrofit manufacturing environments, minimizing economic losses.
The remainder of this paper is organized as follows. Section 2 reviews related work. Section 3 provides background knowledge. Section 4 details the proposed multi-machine-task MAML LSTM-AE approach. Section 5 describes the experimental design and results. Finally, Section 6 concludes and discusses limitations and future research directions.

2. Related Work

Digital retrofit systems—which modernize sensors and control architectures to enhance equipment productivity—are garnering attention; however, immediately after retrofit implementation, limited operational history results in extremely scarce data. According to Li et al., the majority of existing prognostics and health management (PHM) studies have focused on approaches that rely on large volumes of data, and research addressing the realistic scenario of data scarcity remains comparatively rare [24]. Similarly, Sanchez-Londoño et al. have identified data paucity and accessibility challenges as major technical barriers in predictive maintenance research for retrofitted equipment [25].
Table 1 summarizes the methodologies, target systems, and key limitations of existing studies on anomaly detection for retrofitted equipment. Selvaraj et al. (2023) implemented a real-time anomaly detection system using power consumption data from retrofitted CNC machines [26,27,28]. Cekik and Turan (2025) likewise proposed an LSTM-based anomaly detector leveraging vibration data, illustrating that most approaches heavily depend on both data volume and quality [29].
However, because these methods require sufficient data collection after retrofitting to ensure model performance, they leave a practical gap in fault diagnosis during the initial period of limited operational history. To address this, recent research has explored models that combine transfer learning, data augmentation, and unsupervised learning using only small datasets. For example, Li et al. (2023) applied transfer learning of an LSTM Autoencoder trained on a three-axis CNC machine to other CNC systems, maintaining performance under low-data conditions [14]. Demetgul et al. (2023) further integrated Wasserstein deep convolutional GAN (W-DCGAN) for data augmentation within a transfer learning framework to mitigate real-world data scarcity [30]. Nevertheless, these studies either focus on single-source–single-target transfer learning [31] or rely on data collected from factory acceptance test platforms for augmentation, which introduces challenges in generalizing to new targets and in acquiring actual equipment data [32].
Recently, research has increasingly turned to meta-learning as a means of enhancing model performance with only small amounts of data, without being constrained by source diversity or target generalization issues. In these meta-learning studies, training data are typically divided into episodes within a single dataset to facilitate multi-task learning. For example, Chang et al. (2023) employed a MAML-based meta-learning framework to improve model generalization for equipment anomalies in semiconductor wafer fabrication processes [31]. Similarly, Yao et al. (2025) applied meta-learning to industrial datasets with scarce annotations, enabling label-free learning [32]. However, because these approaches must partition tasks from the same dataset, they still face challenges in fully overcoming situations where the target dataset remains insufficiently large.
In summary, prior research has sought to overcome data scarcity by employing single-source–single-target transfer learning or by partitioning tasks within a single dataset to achieve both model generalization and data efficiency through meta-learning. However, these approaches have proven inadequate for securing reliable performance in real manufacturing environments immediately following retrofit, where only minimal data are available. To address this gap, our study treats five analogous five-axis CNC machines as independent meta-tasks to train an LSTM-Autoencoder-based meta-learning model, which is then applied via low-data fine-tuning to a severely data-constrained retrofitted five-axis CNC machine. This strategy enables instant adaptation to changes in the products processed by the retrofitted machine, and, by leveraging the abundant data from existing five-axis systems during meta-training, achieves significant performance gains even with minimal target data.

3. Background

3.1. Long Short-Term Memory Autoencoder (LSTM-AE)

The Long Short-Term Memory Autoencoder (LSTM-AE) is an unsupervised learning model that integrates the Long Short-Term Memory (LSTM) architecture—a specialized form of recurrent neural network (RNN)—with the autoencoder framework to capture both complex temporal dependencies and intrinsic patterns in time-series data. This architecture is particularly effective for handling multivariate sensor streams where capturing both short-and long-term temporal dependencies is critical for anomaly detection.
As illustrated in Figure 1, the LSTM-AE consists of two main components: the encoder and the decoder. The encoder receives input time-series data X = x 1 , x 2 ,   . . . , x T , typically represented as a three-dimensional tensor (batch size × sequence length × feature dimension). The stacked LSTM layers sequentially process each time step, capturing nonlinear temporal relationships and yielding a compact latent representation Z after processing the entire sequence. This latent vector serves as a condensed summary of the input, effectively filtering out noise and preserving essential patterns in a lower-dimensional space.
The decoder reconstructs the original sequence X ^ = x ^ 1 ,   x ^ 2 . . . , x ^ T using the latent representation Z as the initial state. This reconstruction process also employs stacked LSTM layers followed by dense layers that generate the output at each time step. Model training is conducted by minimizing the mean squared error (MSE) between the input X and its reconstruction X ^ , encouraging the encoder to learn features that are most critical for accurate reconstruction.
For anomaly detection, the model is trained exclusively on normal operational data. During inference, sequences that produce reconstruction errors exceeding a predefined threshold are flagged as anomalies, indicating potential deviations from the learned normal behavior. This approach has been successfully applied in industrial scenarios such as equipment monitoring, predictive maintenance, and quality control, where detecting subtle temporal anomalies is essential for ensuring stable operation.

3.2. Model-Agnostic Meta-Learning (MAML)

Model-Agnostic Meta-Learning (MAML) is a meta-learning algorithm that, drawing on experience gained from multiple related tasks, identifies a set of initial model parameters which can be rapidly fine-tuned to new tasks. Figure 2 provides a concise schematic illustration of the MAML procedure.
The core idea of Model-Agnostic Meta-Learning (MAML) is to learn feature representations that are commonly useful across multiple tasks, thereby identifying an initialization of model parameters capable of effective adaptation to a new task with minimal data. Unlike conventional supervised learning, which trains a separate model for each task, MAML finds a parameter initialization that generalizes across diverse tasks and achieves high performance on a new task after only a few gradient steps.
Meta-learning, understood as “learning to learn,” accumulates experience over a set of related tasks T i to acquire rapid adaptation capabilities for unseen tasks. Each meta-task T i comprises a support set S i and a query set Q i . The MAML training procedure employs a bi-level optimization: the inner loop performs fast adaptation on individual tasks, while the outer loop updates the shared meta-parameters across all tasks.
Specifically, the model parameters θ of the LSTM-AE are randomly initialized. During the inner loop for each task T i , the parameters θ are temporarily updated on the support set S i , allowing rapid adaptation to the normal operational time-series patterns specific to the corresponding equipment. The adaptation step is defined as shown in Equation (1).
θ i = θ α θ L T i f θ ,   S i
where α is the inner-loop learning rate, θ i denotes the parameters adapted to task T i , and L T i is the loss evaluated on the support set S i .
In the subsequent outer loop, the loss is computed on each query set Q i using the adapted parameters θ i and the shared meta-parameters θ . Equation (2) defines this meta-update as
θ θ β θ i = T i N L T i f θ i ,   Q i
where β is the outer-loop learning rate and N is the number of meta-tasks.
By iterating these inner and outer loops across all tasks in the meta-training dataset, the model learns an initialization θ with strong generalization performance across diverse equipment and operating conditions. During the meta-testing phase, this trained model delivers effective anomaly detection on a new target task using only a small amount of normal data.

4. Multi-Machine-Based MAML LSTM-AE

The proposed meta-learning-based anomaly detection framework leverages time-series sensor data collected from multiple five-axis CNC machines that are functionally equivalent to the retrofitted equipment, enabling effective fault detection using only a small amount of data from the target machine. Figure 3 illustrates the overall architecture of the model.
While the Model-Agnostic Meta-Learning (MAML) framework itself has been extensively explored in prior anomaly detection research, this study differentiates itself by introducing a multi-machine-based meta-task formulation. Unlike conventional approaches that partition tasks within a single dataset, the proposed method defines each physical machine—operating under distinct product types and machining conditions—as an independent meta-task. This explicit incorporation of inter-machine heterogeneity into meta-training allows the model to learn more generalized and transferable representations, thereby enhancing its adaptability to newly retrofitted machines even under limited data availability.
The framework comprises three major stages. In the first stage, time-series sensor data are collected from the target retrofitted machine and five analogous five-axis CNC systems. These multi-machine datasets then undergo systematic preprocessing, and each machine’s cleaned data is formulated as a separate meta-task. In the second stage, those meta-tasks are used to train a generalized meta-learning LSTM-Autoencoder model. Finally, in the third stage, the meta-trained model is fine-tuned on the limited data from the target machine and applied to detect anomalies during actual operation.

4.1. Multi-Machine Data Collection and Preprocessing

Data are collected from the target retrofitted machine and five analogous five-axis CNC systems to train the meta-learning-based LSTM-AE anomaly detection model. Because each machine processes different products under varying conditions, both the data characteristics and time-series lengths differ. Consequently, a systematic data collection and preprocessing pipeline that accommodates these variations is employed. Figure 4 depicts the overall workflow of these stages.
All time-series data generated by the 5-axis CNC machines (X, Y, Z, A, and B axes)are initially ingested into a central database. From this repository, a first-pass extraction of records containing equipment metadata, operational status, process details, sensor measurements, and anomaly annotations is conducted. The extracted fields then serve as inputs for the preprocessing pipeline, where they are transformed into the formats required for model training.
Table 2 lists the primary data items extracted prior to preprocessing. Equipment metadata include unique identifier codes for each machine represented in the meta-tasks. Operational status captures the current mode of each machine, such as idle, running, or emergency stop. Process details provide information on production quantities and per-cycle work parameters. Sensor measurements consist of real-time axis load values, used to monitor machine health continuously. Anomaly annotations are recorded as alarm codes (ALARM_CODE), which the control system logs automatically whenever an abnormal condition or event is detected; these codes play a crucial role in diagnosing machine behavior and detecting early signs of failure.
In the raw-data filtering stage, three criteria are applied. First, only records with MACHINE_MODE set to Memory (Auto) are retained—this mode corresponds to CNC programs executing automatically from internal memory without operator intervention. Likewise, the machine is considered active only when OPERATE_MODE equals OPERATE, ALARM, or EMERGENCY; IDLE, STOP, and other non-operational states are excluded to remove irrelevant logs. Second, any production cycle in which the five axis load values remain constant for the entire duration is discarded, as such constancy indicates a paused state rather than actual machining. Third, cycles during which ALARM_CODE occurs continuously are excluded, since these typically reflect non-fault events (e.g., operator overrides or spurious alerts) rather than genuine anomalies. Applying these criteria removes logs generated by maintenance activities or operator actions, leaving only meaningful time-series segments.
Figure 5 illustrates the identification of normal and abnormal data for model training and evaluation. Because true anomalies are rare during normal operation, obtaining abnormal samples poses a significant challenge in equipment-based anomaly detection. In retrofit environments, however, the controlled induction of ALARM_CODE—via manipulated inputs or simulated fault scenarios—provides the limited abnormal data needed for validation.
A portion of the anomalous data is generated by adjusting the alarm-code occurrence patterns observed during actual equipment operation so that they are naturally reflected, and ALARM_CODE events that occur organically in service are also included. For each production cycle, the ratio p 0,1 of time steps at which an ALARM_CODE is present is computed, and labeling is performed based on this value. Equation (3) defines the calculation of p as
p = 1 L i = 1 N 1 A L A R M _ C O D E i N U L L
where L denotes the total number of time steps in the given production cycle and 1 is the indicator function that returns 1 if the condition holds and 0 otherwise. Based on the computed value p , each production cycle is assigned either a “Normal” or “Anomaly” label. Table 3 summarizes the definitions of these labels and how they are applied.
These labeling criteria ensure that only ALARM_CODE patterns corresponding to genuine fault conditions are treated as anomalies, while ambiguous alarm intervals arising from operator interventions or controlled test scenarios are excluded from both training and evaluation. This approach enhances the reliability of data labels and prevents distortions in model performance.
Time-series length normalization is then applied to the preprocessed raw data and the labeled normal sequences. Because each five-axis CNC machine varies in product type and process complexity, the number of time steps per production cycle ranges from tens to hundreds and is not fixed. LSTM-AE models, however, require input sequences of uniform length, making normalization necessary. To this end, data are first segmented by PROSSED_QTY—with each segment representing one production cycle—and the distribution of sequence lengths across all cycles is analyzed. A strong concentration of lengths within a specific interval indicates that deviations outside this range likely result from missing data due to external interruptions or sensor errors. Accordingly, the modal sequence length is identified, and the interquartile range (IQR) around this mode is used to select only those cycles whose lengths fall within the central quartile boundary. Sequences outside this range are treated as outliers and excluded from the training dataset.
Here, the time-series length L is defined as the number of time steps in each production cycle, and the reference length L r e f is defined as the modal length among the normal sequences. However, even after applying this criterion, differences in sequence lengths persist. Therefore, to conform to the LSTM-AE input structure, an additional normalization process is applied. If a sequence is longer than the reference ( L > L r e f ), uniform resampling is performed across the entire sequence to extract time steps at equal intervals and reduce its length to L r e f . Conversely, if a sequence is shorter than the reference ( L < L r e f ), simple moving average (SMA)-based interpolation is applied to each load axis (LOAD_1 through LOAD_5) to fill in the missing time steps. The interpolation is defined by Equation (4).
x ^ t = 1 2 ω + 1 i = ω ω x t + i
where x ^ t denotes the interpolated value at time step t , ω is the half-width of the moving-average window, and   x t + i represents the actual observations at neighboring time steps in the original series.
This preprocessing pipeline transforms irregular time-series data collected under diverse production conditions into a uniform format suitable for model training. Noise, outliers, and uneven sequence lengths are corrected without distorting underlying temporal patterns, thereby preserving continuity and enabling direct comparability across sequences. Maintaining a consistent input structure establishes a stable foundation for meta-learning and enhances cross-domain generalization performance during training.

4.2. Multi-Machine-Based Task Construction

The core contribution of this study lies in the construction of meta-tasks from multi-machine data. A meta-task is defined as a learning unit in which individual tasks are formulated from the normal time-series data collected for each manufacturing machine, enabling rapid adaptation to new environments. Most existing meta-learning-based anomaly detection studies generate meta-tasks by arbitrarily partitioning normal data from a single machine or within a homogeneous domain. Such an approach yields only minor inter-task distributional differences, rendering cross-domain generalization fragile and failing to capture machine heterogeneity, such as variations in product type, sensor configuration, and production cycle, thereby limiting applicability to new equipment or products.
To address these limitations, each meta-task is defined independently using time-series data from multiple machines that closely resemble the target retrofitted system. Training across these diverse equipment environments enables the model to learn generalized representations of domain-specific characteristics, so that anomalous patterns can be detected accurately in a new machine environment using only a small amount of data.
The selected five-axis CNC machines exhibit not only unit-to-unit variability but also significant changes in machining conditions, cycle lengths, and load signal patterns as workpieces change over time. Such variability may even occur within a single machine and poses a major challenge for anomaly detection methods based on a single model. In contrast, a meta-learning-based approach that incorporates diverse domains effectively captures the characteristics arising from heterogeneous equipment environments, thereby improving anomaly detection accuracy in new contexts such as retrofitted machines. This strategy outperforms conventional techniques by enabling rapid learning and adaptation even when data are severely limited.
Figure 6 illustrates the procedure for defining meta-tasks from each machine’s data and partitioning them into support and query sets. In this implementation, meta-tasks are constructed using data collected from six five-axis CNC machines (Machines 1–6), each processing different products and thus representing an independent task that clearly reflects cross-domain heterogeneity.
Five machines (Machine 1–5) constitute the meta-training dataset D m e t a - t r a i n , while the retrofitted machine (Machine 6) serves as the meta-test dataset D m e t a - t e s t for final performance evaluation. Each machine defines a separate task T i , within which data are partitioned into a support set S i and a query set Q i according to the MAML protocol. The support set S i comprises a subset of normal time-series data collected from the corresponding machine (e.g., D 1 t r a i n for Machine 1 in Figure 6) and is used during meta-training to perform an inner-loop update of the LSTM-AE parameters, effectively acting as a fine-tuning step.
The query set Q i contains the remaining sequences—both normal and anomalous—collected from the same machine (denoted D 1 t e s t for Machine 1 in Figure 6). Anomalous sequences are defined by the labeling criterion as those production cycles in which the proportion of time steps with an ALARM_CODE exceeds 50%. Q i thus serves to evaluate reconstruction performance after the inner-loop update and to drive the outer-loop meta-parameter update. In the meta-test phase for the retrofitted machine (Machine 6), S i consists of a small set of normal sequences D n e w t r a i n , while Q i comprises the unlabeled mixed normal and anomalous sequences D n e w t e s t .
This support–query-based task construction strategy enables rapid adaptation in data-scarce scenarios—such as newly installed equipment—or when production conditions change. By applying identical preprocessing and labeling criteria across all tasks, structural consistency of the meta-learning process is maintained and experiment reproducibility is ensured. Consequently, the proposed methodology can be generalized across diverse manufacturing environments and achieves effective anomaly detection on new equipment using only minimal data.

4.2.1. Multi-Machine Task Sequence Normalization

In the multi-machine task-based MAML LSTM-AE architecture proposed in this study, each CNC machine exhibits different sequence lengths. When producing a single workpiece, the cycle length per machine varies substantially, ranging from 280 to 380 time steps. Accordingly, the representative cycle length for each machine is defined by its modal value, as established in Section 4.1: n 1 = 280 , n 2 = 320 , n 3 = 333 , n 4 = 350 , n 5 = 380 , and n t a r g e t = 335 . This enables the exploration of a normalization strategy for integrating multiple machines with heterogeneous cycle lengths into a unified meta-model training process. In particular, for anomaly detection using MAML-based LSTM-AE, variability in sequence length and misalignment of event timing significantly impair both adaptation speed and detection performance.
Sequence length normalization is essential when applying MAML-based LSTM-AE to multi-machine anomaly detection for several reasons. First, from a network architecture perspective, the LSTM encoder–decoder is designed to receive batches of uniformly sized sequences; if sequence lengths differ, tensor-dimension mismatches render training and inference infeasible. Second, in the meta-learning context, both the inner and outer loops of MAML rely on computing and aggregating gradients across tasks using the same parameter set θ ; differing input dimensions across tasks prevent valid gradient comparison and accumulation during meta-updates. Third, because the autoencoder’s reconstruction loss (∥ X X ^ ∥) is computed at each time step, sequence lengths must be identical across tasks to ensure that the loss magnitude is evaluated on a consistent scale—otherwise, meta-loss comparisons and updates become distorted. Finally, if cycle-length differences themselves constitute a source of variability, the meta-model’s adaptation speed slows; by unifying all sequences to the reference length, the model can focus solely on genuine pattern differences (e.g., variations in load profiles), enabling rapid and stable adaptation to new machines with minimal data.
To address these challenges, sequence length normalization is performed through Dynamic Time Warping (DTW)-based alignment and resampling. DTW is a classical algorithm that computes the optimal alignment between two time series by minimizing the cumulative distance along a warping path, thereby compensating for temporal variations in speed or phase [33] (Equation (5)). Formally, for two sequences X = x 1 , x 2 ,   . . . ,   x m and Y = y 1 ,   y 2 ,   . . .   ,   y n , DTW computes:
DTW X , Y = min π A ( , j ) π d x i , y j
where π denotes the alignment path within the admissible set A and d x i , y j is typically the Euclidean distance between aligned points. However, conventional DTW is non-differentiable, limiting its integration with neural networks. Soft-DTW introduces a differentiable relaxation by replacing the min operator with a soft-min formulation [34], Equation (6):
DTW γ X , Y = γ   log π A e x p C π γ  
where γ is the smoothing parameter and C π is the cumulative cost along path π .
In our approach, we first compute a global reference sequence—called the barycenter—from K machine time series using DTW Barycenter Averaging (DBA) according Equation (7):
b = a r g m i n b k = 1 6 D T W 2 X ( k ) ,   b
This barycenter b captures the common temporal structure across all series. Next, we align each series X ( k ) to b by computing the optimal warping path π * via Soft-DTW. Finally, we resample the warped series to a fixed length L = 333 using linear interpolation (Equation (8)):
X ~ k = R e s a m p l e ( W a r p X ( k ) , π * ,   333 )
Through this normalization process, all machines’ time series attain a uniform length of 333 while preserving each machine’s intrinsic temporal patterns. This ensures that the MAML LSTM-AE model can learn common anomaly patterns across machines while still reflecting machine-specific characteristics. Moreover, the normalized sequences enable efficient batch processing and consistent gradient computation during meta-parameter updates. As a result, this sequence-length normalization serves as a critical preprocessing step securing both the performance and stability of the meta-learning-based anomaly detection model in a multi-machine environment.

4.2.2. Input Data Normalization

The selected load signals share the same physical units but exhibit differing distributions in absolute magnitude and variance due to variations in machining conditions across machines and products. Such heterogeneity in input distributions can unduly bias certain channels during training or adversely affect overall parameter convergence. In particular, within the meta-learning framework applied in this study, unifying time-series characteristics from different domains (machines/products) into a common input structure is essential; hence, the normalization of each input variable is required.
Z-score normalization is applied to the time series of each load axis. Z-score normalization transforms each series to have a mean of zero and a standard deviation of one. The normalized load value x t z at time step t is defined as in Equation (9):
x t z = x t μ σ
This normalization procedure removes scale discrepancies among input channels and ensures that, during network training, all features contribute equally. Moreover, by consistently aligning input distributions across machines and tasks, it enhances the generalization performance of the meta-learning-based anomaly detection model.

4.2.3. Metadata Statistics

Table 4 summarizes the basic statistics (sample count, mean, and standard deviation) of each axis load signal (LOAD_1–LOAD_5) for the five five-axis CNC machines, computed after preprocessing (outlier removal, time-series normalization, etc.). Machines M001–M005 were monitored continuously for approximately five months, yielding the large sample volumes shown in Table 4. In contrast, the retrofitted machine M006 was monitored for only one month, producing significantly fewer samples (1,095 sequences). For all machines, usable samples are inherently limited by production schedules—machines operate only during active product runs—and further reduced by preprocessing that removes noisy or incomplete records. M006’s dataset is additionally constrained by its shorter monitoring period.
The mean values of all five axis load signals differ notably between machines, particularly evident in LOAD_3, ranging from 0.5090 to 0.6190. Additionally, the standard deviations across all load signals span a broad range (0.0677–0.2278), clearly highlighting the diverse operational characteristics such as sensor configurations, machining speeds, and types of processed workpieces. Specifically, machines M002 and M006 exhibit larger standard deviations in LOAD signals, indicating more variable operational patterns, potentially due to frequent product changes or varying machining conditions. Such pronounced heterogeneity underscores the complexity involved in generalizing a single anomaly detection model across all machines and supports our approach of modeling each machine as a distinct meta-task within the proposed MAML-based anomaly detection framework. In particular, the limited data availability from the retrofitted machine M006 further motivates the meta-learning methodology, aiming to leverage information from well-instrumented machines (M001–M005) to facilitate efficient adaptation to the target retrofit scenario.

4.3. MAML-Based LSTM-AE Anomaly Detection Model

In MAML-based anomaly detection, N meta-tasks are given—that is, a set of anomaly detection problems T 1 ,   T 2 , ,   T N . Each task is constructed as described in Section 4.2. The meta-learning process for anomaly detection is divided into two phases: meta-training and meta-testing. In the meta-training phase, the model’s initial parameters are learned from the normal time-series data collected across the diverse machines (tasks). This phase aims to acquire generalized feature representations that enable rapid adaptation to new equipment environments. In the meta-testing phase, the learned meta-parameters are employed to perform fast adaptation and anomaly detection on a new retrofitted machine by fine-tuning with only a small amount of target-machine data.

4.3.1. Meta-Training

Figure 7 illustrates the inner-loop update during the meta-training phase. Each task T i is constructed from the normal sensor data collected across N distinct machines. For every T i , the task-specific parameter θ i is obtained by updating the shared initialization θ using the support set S i = D i t r a i n according to Equation (10):
θ i   θ α θ L T i f θ , D i t r a i n
Here, θ denotes the common meta-initialization parameters, θ i represents the parameters adapted specifically for task T i , α is the inner-loop learning rate, and L is the reconstruction loss of the LSTM-AE. The function f θ denotes the LSTM-Autoencoder model parameterized by θ , which maps the input time-series sensor data to its latent representation and reconstructs it to minimize reconstruction errors.
This update is executed in parallel for all tasks, enabling each θ i to capture the normal operational pattern of its respective machine, thereby laying the groundwork for effective anomaly detection in the subsequent query set.
Figure 8 depicts the outer-loop meta-optimization during the meta-training phase. After obtaining each task-specific parameter θ i via the inner loop, it is evaluated on the query set Q i = D i t e s t , which contains both normal and anomalous sequences.
As shown in Equation (11), the common parameter θ is then updated for each task T i :
θ θ β θ L T i f θ i , D i t e s t
where β is the outer-loop learning rate and L T i f θ i , D i t e s t denotes the test loss on the query set. The function f θ i refers to the task-adapted LSTM-Autoencoder model, parameterized by θ i , which is fine-tuned on the support set D i t r a i n for task T i . This model reconstructs the input time-series and generates reconstruction errors used for anomaly detection.
To ensure that the meta-loss captures both reconstruction fidelity and anomaly detection performance, the following definition is used in Equation (12):
L m e t a f θ i , D i t e s t = λ 1 L r e c o n f θ i , D i t e s t + λ 2 L a n o m a l y f θ i , D i t e s t
where L r e c o n is the reconstruction loss, L a n o m a l y is an anomaly detection-based loss term, and λ 1 and λ 2 are weighting hyperparameters. The combined meta-loss L m e t a , defined in Equation (12), is then used as the task loss L T i f θ i , D i t e s t in the outer-loop meta-update step shown in Equation (13).
Repeating this update across all N meta-tasks, each using the meta-loss L m e t a , yields the aggregate meta-update given in Equation (13):
θ θ β θ   i = 1 N L T i f θ i , D i t e s t
The resulting parameter θ serves as a generalized initialization that encapsulates knowledge from diverse machines, enabling rapid low-data adaptation and accurate anomaly detection on a new retrofitted system during the meta-testing phase.

4.3.2. Meta-Testing

Figure 9 schematizes the meta-testing procedure on the new target equipment. The generalized parameter θ learned during the meta-training phase (Figure 7 and Figure 8) is fine-tuned on the unseen task T 6 using only a small amount of target-machine data, enabling rapid adaptation and anomaly detection. The meta-test dataset D m e t a t e s t corresponds to the retrofitted machine (Machine 6) and comprises the two subsets defined in Section 4.2: D n e w t r a i n , a small labeled normal dataset serving as the support set S i , and D n e w t e s t , the dataset—containing both normal and anomalous sequences—serving as the query set Q i .
Using only D n e w t r a i n , the meta-trained initialization parameter θ is adapted through an inner-loop update specific to the new equipment as follows Equation (14):
θ n e w θ α θ L T n e w = 6 f θ , D n e w t r a i n
Here, θ is the meta-trained initialization parameter, α denotes the inner-loop learning rate, and L(⋅) is the reconstruction loss of the LSTM-AE. The resulting θ n e w is specialized to the normal operating patterns of the retrofitted machine.
During inference, this fine-tuned model is applied to the unlabeled test set D n e w t e s t to detect anomalies, following the procedure defined in Equation (15):
y ^ = f θ n e w x t e s t
where f θ n e w denotes the fine-tuned model and x t e s t D n e w t e s t is the test input. The resulting prediction y ^ indicates, for each time step or input sample, whether an anomaly is present, thereby enabling the early detection and mitigation of fault conditions in live operational environments.
Moreover, the meta-testing procedure serves as a key component of a low-data predictive maintenance system, allowing rapid adaptation and inference on new equipment with only a small amount of labeled normal data, without requiring complete retraining of the entire model. This significantly reduces labeling costs in industrial settings and broadens the model’s applicability across diverse equipment types, enhancing both generalizability and scalability.
In particular, for retrofitted machines or other newly introduced systems where data scarcity is prevalent, the model leverages the generalized initialization θ , learned from multiple existing machines (tasks T 1 , T 2 , T 3 ,   T 4 ,   T 5 ), to achieve fast and effective adaptation. This maximizes the utility of cross-machine data and greatly expands the feasibility of applying predictive maintenance in retrofit installations across various industrial domains, delivering high-performance anomaly detection with minimal normal data.

5. Experiments

This chapter describes in detail the design and results of two experiments conducted to validate the performance of the proposed multi-machine meta-learning model. The first experiment compared the proposed approach against the conventional LSTM-AE method using a small amount of data from the retrofitted machine. The second experiment evaluated generalization capability by comparing the single-machine meta-learning model with the proposed multi-machine meta-learning model on datasets drawn from other 5-axis CNC machines rather than the retrofitted unit.

5.1. Experimental Setup

All experiments were carried out using time-series sensor data collected in an actual manufacturing environment. Data were obtained from one retrofitted CNC machine and five additional machines in regular operation, each processing a different product family. For the retrofitted machine, approximately one month of normal-operation data was collected. The multi-machine dataset comprised five months of long-term data, with each machine’s LOAD_1 through LOAD_5 axis load values monitored as features.
Table 5 lists the hyperparameter settings used in the experiments. The proposed model architecture integrates an LSTM-AE with the MAML meta-learning framework. The encoder consists of two LSTM layers, each with a hidden-state dimension of 128. The dimensionality of the latent space was set to 64 to balance representational capacity and computational efficiency. During training, a learning rate of 1 × 10−4, a batch size of 64, and a dropout rate of 0.2 were used, and early stopping was employed to prevent overfitting. The detection threshold was determined statistically: the threshold was set to the mean reconstruction error of the normal data plus three times its standard deviation.
Table 6 presents the quantitative metrics used to evaluate the model’s performance. Anomaly detection capability was assessed using widely adopted measures—accuracy, precision, recall, F1-score, and the confusion matrix—while reconstruction error quantified the model’s ability to faithfully reconstruct normal time-series patterns.

5.2. Experiment 1: Compare Performance with the Baseline Model in a Low-Data Situation

In the first experiment, the performance of the proposed multi-machine, multi-task MAML LSTM-AE approach was evaluated against three baseline methods commonly employed for anomaly detection in data-scarce manufacturing environments. The baselines comprised (1) a standard LSTM-AE trained solely on the retrofit machine’s dataset, (2) a transfer-learning scheme that pre-trains on other machine datasets before fine-tuning on the retrofit data, and (3) a single-machine MAML LSTM-AE model, in which meta-learning is performed using only the retrofit dataset. The proposed method, by contrast, leverages meta-learning across five distinct five-axis CNC machines.
All models were trained and evaluated on approximately 1050 normal sequences and 45 anomalous sequences collected over one month from the retrofitted machine, thereby emulating a low-data scenario that reflects real-world constraints: extensive data collection for legacy equipment incurs significant time and cost, and expert labeling is both time-intensive and expensive, making selective annotation a practical necessity. Consequently, only a limited volume of labeled data is available for model development.

Experiment 1 Results

To assess the classification performance of the proposed multi-machine-based MAML LSTM-AE model, a quantitative comparison was conducted against three baseline methods. Specifically, the four models—basic LSTM-AE, transfer learning, single-machine-based MAML LSTM-AE, and the multi-machine-based MAML LSTM-AE—were evaluated using the anomaly detection metrics defined in Table 6 (accuracy, precision, recall, F1-score, and the confusion matrix components T P ,   T N ,   F P ,   F N ). The results of this evaluation are summarized in Table 7.
In the comparative experiments, the basic LSTM-AE and the single-machine-based MAML LSTM-AE exhibited markedly lower accuracy. The basic LSTM-AE achieves only 0.8413 accuracy, indicating that, when trained on 80% of the approximately 1050 sequences, its anomaly detection capability degrades due to insufficient data. Likewise, the single-machine-based MAML LSTM-AE struggles under the same data constraints, as partitioning a single retrofit dataset into meta-tasks does not provide enough diversity or volume for effective meta-learning. By contrast, both the transfer-learning-based model and the proposed multi-machine-based MAML LSTM-AE demonstrate substantially higher accuracy. The transfer learning approach attains 0.9405 accuracy, confirming that pre-training on ample source data followed by fine-tuning on the retrofit dataset can maintain robust performance even in low-data scenarios. The multi-machine-based MAML LSTM-AE achieves the highest accuracy of 0.9802, indicating that leveraging meta-training across five distinct CNC machines yields a generalized initialization that remains highly robust when fine-tuned on the limited retrofit data.
In manufacturing environments, deploying an anomaly detection model requires not only high accuracy but also a careful balance between false positives (misclassifying normal data as anomalous) and false negatives (failing to detect true anomalies), as captured by precision and recall. These metrics are jointly summarized by their harmonic mean, the F1-score. As reported in Table 7, the multi-machine-based MAML LSTM-AE model achieves an F1-score of 0.9474, accurately identifying all anomalous instances and thus recording the highest F1-score among the evaluated methods. This represents a 16.9% improvement over the transfer-learning baseline—a practically significant advance. In particular, sustaining perfect recall while substantially boosting precision highlights the core contribution of this approach. Furthermore, attaining an F1-score of 0.9474 with only approximately one month’s worth of limited data underscores the model’s effectiveness under severe data scarcity. Overall, these results demonstrate that the proposed method maintains an excellent trade-off between precision and recall, delivering superior anomaly detection performance. The combination of flawless recall and relatively high precision meets the stringent reliability requirements typical of industrial deployment.
Figure 10 illustrates the reconstruction error distributions on test data for each model. The two left-hand plots correspond to the basic LSTM-AE and the single-machine-based MAML LSTM-AE: although anomalous sequences exhibit marginally higher errors than normal ones, the two distributions overlap substantially, limiting classification performance.
This observation aligns with the quantitative metrics in Table 7, confirming that a plain LSTM-AE architecture and a single-task MAML model fail to capture subtle distinctions in complex time-series patterns. The upper-right plot shows the transfer learning model’s error distribution, where normal and anomalous data are relatively well separated—an outcome of extensive pretraining on large multi-machine datasets that enables effective learning of intricate temporal features. Finally, the proposed multi-machine-based MAML LSTM-AE model records a decision threshold of 0.0104, the lowest among all compared approaches, indicating the strongest reconstruction capability on normal data. Moreover, it produces a clear separation between normal and anomalous distributions, demonstrating that the multi-task MAML framework effectively internalizes diverse normal patterns while consistently generating high reconstruction errors for anomalies.
A comprehensive analysis of the quantitative metrics in Table 7 and the reconstruction error distributions in Figure 10 demonstrates that the proposed multi-machine-based MAML LSTM-AE model outperforms all comparison methods across every performance criterion. In particular, it achieves perfect anomaly detection (recall = 1.0) while maintaining a comparatively low false-positive rate, thereby satisfying the stringent reliability and practicality requirements of industrial deployment. These results provide compelling evidence of the efficacy of the meta-learning approach, which leverages data from multiple machines. Specifically, relative to a single-machine MAML model, this method improves accuracy by approximately 13% and increases the F1-score by roughly 2.3×, confirming that incorporating multi-machine data effectively mitigates the challenges posed by limited samples.
The superior performance of the proposed multi-machine meta-task framework indicates its suitability for real-world manufacturing settings characterized by data scarcity. Its rapid adaptability to new equipment and products further underscores its practical applicability. Altogether, these findings constitute strong empirical support for the use of multi-machine-based meta-learning as a highly effective solution for anomaly detection in the manufacturing domain.
The training and inference speeds of the proposed LSTM autoencoder approach were quantitatively assessed using an NVIDIA GeForce RTX 3060 GPU (12 GB, CUDA 12.6). Table 8 summarizes the training and inference performance across various model training strategies, adjusted to reflect the actual dataset sizes. The Basic LSTM and Single-task MAML models were both trained solely on the M006 dataset (364,635 samples, 1095 sequences), requiring approximately 2 and 3 min per epoch, respectively. Their total training times were approximately 3.3 h and 5 h. The difference in training duration stems from the additional meta-optimization process included in the Single-task MAML strategy. Transfer learning leveraged M005 (3,468 sequences) for pretraining and M006 (1,095 sequences) for fine-tuning. Each phase required approximately 6.5 and 2 min per epoch, respectively, amounting to a total training time of about 14.1 h. The Multi-task MAML model was meta-trained on a significantly larger dataset consisting of 13,188 sequences collected from M001 to M005. Each meta-training epoch took approximately 20 min, and fine-tuning on M006 required an additional 3 min per epoch, resulting in a total training time of approximately 38.3 h.
Additionally, since all strategies share the same LSTM autoencoder architecture during inference, inference performance (in terms of throughput and latency) remains nearly identical across methods. The primary differences in computational cost arise during the training phase, particularly for the multi-task MAML approach, which requires greater resources due to larger datasets and more complex meta-learning processes. Nevertheless, this approach delivers superior performance and adaptability, especially in situations where data from the target equipment is scarce. In real-world deployments under such data-constrained conditions, multi-task MAML consistently achieved outstanding anomaly detection accuracy and provided highly reliable results. These findings indicate that, despite increased training demands, multi-task MAML is particularly well-suited for industrial environments where scarcity of target equipment training data is common—especially when data from similar equipment or related operational contexts can be leveraged during training. By effectively utilizing information from analogous machines, multi-task MAML enables robust generalization and ensures reliable real-time anomaly detection for target equipment, while maintaining practical feasibility for industrial deployment.

5.3. Experiment 2: Evaluation of Generalization Under Equipment Change

In real-world manufacturing environments, anomaly detection models must be generalizable across diverse machines producing different products. Conventionally, each new machine requires retraining from scratch on a sufficient volume of its own data, which is time-and resource-intensive. A model that can adapt to a new machine’s data with only a small amount of fine-tuning thus offers significant practical value. Although a single-machine MAML LSTM-AE model can be generalized to some extent, its ability to capture complex temporal patterns is limited when the meta-training data originate from only one machine. By contrast, the proposed multi-machine-based MAML LSTM-AE model is exposed to a much wider variety of time-series patterns during meta-training and is thus expected to reconstruct these patterns more faithfully even when the target machine or product changes.
To verify this, Experiment 2 compared the generalization performance of the single-machine-based MAML LSTM-AE model against that of the multi-machine-based MAML LSTM-AE model. Both experiments were carried out under the same data collection conditions, model architecture, and hyperparameter settings used in Experiment 1. To mimic a low-data scenario, data from the retrofit machine was withheld and instead data from another 5-axis CNC machine—producing a different product—was selected, with the fine-tuning set limited to approximately 30% of its available sequences. For the single-machine MAML model, the meta-model trained exclusively on the retrofit machine data was fine-tuned for 30 epochs on the 30% subset of the new machine’s normal sequences. Likewise, for the multi-machine MAML model, the meta-model trained across five machines was fine-tuned for 30 epochs on the same low-data dataset. Both models were then evaluated based on the held-out test sequences—containing both normal and anomalous samples—to assess their anomaly detection performance under limited data and changed equipment conditions.

Experiment 2 Results

To compare the generalization performance of the proposed multi-machine-based MAML LSTM-AE model with that of the single-machine-based MAML LSTM-AE, both a quantitative evaluation and an analysis of reconstruction-error distributions were conducted. As defined in Table 7, accuracy, precision, recall, F1-score, and confusion-matrix components ( T P ,   T N ,   F P ,   F N ) were measured; the results are summarized in Table 9.
The single-machine-based MAML LSTM-AE model exhibits a notable drop in accuracy compared with Experiment 1’s retrofit-target results (from 0.9802 to 0.8315). Although meta-learning typically enhances generalization, the meta-model trained on only one machine’s data struggles to adapt when the target shifts to a different machine and product. In contrast, the proposed multi-machine-based MAML LSTM-AE maintains high detection performance despite the change in target data. This robustness stems from meta-training on five heterogeneous machines, which equips the model to be generalized effectively across new equipment.
Figure 11 presents the low-data fine-tuning learning curves and reconstruction error distributions for both models. The single-machine model shows inadequate convergence over 30 epochs, reflecting its inability to assimilate diverse time-series patterns from the limited retrofit data. Its decision threshold of 0.0411 results in substantial overlap between normal and anomalous error distributions, yielding unstable detection performance. By contrast, the multi-machine-based model converges stably under the same low-data conditions and requires a much lower threshold of 0.0111. Normal sequences cluster tightly at low reconstruction errors, while anomalies incur consistently higher errors, demonstrating the multi-task MAML framework’s ability to learn diverse normal patterns and generate discriminative reconstruction errors for anomalies.
Through these two experiments, the proposed multi-machine-based MAML LSTM-AE model demonstrates robust anomaly detection performance both immediately after equipment retrofit—when data are extremely scarce—and under conditions of changing equipment and products. In the data-limited initial retrofit stage (Experiment 1), the model successfully leverages knowledge transferred from previously seen machines with only a handful of training samples, achieving high recall and low false-positive rates. This outcome confirms that the meta-learning framework can adapt rapidly to a novel machine environment by compensating for shifts in feature distributions. In the generalization evaluation (Experiment 2), the model maintains stable anomaly discrimination even on entirely different machines, demonstrating its ability to accommodate diverse operating conditions without overfitting. These results suggest that, in real-world manufacturing settings—where inter-machine data variability is high and gathering labeled data on new equipment is challenging—the proposed method holds strong promise as a real-time anomaly detection solution. Future work involves expanding validation to encompass a broader array of processes and sensor configurations, as well as integrating online learning techniques to enable continuous adaptation.

6. Conclusions

This study proposes a meta-learning-based, multi-machine-task MAML LSTM-Autoencoder (MAML LSTM-AE) model for anomaly detection on retrofitted five-axis CNC machines. In manufacturing environments characterized by limited data and frequent product changes, the multi-machine meta-learning approach overcomes the shortcomings of conventional methods. The validity of the proposed model is confirmed through two experiments tailored to real-world operating conditions: it demonstrates both high adaptability and robust anomaly detection performance under extreme data scarcity and in the presence of new equipment and product variants.
To summarize the key findings, with only 1050 normal sample sequences collected over approximately one month, the model achieves 98.02% accuracy and a 94.74% F1-score—improvements of 4.2% and 16.9%, respectively, over standard transfer learning—and attains perfect recall (100%), detecting all anomalies without omission. When evaluated on completely different equipment data, the model maintains 98.13% accuracy and a 95.93% F1-score—an 18.1% gain in accuracy and roughly a 2.5× increase in F1-score compared to the single-machine MAML baseline—and reduces the reconstruction-error threshold by 73% (from 0.0411 to 0.0111), enabling more precise anomaly localization. By simultaneously improving precision (90.0–92.19%) and sustaining perfect recall, the approach meets the stringent reliability requirements of industrial deployment, minimizing false positives and eliminating false negatives.
These results position the multi-machine meta-learning framework as a practical solution that immediately addresses data limitations and generalizes across diverse manufacturing settings. It is applicable not only to data-scarce retrofit installations but also to early deployment on newly installed equipment, offering a strategic means to enhance both maintenance efficiency and production-line stability.
However, this work has certain limitations. The experiments were conducted exclusively on five-axis CNC machines, and further validation is needed to confirm generalizability to other equipment types. Effective meta-training also presupposes the availability of pre-existing data from similar machines, and optimizing inference latency and resource usage for real-time deployment remains an open challenge. Additionally, potential challenges arise when discrepancies between target and source machines exceed certain bounds. Specifically, if the target machine operates under significantly different conditions or processes, notably different product types from source machines, the model’s initialization parameters might not adequately capture the target machine’s characteristics. Such discrepancies could limit fast adaptation, necessitating additional fine-tuning or more extensive datasets to achieve effective model convergence.
Future research should focus on developing lightweight, edge-compatible architectures paired with online learning techniques to achieve genuine real-time monitoring. Integrating explainable AI methods (e.g., SHAP and LIME) will be crucial for enhancing transparency and facilitating root-cause analysis. To handle significant target-source discrepancies, novel strategies are needed—such as hybrid meta-learning that combines generalized meta-initialization with targeted parameter adaptation, transfer learning to bridge domain gaps, and adaptive normalization or domain-adaptation schemes to explicitly mitigate inter-machine variability. Finally, extensive experimental validation across automotive, aerospace, and semiconductor sectors will establish this approach as a cornerstone of Industry 4.0.

Author Contributions

Conceptualization, J.-M.W. and S.-H.J.; methodology, J.-M.W. and S.-H.J.; software, J.-H.S. and S.-H.J.; validation, J.-H.S.; formal analysis, J.-M.W. and S.-H.J.; investigation, J.-H.S.; resources, K.-M.S.; data curation, S.-H.J.; writing—original draft preparation, J.-M.W., S.-H.J. and J.-H.S.; writing—review and editing, K.-M.S.; visualization, J.-H.S.; supervision, K.-M.S.; project administration, K.-M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Machinery and Equipment Industry Technology Development (R&D) (RS-2024-00444566, Demonstration of digital retrofit technology through upgrading of old manufacturing equipment controllers) funded By the Ministry of Trade Industry & Energy (MOTIE, Republic of Korea).

Data Availability Statement

Data available on request due to restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kolla, S.S.V.; Lourenço, D.M.; Kumar, A.A.; Plapper, P. Retrofitting of legacy machines in the context of Industrial Internet of Things (IIoT). Procedia Comput. Sci. 2022, 200, 62–70. [Google Scholar] [CrossRef]
  2. Alqoud, A.; Schaefer, D.; Milisavljevic-Syed, J. Industry 4.0: A systematic review of legacy manufacturing system digital retrofitting. Manuf. Rev. 2022, 9, 32. [Google Scholar] [CrossRef]
  3. Ilari, S.; Di Carlo, F.; Ciarapica, F.E.; Bevilacqua, M. Machine Tool Transition from Industry 3.0 to 4.0: A comparison Between old machine retrofitting and the purchase of new machines from a triple bottom line perspective. Sustainability 2021, 13, 10441. [Google Scholar] [CrossRef]
  4. Schüppstuhl, T.; Tracht, K.; Roßmann, J. (Eds.) Tagungsband des 4. Kongresses Montage Handhabung Industrieroboter; Springer: Berlin, Germany, 2019. [Google Scholar]
  5. Di Carlo, F.; Mazzuto, G.; Bevilacqua, M.; Ciarapica, F.E. Retrofitting a process plant in an Industry 4.0 perspective for improving safety and maintenance performance. Sustainability 2021, 13, 646. [Google Scholar] [CrossRef]
  6. Anomaly Detection in Industrial Sensors: A Machine Learning-Driven Approach to Predictive Maintenance. Available online: https://www.researchgate.net/publication/386441949_Anomaly_Detection_in_Industrial_Sensors_A_Machine_Learning-Driven_Approach_to_Predictive_Maintenance (accessed on 29 May 2025).
  7. AI-Enabled Anomaly Detection in Industrial Systems: A New Era in Predictive Maintenance. Available online: https://www.researchgate.net/publication/386443676_AI-Enabled_Anomaly_Detection_in_Industrial_Systems_A_New_Era_in_Predictive_Maintenance (accessed on 29 May 2025).
  8. Shiva, K.; Etikani, P.; Venkata, V.; Mittal, A.; Dave, A.; Kanchetti, D.; Thakkar, D.; Munirathnam, R. Anomaly Detection in Sensor Data with Machine Learning Predictive Maintenance for Industrial Systems. J. Electr. Syst. 2024, 20, 454–462. [Google Scholar]
  9. Holtz, D.; Kaymakci, C.; Leuthe, D.; Wenninger, S.; Sauer, A. A data-efficient active learning architecture for anomaly detection in industrial time series data. Flex. Serv. Manuf. J. 2025, 1, 1–32. [Google Scholar] [CrossRef]
  10. Pietrangeli, I.; Mazzuto, G.; Ciarapica, F.E.; Bevilacqua, M. Smart Retrofit: An innovative and sustainable solution. Machines 2023, 11, 523. [Google Scholar] [CrossRef]
  11. Abdallah, M.; Joung, B.G.; Lee, W.J.; Mousoulis, C.; Raghunathan, N.; Shakouri, A.; Sutherland, J.W.; Bagchi, S. Anomaly Detection and Inter-Sensor Transfer Learning on Smart Manufacturing Datasets. Sensors 2023, 23, 486. [Google Scholar] [CrossRef]
  12. Maschler, B.; Pham, T.T.H.; Weyrich, M. Regularization-based continual learning for anomaly detection in discrete manufacturing. Procedia CIRP 2021, 104, 452–457. [Google Scholar] [CrossRef]
  13. Maschler, B.; Knödel, T.; Weyrich, M. Towards deep industrial transfer learning for anomaly detection on time series data. In Proceedings of the 2021 26th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Vasteras, Sweden, 7–10 September 2021; pp. 1–8. [Google Scholar] [CrossRef]
  14. Li, E.; Bedi, S.; Melek, W. Anomaly detection in three-axis CNC machines using LSTM networks and transfer learning. Int. J. Adv. Manuf. Technol. 2023, 127, 5185–5198. [Google Scholar] [CrossRef]
  15. Peixoto, J.; Sousa, J.; Carvalho, R.; Soares, M.; Cardoso, R.; Reis, A. Anomaly detection with a LSTM autoencoder using InfluxDB. In Flexible Automation and Intelligent Manufacturing: Establishing Bridges for More Sustainable Manufacturing Systems; Silva, F.J.G., Ferreira, L.P., Sá, J.C., Pereira, M.T., Pinto, C.M.A., Eds.; Springer Nature: Cham, Switzerland, 2024; pp. 69–76. [Google Scholar] [CrossRef]
  16. Shin, Y.; Na, K.Y.; Kim, S.E.; Kyung, E.J.; Choi, H.G.; Jeong, J. LSTM-Autoencoder based detection of time-series noise signals for water supply and sewer pipe leakages. Water 2024, 16, 2631. [Google Scholar] [CrossRef]
  17. Homayouni, H.; Ghosh, S.; Ray, I.; Gondalia, S.; Duggan, J.; Kahn, M.G. An autocorrelation-based LSTM-Autoencoder for anomaly detection on time-series data. In Proceedings of the 2020 IEEE International Conference on Big Data (Big Data), Virtual Event, 10–13 December 2020; pp. 5068–5077. [Google Scholar] [CrossRef]
  18. Berahmand, K.; Daneshfar, F.; Salehi, E.S.; Li, Y.; Xu, Y. Autoencoders and their applications in machine learning: A survey. Artif. Intell. Rev. 2024, 57, 28. [Google Scholar] [CrossRef]
  19. Safonova, A.; Ghazaryan, G.; Stiller, S.; Main-Knorn, M.; Nendel, C.; Ryo, M. Ten deep learning techniques to address small data problems with remote sensing. Int. J. Appl. Earth Obs. Geoinf. 2023, 125, 103569. [Google Scholar] [CrossRef]
  20. Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A comprehensive survey on transfer learning. Proc. IEEE 2021, 109, 43–76. [Google Scholar] [CrossRef]
  21. Deng, C.; Wu, Q.; Miao, J.; Lu, S.; Xie, L. Meta-transfer learning for predicting position-dependent milling stability across varying overhang lengths under limited experimental data. J. Manuf. Process. 2025, 149, 43–56. [Google Scholar] [CrossRef]
  22. Chen, Y.; Xu, X.; Liu, C. Few-shot meta transfer learning-based damage detection of composite structures. Smart Mater. Struct. 2024, 33, 025027. [Google Scholar] [CrossRef]
  23. Song, W.; Wu, D.; Shen, W.; Boulet, B. Meta-learning based early fault detection for rolling bearings via few-shot anomaly detection. arXiv 2023, arXiv:2204.12637. [Google Scholar] [CrossRef]
  24. Li, C.; Li, S.; Feng, Y.; Gryllias, K.; Gu, F.; Pecht, M. Small data challenges for intelligent prognostics and health management: A review. Artif. Intell. Rev. 2024, 57, 214. [Google Scholar] [CrossRef]
  25. Sanchez-Londono, D.; Barbieri, G.; Fumagalli, L. Smart retrofitting in maintenance: A systematic literature review. J. Intell. Manuf. 2022, 34, 1–19. [Google Scholar] [CrossRef]
  26. Xu, Z.; Selvaraj, V.; Min, S. State identification of a 5-axis ultra-precision CNC machine tool using energy consumption data assisted by multi-output densely connected 1D-CNN model. J. Intell. Manuf. 2024, 35, 147–160. [Google Scholar] [CrossRef]
  27. Selvaraj, V.; Min, S. Real-time fault identification system for a retrofitted ultra-precision CNC machine from equipment’s power consumption data: A case study of an implementation. Int. J. Precis. Eng. Manuf. Green Technol. 2022, 10, 925–941. [Google Scholar] [CrossRef]
  28. Selvaraj, V.; Xu, Z.; Min, S. Intelligent operation monitoring of an ultra-precision CNC machine tool using energy data. Int. J. Precis. Eng. Manuf. Green Technol. 2022, 10, 59–69. [Google Scholar] [CrossRef]
  29. Çekik, R.; Turan, A. Deep learning for anomaly detection in CNC machine vibration data: A RoughLSTM-based approach. Appl. Sci. 2025, 15, 3179. [Google Scholar] [CrossRef]
  30. Demetgul, M.; Zheng, Q.; Tansel, I.N.; Fleischer, J. Monitoring the misalignment of machine tools with autoencoders after they are trained with transfer learning data. Int. J. Adv. Manuf. Technol. 2023, 128, 3357–3373. [Google Scholar] [CrossRef]
  31. Chang, B.R.; Tsai, H.F.; Mo, H.Y. Ensemble meta-learning-based robust chipping prediction for wafer dicing. Electronics 2024, 13, 1802. [Google Scholar] [CrossRef]
  32. Yao, M.; Tao, D.; Qi, P.; Gao, R. Rethinking discrepancy analysis: Anomaly detection via meta-learning powered dual-source representation differentiation. IEEE Trans. Autom. Sci. Eng. 2024, 22, 8579–8592. [Google Scholar] [CrossRef]
  33. Sakoe, H.; Chiba, S. Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 1978, 26, 43–49. [Google Scholar] [CrossRef]
  34. Cuturi, M.; Blondel, M. Soft-DTW: A differentiable loss function for time-series. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 894–903. [Google Scholar]
Figure 1. LSTM-Autoencoder architecture.
Figure 1. LSTM-Autoencoder architecture.
Systems 13 00534 g001
Figure 2. Model-Agnostic Meta-Learning (MAML) architecture.
Figure 2. Model-Agnostic Meta-Learning (MAML) architecture.
Systems 13 00534 g002
Figure 3. Framework of the proposed work.
Figure 3. Framework of the proposed work.
Systems 13 00534 g003
Figure 4. Workflow of data collection and preprocessing.
Figure 4. Workflow of data collection and preprocessing.
Systems 13 00534 g004
Figure 5. Construction of normal and anomalous data.
Figure 5. Construction of normal and anomalous data.
Systems 13 00534 g005
Figure 6. Construction of meta-tasks.
Figure 6. Construction of meta-tasks.
Systems 13 00534 g006
Figure 7. Meta-training stage: support-set construction from multiple normal machine cycles.
Figure 7. Meta-training stage: support-set construction from multiple normal machine cycles.
Systems 13 00534 g007
Figure 8. Meta-training stage: query-set construction from multiple machine unlabeled cycles.
Figure 8. Meta-training stage: query-set construction from multiple machine unlabeled cycles.
Systems 13 00534 g008
Figure 9. Meta-testing stage: Adaptation to target machine and inference on unlabeled cycles.
Figure 9. Meta-testing stage: Adaptation to target machine and inference on unlabeled cycles.
Systems 13 00534 g009
Figure 10. Reconstruction error distribution of models (basic LSTM-AE, transfer learning, single-machine-based MAML LSTM-AE, multi-machine-based MAML LSTM-AE).
Figure 10. Reconstruction error distribution of models (basic LSTM-AE, transfer learning, single-machine-based MAML LSTM-AE, multi-machine-based MAML LSTM-AE).
Systems 13 00534 g010
Figure 11. Reconstruction error distribution and fine-tuning loss of models (single-machine-based MAML LSTM-AE, multi-machine-based MAML LSTM-AE).
Figure 11. Reconstruction error distribution and fine-tuning loss of models (single-machine-based MAML LSTM-AE, multi-machine-based MAML LSTM-AE).
Systems 13 00534 g011
Table 1. Comparison of anomaly detection studies on retrofit systems or low-data situations.
Table 1. Comparison of anomaly detection studies on retrofit systems or low-data situations.
Related
Research
MethodologyTarget System and
Approach
Limitations
Selvaraj et al.
(2023)
[26,27,28]
Anomaly scoring metric (Mahalanobis Distance)Retrofitted 5-axis CNC machineAnomaly detection studies were conducted after sufficient data was collected, so the issue of data acquisition timeframe was not addressed.
Çekik & Turan. (2025) [29]RoughLSTMRetrofitted 5-axis CNC machineRelies on expert-labeled vibration data, making the collection of large, balanced datasets both time-consuming and resource-intensive
Li, Bedi & Melek. (2023)
[14]
LSTM-AE + Transfer LearningLow-data situations in
5-axis CNC machine
Uncertainty in model generalization due to single-source-single-target transfer learning approach
Demetgul et al. (2023) [30]LSTM-AE + Transfer Learning + Data AugmentationLow-data situations in
3-axis CNC machine
Only normal data was collected, and anomaly data had to be synthetically generated from experimental test platforms.
Chang et al. (2023)
[31]
MAMLLow-data situations in
wafer dicing machine
Requires sufficient and well-distributed data, and experiments showed poor prediction accuracy when training data is limited
Yao et al. (2025)
[32]
AuoDualUnlabeled data situations in five large-scale real-world industriesFew anomalous samples can destabilize representation learning, degrading detection performance
Proposed workLSTM-AEUnlabeled and low-data situations in retrofitted 5-axis CNC machinesRequires data from multiple identical 5-axis machines, making it difficult to generalize to industrial sites
Table 2. Column description for the five-axis CNC machine dataset.
Table 2. Column description for the five-axis CNC machine dataset.
CategoryColumnsContentsExample
Machine infoMACHINE_CODEMachine codeM001
Operation infoOCCUR_DATEData timestampYYYY.MM.DD hh:mm:ss
MACHINE_MODEMachine modeMEMORY (Auto)
OPERATE_MODEOperation statusOPERATE
ALARM
EMEGENCY
IDLE
STOP
SUSPEND
MACHINE_STATUSMachine operational stateSTOP (0)
RUNNING (1)
ALARM (2)
OFF (3)
Process infoPROSSED_QTYProcessed product count (until level)14,934 (14,934th product)
Sensor dataLOAD 1–5Axis load values1.757644
Anomaly infoALARM_CODEAnomaly alarm codeSV040: Servo position error
SV041: Servo following error
SV045: Motor overload
SV049: Servo delay
SV060: Axis parameter error
Table 4. Basic statistics of axis load signals (LOAD_1–LOAD_5) produced by different machines.
Table 4. Basic statistics of axis load signals (LOAD_1–LOAD_5) produced by different machines.
MachineColumnSample CountMeanStd
M001LOAD_1900,598 (2704 sequences)0.505424560.10112762
LOAD_2900,598 (2704 sequences)0.447619830.12700728
LOAD_3900,598 (2704 sequences)0.590697050.17027938
LOAD_4900,598 (2704 sequences)0.368392230.09298372
LOAD_5900,598 (2704 sequences)0.617484270.1081329
M002LOAD_1709,840 (2131 sequences)0.595143790.1574048
LOAD_2709,840 (2131 sequences)0.403449760.20567667
LOAD_3709,840 (2131 sequences)0.531533940.2277552
LOAD_4709,840 (2131 sequences)0.22206270.14370497
LOAD_5709,840 (2131 sequences)0.43020110.1505984
M003LOAD_1523,476 (1572 sequences)0.542037870.09018785
LOAD_2523,476 (1572 sequences)0.523822110.10038
LOAD_3523,476 (1572 sequences)0.511523210.10031709
LOAD_4523,476 (1572 sequences)0.513517720.11441465
LOAD_5523,476 (1572 sequences)0.51310190.19326683
M004LOAD_11,103,229 (3313 sequences)0.432550120.06768404
LOAD_21,103,229 (3313 sequences)0.496722490.06791739
LOAD_31,103,229 (3313 sequences)0.509045670.07837191
LOAD_41,103,229 (3313 sequences)0.522175360.09814365
LOAD_51,103,229 (3313 sequences)0.53202330.17759579
M005LOAD_11,154,844 (3468 sequences)0.449280780.10224675
LOAD_21,154,844 (3468 sequences)0.391945640.16300031
LOAD_31,154,844 (3468 sequences)0.524873950.11619587
LOAD_41,154,844 (3468 sequences)0.497838550.1232668
LOAD_51,154,844 (3468 sequences)0.396709190.0879473
M006
(Retrofitted Machine)
LOAD_1364,635 (1095 sequences)0.556648070.08672394
LOAD_2364,635 (1095 sequences)0.392032320.16062026
LOAD_3364,635 (1095 sequences)0.619019110.11195613
LOAD_4364,635 (1095 sequences)0.415803230.10295453
LOAD_5364,635 (1095 sequences)0.486481360.09712615
Table 3. Labeling criteria based on the alarm code occurrence ratio per cycle.
Table 3. Labeling criteria based on the alarm code occurrence ratio per cycle.
CategoryMathematical ConditionDescriptionUsage
Normal p = 0No alarm events in the cycle (all ALARM_CODE are NULL)Used for training and evaluation
Minor Anomaly 0 < p < 0.5 Some alarm events occurred, but fewer than 50% of the entries are non-NULLCan be used for threshold tuning or semi-supervised training
Anomaly p ≥ 0.5More than 50% of entries are non-NULL ALARM_CODEUsed for evaluation only
Table 5. Experimental hyperparameter settings.
Table 5. Experimental hyperparameter settings.
CategoryParameterValue
Model ArchitectureModel typeLSTM Autoencoder
Input dimensionAutomatically determined
Hidden layer dimension128
Latent-space dimension64
Number of LSTM layers2
Dropout rate0.2
Training ParametersLearning rate0.0001
Epochs100
Batch size64
Training/validation split0.8/0.2
Early-stopping patience20
Gradient clipping threshold1.0
Data PreprocessingSequence length333
Production-cycle length333
Normalization methodz-score
Anomaly detection thresholdThresholding methodStatistical (mean + std dev)
Thresholding multiplier3
Table 6. Performance metrics.
Table 6. Performance metrics.
CategoryPerformance MetricsContents
Anomaly Detection
Performance
Confusion matrixTP (True Positive), FP (False Positive),
FN (False Negative), TN (True Negative)
Accuracy T P + T N T P + T N + F P + F N
Precision T P T P + F P
Recall T P T P + F N
F1-score 2 × ( P r e c i s i o n × R e c a l l ) ( P r e c i s i o n + R e c a l l )
Reconstruction
Performance
Reconstruction errorThe mean squared error between the input and its reconstruction
Table 7. Model performance metrics.
Table 7. Model performance metrics.
ModelAccuracyPrecisionRecallF1-ScoreTPFPTNFN
Basic LSTM-AE0.84131.00000.11110.20002070540
Transfer learning0.94050.94120.71110.810120523213
Single-machine-based MAML LSTM-AE0.85170.84620.24440.379320521134
Proposed
multi-machine-based MAML LSTM-AE
0.98020.90001.00000.94742025450
Table 8. Comparison of training duration and inference latency across model strategies.
Table 8. Comparison of training duration and inference latency across model strategies.
ModelDatasetTime per Epoch (min)Total Training Time
(100 Epochs)
Avg. Inference Time
(ms/Sample)
Basic LSTMM006
(364,635 samples, 1095 sequences)
≈2≈3.3 h≈0.2
Single-task MAMLM006
(364,635 samples, 1095 sequences)
≈3≈5 h≈0.2
Transfer Learning (Pretrain/Fine-tune)Pretrain: M005
(1,154,844 samples, 3468 sequences)
Fine-tune: M006
Pretrain ≈ 6.5
Fine-tune ≈ 2
≈14.1 h≈0.2
Multi-task MAML (Pretrain/Fine-tune)Pretrain: M001–M005
(≈4,491,987 samples, 13,188 sequences)
Fine-tune: M006
Pretrain ≈ 20
Fine-tune ≈ 3
≈38.3 h≈0.2
Table 9. Performance metrics of multi-machine-based model vs. single-machine-based model.
Table 9. Performance metrics of multi-machine-based model vs. single-machine-based model.
ModelAccuracyPrecisionRecallF1-ScoreTPFPTNFN
Single-machine-based MAML LSTM-AE0.83151.00000.23730.383628604514
Proposed
Multi-machine-based MAML LSTM-AE
0.98130.92191.00000.95932035059
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Woo, J.-M.; Ju, S.-H.; Sung, J.-H.; Seo, K.-M. Meta-Learning-Based LSTM-Autoencoder for Low-Data Anomaly Detection in Retrofitted CNC Machine Using Multi-Machine Datasets. Systems 2025, 13, 534. https://doi.org/10.3390/systems13070534

AMA Style

Woo J-M, Ju S-H, Sung J-H, Seo K-M. Meta-Learning-Based LSTM-Autoencoder for Low-Data Anomaly Detection in Retrofitted CNC Machine Using Multi-Machine Datasets. Systems. 2025; 13(7):534. https://doi.org/10.3390/systems13070534

Chicago/Turabian Style

Woo, Ji-Min, Seong-Hyeon Ju, Jin-Hyeon Sung, and Kyung-Min Seo. 2025. "Meta-Learning-Based LSTM-Autoencoder for Low-Data Anomaly Detection in Retrofitted CNC Machine Using Multi-Machine Datasets" Systems 13, no. 7: 534. https://doi.org/10.3390/systems13070534

APA Style

Woo, J.-M., Ju, S.-H., Sung, J.-H., & Seo, K.-M. (2025). Meta-Learning-Based LSTM-Autoencoder for Low-Data Anomaly Detection in Retrofitted CNC Machine Using Multi-Machine Datasets. Systems, 13(7), 534. https://doi.org/10.3390/systems13070534

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop