Next Article in Journal
Enhancement of Phenolic and Polyacetylene Production in Chinese Lobelia (Lobelia chinensis Lour.) Plant Suspension Culture by Employing Silver, Iron Oxide Nanoparticles and Multiwalled Carbon Nanotubes as Elicitors
Previous Article in Journal
Modeling and Optimization of the Vacuum Degassing Process in Electric Steelmaking Route
Previous Article in Special Issue
A Review of Hybrid LSTM Models in Smart Cities
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Low-Code Visual Framework for Deep Learning-Based Remaining Useful Life Prediction

1
State Key Laboratory of Nuclear Power Safety Technology and Equipment, China Nuclear Power Engineering Co., Ltd., Shenzhen 518172, China
2
Shenzhen Key Laboratory of Nuclear and Radiation Safety, Institute for Advanced Study in Nuclear Energy & Safety, College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen 518060, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work and should be considered as co-first authors.
Processes 2025, 13(8), 2366; https://doi.org/10.3390/pr13082366
Submission received: 6 June 2025 / Revised: 10 July 2025 / Accepted: 23 July 2025 / Published: 25 July 2025

Abstract

In the context of intelligent manufacturing, deep learning-based remaining useful life (RUL) prediction has become a research hotspot in the field of Prognostics and Health Management (PHM). The traditional approaches often require strong programming skills and repeated model building, posing a high entry barrier. To address this, in this study, we propose and implement a visualization tool that supports multiple model selections and result visualization and eliminates the need for complex coding and mathematical derivations, helping users to efficiently conduct RUL prediction with lower technical requirements. This study introduces and summarizes various novel neural network models for DL-based RUL prediction. The models are validated using the NASA and HNEI datasets, and among the validated models, the LSTM model best met the requirements for remaining useful life (RUL) prediction. In order to achieve the low-code usage of deep learning for RUL prediction, the following tasks were performed: (1) multiple models were developed using the Python (3.9.18) language and were implemented on the PyTorch (1.12.1) framework, providing users with the freedom to choose their desired model; (2) a user-friendly and low-code RUL prediction interface was built using Streamlit, enabling users to easily make predictions; (3) the visualization of prediction results was implemented using Matplotlib (3.8.2), allowing users to better understand and analyze the results. In addition, the tool offers functionalities such as automatic hyperparameter tuning to optimize the performance of the prediction model and reduce the complexity of operations.

1. Introduction

The inherent degradation of the health of long-running production equipment during its operation leads to an increasing risk of failure as the equipment continues to age. Additionally, the rapid advancement of technology has made production equipment structures more complex and intricate. As a result, when malfunctions occur, the costs of repairs and the losses from production interruptions increase significantly, adversely affecting the factory’s economic efficiency. Statistical data indicate that repair costs typically range from 10 to 40% of a factory’s production costs, representing a significant financial burden. Consequently, factories must implement effective maintenance strategies to ensure the smooth and uninterrupted operation of their equipment. In light of these challenges, the field of Prognostics and Health Management (PHM) has gained considerable attention, offering promising solutions for addressing these issues.
The primary objective of PHM is the accurate prediction of the remaining useful life (RUL) of equipment [1,2]. In general, regardless of whether the equipment is in operational mode, the equipment’s performance naturally degrades over time. However, accurately determining the extent of equipment degradation is challenging, making it difficult to ascertain whether the equipment can still perform its tasks and its remaining normal operational time. The value of RUL prediction lies in its ability to address this challenge.
Traditional remaining useful life (RUL) prediction methods largely rely on physical modeling and empirical formulas. While these methods have achieved some success in specific domains, they often require time-consuming modeling, struggle to adapt to environmental changes, and lack generalization capabilities when applied to complex systems. Data-driven RUL prediction has emerged as a research hotspot with the rapid development of big data and deep learning. Deep learning techniques, such as long short-term memory (LSTM) networks and convolutional neural networks (CNNs), can automatically extract features and capture complex temporal relationships, providing more accurate predictions. However, the practical application of deep learning still faces significant technical barriers—particularly in model construction, training, and tuning—which require deep expertise and considerable computational resources. Consequently, the challenge of lowering these technical barriers to enable non-experts to utilize deep learning for RUL prediction remains an urgent issue.
In this study, we propose a low-code deep learning platform that simplifies the application of deep learning models and reduces technical barriers. The platform provides an efficient and user-friendly tool for personnel in the field of equipment health management, allowing them to perform data importation, feature selection, model training, and prediction visualization through a graphical interface, all without the need for complex coding. This approach not only reduces the difficulty of applying deep learning but also promotes its widespread use in RUL prediction.
The main contributions of our work are as follows:
(1)
Model Implementation: Multiple representative and novel neural network models suitable for time series RUL prediction tasks—such as LSTM, GRU, and 1D CNN models—were developed using the Python programming language and implemented on the PyTorch deep learning framework. This modular design allows users to freely select and switch between different models based on their task requirements or data characteristics.
(2)
Interface Development: A low-code and interactive user interface was developed using the Streamlit framework. This interface supports operations such as data uploading, model selection, training initiation, and result downloading, enabling users without deep programming expertise to easily complete the RUL prediction process.
(3)
Result Visualization: To improve the interpretability of model outputs, the system incorporates visualization modules built with Matplotlib (3.8.2). These modules provide intuitive plots such as training curves, prediction vs. actual RUL trends, and residual error distributions, helping users better evaluate model performance.
(4)
Intelligent Parameter Optimization: To reduce the manual effort involved in model tuning, the system includes functions for automatic hyperparameter optimization. This feature leverages search algorithms (e.g., grid search or random search) to recommend the optimal training parameters, thereby enhancing the prediction accuracy and efficiency.
(5)
Scalability and Extensibility: The architecture of the tool was designed with scalability in mind and supports the modular integration of additional models and datasets, making it suitable not only for academic research but also for industrial applications involving various types of equipment and operational scenarios.
The organization of the remainder of this paper is as follows. Section 2 provides a detailed literature review on the existing deep learning-based remaining useful life (RUL) prediction methods and other low-code platforms. Section 3 explains the theoretical background of deep learning and the low-code framework. Section 4 elaborates on the implementation of the proposed method and the system architecture. Section 5 discusses the experiments and result analysis. Section 6 presents the conclusions and future work.

2. Related Work

2.1. Overview of Deep Learning for RUL Prediction

Remaining useful life (RUL) prediction is a core issue in Prognostics and Health Management (PHM) that involves predicting the remaining lifespan of equipment by monitoring its operating condition, which helps with planning maintenance and preventing failures in advance. There are many RUL prediction methods, with the mainstream approaches involving model-based prediction, data-driven techniques, or a combination of both.
Gao Z. et al. [3] proposed a multiscale temporal memory Transformer framework for the remaining useful life (RUL) prediction of industrial robots, addressing challenges in detecting state changes and capturing multi-term temporal dependencies. Their method, validated on a self-built industrial robot platform, outperforms other advanced techniques by accurately locating state change points and achieving high-precision RUL prediction.
Wang L. et al. [4] proposed a novel predictive maintenance framework and developed a hybrid deep learning model for RUL prediction and two mixed-integer linear programming models to minimize/maximize the maintenance time, using a fast hybrid metaheuristic algorithm to solve large-scale problems effectively.
Zhou Y. et al. [5] proposed a deep ensemble RUL prediction method based on a multimodal interactive attention spatial–temporal network (MIASTN). The method integrates deep base learners, uses signal processing techniques to transform vibration data for improved interpretability, and employs a learning ensemble strategy.
Kim M. et al. [6] proposed a physics-informed deep learning framework for explainable remaining useful life (RUL) prediction. The framework enhances the RUL accuracy and interpretability by transforming raw measurements into five physics-informed features using a multiscale deep convolutional neural network to extract temporal patterns and applying layer-wise relevance propagation for failure mode identification.
Chen Z. et al. [7] proposed an attention-based deep learning framework for predicting the RUL of machines. The framework integrates both handcrafted features and automatically learned features, and the LSTM network learns sequential patterns from raw sensory data.
Cheng C. et al. [8] proposed a deep learning framework for predicting the RUL of rolling element bearings (REBs) that uses the Hilbert–Huang transform to extract a degradation indicator from raw vibrations, with a CNN to identify patterns and an ϵ-support vector regression model for RUL prediction, showing a superior performance across varying conditions.
Deep learning-based RUL prediction has gained significant attention for its robustness, wide applicability, and high precision due to the continuous advancement of sensor technology and the ongoing optimization of deep learning models. Deep learning has demonstrated significant performance advantages in RUL prediction but also has certain limitations, one of which is the high technical threshold required for its implementation. Practitioners must engage in data processing, model selection, and building the necessary computing framework, all of which require substantial programming skills and a foundational understanding of deep learning. This complexity can deter many potential users.

2.2. Low-Code Platform Applications

Low-code platforms (LCPs) enable users to quickly build applications with minimal coding through a visual interface. As data science and machine learning grow in popularity, LCPs are being adopted to speed up model development and data analysis. We explore their applications in these fields, focusing on open-source platforms and AutoML advancements.
The KNIME Analytics Platform [9] offers a wide range of extensions and plugins to support various data processing, analysis, modeling, and visualization functions, meeting the needs of complex data science tasks, and its modular design allows users to perform data analysis by dragging and dropping components, offering high flexibility. However, beginners may find the interface and functional modules of the KNIME Analytics Platform complex.
The Orange data mining toolbox [10] has a simple and user-friendly interface, making it suitable for beginners. Moreover, its modular design enables data analysis through drag-and-drop components, and it provides various machine learning algorithms. As an open-source and free project, users can freely download, modify, and extend its functionality. However, its performance is weaker when handling complex feature engineering or large-scale data.
AutoML [11,12] automates algorithm selection, feature engineering, and hyperparameter tuning, lowering the technical barrier for machine learning and helping non-experts quickly develop applications. However, while it saves time and effort, it has limited flexibility and may struggle with complex or specialized tasks, and due to its lack of transparency and high computational resource demands, it can face black-box issues and performance bottlenecks.
The table below (Table 1) provides a detailed comparison of the three different platform frameworks, covering their strengths and weaknesses in terms of their functionalities, performances, ease of use, and other aspects.
The low-code platform in this study is designed for beginners, allowing them to customize the number of layers in the model, and as a Python (3.9.18)-based software, it provides raw code to help users understand and build their own neural network models. Unlike other platforms, our focus is on teaching users the model-building process and enabling the flexible application of deep learning techniques.

3. Theoretical Method

DL technology, as a forefront in the field of machine learning, derives its outstanding capabilities from its ability to automatically learn abstract features and complex patterns, thereby achieving highly accurate predictions and decision making across various domains [13]. At the technical level, deep learning models are typically constructed as multi-layer neural networks with a large number of parameters to handle high-dimensional data, such as images, text, and time series. Structures such as convolutional neural networks (CNN) and long short-term memory (LSTM) excel at processing spatial and temporal information [2,14]. The software developed in this project offers various foundational models, including CNN and LSTM models. For instance, with CNNs, users can adjust the model parameters in pursuit of improved predictive results, such as the input window size and the numbers of convolutional and normalization layers, among others.
Deep learning technology has shown significant potential in the RUL prediction field. Through training on sensor data from equipment or systems, deep learning models can predict their future lifespans and performances, aiding in more precise maintenance planning and resource management. Deep learning can also be used to identify subtle trends and potential anomalies in time series data, enhancing the RUL estimation accuracy.

3.1. Basic Principles of the Model

In this study, we comprehensively applied various deep learning models, including LSTM, GRU, CNN, and Transformer architectures. We provide a brief introduction to their core structures and fundamental principles below.

3.1.1. Long Short-Term Memory

Long short-term memory (LSTM) [15] was introduced by Hochreiter and Schmidhuber, and its core innovation lies in the introduction of memory cells and a triple-gating mechanism, which addresses the problem of long-term dependencies through a constant error carousel (CEC). The LSTM network structure consists of multiple memory and gate units. Each memory unit typically includes an input gate, an output gate, and an internal state, and these components work together as follows (Figure 1):
The input gate determines the degree to which new information enters the memory unit, and its activation value ( i t ) is calculated via the sigmoid activation function using the current input ( x t ) and the previous hidden state ( h t 1 ):
i t = σ W i x t + W h i h t 1 + b i
where σ is the sigmoid activation function, W i and W h are the weight matrices, and b i is the bias term.
The forget gate determines when the information in the memory unit is forgotten, and its activation value ( f t ) is calculated via the sigmoid activation function using the current input ( x t ) and the previous hidden state ( h t 1 ):
f t = σ W i f x t + W h f h t 1 + b f
where W i f   a n d   W h f are the weight matrices, and b f   is the bias term. The internal state of the memory unit is updated by combining the outputs of the input and forget gates. First, the candidate state ( c ~ t ) is calculated as follows:
c ~ t = t a n h W i c x t + W h c h t 1 + b c
Then, the internal state ( c t ) is updated as follows:
c t = f t c t 1 + i t c ~ t
where c ~ t represents the candidate state, c t is the updated internal state, and t a n h is the hyperbolic tangent activation function. W i c   a n d   W h c are the weight matrices, and b c is the bias term. The output gate determines when the information in the memory unit is output. The activation value of the output gate ( o t ) is calculated via the sigmoid activation function using the current input ( x t ) and the previous hidden state ( h t 1 ):
o t = σ W i o x t + W h o h t 1 + b o
The output of the memory unit ( h t ) is calculated as follows:
h t = o t t a n h c t
where o t   is the activation value of the output gate, h t is the output of the memory unit, W i o   a n d   W h o   are the weight matrices, and b o is the bias term.
The LSTM employs a special gradient descent algorithm that combines truncated backpropagation to update the network weights. This method is as efficient as traditional backpropagation through time (BPTT) in terms of its computational performance, and by limiting the spread of error signals, it avoids gradient explosion and gradient disappearance.

3.1.2. Gated Recurrent Units

The GRU model was proposed by Cho et al., and its core innovation lies in the introduction of the update and reset gate mechanisms [16]. The GRU network structure consists of multiple cyclic units, each of which controls the flow of information through the update and reset gates (Figure 2).
The role of the update gate is to determine the degree to which the current unit state needs to be updated, and its activation value ( z t ) is calculated via the sigmoid activation function using the current input ( x t ) and the previous hidden state ( h t 1 ):
z t = σ W z x t + U z h t 1 + b z
where σ is the sigmoid activation function, W z   a n d   U z are the weight matrices, and b z is the bias term.
The reset gate determines the extent to which the current unit state needs to be forgotten, and its activation value ( r t ) is calculated via the sigmoid activation function using the current input ( x t ) and the previous hidden state ( h t 1 ):
r t = σ W r x t + U r h t 1 + b r
where W r   a n d   U r are the weight matrices, and b r is the bias term. The candidate hidden state (Candidate Hidden State) combines the output of the reset gate to calculate the candidate hidden state ( h ~ t ):
h ~ t = t a n h W x t + U r t h t 1 + b c
where represents the element-wise multiplication, W   a n d   U are the weight matrices, b c is the bias term, and t a n h is the hyperbolic tangent activation function. The final hidden state ( h t ) is obtained by updating the gate ( z t ) to perform a weighted sum of the previous hidden state ( h t 1 ) and the candidate hidden state ( h ~ t ):
h t = 1 z t h t 1 + z t h ~ t
where h t is the hidden state of the current moment, and z t is the activation value of the update gate.
Both the GRU and LSTM models are advanced variants of recurrent neural networks designed to address the issues of gradient disappearance and gradient explosion that traditional RNNs face when processing long sequences. The incorporated gating mechanisms control the information flow, thereby better capturing both short-term and long-term dependencies.

3.1.3. Convolutional Neural Networks

Convolutional neural networks (CNNs) are well-suited for image processing tasks due to their hierarchical feature extraction structure, inspired by biological visual systems [16]. CNNs excel at tasks such as image recognition and classification by automatically learning useful features from raw data.
The convolutional layer extracts features from an input image using small filters (kernels), sliding over the image and applying element-wise multiplication and summation to produce a feature map. The operation is as follows:
F ( x , y ) = ( I K ) ( x , y ) = i , j I ( x + i , y + j ) K ( i , j )
where I is the image, K is the kernel, and F is the feature map. This process reduces the number of parameters and helps capture spatial hierarchies.
Pooling reduces the spatial dimensions of the feature map, minimizing computation. Max pooling selects the maximum value from each region, while average pooling computes the average:
M a x P o o l i n g ( x , y ) = m a x ( I ( x , y ) )
A v g P o o l i n g ( x , y ) = 1 n i = 1 n I ( x i , y i )
where I(x,y) represents the pixel values of the input image, and n is the number of elements in the pooling region. Pooling helps with downsampling the feature map, reducing the computational complexity, and making the network more robust to small translations or deformations of the input image.
Fully connected layers integrate the features to make final predictions. Each neuron connects to every neuron in the previous layer, and the output is computed as follows:
y = W x + b
where W is the weight matrix, x is the input, and b is the bias.
Activation functions introduce nonlinearity into the neural network, enabling it to learn complex patterns and relationships in the data. Without activation functions, the network would only be able to perform linear transformations, limiting its expressive power. One common activation function is the ReLU function:
R e L U ( x ) = m a x ( 0 , x )
Backpropagation is the core algorithm for training convolutional neural networks, updating the parameters by calculating the gradients of the loss function with respect to the network parameters in order to minimize the loss. The weight update rule is as follows:
θ = θ η L θ
where η is the learning rate, and L is the loss. Through repeated iterations, the network gradually optimizes its weights, improving its performance on the classification task.

3.1.4. Transformers

The Transformer model is a neural network architecture based on the attention mechanism and is designed to handle sequence-to-sequence tasks such as machine translation and text generation [13]. Moreover, the Transformer model completely abandons the traditional recurrent and convolutional network structures, relying solely on the attention mechanism to capture the global dependencies between inputs and outputs.
The Transformer processes sequence data through a multi-head self-attention mechanism and positional encoding, and it significantly improves the training efficiency by taking advantage of parallel computing.
Self-attention allows the model to consider all positions in the sequence when computing the representation of a particular position, calculating the correlations (typically via dot products) between elements and using them to weight and sum values, resulting in a new representation. Multi-head attention projects the input into multiple subspaces, performs self-attention independently in each, and concatenates the results before applying a linear transformation.
Because the Transformer does not rely on recurrent or convolutional structures, it cannot naturally capture positional information such as RNNs or CNNs. To address this, it adds positional encoding. The positional encoding is a vector with the same dimension as the input embedding and is used to represent the position of each element in the sequence. The positional encoding is calculated as follows:
The Transformer uses an encoder–decoder architecture, where the encoder converts the input sequence into a continuous representation, and the decoder generates the output sequence based on this representation.

3.2. DL-Based RUL Prediction

DL-based RUL prediction is a data-driven method that relies on deep neural network models to make precise predictions. These models use a vast amount of data collected from sensors and devices to achieve accurate predictions. Deep learning models can capture subtle variations in the equipment performance over time by automatically learning complex data patterns and features, thereby providing accurate estimates of the remaining life.
LSTM, a recurrent neural network (RNN) variant, is particularly suitable for time series data. LTSM possesses self-memory and the ability to handle long sequences of data, and its self-memory allows it to effectively manage long-term dependencies in time series data, which is crucial for RUL prediction. The equipment operating history often spans multiple time steps, where past operations and environmental conditions may significantly impact the current and future device status. LSTM can capture these intricate time dependencies, making it a powerful tool for remaining useful life prediction.
The GRU model simplifies the LSTM structure by including only the update and reset gates, and it directly stores information through the hidden state. In practical applications, whether to choose the GRU or LSTM model depends on the specific task requirements. The GRU model is used to predict when a device might fail at a certain point in the future, and its advantage lies in its efficient ability to handle time series data, making it particularly suitable for dynamic and changing industrial equipment prediction tasks.
CNNs, renowned for their automatic feature extraction capabilities, efficiently capture local and global features in time series data and are especially useful for analyzing the impacts of different operation and environmental condition time periods on the remaining life in the equipment’s operational history. Utilizing CNNs for RUL prediction provides robust feature extraction abilities in time series data analysis, adeptly handling multi-dimensional data and thereby supporting an improved RUL prediction performance and maintenance efficiency. Additionally, CNNs have certain advantages in terms of their computational efficiency, reducing complexity and significantly improving data processing for large-scale datasets.
RUL prediction methods based on Transformer models make full use of their high expressive power and attention mechanism. The attention mechanism is a core component of the Transformer, mimicking how the human brain allocates resources [17], and it can automatically identify which parts of the time series data need more attention at different time points, which is crucial in the RUL prediction field. In the equipment’s operational history, environmental variables cannot remain constant, and changes in the operations and environmental conditions during specific time periods may have a significant impact on the device lifespan. Through the attention mechanism, the Transformer can automatically identify and weigh these critical time periods and features. Furthermore, unlike traditional recurrent neural networks (RNNs) [18,19], the Transformer is not hindered by long-term dependency issues and can easily handle time series data, making it more suitable for handling long time series data. Additionally, the Transformer has the ability to extract contextual features (in RUL prediction, contextual features typically refer to factors such as the equipment operations, maintenance history, and environmental conditions). Traditional methods often require the manual or artificial generation of these features, but Transformer-based methods can automatically extract this context information from raw time series data [20], because the Transformer’s self-attention mechanism allows it to capture the context relationships between different elements in the data, especially in time series data.
The software implemented in this study offers a range of deep learning models, including the four aforementioned models, enabling users to compare the predictive performances of different models for the same case, facilitating their selection based on specific use scenarios for improved prediction outcomes.

3.3. Low-Code Platform Design Concept

In this study, we designed a low-code, user-friendly visualization tool to lower the technical barriers for using deep learning in predicting the RUL of equipment. Many potential users find it difficult to directly use deep learning tools due to the high technical requirements. Therefore, simplifying the operational process was the core objective.
PyTorch (1.12.1) was chosen as the deep learning framework due to its dynamic computation graphs and flexibility, making it well-suited for handling complex tasks such as RUL prediction. The rich set of tools and libraries in PyTorch, such as Streamlit, further enhances the development efficiency. Streamlit, an innovative Python library, helps developers to quickly create interactive web applications due to its ease of use and high interactivity, eliminating the need for complex web development steps and further improving the accessibility of the technology.
To simplify the process, we developed a web-based interactive interface that allows users to upload data files and select features, labels, etc., without writing code for data preprocessing. Additionally, the platform provides a variety of predefined deep learning models (such as LSTM, GRU, CNN, and Transformer models), enabling users to easily select and adjust model parameters, such as the learning rate and training epochs, further reducing the technical barrier.
The platform also includes a visualization module, where users can view real-time charts to track the training progress and compare the predicted RUL with the actual RUL. This visualization helps users to better understand the model’s behavior, enabling more effective decision making.
The platform uses a modular design, allowing users to flexibly combine different functional modules based on their needs. This flexibility ensures that both beginners and experienced users can efficiently complete RUL prediction tasks.

4. Software Design

Before designing and implementing the DL-based RUL prediction software, it is crucial to analyze the software’s usage scenarios and user profiles. As an interactive platform, users can choose data files from the initial interface, and they can also select feature, label, time, and identifier columns from the data table. Moreover, users can actively participate in data preprocessing by setting parameters such as the data batch size and time window divisions, and during the model selection phase, they can choose the most suitable model based on the data provided. They can also configure model-specific parameters, such as the number of training iterations and the learning rate to optimize the performance. These settings are configured into the software’s configuration files through the Flask framework. The software automatically reads this information to process the data and configure the model.
Once training is complete, the software outputs the evaluation metrics of the model training process and provides a visualization of these metrics. Users can select evaluation metric functions and examine the function graphs to gain insights into the model’s training progress. The trained model is saved within the software for later use. During the testing and prediction phases, the software provides users access to the pretrained models. The prediction results are shown in the form of line graphs, allowing users to better understand the equipment’s degradation trends.

4.1. Software Structure

4.1.1. Software Architecture Selection

The software architecture combines a Command-Line Interface (CLI) client with a web interface, providing users with multiple interactive pathways and enabling a distributed interactive application. Users initiate the interactive web interface hosted on a web server by inputting commands into the CLI client within their terminal. The web server receives requests from the CLI client and is responsible for executing the core logic of the application, data processing, and service calls. Users can communicate with the web server through their web browser, where the web interface provides visual data representation and functional operations.
To ensure the smooth operation of the CLI client, users need to maintain specific runtime environments and dependencies on their local terminals. This hybrid architecture allows users to interact with the software in a command-line manner in their terminals and view data and perform operations graphically through the web interface. This architecture is suitable for applications that require flexible interactions between users in both local and web interfaces, ensuring that the data and functionalities are presented visually while balancing the flexibility of the CLI and the user-friendliness of the web interface.

4.1.2. Software Architecture Implementation

The software architecture adheres to strict modular design principles with the primary goal of ensuring the system’s high scalability and maintainability (The specific module development details can be found in Appendix A). The software is built in the form of a series of core modules, including the user interface, data processing, model selection, and visualization modules, among others. These modules collaborate with each other to provide a wide range of features and offer a comprehensive user experience (Figure 3).
The software features a modular architecture, and within this architecture, the user interface module allows users to import local data through a user-friendly interface. Users can also precisely specify parameters related to the data, including the feature, label, time, and sequence columns.
The data processing module undertakes critical data-related tasks, including but not limited to data partitioning and normalization and the creation of data windows. These tasks are essential to ensure that the data are prepared and cleaned before they are used by the model.
The model selection module is responsible for executing the training process of the machine learning model upon receiving the data prepared by the data processing module. During this process, users have the authority to choose the desired model architecture and adjust the model parameters as needed to meet their specific requirements. Once training is completed, the model is saved within the software for later testing and application (Figure 4).
Finally, the visualization module ensures that the model’s evaluation metrics are presented in a highly visual manner, enabling users to better understand the model’s performance and test results and gain profound insights into its behavior.
Users directly input data during the testing phase. At this point, the software automatically performs data integrity checks to verify whether the testing data include the selected feature columns from the training dataset. If the necessary feature columns are missing in the testing data, the software generates an error report and prevents users from proceeding with further actions. After ensuring the integrity of the testing data, the software normalizes these input data to ensure that they fall within an acceptable range for the subsequent model testing.
Users have the freedom to select one from multiple available testing models, and the software applies that model to the testing data provided by the user for predictions. The final test results are decoupled and presented to the user in the form of line graphs, allowing them to have a clear understanding of the model’s performance and prediction results (Figure 5).
This modular design enhances the software’s scalability and makes it easier to maintain. Each module in the system has well-defined tasks and responsibilities, helping to reduce the system complexity and improve maintainability, also providing a strong foundation for future feature expansion. The architecture offers users powerful data processing and model training capabilities, while ensuring a highly visual presentation of the data and model performance. As a result, the software’s flexibility and user friendliness are improved. This modular design principle gives the software an excellent structure, allowing various tasks and components to work together effectively, providing users with outstanding data processing and model training capabilities.

4.1.3. Modular Design Principles

The software design adhered to a set of key modular design principles, including the Single Responsibility Principle (SRP) and Open–Closed Principle (OCP). These principles are foundational in the field of software engineering and aim to ensure a high level of maintainability, scalability, and low coupling in software systems. The Single Responsibility Principle mandates that each module has a well-defined responsibility, thereby increasing the code cohesion and clarity. The Open–Closed Principle emphasizes that systems should be open for extension but closed for modification, supporting seamless system functionality extension through the use of abstractions and interfaces without the need to modify existing code. The Interface Segregation Principle ensures that modules only depend on the minimal interfaces they need, reducing unnecessary dependencies and increasing module independence. The application of these principles ensures that each module in the system is responsible for a single, independent duty, and through appropriate abstraction and interface definitions, it allows for the extension of the system’s functionality without modifying existing code, thereby guaranteeing maintainability and extensibility in the software.

4.2. Application Module Details

The software is composed of several major modules—the user interface, data processing, model selection, and visualization modules—which are all designed for deep learning-based equipment RUL prediction. Each module has distinct functions and responsibilities, collectively providing users with comprehensive data analysis and visualization tools.

4.2.1. Interface Interaction Module

The user interface module serves as the interface between the user and the software, offering a user-friendly interface for importing, processing, and displaying data (Figure 6). In terms of data, users input data files, select the feature columns to be used as input, and choose the label columns as output. If there is a device identifier column in the data table, users should select it as the sequence number column. Users can also select a model and decide on the data step size, model optimizer, training iterations, loss functions, and other model configurations within the interface.
We chose to use the Streamlit library to design the front-end interface, as Streamlit excels at transforming data science and machine learning results into interactive web applications. User input information is stored in the dataset.yaml configuration file through the Flask framework. This file provides the necessary parameters for the subsequent model training, such as the number of training epochs and the input and output dimensions.

4.2.2. Data Processing Module

The primary responsibility of this module is to process user input data, including dataset partitioning, normalization, and windowing. Users first define the categorization of the columns in the input data to facilitate the processing of data via this module (Figure 7).
Once the input data are determined, the module checks the existence of the files to ensure their accessibility. This is a crucial step, as data science applications often require loading data from external files or databases. By verifying that the file content is not empty, the module can prevent potential runtime errors and ensure smooth data loading.
The module proceeds to partition the dataset. Dataset partitioning is primarily performed using the train_test_split function from the sklearn library. The partition ratio is determined by the “Split” value in the dataset.yaml configuration file, and the partitioning is performed randomly. Data normalization is achieved using the MinMaxScaler function, which scales data values to the range [0, 1]. This normalization process aims to accelerate the model training, enhance the model stability, and improve the model’s robustness to data with different scales.
Additionally, this module is responsible for data windowing. The module determines the data step size based on the “Step” variable in the dataset.yaml file and subsequently divides the data into windows to facilitate model training (Figure 8).

4.2.3. Model Selection Module

The model selection module (Figure 9) is one of the core components of the software, developed using Python and the PyTorch framework, and its primary objective is to provide users with a diverse range of model choices. To achieve this, we have developed various deep learning models, including convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and Transformers, all centrally stored in the model.py file. This diversity of choices allows users to select the model structure that best suits their specific task requirements.
Moreover, shallow models may not meet the training requirements when dealing with complex tasks and large-scale datasets. Therefore, we provide options for setting the number of model layers, allowing users to personalize their models more effectively.
The selected model type, the number of model layers, and the other model parameters chosen by the user are written into the dataset.yaml file. During the model training phase, the get_model function retrieves essential model parameters from the dataset.yaml file, such as the model’s hidden dimensions, the number of hidden layers, and the number of training epochs.
The software uses the Adam optimizer for model optimization. The Adam optimizer cleverly combines the gradient descent and momentum methods, adjusting the learning rate adaptively based on the moving averages of both the gradient and squared gradient for each parameter. This adaptation helps automatically adjust the learning rate during different training phases, improving the model stability and accelerating convergence for parameter optimization. Additionally, Adam introduces a momentum term to expedite convergence and reduce oscillations in gradient descent, and through a bias correction mechanism, it also corrects inaccuracies in gradient estimates during the early training stages.
The train_and_evaluate function within the module receives these parameters and the dataset, conducts the training, and outputs the trained model, along with the loss data generated during the training process.

4.2.4. Visualization Module

The visualization module was developed using the Matplotlib library, which is a widely used Python data visualization library in academic research and data analysis. The purpose of this module is to provide users with a convenient tool for data analysis and presentation, allowing for a clearer presentation of the loss values, root-mean-squared errors (RMSEs), and mean-squared errors (MSEs) during the training process, as well as the trends in the prediction results. The data generated during model training, as well as the prediction results from testing, are visualized through this module. These visualization tools assist users in better understanding the model performance and prediction trends, supporting their decision making and analysis tasks.

5. Software Operation and Practical Examples

5.1. Example Introduction

5.1.1. NASA Turbofan Engine Dataset

The software’s testing data were obtained from NASA’s Prognostics Center of Excellence (PCoE) Prognostics Data Repository [21]. The Turbofan Engine Degradation Simulation Dataset was provided by the NASA Ames Predictive Maintenance and Health Management Center of Excellence. This dataset, generated using the Commercial Modular Aero-Propulsion System Simulation (C-MAPSS) tool, simulates the degradation of turbofan engines under various operating conditions and fault modes.
The dataset consists of four different subsets (FD001, FD002, FD003, FD004), each of which operates under different operating conditions and failure modes (Table 2).
Each engine has different initial wear and manufacturing variations when it starts, which users are unaware of (Table 3). These wear and manufacturing variations are considered normal and do not indicate a fault condition. The data are contaminated by sensor noise. Each time series begins with the engine running normally and then experiences a failure at a certain point. In the training set, the failure progresses until the system fails. In the test set, the time series ends at a certain point before the system fails. The goal is to predict the remaining operating cycle before the failure in the test set, especially the number of remaining cycles after the engine’s last full operation. The actual remaining useful life (RUL) values of the test data are also provided.

5.1.2. HNEI Battery RUL Dataset

The second dataset used in this study was the Hawaii Natural Energy Institute (HNEI) battery dataset [22], which supports research on battery remaining useful life (RUL) prediction. This dataset contains multiple cells subjected to repeated charging and discharging cycles under varying operating conditions (Table 4). Key features include the current, voltage, temperature, and capacity. Each time series represents the full lifecycle of a battery cell until the capacity drops below a specified threshold. Unlike the NASA turbofan engine dataset, the HNEI battery dataset represents electrochemical degradation processes. The goal is to predict how many cycles remain before the battery reaches its end of life, defined as 80% of the nominal capacity.
Based on the original experimental records, the data are processed via feature engineering and normalized preprocessing to form structured time series samples. Each record represents the state characteristics of the battery in a specific cycle stage and corresponds to the corresponding RUL label.

5.2. Application Example

5.2.1. Introduction to Evaluation Metrics

To evaluate the performance of the deep learning model in predicting remaining useful life (RUL), we used several key metrics, including the mean-squared error (MSE), root-mean-squared error (RMSE), R2, and weighted mean absolute percentage error (WMAPE).
The MSE (mean-squared error) is a commonly used metric to measure the average of the squared differences between predicted ( y ^ i ) and actual ( y i ) values. Specifically, for a dataset with real and predicted values, where there are data points in total, the formula for calculating the MSE is as follows:
M S E = 1 n i = 1 n y i y ^ i 2
Because the MSE is the average of squared errors, it is always non-negative. The MSE is zero only when all the predicted values match the actual values exactly. Furthermore, because the MSE is the average of squared errors, larger errors are amplified, making it more sensitive to larger errors. For example, a difference of 2 between a predicted value and an actual value results in a squared error of 4, whereas a difference of 10 results in a squared error of 100, which significantly contributes to the MSE. Additionally, the MSE is measured in square units of the predicted and actual values. For instance, if both are measured in meters (m), the MSE is measured in square meters (m2), which can make the MSE less intuitive in interpretation.
The RMSE (root-mean-squared error) is the squared root of the MSE. The RMSE performs the operation of squaring, averaging, and taking the square root of the error between the predicted and true values, and its calculation formula is as follows:
R M S E = 1 n i = 1 n y i y ^ i 2
The root-mean-squared error (RMSE) is the square root of the mean-squared error (MSE), meaning that it retains the MSE’s penalty for errors but, compared to the MSE, it has units that are consistent with those of the predicted and actual values. For example, if the units of the predicted and actual values are in meters (m), then the RMSE also has units of meters (m), making it more intuitive in interpretation. Similar to the MSE, the RMSE is non-negative, and it is only zero when all the predicted values exactly match the actual values.
Because the RMSE is the squared root of the squared error, it is still sensitive to large errors. However, due to the squared-root operation, the RMSE is less amplified by errors than the MSE.
The R 2   (determination coefficient) is an important index for evaluating the goodness of fit of a regression model. The R 2 reflects the explanatory power of the model for the fluctuations in the observed data, and its calculation formula is as follows:
R 2 = 1 i = 1 n y i y ^ i 2 i = 1 n y i y 2
where y represents the average of the true values. The R2 value typically ranges from 0 to 1; the closer the value is to 1, the better the model explains the data changes; when the R2 is 1, the model perfectly fits the data. The R2 is sensitive to outliers and does not accurately reflect the actual size of prediction errors, so it should be used in conjunction with the MAE or MSE.
The WMAPE (weighted mean absolute percentage error) is a weighted version of the MAPE used to eliminate the sensitivity of the MAPE to zero values, and its calculation formula is as follows:
W M A P E = i = 1 n y i y ^ i i = 1 n y i
Unlike the MAPE, the WMAPE does not average each sample but uses the weighted sum of the actual values as a normalization factor, giving greater weight to errors in larger samples. The value of this metric typically ranges from 0 to 1, with smaller values indicating lower error rates, often expressed as a percentage. The WMAPE is more stable, making it particularly suitable for practical applications where the actual values frequently contain many small or zero values.

5.2.2. NASA Dataset Application

The data input utilizes the FD002 dataset. In the initial interface, data are input, and the “oil temperature,” “sensor measurement 1,” and “sensor measurement 2” are selected as the data feature columns. The device run cycles are chosen as the time column. The remaining life is calculated by subtracting the current run cycle from the maximum run cycle for each device, and this value is set as the label column. In data processing, a time step of 5 is selected, and the data batch size is set to 32.
Regarding the model parameters, the training batch size is set to 64 rounds, and the Adam optimizer is selected. The model’s learning rate can be adjusted through this optimizer to achieve the training goals faster. The hidden-layer dimension is set to 128, the number of layers is set to 3, and the FD004_68 dataset is used for prediction.
The training results and a sample prediction instance are as follows:
This case study demonstrates that in the task of predicting the remaining useful life (RUL) of equipment for the NASA dataset, the Transformer model outperformed the LSTM, GRU, and CNN models. The experimental results (Table 5 and Table 6) show that the Transformer model achieved the highest R2 score (0.9606) and the lowest WMAPE (0.0983), indicating its outstanding ability to capture the underlying patterns of time series data. These results highlight the advantages of the Transformer in complex time-dependent modeling for remaining useful life prediction. In contrast, the LSTM, GRU, and CNN models exhibited relatively lower prediction performances. Although the LSTM and CNN models achieved lower MSE and RMSE values compared with those of the Transformer model, their R2 scores (0.7944 and 0.8269, respectively) and WMAPEs (both above 0.20) suggest that they were less effective at capturing the full variance and trend of the time series data. The GRU model performed the worst among the four, likely due to its simpler gating mechanism, which may limit its ability to model long-range dependencies in data. These shortcomings highlight the importance of basing the model selection on the task complexity and data characteristics.
In this specific case, users are able to use deep learning for equipment remaining life prediction without the need to delve into the intricacies of neural network construction. Instead, this interactive interface streamlines the workflow, allowing users to focus on key steps. First, users only need to provide training data without the need to construct complex neural network architectures themselves. Users can explicitly choose the training model that suits their problem and thereby shape the basic structure of the model effectively.
Furthermore, users are empowered to fine-tune the model parameters at appropriate times to meet their specific requirements. The model training process takes data into the user-defined model architecture and, through multiple iterations, automatically adjusts the model parameters to efficiently minimize prediction errors, enabling it to capture data features and patterns effectively to generate a robust model. After training, users can effectively test the model and obtain information on the predicted remaining life of the equipment in the testing phase. The key feature of this process is its simplicity, as in-depth knowledge of deep learning is not required to obtain high-quality prediction results from the system.
This approach has a positive impact on the popularization of deep learning applications by simplifying complex task workflows and improving the accessibility to deep learning technology for ordinary users. The user friendliness combined with the high-quality prediction results provided by the system gives it potential widespread applicability in the field of equipment remaining life prediction.

5.2.3. HNEI Battery Dataset Application

In this section, the application of the HNEI battery dataset is demonstrated using the developed software platform. The dataset comprises multiple lithium-ion battery cells subjected to repeated charge–discharge cycles under varying load conditions. The primary objective is to predict the remaining useful life (RUL) of the battery, defined as the number of cycles remaining before the battery capacity falls below a predefined threshold (typically 80% of the initial capacity).
In the data input interface, the HNEI dataset is loaded, and relevant features such as the “voltage,” “current,” and “temperature” are selected as the input variables. The “cycle” count is designated as the time column to represent the operational timeline of each battery unit. The target variable—i.e., the RUL label—is derived by subtracting the current cycle index from the total number of cycles recorded for that battery cell. This calculation yields a time-to-failure label consistent with the regression-based prediction framework used for the turbofan engine dataset.
For data preprocessing, a time window size of 20 is selected, which determines the length of the historical sequence used to predict the remaining life at each time step. The data batch size for training is set to 64, allowing for efficient mini-batch gradient updates. Prior to model training, the dataset is normalized to ensure the consistent scaling of the feature values and reduce the influence of unit differences among sensors.
Regarding the model configuration, the same deep learning architectures used in Section 5.2.1 are employed to facilitate the performance comparison. These include the GRU, LSTM, CNN, and Transformer models. The training process adopts the Adam optimizer, with a learning rate of 0.001 and a hidden-layer dimension of 128. Each model is trained for 64 epochs. This consistency in the model parameters ensures the comparability of performances across datasets and highlights the flexibility of the software in adapting to different data domains.
During training, the system automatically adjusts the model parameters through backpropagation and loss minimization to optimize the prediction performance. After training is completed, users can initiate the prediction phase on a test subset of the battery data. The predicted RUL values are visualized alongside ground-truth labels, enabling the intuitive evaluation of the model accuracy.
According to the table (Table 7 and Table 8), the CNN model achieved the best prediction performance on the HNEI battery dataset. The test MSE was 0.0002, the RMSE was 0.015, the R2 was as high as 0.9973, and the WMAPE was only 0.0244, showing the most superior performance. However, overall, the error gaps among the CNN, LSTM, and GRU models were relatively small, and all could fit the battery life data well, indicating that they all have strong stability and accuracy when dealing with such sequential data. The RMSE of the LSTM model was 0.0176 and the R2 was 0.9963, while the RMSE of the GRU model was 0.0193, and the R2 was 0.9955. Their performances were only slightly lower than that of the CNN.
In contrast, the performance of the Transformer model on this dataset was relatively weak. Although its R2 still reached 0.9924, the MSE and RMSE are significantly higher than those of the other models. This might be because the battery sensor data have a strong linear correlation, and the Transformer is better at handling data with complex nonlinear relationships or long-range dependencies. In scenarios where the data features are relatively stable, and the structure is relatively simple, its advantages cannot be fully exerted, and instead, it may lead to a decline in the generalization ability.
Combining the predictive results of the turbofan engine dataset in Section 5.2.1, it can be seen that the different models performed differently on different types of data, further verifying the practicality and scalability of this software system. Users can flexibly choose the model according to specific data characteristics and their task requirements to achieve the optimal prediction effect.

5.3. Model Training Efficiency and Resource Requirements

Considering that some users in practical applications may lack high-performance GPU devices, to lower the model configuration threshold and enhance the system’s adaptability, we conducted experimental tests during the model training phase using an ordinary CPU environment to evaluate the training efficiency and resource consumption under common hardware conditions.
The experimental hardware platform was configured with an Intel Core i7-12700H processor and 16 GB of memory (Manufactured by Intel Corporation, headquartered in Santa Clara, CA, USA), and the operating system was Windows 11. Based on this, the training efficiencies of the four mainstream models (the LSTM, GRU, CNN, Transformer models) were evaluated for the NASA turbofan engine and HNEI battery datasets. The main statistical indicators included the total time (in seconds) required for the complete training process of a single model and the peak memory usage (in KB).
Table 9 presents the training time and memory usage of each model in the CPU environment for both datasets.
According to Table 9, the training time and memory usage of the Transformer model were significantly higher than those of the other models on both datasets, indicating its higher dependence on computing resources. In contrast, the GRU and LSTM models performed better in terms of the training time and memory consumption, possessing strong lightweight advantages and being suitable for resource-constrained environments. The CNN model is extremely efficient in terms of memory usage and is especially suitable for scenarios that require both prediction accuracy and deployment efficiency.
Furthermore, compared to the NASA dataset, the HNEI battery dataset generally shows a lower training time under the same model, which is related to its smaller sample size and time series length, as well as its relatively lower computational complexity.
Overall, this system can efficiently complete model training and inference without GPU acceleration, reducing the threshold of deep learning technology in engineering applications. Users can choose the appropriate model based on their hardware configuration and actual scenarios, taking into account the accuracy, efficiency, and resource consumption, providing diverse solutions for predicting the remaining life of equipment.

5.4. Comparative Analysis of Model Performance Across Datasets

To comprehensively evaluate the adaptability of the designed system to different types of data and the generalization ability of the models, in this section, we compare the model prediction results on the NASA turbofan engine and HNEI battery datasets (Table 10). These two datasets represent the typical degradation processes of mechanical systems and electrochemical systems, respectively, with different data patterns and degradation mechanisms, and possess good representativeness for comparison.
The LSTM and CNN models demonstrated excellent generalization capabilities on both datasets. Among them, the CNN model had the most outstanding performance on the HNEI battery dataset, with a test MSE of 0.00014 and an RMSE of only 0.0108, indicating its significant advantage in handling data with stationary and periodic characteristics. The LSTM model achieved the best performance on the NASA dataset (MSE: 0.0136; RMSE: 0.1040), demonstrating a strong modeling ability for complex mechanical degradation processes.
The GRU model achieved relatively stable prediction results on both datasets, performing slightly worse than the LSTM and CNN models, but with a significant advantage in terms of its training efficiency (refer to Section 5.3). Therefore, it has practical application value in lightweight deployment scenarios.
In contrast, the Transformer model performed unsatisfactorily on both datasets, with especially higher errors on the NASA dataset, indicating that it has difficulty fully learning the degradation patterns between sequences when dealing with small-scale time series data without rich contextual information, and it has a large resource overhead. In industrial applications, it is necessary to carefully consider the specific data scale and structure when deciding whether to use it.
The above analysis has verified the performance differences of different models on various types of datasets and has also demonstrated the multi-scenario adaptability of this system. The system enables users to flexibly select model architectures based on the specific data characteristics of their devices, which can not only adapt to the complex degradation trends of industrial equipment but also handle fine sequential data such as those for batteries, and it possesses excellent cross-domain prediction capabilities and scalability.

6. Conclusions

In this research, we discussed the crucial role of deep learning in the remaining life prediction of equipment, while also highlighting the considerable technical barriers and time and effort costs associated with multiple model constructions. In light of this, we developed deep learning-driven software for the remaining life prediction of equipment, which consists of the following four main modules: the interface interaction, data processing, model selection, and visualization modules. Users can input training and testing data on the interface, select data features, and set parameters, such as the time window size. The software provides various neural network models for the user’s selection, such as the LSTM, CNN, GRU, and Transformer models, and also allows users to customize the training parameters. During the testing phase, users can input test data and choose a pretrained model, and the software will generate comparison plots between the predicted and actual values, as well as calculating the MSE evaluation metric.
Future work will focus on improving the user friendliness of the software and implementing automatic parameter tuning for models. Additionally, efforts will be directed towards further optimizing the model performance to enhance the model generalization and accuracy. Real users will be introduced for low-code platform validation when the implementation conditions are met.

Author Contributions

Methodology, S.C., M.Y. and J.W.; Software, Y.L.; Investigation, Y.N.; Writing—original draft, Y.L.; Writing—review & editing, J.C.; Supervision, S.C.; Project administration, M.W. and B.Z.; Funding acquisition, S.C. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the GuangDong Basic and Applied Basic Research Foundation (grant nos. 2022A1515110545 and 2024A1515012727), the Shenzhen Science and Technology Program (grant no. RCBS20231211090600001), the LingChuang Research Project of the China National Nuclear Corporation (grant no. CNNC-LCKY-202263), the Shenzhen Pengcheng Peacock Plan Talent Project (no. 827-000885), the Shenzhen Science and Technology Innovation Commission Key Technical Project (JSGG20210713091539014), and the Shenzhen Science and Technology Program (ZDSYS20230626091501002).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Authors Yuhan Lin, Ming Wang and Bing Zhang was employed by the company State Key Laboratory of Nuclear Power Safety Technology and Equipment, China Nuclear Power Engineering Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A

Key Classes, Interfaces, and Functions

In the implementation of the low-code DL-based RUL prediction software, a series of well-defined Python classes and interface functions form the backbone of its functionality. These are distributed across several key modules, including the user interface (streamlit_version.py), training/testing logic (main.py), and model construction and evaluation (model.py, train.py, and predict.py). Below is a summary of the critical components.
For the model training, the primary entry point is the run_train(data) function. This function first reads the dataset.yaml configuration file to determine the input features, output labels, model architecture, and training parameters, and it then splits the input dataset into training and validation sets and uses the custom_data_loader() function to construct PyTorch-compatible DataLoader objects. The model structure is generated dynamically through the get_model() function, and the entire training process is carried out using the train_and_evaluate() function. Upon completion, the function returns the trained model object along with the evaluation metrics, including the loss, mean-squared error (MSE), and root-mean-squared error (RMSE), which are later used for visualization and analysis.
The testing phase is handled by the run_test (data, trained_model_path, has_labels=True) function, which loads a trained model and performs predictions on the user-provided test data. This function chooses between different test routines depending on the presence or absence of labels in the dataset; Transformer-based models are tested using test_model2(), while other models are evaluated using the standard test_model() function. The function also includes mechanisms to verify whether the required columns exist in the test data, improving the robustness and usability.
The model generation process is encapsulated in the get_model(config) function, which dynamically returns an appropriate model object based on the parameters defined in the configuration file. The supported models include LSTM, CNN, and GRU models and various Transformer variants. Users can freely select among these options through the front-end interface, and the system constructs the model instance accordingly.
For data loading, the custom_data_loader() and former_data_loader() functions are responsible for preprocessing raw data. These functions perform feature–label separation, normalization, and sliding-window segmentation to generate training samples based on the user’s column selections. The former_data_loader() function is specifically designed to support Transformer-based models by formatting the input data to match the encoder’s expected structure.
The training procedure itself is executed via the train_and_evaluate() function, which receives the model, data loaders, loss function, and optimizer and orchestrates the full training loop. This function tracks the loss values at each epoch and returns results for the subsequent visualization and evaluation.
The front-end interface is implemented using Streamlit, with the main logic contained in the streamlit_version.py file. Through this interface, users can upload datasets, select input, output, timestamp, and identifier columns and configure key parameters, such as the sequence length, batch size, model type, optimizer, loss function, and learning rate. All user inputs are automatically written to the dataset.yaml file for backend processing. Once training is complete, the model structure is displayed, and Matplotlib is used to render visualizations, such as training loss curves and prediction results, enabling users to better understand the model behavior and performance.
Configuration management is handled by the configs.read_configs() function, which ensures that all the training and testing parameters are consistently sourced from the same YAML file. This centralized parameter control improves maintainability and eliminates inconsistencies. Through the coordinated operation of these core classes and interfaces, the software achieves a complete closed-loop workflow from data ingestion to prediction, and it ensures the performance of deep learning models, while significantly lowering the operational complexity, offering users an efficient and user-friendly RUL prediction platform.

References

  1. Ogunfowora, O.; Najjaran, H. A Transformer-based Framework For Multi-variate Time Series: A Remaining Useful Life Prediction Use Case. arXiv 2023, arXiv:2308.09884. [Google Scholar]
  2. Li, X.; Ding, Q.; Sun, J.Q. Remaining useful life estimation in prognostics using deep convolution neural networks. Reliab. Eng. Syst. Saf. 2018, 172, 1–11. [Google Scholar] [CrossRef]
  3. Gao, Z.; Wang, C.; Wu, J.; Wang, Y.; Jiang, W.; Dai, T. Degradation-Aware Remaining Useful Life Prediction of Industrial Robot via Multiscale Temporal Memory Transformer Framework. Reliab. Eng. Syst. Saf. 2025, 262, 111176. [Google Scholar] [CrossRef]
  4. Wang, L.; Zhao, X.; Pham, H. Novel formulations and metaheuristic algorithms for predictive maintenance of aircraft engines with remaining useful life prediction. Reliab. Eng. Syst. Saf. 2025, 261, 111064. [Google Scholar] [CrossRef]
  5. Zhou, Y.; Wang, H.; Jin, H.; Liu, Y.; Liu, X.; Cao, Z. Remaining useful life prediction for machinery using multimodal interactive attention spatial–temporal networks with deep ensembles. Expert Syst. Appl. 2025, 263, 125808. [Google Scholar] [CrossRef]
  6. Kim, M.; Yoo, S.; Son, S.; Chang, S.Y.; Oh, K.-Y. Physics-informed deep learning framework for explainable remaining useful life prediction. Eng. Appl. Artif. Intell. 2025, 143, 110072. [Google Scholar] [CrossRef]
  7. Chen, Z.; Wu, M.; Zhao, R.; Guretno, F.; Yan, R.; Li, X. Machine Remaining Useful Life Prediction via an Attention Based Deep Learning Approach. IEEE Trans. Ind. Electron. 2020, 68, 2521–2531. [Google Scholar] [CrossRef]
  8. Cheng, C.; Ma, G.; Zhang, Y.; Sun, M.; Teng, F.; Ding, H. A deep learning-based remaining useful life prediction approach for bearings. IEEE/ASME Trans. Mechatron. 2020, 25, 1243–1254. [Google Scholar] [CrossRef]
  9. Francisco, O.V.; Rosaria, S. Machine learning for marketing on the KNIME Hub: The development of a live repository for marketing applications. J. Bus. Res. 2021, 137, 393–410. [Google Scholar] [CrossRef]
  10. Demšar, J.; Zupan, B. Hands-on training about data clustering with orange data mining toolbox. PLoS Comput. Biol. 2024, 20, e1012574. [Google Scholar] [CrossRef]
  11. Baratchi, M.; Wang, C.; Limmer, S.; van Rijn, J.N.; Hoos, H.; Bäck, T.; Olhofer, M. Automated machine learning: Past, present and future. Artif. Intell. Rev. 2024, 57, 122. [Google Scholar] [CrossRef]
  12. Kok, C.L.; Tan, H.R.; Ho, C.K.; Lee, C.; Teo, T.H.; Tang, H. A Comparative Study of AI and Low-Code Platforms for SMEs: Insights into Microsoft Power Platform, Google AutoML and Amazon SageMaker. In Proceedings of the 2024 IEEE 17th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), Kuala Lumpur, Malaysia, 16–19 December 2024; pp. 50–53. [Google Scholar] [CrossRef]
  13. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
  14. Kim, Y. Convolutional neural networks for sentence classification. arXiv 2014, arXiv:1408.5882. [Google Scholar]
  15. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  16. LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 2002, 86, 2278–2324. [Google Scholar] [CrossRef]
  17. Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. arXiv 2014, arXiv:1409.3215. [Google Scholar]
  18. Graves, A. Generating sequences with recurrent neural networks. arXiv 2013, arXiv:1308.0850. [Google Scholar]
  19. Vinyals, O.; Toshev, A.; Bengio, S.; Erhan, D. Show and tell: A neural image caption generator. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3156–3164. [Google Scholar]
  20. Wu, H.; Xu, J.; Wang, J.; Long, M. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. arXiv 2021, arXiv:2106.13008. [Google Scholar]
  21. Saxena, A.; Goebel, K. Turbofan Engine Degradation Simulation Data Set. In NASA Prognostics Data Repository; NASA Ames Research Center: Moffett Field, CA, USA, 2008. [Google Scholar]
  22. Ignavinuales. Battery RUL Dataset [DB/OL]. 1 January 2023. Available online: https://github.com/ignavinuales/Battery_RUL_Prediction (accessed on 11 June 2024).
Figure 1. Diagram of the LSTM framework.
Figure 1. Diagram of the LSTM framework.
Processes 13 02366 g001
Figure 2. Diagram of the GRU framework.
Figure 2. Diagram of the GRU framework.
Processes 13 02366 g002
Figure 3. Overall software architecture.
Figure 3. Overall software architecture.
Processes 13 02366 g003
Figure 4. Architecture diagram of the software training phase.
Figure 4. Architecture diagram of the software training phase.
Processes 13 02366 g004
Figure 5. Architecture diagram of the software testing phase.
Figure 5. Architecture diagram of the software testing phase.
Processes 13 02366 g005
Figure 6. Initial software interface.
Figure 6. Initial software interface.
Processes 13 02366 g006
Figure 7. Determination of data columns.
Figure 7. Determination of data columns.
Processes 13 02366 g007
Figure 8. Data batch and time window size determination.
Figure 8. Data batch and time window size determination.
Processes 13 02366 g008
Figure 9. Model selection.
Figure 9. Model selection.
Processes 13 02366 g009
Table 1. A comparison of the three platform frameworks.
Table 1. A comparison of the three platform frameworks.
Platform/ToolAdvantagesDisadvantages
KNIMEStrong plugins and functionality, supports multiple data sources, AutoML integrationSteep learning curve, outdated interface, performance bottlenecks with large data
OrangeSimple to use, open-source, supports various algorithmsLess powerful than KNIME, limited data processing ability, fewer plugins
AutoML (General)Automated workflows, quick prototyping, reduces human interventionLow flexibility, black-box issue, high resource consumption
Table 2. Specific information for each subset.
Table 2. Specific information for each subset.
Dataset Number of Training TracksNumber of Test TrajectoriesFault Pattern
FD0011001001 (HPC degradation)
FD0022602591 (HPC degradation)
FD0031001002 (HPC degradation, fan degradation)
FD0042482492 (HPC degradation, fan degradation)
Table 3. Detailed content of each dataset.
Table 3. Detailed content of each dataset.
Unit numberIndicates the engine number.
Time (period)Indicates the number of cycles in which the engine is running.
Operation settingsIncludes three operational setting variables—the flight altitude, throttle analysis angle (TRA), and Mach number.
Sensor measurementThe measurements include 21 sensors, such as the total temperature at the fan inlet (T2) and the total temperature at the low-pressure compressor outlet (T24).
Table 4. Detailed content of each dataset.
Table 4. Detailed content of each dataset.
cycle_indexThe number of charge and discharge cycles of the battery is used to reflect its aging process.
discharge_timeThe time (seconds) it takes for the battery to go from full charge to the set cut-off voltage in a complete discharge cycle is an indicator of health.
decrement_3.6–3.4 VThe time (sec) used to reduce the voltage from 3.6 V to 3.4 V is closely related to the voltage attenuation rate in this interval, which is a sensitive feature of the battery degradation process.
max_discharge_voltageThe highest voltage value during discharge (usually measured at the beginning of discharge) can be used to understand the decline in battery capacity.
min_charge_voltageThe minimum voltage during the charging process (usually measured at the beginning of the charging) reflects the change in the battery state to some extent.
time_at_4.15 VThe duration (seconds) that the battery voltage remains within the 4.15 V range reflects the characteristics of the constant voltage phase and can indirectly reveal the characteristics of the internal resistance and capacity of the battery.
constant_current_time (TCC)The time (seconds) when the current is maintained constant during discharge corresponds to the change in the internal electrical characteristics and is often associated with the aging state of the battery.
charging_timeThe discharge time variation reflects the remaining available capacity in seconds for a full charge cycle.
RUL (Remaining Useful Life)The RUL is the predicted target value of the model.
Table 5. Validation results of different algorithms on the NASA dataset.
Table 5. Validation results of different algorithms on the NASA dataset.
Numerical Visualization of Evaluation Indicators
LSTMProcesses 13 02366 i001
GRUProcesses 13 02366 i002
CNNProcesses 13 02366 i003
TransformerProcesses 13 02366 i004
Table 6. Validation results of different algorithms on the NASA dataset.
Table 6. Validation results of different algorithms on the NASA dataset.
Results of PredictionTest MSETest RMSER2WMAPE
LSTMProcesses 13 02366 i0050.01360.10400.79440.2052
GRUProcesses 13 02366 i0060.02990.15690.75960.2306
CNNProcesses 13 02366 i0070.02090.13320.82690.2051
TransformerProcesses 13 02366 i0080.14850.38510.96060.098259
Table 7. Validation results of different algorithms on the HNEI dataset.
Table 7. Validation results of different algorithms on the HNEI dataset.
Numerical Visualization of Evaluation Indicators
LSTMProcesses 13 02366 i009
GRUProcesses 13 02366 i010
CNNProcesses 13 02366 i011
TransformerProcesses 13 02366 i012
Table 8. Validation results of different algorithms on the HNEI dataset.
Table 8. Validation results of different algorithms on the HNEI dataset.
Results of PredictionTest MSETest RMSER2WMAPE
LSTMProcesses 13 02366 i0130.000290.01610.99630.0348
GRUProcesses 13 02366 i0140.000220.01460.99730.0244
CNNProcesses 13 02366 i0150.000140.01080.99550.0379
TransformerProcesses 13 02366 i0160.001670.03890.99240.030296
Table 9. The training time and memory usage of each model in the CPU environment for both datasets.
Table 9. The training time and memory usage of each model in the CPU environment for both datasets.
ModelDataset Training Time (s)Memory Usage (kb)
LSTMNASA2325.801270
GRUNASA1023.85983
CNNNASA2232.86147
TransformerNASA9305.786690
LSTMHNEI363.831315
GRUHNEI549.52987
CNNHNEI312.37149
TransformerHNEI2060.612997
Table 10. The model predictions on the NASA turbofan engine and HNEI battery datasets.
Table 10. The model predictions on the NASA turbofan engine and HNEI battery datasets.
ModelNASA Test MSENASA Test RMSEHNEI Test MSEHNEI Test RMSE
GRU0.02990.15690.000290.0161
LSTM0.01360.10400.000220.0146
CNN0.02090.13320.000140.0108
Transformer0.14850.38510.001670.0389
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lin, Y.; Chen, J.; Chen, S.; Nie, Y.; Wang, M.; Zhang, B.; Yang, M.; Wang, J. A Low-Code Visual Framework for Deep Learning-Based Remaining Useful Life Prediction. Processes 2025, 13, 2366. https://doi.org/10.3390/pr13082366

AMA Style

Lin Y, Chen J, Chen S, Nie Y, Wang M, Zhang B, Yang M, Wang J. A Low-Code Visual Framework for Deep Learning-Based Remaining Useful Life Prediction. Processes. 2025; 13(8):2366. https://doi.org/10.3390/pr13082366

Chicago/Turabian Style

Lin, Yuhan, Jianhua Chen, Sijuan Chen, Yunfei Nie, Ming Wang, Bing Zhang, Ming Yang, and Jipu Wang. 2025. "A Low-Code Visual Framework for Deep Learning-Based Remaining Useful Life Prediction" Processes 13, no. 8: 2366. https://doi.org/10.3390/pr13082366

APA Style

Lin, Y., Chen, J., Chen, S., Nie, Y., Wang, M., Zhang, B., Yang, M., & Wang, J. (2025). A Low-Code Visual Framework for Deep Learning-Based Remaining Useful Life Prediction. Processes, 13(8), 2366. https://doi.org/10.3390/pr13082366

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop