Predictive Maintenance in the Automotive Sector: A Literature Review

: With the rapid advancement of sensor and network technology, there has been a notable increase in the availability of condition-monitoring data such as vibration, temperature, pressure, voltage, and other electrical and mechanical parameters. With the introduction of big data, it is possible to prevent potential failures and estimate the remaining useful life of the equipment by developing advanced mathematical models and artiﬁcial intelligence (AI) techniques. These approaches allow taking maintenance actions quickly and appropriately. In this scenario, this paper presents a systematic literature review of statistical inference approaches, stochastic methods, and AI techniques for predictive maintenance in the automotive sector. It provides a summary on these approaches, their main results, challenges, and opportunities, and it supports new research works for vehicle predictive maintenance.


Introduction
Thanks to new digital technologies, it is possible to interconnect, in industrial processes, production machines with their software. This technological progress has several advantages, including accelerating processes related to digital data collection, optimizing production process times, producing higher quality goods at lower costs, and having all the necessary information to implement strategic decisions to support the business [1]. In the production field, the connection between physical systems (in particular machines) and IT systems is the core of Industry 4.0 (I4.0). Rüßmann et al. [2] describe the major technological trends that are the building blocks of I4.0 and explore their potential technical and economic benefits for manufacturers and suppliers of production equipment. In the context of I4.0, the collection and comprehensive evaluation of data from different sources-equipment and production systems, as well as business and customer management systems-will become the standard to support decision-making in real time.
The contribution of I4.0, particularly relevant in the automotive sector, has led to a change in the maintenance paradigm [3,4], both in terms of vehicle production [5,6] and subsequent maintenance [7][8][9][10]. We can say that the automotive world is one of the most advanced in adopting the Internet of Things (IoT) [11]. This innovative technology allows the creation of a wide range of features and services that would have seemed impossible to develop only a few years ago. As an example, we can consider all applications related to the management and maintenance of vehicles. In this scenario, sensors and IoT management systems minimize inconvenience and time consumption related to maintenance [12]. Thanks to computing capacity provided by edge computing systems, it is possible to analyze the vehicle parameters and alert operators of any critical issues. This allows to promptly notify, in advance, the driver or vehicle manager that the vehicle may require some intervention and avoid the occurrence of serious damage, as well as preserving the safety of those on board. To understand the benefits of predictive maintenance, it is necessary to note traditional management techniques. Industrial and process plants typically employ three types of maintenance management [13] (see Figure 1). • Run-to-failure (RtF) or reactive maintenance, where maintenance interventions are performed only after the occurrence of failures. This approach is common when equipment failure does not significantly affect operations or productivity. • Planned preventive maintenance (PvM): Time-based maintenance or scheduled maintenance, which involves taking the necessary precautions and actions to reduce the likelihood of equipment failure, and prevent accidents or failures before they occur. It is performed regularly while the equipment is still running so that it does not fail unexpectedly. Therefore, in terms of complexity, this maintenance strategy lies between run-to-failure and predictive maintenance. • Predictive maintenance (PdM), which employs condition-monitoring technology to measure equipment performance through IoT systems that allow the connection of electronic devices to mechanical and digital machines and the collecting of a significant amount of data. Data are collected over time to monitor the state of equipment and construct models that can help prevent failures. In the next section, we describe a cost model for continuous-time maintenance which proves that the predictive approach allows one to optimize costs.

The Maintenance Costs
In choosing the most suitable maintenance strategy, the involved costs must be taken into consideration. An effective solution certainly implies a reduction of expenses and an increase in productivity. The cost model varies as the applied maintenance strategy varies. For the reactive maintenance strategy, the maintenance action for repairing the equipment is performed if the equipment has stopped working, so there is only the corrective replacement cost (C c ). For the preventive maintenance strategy, sequential maintenance actions are scheduled, and the overall cost often includes costs related to preventive replacement (C p ), inspection costs (C i ), costs related to unit downtime (C d ), and the costs associated with corrective replacement (C c ). In particular, in [14], Grall et al. propose a cost model for continuous-time predictive maintenance that aims to find an optimal prevention threshold. The objective function to be minimized EC ∞ represents the total expected cost for long-term maintenance. The cumulative maintenance cost can be expressed as where N i (t), N p (t), and N c (t) represent, respectively, the number of inspections, preventive repairs, and corrective repairs carried out in the time interval [0, t], while d(t) represents the duration of the machine inactivity time in [0, t]. Therefore, the cost function to be minimized is defined as follows: where E[C(t)] is the expected value of the maintenance cost. For the predictive maintenance strategy, maintenance actions are performed according to the results of the failure prediction, so the cost model is usually associated with the estimation of the remaining useful life (RUL) and depends on the specific system or equipment [15]. As pointed out in [16], reactive maintenance has the lowest prevention cost, while preventive has the lowest repair cost due to well-planned machine downtime. Instead, predictive maintenance allows obtaining the best compromise between repair cost and prevention cost (see Figure 2). Ideally, this maintenance strategy provides for a lower frequency of maintenance and prevents unexpected repair costs without incurring the costs associated with excessive prevention.
As shown in the US Department of Energy report [17], it is possible to save about 8-12% with the predictive approach compared to relying on planned preventive maintenance. Predictive maintenance increases the asset lifespan and decreases equipment downtime and costs of spare parts and labor. Moreover, this approach improves worker safety, increases plant reliability, and optimizes the equipment's operation, leading to immediate energy savings. On the other hand, this approach requires initial capital costs for acquiring and setting up diagnostic equipment. In addiction, it needs investment in employee training to effectively use the predictive maintenance technology adopted by the company. Generally speaking, the advantages of this approach outweigh the disadvantages. Surveys on industrial average savings showed that companies removed 70-75% asset breakdown, reduced maintenance costs by 25-30%, and increased production by 20-25% after implementing a predictive maintenance program. The return on investment (ROI) was an average of 10 times, making it a proper investment.
Throughout the paper, we thus focus our analysis on predictive maintenance in the automotive sector.

Predictive Maintenance
Predictive maintenance represents a very complex process; indeed, for a real-time view of the state of health and reliability of industrial machines, it is necessary to collect data from different sensors of the system. This maintenance strategy is developed in four phases [18]: (1) Collecting data from different sensors of the system.
Faults diagnostics and prognosis are two research topics that have attracted attention from the academic world and industry. The purpose of diagnostics is to detect, isolate, and identify a fault that has occurred. There are usually two crucial steps in fault diagnostics: (1) Feature extraction and selection: in this phase, the discriminating features of the raw data are extracted and selected. (2) Classification of faults: the main task of this phase is to classify the different faults and identify the causes of the failure using the selected discriminating characteristics.
Prognostics is based on observing the variation of operating parameters of a system during its normal operating cycle. It allows you to predict a failure before it occurs and estimate your equipment's RUL. It is generally performed with three key steps: (1) Construction of the health indicator (HI): the HIs are indexes constructed to represent the health of the equipment. The information provided by the implemented diagnostic and prognostic methods can support the maintenance decision-making process. Maintenance personnel can perform maintenance actions in advance to effectively prevent equipment failure. A schematic representation of predictive maintenance is shown in Figure 3.
The process of gathering and interpreting data acquired from the physical world is possible using artificial intelligence (AI) techniques and machine learning algorithms [20,21]. Sajid et al. [22] identify several methods for predictive maintenance: • Physical model approach, which uses a physics or mathematical model of the system for assessing degradation of components. The accuracy of this approach relies on the model, and it also uses statistical methods to validate it [23]. • Knowledge-based approach, which relies on some knowledge or expertise on the system to reduce its complexity. Expert systems and fuzzy logic belong to this category [24]. • Data-driven approach, which employs computational power and a large amount of data. This model is classified into three types-statistical models [25], stochastic models [26], and machine learning models. • Digital twin approach, which combines data and models and creates a link between the physical world and the digital ones [27].
This work aims to present a literature review on approaches that have emerged in the last years as powerful tools for predictive maintenance in the context of the transportation and vehicle industry. Data-driven machine learning algorithms require an effective analysis of a huge amount of historical and real-time data via multiple streams (sensors and computer systems [28]). Therefore, data preprocessing has a significant impact on performance of a machine learning algorithm [19,29,30].
In the following sections, we describe these methods and provide an overview of recent research contributions in predictive maintenance. This paper is organized as follows. Section 2 introduces physics-based models presented in literature. Section 3 describes the so-called knowledge-based models, which simulate the skills and behavior of the experts. Section 4 describes the most common traditional machine learning techniques and the main deep learning techniques. The use of digital twin technology for predictive maintenance is illustrated in Section 5 and the last section, Section 6, compares different approaches.

Physics-Based Models
A first approach to faults prediction is the formulation of physics-based models characterized by the physical description of the machine degradation process. Nowadays, even if data-based methods are mainly implemented, the choice of physics-based models may be more appropriate, especially in some areas (including monitoring of offshore turbines, and maritime and military systems) [31]. From a mathematical point of view, this approach correlates the phenomenon of wear and the useful life of components. Among the variables considered in the formulation of the physical-mathematical model, various physical quantities describe the thermal, mechanical, chemical, and electrical nature of the analyzed component. Being able to describe the impact they have on the health of machinery is a rather tricky task due to the fact that this type of solution requires high knowledge of the domain. Once the model is formulated, it is necessary to have sensors available that make it possible to obtain the values assumed by the quantities considered relevant in the analysis and modeling phase to use as inputs. The main advantage of this type of approach is that it allows you to precisely describe the outputs it provides, because it is based on a physical description of the process. As for accuracy, it is strongly correlated to the quality of the analysis and modeling by the domain experts. On the other hand, the negative aspects are the complexity, the high cost of implementation, and the high specificity for the system, which give a minor possibility of reuse and extension. Now, we recall some recent results in the literature: In [23], the authors propose a simplified physics-based model to describe a compact angular head (roller hemming) RHEvo, a tool for hemming mainly used in production lines and consisting of mechanical parts such as springs, rollers, skates, and bearings. After developing the model, the authors use a neural network to estimate the current state of the internal components. From the physical analysis of the internal springs, one can observe that aging affects the elastic coefficient due to the fatigue degradation processes. Finally, an estimate of the internal spring's RUL is calculated with a stochastic model. One can use the proposed approach for various devices that use springs; in fact, the coil springs of traction and compression have numerous uses, particularly the suspension systems of automobiles, the recoil mechanisms of weapons, and the shut-off valves in engines.
In [31] the authors present a series of methods and tools that improve predictive maintenance. They examine specific cases to demonstrate how the developed methods and tools could be implemented in different fields. In particular, the authors state that vibrationbased machinery health monitoring techniques can help detect damage, diagnose the health of a system, and predict the remaining life of the machinery. Furthermore, one emphasizes the importance of in-depth knowledge of system dynamics in the development of wellperforming algorithms. An approach to developing physics-based models is presented, underlining the need to understand the physics of the faults to predict the life of systems in highly variable operating conditions. Finally, they present a decision support tool that helps users choose the most suitable approach to predictive maintenance and the most appropriate technique for monitoring conditions.

Knowledge-Based Models
Domain experts are also relied upon to create knowledge-based models, as this approach aims to simulate the skills and behavior of the experts. Therefore, after a formalization of knowledge, it is possible to reproduce it and apply it automatically. Expert systems are programs that use experts' knowledge in a given field and apply inference mechanisms to emulate thought and provide support and practical solutions. Among the most common approaches for implementing this type of model are rule-based systems and fuzzy logic. The rule-based systems have the advantage of simplicity in the implementation and interpretability, but they can be poorly performing, especially when one needs to express complicated conditions or when the number of rules is very high. Similarly, fuzzy logic allows describing the system state by imitating human decision-making processes, making the formalization process and description of the model more straightforward and intuitive. Even for expert systems, as for physical models, the results are highly dependent on the quality and level of accuracy achieved by the model and are highly specific.
In the literature, this approach is often applied in combination with data-driven methods. For example, Zhou et al. [24] tackle the problem of real-time and onboard electric vehicle fault diagnosis by combining a neural network and fuzzy logic. It is clear that electric vehicles' low calculating ability and limited storage capacity hamper real-time and onboard fault diagnosis. To address this issue, combining neural network and fuzzy logic, Zhou et al. [24] propose a low-complexity onboard vehicle fault diagnosis method to monitor the vehicle status and give early warning of accidents. The authors collect real data relating to the components of three different electric vehicles and propose a training method based on a neural network to define the correlation between data types and types of faults. Subsequently, using this correlation, a classification method based on fuzzy logic is introduced, making it possible to evaluate the vehicle state and prevent any anomalies and malfunctions. The simulation results indicate that the onboard method could correctly diagnose vehicle faults with an accuracy of 88%.

Data-Driven Methods
As a consequence of the rapid development of cars' features, traditional rule-based diagnostic systems became very limited. Therefore, more sophisticated data-driven approaches need to be investigated towards more efficient solutions. In this section, we present the state of the art on data-driven approaches recently introduced for predictive maintenance in the transport and vehicle industry. One can classify them into the following: • Statistical approaches. • Stochastic approaches. • Machine learning techniques.

Statistical and Stochastic Approaches
Statistical and stochastic approaches make it possible to deal with complex systems whose evolution over time is not easily predictable. As we will see in this section, the application of statistical methods for the prediction, estimation, and optimization of the probability of survival and the average life span of a system can be advantageous in some specific cases related to the operation of mechanical components such as the battery of electric vehicles or spur gears of a car. Below, some recent results in the literature are briefly described.
In [32], the authors use statistical analysis to diagnose battery faults. This method is efficient and accurate and can predict faults in advance. They use the usual statistical analysis methods of big data to determine the probability of error on the voltage of the battery cell terminal, and they obtain the diagnosis of 3σ multilevel screening error based on the Gaussian distribution. In this work, one applies the neural network algorithm and combines the results of the fault diagnosis in the case of a specific car with the statistical adjustment of large samples. Moreover, the authors build a comprehensive method of diagnosing the battery faults and perform a corresponding analysis between the statistical result and the actual breakdown of the vehicle.
In [33], the authors focus on faults diagnosis of the connection of lithium-ion battery in series. In particular, one uses the mean squared error (MSE) to indicate the mean squared discrepancy between the experimental data and the data obtained through the simulation and to describe the voltage state of each cell. It also provides a preliminary assessment of the tension. In the case of abnormal voltage values, this is analyzed using the Z-score parameter, and in this way, one establishes if a fault has occurred.
In [25], Ashok Raj et al. show how one of the approaches used to estimate the severity of a failure in the cylindrical gears is the analysis of statistical parameters extracted from the vibration signals. The parameters used in this paper are the fourth-order normalized statistical moment and the Curtosi index, while the technique applied for the extraction of these signals is the empirical decomposition (EMD). This approach allows decomposing the vibration signal of the machine into several intrinsic mode functions (IMF) to acquire the local characteristics both in the frequency domain and in the time domain. Finally, local characteristics have been identified from the information obtained. The results reveal that the EMD-based vibration technique is the most suitable for fault detection and their severity estimation. It has also been shown that using the statistical parameters obtained by the IMFs is very effective for early fault detection compared with using the parameters extracted from the original non-elaborated signals. This approach can prove to be a powerful tool for identifying various developmental defects in a spur gear system.
Garay et al. [6] describe degradation processes with the following expression: where the stochastic process X(t) is a function of time h(·), depending on error (t), which characterizes the variability and uncertainty of parameters involved in the process.
Shen et al. [26] propose an innovative stochastic model based on the two-stage Wiener process to describe lithium-ion batteries' degradation behavior, taking into account different degradation phases.
Recent developments in industrial systems provide us with a large amount of time series data from sensors, logs, system settings, physical measurements, etc. These data provide insights into complex systems and could detect anomalies. However, the characteristics of these time series data, such as high dimensions and complex dependencies between variables, pose great challenges to existing anomaly detection algorithms. Therefore, when analyzing complex systems in which variables exhibit strong correlations, multivariate statistical methodologies provide more precise results than univariate techniques. Principal component analysis (PCA) is one of the most effective multivariate statistical techniques that find applications for process monitoring and control, fault detection and diagnosis, and sensor validation in various process industries. In fact, in exploratory data analysis, PCA can reduce data dimensionality and, consequently, the computation time. The use of this technique for fault detection was discussed in [34], where PCA-based fault amplification methodology was developed for estimating the fault propagation path in industrial systems. It is possible to apply this methodology for small systems with limited variables; however, it can become more complex and time-consuming with an increased number of variables.
Generally speaking, the fault in one process can influence the error in other variables involved, making the fault detection process more difficult and time-consuming. In this case, another useful tool is Granger causality (GC) algorithm that can estimate causal relationships among variables and help detect the root cause of the faults. GC algorithms find their application in process [35] and energy industries [36,37] due to their simple implementation and reliable interpretation of the empirical findings.
For instance, Bhat et al. [38] model Granger causal relationships between pairs of sensor data streams to detect changes in their dependencies. They compare the method on simulated signals with the Pearson correlation and show that the method sufficiently handles noise and lags in the signals and provides appreciable dependency detection. Nevertheless, the results show that the method is also prone to detecting false positives. Therefore, this method can be used as a weak detection of faults, but other methods, such as the use of a structural model, are required to detect and diagnose faults reliably.
Qiu et al. [39] propose a novel method based on Granger causality to detect anomalies regarding dependency changes in multivariate time series. They also investigated several stochastic and parallel optimization algorithms to speed up their approach. The empirical results verified the effectiveness of this method in accuracy and persistence. In [40], Kordes et al. discuss an automotive application of the Granger causality algorithm. This approach is applied to model all possible causal relationships between sensor signals recorded directly from CAN bus in-vehicle networks (IVNs), which connect electronic control units (ECUs). Most of the communication on the IVNs directly affects the comfort or even the safety of the driver. Therefore, it is necessary to monitor these systems to find the cause and effect of a fault. In the case of mechanical wear, it is possible to obtain automatic fault detection with a positive result in a simulated situation and by using real data. According to the results presented in [40], modeling causal relationships between time series can be applied to sensor signals of IVNs very well.
Finally, as pointed out in [34], combining PCA models with the GC algorithms addresses a more efficient process monitoring.

Machine Learning Algorithms
Machine learning is a subset of artificial intelligence (AI) and deals with creating systems that learn and improve performance based on the data they use. There are four types of machine learning algorithms currently used: supervised, semi-supervised, unsupervised, and reinforcement learning algorithms. The difference between these four types of algorithms is defined by how each algorithm learns the data to make predictions.
In unsupervised learning, the data is not labeled, and the model is formulated so that it identifies patterns and structures in the data on its own. In semi-supervised learning, the input data is a combination of labeled and unlabeled data.
In supervised learning, the ML model uses labeled training data. In other words, input variables and the corresponding output are supplied to the machine to learn a mapping from the inputs to the outputs, often adjusting the model iteratively. This process is repeated until the model achieves the desired level of accuracy on the training data and can correctly predict outputs for new data. Supervised learning is probably the most frequently used machine learning in practical applications.
Finally, reinforcement learning enables the system to learn by rules, trial, and error to discover the most beneficial actions. Concerning applications in the automotive industry, reinforcement learning has been fundamental to allow the development of self-driving vehicles that learn to recognize the surrounding environment (with the data collected by GPS, sensors, etc.) and to adapt their "behavior" according to the specific situations they have to face.
Machine learning algorithms require an effective analysis of a considerable amount of historical data and real-time data extrapolated through multiple streams (sensors and IT systems) [28]. Therefore, the data preprocessing phase has a significant impact on the performance of machine learning algorithms [19,29,30].
This section explores the traditional machine learning approaches and more advanced deep learning methods, which are usually employed for predictive maintenance in the automotive domain.

Traditional Algorithms
This section explores the most advanced deep learning methods typically employed for predictive maintenance in the automotive industry. The term deep learning (DL) refers to a subset of artificial intelligence and machine learning that uses multilayer artificial neural networks to estimate a better mapping function between specific inputs and outputs. DL algorithms require huge amounts of data to achieve high accuracy. These algorithms have been widely used in many automotive sectors, such as autonomous driving and manufacturing [41]. The most spread traditional algorithms used in predictive maintenance are linear regression (LR), Gaussian process regression (GPR), artificial neural network (ANN), decision tree (DT), support vector machine (SVM), and k-nearest neighbors (k-NN).

Linear Regression
Linear regression (LR) analysis is a statistical technique for investigating and modeling the functional relationship between dependent variables (response) and independent variables (predictor). If we denote with y the dependent variable and x 1 , x 2 , . . . , x N the independent variables, then the equation of a straight line relating these variables is where β 0 , β 1 , . . . , β N are equation parameters, and represents the difference between the values of Y and the model used to represent them, β 0 + ∑ N j=1 β j x j . Typically, ∼ N(0, σ 2 ), and so it is referred to as the error term that accounts for the failure of the model to fit the data exactly. The adjective linear is employed to indicate that the model is linear in the parameters β 0 , β 1 , . . . , β N , not because Y is a linear function of X i s.
Dehning et al. in [42] provide an insight into how the multiple linear regression approach can be used to identify and quantify factors influencing the energy intensity of automotive plants. The presented model aims at supporting strategic decision-making and forecasting the future energy demand of automotive plants. The model can be used for different purposes for automotive companies and other stakeholders.
Kong et al. [43] discusses the development of multiple linear regression (MLR)-based spring durability models for predicting the fatigue life of automotive coil springs based on the vertical vibrations of the vehicle and natural frequencies of the vehicle suspension system. In these models, the fatigue life of the automotive coil spring f (X i ) is the dependent variable, whereas the weighted acceleration (vertical vibrations of the vehicle) and natural frequencies of the vehicle suspension system are used as independent variables x i . The results indicate that the MLR-based spring durability models can predict the fatigue life of automotive coil springs with reasonable accuracy. The fault prediction task is formulated in [44] as regression and classification problems. In particular, the authors compare two ML approaches: the first is an autoregression model of vehicle failure ratios based on past information. The second is the aggregation of individual vehicle failure predictions based on their personal usage.

Gaussian Process Regression
Gaussian process regression (GPR) is a nonparametric, Bayesian approach that has been widely used for regression and classification tasks. This algorithm is an efficient tool to develop forecasting models and estimate predictions by incorporating prior knowledge (kernels). Aye et al. [45] proposed an integrated GPR model to predict the RUL of slow speed bearings and achieved lower prediction error.
However, in some cases, approaches such as linear regression or GPR can provide inaccurate estimates. In fact, the work of Tosun et al. [46] compares the results obtained by linear regression (LR) and artificial neural networks (ANN) to predict some performance and emission data of a diesel engine fueled with alcohol and biodiesel blends. They show that while linear regression was lacking in predicting the desired parameters, it is possible to obtain more accurate results using ANN. The idea behind artificial neural networks is illustrated below.

Artificial Neural Network
An artificial neural network (ANN) has hundreds or thousands of artificial "neurons" (named processing units), which are interconnected by nodes. They are called "neural networks" because the behavior of the nodes that compose them resembles that of biological neurons. These processing units are made up of input and output units. The input units receive information, and the neural network attempts to learn it to produce the outputs. Just as humans need rules and guidelines to obtain a result, ANNs use a network training algorithm named backpropagation to refine the output results. Initially, an ANN goes through a training phase in which it learns to recognize patterns from the data. During this supervised learning phase, the network compares the actual output produced with what it should have produced-the desired output. The difference between the two results is minimized by using the backpropagation of the error. In other words, the network works backward, going from the output unit to the input unit to adjust the unit's connection weights until the difference between the actual and the desired result produces the least possible error.
In an ANN, we denote by w l jk the weight of the connection from the kth neuron of the (l − 1)th layer to the jth neuron of the lth layer (see Figure 4). The signal arriving at the jth node in the (l + 1)th layer is determined by the signal arriving in the lth layer: The value of f 0 (x) is the input x, while the so-called activation function g(·) is a nonlinear and nondecreasing function. Generally, the most commonly used activation functions are as follows: 1. The sigmoid function g(x) = 1 1+e −x . 2. The hyperbolic tangent function g(x) = 1 + tanh(x). This technique has several applications in maintenance [46][47][48][49]. A particularly interesting model is the new ANN architecture presented in [50], which significantly improves predicting a vehicle powertrain failure with a reduction in the input data size.

Support Vector Machine
Another supervised learning technique is the support vector machine (SVM). Support vector machines perform the classification task by constructing, in a higher-dimensional space, the hyperplane that optimally separates the data into two categories. Then, the SVM algorithm tries to find the maximum margin that separates the two categories of data and then determines the hyperplane in the center of the maximum margin. The term margin means the minimum distance of points of the two classes in the training set from the identified hyperplane. The boundary that separates the classes is called the decision boundary. Therefore, the points closest to the decision boundary are at the same distance from the optimal hyperplane.
Let us consider a training set {(x 1 , y 1 ), (x 2 , y 2 ), . . . , (x n , y n )} where x i ∈ R d are multidimensional patterns and y i ∈ {−1, 1} the labels of the two classes. The equation of a generic hyperplane is f (x) = wx + b. The goal of SVM is to find the parameters w and b of the linear function f that identify the optimal hyperplane. The points closest to the decision boundary define the margin. Considering two generic points, x 1 , x 2 , on opposite sides of the margin such that f (x 1 ) = 1 and f (x 2 ) = −1, the margin is equal to f (x 1 )− f (x 2 ) w = 2 w . Therefore, maximizing the margin is equivalent to minimizing w 2 or w 2 2 (see Figure 5). Therefore, to find the optimal hyperplane, SVM solves the following convex optimization problem (which admits a global minimum): where w T denotes the transposed vector of w. Patterns in the training set that lie on edge are called support vectors. These patterns, which constitute the most complex cases, completely define the problem solution, which can be expressed exclusively as a function of these patterns, regardless of the size of the space d and the number n of elements in the training set. In practice, data is often not linearly separable from a hyperplane, and therefore a more sophisticated SVM is used to solve it.
To map the training data in a nonlinear way in the space of characteristics with multiple dimensions, we introduce variables ξ i , i = 1, . . . , n and modify the separation constraints: For each pattern x i in the training set, the variable ξ i encodes the deviation from the margin. For separable patterns, the corresponding variables ξ i will take value 0. In this case, the optimal hyperplane must maximize the margin and, at the same time, minimize the number of incorrectly classified elements. The objective function and, consequently, the optimization problem are modified as follows: where the coefficient C is a hyperparameter that the user must choose in the algorithm implementation phase, which indicates the relative importance of the classification errors concerning the width of the margin. SVM provides an important extension of the theory initially developed for hyperplanes to the (nonlinear) case of separation of patterns even with very complex surfaces. In this case, we introduce a nonlinear function φ of the patterns from the space R d to a space R m of greater dimension (m > d): In the space R m , where the degrees of freedom are greater, the patterns φ(x i ), φ(x 2 ), . . . , φ(x n ) can be more easily separated by a hyperplane using the general theory. This is equivalent to splitting the patterns x 1 , x 2 , . . . , x n in R d with arbitrarily complex surfaces.
To determine the separation surface, we define the scalar product of two patterns mapped in the space R m as a function (called kernel) K : This allows to solve the optimization problem without particular complications compared to the linear case [51].
The SVM algorithm is widely used in maintenance, especially for the classification of faults. For example, Jeong et al. [52] propose a fault diagnosis algorithm to detect sensor faults robustly in vehicle suspension systems, and they evaluated them using an SVM. Nevertheless, this machine-learning application method allows reducing the effort required when designing fault diagnosis algorithms and achieves excellent performance at the same time. Nevertheless, this approach has some drawbacks, such as performance deviation depending on the configuration of the residual dataset used for learning. In this paper, the dataset is collected based on simulations. However, if the proposed fault diagnosis method is applied to an actual vehicle, actual vehicle test data must be collected in various scenarios. Biddle et al. [53] use the SVM technique to detect and identify faults in sensors for autonomous vehicle control systems, and they propose a novel predictive algorithm to identify degrading performance in a sensor and predict the time at which a fault will occur. The experimental results show good performance, with a relatively simple implementation resulting in prediction accuracy of 75.35%.

k-Nearest Neighbors
Another machine learning technique is the k-nearest neighbor (k-NN) algorithm used mainly in pattern recognition and fault classification in the context of predictive maintenance. Given a test instance, one can either search for k training instances closest to this test instance or predict the value of a new test instance.
The distance between two instances x = (x 1 , x 2 , ..., x n ) and y = (y 1 , y 2 , ..., y n ) is calculated as where n is the number of features in the dataset and w i is the weight of feature i. When i is set to 1, the distance between two instances becomes the Euclidean distance. The value of k should be an odd number for binary classification [54]. In the case study of [55], a cyber-physical system analyzes and records the vibration data of an electrical motor to identify and classify the vibration motor severity and implement a predictive maintenance. The vibration severity was classified through the k-nearest neighbour algorithm (k-NN).
Vasavi et al. [48] presents an edge computing-based fault prediction system that predicts vehicle health using internal and external sensors in real time. The results show that by combining ANNs and k-NN algorithms it is possible to achieve better accuracy compared to ANN and k-NN when applied individually.

Decision Tree
Other algorithms used for fault classification are the so-called decision tree (DT). The decision tree is a particular supervised learning technique that can be used both to predict discrete variables (in this case, we are talking about classification) and to predict continuous variables (in this case, we are talking about regression). It is mainly used as a tree-structured classifier, where the internal nodes represent the characteristics of a dataset, the branches represent the decision rules, and each leaf node represents the result. There are two types of nodes in a decision tree: the decision node and the leaf node. Decision nodes are used to make any decisions and have multiple branches, while leaf nodes are the output of those decisions and contain no further branches [56]. Decisions or tests are made based on the characteristics of the used dataset. It is called a decision tree because it starts with the root node, which expands to further branches and builds a tree structure. Several algorithms are used to build a tree. Among the best known are the classification and regression tree (CART) algorithm [57], ID3 [58], and C4.5 [59]. For a detailed analysis of the aforementioned algorithms, please refer to [60,61].
An example of application of this technique for the fault classification is given in [62], where to better understand the real faults of axle box bearings, the authors identify five different fault types by applying the C4.5 decision tree algorithm.

Deep Learning Approaches
This section explores the more advanced deep learning methods, which are usually employed for predictive maintenance in the automotive domain.
Deep learning (DL) refers to a subset of AI and machine learning that uses multilayered artificial neural networks to estimate a better mapping function between given inputs and outputs. To achieve high accuracy, DL algorithms require huge amounts of data. DL has been employed in many automotive sectors, such as autonomous driving, vehicle development, and manufacturing [41].
Let us discuss some applications of deep learning techniques for predictive maintenance in the automotive field.
In [66], two machine learning-based methods are developed for heavy medium leadacid battery prognosis, one based on long short-term memory (LSTM) neural networks and one on random survival forest (RSF). The lead-acid battery is mainly used when starting the engine and heating and cooling the passenger compartment. It is an essential part of the electrical system essential for the safe operation of the vehicle.
A novel health monitoring system based on a LSTM network is proposed in [67] to estimate the remaining fatigue life of automotive suspension.
In [68], the authors focus on utilizing deep learning to build a diagnostic system that efficiently and effectively predicts a wide range of faults by relying on a new model, called the deep symptoms-based model (deep-SBM). The performance of this approach was compared against the state-of-the-art models, and better results have been reported in terms of accuracy, precision, and F-score.
Particularly interesting is the ensemble learning technique, a machine learning paradigm that combines different machine learning techniques in a single predictive model to improve the overall accuracy of artificial intelligence algorithms [69]. For example, the method proposed in [70] combines the autoencoder (AE) technique with the long short-term memory (LSTM) algorithm to predict time series of mechanical failures. Another approach of this type can be found in [71], where a model of prediction of the RUL for electric valves is obtained by combining the convolutional autoencoder (CAE) and LSTM algorithms. The obtained results show a significant improvement in RUL prediction compared to the estimates obtained with other ML techniques, and, consequently, a relatively good accuracy on the prediction of equipment failures.
Surrounding factors such as weather, traffic, and terrain could influence the vehicle lifecycle. It is only recently that these factors have been taken into account in the study of automobile time-between-failure (TBF) prediction modeling [72]. With GPS information, these real-time data can be collected from external sources and transmitted to the cloud. As pointed out in this paper, it is possible to integrate these real-time telematics data with historical maintenance data to establish a more accurate automobile maintenance prediction model, offering real-time health condition monitoring and RUL prediction. For this purpose, a novel deep learning architecture called a merged-LSTM (M-LSTM) network is proposed to build a TBF prediction model based on multisource data. The experimental results show that the introduction of these data can improve TBF prediction modeling.
In [73], a CNN classifier is proposed for real-time multisensor monitoring to capture faulty signals and construct sensors' health index (HI). The proposed fault detection system obtained an accuracy of 99.84%.
In [74], the idea of ensemble learning is applied, and relevant vector machine (RVM) is used as a weak learning machine under the framework of ensemble learning to predict the health trend with uncertainty conditions. On this basis, this novel prediction model allows to effectively convert point estimation to continuous estimation.

Digital Twin Technology
Digital twin (DT) technology refers to a complete physical and functional description of a physical component, product, or entire system with all operational data. A product digital twin establishes a virtual connection that opens the way for real-time monitoring of the entire lifecycle of that product. In other words, a digital twin, like a virtual prototype, is a dynamic digital representation of a physical system. However, unlike a virtual prototype, a digital twin is a virtual instance of a physical (twin) system that is continually updated with data on its performance, maintenance, and health throughout the entire lifecycle of the physical system [75]. For a detailed discussion on the technologies that support DT and its applications in different sectors, see the reference [76]. In recent years, digital twin technology has been widely used to support automotive maintenance processes. For example, in [77], the authors describe the application of digital twin technology to support the predictive maintenance of an automotive braking system. The vehicle brake pressure was measured at different speeds using the ThingWorx Internet of Things (IoT) platform. The data acquired using this platform predict brake wear using the CAD model implemented in CREO Simulate. The approach applied in [78] combines physics-based modeling techniques (0-D, 1-D, 3-D) to create a digital twin that allows predicting brake pad wear in a conventional car. In [79], a multidimensional digital twin model dedicated to the product lifecycle is developed to improve the production quality and maintenance efficiency of the constant velocity joint of a car. The constant velocity joint is one of the main components of the automobile transmission system, and its reliability and stability are essential for the good functioning of the vehicle. This device represents a key factor in the realization of the steering and propulsion of the vehicle, the quality of which directly impacts safety, maneuverability, and comfort. The use of computer simulation technology reduces production and test costs so that the digital twin model can diagnose faults and optimize the design.

Comparison among Different Approaches
Generally speaking, selecting the most appropriate machine learning algorithm for predictive maintenance depends on many factors, from the type of issue at hand to the nature of data, and involves conducting experiments, evaluating different approaches, and tuning parameters. In the literature, there are many comparisons among different machine learning approaches.
For example, in [47], the authors use six machine learning algorithms, which include artificial neural network (ANN), support vector machine (SVM), linear regression (LR), Gaussian process regression (GPR), ensemble bagging, and ensemble boosting algorithms for estimating lithium-ion batteries' state of charge (SoC). As a result of this comparison, with 85% mean absolute error, the proposed ANN and GPR approach achieved strong performance while outperforming other methods. Therefore, ANN and GPR could help design the optimum battery management system for electric vehicles based on SoC predictions. In [80], a hybrid data-driven algorithm combining the benefits of GPR and long short-term memory was proposed to improve the accuracy of RUL prediction for lithium-ion (Li-ion) batteries with reliable uncertainty management. In [26], a novel two-stage Wiener process model is proposed to describe the degradation behavior of lithium ion batteries in different degradation stages.
According to [9], four classifiers were compared, namely, SVM, DT, RF, and kNN. Results show that all algorithms are very accurate, especially the SVM classifier, which obtained the best performance on four operating systems. The lowest accuracy of the SVM model is 96.6%, which was achieved on the ignition and cooling systems, while the best accuracy is 98.5%, which was achieved on the fuel system. The SVM classifier obtained the best performance on four operating systems, and the accuracy of the SVM was 96.6%, 98.7%, 98%, and 96.6%.
The purpose of the study in [49] is to show the feasibility of using different machine learning approaches implemented as classification predictors for fault detection tasks, including random forest (RF), support vector machines (SVM), artificial neural networks (ANN) variants, and Gaussian processes (GP). The authors use training and testing datasets of different standardized driving cycles generated by a simulation testbed for fault diagnosis in turbocharged petrol engine systems. The best results are achieved by the random forest method, since its minimum accuracy, i.e., 0.88539, is greater than the second maximum accuracy, i.e., 0.806120, performed by the support vector machine method. Nevertheless, it is possible to increase the accuracy of all methods by low-pass filtering the outputs.
In Table 1, we present a comparison between existing works from three perspectives: methods, applications, and data types. Table 1. Summary of the most recent papers for predictive maintenance in automotive sector. The data types are Real Data (RD) and Synthetic Data (SD).

Ref.
Year  [78] 2017 Prediction of brake pad wear in a car SD [77] 2019 Predictive maintenance of an automotive braking system SD [79] 2021 Maintenance of the constant velocity joint of a car SD

Conclusions
Timely and adequate maintenance actions are essential for the operation of industrial equipment as they can significantly improve the reliability, availability, and safety of the equipment and minimize failures. Predictive maintenance (PdM) is an advanced maintenance strategy that allows you to predict potential failures and take maintenance actions in a timely and appropriate manner [18]. It has gradually replaced traditional maintenance strategies, including reactive and preventive maintenance (also known as scheduled maintenance). In recent years, with the rapid advancement of sensor and network technology, there has been a notable increase in the availability of data such as vibration, temperature, pressure, and other types of electrical and mechanical equipment condition-monitoring data. With the development of big data, artificial intelligence (AI) techniques, especially machine learning and deep learning, have been widely applied in current predictive maintenance systems.
In recent years, numerous research articles in predictive maintenance, including theoretical studies and industrial applications, have been published in scientific journals and research reports. This work aims to provide a brief overview of recent research contributions on techniques used for predictive maintenance, especially in the automotive field. We have seen how deep learning methods, on the one hand, guarantee better accuracy in predicting failures, and on the other, require a greater amount of data than traditional machine learning techniques. The case studies analyzed in this work show how machine learning can effectively predict failures or anomalies in a wide range of applications and how it has improved (and will continue to do so) the toolset for predictive maintenance [7]. We have seen how hybrid models and physical models represent the most reasonable choice in some cases, such as those analyzed in [31], in which a large set of data is not available. Finally, we analyzed the role of digital twin technology in predictive maintenance. Digital twins give car manufacturers a greater ability to diagnose abnormal conditions and predict the remaining useful life of degradable components, improving vehicle performance and safety.
One of the main limitations of these contributions, also recognized in other reviews [7], is the non-availability of real datasets. These data are usually considered highly confidential by automotive companies. Consequently, this does not allow comparing qualitatively novel approaches with the state of the art because the dataset where the previous approaches were tested is usually unavailable online. Another limitation is that it is difficult to evaluate the validity of developed methods by using real data. Real data is often not, or only partially, labeled, and annotating data is time-consuming and requires expert knowledge. Nevertheless, to build methods that yield more robust results, it is necessary to test models with labeled data, even if only having trained them on unlabeled data.
Future research could address the application of general predictive maintenance achievements to automotive use cases and compare the approaches present in literature by testing, when it is possible, different models on the same real dataset. Another future research perspective is the development of models obtained by combining different approaches in order to give more efficient predictive analytics.
Author Contributions: These authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: