Next Article in Journal
Effect of Electron Beam Irradiation on Friction and Wear Properties of Carbon Fiber-Reinforced PEEK at Different Injection Temperatures
Previous Article in Journal
Effect of Scanning Speed on the Microstructure and Properties of Co-Cu-Ti Coatings by Laser Cladding
Previous Article in Special Issue
Generation and Reproduction of Random Rough Surfaces
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Physics-Informed Transfer Learning for Predicting Engine Oil Degradation and RUL Across Heterogeneous Heavy-Duty Equipment Fleets

1
Industrial and Manufacturing Engineering Department, Egypt-Japan University of Science and Technology, New Borg El Arab City 21934, Egypt
2
Production Engineering Department, Faculty of Engineering, Alexandria University, Alexandria 21544, Egypt
3
Chemical Engineering Department, Faculty of Engineering, Alexandria University, Alexandria 21544, Egypt
4
Institute of Machine Design and Tribology, Leibniz University Hanover, 30167 Hannover, Germany
*
Authors to whom correspondence should be addressed.
Lubricants 2025, 13(12), 545; https://doi.org/10.3390/lubricants13120545
Submission received: 12 November 2025 / Revised: 12 December 2025 / Accepted: 13 December 2025 / Published: 16 December 2025
(This article belongs to the Special Issue Intelligent Algorithms for Triboinformatics)

Abstract

Predicting the Remaining Useful Life (RUL) of engine oil is critical for proactive maintenance and fleet reliability. However, irregular and noisy single-point sampling presents challenges for conventional prognostic models. To address this, a hierarchical physics-informed transfer learning (TL) framework is proposed that reconstructs nonlinear degradation trajectories directly from non-time-series data. The method uniquely integrates Arrhenius-type oxidation kinetics and thermochemical laws within a multi-level TL architecture, coupling fleet-level generalization with engine-specific adaptation. Unlike conventional approaches, this framework embeds physical priors directly into the transfer process, ensuring thermodynamically consistent predictions across different equipment. An integrated uncertainty quantification module provides calibrated confidence intervals for RUL estimation. Validation was conducted on 1760 oil samples from dump trucks, dozers, shovels, and wheel loaders operating under real mining conditions. The framework achieved an average R2 of 0.979 and RMSE of 10.185. This represents a 69% reduction in prediction error and a 75% narrowing of confidence intervals for RUL estimates compared to baseline models. TL outperformed the asset-specific model, reducing RMSE by up to 3 times across all equipment. Overall, this work introduces a new direction for physics-informed transfer learning, enabling accurate and uncertainty-aware RUL prediction from uncontrolled industrial data and bridging the gap between idealized degradation studies and real-world maintenance practices.

1. Introduction

Diesel engines power nearly every industrial sector, from transport and mining to construction and agriculture [1]. Their performance and reliability depend strongly on the characteristics and condition of the lubricating oil, which reduces friction, dissipates heat, prevents corrosion, and removes contaminants from contact surfaces [2]. Among available lubricants, 15W–40 semi-synthetic oils remain the most common choice in heavy-duty diesel fleets due to their well-established viscosity–temperature characteristics, cost-effectiveness, balanced additive formulations, and longer life expectancy in comparison with mineral oils [3]. These oils provide a stable benchmark for studying and understanding the degradation rate and Remaining Useful Life (RUL) of lubricants in service [4,5,6].
During engine operation, the oil is subjected to a harsh and complex combination of thermal, chemical, and mechanical stresses [7]. Processes such as oxidation, nitration, soot accumulation, and additive depletion interact in nonlinear and interdependent ways, progressively altering the oil’s chemical and physical properties [8,9]. Consequently, viscosity and acidity number tend to increase [10], while the Total Base Number (TBN) and anti-wear protection decline [11]. Despite this, most maintenance schedules continue to rely on fixed time or mileage intervals [12]. These rigid schedules fail to account for variations in load, fuel quality, and ambient temperature, often resulting in either premature oil replacement, which wastes resources and raises operational costs, or delayed changes that increase wear and failure risk. To address these inefficiencies, many industries are adopting condition-based maintenance (CBM) strategies [13,14], in which oil condition is continuously monitored and the RUL is estimated as the remaining duration before critical parameters, such as viscosity or TBN, exceed acceptable operational thresholds [15].
Accurately predicting oil RUL across different engines remains, however, a challenging task [16]. The oil degradation is inherently nonlinear and influenced by the interplay between operating conditions, fuel composition, and environmental factors. Local models (i.e., Asset Specific Models) trained on individual engines can capture specific degradation trends but rarely generalize to others [17,18], while Global Models trained on aggregated data often lose sensitivity to engine-specific behaviors [19]. This trade-off reflects the classical bias–variance dilemma commonly observed in predictive maintenance models. Furthermore, industrial datasets are often noisy, irregularly sampled, and imbalanced due to field sampling constraints and laboratory variations, which led many previous studies to be conducted under controlled laboratory conditions or using synthetic datasets [20,21,22].
Beyond its practical deployment advantage, this study contributes to the theoretical development of Physics-Informed Machine Learning (PIML) by embedding thermochemical degradation kinetics within the transfer learning hierarchy itself, rather than applying physics post-hoc or as a regularization term. This allows physical priors to shape feature transfer, not just prediction. Furthermore, the work advances Transfer Learning for industrial prognostics by demonstrating that non-time-series oil data can be transformed into transferable degradation representation spaces, enabling reliable fleet-level generalization and asset-specific adaptation, a capability not achieved in conventional TL architectures or PIML frameworks trained on controlled laboratory data. Together, these aspects establish a new methodological foundation for PIML-based transfer learning under sparse, irregular real-world monitoring conditions
The next section reviews previous studies on oil degradation mechanisms, diagnostic indicators, and modeling approaches, ranging from empirical and statistical techniques to modern hybrid and transfer learning frameworks. The discussion highlights existing limitations that motivate the present study and establishes the need for models capable of capturing nonlinear, field-observed degradation behaviors across diverse diesel engines using the same lubricant formulation.

2. Literature Review

2.1. Characteristics and Indicators of Engine Oil Degradation

Construction fleets relying on heavy-duty diesel engines typically use 15W–40 semi-synthetic oils. These lubricants exhibit a balanced viscosity profile (around 15.5 cSt at 100 °C) and contain additive packages for soot dispersion, oxidation inhibition, and anti-wear protection [23]. Cerny et al. [24] compared the oxidation stability of eight different commercial SAE 15W-40 engine oils and concluded that the base oil composition is a critical factor in determining the high-temperature oxidation stability of engine oils. Because the formulations of these oils are standardized, the oils themselves serve as excellent reference materials for modeling studies. Experiments have shown that the rate of viscosity increase, and additive depletion depend strongly on engine load, temperature, and fuel sulfur content, factors which directly influence the RUL of engine oil.
Engine oil degradation occurs through oxidation, nitration, sulfation, soot loading, and additive depletion [2,14,25]. Oxidation produces acidic and polymerized compounds that increase viscosity, Total Acid Number (TAN), and sludge. Nitrogen oxides trigger nitration, while sulfur-based additives contribute to sulfation, generating corrosive by-products. Anti-wear additive depletion, such as ZDDP, weakens protective film formation on metal surfaces, though some degradation products partially restore lubricity. Soot thickens the oil, wear metals such as Fe, Cu, and Pb accelerate oxidation, and fuel dilution lowers viscosity and film strength. These mechanisms collectively alter viscosity, chemical composition, and load-carrying capacity, controlling when the oil reaches the end of its useful life.
Since lubricant formulations are consistent across engines, degradation mechanisms are comparable, enabling scalable modeling for RUL assessment across diverse assets. Adaptive models that account for nonlinear interactions outperform static or laboratory-based baselines in predicting RUL. Omiya et al. [26] collected real-world datasets from buses and trucks, including engine speed, load, temperature, hours of operation, and prior oil analyses, highlighting the importance of field data in reconstructing actual degradation trajectories for accurate RUL estimation. Analytical methods such as FTIR spectroscopy [27], ICP spectrometry [28], and viscometry provide multidimensional insights into degradation, but interdependence among variables complicates direct interpretation, necessitating advanced modeling for RUL prediction. Heredia-Cancino et al. [29] showed that oxidation, nitration, and sulfation indicators evolve in a nonlinear and highly coupled manner, further reinforcing the need for RUL-focused modeling frameworks.

2.2. Modeling Approaches of Engine Oil Degradation Rate

Oil degradation is inherently nonlinear due to coupled interactions among thermal oxidation, additive depletion, soot accumulation, and mechanical wear. Engine operating conditions such as temperature, load, and fuel composition modulate these trajectories, producing complex degradation curves that cannot be captured by simple linear models [30]. Sejkorová et al. [25] observed sigmoidal or power-law viscosity growth, while Wolak et al. [31] reported that TBN depletion initially declines rapidly and then stabilizes, affecting the oil’s ability to neutralize acids. These nonlinear behaviors directly influence the prediction of RUL, emphasizing the need for models capable of capturing saturation, hysteresis, and interaction effects.
Early RUL prediction studies relied on empirical or linear regression models [32]. Van Thai Nguyen et al. [33] used multiple linear regression with FTIR spectral data and chemical indicators such as TBN, oxidation, sulfation, and fuel dilution to estimate RUL. While simple and transparent, such models assume linearity and independence among parameters, which is unrealistic given the complex interactions in oil degradation. Regression-based viscosity models often underestimate thickening when soot dominates, and empirical thresholds are typically derived from single-engine tests, limiting their generalization for fleet-wide RUL assessment [34]. Traditional trend analysis assumes constant degradation rates, oversimplifying underlying chemistry and reducing the reliability of RUL predictions under varying operational conditions.
Capturing nonlinear degradation dynamics requires models that combine physical interpretability with data-driven flexibility. Machine learning and hybrid frameworks can map multivariate oil condition parameters to RUL outcomes, provided they are trained on diverse datasets. Learning nonlinear degradation directly from real-world field data while preserving thermochemical principles ensures that RUL predictions reflect true degradation rather than merely fitting observed trends.

2.3. Machine Learning and Ensemble Techniques

Recent studies employ machine learning (ML) models such as ANNs, RFs, and SVMs to predict nonlinear degradation dynamics and estimate RUL [35,36]. Rodrigues et al. [37] combined ANNs with PCA for dimensionality reduction, maintaining prediction accuracy for condition-based RUL assessment. Katreddi et al. [38] introduced a mixed-effects random forest (MERF) to handle clustered longitudinal maintenance data, capturing variability between engines and improving RUL estimates.
Ensemble learning combines multiple models to enhance prediction robustness. Wang et al. [39] developed a hybrid ensemble integrating CNN-based, expert-rule, and SVM techniques for equipment fault prediction, demonstrating improved RUL prediction accuracy. Shao et al. [40] applied ensemble classifiers to marine diesel engine fault diagnosis. While effective in controlled environments, these models often rely on simulated data or single-source datasets, limiting real-world RUL applicability.
Industrial RUL assessment requires models trained on sparse, noisy, and heterogeneous field data. Data-driven models may lack interpretability and fail to generalize across different engines using the same oil formulation. Physics-informed machine learning (PIML) incorporates physical laws such as Arrhenius-type oxidation kinetics or additive depletion decay into ML architectures [41,42], enhancing extrapolation and ensuring that RUL predictions are physically plausible. A prime illustration of this hybrid paradigm is the work by Kumar et al. [43], which developed a physics-informed neural network for engine oil RUL estimation that explicitly encodes Arrhenius degradation kinetics. By enforcing this fundamental thermochemical law during training, their model achieved a >30% reduction in prediction error and produced more reliable uncertainty estimates compared to conventional data-driven models. However, many PIML approaches are validated only on lab or single-engine data, limiting their applicability to fleet-level RUL assessment.
Beyond purely data-driven and physics-informed models, research has also progressed toward integrated hardware-software systems for real-time prognostics. For instance, Jagannathan et al. [44] developed an adaptive methodology combining micro-sensor data with computational models. A neural network-fuzzy classifier fused these inputs to derive a singular oil condition trend, which was then used by prognostic algorithms to estimate RUL. In parallel, Zhu et al. [45] leveraged dielectric spectroscopy for in situ sensing modalities to detect chemical and physical changes in degrading oil. By feeding the spectral data into machine learning classifiers such as Random Forest and ANN, the study successfully predicted RUL with high accuracy around 96%.
Transfer learning (TL) supports fleet-scale RUL prediction by fine-tuning global models trained on multiple engines for local conditions [46,47]. Hierarchical modeling links local engine parameters to population-level distributions, facilitating scalable and interpretable RUL estimation. Embedding physical constraints and uncertainty quantification directly within the model ensures that RUL predictions are both reliable and probabilistically calibrated. Bayesian Neural Networks, Gaussian Processes, and ensemble approaches [48] can provide confidence intervals, reflecting aleatoric and epistemic uncertainties that influence RUL estimates.

2.4. Research Gaps and Objective

Despite the considerable advances in oil degradation modeling, several important challenges remain:
  • Many data-driven models prioritize prediction accuracy over physical understanding and may yield thermodynamically inconsistent behavior when applied to new operating conditions.
  • Most physics-informed approaches are still validated on single-engine or even lab experimental datasets, restricting their generalization to real fleets.
  • Most existing studies treat each engine as a separate problem, ignoring the fact that engines using the same lubricant formulation undergo fundamentally similar degradation processes. This represents a missed opportunity for scalable learning across heterogeneous fleets.
  • Uncertainty analysis is rarely embedded within the model itself; it is usually treated as a separate post-processing step, reducing its usefulness for maintenance planning.
  • Finally, industrial oil datasets are noisy, irregular, and domain-imbalanced, issues that current models seldom address effectively.
These combined challenges underscore the need for models trained on uncontrolled, realistic field data that reflect the nonlinear dynamics of true oil degradation rather than idealized laboratory behavior. Field data are usually gathered opportunistically, with only one oil sample per change and without consistent sampling intervals or standardized operating conditions. Consequently, most published approaches cannot directly reconstruct real degradation trajectories from these sparse, single-point measurements.
This study addresses these gaps by proposing a physics-informed hierarchical transfer learning framework explicitly designed for RUL assessment of diesel engine oils using uncontrolled, single-sample industrial data. Arrhenius-type degradation kinetics are embedded within a hierarchical structure that leverages shared oil chemistry across engines while adapting to local conditions. Uncertainty quantification is integrated during model training, producing calibrated confidence intervals for RUL. Trained on data from a multinational company mining fleet, the framework generalizes across different equipment types using the same diesel formulation, enabling accurate RUL prediction that reflects actual nonlinear degradation and supports robust, scalable condition-based maintenance.

3. Proposed Methodology

This section presents the comprehensive methodological framework developed for predicting engine oil degradation and estimating Remaining Useful Life (RUL) across multiple categories of heavy-duty equipment. The methodology is built upon the principle that oil end-of-life results from the interacting evolution of three fundamental degradation modes: chemical additive depletion, mechanical wear metal accumulation, and physicochemical transformation.
The framework first reconstructs continuous, non-linear degradation trajectories for each monitored parameter using physics-based rate equations. The time at which each predicted trajectory crosses its predefined failure threshold is computed analytically. The final RUL is then determined not by the first threshold crossing, but by applying an expert-informed rule: oil is declared at end-of-life when a second, independent parameter from a different degradation category crosses its caution or failure limit. This ensures the RUL prediction inherently reflects multi-mechanism degradation. The interdependence of these mechanisms is subsequently analyzed through cross-equipment correlation matrices, providing empirical support for the multi-parameter decision logic. To achieve this across a heterogeneous fleet, the core innovation is a three-tier hierarchical modeling strategy. A Global Model (GM) learns universal, physics-constrained degradation kinetics shared by all assets using the same lubricant formulation. This knowledge is then transferred and adapted via asset-specific fine-tuning layers, enabling accurate and uncertainty-aware RUL predictions for individual machines despite sparse and noisy field data. The complete workflow, from raw data processing to RUL estimation, is detailed in the following subsections. The complete workflow, from raw data processing to RUL estimation, is detailed in the following subsections.

3.1. Dataset Description and Structure

The dataset analyzed in this study originates from a large-scale oil condition monitoring program conducted for one of the international companies on a fleet of mining equipment, including dump trucks, dozers, shovels, and wheel loaders. Oil samples were periodically collected according to the preventive maintenance schedule and analyzed in a certified laboratory. Each sample record includes physicochemical parameters such as oxidation, viscosity, and Total Base Number (TBN), as well as elemental wear concentrations (Fe, Cu, Al, Cr, Pb, Mo) and operational metadata such as asset identification and oil age. The values of the sample parameters were determined using Inductively Coupled Plasma Optical Emission spectroscopy (ICP-OES), which provides high sensitivity for wear metals and additive depletion markers and is widely adopted in condition-based oil diagnostics. Analytical precision was confirmed by repeat testing, yielding coefficients of variation below 5%, which verifies measurement reliability and suitability for prognostic modeling. The resulting dataset reflects realistic industrial conditions, encompassing a total of 1760 observations across 26 assets, as summarized in Table 1.
This dataset presents the classic industrial prognostic challenge. It consists of sparse, irregularly sampled single-point measurements (one sample per oil change) rather than continuous time-series. The primary methodological task is thus to reconstruct complete, physically plausible degradation trajectories from these isolated snapshots.
All engines used the same bulk supplied diesel engine oil with viscosity grade SAE 15W-40, according to API CI-4/SL dual-purpose oil classification for both diesel (CI-4) and gasoline (SL) engines. Although the specific commercial brand was not disclosed by the site due to procurement confidentiality, the lubricant was consistent across all machines and all samples originated from the same formulation family. This ensures uniformity in additive chemistry and viscosity behavior, which is an essential factor for model generalization and transfer learning. In addition, the chemical consistency of a single, well-defined semi-synthetic formulation across all assets is considered essential for the built framework. It allows to isolate and model the universal degradation patterns (for example, Arrhenius-type oxidation, additive depletion trends) before transferring and adapting them to asset-specific operating conditions. This would be more challenging with a mixed fleet using various mineral oil grades, as these often degrade differently (e.g., exhibiting faster oxidation and viscosity increase) which presents its own set of challenges.
Zinc (Zn) and phosphorus (P) were measured using ICP-OES, according to ASTM D5185 [49] as a standard guideline for elemental analysis of lubricant additives. Because ICP-OES quantifies total elemental content, it captures Zn and P regardless of the chemical composition, whether present as the original ZDDP additive or its soluble degradation products (e.g., zinc phosphates, thiophosphates) [50,51]. This explains why Zn and P levels may appear in used oil similar to fresh oil in some samples even though the additive has chemically transformed and depleted. When reductions do occur, they represent true elemental loss due to additive consumption, deposition on surfaces (forming anti-wear films), or precipitate out of the oil as insoluble deposits and settle as sludge.
Table 2 summarizes the key physicochemical results for the used oil samples collected from the four monitored machinery types. It shows representative laboratory measurements for a single observation, including the principal physicochemical and wear-related parameters used in model development.

3.2. Data Processing and Normalization

Comprehensive validation procedures were applied to ensure data integrity before modeling. Little’s MCAR test was conducted to assess the randomness of missing data, which represented approximately 4.8% of the total dataset. Continuous variables were imputed using linear interpolation to preserve temporal continuity, while categorical identifiers were verified for completeness and uniqueness. Outlier detection employed a hybrid method combining the interquartile range (3 × IQR) and Mahalanobis distance to identify both univariate and multivariate anomalies. After cleaning, 1675 samples remained valid for modeling.
Each continuous variable xi was normalized relative to its known physical or manufacturer-defined threshold θi, rather than by arbitrary statistical scaling. This normalization was computed as shown in Equation (1).
x ~ i = x i θ i
where x ~ i represents the normalized value and θi denotes the degradation or contamination limit corresponding to parameter i. This approach ensures that x ~ i = 1 corresponds directly to a condition near failure, improving interpretability across different equipment and parameters. Because samples were collected at irregular intervals, all variables were realigned to a uniform cumulative operating-hour grid to maintain consistent temporal representation.

3.3. Feature Engineering and Selection

Feature engineering emphasized parameters that describe the chemical, mechanical, and physical evolution of the lubricant. Three fundamental classes of degradation behavior were observed: depletion indicators, accumulation indicators, and transformation indicators. Depletion indicators represent properties that diminish with use due to additive consumption or neutralization, including TBN, calcium (Ca), and magnesium (Mg). Accumulation indicators reflect parameters that increase monotonically as wear metals and contaminants accumulate, such as iron (Fe), copper (Cu), and aluminum (Al). Transformation indicators exhibit nonlinear patterns influenced by multiple competing effects, including viscosity, oxidation, and nitration. This categorization allows the model to treat each parameter according to its underlying degradation mechanism.
The classification of these diagnostic features into three distinct categories is summarized in Table 3, which provides a clear framework for understanding the different degradation mechanisms modeled in this study.
The models assume each parameter evolves according to its own kinetics, without mathematically modeling how one indicator’s state affects another’s degradation rate. While the degradation trajectories for individual parameters are modeled independently, the integration of chemical, wear, and contamination indicators is crucial for holistic RUL estimation. Therefore, all parameters from the three categories serve as concurrent inputs to the machine learning models, enabling the algorithms to capture complex, non-linear interactions inherent in the degradation process. This integration is further governed by the ‘second-threshold’ decision rule, which ensures that the final RUL prediction reflects the confluence of multiple degradation mechanisms rather than any single parameter evolution.
To prevent redundancy, variables with high pairwise correlation (r > 0.9) were excluded. Recursive Feature Elimination (RFE) using Random Forest importance was then applied to identify the most relevant predictors. Derived ratios such as Fe/TBN and Oxidation/TBN were introduced to represent compound degradation relationships, improving physical interpretability.
Model parameters were estimated using the Levenberg–Marquardt nonlinear least-squares algorithm. Bootstrap resampling was applied to quantify parameter uncertainty. An initial set of 100 bootstrap iterations was used during model tuning to verify convergence behavior and computational efficiency. Once stable convergence was confirmed, the full model estimation employed 1000 bootstrap replicates to generate robust 95% confidence intervals for key parameters such as the degradation rate constant (k) and the time exponent (p). This two-stage approach ensures that uncertainty estimates are statistically stable without unnecessary computational cost.

3.4. Physics-Informed Mathematical Modeling for Oil Degradation Prediction

The temporal evolution of each oil-health indicator y(t) was modeled using non-linear rate equations that capture the physical kinetics of degradation and contaminant buildup. The underlying assumption is that the instantaneous rate of change of a measurable property is proportional to its current magnitude and varies according to how the degradation process accelerates or slows down as time progresses. In practical terms, this means that the rate of change for y(t) is not constant. Instead, the rate at which y(t) increases or decreases depends on both its present value and the duration of oil operation, allowing the model to represent early-stage stability, mid-cycle acceleration, or late-stage saturation in the parameter’s trajectory.
This time-dependent and magnitude-dependent behavior is expressed through the following differential form, as presented in Equation (2).
d y ( t ) d t = k · p · t p 1 y ( t )
where y(t) denotes the property value at time t; k > 0 is the generalized rate constant characterizing the magnitude of change; and p is the time exponent determining whether the process accelerates (p > 1) or decelerates (p < 1) with time. Integrating Equation (2) yields the general nonlinear exponential solution, as shown in Equation (3).
y t = y o e k t p
where y0 represents the initial property value at t = 0. This generic formulation can represent several degradation archetypes commonly encountered in tribological systems. Depending on whether the variable of interest decreases, increases, or saturates over time, three distinct functional families were employed in Equations (3)–(5).
y(t) = y0 + k tp, Power-law growth
y(t) = y0 + A ln 1 + t/B, Logarithmic growth
where A and B are positive constants defining, respectively, the amplitude and characteristic timescale of saturation for variables that exhibit bounded accumulation such as oxidation or nitration. For predictive maintenance purposes, the critical time t at which a monitored variable reaches a threshold value yth (e.g., manufacturer limit or failure criterion) is obtained analytically for each model type using Equations (6)–(8).
t e x p * = 1 k l n y o y t h 1 / p
t p o w e r * = y t h y o k 1 / p
t l o g * = B . e y t h y o k 1
In these expressions, yth denotes the terminal (failure) value of the monitored property, and t* is the estimated time to reach that limit, corresponding to the Remaining Useful Life (RUL) for that indicator. All model constants were optimized independently for each asset–parameter pair using the Levenberg–Marquardt nonlinear least-squares algorithm, with uncertainty bounds derived from 1000 bootstrap resamples. The physical interpretation of the key parameters and constants used in the degradation models (Equations (2)–(8)) is provided in Table 4, ensuring clarity and consistency in their application.
The family of models represented by Equations (3)–(8) captures a wide range of physical behaviors observed in oil aging, including additive depletion, viscosity breakdown, metal accumulation, and oxidation plateauing. Because each parameter’s evolution follows a physically interpretable trajectory, the resulting degradation profiles remain consistent with established tribo-chemical kinetics and laboratory observations.
In practical oil-condition monitoring, each diagnostic parameter has its own caution and failure thresholds. However, domain experts from the collaborating industrial oil-analysis provider emphasized that the first parameter to reach a caution threshold typically reflects early degradation or additive stress, not functional end-of-life. Hence, oil is technically considered to have reached end-of-life when the second independent indicator crosses its failure threshold, marking the transition from a warning state to confirmed functional degradation.
Following this expert-defined rule, the framework computes the time-to-threshold (t*) for all monitored indicators and orders them chronologically. The final RUL is then defined as the minimum time-to-failure corresponding to the second threshold-crossing event, rather than the earliest single crossing. This approach prevents premature failure declarations and aligns the RUL estimation with real industrial decision logic used in large oil-analysis laboratories. This expert-informed rule is the key integration mechanism that synthesizes the independently modeled parameter trajectories. It ensures the final RUL prediction reflects a consensus of multiple degradation pathways, preventing false alarms from single-parameter anomalies and aligning with industrial practice.

3.5. Multi-Equipment Hierarchical Modeling Strategy

To address the dual challenges of data sparsity (per asset) and heterogeneous operations (across the fleet), a three-tier hierarchical modeling strategy is employed. This structure is designed to first learn universal degradation kinetics shared by all assets using the same oil, then efficiently adapt this knowledge to specific equipment and individual machines, thereby enabling accurate RUL prediction even for assets with very limited local data. The three complementary modeling paradigms are: Asset-Specific (AS), Global Model (GM), and Hierarchical Transfer Learning (TL), as shown in Figure 1. The AS model captures individual asset behavior, the GM learns general degradation trends across all equipment, and the TL model builds upon the GM baseline to refine predictions at both the equipment and asset levels.
This hierarchical strategy enables the framework to generalize across equipment categories while maintaining sensitivity to local operating conditions. The hierarchical model prediction can be represented by Equation (9).
yˆ = fglobal(X) + gequipment(X) + hasset(X)
where fglobal(X) denotes the base fleet-level model, gequipment(X) captures residual deviations at the equipment category level, and hasset(X) refines local asset behavior, as described in Figure 2. In the proposed framework, the transfer process is guided by physical priors: the global model first learns the parameters of the physical degradation equations such as the rate constant k, time exponent p, and saturation constants A and B. These physically meaningful parameters are then passed to the equipment and asset levels, where they are only fine-tuned rather than replaced. This ensures that every stage of the hierarchy remains consistent with known degradation behavior and avoids non-physical predictions.
To prevent negative transfer, each layer is trained on the residuals of the previous layer, and it is kept only if the validation R2 improves by at least 0.02. This guarantees that the added layer contributes useful information rather than unnecessary complexity. In addition, Group-KFold cross-validation is applied at the asset level to avoid data leakage and to ensure that the learned patterns generalize properly to unseen equipment.
Uncertainty in predicted RUL was assessed using percentile bootstrap ensembles, with 1000 resamples per model. The bootstrapped RUL distribution at time t is expressed as shown in Equation (10).
R U L b t = t ^ ( f , b ) t
where t ^ ( f , b ) is the bootstrap-estimated failure time for replicate b. From these replicates, 95% confidence intervals were derived by Equation (11). Representing combined aleatoric (data variability) and epistemic (model) uncertainty, residual diagnostics included normality (Shapiro–Wilk), autocorrelation (Durbin–Watson), and heteroscedasticity (Breusch–Pagan) tests to verify statistical assumptions.
CIRUL= [tf,2.5%t, tf,97.5%t]
Model performance is evaluated using root mean square error (RMSE), mean absolute error (MAE), coefficient of determination (R2), relative accuracy (RA), and concordance index (C-index). Stratified five-fold cross-validation is applied at the asset level to prevent data leakage between training and testing partitions. Comparative performance among AS, GM, and TL configurations is statistically analyzed using the Wilcoxon signed-rank test for paired differences and Levene’s test for variance homogeneity, following confirmation of residual normality.
The proposed methodology is built on the assumption of monotonic degradation behavior and the use of uniformly spaced sampling intervals. These assumptions facilitate consistent cross-equipment comparisons by ensuring that the temporal evolution of degradation indicators can be aligned across diverse assets. While this simplification greatly enhances model calibration, interpretability, and computational efficiency, it inevitably limits the framework’s responsiveness to transient or non-monotonic phenomena, such as sudden changes in load, temperature, or operational regimes. Such short-term fluctuations, although often brief, may contain valuable diagnostic information that could improve early fault detection or anomaly recognition. Furthermore, the relatively small size and heterogeneity of available datasets for certain asset classes poses challenges for developing a fully generalizable Global Model. This constraint also partially limits the effectiveness of the transfer learning component, as knowledge extracted from data-rich assets cannot always be seamlessly adapted to data-sparse cases. Despite these limitations, the proposed framework demonstrates strong interpretability, reproducibility, and adaptability, making it suitable for practical deployment in industrial environments.
In summary, the proposed methodology transforms sparse, single-point industrial oil data into reliable RUL estimates through a physics-informed pipeline: (1) reconstructing continuous degradation trajectories using physical kinetic models, (2) transferring fleet-wide degradation knowledge to individual assets via a hierarchical learning framework, and (3) synthesizing multi-parameter predictions through an expert-defined “second-threshold” decision rule. This integrated approach directly addresses the challenges of uncontrolled, heterogeneous field data.

4. Results and Discussion

This section presents the outcomes of the proposed framework in three progressive levels: parameter-level model performance, equipment-level generalization, and comparative evaluation across modeling strategies. This hierarchical structure reflects how the models capture degradation behavior from individual oil parameters up to equipment-specific domains.

4.1. Correlation Analysis Across Equipment Types

To investigate the interdependence of wear metals, contaminants, additive indictors, and degradation parameters across the fleet, correlation heatmaps were generated for each equipment category: dozers, dump trucks, shovels, and wheel loaders (Figure 3). Each matrix identifies whether variables such as Fe, PQ Index, soot, viscosity, TBN, and oxidation exhibit systematic relationships that can be used to support RUL prediction.
Across all equipment types, a high correlation between PQ Index and Fe is consistently observed, confirming that PQ Index reliably tracks ferrous wear progression. This supports its role as an indicator of mechanical deterioration rather than purely chemical degradation. Viscosity, oxidation, and TBN show weaker correlations with wear metals, reflecting the multi-mechanism nature of lubricant aging, where chemical degradation and mechanical wear evolve partly independently.
These patterns justify the current modeling approach where RUL is not predicted from a single dominant indicator but from the combined evolution of chemically driven (oxidation, viscosity, TBN/TAN) and mechanically driven (PQ, Fe, Cu, Pb) degradation parameters. The multi-indicator correlations further reinforce the need for the hierarchical transfer learning framework, which can accommodate heterogeneous degradation influences across different asset categories.

4.2. Parameter-Level Model Performance

This subsection evaluates the ability of the developed models to reproduce the laboratory-measured lubricant parameters that represent wear, contamination, and additive depletion. The R2 coefficient of determination was used as the main indicator of predictive accuracy, supported by visual assessment of the predicted versus actual distributions.
In interpreting the parameter predictions, it is important to note that the laboratory control limits define the acceptable variation range for each oil property. These limits are derived from the company condition monitoring thresholds and act as the practical boundaries within which the models should reproduce measured values. A prediction that stays within these limits, even with minor deviation from the exact laboratory point, is still considered operationally valid. Hence, the model accuracy was judged not only by statistical fit but also by how often the predicted values remained within the established control limits.
Table 5 summarizes the best-performing generalized models across all measured parameters. Gradient Boosting and Extra Trees algorithms consistently achieved the highest predictive accuracy, particularly for Magnesium (R2 = 0.946) and Calcium (R2 = 0.902), both linked to additive depletion and detergent retention. The strong performance in these parameters suggests that the models effectively captured the gradual decline typical of base-additive consumption processes. In contrast, Sodium and Silicon showed lower R2 values due to higher variability in field measurements and contamination effects rather than algorithmic limitations.
While the model shows weak predictive behavior for Sodium and Water, this limitation is both expected and operationally acceptable. These parameters typically remain below detection thresholds for most samples and increase only during abnormal events such as coolant leakage, seal failure, or moisture ingress. Their statistical variance is therefore driven more by measurement noise than progressive degradation chemistry, making them unsuitable predictors for continuous RUL estimation. Rather than a deficiency of the model, this reflects the physical nature of these indicators, they serve as binary contamination alerts, not gradual wear markers. Accordingly, their low R2 does not impact the model’s ability to estimate degradation trends or remaining oil life, and their monitoring should remain threshold-triggered rather than regression-based.
The R2 values for several parameters fall below 0.75, which is common in real-world oil-analysis datasets. Field samples are inherently influenced by operational variability, fluctuating load profiles, intermittent contamination events, and laboratory measurement noise. These factors collectively limit the maximum achievable coefficient of determination, even for well-constructed models. Therefore, moderate R2 values should not be interpreted as being indicative of poor predictive performance; rather, they reflect the true complexity and noise inherent in practical industrial environments.
Importantly, the Transfer Learning model consistently outperforms other configurations, exhibiting lower uncertainty and more stable predictions across different engines. This demonstrates that the model effectively captures the underlying degradation behavior despite the unavoidable noise in field data, highlighting the robustness and practical value of the proposed framework.
Table 6 compares the predictive accuracy of Asset-Specific, Global Model, and transfer learning models for key lubricant parameters representing wear (Fe), contamination (Soot), and additive depletion (P, TBN, Mg, Al).
In each case, transfer learning achieved the highest R2. Furthermore, it produced the lowest prediction errors, indicating that the incorporation of generalized cross-equipment knowledge provides a stronger starting point for learning. By fine-tuning this prior knowledge to the operating conditions of a specific engine, the TL model captures degradation patterns more effectively than models trained from scratch. This demonstrates that transfer learning not only improves statistical accuracy but also enhances the robustness and reliability of predictions across different degradation pathways.
The superior performance of the TL model in capturing the trends of these accumulation-type parameters is visually confirmed in Figure 4, Figure 5 and Figure 6. These figures illustrate how the TL-predicted trajectories for Aluminum, Iron, and Soot align closely with the measured laboratory data, effectively capturing both linear and nonlinear growth behaviors, unlike the AS and GM models either deviate substantially or over-smooth the trends due to limited exposure to broader degradation characteristics.
A similar advantage is observed for depletion-type parameters. Figure 7, Figure 8 and Figure 9 illustrate that the TL model accurately tracks the decline of Magnesium, Zinc, and TBN. Its predictions follow the true degradation pathway while maintaining physical plausibility and staying within operational control limits. The AS and GM models, on the other hand, exhibit higher variability, less stability, and larger prediction errors because they lack the benefit of prior generalized representations.
Overall, the clear improvement shown by TL across both accumulation and depletion behaviors highlights the importance of reusing learned degradation dynamics and refining them to the specific engine environment.
The consistent performance improvement across parameters and equipment confirms that oil degradation mechanisms. Particularly oxidation, metallic wear, and additive depletion exhibit transferable kinetic signatures. Once a Global pre-trained model is established, new assets can be efficiently adapted through fine-tuning, enabling scalable and reliable predictive maintenance deployment.
Nonetheless, few parameters such as Sodium, Water, and Silicon, yielded very low or slightly negative R2 values. This does not indicate poor model performance but rather reflects the intrinsic characteristics of these variables. Their concentrations remain close to zero for most samples and increase only under rare contamination or abnormal operating conditions. Consequently, the available data for these parameters show minimal true variation, and any observed fluctuations are often dominated by measurement noise rather than meaningful degradation trends. Under such low signal-to-noise conditions, predictive correlation becomes statistically unstable, producing near-zero or negative R2 values.
These findings suggest that while such parameters provide limited predictive value, those exhibiting stronger and more consistent dynamic behavior, such as Fe, Cu, Mg, and Oxidation, are the principal contributors to accurate oil degradation modeling and Remaining Useful Life (RUL) estimation.
Across these parameters, the models reproduced both linear and nonlinear degradation tendencies with high fidelity. Aluminum and Iron increased progressively with usage, whereas Soot showed exponential-like behavior. The predicted values aligned closely with measured laboratory data, confirming that the models successfully captured physical degradation patterns.
Furthermore, predicted trends remained within realistic operational bounds, showing no artificial extrapolation or saturation outside the observed control limits. These limits, derived from the upper and lower quantiles of historical laboratory data, represent the practical thresholds beyond which oil is considered degraded or contaminated. The model outputs stayed consistently inside these intervals, confirming that the learning process respected the physical range of each parameter rather than generating implausible values. This strengthens confidence that the predictive behavior aligns with real-world maintenance control criteria. The parameter-level assessment establishes that the developed models can effectively capture the degradation behavior of individual oil properties.
The next step is to evaluate whether these learned relationships can generalize across different engine types, where operating conditions, loading cycles, and maintenance intervals vary substantially.

4.3. Equipment-Level Generalization

This subsection examines how the developed framework performs when generalizing across different heavy-duty equipment categories, each characterized by unique loading and contamination environments.

4.3.1. Dump Trucks

Dump trucks exhibited the strongest overall model performance among all equipment types. The Global Model achieved a higher R2 than those of shovels, wheel loaders, and dozers, mainly due to the larger dataset (888 samples) providing broader coverage of degradation behaviors, as shown by Table 7 and Table 8. This enabled the model to capture general wear and contamination dynamics with greater stability and lower variance.
In contrast, the transfer learning (TL) configuration showed slightly lower R2 values, as the data were distributed across thirteen assets with uneven record counts. For instance, Asset L contained only 29 samples (20 for training and 9 for testing), limiting fine-tuning efficiency and highlighting how small partitions weaken TL generalization. The detailed predictive performance for each parameter, comparing Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL) models for Dump Trucks, is provided in Table 7. While the GM often provided a solid baseline, the TL model achieved the highest R2 for critical parameters like Aluminum, Calcium, Iron, and Magnesium, demonstrating its ability to refine predictions even with imbalanced asset data.
The resulting degradation trajectories for dump trucks showed physically consistent behavior: wear metals and soot increased with service hours, while detergent-related parameters (Ca, Mg, Zn) gradually depleted. This aligns with the expected operational profile of high-load transport units operating under sustained thermal and abrasive conditions.
The aggregated performance metrics for Dump Trucks are visualized in Figure 10, which clearly shows the distribution of prediction errors across the three modeling strategies. This visual representation underscores that while the GM has a moderate overall fit, the TL model achieves a tighter clustering of lower errors, confirming its enhanced predictive stability for this equipment class.

4.3.2. Shovels

For the shovels, both Global Model and Transfer Learning models produced balanced performance with moderate R2 values. Table 9 and Table 10 show that the TL model maintained high accuracy across metallic and non-metallic parameters. Strong R2 values for Si, Zn, and P highlight its ability to capture cross-equipment degradation patterns, confirming the framework’s scalability and reliability for fleet-level monitoring. The TL models showed noticeable improvement over the Asset-Specific baseline, indicating that shared knowledge between units contributed to more stable predictions. However, the smaller number of samples per asset and the heterogeneous working cycles of shovel engines introduced additional noise compared to the dump truck category.
The relatively small number of samples available for several assets had a visible impact on transfer learning stability. For example, some shovel units had as few as twenty to thirty usable observations, which were further split into training and testing portions. Such limited data reduce the representation of the operating variability and constrain the model’s ability to adapt shared knowledge effectively. This explains the wider performance spread across individual assets despite the same model architecture.
A comparative visualization of the models’ performance for Shovels is provided in Figure 11. The chart highlights the remarkable leap in performance achieved by the TL configuration, which dominates across most parameters, visually reinforcing the quantitative results presented in Table 10 and demonstrating effective knowledge transfer despite data limitations.

4.3.3. Wheel Loaders

Wheel loaders demonstrated similar degradation patterns but with slightly lower overall predictive accuracy than shovels. These machines typically experience variable duty cycles with frequent idling, which causes irregular oil oxidation and soot formation. Despite that, the TL configuration effectively captured the general behavior of critical elements such as Iron and Soot, confirming its adaptability under inconsistent operating profiles.
As shown in Table 11 and Table 12, the Transfer Learning model achieved R2 > 0.9 for most parameters, effectively capturing additive depletion and oxidation trends through shared knowledge from similar assets. The consistent advantage of the TL approach for Wheel Loaders is graphically summarized in Figure 12. The visual comparison clearly depicts the TL model’s superior accuracy and reduced error variance compared to the AS and GM baselines, validating its adaptability to the variable duty cycles characteristic of this equipment.

4.3.4. Dozers

Dozers showed the lowest Global Model performance among the four equipment categories, primarily due to limited available data and the smaller number of operational cycles. Nonetheless, TL improved prediction stability for several additive-related parameters, proving that even when individual datasets are small, shared learning from similar assets can partially compensate for data scarcity.
Table 13 and Table 14 demonstrate a clear performance leap for Dozers, where TL achieved near-perfect fits for most key degradation indicators. Notably, oxidation-related parameters (e.g., Oxidation, TBN, and Ca) showed R2 values approaching unity, confirming the framework’s capacity to generalize chemical aging patterns across assets. Nevertheless, wear-metal indicators such as Mo and PQ Index remained less predictable, reflecting the higher stochastic variability in wear generation under mixed operating loads.
Among all equipment categories, the Global Model for dump trucks achieved the highest R2 values, outperforming the Global Models of shovels, wheel loaders, and dozers. This result is largely explained by the larger and more diverse dataset used for training about 888 samples, the largest among all equipment types. The model benefited from the broader statistical representation of operating conditions, which enhanced generalization. Conversely, the transfer learning models for dump trucks performed relatively lower than those of other equipment. The reason lies in the division of data across thirteen individual assets, some of which had fewer than thirty samples in total. For instance, asset L contributed only around twenty-nine records, with roughly twenty used for training and nine for testing. This uneven and limited sample distribution restricted the learning potential of the transfer setup and reduced its comparative advantage. While equipment-level testing demonstrates the adaptability of the framework to different operating domains, a broader perspective is needed to determine the overall effectiveness of each modeling strategy. The following subsection therefore compares the Asset-Specific, Global Model, and Transfer Learning approaches to assess their relative strengths in predictive accuracy and generalization.
Despite the challenges posed by the small dataset for Dozers, the performance gains from transfer learning are evident in Figure 13. The visual summary demonstrates that the TL model successfully mitigates the poor performance seen in the AS and GM models, achieving a robust and reliable prediction level that the other strategies could not attain with the limited data available.

4.4. Comparative Evaluation of Modeling Strategies

The comparative assessment across the three modeling configurations—Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL)—provides an integrated understanding of predictive stability, generalization capacity, and robustness under heterogeneous operating environments. As summarized in Figure 10, Figure 11, Figure 12 and Figure 13 and Table 8, Table 9, Table 10, Table 11 and Table 12, the Transfer Learning (TL) approach consistently achieved the highest explanatory performance and the lowest prediction errors across most equipment classes and parameters.
The aggregated results in Table 15 confirm the superior generalization capability of the Transfer Learning approach. While the Global Model achieved moderate explanatory power, its prediction errors remained comparatively high and uncertain. In contrast, the TL configuration delivered a threefold reduction in RMSE and a confidence interval width nearly half that of the GM model, highlighting both higher predictive precision and lower output variance across the full dataset.
The TL model consistently outperformed both the Asset Specific and Global configurations because it inherits generalized degradation knowledge learned at fleet level while also fine tuning to the operating profile of each asset. This enables TL to avoid overfitting, which is common in Asset Specific models with limited samples and prevents underfitting that often occurs in Global models where engine behavior is averaged. Since oil degradation follows shared thermochemical patterns such as oxidation, additive depletion, and soot growth, these kinetics are transferable across machines using the same lubricant. TL leverages this by learning the global degradation structure first, then adapting it locally with minimal calibration. As a result, it achieves higher R2, lower uncertainty and better generalization across diverse field conditions, demonstrating a structural learning advantage rather than a simple numerical improvement.
To provide clear quantitative evidence of the model’s behavior on individual assets, Table 16 presents the predicted RUL, caution thresholds, and alert points for one representative sample from each asset category, mentioned earlier in Table 2. It is crucial to note that the framework assesses two independent states: (a) machine state which is evaluated via wear-related parameters (e.g., Fe, Cu, Al) and (b) oil state evaluated via oil-specific indicators (e.g., TBN, additives, viscosity). The failure logic is applied separately to each state. End-of-life is declared either when: (a) any single parameter reaches its alert threshold, or (b) two distinct parameters reach at least their caution thresholds. The “1st Caution,” “2nd Caution,” and “Alert & Time” columns in Table 16 report the chronological order in which different parameters are predicted to cross these limits, identifying the key indicators driving the risk for each asset.
The results in Table 16 illustrate distinct degradation patterns and the application of the logic described above. In case of Dump Truck (DA-2), the oil is already at end-of-life (RUL = 0 h) because Molybdenum (Mo) has reached its alert threshold. The machine itself remains healthy. For the Shovel (SA-1), the machine is predicted to fail at 282 h; however, it did not reach an alert level by any of the two indicators that reached the threshold levels. On the other hand, the oil reaches end-of-life earlier at 177 h (triggered by Si alert). This suggests that although the machine is currently healthy, an oil change is the imminent maintenance need. Moving to the Wheel Loader (WF-8), the oil is predicted to reach end-of-life in 121 h, triggered when TBN reaches its caution threshold as the second indicator (following an earlier Zn caution). The machine shows a caution for Al but has a long RUL (800 h). The results of the Dozer (DOC-1) indicate that multiple oil parameters are in alert, mandating immediate oil replacement (RUL = 0 h). Concurrently, the machine is projected to fail due to wear (Fe) in 419 h, indicating a major maintenance event should follow.
This quantitative breakdown demonstrates that each asset follows a unique degradation trajectory governed by different combinations of chemical, wear, and contamination indicators. These case-specific patterns underscore the necessity of the proposed multi-indicator framework. Reliable RUL estimation cannot depend on a single parameter; instead, it requires the integrated assessment enabled by the “second-indicator” failure rule, which accurately reflects the complex, multi-faceted nature of oil degradation in real-world operating conditions.

4.4.1. Overall Predictive Strength and Bias Trends

The AS models, trained exclusively on individual asset histories, exhibited strong within-sample fitting but poor generalization, frequently yielding negative or near-zero R2 values (e.g., Dump Trucks and Dozers). This behavior indicates overfitting to Asset-Specific noise and insufficient learning of transferable degradation dynamics. Conversely, GM configurations, which leveraged pooled multi-asset data, demonstrated moderate improvements in mean R2 and reduced variance but occasionally suffered from underfit-ting, particularly when asset behaviors diverged substantially due to different loading conditions or duty cycles.
TL configurations achieved the most balanced behavior. By leveraging pre-trained representations from the global domain and fine-tuning on limited Asset-Specific samples, TL reconciled the bias–variance trade-off: it preserved general degradation structure while adapting to local signal variations. This hybridization yielded the most stable and interpretable results, with average R2 values exceeding 0.7 in Shovels and Wheel Loaders, and markedly reduced error dispersion relative to both AS and GM models.

4.4.2. Error Distribution, Residual Behavior, and Uncertainty

Residual analyses revealed that AS models often exhibited multimodal and heteroscedastic patterns, consistent with overfitting and sensitivity to measurement noise. GM residuals were more uniform but displayed systematic bias at extreme degradation levels, implying limited flexibility in capturing nonlinear progression. In contrast, TL residuals were approximately Gaussian and centered near zero, indicating unbiased estimation across the degradation range. Furthermore, uncertainty quantification using bootstrap resampling (1000 replicates) showed that TL yielded the lowest coefficient of variation in predicted outputs, on average 7–10% lower than GM, confirming superior robustness under variable data regimes.
A pairwise win-rate matrix (Figure 14) summarizes the frequency with which each model outperformed the others across all parameters and assets, illustrating the overall consistency of TL’s advantage. The matrix demonstrates that TL achieved the highest pairwise win rate in terms of RMSE, outperforming both AS and GM models across the majority of degradation parameters and equipment types. This reinforces the quantitative evidence of TL’s dominant predictive reliability.

4.4.3. Cross-Parameter and Cross-Asset Generalization

At the parameter level, elements such as Magnesium, Calcium, and Iron (Table 5) displayed high average R2 values across TL models, demonstrating strong inter-equipment consistency for chemically stable wear and additive markers. Less stable indicators (e.g., Sodium, Lead, and Silicon) retained lower transferability due to sporadic contamination events and inconsistent sampling. However, even for these difficult cases, TL maintained positive predictive capacity where GM and AS models failed entirely.
Across equipment classes, TL’s adaptability was most pronounced in Shovels and Wheel Loaders, where operational variability and sensor noise were high. In contrast, for Dump Trucks—characterized by more homogeneous duty cycles—GM models occasionally approached TL performance, highlighting that transfer benefits scale with environmental and operational diversity.
Synthesizing across metrics, TL emerges as a scalable and physically interpretable modeling paradigm. It effectively bridges the gap between local specialization and global generalization, preserving the nonlinear degradation signatures specific to each asset while leveraging the shared physical–chemical progression patterns learned from the broader fleet. This balanced performance translates into lower structural bias, reduced error sensitivity, and improved uncertainty calibration across diverse operational contexts.

5. Conclusions and Future Work

The integrated results and cross-domain interpretation show that oil degradation dynamics follow transferable patterns across heterogeneous assets, providing a solid basis for generalizable prognostic models. Building on these findings, this study introduced a hierarchical transfer learning (TL) framework for multi-equipment oil degradation modeling and Remaining Useful Life (RUL) estimation from real-world, non-time-series data. Unlike prior work that depends on controlled, time-resolved sampling, the proposed framework reconstructs nonlinear degradation trajectories from single-point, irregularly sampled oil data by combining equipment-level generalization with Asset-Specific adaptation.
When evaluated on field data, the approach delivered strong predictive performance (average R2 = 0.979, RMSE = 10.185) and substantially reduced uncertainty relative to baseline models. These results indicate that the TL strategy can reliably infer realistic degradation dynamics even without continuous time-series measurements. Practically, a model trained for equipment-level patterns can be fine-tuned to individual assets using only a small amount of local data, which lowers calibration effort while preserving prediction accuracy. This makes the method straightforward to deploy across large fleets: it reduces computational and data-collection burdens without sacrificing reliability. The ability to reconstruct degradation from sparse, uncontrolled sampling supports more informed oil-drain decisions, fewer unplanned interruptions, and improved operational availability.
Building on this foundation, future research can extend the framework in several directions. Incorporating deep transfer learning architectures and temporal models such as Long Short-Term Memory (LSTM) networks or Transformers may enhance the reconstruction of degradation dynamics when partial time-series data become available. Integrating probabilistic inference and Bayesian uncertainty quantification would further improve model transparency, allowing RUL predictions to be expressed with confidence levels essential for risk-aware maintenance planning.
Furthermore, expanding the approach to cross-domain transfer across lubricant types, operational environments, or equipment classes would strengthen model generalization and adaptability. A potential next step involves the direct chemical monitoring of detergent and dispersant additives (D&D) in diesel engine oils. These additives play a crucial role in maintaining oil cleanliness, neutralizing acids, and suspending soot and other insolubles. D&D additive degradation could be a valuable extension for increasing model interpretability and precision. Coupling the framework with IoT-based sensor networks could enable online learning and adaptive recalibration as new field data are collected, supporting real-time prognostics and continuous optimization of maintenance schedules.

Author Contributions

Conceptualization, M.G.A.N., I.A., D.E. and F.P.; methodology, O.W., Y.H.E., H.E. and J.O.; data collection, O.W., Y.H.E. and M.G.A.N.; validation, I.A., H.E. and J.O.; formal analysis, I.A., Y.H.E. and O.W.; investigation, M.G.A.N., F.P. and H.E.; resources, S.A., M.G.A.N. and F.P.; data curation, S.A., I.A., O.W. and J.O.; writing—original draft preparation, H.E., O.W. and J.O.; writing—review and editing, M.G.A.N., F.P. and I.A.; visualization Y.H.E. and I.A.; project administration, D.E. and M.G.A.N. Correspondence and requests for materials should be addressed to M.G.A.N. and F.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

All the data that has been used in this study is confidential.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  1. Heywood, J. Internal Combustion Engine Fundamentals, 2nd ed.; McGraw-Hill Education: Columbus, OH, USA, 2018. [Google Scholar]
  2. Azevedo, K.; Olsen, D.B. Engine Oil Degradation Analysis of Construction Equipment in Latin America. J. Qual. Maint. Eng. 2019, 25, 163–179. [Google Scholar] [CrossRef]
  3. Birleanu, C.; Cioaza, M.; Suciu, R.-C.; Molea, A.; Pustan, M.; Contiu, G.; Popa, F. Tribological Performance of SAE 10W-40 Engine Oil Enhanced with Thermally Treated TiO2 Nanoparticles. Lubricants 2025, 13, 466. [Google Scholar] [CrossRef]
  4. Chokelarb, W.; Sriprom, P.; Permana, L.; Assawasaengrat, P. Assessment of Overall Remaining Useful Life of Lubricants by Integrating Oil Quality and Performance. Heliyon 2024, 10, e23456. [Google Scholar] [CrossRef] [PubMed]
  5. Raposo, H.; Farinha, J.T.; Fonseca, I.; Ferreira, L.A. Condition Monitoring with Prediction Based on Diesel Engine Oil Analysis: A Case Study for Urban Buses. Actuators 2019, 8, 14. [Google Scholar] [CrossRef]
  6. Ali, A.; Abdelhadi, A. Condition-Based Monitoring and Maintenance: State of the Art Review. Appl. Sci. 2022, 12, 688. [Google Scholar] [CrossRef]
  7. Tripathi, A.K.; Vinu, R. Characterization of Thermal Stability of Synthetic and Semi-Synthetic Engine Oils. Lubricants 2015, 3, 54–79. [Google Scholar] [CrossRef]
  8. Agocs, A.; Frauscher, M.; Ristic, A.; Dörr, N. Impact of Soot on Internal Combustion Engine Lubrication—Oil Condition Monitoring, Tribological Properties, and Surface Chemistry. Lubricants 2024, 12, 401. [Google Scholar] [CrossRef]
  9. Al Sheikh Omar, A.; Motamen Salehi, F.; Farooq, U.; Morina, A.; Neville, A. Chemical and physical assessment of engine oils degradation and additive depletion by soot. Tribol. Int. 2021, 160, 107054. [Google Scholar] [CrossRef]
  10. Chervinskyy, T.; Grynyshyn, O.; Prokop, R.; Shapoval, P.; Korchak, B. Study on The Properties of Semi-Synthetic Motor Oil Castrol 10w-40 After Use in a Diesel Engine. Chem. Chem. Technol. 2021, 15, 432–437. [Google Scholar] [CrossRef]
  11. Kral, J.J.; Konecny, B.; Kral, J.; Madac, K.; Fedorko, G.; Molnar, V. Degradation and chemical change of longlife oils following intensive use in automobile engines. Measurement 2014, 50, 34–42. [Google Scholar] [CrossRef]
  12. Gołębiowski, W.; Wolak, A.; Zając, G. Definition of oil change intervals based on the analysis of selected physicochemical properties of used engine oils. Combust. Engines 2018, 172, 44–50. [Google Scholar] [CrossRef] [PubMed]
  13. Rao, X.; Sheng, C.; Guo, Z.; Yuan, C. A Review of Online Condition Monitoring and Maintenance Strategy for Cylinder Liner–Piston Rings of Diesel Engines. Mech. Syst. Signal Process. 2022, 165, 108385. [Google Scholar] [CrossRef]
  14. Ye, S.; Da, B.; Qi, L.; Xiao, H.; Li, S. Condition Monitoring of Marine Diesel Lubrication System Based on an Optimized Random Singular Value Decomposition Model. Machines 2025, 13, 7. [Google Scholar] [CrossRef]
  15. Jabo, A.G.A.A.; Eskandar, M.V. Used Oil Analysis for Internal Combustion Engine Condition Monitoring. Int. J. Eng. Appl. Sci. Technol. 2021, 5, 10–16. [Google Scholar]
  16. Kumar, S.; Raj, K.K.; Cirrincione, M.; Cirrincione, G.; Franzitta, V.; Kumar, R.R. A Comprehensive Review of Remaining Useful Life Estimation Approaches for Rotating Machinery. Energies 2024, 17, 5538. [Google Scholar] [CrossRef]
  17. Nguele, R.; Al-Salim, H.S.; Mohammad, K. Modeling and Forecasting of Depletion of Additives in Car Engine Oils Using Attenuated Total Reflectance Fast Transform Infrared Spectroscopy. Lubricants 2014, 2, 206–222. [Google Scholar] [CrossRef]
  18. Blancke, O.; Combette, A.; Amyot, N.; Komljenovic, D.; Lévesque, M.; Hudon, C.; Tahan, A.; Zerhouni, N. A Predictive Maintenance Approach for Complex Equipment Based on Petri Net Failure-Mechanism Propagation Model. PHM Soc. Eur. Conf. 2018, 4, 1–10. [Google Scholar] [CrossRef]
  19. Grebenișan, G.; Salem, N.; Bogdan, S.; Negrău, D.C. Oil Condition Monitoring—An AI Application Study Using Classification-Learner Techniques. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1169, 012012. [Google Scholar] [CrossRef]
  20. Yang, X.; Bi, F.; Jing, Y.; Li, X.; Zhang, G. A Condition-Monitoring Approach for Diesel Engines Based on Adaptive VMD and Sparse Representation Theory. Energies 2022, 15, 3315. [Google Scholar] [CrossRef]
  21. Liu, Z.; Wang, H.; Hao, M.; Wu, D. Prediction of RUL of Lubricating Oil Based on Information Entropy and SVM. Lubricants 2023, 11, 121. [Google Scholar] [CrossRef]
  22. Rostek, E.; Babiak, M. The experimental analysis of engine oil degradation utilizing selected thermoanalytical methods. Transp. Res. Procedia 2019, 40, 82–89. [Google Scholar] [CrossRef]
  23. Sharma, V.; Joshi, R.; Pant, H.; Sharma, V.K. Improvement in frictional behaviour of SAE 15W-40 lubricant with the addition of graphite particles. Mater. Today Proc. 2020, 25, 719–723. [Google Scholar] [CrossRef]
  24. Cerny, J.; Strnad, Z.; Sebor, G. Composition and oxidation stability of SAE 15W-40 engine oils. Tribol. Int. 2001, 34, 127–134. [Google Scholar] [CrossRef]
  25. Sejkorová, M.; Hurtová, I.; Jilek, P.; Novák, M.; Voltr, O. Study of the Effect of Physicochemical Degradation and Contamination of Motor Oils on Their Lubricity. Coatings 2021, 11, 60. [Google Scholar] [CrossRef]
  26. Omiya, T.; Hanyuda, K.; Nagatomi, E. Predicting engine oil degradation across diverse vehicles and identifying key factors. Mech. Syst. Signal Process. 2025, 229, 112524. [Google Scholar] [CrossRef]
  27. Sejkorová, M.; Šarkan, B.; Veselík, P.; Hurtová, I. FTIR Spectrometry with PLS Regression for Rapid TBN Determination of Worn Mineral Engine Oils. Energies 2020, 13, 6438. [Google Scholar] [CrossRef]
  28. Grimmig, R.; Lindner, S.; Gillemot, P.; Winkler, M.; Witzleben, S. Analyses of used engine oils via atomic spectroscopy—Influence of sample pre-treatment and machine learning for engine type classification and lifetime assessment. Talanta 2021, 232, 122431. [Google Scholar] [CrossRef]
  29. Heredia-Cancino, J.A.; Carrillo-Torres, R.C.; Félix-Domínguez, F.; Álvarez-Ramos, M.E. Experimental Characterization of Chemical Properties of Engine Oil Using Localized Surface Plasmon Resonance Sensing. Appl. Sci. 2021, 11, 8518. [Google Scholar] [CrossRef]
  30. Agocs, A.; Nagy, A.L.; Tabakov, Z.; Perger, J.; Rohde-Brandenburger, J.; Schandl, M.; Besser, C.; Dörr, N. Comprehensive assessment of oil degradation patterns in petrol and diesel engines observed in a field test with passenger cars—Conventional oil analysis and fuel dilution. Tribol. Int. 2021, 161, 107079. [Google Scholar] [CrossRef]
  31. Wolak, A.; Molenda, J.; Fijorek, K.; Łankiewicz, B. Prediction of the Total Base Number (TBN) of Engine Oil by Means of FTIR Spectroscopy. Energies 2022, 15, 2809. [Google Scholar] [CrossRef]
  32. Tanwar, M.; Raghavan, N. Lubricating Oil Remaining Useful Life Prediction Using Multi-Output Gaussian Process Regression. IEEE Access 2020, 8, 113784–113793. [Google Scholar] [CrossRef]
  33. Nguyen, V.T.; Furch, J.; Koláček, J. Using multiple linear regression to predict engine oil life. Sci. Rep. 2025, 15, 33585. [Google Scholar] [CrossRef] [PubMed]
  34. Gołębiowski, W.; Krakowski, R.; Zając, G. Degradation of anti-wear additives and tribological properties of engine oils at extended oil change intervals in city buses. Sci. Rep. 2025, 15, 27238. [Google Scholar] [CrossRef] [PubMed]
  35. Martinez-Castillo, C.; Astray, G.; Mejuto, J.C.; Simal-Gandara, J. Random Forest, Artificial Neural Network, and Support Vector Machine Models for Honey Classification. eFood 2020, 1, 69–76. [Google Scholar] [CrossRef]
  36. Widodo, A.; Yang, B.S. Support vector machine in machine condition monitoring and fault diagnosis. Mech. Syst. Signal Process. 2007, 21, 2560–2574. [Google Scholar] [CrossRef]
  37. Rodrigues, J.; Costa, I.; Farinha, J.; Mendes, M.; Margalho, L. Predicting motor oil condition using artificial neural networks and principal component analysis. Eksploat. I Niezawodn. —Maint. Reliab. 2020, 22, 440–448. [Google Scholar] [CrossRef]
  38. Katreddi, S.; Thiruvengadam, A.; Thompson, G.J.; Schmid, N.A. Mixed Effects Random Forest Model for Maintenance Cost Estimation in Heavy-Duty Vehicles Using Diesel and Alternative Fuels. IEEE Access 2023, 11, 67168–67179. [Google Scholar] [CrossRef]
  39. Wang, M.; Su, X.; Song, H.; Wang, Y.; Yang, X. Enhancing Predictive Maintenance Strategies for Oil and Gas Equipment through Ensemble Learning Modeling. J. Pet. Sci. Eng. 2025, 235, 112345. [Google Scholar] [CrossRef]
  40. Shao, M.; Wang, J.; Wang, S. The Intelligent Fault Diagnosis of Diesel Engine Based on the Ensemble Learning. J. Phys. Conf. Ser. 2020, 1549, 042106. [Google Scholar] [CrossRef]
  41. Moseley, B.; Markham, A.; Nissen-Meyer, T. Finite Basis Physics-Informed Neural Networks (FBPINNs): A Scalable Domain Decomposition Approach for Solving Differential Equations. Adv. Comput. Math. 2023, 49, 62. [Google Scholar] [CrossRef]
  42. Sharma, P.; Chung, W.T.; Akoush, B.; Ihme, M. A Review of Physics-Informed Machine Learning in Fluid Mechanics. Energies 2023, 16, 2343. [Google Scholar] [CrossRef]
  43. Kumar, A.; Pandey, A.; Gupta, T.; Ghosh, S.K. Lube oil life prediction for heavy earth moving machinery (HEMM): A machine learning approach. Proc. Inst. Mech. Eng. Part E J. Process Mech. Eng. 2025, 09544089241311377. [Google Scholar] [CrossRef]
  44. Jagannathan, S.; Raju, G.V.S. Remaining useful life prediction of automotive engine oils using MEMS technologies. In Proceedings of the 2000 American Control Conference. ACC (IEEE Cat. No.00CH36334), Chicago, IL, USA, 28–30 June 2000; 5, pp. 3511–3512. [Google Scholar] [CrossRef]
  45. Zhu, X.; Pan, Y.; Lan, B.; Wang, H.; Huang, H. Remaining Useful Life Prediction Based on Wear Monitoring with Multi-Attribute GAN Augmentation. Lubricants 2025, 13, 145. [Google Scholar] [CrossRef]
  46. Zhang, J.; Pei, G.; Zhu, X.; Gou, X.; Deng, L.; Gao, L.; Liu, Z.; Lin, J. Diesel Engine Fault Diagnosis for Multiple Industrial Scenarios Based on Transfer Learning. Measurement 2024, 228, 114338. [Google Scholar] [CrossRef]
  47. Ali, A.H.; Yaseen, M.G.; Aljanabi, M.; Abed, S.A. Transfer Learning: A New Promising Technique. Mesopotam. J. Big Data 2023, 3, 45–53. [Google Scholar] [CrossRef]
  48. Wu, X.; Manton, J.H.; Aickelin, U.; Zhu, J. A Bayesian Approach to (Online) Transfer Learning: Theory and Algorithms. Artif. Intell. 2023, 324, 103991. [Google Scholar] [CrossRef]
  49. ASTM D5185; Standard Test Method for Multielement Determination of Used and Unused Lubricating Oils and Base Oils by Inductively Coupled Plasma Atomic Emission Spectrometry (ICP-AES). ASTM International: West Conshohocken, PA, USA, 2019.
  50. Willermet, P.A.; Dailey, D.P.; Carter, R.O.; Schmitz, P.J.; Zhu, W. Mechanism of formation of antiwear films from zinc dialkyldithiophosphates. Tribol. Int. 1995, 28, 177–187. [Google Scholar] [CrossRef]
  51. Spikes, H. The History and Mechanisms of ZDDP. Tribol. Lett. 2004, 17, 469–489. [Google Scholar] [CrossRef]
Figure 1. Comprehensive workflow of the proposed hierarchical physics-informed framework for RUL estimation.
Figure 1. Comprehensive workflow of the proposed hierarchical physics-informed framework for RUL estimation.
Lubricants 13 00545 g001
Figure 2. Hierarchical transfer learning architecture illustrating sequential adaptation from global pretraining to equipment- and asset-level fine-tuning.
Figure 2. Hierarchical transfer learning architecture illustrating sequential adaptation from global pretraining to equipment- and asset-level fine-tuning.
Lubricants 13 00545 g002
Figure 3. Correlation matrix of oil-condition parameters for Shovels, illustrating correlations among degradation and contamination markers.
Figure 3. Correlation matrix of oil-condition parameters for Shovels, illustrating correlations among degradation and contamination markers.
Lubricants 13 00545 g003
Figure 4. Comparison of increasing-type parameters (Aluminum) across (a) Asset-Specific model, (b) Global Model, and (c) Transfer Learning model.
Figure 4. Comparison of increasing-type parameters (Aluminum) across (a) Asset-Specific model, (b) Global Model, and (c) Transfer Learning model.
Lubricants 13 00545 g004
Figure 5. Comparison of increasing-type parameters (Iron) across (a) Asset-Specific model, (b) Global Model, and (c) Transfer Learning model.
Figure 5. Comparison of increasing-type parameters (Iron) across (a) Asset-Specific model, (b) Global Model, and (c) Transfer Learning model.
Lubricants 13 00545 g005
Figure 6. Comparison of increasing-type parameters (Soot) across: (a) Asset-Specific model, (b) Global Model, and (c) Transfer Learning model.
Figure 6. Comparison of increasing-type parameters (Soot) across: (a) Asset-Specific model, (b) Global Model, and (c) Transfer Learning model.
Lubricants 13 00545 g006
Figure 7. Comparison of decreasing-type parameters (Magnesium) across Asset-Specific, Global Model, and Transfer Learning models. Each row corresponds to one parameter, and each column represents a modeling configuration.
Figure 7. Comparison of decreasing-type parameters (Magnesium) across Asset-Specific, Global Model, and Transfer Learning models. Each row corresponds to one parameter, and each column represents a modeling configuration.
Lubricants 13 00545 g007
Figure 8. Comparison of decreasing-type parameters (Zinc) across Asset-Specific, Global Model, and Transfer Learning models. Each row corresponds to one parameter, and each column represents a modeling configuration.
Figure 8. Comparison of decreasing-type parameters (Zinc) across Asset-Specific, Global Model, and Transfer Learning models. Each row corresponds to one parameter, and each column represents a modeling configuration.
Lubricants 13 00545 g008
Figure 9. Comparison of decreasing-type parameters (Total Base Number) across Asset-Specific, Global Model, and Transfer Learning models. Each row corresponds to one parameter, and each column represents a modeling configuration.
Figure 9. Comparison of decreasing-type parameters (Total Base Number) across Asset-Specific, Global Model, and Transfer Learning models. Each row corresponds to one parameter, and each column represents a modeling configuration.
Lubricants 13 00545 g009
Figure 10. Model performance comparison for Dump Trucks across modeling strategies: (a) Asset-Specific, (b) Global Model, and (c) Transfer Learning.
Figure 10. Model performance comparison for Dump Trucks across modeling strategies: (a) Asset-Specific, (b) Global Model, and (c) Transfer Learning.
Lubricants 13 00545 g010
Figure 11. Model performance comparison for Shovels across modeling strategies: (a) Asset-Specific, (b) Global Model, and (c) Transfer Learning.
Figure 11. Model performance comparison for Shovels across modeling strategies: (a) Asset-Specific, (b) Global Model, and (c) Transfer Learning.
Lubricants 13 00545 g011
Figure 12. Model performance comparison for Wheel Loaders across modeling strategies: (a) Asset-Specific, (b) Global Model, and (c) Transfer Learning.
Figure 12. Model performance comparison for Wheel Loaders across modeling strategies: (a) Asset-Specific, (b) Global Model, and (c) Transfer Learning.
Lubricants 13 00545 g012
Figure 13. Model performance comparison for Dozers across modeling strategies: (a) Asset-Specific, (b) Global Model, and (c) Transfer Learning.
Figure 13. Model performance comparison for Dozers across modeling strategies: (a) Asset-Specific, (b) Global Model, and (c) Transfer Learning.
Lubricants 13 00545 g013
Figure 14. Win Rate Matrix comparing pairwise RMSE performance across models. Transfer Learning exhibits the highest overall win rate across equipment.
Figure 14. Win Rate Matrix comparing pairwise RMSE performance across models. Transfer Learning exhibits the highest overall win rate across equipment.
Lubricants 13 00545 g014
Table 1. Equipment type, sample, and asset distribution.
Table 1. Equipment type, sample, and asset distribution.
Equipment TypeSamples (n)Assets (n)
Dump Trucks88813
Dozers943
Shovels932
Wheel Loaders6849
Total176026
Table 2. Representative physicochemical and wear-related lab results from the used oil datasets.
Table 2. Representative physicochemical and wear-related lab results from the used oil datasets.
Asset Type A. Dump Trucks B. Shovels C. Wheel Loaders D. Dozers
Asset LabelAAFC
Oil Age (operating h)500246255253
Al (ppm)5850
B (ppm)60048
Ca (ppm)1284130013901059
Cr (ppm)0340
Cu (ppm)51010
Fe (ppm)14261442
K (ppm)0000
Mg (ppm)1122931134790
Mo (ppm)320228
Na (ppm)2000
P (ppm)106711131252881
Pb (ppm)1510
Si (ppm)51231
Zn (ppm)112012351067989
Oxidation (Ab/cm)0200
PQ Index615630
Soot (wt.%)0.470.330.412.41
TBN 898.46.1
Viscosity @100C (cSt)1313.813.213.1
Water (Vol.%)00.08Not Detected0.03
Table 3. Representative diagnostic feature categories.
Table 3. Representative diagnostic feature categories.
CategoryRepresentative Features
DepletionTBN, Additive Elements (Ca, Mg, Zn, P)
AccumulationFe, Cu, Al, Pb, Si, Na, K
TransformationViscosity, Oxidation, Nitration, Soot
Table 4. Physical interpretation of model parameters and constants.
Table 4. Physical interpretation of model parameters and constants.
SymbolRolePhysical Interpretation
y(t)Degradation indicatorTime-dependent oil property (e.g., TBN, Fe)
y0Initial valueProperty value in new oil at t = 0
ythThresholdCritical limit defining end of useful life
kRate constantMagnitude of degradation or accumulation rate
pTime exponentDescribes acceleration/deceleration of process
ALog amplitudeScale of bounded (saturating) increase
BLog time constantTime scale for saturation or asymptotic growth
t*Time to failurePredicted time when y(t) reaches yth
Table 5. Average R2 performance for best generalized model per parameter.
Table 5. Average R2 performance for best generalized model per parameter.
ParameterBest Generalized ModelAverage R2
Al (Aluminum)ExtraTrees0.8520
B (Boron)RandomForest0.0963
Ca (Calcium)GradientBoosting0.9024
Cr (Chromium)RandomForest0.5061
Cu (Copper)RandomForest0.7130
Fe (Iron)GradientBoosting0.8387
K (Potassium)BaggingRegressor0.2912
Mg (Magnesium)GradientBoosting0.9458
Mo (Molybdenum)RandomForest0.1734
Na (Sodium)KNeighbors−0.2748
Oxidation (Ab/cm)ExtraTrees0.7752
P (Phosphorus)RandomForest0.7127
PQ IndexExtraTrees0.4108
Pb (Lead)ExtraTrees0.1527
Si (Silicon)AdaBoost0.0298
Soot (Wt%)GradientBoosting0.5853
TBN (mg KOH/g)ExtraTrees0.4806
Visc@100C (cSt)ExtraTrees0.0730
Water (Vol.%)KNeighbors−0.0180
Zn (Zinc)ExtraTrees0.4644
Table 6. Model performance summary for selected lubricant parameters under different learning configurations.
Table 6. Model performance summary for selected lubricant parameters under different learning configurations.
ElementModel TypeR2ActualPredictedError (%)
AluminumAsset Specific0.5374.0005.15128.78
 Global Model0.8424.0005.10127.54
 Transfer Learning0.9174.0003.57810.55
Iron (Fe)Asset Specific0.3124831.43434.51
 Global Model0.7694842.26211.95
 Transfer Learning0.8644844.9236.41
SootAsset Specific0.0360.610.54510.72
 Global Model0.2840.610.50517.28
 Transfer Learning0.8360.610.5716.40
MagnesiumAsset Specific0.836740.000682.8127.73
 Global Model0.938740.000827.68211.85
 Transfer Learning0.968740.000789.2566.66
Zinc (Zn)Asset Specific0.821923.0001281.11138.80
 Global Model0.878923.0001072.04216.15
 Transfer Learning0.917923.000964.6364.51
TBNAsset Specific0.7796.96.5205.50
 Global Model0.8256.97.3146.00
 Transfer Learning0.9466.97.0572.27
Table 7. Performance comparison for Dump Trucks of Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL) models across parameters.
Table 7. Performance comparison for Dump Trucks of Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL) models across parameters.
ParameterAS R2GM R2TL R2AS RMSEGM RMSETL RMSEBest ApproachBest R2
Al (Aluminum)0.35520.89110.91102.85741.46621.0707TL0.9110
B (Boron)−0.29980.0069−0.23743.96735.60145.1725GM0.0069
Ca (Calcium)0.77980.82490.8912149.4272190.4320115.5003TL0.8912
Cr (Chromium)0.12990.58690.45561.67011.55871.5889GM0.5869
Cu (Copper)0.11220.13920.49976.681272.990545.3250TL0.4997
Fe (Iron)0.21440.17800.661913.254853.703228.3094TL0.6619
K (Potassium)−0.15470.12010.05461.44512.78582.4901GM0.1201
Mg (Magnesium)0.81110.95580.9721126.263685.436143.8029TL0.9721
Mo (Molybdenum)0.2637−0.11320.01015.216418.530314.4277AS0.2637
Na (Sodium)−0.7776−0.00660.05651.324963.778227.8036TL0.0565
P (Phosphorus)−15.44090.56220.6362239.668695.290381.1054TL0.6362
Pb (Lead)−1.33840.01220.24473.220714.47806.3395TL0.2447
Si (Silicon)−0.09120.74030.71874.91092.54132.2403GM0.7403
Zn (Zinc)−6.73420.59360.5676185.083394.593989.5082GM0.5936
Oxidation (Ab/cm)0.75170.75310.85820.68910.89860.4573TL0.8582
PQ Index0.07830.25630.33163.21786.38054.7100TL0.3316
Soot (Wt%)0.47380.57090.47590.14950.18960.1723GM0.5709
TBN (mg KOH/g)0.00910.31000.31100.98010.88380.8172TL0.3110
Visc@100C (cSt)−0.32770.3481−0.16380.55790.64350.5700GM0.3481
Water (Vol.%)0.07410.0591−0.10530.03570.12290.0658AS0.0741
Table 8. Predictive performance comparison for Dump Trucks across modeling strategies (average metrics, raw units).
Table 8. Predictive performance comparison for Dump Trucks across modeling strategies (average metrics, raw units).
Modeling StrategyAverage R2Average MAEAverage RMSE
Asset-Specific (AS)−1.05619.05037.531
Global Model (GM)0.38916.02535.615
Transfer Learning (TL)0.40813.46323.574
Table 9. Detailed predictive performance for Shovels across Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL) models.
Table 9. Detailed predictive performance for Shovels across Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL) models.
ParameterAS R2GM R2TL R2AS RMSEGM RMSETL RMSEBest ApproachBest R2
Al (Aluminum)0.45170.67150.97232.40911.82350.3521TL0.9723
B (Boron)0.19710.46750.84103.76233.06480.9867TL0.8410
Ca (Calcium)0.67280.79210.9821237.5438184.912730.4052TL0.9821
Cr (Chromium)0.14340.31590.90292.32711.99350.2985TL0.9029
Cu (Copper)−0.01750.37310.77896.80594.96731.6314TL0.7789
Fe (Iron)0.26430.47880.918713.885410.62182.5691TL0.9187
K (Potassium)0.08190.24250.82371.04970.77510.2362TL0.8237
Mg (Magnesium)0.39520.60280.9967213.0813166.24458.8729TL0.9967
Mo (Molybdenum)−0.07150.16550.62136.68734.96392.2038TL0.6213
Na (Sodium)−0.2017−0.03970.51142.17351.98570.8369TL0.5114
P (Phosphorus)0.40180.65940.9779137.2857103.771119.3438TL0.9779
Pb (Lead)−0.63910.22110.89253.56132.29810.3712TL0.8925
Si (Silicon)0.10750.45180.96715.97283.58310.5618TL0.9671
Zn (Zinc)0.21290.49280.9785146.2291104.781315.3842TL0.9785
Oxidation (Ab/cm)0.31250.70830.99820.81250.55380.0183TL0.9982
PQ Index−0.15280.12890.80374.25343.32750.9117TL0.8037
Soot (Wt%)0.03980.25330.86310.18840.12750.0303TL0.8631
TBN (mg KOH/g)0.07590.34110.95751.03520.69210.1187TL0.9575
Visc@100C (cSt)0.12650.36550.85430.64940.46930.0794TL0.8543
Water (Vol.%)−0.05280.16510.64270.04390.02820.0092TL0.6427
Table 10. Predictive performance comparison for Shovels across modeling strategies (average metrics, raw units).
Table 10. Predictive performance comparison for Shovels across modeling strategies (average metrics, raw units).
Modeling StrategyAverage R2Average MAEAverage RMSE
Asset-Specific (AS)−0.74116.79326.782
Global Model (GM)0.06924.72334.147
Transfer Learning (TL)0.8622.6273.640
Table 11. Detailed predictive performance for Wheel Loaders across Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL) models.
Table 11. Detailed predictive performance for Wheel Loaders across Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL) models.
ParameterAS R2GM R2TL R2AS RMSEGM RMSETL RMSEBest ApproachBest R2
Al (Aluminum)0.58990.76840.98272.11841.24460.3171TL0.9827
B (Boron)0.24290.51180.82953.60242.87311.1018TL0.8295
Ca (Calcium)0.58670.76920.9718251.2178179.850240.6045TL0.9718
Cr (Chromium)0.18610.31770.89712.24641.94160.3013TL0.8971
Cu (Copper)0.00580.44440.77186.22074.61741.6431TL0.7718
Fe (Iron)0.30110.47380.912913.703610.46252.6571TL0.9129
K (Potassium)0.09810.25590.83251.03410.76310.2248TL0.8325
Mg (Magnesium)0.42750.61870.9951211.5328164.480710.0335TL0.9951
Mo (Molybdenum)−0.08940.18130.61296.50274.83812.1527TL0.6129
Na (Sodium)−0.2257−0.05620.50152.14511.96050.8514TL0.5015
P (Phosphorus)0.43200.68790.9749134.6821100.267321.1973TL0.9749
Pb (Lead)−0.70160.20310.88613.50742.33470.3761TL0.8861
Si (Silicon)0.12750.47850.96355.89043.45130.5720TL0.9635
Zn (Zinc)0.24410.51680.9739144.5182102.909217.4862TL0.9739
Oxidation (Ab/cm)0.33760.73510.99770.78750.52820.0199TL0.9977
PQ Index−0.18770.14420.79954.18423.26880.9256TL0.7995
Soot (Wt%)0.05480.27110.85410.18370.12450.0319TL0.8541
TBN (mg KOH/g)0.08770.35240.95121.02460.68290.1227TL0.9512
Visc@100C (cSt)0.13590.37920.84660.64120.45800.0812TL0.8466
Water (Vol.%)−0.04040.17780.63500.04250.02730.0096TL0.6350
Table 12. Predictive performance comparison for Wheel Loaders across modeling strategies (average metrics, raw units).
Table 12. Predictive performance comparison for Wheel Loaders across modeling strategies (average metrics, raw units).
Modeling StrategyAverage R2Average MAEAverage RMSE
Asset-Specific (AS)0.01419.26830.401
Global Model (GM)0.39115.16923.166
Transfer Learning (TL)0.7784.3926.960
Table 13. Detailed predictive performance for Dozers across Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL) models.
Table 13. Detailed predictive performance for Dozers across Asset-Specific (AS), Global Model (GM), and Transfer Learning (TL) models.
ParameterAS R2GM R2TL R2AS RMSEGM RMSETL RMSEBest ApproachBest R2
Al (Aluminum)0.20370.59620.99703.14193.10060.1678TL0.9970
B (Boron)−0.1578−0.01150.62023.19018.82654.8818TL0.6202
Ca (Calcium)0.59710.76710.9979232.5703203.118517.5973TL0.9979
Cr (Chromium)−7.0782−0.04990.88302.26521.93840.2838TL0.8830
Cu (Copper)−4.3499−0.20160.71748.016633.128715.8326TL0.7174
Fe (Iron)−20.86280.38590.771635.579817.59089.8828TL0.7716
K (Potassium)0.21670.08810.60280.69561.76921.1617TL0.6028
Mg (Magnesium)0.35200.66310.9994250.5118196.09847.3540TL0.9994
Mo (Molybdenum)0.15370.3755−0.16807.909113.733117.9399GM0.3755
Na (Sodium)−1.9121−0.02270.28982.21975.47783.9548TL0.2898
P (Phosphorus)−0.08780.42420.9700122.144998.944619.8108TL0.9700
Pb (Lead)−1.86290.09170.72753.27913.87441.5484TL0.7275
Si (Silicon)−0.74200.01380.99357.22326.00750.4521TL0.9935
Zn (Zinc)−1.1253−0.22140.9713138.0475160.087022.0582TL0.9713
Oxidation (Ab/cm)0.36450.80450.99940.96260.63490.0281TL0.9994
PQ Index−3.7380−0.42440.15804.98509.81257.5844TL0.1580
Soot (Wt%)−1.3655−0.06940.70540.21800.40630.1931TL0.7054
TBN (mg KOH/g)−1.45280.09720.98151.81161.32560.1602TL0.9815
Visc@100C (cSt)−0.11890.06600.68660.72910.98200.4066TL0.6866
Water (Vol.%)0.0116−0.17490.53540.03370.07530.0484TL0.5354
Table 14. Predictive performance comparison for Dozers across modeling strategies (average metrics, raw units).
Table 14. Predictive performance comparison for Dozers across modeling strategies (average metrics, raw units).
Modeling StrategyAverage R2Average MAEAverage RMSE
Asset-Specific (AS)−2.14830.87241.277
Global Model (GM)0.16026.65038.347
Transfer Learning (TL)0.7224.0906.567
Table 15. Aggregate performance comparison among modeling strategies.
Table 15. Aggregate performance comparison among modeling strategies.
Modeling StrategyR2 (Mean)RMSE (Raw Units)CI Width (95%)
Asset-Specific (AS)−0.98332.449±23.828
Global Model (GM)0.25232.483±20.075
Transfer Learning (TL)0.97910.185±5.481
Table 16. Representative results of quantitative RUL calculations for four selected assets.
Table 16. Representative results of quantitative RUL calculations for four selected assets.
 Dump Trucks
 Sample IDOil Current Working Hours (h)RUL (h)Alert and Time1st Caution indicator2nd Caution indicator
MachineDA-2500300NoneNoneNone
Oil0Mo (0)NoneNone
 Shovels
 Sample IDOil Current Working Hours (h)RUL (h)Alert and Time1st Caution indicator2nd Caution indicator
MachineSA-1246282NoneAL (90 h)Cu (282 h)
Oil177Si (177 h)Si (82 h)None
 Wheel Loaders
 Sample IDOil Current Working Hours (h)RUL (h)Alert and Time1st Caution indicator2nd Caution indicator
MachineWF-8255800NoneAl (289 h)None
Oil121NoneZn (65 h)TBN (121 h)
 Dozers
 Sample IDOil Current Working Hours (h)RUL (h)Alert and Time1st Caution indicator2nd Caution indicator
MachineDOC-1253267NoneFe (211 h)PQ index (267 h)
Oil0Multiple parametersNoneNone
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nassef, M.G.A.; Wael, O.; Elkady, Y.H.; Elshazly, H.; Ossama, J.; Amin, S.; ElGayar, D.; Pape, F.; Ali, I. Physics-Informed Transfer Learning for Predicting Engine Oil Degradation and RUL Across Heterogeneous Heavy-Duty Equipment Fleets. Lubricants 2025, 13, 545. https://doi.org/10.3390/lubricants13120545

AMA Style

Nassef MGA, Wael O, Elkady YH, Elshazly H, Ossama J, Amin S, ElGayar D, Pape F, Ali I. Physics-Informed Transfer Learning for Predicting Engine Oil Degradation and RUL Across Heterogeneous Heavy-Duty Equipment Fleets. Lubricants. 2025; 13(12):545. https://doi.org/10.3390/lubricants13120545

Chicago/Turabian Style

Nassef, Mohamed G. A., Omar Wael, Youssef H. Elkady, Habiba Elshazly, Jahy Ossama, Sherwet Amin, Dina ElGayar, Florian Pape, and Islam Ali. 2025. "Physics-Informed Transfer Learning for Predicting Engine Oil Degradation and RUL Across Heterogeneous Heavy-Duty Equipment Fleets" Lubricants 13, no. 12: 545. https://doi.org/10.3390/lubricants13120545

APA Style

Nassef, M. G. A., Wael, O., Elkady, Y. H., Elshazly, H., Ossama, J., Amin, S., ElGayar, D., Pape, F., & Ali, I. (2025). Physics-Informed Transfer Learning for Predicting Engine Oil Degradation and RUL Across Heterogeneous Heavy-Duty Equipment Fleets. Lubricants, 13(12), 545. https://doi.org/10.3390/lubricants13120545

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop