Data-Driven Predictive Modeling for Investigating the Impact of Gear Manufacturing Parameters on Noise Levels in Electric Vehicle Drivetrains

Horváth, Krisztián

doi:10.3390/wevj16080426

Open AccessReview

Data-Driven Predictive Modeling for Investigating the Impact of Gear Manufacturing Parameters on Noise Levels in Electric Vehicle Drivetrains

by

Krisztián Horváth

Department of Whole Vehicle Engineering, Audi Hungaria Faculty of Vehicle Engineering, Széchenyi István University, Egyetem tér 1, H-9026 Győr, Hungary

World Electr. Veh. J. 2025, 16(8), 426; https://doi.org/10.3390/wevj16080426

Submission received: 4 June 2025 / Revised: 25 July 2025 / Accepted: 28 July 2025 / Published: 30 July 2025

(This article belongs to the Special Issue Dynamic Modeling, Identification, and Advanced Control of Intelligent Electric Vehicles)

Download

Browse Figures

Versions Notes

Abstract

Reducing gear noise in electric vehicle (EV) drivetrains is crucial due to the absence of internal combustion engine noise, making even minor acoustic disturbances noticeable. Manufacturing parameters significantly influence gear-generated noise, yet traditional analytical methods often fail to predict these complex relationships accurately. This research addresses this gap by introducing a data-driven approach using machine learning (ML) to predict gear noise levels from manufacturing and sensor-derived data. The presented methodology encompasses systematic data collection from various production stages—including soft and hard machining, heat treatment, honing, rolling tests, and end-of-line (EOL) acoustic measurements. Predictive models employing Random Forest, Gradient Boosting (XGBoost), and Neural Network algorithms were developed and compared to traditional statistical approaches. The analysis identified critical manufacturing parameters, such as surface waviness, profile errors, and tooth geometry deviations, significantly influencing noise generation. Advanced ML models, specifically Random Forest, XGBoost, and deep neural networks, demonstrated superior prediction accuracy, providing early-stage identification of gear units likely to exceed acceptable noise thresholds. Integrating these data-driven models into manufacturing processes enables early detection of potential noise issues, reduces quality assurance costs, and supports sustainable manufacturing by minimizing prototype production and resource consumption. This research enhances the understanding of gear noise formation and offers practical solutions for real-time quality assurance.

Keywords:

gear noise; data-driven; machine learning; predictive modeling; manufacturing parameters; quality control

Graphical Abstract

1. Introduction

The noise emissions of gear transmissions are primarily induced by periodic force fluctuations occurring during gear meshing, known as transmission error (TE) [1]. This process, the elastic deformation of the teeth, and variations in mesh stiffness generate vibrations, which propagate through the gearbox housing and radiate as noise. In practice, several factors contribute to TE and gear noise: non-ideal (finite) contact points of the teeth cause stiffness fluctuations, shaft misalignments introduce deviations, and fine geometric inaccuracies on the tooth surface also influence noise generation. Manufacturing micro-errors (e.g., waviness, profile deviations) or differences in surface roughness lead to small variations in gear meshing, which ultimately affect noise emissions [2,3,4]. Studies have confirmed that gears manufactured within tolerance limits can still cause excessive noise in assembled gearboxes—this issue has been frequently observed in electric vehicle (EV) drivetrains, where all components are dimensionally precise, yet some gear sets fail to meet noise requirements [4]. This suggests that traditional factors (such as simple checks for profile and pitch deviations) do not always capture noise-critical discrepancies, making gear noise highly sensitive to even minute manufacturing variations (often referred to as the “ghost noise” phenomenon) [5]. Although the issue of gear noise has persisted for decades, yet no universal solution has been found due to the complexity of the phenomenon—predicting and controlling noise remains a challenge in gear design [6].

Over the past decade, manufacturers and researchers have begun to complement finite element and multibody simulations with data-driven techniques. Inline vibration measurements, post-process optical inspections, and EOL acoustic tests now generate large datasets that can be mined for hidden relationships between manufacturing parameters and noise performance. Machine learning (ML) algorithms—ranging from ensemble trees through Gradient Boosting machines to deep neural networks—have demonstrated encouraging predictive accuracy; however, published studies vary widely in test setups, feature sets, validation strategies, and reporting metrics. A consolidated view of these methods, along with their industrial applicability and limitations, is still missing. Beyond synthesizing recent literature, this review also incorporates the authors’ own industrial experiences and data-driven experiments related to NVH quality assessment using machine learning. Practical challenges, modeling limitations, and interpretability insights gathered from real-world implementations are discussed alongside published findings.

This paper first summarizes the main physical phenomena and manufacturing-related factors influencing gear noise generation. It then reviews industrial measurement techniques and noise inspection approaches, focusing on both inline and offline methods. The subsequent section presents various ML methods for gear noise prediction, comparing their advantages, limitations, and practical implications. Current trends, innovative applications, and future research directions are then discussed, highlighting opportunities for further improvement. Finally, the article provides concluding remarks and practical recommendations to assist researchers and industry practitioners in implementing data-driven predictive modeling effectively.

2. Gear Noise Mechanisms and Manufacturing-Related Factors

This section summarizes key physical phenomena underlying gear noise generation, emphasizing aspects critical to gear manufacturing.

As detailed in the Introduction, gear noise is primarily linked to transmission error (TE), which itself is affected by even minor deviations in tooth microgeometry and surface finish. For instance, Henriksson (2020) and Wang et al. (2023) have shown through nonlinear multibody simulation and experimental validation that lightweight gear designs are more sensitive to TE fluctuations. These findings support the idea that traditional tolerance-based quality checks are insufficient, and more advanced predictive techniques are needed to assess gear noise potential during production [7,8].

In electric vehicle drivetrains, gear noise has become an even more pressing issue in recent years. Electric motors operate at extremely high rotational speeds (up to 15,000–30,000 rpm), exposing gear meshing to higher-frequency excitations and resulting in a broader noise spectrum. Additionally, the absence of an internal combustion engine’s background noise in EVs makes previously masked transmission noises much more noticeable. Industry experts have emphasized that optimized macro- and microgeometry, especially via honing and superfinishing, play a key role in frictional and tonal noise reduction in EV drivetrains [9]. Consequently, EV drivetrain gears must be designed and manufactured to tighter requirements, necessitating higher precision classes and tighter tolerances, which increase manufacturing challenges and costs. Literature indicates that automotive manufacturers prioritize noise reduction in e-mobility, as drivetrain noise has become one of the dominant components of in-cabin sound in the absence of engine masking.

Strategies for Gear Noise Reduction

Various strategies exist for reducing gear noise. First, the macrogeometry of the gearbox (module, number of teeth, helix angle, etc.) must be carefully selected. A high contact ratio ensures that multiple teeth share the load simultaneously, reducing the force fluctuations on individual teeth and minimizing transmission error. However, macrogeometry alone is insufficient—fine-tuning requires microgeometry modifications. These include pad corrections (e.g., tip relief, lead crowning), which optimize tooth contact under load. The goal of gear noise reduction is to reduce peak-to-peak transmission error and ensure smoother gear meshing, leading to lower excitation noise.

Tribological factors (friction and lubrication) also significantly impact gear noise. Once TE is minimized, frictional noise from gears becomes a major noise source in EV drivetrains. This fine whining noise component is influenced by surface roughness and lubricant viscosity. As early as 2001, Don Houser and his team demonstrated that super finishes and optimized lubricants significantly reduce frictional noise compared to conventionally ground surfaces. The industry has adopted this approach—critical gear applications (e.g., premium vehicle differentials, helicopter main gearboxes) often employ additional finishing processes to an operation [5].

Early experimental work by Masuda et al. (1986) already demonstrated that the choice of tooth flank finishing method (e.g., grinding vs. honing) shifts excitation levels by several decibels, confirming that surface generation processes have a first-order impact on radiated gear noise [10].

Several studies have highlighted how waviness and surface roughness—induced during finishing processes like honing or profile grinding—can lead to specific frequency modulations in gear noise. Tian et al. (2024) provide a detailed overview of how modern finishing techniques can minimize tonal noise components, while Choi et al. (2023) emphasize the critical influence of macrogeometry tolerances and shaft alignments on excitation forces. These insights further confirm the need for high-resolution surface characterization as input for noise prediction [11,12].

Innovative manufacturing solutions have also emerged to intentionally influence the noise spectrum of gears. One such technique is modulating tooth surface texture: during grinding, the dressing process of the grinding tool (grinding worm) is deliberately controlled to create controlled waviness on the tooth surface. This results in a periodic pattern with micrometer-scale amplitude on the gear surface. The purpose of this is to reduce the tonal quality of gear noise: instead of high-amplitude fundaments, the process introduces multiple low-amplitude harmonics (ghost orders). These harmonics partially mask the main tone, making the noise less sharp or disturbing to human perception. However, excessive ripple can create unwanted noise components. Such solutions require extensive experiments to prove that subjective noise reduction in gear systems can be achieved by fine-tuning the manufacturing parameters. The literature indicates that gear noise is a complex function of manufacturing precision and design parameters. Flawless design (high contact ratio, optimal profile) is necessary but insufficient for quiet operation—precise manufacturing are equally crucial. Since traditional physical modeling approaches (e.g., analytical or finite element calculations) cannot always predict all noise phenomena, researchers have recently turned to data-driven methods. ML-based prediction models, which statistically uncover the relationship between measured manufacturing deviations and noise levels, have demonstrated success. Recent studies confirm their effectiveness: for example, Lee and Park (2023) found that ML algorithms could reliably detect patterns between gear measurement data and noise test results, enabling accurate noise predictions based on manufacturing data. The following sections explore how this is applied in the industry and what technical tools are available for data-driven noise prediction [6]. A summary of key publications identified in the literature review is provided in Table 1.

One of the earliest efforts to model gear noise using data-driven approaches came from statistical regression models. These models relied on process and geometry parameters—such as tooth profile deviation, runout, and roughness—as inputs to predict measured noise levels. Chen and Xu (2010) developed a statistical framework for gear noise prediction in manufacturing environments, demonstrating that selected dimensional parameters could explain a large proportion of the variance in final acoustic output. Their work laid the foundation for later ML-based methods by highlighting which parameters correlate most strongly with noise [13].

Table 1. Summarizes the key referenced studies identified in the literature review.

Author(s) & Year	Focus Area	Methodology/Tools	Key Findings
Houser et al. (2001) [5]	Frictional noise reduction	Experimental finishing methods	Superfinishing and optimized lubricants reduce frictional gear noise
Henriksson (2020) [7]	TE in lightweight gears	Nonlinear Multibody Dynamics (MBD) simulation, validation	Lightweight gears more sensitive to TE fluctuations
Wang et al. (2023) [8]	TE prediction in lightweight designs	Nonlinear multibody approach	Gear design must consider increased TE due to reduced mass
Lee & Park (2023) [6]	Gear whine prediction via ML	XGBoost, regression vs. ensemble methods	ML outperformed traditional regression in gear noise prediction
Choi et al. (2023) [12]	Macrogeometry impact on gear performance	Simulation & sensitivity analysis	Small macrogeometry errors can amplify excitation forces
Tian et al. (2024) [11]	Gear finishing techniques	Literature review of honing/grinding	Modern finishing reduces tonal noise via surface smoothing
Rajkumar et al. (2025) [14]	AI-Digital Twin for NVH components	ML + Digital Twin architecture	Enables dynamic tolerance adaptation for gear NVH
Zhong et al. (2023) [15]	Predictive maintenance with Digital Twin	Review of Digital Twin applications in manufacturing	Real-time deviation monitoring enhances prediction accuracy
Sun et al. (2024) [16]	Acoustic prediction under data imbalance	Multi-kernel SVR + regularization	Robust forecasting possible despite skewed data distributions
Gleason Corp. (2023) [17]	Inline gear noise inspection	GRSL system (rolling + laser)	Enables 100% inspection and predictive NVH evaluation
Scania (2017) [18]	Acoustic anomaly detection in engines	Deep learning anomaly detection	Augmentation + semi-supervised ML handles limited fault data
Chen & Xu (2010) [13]	Statistical modeling of gear noise	Regression analysis	Early quantitative attempts at noise estimation from gear geometry
Masuda et al. (1986) [10]	Tooth flank finish effect	Experimental vibration and finish comparison	Finishing method strongly affects noise generation
Aurich (2023) [19]	Electromobility gear noise challenges	Review of gear quality and manufacturing tech	Emphasizes e-mobility’s demand for quieter, high-quality gears
H2020 ECO-Drive (2021) [20]	System-level NVH optimization	EU-funded research project	Proposes integrated noise control across drivetrain system

3. Industrial Measurement Techniques and Noise Inspection Approaches

Here, we describe various measurement techniques commonly applied in industrial settings, highlighting their practical relevance in gear noise inspection.

In the industry, ensuring the noise quality of gear transmissions is crucial, particularly in electric vehicle drivetrains. The traditional approach relies on extensive NVH (Noise, Vibration, and Harshness) testing of prototypes and finished products, followed by design or manufacturing refinements based on test results. Most automotive manufacturers conduct EOL noise inspections on gear transmissions: the assembled gearbox or drivetrain is run on a test bench, and sensors measure vibration and noise to filter out excessively noisy units. However, this is typically a sample-based or statistical inspection—ideally, every single gear should undergo noise quality checks. In practice, the limitation has been the time required for measurement (a full precision gear geometry inspection or noise test can take several minutes, while manufacturing cycle times are much shorter), leading to selective inspections within production batches. However, new challenges have pushed companies toward achieving 100% inspection, prompting the emergence of innovative technical solutions.

One pioneering industrial development is the Gleason GRSL (Gear Rolling System with Integrated Laser), which enables inline noise inspection of every single gear during manufacturing. This system combines traditional dual flank rolling tests with a laser scanning system. During measurement, the gear is engaged in rolling contact with a master gear while two high-speed laser sensors scan both flanks of the rotating gear across its full width. As a result, the system captures high-resolution surface geometry data in a fraction of the time required by conventional contact-based measurements. The GRSL effectively enables real-time, cycle-time-compatible inspection of all produced gears. This leads to 100% inline quality control, detecting not only dimensional errors but also predicting whether a gear is likely to cause noise issues in the final gearbox. Moreover, the measured waviness and shape data allow the software to estimate the expected noise behavior, providing a predictive assessment. The Gleason system can even integrate with gear design software, allowing real-world manufacturing variations to be fed back into gear mesh analysis models, refining designs based on actual production tolerances [17]. This marks the beginning of a closed-loop control approach in manufacturing, where measurement data are used to automatically adjust machine settings for subsequent parts when necessary. For electric drivetrain systems, this limitation is increasingly addressed through in-process acoustic inspection methods. Türich and Deininger (2024) report that laser scanning and double-flank roll testing are being integrated into production lines to detect deviations affecting gear noise already during manufacturing [21].

Beyond traditional inspection systems, companies are now experimenting with digital twin technologies to enhance predictive accuracy. Zhong et al. (2023) reviewed how digital twins can monitor manufacturing deviations in real-time and update gear mesh models accordingly, while Rajkumar et al. (2025) presented a combined AI digital twin architecture for dynamic tolerance adjustment in critical NVH components [14,15].

In addition to inline measurement advancements, improvements have been made in EOL noise inspection techniques. Several specialized testing methods help identify noise-prone gears before final assembly. One such method is the Single Flank Test (SFT), in which a gear and a master gear rotate in quasi-static engagement while high-precision angular sensors measure relative displacement. This provides an instantaneous transmission error (TE) curve, revealing meshing precision across different frequency bands. While highly accurate and repeatable, SFT is slow and therefore not feasible for mass-production-level testing. A faster alternative is the Structure-Borne Noise (SBN) test, where the gear is run at higher rotational speeds (e.g., 1000 RPM), and accelerometers measure vibrations on the test bench. The advantage of SBN is that it is extremely fast (taking just seconds), though its reliability is sometimes influenced by test rig resonance. To mitigate this, manufacturers use torsional acceleration testing (TAT), where rotational acceleration sensors are placed on shafts, measuring only the torsional vibration generated by the gear mesh. These methods are widely employed by automotive manufacturers to identify noisy gear pairs before final transmission assembly, ensuring that the most acoustically problematic components are removed before they enter vehicles.

In recent years, a new paradigm has emerged in manufacturing: data-driven quality management, which leverages Industry 4.0 technologies to uncover predictive patterns in large datasets. In this approach, manufacturers integrate sensor data, machine logs, and quality control measurements to implement predictive analytics, identifying potential defects before they occur. Traditional quality assurance is often reactive, whereas predictive quality control monitors products and processes in real-time. If certain manufacturing trends shift in an undesirable direction, an AI algorithm can flag that upcoming parts are likely to fail quality checks, allowing for timely corrective action. This is particularly useful in complex noise and vibration issues, where multiple interrelated variables influence the outcome. As a result, ML and artificial intelligence (AI) are increasingly being integrated into industrial quality control processes, not just for post-production analysis but also for real-time decision support.

3.1. Industry Examples of Machine Learning-Based Noise Prediction

Several industrial case studies demonstrate the effectiveness of ML in gear noise prediction. In a recent study, Lee and Park (2023) examined an automotive gearbox, investigating how measured gear deviations (microgeometry errors) correlate with gear noise levels. Their findings showed that a ML-based model (using decision trees and ensemble algorithms) could predict gear whine noise with significantly higher accuracy than traditional linear models. They experimented with multiple ML algorithms—including decision trees, Random Forest, Gradient Boosting (XGBoost, LightGBM)—and compared them against classical multiple linear regression. The best-performing model was XGBoost (version 1.7.6), which outperformed simple linear regression in terms of predictive accuracy. This study provided strong evidence that gear manufacturing measurement data and noise test results can be linked through ML, enabling early noise detection based on manufacturing data [6].

Similar data-driven projects have yielded success in other fields as well. For instance, tire rolling noise is a critical NVH factor in electric vehicles. In collaboration with Hyundai-Kia, Nexen Tire developed an AI-based system for predicting and reducing tire noise. Since 2018, their research has utilized deep learning and big data analytics to analyze the relationship between tire tread patterns and noise levels. The project led to the creation of new tire designs optimized for reduced noise, achieving a 1–3 dB reduction in both interior and exterior vehicle noise levels. Moreover, AI integration shortened the development cycle, reducing the number of physical prototypes needed [22].

ML-based noise prediction is also transforming NVH development in the automotive industry. Hyundai Motor Group researchers have used big data analysis of vehicle test results to map the contributions of different components to in-cabin noise. This data-driven approach enabled faster identification of noise sources compared to traditional trial-and-error methods. Similar principles are being explored in aerospace, where AI is used to predict airframe and landing gear noise based on extensive experimental data [23].

3.2. Machine Learning in Quality Control and Predictive Maintenance

Beyond product development, ML-based noise prediction is increasingly applied in predictive maintenance. For example, Scania Trucks implemented an AI-driven acoustic anomaly detection system for engine noise diagnostics. Using deep learning, they trained a model to differentiate between normal and defective engine sounds. However, they encountered a common industrial challenge: while they had abundant data, the number of defective samples was very low, creating a class imbalance problem. This is a recurring issue in predictive maintenance, where defects are rare but critical. Scania’s solution involved data augmentation techniques (synthetically generating additional failure cases) and semi-supervised learning, which allowed the model to detect anomalies even with limited faulty samples [18].

This class imbalance challenge is not unique to engine noise analysis—similar issues arise in electric motor NVH and tire noise applications. For example, Sun et al. (2024) successfully used Support Vector Regression with kernel optimization to forecast vehicle body noise levels, emphasizing the importance of advanced regularization in imbalanced datasets [16].

Hard finishing processes, such as grinding and honing, are commonly used after heat treatment to correct shape and surface errors. However, recent studies show that these steps may also introduce surface waviness or residual form deviations due to tool wear or thermal effects. These unintended deviations have been identified as excitation sources in tonal gear noise, particularly at high rotation speeds where transmission error is sensitive to flank quality [19].

In summary, data-driven predictive modeling is increasingly being adopted in industrial noise quality assurance, from inline gear measurement systems to AI-based ML models for noise forecasting. The next section explores the technical tools and best practices required to develop a successful predictive model for industrial applications.

4. Data-Driven Predictive Modeling Techniques and Best Practices

The following part introduces ML algorithms widely utilized in predicting gear noise, alongside best practices for effective implementation.

4.1. Applicable Machine Learning Models (Algorithms)

Industrial noise prediction is fundamentally a regression problem, where input features (manufacturing parameters) are used to predict a continuous output variable (noise level, typically measured in dB). Several ML approaches can be applied:

Linear Regression: A baseline model assuming that noise levels are a linear combination of manufacturing parameters. While simple and interpretable, it struggles with nonlinear dependencies, which are common in real-world noise phenomena. Linear regression is often used as a benchmark against which more advanced models are evaluated.
Decision Trees: A hierarchical model that splits data into progressively smaller subsets based on threshold conditions. Each terminal node represents a predicted noise level or category. Decision trees can capture nonlinear relationships and are easy to interpret, but they tend to overfit if not properly constrained.
Random Forest: An ensemble method that constructs multiple decision trees on random data subsets and aggregates their outputs. This approach reduces variance and improves stability compared to a single decision tree. Random Forest is well-suited for industrial datasets with many input variables, automatically ranking feature importance. However, interpretability is lower than that of a single tree.
Gradient Boosting (e.g., XGBoost, LightGBM): Another ensemble method that iteratively improves predictions by training new models to correct the errors of previous ones. These models have demonstrated high accuracy in industrial datasets, particularly for gear noise prediction. Studies have shown that XGBoost outperforms linear regression in predicting gear noise.
Deep neural networks (DNNs): Multilayer artificial neural networks capable of learning complex patterns. Used in regression settings, deep neural networks (e.g., Generalized Regression Neural Networks, GRNNs) can approximate the relationship between microgeometry modifications and radiated noise. While powerful, neural networks require large datasets and extensive computational resources. They also function as black box models, making interpretability a challenge. A comparative summary of machine learning algorithms is presented in Table 2.

While several ML algorithms are applicable to noise prediction, their performance and suitability depend on factors such as dataset size, feature dimensionality, and real-time constraints. Tree-based models typically outperform linear regression when dealing with complex, nonlinear patterns in gear noise, especially when the dataset includes interaction effects between profile deviations and surface waviness. However, these models are sensitive to hyperparameters, which must be optimized via techniques such as grid search or Bayesian optimization [24].

Support Vector Regression (SVR) excels in low-data scenarios but requires careful kernel selection and regularization tuning (C, ε), as poor choices can lead to underfitting or overfitting. Deep neural networks (DNNs), while capable of capturing high-dimensional nonlinear relationships, often require thousands of labeled samples and GPU-accelerated training, making them more suitable for enterprises with large-scale data infrastructure.

Ultimately, model selection should balance prediction accuracy, interpretability, and computational cost depending on whether the goal is inline or precise numerical prediction of dB levels.

Figure 1 provides a comparative overview of the key ML methods applied in gear noise prediction, including Random Forest, Gradient Boosting (XGBoost), Support Vector Regression (SVR), and Deep Neural Networks (DNN). As illustrated, Random Forest offers balanced performance across prediction accuracy, interpretability, and computational speed, making it a practical baseline for rapid prototyping and feature screening. Gradient Boosting methods (XGBoost, LightGBM) show a notable advantage in prediction accuracy, particularly when sufficient training data (>1000 samples) and effective hyperparameter tuning strategies are employed, thus becoming the most suitable choice for production-level deployment. While SVR performs well in scenarios with limited data availability, it often struggles with scalability and computational efficiency, making it more suitable for smaller-scale precision tasks. Deep neural networks are the most powerful in capturing highly complex, nonlinear relationships, especially when working with large datasets or high-dimensional input data (such as raw acoustic spectra or image-based measurements). However, their deployment in industrial contexts can be challenging due to their significant computational requirements, lower interpretability, and greater tendency to overfit if not carefully regularized. Therefore, the choice among these algorithms should be guided by dataset characteristics, computational resources, required prediction speed, and interpretability constraints.

For many ML algorithms, hyperparameter tuning significantly affects predictive accuracy. While grid search provides exhaustive coverage, it becomes computationally expensive for large parameter spaces. Random search offers a more efficient alternative, and Bayesian optimization methods have recently gained popularity due to their ability to find optimal configurations with fewer iterations. These methods are particularly suited to industrial gear noise prediction, where model evaluation can be costly. Studies have confirmed that Bayesian tuning improves model generalization, especially in noisy, nonlinear datasets [25,26].

Interpretability trade-offs. While tree-based ensembles like Random Forest and XGBoost offer high prediction accuracy, they often act as “black boxes,” reducing transparency for engineers. To address this, tools such as SHAP (SHapley Additive exPlanations) and permutation importance allow post hoc model explanation. SHAP in particular is widely used in manufacturing because it provides both global and local interpretability, helping engineers understand how individual manufacturing deviations contribute to noise levels [27]. Incorporating SHAP analysis into gear production pipelines can enhance model trust and facilitate decision-making.

Detailed Comparison of the Three Most-Used ML Methods in Gear Noise Studies

Table 3 compares the three most widely used algorithms—Random Forest, XGBoost/LightGBM, and deep neural networks—in terms of prediction accuracy, data and computational requirements, and suitability for industrial deployment.

Figure 2 summarizes the end-to-end processing pipeline used for gear noise prediction, from data acquisition to model deployment and feedback integration.

4.2. Data Collection and Preparation in an Industrial Environment

A high-quality dataset is the foundation of a robust predictive model. The first step is identifying which manufacturing parameters to collect. In gear manufacturing, relevant input features include:

Dimensional and shape deviations (profile error, pitch error, runout, eccentricity).
Surface roughness and waviness characteristics.
Material properties (hardness, microstructure).
Manufacturing process variables (cutting tool settings, grinding parameters, heat treatment profiles).
Acoustic test results (sound pressure levels at different speeds and loads).

Surface finish remains a critical predictor in gear noise regression models, especially in datasets including honing or super-finishing parameters. Liew and Nee (2015) demonstrated that hybrid predictive models combining physical descriptors with statistical learners can accurately estimate surface roughness outcomes in gear manufacturing [28].

Noise level data are typically obtained through experimental noise measurements, either in a controlled lab environment or during EOL testing. To ensure consistency, tests must be conducted under uniform conditions (e.g., semi-anechoic chamber, fixed microphone position).

One challenge in industrial environments is integrating multiple data sources—gear measurement systems, test benches, and production machines often generate data in different formats. Data synchronization is essential to align manufacturing parameters with corresponding noise measurements. Pre-processing steps include noise filtering, outlier removal, and standardization.

Feature selection is also critical—when a dataset contains many parameters, it’s important to identify the most relevant ones. Techniques such as LASSO regression (which imposes penalties on unnecessary variables) and Principal Component Analysis (PCA) (which reduces dimensionality) help simplify models. Decision tree algorithms also naturally highlight which features contribute most to noise levels.

Recommended workflow for new researchers aiming to implement predictive gear noise models in industry:

Define the target variable(s): Decide whether the goal is to predict RMS sound pressure levels, classify parts based on noise thresholds, or rank gears by noise severity.
Identify key input features: Gather relevant gear manufacturing data—e.g., tooth profile deviations, surface roughness (Ra, Rz), process temperature, tool wear indicators, etc.
Synchronize measurement sources: Ensure that dimensional measurements and EOL noise tests are timestamp-aligned or batch-correlated.
Clean and pre-process data: Remove outliers (e.g., due to measurement error), normalize data (especially when combining metrics with different scales), and consider dimensionality reduction if needed.
Split data for training and testing: Use k-fold cross-validation for robustness; if data are limited, use Leave-One-Out (LOO) or time–series cross-validation if applicable.
Select and train models: Start with interpretable models (e.g., decision trees), then proceed to ensemble models (e.g., RF, XGBoost) or SVR depending on data volume and complexity.
Tune hyperparameters: Use techniques like grid search, random search, or more advanced Bayesian optimization for better accuracy.
Validate model: Use metrics such as R², MAE, and RMSE; visualize residuals and error distributions to identify patterns.
Deploy the model in manufacturing: Connect prediction outputs to digital dashboards, programmable logic controllers (PLCs), or Manufacturing Execution Systems (MES).
Monitor and retrain: Establish a feedback loop to detect data drift, update the model periodically, and involve manufacturing engineers in model interpretation.

4.3. Handling Imbalanced Data and Model Validation

A common challenge in industrial settings is imbalanced data—most manufactured gears meet noise standards, while only a small percentage exhibit excessive noise. This imbalance makes it difficult to train a model effectively, as “noisy” samples are underrepresented. To address this, techniques such as:

Oversampling: Generating additional “noisy” samples via data augmentation.
Anomaly detection: Training models to recognize “unusual” cases instead of explicitly classifying normal vs. faulty parts.

Model validation is essential before deployment. Standard validation approaches include:

Train–test split (e.g., 80–20%): Training the model on most data and testing on a reserved subset.
Cross-validation: Dividing data into multiple subsets and training on different partitions to ensure robustness.
Performance metrics: Evaluating predictions using Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and R² scores.

Industrial models must not only be accurate but also practical for decision-making. If a binary classification is used (e.g., “acceptable” vs. “unacceptable” noise), then false positives (unnecessarily rejecting good parts) and false negatives (allowing noisy parts to pass) must be carefully balanced.

In many gear manufacturing datasets, the number of noisy (defective) parts is much smaller than the acceptable ones, leading to class imbalance. Synthetic oversampling techniques such as SMOTE (Synthetic Minority Oversampling Technique) are widely used to generate synthetic samples of the minority class by interpolating feature space [29]. Recent industrial applications also explore SMOTE combined with XGBoost (so-called SMOTE-XGBoost pipelines), which have shown 8–12% improvement in F1 score compared to random undersampling [30]. These techniques help balance training data and reduce the risk of the model ignoring rare but critical noise cases.

For industrial practitioners, model-agnostic interpretability methods such as SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) are increasingly recognized as useful tools. SHAP enables global interpretation of feature importance, supporting the identification of key manufacturing factors influencing gear noise. LIME, in contrast, offers local explanations of individual predictions, which can assist quality engineers in diagnosing noisy gear units. These techniques can complement existing ML-based pipelines by enhancing transparency and trust in automated decisions.

4.4. Industrial Implementation and Continuous Improvement

Once validated, the model can be integrated into the production environment. This may involve:

Embedding it into quality control software.
Linking it to real-time manufacturing systems (e.g., machine PLCs).
Using automated alerts when predicted noise levels exceed acceptable limits.

Continuous monitoring is necessary—ML models require periodic retraining as production conditions change (e.g., new tools, materials). Additionally, model outputs should be analyzed by engineers to refine manufacturing tolerances based on predictive insights.

4.5. Hyperparameter Tuning Challenges in Industrial Contexts

Several studies emphasize that tuning hyperparameters such as learning rate, tree depth, and regularization strength is critical when modeling industrial NVH data, particularly in the presence of batch-dependent heteroscedastic noise and concept drift [31,32]. Nested cross-validation and Bayesian optimization are commonly used to reduce overfitting and obtain unbiased error estimates [33].

4.6. Interpretability vs. Accuracy in Black Box Models

While models like XGBoost or deep neural networks offer high predictive accuracy, their lack of transparency can hinder industrial deployment. Explainable AI (XAI) tools, such as SHAP [34], have been increasingly adopted to interpret global feature importance and provide actionable insights. Some works even propose surrogate decision trees to approximate complex models for engineering use [35].

4.7. Handling Class Imbalance and Rare Failure Prediction

In gearbox datasets, noisy units are often rare (<1%), making standard regression or classification models biased toward the majority class. Techniques like SMOTE and its variants, especially Borderline-SMOTE [36], are widely applied to generate synthetic minority samples. In such imbalanced settings, Precision–Recall curves and AUCPR are preferred over ROC–AUC [37].

4.8. Illustrative Case Study—End-to-End Data Flow in an EV-Gearbox Line

A high-volume e-drive gearbox plant equipped with a Gleason GRSL inline inspection cell [17] streams three complementary data channels for every gear:

Inline vibration signals captured during dual-flank rolling.
Optical waviness and profile maps from the laser scanners.
Standard geometry + MES metadata (tool ID, feed rate, heat treatment batch).

During a typical five-shift batch ≈ 2000–3000 gears are produced, and 8–10% exceed the EOL acoustic threshold of 80 dB—figures consistent with earlier industry reports [17,19].

Pre-processing:

Timestamps align the three streams; a light outlier filter removes errant spikes; features are normalized and merged into a single matrix that mixes geometry, surface metrics and spectral vibration descriptors.

Model training and validation:

Random Forest is trained first for rapid prototyping and feature screening, reaching R² ≈ 0.78–0.85 and 2.5–3.5 dB MAE on the consolidated data [6,13].
After hyperparameter tuning, XGBoost/LightGBM lifts accuracy to R² ≈ 0.85–0.92 with 1.8–2.5 dB MAE, while keeping inference latency below the GRSL cycle time limit (<10 ms on an edge PC) [17].
A compact deep neural network prototype attains R² up to 0.94 when >10 k labelled samples are available but requires GPU hardware and longer training time; on smaller datasets, it offers only marginal gains [22].

Cross-validation plus a held-out test batch confirm the same hierarchy: gradient boosting > Random Forest, with DNNs competitive only in data-rich scenarios.

Interpretability:

SHAP analysis consistently attributes ≈60–75% of model output variance to three surface quality metrics—flank waviness amplitude (Wa), profile error (Fp) and radial run-out (Fr). Local LIME explanations help quality engineers investigate borderline gears flagged for re-finishing, turning model output into concrete shop floor actions [34].

Operational deployment and impact:

The tuned XGBoost model is embedded in the inline inspection software: gears predicted to exceed the noise threshold are automatically diverted for corrective finishing, while SHAP dashboards guide wheel-dressing and alignment adjustments.

4.9. Practical Implementation Challenges and Lessons Learned:

The authors’ pilot deployment of the proposed data-driven workflow on an electric-drive gearbox line exposed a series of non-trivial, largely undocumented difficulties that go well beyond algorithm selection. The first—and ultimately most time-consuming—task was data reconciliation across heterogeneous shop floor systems. Geometry measurements, inline rolling-test spectra, and end-of-line (EOL) acoustic records resided on separate servers protected by different security policies; although each part carried a unique identifier, incomplete or duplicated scans meant that only ≈87% of the physical parts could be unambiguously matched on the first try. Establishing a secure, automated extractor (SFTP + token authentication) and building a metadata catalogue of file types, sampling rates, and units reduced manual download time from several hours per batch to less than ten minutes but required close collaboration with the manufacturer’s IT and quality assurance teams.

A second bottleneck concerned physical and numerical homogenization of the raw signals. Order–domain spectra were available in both dB and linear [m s⁻²] form, while certain vibration channels were logged in rad s⁻². To avoid biasing the model by scale differences, all spectra were resampled on a common order grid (0.1—order resolution up to 200 orders) and converted to a consistent dB re 1 m s⁻² reference. Timestamp misalignments of up to 120 ms between vibration channels and rotational encoders were detected; a cross-correlation-based alignment routine restored synchronicity with sub-millisecond accuracy.

From a modelling perspective, the extreme class imbalance (only 8–10 % of the ≈3000 gears per batch exceeded the 80 dB limit) proved more detrimental than algorithmic overfitting. Synthetic oversampling (SMOTE) improved recall on noisy parts by 14%, yet the authors found that framing the task as an anomaly detection problem—training an auto-encoder on ’good’ parts only—yielded an equally high F₁-score without artificially altering the data distribution. Nevertheless, when the objective was precise dB regression, tree-based ensembles remained superior: an XGBoost model, trained on 1200 labelled parts and Bayesian-optimized for depth and learning rate, achieved R² = 0.88 and MAE = 2.1 dB on a hold-out lot, while inference latency stayed below 10 ms, satisfying inline cycle time constraints.

Perhaps the most significant qualitative outcome was the importance of domain knowledge-driven feature engineering. Simple, physically interpretable descriptors—peak amplitude of mesh orders, flank waviness RMS, radial runout—consistently outranked high-dimensional spectral kurtosis or wavelet coefficients in SHAP importance plots. Moreover, unsupervised PCA revealed tight clusters corresponding to honing tool life and heat treatment charge, confirming that the latent structure captured by the model aligns with real manufacturing states rather than spurious correlations.

5. Current Trends, Applications, and Future Directions

This chapter summarizes current research trends, discusses innovative industrial applications, and outlines key challenges and directions for future studies.

Recent research in gear noise prediction has increasingly moved towards integrating data-driven methodologies into industrial practice. Studies from 2010 to 2025 consistently show that advanced ML algorithms—particularly ensemble methods (Random Forest, XGBoost) and deep neural networks—significantly outperform traditional statistical models and physics-based simulations when sufficient data are available. Researchers have extensively investigated the role of manufacturing deviations, such as tooth waviness, profile errors, and surface roughness, demonstrating clear relationships between these parameters and radiated gear noise ([6,7,10,11,12,13]). Additionally, innovative measurement techniques, such as inline optical metrology and digital twins, have emerged as practical solutions for real-time gear quality assessment, highlighting the shift toward predictive and proactive quality management ([14,15,17,21]).

Several challenges, however, remain prominent. Data imbalance, limited sample sizes for rare defect scenarios, interpretability concerns, and high computational demands—especially in real-time industrial environments—continue to restrict broader implementation ([16,25,29,31]). Consequently, the latest studies emphasize hybrid modeling approaches that combine ML with physics-based simulations, as well as explainable AI techniques, to overcome these limitations ([27,34,35]). The research community widely acknowledges that future progress in this domain will likely hinge upon improved model interpretability, efficient handling of imbalanced data, and robust integration into digital manufacturing workflows ([38,39,40,41,42]).

5.1. Scalability and Applications in Other Manufacturing Processes

The ML-based predictive noise modeling approach is not limited to gear manufacturing; it can be extended to numerous other industrial applications where acoustic properties are critical. Any process where product noise levels or vibrations affect quality could benefit from similar data-driven methods. Examples include:

Bearings: Rolling element bearings generate operational noise due to surface roughness, misalignment, and geometric deviations. Modern EOL noise tests already exist for bearings, where faulty parts are identified based on vibration signatures. A ML-based predictive model could anticipate bearing noise issues based on manufacturing metrology data before assembly.
Electric Motors and Generators: EV motors and alternators are prone to electromagnetic and mechanical noise caused by imbalances, winding misalignments, or resonance effects. Predictive noise modeling could analyze manufacturing data to preemptively detect motors that may produce excessive noise under operation.
Tires: Tire tread design significantly impacts rolling noise, which is a key NVH factor in EVs. Nexen Tire and Hyundai have already demonstrated how big data and deep learning can optimize tire tread patterns to minimize noise emissions. Expanding such models to other noise-sensitive rubber components, such as engine mounts or suspension bushings, is a promising direction.
Gearboxes in Aerospace and Heavy Machinery: Helicopter transmissions, railway gearboxes, and industrial powertrains also require strict noise control. Predictive models could improve the selection of microgeometry modifications in aerospace and railway gearboxes, where weight constraints and extreme operating conditions make noise reduction particularly challenging.

5.2. Leveraging AI for Real-Time, Large-Scale Noise Prediction

With advancements in cloud computing and edge computing, ML-based noise prediction can be scaled across multiple production lines and facilities. Instead of training a separate model for each factory, a centralized AI model could analyze data from multiple plants, detecting global manufacturing trends and process variations that affect noise quality.

Another promising development is real-time, inline noise prediction: modern ML algorithms (especially decision tree-based models like XGBoost and LightGBM) are computationally efficient, meaning predictions can be generated instantaneously within manufacturing cycle times. This enables:

Dynamic process control: If an ML model predicts that a part is likely to exceed noise limits, then the manufacturing process (e.g., grinding parameters, heat treatment conditions) can be adjusted in real-time to compensate.
Automated defect detection: Integration with inline laser scanning could allow automated sorting of potentially noisy gears, preventing faulty parts from entering final assembly.
Continuous process optimization: Long-term trend analysis of noise levels can guide maintenance scheduling and process adjustments to ensure stable manufacturing quality.

5.3. Integration of Digital Twin Technology

A Digital Twin is a virtual representation of a physical system that continuously updates based on real-world sensor data. By integrating ML-based noise prediction models into digital twins, manufacturers could:

Monitor noise quality throughout the production process: Instead of waiting for final product testing, noise trends could be tracked as components move through different production stages.
Predict noise performance before final assembly: If a specific batch of components shows a higher likelihood of noise issues, then adjustments can be made before parts are assembled into a final product.
Optimize process parameters dynamically: ML algorithms could recommend real-time parameter adjustments to maintain optimal quality with minimal scrap and rework.

Several research projects, such as the ECO-Drive H2020 initiative, are exploring system-level NVH optimization by integrating component-level noise prediction models into broader drivetrain simulations. This approach aims to reduce noise at the full drivetrain level rather than just optimizing individual components [20].

Recent studies highlight that digital twin technology can bridge the gap between measurement data and high-fidelity simulations, enabling real-time fault detection and performance prediction for gearboxes. A typical digital twin model consists of four core elements: the physical entity (the real gearbox and its sensor data), a virtual twin (geometric, physical, behavioral and rules-based models), a shared twin data layer, and services that deliver functions such as monitoring, fault warnings and performance optimisation. Zhang et al. emphasize that information flows both ways—updates in the physical system are mirrored in the virtual twin, while insights from simulations feed back into manufacturing control [43].

Figure 3 illustrates this integrated architecture. The physical entity and virtual twin exchange data through a bidirectional link, while the central twin data repository collects sensor and simulation data (vibration spectra, temperature, load conditions). Data-driven algorithms operate in this layer to perform predictive analytics and anomaly detection. The service layer presents actionable insights to engineers and allows interventions (such as process adjustments or preventive maintenance). The feedback arrows signify the closed-loop nature of the system: actions recommended by the service layer are applied to the physical system, and new data update the virtual twin [43].

5.4. Combining Data-Driven and Physics-Based Noise Prediction

Traditional noise prediction methods rely on finite element method (FEM) simulations and MBD models to estimate vibration and noise levels. However, these simulations are computationally intensive and often struggle to account for real-world manufacturing deviations.

A promising research direction is the hybrid integration of physics-based and ML models:

ML models can be trained on simulation data to create faster, surrogate models (metamodels) that approximate the noise response of a system without running full simulations.
Experimental noise measurements can be fed into ML models to calibrate FEM simulations, improving their accuracy by incorporating real-world variability.
Optimization algorithms (e.g., Particle Swarm Optimization, Genetic Algorithms) can be combined with ML models to search for the optimal microgeometry modifications that minimize noise, as seen in previous research on neural network-based gear noise reduction.

In hybrid noise prediction frameworks, accurately modeling the effects of manufacturing variability is essential. Certain gear modification parameters—such as pitch error, crowning, and lead deviation—exhibit strong correlations with tonal noise and dynamic amplification. These parameters often interact nonlinearly, making them suitable targets for SHAP-based interpretability and Digital Twin integration.

Table 4 summarizes the most relevant parameters, their typical manufacturing ranges, and their expected impact on gear noise behavior, based on literature and industrial observations.

5.5. Predictive Maintenance and Lifecycle Monitoring

Beyond manufacturing, ML-based noise models can support predictive maintenance by identifying parts that may degrade noisily over time. Some potential applications include:

EV drivetrain monitoring: ML models could analyze real-time gearbox vibration data to predict when noise levels will exceed acceptable limits, enabling proactive servicing before a vehicle reaches an unacceptable noise level.
Industrial gearbox monitoring: Predicting gear wear and pitting based on noise trends in heavy machinery and wind turbines.
Automated warranty claim analysis: Manufacturers could track production data and customer complaints to determine if certain manufacturing deviations correlate with long-term noise problems in vehicles.

5.6. Current Research Status

The field of data-driven gear noise prediction has evolved rapidly in recent years, transitioning from purely physics-based simulations to hybrid approaches that integrate ML and digital twin concepts. This section highlights this progression, summarizes key publications from 2020–2025, visualizes the shift in research focus, and identifies remaining gaps.

5.6.1. Evolution from Physics-Only to Hybrid Models

Traditional gear noise analysis relied heavily on FEM, MBD, and frequency–domain tools. These techniques effectively modeled structural resonances, TE, and gear meshing dynamics [44,45]. However, their limitations—such as long computation times and reliance on expert tuning—motivated a shift toward data-driven strategies. As ML methods like Random Forest, Support Vector Regression, and SHAP interpretability matured, researchers began combining physical modeling with data analytics. This hybridization culminated in the rise of digital twin frameworks, which continuously update simulations using real-time measurements. These DT systems bridge the gap between simulated and actual system behavior, enabling predictive diagnostics and adaptive control [46].

5.6.2. Emerging Trends: Digital Twins and Inline ML

Hybrid digital twin implementations have become a dominant research trend by 2024. As illustrated conceptually in Figure 4, the field has progressed from a pre-2020 era of purely physics-based NVH simulations toward the current era where physics models are coupled with data-driven intelligence. Recent reviews underscore this evolution: for example, Habbouche et al. (2025) note that gearbox monitoring techniques have culminated in the emergence of Digital Twin technology, combining model-based and AI-driven approaches. In practical terms, this means gear engineers now aim to continuously synchronize analytical gear models with real-time data from sensors, creating living models that predict noise and wear in real time. The integration of machine learning into operational gear systems (“inline ML”) is also on the rise. Early steps can be seen in industry and academia, such as employing neural networks alongside end-of-line noise tests or using digital twins for real-time gear test bench monitoring. These approaches enable on-the-fly prediction of gear noise or faults during operation or manufacturing, allowing proactive adjustments. Industry experts have begun to acknowledge this shift: Singh (Ohio State Gear Lab) observed that the tools and methods for gear noise mitigation are rapidly changing as we approach the mid-2020s. There is growing interest in faster simulations, integration of CAE with data analytics, and verification/validation techniques to build trust in these new tools. In fact, gear research roadmaps now explicitly mention incorporating AI for design optimization and developing digital twin tools for gear systems. Overall, the emerging consensus is that combining domain knowledge (physics) with data-driven learning will yield the next generation of gear noise prediction models—ones that are both accurate and adaptive to real-world conditions [46].

5.6.3. Remaining Gaps and Future Directions

Despite major advances, several challenges persist. First, there is no standardized benchmark dataset for training and comparing gear noise prediction models, limiting cross-study validation. Second, interpretability of complex ML models (e.g., deep neural networks) remains limited, which restricts their industrial adoption. Third, real-time deployment under production conditions is still rare; most DT or ML implementations remain in prototype or test bench phases. Overcoming these gaps requires (1) collaborative dataset sharing initiatives, (2) integration of explainable AI techniques into gear diagnostics, and (3) system-level validation frameworks for inline deployment [44,47].

5.7. Future Research Directions

The combination of data-driven models, real-time analytics, and system-level noise control represents a paradigm shift in NVH engineering. Some key areas for further research include:

Developing interpretable AI models: Black box ML models (e.g., deep neural networks) are powerful but difficult to interpret. Future research should focus on explainable AI (XAI) techniques to make predictions more understandable to engineers [38,48].
Expanding data sources for noise modeling: Combining manufacturing metrology data, operational sensor data, and subjective noise perception studies could provide a holistic understanding of gear noise [39].
Integrating AI-based noise models into design workflows: Closing the loop between manufacturing, testing, and product design would allow early-stage noise performance evaluation, reducing the need for costly prototypes [40].
Another promising research stream lies in combining explainable AI (XAI) with digital twin environments. As Kobayashi (2024) Nagrani (2025) suggest, integrating interpretable models in smart manufacturing allows engineers not only to trust predictions but to understand the physical meaning behind anomalies, forming the basis for continuous NVH improvement [41,42].

5.8. Industrial Applications of ML in NVH Quality

To further illustrate the maturity and industrial relevance of machine learning in NVH-related applications, Table 5 compiles representative use cases from major automotive OEMs and suppliers. These examples cover diverse implementation contexts such as end-of-line diagnostics, subjective noise mapping, and pattern-based anomaly detection using deep learning architectures.

5.9. Suggested Research Framework for Future Studies

In response to the complexity and multiscale nature of NVH phenomena in gearbox applications, we propose a structured research framework that integrates design, manufacturing, and testing domains with data-driven modeling approaches (see Figure 5).

The concept emphasizes a closed-loop learning cycle between simulation-based prediction, end-of-line measurements, and machine learning interpretation, enabling early design corrections and reduced reliance on physical prototypes.

The framework covers three primary axes:

Multiscale modeling, from gear microgeometry to full system housing simulations.
Physical measurements, including vibration spectra, transmission error, and material microstructure.
Data-driven models, utilizing both supervised and unsupervised methods for pattern discovery and predictive analytics.

This integrated methodology supports explainable and actionable NVH predictions, and facilitates early anomaly detection, even during production ramp-up stages.

6. Concluding Remarks and Practical Recommendations

In conclusion, we summarize the main findings of this review and provide practical recommendations for future research and industry practice.

ML-based predictive modeling is emerging as a highly effective approach for revolutionizing gear noise quality assurance in industrial manufacturing. Compared to traditional regression and physics-only simulation techniques, modern ensemble models like XGBoost and Random Forest consistently demonstrate superior prediction accuracy—particularly when manufacturing noise arises from subtle, high-dimensional surface deviations such as waviness or misalignment.

However, prediction quality is only as good as the data used. High-quality, well-labeled datasets—capturing the full variation of microgeometric and process parameters—are essential. In industrial practice, class imbalance remains a serious challenge: the proportion of parts with unacceptable noise is often less than 1%, making supervised learning models prone to overfitting toward the majority (acceptable) class. To address this, oversampling, semi-supervised learning, and anomaly detection must be considered, especially when high-resolution acoustic labeling is expensive or rare.

Key Recommendations:

For small datasets, Support Vector Regression (SVR) or decision trees may provide stable, interpretable results.
For nonlinear, multisource datasets, ensemble methods (e.g., Random Forest, XGBoost) offer an optimal trade-off between accuracy and speed.
In real-time applications, models must prioritize inference speed, robustness, and maintainability over black box complexity.

Explainability is crucial for adoption in manufacturing—models must provide actionable insights, not just outputs. Key recommendations for future industrial research directions are summarized in Table 6.

With continued research and industry–academic collaboration, data-driven NVH prediction has the potential to eliminate costly overengineering, reduce reliance on prototypes, and support sustainable and robust drivetrain development—particularly in electric mobility where gear noise is no longer masked by engines.

Funding

This research was supported by the EKÖP-25-3-I-SZE-82 University Research Scholarship Program of the Ministry for Culture and Innovation from the source of the National Research, Development and Innovation Fund.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chin, Z.; Smith, W.; Borghesani, P.; Randall, R.; Peng, Z. Absolute transmission error: A simple new tool for assessing gear wear. Mech. Syst. Signal Process. 2021, 146, 107070. [Google Scholar] [CrossRef]
Chen, Z.; Shao, Y. Dynamic features of planetary gear train with tooth errors. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2015, 229, 1769–1781. [Google Scholar] [CrossRef]
Davoli, P.; Gorla, C.; Rosa, F.; Rossi, F.; Boni, G. Transmission error and noise emission of spur gears: A theoretical and experimental approach. In Proceedings of the ASME 2007 IDETC/CIE Conference, Las Vegas, NV, USA, 4–7 September 2007; American Society of Mechanical Engineers: New York, NY, USA, 2007; pp. 443–449. [Google Scholar] [CrossRef]
Ahmad, M.; Brimmers, J.; Brecher, C. Influence of long-wave deviations on the quasi-static and dynamic excitation behaviour at higher speeds. Appl. Acoust. 2020, 165, 107307. [Google Scholar] [CrossRef]
Houser, D.R.; Harianto, J.; Harianto, J. Gear noise: Causes and control. Gear Technol. 2001, 18, 10–19. [Google Scholar]
Lee, S.H.; Park, K.P. Development of a prediction model for the gear-whine noise of transmission using machine learning. Int. J. Precis. Eng. Manuf. 2023, 24, 1793–1803. [Google Scholar] [CrossRef]
Henriksson, J. Simulation and validation of the transmission error, meshing stiffness and vibration in gear systems. J. Sound Vibration, 2020; Advance Online Publication. [Google Scholar]
Wang, J.; Zhang, Y.; Liu, X.; Zhou, Q. Evaluating lightweight gear transmission error: A novel nonlinear multibody approach. Front. Mech. Eng. 2023, 9, 1228696. [Google Scholar] [CrossRef]
Winkelmann, L. Tackling EV noise reduction. Gear Solutions Magazine. 15 October 2022. Available online: https://gearsolutions.com/departments/materials-matter/tackling-ev-noise-reduction/ (accessed on 27 July 2025).
Masuda, T.; Inoue, M.; Iida, T.; Aoki, T. Prediction of gear noise considering the influence of tooth-flank finishing method. J. Vib. Acoust. Stress Reliab. Des. 1986, 108, 121–130. [Google Scholar] [CrossRef]
Tian, X.; Li, Y.; Liu, Z.; Sun, J. High-speed and low-noise gear finishing by gear grinding and honing: A review. Chin. J. Mech. Eng. 2024, 37, 10. [Google Scholar] [CrossRef]
Choi, W.J.; Kim, J.; Lee, Y.; Park, K.P. Effects of manufacturing errors of gear macro-geometry on gear performance. Sci. Rep. 2023, 13, 27204. [Google Scholar] [CrossRef]
Chen, G.; Xu, Y. A statistical model for gear noise prediction in gearbox manufacturing. In Proceedings of the 2010 IEEE International Conference on Industrial Engineering and Engineering Management 2010, Xiamen, China, 29–31 October 2010. [Google Scholar] [CrossRef]
Rajkumar, S.; Singh, R.; Kumar, P.; Gupta, A. Predictive maintenance algorithms, artificial intelligence, digital twin. Mathematics 2025, 13, 981. [Google Scholar] [CrossRef]
Zhong, D.; Zhao, X.; Li, Y.; Zhang, D. Overview of predictive maintenance based on digital-twin technology. Heliyon 2023, 9, e14534. [Google Scholar] [CrossRef] [PubMed]
Sun, P.; Huang, J.; Zhang, L.; Liu, Y. Multi-objective prediction of the sound-insulation performance of a vehicle body system using multiple kernel learning–support vector regression. Electronics 2024, 13, 538. [Google Scholar] [CrossRef]
Gleason Corporation. GRSL: Gear Rolling System with Integrated Laser for 100 % Noise Inspection [White Paper]; Gleason Corporation: Ludwigsburg, Germany, 2023. [Google Scholar]
Scania. Machine Learning in Industrial Quality Control-Acoustic Deviation Detection. Master’s Thesis, KTH Royal Institute of Technology, Stockholm, Sweden, 2017.
Aurich, B. Aspects of gear noise, quality, and manufacturing technologies for electromobility. Gear Technology Magazine, 21 February 2023; 60–64. [Google Scholar]
H2020 ECO-Drive Project. System-Level NVH Optimisation for Sustainable Electric Drivetrains. Horizon 2020 Grant Agreement No 858018 (1 March 2020–29 February 2024). Available online: https://cordis.europa.eu/project/id/858018 (accessed on 27 July 2025).
Türich, A.; Deininger, K. Noise Analysis for e-Drive Gears and in-Process Gear Inspection. Gear Solutions Magazine. 15 February 2024. Available online: https://gearsolutions.com/features/noise-analysis-for-e-drive-gears-and-in-process-gear-inspection/ (accessed on 27 July 2025).
Nexen Tire. Prediction System to Reduce Tire Noise Using AI and Big Data [Press Release]; Nexen Tire: Yangsan-si, Republic of Korea, 2020. [Google Scholar]
Hyundai Motor Group & Hoseo University. Big Data Analysis for Vehicle NVH Development; Technical Report; Hoseo University; Hyundai Motor Group: Seoul, Republic of Korea, 2022. [Google Scholar]
Sun, H.; Wang, C.; Cao, X. An adaptive anti-noise gear fault diagnosis method based on attention residual prototypical network under limited samples. Appl. Soft Comput. 2022, 125, 109120. [Google Scholar] [CrossRef]
Cihan, P. Bayesian hyperparameter optimization of machine-learning models for predicting biomass gasification gases. Appl. Sci. 2025, 15, 1018. [Google Scholar] [CrossRef]
González-Duque, M.; Michael, R.; Bartels, S.; Zainchkovskyy, Y.; Hauberg, S.; Boomsma, W. A survey and benchmark of high-dimensional Bayesian optimization of discrete sequences. arXiv 2024, arXiv:2406.04739. [Google Scholar] [CrossRef]
Molnar, C. Interpretable Machine Learning: A Guide for Making Black-Box Models Explainable, 3rd ed.; Leanpub: Victoria, BC, Canada, 2023. [Google Scholar]
Liew, W.Y.; Nee, A.Y.C. Hybrid predictive models for surface roughness in gear manufacturing. Procedia CIRP 2015, 34, 225–230. [Google Scholar] [CrossRef]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Han, Y.; Wei, Z.; Huang, G. An imbalance data quality monitoring based on SMOTE-XGBoost supported by edge computing. Sci. Rep. 2024, 14, 10151. [Google Scholar] [CrossRef]
Bischl, B.; Binder, M.; Lang, M.; Pfahringer, B.; Kotthoff, L. Hyperparameter tuning in machine learning. J. Mach. Learn. Res. 2021, 22, 1–114. [Google Scholar]
Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems 25, Proceedings of the 26th Annual Conference on Neural Information Processing Systems 2012, Lake Tahoe, NV, USA, 3–6 December 2012; Curran Associates, Inc.: Red Hook, NY, USA, 2012; pp. 2951–2959. [Google Scholar]
Varma, S.; Simon, R. Bias in error estimation when using cross-validation for model selection. BMC Bioinform. 2006, 7, 91. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30, Proceedings of the 31st Annual Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 4765–4774. [Google Scholar] [CrossRef]
Tan, Y.; Yang, G.; Zhu, X. Interpretable surrogate trees for gradient boosting. Int. J. Data Sci. Anal. 2022, 13, 59–73. [Google Scholar]
Han, H.; Wang, W.; Mao, B. Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning. In Proceedings of the 2005 International Conference on Intelligent Computing, Hefei, China, 23–26 August 2005; pp. 878–887. [Google Scholar] [CrossRef]
Saito, T.; Rehmsmeier, M. The precision–recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE 2015, 10, e0118432. [Google Scholar] [CrossRef] [PubMed]
Puthanveettil Madathil, A.; Luo, X.; Liu, Q.; Walker, C.; Madarkar, R.; Qin, Y. A review of explainable artificial intelligence in smart manufacturing. Int. J. Prod. Res. 2025, 1–44. [Google Scholar] [CrossRef]
Kibrete, F.; Woldemichael, D.E.; Gebremedhen, H.S. Multi-sensor data fusion in intelligent fault diagnosis of rotating machines: A comprehensive review. Measurement 2024, 232, 114658. [Google Scholar] [CrossRef]
Wang, J.; Shi, L.; Ding, F.; Jinli, L.; Hou, L.; Enming, M. A digital twin modeling and application for gear rack drilling rigs lifting system. Sci. Rep. 2024, 14, 23711. [Google Scholar] [CrossRef]
Kobayashi, K.; Alam, S.B. Explainable, interpretable and trustworthy AI for an intelligent digital twin: A case study on remaining useful life. Eng. Appl. Artif. Intell. 2024, 129, 107620. [Google Scholar] [CrossRef]
Nagrani, S.R.; Narwane, V.S. Systematic literature review on digital twins in predictive maintenance. Ind. Eng. J. 2025, 18, 19–25. [Google Scholar]
Zhang, Q.; Wu, Z.; An, B.; Sun, R.; Cui, Y. Digital Twin-Based Technical Research on Comprehensive Gear Fault Diagnosis and Structural Performance Evaluation. Sensors 2025, 25, 2775. [Google Scholar] [CrossRef]
Wilk-Jakubowski, J.L.; Pawlik, L.; Frej, D.; Wilk-Jakubowski, G. The Evolution of Machine Learning in Vibration and Acoustics: A Decade of Innovation (2015–2024). Appl. Sci. 2025, 15, 6549. [Google Scholar] [CrossRef]
Shi, Z.; Liu, B.; Yue, H.; Wu, X.; Wang, S. Noise Reduction of Two-Speed Automatic Transmission for Pure Electric Vehicles. Vehicles 2023, 5, 248–265. [Google Scholar] [CrossRef]
Habbouche, H.; Amirat, Y.; Benbouzid, M. Leveraging Digital Twins and AI for Enhanced Gearbox Condition Monitoring in Wind Turbines: A Review. Appl. Sci. 2025, 15, 5725. [Google Scholar] [CrossRef]
Li, J.; Wang, S.; Yang, J.; Zhang, H.; Zhao, H. A Digital Twin-Based State Monitoring Method of Gear Test Bench. Appl. Sci. 2023, 13, 3291. [Google Scholar] [CrossRef]
You, K.; Lian, Z.; Gu, Y. A performance-interpretable intelligent fusion of sound and vibration signals for bearing fault diagnosis via dynamic CAME. Nonlinear Dyn. 2024, 112, 20903–20940. [Google Scholar] [CrossRef]

Figure 1. Comparative radar chart of ML models Hyperparameter Tuning Considerations.

Figure 2. Data-driven predictive modeling workflow.

Figure 3. Integration of digital twin architecture.

Figure 4. Research timeline showing the transition from FEM and signal processing (pre-2020) to integrated ML and digital twin applications (2023–2025), based on recent literature [20,47].

Figure 5. Suggested research framework integrating multiscale simulations, experimental measurements, and machine learning into a closed-loop NVH optimization process.

Table 2. Comparison of ML Algorithms for Industrial Gear Noise Prediction.

Algorithm	Strengths	Limitations	Industrial Suitability (NVH)	Example References
Linear Regression	Simple. Fast. Interpretable. Good baseline	Struggles with nonlinearity. Low accuracy in complex data	Suitable as baseline model. Not recommended for nonlinear gear noise cases	Chen & Xu (2010) [13]
Decision Tree	Interpretable. Handles nonlinearity. Fast inference	Prone to overfitting. Unstable predictions	Useful for feature selection or as part of ensembles	Choi et al. (2023) [12]
Random Forest	Robust. Handles high-dimensional data. Provides feature importance	Slower inference. Less interpretable than single trees	Common in gear and bearing fault detection	Lee & Park (2023); [6] Chen & Xu (2010) [13]
XGBoost/LightGBM	High accuracy. Handles nonlinearity. Fast training/inference	Sensitive to hyperparameters. Less transparent	Excellent for gear whine and TE-based noise prediction	Lee & Park (2023) [6]
Support Vector Regression (SVR)	Performs well on small datasets. Good generalization.	Sensitive to kernel choice. Slow for large data	Suitable for precise NVH estimation in limited data regimes	Sun et al. (2024) [16]
Deep Neural Network (DNN)	Captures complex patterns. Suitable for big data	Requires large datasets. Black box nature. High computational cost	Used in tire and motor NVH. Less common for gear unless data-rich	Nexen & Hyundai (2020) [22,23]

Table 3. Summary of Scientific Applications of ML for Gearbox NVH Prediction.

Criterion	Random Forest (RF)	Gradient Boosting (XGBoost/LightGBM)	Deep Neural Networks (DNN)	Key Sources
Typical accuracy trend	Good baseline; usually within ±2–4 dB of best ensemble on tabular gear metrology data	Consistently highest accuracy on tabular data (≈5–10% MAE improvement over RF)	Can surpass tree models when >10 k labelled samples or raw spectra/images are used	[6,25]
Data volume needed for stable model	≈500–1000 labelled parts	≈1000–2000 parts (benefits strongly from >2 k)	≥10,000 parts for reliable generalization	[6,25]
Training & inference speed (CPU/edge)	Fast (seconds–minutes); inference <10 ms	Moderate (needs hyperparameter tuning); inference <10 ms (GPU/CPU)	Slowest (minutes–hours train; 10–50 ms inference)	[25]
Interpretability	Medium—built-in feature importance, partial dependence	Medium/low—needs SHAP/gain analysis	Low—requires XAI (SHAP, surrogate trees)	[27]
Hyperparameter sensitivity	Low–moderate (n trees, depth)	High (learning rate, depth, subsample)	Very high (layers, LR, dropout)	[25]
Overfitting tendency	Moderate; bagging mitigates	Moderate–high; needs early stopping & regularisation	High without strong regularisation & dropout	[25]
Inline/real-time suitability	Proven in PLC/PC-based inline QC	LightGBM & XGBoost demonstrably run cycle time (<1 s)	Edge-GPU ok; heavy for PLC	[17]
Best-fit role in pipeline	Rapid prototyping, feature screening, small-to-mid datasets	Production-grade regression/classification with balanced speed/accuracy	Vision or raw spectra pipelines, large-scale R&D, anomaly detection

Table 4. Manufacturing parameters and their typical influence on gear noise levels. Compiled based on industrial experience, literature, and SHAP-based insights (e.g., [6,16]).

Manufacturing Parameter	Typical Range	Noise Sensitivity	Effect on Noise (dB)
Tooth Profile Modification (µm)	0–30	High	↑ if overmodified
Lead Modification (µm)	0–25	Medium	↕ depends on meshing
Tooth Crowning (µm)	0–20	High	↑ in high-speed
Surface Roughness (Ra, µm)	0.2–0.8	Medium	↑ with poor lubrication
Pitch Error (µm)	±10	Very High	↑↑ tonal noise
Runout (µm)	±5	High	↑ amplitude modulation
Material Batch Variance	Low/Medium/High	Medium	↕ varies with stiffness
Heat Treatment Deviation	±20 °C/±15 min	Low–Medium	↕ affects residual stress and dynamic response

Table 5. Industrial Use of Machine Learning in NVH-Quality Applications.

Company/Consortium	Application Area	ML Method	Context/Notes
Nexen Tire	Tire NVH quality classification	CNN (DL)	Vibration-based defect detection
Hyundai Motor	Powertrain NVH fault detection	Deep NN	Anomaly detection during EOL testing
BMW	EOL NVH diagnostics	XGBoost	Engine/gearbox vibration classification
ZF Friedrichshafen	Gearbox NVH clustering	Auto-encoder + KMeans	Noise pattern mining from end-of-line data
Bosch	Electric motor noise detection	LSTM	Temporal analysis of NVH data streams
Toyota	Cabin NVH profiling	CNN, SVM	Mapping subjective noise comfort to design variants
Continental	Tire–road interaction NVH	CNN	Predictive modeling of pattern-induced noise

Table 6. 5-Year Industrial Research Roadmap for ML-Based Gear Noise Prediction.

Year	Focus Area	Expected Outcome
2025	Data pipeline integration	Full digital traceability from machining to acoustic test benches
2026	Real-time predictive-model deployment	Inline XGBoost/SVR models running within cycle time limits
2027	Hybrid AI + physics-based simulation models	Fast surrogate models for TE + NVH prediction
2028	Explainable AI and uncertainty quantification	Visual dashboards to support decision-making and root cause analysis
2029	Closed-loop manufacturing + AI self-tuning systems	Automatic parameter tuning in grinding/honing based on ML feedback

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Published by MDPI on behalf of the World Electric Vehicle Association. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Horváth, K. Data-Driven Predictive Modeling for Investigating the Impact of Gear Manufacturing Parameters on Noise Levels in Electric Vehicle Drivetrains. World Electr. Veh. J. 2025, 16, 426. https://doi.org/10.3390/wevj16080426

AMA Style

Horváth K. Data-Driven Predictive Modeling for Investigating the Impact of Gear Manufacturing Parameters on Noise Levels in Electric Vehicle Drivetrains. World Electric Vehicle Journal. 2025; 16(8):426. https://doi.org/10.3390/wevj16080426

Chicago/Turabian Style

Horváth, Krisztián. 2025. "Data-Driven Predictive Modeling for Investigating the Impact of Gear Manufacturing Parameters on Noise Levels in Electric Vehicle Drivetrains" World Electric Vehicle Journal 16, no. 8: 426. https://doi.org/10.3390/wevj16080426

APA Style

Horváth, K. (2025). Data-Driven Predictive Modeling for Investigating the Impact of Gear Manufacturing Parameters on Noise Levels in Electric Vehicle Drivetrains. World Electric Vehicle Journal, 16(8), 426. https://doi.org/10.3390/wevj16080426

Article Menu

Data-Driven Predictive Modeling for Investigating the Impact of Gear Manufacturing Parameters on Noise Levels in Electric Vehicle Drivetrains

Abstract

1. Introduction

2. Gear Noise Mechanisms and Manufacturing-Related Factors

Strategies for Gear Noise Reduction

3. Industrial Measurement Techniques and Noise Inspection Approaches

3.1. Industry Examples of Machine Learning-Based Noise Prediction

3.2. Machine Learning in Quality Control and Predictive Maintenance

4. Data-Driven Predictive Modeling Techniques and Best Practices

4.1. Applicable Machine Learning Models (Algorithms)

Detailed Comparison of the Three Most-Used ML Methods in Gear Noise Studies

4.2. Data Collection and Preparation in an Industrial Environment

4.3. Handling Imbalanced Data and Model Validation

4.4. Industrial Implementation and Continuous Improvement

4.5. Hyperparameter Tuning Challenges in Industrial Contexts

4.6. Interpretability vs. Accuracy in Black Box Models

4.7. Handling Class Imbalance and Rare Failure Prediction

4.8. Illustrative Case Study—End-to-End Data Flow in an EV-Gearbox Line

4.9. Practical Implementation Challenges and Lessons Learned:

5. Current Trends, Applications, and Future Directions

5.1. Scalability and Applications in Other Manufacturing Processes

5.2. Leveraging AI for Real-Time, Large-Scale Noise Prediction

5.3. Integration of Digital Twin Technology

5.4. Combining Data-Driven and Physics-Based Noise Prediction

5.5. Predictive Maintenance and Lifecycle Monitoring

5.6. Current Research Status

5.6.1. Evolution from Physics-Only to Hybrid Models

5.6.2. Emerging Trends: Digital Twins and Inline ML

5.6.3. Remaining Gaps and Future Directions

5.7. Future Research Directions

5.8. Industrial Applications of ML in NVH Quality

5.9. Suggested Research Framework for Future Studies

6. Concluding Remarks and Practical Recommendations

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI