1. Introduction
Although traditional product quality control methods have long served as the cornerstone of industrial competitiveness, the transition towards intelligent and high-end manufacturing introduces greater complexities and elevated standards for quality control. This is particularly evident in assembly processes, where dimensional compatibility directly determines product performance and stability. Conventional quality control approaches, which primarily rely on manual expertise and univariate statistical analysis [
1], are often inadequate to address the error interactions and cumulative effects inherent to complex multi-stage manufacturing systems (MMS) [
2]. As a result, quality variations often become difficult to trace back to specific processes or root causes.
With the advent of Industry 4.0, manufacturing is undergoing a data-driven and intelligence-centred transformation. Modern manufacturing systems, especially in domains like automobile manufacturing, are characterised not only by structural complexity and multi-step processes but also by high coupling and nonlinear behaviours. Traditional methods exhibit clear limitations in handling multi-source, high-dimensional data and uncovering hidden quality correlations, creating an urgent need for more intelligent and adaptive quality control strategies, so as to realise the transformation from post-event detection and reactive correction to predicting quality issues, identifying critical deviations, and tracing root causes. By integrating real-time sensor data, process parameters, and historical quality records, an intelligent quality control system can identify potential deviations in advance, pinpoint key process variables, and track error propagation paths across manufacturing stages. This transformation not only enables more precise and efficient quality management but also lays a solid foundation for building next-generation smart manufacturing systems with self-awareness and autonomous decision-making capabilities. In this context, machine learning and neural networks have emerged as pivotal technologies for enhancing process quality [
3]. Artificial neural networks (ANNs) are by far one of the most successful machine learning methods. Due to their flexible and parallel composition of neurons, and their ability to approximate any functions with various input forms, ANNs provide a feasible solution to modern engineering modelling [
4,
5]. The BP neural network denotes a specific category of feedforward ANN whose training is fundamentally based on the back propagation algorithm. In recent years, BP neural networks have been successfully applied to general and adaptive quality prediction and control in the fields of manufacturing and processing [
6,
7], exhibiting advantages in rapid and accurate prediction.
Accordingly, this study provides an integrated dimensional chain quality prediction and traceability system that combines principal component analysis (PCA), BP neural networks, and permutation importance. Firstly, by using PCA to reduce high-dimensional correlated data and extract essential features, it eliminates redundancy to ensure reliable and precise model inputs. Secondly, it integrates neural network methodologies from a data-driven perspective to establish the nonlinear predictive relationship between body-in-white dimensions and matching quality. Finally, measuring the variables’ importance addresses the limitations of conventional neural networks in identifying the root causes of deviations and pinpointing the primary sources of these deviations. This method synergistically enhances prediction accuracy and interpretability and introduces a novel dimensional assembly quality analysis technique for automotive manufacturing.
2. Literature Review
Conventional assembly quality control relies on worker expertise and measurement instruments. While adequate for simple scenarios, it falls short in contemporary, complicated assemblies. These complex assemblies demand synchronised, hierarchical data management from diverse sources. To address this, physically based modelling approaches, particularly deviation propagation modelling, are suggested for identifying quality variations [
8].
Recent mechanism-based models [
9,
10,
11,
12,
13] mathematically analyse assembly deviations, overcoming empirical limits. However, their inability to precisely quantify complex multi-factor coupling and adapt in real-time (relying on offline modelling) renders them inadequate for modern complex assembly. Efficient online methods for prediction and control are urgently needed.
Developing sophisticated control models for predicting assembly quality is essential in intricate product production. Accurate forecasting of product quality metrics enables early issue detection and proactive strategies to maintain acceptable thresholds. Product quality prediction methods are often classified as either physical methods or data-driven approaches. Data-driven methodologies are more pragmatic and adaptable, involving training predictive models and extracting trends from previous data. New techniques offer benefits by eliminating the intricate modelling processes conventionally needed in assembly activities. Ref. [
14] integrated quality tools, the Genetic Algorithm (GA), and the Distributed Computing Continuum (DCC); Ref. [
15] used Bayesian sampling for joint design; Refs. [
16,
17] proposed a data-driven approach to creating a predictive grinding error model to improve tolerance allocation; and Ref. [
18] applied Man, Machine, Material, Method, Measure, and Environment (5M1E)/Fuzzy Analytic Hierarchy Process (FAHP) to assembly error analysis.
Manufacturing firms must examine and handle substantial process data before employing quality prediction methodologies, with data quality being crucial in machine learning algorithms [
19]. Consequently, adopting more stable and interpretable quality prediction models is imperative. Prevalent quality prediction methodologies encompass particle swarm optimisation [
20,
21], grey theory [
22,
23], support vector machines [
24], neural networks [
25], and random forest algorithms [
26]. Each possesses distinct advantages tailored to various application contexts, data conditions, and problem types. Recent studies have advanced quality control through data-driven methods: Ref. [
27] applied time-series Machine Learning (ML) to automotive bumper inspection, while Ref. [
28] combined random forest and FAHP for aerospace defect root-cause analysis. For assembly deviation, Ref. [
29] modelled transmission mechanisms, Ref. [
30] developed online control for multidimensional error coupling, and Ref. [
31] integrated digital twins for geometric deviation propagation analysis. Prediction algorithms were enhanced by [
32], using Least Squares Support Vector Regression (LSSVR)-enhanced Particle Swarm Optimization (PSO) and Ref. [
33], employing ML techniques (including random forests) for imbalanced data. In industrial manufacturing, neural network predictions often overlook critical information within the original data. Ref. [
34] proposed hierarchical residual networks with stacked autoencoders; Ref. [
35] boosted accuracy via multi-model fusion; and Ref. [
36] applied the failure sign algorithm of the firefly neural network (FSAFNN) to gear machining deviations. Advanced detection was addressed by [
37] using interpretable Artificial Intelligence (AI) for defect linkage mining, and Ref. [
38] implemented multilayer transformers for multi-source time-series anomaly detection.
Currently, most existing assembly quality optimisation methods rely primarily on design models to achieve simulation-based offline optimisation. Assembly processes face uncertainties from measurement, location, fixturing, and tightening forces, limiting error source identification and quality assurance. Neural networks are among the most prevalent machine learning algorithms for process and quality management in manufacturing, primarily for real-time analysis or identifying defective products using picture recognition. Their workflow can be divided into two major processes, namely, forward propagation and backward propagation of errors (backpropagation). In the forward propagation process, the input layer receives external input signals and passes them to the hidden layer; the hidden layer performs a nonlinear signal transformation; the processed information is transmitted to the output layer, and the output layer outputs the result. In the backpropagation stage, an error is propagated backward from the output layer through the hidden layer(s) toward the input layer, and the weights of connections between neurons in each layer are updated according to the principle of error gradient descent [
39]. Ref. [
40] used power signals with feed-forward networks for spot weld diameter prediction, outperforming regression models. For complex manufacturing scenarios, Ref. [
41] identified key quality features in multi-process production using PageRank algorithms, while Ref. [
42] enhanced prediction precision with Phased Dual-Attention Long Short-Term Memory (PDA-LSTM) = networks. Process-specific innovations included [
43] optimising blast furnace control via a BP neural network, Ref. [
6] developing a Random Subspace Method (RSM)-BP neural network for small-batch surface roughness prediction, and Ref. [
44] combining the Principal Component Analysis (PCA) dimensionality reduction with deep neural networks for internal crack analysis. Ref. [
45] used a BP neural network to predict adjustment errors, and heuristic algorithms guided by the structural characteristics of the BP neural network are embedded into the Machine Learning framework to construct a bi-level optimisation strategy that enhances model performance.
Despite advancements in physical modelling and data-driven techniques in contemporary assembly quality control research, a substantial bottleneck persists regarding the demand for high-dimensional and tightly coupled dimensional chain analysis in automotive manufacturing. Physical models’ offline nature impedes dynamic coupling quantification, while prevalent data-driven approaches (e.g., LSTM, Random Forest) suffer from “black-box” limitations that obscure deviation sources and hinder actionable optimisation. For complex assemblies like body-in-white—characterised by high dimensionality and multicollinearity—current methods rely on manual feature filtering or sequential prediction/traceability, risking inefficiency and error. Neural networks correlate parameters with quality but miss transmission paths; traditional statistics (e.g., ANOVA) identify problematic processes but fail with nonlinear interactions and real-time data. Moreover, most research compartmentalises data dimensionality reduction, predictive modelling, and root cause analysis, overlooking the possibility for synergistic optimisation among these elements, leading to inadequate model flexibility in dynamic production environments.
3. Data Collection, Preprocessing, and Analysis
3.1. Dimensional Chain Construction for Body-in-White Assembly
To analyse the rear panel fit quality, the assembly drawing of the rear panel must be established to support further node-to-node influence analysis, using a certain automobile company as an example, as shown in
Figure 1. Since body-in-white components are typically joined by welding, with sequences limited by accessibility and part interference, optimising the process design requires systematic assembly sequence modelling. This study uses a directed graph to represent the assembly relationship, capturing both sequence dependency and physical connection logic. The assembly graph is defined as
, where
is the set of components, and
is the set of relationships describing their connections. Solid lines indicate direct physical connections, while dashed lines represent sequence constraints due to welding accessibility—for example, if part
blocks the weld area of
then
must be assembled before
.
Based on the architectural diagram and the welding accessibility employed by the automobile company, the pertinent components of the rear panel can be interrelated as depicted in
Figure 2. The numbers correspond to those in
Figure 1. Solid lines indicate direct physical connections between components, while dashed lines denote a sequential order of operations between components that are not directly connected. This convention stems from practical manufacturing constraints. During auto body welding, many components cannot be welded simultaneously due to spatial interference.
Based on
Figure 2, the adjacency matrix of the directed graph
can be obtained. Through graph theory transformation, it can be converted into numerical sorting data while retaining the assembly priority, so that the computer can recognise the assembly sequence represented by {} for subassemblies and () for subordinate components, while maintaining the hierarchical relationship. The interrelations among specific assemblies can be articulated in the following manner:
Let represent the assembly, defined as where and signifies the total number of pieces.
Consequently, the assembly dimension chain for the whole vehicle of this automobile company may be articulated as follows:
The assembly sequence of various components and their interrelated assembly relationships can be identified from these brackets; specifically, (2), (4), (3) represent the primary assembly, while denotes the subassembly, which is ultimately integrated to constitute the assembly of (1).
3.2. Data Collection
This study focuses on the dimensional matching quality of a vehicle’s rear section, which is evaluated by comparing the physical measurements against theoretical design specifications. The analysis is based on a dataset obtained from the automobile manufacturer, comprising rear panel measurement data for a specific Body-in-White model from June 2022 to October 2023. All data were acquired using a calibrated Three-Coordinate Measuring Machine (CMM) system, which provides an objective assessment of deviations at key installation and mating points. The fundamental quality criterion is whether the foundational dimensions of the rear section are within the specified tolerance range. Consequently, if it is within the tolerance range, the body matching quality is considered to have met the required control standards.
To construct the dimensional chain for rear panel fit quality, measurement points related to five key assemblies are selected based on engineering expertise: Side Inner Panel Assembly (039), Side Outer Panel Assembly (051), Rear Floor Assembly (101), Floor Assembly (709), and Body-in-White Skeleton Assembly (701), corresponding, respectively, to parts (4), (2), (8), (3), and (1) as shown in the previous section. Assemblies 709 and 701 are re-assessed daily, while the remaining subassemblies are evaluated twice a week on average. Additionally, overlapping measurement points exist between sub-assemblies and assemblies, resulting in repeated assessments of their respective components. A total of 119,910 measurement data points is ultimately accumulated.
Using the measurement point ‘NLASL1201_V_AA’ in 709 as an example, the measurement data is shown in
Table 1. X, Y, and Z serve as the primary criteria for dimensional evaluation, whereas the standard D value is employed to augment the assessment of surface alterations.
3.3. Data Preprocessing
The initial modelling attempts using the raw measurement data yield suboptimal prediction performance, indicating the likely presence of significant noise, sporadic measurement errors, and non-systematic variations. To enhance the signal-to-noise ratio and extract the stable, systematic relationships crucial for quality prediction, a targeted preprocessing workflow is designed and implemented. The workflow comprises three sequential stages: data classification, gross error elimination, and data denoising, utilising selected data from 709 as examples. Its overall effectiveness is validated by the subsequent improvements in model stability and predictive accuracy.
3.3.1. Data Classification
The unprocessed Computerised System Validation (CSV) data are classified and structured by measurement points to create an analytical matrix-type data framework encompassing multidimensional measurement metrics, with the reorganised data presented in
Table 2.
3.3.2. Gross Error Elimination
Prior to dimensionality reduction with PCA and training the neural network, all numerical features (i.e., the X, Y, and Z values of each measurement point) are standardised to have a mean of zero and a unit standard deviation. This step is crucial for PCA, which is sensitive to the scale of variables, and it also stabilises and accelerates the convergence of the neural network training.
The standardisation is applied to each data point independently using the Z-score normalisation formula:
where
is the original value of a data point for a given feature,
is the mean of that feature calculated from the data set, and
is its standard deviation. The calculation formula for
and
are as follows:
This work employs the
criterion for gross error rejection based on normally distributed data. The computation of all X-direction data for the NLASL1201_V_AA yields
and
for this measurement point. The upper and lower tolerances of the data are calculated based on these two values, as shown in Equations (6) and (7).
All data sources are evaluated, and those that satisfy Equation (8) are removed. The
σ value is recalculated, and the new σ value is used to screen gross errors further until no gross errors remain.
Upon eliminating the three outliers (−1.64, 1.38, and 1.5), we obtain the new and , and retain the updated measurement data points.
3.3.3. Data Denoising
This paper analyses the data distribution of NLASL1201_V_AA following the removal of gross errors, utilising a histogram and box plot, as shown in
Figure 3. In
Figure 3, the solid red line represents the theoretical normal distribution, while the dashed red lines and the blue shaded band both indicate the boundaries of the confidence interval. The black data points are the observed values, and assessing their position relative to the theoretical line and confidence band helps determine whether the data significantly deviate from a normal distribution. The analysis reveals that the measurement point exhibits normal distribution, with noise predominantly concentrated at both extremes of the data. A Gaussian filter, which is more responsive to normally distributed data, is chosen for processing to achieve data smoothing.
The fundamental parameter of Gaussian filtering is the standard deviation σ, which dictates the extent of the filter’s smoothing effect. This study employs the Gaussian filtering function from the Scipy library in Python 3.9 to assess the MSE of filtering outcomes across various σ values. For settings of 1.0, 1.5, and 2.0, respectively, for the output results, and the Mean Squared Error (MSE) values obtained are 0.0709, 0.1223, and 0.1107. Since a lower MSE indicates superior performance, data filtered with a Gaussian filter with 1 is ultimately selected for subsequent data mining.
Figure 4 effectively eliminates extreme noise while preserving distribution tails, retaining original features, and reducing computational complexity for subsequent mining. Upon completion of data cleansing, a total of 88,970 measurement data points are acquired.
The preprocessing steps are essential for creating a robust and stable dataset for model training, effectively removing measurement noise and sporadic, non-systematic errors. However, the 3σ-based outlier removal and Gaussian smoothing, while targeting random noise, may also attenuate or remove rare yet potentially significant abnormal patterns that deviate from the normal process distribution. These patterns could correspond to infrequent but critical process faults, incoming material defects, or unique assembly events. Since this study aims to establish a reliable prediction model for routine dimensional control, by filtering these noises, the model is primarily optimised for predicting quality under stable, standard operating conditions; its ability to predict outcomes under extreme or unforeseeable fault conditions may be limited.
4. Model Development Based on PCA-BP for Dimensional Matching Quality Prediction
4.1. PCA Optimisation of Input Parameters
In this study, PCA is employed not for the conventional purpose of linear feature transformation, but as a robust feature selection technique to identify the most informative original measurement points from a high-dimensional, multicollinear dataset [
46]. We apply standard linear PCA for dimensionality reduction and feature selection. The calculations are performed using the PCA class from the scikit-learn library (version 1.3.0) in Python.
The BIW Assembly dimensional data is characterised by strong correlations among numerous measurement points due to physical constraints and process couplings. Simple univariate metrics, such as Pearson correlation coefficients between individual predictors and the target, could identify locally relevant points but may fail to capture the underlying, system-level variation patterns that govern overall assembly quality. PCA, by identifying the orthogonal directions of maximum variance in the entire predictor space, effectively reveals these dominant modes of variation. We then trace back to the original variables that contribute most to these key PCs.
This approach ensures that the selected predictors are not only individually significant but also collectively representative of the major systematic variation sources in the assembly process. The retained original measurement point values are then used directly as inputs to the BP neural network, preserving their physical interpretability for subsequent root cause analysis, which would be obscured if transformed principal components were used.
The dimensional matching quality of the rear panel is influenced by a multitude of pertinent factors. Due to the elevated measurement frequency of factory-manufactured components (such as Floor Assembly Zone) and the comparatively lower measurement frequency of incoming components from suppliers (such as Body Side Compartment), directly integrating the dimensional reports of incoming components with the modelling may lead to unnecessary interference factors caused by different measurement cycles, we prioritise establishing higher level factory-controlled components, preserving solely the 101 within the 709. Therefore, the model input comprises five essential components: 101 (rear floor assembly), 051 (side outer panel assembly), 039 (side inner panel assembly), 709 (floor assembly), and 701 (body-in-white skeleton assembly).
From the perspective of data analysis, there are 233 measurement points recorded from various locations; including all of them would considerably elevate the computing complexity and resource consumption of the model. To manage the computational complexity of 233 measurement points and their correlations, PCA is employed to reduce dimensionality, removing superfluous variables while preserving essential features. Using a refined dataset of 101 points as a case study, this section details PCA’s application process and key phases in rear panel dimensional modelling.
4.1.1. PCA Application Using 101 as an Example
Based on previously cleaned and normalised data, the 101 area totals 33 variables, with 434 data points for a single variable and 14,322 data points cumulatively. The Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy is 0.699, and Bartlett’s test of sphericity indicates that , , and . The results suggest that the PCA was suitable.
The PCA shows that the first eight principal components (PCs) explain 91.45% of the total variance. The PC1 (eigenvalue 2.975) contributes 41.46%, and PC2 (eigenvalue 1.927) contributes 26.85%. Given their dominant cumulative contribution and the minimal impact of later components, these eight PCs were selected as evaluation criteria.
Based on the eight selected PCs, the contribution of variables to the PCs is ranked from highest to lowest according to PC1, and the statistical results are shown in
Table 3. To facilitate the selection of high-contribution variables, plotting all data yields a combined histogram of all 33 variables across the eight PCs.
Figure 5 shows the aggregated histogram for these components.
To more accurately assess the importance of each original variable in explaining the overall data variation, we adopt a weighted contribution calculation method. This method considers the proportion of each PC in explaining the overall variance, thereby assigning appropriate weights to the variable contributions within different PCs.
Let denote the variance contribution ratio (i.e., the proportion of eigenvalue to total eigenvalues) of the principal component and represent the original contribution of variable in the PC.
The number of PCs is selected through their cumulative contribution rate [
47], which can effectively demonstrate the importance of their comprehensive influence in the measurements. By summing the contributions of each variable across the selected eight PCs, the cumulative contribution of each variable to all PCs can be calculated. Thus, the following formula can be used for calculation, where
represents the total number of variables.
Cumulative weighted contribution of individual variables:
Cumulative weighted contribution of all variables:
Cumulative weighted contribution share of individual variables:
Parameter Explanation for the Formulas: for an original matrix with
samples and
indicators, we first compute the column-wise mean
and standard deviation
; then, we transform the raw data
into standardised data
to obtain the standardised matrix
. Next, based on this standardised matrix, we calculate the
covariance matrix
, where the element
represents the correlation coefficient between indicator
and indicator
. Finally, we solve for the eigenvalues of the covariance matrix
, which satisfy
[
48].
The cumulative weighted contribution ratio of all variables is calculated and ranked across the eight PCs, as shown in
Table 4. This helps in analysing the cumulative contribution rates and reducing dimensionality. The first 11 variables contain the top 80% of the contribution to content information change. To optimise the analysis cost and reduce the number of calculations, these variables are selected as the input variables in the BP neural network.
4.1.2. Principal Component Evaluation and Selection
For 101, the 11 selected variables (For ease of understanding, hereinafter referred to as “measurement points”) are distributed on the digital model as shown in
Figure 6.
These points cluster primarily on the inclined support of the rear floor assembly and the flange edge area on the side of the rear floor. The analysis shows that these regions involve non-focused flange edges. During production, conventional Reference Point System (RPS) areas and critical surfaces are secured in position using pins and fixtures. However, the flange edges and styling surfaces have no dedicated clamping points and instead rely on the inherent stiffness of the stamped part to maintain their shape. Crucially, the inclined support and flange edge area form the main welding joint between the rear panel and side inner panel, ensuring final connection dimensions.
Similarly, PCA was performed on the four major areas (051, 039, 709, and 701) in the same way, with the selected measurement points being 32 for 701, 23 for 039, 21 for 051, and 9 for 709. The 701, 039, and 051 regions possess a greater number of measurement points, as these areas are monitored comprehensively, encompassing both the inner and outer side panels along with the final vehicle dimensions; therefore, the corresponding impact area is larger. Consequently, this paper uses filtered data from these high-value measurement points to develop the neural network model.
4.2. Configuration of the BP Neural Network Algorithm
The BP neural network is a multi-layer feedforward network trained via the error back-propagation algorithm. The core of its learning process involves using gradient descent to iteratively adjust the network’s weights and biases, thereby minimising a predefined loss function (e.g., MSE) between the actual and desired outputs. The architectural configuration and learning dynamics of the network are governed by a set of hyperparameters. These include the number of hidden layers, the number of neurons per layer, the choice of activation functions, and the learning rate. The selection of these hyperparameters is typically empirically predetermined prior to the commencement of the training process. Based on the BP neural network, it is possible to effectively simulate complex geometric constraints and cumulative tolerance relationships in a dimensional chain and automatically adjust weights and thresholds based on errors. This adaptability allows the model to continuously iterate and optimise, adapting to new dimensional chain datasets. For example, Liu et al. [
49] used a BP neural network to perform accurate dimensional prediction and precision verification of the weld seam dimensional chain model.
This paper investigates two neural network modelling strategies—single neural network and sequential neural network—considering the hierarchical nature of measurement points within the part structure. To identify the most suitable neural network for the automobile company, both strategies will be employed to construct the models separately, and the ultimate method will be determined based on the final validation results. From the results of PCA in
Section 4.1.2, a cumulative total of 41,664 data points is obtained from 96 measurement points (the data volume of each measurement point is 434). This model will allocate 70% of the data to the training set, comprising 29,184 measurement data points, 20% to the validation set, totalling 8352 measurement data points, and retain 10% for model testing, amounting to 4128 measurement data points. The BP neural network model in this study is implemented using the MLPRegressor class from the scikit-learn machine learning library (version 1.3.0) in Python.
4.2.1. Data Matrix Construction and Rationale
The input data matrix X and output matrix Y for the BP neural network are constructed as in
Table 5. Each row in the dataset represents a unique assembly instance (e.g., a specific vehicle body at a specific measurement time). The columns represent the dimensional variables of different areas.
This design is dictated by the physical assembly sequence and deviation propagation path shown in
Figure 2. Components 101, 051, 039, and 709 are precursors and sub-assemblies that are joined together to form the final 701 Skeleton Assembly. Their individual dimensional states collectively determine the final dimensional state of 701. Modelling the relationship
directly captures this cumulative effect of upstream variation on the final assembly quality, enabling predictive rather than reactive control.
The matching of data rows is achieved by aligning timestamps and product identifiers. For each completed 701 assembly sample, the measurement system traces back and associates the most recent pre-assembly measurement records of all its upstream areas (101, 051, 039, 709) based on the production timestamp or batch number. Despite differences in measurement frequencies across these components, this chronological backtracking method constructs corresponding input variables that reflect the pre-assembly state for each of the 701 variables. During the modelling process, the measurement for each variable (e.g., NLASL1201_V_AA) is one-dimensional; the X, Y, and Z values are each incorporated as independent feature columns into the input matrix, ensuring that the model learns the causal relationship from the complete spatial deviations of upstream parts to the final assembly quality.
4.2.2. Single Neural Network Model
This neural network models all measurement points directly. The measurement points of 101, 051, 039, and 709 are used as X factors, amounting to a total of 64 groups with 27,776 measurement data points. The data of 701, totalling 32 groups with 13,888 measurement data points, are used as Y factors.
are chosen to balance dimensional chain complexity and overfitting risk from excessive layers. The number of neurons in the hidden layer is determined by the following empirical Equation (12) [
50].
where
denotes the quantity of neurons in the input layer;
signifies the quantity of neurons in the output layer;
represents the number of samples in the training set; and
indicates the computed number of neurons.
is a variable with arbitrary values that can be self-assigned, often within the range of 2 to 10.
Based on this empirical formula, the maximum and minimum values of the neurons used are calculated separately:
The number of neurons in the hidden layers is tuned within a fundamental range of 30 to 152. In this study, a count of 100 is chosen as the starting point for network configuration. Within this architecture, more neurons are allocated to the first hidden layer (65) than to the second (35).
Employing the two-hidden-layer architecture is based on a balance between model capacity and the risk of overfitting. The dimensional chain problem, while nonlinear, does not require excessively deep feature transformations. A single hidden layer may lack the expressive power to capture complex interactions among the numerous factors in the dimensional chain model, whereas three or more layers could easily overfit due to the high dimensionality of the input and the limited number of training samples. Adding more layers would also increase computational cost without a guaranteed improvement in performance.
The initial neuron counts (65 and 35) are derived from the empirical rule (Equation (12)) as a starting point. Recognising the general nature of this rule, we conduct a systematic experimental evaluation to justify our final choice. We test multiple network configurations around the initial estimate, including but not limited to [50/25], [80/40], and [100/50]. The performance of these architectures is compared to the validation set using the primary metrics ( and RMSE).
The [65/35] configuration consistently achieves an optimal balance: it delivers high accuracy (e.g., ) while maintaining lower validation error compared to smaller networks like [50/25], and exhibits less overfitting (smaller generalisation gap between training and validation sets) compared to larger networks like [100/50]. Therefore, the [65/35] architecture is selected as the optimal configuration for the single neural network model. The training of the sequential neural network in the following text is also the same as in this section.
The Gaussian activation function is employed in the hidden layers due to normally distributed data. The learning rate is typically chosen in the range of 0~1, and the traditional default value of 0.1 is chosen. Given the localised size changes in the dimensional chain, weighted attenuation is adopted for training penalties. The number of iterations is set to five for each model to enable timely training. The finalised BP network hyperparameters are summarised in
Table 6.
4.2.3. Sequential Neural Network Model
Sub-models are developed sequentially per assembly hierarchy: first 101 to 709, then integrating 709, 039, and 051 to reach 701 body assembly. Sequential building requires the output of the first network ) to feed into the second. Thus, models for 101 and 709 must be designed first. Data selection (training, validation, test) follows the previous methodology, differing only in new network parameters.
The 101 has 11 measurement points; 709 has 9. Neuron/layer counts for the first network are calculated using the established formula and principles:
Based on comprehensive principles, the initial model uses 30 hidden neurons (20 in the first layer, 10 in the second) with other parameters unchanged. Using the 709 prediction values alongside 051 and 039 data (totalling 53 input components) as inputs, the sequential neural network calculates the 701 output response (32 output factors) with two hidden layers totalling 100 neurons (65 first, 35 s).
5. Results and Discussion
5.1. Performance of Different Models
5.1.1. Model Performance Evaluation Metrics
In this study, we selected several indicators to quantitatively analyse the performance of various models, as shown in
Table 7.
is the true value,
is the predicted value,
is the mean of the observed series and
indicates the size of the sample collection.
5.1.2. Comparison and Optimisation of Neural Network Models
After running the neural network models on the training set, the predicted values and model formulas of 701 measurement points can be obtained. The model is then adjusted using the validation set, and its final performance is evaluated on the test set. By comparing the performance metrics across the training, validation, and test sets, the model’s goodness of fit and generalisation capability can be comprehensively assessed.
Table 8 presents the outcomes of NLSTS1235_O_BA_701, demonstrating the comprehensive operation of the single neural network, which will serve as a case study for the evaluative methodology in this paper.
The 70%–20%–10% training–validation–test split provides robust model evaluation. Cross-validation confirms its superior performance: across all sets indicates excellent fit, while low prediction errors (Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), The Sum of Squares due to Error (SSE)) demonstrate high accuracy.
The comprehensive analysis of the aforementioned results indicates that the neural network model exhibits superior performance across all evaluation metrics of NLSTS1235_O_BA_701. The model demonstrates a strong fit on the training data while exhibiting comparable performance on the validation and test sets, indicating robust generalisation capabilities and effective prediction of new data. This model is expected to be dependable and efficient in actual applications. All measurement data points are evaluated using the same procedure, with the results illustrated in
Figure 7.
Figure 7 shows that most points exhibit high model usability, except for NLSTS1308_R_BA_Y_701, where SSE declines across three datasets (
). This suggests that the model possesses satisfactory predictive performance on the test set, thus negating the need for further model optimisation.
The evaluation results of the sequential neural network are shown in
Figure 8. It is found that there are significant differences in the three datasets at many points.
Analysis of randomly selected NLSTS1274_O_BA_701 reveals significant dataset disparities (The
across the training, validation, and test sets are 0.8731, 0.7489, and 0.6903), indicating overfitting. Neuron augmentation (
) is prioritised over layers to enhance fit without excessive training time. The final configuration is
,
neurons.
Figure 9 shows light outputs significantly outperform the initial model (dark lines) across metrics.
After the second model training, there is, again, a lack of model explanation on NLSTS1308_R_BA_Y_701 (
: 0.8731, 0.7489 and 0.6903;
: 0.1332, 0.1853, and 0.2041 across three sets). All other points achieve
,
. Neuron augmentation triples training time, increasing computational demands. Final model selection confirms the single neural network outperforms the sequential approach across validation metrics under equivalent resource constraints, as shown in
Figure 10.
At the same time, the NLSTS1308_R_BA_Y_701 point, with the worst status in both models, is extracted to compare the difference between the two evaluations of this data point, which can be observed in
Figure 11.
In
Figure 11, light pillars (single neural network) outperform dark pillars (sequential neural network) across metrics: higher
and lower MAE/SSE/RMSE, confirming their superior fitting capability.
In order to visualise the advantages and disadvantages of the two models, the evaluation data of the sequential neural network for the NLSTS1308_R_BA_Y_701 is subtracted from the single neural network evaluation data, illustrated in a histogram, as shown in
Figure 12.
A positive difference signifies that the single neural network’s is superior. The greater negative difference value of the remaining three evaluations indicates the superior performance of the single neural network. Therefore, the single neural network is ultimately selected for subsequent application.
5.1.3. Practical Verification of the Single Neural Network Model
To validate the single neural network’s generalisability, a new production dataset is employed. This validation set is constructed from one measurement per day over five consecutive days, with model performance evaluated with respect to the MAE, MSE, and RMSE metrics.
Figure 13 demonstrates strong predictive accuracy across all measurement points, with MAE, MSE, and RMSE values all below 0.1. Especially, MAE values are as low as 0.003573 at NLSTS1307_H_BA_701 and 0.006894 at PLSTS1108_R_BA_701, a level of accuracy further underscored by the exceptional precision at NLSTS1264_U_BA_701 (MSE = 0.000000011). Validation using new production data confirms the model’s robustness in handling dimensional variations during routine operations, affirming its practical generalisability.
5.2. Feature Traceability: Pinpointing with Permutation Importance
To ensure transparency and interpretability of the predictive model and trace the sources of dimensional deviations, we employ Permutation Importance (PI). Among various feature importance evaluation methods (e.g., SHAP values, Gini Importance, and Leave-One-Covariate-Out), PI is selected for its model-agnostic nature, interpretability, and its unique ability to disentangle a feature’s independent influence from its interaction effects with other features. This is critical for root cause analysis in complex assemblies.
For a trained model, the importance of a feature is calculated by randomly shuffling the values of that feature in the validation dataset and measuring the resulting increase in the model’s prediction error (MSE). This process is repeated multiple times to obtain a stable estimate. The reported importance score is the average increase in MSE across all iterations. In this study, the permutation importance function from the scikit-learn library is used, with the number of permutations set to K = 50, and we calculate two metrics:
Main Effects: The average increase in MSE when only the feature in question is permuted. This shows the feature’s direct, independent contribution to the model’s prediction accuracy.
Total Effects: The average increase in MSE when the feature in question is permuted together with all other features simultaneously. This captures the feature’s total contribution, including both its main effect and all its interactions with other features.
A Total Effect significantly larger than its corresponding Main Effect indicates that the feature exerts substantial influence through interactions with other process variables. Conversely, a feature with a high Main Effect is a dominant independent driver of quality deviation. The values presented in
Table 9 represent the mean contribution percentages derived from this procedure, providing a ranked list of critical deviation sources for targeted quality intervention.
Table 9 reveals NLQRA1201_L_AA_Y_709, NLVDS1204_L_BA_Y_709, and NLQRA1202_L_AA_Y_709 as dominant variables with high main and total effects on predictions, marking them critical for error control and optimisation. Notably, some variables exhibit minimal main effects but substantial total effects, indicating significant interaction-driven indirect influence. This demonstrates that model outcomes depend on both individual variables and their synergistic relationships, necessitating explicit interaction in the modelling process.
6. Empirical Analysis
This section applies the neural network prediction model to solve the sudden overshoot problem of the body-in-white dimensions.
During the manufacturing of a specific vehicle model, the measurement data at point NLSTD1311_L_BA_Y in the luggage compartment area of 051 exhibited an abnormal fluctuation. The recorded value surged to 1.58 mm, a significant deviation from the conventional mean of 0.89 mm. The traditional single-piece optimisation was deemed unsuitable due to its high time and cost requirements, compounded by a lack of historical data on how adjustments in this area would impact the rear cover matching. To address this, the PCA-BP prediction framework was employed, which integrated data from the side panel and floor assemblies. The model predicted that only one measurement point, NLSTS1249_O_BA_701, would exceed the tolerance (1.053 mm > +1 mm threshold), as detailed in
Table 10.
Upon thoroughly comparing the measurement report, it is revealed that the area is merely the measuring process point of the rear cover matching area, and it has a long-term propensity to be on the high side. Since the fluctuation range is not large and is within the controllable range of the actual loading, the region is judged to be risk-free, and no special intervention is made for the out-of-tolerance part. Following repeated routine measurements, no apparent abnormalities are detected in the report of 701, and the actual measurement result at this point is 1.01, which closely aligns with the model prediction.
Based on the above analysis, PI validation is further conducted. To visually highlight the primary sources of assembly variation,
Figure 14 presents a Pareto diagram of the contribution share based on the Total Effects.
As shown in
Figure 14, the contribution analysis clearly indicates that the anomalous point (NLSTD1311_L_BA_Y) is only a secondary factor in the overall dimensional variation. The primary sources of variation—key measurement points such as NLQRA1201_L_AA_Y_709, NLVDS1204_L_BA_Y_709, and NLQRA1202_L_AA_Y_709, which collectively account for 38% of the total effect—have already been identified during the model-building phase. In practice, these points are located in well-controlled process areas where fixture positioning, welding sequence, and part conformity are rigorously monitored. As a result, their production variability remains very low (standard deviation < 0.05), and they rarely drift out of tolerance under normal conditions. This explains why, despite the model attributing strong influence to them, the actual assembly fluctuation in this case is minimal, and the corresponding 701 measurement results consistently meet specifications with reduced variation.
The key insight from this case is the shift in quality strategy it enables. By distinguishing primary variation drivers from secondary anomalies through the PCA-BP model, we can implement differentiated control: stringent preventive monitoring is maintained for the few critical points, while non-critical deviations (like the one observed) do not trigger immediate, costly interventions. This approach directly reduces the rework rate compared to the period before model implementation.
From a cost perspective, this strategy yields savings in two major ways. First, avoiding unnecessary rework on non-critical anomalies saves direct labour, material, and downtime costs. Second, and more significantly, the preventive focus on key points drastically reduces the occurrence of major defects, thereby minimising high-cost rework, line stoppages, and delays. These benefits are intrinsically linked to the model’s ability to isolate major variation sources from minor noise.
Furthermore, the neural-network-based method itself creates substantial efficiency gains. Without it, resolving such an anomaly would require immediate 701 skeleton measurements (taking three hours per assembly due to lack of historical data) and 100% manual inspection of rear panels using gap rulers and flatness gauges. The predictive model eliminates these resource-intensive, ad hoc measurement requirements, significantly reducing quality control costs while sustaining high operational efficiency.
It is worth emphasising that, to enhance the practical convenience and efficiency of the model proposed, it is essential to establish a dimensional quality control digital platform. This platform should be equipped with core functionalities such as data uploading, automated computation, results visualisation, web-based publishing, and a human–machine interface (HMI). The development and implementation of such a digital platform within the automobile company discussed in this paper has significantly bolstered the effectiveness and efficiency of its overall dimensional matching quality control. For instance, the intuitive visualisation interface coupled with intelligent early-warning mechanisms has markedly improved the timeliness and accuracy of anomaly detection, thereby reducing human misjudgement. This integrated, data-driven approach has achieved a 50% reduction in the analysis-and-resolution cycle for dimensional quality issues, demonstrating a substantial advancement towards intelligent, proactive quality management.
7. Conclusions
This study proposes an integrated PCA-BP neural network technique with permutation importance for quality prediction and traceability in body-in-white rear panel dimension chains. PCA first reduces high-dimensional data dimensionality; using rear floorboard data as a case study, 8 principal components yield 11 key measurement points while retaining 91.452% of the information content. The evaluation of critical measurement points across five major regions (101, 039, 051, 709, 701) provides key reference indices for developing a neural network aligned with the rear panel dimensional chain, reducing computational burden and enhancing model accuracy.
Neural network development involves model configuration and dataset distribution. During model creation and optimisation, a two-hidden-layer structure is implemented, with the optimal number of neurons established based on data features and empirical calculations. The Gaussian activation function and weighted decay mechanisms enhance generalisation. According to the characteristics of the dimension chain, two modelling approaches—single and sequential neural networks—are comprehensively compared. Results confirm that the single neural network model delivers superior metrics (average ) and robust generalisation ().
To address the neural network’s ‘black box’ issue that hinders bias source identification, this work evaluates variable importance by permutation importance to enhance interpretability and traceability. It identifies critical bias sources (e.g., NLQRA1201_L_AA_Y_709 in region 709) and elucidates how variable interactions influence quality. The proposed PCA-BP neural network model enables active intervention in dimensional fluctuation. Empirically, the model alerts to Side Panel to Rear End Mating discrepancies, with closely aligning with . It also reduces emergency measuring and manual inspection costs, while lowering rework rates. This data-driven predictive control paradigm offers significant advantages for automotive assembly quality assurance.
Compared to conventional methods, this framework substantially improves prediction accuracy, computational efficiency, and interpretability, providing both theoretical support and a practical pathway for intelligent quality control.
Despite the promising results, this study has several limitations. Future research can be carried out in the following aspects:
- (1)
Extension to different vehicle models and production lines
The datasets used for empirical analysis are sourced from a specific vehicle model and a single production line of a particular automotive manufacturer, specifically focusing on the body-in-white rear panel assembly process. Consequently, the proposed methodology and findings are, to a certain extent, dependent on specific process conditions and production environment, which means their generalisability to other vehicle models, different manufacturers, or varied tooling and production line layouts has not been sufficiently validated. Future work will involve applying the framework to other vehicle models and production lines from the same manufacturer and testing it under diverse process conditions to systematically evaluate its robustness and generalisability. Through cross-model comparisons, the method will be further refined for a wider range of automotive manufacturing scenarios.
- (2)
Systematic parameter sensitivity analysis for data preprocessing
The data preprocessing strategy employed in this study may filter out some rare anomalous signals, which could affect the model’s predictive performance under extreme conditions. Future work will conduct a systematic parameter sensitivity analysis of data preprocessing steps, such as filtering and normalisation, to quantify the impact on model performance. This will shift preprocessing from an empirically driven selection to an evidence-based optimisation, thereby enhancing the overall robustness of the framework.
- (3)
Comparison with different machine learning algorithms
This study is limited to examining the performance of single neural networks and sequential neural networks, without comparing them with other representative data-driven methods. Future work plans to conduct a multidimensional comparison with machine learning algorithms such as Random Forest and Support Vector Machines to comprehensively evaluate their respective strengths and weaknesses. Building on this, we will further explore hybrid models that integrate the advantages of different algorithms to address more complex quality prediction tasks.
- (4)
Integration with cloud manufacturing and real-time quality control
Future work will focus on real-time data interoperability and multimodal fusion across cloud manufacturing networks, investigating multi-agent collaboration mechanisms that incorporate suppliers and other stakeholders to enhance the model’s adaptability to process variations and support real-time quality control in distributed production ecosystems.