1. Introduction
With economic development and continuous technological progress, there is an increasing demand for polymer materials with new and improved properties [
1,
2]. The popularity of these materials stems from their low weight and relatively high resistance to external conditions [
3]. Among a wide range of polymers, polyethylene (PE) holds particular importance due to its unique characteristics: low water absorption, high impact and abrasion resistance, and strong durability. A key factor behind its widespread use is its economic viability—it is inexpensive to produce and can be recycled. Therefore, LDPE aligns well with the Sustainable Development Goals, as its recyclability supports circular economy practices. Within this group, polyethylene films are widely used in construction, agriculture, and packaging applications, where mechanical performance and durability are critical.
The morphology of PE films plays a crucial role in determining the mechanical properties of these materials, which affects their suitability for various applications. Within the polyethylene family, low-density polyethylene (LDPE) has garnered particular attention due to its flexibility, low density, and ease of processing [
4]. Its highly branched molecular structure results in these properties, creating an amorphous morphology. This morphology enables LDPE to exhibit significant elongation at break and favorable mechanical properties, albeit at the cost of reduced tensile strength compared to high-density polyethylene (HDPE). The key physical and mechanical properties of LDPE films are influenced by crystallinity, crystal size, thickness, and molecular orientation [
5,
6,
7].
Mechanical properties, including tensile strength and elastic modulus, increase with the degree of crystallinity, especially regarding yield strength [
5]. The molecular weight of polyethylene directly affects its strength, with higher molecular weight resulting in greater strength due to limited molecular mobility. However, crystallinity tends to decrease as molecular weight increases, leading to a trade-off that must be carefully managed to achieve desired performance characteristics [
8,
9]. Given the complexity of these interdependencies, systematic experimental studies combined with advanced modeling techniques are crucial for optimizing the performance of LDPE materials [
3,
10,
11]. The correlation between structural characteristics and mechanical properties suggests that the elastic modulus and tensile strength are primarily determined by the orientation of polymer chain segments in the amorphous phase. At the same time, thermal stability is provided by the crystalline structure. Mechanical properties can also be modified by controlling production conditions, such as temperature gradients, which influence material crystallinity [
12].
The elastic modulus of LDPE films varies widely, ranging from 96.5 to 262 MPa, with tensile strength between 4.1 and 15.9 MPa [
7,
11]. In other studies, the elastic modulus ranged from 113 to 230 MPa, while tensile strength ranged from 11 to 37.9 MPa [
11]. These differences do not represent contradictions but reflect variations in polymer grade, film thickness, processing conditions, and testing methodologies, indicating that reported mechanical properties are highly sensitive to production and structural parameters.
Although LDPE is relatively low in strength, it remains a widely used material. The elongation at break of LDPE films is influenced by aging, blending with other polymers, and the addition of specific additives, with values varying according to these factors [
13].
Despite extensive experimental research, predictive modeling of LDPE film mechanical properties remains challenging due to the nonlinear and multivariate nature of the factors that influence them. The processes affecting the final parameters of studied films are complex, multistage, and multidimensional, often exhibiting nonlinear effects on mechanical properties. Technological parameters are usually selected based on reference data or standard formulations, followed by multiple trials to eliminate errors and achieve desired outcomes. Optimization of production costs, reduction in defects, and enhancement of mechanical and structural properties can be achieved through the application of appropriate statistical techniques.
Forecasting is a key method for predicting phenomena. Numerous definitions of forecasting in the literature commonly agree that a forecast assesses future, uncertain events whose potential occurrence is predicted in advance [
14]. Scientific forecasts must be based on well-established theories that can be empirically verified. In recent years, fields such as statistics, econometrics, and computer science have developed various tools that enable forecasting. According to the No Free Lunch theorem [
15], no universal model can be applied in all cases. This limitation is particularly relevant for polymer materials, where traditional linear or empirical models often fail to capture complex interactions between structural and surface-related parameters.
Machine learning (ML) is an interdisciplinary field that combines probability theory, statistics, approximation theory, and algorithmic complexity theory. ML enables the construction of data-driven models capable of identifying nonlinear relationships without explicitly defined physical equations. ML has increasingly been applied to support the predictions in systems characterized by high variability and complex structure–property relationships [
16,
17,
18,
19]. Data-driven modeling strategies have been successfully adopted in civil engineering, where both classical statistical methods and advanced ML techniques are employed to predict mechanical performance, durability, and failure behavior of materials and structural components. Recent studies have demonstrated that ML-based approaches are particularly effective in capturing multivariate dependencies involving material composition, microstructural features, and macroscopic responses, often surpassing traditional empirical or semi-empirical models in predictive accuracy [
18,
20,
21]. These findings underscore the growing role of ML as a complementary tool to physics-based modeling frameworks across engineering disciplines.
However, many existing ML-based approaches to polymer property prediction rely on limited input parameters, simplified datasets, or neglect surface topography effects, which are particularly relevant for polymer films [
22].
The purpose of this article is to present an innovative application of machine learning algorithms (MLA), including Neural Network, Gradient Boosting, and XGBoost, to predict the tensile strength of LDPE films in the transverse (TD) and machine (MD) directions based on surface roughness parameters and physical properties such as surface mass and film thickness. By explicitly incorporating surface roughness alongside basic physical parameters, this study addresses a gap identified in existing predictive approaches. This paper aims to demonstrate the potential of ML-based methods as supportive tools for improving material assessment and contributing to more efficient and sustainable polymer film production.
2. Materials and Methods
2.1. Building Films
The analysis focused on LDPE building films of varying thicknesses, produced using the extrusion blow molding method. Two types of films were examined: vapor-proof films (VFY) and construction films (IFB). Microscopic images of the films are shown in
Figure 1. Due to the nature of recycled polymer streams, detailed information on the original product composition and full processing history of the recycled LDPE was not available. This reflects realistic industrial recycling conditions and constitutes an inherent characteristic of recycled materials.
2.2. Mass per Unit Area and Material Thickness
The average mass per unit area was determined through three measurements, whereas the average thickness was determined from 60 measurements of the tested construction films. The obtained thickness values were then compared with the nominal thickness declared by the manufacturer. Both the mass per unit area and thickness measurements were conducted in accordance with the PN-EN 1849-2 standard [
23].
Test specimens with a known area were weighed with an accuracy of 0.01 g to determine the mass per unit area. The specimens were square-shaped, with an area of 10,000 ± 100 mm2, and were cut at least 100 mm away from the film edges. Three samples were taken from each film at intervals of approximately 500 mm. Before weighing, the specimens were conditioned for about 20 h at a temperature of 23 ± 2 °C and a relative humidity of 50 ± 5%. The average mass per unit area was then calculated with a precision of 0.1 g/m2.
Thickness measurements were performed using a mechanical micrometer with an accuracy of 0.01 mm. These measurements were conducted on samples prepared for strength testing, with six readings taken from each sample.
2.3. Strength Tests
The tensile properties of the tested building films were evaluated by the PN-EN 12311-2 standard [
24] using an Instron 5966 testing machine. This machine was equipped with mechanical grips and operated with Bluehill 2 software.
Two sets of samples were prepared for each material for the strength tests: five samples for the machine direction (MD) and five for the transverse direction (TD). Each sample measured 50 mm by 200 mm and was randomly cut from the tested material using a template, ensuring a minimum distance of 100 mm from the material edge. The longer dimension of each sample aligned with the direction being tested.
Before testing, the samples were conditioned for 24 h at a temperature of 23 ± 2 °C and a relative humidity of 50 ± 5%. The static tensile tests were carried out under the same temperature conditions, with the machine grips extending at a constant speed of 100 ± 10 mm/min.
It should be noted that the mechanical properties of polyethylene films are strongly influenced by molecular weight and molecular weight distribution, which were not available for the analyzed materials in this study. Consequently, these parameters were not included as input variables in the machine learning models. The absence of molecular-level descriptors may limit the predictive accuracy and generalizability of the developed models when applied to LDPE films with significantly different polymer grades or processing histories. Nevertheless, the proposed approach remains effective for comparative analysis and prediction within the investigated material set, where surface roughness parameters and physical properties indirectly reflect underlying structural differences.
2.4. Microscopic Analysis of Sample Surfaces
In construction materials and components, attention must be paid not only to dimensional tolerances and geometric precision, but also to how the surface geometry affects functional performance. The geometric surface structure (GSS) plays a crucial role in determining properties such as mechanical strength, wear resistance, resistance to contact loading, and the integrity of joints. Assessment of GSS is commonly carried out through the evaluation of two- and three-dimensional surface roughness parameters in compliance with ISO 4287 [
25] and ISO 25,178 [
26] standards.
Accordingly, alongside mechanical testing, a detailed investigation of the surface microstructure of the examined films was performed. The purpose of this analysis was to quantify the structural changes that occurred in the samples after failure and to establish correlations between surface roughness parameters and mechanical strength indicators. Such relationships may enable the prediction of selected performance properties of the films. Specimens of the VFY vapor barrier film and the IFB construction film were analyzed using a Keyence VHX-6000 series digital microscope equipped with a VH-Z20R/Z20T universal zoom lens, together with a Keyence VR-5000 series wide-area 3D measurement system. The acquired images facilitated the assessment of surface roughness as well as the identification and measurement of inclusions, micro-defects, and surface topography. An example of a 2D and 3D roughness measurement result is shown in
Figure 2 for illustrative purposes. The upper part of
Figure 2 shows a three-dimensional (3D) surface height map of the analyzed sample area, where the color scale represents height variations across the surface. This visualization enables the identification of local surface features such as peaks, depressions, and defect clusters. The central part of the figure presents a two-dimensional (2D) image of the same surface region with the measurement area highlighted, from which the 3D roughness parameters were calculated. The evaluated areal roughness parameters (e.g., Sa, Sq, Sz, Ssk, Sku, Sp, Sv) are listed, providing a quantitative description of surface morphology over the entire measurement area. The lower part displays the measurement line used to determine traditional 2D roughness parameters, together with the corresponding roughness profile extracted along this line. The profile is shown in a mirrored view to clearly visualize height fluctuations and surface irregularities along the selected cross-section. It should be emphasized that, as demonstrated by Kowalski et al. [
27] and proved in Refs. [
28,
29,
30], surface characteristics evaluated exclusively in a planar (2D) system may be subject to significant errors, particularly when the analyzed surface exhibits non-isotropic features. Linear roughness parameters often fail to capture extreme surface points, such as deep valleys or high peaks, and therefore may not adequately represent the true surface condition. For this reason, relying solely on linear roughness measurements, such as those presented in the low part of
Figure 2, may lead to incomplete or misleading interpretations. Consequently, the discussion and further analysis in this paper are based exclusively on 3D surface roughness parameters, ensuring a more accurate description of surface morphology and its influence on mechanical performance.
In the correlation analysis between surface topography and the tensile strength of LDPE films, the parameters Sa, Sq, Sz, Ssk, and Sku were considered, whereas the parameters Sp and Sv were not directly included in the quantitative correlation analysis.
The parameters Sa and Sq describe the global level of surface roughness and are directly related to the structural uniformity of the material and the potential for stress concentration at the surface. The parameter Sz represents the total height of the surface profile, encompassing the full range of surface irregularities, including extreme defects that may serve as crack initiation sites under tensile loading.
The parameters Ssk and Sku provide information on the statistical characteristics of the surface height distribution. Skewness (Ssk) indicates whether the surface topography is dominated by peaks or valleys, while kurtosis (Sku) identifies the presence of sharp asperities and localized stress concentrators. Both parameters are particularly relevant in the context of fracture mechanics and damage initiation in thin polymer films.
In contrast, the parameters Sp and Sv describe single extreme surface points and are more sensitive to local measurement artifacts. Therefore, they have limited statistical representativeness for the entire measurement area and were used only as auxiliary descriptors of surface topography, rather than for quantitative correlation analysis with tensile strength.
2.5. Selection and Rationale of Machine Learning Algorithms
The prediction of tensile strength in polymer films represents a non-linear regression problem characterized by a limited number of experimental observations and multiple correlated physical and surface parameters. Under such conditions, model selection must strike a balance between predictive accuracy, robustness against overfitting, and interpretability. Three machine learning algorithms were selected for this study: Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGBoost), and a feed-forward Neural Network (NN). These models were deliberately chosen to represent different levels of model complexity and learning mechanisms, enabling a comprehensive comparison of ensemble-based and neural approaches in the context of materials characterization [
31,
32,
33].
Gradient Boosting Machine (GBM) was selected due to its proven effectiveness in modeling non-linear relationships in structured experimental datasets. By sequentially fitting shallow decision trees to residual errors, GBM achieves high predictive accuracy while maintaining robustness in relatively small datasets. Intrinsic regularization through tree depth, learning rate, and node size makes this method particularly suitable for materials science applications where overfitting is a concern [
34,
35,
36].
Extreme Gradient Boosting (XGBoost) extends the GBM framework by incorporating additional regularization mechanisms and efficient computational strategies. The use of L1/L2 penalties and subsampling improves generalization performance, especially in datasets with correlated physical and surface parameters. Due to its strong performance in engineering regression tasks, XGBoost was selected as a state-of-the-art ensemble method for predicting tensile strength [
31,
37].
A feed-forward Neural Network with a single hidden layer was employed to represent a flexible, non-parametric modeling approach. Although neural networks can capture complex nonlinear interactions, their susceptibility to overfitting in smaller datasets requires careful control of the architecture and preprocessing. Therefore, a shallow network configuration was adopted to ensure stability and enable a fair comparison with tree-based ensemble models [
21,
32,
33,
38].
During the exploratory analysis, simpler linear models and kernel-based approaches (e.g., Support Vector Regression) were preliminarily evaluated but were excluded from the final comparison due to their inferior predictive performance and reduced robustness across validation folds.
2.6. Model Training, Validation Strategy, and Preprocessing
All models were trained and evaluated using a consistent and reproducible pipeline designed to minimize overfitting and ensure fair comparison between algorithms.
2.6.1. Dataset Splitting, Cross-Validation, and Hyperparameter Optimization
For both transverse (TD) and machine (MD) directions, the available datasets (114 and 115 samples, respectively) were randomly split into training (80%) and test (20%) subsets using stratified sampling with respect to the target variable. The test set was held out entirely from model training and hyperparameter optimization and used exclusively for final performance evaluation.
Hyperparameter tuning was conducted exclusively on the training data using a five-fold cross-validation (5-fold CV) approach. This approach ensured that model selection was based on cross-validated performance rather than a single train–test split, reducing the risk of optimistic bias in small datasets. The optimal hyperparameters were selected by maximizing the cross-validated coefficient of determination.
Table 1 provides a summary of the tuned hyperparameters and search spaces.
2.6.2. Performance Metrics
Model performance was assessed using three complementary metrics: coefficient of determination (R
2), root mean square error (RMSE), and mean absolute error (MAE) [
21,
39].
3. Results
3.1. Mass per Unit Area and Thickness
The average mass per unit area and the average thickness of the tested building films are presented in
Table 1. A higher mass per unit area is advantageous, as it enhances resistance to damage during installation and improves durability for long-term protection. For polyethylene (PE) films, a direct relationship exists between mass per unit area and thickness when material density remains constant; a greater mass per unit area corresponds to a thicker film.
The measured thicknesses and mass per unit area of the tested films (VFY 0.20, VFY 0.15, IFB 0.30, IFB 0.20, and IFB 0.15) are shown in
Table 2. While the thicknesses of the tested films are similar, they are lower than the values specified by the manufacturer.
The average thickness values for the tested films fall within the manufacturer’s specified tolerance of ±50%. However, individual film samples occasionally fall below the lower tolerance limits, as shown in
Figure 3.
3.2. Mechanical Properties
During the tensile tests, no specimen slipped from the grips or failed at a distance of less than 10 mm from the machine clamps. All samples fractured in or close to the central region, confirming the proper execution of the tests.
Figure 4a–c illustrates the tensile strength (TS), strain at tensile strength (STS), and strain at break (SB) for the two types of analyzed films (VFY and IFB) with different thicknesses and tension directions (MD and TD). The coefficient of variation
for tensile strength did not exceed 19.8% for vapor-proof films and 18.7% for construction films. The results for each film are summarized in
Table 3.
3.3. Roughness Measurements
Surface topography analysis was conducted to evaluate parameters related to height, roughness, and isotropy of the examined surfaces, including both valleys and peaks. The surface mapping results indicate that the VFY 0.15 film exhibits the lowest values of root mean square height (Sq), maximum height (Sz), and arithmetic mean height (Sa) (
Table 2). The kurtosis of the height distribution (Sku) provides information on the uniformity of surface irregularities. A Sku value of 3 corresponds to a normal, evenly distributed roughness profile that includes surface imperfections. Examination of the data presented in
Table 4 shows that, in all cases, the Sku values exceed 3, indicating a more peaked and narrow height distribution curve. Higher Sku values are associated with an increased presence of surface defects, such as deep valleys and pronounced peaks. Consequently, the IFB 0.20 film is characterized by the highest density of surface defects, which was also observable by visual inspection.
3.4. Model Training and Evaluation
3.4.1. Predictive Performance of Machine Learning Models
Machine learning models were developed to predict the tensile strength of LDPE building films in the transverse (TD) and machine (MD) directions using surface roughness parameters and physical properties such as mass per unit area and film thickness. Due to the limited dataset size (114 samples for TD and 115 for MD), particular emphasis was placed on model validation and the prevention of overfitting through stratified train–test splitting and cross-validation-based hyperparameter selection.
The results summarized in
Table 5 indicate that all three algorithms were capable of capturing the relationship between the selected parameters and tensile strength, with XGBoost consistently achieving the highest predictive accuracy in both material directions. Importantly, model performance was evaluated not only using the coefficient of determination (R
2), but also with absolute (MAE) and squared (RMSE) error metrics, providing a physically interpretable assessment of prediction accuracy. In the transverse direction (TD), XGBoost achieved the best overall performance, with an R
2 of 0.827 and the lowest prediction error (MAE = 0.589). Gradient Boosting showed a slightly lower predictive capability, while the neural network produced comparable R
2 values but with increased prediction uncertainty, as reflected by higher MAE and RMSE values.
The predictive accuracy improved markedly for all models in the machine direction (MD). XGBoost achieved an R2 of 0.908, indicating a strong agreement between predicted and experimental tensile strength values. Gradient Boosting and the neural network also demonstrated high predictive performance in MD, with R2 values exceeding 0.86, although accompanied by moderately higher prediction errors.
The optimal hyperparameter configurations selected via cross-validation for each model and material direction are summarized in
Table 6. These configurations correspond to the models whose predictive performance is reported in
Table 5.
In the TD (
Figure 5), all models exhibit a clear positive correlation between predicted and observed tensile strength values. XGBoost shows the tightest clustering of data points around the regression line, consistent with its superior quantitative performance reported in
Table 5. Gradient Boosting displays greater scatter, particularly at lower tensile strength values, suggesting reduced robustness in capturing subtle variations in this direction. The neural network achieves a reasonably good fit, although deviations from the ideal prediction line remain visible, reflecting its higher error metrics.
For the MD (
Figure 6), the predictive performance of all models improves significantly. XGBoost predictions closely follow the experimental data across the full range of tensile strength values, confirming its high R
2 and low error values. Gradient Boosting also demonstrates improved accuracy compared to the TD, while the neural network maintains stable performance with only slightly increased dispersion at higher tensile strength levels.
The observed differences between TD and MD predictions reflect the inherent an-isotropy of LDPE films, where mechanical properties are more consistently structured along the machine direction due to processing-induced molecular orientation. The superior performance of tree-based ensemble methods, particularly XGBoost, can be at-tributed to their ability to model non-linear interactions between surface roughness descriptors and bulk mechanical properties without imposing restrictive parametric as-assumptions. Although neural networks demonstrated competitive performance, their slightly higher prediction errors may be attributed to the limited dataset size, which constrains the effective learning of complex feature interactions. Nevertheless, the consistency between cross-validated training performance and independent test set results indicates that overfitting was successfully mitigated.
3.4.2. Model Error Analysis and Residual Diagnostics
Figure 7 and
Figure 8 present the residual distributions for three models, XGBoost, Gradient Boosting, and Neural Network, across two datasets (for the transverse direction (TD) (
Figure 7) and the machine direction (MD) (
Figure 8)). Residuals indicate the difference between predicted and actual values, helping to assess the accuracy of the models and their prediction errors. The residual distribution for XGBoost in the transverse direction is symmetric and resembles a normal distribution, indicating balanced error prediction. The center of the distribution is close to zero, suggesting that the model generally predicts permeability values well. The distribution is moderately wide, which indicates moderate error variability. Gradient Boosting has a more stretched distribution, with more significant errors around the negative residual values. This suggests that the model underestimates certain permeability values in the transverse direction (TD). The distribution is not symmetric, indicating that the model has some accuracy issues across different data ranges. The neural network in the transverse direction (TD) exhibits a more irregular residual distribution, suggesting greater, more significant variability in prediction errors. However, the distribution is relatively narrow, which may indicate smaller but less stable prediction errors. The center of the distribution is close to zero, but there are clear deviations on both sides.
In the machine direction (MD), XGBoost shows a very similar residual distribution to the TD, with a symmetric distribution close to zero. The model tends to predict permeability in MD with greater precision, as evidenced by the slightly narrower residual distribution. Gradient Boosting in the MD has a better residual distribution than TD, with a more symmetrical shape, although it is still somewhat stretched on the right side. The model shows improved predictions in this direction, but there are still some deviations in the distribution’s right tail. The neural network in the MD exhibits a narrower and more concentrated residual distribution than in TD. Although the model still shows some irregularities in the error distribution, the residuals are closer to zero, suggesting a better model fit in MD.
Figure 9 and
Figure 10 illustrate the relationship between standardized residuals and leverage for the analyzed algorithms. The relationship between residuals and leverage can indicate the presence of outliers or systematic errors in the models. These charts are often used to assess the stability and quality of models. The chart shows a random distribution of points across leverage values, suggesting that the XGBoost model performs well in the transverse direction (TD). The absence of noticeable patterns implies a lack of systematic errors and relatively good model stability. In the case of Gradient Boosting, the chart is similar to that of XGBoost, with points evenly distributed around the axis. No discernible trend indicates that Gradient Boosting in TD also does not exhibit significant systematic errors. However, a few points with higher leverage values are visible and may be more influential. The leverage chart for the neural network shows no clear trends in residuals, although a few outliers with higher leverage values suggest the potential presence of influential points.
In the machine direction (MD), XGBoost also shows a stable distribution of points around the regression line. There are no noticeable patterns, which indicates the robustness of the model in this direction. A few points are close to leverage values of 0.2, which may suggest the presence of influential points, but this is not problematic. Gradient Boosting in the MD shows slightly more clustered points at lower leverage values, but as in previous cases, there are no clear patterns of systematic errors. The chart indicates that the model is stable across most of the data. In the case of the neural network, there is no clear trend in the MD as seen in TD. The distribution of points is more evenly spread around the regression line, suggesting that the model performs better in predicting the MD than in the TD. A few points are more dispersed but do not form noticeable systematic errors.
3.4.3. SHAP-Based Interpretation of Tensile Strength Prediction
To model interpretation and to verify whether the machine learning models learned physically meaningful relationships, SHAP (Shapley Additive exPlanations) analysis was applied to all three algorithms for both transverse (TD) and machine (MD) directions. Two complementary SHAP visualizations were used: mean absolute SHAP values (
Figure 11 and
Figure 12), which determine the overall magnitude of feature contributions, and SHAP summary (beeswarm) plots (
Figure 13 and
Figure 14), which additionally indicate the direction and variability of feature influences on individual samples.
In the TD (
Figure 11), both tree-based models, XGBoost and Gradient Boosting, identify film thickness (T) as the dominant parameter determining the prediction of tensile strength. This result is consistent with the fundamental relationship between the structure and properties of polymer films, where the response to stretching is proportional to the cross-sectional geometry and molecular orientation. Surface topography parameters, particularly skewness (Ssk) and kurtosis (Sku), also exhibit a significant influence, suggesting that asymmetry and distribution of surface features affect stress transfer and damage initiation in the transverse direction. Mechanical parameters related to tensile behavior in TD (e.g., STS_TD) further support the importance of anisotropy-related variables.
The Neural Network model shows a broader distribution of feature contributions in the bar plot representation, with thickness remaining influential but accompanied by several surface and mechanical parameters. Importantly, the beeswarm plot (
Figure 13) reveals clear feature-to-feature variation and non-uniform SHAP distributions, indicating that the model assigns different levels of importance to each input. Instead, the NN captures a more distributed representation of contributing factors, which may reflect its tendency to model multiple interacting parameters rather than emphasizing a small subset of dominant variables.
The beeswarm visualization further provides insight into the directionality of effects. For thickness, higher values generally contribute positively to tensile strength predictions, while several surface parameters show both positive and negative SHAP values depending on their magnitude, highlighting non-linear and specimen-dependent interactions that are not accessible from global importance rankings alone.
In the MD (
Figure 12), film thickness (T) again emerges as a key contributor across all models, confirming its central role in tensile strength prediction regardless of orientation. In contrast to TD, mass per unit area (M) becomes more influential, particularly in the XGBoost and Neural Network models. This is physically plausible, as M integrates thickness and density-related effects that are closely linked to processing conditions and molecular alignment in the machine direction.
Gradient Boosting places strong emphasis on thickness and selected roughness and strength-related parameters, distributing importance more evenly than XGBoost but still prioritizing physically interpretable features. The Neural Network exhibits a similar distributed attribution pattern as observed in TD. However, the beeswarm plot (
Figure 14) demonstrates meaningful variation in SHAP values across features and specimens, confirming that the model differentiates between parameters rather than treating them uniformly.
The beeswarm plots reveal that increases in thickness and mass per unit area predominantly contribute to higher predicted tensile strength in MD, while surface roughness parameters show more complex, bidirectional effects. This behavior suggests that surface morphology modulates tensile response indirectly, potentially through its influence on stress concentration and interlayer interactions rather than acting as a purely monotonic factor.
SHAP analysis confirms that all three models rely on physically meaningful parameters and capture anisotropy-related differences between TD and MD. Tree-based models tend to concentrate predictive power on a smaller set of dominant variables, whereas the Neural Network distributes importance across multiple interacting features. The inclusion of beeswarm plots was essential for identifying the direction and variability of feature effects and for avoiding misleading interpretations based solely on global importance measures.
4. Discussion
4.1. Effect of Thickness
The thickness measurements confirm that the tested films comply with the manufacturer’s declared thickness tolerances (MDV), but variability within individual samples can negatively affect their mechanical properties. Thinner films are more susceptible to mechanical damage, which reduces their durability under adverse environmental conditions, such as UV radiation. These findings align with previous studies by Rennert et al. [
40] and Szlachetka et al. [
28], which identified greater thickness variability in PE films produced by the blown extrusion method compared to cast films. This increased variability may be attributed to the inherent nature of the blown extrusion process, which is less precise in controlling thickness uniformity across batches. The error bars in
Figure 3 illustrate standard errors
(where σ is the mean standard deviation and
n is the population size), providing insight into the variability of the thickness measurements for both the machine direction (MD) and transverse direction (TD). The observed variability, while within acceptable tolerances, highlights the importance of optimizing production processes to minimize inconsistencies and improve the overall performance of LDPE films. By reducing the variability in film thickness and ensuring compliance with specified tolerances, manufacturers can enhance the mechanical properties and longevity of construction films, thereby addressing the demands for durable and sustainable materials in construction applications.
4.2. Changes of Mechanical Properties
Despite the consistent tensile strength values, significant differences in strain at tensile strength and strain at break were observed in certain cases. As shown in
Figure 4b–c, the vapor-proof film VFY 0.20 exhibited average strain at tensile strength (STS) and strain at break (SB) in the MD of 349.44% and 479.88%, respectively, while in the transverse direction (TD), these values were 228.31% and 515.71%. On the other hand, the IFB 0.15 film showed average strain at tensile strength (STS) and strain at break (SB) in the MD of 269.17% and 287.74%, and in the TD, these values were 184.98% and 469.96%.
Similar results were obtained in [
28], where the elongation strain at break for two types of LDPE films ranged from approximately 57% to about 671% in the MD and from around 379% to about 635% in the TD, while the strain at tensile strength was from 6% to 474% for MD and from 103% to 630%. Such high differences between the strains could be caused by the low mass per unit area and/or low thickness, as confirmed by [
28] and the visible heterogeneity of the film surface microstructure [
41].
In the conducted studies, the IFB 0.20 film exhibited the lowest average tensile strength in the MD (10.05 MPa), while the IFB 0.15 film showed the highest (14.56 MPa). As illustrated in
Figure 4a, the tested construction films generally have higher tensile strength in the MD than in the TD, except for the IFB 0.30 film. However, it is worth noting that the differences in tensile strength based on stretching direction are insignificant, averaging 1.26 MPa for VFY films and 1.09 MPa for IFB films. Hatfield [
42] demonstrated that this is a characteristic feature of films produced by the blown extrusion method. The average tensile strength of the tested samples is 11.07 MPa. For comparison, the tensile strength of LDPE films with thicknesses ranging from 0.05 to 0.15 mm, measured by Rennert et al. [
40], averaged 16.7 MPa. At the same time, Dilara and Briassoulis [
11] reported a tensile strength range for LDPE films from 4.1 MPa to 15.9 MPa, depending on density (910 kg/m
3 and 925 kg/m
3). However, these values remain 10–20 times lower than those of polypropylene roofing membranes [
30].
The strain at tensile strength (STS) and strain at break (SB) of the VFY 0.20 film and IFB 0.30 film exhibited the highest coefficients of variation (96.3% for VFY 0.20 and 104.8% for IFB 0.30). The average coefficient of variation for STS and SB for VFY films is 75.2% and 19.2%, respectively, while for IFB films, it is 75.1% and 48.9%.
In conclusion, while the mechanical properties of the tested films meet basic standards, variability in thickness, mass per unit area, and strain behavior necessitates improved manufacturing processes to enhance film performance and reliability in construction applications [
28].
4.3. Surface Roughness Characteristics
The height distribution skewness parameter (Ssk) describes the degree of symmetry of surface roughness amplitudes with respect to the mean reference plane. Negative Ssk values are indicative of plateau-type surface features, whereas positive values correspond to surfaces dominated by sharp protrusions. As shown in
Table 2, only the vapor-proof film VFY 0.20 exhibits skewness values close to zero (−0.01 in the MD and 0.01 in the TD), suggesting a nearly symmetrical height distribution. In contrast, for the other films, Ssk values are positive, indicating that the height distribution is biased below the mean plane of the surface. Since film thickness is not a decisive parameter in the context of tensile strength, this study will consider machine learning algorithms that take into account both thickness and roughness parameters (
Table 3) as well as sample deformation.
4.4. Machine Learning Algorithms
The results demonstrate that machine learning algorithms can effectively predict the tensile strength of films based on various surface and physical properties. Among the algorithms tested, XGBoost provided the best overall performance, particularly in predicting tensile strength in the transverse (TD) and machine directions (MD). While the Gradient Boosting model performed well, its lower R
2 values, especially in the TD, indicated difficulties in making accurate predictions. The Neural Network, while slightly less precise than XGBoost, still performed well, demonstrating that a more generalized model approach can be effective for predicting tensile strength. Residual analysis for each algorithm revealed that XGBoost had a symmetric and relatively narrow residual distribution, indicating that it made the most accurate predictions. Gradient Boosting exhibited greater variability in its residuals, while the neural network displayed more pronounced irregularities in its error distribution. However, the influence plots showed no significant systematic errors or outliers affecting the models’ stability. Feature importance analysis further revealed that film thickness (T) was the most influential feature for all models, particularly in the MD. MLAs are also used to predict mechanical properties, such as tensile strength, impact strength, and flexural strength, based on oven residence time during polymer molding. Such an analysis was conducted for linear low-density polyethylene (LLDPE) [
43]. The algorithms utilized included Decision Trees (DT) as the baseline predictive model, ensemble methods such as Random Forest, Extremely Randomized Trees (Extra Trees), Gradient Boosting Decision Trees (GBDT), and AdaBoost, as well as the Firefly Algorithm (FA), which was applied as a hyperparameter optimizer for each model, improving their performance. In this study, the best results were achieved by the FA-ET model (Firefly-Optimized Extra Trees), with R
2 values of 0.9994 for tensile strength, 0.9995 for impact strength, and 0.9968 for flexural strength. The FA-GBDT and FA-BDT models also demonstrated high precision; however, FA-ET exhibited the highest agreement with the actual data [
43].
The dataset size included 25 observations, which may limit the generalizability of the results. However, the authors rightly emphasize the potential of using machine learning algorithms to optimize industrial processes, leading to time and cost savings. Furthermore, predicting mechanical properties based on limited input data provides a better understanding of the impact of process parameters on product quality. Looking ahead, it is crucial to work with larger datasets and implement more complex models with multiple input parameters. These parameters should include those related to the production (molding) process, material composition, and final material characteristics to achieve more robust and accurate predictions.
5. Conclusions
It is worth noting that the presented results are based on a limited experimental dataset and internal validation; therefore, the conclusions should be interpreted within the context of the specific materials and conditions studied.
The study examines LDPE construction and vapor-proof films, analyzing their tensile strength, thickness, and surface characteristics. Key findings from mechanical properties highlight the influence of mass per unit area on durability, the variability in tensile strength and strain at break across different film types and stretching directions, and the impact of surface defects on performance, emphasizing the need for stricter quality control, particularly for construction films. Key findings from mechanical properties are listed below:
Higher mass per unit area improves durability and protection.
Tensile strength was tested in the machine direction (MD) and the transverse direction (TD). IFB 0.15 showed the highest tensile strength (14.56 MPa in MD), while IFB 0.20 had the lowest (10.05 MPa in MD). Generally, MD showed slightly higher strength than TD, with small differences between VFY and IFB films.
Significant variations in strain at tensile strength (STS) and strain at break (SB) were observed, with thinner films exhibiting greater variability. VFY 0.20 had the highest strain at break in TD (515.71%), while IFB 0.15 showed notable variability in MD and TD strains.
Surface roughness analysis revealed that IFB 0.20 had the most defects (high Sku parameter), visible as deep depressions and high peaks. VFY 0.15 had the lowest roughness and defect counts.
Construction films (IFB) require higher stability and performance as they protect structural elements, whereas vapor barrier films (VFY) are used mainly in roof structures with less critical functions.
The machine learning results are consistent with the fundamental principles of polymer mechanics. Film thickness and mass per unit area, identified as key predictive features, directly influence load-bearing capacity and stress distribution during tensile deformation. Similarly, surface roughness parameters reflect the presence of defects and stress concentrators, which can promote localized deformation and premature failure. The observed directional differences between MD and TD are consistent with the anisotropic molecular orientation induced during film processing. These qualitative links indicate that the machine learning findings are physically meaningful rather than purely statistical.
Building upon the experimental dataset, established machine learning algorithms -namely Gradient Boosting Machine (GBM), Extreme Gradient Boosting (XGBoost), and a feed-forward Neural Network—were implemented, trained, and optimized to predict tensile strength in both TD and MD. A consistent training and validation framework incorporating stratified train–test splitting and cross-validation-based hyperparameter optimization was applied to minimize overfitting and ensure a fair comparison between models.
All three models demonstrated the ability to capture the relationships between surface descriptors, physical properties, and tensile strength. However, XGBoost consistently achieved the highest predictive performance in both directions, with R2 values of 0.827 in TD and 0.908 in MD, accompanied by the lowest MAE and RMSE values. Gradient Boosting yielded slightly lower accuracy, particularly in the TD, while the Neural Network showed stable but more generalized performance, with moderately higher prediction errors. The improved predictive accuracy observed in MD for all models reflects the lower variability and stronger structural alignment inherent to the machine direction in LDPE films.
SHAP-based interpretability analysis confirmed that the machine learning models learned physically meaningful relationships rather than purely statistical correlations. Film thickness emerged as the dominant predictor of tensile strength across all models and directions, consistent with fundamental principles of polymer mechanics. In the MD, mass per unit area gained additional importance, reflecting its combined influence on thickness, density, and processing-induced molecular orientation. Surface roughness parameters, particularly skewness and kurtosis, exhibited non-linear and bidirectional effects, suggesting their indirect role in stress concentration and damage initiation rather than a simple monotonic influence.
Overall, within the scope of the investigated dataset, XGBoost proved to be the most accurate and robust model for tensile strength prediction, effectively balancing predictive performance and interpretability. Neural Networks, while versatile and stable, exhibited reduced precision under limited data conditions, whereas Gradient Boosting showed comparatively lower robustness, especially in the transverse direction.
The developed machine learning models are applicable within the range of material properties and processing conditions represented by the studied LDPE films. Extending these models to other polyethylene grades or substantially different manufacturing conditions would require additional, representative training data. Future work should therefore focus on expanding the dataset, incorporating processing parameters, and exploring more complex multi-input models to further enhance prediction robustness and generalizability.
6. Summary
Machine learning algorithms offer an effective framework for predicting the mechanical performance of recycled LDPE films by modeling the complex, non-linear relationships between surface roughness parameters, physical characteristics such as film thickness and mass per unit area, and tensile strength in different stretching directions. In this study, established machine learning models—XGBoost, Gradient Boosting, and feed-forward Neural Networks—were implemented and systematically compared for predicting tensile strength in both the machine direction (MD) and transverse direction (TD).
All applied models were able to integrate multiple correlated input variables and capture their interdependencies; however, XGBoost consistently provided the highest predictive accuracy and robustness. Feature importance and SHAP-based interpretability analyses demonstrated that film thickness is the dominant predictor of tensile strength, while mass per unit area and selected surface roughness parameters (e.g., skewness and kurtosis) contribute in a direction-dependent and non-linear manner. These findings are consistent with established principles of polymer mechanics and processing-induced anisotropy in LDPE films.
Residual analysis further confirmed the superior performance of XGBoost, which exhibited symmetric and narrowly distributed prediction errors, indicating balanced and reliable predictions. Gradient Boosting showed greater residual dispersion, particularly in the transverse direction, while the Neural Network demonstrated stable but more generalized performance under limited data conditions.
Overall, the results highlight the ability of machine learning to enhance tensile strength prediction by integrating diverse experimental variables, identifying key structure–property relationships, and accounting for directional anisotropy. Although the findings are limited to the investigated dataset and material system, they demonstrate the strong potential of ML-based approaches as decision-support tools for quality control, material optimization, and process design in the manufacturing of recycled polymer films.