1. Introduction
The Small Punch Test (SPT) has been developed as a small sample technique for the evaluation of mechanical properties of structural materials. It has been investigated as an alternative or complementary method to classical approaches such as the Uniaxial Tensile Test (UTT) [
1], with applications that extend to the estimation of the yield strength, ultimate tensile strength, creep properties, and ductile-to-brittle transition temperature (DBTT) [
2,
3,
4]. This method has gained recognition because it requires only a minimal volume of material and can be applied to components in service, where the extraction of standard tensile samples is often not feasible. Due to these advantages, the SPT has attracted attention in power generation, aerospace, and civil engineering, where condition monitoring of critical components is essential to maintaining structural integrity and optimizing maintenance strategies.
In addition to its technical advantages, the SPT has become a valuable tool for the assessment of material degradation in steels exposed to service. Because degradation processes such as creep, fatigue, and thermal embrittlement manifest primarily via changes in mechanical properties, the SPT provides a practical way of identifying loss of strength or ductility without requiring large destructive samples. Numerous studies [
2,
5,
6,
7,
8] have demonstrated its applicability for tracking degradation-related changes in yield strength, ultimate tensile strength, and DBTT, making the method an attractive candidate for integration into structural health monitoring frameworks.
Despite these benefits, direct correlation between SPT results and those obtained from conventional methods remains a challenge. The difficulty comes from the differences in stress states between the small punch configuration and the standard tensile loading. Although transformations have been proposed to convert parameters derived from SPT curves into their UTT equivalents, the resulting correlations have often been material-specific and not universally applicable [
9,
10,
11]. Even in recent standards, normalized calculation procedures yield inconsistent accuracy across different steels, and repeated calibration remains necessary for industrial use. As a result, research has focused on complementary approaches that can reduce the need for empirical calibration.
Among such approaches, the use of machine learning, particularly neural network models, has gained attention. Early studies [
12] demonstrated that artificial neural networks could approximate the inverse relationship between SPT load–displacement curves and material properties by training on data generated from finite element simulations, often based on the Gurson–Tvergaard–Needleman (GTN) damage model [
13,
14,
15]. In these works, simulated SPT curves under varying input parameters served as training data, with neural networks used to reconstruct stress–strain behavior or estimate key mechanical properties. Studies using the Stuttgart Neural Network Simulator (SNNS) [
16] and later backpropagation-based models confirmed the feasibility of this approach [
17]. For boiler steels such as 10GN2MFA, 08Ch18N10T, and 14MoV6-3, reported average prediction errors were 1–3% for ultimate tensile strength and 4–8% for yield strength [
17]. Although data generated from finite element (FE) simulations are invaluable for exploring mechanical responses under controlled conditions and for enriching limited experimental datasets [
18,
19], they may not always capture the stochastic variability present in real tests. In the present work, we intentionally relied solely on experimentally measured SPT and UTT data to ensure that the developed CNN model was validated on physical observed behavior rather than simulated approximations.
To address these shortcomings, recent research has begun to emphasize the use of experimental databases, although comprehensive studies remain scarce. The availability of paired SPT and UTT curves from the same material offers an opportunity to train neural networks directly on experimental evidence, bypassing the reliance on simulations. This shift is particularly important in the context of degradation monitoring, where service exposure introduces microstructural changes that are not easily replicated in simulations but are directly reflected in experimental data. At the same time, it provides a stringent test of the ability of neural networks to generalize under conditions of relatively limited dataset size, which remains a characteristic constraint in this domain.
Because the present work relies on a relatively small number of experimental data points, special attention was given to the prevention of overfitting and to the effective use of available data. In the broader machine learning literature, small-sample challenges are often addressed through techniques such as data augmentation, transfer learning, regularization, or architecture simplification. In this study, these limitations were mitigated by pairing all available SPT and UTT curves within the same material group to form unique training examples and applying dropout regularization throughout the CNN architecture. This strategy allowed the network to generalize effectively while preserving the physical consistency of the experimental database.
In the present study, an experimental approach has been adopted to investigate the potential of neural networks in order to predict the UTT-equivalent behavior from SPT measurements. An experimental database containing paired SPT and UTT data has been prepared for three boiler steels (10H2M, 13HMF, and 15HM) in both new and service-degraded states. Based on these data, a neural network architecture consisting of a convolutional neural network (CNN) designed for curve-to-curve prediction has been trained and evaluated. The working hypothesis is that by exploiting local curve features, CNN models are capable of reducing the systematic bias of SPT and providing more accurate property estimations.
The evaluation has been carried out with emphasis on two key aspects. First, the predicted force–displacement curves have been transformed into stress–strain data to allow for extraction of the yield strength and the ultimate tensile strength, with these values then compared directly to the UTT reference data. Second, validation procedures have been applied to assess the generalization of the model and identify the influence of outliers on the predictive accuracy. In line with the adopted criterion, predictions have been considered successful if they provide values closer to the UTT results than to the baseline SPT measurements.
By situating neural network predictions within this framework, the present work contributes to ongoing efforts to establish the SPT as a reliable technique for structural health assessment. Specifically, this paper positions the SPT not only as a miniature testing method but also as a viable route for automated evaluation of degradation through mechanical property assessment. In particular, it provides an experimental demonstration that CNN-based models can correct known biases of the SPT and deliver UTT-consistent predictions across multiple steels and degradation states. At the same time, the study highlights limitations related to yield strength detection and outlier sensitivity, indicating directions for future development. Through these contributions, this work aims to bridge the gap between the testing of miniature specimens and conventional mechanical characterization, offering a pathway toward the automated data-driven evaluation of structural steels in service.
Recent years have also seen a rapid expansion of machine learning (ML) and neural network (NN) applications in materials science beyond the Small Punch Test domain. Convolutional and fully connected architectures have been successfully employed to predict mechanical properties directly from hardness measurements or simplified mechanical inputs. Such data-driven approaches demonstrate the ability of NNs to capture complex nonlinear relations between material features and macroscopic strength parameters even in cases where traditional empirical models fail. The present study follows this trend by adapting a convolutional architecture for curve-to-curve translation between SPT and UTT responses, providing a new perspective on how ML can bridge miniature testing with standard tensile characterization.
During the past two decades, numerous attempts have been made to link SPT results with standard tensile parameters using models based on neural networks. Early approaches relied primarily on synthetic data obtained from finite element (FE) simulations. Abendroth [
12] trained neural feed-forward networks on the output of the Gurson–Tvergaard–Needleman (GTN) model in order to approximate inverse mappings between the SPT curve and the material parameters for several steels. Linse [
16] extended this concept using the Stuttgart Neural Network Simulator to reconstruct full SPT load–displacement curves through multiple displacement-specific networks. Subsequent studies [
17,
20] emphasized the need for large simulated datasets—often consisting of thousands of FE-generated curves—in order to achieve acceptable accuracy in predicting hardening or damage parameters. Despite these advances, such models remained limited by their reliance on idealized data and poor generalization across materials. More recent efforts have investigated various architectures, including multilayer perceptrons (MLP), convolutional neural networks (CNN), and Bayesian networks for different materials [
18,
19,
21]. Although these studies have confirmed the potential of deep learning for mechanical property prediction, most of them used simulated input data and focused on scalar parameter estimation rather than full-curve reconstruction. In contrast, the present study employs a one-dimensional CNN trained exclusively on experimentally measured SPT and UTT data to achieve direct curve-to-curve translation. This approach extends previous machine learning applications beyond parameter identification, enabling physically interpretable prediction of complete UTT-equivalent responses from miniature SPT experiments.
In this paper, analyzed materials are introduced that are subjected to conventional and unconventional testing methods; in addition, a compositional approach is taken in order to correlate SPT and UTT. In the following chapters, this paper:
Describes the evaluation of SPT capabilities to reflect degradation induced changes in mechanical properties.
Describes the implementation, practical testing. results of a convolutional neural network for automation of SPT results interpretation.
Further benchmarks CNN-based predictions against SPT results in the context of UTT correlation.
Based on the goals presented above, the following research questions have been answered:
Can CNNs trained on paired SPT–UTT data outperform conventional SPT correlations in predicting UTT-equivalent curves?
Can CNN-based models capture service-induced degradation effects across different steels?
How robust are CNN predictions under experimental noise and outliers?
4. Discussion
The results obtained in this study demonstrate that convolutional neural networks (CNNs) can serve as a reliable tool for predicting Uniaxial Tensile Test (UTT) properties from Small Punch Test (SPT) data. The working hypothesis was that SPT curves would provide estimates of the yield strength () and ultimate tensile strength () with precision comparable to conventional UTT measurements when processed through neural models. The present findings largely confirm this assumption, particularly for tensile strength, while also highlighting material- and state-dependent challenges for yield strength determination.
For tensile strength, CNN predictions consistently reproduced UTT reference values in all of the investigated steels (10H2M, 13HMF, and 15HM), with deviations limited to only a few percent. This precision exceeded that of the direct interpretation of SPT, which systematically overestimated the tensile strength by 20–30 MPa. Thus, the present study confirms that the SPT alone tends to bias absolute values, while data-driven models can correct for this offset. The high reproducibility of CNN output further supports their applicability in industrial contexts where precise estimation of tensile properties from miniature samples is a requirement.
In contrast, predicting the yield strength presented greater difficulty. The 10H2M steel proved difficult to assess due to the absence of a visible yield point inherited by CNN predictions, which demonstrated improved alignment with the UTT for other materials such as 13HMF and 15HM. This is shown on our example generated curves for both types of yield points. Our finding that the CNN systematically outperformed the SPT in its prediction of Re supports the hypothesis that convolutional architectures are better suited to capturing curve-level features, such as the saddle-shaped plateau that precedes yielding.
An important implication of these results is the sensitivity of neural networks to outliers. In 13HMF steel, an anomalous sample markedly affected the precision of the prediction and reduced the performance metrics . This confirms that outlier management is critical when building training databases for small datasets. Although the decision to remove or retain outliers remains open, this work highlights the need for robust preprocessing protocols to prevent such data points from disproportionately influencing model generalization.
This study shows that convolutional neural networks (CNN) can reduce the systematic bias inherent in the SPT, generating estimates of material properties with a level of consistency that confirms their robustness for curve-to-curve prediction. The findings demonstrate that CNNs can effectively translate SPT data into UTT-equivalent stress–strain behavior, bridging a longstanding methodological gap and allowing for automated evaluation of boiler steels in service. This positions deep learning as a practical tool for material state assessment, particularly when direct tensile testing is impractical or destructive. At the same time, the results highlight persistent challenges, especially in yield strength prediction, which may benefit from hybrid physics-informed models and uncertainty quantification to increase reliability in safety-critical applications.
Nevertheless, the applicability of such networks remains largely material-specific. Models trained on 10H2M, 13HMF, and 15HM steels learn curve geometries and unique degradation patterns for these alloys, and cannot be assumed to generalize to other steels without adaptation. Factors such as differences in microstructure, deformation mechanisms, and sensitivity to boundary conditions limit the ability of the network to extrapolate beyond its training domain. To extend the proposed methodology, additional experimental datasets would be required for each new steel, allowing transfer learning or retraining of dedicated decoder heads. In this way, the approach could gradually evolve into a more universal framework for SPT-to-UTT translation; however, for now its use should be regarded as restricted to the same steels on which it was trained.